Abstract
We present new algorithms for approximate string matching based in simple, but efficient, ideas. First, we present an algorithm for string matching with mismatches based in arithmetical operations that runs in linear worst case time for most practical cases. This is a new approach to string searching. Second, we present an algorithm for string matching with errors based on partitioning the pattern that requires linear expected time for typical inputs.
This work was partially supported by Grant C-11001 from FundaciĆ³n Andes and Grant DTI I-3084-9222 of the University of Chile
Preview
Unable to display preview. Download preview PDF.
References
K. Abrahamson. Generalized string matching. SIAM J on Computing, 16:1039ā1051, 1987.
A.V. Aho and M. Corasick. Efficient string matching: An aid to bibliographic search. C.ACM, 18(6):333ā340, June 1975.
R. Baeza-Yates and G.H. Gonnet. A new approach to text searching. In Proc. of 12th ACM SIGIR, pages 168ā175, Cambridge, Mass., June 1989. (Addendum in ACM SIGIR Forum, V. 23, Numbers 3, 4, 1989, page 7.). To appear in Communications of CACM.
R. Baeza-Yates and G.H. Gonnet. Fast string matching with mismatches. Information and Computation, 1992. (to appear). Also as Tech. Report CS-88-36, Dept. of Computer Science, University of Waterloo, 1988.
R. Baeza-Yates and M. RĆ©gnier. Fast algorithms for two dimensional and multiple pattern matching. In R. Karlsson and J. Gilbert, editors, 2nd Scandinavian Workshop in Algorithmic Theory, SWAT'90, Lecture Notes in Computer Science 447, pages 332ā347, Bergen, Norway, July 1990. Springer-Verlag.
W. Chang and E. Lawler. Approximated string matching in sublinear expected time. In Proc. 31st FOCS, pages 116ā124, St. Louis, MO, Oct 1990. IEEE.
B. Commentz-Walter. A string matching algorithm fast on the average. In ICALP, volume 6 of Lecture Notes in Computer Science, pages 118ā132. Springer-Verlag, 1979.
M. Fischer and M. Paterson. String matching and other products. In R. Karp, editor, Complexity of Computation (SIAM-AMS Proceedings 7), volume 7, pages 113ā125. American Mathematical Society, Providence, RI, 1974.
Z. Galil and R. Giancarlo. Improved string matching with k mismatches. SIGACT News, 17:52ā54, 1986.
Z. Galil and K. Park. An improved algorithm for approximate string matching. In ICALP'89, pages 394ā404, Stressa, Italy, 1989.
G.H. Gonnet and R. Baeza-Yates. Handbook of Algorithms and Data Structures-In Pascal and C. Addison-Wesley, Wokingham, UK, 1991. (second edition).
R. Grossi and F. Luccio. Simple and efficient string matching with k mismatches. Inf. Proc. Letters, 33(3):113ā120, July 1989.
A. Hume and D.M. Sunday. Fast string searching. Software ā Practice and Experience, 21(11):1221ā1248, Nov 1991.
D.E. Knuth, J. Morris, and V. Pratt. Fast pattern matching in strings. SIAM J on Computing, 6:323ā350, 1977.
G. Landau and U. Vishkin. Efficient string matching with k mismatches. Theoretical Computer Science, 43:239ā249, 1986.
G. Landau and U. Vishkin. Fast string matching with k differences. JCSS, 37:63ā78, 1988.
U. Manber and S. Wu. An algorithm for approximate string matching with non uniform costs. Technical Report TR-89-19, Department of Computer Science, University of Arizona, Tucson, Arizona, Sept 1989.
P.D. Smith. Experiments with a very fast substring search algorithm. Software ā Practice and Experience, 21(10):1065ā1074, Oct 1991.
M.A. Sridhar. Efficient algorithms for multiple pattern matching. Technical Report Computer Sciences 661, University of Wisconsin-Madison, 1986.
J. Tarhio and E. Ukkonen. Boyer-moore approach to approximate string matching. In J.R. Gilbert and R.G. Karlsson, editors, 2nd Scandinavian Workshop in Algorithmic Theory, SWAT'90, Lecture Notes in Computer Science 447, pages 348ā359, Bergen, Norway, July 1990. Springer-Verlag.
S. Wu. personal communication. 1992.
S. Wu and U. Manber. Fast text searching with errors. Technical Report TR-91-11, Department of Computer Science, University of Arizona, Tucson, Arizona, June 1991.
S. Wu and U. Manber. Agrep ā a fast approximate pattern-matching tool. In Proceedings of USENIX Winter 1992 Technical Conference, pages 153ā162, San Francisco, CA, Jan 1992.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
Ā© 1992 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Baeza-Yates, R.A., Perleberg, C.H. (1992). Fast and practical approximate string matching. In: Apostolico, A., Crochemore, M., Galil, Z., Manber, U. (eds) Combinatorial Pattern Matching. CPM 1992. Lecture Notes in Computer Science, vol 644. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-56024-6_15
Download citation
DOI: https://doi.org/10.1007/3-540-56024-6_15
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-56024-1
Online ISBN: 978-3-540-47357-2
eBook Packages: Springer Book Archive