Skip to main content

Fast and practical approximate string matching

  • Conference paper
  • First Online:
Combinatorial Pattern Matching (CPM 1992)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 644))

Included in the following conference series:

Abstract

We present new algorithms for approximate string matching based in simple, but efficient, ideas. First, we present an algorithm for string matching with mismatches based in arithmetical operations that runs in linear worst case time for most practical cases. This is a new approach to string searching. Second, we present an algorithm for string matching with errors based on partitioning the pattern that requires linear expected time for typical inputs.

This work was partially supported by Grant C-11001 from FundaciĆ³n Andes and Grant DTI I-3084-9222 of the University of Chile

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. K. Abrahamson. Generalized string matching. SIAM J on Computing, 16:1039ā€“1051, 1987.

    Google ScholarĀ 

  2. A.V. Aho and M. Corasick. Efficient string matching: An aid to bibliographic search. C.ACM, 18(6):333ā€“340, June 1975.

    Google ScholarĀ 

  3. R. Baeza-Yates and G.H. Gonnet. A new approach to text searching. In Proc. of 12th ACM SIGIR, pages 168ā€“175, Cambridge, Mass., June 1989. (Addendum in ACM SIGIR Forum, V. 23, Numbers 3, 4, 1989, page 7.). To appear in Communications of CACM.

    Google ScholarĀ 

  4. R. Baeza-Yates and G.H. Gonnet. Fast string matching with mismatches. Information and Computation, 1992. (to appear). Also as Tech. Report CS-88-36, Dept. of Computer Science, University of Waterloo, 1988.

    Google ScholarĀ 

  5. R. Baeza-Yates and M. RĆ©gnier. Fast algorithms for two dimensional and multiple pattern matching. In R. Karlsson and J. Gilbert, editors, 2nd Scandinavian Workshop in Algorithmic Theory, SWAT'90, Lecture Notes in Computer Science 447, pages 332ā€“347, Bergen, Norway, July 1990. Springer-Verlag.

    Google ScholarĀ 

  6. W. Chang and E. Lawler. Approximated string matching in sublinear expected time. In Proc. 31st FOCS, pages 116ā€“124, St. Louis, MO, Oct 1990. IEEE.

    Google ScholarĀ 

  7. B. Commentz-Walter. A string matching algorithm fast on the average. In ICALP, volume 6 of Lecture Notes in Computer Science, pages 118ā€“132. Springer-Verlag, 1979.

    Google ScholarĀ 

  8. M. Fischer and M. Paterson. String matching and other products. In R. Karp, editor, Complexity of Computation (SIAM-AMS Proceedings 7), volume 7, pages 113ā€“125. American Mathematical Society, Providence, RI, 1974.

    Google ScholarĀ 

  9. Z. Galil and R. Giancarlo. Improved string matching with k mismatches. SIGACT News, 17:52ā€“54, 1986.

    Google ScholarĀ 

  10. Z. Galil and K. Park. An improved algorithm for approximate string matching. In ICALP'89, pages 394ā€“404, Stressa, Italy, 1989.

    Google ScholarĀ 

  11. G.H. Gonnet and R. Baeza-Yates. Handbook of Algorithms and Data Structures-In Pascal and C. Addison-Wesley, Wokingham, UK, 1991. (second edition).

    Google ScholarĀ 

  12. R. Grossi and F. Luccio. Simple and efficient string matching with k mismatches. Inf. Proc. Letters, 33(3):113ā€“120, July 1989.

    Google ScholarĀ 

  13. A. Hume and D.M. Sunday. Fast string searching. Software ā€” Practice and Experience, 21(11):1221ā€“1248, Nov 1991.

    Google ScholarĀ 

  14. D.E. Knuth, J. Morris, and V. Pratt. Fast pattern matching in strings. SIAM J on Computing, 6:323ā€“350, 1977.

    Google ScholarĀ 

  15. G. Landau and U. Vishkin. Efficient string matching with k mismatches. Theoretical Computer Science, 43:239ā€“249, 1986.

    Google ScholarĀ 

  16. G. Landau and U. Vishkin. Fast string matching with k differences. JCSS, 37:63ā€“78, 1988.

    Google ScholarĀ 

  17. U. Manber and S. Wu. An algorithm for approximate string matching with non uniform costs. Technical Report TR-89-19, Department of Computer Science, University of Arizona, Tucson, Arizona, Sept 1989.

    Google ScholarĀ 

  18. P.D. Smith. Experiments with a very fast substring search algorithm. Software ā€” Practice and Experience, 21(10):1065ā€“1074, Oct 1991.

    Google ScholarĀ 

  19. M.A. Sridhar. Efficient algorithms for multiple pattern matching. Technical Report Computer Sciences 661, University of Wisconsin-Madison, 1986.

    Google ScholarĀ 

  20. J. Tarhio and E. Ukkonen. Boyer-moore approach to approximate string matching. In J.R. Gilbert and R.G. Karlsson, editors, 2nd Scandinavian Workshop in Algorithmic Theory, SWAT'90, Lecture Notes in Computer Science 447, pages 348ā€“359, Bergen, Norway, July 1990. Springer-Verlag.

    Google ScholarĀ 

  21. S. Wu. personal communication. 1992.

    Google ScholarĀ 

  22. S. Wu and U. Manber. Fast text searching with errors. Technical Report TR-91-11, Department of Computer Science, University of Arizona, Tucson, Arizona, June 1991.

    Google ScholarĀ 

  23. S. Wu and U. Manber. Agrep ā€” a fast approximate pattern-matching tool. In Proceedings of USENIX Winter 1992 Technical Conference, pages 153ā€“162, San Francisco, CA, Jan 1992.

    Google ScholarĀ 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Alberto Apostolico Maxime Crochemore Zvi Galil Udi Manber

Rights and permissions

Reprints and permissions

Copyright information

Ā© 1992 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Baeza-Yates, R.A., Perleberg, C.H. (1992). Fast and practical approximate string matching. In: Apostolico, A., Crochemore, M., Galil, Z., Manber, U. (eds) Combinatorial Pattern Matching. CPM 1992. Lecture Notes in Computer Science, vol 644. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-56024-6_15

Download citation

  • DOI: https://doi.org/10.1007/3-540-56024-6_15

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-56024-1

  • Online ISBN: 978-3-540-47357-2

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics