Skip to main content

Indexing Structures for Approximate String Matching

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2653))

Abstract

In this paper we give the first, to our knowledge, structures and corresponding algorithms for approximate indexing, by considering the Hamming distance, having the following properties.

  1. i)

    Their size is linear times a polylog of the size of the text on average.

  2. ii)

    For each pattern x, the time spent by our algorithms for finding the list occ(x) of all occurrences of a pattern x in the text, up to a certain distance, is proportional on average to |x| + |occ(x)|, under an additional but realistic hypothesis.

Supported by MIUR National Project PRIN “Linguaggi Formali e Automi: teoria ed applicazioni”

Supported by Progetto Giovani Ricercatori anno 1999 - Comitato 01

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. A. Amir, D. Keselman, G. M. Landau, M. Lewenstein, N. Lewenstein, and M. Rodeh. Indexing and dictionary matching with one error. LLNCS, 1663:181–190, 1999.

    Google Scholar 

  2. R. Arratia and M. Waterman. The erd’os-rényi strong law for pattern matching with given proportion of mismatches. Annals of Probability, 4:200–225, 1989.

    Google Scholar 

  3. R. B. Ash. Information Theory. Interscience, 1965.

    Google Scholar 

  4. M. Crochemore, C. Hancart, and T. Lecroq. Algorithmique du texte. Vuibert, 2001. 347 pages.

    Google Scholar 

  5. A. Gabriele, F. Mignosi, A. Restivo, and M. Sciortino. Indexing structure for approximate string matching. Technical Report 169, University of Palermo, Department of Mathematics and Applications, 2002.

    Google Scholar 

  6. Z. Galil and R. Giancarlo. Data structures and algorithms for approximate string matching. Journal of Complexity, 24:33–72, 1988.

    Article  MathSciNet  Google Scholar 

  7. D. Gusfield. Algorithms on Strings, Trees, and Sequences. Cambridge University Press, 1997. ISBN 0 521 58519 8 hardback. 534 pages.

    Google Scholar 

  8. S. Muthukrishnan. Efficient algorithms for document retrieval problems. In Proceedings of the 13th Annual ACM-SIAM Sumposium on Discrete Algorithms, pages 657–666, 2002.

    Google Scholar 

  9. G. Navarro. A guided tour to approximate string matching. ACM Computing Surveys, 33(1):31–88, 2001.

    Article  Google Scholar 

  10. G. Navarro, R. Baeza-Yates, E. Sutinen, and J. Tarhio. Indexing methods for approximate string matching. IEEE Data Engineering Bulletin, 24(4):19–27, 2001. Special issue on Managing Text Natively and in DBMSs. Invited paper.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Gabriele, A., Mignosi, F., Restivo, A., Sciortino, M. (2003). Indexing Structures for Approximate String Matching. In: Petreschi, R., Persiano, G., Silvestri, R. (eds) Algorithms and Complexity. CIAC 2003. Lecture Notes in Computer Science, vol 2653. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44849-7_20

Download citation

  • DOI: https://doi.org/10.1007/3-540-44849-7_20

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-40176-6

  • Online ISBN: 978-3-540-44849-5

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics