Advertisement

Indexing and Dictionary Matching with One Error (Extended Abstract)

  • Amihood Amir
  • Dmitry Keselman
  • Gad M. Landau
  • Moshe Lewenstein
  • Noa Lewenstein
  • Michael Rodeh
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1663)

Abstract

The indexing problem is the one where a text is preprocessed and subsequent queries of the form: “Find all occurrences of pattern P in the text” are answered in time proportional to the length of the query and the number of occurrences. In the dictionary matching problem a set of patterns is preprocessed and subsequent queries of the form: “Find all occurrences of dictionary patterns in text T” are answered in time proportional to the length of the text and the number of occurrences.

In this paper we present a uniform deterministic solution to both the indexing and the general dictionary matching problem with one error. We preprocess the data in time O(n log2 n), where n is the text size in the indexing problem and the dictionary size in the dictionary matching problem. Our query time for the indexing problem is O(mlog n log log n+ tocc), where m is the query string size and tocc is the number of occurrences.

Our query time for the dictionary matching problem is O(n log3 d log log d+ tocc), where n is the text size and d the dictionary size.

Keywords

Query Processing Exact Match Range Query Query Time String Match 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    A. V. Aho and M. J. Corasick. Efficient string matching. Comm. ACM, 18(6):333–340, 1975.MathSciNetzbMATHCrossRefGoogle Scholar
  2. 2.
    A. Amir and M. Farach. Adaptive dictionary matching. Proc. 32nd IEEE FOCS, pages 760–766, 1991.Google Scholar
  3. 3.
    A. Amir, M. Farach, R. Giancarlo, Z. Galil, and K. Park. Dynamic dictionary matching. Journal of Computer and System Sciences, 49(2):208–222, 1994.MathSciNetzbMATHCrossRefGoogle Scholar
  4. 4.
    A. Amir, M. Farach, R. M. Idury, J. A. La Poutré, and A. A. Schäffer. Improved dynamic dictionary matching. Information and Computation, 119(2):258–282, 1995.MathSciNetzbMATHCrossRefGoogle Scholar
  5. 5.
    R. S. Boyer and J. S. Moore. A fast string searching algorithm. Comm. ACM, 20:762–772, 1977.zbMATHCrossRefGoogle Scholar
  6. 6.
    G. S. Brodal and L. Gasieniec. Approximate dictionary queries. In Proc. 7th Annual Symposium on Combinatorial Pattern Matching (CPM 96), pages 65–74. LNCS 1075, Springer, 1996.CrossRefGoogle Scholar
  7. 7.
    M. T. Chen and J. Seiferas. Efficient and elegant subword tree construction. In A. Apostolico and Z. Galil, editors, Combinatorial Algorithms on Words, chapter 12, pages 97–107. NATO ASI Series F: Computer and System Sciences, 1985.Google Scholar
  8. 8.
    M. Farach. Optimal suffix tree construction with large alphabets. Proc. 38th IEEE Symposium on Foundations of Computer Science, pages 137–143, 1997.Google Scholar
  9. 9.
    P. Ferragina and R. Grossi. Optimal on-line search and sublinear time update in string matching. Proc. 7th ACM-SIAM Symposium on Discrete Algorithms, pages 531–540, 1995.Google Scholar
  10. 10.
    M. J. Fischer and M. S. Paterson. String matching and other products. Complexity of Computation, R.M. Karp (editor), SIAM-AMS Proceedings, 7:113–125, 1974.Google Scholar
  11. 11.
    D. Greene, M. Parnas, and F. Yao. Multi-index hashing for information retrieval. Proc. 35th Annual Symposium on Foundations of Computer Science, pages 722–731, 1994.Google Scholar
  12. 12.
    M. Gu, M. Farach, and R. Beigel. An efficient algorithm for dynamic text indexing. Proc. 5th Annual ACM-SIAM Symposium on Discrete Algorithms, pages 697–704, 1994.Google Scholar
  13. 13.
    R. M. Idury and A. A Schäffer. Dynamic dictionary matching with failure functions. Proc. 3rd Annual Symposium on Combinatorial Pattern Matching, pages 273–284, 1992.Google Scholar
  14. 14.
    R. Karp, R. Miller, and A. Rosenberg. Rapid identification of repeated patterns in strings, arrays and trees. Symposium on the Theory of Computing, 4:125–136, 1972.Google Scholar
  15. 15.
    R. M. Karp and M. O. Rabin. Efficient randomized pattern-matching algorithms. IBM Journal of Res. and Dev., pages 249–260, 1987.Google Scholar
  16. 16.
    J. M. Kleinberg. Two algorithms for nearest-neighbor searchin high dimensions. Proc. 29th ACM STOC, pages 599–608, 1997.Google Scholar
  17. 17.
    D. E. Knuth, J. H. Morris, and V. R. Pratt. Fast pattern matching in strings. SIAM J. Computing, 6:323–350, 1977.MathSciNetzbMATHCrossRefGoogle Scholar
  18. 18.
    E. Kushilevitz, R. Ostrovsky, and Y. Rabani. Efficient search for approximate nearest neighbor in high dimensional spaces. Proc. 30th ACM STOC, 1998. to appear.Google Scholar
  19. 19.
    V. I. Levenshtein. Binary codes capable of correcting, deletions, insertions and reversals. Soviet Phys. Dokl., 10:707–710, 1966.MathSciNetGoogle Scholar
  20. 20.
    E. M. McCreight. A space-economical suffix tree construction algorithm. Journal of the ACM, 23:262–272, 1976.MathSciNetzbMATHCrossRefGoogle Scholar
  21. 21.
    M. Minsky and S. Papert. Perceptrons. MIT Press, Cambridge, Mass., 1969.zbMATHGoogle Scholar
  22. 22.
    M. H. Overmars. Efficient data structures for range searching on a grid. J. of Algorithms, 9:254–275, 1988.MathSciNetzbMATHCrossRefGoogle Scholar
  23. 23.
    S. C. Sahinalp and U. Vishkin. Efficient approximate and dynamic matching of patterns using a labeling paradigm. Proc. 37th FOCS, pages 320–328, 1996.Google Scholar
  24. 24.
    P. Weiner. Linear pattern matching algorithm. Proc. 14 IEEE Symposium on Switching and Automata Theory, pages 1–11, 1973.Google Scholar
  25. 25.
    A. C.-C. Yao and F. F. Yao. Dictionary lookup with one error. J. of Algorithms, 25(1):194–202, 1997.zbMATHCrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1999

Authors and Affiliations

  • Amihood Amir
    • 1
  • Dmitry Keselman
    • 2
  • Gad M. Landau
    • 3
  • Moshe Lewenstein
    • 4
  • Noa Lewenstein
    • 4
  • Michael Rodeh
    • 5
  1. 1.Department of Computer ScienceBar-Ilan UniversityIsrael
  2. 2.Simons TechnologiesDecatur
  3. 3.Department of Computer ScienceHaifa UniversityHaifaIsrael
  4. 4.Department of Mathematics and Computer ScienceBar-Ilan UniversityRamat-GanIsrael
  5. 5.Computer Science DepartmentTechnionHaifaIsrael

Personalised recommendations