Indexing and Dictionary Matching with One Error (Extended Abstract)
The indexing problem is the one where a text is preprocessed and subsequent queries of the form: “Find all occurrences of pattern P in the text” are answered in time proportional to the length of the query and the number of occurrences. In the dictionary matching problem a set of patterns is preprocessed and subsequent queries of the form: “Find all occurrences of dictionary patterns in text T” are answered in time proportional to the length of the text and the number of occurrences.
In this paper we present a uniform deterministic solution to both the indexing and the general dictionary matching problem with one error. We preprocess the data in time O(n log2 n), where n is the text size in the indexing problem and the dictionary size in the dictionary matching problem. Our query time for the indexing problem is O(mlog n log log n+ tocc), where m is the query string size and tocc is the number of occurrences.
Our query time for the dictionary matching problem is O(n log3 d log log d+ tocc), where n is the text size and d the dictionary size.
KeywordsQuery Processing Exact Match Range Query Query Time String Match
Unable to display preview. Download preview PDF.
- 2.A. Amir and M. Farach. Adaptive dictionary matching. Proc. 32nd IEEE FOCS, pages 760–766, 1991.Google Scholar
- 7.M. T. Chen and J. Seiferas. Efficient and elegant subword tree construction. In A. Apostolico and Z. Galil, editors, Combinatorial Algorithms on Words, chapter 12, pages 97–107. NATO ASI Series F: Computer and System Sciences, 1985.Google Scholar
- 8.M. Farach. Optimal suffix tree construction with large alphabets. Proc. 38th IEEE Symposium on Foundations of Computer Science, pages 137–143, 1997.Google Scholar
- 9.P. Ferragina and R. Grossi. Optimal on-line search and sublinear time update in string matching. Proc. 7th ACM-SIAM Symposium on Discrete Algorithms, pages 531–540, 1995.Google Scholar
- 10.M. J. Fischer and M. S. Paterson. String matching and other products. Complexity of Computation, R.M. Karp (editor), SIAM-AMS Proceedings, 7:113–125, 1974.Google Scholar
- 11.D. Greene, M. Parnas, and F. Yao. Multi-index hashing for information retrieval. Proc. 35th Annual Symposium on Foundations of Computer Science, pages 722–731, 1994.Google Scholar
- 12.M. Gu, M. Farach, and R. Beigel. An efficient algorithm for dynamic text indexing. Proc. 5th Annual ACM-SIAM Symposium on Discrete Algorithms, pages 697–704, 1994.Google Scholar
- 13.R. M. Idury and A. A Schäffer. Dynamic dictionary matching with failure functions. Proc. 3rd Annual Symposium on Combinatorial Pattern Matching, pages 273–284, 1992.Google Scholar
- 14.R. Karp, R. Miller, and A. Rosenberg. Rapid identification of repeated patterns in strings, arrays and trees. Symposium on the Theory of Computing, 4:125–136, 1972.Google Scholar
- 15.R. M. Karp and M. O. Rabin. Efficient randomized pattern-matching algorithms. IBM Journal of Res. and Dev., pages 249–260, 1987.Google Scholar
- 16.J. M. Kleinberg. Two algorithms for nearest-neighbor searchin high dimensions. Proc. 29th ACM STOC, pages 599–608, 1997.Google Scholar
- 18.E. Kushilevitz, R. Ostrovsky, and Y. Rabani. Efficient search for approximate nearest neighbor in high dimensional spaces. Proc. 30th ACM STOC, 1998. to appear.Google Scholar
- 23.S. C. Sahinalp and U. Vishkin. Efficient approximate and dynamic matching of patterns using a labeling paradigm. Proc. 37th FOCS, pages 320–328, 1996.Google Scholar
- 24.P. Weiner. Linear pattern matching algorithm. Proc. 14 IEEE Symposium on Switching and Automata Theory, pages 1–11, 1973.Google Scholar