Approximate word sequence matching over Sparse Suffix Trees
In this paper, we discuss word sequence matching, and we adapt the common edit distance metric for approximate string matching to searching for words and sequences of words. We furthermore create a variant of the Sparse Suffix Tree() and adapt algorithms for approximate word and word sequence matching over the Sparse Suffix Tree variant. The algorithms have been implemented and tested in WWW information retrieval environment, and performance data is presented.
KeywordsLeaf Node Edit Distance Semantic Interpretation Suffix Tree Edit Operation
Unable to display preview. Download preview PDF.
- Cobbs A. L. (1995) “Fast Approximate Matching using Suffix Trees,” In Proceedings of Sixth Symposium on Combinatorial Pattern Matching (CPM'95) Springer Verlag, pp. 41–54.Google Scholar
- Gonnet G.H, Baeza-Yates R.A., Snider T. (1991) “Lexicographical indices for text: Inverted files vs. PAT trees.,” Technical Report OED-91-10, Center for the new OED, University of Waterloo.Google Scholar
- Kärkkäinen J., Ukkonen E. “Sparse Suffix Trees“ In Proceedings of the Second Annual International Computing and Combinatorias Conference (COCOON 96), Springer Verlag, pp. 219–230.Google Scholar
- Levenstein, V.I. (1965) “Binary codes capable of correcting deletions, insertions, and reversals,” (Russian) Doklady Akademii nauk SSSR, Vol. 163, No. 4, p. 845–8 (also Cybernetics and Control Theory, Vol. 10, No. 8, p. 707–10, 1966).Google Scholar
- Morrison D.R. (1968) “PATRICIA — Practical Algorithm To Retrieve Information Coded in Alphanumeric,” Journal of the ACM, 15, pp. 514–534.Google Scholar
- Sbang H., Merrettal T.H. (1996) “Tries for Approximate String Matching,” IEEE Transactions on Knowledge and Data Engineering, Vol 5, No. 4, p. 540–547.Google Scholar
- Ukkonen E. (1985) “Finding Approximate Patterns in Strings,” Journal of Algorithms, vol. 6, pp. 132–137.Google Scholar
- Weiner P. (1973) “Linear pattern matching algorithms,” In Proceedings of the IEEE 14th Annual Symposium on Switching and Automata Theory, pp. 1–11.Google Scholar