An Efficient Algorithm for Generating Super Condensed Neighborhoods
Indexing methods for the approximate string matching problem spend a considerable effort generating condensed neighborhoods. Here, we point out that condensed neighborhoods are not a minimal representation of a pattern neighborhood. We show that we can restrict our attention to super condensed neighborhoods which are minimal. We then present an algorithm for generating Super Condensed Neighborhoods. The algorithm runs in O(m⌈ m / w ⌉ s), where m is the pattern size, s is the size of the super condensed neighborhood and w the size of the processor word. Previous algorithms took O(m⌈ m / w ⌉ c) time, where c is the size of the condensed neighborhood. We further improve this algorithm by using Bit-Parallelism and Increased Bit-Parallelism techniques. Our experimental results show that the resulting algorithm is very fast.
KeywordsEdit Distance Approximate String Match Computer Word Dynamic Programming Table Canonical Path
Unable to display preview. Download preview PDF.
- 1.Baeza-Yates, R.: Text retrieval: Theory and practice. In: 12th IFIP World Computer Congress, vol. I, pp. 465–476. Elsevier Science, Amsterdam (1992)Google Scholar
- 3.Cobbs, A.L.: Fast approximate matching using suffix trees. In: Galil, Z., Ukkonen, E. (eds.) CPM 1995. LNCS, vol. 937, pp. 41–54. Springer, Heidelberg (1995)Google Scholar
- 4.Gusfield, D.: Algorithms on Strings, Trees, and Sequences. Cambridge University Press, Cambridge (1999)Google Scholar
- 5.Hyyrö, H.: Explaining and extending the bit-parallel approximate string matching algorithm of myers. Technical Report A-2001-10, Dept. of Computer and Information Sciences, University of Tampere, Tampere, Finland (2001)Google Scholar
- 6.Hyyrö, H.: Practical Methods for Approximate String Matching. PhD thesis, University of Tampere (2003)Google Scholar
- 10.Myers, E.: A sublinear algorithm for approximate keyword matching. Algorithmica (12), 345–374 (1994)Google Scholar
- 14.Navarro, G., Baeza-Yates, R., Sutinen, E., Tarhio, J.: Indexing methods for approximate string matching. IEEE Data Engineering Bulletin 24(4), 19–27 (2001)Google Scholar
- 15.Ukkonen, E.: Finding approximate patterns in strings. Journal of Algorithms, 132–137 (1985)Google Scholar