Abstract
The task of approximate string matching is to find all locations at which a pattern string p of length m matches a substring of a text string t of length n with at most k differences. It is common to use Levenshtein distance [5], which allows the differences to be single-character insertions, deletions, substitutions. Recently, in [3], the IndelMYE, IndelWM and IndelBYN algorithms where introduced as modified version of the bit-parallel algorithms of Myers [6], Wu&Manber [10] and Baeza-Yates&Navarro [1], respectively. These modified versions where made to support the indel distance (only single-character insertions and/or deletions are allowed). In this paper we present an improved version of IndelMYE that makes a better use of the bit-operations and runs 24.5 percent faster in practice. In the end we present a complete set of experimental results to support our findings.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Baeza-Yates, R., Navarro, G.: Faster approximate string matching. Algorithmica 23(2), 127–158 (1999)
Hyyrö, H., Fredriksson, K., Navarro, G.: Increased bit-parallelism for approximate string matching. In: Ribeiro, C.C., Martins, S.L. (eds.) WEA 2004. LNCS, vol. 3059, pp. 285–298. Springer, Heidelberg (2004)
Hyyrö, H., Pinzon, Y., Shinohara, A.: Fast bit-vector algorithms for approximate string matching under indel distance. In: Vojtáš, P., Bieliková, M., Charron-Bost, B., Sýkora, O. (eds.) SOFSEM 2005. LNCS, vol. 3381, pp. 380–384. Springer, Heidelberg (2005)
Hyyrö, H.: Explaining and extending the bit-parallel approximate string matching algorithm of Myers. Technical Report A-2001-10, Dept. of Computer and Information Sciences, University of Tampere, Tampere, Finland (2001)
Levenshtein, V.I.: Binary codes capable of correcting spurious insertions and deletions of ones (original in Russian). Russian Problemy Peredachi Informatsii 1, 12–25 (1965)
Myers, G.: A fast bit-vector algorithm for approximate string matching based on dynamic progamming. Journal of the ACM 46(3), 395–415 (1999)
Navarro, G.: A guided tour to approximate string matching. ACM Computing Surveys 33(1), 31–88 (2001)
Ukkonen, E.: Finding approximate patterns in strings. Journal of Algorithms 6, 132–137 (1985)
Wright, A.: Approximate string matching using within-word parallelism. Software Practice and Experience 24(4), 337–362 (1994)
Wu, S., Manber, U.: Fast text searching allowing errors. Comm. of the ACM 35(10), 83–91 (1992)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Hyyrö, H., Pinzon, Y., Shinohara, A. (2005). New Bit-Parallel Indel-Distance Algorithm. In: Nikoletseas, S.E. (eds) Experimental and Efficient Algorithms. WEA 2005. Lecture Notes in Computer Science, vol 3503. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11427186_33
Download citation
DOI: https://doi.org/10.1007/11427186_33
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-25920-6
Online ISBN: 978-3-540-32078-4
eBook Packages: Computer ScienceComputer Science (R0)