Abstract
Given a text T[1..n] and a pattern P[1..m] the classic dynamic programming algorithm for computing the edit distance between P and every location of T runs in time O(nm). The bit-parallel computation of the dynamic programming matrix [6] runs in time \(O(n \left\lceil m/w \right\rceil)\), where w is the number of bits in computer word. We present a new method that rearranges the bit-parallel computations, achieving time \(O(\left\lceil n/w \right\rceil (m+\sigma\log_2(\sigma))+n)\), where σ is the size of the alphabet. The algorithm is then modified to solve the k differences problem. The expected running time is \(O(\left\lceil n/w \right\rceil (L(k)+\sigma\log_2(\sigma))+R)\), where L(k) depends on k, and R is the number of occurrences. The space usage is O(σ + m). It is in practice much faster than the existing \(O(n \left\lceil k/w \right\rceil)\) algorithm [6]. The new method is applicable only for small (e.g. dna) alphabets, but this becomes the fastest algorithm for small m, or moderate k/m. If we want to search multiple patterns in a row, the method becomes attractive for large alphabet sizes too. We also consider applying 128-bit vector instructions for bit-parallel computations.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Baeza-Yates, R.A., Navarro, G.: Faster approximate string matching. Algorithmica 23(2), 127–158 (1999)
Chang, W.I., Lampe, J.: Theoretical and empirical comparisons of approximate string matching algorithms. In: Bezem, M., Groote, J.F. (eds.) TLCA 1993. LNCS, vol. 664, pp. 175–184. Springer, Heidelberg (1993)
Fredriksson, K., Navarro, G.: Average-optimal multiple approximate string matching. In: Baeza-Yates, R., Chávez, E., Crochemore, M. (eds.) CPM 2003. LNCS, vol. 2676, pp. 109–128. Springer, Heidelberg (2003)
Hyyrö, H.: Extending and explaining the bit-parallel approximate string matching algorithm of Myers. Technical report A2001-10, Department of Computer and Information Sciences, University of Tampere (2001)
Hyyrö, H., Navarro, G.: Faster bit-parallel approximate string matching. In: Apostolico, A., Takeda, M. (eds.) CPM 2002. LNCS, vol. 2373, pp. 203–224. Springer, Heidelberg (2002)
Myers, G.: A fast bit-vector algorithm for approximate string matching based on dynamic programming. J. Assoc. Comput. Mach. 46(3), 395–415 (1999)
Navarro, G.: A guided tour to approximate string matching. ACM Computing Surveys 33(1), 31–88 (2001)
Navarro, G.: Indexing text using the ziv-lempel trie. In: Laender, A.H.F., Oliveira, A.L. (eds.) SPIRE 2002. LNCS, vol. 2476, pp. 325–336. Springer, Heidelberg (2002)
Navarro, G., Baeza-Yates, R.: Very fast and simple approximate string matching. Information Processing Letters 72, 65–70 (1999)
Navarro, G., Raffinot, M.: Fast and flexible string matching by combining bit-parallelism and suffix automata. ACM Journal of Experimental Algorithmics (JEA) 5(4) (2000), http://www.jea.acm.org/2000/NavarroString
Sellers, P.H.: The theory and computation of evolutionary distances: Pattern recognition. J. Algorithms 1(4), 359–373 (1980)
Ukkonen, E.: Algorithms for approximate string matching. Inf. Control 64(1-3), 100–118 (1985)
Wright, H.: Approximate string matching using within-word parallelism. Softw. Pract. Exp. 24(4), 337–362 (1994)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Fredriksson, K. (2003). Row-wise Tiling for the Myers’ Bit-Parallel Approximate String Matching Algorithm. In: Nascimento, M.A., de Moura, E.S., Oliveira, A.L. (eds) String Processing and Information Retrieval. SPIRE 2003. Lecture Notes in Computer Science, vol 2857. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-39984-1_6
Download citation
DOI: https://doi.org/10.1007/978-3-540-39984-1_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-20177-9
Online ISBN: 978-3-540-39984-1
eBook Packages: Springer Book Archive