Row-wise Tiling for the Myers’ Bit-Parallel Approximate String Matching Algorithm
- 425 Downloads
Given a text T[1..n] and a pattern P[1..m] the classic dynamic programming algorithm for computing the edit distance between P and every location of T runs in time O(nm). The bit-parallel computation of the dynamic programming matrix  runs in time \(O(n \left\lceil m/w \right\rceil)\), where w is the number of bits in computer word. We present a new method that rearranges the bit-parallel computations, achieving time \(O(\left\lceil n/w \right\rceil (m+\sigma\log_2(\sigma))+n)\), where σ is the size of the alphabet. The algorithm is then modified to solve the k differences problem. The expected running time is \(O(\left\lceil n/w \right\rceil (L(k)+\sigma\log_2(\sigma))+R)\), where L(k) depends on k, and R is the number of occurrences. The space usage is O(σ + m). It is in practice much faster than the existing \(O(n \left\lceil k/w \right\rceil)\) algorithm . The new method is applicable only for small (e.g. dna) alphabets, but this becomes the fastest algorithm for small m, or moderate k/m. If we want to search multiple patterns in a row, the method becomes attractive for large alphabet sizes too. We also consider applying 128-bit vector instructions for bit-parallel computations.
KeywordsEdit Distance Current Block Approximate String Match Computer Word Large Alphabet
Unable to display preview. Download preview PDF.
- 2.Chang, W.I., Lampe, J.: Theoretical and empirical comparisons of approximate string matching algorithms. In: Bezem, M., Groote, J.F. (eds.) TLCA 1993. LNCS, vol. 664, pp. 175–184. Springer, Heidelberg (1993)Google Scholar
- 4.Hyyrö, H.: Extending and explaining the bit-parallel approximate string matching algorithm of Myers. Technical report A2001-10, Department of Computer and Information Sciences, University of Tampere (2001)Google Scholar
- 10.Navarro, G., Raffinot, M.: Fast and flexible string matching by combining bit-parallelism and suffix automata. ACM Journal of Experimental Algorithmics (JEA) 5(4) (2000), http://www.jea.acm.org/2000/NavarroString