# Row-wise Tiling for the Myers’ Bit-Parallel Approximate String Matching Algorithm

• Kimmo Fredriksson
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2857)

## Abstract

Given a text T[1..n] and a pattern P[1..m] the classic dynamic programming algorithm for computing the edit distance between P and every location of T runs in time O(nm). The bit-parallel computation of the dynamic programming matrix  runs in time $$O(n \left\lceil m/w \right\rceil)$$, where w is the number of bits in computer word. We present a new method that rearranges the bit-parallel computations, achieving time $$O(\left\lceil n/w \right\rceil (m+\sigma\log_2(\sigma))+n)$$, where σ is the size of the alphabet. The algorithm is then modified to solve the k differences problem. The expected running time is $$O(\left\lceil n/w \right\rceil (L(k)+\sigma\log_2(\sigma))+R)$$, where L(k) depends on k, and R is the number of occurrences. The space usage is O(σ + m). It is in practice much faster than the existing $$O(n \left\lceil k/w \right\rceil)$$ algorithm . The new method is applicable only for small (e.g. dna) alphabets, but this becomes the fastest algorithm for small m, or moderate k/m. If we want to search multiple patterns in a row, the method becomes attractive for large alphabet sizes too. We also consider applying 128-bit vector instructions for bit-parallel computations.

## Keywords

Edit Distance Current Block Approximate String Match Computer Word Large Alphabet
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

## References

1. 1.
Baeza-Yates, R.A., Navarro, G.: Faster approximate string matching. Algorithmica 23(2), 127–158 (1999)
2. 2.
Chang, W.I., Lampe, J.: Theoretical and empirical comparisons of approximate string matching algorithms. In: Bezem, M., Groote, J.F. (eds.) TLCA 1993. LNCS, vol. 664, pp. 175–184. Springer, Heidelberg (1993)Google Scholar
3. 3.
Fredriksson, K., Navarro, G.: Average-optimal multiple approximate string matching. In: Baeza-Yates, R., Chávez, E., Crochemore, M. (eds.) CPM 2003. LNCS, vol. 2676, pp. 109–128. Springer, Heidelberg (2003)
4. 4.
Hyyrö, H.: Extending and explaining the bit-parallel approximate string matching algorithm of Myers. Technical report A2001-10, Department of Computer and Information Sciences, University of Tampere (2001)Google Scholar
5. 5.
Hyyrö, H., Navarro, G.: Faster bit-parallel approximate string matching. In: Apostolico, A., Takeda, M. (eds.) CPM 2002. LNCS, vol. 2373, pp. 203–224. Springer, Heidelberg (2002)
6. 6.
Myers, G.: A fast bit-vector algorithm for approximate string matching based on dynamic programming. J. Assoc. Comput. Mach. 46(3), 395–415 (1999)
7. 7.
Navarro, G.: A guided tour to approximate string matching. ACM Computing Surveys 33(1), 31–88 (2001)
8. 8.
Navarro, G.: Indexing text using the ziv-lempel trie. In: Laender, A.H.F., Oliveira, A.L. (eds.) SPIRE 2002. LNCS, vol. 2476, pp. 325–336. Springer, Heidelberg (2002)
9. 9.
Navarro, G., Baeza-Yates, R.: Very fast and simple approximate string matching. Information Processing Letters 72, 65–70 (1999)
10. 10.
Navarro, G., Raffinot, M.: Fast and flexible string matching by combining bit-parallelism and suffix automata. ACM Journal of Experimental Algorithmics (JEA) 5(4) (2000), http://www.jea.acm.org/2000/NavarroString
11. 11.
Sellers, P.H.: The theory and computation of evolutionary distances: Pattern recognition. J. Algorithms 1(4), 359–373 (1980)
12. 12.
Ukkonen, E.: Algorithms for approximate string matching. Inf. Control 64(1-3), 100–118 (1985)
13. 13.
Wright, H.: Approximate string matching using within-word parallelism. Softw. Pract. Exp. 24(4), 337–362 (1994)