Advertisement

Row-wise Tiling for the Myers’ Bit-Parallel Approximate String Matching Algorithm

  • Kimmo Fredriksson
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2857)

Abstract

Given a text T[1..n] and a pattern P[1..m] the classic dynamic programming algorithm for computing the edit distance between P and every location of T runs in time O(nm). The bit-parallel computation of the dynamic programming matrix [6] runs in time \(O(n \left\lceil m/w \right\rceil)\), where w is the number of bits in computer word. We present a new method that rearranges the bit-parallel computations, achieving time \(O(\left\lceil n/w \right\rceil (m+\sigma\log_2(\sigma))+n)\), where σ is the size of the alphabet. The algorithm is then modified to solve the k differences problem. The expected running time is \(O(\left\lceil n/w \right\rceil (L(k)+\sigma\log_2(\sigma))+R)\), where L(k) depends on k, and R is the number of occurrences. The space usage is O(σ + m). It is in practice much faster than the existing \(O(n \left\lceil k/w \right\rceil)\) algorithm [6]. The new method is applicable only for small (e.g. dna) alphabets, but this becomes the fastest algorithm for small m, or moderate k/m. If we want to search multiple patterns in a row, the method becomes attractive for large alphabet sizes too. We also consider applying 128-bit vector instructions for bit-parallel computations.

Keywords

Edit Distance Current Block Approximate String Match Computer Word Large Alphabet 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Baeza-Yates, R.A., Navarro, G.: Faster approximate string matching. Algorithmica 23(2), 127–158 (1999)zbMATHCrossRefMathSciNetGoogle Scholar
  2. 2.
    Chang, W.I., Lampe, J.: Theoretical and empirical comparisons of approximate string matching algorithms. In: Bezem, M., Groote, J.F. (eds.) TLCA 1993. LNCS, vol. 664, pp. 175–184. Springer, Heidelberg (1993)Google Scholar
  3. 3.
    Fredriksson, K., Navarro, G.: Average-optimal multiple approximate string matching. In: Baeza-Yates, R., Chávez, E., Crochemore, M. (eds.) CPM 2003. LNCS, vol. 2676, pp. 109–128. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  4. 4.
    Hyyrö, H.: Extending and explaining the bit-parallel approximate string matching algorithm of Myers. Technical report A2001-10, Department of Computer and Information Sciences, University of Tampere (2001)Google Scholar
  5. 5.
    Hyyrö, H., Navarro, G.: Faster bit-parallel approximate string matching. In: Apostolico, A., Takeda, M. (eds.) CPM 2002. LNCS, vol. 2373, pp. 203–224. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  6. 6.
    Myers, G.: A fast bit-vector algorithm for approximate string matching based on dynamic programming. J. Assoc. Comput. Mach. 46(3), 395–415 (1999)zbMATHGoogle Scholar
  7. 7.
    Navarro, G.: A guided tour to approximate string matching. ACM Computing Surveys 33(1), 31–88 (2001)CrossRefGoogle Scholar
  8. 8.
    Navarro, G.: Indexing text using the ziv-lempel trie. In: Laender, A.H.F., Oliveira, A.L. (eds.) SPIRE 2002. LNCS, vol. 2476, pp. 325–336. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  9. 9.
    Navarro, G., Baeza-Yates, R.: Very fast and simple approximate string matching. Information Processing Letters 72, 65–70 (1999)CrossRefMathSciNetGoogle Scholar
  10. 10.
    Navarro, G., Raffinot, M.: Fast and flexible string matching by combining bit-parallelism and suffix automata. ACM Journal of Experimental Algorithmics (JEA) 5(4) (2000), http://www.jea.acm.org/2000/NavarroString
  11. 11.
    Sellers, P.H.: The theory and computation of evolutionary distances: Pattern recognition. J. Algorithms 1(4), 359–373 (1980)zbMATHCrossRefMathSciNetGoogle Scholar
  12. 12.
    Ukkonen, E.: Algorithms for approximate string matching. Inf. Control 64(1-3), 100–118 (1985)zbMATHCrossRefMathSciNetGoogle Scholar
  13. 13.
    Wright, H.: Approximate string matching using within-word parallelism. Softw. Pract. Exp. 24(4), 337–362 (1994)zbMATHCrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2003

Authors and Affiliations

  • Kimmo Fredriksson
    • 1
  1. 1.Department of Computer ScienceUniversity of JoensuuJoensuu

Personalised recommendations