Skip to main content

Row-wise Tiling for the Myers’ Bit-Parallel Approximate String Matching Algorithm

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2857))

Abstract

Given a text T[1..n] and a pattern P[1..m] the classic dynamic programming algorithm for computing the edit distance between P and every location of T runs in time O(nm). The bit-parallel computation of the dynamic programming matrix [6] runs in time \(O(n \left\lceil m/w \right\rceil)\), where w is the number of bits in computer word. We present a new method that rearranges the bit-parallel computations, achieving time \(O(\left\lceil n/w \right\rceil (m+\sigma\log_2(\sigma))+n)\), where σ is the size of the alphabet. The algorithm is then modified to solve the k differences problem. The expected running time is \(O(\left\lceil n/w \right\rceil (L(k)+\sigma\log_2(\sigma))+R)\), where L(k) depends on k, and R is the number of occurrences. The space usage is O(σ + m). It is in practice much faster than the existing \(O(n \left\lceil k/w \right\rceil)\) algorithm [6]. The new method is applicable only for small (e.g. dna) alphabets, but this becomes the fastest algorithm for small m, or moderate k/m. If we want to search multiple patterns in a row, the method becomes attractive for large alphabet sizes too. We also consider applying 128-bit vector instructions for bit-parallel computations.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Baeza-Yates, R.A., Navarro, G.: Faster approximate string matching. Algorithmica 23(2), 127–158 (1999)

    Article  MATH  MathSciNet  Google Scholar 

  2. Chang, W.I., Lampe, J.: Theoretical and empirical comparisons of approximate string matching algorithms. In: Bezem, M., Groote, J.F. (eds.) TLCA 1993. LNCS, vol. 664, pp. 175–184. Springer, Heidelberg (1993)

    Google Scholar 

  3. Fredriksson, K., Navarro, G.: Average-optimal multiple approximate string matching. In: Baeza-Yates, R., Chávez, E., Crochemore, M. (eds.) CPM 2003. LNCS, vol. 2676, pp. 109–128. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  4. Hyyrö, H.: Extending and explaining the bit-parallel approximate string matching algorithm of Myers. Technical report A2001-10, Department of Computer and Information Sciences, University of Tampere (2001)

    Google Scholar 

  5. Hyyrö, H., Navarro, G.: Faster bit-parallel approximate string matching. In: Apostolico, A., Takeda, M. (eds.) CPM 2002. LNCS, vol. 2373, pp. 203–224. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  6. Myers, G.: A fast bit-vector algorithm for approximate string matching based on dynamic programming. J. Assoc. Comput. Mach. 46(3), 395–415 (1999)

    MATH  Google Scholar 

  7. Navarro, G.: A guided tour to approximate string matching. ACM Computing Surveys 33(1), 31–88 (2001)

    Article  Google Scholar 

  8. Navarro, G.: Indexing text using the ziv-lempel trie. In: Laender, A.H.F., Oliveira, A.L. (eds.) SPIRE 2002. LNCS, vol. 2476, pp. 325–336. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  9. Navarro, G., Baeza-Yates, R.: Very fast and simple approximate string matching. Information Processing Letters 72, 65–70 (1999)

    Article  MathSciNet  Google Scholar 

  10. Navarro, G., Raffinot, M.: Fast and flexible string matching by combining bit-parallelism and suffix automata. ACM Journal of Experimental Algorithmics (JEA) 5(4) (2000), http://www.jea.acm.org/2000/NavarroString

  11. Sellers, P.H.: The theory and computation of evolutionary distances: Pattern recognition. J. Algorithms 1(4), 359–373 (1980)

    Article  MATH  MathSciNet  Google Scholar 

  12. Ukkonen, E.: Algorithms for approximate string matching. Inf. Control 64(1-3), 100–118 (1985)

    Article  MATH  MathSciNet  Google Scholar 

  13. Wright, H.: Approximate string matching using within-word parallelism. Softw. Pract. Exp. 24(4), 337–362 (1994)

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Fredriksson, K. (2003). Row-wise Tiling for the Myers’ Bit-Parallel Approximate String Matching Algorithm. In: Nascimento, M.A., de Moura, E.S., Oliveira, A.L. (eds) String Processing and Information Retrieval. SPIRE 2003. Lecture Notes in Computer Science, vol 2857. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-39984-1_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-39984-1_6

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-20177-9

  • Online ISBN: 978-3-540-39984-1

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics