Abstract
We present a new efficient algorithm for exact matching in encoded DNA sequences and on binary strings. Our algorithm combines a multi-pattern version of the Bndm algorithm and a simplified version of the Commentz-Walter algorithm. We performed also experimental comparisons with the most efficient algorithms presented in the literature. Experimental results show that the newly presented algorithm outperforms existing solutions in most cases.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Baeza-Yates, R., Gonnet, G.H.: A new approach to text searching. Commun. ACM 35(10), 74–82 (1992)
Boyer, R.S., Moore, J.S.: A fast string searching algorithm. Commun. ACM 20(10), 762–772 (1977)
Charras, C., Lecroq, T., Pehoushek, J.D.: A very fast string matching algorithm for small alphabets and long patterns. In: Farach-Colton, M. (ed.) CPM 1998. LNCS, vol. 1448, pp. 55–64. Springer, Heidelberg (1998)
Commentz-Walter, B.: A string matching algorithm fast on the average. In: Maurer, H.A. (ed.) ICALP 1979. LNCS, vol. 71, pp. 118–132. Springer, Heidelberg (1979)
Faro, S., Lecroq, T.: Efficient pattern matching on binary strings. In: Current Trends in Theory and Practice of Computer Science, Poster (2009)
Holub, J., Durian, B.: Fast variants of bit parallel approach to suffix automata. Talk given in: The Second Haifa Annual International Stringology Research Workshop of the Israeli Science Foundation (2005), http://www.cri.haifa.ac.il/events/2005/string/presentations/Holub.pdf
Kim, J.W., Kim, E., Park, K.: Fast matching method for DNA sequences. In: Chen, B., Paterson, M., Zhang, G. (eds.) ESCAPE 2007. LNCS, vol. 4614, pp. 271–281. Springer, Heidelberg (2007)
Klein, S.T., Ben-Nissan, M.K.: Accelerating Boyer Moore searches on binary texts. In: Holub, J., Žďárek, J. (eds.) CIAA 2007. LNCS, vol. 4783, pp. 130–143. Springer, Heidelberg (2007)
Klein, S.T., Bookstein, A., Deerwester, S.: Storing text retrieval systems on cdrom: Compression and encryption considerations. ACM Trans. on Information Systems 7, 230–245 (1989)
Lecroq, T.: Fast exact string matching algorithms. Inf. Process. Lett. 102(6), 229–235 (2007)
Navarro, G., Raffinot, M.: A bit-parallel approach to suffix automata: Fast extended string matching. In: Farach-Colton, M. (ed.) CPM 1998. LNCS, vol. 1448, pp. 14–33. Springer, Heidelberg (1998)
Sunday, D.M.: A very fast substring search algorithm. Commun. ACM 33(8), 132–142 (1990)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Faro, S., Lecroq, T. (2009). An Efficient Matching Algorithm for Encoded DNA Sequences and Binary Strings. In: Kucherov, G., Ukkonen, E. (eds) Combinatorial Pattern Matching. CPM 2009. Lecture Notes in Computer Science, vol 5577. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-02441-2_10
Download citation
DOI: https://doi.org/10.1007/978-3-642-02441-2_10
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-02440-5
Online ISBN: 978-3-642-02441-2
eBook Packages: Computer ScienceComputer Science (R0)