Abstract
We present a new and efficient algorithm to solve the ’threshold all vs. all’ problem, which involves searching of two strings (with length N and M respectively) for finding all maximal approximate matches of length at least S and with up to K differences. The algorithm is based on a novel graph model, and it solves the problem in time O(NMK 2).
Chapter PDF
References
Baeza-Yates, R.A., Gonnet, G.H.: All-against-all sequence matching. Rep. Dept. of CS, U. de Chile (1990)
Baeza-Yates, R.A., Gonnet, G.H.: A fast algorithm on average for all-against-all sequence matching. In: Proc. SPIRE/CRIWG 1999, pp. 16–23 (1999)
Barsky, M., Stege, U., Thomo, A., Upton, C.A.: A New Algorithm for Fast All-Against-All Substring Matching (2006), http://www.cs.uvic.ca/~mgbarksy/apbt.pdf
Gusfield, D.: Algorithms on Strings, Trees and Sequences. Cambridge University Press, Cambridge (1997)
Pevzner, P., Sze, S.H.: Combinatorial approaches to finding subtle signals in DNA sequences. In: Proc. ISMB 2000, pp. 269–278 (2000)
Ukkonen, E.: Algorithms for approximate string matching. Information and Control 64, 100–118 (1985)
Ukkonen, E.: Approximate string matching over suffix trees. In: Apostolico, A., Crochemore, M., Galil, Z., Manber, U. (eds.) CPM 1993. LNCS, vol. 684, pp. 228–242. Springer, Heidelberg (1993)
Vilo, J.: Pattern Discovery from Biosequences. PhD Thesis, Series of Publications A, Report A-2002-3 U. of Helsinki, Finland (2002)
Virus Orthologous Clusters database at Viral Bioinformatics Resource Center, U. of Victoria, Canada, http://athena.bioc.uvic.ca
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Barsky, M., Stege, U., Thomo, A., Upton, C. (2006). A New Algorithm for Fast All-Against-All Substring Matching. In: Crestani, F., Ferragina, P., Sanderson, M. (eds) String Processing and Information Retrieval. SPIRE 2006. Lecture Notes in Computer Science, vol 4209. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11880561_31
Download citation
DOI: https://doi.org/10.1007/11880561_31
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-45774-9
Online ISBN: 978-3-540-45775-6
eBook Packages: Computer ScienceComputer Science (R0)