Abstract
We consider the local alignment problem where sequences may have masked regions. The bases in masked regions are either unspecified or unknown, and they will be denoted by N. We present an efficient algorithm that finds an optimal local alignment by skipping such masked regions of sequences. Our algorithm works for both the affine gap penalty model and the linear gap penalty model. The time complexity of our algorithm is O((n–T)(m–S)+vm+wn) time, where n and m are the lengths of given sequences a and b, T and S are the numbers of base N in a and b, and v and w are the numbers of masked regions of a and b, respectively.
This work was supported by the MOST grant M1-0309-06-0003.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Batzoglou, A., Jaffe, D.B., Stanley, K., Butler, J., et al.: ARACHNE: A Whole- Genome Shotgun Assembler. Genome Research 12, 177–189 (2002)
Crochemore, M., Landau, G.M., Ziv-Ukelson, M.: A Sub-quadratic Sequence Alignment Algorithm for Unrestricted Cost Matrices. In: Proc. 13th Annual ACMSIAM Symposium on Discrete Algorithms (SODA 2002), pp. 679–688 (2002)
Ewing, B., Hillier, L., Wendl, M.C., Green, P.: Base-Calling of Automated Sequencer Traces Using Phred. I. Accuracy Assessment, Genome Research 8, 175–185 (1998)
GCG Documentation, http://www-igbmc.u-strasbg.fr/BioInfo/GCGdoc/Doc.html
Gotoh, O.: An Improved Algorithm for Matching Biological Sequences. Journal of Molecular Biology 162, 705–708 (1982)
Green, P.: PHRAP, http://www.phrap.org
Gusfield, D.: Algorithms on Strings, Trees, and Sequences. Cambridge University Press, Trees (1997)
Kim, J.W., Roh, K., Park, K., et al.: MLP: Mate-based Layout with PHRAP. In: Proc. 7th Annual International Conference on Research in Computational Molecular Biology - Currents in Computational Molecular Biology 2003, pp. 65–66 (2003)
Myers, E.W., Sutton, G.G., Delcher, A.L., Dew, I.M., Fasulo, D.P., et al.: A Whole-Genome Assembly of Drosophila. Science 287, 2196–2204 (2000)
NC-UIB: Nomenclature for incompletely specified bases in nucleic acid sequences. Recommendations 1985. The European Journal of Biochemistry 150, 1–5 (1985)
Smit, F.A., Green, P.: http://ftp.genome.washington.edu/RM/RepeatMasker.html
Smith, T.F., Waterman, M.S.: Identification of Common Molecular Subsequences. Journal of Molecular Biology 147, 195–197 (1981)
Wang, J., Wong, G.K., Ni, P., et al.: RePS: A Sequence Assembler That Masks Exact Repeats Identified from the Shotgun Data. Genome Research 12, 824–831 (2002)
Waterman, M.S.: Introduction to Computational Biology. Chapman & Hall, London (1995)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kim, J.W., Park, K. (2004). An Efficient Local Alignment Algorithm for Masked Sequences. In: Chwa, KY., Munro, J.I.J. (eds) Computing and Combinatorics. COCOON 2004. Lecture Notes in Computer Science, vol 3106. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-27798-9_47
Download citation
DOI: https://doi.org/10.1007/978-3-540-27798-9_47
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-22856-1
Online ISBN: 978-3-540-27798-9
eBook Packages: Springer Book Archive