Abstract
Multiple sequence alignment (MSA) is one of the most basic and central tasks for many studies in modern biology. In this paper, we present a new progressive alignment algorithm for this very difficult problem. Given two groups A and B of aligned sequences, this algorithm uses Dynamic Programming and the sum-of-pairs objective function to determine an optimal alignment C of A and B. The proposed algorithm has a much lower time complexity compared with a previously published algorithm for the same task [11]. Its performance is extensively assessed on the well-known BAliBase benchmarks and compared with several state-of-the-art MSA tools.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Dayhoff, M.O., Schwartz, R.M., Orcutt, B.C.: A model of evolutionary change in proteins. In: Dayhoff, M.O. (ed.) Atlas of Protein Sequence and Structure. National Biomedical Research Foundation, ch. 22, vol. 5, pp. 345–352 (1978)
Derrien, V., Richer, J-M., Hao, J-K.: Plasma: un nouvel algorithme progressif pour l’alignement multiple de séquences. Actes des PremiÃr̈es Journées Francophones de Programmation par Contraintes (JFPC’05), Lens, France (2005)
Do, C.B., Mahabhashyam, S.P., Brudno, M., Batzoglou, S.: Probabilistic consistency-based multiple sequence alignment. Genome Research 15, 330–340 (2005)
Edgar, R.C.: Muscle: multiple sequence alignment with high accuracy and high throughput. Nucleic Acid Research 32, 1792–1797 (1994)
Feng, D.-F., Doolitle, R.F.: Progressive sequence alignment as a prerequisite to correct phylogenetic trees. Journal of Molecular Evolution 25, 351–360 (1987)
Gotoh, O.: An improved algorithm for matching biological sequences. Journal of Molecular Biology 162, 705–708 (1982)
Gotoh, O.: A weighting system and algorithm for aligning many phylogenetically related sequences. Computer Applications in the Biosciences (CABIOS) 11, 543–551 (1995)
Gotoh, O.: Multiple sequence alignment: algorithms and applications. Adv. Biopys. 36, 159–206 (1999)
Henikoff, S., Henikoff, J.G.: Amino acid substitution matrices from protein blocks. In: Proceedings of the National Academy of Science, vol. 89, pp. 10915–10919 (1992)
Humberto, C., Lipman, D.: The multiple sequence alignment problem in biology. SIAM Journal on Applied Mathematics 48(5), 1073–1082 (1988)
John, K., Dean, S.: Aligning alignments exactly. In: Proceedings of RECOMB 2004, San Diego, March 27-31, 2004, pp. 27–31 (2004)
Katoh, Misawa, Kuma, Miyata: Mafft: a novel method for rapid mulitple sequence alignment based on fast fourier transform. Nucleic Acid Research 30, 3059–3066 (2002)
Lipman, D.J., Altschul, S.F., Kececioglu, J.D.: A tool for multiple sequence alignment. In: Proc. Natl. Acad Sci., pp. 4412–4415 (1989)
Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. JMB 3(48), 443–453 (1970)
Notredame, C., Higgins, D., Heringa, J.: T-coffee: A novel method for multiple sequence alignments. Journal of Molecular Biology 302, 205–217 (2000)
Notredame, C., Higgins, D.G.: Saga: Sequence alignment by genetic algorithm. Nuc. Acids Res. 8, 1515–1524 (1996)
Notredame, C., Higgins, D.G., Heringa, J.: T-coffee: A novel method for fast and accurate multiple sequence alignment. Journal of Molecular Biology, 205–217 (2000)
Notredame, C., Holme, L., Higgins, D.G.: Coffee: A new objective function for multiple sequence alignmnent. Bioinformatics 14(5), 407–422 (1998)
Saitou, N., Nei, M.: The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol. Biol. Evol. 4, 406–425 (1987)
Subramanian, A.R., Weyer-Menkhoff, J., Kaufmann, M., Morgenstern, B.: Dialign-t: An improved algorithm for segment-based multiple sequence alignment. BMC Bioinformatics 6, 66 (2005)
Thompson, J.D., Higgins, D.G., Gibson, T.J.: Clustalw: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucl. Acids Res. 22, 4673–4690 (1994)
Thompson, J.D., Plewniak, F., Poch, O.: Balibase: A benchmark alignments database for the evaluation of multiple sequence alignment programs. Bioinformatics 15, 87–88 (1999)
Thompson, J.D., Plewniak, F., Poch, O.: A comprehensive comparison of multiple sequence alignment programs. Nucleic Acid Research 27, 2682–2690 (1999)
Wang, L., Jiang, T.: On the complexity of multiple sequence alignment. Journal of Computational Biology 1(4), 337–348 (1994)
Waterman, M.S.: Introduction to Computational Biology. Chapman and Hall/CRC, Boca Raton (2000)
Yamada, S., Gotoh, O., Yamana, H.: Improvement in accuracy of multiple sequence alignment using novel group-to-group sequence alignment algorithm with piecewise linear gap cost. BMC Bioinformatics 7(1), 524 (2006)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Richer, JM., Derrien, V., Hao, JK. (2007). A New Dynamic Programming Algorithm for Multiple Sequence Alignment. In: Dress, A., Xu, Y., Zhu, B. (eds) Combinatorial Optimization and Applications. COCOA 2007. Lecture Notes in Computer Science, vol 4616. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-73556-4_8
Download citation
DOI: https://doi.org/10.1007/978-3-540-73556-4_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-73555-7
Online ISBN: 978-3-540-73556-4
eBook Packages: Computer ScienceComputer Science (R0)