Abstract
The multiple sequence alignment problem is one the most common task in the analysis of sequential data, especially in bioinformatics. In this paper, we propose to use a genetic algorithm to compute a multiple sequence alignment, by optimizing a simple scoring function. Even though the idea of using genetic algorithms is not new, the presented approach differs in the representation of the multiple alignment and in the simplicity of the genetic operators. The results so far obtained are reported and discussed in this paper.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Anabarasu, L.A.: Multiple sequence alignment using parallel genetic algorithms. In: The Second Asia-Pacific Conference on Simulated Evolution (SEAL-98), Canberra Australia (1998)
Bahr, A., Thompson, J.D., Thierry, J.C., Poch, O.: BAliBASE (Benchmark Alignment dataBASE): enhancements for repeats, transmembrane sequences and circular permutations. Nucleic Acids Res. 29, 323–326 (2001)
Barton, G.J., Sternberg, M.J.E.: A strategy for the rapid multiple alignment of protein sequences: confidence levels from tertiary structure comparisons. J. Mol. Biol. 198, 327–337 (1987)
Cai, L., Juedes, D., Liakhovitch, E.: Evolutionary computation techniques for multiple sequence alignment. In: Congress on Evolutionary Computation (2000)
Chellapilla, K., Fogel, G.B.: Multiple sequence alignment using evolutionary programming. In: Congress on Evolutionary Computation (1999)
Corpet, F.: Multiple sequence alignment with hierarchical clustering. Nucleic Acids Res. 16, 10881–10890 (1988)
Depiereux, E., Baudoux, G., Briffeuil, P., et al.: Match-Box_server: a multiple sequence alignment tool placing emphasis on reliability. Comput. Appl. Biosci. 13(3), 249–256 (1997)
Feng, D.-F., Doolittle, R.F.: Progressive sequence alignment as a prerequisite to correct phylogenetic trees. J. Mol. Evol. 25, 351–360 (1987)
Goldberg, D.E.: Genetic Algorithms. In: Goldberg, D.E. (ed.) Search Optimization and Machine Learning. Addison-Wesley, New York (1989)
Gonzalez, R.R.: Multiple protein sequence comparison by genetic algorithms. In: SPIE-98 (1999)
Gotoh, O.: Significant Improvement in Accuracy of Multiple Protein Sequence Alignments by Iterative Refinements as Assessed by Reference to Structural Alignments. J. Mol. Biol. 264(4), 823–838 (1996)
Heringa, J.: Two strategies for sequence comparison: profile-preprocessed and secondary structure-induced multiple alignment. Computers and Chemistry 23, 341–364 (1999)
Higgins, D.G., Sharp, P.M.: Clustal: a package for performing multiple sequence alignment on a microcomputer. Gene 73, 237–244 (1988)
Hogeweg, P., Hesper, B.: The alignment of sets of sequences and the construction of phylogenetic trees. An integrated method. J. Mol. Evol. 20, 175–186 (1984)
Krogh, A., Brown, M., Mian, I.S., Sjölander, K., Haussler, D.: Hidden Markov Models in Computational Biology: Applications to Protein Modeling. J. Mol. Biol. 235, 1501–1531 (1994)
Kim, J., Pramanik, S., Chung, M.J.: Multiple Sequence Alignment using Simulated Annealing. Comp. Applic. Biosci. 10(4), 419–426 (1994)
Lawrence, C.E., Altschul, S.F., Boguski, M.S., Liu, J.S., Neuwald, A.F., Wootton, J.C.: Detecting subtle sequence signals: a Gibbs sampling strategy for multiple alignment. Science 262, 208–214 (1993)
Lipman, D.J., Altschul, S.F., Kececioglu, J.D.: A tool for multiple sequence alignment. Proc. Natl. Acad. Sci. USA 86, 4412–4415 (1989)
Morgenstern, B., Dress, A., Wener, T.: Multiple DNA and protein sequence based on segment-to-segment comparison. Proc. Natl. Acad. Sci. USA 93, 12098–12103 (1996)
Needleman, S.B., Wunsch, C.D.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. J. Mol. Biol. 48, 443–453 (1970)
Notredame, C., Higgins, D.G.: SAGA: sequence alignment by genetic algorithm. Nucleic Acids Res. 24, 1515–1524 (1996)
Notredame, C., Higgins, D.G., Heringa, J.: TCoffee: A novel algorithm for multiple sequence alignment. J. Mol. Biol. 302, 205–217 (2000)
Stoye, J., Moulton, V., Dress, A.W.: DCA: an efficient implementation of the divide-andconquer approach to simultaneous multiple sequence alignment. Comput. Appl. Biosci. 13(6), 625–626 (1997)
Taylor, W.R.: A flexible method to align large numbers of biological sequences. J. Mol. Evol. 28, 161–169 (1988)
The ShareGrid Project Home Page, http://dcs.di.unipmn.it/ShareGrid/ (visited on June 2009)
Thompson, J.D., Plewniak, F., Poch, O.: BAliBASE: a benchmark alignment database for the evaluation of multiple alignment programs. Bioinformatics 15, 87–88 (1999)
Wall, M.: GAlib: A C++ Library of Genetic Algorithm Components, ver. 2.4.7, http://lancet.mit.edu/ga/
Wallace, I.M., O’Sullivan, O., Higgins, D.G., Notredame, C.: M-Coffee: combining multiple sequence alignment methods with T-Coffee. Nucleic Acids Res. 34(6), 1692–1699 (2006)
Wang, L., Jiang, T.: On the complexity of multiple sequence alignment. J. Comput. Biol. 1, 337–348 (1994)
Waterman, M.S., Smith, T.F., Beyer, W.A.: Some biological sequence metrics. Adv. Math. 20, 367–387 (1976)
Zhang, C., Wong, A.K.: A genetic algorithm for multiple molecular sequence alignment. Comput. Appl. Biosci. 13(6), 565–581 (1997)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Botta, M., Negro, G. (2010). Multiple Sequence Alignment with Genetic Algorithms. In: Masulli, F., Peterson, L.E., Tagliaferri, R. (eds) Computational Intelligence Methods for Bioinformatics and Biostatistics. CIBB 2009. Lecture Notes in Computer Science(), vol 6160. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-14571-1_15
Download citation
DOI: https://doi.org/10.1007/978-3-642-14571-1_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-14570-4
Online ISBN: 978-3-642-14571-1
eBook Packages: Computer ScienceComputer Science (R0)