Skip to main content

A Parallel Algorithm for Multiple Biological Sequence Alignment

  • Conference paper
Information Processign in Cells and Tissues (IPCAT 2012)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 7223))

  • 1052 Accesses

Abstract

The search of a multiple sequence alignment (MSA) is a well-known problem in bioinformatics that consists in finding a sequence alignment of three or more biological sequences. In this paper, we propose a parallel iterative algorithm for the global alignment of multiple biological sequences. In this algorithm, a number of processes work independently at the same time searching for the best MSA of a set of sequences. It uses a Longest Common Subsequence (LCS) technique in order to generate a first MSA. An iterative process improves the MSA by applying a number of operators that have been implemented to produce more accurate alignments. Simulations were made using sequences from the UniProKB protein database. A preliminary performance analysis and comparison with several common methods for MSA shows promising results. The implementation was developed on a cluster platform through the use of the standard Message Passing Interface (MPI) library.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 54.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 69.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Altschul, S., Gish, W., Miller, W., Myers, E., Lipman, D.: Basic local alignment search tool. Molecular Biology-ElsevierĀ 215(3), 403ā€“410 (1990)

    Google ScholarĀ 

  2. Anbarasu, L., Narayanasamy, P., Sundararajan, V.: Multiple molecular sequence alignment by island parallel genetic algorithm. Current ScienceĀ 78(7), 858ā€“863 (2000)

    Google ScholarĀ 

  3. Bilu, Y., Agarwal, P., Kilodny, R.: Faster algorithms for optimal multiple sequence alignment based on pairwise comparisons. IEEE/ACM Transactions on Computational Biology and BioinformaticsĀ 3(4), 408ā€“422 (2006)

    ArticleĀ  Google ScholarĀ 

  4. Chengpeng, B.: DNA motif alignment by evolving a population of Markov chains. BMC BioinformaticsĀ 10(1), S13 (2009)

    ArticleĀ  Google ScholarĀ 

  5. Edgar, R.: Muscle: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids ResearchĀ 32(5), 1792ā€“1797 (2004)

    ArticleĀ  Google ScholarĀ 

  6. Galperin, M., Cochrane, G.: The 2011 nucleic acids research database issue and the online molecular biology database collection. Nucleic Acids ResearchĀ 39, D1ā€“D6 (2011)

    ArticleĀ  Google ScholarĀ 

  7. Gotoh, O.: Significant improvement in accuracy of multiple protein sequence alignments by iterative refinement as a assessed by reference to structural alignments. J. Mol. Biol.Ā 264, 823ā€“838 (1996)

    ArticleĀ  Google ScholarĀ 

  8. Henikoff, S., Henikoff, J.G.: Amino acid substitution matrices from protein blocks. BiochemistryĀ 89, 10915ā€“10919 (1992)

    Google ScholarĀ 

  9. Jones, N., Pevzner, P.A.: An introduction to bioinformatics algorithms. MIT Press (1996)

    Google ScholarĀ 

  10. Kim, J., Pramanik, S., Chung, M.: Multiple sequence alignment using simulated annealing. Comput. Appl. Biosci.Ā 10(4), 419ā€“426 (1994)

    Google ScholarĀ 

  11. Kleinjung, J., Douglas, N., Heringa, J.: Parallelized multiple alignment. Bioinformatics Applications NoteĀ 18(9), 1270ā€“1271 (2002)

    Google ScholarĀ 

  12. Lassmann, T., Frings, O., Sonnhammer, E.: Kalign2: high-performance multiple alignment of protein and nucleotide sequences allowing external features. Nucleid Acids ResearchĀ 37(3), 858ā€“865 (2009)

    ArticleĀ  Google ScholarĀ 

  13. Li, K.: Clustalw-mpi: Clustalw analysis using distributed and parallel computing. Bioinformatics Applications NoteĀ 19(12), 1585ā€“1586 (2003)

    Google ScholarĀ 

  14. Lipman, D., Pearson, W.: Rapid and sensitive protein similarity searches. ScienceĀ 227(4693), 1435ā€“1441 (1985)

    ArticleĀ  Google ScholarĀ 

  15. Lu, Y., Sze, S.: Improvig accuracy of multiple sequence alignment algorithms based on alignment of neighboring residues. Nucleic Acids ResearchĀ 37(2), 463ā€“472 (2009)

    ArticleĀ  Google ScholarĀ 

  16. Luscombe, N., Greenbaum, D., Gerstein, M.: What is bioinformatics? a proposed definition and overview of the field. Method Inf. Med.Ā 40(4), 346ā€“358 (2001)

    Google ScholarĀ 

  17. Moretti, S., Armougom, F., Wallace, I., Higgins, D., Jongeneel, C., Notredame, C.: The M-Coffee web server: a meta-method for computing multiple sequence alignments by combining alternative alignment methods. Nucleic Acids ResearchĀ 35, Web Server Issue, W645ā€“W648 (2007)

    ArticleĀ  Google ScholarĀ 

  18. Mount, D.: Bioinformatics: sequence and genome analysis. Cold Spring Harbor Laboratory Press (2004)

    Google ScholarĀ 

  19. National Center for Biotechnology Information: Fasta format, http://blast.ncbi.nlm.nih.gov/blastcgihelp.shtml

  20. Needleman, S., Wunsch, C.: A general method applicable to the search for similarities in the amino acid sequence of two proteins. J. Mol. Biol.Ā 48, 443ā€“453 (1970)

    ArticleĀ  Google ScholarĀ 

  21. Notredame, C., Higgins, D.: Saga: sequence alignment by genetic algorithm. Nucleic Acids ResearchĀ 24(8), 1515ā€“1524 (1996)

    ArticleĀ  Google ScholarĀ 

  22. Notredame, C., Higgins, D., Heringa, J.: T-coffee: a novel method for fast and accurate multiple sequence alignment. J. Mol. Biol.Ā 302(1), 205ā€“217 (2000)

    ArticleĀ  Google ScholarĀ 

  23. Shu, N., Elofsson, A.: KalignP: Improved multiple sequence alignments using position specific gap penalties in kalign2. Bioinformatics Applications NoteĀ 27(12), 1702ā€“1703 (2011)

    Google ScholarĀ 

  24. Smith, T., Waterman, M.: Identification of common molecular subsequences. J. Mol. Biol.Ā 147, 195ā€“197 (1981)

    ArticleĀ  Google ScholarĀ 

  25. Thompson, J., Higgins, D., Gibson, T.: Clustal w: improving the sensitivy of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids ResearchĀ 22(22), 4673ā€“4680 (1994)

    ArticleĀ  Google ScholarĀ 

  26. Wagner, R., Fischer, M.: The string-to-string correction problem. ACMĀ 21(1), 168ā€“173 (1974)

    ArticleĀ  MathSciNetĀ  MATHĀ  Google ScholarĀ 

  27. Wallace, I., Oā€™Sullivan, O., Higgins, D., Notredame, C.: M-coffee: combining multiple sequence alignment methods with t-coffee. Nucleic Acids ResearchĀ 34(6), 1692ā€“1699 (2006)

    ArticleĀ  Google ScholarĀ 

  28. Wang, Y., Li, K.: An adaptative and iterative algorithm for refining multiple sequence alignment. Computational Biology and ChemistryĀ 28, 141ā€“148 (2004)

    ArticleĀ  MATHĀ  Google ScholarĀ 

  29. Zhang, Z., Schwartz, S., Wagner, L., Miller, W.: A greedy algorithm for aligning dna sequences. Journal of Computational BiologyĀ 7(1/2), 203ā€“214 (2000)

    ArticleĀ  Google ScholarĀ 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

Ā© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Andalon-Garcia, I.R., Chavoya, A., Meda-CampaƱa, M.E. (2012). A Parallel Algorithm for Multiple Biological Sequence Alignment. In: Lones, M.A., Smith, S.L., Teichmann, S., Naef, F., Walker, J.A., Trefzer, M.A. (eds) Information Processign in Cells and Tissues. IPCAT 2012. Lecture Notes in Computer Science, vol 7223. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28792-3_31

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-28792-3_31

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-28791-6

  • Online ISBN: 978-3-642-28792-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics