Skip to main content

Wie funktionieren MSA-Programme?

  • Chapter
  • First Online:
Multiple Sequenzalignments
  • 980 Accesses

Zusammenfassung

Wie gut ein MSA-Programm dafür geeignet ist, eine bestimmte Sammlung von Sequenzen zu alignieren, hängt stark damit zusammen, wie es funktioniert. Dieses Kapitel soll dazu dienen, einen groben Überblick über die Methoden des multiplen Sequenzalignments zu geben. Dazu werden wir uns im Abschn. 2.2 zunächst anschauen, wie Sequenzpaare aneinander aligniert werden. Im Abschn. 2.3 werden wir dann beleuchten, wie diese Methodik auf Datensätze mit mehr als zwei Sequenzen ausgeweitet werden kann, welche Probleme dabei auftreten und wie sie umgangen werden.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 14.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 19.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Alipanahi B, Delong A, Weirauch MT, Frey BJ (2015) Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning. Nat Biotechnol 33(8):831–838

    CAS  PubMed  Google Scholar 

  2. Altschul S (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25(17):3389–3402

    CAS  PubMed  PubMed Central  Google Scholar 

  3. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ (1990) Basic local alignment search tool. J Mol Biol 215(3):403–410

    CAS  PubMed  Google Scholar 

  4. Boratyn GM, Schäffer AA, Agarwala R, Altschul SF, Lipman DJ, Madden TL (2012) Domain enhanced lookup time accelerated BLAST. Biol Direct 7(1):12

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Camacho C, Coulouris G, Avagyan V, Ma N, Papadopoulos J, Bealer K, Madden TL (2009) BLAST+: architecture and applications. BMC Bioinf 10(1):421

    Article  Google Scholar 

  6. Chang J-M, Tommaso PD, Notredame C (2014) TCS: a new multiple sequence alignment reliability measure to estimate alignment accuracy and improve phylogenetic tree reconstruction. Mol Biol Evol 31(6):1625–1637

    Article  CAS  PubMed  Google Scholar 

  7. Cline M, Hughey R, Karplus K (2002) Predicting reliable regions in protein sequence alignments. Bioinformatics 18(2):306–314

    Article  CAS  PubMed  Google Scholar 

  8. Delcher AL (2002) Fast algorithms for large-scale genome alignment and comparison. Nucleic Acids Res 30(11):2478–2483

    Article  PubMed  PubMed Central  Google Scholar 

  9. Delcher AL, Kasif S, Fleischmann RD, Peterson J, White O, Salzberg SL (1999) Alignment of whole genomes. Nucleic Acids Res 27(11):2369–2376

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. Gardner PP (2005) A benchmark of multiple sequence alignment programs upon structural RNAs. Nucleic Acids Res 33(8):2433–2439

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Gotoh O (1990) Consistency of optimal sequence alignments. Bull Math Biol 52(4):509–525

    Article  CAS  PubMed  Google Scholar 

  12. Haubold B (2013) Alignment-free phylogenetics and population genetics. Brief Bioinform 15(3):407–418

    Article  PubMed  Google Scholar 

  13. Henikoff S, Henikoff JG (1992) Amino acid substitution matrices from protein blocks. Proc Natl Acad Sci 89(22):10915–10919

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. Hogeweg P, Hesper B (1984) The alignment of sets of sequences and the construction of phyletic trees: an integrated method. J Mol Evol 20(2):175–186

    Article  CAS  PubMed  Google Scholar 

  15. Kelley DR, Snoek J, Rinn J (2016) Basset: learning the regulatory code of the accessible genome with deep convolutional neural networks. Genome Res. https://doi.org/10.1101/gr.200535.115

    Google Scholar 

  16. Landan G, Graur D (2007) Heads or tails: a simple reliability check for multiple sequence alignments. Mol Biol Evol 24(6):1380–1383

    Article  CAS  PubMed  Google Scholar 

  17. Liu Y, Schmidt B, Maskell DL (2009) MSA-CUDA: multiple sequence alignment on graphics processing units with CUDA. In: 2009 20th IEEE International Conference on Application-specific Systems Architectures and Processors, S 121–128

    Google Scholar 

  18. McGinnis S, Madden TL (2004) BLAST: at the core of a powerful and diverse set of sequence analysis tools. Nucleic Acids Res 32(Web Server):W20–W25

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Morgulis A, Coulouris G, Raytselis Y, Madden TL, Agarwala R, Schäffer AA (2008) Database indexing for production MegaBLAST searches. Bioinformatics 24(16):1757–1764

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  20. Needleman SB, Wunsch CD (1970) A general method applicable to the search for similarities in the amino acid sequence of two proteins. J Mol Biol 48(3):443–453

    Article  CAS  PubMed  Google Scholar 

  21. Nguyen NG, Tran VA, Ngo DL, Phan D, Lumbanraja FR, Faisal MR, Abapihi B, Kubo M, Satou K (2016) DNA sequence classification by convolutional neural network. J Biomed Sci Eng 9(5):280–286

    Article  CAS  Google Scholar 

  22. Ortuño FM, Valenzuela O, Pomares H, Rojas F, Florido JP, Urquiza JM, Rojas I (2012) Predicting the accuracy of multiple sequence alignment algorithms by using computational intelligent techniques. Nucleic Acids Res 41(1):e26–e26

    Article  PubMed  PubMed Central  Google Scholar 

  23. Pearson WR (2013) Selecting the right similarity-scoring matrix. Curr Protoc Bioinformatics 43:1–9

    Google Scholar 

  24. Pei J, Grishin NV (2007) PROMALS: towards accurate multiple sequence alignments of distantly related proteins. Bioinformatics 23(7):802–808

    Article  CAS  PubMed  Google Scholar 

  25. Penn O, Privman E, Landan G, Graur D, Pupko T (2010) An alignment confidence score capturing robustness to guide tree uncertainty. Mol Biol Evol 27(8):1759–1767

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  26. Quang D, Xie X (2016) Danq: a hybrid convolutional and recurrent deep neural network for quantifying the function of DNA sequences. Nucleic Acids Res 44:e107

    Article  PubMed  PubMed Central  Google Scholar 

  27. Rost B (1999) Twilight zone of protein sequence alignments. Protein Eng Des Sel 12(2): 85–94

    Article  CAS  Google Scholar 

  28. Smith TF, Waterman MS (1981) Identification of common molecular subsequences. J Mol Biol 147(1):195–197

    Article  CAS  PubMed  Google Scholar 

  29. Vogt G, Etzold T, Argos P (1995) An assessment of amino acid exchange matrices in aligning protein sequences: the twilight zone revisited. J Mol Biol 249(4):816–831

    Article  CAS  PubMed  Google Scholar 

  30. Wallace IM (2006) M-coffee: combining multiple sequence alignment methods with t-coffee. Nucleic Acids Res 34(6):1692–1699

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Yoon B-J (2009) Hidden markov models and their applications in biological sequence analysis. Curr Genomics 10(6):402–415

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  32. Zhang Z (1998) Protein sequence similarity searches using patterns as seeds. Nucleic Acids Res 26(17):3986–3990

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  33. Zhang Z, Schwartz S, Wagner L, Miller W (2000) A greedy algorithm for aligning DNA sequences. J Comput Biol 7(1–2):203–214

    Article  CAS  PubMed  Google Scholar 

  34. Zielezinski A, Vinga S, Almeida J, Karlowski WM (2017) Alignment-free sequence comparison: benefits applications and tools. Genome Biol 18(1):186

    Article  PubMed  PubMed Central  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer-Verlag GmbH Deutschland, ein Teil von Springer Nature

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Sperlea, T. (2019). Wie funktionieren MSA-Programme?. In: Multiple Sequenzalignments. Springer Spektrum, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-58811-6_2

Download citation

Publish with us

Policies and ethics