Zusammenfassung
Wie gut ein MSA-Programm dafür geeignet ist, eine bestimmte Sammlung von Sequenzen zu alignieren, hängt stark damit zusammen, wie es funktioniert. Dieses Kapitel soll dazu dienen, einen groben Überblick über die Methoden des multiplen Sequenzalignments zu geben. Dazu werden wir uns im Abschn. 2.2 zunächst anschauen, wie Sequenzpaare aneinander aligniert werden. Im Abschn. 2.3 werden wir dann beleuchten, wie diese Methodik auf Datensätze mit mehr als zwei Sequenzen ausgeweitet werden kann, welche Probleme dabei auftreten und wie sie umgangen werden.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Alipanahi B, Delong A, Weirauch MT, Frey BJ (2015) Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning. Nat Biotechnol 33(8):831–838
Altschul S (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25(17):3389–3402
Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ (1990) Basic local alignment search tool. J Mol Biol 215(3):403–410
Boratyn GM, Schäffer AA, Agarwala R, Altschul SF, Lipman DJ, Madden TL (2012) Domain enhanced lookup time accelerated BLAST. Biol Direct 7(1):12
Camacho C, Coulouris G, Avagyan V, Ma N, Papadopoulos J, Bealer K, Madden TL (2009) BLAST+: architecture and applications. BMC Bioinf 10(1):421
Chang J-M, Tommaso PD, Notredame C (2014) TCS: a new multiple sequence alignment reliability measure to estimate alignment accuracy and improve phylogenetic tree reconstruction. Mol Biol Evol 31(6):1625–1637
Cline M, Hughey R, Karplus K (2002) Predicting reliable regions in protein sequence alignments. Bioinformatics 18(2):306–314
Delcher AL (2002) Fast algorithms for large-scale genome alignment and comparison. Nucleic Acids Res 30(11):2478–2483
Delcher AL, Kasif S, Fleischmann RD, Peterson J, White O, Salzberg SL (1999) Alignment of whole genomes. Nucleic Acids Res 27(11):2369–2376
Gardner PP (2005) A benchmark of multiple sequence alignment programs upon structural RNAs. Nucleic Acids Res 33(8):2433–2439
Gotoh O (1990) Consistency of optimal sequence alignments. Bull Math Biol 52(4):509–525
Haubold B (2013) Alignment-free phylogenetics and population genetics. Brief Bioinform 15(3):407–418
Henikoff S, Henikoff JG (1992) Amino acid substitution matrices from protein blocks. Proc Natl Acad Sci 89(22):10915–10919
Hogeweg P, Hesper B (1984) The alignment of sets of sequences and the construction of phyletic trees: an integrated method. J Mol Evol 20(2):175–186
Kelley DR, Snoek J, Rinn J (2016) Basset: learning the regulatory code of the accessible genome with deep convolutional neural networks. Genome Res. https://doi.org/10.1101/gr.200535.115
Landan G, Graur D (2007) Heads or tails: a simple reliability check for multiple sequence alignments. Mol Biol Evol 24(6):1380–1383
Liu Y, Schmidt B, Maskell DL (2009) MSA-CUDA: multiple sequence alignment on graphics processing units with CUDA. In: 2009 20th IEEE International Conference on Application-specific Systems Architectures and Processors, S 121–128
McGinnis S, Madden TL (2004) BLAST: at the core of a powerful and diverse set of sequence analysis tools. Nucleic Acids Res 32(Web Server):W20–W25
Morgulis A, Coulouris G, Raytselis Y, Madden TL, Agarwala R, Schäffer AA (2008) Database indexing for production MegaBLAST searches. Bioinformatics 24(16):1757–1764
Needleman SB, Wunsch CD (1970) A general method applicable to the search for similarities in the amino acid sequence of two proteins. J Mol Biol 48(3):443–453
Nguyen NG, Tran VA, Ngo DL, Phan D, Lumbanraja FR, Faisal MR, Abapihi B, Kubo M, Satou K (2016) DNA sequence classification by convolutional neural network. J Biomed Sci Eng 9(5):280–286
Ortuño FM, Valenzuela O, Pomares H, Rojas F, Florido JP, Urquiza JM, Rojas I (2012) Predicting the accuracy of multiple sequence alignment algorithms by using computational intelligent techniques. Nucleic Acids Res 41(1):e26–e26
Pearson WR (2013) Selecting the right similarity-scoring matrix. Curr Protoc Bioinformatics 43:1–9
Pei J, Grishin NV (2007) PROMALS: towards accurate multiple sequence alignments of distantly related proteins. Bioinformatics 23(7):802–808
Penn O, Privman E, Landan G, Graur D, Pupko T (2010) An alignment confidence score capturing robustness to guide tree uncertainty. Mol Biol Evol 27(8):1759–1767
Quang D, Xie X (2016) Danq: a hybrid convolutional and recurrent deep neural network for quantifying the function of DNA sequences. Nucleic Acids Res 44:e107
Rost B (1999) Twilight zone of protein sequence alignments. Protein Eng Des Sel 12(2): 85–94
Smith TF, Waterman MS (1981) Identification of common molecular subsequences. J Mol Biol 147(1):195–197
Vogt G, Etzold T, Argos P (1995) An assessment of amino acid exchange matrices in aligning protein sequences: the twilight zone revisited. J Mol Biol 249(4):816–831
Wallace IM (2006) M-coffee: combining multiple sequence alignment methods with t-coffee. Nucleic Acids Res 34(6):1692–1699
Yoon B-J (2009) Hidden markov models and their applications in biological sequence analysis. Curr Genomics 10(6):402–415
Zhang Z (1998) Protein sequence similarity searches using patterns as seeds. Nucleic Acids Res 26(17):3986–3990
Zhang Z, Schwartz S, Wagner L, Miller W (2000) A greedy algorithm for aligning DNA sequences. J Comput Biol 7(1–2):203–214
Zielezinski A, Vinga S, Almeida J, Karlowski WM (2017) Alignment-free sequence comparison: benefits applications and tools. Genome Biol 18(1):186
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer-Verlag GmbH Deutschland, ein Teil von Springer Nature
About this chapter
Cite this chapter
Sperlea, T. (2019). Wie funktionieren MSA-Programme?. In: Multiple Sequenzalignments. Springer Spektrum, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-58811-6_2
Download citation
DOI: https://doi.org/10.1007/978-3-662-58811-6_2
Published:
Publisher Name: Springer Spektrum, Berlin, Heidelberg
Print ISBN: 978-3-662-58810-9
Online ISBN: 978-3-662-58811-6
eBook Packages: Life Science and Basic Disciplines (German Language)