Skip to main content

Multiple Protein Sequence Alignment with MSAProbs

  • Protocol
  • First Online:
Multiple Sequence Alignment Methods

Part of the book series: Methods in Molecular Biology ((MIMB,volume 1079))

Abstract

Multiple sequence alignment (MSA) generally constitutes the foundation of many bioinformatics studies involving functional, structural, and evolutionary relationship analysis between sequences. As a result of the exponential computational complexity of the exact approach to producing optimal multiple alignments, the majority of state-of-the-art MSA algorithms are designed based on the progressive alignment heuristic. In this chapter, we outline MSAProbs, a parallelized MSA algorithm for protein sequences based on progressive alignment. To achieve high alignment accuracy, this algorithm employs a hybrid combination of a pair hidden Markov model and a partition function to calculate posterior probabilities. Furthermore, we provide some practical advice on the usage of the algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Feng DF, Doolittle RF (1987) Progressive sequence alignment as a prerequisite to correct phylogenetic trees. J Mol Evol 25:351–361

    Article  PubMed  CAS  Google Scholar 

  2. Liu Y, Schmidt B, Maskell DL (2010) MSAProbs: multiple sequence alignment based on pair hidden Markov models and partition function posterior probabilities. Bioinformatics 26:1958–1964

    Article  PubMed  CAS  Google Scholar 

  3. Durbin R, Eddy SR, Krogh A, Mitchison G (1998) Biological sequence analysis: probabilistic models of proteins and nucleic acids. Cambridge University Press, Cambridge

    Book  Google Scholar 

  4. Miyazawa S (1995) A reliable sequence alignment method based on probabilities of residue correspondences. Protein Eng 8:999–1009

    Article  PubMed  CAS  Google Scholar 

  5. Thompson JD, Koehl P, Ripp R, Poch O (2005) BAliBASE 3.0: latest developments of the multiple sequence alignment benchmark. Proteins 61:127–136

    Article  PubMed  CAS  Google Scholar 

  6. Edgar RC (2004) MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res 32:1792–1797

    Article  PubMed  CAS  Google Scholar 

  7. Sievers F, Wilm A, Dineen D et al (2011) Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega. Mol Syst Biol 7:539

    Article  PubMed  Google Scholar 

  8. Chang JM, Di Tommaso P, Taly JF et al (2012) Accurate multiple sequence alignment of transmembrane proteins with PSI-Coffee. BMC Bioinformatics 13:S1

    Article  PubMed  CAS  Google Scholar 

  9. Deng X, Cheng J (2011) MSACompro: protein multiple sequence alignment using predicted secondary structure, solvent accessibility, and residue–residue contacts. BMC Bioinformatics 12:472

    Article  PubMed  CAS  Google Scholar 

  10. Vingron M, Argos P (1989) A fast and sensitive multiple sequence alignment algorithm. Comput Appl Biosci 5:115–121

    PubMed  CAS  Google Scholar 

  11. Gotoh O (1990) Consistency of optimal sequence alignments. Bull Math Biol 52:509–525

    PubMed  CAS  Google Scholar 

  12. Notredame C, Holm L, Higgins DG (1998) COFFEE: an objective function for multiple sequence alignments. Bioinformatics 14:407–422

    Article  PubMed  CAS  Google Scholar 

  13. Notredame C, Higgins DG, Heringa J (2000) T-coffee: a novel method for fast and accurate multiple sequence alignment. J Mol Biol 302:205–217

    Article  PubMed  CAS  Google Scholar 

  14. Do CB, Mahabhashyam MS, Brudno M et al (2005) ProbCons: probabilistic consistency-based multiple sequence alignment. Genome Res 15:330–340

    Article  PubMed  CAS  Google Scholar 

  15. Liu Y, Schmidt B, Maskell DL (2009) MSA-CUDA: multiple sequence alignment on graphics processing units with CUDA. 20th IEEE international conference on application-specific systems, architectures and processors, pp 121–128

    Google Scholar 

  16. Thompson JD, Higgins DG, Gibson TJ (1994) CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res 22:4673–4680

    Article  PubMed  CAS  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer Science+Business Media, LLC

About this protocol

Cite this protocol

Liu, Y., Schmidt, B. (2014). Multiple Protein Sequence Alignment with MSAProbs. In: Russell, D. (eds) Multiple Sequence Alignment Methods. Methods in Molecular Biology, vol 1079. Humana Press, Totowa, NJ. https://doi.org/10.1007/978-1-62703-646-7_14

Download citation

  • DOI: https://doi.org/10.1007/978-1-62703-646-7_14

  • Published:

  • Publisher Name: Humana Press, Totowa, NJ

  • Print ISBN: 978-1-62703-645-0

  • Online ISBN: 978-1-62703-646-7

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics