Abstract
Positions in a protein are thought to coevolve to maintain important structural and functional interactions over evolutionary time. The detection of putative coevolving positions can provide important new insights into a protein family in the same way that knowledge is gained by recognizing evolutionarily conserved characters and characteristics. Putatively coevolving positions can be detected with statistical methods that identify covarying positions. However, positions in protein alignments can covary for many other reasons than coevolution; thus, it is crucial to create high-quality multiple sequence alignments for coevolution inference. Furthermore, it is important to understand common signs and sources of error. When confounding factors are accounted for, coevolution is a rich resource for protein engineering information.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Kimura M (1968) Evolutionary rate at the molecular level. Nature 217:624–626
Kimura M, Ota T (1974) On some principles governing molecular evolution. Proc Natl Acad Sci U S A 71:2848–2852
Kleinstiver BP, Fernandes AD, Gloor GB, Edgell DR (2010) A unified genetic, computational and experimental framework identifies functionally relevant residues of the homing endonuclease I-BmoI. Nucleic Acids Res. doi:10.1093/nar/gkp1223
Dickson R, Wahl L, Fernandes A, Gloor G (2010) Identifying and seeing beyond multiple sequence alignment errors using intra-molecular protein covariation. PLoS ONE 5:e11082
Dickson RJ, Gloor GB (2013) The MIp toolset: an efficient algorithm for calculating Mutual Information in protein alignments. arXiv, Ithaca, NY
Dickson RJ, Gloor GB (2012) Protein sequence alignment analysis by local covariation: coevolution statistics detect benchmark alignment errors. PLoS ONE 7:e37645
Talavera G, Castresana J (2007) Improvement of phylogenies after removing divergent and ambiguously aligned blocks from protein sequence alignments. Syst Biol 56:564
Privman E, Penn O, Pupko T (2012) Improving the performance of positive selection inference by filtering unreliable alignment regions. Mol Biol Evol 29:1–5
Martin LC, Gloor GB, Dunn SD, Wahl LM (2005) Using information theory to search for co-evolving residues in proteins. Bioinformatics 21:4116–4124
Kawrykow A et al (2012) Phylo: a citizen science approach for improving multiple sequence alignment. PLoS ONE 7:e31362
Khatib F, DiMaio F, Cooper S (2011) Crystal structure of a monomeric retroviral protease solved by protein folding game players. Nat Struct Mol Biol 18:1175–1177. doi:10.1038/nsmb.2119
Clamp M, Cuff J, Searle SM, Barton GJ (2004) The Jalview Java alignment editor. Bioinformatics 20:426–427
Waterhouse AM, Procter JB, Martin DMA, Clamp M, Barton GJ (2009) Jalview Version 2–a multiple sequence alignment editor and analysis workbench. Bioinformatics 25:1189–1191
Söding J (2005) Protein homology detection by HMM-HMM comparison. Bioinformatics 21:951–960
Eddy SR (2011) Accelerated profile HMM searches. PLoS Comput Biol 7:e1002195
Altschul SF et al (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25:3389–3402
Marchler-Bauer A et al (2009) CDD: specific functional annotation with the Conserved Domain Database. Nucleic Acids Res 37:D205–D210
Punta M et al (2012) The Pfam protein families database. Nucleic Acids Res 40:D290–D301
Katoh K, Toh H (2008) Recent developments in the MAFFT multiple sequence alignment program. Brief Bioinform 9:286–298
Loytynoja A, Goldman N (2008) Phylogeny-aware gap placement prevents errors in sequence alignment and evolutionary analysis. Science 320:1632–1635
Edgar RC (2004) MUSCLE: a multiple sequence alignment method with reduced time and space complexity. BMC Bioinform 5:113
Gilbert D (2002) Sequence file format conversion with command-line readseq.. doi:10.1002/0471250953.bia01es00
Hogue CW (1997) Cn3D: a new generation of three-dimensional molecular structure viewer. Trends Biochem Sci 22:314–316
Wang Y, Geer LY, Chappey C, Kans JA, Bryant SH (2000) Cn3D: sequence and structure views for Entrez. Trends Biochem Sci 25:300–302
Ash RB (1965) Information theory. Courier Dover, New York
Cover TM, Thomas JA (1991) Elements of information theory. Wiley, New York
Dunn SD, Wahl LM, Gloor GB (2008) Mutual information without the influence of phylogeny or entropy dramatically improves residue contact prediction. Bioinformatics 24:333–340
R Development Core Team (2008) R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0, http://www.R-project.org.29.
Ellson J, Gansner E, Koutsofios L, North S, Woodhull G (2002) Graphviz—open source graph drawing tools. Springer, Heidelberg, pp 594–597
Bromham L (2009) Reading the story in DNA. Oxford University Press, USA
Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ (1990) Basic local alignment search tool. J Mol Biol 215:403–410
Henikoff S, Henikoff JG (1992) Amino acid substitution matrices from protein blocks. Proc Natl Acad Sci U S A 89:10915–10919
Altschul SF (1998) Generalized affine gap costs for protein sequence alignment. Proteins 32:88–96
Burger L, van Nimwegen E (2010) Disentangling direct from indirect co-evolution of residues in protein alignments. PLoS Comput Biol 6:e1000633
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer Science+Business Media New York
About this protocol
Cite this protocol
Dickson, R.J., Gloor, G.B. (2014). Bioinformatics Identification of Coevolving Residues. In: Edgell, D. (eds) Homing Endonucleases. Methods in Molecular Biology, vol 1123. Humana Press, Totowa, NJ. https://doi.org/10.1007/978-1-62703-968-0_15
Download citation
DOI: https://doi.org/10.1007/978-1-62703-968-0_15
Published:
Publisher Name: Humana Press, Totowa, NJ
Print ISBN: 978-1-62703-967-3
Online ISBN: 978-1-62703-968-0
eBook Packages: Springer Protocols