Using Evolutionary Information to Find Specificity-Determining and Co-evolving Residues

  • Grigory Kolesov
  • Leonid A. Mirny
Part of the Methods in Molecular Biology book series (MIMB, volume 541)


Intricate networks of protein interactions rely on the ability of a protein to recognize its targets: other proteins, ligands, and sites on DNA and RNA. To recognize other molecules, it was suggested that a protein uses a small set of specificity-determining residues (SDRs). How can one find these residues in proteins and distinguish them from other functionally important amino acids? A number of bioinformatics methods to predict SDRs have been developed in recent years. These methods use genomic information and multiple sequence alignments to identify positions exhibiting a specific pattern of conservation and variability. The challenge is to delineate the evolutionary pattern of SDRs from that of the active site residues and the residues responsible for formation of the protein’s structure. The phylogenetic history of a protein family makes such analysis particularly hard. Here we present two methods for finding the SDRs and the co-evolving residues (CERs) in proteins. We use a Monte Carlo approach for statistical inference, allowing us to reveal specific evolutionary patterns of SDRs and CERs. We apply these methods to study specific recognition in the bacterial two-component system and in the class Ia aminoacyl-tRNA synthetases. Our results agree well with structural information and the experimental analyses of these systems. Our results point at the complex and distinct patterns characteristic of the evolution of specificity in these systems.

Key words

Specificity-determining residues co-evolving residues correlated mutations mutual information Monte Carlo protein evolution two-component system aminoacyl tRNA synthetase 


  1. 1.
    Kopke Salinas R, Folkers GE, Bonvin AM, Das D, Boelens R, Kaptein R. Altered specificity in DNA binding by the lac repressor: a mutant lac headpiece that mimics the gal repressor. Chembiochem 2005;6:1628–37.PubMedCrossRefGoogle Scholar
  2. 2.
    de Prat Gay G, Duckworth HW, Fersht AR. Modification of the amino acid specificity of tyrosyl-tRNA synthetase by protein engineering. FEBS Lett 1993;318:167–71.PubMedCrossRefGoogle Scholar
  3. 3.
    Livingstone CD, Barton GJ. Protein sequence alignments: a strategy for the hierarchical analysis of residue conservation. Comput Appl Biosci 1993;9:745–56.PubMedGoogle Scholar
  4. 4.
    Lichtarge O, Bourne HR, Cohen FE. An evolutionary trace method defines binding surfaces common to protein families. J Mol Biol 1996;257:342–58.PubMedCrossRefGoogle Scholar
  5. 5.
    Hannenhalli SS, Russell RB. Analysis and prediction of functional sub-types from protein sequence alignments. J Mol Biol 2000;303:61–76.PubMedCrossRefGoogle Scholar
  6. 6.
    Kalinina OV, Mironov AA, Gelfand MS, Rakhmaninova AB. Automated selection of positions determining functional specificity of proteins by comparative analysis of orthologous groups in protein families. Protein Sci 2004;13:443–56.PubMedCrossRefGoogle Scholar
  7. 7.
    Mirny LA, Gelfand MS. Using orthologous and paralogous proteins to identify specificity-determining residues in bacterial transcription factors. J Mol Biol 2002;321:7–20.PubMedCrossRefGoogle Scholar
  8. 8.
    Pei J, Cai W, Kinch LN, Grishin NV. Prediction of functional specificity determinants from protein sequences using log-likelihood ratios. Bioinformatics 2006;22:164–71.PubMedCrossRefGoogle Scholar
  9. 9.
    Vernet T, Tessier DC, Khouri HE, Altschuh D. Correlation of co-ordinated amino acid changes at the two-domain interface of cysteine proteases with protein stability. J Mol Biol 1992;224:501–9.PubMedCrossRefGoogle Scholar
  10. 10.
    Gobel U, Sander C, Schneider R, Valencia A. Correlated mutations and residue contacts in proteins. Proteins 1994;18:309–17.PubMedCrossRefGoogle Scholar
  11. 11.
    Tress M, de Juan D, Grana O, Gomez MJ, Gomez-Puertas P, Gonzalez JM, Lopez G, Valencia A. Scoring docking models with evolutionary information. Proteins 2005;60:275–80.PubMedCrossRefGoogle Scholar
  12. 12.
    Shindyalov IN, Kolchanov NA, Sander C. Can three-dimensional contacts in protein structures be predicted by analysis of correlated mutations? Protein Eng 1994;7:349–58.PubMedCrossRefGoogle Scholar
  13. 13.
    Pollock DD, Taylor WR, Goldman N. Coevolving protein residues: maximum likelihood identification and relationship to structure. J Mol Biol 1999;287:187–98.PubMedCrossRefGoogle Scholar
  14. 14.
    Fariselli P, Casadio R. A neural network based predictor of residue contacts in proteins. Protein Eng 1999;12:15–21.PubMedCrossRefGoogle Scholar
  15. 15.
    Yu GX, Park BH, Chandramohan P, Munavalli R, Geist A, Samatova NF. In silico discovery of enzyme-substrate specificity-determining residue clusters. J Mol Biol 2005;352:1105–17.PubMedCrossRefGoogle Scholar
  16. 16.
    Fitch WM. Distinguishing homologous from analogous proteins. Syst Zool 1970;19:99–113.PubMedCrossRefGoogle Scholar
  17. 17.
    Fitch WM. Homology a personal view on some of the problems. Trends Genet 2000;16:227–31.PubMedCrossRefGoogle Scholar
  18. 18.
    Blattner FR, Plunkett G, 3rd, Bloch CA, Perna NT, Burland V, Riley M, Collado-Vides J, Glasner JD, Rode CK, Mayhew GF, Gregor J, Davis NW, Kirkpatrick HA, Goeden MA, Rose DJ, Mau B, Shao Y. The complete genome sequence of Escherichia coli K-12. Science 1997;277:1453–74.PubMedCrossRefGoogle Scholar
  19. 19.
    Skerker JM, Prasol MS, Perchuk BS, Biondi EG, Laub MT. Two-component signal transduction pathways regulating growth and cell cycle progression in a bacterium: a system-level analysis. PLoS Biol 2005;3:e334.PubMedCrossRefGoogle Scholar
  20. 20.
    Katoh K, Kuma K, Toh H, Miyata T. MAFFT version 5: improvement in accuracy of multiple sequence alignment. Nucleic Acids Res 2005;33:511–8.PubMedCrossRefGoogle Scholar
  21. 21.
    Edgar RC. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res 2004;32: 1792–7.PubMedCrossRefGoogle Scholar
  22. 22.
    Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 1997;25:3389–402.PubMedCrossRefGoogle Scholar
  23. 23.
    Metropolis N, Rosenbluth AW, Rosenbluth MN, Teller AH, Teller EJ. Equation-of-state calculations by fast computing machines. J Chem Phys 1953;21:1087–92.CrossRefGoogle Scholar
  24. 24.
    Stock AM, Robinson VL, Goudreau PN. Two-component signal transduction. Annu Rev Biochem 2000;69:183–215.PubMedCrossRefGoogle Scholar
  25. 25.
    Buckler DR, Zhou Y, Stock AM. Evidence of intradomain and interdomain flexibility in an OmpR/PhoB homolog from Thermotoga maritima. Structure 2002;10:153–64.PubMedCrossRefGoogle Scholar
  26. 26.
    Birck C, Mourey L, Gouet P, Fabry B, Schumacher J, Rousseau P, Kahn D, Samama JP. Conformational changes induced by phosphorylation of the FixJ receiver domain. Structure 1999;7:1505–15.PubMedCrossRefGoogle Scholar
  27. 27.
    Marina A, Waldburger CD, Hendrickson WA. Structure of the entire cytoplasmic portion of a sensor histidine-kinase protein. Embo J 2005;24:4247–59.PubMedCrossRefGoogle Scholar
  28. 28.
    Tzeng YL, Hoch JA. Molecular recognition in signal transduction: the interaction surfaces of the Spo0F response regulator with its cognate phosphorelay proteins revealed by alanine scanning mutagenesis. J Mol Biol 1997; 272:200–12.PubMedCrossRefGoogle Scholar
  29. 29.
    Fukai S, Nureki O, Sekine S, Shimada A, Tao J, Vassylyev DG, Yokoyama S. Structural basis for double-sieve discrimination of L-valine from L-isoleucine and L-threonine by the complex of tRNA(Val) and valyl-tRNA synthetase. Cell 2000;103:793–803.PubMedCrossRefGoogle Scholar
  30. 30.
    Silvian LF, Wang J, Steitz TA. Insights into editing from an ile-tRNA synthetase structure with tRNAile and mupirocin. Science 1999;285:1074–7.PubMedCrossRefGoogle Scholar
  31. 31.
    Tamura K, Nameki N, Hasegawa T, Shimizu M, Himeno H. Role of the CCA terminal sequence of tRNA(Val) in aminoacylation with valyl-tRNA synthetase. J Biol Chem 1994;269:22173–7.PubMedGoogle Scholar

Copyright information

© Humana Press, a part of Springer Science+Business Media, LLC 2009

Authors and Affiliations

  • Grigory Kolesov
    • 1
  • Leonid A. Mirny
    • 1
  1. 1.Harvard-MIT Division of Health Sciences and TechnologyMassachusetts Institute of TechnologyCambridgeUSA

Personalised recommendations