Correlated positions in protein evolution and engineering

Biocatalysis - Review


Statistical analysis of a protein multiple sequence alignment can reveal groups of positions that undergo interdependent mutations throughout evolution. At these so-called correlated positions, only certain combinations of amino acids appear to be viable for maintaining proper folding, stability, catalytic activity or specificity. Therefore, it is often speculated that they could be interesting guides for semi-rational protein engineering purposes. Because they are a fingerprint from protein evolution, their analysis may provide valuable insight into a protein’s structure or function and furthermore, they may also be suitable target positions for mutagenesis. Unfortunately, little is currently known about the properties of these correlation networks and how they should be used in practice. This review summarises the recent findings, opportunities and pitfalls of the concept.


Correlated positions Coevolution Protein engineering Correlated mutation analysis 



The authors wish to thank the Fund for Scientific Research-Flanders (FWO-Vlaanderen) for financial support (doctoral scholarship for JF).


  1. 1.
    Altschuh D, Vernet T, Berti P, Moras D, Nagai K (1988) Coordinated amino acid changes in homologous protein families. Protein Eng Des Sel 2:193–199CrossRefGoogle Scholar
  2. 2.
    Bednar D, Beerens K, Sebestova E, Bendl J, Khare S, Chaloupkova R, Prokop Z, Brezovsky J, Baker D, Damborsky J (2015) FireProt: energy- and evolution-based computational design of thermostable multiple-point mutants. PLoS Comput Biol 11:1–20CrossRefGoogle Scholar
  3. 3.
    Bendl J, Stourac J, Sebestova E, Vavra O, Musil M, Brezovsky J, Damborsky J (2016) HotSpot Wizard 2.0: automated design of site-specific mutations and smart libraries in protein engineering. Nucleic Acids Res. doi: 10.1093/nar/gkw416
  4. 4.
    Bornscheuer UT, Huisman GW, Kazlauskas RJ, Lutz S, Moore JC, Robins K (2012) Engineering the third wave of biocatalysis. Nature 485:185–194CrossRefPubMedGoogle Scholar
  5. 5.
    Chakrabarti S, Panchenko AR (2009) Coevolution in defining the functional specificity. Proteins Struct Funct Bioinform 75:231–240CrossRefGoogle Scholar
  6. 6.
    Chen Z, Meyer W, Rappert S, Sun J, Zeng AP (2011) Coevolutionary analysis enabled rational deregulation of allosteric enzyme inhibition in Corynebacterium glutamicum for lysine production. Appl Environ Microbiol 77:4352–4360CrossRefPubMedPubMedCentralGoogle Scholar
  7. 7.
    Chen Z, Rappert S, Sun J, Zeng AP (2011) Integrating molecular dynamics and co-evolutionary analysis for reliable target prediction and deregulation of the allosteric inhibition of aspartokinase for amino acid production. J Biotechnol 154:248–254CrossRefPubMedGoogle Scholar
  8. 8.
    Currin A, Swainston N, Day PJ, Kell DB (2015) Synthetic biology for the directed evolution of protein biocatalysts: navigating sequence space intelligently. Chem Soc Rev 44:1172–1239CrossRefPubMedGoogle Scholar
  9. 9.
    Dalby PA (2011) Strategy and success for the directed evolution of enzymes. Curr Opin Struct Biol 21:473–480CrossRefPubMedGoogle Scholar
  10. 10.
    Dietrich S, Borst N, Schlee S, Schneider D, Janda J, Sterner R, Merkl R (2012) Experimental assessment of the importance of amino acid positions identified by an entropy-based correlation analysis of multiple-sequence alignments. Biochemistry 51:5633–5641CrossRefPubMedGoogle Scholar
  11. 11.
    Dill KA, MacCallum JL (2012) The protein-folding problem, 50 years on. Science 338:1042–1046CrossRefPubMedGoogle Scholar
  12. 12.
    Van Durme J, Delgado J, Stricher F, Serrano L, Schymkowitz J, Rousseau F (2011) A graphical interface for the FoldX forcefield. Bioinformatics 27:1711–1712CrossRefPubMedGoogle Scholar
  13. 13.
    Ehrlich PR, Raven PH (1964) Butterflies and plants: a study in coevolution. Evolution 18:586–608CrossRefGoogle Scholar
  14. 14.
    Eijsink VGH, Bjørk A, Gåseidnes S, Sirevåg R, Synstad B, van den Burg B, Vriend G (2004) Rational engineering of enzyme stability. J Biotechnol 113:105–120CrossRefPubMedGoogle Scholar
  15. 15.
    Eijsink VGH, Gåseidnes S, Borchert TV, van den Burg B (2005) Directed evolution of enzyme stability. Biomol Eng 22:21–30CrossRefPubMedGoogle Scholar
  16. 16.
    Gloor GB, Martin LC, Wahl LM, Dunn SD (2005) Mutual information in protein multiple sequence alignments reveals two classes of coevolving positions. Biochemistry 44:7156–7165CrossRefPubMedGoogle Scholar
  17. 17.
    Göbel U, Sander C, Schneider R, Valencia A (1994) Correlated mutations and residue contacts in proteins. Proteins 18:309–317CrossRefPubMedGoogle Scholar
  18. 18.
    Goldsmith M, Tawfik DS (2012) Directed enzyme evolution: beyond the low-hanging fruit. Curr Opin Struct Biol 22:406–412CrossRefPubMedGoogle Scholar
  19. 19.
    Gregoret LM, Sauer RT (1993) Additivity of mutant effects assessed by binomial mutagenesis. Proc Natl Acad Sci USA 90:4246–4250CrossRefPubMedPubMedCentralGoogle Scholar
  20. 20.
    Halabi N, Rivoire O, Leibler S, Ranganathan R (2009) Protein sectors: evolutionary units of three-dimensional structure. Cell 138:774–786CrossRefPubMedPubMedCentralGoogle Scholar
  21. 21.
    Hayat S, Sander C, Marks DS, Elofsson A (2015) All-atom 3D structure prediction of transmembrane β-barrel proteins from sequences. Proc Natl Acad Sci USA 112:5413–5548CrossRefPubMedPubMedCentralGoogle Scholar
  22. 22.
    Hopf TA, Morinaga S, Ihara S, Touhara K, Marks DS, Benton R (2015) Amino acid coevolution reveals three-dimensional structure and functional domains of insect odorant receptors. Nat Commun 6:6077CrossRefPubMedPubMedCentralGoogle Scholar
  23. 23.
    Hopf TA, Schärfe CPI, Rodrigues JPGLM, Green AG, Kohlbacher O, Sander C, Bonvin AMJJ, Marks DS (2014) Sequence co-evolution gives 3D contacts and structures of protein complexes. Elife 3:e03430CrossRefPubMedCentralGoogle Scholar
  24. 24.
    Joosten HJ, Han Y, Niu W, Vervoort J, Dunaway-Mariano D, Schaap PJ (2008) Identification of fungal oxaloacetate hydrolyase within the isocitrate lyase/PEP mutase enzyme superfamily using a sequence marker-based method. Proteins Struct Funct Bioinform 70:157–166CrossRefGoogle Scholar
  25. 25.
    de Juan D, Pazos F, Valencia A (2013) Emerging methods in protein co-evolution. Nat Rev Genet 14:249–261CrossRefPubMedGoogle Scholar
  26. 26.
    Kazlauskas RJ, Bornscheuer UT (2009) Finding better protein engineering strategies. Nat Chem Biol 5:526–529CrossRefPubMedGoogle Scholar
  27. 27.
    Kellogg EH, Leaver-Fay A, Baker D (2011) Role of conformational sampling in computing mutation-induced changes in protein structure and stability. Proteins Struct Funct Bioinform 79:830–838CrossRefGoogle Scholar
  28. 28.
    Kortemme T, Baker D (2004) Computational design of protein-protein interactions. Curr Opin Chem Biol 8:91–97CrossRefPubMedGoogle Scholar
  29. 29.
    Kuipers RKP, Joosten H-J, Verwiel E, Paans S, Akerboom J, van der Oost J, Leferink NGH, van Berkel WJH, Vriend G, Schaap PJ (2009) Correlated mutation analyses on super-family alignments reveal functionally important residues. Proteins Struct Funct Bioinform 76:608–616CrossRefGoogle Scholar
  30. 30.
    Livesay DR, Kreth KE, Fodor AA (2012) A critical evaluation of correlated mutation algorithms and coevolution within allosteric mechanisms. Methods Mol Biol 286:385–398CrossRefGoogle Scholar
  31. 31.
    Lockless SW, Ranganathan R (1999) Evolutionarily conserved pathways of energetic connectivity in protein families. Science 286:295–299CrossRefPubMedGoogle Scholar
  32. 32.
    Lovell SC, Robertson DL (2010) An integrated view of molecular coevolution in protein-protein interactions. Mol Biol Evol 27:2567–2575CrossRefPubMedGoogle Scholar
  33. 33.
    Lutz S (2010) Beyond directed evolution: semi-rational protein engineering and design. Curr Opin Biotechnol 21:734–743CrossRefPubMedPubMedCentralGoogle Scholar
  34. 34.
    Marks DS, Colwell LJ, Sheridan R, Hopf TA, Pagnani A, Zecchina R, Sander C (2011) Protein 3D structure computed from evolutionary sequence variation. PLoS One 6:e28766CrossRefPubMedPubMedCentralGoogle Scholar
  35. 35.
    Marks DS, Hopf TA, Sander C (2012) Protein structure prediction from sequence variation. Nat Biotechnol 30:1072–1080CrossRefPubMedPubMedCentralGoogle Scholar
  36. 36.
    McLaughlin RN, Poelwijk FJ, Raman A, Gosal WS, Ranganathan R (2012) The spatial architecture of protein function and adaptation. Nature 491:138–142CrossRefPubMedPubMedCentralGoogle Scholar
  37. 37.
    McMurrough TA, Dickson RJ, Thibert SMF, Gloor GB, Edgell DR (2014) Control of catalytic efficiency by a coevolving network of catalytic and noncatalytic residues. Proc Natl Acad Sci 111:E2376–E2383CrossRefPubMedPubMedCentralGoogle Scholar
  38. 38.
    Miyazaki K, Arnold FH (1999) Exploring nonnatural evolutionary pathways by saturation mutagenesis: rapid improvement of protein function. J Mol Evol 49:716–720CrossRefPubMedGoogle Scholar
  39. 39.
    Morley KL, Kazlauskas RJ (2005) Improving enzyme properties: when are closer mutations better? Trends Biotechnol 23:231–237CrossRefPubMedGoogle Scholar
  40. 40.
    Neher E (1994) How frequent are correlated changes in families of protein sequences? Proc Natl Acad Sci USA 91:98–102CrossRefPubMedPubMedCentralGoogle Scholar
  41. 41.
    Nobili A, Tao Y, Pavlidis IV, van den Bergh T, Joosten H-J, Tan T, Bornscheuer UT (2015) Simultaneous use of in silico design and a correlated mutation network as a tool to efficiently guide enzyme engineering. Chembiochem 16:805–810CrossRefPubMedGoogle Scholar
  42. 42.
    Pazos F, Helmer-Citterich M, Ausiello G, Valencia A (1997) Correlated mutations contain information about protein-protein interaction. J Mol Biol 271:511–523CrossRefPubMedGoogle Scholar
  43. 43.
    Raman AS, White KI, Ranganathan R (2016) Origins of allostery and evolvability in proteins: a case study. Cell. doi: 10.1016/j.cell.2016.05.047 PubMedGoogle Scholar
  44. 44.
    Reetz M, Kahakeaw D, Lohmer R (2008) Addressing the numbers problem in directed evolution. Chembiochem 9:1797–1804CrossRefPubMedGoogle Scholar
  45. 45.
    Reetz MT (2013) The importance of additive and non-additive mutational effects in protein engineering. Angew Chem Int Ed Engl 52:2658–2666CrossRefPubMedGoogle Scholar
  46. 46.
    Reetz MT, Prasad S, Carballeira JD, Gumulya Y, Bocola M (2010) Iterative saturation mutagenesis accelerates laboratory evolution of enzyme stereoselectivity: rigorous comparison with traditional methods. J Am Chem Soc 132:9144–9152CrossRefPubMedGoogle Scholar
  47. 47.
    Reetz MT, Wang L-W, Bocola M (2006) Directed evolution of enantioselective enzymes: iterative cycles of CASTing for probing protein-sequence space. Angew Chem Int Ed Engl 45:1236–1241CrossRefPubMedGoogle Scholar
  48. 48.
    Salverda MLM, Dellus E, Gorter FA, Debets AJM, van der Oost J, Hoekstra RF, Tawfik DS, de Visser JAGM (2011) Initial mutations direct alternative pathways of protein evolution. PLoS Genet 7:e1001321CrossRefPubMedPubMedCentralGoogle Scholar
  49. 49.
    Shindyalov IN, Kolchanov NA, Sander C (1994) Can three-dimensional contacts in protein structures be predicted by analysis of correlated mutations? Protein Eng 7:349–358CrossRefPubMedGoogle Scholar
  50. 50.
    Soskine M, Tawfik DS (2010) Mutational effects and the evolution of new protein functions. Nat Rev Genet 11:572–582CrossRefPubMedGoogle Scholar
  51. 51.
    Stiffler MA, Hekstra DR, Ranganathan R (2015) Evolvability as a function of purifying selection in TEM-1 β-lactamase. Cell 160:882–892CrossRefPubMedGoogle Scholar
  52. 52.
    Strafford J, Payongsri P, Hibbert EG, Morris P, Batth SS, Steadman D, Smith MEB, Ward JM, Hailes HC, Dalby PA (2012) Directed evolution to re-adapt a co-evolved network within an enzyme. J Biotechnol 157:237–245CrossRefPubMedPubMedCentralGoogle Scholar
  53. 53.
    Sullivan BJ, Nguyen T, Durani V, Mathur D, Rojas S, Thomas M, Syu T, Magliery TJ (2012) Stabilizing proteins from sequence statistics: the interplay of conservation and correlation in triosephosphate isomerase stability. J Mol Biol 420:384–399CrossRefPubMedPubMedCentralGoogle Scholar
  54. 54.
    Sutto L, Marsili S, Valencia A, Gervasio FL (2015) From residue coevolution to protein conformational ensembles and functional dynamics. Proc Natl Acad Sci USA 112:13567–13572CrossRefPubMedPubMedCentralGoogle Scholar
  55. 55.
    Taylor WR, Hatrick K (1994) Compensating changes in protein multiple sequence alignments. Protein Eng 7:341–348CrossRefPubMedGoogle Scholar
  56. 56.
    Turner NJ (2009) Directed evolution drives the next generation of biocatalysts. Nat Chem Biol 5:567–573CrossRefPubMedGoogle Scholar
  57. 57.
    Verges A, Cambon E, Barbe S, Salamone S, Le Guen Y, Moulis C, Mulard LA, Remaud-Siméon M, André I (2015) Computer-aided engineering of a transglycosylase for the glucosylation of an unnatural disaccharide of relevance for bacterial antigen synthesis. ACS Catal 5:1186–1198CrossRefGoogle Scholar
  58. 58.
    Wang C, Huang R, He B, Du Q (2012) Improving the thermostability of alpha-amylase by combinatorial coevolving-site saturation mutagenesis. BMC Bioinform 13:263CrossRefGoogle Scholar
  59. 59.
    Zou T, Risso VA, Gavira JA, Sanchez-Ruiz JM, Ozkan SB (2014) Evolution of conformational dynamics determines the conversion of a promiscuous generalist into a specialist enzyme. Mol Biol Evol 32:132–143CrossRefPubMedGoogle Scholar

Copyright information

© Society for Industrial Microbiology and Biotechnology 2016

Authors and Affiliations

  1. 1.Department of Biochemical and Microbial Technology, Centre for Industrial Biotechnology and BiocatalysisGhent UniversityGhentBelgium

Personalised recommendations