A Hybrid Approach for Protein Structure Determination Combining Sparse NMR with Evolutionary Coupling Sequence Data
While 3D structure determination of small (<15 kDa) proteins by solution NMR is largely automated and routine, structural analysis of larger proteins is more challenging. An emerging hybrid strategy for modeling protein structures combines sparse NMR data that can be obtained for larger proteins with sequence co-variation data, called evolutionary couplings (ECs), obtained from multiple sequence alignments of protein families. This hybrid “EC-NMR” method can be used to accurately model larger (15–60 kDa) proteins, and more rapidly determine structures of smaller (5–15 kDa) proteins using only backbone NMR data. The resulting structures have accuracies relative to reference structures comparable to those obtained with full backbone and sidechain NMR resonance assignments. The requirement that evolutionary couplings (ECs) are consistent with NMR data recorded on a specific member of a protein family, under specific conditions, potentially also allows identification of ECs that reflect alternative allosteric or excited states of the protein structure.
KeywordsHybrid methods Protein NMR spectroscopy Protein families Multiple sequence alignment Maximum entropy Evolutionary couplings Automated NMR data analysis AutoStructure/ASDP
This work was supported by National Institutes of Health grants 1R01-GM120574 (to G.T.M.) and 1R01-GM106303 (C.S. & D.M.).We thank all of the members of the Northeast Structural Genomics Consortium who generated and archived NMR data used in this work, particularly scientists in the laboratories of C. Arrowsmith, M. Kennedy, G.T. Montelione, T. Szyperski, and J. Prestegard.
- Everett JK, Tejero R, Murthy SB, Acton TB, Aramini JM, Baran MC, Benach J, Cort JR, Eletsky A, Forouhar F, Guan R, Kuzin AP, Lee HW, Liu G, Mani R, Mao B, Mills JL, Montelione AF, Pederson K, Powers R, Ramelot T, Rossi P, Seetharaman J, Snyder D, Swapna GV, Vorobiev SM, Wu Y, Xiao R, Yang Y, Arrowsmith CH, Hunt JF, Kennedy MA, Prestegard JH, Szyperski T, Tong L, Montelione GT (2016) A community resource of experimental data for NMR / X-ray crystal structure pairs. Protein Sci 25(1):30–45. https://doi.org/10.1002/pro.2774 CrossRefPubMedGoogle Scholar
- Lange OF, Rossi P, Sgourakis NG, Song Y, Lee HW, Aramini JM, Ertekin A, Xiao R, Acton TB, Montelione GT, Baker D (2012) Determination of solution structures of proteins up to 40 kDa using CS-Rosetta with sparse NMR data from deuterated samples. Proc Natl Acad Sci U S A 109(27):10873–10878. https://doi.org/10.1073/pnas.1203013109 CrossRefPubMedPubMedCentralGoogle Scholar
- Lapedes A, Giraud B, Jarzynski C (2002) Using sequence alignments to predict protein structure and stability with high accuracy. National Laboratory Report LA-UR-02-4481. http://permalink.lanl.gov/object/tr?what=info:lanl-repo/lareport/LA-UR-02-4481 and arXiv:1207.2484 [q-bio.QM] (2012 copy)
- Montelione GT, Nilges M, Bax A, Güntert P, Herrmann T, Richardson JS, Schwieters CD, Vranken WF, Vuister GW, Wishart DS, Berman HM, Kleywegt GJ, Markley JL (2013) Recommendations of the wwPDB NMR validation task force. Structure 21(9):1563–1570. https://doi.org/10.1016/j.str.2013.07.021 CrossRefPubMedGoogle Scholar
- Morcos F, Pagnani A, Lunt B, Bertolino A, Marks DS, Sander C, Zecchina R, Onuchic JN, Hwa T, Weigt M (2011) Direct-coupling analysis of residue coevolution captures native contacts across many protein families. Proc Natl Acad Sci U S A 108(49):E1293–E1301. https://doi.org/10.1073/pnas.1111471108 CrossRefPubMedPubMedCentralGoogle Scholar
- Mueller GA, Choy WY, Yang D, Forman-Kay JD, Venters RA, Kay LE (2000) Global folds of proteins with low densities of NOEs using residual dipolar couplings: application to the 370-residue maltodextrin-binding protein. J Mol Biol 300(1):197–212. https://doi.org/10.1006/jmbi.2000.3842 CrossRefPubMedGoogle Scholar
- Raman S, Lange OF, Rossi P, Tyka M, Wang X, Aramini J, Liu G, Ramelot TA, Eletsky A, Szyperski T, Kennedy MA, Prestegard J, Montelione GT, Baker D (2010) NMR structure determination for larger proteins using backbone-only data. Science 327(5968):1014–1018. https://doi.org/10.1126/science.1183649 CrossRefPubMedPubMedCentralGoogle Scholar
- Sgourakis NG, Natarajan K, Ying J, Vogeli B, Boyd LF, Margulies DH, Bax A (2014) The structure of mouse cytomegalovirus m04 protein obtained from sparse NMR data reveals a conserved fold of the m02-m06 viral immune modulator family. Structure 22(9):1263–1273. https://doi.org/10.1016/j.str.2014.05.018 CrossRefPubMedPubMedCentralGoogle Scholar
- Sheridan R, Fieldhouse RJ, Hayat S, Sun Y, Antipin Y, Yang L, Hopf T, Marks DS, Sander C (2015) EVfold.org: evolutionary couplings and protein 3D structure prediction. bioRxiv 021022. https://doi.org/10.1101/021022