Abstract
The function and dynamics of many proteins are best understood not from a single structure but from an ensemble. A high quality ensemble is necessary for accurately delineating protein dynamics. However, conformations in an ensemble are generally given equal weights. Few attempts were made to assign relative populations to the conformations, mainly due to the lack of right experimental data. Here we propose a method for assigning relative populations to ensembles using experimental residue dipolar couplings (RDC) as constraints, and show that relative populations can significantly enhance an ensemble’s ability in representing the native states and dynamics. The method works by identifying conformation states within an ensemble and assigning appropriate relative populations to them. Each of these conformation states is represented by a sub-ensemble consisting of a subset of the conformations. Application to the ubiquitin X-ray ensemble clearly identifies two key conformation states, with relative populations in excellent agreement with previous work. We then apply the method to a reprotonated ERNST ensemble that is enhanced with a switched conformation, and show that as a result of population reweighting, not only the reproduction of RDCs is significantly improved, but common conformational features (particularly the dihedral angle distributions of ϕ 53 and ψ 52) also emerge for both the X-ray ensemble and the reprotonated ERNST ensemble.
Similar content being viewed by others
References
Austin RH, Beeson KW, Eisenstein L, Frauenfelder H, Gunsalus IC (1975) Dynamics of ligand binding to myoglobin. Biochemistry 14:5355–5373
Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN, Bourne PE (2000) The protein data bank. Nucleic Acids Res 28:235–242
Best RB, Lindorff-Larsen K, DePristo MA, Vendruscolo M (2006) Relation between native ensembles and experimental structures of proteins. Proc Natl Acad Sci USA 103:10901–10906
Boehr DD, Nussinov R, Wright PE (2009) The role of dynamic conformational ensembles in biomolecular recognition. Nat Chem Biol 5:789–796
Bonvin AMJJ, Brunger AT (1996) Do NOE distances contain enough information to assess the relative populations of multi-conformer structures? J Biomol NMR 7:72–76
Clore GM, Schwieters CD (2004a) Amplitudes of protein backbone dynamics and correlated motions in a small α/β protein: correspondence of dipolar coupling and heteronuclear relaxation measurements. Biochemistry 43:10678–10691
Clore GM, Schwieters CD (2004b) How much backbone motion in ubiquitin is required to account for dipolar coupling data measured in multiple alignment media as assessed by independent cross-validation? J Am Chem Soc 126:2923–2938
Clore GM, Schwieters CD (2006) Concordance of residual dipolar couplings, backbone order parameters and crystallographic B-factors for a small α/β protein: a unified picture of high probability, fast atomic motions in proteins. J Mol Biol 355:879–886
Daura X, Gademann K, Jaun B, Seebach D, van Gunsteren WF, Mark AE (1999) Peptide folding: when simulation meets experiment. Angew Chem Int Ed 38:236–240
de Groot BL, van Aalten DM, Scheek RM, Amadei A, Vriend G, Berendsen HJ (1997) Prediction of protein conformational freedom from distance constraints. Proteins 29:240–251
DePristo MA, de Bakker PI, Blundell TL (2004) Heterogeneity and inaccuracy in protein structures solved by X-ray crystallography. Structure 12:831–838
Dill KA, Chan HS (1997) From Levinthal to pathways to funnels. Nat Struct Biol 4:10–19
Eastwood MP, Hardin C, Luthey-Schulten Z, Wolynes PG (2001) Evaluating protein structure-prediction schemes using energy landscape theory. IBM J Res Dev 45:475–497
Fenwick RB, Esteban-Martín S, Richter B, Lee D, Walter KF, Milovanovic D, Becker S, Lakomek NA, Griesinger C, Salvatella X (2011) Weak long-range correlated motions in a surface patch of ubiquitin involved in molecular recognition. J Am Chem Soc 133:10336–10339
Frauenfelder H, Silgar S, Wolynes P (1991) The energy landscapes and motions of proteins. Science 254:1598–1603
Frauenfelder H, McMahon BH, Austin RH, Chu K, Groves JT (2001) The role of structure, energy landscape, dynamics, and allostery in the enzymatic function of myoglobin. Proc Natl Acad Sci USA 98:2370–2374
Furnham N, Blundell TL, DePristo MA, Terwilliger TC (2006) Is one solution good enough? Nat Struct Mol Biol 13:184–185
Hamelberg D, Mongan J, McCammon JA (2004) Accelerated molecular dynamics: a promising and efficient simulation method for biomolecules. J Chem Phys 120:11919–11929
Huang KY, Amodeo GA, Tong L, McDermott A (2011) The structure of human ubiquitin in 2-methyl-2,4-pentanediol: a new conformational switch. Protein Sci 20:630–639
Karplus M, McCammon JA (2002) Molecular dynamics simulations of biomolecules. Nat Struct Mol Biol 9:646–652
Kontaxis G, Bax A (2001) Multiplet component separation for measurement of methyl 13C-1H dipolar couplings in weakly aligned proteins. J Biomol NMR 20:77–82
Lakomek NA, Carlomagno T, Becker S, Griesinger C, Meiler J (2006) A thorough dynamic interpretation of residual dipolar couplings in ubiquitin. J Biomol NMR 34:101–115
Lakomek NA, Walter KF, Fares C, Lange OF, de Groot BL, Grubmuller H, Bruschweiler R, Munk A, Becker S, Meiler J et al (2008) Self-consistent residual dipolar coupling based model-free analysis for the robust determination of nanosecond to microsecond protein dynamics. J Biomol NMR 41:139–155
Lange OF, Lakomek NA, Fares C, Schroder GF, Walter KF, Becker S, Meiler J, Grubmuller H, Griesinger C, de Groot BL (2008) Recognition dynamics up to microseconds revealed from an RDC-derived ubiquitin ensemble in solution. Science 320:1471–1475
Lawson CL, Hanson RJ (1995) Solving least squares problems. SIAM, Philadelphia
Levin EJ, Kondrashov DA, Wesenberg GE, Phillips GN Jr (2007) Ensemble refinement of protein crystal structures: validation and application. Structure 15:1040–1052
Lindorff-Larsen K, Best RB, Depristo MA, Dobson CM, Vendruscolo M (2005) Simultaneous determination of protein structure and dynamics. Nature 433:128–132
Markwick PRL, Bouvignies G, Salmon L, McCammon JA, Nilges M, Blackledge M (2009) Toward a unified representation of protein structural dynamics in solution. J Am Chem Soc 131:16968–16975
Miyashita O, Onuchic JN, Wolynes PG (2003) Nonlinear elasticity, proteinquakes, and the energy landscapes of functional transitions in proteins. Proc Natl Acad Sci 100:12570–12575
Ottiger M, Bax A (1998) Determination of relative N-HN, N-C′, Cα-C′, and Cα-Hα effective bond lengths in a protein by NMR in a dilute liquid crystalline phase. J Am Chem Soc 120(47):12334–12341
Phillips GN (2009) Describing protein conformational ensembles: beyond static snapshots. F1000 biology reports, vol 1
Piana S, Lindorff-Larsen K, Shaw DE (2013) Atomic-level description of ubiquitin folding. Proc Natl Acad Sci 110:5915–5920
Prestegard J (1998) New techniques in structural NMR—anisotropic interactions. Nat Struct Mol Biol 5:517–522
Richter B, Gsponer J, Varnai P, Salvatella X, Vendruscolo M (2007) The MUMO (minimal under-restraining minimal over-restraining) method for the determination of native state ensembles of proteins. J Biomol NMR 37:117–135
Shao J, Tanner SW, Thompson N, Cheatham TE (2007) Clustering molecular dynamics trajectories: 1. Characterizing the performance of different clustering algorithms. J Chem Theory Comput 3:2312–2334
Tolman JR, Flanagan JM, Kennedy MA, Prestegard JH (1995) Nuclear magnetic dipole interactions in field-oriented proteins: information for structure determination in solution. Proc Natl Acad Sci 92:9279–9283
Word JM, Lovell SC, Richardson JS, Richardson DC (1999) Asparagine and glutamine: using hydrogen atom contacts in the choice of side-chain amide orientation. J Mol Biol 285:1735–1747
Acknowledgments
Funding from National Science Foundation (CAREER award, CCF-0953517) is gratefully acknowledged. The authors would also like to thank the two anonymous reviewers for their insightful comments.
Author information
Authors and Affiliations
Corresponding authors
Appendices
Appendix 1: Calculation of RDC’s
Given a 3D structure of a protein, the RDC D ij can be expressed using the molecular frame. First, the elements of Saupe matrix is defined as:
where β l denotes the orientation of the l-th molecular axis with respect to the external magnetic field. The RDC D ij can be reformulated in the molecular frame as:
where α x, α y, and α z are the cosines of the angles between the bond vector of the two nuclei and the x, y, and z axes of the molecular frame. Let α xk, α yk, and α zk represent the k-th α x, α y, and α z. When all the bond vectors are considered, we have the following formula:
where D exp is the experimental RDCs and N is the total number of data points. Eq. 6 can be rewritten in the following matrix form:
where c is the constant \( \frac{{ - \mu hr_{\text{i}} r_{\text{j}} }}{{(2\pi r)^{3} }} \) and A is the N × 5 matrix in Eq. 6 and S is the 5 × 1 vector. Optimal S and thereby D calc (i.e., the calculated RDCs) can be computed by singular value decomposition using Moore–Penrose pseudoinverse of matrix A:
Residual dipolar coupling (RDC) calculation from an ensemble
The RDC calculation method for a single structure can be extended to take ensemble averaging into account so that the ensemble D calc can be obtained. First let us consider the assumption that all structures have equal contributions toward the experimental RDC: D exp. When an ensemble with equal weights is considered, we have the following formula:
where A k is the A matrix obtained from the k-th structure in the ensemble. S can be obtained from the following equation:
Strictly speaking, the Saupe matrix might vary for different conformations of the protein. In this work we assume the same Saupe matrix for all the conformations. This assumption is reasonable especially for proteins that make only small conformation changes, as is the case with ubiquitin.
Now let us consider the case that structures in an ensemble have different populations and thus different amounts of contributions toward the experimental observations D exp. Therefore, weights (representing the relative populations) are given to different structures and the following formula is used to represent the combination:
where n is the total number of structures and w k and A k are respectively the relative population (or weight) and A matrix of the k-th structure. Thus, S can be obtained from the following formula:
Our problem is thus to find the optimal relative populations for the structures in the ensemble so that the experimental RDCs are best reproduced. The solution to this problem is given in Appendix 2.
Appendix 2
The iterative least squares fitting algorithm to a single RDC data set
The iterative least squares fitting algorithm to multiple RDC data sets
Rights and permissions
About this article
Cite this article
Vammi, V., Lin, TL. & Song, G. Enhancing the quality of protein conformation ensembles with relative populations. J Biomol NMR 58, 209–225 (2014). https://doi.org/10.1007/s10858-014-9818-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10858-014-9818-2