Skip to main content
Log in

Enhancing the quality of protein conformation ensembles with relative populations

  • Article
  • Published:
Journal of Biomolecular NMR Aims and scope Submit manuscript

Abstract

The function and dynamics of many proteins are best understood not from a single structure but from an ensemble. A high quality ensemble is necessary for accurately delineating protein dynamics. However, conformations in an ensemble are generally given equal weights. Few attempts were made to assign relative populations to the conformations, mainly due to the lack of right experimental data. Here we propose a method for assigning relative populations to ensembles using experimental residue dipolar couplings (RDC) as constraints, and show that relative populations can significantly enhance an ensemble’s ability in representing the native states and dynamics. The method works by identifying conformation states within an ensemble and assigning appropriate relative populations to them. Each of these conformation states is represented by a sub-ensemble consisting of a subset of the conformations. Application to the ubiquitin X-ray ensemble clearly identifies two key conformation states, with relative populations in excellent agreement with previous work. We then apply the method to a reprotonated ERNST ensemble that is enhanced with a switched conformation, and show that as a result of population reweighting, not only the reproduction of RDCs is significantly improved, but common conformational features (particularly the dihedral angle distributions of ϕ 53 and ψ 52) also emerge for both the X-ray ensemble and the reprotonated ERNST ensemble.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  • Austin RH, Beeson KW, Eisenstein L, Frauenfelder H, Gunsalus IC (1975) Dynamics of ligand binding to myoglobin. Biochemistry 14:5355–5373

    Article  Google Scholar 

  • Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN, Bourne PE (2000) The protein data bank. Nucleic Acids Res 28:235–242

    Article  Google Scholar 

  • Best RB, Lindorff-Larsen K, DePristo MA, Vendruscolo M (2006) Relation between native ensembles and experimental structures of proteins. Proc Natl Acad Sci USA 103:10901–10906

    Article  ADS  Google Scholar 

  • Boehr DD, Nussinov R, Wright PE (2009) The role of dynamic conformational ensembles in biomolecular recognition. Nat Chem Biol 5:789–796

    Article  Google Scholar 

  • Bonvin AMJJ, Brunger AT (1996) Do NOE distances contain enough information to assess the relative populations of multi-conformer structures? J Biomol NMR 7:72–76

    Article  Google Scholar 

  • Clore GM, Schwieters CD (2004a) Amplitudes of protein backbone dynamics and correlated motions in a small α/β protein: correspondence of dipolar coupling and heteronuclear relaxation measurements. Biochemistry 43:10678–10691

    Article  Google Scholar 

  • Clore GM, Schwieters CD (2004b) How much backbone motion in ubiquitin is required to account for dipolar coupling data measured in multiple alignment media as assessed by independent cross-validation? J Am Chem Soc 126:2923–2938

    Article  Google Scholar 

  • Clore GM, Schwieters CD (2006) Concordance of residual dipolar couplings, backbone order parameters and crystallographic B-factors for a small α/β protein: a unified picture of high probability, fast atomic motions in proteins. J Mol Biol 355:879–886

    Article  Google Scholar 

  • Daura X, Gademann K, Jaun B, Seebach D, van Gunsteren WF, Mark AE (1999) Peptide folding: when simulation meets experiment. Angew Chem Int Ed 38:236–240

    Article  Google Scholar 

  • de Groot BL, van Aalten DM, Scheek RM, Amadei A, Vriend G, Berendsen HJ (1997) Prediction of protein conformational freedom from distance constraints. Proteins 29:240–251

    Article  Google Scholar 

  • DePristo MA, de Bakker PI, Blundell TL (2004) Heterogeneity and inaccuracy in protein structures solved by X-ray crystallography. Structure 12:831–838

    Article  Google Scholar 

  • Dill KA, Chan HS (1997) From Levinthal to pathways to funnels. Nat Struct Biol 4:10–19

    Article  Google Scholar 

  • Eastwood MP, Hardin C, Luthey-Schulten Z, Wolynes PG (2001) Evaluating protein structure-prediction schemes using energy landscape theory. IBM J Res Dev 45:475–497

    Article  Google Scholar 

  • Fenwick RB, Esteban-Martín S, Richter B, Lee D, Walter KF, Milovanovic D, Becker S, Lakomek NA, Griesinger C, Salvatella X (2011) Weak long-range correlated motions in a surface patch of ubiquitin involved in molecular recognition. J Am Chem Soc 133:10336–10339

    Article  Google Scholar 

  • Frauenfelder H, Silgar S, Wolynes P (1991) The energy landscapes and motions of proteins. Science 254:1598–1603

    Article  ADS  Google Scholar 

  • Frauenfelder H, McMahon BH, Austin RH, Chu K, Groves JT (2001) The role of structure, energy landscape, dynamics, and allostery in the enzymatic function of myoglobin. Proc Natl Acad Sci USA 98:2370–2374

    Article  ADS  Google Scholar 

  • Furnham N, Blundell TL, DePristo MA, Terwilliger TC (2006) Is one solution good enough? Nat Struct Mol Biol 13:184–185

    Article  Google Scholar 

  • Hamelberg D, Mongan J, McCammon JA (2004) Accelerated molecular dynamics: a promising and efficient simulation method for biomolecules. J Chem Phys 120:11919–11929

    Article  ADS  Google Scholar 

  • Huang KY, Amodeo GA, Tong L, McDermott A (2011) The structure of human ubiquitin in 2-methyl-2,4-pentanediol: a new conformational switch. Protein Sci 20:630–639

    Article  Google Scholar 

  • Karplus M, McCammon JA (2002) Molecular dynamics simulations of biomolecules. Nat Struct Mol Biol 9:646–652

    Article  Google Scholar 

  • Kontaxis G, Bax A (2001) Multiplet component separation for measurement of methyl 13C-1H dipolar couplings in weakly aligned proteins. J Biomol NMR 20:77–82

    Article  Google Scholar 

  • Lakomek NA, Carlomagno T, Becker S, Griesinger C, Meiler J (2006) A thorough dynamic interpretation of residual dipolar couplings in ubiquitin. J Biomol NMR 34:101–115

    Article  Google Scholar 

  • Lakomek NA, Walter KF, Fares C, Lange OF, de Groot BL, Grubmuller H, Bruschweiler R, Munk A, Becker S, Meiler J et al (2008) Self-consistent residual dipolar coupling based model-free analysis for the robust determination of nanosecond to microsecond protein dynamics. J Biomol NMR 41:139–155

    Article  Google Scholar 

  • Lange OF, Lakomek NA, Fares C, Schroder GF, Walter KF, Becker S, Meiler J, Grubmuller H, Griesinger C, de Groot BL (2008) Recognition dynamics up to microseconds revealed from an RDC-derived ubiquitin ensemble in solution. Science 320:1471–1475

    Article  ADS  Google Scholar 

  • Lawson CL, Hanson RJ (1995) Solving least squares problems. SIAM, Philadelphia

    Book  MATH  Google Scholar 

  • Levin EJ, Kondrashov DA, Wesenberg GE, Phillips GN Jr (2007) Ensemble refinement of protein crystal structures: validation and application. Structure 15:1040–1052

    Article  Google Scholar 

  • Lindorff-Larsen K, Best RB, Depristo MA, Dobson CM, Vendruscolo M (2005) Simultaneous determination of protein structure and dynamics. Nature 433:128–132

    Article  ADS  Google Scholar 

  • Markwick PRL, Bouvignies G, Salmon L, McCammon JA, Nilges M, Blackledge M (2009) Toward a unified representation of protein structural dynamics in solution. J Am Chem Soc 131:16968–16975

    Article  Google Scholar 

  • Miyashita O, Onuchic JN, Wolynes PG (2003) Nonlinear elasticity, proteinquakes, and the energy landscapes of functional transitions in proteins. Proc Natl Acad Sci 100:12570–12575

    Article  ADS  Google Scholar 

  • Ottiger M, Bax A (1998) Determination of relative N-HN, N-C′, Cα-C′, and Cα-Hα effective bond lengths in a protein by NMR in a dilute liquid crystalline phase. J Am Chem Soc 120(47):12334–12341

    Google Scholar 

  • Phillips GN (2009) Describing protein conformational ensembles: beyond static snapshots. F1000 biology reports, vol 1

  • Piana S, Lindorff-Larsen K, Shaw DE (2013) Atomic-level description of ubiquitin folding. Proc Natl Acad Sci 110:5915–5920

    Article  ADS  Google Scholar 

  • Prestegard J (1998) New techniques in structural NMR—anisotropic interactions. Nat Struct Mol Biol 5:517–522

    Article  Google Scholar 

  • Richter B, Gsponer J, Varnai P, Salvatella X, Vendruscolo M (2007) The MUMO (minimal under-restraining minimal over-restraining) method for the determination of native state ensembles of proteins. J Biomol NMR 37:117–135

    Article  Google Scholar 

  • Shao J, Tanner SW, Thompson N, Cheatham TE (2007) Clustering molecular dynamics trajectories: 1. Characterizing the performance of different clustering algorithms. J Chem Theory Comput 3:2312–2334

    Article  Google Scholar 

  • Tolman JR, Flanagan JM, Kennedy MA, Prestegard JH (1995) Nuclear magnetic dipole interactions in field-oriented proteins: information for structure determination in solution. Proc Natl Acad Sci 92:9279–9283

    Article  ADS  Google Scholar 

  • Word JM, Lovell SC, Richardson JS, Richardson DC (1999) Asparagine and glutamine: using hydrogen atom contacts in the choice of side-chain amide orientation. J Mol Biol 285:1735–1747

    Article  Google Scholar 

Download references

Acknowledgments

Funding from National Science Foundation (CAREER award, CCF-0953517) is gratefully acknowledged. The authors would also like to thank the two anonymous reviewers for their insightful comments.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Vijay Vammi or Guang Song.

Appendices

Appendix 1: Calculation of RDC’s

Given a 3D structure of a protein, the RDC D ij can be expressed using the molecular frame. First, the elements of Saupe matrix is defined as:

$$ S_{\text{lm}} = \left\langle {\frac{{3\cos \beta_{l} \cos \beta_{m} - k_{\text{lm}} }}{2}} \right\rangle $$
(4)

where β l denotes the orientation of the l-th molecular axis with respect to the external magnetic field. The RDC D ij can be reformulated in the molecular frame as:

$$ D_{ij} = \frac{{ - \mu hr_{\text{i}} r_{\text{j}} }}{{(2\pi r)^{3} }}\left. {\left( {\alpha_{y}^{2} - \alpha_{x}^{2} ; \alpha_{z}^{2} - \alpha_{x}^{2} ; 2\alpha_{{x\alpha_{y} ;}} 2\alpha_{{x\alpha_{z} ;}} 2\alpha_{{y\alpha_{z} }} } \right.} \right) \begin{array}{*{20}c} {\left( {\begin{array}{*{20}c} {S_{\text{yy}} } \hfill \\ {S_{\text{zz}} } \hfill \\ {S_{\text{xy}} } \hfill \\ {S_{\text{xz}} } \hfill \\ {S_{\text{yz}} } \hfill \\ \end{array} } \right)} \\ \end{array} $$
(5)

where α x, α y, and α z are the cosines of the angles between the bond vector of the two nuclei and the x, y, and z axes of the molecular frame. Let α xk, α yk, and α zk represent the k-th α x, α y, and α z. When all the bond vectors are considered, we have the following formula:

$$ D_{\exp } = \left( {\frac{{ - \mu hr_{i} r_{j} }}{{(2\pi r)^{3} }}} \right)\left( {\begin{array}{*{20}c} {\alpha_{{{\text{y}},1}}^{2} - \alpha_{{{\text{x}},1}}^{2} } \hfill & \cdots \hfill & {2\alpha_{{{\text{y}},1}} \alpha_{{{\text{z}},1}} } \hfill \\ \vdots \hfill & \ddots \hfill & \vdots \hfill \\ {\alpha_{{{\text{y}},N}}^{2} - \alpha_{{{\text{x}},{\text{N}}}}^{2} } \hfill & \cdots \hfill & {2\alpha_{{{\text{y}},{\text{N}}}} \alpha_{{{\text{z}},{\text{N}}}} } \hfill \\ \end{array} } \right)\begin{array}{*{20}c} {\left( {\begin{array}{*{20}c} {S_{\text{yy}} } \hfill \\ {S_{\text{zz}} } \hfill \\ {S_{\text{xy}} } \hfill \\ {S_{\text{xz}} } \hfill \\ {S_{\text{yz}} } \hfill \\ \end{array} } \right)} \\ \end{array} $$
(6)

where D exp is the experimental RDCs and N is the total number of data points. Eq. 6 can be rewritten in the following matrix form:

$$ D_{\exp } = cAS $$
(7)

where c is the constant \( \frac{{ - \mu hr_{\text{i}} r_{\text{j}} }}{{(2\pi r)^{3} }} \) and A is the N × 5 matrix in Eq. 6 and S is the 5 × 1 vector. Optimal S and thereby D calc (i.e., the calculated RDCs) can be computed by singular value decomposition using Moore–Penrose pseudoinverse of matrix A:

$$ S = A^{ - 1} D_{\exp } $$
(8)
$$ D_{\text{calc}} = AA^{ - 1} D_{\exp } $$
(9)

Residual dipolar coupling (RDC) calculation from an ensemble

The RDC calculation method for a single structure can be extended to take ensemble averaging into account so that the ensemble D calc can be obtained. First let us consider the assumption that all structures have equal contributions toward the experimental RDC: D exp. When an ensemble with equal weights is considered, we have the following formula:

$$ \left( {\frac{{A_{1} }}{n} + \frac{{A_{2} }}{n} + \cdots + \frac{{A_{k} }}{n} + \cdots \frac{{A_{n} }}{n}} \right)S = D_{\exp } $$
(10)

where A k is the A matrix obtained from the k-th structure in the ensemble. S can be obtained from the following equation:

$$ S = \left( {\frac{{A_{1} }}{n} + \frac{{A_{2} }}{n} + \cdots + \frac{{A_{k} }}{n} + \cdots \frac{{A_{n} }}{n}} \right)^{ - 1} D_{\exp } $$
(11)

Strictly speaking, the Saupe matrix might vary for different conformations of the protein. In this work we assume the same Saupe matrix for all the conformations. This assumption is reasonable especially for proteins that make only small conformation changes, as is the case with ubiquitin.

Now let us consider the case that structures in an ensemble have different populations and thus different amounts of contributions toward the experimental observations D exp. Therefore, weights (representing the relative populations) are given to different structures and the following formula is used to represent the combination:

$$ \left( {w_{1} A_{1} + w_{2} A_{2} + \cdots + w_{k} A_{k} + \cdots w_{n} A_{n} } \right)S = D_{\exp } $$
(12)

where n is the total number of structures and w k and A k are respectively the relative population (or weight) and A matrix of the k-th structure. Thus, S can be obtained from the following formula:

$$ S = \left( {w_{1} A_{1} + w_{2} A_{2} + \cdots + w_{k} A_{k} + \cdots w_{n} A_{n} } \right)^{ - 1} D_{\exp } $$
(13)

Our problem is thus to find the optimal relative populations for the structures in the ensemble so that the experimental RDCs are best reproduced. The solution to this problem is given in Appendix 2.

Appendix 2

The iterative least squares fitting algorithm to a single RDC data set

The iterative least squares fitting algorithm to multiple RDC data sets

Rights and permissions

Reprints and permissions

About this article

Cite this article

Vammi, V., Lin, TL. & Song, G. Enhancing the quality of protein conformation ensembles with relative populations. J Biomol NMR 58, 209–225 (2014). https://doi.org/10.1007/s10858-014-9818-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10858-014-9818-2

Keywords

Navigation