# Comparison of molecular dynamics and superfamily spaces of protein domain deformation

- 5.4k Downloads
- 24 Citations

## Abstract

### Background

It is well known the strong relationship between protein structure and flexibility, on one hand, and biological protein function, on the other hand. Technically, protein flexibility exploration is an essential task in many applications, such as protein structure prediction and modeling. In this contribution we have compared two different approaches to explore the flexibility space of protein domains: i) molecular dynamics (MD-space), and ii) the study of the structural changes within superfamily (SF-space).

### Results

Our analysis indicates that the MD-space and the SF-space display a significant overlap, but are still different enough to be considered as complementary. The SF-space space is wider but less complex than the MD-space, irrespective of the number of members in the superfamily. Also, the SF-space does not sample all possibilities offered by the MD-space, but often introduces very large changes along just a few deformation modes, whose number tend to a plateau as the number of related folds in the superfamily increases.

### Conclusion

Theoretically, we obtained two conclusions. First, that function restricts the access to some flexibility patterns to evolution, as we observe that when a superfamily member changes to become another, the path does not completely overlap with the physical deformability. Second, that conformational changes from variation in a superfamily are larger and much simpler than those allowed by physical deformability. Methodologically, the conclusion is that both spaces studied are complementary, and have different size and complexity. We expect this fact to have application in fields as 3D-EM/X-ray hybrid models or *ab initio* protein folding.

### Keywords

Molecular Dynamic Singular Vector Molecular Dynamic Trajectory Superfamily Member Deformation Space## Background

The central dogma of structural biology asserts that the aminoacid sequence has all the information needed for a protein to adopt a structure, and that structure determines function. The connection between sequence and structure has centered a great amount of work and detailed theories of protein folding exist [1], but still predicting structure or function from sequence is a extremely complex task except in cases of high sequence identity between the target protein and a well annotated homolog [2]. There are many cases of non-homologous proteins sharing a given fold or function as well as proteins with reasonably similar sequences having quite different structures.

Flexibility seems to play an important role in protein function, as in many cases movements are key for activity. Unfortunately, still less information exists on this connection between flexibility and function and, specifically, regarding the conformational changes that need to happen in a protein to perform its biological function [3, 4, 5]. In the very same way as structures that are able to perform a specific function are conserved by evolution by not tolerating mutations that seriously modify that structure, it is plausible to think that mutations disrupting the flexibility pattern of a given protein are not going to be accepted either [3, 6, 7, 8, 9].

*a priori*, are possible:

- i)
If physical deformability is crucial to protein function, conformational changes introduced by sequence modifications will happen as orthogonal as possible to the physical deformation pattern.

ii) The physical deformation pattern traces movements that allow quite significant conformational changes without disruption of the function(s) associated to a fold. Mutations leading to conformational changes along this pattern of flexibility are going to be better tolerated, as they won't affect the function. This would suggest a good overlap between the physical space studied by MD and the conformational space explored by the members of a superfamily.

Our results show that the relative flexibility among domains of a given superfamily is restricted to just a few "directions of change" (SF-space), which overlap only partially with the "directions of change" indicated by MD (MD-space). For technical purposes, the conclusion is that both spaces can be combined to increase the dimensionality of the search space when performing any kind of computational-biology task that requires the exploration of possible protein deformations.

## Results and discussion

*versus*size, a rough increase in the ratio between SF- and MD-space variances with protein size is found (Figure 2b), and the same incremental tendency is observed for the variance ratio plotted against the number of superfamily members (Figure 2c). Again, a similar reasoning explains it: a greater size of the superfamily implies a parallel increase in the possibilities of sequence variation, while it does not affect the variance of the MD-space.

*physically*possible, but they are not well populated within the experimental ensembles of the superfamilies, meaning that they have not been tolerated by evolution.

*versus*complete MD-ensemble) or relative (

*versus*reduced MDp-ensemble) coverage (Figure 4), confirming that larger superfamilies do not necessarily sample better than the smaller ones the physical deformation space.

Putting together all the analysis commented above, we conclude that there appear to be many deformation patterns that are physically possible but are not explored within a superfamily and that the overlap between MD- and SF-spaces is only partial. The reasons for these findings could be related to the bias of the SF-space towards insertions, deletions, and changes of aminoacids leading to bigger deformations in the structure than the simple variation of the torsion angles explored in the physical space. Others reasons are probably related to the inability of the SF-space to explore movements that might challenge protein functionality.

The structural changes inside a superfamily can be severe in extension but are easily represented by a few essential movements. We cannot completely rule out the possibility that when the structures of more members of a given superfamily were solved, the overlap between spaces increased, but according to our results it seems to be an inherent limit. In summary, as suggested in the complexity analysis, the SF-space is quickly saturated.

*Thermus thermophilus*(1a8h001) are very flexible in our MD simulations performed in the absence of RNA, but they are frozen in the biologically-relevant RNA-bound form [22]. Similarly, the C-terminal region of Germin from

*Hordeum vulgare*(1fi2A00, Figure 8, red), required for dimer formation [23], is exposed and flexible in the MD trajectory of the monomer while in the dimer the contacts trap it.

Taking into account local and global behavior together, we distinguish three groups among the 55 studied superfamilies:

i) Superfamilies (with both small and large number of members) showing poor overlap between SF- and MD-spaces (Hess index < 0.15, Additional file 1) and low correspondence between B-factor plots (Figure 6a). This group is largely enriched in enzymes of the α+β structural class. We can expect that flexibility will be a crucial issue in these proteins and accordingly the deformation pattern should be very well preserved, which means that changes in the SF-space happen as orthogonal as possible to the functionally relevant MD-space [24, 25, 26].

ii) Superfamilies with high number of members (n > 40), good overlap of SF- and MD-spaces (Hess index > 0.25, Additional file 1) and relatively good correspondence between the B-factor plots (Figure 6c). Here we find domains with structural or binding roles and fewer enzymes, with preference for α and β motives. In this group the superfamilies have been able to explore many physically-available deformation modes of the MD-space which do not interfere with function.

iii) Superfamilies with low number of members (n < 40), some overlap in the deformation spaces (Hess index > 0.15, Additional file 1) and poor correspondence between B-factor plots (Figure 6b). This group shows diverse families both in structural and functional terms. The physical deformability space has been explored to a little extent, but the residues that are not essential for function introduce large local structural changes reflected in poor B-factor correspondence.

## Conclusion

Our technical analysis comparing the spaces of structural variation within superfamilies (SF-space) and along atomistic MD simulations (MD-space) sheds light on the connection between physical flexibility and conformational variation with compositional change in the aminoacid sequence. The overall picture showed a more complex scenario than we originally thought, in part due to the fact that we are comparing a set of different structures in a SF with the MD of just one of them. First, we have observed that when the sequence of a protein changes to become another member of the superfamily, the change is produced following a way that does not completely overlap with that expected from the intrinsic physical deformability of the protein, which suggests that functional restriction limits the access to some flexibility patterns to evolution. This effect is especially clear for enzymes, where there is the worst overlap between SF- and MD-spaces. Second, our analysis shows that conformational changes resulting from sequence variation tend to be larger and much simpler than those allowed by individual physical flexibility. Interestingly, the threshold for achieving the maximum overlap between the SF and MD-spaces seems to be situated around 40 superfamily members (Figure 3b), suggesting some saturation in the deformation along the superfamily when compared to the physical space.

MD and SF spaces are comparable, but they also have important differences, and some words of caution are necessary. Since superfamily members vary in sequence, in some cases quite dramatically, and they will be expected to have different structures, while MD simulation samples the flexibility of a single sequence, it is not surprising that MD does not explain instances where there are specific chemical interactions.

The strength of our analysis relies in its interesting methodological implications. As the deformation spaces have different size and complexity and do not fully overlap, they can be considered as complementary. Flexibility analysis derived from the study of the structural variation along superfamilies can provide easy to manage and useful descriptions [21, 27], although they will have a limit in the physical complexity that they can describe. In much the same way, physical descriptions of isolated domains without considering their possible interactions have a limited capability to predict their flexibility in the context of protein-protein complexes, and variation along domains in a superfamily is a good way of obtaining that information. In other words, taking together SF and MD spaces we enrich our view on the conformational freedom of proteins.

This is expected to be of especial interest in the areas of 3D-EM/X-ray hybrid models or *ab initio* protein folding, where the exploration of the physical conformational space exclusively with high dimensionality methods such as Molecular Dynamics or Normal Mode Analysis could be over-conservative. We suggest that the use of the most important singular vectors of the SF-space (about 6) will provide a complementary deformation space that can be very useful in sampling [27], since it will attract to the common fold quite distant structures. A combination of both spaces in a sequential way can help to improve these areas of protein structure prediction.

## Methods

### Superfamily space of flexibility

*cdv's*):

where *x*, *y*, *z* stand for the coordinates of the same backbone atom *n* (C_{α}, O, N and C) in two structurally aligned aminoacids, each one belonging to one domain (*i* for the reference, *j* for the aligned). A CDV vector was created by using all the *cdv's* obtained for the atoms of a given aligned domain, placing *x*, *y*, *z* coordinates in consecutive indexes. Then a **CDV** matrix was built with all the CDVs as its columns (one per aligned domain). The **CDV** matrix was decomposed with the incremental singular value decomposition (ISVD) algorithm [29] to capture the main axes of variation (Figure 1). The use of ISVD, a variant of the single value decomposition (SVD) method [30], allows us to manage superfamilies with incomplete information in the core due to gaps in the alignment, since it can handle matrices for which some of the values of their elements are unknown. In any case, aminoacids in the reference domain that cannot be aligned in any of the pairwise alignments using MAMMOTH (black box, Figure 9) were excluded of further analysis. When ISVD is applied to the **CDV** matrix it produces:**CDV** = **U**·**S**·**V**^{ T }

**U** - *4*3*m* × *n-1* matrix containing an orthogonal basis for the multi-dimensional space defined by the CDVs, were *m* is the number of aminoacids in the core and *n* is the number of superfamily members used in the procedure. 4 comes for the 4 backbone atoms employed and 3 comes from the x, y, z coordinates.

**S** - *n-1* × *n-1* diagonal matrix containing the *n-1* singular values of the decomposition.

**V** - *n-1* × *n-1* matrix containing an orthogonal basis for the space of the rows of **CDV**.

**U**define a new basis for

**CDV**which, ranked by the relative value of the singular values in

**S**, best explains the structural variation among the aligned domains. The ISVD algorithm estimates the incomplete columns of the original

**CDV**matrix during the decomposition procedure in an incremental fashion, starting with the columns with less missing values. If the next CDV vector

**c**has missing values, denoted as

**c**

_{0}, they are estimated by:

**Z**is the set of values that minimize the sum of squared errors for the known values, denoted as

**c**

_{•}, when solving:

**U'**for the missing and known data, respectively.

**U'**and

**S'**are the decomposition matrices calculated in intermediate steps of the ISVD procedure. The interested reader is referred to [29] for the theory behind the ISVD, and to [21] for a complete explanation of the adaptation of ISVD to structural alignments of superfamilies. As in Principal Component Analysis (PCA), the result of both SVD and ISVD calculations is a transformation of the initial variation matrix into a set of orthogonal movements characterized by a set of singular vectors (which indicates the nature of the essential movement) and a set of singular values which, after transformation by eq. 5, are equal to the PCA eigenvalues.

where *n* is the number of snapshots used for the decomposition, *l*_{ i }is the PCA eigenvalue and *s*_{ i }is the [I]SVD singular value. Note that the original protein Cartesian coordinates appear now as projections onto the space defined by the singular vectors without any loss of structural information.

### Molecular-dynamics space of flexibility

The range of conformations accessible for a protein under normal physiological conditions can be well explored by molecular dynamics (MD) simulations. The technique samples the movements of macromolecules by integration of Newton equations of motion, with the forces being obtained from an accurate potential functional (the force field) fitted to reproduce high accurate quantum mechanical data in small model systems [31, 32]. In opposition to Normal Mode Analysis, atomistic MD does not assume that the protein should be confined in a harmonic well around the experimental structure, allowing then, if required by the physics of the system, large conformational transitions. It is the best technique to explore the physical deformation space for proteins.

The reference protein domains were simulated in the context of the whole native protein. All protein structures were titrated, neutralized by ions, minimized, hydrated, heated and equilibrated (for at least 0.5 ns) using a well established protocol [20]. Trajectories were collected using AMBER parm99 force field [33] in conjunction with Jorgensen's TIP3P model [34, 35] for representing water molecules. Particle Mesh Ewald approach was used to deal with long-range effects [36]. Integration of motion equations was performed every 1 fs, the vibrations of bonds involving hydrogen atoms being removed by SHAKE algorithm [37]. Production runs were obtained with the program AMBER8 [38] and were extended for 10 ns. Computational effort performed here corresponds to more than 20 CPU years and were done thanks to access to large supercomputer resources.

### Statistical descriptors for comparison

The MD and SF-spaces were subjected, for comparison purposes, to a modified version of the essential dynamics procedure [39] using SVD (with MD-space) and ISVD (with SF-space) decompositions. Many comparisons can be easily made using the singular vectors and values provided by the decomposition algorithms:

*1) The size of deformability space* was measured by the variance in MD or superfamily ensembles, summing the square of the singular values obtained after the decomposition. To avoid bias related to the limited number of structures in most superfamilies, the analysis of MD variance was repeated also using as many equally spaced MD snapshots as superfamily members (partial-MD space; MDp). The average values for 100 windows were computed.

*2) The complexity of the deformability space* was determined by the number of singular vectors needed to explain 90% of the variance.

*3) The overlap between the SF- and MD-spaces*was determined using the Hess metric [40] and associated Z-score (eqs. 6 and 7; [41]).

*X*and

*Y*stand for the two methods, the indexes

*i*and

*j*stand for the orders of the eigenvectors (ranked according to their contribution to the structural variance), and

*n*stands for the number of superfamily members.

Pure random models were obtained by decomposition of a pseudo-covariance matrix obtained by random permutation of the backbone atoms for each snapshot in a trajectory, and the standard deviation (std) was obtained by considering 500 different pseudo-covariance matrices.

Additional Z-scores* (labeled with * to avoid confusion with previous Z-scores derived from purely random models) showing the relevance of the values for H in a more chemically sound environment were computed from models where the chemical connectivity was maintained and steric collapses were avoided. For this purpose, we performed several 10 ns discrete dynamics simulations for each protein with a simplified force-field defined by covalent bonds plus a hard sphere potential for each atom [42]. Essential dynamics from these trajectories provided sets of singular vectors being representative from random movements but still consistent with the basic physics of the protein. The standard deviations needed for Z-score calculations were evaluated from independent discrete dynamics simulations.

*4) The coverage of MD-space achieved by the SF-space*was measured by analyzing the distribution of the projections of the superfamily members on the essential subspace defined by the two first singular vectors of the MD-space (essential MD-space). The essential MD-space was divided into 9 equivalent portions were the maximum X and Y values were determined by the smallest and largest projection values achieved during the 10 ns trajectories. The coverage was evaluated as the number of portions of the MD-essential space that were visited by at least one superfamily member (example in Figure 10). Similar results were obtained changing the number of portions. Note that a low coverage can obey to the intrinsic differences between MD and superfamily-derived samplings, but also to the limited number of superfamily members available. In order to distinguish between both sources of deviation we also computed the coverage for the partial MD-space.

*5) Individual mobility of residues*was determined by the residue B-factors:

where ⟨Δ*r*^{2}⟩ stands for the oscillations of atoms around equilibrium positions.

where *n* is the number of superfamily members and $\Delta {D}_{i}^{X}$ stands for a displacement along a given mode (*i*) in the space *X*. ${k}_{i}^{X}$ is the stiffness constant associated with a deformation mode, computed as *k*_{ b }*T*/(2*l*_{ i }), with k_{b} being Boltzmann's constant, *l*_{ i }the corresponding PCA eigenvalue and T the absolute temperature.

## Notes

### Acknowledgements

The authors thank Tim Meyer for helpful suggestions. This work was partially funded by the European Union (FP6-502828 and UE-512092), the USA National Institutes of Health (HL740472), the Spanish Comisión Interministerial de Ciencia y Tecnología (BIO2007-67150-C03-01, BIO2007-67150-C03-03, BIO2006-01602, CONSOLIDER CSD2006-23, CONSOLIDER CSDooC-06.0892), the Spanish Ministry of Health (COMBIOMED RD07/0067/0009), the Government of Madrid (S-Gen-0166/2006), the National Institute of Bioinformatics (a project of Genoma España), and the Fundación Marcelino Botín. JAVM and MR are supported by a MEC Postdoctoral Fellowship, IC is supported by a Spanish Postdoctoral Fellowship (FIS-CD07/00131) and APM is supported by the Spanish Ramón y Cajal program. We acknowledge the Barcelona Supercomputing Center for providing us with computer resources.

## Supplementary material

### References

- 1.Shea JE, Brooks CL 3rd:
**From folding theories to folding proteins: a review and assessment of simulation studies of protein folding and unfolding.***Annu Rev Phys Chem*2001,**52:**499–535.CrossRefPubMedGoogle Scholar - 2.Rost B:
**Twilight zone of protein sequence alignments.***Protein Eng*1999,**12**(2):85–94.CrossRefPubMedGoogle Scholar - 3.Gerstein M, Krebs W:
**A database of macromolecular motions.***Nucleic Acids Res*1998,**26**(18):4280–4290.PubMedCentralCrossRefPubMedGoogle Scholar - 4.Gerstein M, Lesk AM, Chothia C:
**Structural Mechanisms for Domain Movements in Proteins.***Biochemistry*1994,**33**(22):6739–6749.CrossRefPubMedGoogle Scholar - 5.Goh C-S, Milburn D, Gerstein M:
**Conformational changes associated with protein-protein interactions.***Curr Opin Struct Biol*2004,**14:**104–109.CrossRefPubMedGoogle Scholar - 6.Qian B, Ortiz AR, Baker D:
**Improvement of comparative model accuracy by free-energy optimization along principal components of natural structural variation.***Proc Natl Acad Sci USA*2004,**101**(43):15346–15351.PubMedCentralCrossRefPubMedGoogle Scholar - 7.Goldstein RA:
**The structure of protein evolution and the evolution of protein structure.***Curr Opin Struct Biol*2008,**18**(2):170–177.CrossRefPubMedGoogle Scholar - 8.Daniel RM, Dumm RV, Finney JL, Smith JC:
**The role of dynamics in enzyme activity.***Annual Review Biophysics and Biomolecular Structure*2003,**32:**69–92.CrossRefGoogle Scholar - 9.Kuhlman B, Baker D:
**Native protein sequences are close to optimal for their structures.***Proc Natl Acad Sci USA*2000,**97**(19):10383–10388.PubMedCentralCrossRefPubMedGoogle Scholar - 10.Andreeva A, Howorth D, Chandonia JM, Brenner SE, Hubbard TJ, Chothia C, Murzin AG:
**Data growth and its impact on the SCOP database: new developments.***Nucleic Acids Res*2008, (36 Database):D419–425.Google Scholar - 11.Pearl F, Todd A, Sillitoe I, Dibley M, Redfern O, Lewis T, Bennett C, Marsden R, Grant A, Lee D,
*et al*.:**The CATH Domain Structure Database and related resources Gene3D and DHS provide comprehensive domain family information for genome analysis.***Nucleic Acids Res*2005, (33 Database):D247–251.Google Scholar - 12.Holm L, Ouzounis C, Sander C, Tuparev G, Vriend G:
**A database of protein structure families with common folding motifs.***Protein Sci*1992,**1**(12):1691–1698.PubMedCentralCrossRefPubMedGoogle Scholar - 13.Flores TP, Orengo CA, Moss DS, Thornton JM:
**Comparison of conformational characteristics in structurally similar protein pairs.***Protein Sci*1993,**2**(11):1811–1826.PubMedCentralCrossRefPubMedGoogle Scholar - 14.Pang A, Arinaminpathy Y, Sansom MS, Biggin PC:
**Comparative molecular dynamics – similar folds and similar motions?***Proteins*2005,**61**(4):809–822.CrossRefPubMedGoogle Scholar - 15.Maguid S, Fernandez-Alberti S, Ferrelli L, Echave J:
**Exploring the common dynamics of homologous proteins. Application to the globin family.***Biophysical journal*2005,**89**(1):3–13.PubMedCentralCrossRefPubMedGoogle Scholar - 16.Maguid S, Fernandez-Alberti S, Echave J:
**Evolutionary conservation of protein vibrational dynamics.***Gene*2008.Google Scholar - 17.Maguid S, Fernandez-Alberti S, Parisi G, Echave J:
**Evolutionary conservation of protein backbone flexibility.***Journal of molecular evolution*2006,**63**(4):448–457.CrossRefPubMedGoogle Scholar - 18.Leo-Macias A, Lopez-Romero P, Lupyan D, Zerbino D, Ortiz AR:
**An analysis of core deformations in protein superfamilies.***Biophysical journal*2005,**88**(2):1291–1299.PubMedCentralCrossRefPubMedGoogle Scholar - 19.Leo-Macias A, Lopez-Romero P, Lupyan D, Zerbino D, Ortiz AR:
**Core deformations in protein families: a physical perspective.***Biophys Chem*2005,**115**(2–3):125–128.CrossRefPubMedGoogle Scholar - 20.Rueda M, Ferrer-Costa C, Meyer T, Perez A, Camps J, Hospital A, Gelpi JL, Orozco M:
**A consensus view of protein dynamics.***Proc Natl Acad Sci USA*2007,**104**(3):796–801.PubMedCentralCrossRefPubMedGoogle Scholar - 21.Velazquez-Muriel JA, Carazo JM:
**Flexible fitting in 3D-EM with incomplete data on superfamily variability.***J Struct Biol*2007,**158**(2):165–181.CrossRefPubMedGoogle Scholar - 22.Sugiura I, Nureki O, Ugaji-Yoshikawa Y, Kuwabara S, Shimada A, Tateno M, Lorber B, Giege R, Moras D, Yokoyama S,
*et al*.:**The 2.0 A crystal structure of Thermus thermophilus methionyl-tRNA synthetase reveals two RNA-binding modules.***Structure*2000,**8**(2):197–208.CrossRefPubMedGoogle Scholar - 23.Woo EJ, Dunwell JM, Goodenough PW, Marvier AC, Pickersgill RW:
**Germin is a manganese containing homohexamer with oxalate oxidase and superoxide dismutase activities.***Nat Struct Biol*2000,**7**(11):1036–1040.CrossRefPubMedGoogle Scholar - 24.Henzler-Wildman K, Kern D:
**Dynamic personalities of proteins.***Nature*2007,**450**(7172):964–972.CrossRefPubMedGoogle Scholar - 25.Henzler-Wildman KA, Lei M, Thai V, Kerns SJ, Karplus M, Kern D:
**A hierarchy of timescales in protein dynamics is linked to enzyme catalysis.***Nature*2007,**450**(7171):913–916.CrossRefPubMedGoogle Scholar - 26.Henzler-Wildman KA, Thai V, Lei M, Ott M, Wolf-Watz M, Fenn T, Pozharski E, Wilson MA, Petsko GA, Karplus M,
*et al*.:**Intrinsic motions along an enzymatic reaction trajectory.***Nature*2007,**450**(7171):838–844.CrossRefPubMedGoogle Scholar - 27.Velazquez-Muriel JA, Valle M, Santamaria-Pang A, Kakadiaris IA, Carazo JM:
**Flexible fitting in 3D-EM guided by the structural variability of protein superfamilies.***Structure*2006,**14**(7):1115–1126.CrossRefPubMedGoogle Scholar - 28.Ortiz AR, Strauss CE, Olmea O:
**MAMMOTH (matching molecular models obtained from theory): an automated method for model comparison.***Protein Sci*2002,**11**(11):2606–2621.PubMedCentralCrossRefPubMedGoogle Scholar - 29.Brand ME:
**Incremental Singular Value Decomposition of Uncertain Data with Missing Values.**In*Lecture Notes in Computer Science*.*Volume 2350*. European Conference on Computer Vision (ECCV); 2002:707–720.Google Scholar - 30.Press WH, Flannery BP, Teukolsky SA, Vetterling WT:
*Numerical Recipes in C: The Art of Scientific Computing.*1st edition. UK: Cambridge University Press; 1988.Google Scholar - 31.Karplus M, Kuriyan J:
**Molecular dynamics and protein function.***Proc Natl Acad Sci USA*2005,**102**(19):6679–6685.PubMedCentralCrossRefPubMedGoogle Scholar - 32.Karplus M:
**Molecular dynamics of biological macromolecules: A brief history and perspective.***Biopolymers*2003,**68**(3):350–358.CrossRefPubMedGoogle Scholar - 33.Cornell WD, Cieplak P, Bayly CI, Gould IR, Merz KM Jr, Ferguson DM, Spellmeyer DC, Fox T, Caldwell JW, Kollman PA:
**A Second Generation Force Field for the Simulation of Proteins, Nucleic Acids, and Organic Molecules.***Journal of the American Chemical Society*1995,**117**(19):5179–5197.CrossRefGoogle Scholar - 34.Jorgensen WL, Chandrasekhar J, Madura JD, Impey RW, Klein ML:
**Comparison of simple potential functions for simulating liquid water.***Journal of Chemical Physics*1983,**79**(2):926–935.CrossRefGoogle Scholar - 35.Mahoney MW, Jorgensen WL:
**A five-site model for liquid water and the reproduction of the density anomaly by rigid, nonpolarizable potential functions.***Journal of Chemical Physics*2000,**112**(20):8910–8922.CrossRefGoogle Scholar - 36.Darden TL, York D, Pedersen L:
**Particle Mesh Ewald: AN N-log(N) method for Ewald sums in large systems.***Journal of Chemical Physics*1993,**98:**10089–10092.CrossRefGoogle Scholar - 37.Andersen HC:
**Rattle: a velocity version of the SHAKE algorithm for molecular dynamics calculations.***Journal of Computational Physics*1983,**52:**24–34.CrossRefGoogle Scholar - 38.Case DA, Cheatham TE 3rd, Darden T, Gohlke H, Luo R, Merz KM Jr, Onufriev A, Simmerling C, Wang B, Woods RJ:
**The Amber biomolecular simulation programs.***J Comput Chem*2005,**26**(16):1668–1688.PubMedCentralCrossRefPubMedGoogle Scholar - 39.Amadei A, Linssen AB, Berendsen HJ:
**Essential dynamics of proteins.***Proteins*1993,**17**(4):412–425.CrossRefPubMedGoogle Scholar - 40.Hess B:
**Similarities between principal components of protein dynamics and random diffusion.***Phys Rev E Stat Phys Plasmas Fluids Relat Interdiscip Topics*2000,**62**(6 Pt B):8438–8448.PubMedGoogle Scholar - 41.Rueda M, Chacon P, Orozco M:
**Thorough Validation of Protein Normal Mode Analysis: A Comparative Study with Essential Dynamics.***Structure*2007,**15**(5):565–575.CrossRefPubMedGoogle Scholar - 42.Emperador A, Carrillo O, Rueda M, Orozco M:
**Exploring the suitability of coarse-grained techniques for the representation of protein dynamics.***Biophysical journal*2008,**95**(5):2127–2138.PubMedCentralCrossRefPubMedGoogle Scholar

## Copyright information

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.