WEBnm@: a web application for normal mode analyses of proteins
- 9.5k Downloads
Normal mode analysis (NMA) has become the method of choice to investigate the slowest motions in macromolecular systems. NMA is especially useful for large biomolecular assemblies, such as transmembrane channels or virus capsids. NMA relies on the hypothesis that the vibrational normal modes having the lowest frequencies (also named soft modes) describe the largest movements in a protein and are the ones that are functionally relevant.
We developed a web-based server to perform normal modes calculations and different types of analyses. Starting from a structure file provided by the user in the PDB format, the server calculates the normal modes and subsequently offers the user a series of automated calculations; normalized squared atomic displacements, vector field representation and animation of the first six vibrational modes. Each analysis is performed independently from the others and results can be visualized using only a web browser. No additional plug-in or software is required. For users who would like to analyze the results with their favorite software, raw results can also be downloaded. The application is available on http://www.bioinfo.no/tools/normalmodes. We present here the underlying theory, the application architecture and an illustration of its features using a large transmembrane protein as an example.
We built an efficient and modular web application for normal mode analysis of proteins. Non specialists can easily and rapidly evaluate the degree of flexibility of multi-domain protein assemblies and characterize the large amplitude movements of their domains.
KeywordsNormal Mode Atomic Displacement Deformation Energy MscL Normal Mode Analysis
Molecular modeling provides several powerful tools for computing the dynamics of proteins. Normal Mode Analysis (NMA) is a well suited approach to study dynamics of proteins, especially when the protein is relatively big (several thousand amino acids) and the time scale of the dynamical events of interest are longer than what molecular dynamics (MD) simulations can reach, typically a few nanoseconds. These methods are based on the hypothesis that the vibrational normal modes exhibiting the lowest frequencies (also named soft modes) describe the largest movements in a protein and are the ones functionally relevant.
Several tools based on NMA have been developed [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16] and successfully applied to predict the collective, large amplitude motions of several macromolecules of different sizes, e.g. the F(1)-APTase, RNA polymerases or bigger systems such as virus capsids. Lately, web tools have appeared making this technique accessible to a larger number of users. The elNémo, web interface to the Elastic Network Model, offers normal modes calculations and a fairly large number of analyses for each calculated mode; degree of collectivity, animation (PDB downloadable files or animated GIF images) for each mode using three different views for the protein, comparison between experimental and predicted B-factors, maximum distance fluctuation between all pairs of Cα atoms and normalized mean squared atomic displacements. If two structures are uploaded, the cumulative overlap between the modes and the conformational difference is calculated. Delarue et al.  have developed another application based on the Elastic Network Model. The application offers calculations of normal modes on all atoms (the users can also choose to use only Cα) and provides an animation for each calculated mode (PDBmovies) that can be visualized with e.g. PyMol. The same group has developed a server performing normal modes calculations using a more general molecular mechanics force field, Gromacs, and which also provides animation of the vibrations corresponding to each calculated mode. The use of such a force field increases the computational cost of the computation and the system size is therefore limited to 5000 atoms. The NMA movie generator, available from the web pages of the database of macromolecular movements (MolMovDB), calculates the five lowest frequency normal modes for a PDB structure file which can be either uploaded to the server or chosen by its PDB or SCOP identifiers. Animated GIF images of the vibrations are generated and compared with the pre-calculated flexibility regions based on supplied B-factors or multiple structural alignments for the corresponding fold family for one-domain fold proteins.
The Molecular Vibrations Evaluation Server (MoVies) provides vibrational study of proteins and nucleic acids, using modified AMBER force field and a self-consistent harmonic approximation method. Starting from a structure file in the PDB format, the application performs normal modes calculations and several analyses, and on completion the results are sent to the user by email. Of special interest is the evaluation of hydrogen bond disruption probability.
The ProMode database  is a database of normal mode analysis of proteins. Results of normal mode analysis for a large number of proteins are made accessible via a web interface. For each mode, an animation and the axes of the movement (as calculated by DynDom) can be viewed using the Chime plugin. Fluctuations of atom positions and torsion angles, correlation between Cα atom displacements are plotted for each mode; the averages of these values over all modes are also stored in the database. Dynamical domains for each mode, characterized using DynDom, are given. Although NMA results for a large number of proteins can be very quickly retrieved from ProMode, not all proteins available in the Protein Data Bank are present and users cannot submit their own structure file.
We developed a web application for calculation of normal mode analysis which offers fast calculation of the 200 lowest frequency modes and different types of analyses: deformation energy, animation of the vibration, atomic squared displacements and vector field analysis. Results of each analysis can be visualized using only a web browser, without any additional plug-in or program. Alternatively, the users can download raw data and visualize them using their favorite software. We have carefully designed our web application into independent modules so that the users can perform only the analyses they are interested in, and in this way avoid spending time waiting for results of analysis irrelevant to their particular question. The modular structure will, in the future, allow us to easily add new functionality. The core of the application is written in the Python programming language, using the Molecular Modeling ToolKit  (MMTK). It contains an implementation of the approximate normal analysis method developed by Hinsen which calculates low-frequency domain motions at negligible computational cost. Zope is used for the web interface, which communicates with the core through an application server. Details of the implementation are given below, followed by an example calculation on a large transmembrane protein.
The first step for the user is to upload a pdb file containing the structure. Pressing the submit button starts the normal mode calculation, which runs to completion without doing any further analysis. No limit is set for the system size (i.e. number of residues). When the calculation is finished, the user is directed to a page which displays the result of the energy deformation analysis. Low average deformation energy indicates a mode with large rigid regions, i.e. a mode with a large degree of collectivity, which has a good chance of describing domain motions. This page is meant to help users judge for which mode(s), if any, the analysis will be significant in terms of large collective movements. They can then decide to perform further analysis of the calculated modes and are given the possibility to choose among three different analyses (see description below). Results of each analysis are stored and can at any time be viewed either in a separate window, or downloaded as a ZIP archive together with results of all other analyses performed up to that moment.
Normalized squared atomic displacements can be retrieved in two different formats. Users can download text files containing two columns, the first one corresponding to the amino acid numbers of the sequence in the structure file (PDB) submitted and the second one containing the normalized displacement corresponding to each amino acid. Alternatively, the user can retrieve PDF plots representing the variation of normalized atomic displacements vs. amino acid number. These plots are generated using the R programming language and RPy , a Python interface to R. Thus, we provide the users with the possibility to see the results directly from their web browser without any additional plugins or program, but we also, for users who want to have more flexibility, provide the raw data.
Mode animations are provided for the six first significant modes (i.e. modes 7 to 12, see Methods section), as animated gif images or as DCD trajectory files. The DCD file format is a binary format for trajectories from MD simulations that is common to the CHARMm, XPlor and NAMD programs. DCD files can be read by VMD. Unlike with animated gifs, visualizing DCD files with VMD allows the users to manipulate the protein themselves (rotate, zoom, highlight specific regions, etc..) which might offer a better insight in the calculated domain movements. On the other hand, this requires that the user has VMD installed on his computer and is sufficiently used to it. Therefore, we have decided to offer the possibility to choose the orientation of the protein before the animated gif images are generated. Rasmol[34, 35] is used to generate image files of the different conformations along the mode vector (see Methods section). The images are then concatenated to produce an animation (animated GIF file) using Image Magick . The resulting animation is a sequence of five conformations, with a delay of 1/25 second between them.
Vector field representations help characterize the domain displacements with vectors representing the direction and the relative displacements of the different regions of the protein. Using VMD, the web application generates a picture of the protein and the vectors for modes 7 to 12. Using the same setup as for the mode animations, the user can choose the orientation of his system. Additionally, VMD 'state' files are generated and available for download, allowing a more interactive inspection of the vector fields.
2. Application server
Results: example calculation on SERCA1 Ca-ATPase
The calcium ATPase from the sarcoplasmic reticulum, is constituted of 3 cytoplasmic domains, named Actuator (A, amino acids 1 to 40 (NTer) and 124 to 243), Nucleotidic (N, 360 to 604) and Phosphorylation (P, 330 to 359 and 605 to 737), and 10 transmembrane helices hosting the calcium binding sites. It is known that the cytoplasmic domains undergo large amplitude movements during the active transport of calcium ions. We recently reported a NMA study of the E1Ca form of the Ca-ATPase, starting from its x-ray structure (PDB ref 1EUL) . Using MMTK, we could show that the N and A domains undergo the largest amplitude movements, as revealed by the lowest frequency modes. We highlighted a large amplitude movement of the transmembrane helices, which "twist-opens" the lumenal side of the protein.
The user can then choose to proceed to further analyis (Cf. Figure 2b), for example generate an animation for each of the 6 first modes (7 through 12). The next page (Figure 2c) offers the users the possibility to orient the system properly to ensure the best view of the movements by choosing a rotation angle over the x, y and z axes. A preview will be generated for each chosen set of angles. Once the user has decided upon a set of angles, he can check the 'I'm done' radio button, and then press the 'Perform' button and animations will be generated. The user is then brought back to the 'Analysis' page (Figure 2d) where a logo has now appeared next to 'Mode Animation'. By clicking on this icon, a new window containing the animated images (gif format) will be opened (Figure 2e). This goes for all additional analyses. A click on an icon opens a new window with the results of the corresponding analysis. At any moment, one can download the analyses performed up to that point as a ZIP archive that contains all result files.
WEBnm@ allows efficient calculation of normal modes for proteins and is available to everyone from http://www.bioinfo.no/tools/normalmodes. Calculation of the modes for the Ca-ATPase, which contains 994 residues, takes about 4 minutes. Our web application has several other advantages; a user can choose which analyses to perform so that no time is wasted on analysis he/she is not interested in. Result pages for each analysis are independent and open in separate windows. All results are presented on the web pages, no additional programs or plugins are needed for visualization. However, results are also provided in other formats (x, y format for normalized squared atomic displacements, PDB for structure and DCD for trajectories) in case users want to use their favorite program to visualize and analyze their results. This allows anyone to calculate normal modes for relatively large systems, without having the required resources (i.e. memory) to do it in-house. At any time, result files of the calculation performed up to that moment can be downloaded in a ZIP file. Although WEBnm@ is not the first tool of his kind, it is probably the fastest and provides functionalities that are not found elsewhere.
The architecture of WEBnm@ is totally modular. It is meant to welcome an increasing number of functionalities (structure comparison between different conformations of a protein, domain determination, etc...). Decision on future developments will also be based on users' requests.
Normal modes calculations
A normal mode analysis (NMA) consists of the diagonalization of the matrix of the second derivatives of the energy with respect to the displacements of the atoms, in mass-weighted coordinates (Hessian matrix). The eigenvectors of the Hessian matrix are the normal modes, and its eigenvalues are the squares of the associated frequencies. We use the approximate normal modes calculation method developed by Hinsen  and implemented in the MMTK package. This method represents the low-frequency domain motions very well at negligible computational cost. The force field used is slightly different from the one used in the original publication and has been described in reference . It uses only the Cα atoms of the protein, which are assigned the masses of the whole residues they represent.
Briefly, the functional form of the force field is
V(r) is the harmonic pair potential describing the interaction between the Cα atoms:
Two hundred modes are calculated for proteins containing less than 1200 residues. For proteins containing more than 1200 residues, N/6 modes are calculated (N being the number of residues). The first six modes (zero-frequency modes) correspond to global rotation and translation of the system and are ignored in the analyses. Thus, the lowest frequency mode of interest is mode 7. Deformation energy and normalized atomic displacements analyses are performed for modes 7 through 20 while mode animations and vector fields are calculated for modes 7 through 12.
As in DomainFinder[10, 11], a deformation energy is calculated for each atom. Deformation energy depends on the changes in the distance between the atom in question and each of its close neighbors. Low deformation energies indicate relatively rigid regions, whereas high deformation energies indicate flexible regions. The application returns the average deformation energy for each mode. Low average deformation energy indicates a mode with large rigid regions, which has a good chance of describing domain motions.
Normalized squared atomic displacements
Normalized squared atomic displacements (Di) for each amino acid (resid) or Cα atom (i = 1 to n) are calculated as follows:
where di is the component of the eigenvector corresponding to the i th residue.
Normal mode animations
Subsequent structures of a given animation are generated by applying eigenvectors of the corresponding mode to the Cα coordinates of the structure submitted to the server. Two structures of the protein are generated in each direction (i.e. +a*mode, +2*a*mode, -a*mode, -2*a*mode). The 'a' factor is arbitrary; we choose to set it equal to 10 as a default value since this gives the best visual insight on the movements.
A vector field representation is calculated as described by Thomas et al. . The vector field is calculated over cubic regions with an edge length of 3 Å, containing on average 1.3 Cα atoms. The vector field defined on a regular lattice at the center of each cube is the mass-weighted average of the displacements of the atoms in the cube.
Funding for this work was provided by FUGE (Norwegian functional genomics program) through the technology platform for bioinformatics. Inge Jonassen and Konrad Hinsen are thankfully acknowledged for their pertinent advices and careful reading of our manuscript.
- 9.Cornell WD, Louise-May S: Normal Mode Analysis. In Encyclopedia of Computational Chemistry. Edited by: P. Schleyer NATCJGPKHSPS. Chichester, UK, John Wiley & Sons.; 1998:1904–1913.Google Scholar
- 15.Hayward S: Normal mode analysis of biological molecules. In Computational biochemistry and biophysics. Edited by: Becker OM, MacKerell AD, Roux B and Watanabe M. New-York, Marcel Dekker, Inc.; 2001:153–168.Google Scholar
- 28.Zope Open Source web application server.[http://www.zope.org/]
- 29.RCopyright (C) 1989, 1991 Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111–1307 USA edition. [http://www.r-project.org/]
- 31.Brünger AT: "XPLOR Manual Version 3.1". Yale UNiversity Press; New Haven 1993.Google Scholar
- 35.Bernstein HJ: Recent changes to RasMol, recombining the variants. TIBS 2000, 9: 453–455.Google Scholar
- 36.Cristy J, Randers-Pehrson G: Image Magick. 2003.Google Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.