Correlating protein function and stability through the analysis of single amino acid substitutions
- 3.9k Downloads
Mutations resulting in the disruption of protein function are the underlying causes of many genetic diseases. Some mutations affect the number of expressed proteins while others alter the activity on a per-molecule basis. Single amino acid substitutions as caused by non-synonymous Single Nucleotide Polymorphisms (nsSNPs) often disrupt function by altering protein structure and/or stability, but can also wreak havoc by directly impacting functional binding sites. Given the experimental three-dimensional (3D) structure of a protein, we can try to differentiate between the "effect on structure/stability" and the "effect on binding". However, experimental 3D structures are available for only 1% of all known proteins; the magnitude of stability change caused by a given mutation is more widely available.
Here, we analyze to which extent the functional effect of a mutation can be predicted from the effect on protein stability. We find that simple sequence-based methods succeed in predicting functional effects of nsSNPs. In fact, such methods consistently outperform approaches that predict functional change through the application of binary thresholds to stability change. We also observed that if stability is affected, functional change is easier to predict than when stability is not affected.
Our results confirmed that stability change is somehow related to function change. However, we also show that the knowledge of stability changes in no way suffices to predict functional changes and that many function changing mutations have no effect on stability.
KeywordsFunctional Effect Functional Change Protein Stability Single Amino Acid Substitution Stability Change
Screening for Non-Acceptable Polymorphisms
Sorting Tolerant From Intolerant
Protein Mutant Database
Protein Data Bank
non-synonymous Single Nucleotide Polymorphism
Genetic variation is evolution's way of making children adapt better to the environment than their parents. Unfortunately for us, the specific changes in our genetic make-up are more often deleterious than beneficial. When contrasting the concerns of individuals with that of the species we find that most mutations are bad (for the individual) but the diversity created by these mutations helps the species survive. Here, we aim at predicting the effect of each mutation on the particular gene-product. Such predictions could help in addressing problems that originate from the negative perspectives for individuals.
Most of the genetic variation is accounted for by SNPs (single nucleotide polymorphisms). Eleven million SNPs (~11 M) are estimated to be in the human genome  (dbSNP release 129 contains already ~15 M entries in human, but only ~6.5 are validated ). SNPs vary by their location and effect but can be grouped based on their position in the coding or the non-coding regions of DNA. Furthermore, SNPs resulting in a single amino acid substitution in the translated protein sequence (nsSNPs; non-synonymous SNPs) are differentiated from those that, due to the redundancy of the genetic code, are not. Only ~52 K frequent nsSNPs (> 5% in population) are known in human ; a total of 67–200 K nsSNPs is expected . nsSNPs are as much a small subset of all SNPs as all coding nucleotides are of all nucleotides. In analogy, however, we expect that the importance of nsSNPs is as disproportional as that of protein-coding regions to all of DNA. It is therefore not surprising that an increasingly large number of diseases and defects reported in the human mutation databases HGMD  and OMIM  pertain to nsSNPs. A vast number of all single amino acid substitutions originate from nsSNPs . For simplicity we use the term "nsSNPs" interchangeably with "single amino acid substitutions" in the context of this study.
Not all single amino acid substitutions are deleterious to molecular protein function. By some estimates only 20–30% of the nsSNPs result in an observable functional change . The ability to differentiate disruptive mutations from neutral ones is necessary for a better understanding of protein function. A given nsSNP may disrupt function in two ways: (1) by directly changing the "active" residue (e.g. by replacing the amino acid for a residue involved in ligand binding, catalysis, allosteric regulation, or post-translational modification), or (2) by affecting the scaffolding of the protein (e.g. by deforming and/or destabilizing the binding site or the entire protein structure). Wang and Moult have suggested that disease-associated mutations in the human most often belong to the second class of functional disruptions (in their data set 83% of disease associated mutations affect protein stability) .
Understanding of functional changes due to structural alterations could potentially be derived from either of these contexts. Several methods infer functional effects of nsSNPs based features that include changes in the 3D structure [8, 9, 10, 11, 12] and, often, these are more reliable than purely sequence-based approaches [13, 14].
Unfortunately, experimental 3D structures are only available for one percent of all proteins. The estimate of reduction in protein stability in terms of unfolding energy changes (Eqn. 1) is experimentally simpler and less expensive to obtain then structure identification, translates directly to a reduced number of folded molecules under normal physiological conditions, and in cases of large changes can be regularly expected to diminish function [15, 16, 17]. However, no well-defined algorithm currently succeeds in translating energy changes into functional effects. Such a goal is further complicated by the fact that most single amino acid substitutions result in significant changes to protein stability  (this includes at least 30% of the mutations that are not associated with disease ). Most often a large destabilization changes or even eliminates function. However, the definition of the word "large" is unclear. In fact, that precise threshold likely differs from one protein to another.
At least one study  has attempted to systematically infer functional effects from both known and predicted structural changes. The authors utilized a number of advanced tools and databases to map SNPs to their structural effects in an effort to infer functional alterations. Their method was somewhat successful in identifying functionally important substitutions (albeit less so than the algorithms designed specifically for evaluating functional effects). However, these important results did by no means provide a succinct description of the relationship between changes in protein stability to those in function.
Given that it remains unclear how structural disruptions translate into functional change, we set out to formally evaluate the predictive ability of mutation-associated ΔΔG's on a set of experimentally annotated (both structurally and functionally) single amino acid substitutions. Considering that we were only able to identify a small number of such mutants, we also extracted computational predictions of ΔΔG values for a set of experimentally functionally annotated nsSNPs. We then compared the function-prediction power of ΔΔG to that of directly evaluating changes using sequence-based methods developed specifically for this purpose (SNAP; Screening for Non-Acceptable Polymorphisms ; SIFT ; Sorting Intolerant from Tolerant; Methods). We find, that in a general case SNAP and SIFT are capable of identifying functional disruptions better than an algorithm using a simple threshold-based binary classification of stability changes.
Results and discussion
Stability changes are not easily translated into functional changes
There are at least three reasons why there is no single threshold in ΔΔG stability change at which we are certain that function changes. (1) The threshold at which a mutation is destabilizing enough to disrupt function by reducing the number of folded active proteins depends on the unfolding energy of the wild-type molecule, which ranges from 3–15 kcal/mol [15, 20]; i.e. for the inherently more stable proteins a larger change is necessary to significantly alter the concentration of active molecules. (2) Without exact knowing the particular mechanism of protein function, protein destabilization or stabilization events are equally likely to alter function. (3) Destabilizations affecting active sites of the protein may not be manifest in a large ΔΔG, but can still affect function. Keeping these issues in mind we set up an experiment to gauge the resolution limits of binary classification of experimental ΔΔG in predicting nsSNP-associated function changes; i.e. we tried to answer the question of whether there is one threshold at which most mutations can be considered deleterious. Alternatively, we would find that the distribution of correct and incorrect functional annotations would be similar throughout the spectrum of possible ΔΔG thresholds. Note, that to address the functional differences between stabilizing and destabilizing mutations, we considered these two types of data points separately.
Data set summary*.
Our results demonstrate that in general magnitudes of both destabilizing and stabilizing changes are not very informative. For destabilizing mutations, using ΔΔG is worse than random (filled squares). At best cutoff of ΔΔG = +0.5 kcal/mol (peak point in filled squares in Fig. 1) 78% of the functionally disruptive destabilizingmutants are identified at 89% accuracy, but only 22% of the functionally neutral mutations are found (at 20% accuracy; Eqn. 2, PA = 89%, PC = 78%, NA = 20%, NC = 22%, Q2 = 67%). Assuming an uninformed guess at the distribution of neutral mutations of 50/50 and the "real data" 80/20 distribution of non-neutrals to neutrals in our set, assures a gain of 9% in accuracy and 28% in coverage of non-neutrals and a 28% loss in coverage of neutrals over random. (Note, if we use the suspected  natural distribution of 20–30% non-neutrals for random classification, the same ΔΔG cutoff will generate results with same accuracy, but the gain in non-neutral coverage will come at the cost of more loss in neutral coverage.) For stabilizing mutations (open squares, Fig. 1) using a ΔΔG is slightly better than random at peak (cutoff = -0.5 kcal/mol; PA = 89%, PC = 50%, NA = 20%, NC = 67%, Q2 = 53%), but also below random over all. Thus, over the entire set of mutants, using the ΔΔG cutoff of +/- 0.5 kcal/mol results in proper identification of functional change for only (Q2 =) 62% of the data set. These results demonstrate that a binary functional classification of mutations using a ΔΔG cutoff is not very accurate.
Larger data sets necessary to confirm structure/function correlations
Given the small number of mutations in the PMD-exp data set, it is possible that the suboptimal performance demonstrated by ΔΔG in the functional classification is an artifact of the number explosion. To check the validity of our suspicion we needed to collect a larger data set of mutations with known functional and structural effects (ideally reported as ΔΔG). For this purpose, we first extracted 3981 mutants (in 705 proteins) from the PMD that had both annotated stability and functional changes (PMD-all; Methods). According to the binary annotation of these mutants, ~71% affected function and ~67% affected protein stability. This distribution of functionally and structurally important mutations was slightly different from that of the PMD-exp data set, where 82% affected function, and, at 0.5 kcal/mol cutoff, 54% affected stability. Thus, if the PMD-exp classification performance was a statistical error we could expect to see improvement for this data set.
Both the structural and the functional changes recorded in PMD are in qualitative format (instead of ΔΔG; Methods). Using this form of annotation we could only generate one measurement of usefulness of stability changes in predicting functional ones (stability change = function change; PA = 73%, PC = 68%, NA = 30%, NC = 36%, Q2 = 59%). This performance was even worse than was expected from PMD-exp results, but the comparison was not exact; i.e. we could not search for an optimal cutoff in a binary classification of stability change. To use PMD-all more directly for comparison with the PMD-exp we used FoldX , a structure based program for energy calculations in proteins, to annotate the mutants in the PMD-all set with ΔΔG values. FoldX predictions of ΔΔG changes due to single amino acid substitutions were found in a previous study to be very well correlated with the experimental energy changes . However, since FoldX requires the presence of a known wild-type protein 3D-structure we were limited to a subset of only those substitutions for which this structure was available (PMD-pdb; Methods). This set contained 1657 mutants in 232 PDB structures (~75% functionally and ~71% structurally important; distribution very similar to PMD-all).
Some energy predictions agree with expert annotations of stability
Given the weakness of the correlation between the experimental data reported in PMD and the predictions from FoldX, we needed to identify a subset of our data that could be trusted in for attributing correct functional and structural changes to specific mutations. For this purpose, we retained for further testing only those mutations that were correctly classified by PMD reports at our chosen FoldX cutoffs (PMD-foldx; Methods). This selection assured that subjective study opinions (or errors in manual annotation) reflected in PMD entries corresponded to signals that could potentially be picked up using a predicted (possibly erroneously) ΔΔG threshold; i.e. if reports of stability changes agreed between expert annotation (PMD) and prediction (FoldX), both could be expected to be "biologically" correct. Of the mutants in the resulting data set ~75% altered function and ~57% altered the stability of the protein.
Larger data set confirms difficulty in converting energy changes to function changes
Computational methods are better at identifying functional effects of mutations
The problems with using energy thresholds for identifying functional changes suggest taking a different route to this type of classification. For instance, using SNAP  or SIFT  (Methods), methods specifically designed for evaluating functional consequences of single amino acid substitutions from sequence, we were able to obtain as good and better prediction performances. SNAP outperformed using an energy threshold throughout most of the spectrum of accuracy/coverage values for both destabilizing and stabilizing examples of the PMD-exp (Fig. 1) and the PMD-foldx (Fig. 3) data sets. As expected, accuracy was better at higher RI values (i.e. more reliable predictions), but even using default cutoffs produced better overall results. Similarly, SIFT did better then ΔΔG (but slightly worse then SNAP; at cutoffs described: PMD-exp Q2 ΔΔG = 62%, SNAP = 73%, SIFT = 63%; PMD-foldx Q2 ΔΔG = 63%, SNAP = 70%, SIFT = 68%; Eqn. 2). Interestingly, of mutations with altered stability, SNAP correctly assigned functional changes to 73%, while only 65% of the ones with unchanged stability were classified properly. This result suggests that mutations with significant structural alterations are easier to differentiate in terms of their effect on function then the ones that remain structurally intact.
Accounting for overlap with training data confirms accuracy of performance estimates
Protein stability and function are correlated but not equivalent
Overall, our results are in line with earlier findings from Wang et al , Steward et al  and DePristo et al  which suggest that while many functionally disruptive mutations are due to structural changes, a fairly large segment of functionally neutral mutants is also structurally disruptive. For instance, in the PMD-exp set 69% of mutations affecting function damaged the structure while 19% of the structurally disruptive mutants did not affect function. Trends of function/structure disruption correlation were similar for all data sets except PMD-pdb where 61% of the functionally deleterious mutations were not associated with a structural change (Table 1). This difference, however, can be attributed to a high threshold of considering a destabilizing substitution to be structurally disruptive.
From the data presented above it is clear that functional changes due to mutations do not always correspond to changes in stability. This concept is not novel, nor is it particularly surprising biologically. For instance, mutations eliminating binding residues do not have to be very destabilizing to be damaging while promiscuity of enzyme binding sites that associates with destabilization may not reflect on function. In our data sets we found quite a few examples of both stability-neutral/function-non-neutral mutants and vice versa. For example, mutagenesis studies  of the carboxyl tail of the mouse PKA (protein kinase A; SwissProt [26, 27] id: KAPCA_MOUSE) have shown that a tyrosine in position 330 of the protein is very important for maintaining kinase activity but not its structural integrity. Thus, many substitutions at this position yield structurally normal, yet functionally delinquent molecules. On the other end of the spectrum, there is the alanine to leucine mutation in position 172 of the 3-isopropylmalate dehydrogenase in T. thermophilus (SwissProt id: LEU3_THETH). This mutation affects the interdomain interface of the enzyme and produces a much more closed conformation of the molecule (which, incidentally, is a lot more stable). However, the substitution does not affect the domains' ability to move as necessary to maintain wild-type activity .
These examples are just some of the many that fall into a category of mutations that would always be misclassified by a "stability only"-based algorithm. Interestingly, a major novel finding of this study is that only about a third of mutations fall into this category; i.e. about two thirds of all mutants can in fact be correctly classified for their functional effects by considering the associated stability changes. On the other hand, we also find that functional classification is more precise using a computational method specifically developed for this purpose. The latter is also more advantageous for proteins with minimal information available (i.e. only sequence). Another surprising finding is that functional annotation of mutations that are disruptive stability-wise is simpler then when no stability changes are involved. Overall, we believe that the knowledge of protein stability/function correlation gleaned from this work will contribute significantly to the understanding of the field and to the development of algorithms capable of identifying functional importance of nsSNPs.
We collected experimental and computationally derived information regarding the effects of single amino acid substitutions on protein stability and function. For each of the available data sets we predicted functional effects of mutations using SNAP and SIFT, computational methods developed specifically for this purpose, and using stability alterations reported as changes in unfolding energy of the protein (ΔΔG). Comparing the predictive abilities of both approaches we find that, for our data, SNAP and SIFT perform better than using a binary threshold of ΔΔG. These results suggest that there is no simple relation that associates protein stability change to protein function change. In fact, for about a third of the mutants, these two features appear uncorrelated. Mutations that affect stability are better differentiated in terms of their effect on function than those that do not affect stability. Future implementations of computational algorithms could therefore benefit from using stability information, where available, in making predictions of functional effects of mutations.
PMD (Protein Mutant Database) [22, 23] is a literature-based database containing experimentally derived annotations of changes in protein function and/or structure due to mutation. For the purposes of this study we extracted from PMD only those entries describing single amino acid substitutions. Changes in function and structure are reported in PMD in a qualitative form ([---] means significant decrease in function/stability, [=] means no change, and [+] stands for increase in functionality/stability, etc.). Of the set extracted above we chose only those entries containing an annotated functional change (FUNCTION field) and/or annotated stability change (STABILITY field). Changes in function and stability were recorded in a binary format (neutral = identical to wild type; non-neutral = different from wild-type). If two studies of the same mutant differed in their annotations, the effect of the substitution was recorded as non-neutral. The version of PMD used to train SNAP and to extract the PMD-exp,-all,-pdb,-foldx data sets was created in Dec. 2005.
We took from the Guerois et. al. study  all single amino acid substitutions that have been experimentally annotated with ΔΔG (Eqn. 1). We used BLAST  for each of the proteins in this set to obtain alignments (at 100% sequence identity) to the sequences annotated with functional changes in the PMD database. We recorded functional effects of mutations from PMD corresponding to mutations in the Guerois et al data set for all aligned sequences to make the PMD-exp data set.
We extracted from PMD a set of all entries containing experimental annotations of their effects on structural stability of the affected protein (STABILITY field) and on its function (FUNCTION field) to make the PMD-all data set
FoldX  is a computational method that uses experimental protein structure data to estimate the value of all atomic interactions on the stability of said protein. Using these values FoldX can calculate the effect of mutation on protein stability. To obtain FoldX predictions of mutant ΔΔG's we extracted from the PMD-all data set all mutants in sequences mapping to a PDB id to generate the PMD-pdb data set. The mapping was accomplished by intersecting PMD PDB annotations of the selected entries with BLAST alignments of PDB chains to the reported PMD sequences (at ≥ 97% sequence identity). If the mutated residue reported in the PMD entry differed from the PDB chain residue at the same position, the entry was discarded from the data set.
We selected from the PMD-pdb data set only those mutations that agreed (at ΔΔG = -0.5/2.5 kcal/mol cutoffs) with the binary annotations in PMD; i.e. we selected only those mutants that were annotated as neutral in PMD and had a FoldX prediction of > -0.5 and < 2.5 kcal/mol or those that were non-neutral according to PMD and had a FoldX prediction < = -0.5 or > = 2.5 kcal/mol.
We extracted mutations from the new version of PMD (created in Mar. 2007) in the same manner as was applied to create PMD-foldx (we used the same cutoffs of -0.5 and 2.5 kcal/mol to identify mutations for which expert annotations agreed with FoldX predictions). To test accuracy of SNAP on novel samples we collected from this set only those mutations that were not present in the original PMD-foldx set.
Stabilizing and destabilizing mutations
In PMD-exp data set two mutations, for which ΔΔG = 0, were not considered as part of either stabilizing or destabilizing data set. In all other sets, we classified as stabilizing those mutations where predicted ΔΔG < = 0 and as destabilizing those where ΔΔG > 0.
To evaluate the correlation of the structural and functional effects we varied the threshold of ΔΔG at which a mutation is assigned to be functionally neutral. We had considered cutoffs of 0.01, 0.3, 0.5, 0.8, 1.0, 1.5, 2.0, 2.5, 3.0, 3.5, and 4.0 kcal/mol in both stabilizing and destabilizing directions.
SNAP  is a neural network based method for identifying from sequence functionally disruptive single amino acid substitutions. The inputs to SNAP include secondary structure and solvent accessibility predictions, evolutionary and family information, biophysical differences between the wild type and mutant amino acids, statistical likelihoods of observing residue triplets around the mutation site, SIFT  predictions, and SwissProt  annotations (if available). SNAP outputs both a binary prediction (neutral/non-neutral) and a reliability index (RI; 0–9) representative of the likely accuracy of prediction. For all mutants in the PMD-exp, PMD-foldx, and PMD07 data sets we ran SNAP and recorded both the binary predictions and the RI. SNAP is generally more accurate at higher RI. To obtain the ROC curves, we dialed through the RI values (in both positive/non-neutral and negative/neutral directions) as the threshold for assigning a neutral prediction.
SIFT  is a sequence based method that uses sequence homology and biophysical amino acid similarity to predict functional effects of single amino acid substitutions. SIFT outputs both a binary prediction (tolerated/deleterious) and a score (0–1, where score < = 0.05 means that the mutation is deleterious). SIFT scores are not meant to be used as prediction accuracy estimators. For all mutants in PMD-exp and PMD-foldx data sets we ran SIFT and recorded the binary prediction (tolerated/deleterious). For some of the mutants (4 in PMD-exp and 25 in PMD-foldx) we were unable to obtain SIFT predictions. For these instances a random prediction was generated (50/50 chance of being classified as tolerated or deleterious).
Thanks to Joerg Wicker and Stephen Kramer (both TUM) for all the helpful discussions and to Marco Punta (Columbia) and Lothar Richter (TUM) for help with the manuscript. Particular thanks to Guy Yachdav and Laszlo Kajan (both Columbia) for maintaining SNAP and for all technical help. The authors were supported by the grants R02-LM07329 from the National Library of Medicine (NLM) and U54-GM074958-01 to the Northeast Structural Genomics consortium (NESG) from the Protein Structure Initiative (PSI) of the National Institutes of Health (NIH). Last, not least, thanks to all those who deposit their experimental data in public databases, and to those who maintain these databases.
This article has been published as part of BMC Bioinformatics Volume 10 Supplement 8, 2009: Proceedings of the European Conference on Computational Biology (ECCB) 2008 Workshop: Annotation, interpretation and management of mutations. The full contents of the supplement are available online at http://www.biomedcentral.com/bmcbioinformatics/10?issue=S8.
- 2.Database of Single Nucleotide Polymorphisms (dbSNP). Bethesda (MD): National Center for Biotechnology Information, National Library of Medicine. (dbSNP Build ID: 129)[http://www.ncbi.nlm.nih.gov/SNP/]
- 6.Hamosh A, Scott AF, Amberger JS, Bocchini CA, McKusick VA: Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders. Nucleic Acids Res 2005, (33 Database):D514–517.Google Scholar
- 11.Bao L, Zhou M, Cui Y: nsSNPAnalyzer: identifying disease-associated nonsynonymous single nucleotide polymorphisms. Nucleic Acids Res 2005, (33 Web Server):W480–482.Google Scholar
- 19.Worth CL, Bickerton GR, Schreyer A, Forman JR, Cheng TM, Lee S, Gong S, Burke DF, Blundell TL: A structural bioinformatics approach to the analysis of nonsynonymous single nucleotide polymorphisms (nsSNPs) and their relation to disease. J Bioinform Comput Biol 2007, 5(6):1297–1318.CrossRefPubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.