The amino-acid mutational spectrum of human genetic disease

Vitkup, Dennis; Sander, Chris; Church, George M

doi:10.1186/gb-2003-4-11-r72

The amino-acid mutational spectrum of human genetic disease

Research
Published: 30 October 2003

Volume 4, article number R72, (2003)
Cite this article

Genome Biology Aims and scope Submit manuscript

Dennis Vitkup¹,
Chris Sander²^nAff3 &
George M Church¹

25k Accesses
154 Citations
1 Altmetric
Explore all metrics

Abstract

Background

Nonsynonymous mutations in the coding regions of human genes are responsible for phenotypic differences between humans and for susceptibility to genetic disease. Computational methods were recently used to predict deleterious effects of nonsynonymous human mutations and polymorphisms. Here we focus on understanding the amino-acid mutation spectrum of human genetic disease. We compare the disease spectrum to the spectra of mutual amino-acid mutation frequencies, non-disease polymorphisms in human genes, and substitutions fixed between species.

Results

We find that the disease spectrum correlates well with the amino-acid mutation frequencies based on the genetic code. Normalized by the mutation frequencies, the spectrum can be rationalized in terms of chemical similarities between amino acids. The disease spectrum is almost identical for membrane and non-membrane proteins. Mutations at arginine and glycine residues are together responsible for about 30% of genetic diseases, whereas random mutations at tryptophan and cysteine have the highest probability of causing disease.

Conclusions

The overall disease spectrum mainly reflects the mutability of the genetic code. We corroborate earlier results that the probability of a nonsynonymous mutation causing a genetic disease increases monotonically with an increase in the degree of evolutionary conservation of the mutation site and a decrease in the solvent-accessibility of the site; opposite trends are observed for non-disease polymorphisms. We estimate that the rate of nonsynonymous mutations with a negative impact on human health is less than one per diploid genome per generation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A unified analysis of evolutionary and population constraint in protein domains highlights structural features and pathogenic sites

Article Open access 11 April 2024

The mutational constraint spectrum quantified from variation in 141,456 humans

Article Open access 27 May 2020

Mutation effects predicted from sequence co-variation

Article 16 January 2017

References

Wang Z, Moult J: SNPs, protein structure, and disease. Hum Mutat. 2001, 17: 263-270. 10.1002/humu.22.
Article PubMed Google Scholar
Sunyaev S, Ramensky V, Koch I, Lathe W, Kondrashov AS, Bork P: Prediction of deleterious human alleles. Hum Mol Genet. 2001, 10: 591-597. 10.1093/hmg/10.6.591.
Article PubMed CAS Google Scholar
Ng PC, Henikoff S: Predicting deleterious amino acid substitutions. Genome Res. 2001, 11: 863-874. 10.1101/gr.176601.
Article PubMed CAS PubMed Central Google Scholar
Chasman D, Adams M: Predicting the functional consequences of non-synonymous single nucleotide polymorphisms: structure-based assessment of amino acid variation. J Mol Biol. 2001, 307: 683-706. 10.1006/jmbi.2001.4510.
Article PubMed CAS Google Scholar
Miller MP, Kumar S: Understanding human disease mutations through the use of interspecific variation. Hum Mol Genet. 2001, 10: 2319-2328. 10.1093/hmg/10.21.2319.
Article PubMed CAS Google Scholar
Terp BN, Cooper DN, Christensen IT, Jorgensen FS, Bross P, Gregersen N, Krawczak M: Assessing the relative importance of the biophysical properties of amino acid substitutions associated with human genetic disease. Hum Mutat. 2002, 20: 98-109. 10.1002/humu.10095.
Article PubMed CAS Google Scholar
McKusick VA: Mendelian Inheritance in Man. Catalogs of Human Genes and Genetic Disorders. 1998, Baltimore: John Hopkins University Press, 12
Google Scholar
Bairoch A, Apweiler R: The SWISS-PROT protein sequence data bank and its new supplement TrEMBL. Nucleic Acids Res. 1996, 24: 21-25. 10.1093/nar/24.1.21.
Article PubMed CAS PubMed Central Google Scholar
Stephens JC, Schneider JA, Tanguay DA, Choi J, Acharya T, Stanley SE, Jiang R, Messer CJ, Chew A, Han JH, et al: Haplotype variation and linkage disequilibrium in 313 human genes. Science. 2001, 293: 489-493. 10.1126/science.1059431.
Article PubMed CAS Google Scholar
Dayhoff MO: A model of evolutionary change in proteins. In Atlas of Protein Sequence and Structure. Edited by: Silver Spring: National Biomedical Research Foundation. 1978, Dayhoff MO, 345-352.
Google Scholar
Halushka MK, Fan JB, Bentley K, Hsie L, Shen N, Weder A, Cooper R, Lipshutz R, Chakravarti A: Patterns of single-nucleotide polymorphisms in candidate genes for blood-pressure homeostasis. Nat Genet. 1999, 22: 239-247. 10.1038/10297.
Article PubMed CAS Google Scholar
Cargill M, Altshuler D, Ireland J, Sklar P, Ardlie K, Patil N, Shaw N, Lane CR, Lim EP, Kalyanaraman N, et al: Characterization of single-nucleotide polymorphisms in coding regions of human genes. Nat Genet. 1999, 22: 231-238. 10.1038/10290.
Article PubMed CAS Google Scholar
Hess ST, Blake JD, Blake RD: Wide variations in neighbor-dependent substitution rates. J Mol Biol. 1994, 236: 1022-1033. 10.1016/0022-2836(94)90009-4.
Article PubMed CAS Google Scholar
Sonnhammer EL, von Heijne G, Krogh A: A hidden Markov model for predicting transmembrane helices in protein sequences. Proc Int Conf Intell Syst Mol Biol. 1998, 6: 175-182.
PubMed CAS Google Scholar
Benner SA, Cohen MA, Gonnet GH: Amino acid substitution during functionally constrained divergent evolution of protein sequences. Protein Eng. 1994, 7: 1323-1332.
Article PubMed CAS Google Scholar
Cooper DN, Youssoufian H: The CpG dinucleotide and human genetic disease. Hum Genet. 1988, 78: 151-155. 10.1007/BF00278187.
Article PubMed CAS Google Scholar
Krawczak M, Ball EV, Cooper DN: Neighboring-nucleotide effects on the rates of germ-line single base-pair substitution in human genes. Am J Hum Genet. 1998, 63: 474-488. 10.1086/301965.
Article PubMed CAS PubMed Central Google Scholar
Ng PC, Henikoff S: Accounting for human polymorphisms predicted to affect protein function. Genome Res. 2002, 12: 436-446. 10.1101/gr.212802.
Article PubMed CAS PubMed Central Google Scholar
Ramensky V, Bork P, Sunyaev S: Human non-synonymous SNPs: server and survey. Nucleic Acids Res. 2002, 30: 3894-3900. 10.1093/nar/gkf493.
Article PubMed CAS PubMed Central Google Scholar
Ferrer-Costa C, Orozco M, de la Cruz X: Characterization of disease-associated single amino acid polymorphisms in terms of sequence and structure properties. J Mol Biol. 2002, 315: 771-786. 10.1006/jmbi.2001.5255.
Article PubMed CAS Google Scholar
Bustamante CD, Townsend JP, Hartl DL: Solvent accessibility and purifying selection within proteins of Escherichia coli and Salmonella enterica. Mol Biol Evol. 2000, 17: 301-308.
Article PubMed CAS Google Scholar
Grantham R: Amino acid difference formula to help explain protein evolution. Science. 1974, 185: 862-864.
Article PubMed CAS Google Scholar
Fay JC, Wyckoff GJ, Wu CI: Positive and negative selection on the human genome. Genetics. 2001, 158: 1227-1234.
PubMed CAS PubMed Central Google Scholar
Terwilliger JD, Haghighi F, Heikkalinna TS, Goring HH: A biased assessment of the use of SNPs in human complex traits. Curr Opin Genet Dev. 2002, 12: 726-734. 10.1016/S0959-437X(02)00357-X.
Article PubMed CAS Google Scholar
Lohmueller KE, Pearce CL, Pike M, Lander ES, Hirschhorn JN: Meta-analysis of genetic association studies supports a contribution of common variants to susceptibility to common disease. Nat Genet. 2003, 33: 177-182. 10.1038/ng1071.
Article PubMed CAS Google Scholar
Olins PO, Bauer SC, Braford-Goldberg S, Sterbenz K, Polazzi JO, Caparon MH, Klein BK, Easton AM, Paik K, Klover JA, et al: Saturation mutagenesis of human interleukin-3. J Biol Chem. 1995, 270: 23754-23760. 10.1074/jbc.270.40.23754.
Article PubMed CAS Google Scholar
Huang W, Petrosino J, Hirsch M, Shenkin PS, Palzkill T: Amino acid sequence determinants of beta-lactamase structure and activity. J Mol Biol. 1996, 258: 688-703. 10.1006/jmbi.1996.0279.
Article PubMed CAS Google Scholar
Pakula AA, Sauer RT: Genetic analysis of protein stability and function. Annu Rev Genet. 1989, 23: 289-310. 10.1146/annurev.ge.23.120189.001445.
Article PubMed CAS Google Scholar
Matthews BW: Structural and genetic analysis of the folding and function of T4 lysozyme. FASEB J. 1996, 10: 35-41.
PubMed CAS Google Scholar
Nachman MW, Crowell SL: Estimate of the mutation rate per nucleotide in humans. Genetics. 2000, 156: 297-304.
PubMed CAS PubMed Central Google Scholar
Eyre-Walker A, Keightley PD: High genomic deleterious mutation rates in hominids. Nature. 1999, 397: 344-347. 10.1038/16915.
Article PubMed CAS Google Scholar
Henikoff S, Henikoff JG: Amino acid substitution matrices from protein blocks. Proc Natl Acad Sci USA. 1992, 89: 10915-10919.
Article PubMed CAS PubMed Central Google Scholar
Templeton AR, Clark AG, Weiss KM, Nickerson DA, Boerwinkle E, Sing CF: Recombinational and mutational hotspots within the human lipoprotein lipase gene. Am J Hum Genet. 2000, 66: 69-83. 10.1086/302699.
Article PubMed CAS PubMed Central Google Scholar
Zavolan M, Kepler TB: Statistical inference of sequence-dependent mutation rates. Curr Opin Genet Dev. 2001, 11: 612-615. 10.1016/S0959-437X(00)00242-2.
Article PubMed CAS Google Scholar
Rogozin I, Kondrashov F, Glazko G: Use of mutation spectra analysis software. Hum Mutat. 2001, 17: 83-102. 10.1002/1098-1004(200102)17:2<83::AID-HUMU1>3.0.CO;2-E.
Article PubMed CAS Google Scholar
Wootton JC, Federhen S: Analysis of compositionally biased regions in sequence databases. Methods Enzymol. 1996, 266: 554-571.
Article PubMed CAS Google Scholar
Holm L, Sander C: Removing near-neighbour redundancy from large protein sequence collections. Bioinformatics. 1998, 14: 423-429. 10.1093/bioinformatics/14.5.423.
Article PubMed CAS Google Scholar
Higgins DG, Thomposon JD, Gibson TJ: Using CLUSTAL for multiple sequence alignments. Methods Enzymol. 1996, 266: 383-402.
Article PubMed CAS Google Scholar
Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acid Res. 1997, 25: 3389-3402. 10.1093/nar/25.17.3389.
Article PubMed CAS PubMed Central Google Scholar
Bernstein FC, Koetzle TF, Williams GJB, Meyer EF, Brice MD, Rodgers JR, Kennard O, Shimanouchi T, Tasumi M: The Protein Data Bank: A computer based archival file for macromolecular structures. J Mol Biol. 1977, 112: 535-542.
Article PubMed CAS Google Scholar
Hubbard SJ, Thornton JM: NACCESS Computer Program. 1993, London: Department of Biochemistry and Molecular Biology, University College London
Google Scholar
Mount DW: Bioinformatics. 2001, Cold Spring Harbor, NY: Cold Spring Harbor Laboratory Press
Google Scholar

Download references

Acknowledgements

We thank Jay Shendure, John Aach, Patrik D'haeseleer, Daniel Segre, Peter Kharchenko, and Tzachi Pilpel for discussions. This work was supported in part by research grants from the US Department of Energy through the grant DOE DE-FG02-87-ER60565.

Author information

Chris Sander
Present address: Computational Biology Center, Memorial Sloan-Kettering Cancer Center, 1275 York Avenue, New York, NY, 10021, USA

Authors and Affiliations

Lipper Center for Computational Genetics and Department of Genetics, Harvard Medical School, Boston, MA, 02115, USA
Dennis Vitkup & George M Church
Whitehead Institute for Biomedical Research, Nine Cambridge Center, Cambridge, MA, 02142, USA
Chris Sander

Authors

Dennis Vitkup
View author publications
You can also search for this author in PubMed Google Scholar
Chris Sander
View author publications
You can also search for this author in PubMed Google Scholar
George M Church
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to George M Church.

Authors’ original submitted files for images

Below are the links to the authors’ original submitted files for images.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Vitkup, D., Sander, C. & Church, G.M. The amino-acid mutational spectrum of human genetic disease. Genome Biol 4, R72 (2003). https://doi.org/10.1186/gb-2003-4-11-r72

Download citation

Received: 03 July 2003
Revised: 24 September 2003
Accepted: 30 September 2003
Published: 30 October 2003
DOI: https://doi.org/10.1186/gb-2003-4-11-r72

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The amino-acid mutational spectrum of human genetic disease

Abstract

Background

Results

Conclusions

Access this article

Similar content being viewed by others

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Electronic supplementary material

Authors’ original submitted files for images

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation