Protein Substitution Model and Evolutionary Distance

  • Xuhua Xia


In addition to nucleotide-based substitution models, there are also models based on amino acid and codon sequences. Observed substitutions between two sense codons depend on codon frequencies, the difference between the two encoded amino acids, and the number of nucleotide site differences between the two codons (which could differ at 1, 2, or all 3 sites). Similarly, observed substitutions between two amino acids depend on amino acid frequencies and amino acid dissimilarities. This chapter focuses on amino acid substitution models with their parameters derived from empirical substitution matrices. How is an empirical substitution matrix compiled? How to derive transition probability and rate matrices from an empirical matrix? How to derive evolutionary distances from these matrices? Under what circumstances one may fail to obtain an evolutionary distance? These questions are addressed in detail with numerical illustrations.


  1. Grantham R (1974) Amino acid difference formula to help explain protein evolution. Science 185:862–864CrossRefPubMedGoogle Scholar
  2. Jones DT, Taylor WR, Thornton JM (1992) The rapid generation of mutation data matrices from protein sequences. Comput Appl Biosci 8:275–282PubMedGoogle Scholar
  3. Kimura M (1968) Evolutionary rate at the molecular level. Nature 217:624–626CrossRefPubMedGoogle Scholar
  4. Kimura M (1977) Preponderance of synonymous changes as evidence for the neutral theory of molecular evolution. Nature 267:275–276CrossRefPubMedGoogle Scholar
  5. Kimura M (1983) The neutral theory of molecular evolution. Cambridge University Press, CambridgeCrossRefGoogle Scholar
  6. King MC, Jukes TH (1969) Non-Darwinian evolution. Science 164:788–798CrossRefPubMedGoogle Scholar
  7. Miyata T, Yasunaga T (1980) Molecular evolution of mRNA: a method for estimating evolutionary rates of synonymous and amino acid substitutions from homologous nucleotide sequences and its application. J Mol Evol 16(1):23–36CrossRefPubMedGoogle Scholar
  8. Miyata T, Miyazawa S, Yasunaga T (1979) Two types of amino acid substitutions in protein evolution. J Mol Evol 12(3):219–236CrossRefPubMedGoogle Scholar
  9. Palidwor GA, Perkins TJ, Xia X (2010) A general model of codon bias due to GC mutational bias. PLoS One 5(10):e13431CrossRefPubMedPubMedCentralGoogle Scholar
  10. Xia X (1998b) The rate heterogeneity of nonsynonymous substitutions in mammalian mitochondrial genes. Mol Biol Evol 15:336–344CrossRefPubMedGoogle Scholar
  11. Xia X (2013) DAMBE5: a comprehensive software package for data analysis in molecular biology and evolution. Mol Biol Evol 30:1720–1728PubMedPubMedCentralCrossRefGoogle Scholar
  12. Xia X (2017d) Self-organizing map for characterizing heterogeneous nucleotide and amino acid sequence motifs. Computation 5(4):43CrossRefGoogle Scholar
  13. Xia X, Li WH (1998) What amino acid properties affect protein evolution? J Mol Evol 47(5):557–564CrossRefPubMedGoogle Scholar
  14. Xia X, Xie Z (2002) Protein structure, neighbor effect, and a new index of amino acid dissimilarities. Mol Biol Evol 19(1):58–67CrossRefPubMedGoogle Scholar

Copyright information

© Springer Science+Business Media LLC 2018

Authors and Affiliations

  • Xuhua Xia
    • 1
  1. 1.University of Ottawa CAREG and Biology DepartmentOttawaCanada

Personalised recommendations