Journal of Mathematical Biology

, Volume 78, Issue 6, pp 1953–1979 | Cite as

Quantifying the accuracy of ancestral state prediction in a phylogenetic tree under maximum parsimony

  • Lina Herbst
  • Heyang Li
  • Mike SteelEmail author


In phylogenetic studies, biologists often wish to estimate the ancestral discrete character state at an interior vertex v of an evolutionary tree T from the states that are observed at the leaves of the tree. A simple and fast estimation method—maximum parsimony—takes the ancestral state at v to be any state that minimises the number of state changes in T required to explain its evolution on T. In this paper, we investigate the reconstruction accuracy of this estimation method further, under a simple symmetric model of state change, and obtain a number of new results, both for 2-state characters, and r-state characters (\(r>2\)). Our results rely on establishing new identities and inequalities, based on a coupling argument that involves a simpler ‘coin toss’ approach to ancestral state reconstruction.


Phylogenetic tree Markov process Maximum parsimony Coupling 

Mathematics Subject Classification

05C05 92D15 



Lina Herbst thanks the University of Greifswald for the Landesgraduiertenförderung studentship and the German Academic Exchange Service (DAAD) for the DAAD-Doktorandenstipendium. Mike Steel thanks the New Zealand Marsden Fund (UOC-1709). We also thank Mareike Fischer for several helpful comments, Santiago Catalano for references to some recent biological studies, and the two anonymous reviewers for numerous helpful comments on an earlier version of this manuscript.


  1. Duchemin W, Anselmetti Y, Patterson M, Ponty Y, Bérard S, Chauve C, Scornavacca C, Daubin V, Tannier E (2017) DeCoSTAR: reconstructing in ancestral organization of genes or genomes using reconciled phylogenies. Genome Biol Evol 9:1312–1319CrossRefGoogle Scholar
  2. Felsenstein J (2004) Inferring phylogenies. Sinauer Press, SunderlandGoogle Scholar
  3. Fischer M, Thatte B (2009) Maximum parsimony on subsets of taxa. J Theor Biol 260:290–293MathSciNetCrossRefzbMATHGoogle Scholar
  4. Fitch WM (1971) Toward defining the course of evolution: minimal change for a specific tree topology. Syst Zool 20:406–416CrossRefGoogle Scholar
  5. Gaschen B (2002) Diversity considerations in HIV-1 vaccine selection. Science 296:2354–2360CrossRefGoogle Scholar
  6. Gascuel O, Steel M (2010) Inferring ancestral sequences in taxon-rich phylogenies. Math Biosci 227:125–135MathSciNetCrossRefzbMATHGoogle Scholar
  7. Gascuel O, Steel M (2014) Predicting the ancestral character changes in a tree is typically easier than predicting the root state. Syst Biol 63:421–435CrossRefGoogle Scholar
  8. Göpel T, Wirkner CS (2018) Morphological description, character conceptualization and the reconstruction of ancestral states exemplified by the evolution of arthropod hearts. PLoS ONE 13:e0201702CrossRefGoogle Scholar
  9. Hartigan JA (1973) Minimum mutation fits to a given tree. Biometrics 29:53–65CrossRefGoogle Scholar
  10. Herbst L, Fischer M (2018) On the accuracy of ancestral sequence reconstruction for ultrametric trees with parsimony. Bull Math Biol 80:864–879MathSciNetCrossRefzbMATHGoogle Scholar
  11. Hsiang AY, Field D, Webster TH, Behlke A, Davis MB, Racicot RA, Gauthier JA (2015) The origin of snakes: revealing the ecology, behavior, and evolutionary history of early snakes using genomics, phenomics, and the fossil record. BMC Evol Biol 15:87CrossRefGoogle Scholar
  12. Huelsenbeck J, Bollback JP (2001) Empirical and hierarchical Bayesian estimation of ancestral states. Syst Biol 50:351–366CrossRefGoogle Scholar
  13. Jukes TH, Cantor C (1969) Evolution of protein molecules. In: Munro HN (ed) Mammalian protein metabolism. Academic Press, New York, pp 21–132CrossRefGoogle Scholar
  14. Koshi JM, Goldstein RA (1996) Probabilistic reconstruction of ancestral protein sequences. J Mol Evol 42:313–320CrossRefGoogle Scholar
  15. Li G, Steel M, Zhang L (2008) More taxa are not necessarily better for the reconstruction of ancestral character states. Syst Biol 57:647–653CrossRefGoogle Scholar
  16. Plachetzki DC, Fong CR, Oakley TH (2010) The evolution of phototransduction from an ancestral cyclic nucleotide gated pathway. Proc R Soc Lond B Biol Sci 277:1963–1969CrossRefGoogle Scholar
  17. Pupko T, Pe I, Shamir R, Graur D (2000) A fast algorithm for joint reconstruction of ancestral amino acid sequences. Mol Biol Evol 17:890–896CrossRefGoogle Scholar
  18. Sauquet H, von Balthazar M, Schönenberger J (2017) The ancestral flower of angiosperms and its early diversification. Nat Commun 8:16047CrossRefGoogle Scholar
  19. Steel M (2016) Phylogeny: discrete and random processes in evolution. SIAM, PhiladelphiaCrossRefzbMATHGoogle Scholar
  20. Steel MA, Charleston M (1995) Five surprising properties of parsimoniously colored trees. Bull Math Biol 57:367–375CrossRefzbMATHGoogle Scholar
  21. Steel M, Penny D (2005) Maximum parsimony and the phylogenetic information in multi-state characters. In: Albert VA (ed) Parsimony, phylogeny and genomics. Oxford University Press, Oxford, pp 163–178Google Scholar
  22. Tuffley C, Steel M (1997) Links between maximum likelihood and maximum parsimony under a simple model of site substitution. Bull Math Biol 59:581–607CrossRefzbMATHGoogle Scholar
  23. Yang Z, Kumar S, Nei M (1995) A new method of inference of ancestral nucleotide and amino acid sequences. Genetics 141:1641–1650Google Scholar
  24. Zhang L, Shen J, Yang J, Li G (2010) Analyzing the Fitch method for reconstructing ancestral states on ultrametric phylogenetic trees. Bull Math Biol 72:1760–1782MathSciNetCrossRefzbMATHGoogle Scholar

Copyright information

© Springer-Verlag GmbH Germany, part of Springer Nature 2019

Authors and Affiliations

  1. 1.Institute of Mathematics and Computer ScienceUniversity of GreifswaldGreifswaldGermany
  2. 2.School of Mathematics and StatisticsUniversity of CanterburyChristchurchNew Zealand
  3. 3.Biomathematics Research CentreUniversity of CanterburyChristchurchNew Zealand

Personalised recommendations