Phylogenetic Reconstruction Methods: An Overview

  • Alexandre De Bruyn
  • Darren P. Martin
  • Pierre Lefeuvre
Part of the Methods in Molecular Biology book series (MIMB, volume 1115)


Initially designed to infer evolutionary relationships based on morphological and physiological characters, phylogenetic reconstruction methods have greatly benefited from recent developments in molecular biology and sequencing technologies with a number of powerful methods having been developed specifically to infer phylogenies from macromolecular data. This chapter, while presenting an overview of basic concepts and methods used in phylogenetic reconstruction, is primarily intended as a simplified step-by-step guide to the construction of phylogenetic trees from nucleotide sequences using fairly up-to-date maximum likelihood methods implemented in freely available computer programs. While the analysis of chloroplast sequences from various Vanilla species is used as an illustrative example, the techniques covered here are relevant to the comparative analysis of homologous sequences datasets sampled from any group of organisms.

Key words

Phylogeny DNA sequence Alignment Phylogenetic tree Maximum likelihood 



ADB is supported by the Conseil Général de La Réunion and CIRAD. DPM is supported by the Wellcome Trust. PL is supported by CIRAD and Conseil Régional de La Réunion and European Union (FEDER). The authors wish to thank Dr. Jean-Michel Lett for his helpful comments.


  1. 1.
    Darlu P, Tassy P (1993) La reconstruction phylogénétique. Concepts et Méthodes. MassonGoogle Scholar
  2. 2.
    Groves C (1986) Systematics of the great apes. In: Swindler DR, Erwin J (eds) Comparative primate biology: systematics, evolution and anatomy, vol 1. Liss AR, New York, pp 187–217Google Scholar
  3. 3.
    Hemsley AR, Poole I (2004) The evolution of plant physiology. From whole plants to ecosystems. Elsevier Academic Press, AmsterdamGoogle Scholar
  4. 4.
    Caputo P (1997) DNA and phylogeny in plants: history and new perspectives. Lagascalia 19:331–344Google Scholar
  5. 5.
    Zuckerkandl E, Pauling L (1965) Molecules as documents of evolutionary history. J Theor Biol 8:357–366PubMedCrossRefGoogle Scholar
  6. 6.
    Nei M, Kumar S (2000) Molecular evolution and phylogenetics. Oxford University Press, New YorkGoogle Scholar
  7. 7.
    Van de Peer Y (2009) Phylogeny inference based on distance methods. In: Salemmi M, Vandamme AM (eds) The phylogenetic handbook, a practical approach to DNA and protein phylogeny. Cambridge University Press, New York, pp 101–135Google Scholar
  8. 8.
    Michener CD, Sokal RR (1956) A quantitative approach to a problem in classification. Evolution 11:130–162CrossRefGoogle Scholar
  9. 9.
    Fitch WM, Margoliash E (1967) Construction of phylogenetic trees. Science 155:279–284PubMedCrossRefGoogle Scholar
  10. 10.
    Saitou N, Nei M (1987) The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol Biol Evol 4:406–425PubMedGoogle Scholar
  11. 11.
    Gascuel O (1997) BIONJ: an improved version of the NJ algorithm based on a simple model of sequence data. Mol Biol Evol 14: 685–695PubMedCrossRefGoogle Scholar
  12. 12.
    Steel MA, Hendy MD, Penny D (1988) Loss of information in genetic distances. Nature 336:118PubMedCrossRefGoogle Scholar
  13. 13.
    Felsenstein J (2004) Inferring phylogenies. Sinauer Associates, SunderlandGoogle Scholar
  14. 14.
    Sober E (1988) Reconstructing the past: parsimony, evolution, and inference. MIT Press, CambridgeGoogle Scholar
  15. 15.
    Edwards AWF, Cavalli-Sforza LL (1964) Reconstruction of evolutionary trees. In: Heywood VH, McNeill J (eds) Phenetic and phylogenetic classification: a symposium. Systematics Association, London, pp 67–76Google Scholar
  16. 16.
    Cavalli-Sforza LL, Edwards AWF (1967) Phylogenetic analysis: models and estimation procedures. Evolution 32:550–570CrossRefGoogle Scholar
  17. 17.
    Felsenstein J (1981) Evolutionary trees from DNA sequences: a maximum likelihood approach. J Mol Evol 17:368–376PubMedCrossRefGoogle Scholar
  18. 18.
    Farris JS (1970) Methods for computing Wagner trees. Syst Zool 19:83–92CrossRefGoogle Scholar
  19. 19.
    Fitch WM (1971) Towards defining the course of evolution: minimum change for a specific tree topology. Syst Zool 20:406–416CrossRefGoogle Scholar
  20. 20.
    Kluge AG, Farris JS (1969) Quantitative phyletics and the evolution of anurans. Syst Zool 18:1–32CrossRefGoogle Scholar
  21. 21.
    Harrison CJ, Langdale JA (2006) A step by step guide to phylogeny reconstruction. Plant J 45:561–572PubMedCrossRefGoogle Scholar
  22. 22.
    Aldrich J (1997) R. A. Fisher and the making of maximum likelihood 1912–1922. Statist Sci 12:162–176CrossRefGoogle Scholar
  23. 23.
    Felsenstein J (1973) Maximum-likelihood estimation of evolutionary trees from continuous characters. Am J Hum Genet 25:471–492PubMedCentralPubMedGoogle Scholar
  24. 24.
    Schmidt HA, von Haeseler A (2009) Phylogenetic inference using maximum likelihood methods. In: Salemmi M, Vandamme AM (eds) The phylogenetic handbook, a practical approach to DNA and protein phylogeny. Cambridge University Press, New York, pp 181–209CrossRefGoogle Scholar
  25. 25.
    Hendy MD, Penny D (1982) Branch and bound algorithms to determine minimal evolutionary trees. Math Biosci 59:277–290CrossRefGoogle Scholar
  26. 26.
    Swofford DL, Sullivan J (2003) Phylogeny inference based on parsimony and other methods using Paup*. In: Salemmi M, Vandamme AM (eds) The phylogenetic handbook, a practical approach to DNA and protein phylogeny. Cambridge University Press, New York, pp 267–312Google Scholar
  27. 27.
    Swofford DL, Olsen GJ (1990) Phylogeny reconstruction. In: Hillis DM, Moritz C, Mable BK (eds) Molecular systematics. Sinauer Associates, Sunderland, pp 411–501Google Scholar
  28. 28.
    Swofford DL et al (1996) Phylogenetic inference. In: Hillis DM, Moritz C, Mable BK (eds) Molecular systematics. Sinauer Associates, Sunderland, pp 407–514Google Scholar
  29. 29.
    Ronquist F, van der Mark P, Huelsenbeck JP (2009) Bayesian phylogenetic analysis using MrBayes. In: Salemmi M, Vandamme AM (eds) The phylogenetic handbook, a practical approach to DNA and protein phylogeny. Cambridge University Press, New York, pp 210–266CrossRefGoogle Scholar
  30. 30.
    Tamura K et al (2011) MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods. Mol Biol Evol 28:2731–2739PubMedCrossRefGoogle Scholar
  31. 31.
    Posada D (2008) jModelTest: phylogenetic model averaging. Mol Biol Evol 25: 1253–1256PubMedCrossRefGoogle Scholar
  32. 32.
    Guindon S et al (2010) New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0. Syst Biol 59:307–321PubMedCrossRefGoogle Scholar
  33. 33.
    Morariu V et al (2008) Automatic online tuning for fast Gaussian summation. Advances in Neural Information Processing Systems (NIPS) 18Google Scholar
  34. 34.
    Hall BG (2007) Phylogenetic trees made easy: a how-to manual, 3rd edn. Sinauer Associates, SunderlandGoogle Scholar
  35. 35.
    Benson DA et al (1994) GenBank. Nucleic Acids Res 22:3441–3444PubMedCentralPubMedCrossRefGoogle Scholar
  36. 36.
    Cochrane G et al (2009) Petabyte-scale innovations at the European Nucleotide Archive. Nucleic Acids Res 37:D19–D25PubMedCentralPubMedCrossRefGoogle Scholar
  37. 37.
    Tateno Y et al (2002) DNA Data Bank of Japan (DDBJ) for genome scale research in life science. Nucleic Acids Res 30:27–30PubMedCentralPubMedCrossRefGoogle Scholar
  38. 38.
    Bouetard A et al (2010) Evidence of transoceanic dispersion of the genus Vanilla based on plastid DNA phylogenetic analysis. Mol Phyl Evol 55:621–630CrossRefGoogle Scholar
  39. 39.
    Altschul SF et al (1990) Basic local alignment tool. J Mol Biol 215:403–410PubMedGoogle Scholar
  40. 40.
    Maddison WP, Donoghue MJ, Maddison DR (1984) Outgroup analysis and parsimony. Syst Zool 33:83–103CrossRefGoogle Scholar
  41. 41.
    Thompson JD, Higgins DG, Gibson TJ (1994) CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position specific gap penalties and weight matrix choice. Nucleic Acids Res 22:4673–4680PubMedCentralPubMedCrossRefGoogle Scholar
  42. 42.
    Edgar RC (2004) MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res 32: 1792–1797PubMedCentralPubMedCrossRefGoogle Scholar
  43. 43.
    Posada D, Crandall KA (1998) Model test: testing the model of substitution. Bioinformatics 14:817–818PubMedCrossRefGoogle Scholar
  44. 44.
    Akaike H (1974) A new look at the statistical model identification. IEEE Trans Autom Control 19:716–723CrossRefGoogle Scholar
  45. 45.
    Schwarz G (1978) Estimating the dimension of a model. Ann Stat 6:461–464CrossRefGoogle Scholar
  46. 46.
    Minin V et al (2003) Performance-based selection of likelihood models for phylogeny estimation. Syst Biol 52:674–683PubMedCrossRefGoogle Scholar
  47. 47.
    Luo A et al (2010) Performance of criteria for selecting evolutionary models in phylogenetics: a comprehensive study based on simulated datasets. BMC Evol Biol 10:242PubMedCentralPubMedCrossRefGoogle Scholar
  48. 48.
    Ripplinger J, Sullivan J (2008) Does choice in model selection affect maximum likelihood analysis? Syst Biol 57:76–85PubMedCrossRefGoogle Scholar
  49. 49.
    Jukes TH, Cantor CR (1969) Evolution of protein molecules. In: Munro HN (ed) Mammalian protein metabolism. Academic, New York, pp 21–132CrossRefGoogle Scholar
  50. 50.
    Tavaré S (1986) Some probabilistic and statistical problems in the analysis of DNA sequences. Lect Math Life Sci (Am Math Soc) 17:57–86Google Scholar
  51. 51.
    Hasegawa M, Kishino H, Yano T (1985) Dating of the human-ape splitting by a molecular clock of mitochondrial DNA. J Mol Evol 22:160–174PubMedCrossRefGoogle Scholar
  52. 52.
    Felsenstein J (1985) Confidence limits on phylogenies: an approach using the bootstrap. Evolution 39:783–791CrossRefGoogle Scholar
  53. 53.
    Anisimova M, Gascuel O (2006) Approximate likelihood-ratio test for branches: a fast, accurate, and powerful alternative. Syst Biol 55:539–552PubMedCrossRefGoogle Scholar
  54. 54.
    Anisimova M et al (2011) Survey of branch support methods demonstrates accuracy, power, and robustness of fast likelihood-based approximation schemes. Syst Biol 60:685–699PubMedCrossRefGoogle Scholar
  55. 55.
    Darriba D et al (2011) ProtTest3: fast selection of best-fit models of protein evolution. Bioinformatics 27:1164–1165PubMedCrossRefGoogle Scholar
  56. 56.
    Ruths D, Nakhleh L (2005) Recombination and phylogeny: effects and detection. Int J Bioinform Res Appl 1:202–212PubMedCrossRefGoogle Scholar
  57. 57.
    Posada D, Crandall KA (2002) The effect of recombination on the accuracy of phylogeny estimation. J Mol Evol 54:396–402PubMedGoogle Scholar
  58. 58.
    Ronquist F, Huelsenbeck JP (2003) MrBayes 3: Bayesian phylogenetic inference under mixed models. Bioinformatics 19:1572–1574PubMedCrossRefGoogle Scholar
  59. 59.
    Drummond AJ et al (2012) Bayesian phylogenetics with BEAUti and the BEAST 1.7. Mol Biol Evol 29:1969–1973. doi: 10.1093/molbev/mss075 PubMedCrossRefGoogle Scholar
  60. 60.
    Rannala B, Yang Z (1996) Probability distribution of molecular evolutionary trees: a new method of phylogenetic inference. J Mol Evol 43:304–311PubMedCrossRefGoogle Scholar
  61. 61.
    Mau B, Newton M, Larget B (1999) Bayesian phylogenetic inference via Markov chain Monte Carlo methods. Biometrics 55:1–12PubMedCrossRefGoogle Scholar

Copyright information

© Springer New York 2014

Authors and Affiliations

  • Alexandre De Bruyn
    • 1
  • Darren P. Martin
    • 2
  • Pierre Lefeuvre
    • 1
  1. 1.Pôle de Protection des Plantes, CIRAD, UMR PVBMTUniversité de la RéunionSaint-PierreFrance
  2. 2.Institute of Infectious Diseases and Molecular MedicineUniversity of Cape TownCape TownSouth Africa

Personalised recommendations