Advertisement

Phylogenetic Trees: Applications, Construction, and Assessment

  • Surekha Challa
  • Nageswara Rao Reddy Neelapu
Chapter

Abstract

Molecular phylogeny is used to study the relationships among the set of objects by generating phylogenetic or evolutionary tree. The objects in the study can be organisms or biomolecules such as gene or protein. The evolutionary history hidden in the biomolecules establishes the evolutionary patterns in the form of a tree when a suitable data, data substitution models, and tree construction methods are used. These evolutionary patterns are used to study the relationships among the objects. These patterns sometimes make it difficult to infer the relationship among the objects. In addition, different tree construction methods like unweighted pair group method with arithmetic mean (UPGMA), neighbor joining, minimum evolution, Fitch-Margoliash, maximum parsimony, maximum likelihood, Monte Carlo’s simulation, Bayes, and so on and types of data used in the analysis make it much more complicated to infer the relationships. The above tree construction methods follow different principles to construct a phylogenetic tree. Most often, the tree topologies generated by different methods for the same data will be the same, whereas in some cases the tree topologies may be different in their internal branching. These differences in the tree topologies may make it difficult to assess the confidence of the phylogenetic tree. Further, combination of the tree construction methods and data used by phylogeny program packages such as MEGA, Molphy, Phylip, PAML, and PAUP also make it difficult to assess the confidence of the phylogenetic tree. Molecular phylogeny has a wide range of applications such as affiliating taxonomy of an organism, studying reproductive biology in lower organisms, assessing the process of cryptic speciation in a species, understanding the history of life, resolving controversial history of life, reconstructing the paths of infection in an epidemiology, classifying proteins or genes into families, and many more. If the interpretation of the evolutionary patterns is not appropriate, then the inference of the study may be misleading. Thus, interpretation of the tree and relationships among the organisms is always dependent on assessing the confidence of the phylogenetic tree. Literature review shows that sampling methods such as bootstrapping, jackknifing, and Bayesian simulation and statistical methods such as Kishino-Hasegawa test and Shimodaira-Hasegawa test are used to assess the confidence of the phylogenetic tree. Thus, this chapter reviews the applications, construction, and assessment of phylogenetic tree.

Keywords

Phylogenetic tree Molecular phylogeny Phylogeny packages Tree construction methods Bootstrapping Jackknifing Bayesian simulation Kishino-Hasegawa test Shimodaira-Hasegawa test 

Notes

Acknowledgment

The authors are grateful to Gandhi Institute of Technology and Management (GITAM) Deemed-to-be-University, for providing necessary facilities to carry out the research work and for extending constant support in writing this review.

References

  1. Adachi J, Hasegawa M (1996) Molphy, version 2.3. Programs for molecular phylogenetics based on maximum likelihood. In: Ishiguro M, Kitagawa G, Ogata Y, Takagi H, Tamura Y, Tsuchiya T (eds) Computer science monographs. Institute of Statistical Mathematics, TokyoGoogle Scholar
  2. Ané C, Larget B, Baum DA, Smith SD, Rokas A (2007) Bayesian estimation of concordance among gene trees. Mol Biol Evol 24(2):412–426CrossRefPubMedGoogle Scholar
  3. Baldauf SL, Roger AJ, Wenk-Siefert I, Doolittle WF (2000) A kingdom-level phylogeny of eukaryotes based on combined protein data. Science 290(5493):972–977CrossRefPubMedGoogle Scholar
  4. Castro-Nallar E, Perez-Losada M, Burton GF, Crandall KA (2012) The evolution of HIV: inferences using phylogenetics. Mol Phylogenet Evol 62:777–792CrossRefPubMedGoogle Scholar
  5. Cavalli-Sforza LL, Edwards AWF (1967) Phylogenetic analysis: models and estimation procedures. Evolution 21:550–570CrossRefPubMedGoogle Scholar
  6. Clement M, Posada D, Crandall K (2000) TCS: a computer program to estimate gene genealogies. Mol Ecol 9:1657–1660CrossRefPubMedGoogle Scholar
  7. Devi KU, Reineke A, Reddy NNR, Rao CUM, Padmavathi J (2006) Genetic diversity, reproductive biology, and speciation in the entomopathogenic fungus Beauveria bassiana (Balsamo) Vuillemin. Genome 49(5):495–504CrossRefPubMedGoogle Scholar
  8. Devi UK, Reineke A, Rao UCM, Reddy NRN, Khan APA (2007) AFLP and single-strand conformation polymorphism studies of recombination in the entomopathogenic fungus Nomuraea rileyi. Mycol Res 111(6):716–725CrossRefPubMedGoogle Scholar
  9. Drummond A, Strimmer K (2001) PAL: an object-oriented programming library for molecular evolution and phylogenetics. Bioinformatics 17:662–663CrossRefPubMedGoogle Scholar
  10. Drummond AJ, Suchard MA, Xie D, Rambaut A (2012) Bayesian phylogenetics with BEAUti and the BEAST 1.7. Mol Biol Evol 29:1969–1973CrossRefPubMedPubMedCentralGoogle Scholar
  11. Felsenstein J (1973) Maximum likelihood and minimum-steps methods for estimating evolutionary trees from data on discrete characters. Syst Zool 22:240–249CrossRefGoogle Scholar
  12. Felsenstein J (1981) Evolutionary trees from gene-frequencies and quantitative characters – finding maximum-likelihood estimates. Evolution 35:1229–1242CrossRefPubMedGoogle Scholar
  13. Felsenstein J (1989) PHYLIP – phylogeny inference package (version 3.2). Cladistics 5:164–166Google Scholar
  14. Fitch WM (1971) Towards defining the course of evolution: minimum change for a specific tree topology. Syst Zool 20:406–416CrossRefGoogle Scholar
  15. Frech C, Chen N (2010) Genome-wide comparative gene family classification. PLoS One 5(10):e13409.  https://doi.org/10.1371/journal.pone.0013409CrossRefPubMedPubMedCentralGoogle Scholar
  16. Gao F, Yue L, White AT, Pappas PG, Barchue J, Hanson AP, Greene BM, Sharp PM, Shaw GM, Hahn BH (1992) Human infection by genetically diverse SIVSM-related HIV-2 in West Africa. Nature 358:495–499CrossRefPubMedGoogle Scholar
  17. Gao F, Bailes E, Robertson DL, Chen Y, Rodenburg CM, Michael SF, Cummins LB, Arthur LO, Peeters M, Shaw GM, Sharp PM, Hahn BH (1999) Origin of HIV-1 in the chimpanzee Pan troglodytes troglodytes. Nature 397:436–441CrossRefGoogle Scholar
  18. Gilbert MTP, Rambaut A, Wlasiuk G, Spira TJ, Pitchenik AE, Worobey M (2007) The emergence of HIV/AIDS in the Americas and beyond. Proc Natl Acad Sci U S A 104:18566–18570CrossRefPubMedPubMedCentralGoogle Scholar
  19. Goloboff PA (1999) Analyzing large data sets in reasonable times: solutions for composite optima. Cladistics 15:415–428CrossRefGoogle Scholar
  20. Grenfell B, Pybus O, Gog J, Wood J, Daly J (2004) Unifying the epidemiological and evolutionary dynamics of pathogens. Science 303:327–332CrossRefGoogle Scholar
  21. Hahn BH, Shaw GM, De Cock KM, Sharp PM (2000) AIDS as a zoonosis: scientific and public health implications. Science 287:607–614CrossRefPubMedGoogle Scholar
  22. Hardison RC (2012) Evolution of hemoglobin and its genes. Cold Spring Harb Perspect Med 2(12):a011627.  https://doi.org/10.1101/cshperspect.a011627CrossRefPubMedPubMedCentralGoogle Scholar
  23. Hasegawa M, Kishino H, Yano T (1985) Dating of the human-ape splitting by a molecular clock of mitochondrial DNA. J Mol Evol 22(2):160–174CrossRefPubMedGoogle Scholar
  24. Holmes EC (2009) The evolution and emergence of RNA viruses. Oxford University Press, New YorkGoogle Scholar
  25. Huelsenbeck JP, Ronquist F (2001) MrBayes: Bayesian inference of phylogeny. Bioinformatics 17:754–755CrossRefGoogle Scholar
  26. Huet T, Cheynier R, Meyerhans A, Roelants G, Wain-Hobson S (1990) Genetic organization of a chimpanzee lentivirus related to HIV-1. Nature 345:356–359CrossRefPubMedGoogle Scholar
  27. Kishino H, Hasegawa M (1989) Evaluation of the maximum likelihood estimate of the evolutionary tree topologies from DNA sequence data, and the branching order in Hominoidea. J Mol Evol 29:170–179CrossRefPubMedGoogle Scholar
  28. Kumar S, Tamura K, Nei M (2004) MEGA3: an integrated software for molecular evolutionary genetics analysis and sequence alignment. Brief Bioinform 5:150–163CrossRefPubMedPubMedCentralGoogle Scholar
  29. Lemey P, Pybus OG, Wang B, Saksena NK, Salemi M, Vandamme A-M (2003) Tracing the origin and history of the HIV-2 epidemic. Proc Natl Acad Sci U S A 100:6588–6592CrossRefPubMedPubMedCentralGoogle Scholar
  30. Lord E, Leclercq M, Boc A, Diallo AB, Makarenkov V (2012) Armadillo 1.1: an original workflow platform for designing and conducting phylogenetic analysis and simulations. PLoS One 7(1):e29903.  https://doi.org/10.1371/journal.pone.002990CrossRefPubMedPubMedCentralGoogle Scholar
  31. Maddison WP, Maddison DR (1992) MacClade. Sinauer Associates, SunderlandGoogle Scholar
  32. Maddison WP, Maddison DR (2011) Mesquite: a modular system for evolutionary analysis. Version 2.75. http://mesquiteproject.org
  33. Maeshima M (2000) Vacuolar H+-pyrophosphatase. Biochim Biophys Acta 1465:37–51CrossRefPubMedGoogle Scholar
  34. Margos G, Vollmer SA, Ogden NH, Fish D (2011) Population genetics, taxonomy, phylogeny and evolution of Borrelia burgdorferi sensu lato. Infect Genet Evol 11(7):1545–1563CrossRefPubMedPubMedCentralGoogle Scholar
  35. McGuire G, Wright F (2000) TOPAL 2.0: improved detection of mosaic sequences within multiple alignments. Bioinformatics 16(2):130–134CrossRefPubMedGoogle Scholar
  36. Neelapu NRR (2007) Investigation on existence and mechanism of recombination and molecular phylogeny of mitosporic entomopathogenic fungi Beauveria bassiana (Balsamo) Vuillemin and Nomuraea rileyi (Farlow) Samson. Doctoral dissertation, Andhra University, Visakhapatnam, IndiaGoogle Scholar
  37. Neelapu NRR, Reineke A, Chanchala UMR, Koduru UD (2009) Molecular phylogeny of asexual entomopathogenic fungi with special reference to Beauveria bassiana and Nomuraea rileyi. Rev Iberoam Micol 26(2):129–145CrossRefPubMedGoogle Scholar
  38. Nei M (1975) Molecular population genetics and evolution. North-Holland, AmsterdamGoogle Scholar
  39. Nguyen L-T, Schmidt HA, von Haeseler A, Minh BQ (2015) IQ-TREE: a fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies. Mol Biol Evol 32(1):268–274CrossRefPubMedGoogle Scholar
  40. Olsen GJ, Matsuda H, Hagstrom R, Overbeek R (1994) FastDNAml: a tool for construction of phylogenetic trees of DNA sequences using maximum likelihood. Bioinformatics 10(1):41–48CrossRefGoogle Scholar
  41. Opazo JC, Homan FG, Storz JF (2008) Genomic evidence for independent origins of like globin genes in monotremes and therian mammals. Proc Natl Acad Sci U S A 105:1590–1595CrossRefPubMedPubMedCentralGoogle Scholar
  42. Pace NR (1997) A molecular view of microbial diversity and the biosphere. Science 276:734–740CrossRefPubMedGoogle Scholar
  43. Padmavathi J, Uma Devi K, Rao CUM, Reddy NNR (2003) Telomere fingerprinting for assessing chromosome number, isolating typing and recombination in the entomopathogen Beauveria bassiana. Mycol Res 107(5):572–580CrossRefPubMedGoogle Scholar
  44. Page RDM (1998) GeneTree: comparing gene and species phylogenies using reconciled trees. Bioinformatics 14:819–820CrossRefPubMedGoogle Scholar
  45. Pagel M, Meade A (2004) A phylogenetic mixture model for detecting pattern-heterogeneity in gene sequence or character-state data. Syst Biol 53:571–581CrossRefPubMedGoogle Scholar
  46. Pérez-Losada M, Jobes DV, Sinangil F, Crandall KA, Posada D, Berman PW (2010) Phylodynamics of HIV-1 from a phase-III AIDS vaccine trial in North America. Mol Biol Evol 27:417–425CrossRefPubMedGoogle Scholar
  47. Plantier J-C, Leoz M, Dickerson JE, De Oliveira F, Cordonnier F, Lemee V, Damond F, Robertson DL, Simon F (2009) A new human immunodeficiency virus derived from gorillas. Nat Med 15:871–872CrossRefPubMedGoogle Scholar
  48. Posada D, Crandall KA, Templeton AR (2000) GeoDis: a program for the cladistic nested analysis of the geographical distribution of genetic haplotypes. Mol Ecol 9:487–488CrossRefPubMedGoogle Scholar
  49. Pozio E, Hoberg E, La Rosa G, Zarlenga DS (2009) Molecular taxonomy, phylogeny and biogeography of nematodes belonging to the Trichinella genus. Infect Genet Evol 9(4):606–616CrossRefPubMedGoogle Scholar
  50. Price MN, Dehal PS, Arkin AP (2009) FastTree: computing large minimum-evolution trees with profiles instead of a distance matrix. Mol Biol Evol 26:1641–1650CrossRefPubMedPubMedCentralGoogle Scholar
  51. Ramírez-Flandes S, Ulloa O (2008) Bosque: integrated phylogenetic analysis software. Bioinformatics 24(21):2539–2541CrossRefPubMedGoogle Scholar
  52. Raphaël H, Milinkovitch MC (2010) MetaPIGA v2.0: maximum likelihood large phylogeny estimation using the metapopulation genetic algorithm and other stochastic heuristics. BMC Bioinforma 11:379CrossRefGoogle Scholar
  53. Rea PA, Kim Y, Sarafian V, Poole RJ, Davies JM, Sanders D (1992) Vacuolar H+-translocating pyrophosphatase: a new category of ion translocase. Trends Biochem Sci 17(9):348–352CrossRefPubMedGoogle Scholar
  54. Saitou N, Nei M (1987) The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol Biol Evol 4(4):406–425Google Scholar
  55. Salemi M, Lamers SL, Yu S, de Oliveira T, Fitch WM, McGrath MS (2005) Phylodynamic analysis of human immunodeficiency virus type 1 in distinct brain compartments provides a model for the neuropathogenesis of AIDS. J Virol 79:11343–11352CrossRefPubMedPubMedCentralGoogle Scholar
  56. Schmidt HA, Strimmer K, Vingron M, von Haeseler A (2002) TREE-PUZZLE: maximum likelihood phylogenetic analysis using quartets and parallel computing. Bioinformatics 18:502–504CrossRefGoogle Scholar
  57. Shimodaira H, Hasegawa M (1999) Multiple comparisons of log-likelihoods with applications to phylogenetic inference. Mol Biol Evol 16:1114–1116CrossRefGoogle Scholar
  58. Sokal RR, Michener CD (1958) A statistical method for evaluating systematic relationships. J Univ Kans Sci Bull 28:1409–1438Google Scholar
  59. Suchard MA, Redelings BD (2006) BAli-Phy: simultaneous Bayesian inference of alignment and phylogeny. Bioinformatics 22:2047–2048CrossRefPubMedGoogle Scholar
  60. Suneetha G, Neelapu NRR, Surekha C (2016) Plant vacuolar proton pyrophosphatases (VPPases): structure, function and mode of action. Int J Recent Sci Res 7(6):12148–12152Google Scholar
  61. Swofford DL (1991) PAUP: Phylogenetic Analysis Using Parsimony, version 3.1 Computer program distributed by the Illinois Natural History Survey, Champaign, IllinoisGoogle Scholar
  62. Swofford DL, Olsen GJ, Waddell PJ, Hillis DM (1996) Phylogenetic inference. In: Hillis DM, Moritz C, Mable BK (eds) Molecular systematics. Sinauer, SunderlandGoogle Scholar
  63. Takahashi K, Nei M (2000) Efficiencies of fast algorithms of phylogenetic inference under the criteria of maximum parsimony, minimum evolution, and maximum likelihood when a large number of sequences are used. Mol Biol Evol 17:1251–1258CrossRefPubMedGoogle Scholar
  64. Teugels G (1996) Taxonomy, phylogeny and biogeography of catfishes (Ostariophysi, Siluroidei): an overview. Aquat Living Resour 9(S1):9–34.  https://doi.org/10.1051/alr:1996039CrossRefGoogle Scholar
  65. Thompson RCA (2008) The taxonomy, phylogeny and transmission of Echinococcus. Exp Parasitol 119(4):439–446CrossRefPubMedGoogle Scholar
  66. Van Heuverswyn F, Peeters M (2007) The origins of HIV and implications for the global epidemic. Curr Infect Dis Rep 9:338–346CrossRefPubMedGoogle Scholar
  67. Vinh LS, von Haeseler A (2004) IQPNNI: moving fast through tree space and stopping in time. Mol Biol Evol 21(8):1565–1571CrossRefGoogle Scholar
  68. Woese CR (1987) Bacterial evolution. Microbiol Rev 51:221–271PubMedPubMedCentralGoogle Scholar
  69. Woese CR, Fox GE (1997) Phylogenetic structure of the prokaryotic domain: the primary kingdoms. Proc Natl Acad Sci U S A 74:5088–5090CrossRefGoogle Scholar
  70. Woese CR, Kandler O, Wheelis ML (1990) Towards a natural system of organisms: proposal for the domains Archaea, Bacteria, and Eucarya. Proc Natl Acad Sci U S A 87:4576–4579CrossRefPubMedPubMedCentralGoogle Scholar
  71. Worobey M, Gemmel M, Teuwen DE, Haselkorn T, Kunstman K, Bunce M, Muyembe J-J, Kabongo J-MM, Kalengayi RM, Van Marck E, Gilbert MTP, Wolinsky SM (2008) Direct evidence of extensive diversity of HIV-1 in Kinshasa by 1960. Nature 455:661–664CrossRefPubMedPubMedCentralGoogle Scholar
  72. Xia X, Xie Z (2001) DAMBE: data analysis in molecular biology and evolution. J Hered 92:371–373CrossRefPubMedGoogle Scholar
  73. Yang Z (2000) Phylogenetic analysis by maximum likelihood (PAML). University College, LondonGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Surekha Challa
    • 1
  • Nageswara Rao Reddy Neelapu
    • 1
  1. 1.Department of Biochemistry and Bioinformatics, Institute of ScienceGandhi Institute of Technology and Management (GITAM) (Deemed to be University)VisakhapatnamIndia

Personalised recommendations