Abstract
Molecular phylogenetics examines how biological sequences evolve and the historical relationships between them. An important aspect of many such studies is the estimation of a phylogenetic tree, which explicitly describes evolutionary relationships between the sequences. This chapter provides an introduction to evolutionary trees and some commonly used inferential methodology, focusing on the assumptions made and how they affect an analysis. Detailed discussion is also provided about some common algorithms used for phylogenetic tree estimation. Finally, there are a few practical guidelines, including how to combine multiple software packages to improve inference, and a comparison between Bayesian and maximum likelihood phylogenetics.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Hahn, B. H., Shaw, G. M., de Cock, K.M., et al. (2000) AIDS as a zoonosis: Scientific and public health implications.Science 287, 607–614.
Pellegrini, M., Marcotte, E. M., Thompson, M. J., et al. (1999) Assigning protein functions by comparative genome analysis: protein phylogenetic profiles.Proc Natl Acad Sci U S A 96, 4285–4288.
Tatusov, R. L., Natale, D. A., Garkavtsev, I. V., et al. (2001) The COG database: new developments in phylogenetic classification of proteins from complete genomes.Nucleic Acids Res 29, 22–28.
Mouse Genome Sequencing Consortium. (2002) Initial sequencing of the mouse genome.Nature 420, 520–562.
The ENCODE Project Consortium. (2004) The ENCODE (Encyclopedia of DNA Elements) project.Science 306, 636–640.
Page, R. D. M., Holmes, E. C. (1998)Molecular Evolution: A Phylogenetic Approach. Blackwell Science, Oxford, UK.
Gogarten, J. P., Doolittle, W. F., Lawrence, J. G. (2002) Prokaryotic evolution in light of gene transfer.Mol Biol Evol 19, 2226– 2238.
Siepel, A., Bejerano, G., Pedersen, J. S., et al. (2005) Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes.Genome Res 15, 1034–1050.
Felsenstein, J. (2004)Inferring Phylogenies. Sinauer Associates, Sunderland, MA.
Nei, M., Kumar, S. (2000)Molecular Evolution and Phylogenetics. Oxford University Press, New York.
Whelan, S., Lio, P., Goldman, N. (2001). Molecular phylogenetics: state-of-the-art methods for looking into the past.Trends Genet 17, 262–272.
Chang, J. T. (1996) Full reconstruction of Markov models on evolutionary trees: Identifiability and consistency.Math Biosci 137, 51–73.
Rogers, J. S. (1997) On the consistency of maximum likelihood estimation of phy-logenetic trees from nucleotide sequences.Syst Biol 46, 354–357.
Steel, M. A., Penny, D. (2000) Parsimony, likelihood, and the role of models in molecular phylogenetics.Mol Biol Evol 17, 839–850.
Siddall M. E., Kluge A. G. (1997) Probabi-lism and phylogenetic inference.Cladistics 13, 313–336.
Saitou, N., Nei, M. (1987) The neighbor-joining method: a new method for reconstructing phylogenetic trees.Mol Biol Evol 4, 406–425.
Fitch, W. M., Margoliash, E. (1967) Construction of phylogenetic trees. A method based on mutation distances as estimated from cytochrome c sequences is of general applicability.Science 155, 279–284.
Swofford, D. L., Olsen, G. J., Waddell, P. J., et al. (1996) Phylogenetic inference, in (Hillis, D.M., Moritz, C., and Mable B. K., eds.), Molecular Systematics, 2nd ed. Sin-auer, Sunderland, MA.
Yang, Z., Goldman, N., Friday, A. (1995) Maximum likelihood trees from DNA sequences: a peculiar statistical estimation problem.Syst Biol 44, 384–399.
Strimmer, K., von Haeseler, A. (1996) Quartet puzzling: A quartet maximum likelihood method for reconstructing tree topologies.Mol Biol Evol 13, 964–969.
Bryant, D. The splits in the neighbourhood of a tree.Ann Combinat 8, 1–11.
Sankoff, D., Abel Y., Hein, J. (1994) A tree, a window, a hill; generalisation of nearest neighbor interchange in phylogenetic optimisation.J Classif 11, 209–232.
Ganapathy, G., Ramachandran, V., Warnow, T. (2004) On contract-and-refine transformations between phylogenetic trees.Proc Fifteenth ACM-SIAM Symp Discrete Algorithms (SODA), 893–902.
Wolf, M. J., Easteal, S., Kahn, M., et al. (2000) TrExML: a maximum-likelihood approach for extensive tree-space exploration.Bioinformatics 16, 383–394.
Stamatakis, A., Ludwig, T., Meier, H. (2005) RAxML-III: a fast program for maximum likelihood-based inference of large phyloge-netic trees.Bioinformatics 21, 456–463.
Vinh, L. S., von Haeseler, A. (2004) IQPNNI: moving fast through tree space and stopping in time.Mol Biol Evol 21, 1565–1571.
Felsenstein, J. (1993)PHYLIP (Phylog-eny Inference Package). Distributed by the author. Department of Genetics, University of Washington, Seattle.
Lewis, P. O. (1998) A genetic algorithm for maximum-likelihood phylogeny inference using nucleotide sequence data.Mol Biol Evol 15, 277–283.
Lemmon, A. R., Milinkovich, M. C. (2002) The metapopulation genetic algorithm: an efficient solution for the problem of large phylogeny estimation.Proc Natl Acad Sci U S A 99, 10516–10521.
Lundy, M. (1985) Applications of the annealing algorithm to combinatorial problems in statistics.Biometrika 72, 191–198.
Salter, L., Pearl., D. K. (2001) Stochastic search strategy for estimation of maximum likelihood phylogenetic trees.Syst Biol 50, 7–17.
Keith J. M., Adams P., Ragan M. A., et al. (2005) Sampling phylogenetic tree space with the generalized Gibbs sampler.Mol Phy Evol 34, 459–468.
Efron, B., Tibshirani, R. J. (1993)An Introduction to the Bootstrap. Chapman and Hall, New York.
Felsenstein, J. (1985) Confidence limits on phylogenies: an approach using the bootstrap.Evolution 39, 783–791.
Hillis, D., Bull, J. (1993) An empirical test of bootstrapping as a method for assessing conference in phylogenetic analysis.Syst Biol 42, 182–192.
Efrom, B., Halloran, E., Holmes, S. (1996) Bootstrap confidence levels for phyloge-netic trees.Proc Natl Acad Sci U S A 93, 13429–13434.
Shimodaira, H., Hasegawa, M. (1999) Multiple comparisons of log-likelihoods with applications to phylogenetic inference.Mol Biol Evol 16, 1114–1116.
Shimodaira, H. (2002) An approximately unbiased test of phylogenetic tree selection.Syst Biol 51, 492–508.
Kishino, H., Hasegawa, M. (1989) Evaluation of the maximum-likelihood estimate of the evolutionary tree topologies from DNA-sequence data, and the branching order in Hominoidea.J Mol Evol 29, 170–179.
Hasegawa, M., Kishino, H. (1994) Accuracies of the simple methods for estimating the bootstrap probability of a maximum-likelihood tree.Mol Biol Evol 11, 142–145.
Davison, A. C., Hinkley, D. V. (1997)Bootstrap Methods and Their Application. Cambridge University Press, Cambridge, MA.
Siepel, A., Haussler, D. (2005) Phyloge-netic hidden Markov models, in (Nielsen, R., ed.),Statistical Methods in Molecular Evolution. Springer, New York.
Huelsenbeck, J. P., Larget, B., Miller, R. E., et al. (2002) Potential applications and pitfalls of Bayesian inference of phylogeny.Syst Biol 51, 673–688.
Holder, M., Lewis, P. O. (2003) Phylog-eny estimation: traditional and Bayesian approaches.Nat Rev Genet 4, 275–284.
Larget, B., Simon, D. (1999) Markov chain Monte Carlo algorithms for the Bayesian analysis of phylogenetic trees.Mol Biol Evol 16, 750–759.
Suzuki, Y., Glazko G. V., Nei, M. (2002) Overcredibility of molecular phylogenies obtained by Bayesian phylogenetics.Proc Natl Acad Sci U S A 99, 16138–16143.
Alfaro, M. E., Zoller, S., Lutzoni, F. (2003) Bayes or bootstrap? A simulation study comparing the performance of Bayesian Markov chain Monte Carlo sampling and bootstrapping in assessing phylogenetic confidence.Mol Biol Evol 20,255–266.
Douady, C. J., Delsuc, F., Boucher, Y., et al. (2003) Comparison of Bayesian and maximum likelihood bootstrap measures of phylogenetic reliability.Mol Biol Evol 20, 248–254.
Yang, Z., Rannala, B. (2005) Branch-length prior influences Bayesian posterior probability of phylogeny.Syst Biol 54, 455–470.
Lewis, P. O., Holder, M. T., Holsinger, K. E. (2005) Polytomies and Bayesian phyloge-netic inference.Syst Biol 54, 241–253.
Yang, Z. (1996) Among-site rate variation and its impact on phylogenetic analysis.Trends Ecol Evol 11, 367–372.
Hasegawa, M., Kishino, H., Yano, T. (1985) Dating of the human-ape splitting by a molecular clock of mitochondrial DNA.J Mol Evol 22, 160–174.
Dayhoff, M. O., Eck, R. V., Park, C. M. (1972) A model of evolutionary change in proteins, in (Dayhoff, M. O., ed.),Atlas of Protein Sequence and Structure,vol. 5. National Biomedical Research Foundation, Washington, DC.
Whelan, S., Goldman, N. (2001) A general empirical model of protein evolution derived from multiple protein families using a maximum likelihood approach.Mol Biol Evol 18, 691–699.
Adachi, J., Hasegawa M. (1996) Model of amino acid substitution in proteins encoded by mitochondrial DNA.J Mol Evol 42, 459–468.
Yang, Z., Nielsen, R., Hasegawa, M. (1998) Models of amino acid substitution and applications to mitochondrial protein evolution.Mol Biol Evol 15, 1600–1611.
Cao, Y., Adachi, J., Janke, A., et al. (1994) Phylogenetic relationships among eutherian orders estimated from inferred sequences of mitochondrial proteins: instability of a tree based on a single gene.J Mol Evol 39, 519–527.
Goldman, N., Whelan, S. (2002) A novel use of equilibrium frequencies in models of sequence evolution.Mol Biol Evol 19, 1821–1831.
Ren, F., Tanaka, H., Yang, Z. (2005) An empirical examination of the utility of codon-substitution models in phylogeny reconstruction.Syst Biol 54, 808–818.
Acknowledgments
S.W. is funded by EMBL. Comments and suggestions from Nick Goldman, Lars Jermiin, Ari Loytynoja, and Fabio Pardi all helped improve previous versions of the manuscript.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Humana Press, a part of Springer Science+Business Media, LLC
About this protocol
Cite this protocol
Whelan, S. (2008). Inferring Trees. In: Keith, J.M. (eds) Bioinformatics. Methods in Molecular Biology™, vol 452. Humana Press. https://doi.org/10.1007/978-1-60327-159-2_14
Download citation
DOI: https://doi.org/10.1007/978-1-60327-159-2_14
Publisher Name: Humana Press
Print ISBN: 978-1-58829-707-5
Online ISBN: 978-1-60327-159-2
eBook Packages: Springer Protocols