Abstract
In this chapter, we present recent computational and algorithmic advances for improving the inference of phylogenetic trees from the analysis of homologous genetic sequences under the maximum likelihood criterion. In particular, we detail how the use of matrix algebra at the core of Felsenstein’s pruning algorithm, combined with the architecture of modern day computer processors, leads to efficient techniques for optimizing edge lengths. We also discuss some properties of the likelihood function when considering the optimization of the parameters of mixture models that are used to describe the variation of rates-across sites .
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Adachi, J., Hasegawa, M.: MOLPHY version 2.3: programs for molecular phylogenetics based on maximum likelihood. Institute of Statistical Mathematics Tokyo (1996)
Ayres, D.L., Darling, A., Zwickl, D.J., Beerli, P., Holder, M.T., Lewis, P.O., Huelsenbeck, J.P., Ronquist, F., Swofford, D.L., Cummings, M.P., Rambaut, A., Suchard, M.A.: BEAGLE: an application programming interface and high-performance computing library for statistical phylogenetics. Syst. Biol. 61(1), 170–173 (2011)
Brent, R.P.: An algorithm with guaranteed convergence for finding a zero of a function. Comput. J. 14(4), 422–425 (1971)
Dayhoff, M., Schwartz, R., Orcutt, B.: A model of evolutionary change in proteins. In: Dayhoff, M. (ed.) Atlas of Protein Sequence and Structure, vol. 5, pp. 345–352. National Biomedical Research Foundation, Washington, D.C. (1978)
Felsenstein, J.: Evolutionary trees from DNA sequences: a maximum likelihood approach. J. Mol. Evol. 17, 368–376 (1981)
Felsenstein, J.: Inferring Phylogenies. Sinauer Associates, Sunderland, MA (2004)
Gascuel, O., Guindon, S.: Modelling the variability of evolutionary processes. In: Gascuel, O., Steel, M. (eds.) Reconstructing Evolution: New Mathematical and Computational Advances, pp. 65–99. Oxford University Press (2007)
Guindon, S., Dufayard, J.F., Lefort, V., Anisimova, M., Hordijk, W., Gascuel, O.: New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0. Syst. Biol. 59(3), 307–321 (2010)
Guindon, S., Gascuel, O.: A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. Syst. Biol. 52(5), 696–704 (2003)
Hasegawa, M., Kishino, H., Yano, T.: Dating of the human-ape splitting by a molecular clock of mitochondrial DNA. J. Mol. Evol. 22(2), 160–174 (1985)
Helaers, R., Milinkovitch, M.C.: MetaPIGA v2.0: maximum likelihood large phylogeny estimation using the metapopulation genetic algorithm and other stochastic heuristics. BMC Bioinform. 11(1), 379 (2010)
Hoang, D.T., Chernomor, O., von Haeseler, A., Minh, B.Q., Le, S.V.: UFBoot2 improving the ultrafast bootstrap approximation. Mol. Biol. Evol. 35(2), 518–522 (2018)
Hordijk, W., Gascuel, O.: Improving the efficiency of SPR moves in phylogenetic tree search methods based on maximum likelihood. Bioinformatics 21(24), 4338–4347 (2005)
Jarvis, E., Mirarab, S., Aberer, A., Li, B., Houde, P., Li, C., Ho, S., Faircloth, B., Nabholz, B., Howard, J., Suh, A., Weber, C., da Fonseca, R., Li, J., Zhang, F., Li, H., Zhou, L., Narula, N., Liu, L., Ganapathy, G., Boussau, B., Bayzid, M., Zavidovych, V., Subramanian, S., Gabaldón, T., Capella-Gutiérrez, S., Huerta-Cepas, J., Rekepalli, B., Munch, K., Schierup, M., Lindow, B., Warren, W., Ray, D., Green, R., Bruford, M., Zhan, X., Dixon, A., Li, S., Li, N., Huang, Y., Derryberry, E., Bertelsen, M., Sheldon, F., Brumfield, R., Mello, C., Lovell, P., Wirthlin, M., Schneider, M., Prosdocimi, F., Samaniego, J., Vargas Velazquez, A., Alfaro-Núñez, A., Campos, P., Petersen, B., Sicheritz-Ponten, T., Pas, A., Bailey, T., Scofield, P., Bunce, M., Lambert, D., Zhou, Q., Perelman, P., Driskell, A., Shapiro, B., Xiong, Z., Zeng, Y., Liu, S., Li, Z., Liu, B., Wu, K., Xiao, J., Yinqi, X., Zheng, Q., Zhang, Y., Yang, H., Wang, J., Smeds, L., Rheindt, F., Braun, M., Fjeldsa, J., Orlando, L., Barker, F., Jønsson, K., Johnson, W., Koepfli, K., O’Brien, S., Haussler, D., Ryder, O., Rahbek, C., Willerslev, E., Graves, G., Glenn, T., McCormack, J., Burt, D., Ellegren, H., Alström, P., Edwards, S., Stamatakis, A., Mindell, D., Cracraft, J., Braun, E., Warnow, T., Jun, W., Gilbert, M., Zhang, G.: Whole-genome analyses resolve early branches in the tree of life of modern birds. Science 346(6215), 1320–1331 (2014)
Jones, D.T., Taylor, W.R., Thornton, J.M.: The rapid generation of mutation data matrices from protein sequences. Bioinformatics 8(3), 275–282 (1992)
Jukes, T., Cantor, C.: Evolution of protein molecules. In: Munro, H. (ed.) Mammalian Protein Metabolism, vol. III, chap. 24, pp. 21–132. Academic Press, New York (1969)
Le, S.Q., Dang, C.C., Gascuel, O.: Modeling protein evolution with several amino acid replacement matrices depending on site rates. Mol. Biol. Evol. 29(10), 2921–2936 (2012)
Le, S.Q., Gascuel, O.: An improved general amino acid replacement matrix. Mol. Biol. Evol. 25(7), 1307–1320 (2008)
Le, S.Q., Gascuel, O.: Accounting for solvent accessibility and secondary structure in protein phylogenetics is clearly beneficial. Syst. Biol. 59(3), 277–287 (2010)
Lin, Y., Hu, F., Tang, J., Moret, B.M.: Maximum likelihood phylogenetic reconstruction from high-resolution whole-genome data and a tree of 68 eukaryotes. In: Biocomputing 2013, pp. 285–296. World Scientific (2013)
Nguyen, L.T., Schmidt, H.A., von Haeseler, A., Minh, B.Q.: IQ-TREE: a fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies. Mol. Biol. Evol. 32(1), 268–274 (2014)
Nielsen, R., Yang, Z.: Likelihood models for detecting positively selected amino acid sites and application to the HIV-1 envelope gene. Genetics 148, 929–936 (1998)
Pratas, F., Trancoso, P., Stamatakis, A., Sousa, L.: Fine-grain parallelism using multi-core, cell/be, and GPU systems: accelerating the phylogenetic likelihood function. In: International Conference on Parallel Processing, 2009, ICPP’09, pp. 9–17. IEEE (2009)
Ronquist, F., Huelsenbeck, J.P.: MrBayes 3: Bayesian phylogenetic inference under mixed models. Bioinformatics 19(12), 1572–1574 (2003)
Soubrier, J., Steel, M., Lee, M.S., Der Sarkissian, C., Guindon, S., Ho, S.Y., Cooper, A.: The influence of rate heterogeneity among sites on the time dependence of molecular rates. Mol. Biol. Evol. 29(11), 3345–3358 (2012)
Stamatakis, A.: RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics 22(21), 2688–2690 (2006)
Stamatakis, A., Ludwig, T., Meier, H.: RAxML-III: a fast program for maximum likelihood-based inference of large phylogenetic trees. Bioinformatics 21(4), 456–463 (2004)
Susko, E., Field, C., Blouin, C., Roger, A.J.: Estimation of rates-across-sites distributions in phylogenetic substitution models. Syst. Biol. 52(5), 594–603 (2003)
Swofford, D.: PAUP*: phylogenetic analysis using parsimony (* and other methods) Ver. 4. Sinauer Associates, Sunderland, Massachusetts (2002)
Tamura, K., Peterson, D., Peterson, N., Stecher, G., Nei, M., Kumar, S.: MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods. Mol. Biol. Evol. 28(10), 2731–2739 (2011)
Tavaré, S.: Some probabilistic and statistical problems in the analysis of DNA sequences. Lectures on Mathematics in the Life Sciences, vol. 17, pp. 57–86. American Mathematical Society (1986)
Vinh, L.S., von Haeseler, A.: IQPNNI: moving fast through tree space and stopping in time. Mol. Biol. Evol. 21(8), 1565–1571 (2004)
Whelan, S., Goldman, N.: A general empirical model of protein evolution derived from multiple protein families using a maximum-likelihood approach. Mol. Biol. Evol. 18(5), 691–699 (2001)
Yang, Z.: Maximum likelihood phylogenetic estimation from DNA sequences with variable rates over sites: approximate methods. J. Mol. Evol. 39, 306–314 (1994)
Yang, Z.: Computational molecular evolution. Oxford University Press (2006)
Yang, Z., Nielsen, R.: Estimating synonymous and nonsynonymous substitution rates under realistic evolutionary models. Mol. Biol. Evol. 17, 32–43 (2000)
Yang, Z., Nielsen, R.: Codon-substitution models for detecting molecular adaptation at individual sites along specific lineages. Mol. Biol. Evol. 19, 908–917 (2002)
Zwickl, D.: GARLI: genetic algorithm for rapid likelihood inference (2006). http://www.bio.utexas.edu/faculty/antisense/garli/Garli.html
Acknowledgements
We would like to thank Alexandros Stamatakis for helpful suggestions on how to improve this chapter and Tandy Warnow for inviting us to celebrate Bernard Moret’s contributions to the field of computational evolution.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Guindon, S., Gascuel, O. (2019). Numerical Optimization Techniques in Maximum Likelihood Tree Inference. In: Warnow, T. (eds) Bioinformatics and Phylogenetics. Computational Biology, vol 29. Springer, Cham. https://doi.org/10.1007/978-3-030-10837-3_2
Download citation
DOI: https://doi.org/10.1007/978-3-030-10837-3_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-10836-6
Online ISBN: 978-3-030-10837-3
eBook Packages: Computer ScienceComputer Science (R0)