Bioinformatics and the Cell pp 381-395 | Cite as

# Maximum Likelihood in Molecular Phylogenetics

Chapter

First Online:

## Abstract

Maximum likelihood (ML) methods remain the gold standard in molecular phylogenetics. The calculation of likelihood, given a topology and a substitution model, is illustrated with both a brute-force approach and the pruning algorithm which is the most fundamental algorithm in likelihood calculation. The pruning algorithm is also a dynamic programming algorithm. The likelihood calculation is separately presented without and with a molecular clock. While ML is the most robust of all methods in molecular phylogenetics, it may suffer from bias when handling missing data coupled with rate heterogeneity over sites.

## References

- Felsenstein J (1973) Maximum-likelihood and minimum-steps methods for estimating evolutionary trees from data on discrete characters. Syst Zool 22:240–249CrossRefGoogle Scholar
- Felsenstein J (1981) Evolutionary trees from DNA sequences: a maximum likelihood approach. J Mol Evol 17:368–376CrossRefPubMedGoogle Scholar
- Felsenstein J (2004) Inferring phylogenies. Sinauer, SunderlandGoogle Scholar
- Fitch WM (1971) Toward defining the course of evolution: minimum change for a specific tree topology. Syst Zool 20:406–416CrossRefGoogle Scholar
- Kishino H, Hasegawa M (1989) Evaluation of the maximum likelihood estimate of the evolutionary tree topologies from DNA sequence data, and the branching order in Hominoidea. J Mol Evol 29:170–179CrossRefPubMedGoogle Scholar
- Sankoff D (1975) Minimal mutation trees of sequences. J SIAM Appl Math 28:35–42CrossRefGoogle Scholar
- Shimodaira H, Hasegawa M (1999) Multiple comparisons of log-likelihoods with applications to phylogenetic inference. Mol Biol Evol 16(8):1114–1116CrossRefGoogle Scholar
- Xia X (2013) DAMBE5: a comprehensive software package for data analysis in molecular biology and evolution. Mol Biol Evol 30:1720–1728PubMedPubMedCentralCrossRefGoogle Scholar
- Xia X (2014) Phylogenetic bias in the likelihood method caused by missing data coupled with among-site rate variation: an analytical approach. In: Basu M, Pan Y, Wang J (eds) Bioinformatics research and applications. Springer, New York, pp 12–23Google Scholar
- Xia X (2017d) Self-organizing map for characterizing heterogeneous nucleotide and amino acid sequence motifs. Computation 5(4):43CrossRefGoogle Scholar
- Zhu C, Byrd RH, Lu P, Nocedal J (1997) Algorithm 778: L-BFGS-B: fortran subroutines for large-scale bound-constrained optimization. ACM Trans Math Softw 23(4):550–560CrossRefGoogle Scholar

## Copyright information

© Springer Science+Business Media LLC 2018