Skip to main content

Maximum Likelihood in Molecular Phylogenetics

  • Chapter
  • First Online:
Bioinformatics and the Cell

Abstract

Maximum likelihood (ML) methods remain the gold standard in molecular phylogenetics. The calculation of likelihood, given a topology and a substitution model, is illustrated with both a brute-force approach and the pruning algorithm which is the most fundamental algorithm in likelihood calculation. The pruning algorithm is also a dynamic programming algorithm. The likelihood calculation is separately presented without and with a molecular clock. While ML is the most robust of all methods in molecular phylogenetics, it may suffer from bias when handling missing data coupled with rate heterogeneity over sites.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 259.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 329.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 329.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Felsenstein J (1973) Maximum-likelihood and minimum-steps methods for estimating evolutionary trees from data on discrete characters. Syst Zool 22:240–249

    Article  Google Scholar 

  • Felsenstein J (1981) Evolutionary trees from DNA sequences: a maximum likelihood approach. J Mol Evol 17:368–376

    Article  CAS  PubMed  Google Scholar 

  • Felsenstein J (2004) Inferring phylogenies. Sinauer, Sunderland

    Google Scholar 

  • Fitch WM (1971) Toward defining the course of evolution: minimum change for a specific tree topology. Syst Zool 20:406–416

    Article  Google Scholar 

  • Kishino H, Hasegawa M (1989) Evaluation of the maximum likelihood estimate of the evolutionary tree topologies from DNA sequence data, and the branching order in Hominoidea. J Mol Evol 29:170–179

    Article  CAS  PubMed  Google Scholar 

  • Sankoff D (1975) Minimal mutation trees of sequences. J SIAM Appl Math 28:35–42

    Article  Google Scholar 

  • Shimodaira H, Hasegawa M (1999) Multiple comparisons of log-likelihoods with applications to phylogenetic inference. Mol Biol Evol 16(8):1114–1116

    Article  CAS  Google Scholar 

  • Xia X (2013) DAMBE5: a comprehensive software package for data analysis in molecular biology and evolution. Mol Biol Evol 30:1720–1728

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Xia X (2014) Phylogenetic bias in the likelihood method caused by missing data coupled with among-site rate variation: an analytical approach. In: Basu M, Pan Y, Wang J (eds) Bioinformatics research and applications. Springer, New York, pp 12–23

    Google Scholar 

  • Xia X (2017d) Self-organizing map for characterizing heterogeneous nucleotide and amino acid sequence motifs. Computation 5(4):43

    Article  Google Scholar 

  • Zhu C, Byrd RH, Lu P, Nocedal J (1997) Algorithm 778: L-BFGS-B: fortran subroutines for large-scale bound-constrained optimization. ACM Trans Math Softw 23(4):550–560

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Postscript

Postscript

We have covered the maximum likelihood framework in molecular phylogenetics in depth, but this book does not cover the Bayesian approach which extended the likelihood framework to incorporate prior knowledge. The Bayesian framework can not only help us with molecular phylogenetics but also reduce our tendency to develop prejudice and social bias.

Suppose we live in a multiracial society and need to decide whom our family should interact with. We implicitly would want to estimate the proportion of good people (Pgood) in a race (or an ethnic group), with “good people” defined as those whom we have pleasant experience interacting with. Naturally one wants to interact with people in a race whose Pgood is high and avoid people in a race whose Pgood is low.

Now suppose we have interacted with a small number of people, say three, in one race and our experiences are all bad. A likelihood estimate of Pgood is then 0 because it is based on data only. If we take this estimated Pgood seriously in spite of the small sample size of three, then we become a racist.

With the Bayesian approach, we would first conceive a prior for Pgood before any interaction with people of different races. If we are fair-minded, our prior of Pgood will be the same for all races to start with. If we are unfortunate to have a bad experience with a member of one race, we would reduce Pgood for that race a bit. If our second encounter with people of this race is also bad, then we reduce Pgood still further for that race. Eventually these different Pgood values for different races constitute our private model of racial differences, and the model, correct or wrong, will affect our behavior.

The model of racial differences thus developed in our mind may be quite different from models in other people’s mind, because different people often interact with different samples from different races. Because few of us could claim to have a representative sample of people to interact with, Pgood is almost always biased. However, it may not be as biased as what one gets from a likelihood framework.

In this context of unrepresentative samples from differences, racism, as well as other kinds of prejudices, is almost inevitable. What is important to keep in mind is that much of the differences in Pgood among races or ethnic groups are due to historical differences in racial environment. If a little boy is driven by poverty to steal a loaf of bread for his sick and hungry mother, then it is the ruler of the society, not the boy, who is bad. May the joint effort of mankind lead to a monotonic increase in Pgood in all races.

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Science+Business Media LLC

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Xia, X. (2018). Maximum Likelihood in Molecular Phylogenetics. In: Bioinformatics and the Cell. Springer, Cham. https://doi.org/10.1007/978-3-319-90684-3_16

Download citation

Publish with us

Policies and ethics