Selection of Models of DNA Evolution with jModelTest

Posada, David

doi:10.1007/978-1-59745-251-9_5

David Posada²

Part of the book series: Methods in Molecular Biology ((MIMB,volume 537))

5146 Accesses
171 Citations

Abstract

jModelTest is a bioinformatic tool for choosing among different models of nucleotide substitution. The program implements five different model selection strategies, including hierarchical and dynamical likelihood ratio tests (hLRT and dLRT), Akaike and Bayesian information criteria (AIC and BIC), and a performance-based decision theory method (DT). The output includes estimates of model selection uncertainty, parameter importances, and model-averaged parameter estimates, including model-averaged phylogenies. jModelTest is a Java program that runs under Mac OSX, Windows, and Unix systems with a Java Run Environment installed, and it can be freely downloaded from http://darwin.uvigo.es.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Protocol: USD 49.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Yang, Z., Goldman, N., and Friday, A. (1995) Maximum likelihood trees from DNA sequences: a peculiar statistical estimation problem. Syst Biol 44, 384–99.
Google Scholar
Tamura, K. (1994) Model selection in the estimation of the number of nucleotide substitutions. Mol Biol Evol 11, 154–57.
PubMed CAS Google Scholar
Zhang, J. (1999) Performance of likelihood ratio tests of evolutionary hypotheses under inadequate substitution models. Mol Biol Evol 16, 868–75.
PubMed CAS Google Scholar
Lemmon, A. R., and Moriarty, E. C. (2004) The importance of proper model assumption in Bayesian phylogenetics. Syst Biol 53, 265–77.
Article PubMed Google Scholar
Buckley, T. R., and Cunningham, C. W. (2002) The effects of nucleotide substitution model assumptions on estimates of nonparametric bootstrap support. Mol Biol Evol 19, 394–405.
Article PubMed CAS Google Scholar
Sullivan, J., and Swofford, D. L. (1997) Are guinea pigs rodents? The importance of adequate models in molecular phylogenies. J Mamm Evol 4, 77–86.
Article Google Scholar
Kelsey, C. R., Crandall, K. A., and Voevodin, A. F. (1999) Different models, different trees: the geographic origin of PTLV-I. Mol Phylogenet Evol 13, 336–47.
Article PubMed CAS Google Scholar
Pupko, T., Huchon, D., Cao, Y., Okada, N., and Hasegawa, M. (2002) Combining multiple data sets in a likelihood analysis: which models are the best? Mol Biol Evol 19, 2294–307.
Article PubMed CAS Google Scholar
Posada, D., and Buckley, T. R. (2004) Model selection and model averaging in phylogenetics: advantages of Akaike Information Criterion and Bayesian approaches over likelihood ratio tests. Syst Biol 53, 793–808.
Article PubMed Google Scholar
Sullivan, J., and Joyce, P. (2005) Model selection in phylogenetics. Annu Rev Ecol Evol. Syst. 36, 445–66.
Article Google Scholar
Alfaro, M. E., and Huelsenbeck, J. P. (2006) Comparative performance of Bayesian and AIC-based measures of phylogenetic model uncertainty. Syst Biol 55, 89–96.
Article PubMed Google Scholar
Ripplinger, J., and Sullivan, J. (2008) Does choice in model selection affect maximum likelihood analysis? Syst Biol 57, 76–85.
Article PubMed Google Scholar
Goldman, N. (1993) Statistical tests of models of DNA substitution. J Mol. Evol 36, 182–98.
Article PubMed CAS Google Scholar
Kendall, M., and Stuart, A. (1979) The Advanced Theory of Statistics, Charles Griffin, London.
Google Scholar
Posada, D., and Crandall, K. A. (2001) Selecting the best-fit model of nucleotide substitution. Syst Biol 50, 580–601.
Article PubMed CAS Google Scholar
Akaike, H. (1974) A new look at the statistical model identification. IEEE Trans. Aut. Control 19, 716–23.
Article Google Scholar
Kullback, S., and Leibler, R. A. (1951) On information and sufficiency. Ann Math Stat 22, 79–86.
Article Google Scholar
Sugiura, N. (1978) Further analysis of the data by Akaike's information criterion and the finite corrections. Comm Statist Theor Meth A7, 13–26.
Article Google Scholar
Hurvich, C. M., and Tsai, C.-L. (1989) Regression and time series model selection in small samples. Biometrika 76, 297–307.
Article Google Scholar
Schwarz, G. (1978) Estimating the dimension of a model. Ann Stat 6, 461–64.
Article Google Scholar
Minin, V., Abdo, Z., Joyce, P., and Sullivan, J. (2003) Performance-based selection of likelihood models for phylogeny estimation. Syst Biol 52, 674–83.
Article PubMed Google Scholar
Abdo, Z., Minin, V. N., Joyce, P., and Sullivan, J. (2005) Accounting for uncertainty in the tree topology has little effect on the decision-theoretic approach to model selection in phylogeny estimation. Mol Biol Evol 22, 691–703.
Article PubMed CAS Google Scholar
Burnham, K. P., and Anderson, D. R. (1998) Model Selection and Inference: A Practical Information-Theoretic Approach, Springer-Verlag, New York, NY.
Google Scholar
Burnham, K. P., and Anderson, D. R. (2003) Model Selection and Multimodel Inference: A Practical Information-Theoretic Approach, Springer-Verlag, New York, NY.
Google Scholar
Posada, D. (2003) Current Protocols in Bioinformatics (Baxevanis, A. D., Davison, D. B., Page, R. D. M., Petsko, G. A., Stein, L. D., and Stormo, G. D., Eds.), pp. 6.5.1–6.5.14, John Wiley & Sons, Inc., New York
Google Scholar
Madigan, D. M., and Raftery, A. E. (1994) Model selection and accounting for model uncertainty in graphical models using Occam's Window. J Amer Stat Assoc 89, 1335–46.
Article Google Scholar
Wasserman, L. (2000) Bayesian model selection and model averaging. J Math Psychol 44, 92–107.
Article PubMed Google Scholar
Hoeting, J. A., Madigan, D., and Raftery, A. E. (1999) Bayesian model averaging: a tutorial. Stat Sci 14, 382–417.
Article Google Scholar
Raftery, A. E. (1996) Markov chain Monte Carlo in Practice (Gilks, W. R., Richardson, S., and Spiegelhalter, D. J., Eds.), pp. 163–87, Chapman & Hall, London, New York.
Google Scholar
Gilbert, D. (2007) ReadSeq, Indiana University, Bloomington.
Google Scholar
Guindon, S., and Gascuel, O. (2003) A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. Syst Biol 52, 696–704.
Article PubMed Google Scholar
Felsenstein, J. (2005) Phylip, Department of Genome Sciences, University of Washington, Seattle.
Google Scholar
Gascuel, O. (1997) BIONJ: an improved version of the NJ algorithm based on a simple model of sequence data. Mol Biol Evol 14, 685–95.
PubMed CAS Google Scholar
Swofford, D. L. (2000) PAUP*, Sinauer Associates, Sunderland, Massachusetts.
Google Scholar
Bollback, J. P. (2002) Bayesian model adequacy and choice in phylogenetics. Mol Biol Evol 19, 1171–80.
Article PubMed CAS Google Scholar
Ohta, T. (1992) Theoretical study of near neutrality. II. Effect of subdivided population structure with local extinction and recolonization. Genetics 130, 917–23.
Google Scholar
Goldman, N., and Whelan, S. (2000) Statistical tests of gamma-distributed rate heterogeneity in models of sequence evolution in phylogenetics. Mol Biol Evol 17, 975–78.
PubMed CAS Google Scholar
Huelsenbeck, J. P., Larget, B., and Alfaro, M. E. (2004) Bayesian phylogenetic model selection using reversible jump Markov chain Monte Carlo. Mol Biol Evol 21, 1123–33.
Article PubMed CAS Google Scholar
Jukes, T. H., and Cantor, C. R. (1969) Mammalian Protein Metabolism (Munro, H. M., Ed.), pp. 21–132, Academic Press, New York, NY.
Google Scholar
Felsenstein, J. (1981) Evolutionary trees from DNA sequences: a maximum likelihood approach. J Mol Evol 17, 368–76.
Article PubMed CAS Google Scholar
Kimura, M. (1980) A simple method for estimating evolutionary rate of base substitutions through comparative studies of nucleotide sequences. J Mol Evol 16, 111–20.
Article PubMed CAS Google Scholar
Hasegawa, M., Kishino, K., and Yano, T. (1985) Dating the human-ape splitting by a molecular clock of mitochondrial DNA. J Mol Evol 22, 160–74.
Article PubMed CAS Google Scholar
Tamura, K., and Nei, M. (1993) Estimation of the number of nucleotide substitutions in the control region of mitochondrial DNA in humans and chimpanzees. Mol Biol Evol 10, 512–26.
PubMed CAS Google Scholar
Kimura, M. (1981) Estimation of evolutionary distances between homologous nucleotide sequences. Proc Natl Acad Sci USA 78, 454–58.
Article PubMed CAS Google Scholar
Zharkikh, A. (1994) Estimation of evolutionary distances between nucleotide sequences. J Mol Evol 39, 315–29.
Article PubMed CAS Google Scholar
Tavaré, S. (1986) Some Mathematical Questions in Biology – DNA Sequence Analysis (Miura, R. M., Ed.), Vol. 17, pp. 57–86, American Mathematical Society, Providence, RI.
Google Scholar

Download references

Author information

Authors and Affiliations

Departamento de Genética, Bioquímica e Inmunología, Facultad de Biología, Universidad de Vigo, Vigo, Spain
David Posada

Authors

David Posada
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Dept. Bioquímica, Genética e Inmunología, Universidad de Vigo, Vigo, 36310, Spain
David Posada

Rights and permissions

Reprints and permissions

Copyright information

About this protocol

Cite this protocol

Posada, D. (2009). Selection of Models of DNA Evolution with jModelTest . In: Posada, D. (eds) Bioinformatics for DNA Sequence Analysis. Methods in Molecular Biology, vol 537. Humana Press. https://doi.org/10.1007/978-1-59745-251-9_5

Download citation

DOI: https://doi.org/10.1007/978-1-59745-251-9_5
Published: 28 February 2009
Publisher Name: Humana Press
Print ISBN: 978-1-58829-910-9
Online ISBN: 978-1-59745-251-9
eBook Packages: Springer Protocols

Publish with us

Policies and ethics