Abstract
jModelTest is a bioinformatic tool for choosing among different models of nucleotide substitution. The program implements five different model selection strategies, including hierarchical and dynamical likelihood ratio tests (hLRT and dLRT), Akaike and Bayesian information criteria (AIC and BIC), and a performance-based decision theory method (DT). The output includes estimates of model selection uncertainty, parameter importances, and model-averaged parameter estimates, including model-averaged phylogenies. jModelTest is a Java program that runs under Mac OSX, Windows, and Unix systems with a Java Run Environment installed, and it can be freely downloaded from http://darwin.uvigo.es.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Yang, Z., Goldman, N., and Friday, A. (1995) Maximum likelihood trees from DNA sequences: a peculiar statistical estimation problem. Syst Biol 44, 384–99.
Tamura, K. (1994) Model selection in the estimation of the number of nucleotide substitutions. Mol Biol Evol 11, 154–57.
Zhang, J. (1999) Performance of likelihood ratio tests of evolutionary hypotheses under inadequate substitution models. Mol Biol Evol 16, 868–75.
Lemmon, A. R., and Moriarty, E. C. (2004) The importance of proper model assumption in Bayesian phylogenetics. Syst Biol 53, 265–77.
Buckley, T. R., and Cunningham, C. W. (2002) The effects of nucleotide substitution model assumptions on estimates of nonparametric bootstrap support. Mol Biol Evol 19, 394–405.
Sullivan, J., and Swofford, D. L. (1997) Are guinea pigs rodents? The importance of adequate models in molecular phylogenies. J Mamm Evol 4, 77–86.
Kelsey, C. R., Crandall, K. A., and Voevodin, A. F. (1999) Different models, different trees: the geographic origin of PTLV-I. Mol Phylogenet Evol 13, 336–47.
Pupko, T., Huchon, D., Cao, Y., Okada, N., and Hasegawa, M. (2002) Combining multiple data sets in a likelihood analysis: which models are the best? Mol Biol Evol 19, 2294–307.
Posada, D., and Buckley, T. R. (2004) Model selection and model averaging in phylogenetics: advantages of Akaike Information Criterion and Bayesian approaches over likelihood ratio tests. Syst Biol 53, 793–808.
Sullivan, J., and Joyce, P. (2005) Model selection in phylogenetics. Annu Rev Ecol Evol. Syst. 36, 445–66.
Alfaro, M. E., and Huelsenbeck, J. P. (2006) Comparative performance of Bayesian and AIC-based measures of phylogenetic model uncertainty. Syst Biol 55, 89–96.
Ripplinger, J., and Sullivan, J. (2008) Does choice in model selection affect maximum likelihood analysis? Syst Biol 57, 76–85.
Goldman, N. (1993) Statistical tests of models of DNA substitution. J Mol. Evol 36, 182–98.
Kendall, M., and Stuart, A. (1979) The Advanced Theory of Statistics, Charles Griffin, London.
Posada, D., and Crandall, K. A. (2001) Selecting the best-fit model of nucleotide substitution. Syst Biol 50, 580–601.
Akaike, H. (1974) A new look at the statistical model identification. IEEE Trans. Aut. Control 19, 716–23.
Kullback, S., and Leibler, R. A. (1951) On information and sufficiency. Ann Math Stat 22, 79–86.
Sugiura, N. (1978) Further analysis of the data by Akaike's information criterion and the finite corrections. Comm Statist Theor Meth A7, 13–26.
Hurvich, C. M., and Tsai, C.-L. (1989) Regression and time series model selection in small samples. Biometrika 76, 297–307.
Schwarz, G. (1978) Estimating the dimension of a model. Ann Stat 6, 461–64.
Minin, V., Abdo, Z., Joyce, P., and Sullivan, J. (2003) Performance-based selection of likelihood models for phylogeny estimation. Syst Biol 52, 674–83.
Abdo, Z., Minin, V. N., Joyce, P., and Sullivan, J. (2005) Accounting for uncertainty in the tree topology has little effect on the decision-theoretic approach to model selection in phylogeny estimation. Mol Biol Evol 22, 691–703.
Burnham, K. P., and Anderson, D. R. (1998) Model Selection and Inference: A Practical Information-Theoretic Approach, Springer-Verlag, New York, NY.
Burnham, K. P., and Anderson, D. R. (2003) Model Selection and Multimodel Inference: A Practical Information-Theoretic Approach, Springer-Verlag, New York, NY.
Posada, D. (2003) Current Protocols in Bioinformatics (Baxevanis, A. D., Davison, D. B., Page, R. D. M., Petsko, G. A., Stein, L. D., and Stormo, G. D., Eds.), pp. 6.5.1–6.5.14, John Wiley & Sons, Inc., New York
Madigan, D. M., and Raftery, A. E. (1994) Model selection and accounting for model uncertainty in graphical models using Occam's Window. J Amer Stat Assoc 89, 1335–46.
Wasserman, L. (2000) Bayesian model selection and model averaging. J Math Psychol 44, 92–107.
Hoeting, J. A., Madigan, D., and Raftery, A. E. (1999) Bayesian model averaging: a tutorial. Stat Sci 14, 382–417.
Raftery, A. E. (1996) Markov chain Monte Carlo in Practice (Gilks, W. R., Richardson, S., and Spiegelhalter, D. J., Eds.), pp. 163–87, Chapman & Hall, London, New York.
Gilbert, D. (2007) ReadSeq, Indiana University, Bloomington.
Guindon, S., and Gascuel, O. (2003) A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. Syst Biol 52, 696–704.
Felsenstein, J. (2005) Phylip, Department of Genome Sciences, University of Washington, Seattle.
Gascuel, O. (1997) BIONJ: an improved version of the NJ algorithm based on a simple model of sequence data. Mol Biol Evol 14, 685–95.
Swofford, D. L. (2000) PAUP*, Sinauer Associates, Sunderland, Massachusetts.
Bollback, J. P. (2002) Bayesian model adequacy and choice in phylogenetics. Mol Biol Evol 19, 1171–80.
Ohta, T. (1992) Theoretical study of near neutrality. II. Effect of subdivided population structure with local extinction and recolonization. Genetics 130, 917–23.
Goldman, N., and Whelan, S. (2000) Statistical tests of gamma-distributed rate heterogeneity in models of sequence evolution in phylogenetics. Mol Biol Evol 17, 975–78.
Huelsenbeck, J. P., Larget, B., and Alfaro, M. E. (2004) Bayesian phylogenetic model selection using reversible jump Markov chain Monte Carlo. Mol Biol Evol 21, 1123–33.
Jukes, T. H., and Cantor, C. R. (1969) Mammalian Protein Metabolism (Munro, H. M., Ed.), pp. 21–132, Academic Press, New York, NY.
Felsenstein, J. (1981) Evolutionary trees from DNA sequences: a maximum likelihood approach. J Mol Evol 17, 368–76.
Kimura, M. (1980) A simple method for estimating evolutionary rate of base substitutions through comparative studies of nucleotide sequences. J Mol Evol 16, 111–20.
Hasegawa, M., Kishino, K., and Yano, T. (1985) Dating the human-ape splitting by a molecular clock of mitochondrial DNA. J Mol Evol 22, 160–74.
Tamura, K., and Nei, M. (1993) Estimation of the number of nucleotide substitutions in the control region of mitochondrial DNA in humans and chimpanzees. Mol Biol Evol 10, 512–26.
Kimura, M. (1981) Estimation of evolutionary distances between homologous nucleotide sequences. Proc Natl Acad Sci USA 78, 454–58.
Zharkikh, A. (1994) Estimation of evolutionary distances between nucleotide sequences. J Mol Evol 39, 315–29.
Tavaré, S. (1986) Some Mathematical Questions in Biology – DNA Sequence Analysis (Miura, R. M., Ed.), Vol. 17, pp. 57–86, American Mathematical Society, Providence, RI.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Humana Press, a part of Springer Science+Business Media, LLC
About this protocol
Cite this protocol
Posada, D. (2009). Selection of Models of DNA Evolution with jModelTest . In: Posada, D. (eds) Bioinformatics for DNA Sequence Analysis. Methods in Molecular Biology, vol 537. Humana Press. https://doi.org/10.1007/978-1-59745-251-9_5
Download citation
DOI: https://doi.org/10.1007/978-1-59745-251-9_5
Published:
Publisher Name: Humana Press
Print ISBN: 978-1-58829-910-9
Online ISBN: 978-1-59745-251-9
eBook Packages: Springer Protocols