Skip to main content

Phylogenetic Approaches to Molecular Epidemiology

  • Chapter

Abstract

Phylogenies, diagrams of branching patterns representing the estimated evolutionary histories among organisms or their parts (Crandall, 2001), have become essential tools in the study of the molecular epidemiology of disease agents. While the idea of using phylogenetic approaches to study epidemiology is not new (Harvey et al., 1996; Harvey and Nee, 1994), this book is a testament to the extraordinary information that can be obtained through a phylogenetic analysis of the etiological agents of disease. A prime example of the troubles encountered when the phylogenetic approach is ignored comes from the outbreak of the West Nile Virus in New York City. This virus was responsible for multiple deaths in New York, yet the Centers for Disease Control and Prevention (CDC) initially misdiagnosed the causative agent as St. Louis encephalitis due to their lack of an appropriate phylogenetic comparison (Enserink, 1999). The study of origins, spread, and diversity of pathogens are clearly evolutionary questions. Only after the serological evidence was coupled with strong phylogenetic evidence was the etiological agent responsible for the encephalitis outbreak in New York correctly identified as the West Nile Virus (Lanciotti et al., 1999). Likewise, other chapters in this book provide extensive examples of the insights obtained through phylogenetic thinking.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Altschul S.F., Gish W., Miller W., Myers E., and Lipman D.J. 1990. Basic local alignment search tool. J Mol Biol 215:403–410.

    PubMed  CAS  Google Scholar 

  • Altschul S.F., Madden T.L., Schaffer A.A., Zhang J., Zhang Z. et al. 1997. Gapped BLAST and PSIBLAST: a new generation of protein database search programs. Nucleic Acids Res 25:3389–3402.

    Article  PubMed  CAS  Google Scholar 

  • Bandelt H.-J. and Dress A.W.M. 1992. Split decomposition: A new and useful approach to phylogenetic analysis of distance data. Mol Phylogen Evol 1:242–252.

    Article  CAS  Google Scholar 

  • Bart A., Bamabe C., Achtman M., Dankert J., van der Ende A. et al. 2001. The population structure of Neisseria meningitidis serogroup A fits the predictions for c1onality. Infect Gen Evol 1:117–122.

    Article  CAS  Google Scholar 

  • Brauer M.J., Holder M.T., Dries L.A., Zwickl D.J., Lewis P.O. et al. 2002. Genetic algorithms and parallel processing in maximum-likelihood phylogeny inference. Mol Biol Evol: in press.

    Google Scholar 

  • Brown C.J., Gamer E.C., Dunker A.K., and Joyce P. 2001. The power to detect recombination using the coalescent. Mol Biol Evol 18:1421–1424.

    Google Scholar 

  • Bush R.M., Bender C.A., Subbarao K., Cox N.J., and Fitch W.M. 1999. Predicting the evolution of human influenza A. Science 286: 1921–1925.

    Article  PubMed  CAS  Google Scholar 

  • Cavalli-Sforza L.L. and Edwards A.W.F. 1967. Phylogenetic analysis: models and estimation procedures. Evolution 32:550–570.

    Article  Google Scholar 

  • Crandall K.A. 2001. Phylogeny. In Encyclopedia of Genetics, p. 1465–1466, Brenner S. and Miller J.H., eds. Academic Press, London.

    Chapter  Google Scholar 

  • Crandall K.A., Kelsey C.R., Imamichi H., and Salzman N.P. 1999a. Parallel evolution of drug resistance in HIV: failure of nonsynonymous/synonymous substitution rate ratio to detect selection. Mol Biol Evol 16:372–382.

    Article  PubMed  CAS  Google Scholar 

  • Crandall K.A. and Templeton A.R. 1999. Statistical methods for detecting recombination. In The Evolution of HIV, p. 153–176, Crandall K.A., ed. The Johns Hopkins University Press, Baltimore, MD.

    Google Scholar 

  • Crandall K.A., Vasco D., Posada D., and Imamichi H. 1999b. Advances in understanding the evolution of HIV. AIDS 13:S39–S47.

    PubMed  CAS  Google Scholar 

  • Dorman K.S., Kaplan A.H., and Sinsheimer J.S. 2002. Bootstrap confidence levels for HIV-1 recombination. J Mol Evol 54:200–209.

    Article  PubMed  Google Scholar 

  • Edwards A.W.F. 1996. The origin and early development of the method of minimum evolution for the reconstruction of phylogenetic trees. Syst Biol 45:79–91.

    Article  Google Scholar 

  • Edwards A.W.F. and Cavalli-Sforza L.L. 1964. Reconstruction of evolutionary trees. In Phenetic and phylogenetic classification, p. 67–76, McNeill J. ed. Systematics Association Publication, London.

    Google Scholar 

  • Enserink M. 1999. Groups race to sequence and identify New York virus. Science 286:206–207.

    Article  PubMed  CAS  Google Scholar 

  • Excoffier L. and Smouse P.E. 1994. Using allele frequencies and geographic subdivision to reconstruct gene trees within a species: Molecular variance parsimony. Genetics 136:343–359.

    PubMed  CAS  Google Scholar 

  • Falush D., Kraft C., Taylor N.S., Correa P., and Fox J.G. et al. 2001. Recombination and mutation during long-term gastric colonization by Helicobacter pylori: Estimates of clock rates, recombination size, and minimal age. Proc Natl Acad Sci USA 98:15056–15061.

    Article  PubMed  CAS  Google Scholar 

  • Feil E.J., Holmes E.C., Bessen D.E., Chan M.-S., Day N.P.J. et al. 2001. Recombination within natural populations of pathogenic bacteria: Short-term empirical estimates and long-term phylogenetic consequences. Proc Natl Acad Sci USA 98:182–187.

    Article  PubMed  CAS  Google Scholar 

  • Feil E.J., Maiden M.C.J., Achtman M., and Spratt B.G. 1999. The relative contributions of recombination and mutation to the divergence of clones of Neisseria meningilidis. Mol Biol Evol 16:1496–1502.

    Article  PubMed  CAS  Google Scholar 

  • Felsenstein J. 1981. Evolutionary trees from DNA sequences: A maximum likelihood approach. J Mol Evol 17:368–376.

    Article  PubMed  CAS  Google Scholar 

  • Felsenstein J. 1985. Confidence limits on phylogenies: an approach using the bootstrap. Evolution 39:783–791.

    Article  Google Scholar 

  • Fitch W., Brisse S., Stevens J., and Tibayrenc M. 2001. Infectious diseases and the golden age of phylogenetics: An E-debate. Infect Gen Evol 1:69–74.

    Article  CAS  Google Scholar 

  • Gibbs M.J., Armstrong J.S., and Gibbs A.J., 2001. Recombination in the hemagglutinin gene of the 1918 “Spanish Flu”. Science 293:1842–1845.

    Article  PubMed  CAS  Google Scholar 

  • Giribet G. 2001. Exploring the behavior of POY, a program for direct optimization of molecular data. Cladistics 17:S60–S70.

    Article  PubMed  CAS  Google Scholar 

  • Goldman N., Anderson J.P. and Rodrigo A.G. 2000. Likelihood-based tests of topologies in phylogenetics. Syst Biol 49:652–670.

    Article  PubMed  CAS  Google Scholar 

  • Goldman N. and Yang Z. 1994. A codon-based model of nucleotide substitution for protein-coding DNA sequences. Mol Biol Evol 11:725–736.

    PubMed  CAS  Google Scholar 

  • Greybeal A. 1998. Is it better to add taxa or characters to a difficult phylogenetic problem? Syst Biol 47:9–17.

    Article  Google Scholar 

  • Guttman D.S. and Dykhuizen D.E. 1994. Clonal divergence in Escherichia coli as a result of recombination, not mutation. Science 266:1380–1383.

    Article  PubMed  CAS  Google Scholar 

  • Harvey P.H., Leigh Brown A.J., Maynard Smith J., and Nee S., eds. 1996. New Uses for New Phylogenies. Oxford University Press, Oxford, England.

    Google Scholar 

  • Harvey P.H. and Nee S. 1994. Phylogenetic epidemiology lives. Trends Ecol Evol 9:361–363.

    Article  PubMed  CAS  Google Scholar 

  • Hendy M.D. and Penny D. 1982. Branch and bound algorithms to determine minimal evolutionary trees. Math Biosci 59:277–290.

    Article  Google Scholar 

  • Hillis D.M. 1994. Homology in molecular biology. In Homology: The Hierarchical Basis of Comparative Biology, p. 339–368, Hall B.K., ed. Academic Press, Inc., New York.

    Google Scholar 

  • Hillis D.M. 1998. Taxonomic sampling, phylogenetic accuracy, and investigator bias. Syst Biol 47:3–8.

    Article  PubMed  CAS  Google Scholar 

  • Hillis D.M. 1999. Phylogenetics and the study of HIV. In The Evolution of HIV, Crandall K.A., ed. Johns Hopkins University Press, Baltimore, MD.

    Google Scholar 

  • Hillis D.M. and Bull J.J. 1993. An empirical test of bootstrapping as a method for assessing confidence in phylogenetic analysis. Syst Biol 42:182–192.

    Google Scholar 

  • Huelsenbeck J.P. and Crandall K.A. 1997. Phylogeny estimation and hypothesis testing using maximum likelihood. Annu Rev Ecol Syst 28:437–466.

    Article  Google Scholar 

  • Huelsenbeck J.P., Rannala B., and Masly J.P. 2000. Accommodating phylogenetic uncertainty in evolutionary studies. Science 288:2349–2350.

    Article  PubMed  CAS  Google Scholar 

  • Huelsenbeck J.P. and Ronquist F. 2001. MRBAYES: Bayesian inference of phylogenetic trees. Bioinformatics 17:754–755.

    Article  PubMed  CAS  Google Scholar 

  • Huelsenbeck J.P., Ronquist F., Nielsen R., and Bollback J.P. 2001. Bayesian inference of phylogeny and its impact on evolutionary biology. Science 294:2310–2314.

    Article  PubMed  CAS  Google Scholar 

  • Jenkins G.M., Rambaut A., Pybus O.G., and Holmes E.C. 2002. Rates of molecular evolution in RNA viruses: A quantitative phylogenetic analysis. J Mol Evol 54:156–165.

    Article  PubMed  CAS  Google Scholar 

  • Kelsey C.R., Crandall K.A. and Voevodin A.F. 1999. Different models, different trees: The geographic origin of PTLV-I. Mol Phylogen Evol 13:336–347.

    Article  CAS  Google Scholar 

  • Kim J. 1998. Large-scale phylogenies and measuring the performance of phylogeentic estimators. Syst Biol 47:43–60.

    Article  PubMed  CAS  Google Scholar 

  • Kishino H. and Hasegawa M. 1989. Evaluation of the maximum likelihood estimate of the evolutionary tree topologies from DNA sequence data, and the branching order in Hominoidea. J Mol Evol 29:170–179.

    Article  PubMed  CAS  Google Scholar 

  • Korber B.T.M., Learn G., Mullins J.I., Hahn B.H., and Wolinsky S. 1995. Protecting HIV databases. Nature 378:242–243.

    Article  PubMed  CAS  Google Scholar 

  • Lanciotti R.S., Roehrig J.T., Deubel V., Smith J., Parker M. et al. 1999. Origin of the West Nile Virus responsible for an outbreak of encephalitis in the Northeastern United States. Science 286:2333–2337.

    Article  PubMed  CAS  Google Scholar 

  • Levin B.R., Lipsitch M., and Bonheoffer S. 1999. Population biology, evolution, and infectious disease: convergence and synthesis. Science 283:806–809.

    Article  PubMed  CAS  Google Scholar 

  • Lewis P.O. 1998. A genetic algorithm for maximum-likelihood phylogeny inference using nucleotide sequence data. Mol Biol Evol 15:277–283.

    Article  PubMed  CAS  Google Scholar 

  • Maddison D.R. 1991. The discovery and importance of multiple islands of most-parsimonious trees. Syst Zool 40:315–328.

    Article  Google Scholar 

  • Maddison D.R. and Maddison W.P. 2000 MacClade 4: Analysis of Phylogeny and Character Evolution. Sinauer Associates, Sunderland, MA.

    Google Scholar 

  • McClellan D.A. and McCracken K.G. 2001. Estimating the influence of selection on the variable amino acid sites of the cytochrome B protein functional domain. Mol Biol Evol 18:917–925.

    Article  PubMed  CAS  Google Scholar 

  • Muse S. 1999. Modeling the molecular evolution of HIV sequences. In The Evolution of HIV, in press, Crandall K.A., ed. Johns Hopkins University Press, Baltimore, MD.

    Google Scholar 

  • Muse S.V. and Gaut B.S. 1994. A likelihood approach for comparing synonymous and nonsynonymous nucleotide substitution rates, with application to the chloroplast genome. Mol Biol Evol 11:715–724.

    PubMed  CAS  Google Scholar 

  • Nei M. and Gojobori T. 1986. Simple methods for estimating the numbers of synonymous and nonsynonymous nucleotide substitutions. Mol Biol Evol 3:418–426.

    PubMed  CAS  Google Scholar 

  • Nielsen R. and Yang Z. 1998. Likelihood models for detecting positively selected amino acid sites and applications to the HIV-1 envelope gene. Genetics 148:929–936.

    PubMed  CAS  Google Scholar 

  • Pedersen A.-M. K. and Jensen J.L. 2001. A dependent-rates model and an MCMC-based methodology for the maximum-likelihood analysis of sequences with overlapping reading frames. Mol Biol Evol 18:691–699.

    Article  Google Scholar 

  • Poe S. 1998. Sensitivity of phylogeny estimation to taxonomic sampling. Syst Biol 47:18–31.

    Article  PubMed  CAS  Google Scholar 

  • Poe S. and Swofford D.L. 1999. Taxon sampling revisited. Nature 398:299–300.

    Article  PubMed  CAS  Google Scholar 

  • Pollock D.D., Zwickl D.J., McGuire J.A., and Hillis D.M. 2002. Increased taxon sampling is advantageous for phylogenetic inference. Syst Biol: in press.

    Google Scholar 

  • Posada D. 2001. The effect of branch length variation on the selection of models of molecular evolution. J Mol Evol 52:434–444.

    PubMed  CAS  Google Scholar 

  • Posada D. 2002. Evaluation of methods for detecting recombination from DNA sequences: Empirical data. Mol Biol Evol 19: in press.

    Google Scholar 

  • Posada D. and Crandall K.A. 1998. Modeltest: Testing the model of DNA substitution. Bioinformatics 14:817–818.

    Article  PubMed  CAS  Google Scholar 

  • Posada D. and Crandall K.A. 2001a. A comparison of different strategies for selecting models of DNA substitution. Syst Biol 50:580–601.

    Article  PubMed  CAS  Google Scholar 

  • Posada D. and Crandall K.A. 2001b. Evaluation of methods for detecting recombination from DNA sequences: Computer simulations. Proc Natl Acad Sci USA 98:13757–13762.

    Article  PubMed  CAS  Google Scholar 

  • Posada D. and Crandall K.A. 2001c. Intraspecific gene genealogies: trees grafting into networks. Trends Ecol Evol 16:37–45.

    Article  PubMed  Google Scholar 

  • Posada D. and Crandall K.A. 2001d. Selecting models of nucleotide substitution: An application to Human Immunodeficiency Virus 1 (HIV-1). Mol Biol Evol 18:897–906.

    Article  PubMed  CAS  Google Scholar 

  • Posada D. and Crandall K.A. 2002. The effect of recombination on the accuracy of phylogeny estimation. J Mol Evol 54:396–402.

    PubMed  CAS  Google Scholar 

  • Posada D., Crandall K.A., and Hillis D.M. 2001. Phylogenetics of HIV. In Computational and Evolutionary Analysis of HIV Molecular Sequences, p. 121–160, Rodrigo A.G. and Learn G.H. Jr., eds. Kluwer Academic Publishers, Dordrecht, The Netherlands.

    Google Scholar 

  • Posada D., Crandall K.A., and Holmes E.C. 2002. Recombination in evolutionary genomics. Annu Rev Genet: in press.

    Google Scholar 

  • Posada D., Crandall K.A., Nguyen M., Demma J.C., and Viscidi R.P. 2000. Population genetics of the porB gene of Neisseria gonorrheae. Mol Biol Evol:423–436.

    Google Scholar 

  • Rambaut A. 2002 Se-AI: Sequence Alignment Editor, Department of Zoology, University of Oxford (http://evolve.zoo.ox.ac.uk).

    Google Scholar 

  • Rich S.M., Sawyer S.A., and Barbour A.G. 2001. Antigen polymorphism in Borrelia hermsii, a clonal pathogenic bacterium. Proc Natl Acad Sci USA 98:15038–15043.

    Article  PubMed  CAS  Google Scholar 

  • Robertson D.L., Hahn B.H., and Sharp P.M. 1995. Recombination in AIDS viruses. J Mol Evol 40:249–259.

    Article  PubMed  CAS  Google Scholar 

  • Rosenberg M.S. and Kumar S. 2001. Incomplete taxon sampling is not a problem for phylogenetic inference. Proc Natl Acad Sci USA 98:10751–10756.

    Article  PubMed  CAS  Google Scholar 

  • Rzhetsky A. and Nei M. 1992. A simple method for estimating and testing minimum-evolution trees. Mol Biol Evol 9:945–967.

    CAS  Google Scholar 

  • Salter L.A. 2001. Complexity of the likelihood surface for a large DNA dataset. Syst Biol 50:970–978.

    Article  PubMed  CAS  Google Scholar 

  • Sanderson M.J. and Wojciechowski M.F. 2000. Improved bootstrap confidence limits in large-scale phylogenies, with an example from Neo-Astragalus (Leguminosae). Syst Biol 49:671–685.

    Article  PubMed  CAS  Google Scholar 

  • Schierup M.H. and Hein J. 2000. Consequences of recombination on traditional phylogenetic analysis. Genetics 156:879–891.

    PubMed  CAS  Google Scholar 

  • Sharp P.M. 1997. In search of molecular Darwinism. Nature 385:111–112.

    Article  PubMed  Google Scholar 

  • Shimodaira H. and Hasegawa M. 1999. Multiple comparisons of log-likelihoods with applications to phylogenetic inference. Mol Biol Evol 16:1114–1116.

    Article  CAS  Google Scholar 

  • Strimmer K. and Moulton V. 2000. Likelihood analysis of phylogenetic networks using directed graphical methods. Mol Biol Evol 17:875–881.

    Article  PubMed  CAS  Google Scholar 

  • Sullivan J., Swofford D.L., and Naylor G.J.P. 1999. The effect of taxon sampling on estimating rate heterogenety parameters of maximum-likelihood models. Mol Biol Evol 16:1347–1356.

    Article  CAS  Google Scholar 

  • Swofford D.L. 2000 PAUP*. Phylogenetic Analysis Using Parsimony (*and Other Methods). Sinauer Associates, Sunderland, PA.

    Google Scholar 

  • Swofford D.L., Olsen G.J., Waddell P.J., and Hillis D.M. 1996. Phylogenetic Inference. In Molecular Systematics, p. 407–514, Hillis D.M., Moritz C., and Mable B.K., eds. Sinauer Associates, Inc., Sunderland, MA.

    Google Scholar 

  • Templeton A.R. 1983. Phylogenetic inference from restriction endonuclease cleavage site maps with particular reference to the evolution of humans and the apes. Evolution 37:221–244.

    Article  CAS  Google Scholar 

  • Templeton A.R. 1992. Human origins and analysis of mitochondrial DNA sequences. Science 255:737.

    Article  PubMed  CAS  Google Scholar 

  • Templeton A.R., Crandall K.A., and Sing C.F. 1992. A cladistic analysis of phenotypic associations with haplotypes inferred from restriction endonuclease mapping and DNA sequence data. III. Cladogram estimation. Genetics 132:619–633.

    PubMed  CAS  Google Scholar 

  • Templeton A.R., Routman E., and Phillips C.A. 1995. Separating population structure from population history: a cladistic analysis of geographical distribution of mitochondrial DNA haplotypes in the tiger salamander, Ambystoma tigrinum. Genetics 140:767–782.

    PubMed  CAS  Google Scholar 

  • Thompson J.D., Gibson T.J., Plewniak F., Jeanmougin F., and Higgins D.G. 1997. The clustalX windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res 24:4876–4882.

    Article  Google Scholar 

  • Wiuf C., Christensen T., and Hein J. 2001. A simulation study of the reliability of recombination detection methods. Mol Biol Evol: in press.

    Google Scholar 

  • Woolley S., Johnson J., Smith M.J., Crandall K.A., and McClellan D.A. 2002. TreeSAAP: A phylogenetic approach to identifying selective influences on amino acid properties. Bioinformatics: submitted.

    Google Scholar 

  • Yang Z. 1994. Estimating the pattern of nucleotide substitution. J Mol Evol 39:105–111.

    PubMed  Google Scholar 

  • Yang Z. 1996. Among-site rate variation and its impact on phylogenetic analyses. Trends Ecol Evol 11:367–372.

    Article  PubMed  CAS  Google Scholar 

  • Yang Z. 1998. Likelihood ratio tests for detecting positive selection and application to primate lysozyme evolution. Mol Biol Evol 15:568–573.

    Article  PubMed  CAS  Google Scholar 

  • Yang Z. 2001 PAML: Phylogenetic Analysis by Maximum Likelihood. University College London, London.

    Google Scholar 

  • Yang Z. and Bielawski J.P. 2000. Statistical methods for detecting molecular adaptation. Trends Ecol Evol 15:496–503.

    Article  PubMed  Google Scholar 

  • Yang Z. and Nielsen R. 1998. Synonymous and nonsynonymous rate variation in nuclear genes of mammals. J Mol Evol 46:409–418.

    Article  PubMed  CAS  Google Scholar 

  • Yang Z. and Nielsen R. 2002. Codon-substitution models for detecting molecular adaptation at individual sites along specific lineages. Mol Biol Evol: in press.

    Google Scholar 

  • Yang Z., Nielsen R., Goldman N., and Pedersen A.-M. K. 2000. Codon-substitution models for heterogeneous selection pressure at amino acid sites. Genetics 155:431–449.

    PubMed  CAS  Google Scholar 

  • Zanotto P.M., Kallas E.G., Souza R.F., and Holmes E.C. 1999. Genealogical evidence for positive selection in the nefgene of HIV-1. Genetics 153:1077–1089.

    PubMed  CAS  Google Scholar 

  • Zhang J. and Madden T.L. 1997. PowerBLAST: A new network BLAST application for interactive or automated sequence analysis and annotation. Genome Research 7:649–656.

    PubMed  CAS  Google Scholar 

  • Zharkikh A and Li W.-H. 1995. Estimation of confidence in phylogeny: The complete-and partial bootstrap technique. Mol Phylogen Evol 4:44–63.

    Article  CAS  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2002 Springer Science+Business Media New York

About this chapter

Cite this chapter

Crandall, K.A., Posada, D. (2002). Phylogenetic Approaches to Molecular Epidemiology. In: Leitner, T. (eds) The Molecular Epidemiology of Human Viruses. Springer, Boston, MA. https://doi.org/10.1007/978-1-4615-1157-1_3

Download citation

  • DOI: https://doi.org/10.1007/978-1-4615-1157-1_3

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-1-4613-5420-8

  • Online ISBN: 978-1-4615-1157-1

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics