Skip to main content

Reconstructing the Ancestral Relationships Between Bacterial Pathogen Genomes

  • Protocol
  • First Online:
Bacterial Pathogenesis

Part of the book series: Methods in Molecular Biology ((MIMB,volume 1535))

Abstract

Following recent developments in DNA sequencing technology, it is now possible to sequence hundreds of whole genomes from bacterial isolates at relatively low cost. Analyzing this growing wealth of genomic data in terms of ancestral relationships can reveal many interesting aspects of the evolution, ecology, and epidemiology of bacterial pathogens. However, reconstructing the ancestry of a sample of bacteria remains challenging, especially for the majority of species where recombination is frequent. Here, we review and describe the computational techniques currently available to infer ancestral relationships, including phylogenetic methods that either ignore or account for the effect of recombination, as well as model-based and model-free phylogeny-independent approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 119.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 159.00
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Didelot X, Bowden R, Wilson DJ et al (2012) Transforming clinical microbiology with bacterial genome sequencing. Nat Rev Genet 13:601–612

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. Loman NJ, Pallen MJ (2015) Twenty years of bacterial genome sequencing. Nat Rev Microbiol 13:787–794

    Article  CAS  PubMed  Google Scholar 

  3. World Health Organisation (2015) World health statistics. Global health indicators: cause-specific mortality and morbidity.

    Google Scholar 

  4. Kiechle FL, Zhang X, Holland-Staley CA (2004) The -omics era and its impact. Arch Pathol Lab Med 128:1337–1345

    PubMed  Google Scholar 

  5. Lowder BV, Guinane CM, Ben Zakour NL et al (2009) Recent human-to-poultry host jump, adaptation, and pandemic spread of Staphylococcus aureus. Proc Natl Acad Sci U S A 106:19545–19550

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. Guinane CM, Ben Zakour NL, Tormo-Mas MA et al (2010) Evolutionary genomics of Staphylococcus aureus reveals insights into the origin and molecular basis of ruminant host adaptation. Genome Biol Evol 2:454–466

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  7. Holden MTG, Hsu L-Y, Kurt K et al (2013) A genomic portrait of the emergence, evolution, and global spread of a methicillin-resistant Staphylococcus aureus pandemic. Genome Res 23:653–664

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Croucher NJ, Harris SR, Fraser C et al (2011) Rapid pneumococcal evolution in response to clinical interventions. Science 331:430–434

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Charlesworth J, Eyre-Walker A (2006) The rate of adaptive evolution in enteric bacteria. Mol Biol Evol 23:1348–1356

    Article  CAS  PubMed  Google Scholar 

  10. Batut B, Knibbe C, Marais G, Daubin V (2014) Reductive genome evolution at both ends of the bacterial population size spectrum. Nat Rev Microbiol 12:841–850

    Article  CAS  PubMed  Google Scholar 

  11. Achtman M (2004) Chapter 2: age, descent and genetic diversity within Yersinia pestis. In: Carniel E, Joseph Hinnesbusch B (eds) Yersinia: molecular and cellular biology, 1st edn. Taylor & Francis, Norfolk, UK, pp 17–29

    Google Scholar 

  12. Sheppard SK, Didelot X, Meric G et al (2013) Genome-wide association study identifies vitamin B5 biosynthesis as a host specificity factor in Campylobacter. Proc Natl Acad Sci U S A 110:11923–11927

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. Alam MT, Petit RA 3rd, Crispell EK et al (2014) Dissecting vancomycin-intermediate resistance in staphylococcus aureus using genome-wide association. Genome Biol Evol 6:1174–1185

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. Didelot X, Gardy J, Colijn C (2014) Bayesian inference of infectious disease transmission from whole-genome sequence data. Mol Biol Evol 31:1869–1879

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Price AL, Zaitlen NA, Reich D, Patterson N (2010) New approaches to population stratification in genome-wide association studies. Nat Rev Genet 11:459–463

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  16. Kwok RBH (2011) Phylogeny, genealogy and the Linnaean hierarchy: a logical analysis. J Math Biol 63:73–108

    Article  PubMed  Google Scholar 

  17. Lefort V, Desper R, Gascuel O (2015) FastME 2.0: a comprehensive, accurate, and fast distance-based phylogeny inference program. Mol Biol Evol 32:2798–2800

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Murtagh F (2015) R: Hierarchical Clustering. https://stat.ethz.ch/R-manual/R-devel/library/stats/html/hclust.html. Accessed 27 Jul 2015

  19. Suzuki R, Shimodaira H (2006) Pvclust: an R package for assessing the uncertainty in hierarchical clustering. Bioinformatics 22:1540–1542

    Article  CAS  PubMed  Google Scholar 

  20. Popescu A-A, Huber KT, Paradis E (2012) ape 3.0: new tools for distance-based phylogenetics and evolutionary analysis in R. Bioinformatics 28:1536–1537

    Article  CAS  PubMed  Google Scholar 

  21. Schliep KP (2011) phangorn: phylogenetic analysis in R. Bioinformatics 27:592–593

    Article  CAS  PubMed  Google Scholar 

  22. Gascuel O (1997) BIONJ: an improved version of the NJ algorithm based on a simple model of sequence data. Mol Biol Evol 14:685–695

    Article  CAS  PubMed  Google Scholar 

  23. Tamura K, Peterson D, Peterson N et al (2011) MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods. Mol Biol Evol 28:2731–2739

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  24. Felsenstein J (1989) PHYLIP - phylogeny inference package (Version 3.2). Cladistics 5:164–166

    Google Scholar 

  25. Wilgenbusch JC and Swofford D (2003) Inferring Evolutionary Trees with PAUP*. Current Protocols in Bioinformatics. 00:6.4:6.4.1–6.4.28

    Google Scholar 

  26. Guindon S, Dufayard J-F, Lefort V et al (2010) New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0. Syst Biol 59:307–321

    Article  CAS  PubMed  Google Scholar 

  27. Stamatakis A (2006) RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics 22:2688–2690

    Article  CAS  PubMed  Google Scholar 

  28. Zwickl DJ (2006) Genetic algorithm approaches for the phylogenetic analysis of large biological sequence datasets under the maximum likelihood criterion. Ph. D. dissertation, The University of Texas at Austin

    Google Scholar 

  29. Price MN, Dehal PS, Arkin AP (2009) FastTree: computing large minimum evolution trees with profiles instead of a distance matrix. Mol Biol Evol 26:1641–1650

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Price MN, Dehal PS, Arkin AP (2010) FastTree 2--approximately maximum-likelihood trees for large alignments. PLoS One 5:e9490

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  31. Ashkenazy H, Penn O, Doron-Faigenboim A et al (2012) FastML: a web server for probabilistic reconstruction of ancestral sequences. Nucleic Acids Res 40:W580–W584

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  32. Ronquist F, Teslenko M, van der Mark P et al (2012) MrBayes 3.2: efficient Bayesian phylogenetic inference and model choice across a large model space. Syst Biol 61:539–542

    Article  PubMed  PubMed Central  Google Scholar 

  33. Drummond AJ, Rambaut A (2007) BEAST: Bayesian evolutionary analysis by sampling trees. BMC Evol Biol 7:214

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  34. Bouckaert R, Heled J, Kühnert D et al (2014) BEAST 2: a software platform for Bayesian evolutionary analysis. PLoS Comput Biol 10:e1003537

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  35. Didelot X, Falush D (2007) Inference of bacterial microevolution using multilocus sequence data. Genetics 175:1251–1266

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  36. Didelot X, Wilson DJ (2015) ClonalFrameML: efficient inference of recombination in whole bacterial genomes. PLoS Comput Biol 11:e1004041

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  37. Croucher NJ, Page AJ, Connor TR et al (2015) Rapid phylogenetic analysis of large samples of recombinant bacterial whole genome sequences using Gubbins. Nucleic Acids Res 43:e15

    Article  PubMed  CAS  Google Scholar 

  38. Pritchard JK, Stephens M, Rosenberg NA, Donnelly P (2000) Association mapping in structured populations. Am J Hum Genet 67:170–181

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  39. Pritchard JK, Wen W, Falush D (2003) Documentation for structure software: version 2

    Google Scholar 

  40. Tang J, Hanage WP, Fraser C, Corander J (2009) Identifying currents in the gene pool for bacterial populations using an integrative approach. PLoS Comput Biol 5:e1000455

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  41. Marttinen P, Hanage WP, Croucher NJ et al (2012) Detection of recombination events in bacterial genomes from large population samples. Nucleic Acids Res 40:e6

    Article  CAS  PubMed  Google Scholar 

  42. Alexander DH, Novembre J, Lange K (2009) Fast model-based estimation of ancestry in unrelated individuals. Genome Res 19:1655–1664

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  43. Lawson DJ, Hellenthal G, Myers S, Falush D (2012) Inference of population structure using dense haplotype data. PLoS Genet 8:e1002453

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  44. Yahara K, Didelot X, Ansari MA et al (2014) Efficient inference of recombination hot regions in bacterial genomes. Mol Biol Evol 31:1593–1605

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  45. Dray S, Dufour AB (2007) The ade4 package: implementing the duality diagram for ecologists. J Stat Softw 22:1–20

    Article  Google Scholar 

  46. Jombart T, Devillard S, Balloux F (2010) Discriminant analysis of principal components: a new method for the analysis of genetically structured populations. BMC Genet 11:94

    Article  PubMed  PubMed Central  Google Scholar 

  47. Dunitz MI, Lang JM, Jospin G et al (2015) Swabs to genomes: a comprehensive workflow. PeerJ 3:e960

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  48. Li H, Ruan J, Durbin R (2008) Mapping short DNA sequencing reads and calling variants using mapping quality scores. Genome Res 18:1851–1858

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  49. Zerbino DR, Birney E (2008) Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Res 18:821–829

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  50. Darling AE, Mau B, Perna NT (2010) ProgressiveMauve: multiple genome alignment with gene gain, loss and rearrangement. PLoS One 5:e11147

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  51. Jolley KA, Maiden MCJ (2010) BIGSdb: Scalable analysis of bacterial genome variation at the population level. BMC Bioinformatics 11:595

    Article  PubMed  PubMed Central  Google Scholar 

  52. Legendre P, Legendre LFJ (1983) Developments in environmental modelling, vol 24, 2nd edn, Numerical ecology. Elsevier, Amsterdam

    Google Scholar 

  53. Saitou N, Nei M (1987) The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol Biol Evol 4:406–425

    CAS  PubMed  Google Scholar 

  54. Felsenstein J (1981) Evolutionary trees from DNA sequences: a maximum likelihood approach. J Mol Evol 17:368–376

    Article  CAS  PubMed  Google Scholar 

  55. Hedge J, Wilson DJ (2014) Bacterial phylogenetic reconstruction from whole genomes is robust to recombination but demographic inference is not. MBio 5:e02158

    Article  PubMed  PubMed Central  Google Scholar 

  56. Bogdanowicz D, Giaro K, Wróbel B (2012) TreeCmp: comparison of trees in polynomial time. Evol Bioinform Online 8:475

    PubMed Central  Google Scholar 

  57. Sørensen T (1948) A method of establishing groups of equal amplitude in plant sociology based on similarity of species and its application to analyses of the vegetation on Danish commons. Kongelige Danske Videnskabernes Selskabs Biologiske Skrifter 5:1–34

    Google Scholar 

  58. Sneath PHA, Sokal RR, Freeman WH (1975) Numerical taxonomy. The principles and practice of numerical classification. Syst Zool 24:263–268

    Article  Google Scholar 

  59. Gascuel O, Steel M (2006) Neighbor-joining revealed. Mol Biol Evol 23:1997–2000

    Article  CAS  PubMed  Google Scholar 

  60. Zuckerland E, Pauling LB (1962) Molecular disease, evolution, and genetic heterogeneity. In: Kasha M, Pullman B (eds) Horizons in biochemistry. Academic Press, New York, pp 189–225

    Google Scholar 

  61. Wang L-S, Warnow T, Moret BME et al (2006) Distance-based genome rearrangement phylogeny. J Mol Evol 63:473–483

    Article  CAS  PubMed  Google Scholar 

  62. Sheppard SK, Didelot X, Jolley KA et al (2013) Progressive genome-wide introgression in agricultural Campylobacter coli. Mol Ecol 22:1051–1064

    Article  CAS  PubMed  Google Scholar 

  63. Merker M, Blin C, Mona S et al (2015) Evolutionary history and global spread of the Mycobacterium tuberculosis Beijing lineage. Nat Genet 47:242–249

    Article  CAS  PubMed  Google Scholar 

  64. Morelli G, Song Y, Mazzoni CJ et al (2010) Yersinia pestis genome sequencing identifies patterns of global phylogenetic diversity. Nat Genet 42:1140–1143

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  65. Cui Y, Yu C, Yan Y et al (2013) Historical variations in mutation rate in an epidemic pathogen, Yersinia pestis. Proc Natl Acad Sci U S A 110:577–582

    Article  CAS  PubMed  Google Scholar 

  66. Zhou Z, McCann A, Litrup E et al (2013) Neutral genomic microevolution of a recently emerged pathogen, Salmonella enterica serovar Agona. PLoS Genet 9:e1003471

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  67. Holder M, Lewis PO (2003) Phylogeny estimation: traditional and Bayesian approaches. Nat Rev Genet 4:275–284

    Article  CAS  PubMed  Google Scholar 

  68. Mutreja A, Kim DW, Thomson NR et al (2011) Evidence for several waves of global transmission in the seventh cholera pandemic. Nature 477:462–465

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  69. Harris SR, Feil EJ, Holden MTG et al (2010) Evolution of MRSA during hospital transmission and intercontinental spread. Science 327:469–474

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  70. Harris SR, Clarke IN, Seth-Smith HMB et al (2012) Whole-genome analysis of diverse Chlamydia trachomatis strains identifies phylogenetic relationships masked by current clinical typing. Nat Genet 44(413–9):S1

    Google Scholar 

  71. Metropolis N, Rosenbluth AW, Rosenbluth MN et al (1953) Equation of state calculations by fast computing machines. J Chem Phys 21:1087–1092

    Article  CAS  Google Scholar 

  72. Hastings WK (1970) Monte Carlo sampling methods using Markov chains and their applications. Biometrika 57:97–109

    Article  Google Scholar 

  73. Biek R, Pybus OG, Lloyd-Smith JO, Didelot X (2015) Measurably evolving pathogens in the genomic era. Trends Ecol Evol 30:306–313

    Article  PubMed  PubMed Central  Google Scholar 

  74. Pupko T, Pe’er I, Shamir R, Graur D (2000) A fast algorithm for joint reconstruction of ancestral amino acid sequences. Mol Biol Evol 17:890–896

    Article  CAS  PubMed  Google Scholar 

  75. Didelot X, Meric G, Falush D, Darling A (2012) Impact of homologous and non-homologous recombination in the genomic evolution of Escherichia coli. BMC Genomics 13:256

    Article  PubMed  PubMed Central  Google Scholar 

  76. Joseph SJ, Didelot X, Gandhi K et al (2011) Interplay of recombination and selection in the genomes of Chlamydia trachomatis. Biol Direct 6:28

    Article  PubMed  PubMed Central  Google Scholar 

  77. Joseph SJ, Didelot X, Rothschild J et al (2012) Population genomics of Chlamydia trachomatis: insights on drift, selection, recombination, and population structure. Mol Biol Evol 29:3933–3946

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  78. Dearlove BL, Cody AJ, Pascoe B et al (2015) Rapid host switching in generalist Campylobacter strains erodes the signal for tracing human infections. ISME J 10:721–729. doi:10.1038/ismej.2015.149

    Article  PubMed  PubMed Central  Google Scholar 

  79. van Tonder AJ, Bray JE, Roalfe L et al (2015) Genomics reveals the worldwide distribution of multidrug-resistant serotype 6E pneumococci. J Clin Microbiol 53:2271–2285

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  80. Walker TM, Kohl TA, Omar SV et al (2015) Whole-genome sequencing for prediction of Mycobacterium tuberculosis drug susceptibility and resistance: a retrospective cohort study. Lancet Infect Dis 15:1193–1202

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  81. Croucher NJ, Finkelstein JA, Pelton SI et al (2015) Population genomic datasets describing the post-vaccine evolutionary epidemiology of Streptococcus pneumoniae. Sci Data 2:150058

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  82. Chewapreecha C, Harris SR, Croucher NJ et al (2014) Dense genomic sampling identifies highways of pneumococcal recombination. Nat Genet 46:305–309

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  83. Cornick JE, Chaguza C, Harris SR et al (2015) Region-specific diversification of the highly virulent serotype 1 Streptococcus pneumoniae. Microbial Genomics 1:10.doi: 10.1099/mgen.0.000027

  84. Kamng’ona AW, Hinds J, Bar-Zeev N et al (2015) High multiple carriage and emergence of Streptococcus pneumoniae vaccine serotype variants in Malawian children. BMC Infect Dis 15:234

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  85. Turner CE, Abbott J, Lamagni T et al (2015) Emergence of a new highly successful acapsular group A Streptococcus clade of genotype emm89 in the United Kingdom. MBio 6:e00622

    CAS  PubMed  PubMed Central  Google Scholar 

  86. Stasiewicz MJ, Oliver HF, Wiedmann M, den Bakker HC (2015) Whole-genome sequencing allows for improved identification of persistent listeria monocytogenes in food-associated environments. Appl Environ Microbiol 81:6024–6037

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  87. Robinson DA, Feil EJ, Falush D (2010) Bacterial population genetics in infectious disease. Wiley-Blackwell, Malden, MA

    Book  Google Scholar 

  88. Pritchard JK, Stephens M, Donnelly P (2000) Inference of population structure using multilocus genotype data. Genetics 155:945–959

    CAS  PubMed  PubMed Central  Google Scholar 

  89. Falush D, Stephens M, Pritchard JK (2003) Inference of population structure using multilocus genotype data: linked loci and correlated allele frequencies. Genetics 164:1567–1587

    CAS  PubMed  PubMed Central  Google Scholar 

  90. Rosenberg NA (2004) distruct: a program for the graphical display of population structure. Mol Ecol Notes 4:137–138

    Article  Google Scholar 

  91. Ramasamy RK, Ramasamy S, Bindroo BB, Naik VG (2014) STRUCTURE PLOT: a program for drawing elegant STRUCTURE bar plots in user friendly interface. SpringerPlus 3:431

    Article  PubMed  PubMed Central  Google Scholar 

  92. Falush D, Torpdahl M, Didelot X et al (2006) Mismatch induced speciation in Salmonella: model and data. Philos Trans R Soc Lond B Biol Sci 361:2045–2053

    Article  PubMed  PubMed Central  Google Scholar 

  93. Wirth T, Falush D, Lan R et al (2006) Sex and virulence in Escherichia coli: an evolutionary perspective. Mol Microbiol 60:1136–1151

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  94. Sheppard SK, McCarthy ND, Falush D, Maiden MCJ (2008) Convergence of Campylobacter species: implications for bacterial evolution. Science 320:237–239

    Article  CAS  PubMed  Google Scholar 

  95. Castillo-Ramírez S, Corander J, Marttinen P et al (2012) Phylogeographic variation in recombination rates within a global clone of methicillin-resistant Staphylococcus aureus. Genome Biol 13:R126

    Article  PubMed  PubMed Central  Google Scholar 

  96. Yahara K, Furuta Y, Oshima K et al (2013) Chromosome painting in silico in a bacterial species reveals fine population structure. Mol Biol Evol 30:1454–1464

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  97. Cui Y, Yang X, Didelot X et al (2015) Epidemic clones, oceanic gene pools and eco-LD in the free living marine pathogen Vibrio parahaemolyticus. Mol Biol Evol 32:1396–1410. doi:10.1093/molbev/msv009

    Article  CAS  PubMed  Google Scholar 

  98. Lawson DJ, Falush D (2012) Population identification using genetic data. Annu Rev Genomics Hum Genet 13:337–361

    Article  CAS  PubMed  Google Scholar 

  99. R Core Development Team (2013) The R project for statistical computing. In: R: a language and environment for statistical computing. http://www.r-project.org/. Accessed 1 Feb 2015

  100. Jombart T (2008) adegenet: a R package for the multivariate analysis of genetic markers. Bioinformatics 24:1403–1405

    Article  CAS  PubMed  Google Scholar 

  101. Jombart T, Ahmed I (2011) adegenet 1.3-1: new tools for the analysis of genome-wide SNP data. Bioinformatics 27:3070–3071

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  102. Pearson K (1901) On lines and planes of closest fit to systems of points in space. Philosophical Magazine Series 6 2:559–572

    Article  Google Scholar 

  103. Cavalli-Sforza LL (1966) Population structure and human evolution. Proc R Soc Lond B Biol Sci 164:362–379

    Article  CAS  PubMed  Google Scholar 

  104. Patterson N, Price AL, Reich D (2006) Population structure and eigenanalysis. PLoS Genet 2:e190

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  105. Paschou P, Ziv E, Burchard EG et al (2007) PCA-correlated SNPs for structure identification in worldwide human populations. PLoS Genet 3:1672–1686

    Article  CAS  PubMed  Google Scholar 

  106. Lessa EP (1990) Multidimensional analysis of geographic genetic structure. Syst Biol 39:242–252

    Google Scholar 

  107. Alter O, Brown PO, Botstein D (2000) Singular value decomposition for genome-wide expression data processing and modeling. Proc Natl Acad Sci U S A 97:10101–10106

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  108. Sanchez-Mazas A, Langaney A (1988) Common genetic pools between human populations. Hum Genet 78:161–166

    Article  CAS  PubMed  Google Scholar 

  109. Smouse PE, Spielman RS, Park MH (1982) Multiple-locus allocation of individuals to groups as a function of the genetic variation within and differences among human populations. Am Nat 119:445–463

    Article  Google Scholar 

  110. Jombart T, Pontier D, Dufour A-B (2009) Genetic markers in the playground of multivariate analysis. Heredity 102:330–341

    Article  CAS  PubMed  Google Scholar 

  111. Lefébure T, Bitar PDP, Suzuki H, Stanhope MJ (2010) Evolutionary dynamics of complete Campylobacter pan-genomes and the bacterial species concept. Genome Biol Evol 2:646–655

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  112. Bolivar I, Whiteson K, Stadelmann B et al (2012) Bacterial diversity in oral samples of children in niger with acute noma, acute necrotizing gingivitis, and healthy controls. PLoS Negl Trop Dis 6:e1556

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  113. Montano V, Didelot X, Foll M et al (2015) Worldwide population structure, long term demography, and local adaptation of helicobacter pylori. Genetics 200:947–963. doi:10.1534/genetics.115.176404

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  114. Efron B (1979) Bootstrap methods: another look at the Jackknife. Ann Statist 7:1–26

    Article  Google Scholar 

  115. Felsenstein J (1985) Confidence limits on phylogenies: an approach using the bootstrap. Evolution 39:783–791. doi:10.2307/2408678

    Article  Google Scholar 

  116. Anisimova M, Gascuel O (2006) Approximate likelihood-ratio test for branches: a fast, accurate, and powerful alternative. Syst Biol 55:539–552

    Article  PubMed  Google Scholar 

  117. Comas I, Coscolla M, Luo T et al (2013) Out-of-Africa migration and Neolithic coexpansion of Mycobacterium tuberculosis with modern humans. Nat Genet 45:1176–1182

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  118. Milkman R, Bridges MM (1990) Molecular evolution of the Escherichia coli chromosome. III clonal frames. Genetics 126:505–517

    CAS  PubMed  PubMed Central  Google Scholar 

  119. Dress AWM, Flamm C, Fritzsch G et al (2008) Noisy: identification of problematic columns in multiple sequence alignments. Algorithms Mol Biol 3:7

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  120. Hornstra HM, Priestley RA, Georgia SM et al (2011) Rapid typing of Coxiella burnetii. PLoS One 6:e26201

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  121. Vos M, Didelot X (2008) A comparison of homologous recombination rates in bacteria and archaea. ISME J 3:199–208

    Article  PubMed  CAS  Google Scholar 

  122. Didelot X, Eyre DW, Cule M et al (2012) Microevolutionary analysis of Clostridium difficile genomes to investigate transmission. Genome Biol 13:R118

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  123. Feil EJ, Holmes EC, Bessen DE et al (2001) Recombination within natural populations of pathogenic bacteria: short-term empirical estimates and long-term phylogenetic consequences. Proc Natl Acad Sci U S A 98:182–187

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  124. Kennemann L, Didelot X, Aebischer T et al (2011) Helicobacter pylori genome evolution during human infection. Proc Natl Acad Sci U S A 108:5033–5038

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  125. Albright E, Hessel J, Hiranuma N et al (2014) A comparative analysis of popular phylogenetic reconstruction algorithms. In: Proceedings of the Midwest Instruction and Computing Symposium (MICS)

    Google Scholar 

  126. Bouckaert RR (2010) DensiTree: making sense of sets of phylogenetic trees. Bioinformatics 26:1372–1373

    Article  CAS  PubMed  Google Scholar 

  127. Ochman H, Lawrence JG, Groisman EA (2000) Lateral gene transfer and the nature of bacterial innovation. Nature 405:299–304

    Article  CAS  PubMed  Google Scholar 

  128. Schierup MH, Hein J (2000) Consequences of recombination on traditional phylogenetic analysis. Genetics 156:879–891

    CAS  PubMed  PubMed Central  Google Scholar 

  129. Schierup MH, Hein J (2000) Recombination and the molecular clock. Mol Biol Evol 17:1578–1579

    Article  CAS  PubMed  Google Scholar 

  130. Posada D, Crandall KA (2002) The effect of recombination on the accuracy of phylogeny estimation. J Mol Evol 54:396–402

    Article  CAS  PubMed  Google Scholar 

  131. Rannala B, Yang Z (2008) Phylogenetic inference using whole genomes. Annu Rev Genomics Hum Genet 9:217–231

    Article  CAS  PubMed  Google Scholar 

  132. Everitt RG, Didelot X, Batty EM et al (2014) Mobile elements drive recombination hotspots in the core genome of Staphylococcus aureus. Nat Commun 5:3956

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  133. Mostowy R, Croucher NJ, Hanage WP et al (2014) Heterogeneity in the frequency and characteristics of homologous recombination in pneumococcal evolution. PLoS Genet 10:e1004300

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  134. Namouchi A, Didelot X, Schöck U et al (2012) After the bottleneck: genome-wide diversification of the Mycobacterium tuberculosis complex by mutation, recombination, and natural selection. Genome Res 22:721–734

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  135. Dykhuizen DE, Green L (1991) Recombination in Escherichia coli and the definition of biological species. J Bacteriol 173:7257–7268

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  136. Hudson RR, Kaplan NL (1985) Statistical properties of the number of recombination events in the history of a sample of DNA sequences. Genetics 111:147–164

    CAS  PubMed  PubMed Central  Google Scholar 

  137. Lewontin RC (1964) The interaction of selection and linkage. I general considerations; heterotic models. Genetics 49:49–67

    CAS  PubMed  PubMed Central  Google Scholar 

  138. Hill WG, Robertson A (1968) Linkage disequilibrium in finite populations. Theor Appl Genet 38:226–231

    Article  CAS  PubMed  Google Scholar 

  139. Didelot X, Lawson D, Darling A, Falush D (2010) Inference of homologous recombination in bacteria using whole-genome sequences. Genetics 186:1435–1449

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  140. Waples RS, Gaggiotti O (2006) What is a population? An empirical evaluation of some genetic methods for identifying the number of gene pools and their degree of connectivity. Mol Ecol 15:1419–1439

    Article  CAS  PubMed  Google Scholar 

  141. Evanno G, Regnaut S, Goudet J (2005) Detecting the number of clusters of individuals using the software STRUCTURE: a simulation study. Mol Ecol 14:2611–2620

    Article  CAS  PubMed  Google Scholar 

  142. Hartigan JA, Wong MA (1979) Algorithm AS 136: A K-means clustering algorithm. J R Stat Soc Ser C Appl Stat 28:100–108

    Google Scholar 

  143. MacQueen J (1967) Some methods for classification and analysis of multivariate observations. In: Cao J, Mao K, Cambria E et al (eds) Proceedings of ELM-2014 Volume 1: Algorithms and theories. Springer International Publishing, pp 281–297

    Google Scholar 

  144. Fraley C, Raftery AE (1998) How many clusters? Which clustering method? Answers via model-based cluster analysis. Comput J 41:578–588

    Article  Google Scholar 

  145. Lee C, Abdool A, Huang C-H (2009) PCA-based population structure inference with generic clustering algorithms. BMC Bioinformatics 10(Suppl 1):S73

    Article  PubMed  PubMed Central  Google Scholar 

  146. Zhu X, Zhang S, Zhao H, Cooper RS (2002) Association mapping, using a mixture model for complex traits. Genet Epidemiol 23:181–196

    Article  PubMed  Google Scholar 

  147. Bezdek JC (1981) Pattern recognition with fuzzy objective function algorithms. Plenum Press, New York

    Book  Google Scholar 

  148. Kaufman L, Rousseeuw PJ (1990) Finding groups in data: an introduction to cluster analysis. Wiley, New York

    Book  Google Scholar 

  149. Fraley C, Raferty AE (2002) Model-based clustering, discriminant analysis, and density estimation. J Am Stat Assoc 97:611–631

    Article  Google Scholar 

  150. Lawson DJ (2013) Populations in statistical genetic modelling and inference. arXiv [q-bio.PE]

    Google Scholar 

  151. McVean G (2009) A genealogical interpretation of principal components analysis. PLoS Genet 5:e1000686

    Article  PubMed  PubMed Central  CAS  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Caitlin Collins or Xavier Didelot .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer Science+Business Media New York

About this protocol

Cite this protocol

Collins, C., Didelot, X. (2017). Reconstructing the Ancestral Relationships Between Bacterial Pathogen Genomes. In: Nordenfelt, P., Collin, M. (eds) Bacterial Pathogenesis. Methods in Molecular Biology, vol 1535. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-6673-8_8

Download citation

  • DOI: https://doi.org/10.1007/978-1-4939-6673-8_8

  • Published:

  • Publisher Name: Humana Press, New York, NY

  • Print ISBN: 978-1-4939-6671-4

  • Online ISBN: 978-1-4939-6673-8

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics