Efficient Generation of Uniform Samples from Phylogenetic Trees

  • Paul Kearney
  • J. Ian Munro
  • Derek Phillips
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2812)


In this paper, we introduce new algorithms for selecting taxon (leaf) samples from large phylogenetic trees, uniformly at random, under certain biologically relevant constraints on the taxa. All the algorithms run in polynomial time and have been implemented.

The algorithms have direct applications to the evaluation of phylogenetic tree and supertree construction methods using biologically curated data.

We also relate one of the sampling problems to the well-known clique problem on undirected graphs. From this, we obtain an interesting new class of graphs for which many open problems exist.


Phylogenetic Tree Valid Sample Sampling Problem Chordal Graph Phylogeny Reconstruction 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Aho, A.V., Sagiv, Y., Szymanski, T.G., Ullman, J.D.: Inferring a tree from lowest common ancestors with an application to the optimization of relational expressions. Society of Industrial and Applied Mathematics (SIAM) Journal on Computing 10, 405–421 (1981)zbMATHMathSciNetGoogle Scholar
  2. 2.
    Aldous, D.J.: Stochastic models and descriptive statistics for phylogenetic trees, from yule to today. Statistical Science 16, 23–34 (2001)zbMATHCrossRefMathSciNetGoogle Scholar
  3. 3.
    Baum, B.R.: Combining trees as a way of combining data sets for phylogenetic inference, and the desirability of combining gene trees. Taxon 41, 3–10 (1992)CrossRefGoogle Scholar
  4. 4.
    Bininda-Emonds, O.R.P., Gittleman, J.L., Purvis, A.: Building large trees by combining phylogenetic information: A complete phylogeny of the extant carnivora (mammalia). Biological Reviews of the Cambridge Philosophical Society 74, 143–175 (1999)CrossRefGoogle Scholar
  5. 5.
    Bininda-Emonds, O.R.P., Gittleman, J.L., Steel, M.A.: The (super) tree of life. Annual Review of Ecology and Systematics 33, 265–289 (2002)CrossRefGoogle Scholar
  6. 6.
    Bininda-Emonds, O.R.P., Sanderson, M.J.: Assessment of the accuracy of matrix representation with parsimony supertree construction. Systematic Biology 50, 565–579 (2001)CrossRefGoogle Scholar
  7. 7.
    Eernisse, D.J., Kluge, A.G.: Taxonomic congruence versus total evidence, and amniote phylogeny inferred from fossils, molecules, and morphology. Molecular Biology and Evolution (1993)Google Scholar
  8. 8.
    Farris, J.S.: Methods for computing Wagner trees. Systematic Zoology 19, 83–92 (1970)CrossRefGoogle Scholar
  9. 9.
    Felsenstein, J.: Cases in which parsimony or compatibility methods will be positively misleading. Systematic Zoology 27, 401–410 (1978)CrossRefGoogle Scholar
  10. 10.
    Felsenstein, J.: The number of evolutionary trees. Systematic Zoology 27, 27–33 (1978)CrossRefGoogle Scholar
  11. 11.
    Felsenstein, J.: Evolutionary trees from DNA sequences: a maximum likelihood approach. Journal of Molecular Evolution 17, 368–376 (1981)CrossRefGoogle Scholar
  12. 12.
    Fitch, W.M., Margoliash, E.: The construction of phylogenetic trees - a generally applicable method utilizing estimates of the mutation distance obtained from cycochrome c sequences. Science 155, 279–284 (1967)CrossRefGoogle Scholar
  13. 13.
    Foulds, L.R., Graham, R.L.: The Steiner problem in phylogeny is NP-complete. Advances in Applied Mathematics 3, 43–49 (1982)zbMATHCrossRefMathSciNetGoogle Scholar
  14. 14.
    Friedman, N., Ninio, M., Pe’er, I., Pupko, T.: A structural EM algorithm for phylogenetic inference. In: RECOMB, pp. 132–140 (2001)Google Scholar
  15. 15.
    Gavril, F.: Algorithms for minimum coloring, maximum clique, minimum covering by cliques, and maximum independent set of a chordal graph. Society of Industrial and Applied Mathematics (SIAM) Journal on Computing 1(2), 180–187 (1972)zbMATHMathSciNetGoogle Scholar
  16. 16.
    Gordon, A.D.: Consensus supertrees: The synthesis of rooted trees containing overlapping sets of labelled leaves. Journal of Classification 3, 335–348 (1986)zbMATHCrossRefMathSciNetGoogle Scholar
  17. 17.
    Jiang, T., Kearney, P., Li, M.: A Polynomial Time Approximation Scheme for Inferring Evolutionary Trees from Quartet Topologies and Its Application. Society of Industrial and Applied Mathematics (SIAM) Journal on Computing 30(6), 1942–1961 (2001)zbMATHMathSciNetGoogle Scholar
  18. 18.
    Jukes, T.H., Cantor, C.R.: Evolution of protein molecules. In: Munro, H.N. (ed.) Mammalian Protein Metabolism, pp. 21–132. Academic Press, New York (1969)Google Scholar
  19. 19.
    Kearney, P., Corneil, D.G.: Tree powers. Journal of Algorithms 29, 111–131 (1998)zbMATHCrossRefMathSciNetGoogle Scholar
  20. 20.
    Kimura, M.: A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences. Journal of Molecular Evolution 10, 111–120 (1980)CrossRefGoogle Scholar
  21. 21.
    Liu, F.-G.R., Miyamoto, M.M., Freire, N.P., Ong, P.Q., Tennant, M.R.: Molecular and morphological supertrees for eutherian (placental) mammals. Science 291, 1786–1789 (2001)CrossRefGoogle Scholar
  22. 22.
    Losos, J.B., Adler, F.D.: Stumped by trees? A generalized null model for patterns of organismal diversity. The American Naturalist 145(3), 329–342 (1995)CrossRefGoogle Scholar
  23. 23.
    Martins, E.P.: Phylogenies, spatial autoregression, and the comparative method: a computer simulation test. Evolution 50, 1750–1765 (1996)CrossRefGoogle Scholar
  24. 24.
    Ng, M.P., Wormald, N.C.: Reconstruction of rooted trees from subtrees. Discrete Applied Mathematics 69, 19–31 (1996)zbMATHCrossRefMathSciNetGoogle Scholar
  25. 25.
    Page, R.D.M.: On consensus, confidence, and “total evidence”. Cladistics 12, 83–92 (1996)Google Scholar
  26. 26.
    Pedersen, C.N.S., Stoye, J.: Sorting leaf-lists in a tree (1998),
  27. 27.
    Phillips, D.: Uniform Sampling From Phylogenetics Trees. Master’s thesis, University of Waterloo (August 2002)Google Scholar
  28. 28.
    Purvis, A.: A composite estimate of primate phylogeny. Philosophical Transactions of the Royal Society of London Series B 348, 405–421 (1995)CrossRefGoogle Scholar
  29. 29.
    Ragan, M.A.: Phylogenetic inference based on matrix representation of trees. Molecular Phylogenetics and Evolution 1, 53–58 (1992)CrossRefGoogle Scholar
  30. 30.
    Saitou, N., Nei, M.: The neighbour-joining method: A new method for reconstructing phylogenetic trees. Molecular Biology and Evolution 4(4), 406–425 (1987)Google Scholar
  31. 31.
    Slowinski, J.B., Guyer, C.: Testing the stochasticity of patterns of organismal diversity: an improved null model. The American Naturalist 134(6), 907–921 (1989)CrossRefGoogle Scholar
  32. 32.
    Willson, S.J.: An error-correcting map for quartets can improve the signals for phylogenetic trees. Molecular Biology and Evolution 18, 344–351 (2001)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2003

Authors and Affiliations

  • Paul Kearney
    • 1
  • J. Ian Munro
    • 2
  • Derek Phillips
    • 2
  1. 1.Caprion PharmaceuticalsMontréalCanada
  2. 2.School of Computer ScienceUniversity of WaterlooWaterlooCanada

Personalised recommendations