Skip to main content

How to validate phylogenetic trees? A stepwise procedure

  • Conference paper
Data Science, Classification, and Related Methods

Summary

In this paper, I review some of the methods and tests currently available to validate trees, focussing on phylogenetic trees (dendrograms and cladograms). I first present some of the more commonly used techniques to compare a tree with the data it is derived from (internal validation), or compare a tree to another tree or to more than one (external validation). I also discuss some of the advantages of performing combined (total evidence) versus separate analyses (consensus) of independent data sets for validation purposes. A stepwise validation procedure defined across all levels of comparison is introduced, along with a corresponding statistical test: A phylogeny will be said to be globally validated only if it satisfies all the tests. An application to the phylogeny of kangaroos is presented to illustrate the stepwise procedure.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Adams, F. N., III. (1972): Consensus techniques and the comparison of taxonomic trees Systematic Zoology, 21, 390–397.

    Google Scholar 

  • Alroy. J. (1994): Four permutation tests for the presence of phytogenetic structure, Systematic Biology, 43, 430–437.

    Google Scholar 

  • Anderbcrg, A. and Tehler, A. (1990): Consensus trees, a necessity in taxonomic practice, Cladistics, 6, 399–402.

    Article  Google Scholar 

  • Archie, J. W. (1989a): A randomization test for phytogenetic information in systematic data Systematic Zoology, 38, 219–252.

    Google Scholar 

  • Archie, J. W. (1989b): Homoplasy excess ratios: New indices for measuring levels of homoplasy in phytogenetic systematics and a critique of the consistency index, Systematic Zoology, 38, 253–269.

    Article  Google Scholar 

  • Archie, J. W. (1989c): Phylogenies of plant families: A demonstration of phylogenetic randomness in DNA sequence data derived from proteins, Evolution, 43, 1796–1800.

    Article  Google Scholar 

  • Archie, J. W. (1990): Homoplasy excess statistics and retention indices: A reply to Farris Systematic Zoology, 39, 169–174.

    Google Scholar 

  • Archie, J. W. and Felsenstein, J. (1993): The number of evolutionary steps on random and minimum lengths trees for random evolutionary data. Theoretical Population Biology, 43, 52–79.

    Article  MATH  Google Scholar 

  • Bandelt. H. J. (1995): Combination of data in phylogenetic analysis Plant Systematics and Evolution Supplementum 9, 355–361.

    Google Scholar 

  • Barrett, M. et al. (1991): Against consensus Systematic Zoology 40, 486–493.

    Google Scholar 

  • Barrett, M. et al. (1993): Crusade’? A response to Nelson Systematic Biology 42, 216–217.

    Google Scholar 

  • Barthélemy, J.-P. and McMorris, F. R. (1986): The median procedure for n-trees Journal of Classification 3, 329–334.

    Google Scholar 

  • Baum, B. R. (1992): Combining trees as a way of combining data for phylogenetic inference, and the desirability of combining gene trees, Taxon, 41, 3–10.

    Article  Google Scholar 

  • Baum, B. R. and Ragan, M. A. (1993): Reply to A. G. Rodrigo’s “A comment on Baum’s method for combining phylogenetic trees, Taxon, 42, 637–640.

    Article  Google Scholar 

  • Baverstock, P. R. et al. (1989): Albumin immunologic relationships of the Macropodidae (Marsupialia) Systematic Zoology 38, 38–50.

    Google Scholar 

  • Berry, V. and Gascuel, O. (1996): On the interpretation of bootstrap trees: Appropriate threshold of clade selection and induced gain, Molecular Biology and Evolution, 13, 999–1011.

    Article  Google Scholar 

  • Bledsoe, A. H. and Raikow, R. J. (1990): A quantitative assessment of congruence between molecular and nonmolecular estimates of phylogeny, Journal of Molecular Evolution, 30, 247–259.

    Article  Google Scholar 

  • Bleiweiss, R. et al. (1994): DNA-DNA hybridization-based phylogeny of “higher nonpasserines: Reevaluating a key portion of the avian family tree, Molecular Phylogenetics and Evolution, 3, 248–255.

    Article  Google Scholar 

  • Bock, II. H. (1985): On some significance tests in cluster analysis, Journal of Classification, 2, 77–108. Bosibud, H. M. and Bosibud, L. E. (1972): A metric for classifications, Taxon, 21, 607–613.

    Google Scholar 

  • Bourque, M. (1978): Arbres de Steiner et réseaux dont varie l’emplacement de certains sommets. Ph. D. Thesis, Département d’Informatique et de Recherche Operatiouelle, Unversité de Montréal, Montréal.

    Google Scholar 

  • Bremer, K. (1990): Combinable component consensus, Cladistics, 6, 369–372. Bremer, K. (1995): Branch support and tree stability, Cladistics, 10, 295–304. Brown, J. K. M. (1994): Probabilities of evolutionary trees, Systematic Biology, 43, 78–91.

    Google Scholar 

  • Bryant, H. N. (1992): The role of permutation tail probability tests in phylogenetic systematics Systematic Biology 41, 258–263.

    Google Scholar 

  • Bull, J. J. et al. (1993): Partitioning and combining data in phylogenetic analysis, Systematic Biology, 42, 384–397.

    Google Scholar 

  • Buneman, P. (1971): The recovery of trees from measures of dissimilarity. In: Mathematics in Archeological and Historical Sciences, Hodson, F. R. et al. (eds.), 387–395, Edinburgh University Press, Edinburgh.

    Google Scholar 

  • Buneman, P. (1974): A note on the metric properties of trees, Journal of Combinatorial Theory (B), 17, 48–50.

    Article  MathSciNet  MATH  Google Scholar 

  • Carpenter, J. M. (1992): Random cladistics Cladistics 8, 147–153.

    Google Scholar 

  • Carter, M. et al. (1990): On the distribution of lengths of evolutionary trees SIAM Journal of Discrete ai’lathematics 3, 38–47.

    Google Scholar 

  • Chìppindale, P. T. and Wiens, J. J. (1994): Weighting, partitioning, and combining characters in phylogenetic analysis, Systematic Biology, 43, 278–287.

    Google Scholar 

  • Colless, D. H. (1980): Congruence between morphometric and allozyme data for Menidia species: A reappraisal Systematic Zoology 29, 288–299 .

    Google Scholar 

  • Critchlow, D. E. et al. (1996): The triples distance for rooted bifurcating phylogenetic trees Systematic Biology 45, 323–334.

    Google Scholar 

  • Cucumel, G. and Lapointe, F.-J. (1997): Un test de la pertinence du consensus par une méthode de permutations. In: Actes des XXIXe journées de statistique 299–300, Carcassonne.

    Google Scholar 

  • Davis, J. I. (1993): Character removal as a means for assessing stability of clades, Cladistics, 9, 201–210.

    Article  Google Scholar 

  • Day, W. H. E. (1983a): The role of complexity in comparing classifications, Mathematical Biosciences, 66, 97–114.

    Article  MathSciNet  MATH  Google Scholar 

  • Day, W. H. E. (1983b): Distributions of distances between pairs of classifications. In: Numerical Taxonomy Felsenstein, J. (ed.), 127–131, Springer-Verlag, Berlin.

    Google Scholar 

  • Day, W. H. E. (1983c): Computationally difficult parsimony problems in phylogenetic systematics Journal of Theoretical Biology 103, 429–438.

    Google Scholar 

  • Day, W. H. E. (1986): Analysis of quartet dissimilarity measures between undirected phylogenetic trees Systematic Zoology 35, 325–333.

    Google Scholar 

  • Day, W. H. E. (1987): Computational complexity of inferring phylogenies from dissimilarity matrices Bulletin of Mathematical Biology 49, 461–467.

    Google Scholar 

  • Day, W. H. E. and McMorris, F. R. (1985): A formalization of consensus index methods Bulletin of Mathematical Biology 47, 215–229.

    Google Scholar 

  • de Queiroz, A. (1993): For consensus (sometimes) Systematic Biology 42, 368–372.

    Google Scholar 

  • de Queiroz, A. et al. (1995): Separate versus combined analysis of phylogenetic evidence Annual Review of Ecology and Systematics 26, 657–681.

    Google Scholar 

  • Dopazo, J. (1994): Estimating errors and confidence intervals for branch lengths in phylogenetic tres by a bootstrap approach. Journal of Molecular Evolution, 38, 300–304.

    Article  Google Scholar 

  • Dubes, R. and Jain, A. K. (1979): Validity studies in clustering methodologies, Pattern Recognition, 11, 235–254.

    Article  MATH  Google Scholar 

  • Dwass, M. (1957): Modified randomization tests for nonparametric hypotheses Annals of Mathematics and Statistics 28, 181–187.

    Google Scholar 

  • Edgington, E. S. (1995): Randomization tests, 3rd Edition, Revised and Expanded. Marcel Dekker, New York.

    Google Scholar 

  • Eernisse, D. J. and Kluge, A. G. (1993): Taxonomic congruence versus total evidence, and the phylogeny of amniotes inferred from fossils, molecules and morphology, Molecular Biology and Evolution, 10, 1170–1195.

    Google Scholar 

  • Efron, B. (1979): Bootstrapping methods: Another look at the jackknife Annals of Statistics 7, 1–26.

    Google Scholar 

  • Efron, B. and Gong, G. (1983): A leisurely look at the bootstrap, the jackknife, and cross-validation American Statistician 37, 36–48.

    Google Scholar 

  • Efron, B. and Tibshirani, R. J. (1993): An introduction to the bootstrap, Chapman and Hall, New York.

    Google Scholar 

  • Efron, B. et al. (1996): Bootstrap confidence levels for phylogenetic trees Proceedings of the National Academy of Sciences USA93, 13429–13434.

    Google Scholar 

  • Estabrook, G. F. (1992): Evaluating undirected positional congruence of individual taxa between two estimates of the phylogenetic tree for a group of taxa, Systematic Biology, 41, 172–177.

    Google Scholar 

  • Estabrook, G. F. et al. (1985): Comparison of undirected phylogenetic trees based ou subtrees of four evolutionary units, Systematic Zoology, 34, 193–200.

    Article  Google Scholar 

  • Faith, D. P. (1991): Cladistic permutation tests for monophyly and nonmonophyly, Systematic Zoology, 40, 366–375.

    Article  Google Scholar 

  • Faith, D. P. (1992): Ou corroboration: A reply to Carpenter Cladistics 8, 265–273.

    Google Scholar 

  • Faith, D. P. and Ballard, J. W. O. (1994): Length differences topology-dependent tests: A response to Källersjö et al, Cladistics, 10, 57–64.

    Article  Google Scholar 

  • Faith, D. P. and Belbin, L. (1986): Comparison of classifications using measures intermediate between metric dissimilarity and consensus similarity, Journal of Classification, 3, 257–280.

    Article  MATH  Google Scholar 

  • Faith, D. P. and Cranston, P. S. (1991): Could a cladogram this short have arisen by chance alone? on permutation tests for cladistic structure, Cladistics, 71–28.

    Article  Google Scholar 

  • Faith, D. P. and Trueman, J. W. H. (1996): When the topology-dependent permutation test (T-PTP) for monophyly returns significant support for monophyly, should that be equated with (a) rejecting a null hypothesis of nonmonophyly, (b) rejecting a null hypothesis of “no structure,” (c) failing to falsify a hypothesis of monophyly, or (d) none of the above? Systematic Biology, 45, 580–586.

    Article  Google Scholar 

  • Farris, J. S. (1989a): The retention index and the resealed consistency index, Cladistics, 5, 417–419. Farris, J. S. (1989b): The retention index and homoplasy excess, Systematic Zoology, 38, 406–407. Farris, J. S. (1991): Excess homoplasy ratios, Cladistics, 7,81–91.

    Article  Google Scholar 

  • Farris, J. S. et al. (1995a): Constructing a significance test for incongruence Systematic Biology44, 570572.

    Google Scholar 

  • Farris, J. S. et al. (1995b): Testing significance of incongruencies, Cladistics, 10, 315–370. Felsenstein, J. (1978): The number of evolutionary trees, Systematic Zoology, 27, 27–33.

    Google Scholar 

  • Felsenstein, J. (1985): Confidence limits on phylogenies: An approach using the bootstrap, Evolution, 39, 783–791.

    Article  Google Scholar 

  • Felsenstein, J. (1993): PHYLIP: Phylogeny inference package, version 3.5c, distributed by the author, University of Washington, Seattle.

    Google Scholar 

  • Felsenstein, J. and Kishino, H. (1993): Is there something wrong with the bootstrap on phylogenies? A reply to Hillis and Bull, Systematic Biology, 42, 193–200.

    Google Scholar 

  • Finden, C. R. and Gordon, A. D. (1985): Obtaining common pruned trees Journal of Classification 2, 225–276.

    Google Scholar 

  • Fowlkes, E. B. and Mallows, C. L. (1983): A method for comparing two hierarchical clusterings, Journal.

    Google Scholar 

  • of the American Statistical Association,78, 553–569.

    Google Scholar 

  • Pumas, G. W. (1984): The generation of random, binary unordered trees Journal of Classification 1 187–233.

    Google Scholar 

  • Goloboff, P. (1991a): Homoplasy and the choice among cladograms,•Cladistics, 7, 215–232. Goloboff, P. (1991b): Random data, homoplasy and information, Cladistics,7 395–406.

    Article  Google Scholar 

  • Gordon, A. D. (1986): Consensus supertrees: the synthesis of rooted trees containing overlapping sets of labeled leaves, Journal of Classification, 3, 335–348.

    Article  MathSciNet  MATH  Google Scholar 

  • Gordon, A. D. (1987): A review of hierarchical classifications Journal of the Royal Statistical Society (A)150, 119–137.

    Google Scholar 

  • Gower, J. C. (1983): Comparing classifications. In: Numerical Taxonomy, Felsenstein, J. (ed.), 137–155, Springer-Verlag, Berlin.

    Chapter  Google Scholar 

  • Graham, R. L. and Foulds, L. R. (1982): Unlikelihood that minimal phylogenies for a realistic biological study can be constructed in reasonable computational time, Mathematical Biosciences, 60, 133–142.

    Google Scholar 

  • Hall, P. and Martin, M. A. (1988): On bootstrap resampling and iterations Biometrika 75, 661–671.

    Google Scholar 

  • Harding, E. F. (1971): The probabilities of rooted tree-shapes generated by random bifurcations Advances in Applied Probability 4, 44–77.

    Google Scholar 

  • -Iarshman, J. (1994): The effect of irrelevant characters on bootstrap values, Systematic Biology, 43, 419–424.

    Google Scholar 

  • Hartigan, J. A. (1967): Representation of similarity matrices by trees Journal of the American Statistical Association 62, 1140–1158.

    Google Scholar 

  • Hedges, S. B. (1992): The number of replications needed for accurate estimation of the bootstrap P value in phylogenetic studies, Molecular Biology and Evolution, 9, 366–369.

    Google Scholar 

  • Hendy, M. D. et al. (1984): Comparing trees with pendant vertices labelled SIAM Journal in Applied Mathematics 44, 1054–1065.

    Google Scholar 

  • Hillis, D. M. (1987): Molecular versus morphological approaches to systematics Annual Review of Ecology and Systematics 18, 23–42.

    Google Scholar 

  • Hillis, D. M. (1991): Discriminatin g between phylogenetic signal and random noise in DNA sequences, In: Phylogenetic analysis of DNA sequences, Miyamoto, M. M. and Cracraft, J. (eds.), 278–294, Oxford University Press, New York.

    Google Scholar 

  • Hillis, D. M. (1995): Approaches for assessing phylogenetic accuracy Systematic Biology 44, 3–16.

    Google Scholar 

  • Hillis, D. M. and Bull, J. J. (1993): An empirical test of bootstrapping as a method for assessing confidence in phylogenetic analysis, Systematic Biology, 42, 182–192.

    Google Scholar 

  • Hubert, L. J. and Baker, F. B. (1977): The comparison and fitting of given classification schemes Journal of Mathematical Psychology 16, 233–253.

    Google Scholar 

  • uelsenbeck, J. P. (1995): Performance of phylogenetic methods in simulation, Systematic Biology, 44, 17–48.

    Google Scholar 

  • Huelsenbeck, J. P. and Bull, J. J. (1996): A likelihood ratio test for detection of conflicting phylogenetic signal, Systematic Biology, 45, 92–98.

    Article  Google Scholar 

  • Huelsenbeck, J. P. et al. (1994): Is character weighting a panacea for the problem of data heterogeneity in phylogenetic analysis?, Systematic Biology, 43, 288–291.

    Google Scholar 

  • Huelsenbeck, J. P. et al. (1995): Parametric bootstrapping in molecular phylogenetics: Applications and performance, In: Molecular Zoology: Strategies and Protocols, Ferraris, J and Palumbi, S. (eds.), Wiley, New York.

    Google Scholar 

  • Huelsenbeck, J. P. et al. (1996): Combining data in phylogenetic analysis, Trends in Ecology and Evolution, 11, 152–158.

    Article  Google Scholar 

  • Jardine, C. J. et al. (1967): The structure and construction of taxonomic hierarchies Mathematical Biosciences 1, 173–179.

    Google Scholar 

  • Källersjö, M. et al. (1992): Skewness and permutation Cladistics8, 275–287.

    Google Scholar 

  • Kim, J. (1993): Improving the accuracy of phylogenetic estimation by combining different methods, Systematic Biology, 42, 331–340.

    Google Scholar 

  • Kirsch, J. A. W. et al. (1995): Resolution of portions of the kangaroo phylogeny (Marsupialia: Macropodidae) using DNA hybridization Biological Journal of the Linnean Society 55, 309–328.

    Google Scholar 

  • Kirsch, J. A. W. et al. (1997): DNA-hybridisation studies of marsupials and their implications for metatherian classification. Australian Journal of Zoology, in press.

    Google Scholar 

  • Klassen, G. J. et al. (1991): Consistency indices and random data Systematic Zoology 40, 446–457.

    Google Scholar 

  • Kluge, A. G. (1989): A concern for evidence and a phylogenetic hypothesis of relationships among Epicrates (Boidae, Serpentes) Systematic Biology 38, 7–25.

    Google Scholar 

  • Kluge, A. G. and Farris, J. S. (1969): Quantitative phyletics and the evolution of anurans Systematic Zoology 18, 1–32.

    Google Scholar 

  • Krajewski, C. and Dickerman, A. W. (1990): Bootstrap analysis of phylogenetic trees derived from DNA hybridization matrices, Systematic Zoology, 39, 383–390.

    Article  Google Scholar 

  • Lanyon, S. (1985): Detecting internal inconsistencies in distance data Systematic Zoology 34, 397–403.

    Google Scholar 

  • Lanyon, S. (1993): Phylogenetic frameworks: Towards a firmer foundation for the comparative approach Biological Journal of the Linnean Society 49, 45–61.

    Google Scholar 

  • Lapointe, F.-J. and Cucumel, G. (1997): The average consensus procedure: combination of weighted trees containing identical or overlapping sets of objects, Systematic Biology, 46, 306–312.

    Article  Google Scholar 

  • Lapointe, F.-J. and Legendre, P. (1990): A statistical framework to test the consensus of two nested classifications, Systematic Zoology, 39, 1–13.

    Article  Google Scholar 

  • Lapointe, F.-J. and Legendre, P. (1991): The generation of random ultrametric matrices representing dendrograms, Journal of Classification, 8, 177–200.

    Article  Google Scholar 

  • Lapointe, F.-J. and Legendre, P. (1992a): A statistical framework to test the consensus among additive trees (cladograms), Systematic Biology, 41, 158–171.

    Google Scholar 

  • Lapointe, F.-J. and Legendre, P. (1992b): Statistical significance of the matrix correlation coefficient for comparing independent phylogenetic trees, Systematic Biology, 41, 378–384.

    Google Scholar 

  • Lapointe, F.-J. and Legendre, P. (1994): A classification of pure. malt Scotch whiskies Applied Statistics 43, 237–257.

    Google Scholar 

  • Lapointe, F.-J. and Kirsch, J. A. W. (1995): Estimating phylogenies from lacunose distance matrices, with special reference to DNA hybridization data, Molecular Biology and Evolution, 12, 266–284.

    Google Scholar 

  • Lapointe, F.-J. and Legendre, P. (1995): Comparison tests for dendrograms: A comparative evaluation Journal of Classification 12, 265–282.

    Google Scholar 

  • Lapointe, F.-J. et al. (1994): Jackknifing of weighted trees: Validation of phylogenies reconstructed from distances matrices, Molecular Phylogenetics and Evolution, 3, 256–267.

    Article  Google Scholar 

  • Leclerc, B. and Cucumel, G. (1987): Consensus en classification: Une revue bibliographique Mathématiques et Sciences Humaines 100, 109–128.

    Google Scholar 

  • Lecointre, G. H. et al. (1993): Species sampling has a major impact on phylogenetic inference Molecular Phylogenetics and Evolution 2, 205–224.

    Google Scholar 

  • Lefkovitch, L. P. (1985): Euclidean consensus dendrograms and other classification structures Mathematical Biosciences 74, 1–15.

    Google Scholar 

  • Le Quesne, W. (1989): Frequency distributions of lengths of possible networks from a data matrix Cladistics 5, 395–407.

    Google Scholar 

  • Li, W.-H. and Guoy, M. (1991): Statistical methods for testing phylogenies, In: Phylogenetic analysis of DNA sequences Miyamoto, M. M. and Cracraft, J. (eds.), 249–277, Oxford University Press, New York.

    Google Scholar 

  • Li, W.-H. and Zharkikh, A. (1994): What is the bootstrap technique?, Systematic Biology, 43, 424–430. Li, W.-H. and Zharkikh, A. (1995): Statistical tests of DNA phylogenies, Systematic Biology, 44, 49–63.

    Google Scholar 

  • Ling, R. F. (1973): A probability theory of cluster analysis Journal of the American Statistical Association 68, 159–164.

    Google Scholar 

  • Mantel, N. (1967): The detection of disease clustering and a generalized regression approach Cancer Research 27, 209–220.

    Google Scholar 

  • Margush, T. (1982): Distances between trees Discrete Applied Mathematics 4, 281–290.

    Google Scholar 

  • Margush, T. and McMorris, F. R. (1981): Consensus n-trees, Bulletin of Mathematical Biology, 43, 239244.

    Google Scholar 

  • Marshall, C. R. (1991): Statistical tests and bootstrapping: Assessing the reliability of phylogenies based on distance data, Molecular Biology and Evolution, 8, 386–391.

    Google Scholar 

  • Mason-Gamer, R. J. and Kellogg, E. K. (1996): Testing for phylogenetic conflict among molecular data.

    Google Scholar 

  • sets in the tribe Triticeae (Gramineae), Systematic Biology,45 524–545.

    Google Scholar 

  • McMorris, F. R. (1985): Axioms for consensus functions ou undirected phylogenetic trees Mathematical Biosciences 74 17–21.

    Google Scholar 

  • McMorris, F. R. et al. (1983): A view of some consensus methods for trees. In: Numerical Taxonomy Felsenstein, J. (ed.), 122–126, Springer-Verlag, Berlin.

    Google Scholar 

  • McMorris, F. R. and Neumann, D. (1983): Consensus functions defined on trees Mathematical Social Sciences 4 131–136.

    Google Scholar 

  • Meier, R. et al. (1991): Homoplasy slope ratio: A better measurement of observed homoplasy in cladistic analyses, Systematic Zoology, 40, 74–88.

    Article  Google Scholar 

  • Mickevich, M. F. (1978): Taxonomic congruence, Systematic Zoology, 27, 143–158.

    Article  Google Scholar 

  • Milligan, G. W. (1981): A Monte-Carlo study of 30 internal criterion measures for cluster-analysis, Psychometrika, 46, 187–195.

    Article  MathSciNet  MATH  Google Scholar 

  • Miyamoto, M. M. (1985): Consensus cladograms and general classifications Cladistics 1186–189.

    Google Scholar 

  • Miyamoto, M. M. et al. (1994): A congruence test of reliability using linked mitochondria) DNA sequences, Systematic Biology, 43, 236–249.

    Google Scholar 

  • Miyamoto, M. M. and Fitch, W. M. (1995): Testing species phylogenies and phylogenetic methods with congruence, Systematic Biology, 44, 64–76.

    Google Scholar 

  • Mueller, L. D. and Ayala, F. J. (1982): Estimation and interpretation of genetic distances in empirical studies, Genetical Research, 40, 127–137.

    Article  Google Scholar 

  • Murtagh, F. (1984): Counting dendrograms: A survey, Discrete Applied Mathematics, 7, 191–199.

    Article  MathSciNet  MATH  Google Scholar 

  • Nelson, G. (1979): Cladistic analysis and synthesis: Principles and definitions, with a historical note on Adauson’s Famille des Plantes (1763–1764), Systematic Zoology, 28, 1–21.

    Article  Google Scholar 

  • Nelson, G. (1993): Why crusade against consensus? A reply to Barrett, Donoghue, and Sober Systematic Biology 42 215–216.

    Google Scholar 

  • Nemec, A. F. L. and Brinkburst, R. O. (1988): The Fowlkes-Mallows statistic and the comparison of two independently determined dendrograms, Canadian Journal of Fisheries and Aquatic Sciences, 45, 97 1975.

    Google Scholar 

  • Neumann, D. A. (1983): Faithful consensus methods for n-trees, Mathematical Biosciences, 63, 271–287. Nixon, K. C. and J. M. Carpenter. (1996): On simultaneous analysis, Cladistics, 12, 221–241.

    Google Scholar 

  • Oden, N. L. and Shao, K. T. (1984): An algorithm to equiprobably generate all directed trees with k labeled terminal nodes and unlabeled interior nodes, Bulletin of Mathematical Biology, 46, 379–387.

    MathSciNet  MATH  Google Scholar 

  • Olmstead, R. G. and Sweere, J. A. (1994): Combining data in phylogenetic systematics: An empirical approach using three molecular data sets in the Solanacae, Systematic Biology, 43, 467–481.

    Article  Google Scholar 

  • Omland, K. E. (1994): Character congruence between a molecular and a morphological phylogeny for dabbling ducks (Arras), Systematic Biology, 43, 369–386.

    Google Scholar 

  • Page, R. D. M. (1988): Quantitative cladistic biogeography: Constructing and comparing area cladograms, Systematic Zoology, 37, 254–270.

    Article  Google Scholar 

  • Page, R. D. M. (1991): Random dendrograms and null hypotheses in cladistic biogeography Systematic Zoology 40 54–62.

    Google Scholar 

  • Patterson, C. et al. (1993): Congruence between molecular and morphological phylogenies Annual Review of Ecology and Systematics 24 153–188.

    Google Scholar 

  • Penny, D. and Hendy, M. D. (1985a): The use of tree comparison metrics, Systematic Zoology, 34, 75–82. Penny, D. and Hendy, M. D. (1985b): Testing methods of evolutionary tree construction, Cladistics, 1, 266–278.

    Article  Google Scholar 

  • Penny, D. et al. (1982): Testing the theory of evolution by comparing phylogenetic trees constructed from five different protein sequences, Nature, 297, 197–200.

    Article  Google Scholar 

  • Penny, D. et al. (1992): Progress with methods for constructing evolutionary trees Trends in Ecology and Evolution 7, 73–79.

    Google Scholar 

  • Phillips, C. and Warnow, T. J. (1996): The asymmetric median tree–A new model for building consensus trees, Discrete Applied Mathematics, 71, 311–335.

    Article  MathSciNet  MATH  Google Scholar 

  • Phipps, J. B. (1975): The numbers of classifications, Canadian Journal of Botany, 54, 686–688.

    Article  Google Scholar 

  • Podani, J. and Dickinson, T. A. (1984): Comparison of dendrograms: A multivariate approach Canadian Journal of Botany 62 2765–2778.

    Google Scholar 

  • Poe, S. 1996. Data set incongrence and the phylogeny of Crocodilians, Systematic Biology, 45, 393–414.

    Article  Google Scholar 

  • Prager, E. M. and Wilson, A. C. (1976): Congruency of phylogenies derived from different proteins, Journal of Molecular Evolution, 9, 45–57.

    Article  Google Scholar 

  • Proskurowski, A. (1980): On the generation of binary trees Journal of the Association of Computing Machinery 27 1–2.

    Google Scholar 

  • Purvis, A. (1995a): A modification to Baum and Ragan’s method for combining phylogenetic trees, Systematic Biology, 44, 251–255.

    Google Scholar 

  • Purvis, A. (1995b): A composite estimate of primate phylogeny Philosophical Transactions of the Royal Society of London (B) 348 405–421.

    Google Scholar 

  • Quiroz, A. J. (1989): Fast random generation of binary, t-ary and other types of trees Journal of Classification 6 223–231.

    Google Scholar 

  • Ragan, M. A. (1992): Phylogeuetic inference based on matrix representation of trees Molecular Phylogenetics and Evolution 1 53–58.

    Google Scholar 

  • Robinson, D. F. (1971): Comparison of labeled trees with valency Three Journal of Combinatorial Theory 11 105–119.

    Google Scholar 

  • Robinson, D. F. and Foulds, L. R. (1979): Comparison of weighted labelled trees. In: Lecture Notes in Matehmatics Volume 748, 119–126, Springer-Verlag, Berlin.

    Google Scholar 

  • Robinson, D. F. and Foulds, L. R. (1981): Comparison of phylogenetic trees, Mathematical Biosciences, 53, 131–147.

    Article  MathSciNet  MATH  Google Scholar 

  • Rodrigo, A. G. (1993a): Calibrating the bootstrap test of monophyly, International Journal of Parasitology, 23, 507–514.

    Article  Google Scholar 

  • Rodrigo, A. G. (19936): A comment on Baum’s method for combining phylogenetic trees, Taxon, 42, 63 1636.

    Google Scholar 

  • Rodrigo, A. G. et al. (1993): A randomisation test of the null hypothesis that two cladograms are sample estimates of a parametric phylogenetic tree, New Zealand Journal of Botany, 31, 257–268.

    Article  Google Scholar 

  • Rohlf, F. J. (1974): Methods of comparing classifications, Annual Review of Ecology and Systematics, 5, 101–113.

    Article  Google Scholar 

  • Rohlf, F. J. (1982): Consensus indices for comparing classifications, Mathematical Biosciences, 59, 13 1144.

    Google Scholar 

  • Ronquist, F. (1996): Matrix representations of trees, redudancy and weighting, Systematic Biology, 45, 247–253.

    Article  Google Scholar 

  • Russo, C. A. M. et al. (1996): Efficiencies of different genes and different tree-building methods in recovering a known vertebrate phylogeny, Molecular Biology and Evolution, 13, 525–536.

    Article  Google Scholar 

  • Sanderson, M. J. (1989): Confidence limits on phylogenies: The bootstrap revisited, (laths es, 5, 113129.

    Google Scholar 

  • Sanderson, M. J. (1995): Objections of bootstrapping phylogenies: A critique, Systematic Biology, 44, 299–320.

    Google Scholar 

  • Savage, H. M. (1983): The shape of evolution: Systematic tree topology Biological Journal of the Linneae Society20, 225–244.

    Google Scholar 

  • Shao, K. and Rohlf, F. J. (1983): Sampling distribution of consensus indices when all bifurcating trees are equally likely. In: Numerical Taxonomy, Felsenstein, J. (ed.), 132–136, Springer-Verlag, Berlin.

    Chapter  Google Scholar 

  • Shoo, K. and Sokal, R. R. (1986): Significance tests of consensus indices, Systematic Zoology, 35, 58 2590.

    Google Scholar 

  • Simberloff, D. (1987): Calculating probabilities that cladograms match: A method of biogeographic inference, Systematic Zoology, 36, 175–195.

    Article  Google Scholar 

  • Simberloff, D. et al. (1981): There have been no statistical tests of cladistics biogeographical hypotheses. In: Vicariance Biogeography: A Critique, Nelson, G. and Rosen, D. E. (eds.), 40–63, Columbia University Press, New York.

    Google Scholar 

  • Sneath, P. H. A. (1967): Some statistical problems in numerical taxonomy, The Statistician, 17, 1–12.

    Article  Google Scholar 

  • Sokal R. R. and Rohlf, F. J. (1962): The comparison of dendrograms by objective methods, Taxon, 9, 3340.

    Google Scholar 

  • Sokal R. R. and Rohlf, F. J. (1981): Taxonomic congruence in the Leptopodomorpha re-examined Systematic Zoology30, 309–325.

    Google Scholar 

  • Steel, M. A. (1988): Distribution of the symmetric difference metric on phylogenetic trees SLANI.Journal of Discrete Mathematics1, 541–555.

    Google Scholar 

  • Steel, M. A. (1992): The complexity of reconstructing trees from qualitative characters and subtrees Journal of Classification9, 91–116.

    Google Scholar 

  • Steel, M. A. and Penny, D. (1993): Distribution of tree comparison metrics-Some new results Systematic Biology42, 126–141.

    Google Scholar 

  • Steel., M. A. et al. (1992): Significance of the length of the shortest tree Journal of Classification9, 6370.

    Google Scholar 

  • Stinebrickuer, R. (1982): S-consensus trees and indices Bulletin of Mathematical Biology46, 923–935.

    Google Scholar 

  • Stinebrickner, R. (1984): An extension of intersection methods from trees to dendrograms Systematic Zoology33, 381–386.

    Google Scholar 

  • Sullivan, J. (1996): Combining data with different distributions of among-site variation Systematic Biology45, 375–379.

    Google Scholar 

  • Swofford, D. L. (1991): When are phylogeny estimates from molecular and morphological data incongruent?, In: Phylogenetic analysis of DNA sequences, Miyamoto, M. M. and Cracraft, J. (eds.), 295–333, Oxford University Press, New York.

    Google Scholar 

  • Swofford, D. I. et al. (1996a): Phylogenetic inference, In: Molecular Systematics, 2nd edition, Hillis, D. M. et al. (eds.), 407–514, Sinauer, Sunderland.

    Google Scholar 

  • Swofford, D. L. et al. (19966): The topology-dependent permutation test for monophyly does not test for monophyly, Systeneatic Biology, 45, 575–579.

    Google Scholar 

  • Waterman, M. S. and Smith, T. F. (1978): On the similarity of dendrograms Journal of Theoretical Biology73, 789–800.

    Google Scholar 

  • Wiens, J. J. and Chippindale, P. T. (1994): Combining and weighting characters and the prior agreement approach revisited, Systematic Biology, 43, 564–566.

    Article  Google Scholar 

  • Wiens, J. J. and Reeder, T. W. (1995): Combining data sets with different numbers of taxa for phylogenetic analysis, Systematic Biology, 44, 548–558.

    Google Scholar 

  • Wilkinson, M. (1994): Common cladistic information and its consensus representation: Reduced Adams and reduced cladistic consensus trees and profiles, Systematic Biology, 43, 343–368.

    Google Scholar 

  • Wilkinson, M. (1996): Majority-rule reduced consensus trees and their use in boostrapping Molecular Biology and Evolution13, 437–444.

    Google Scholar 

  • Williams, D. M. (1994): Combining trees and combining data Taxon43, 449–453.

    Google Scholar 

  • Williams, W. T. and Clifford, FL T. (1971): On the comparison of two classifications ou the same set of elements Taxon20, 519–522.

    Google Scholar 

  • Zaretskii, K. (1965): Constructing a tree on the basis of a set of distances between the hanging vertices Uspekhi Mathematika Nauk20, 90–92. (in Russian).

    Google Scholar 

  • Zharkikh, A. and Li, W.-H. (1992a): Statistical properties of bootstrap estimation of phylogenetic variability from nucleotide sequences. I. Four taxa with a molecular clock, Molecular Biology and Evolution, 9, 1119–1147.

    Google Scholar 

  • Zharkikh, A. and Li, W.-H. (1992b): Statistical properties of bootstrap estimation of phylogenetic variability from nucleotide sequences. I1. Four taxa without a molecular clock. Journal of Molecular Evolution, 35, 356–366.

    Article  Google Scholar 

  • Zharkikh, A. and Li, W.-H. (1995): Estimation of confidence in phylogeny: The full-and-partial bootstrap technique, Molecular Phylogenetics and Evolution, 4, 44–63.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1998 Springer Japan

About this paper

Cite this paper

Lapointe, FJ. (1998). How to validate phylogenetic trees? A stepwise procedure. In: Hayashi, C., Yajima, K., Bock, HH., Ohsumi, N., Tanaka, Y., Baba, Y. (eds) Data Science, Classification, and Related Methods. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Tokyo. https://doi.org/10.1007/978-4-431-65950-1_6

Download citation

  • DOI: https://doi.org/10.1007/978-4-431-65950-1_6

  • Publisher Name: Springer, Tokyo

  • Print ISBN: 978-4-431-70208-5

  • Online ISBN: 978-4-431-65950-1

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics