How to validate phylogenetic trees? A stepwise procedure

Lapointe, François-Joseph

doi:10.1007/978-4-431-65950-1_6

François-Joseph Lapointe⁸

Part of the book series: Studies in Classification, Data Analysis, and Knowledge Organization ((STUDIES CLASS))

2066 Accesses
7 Citations

Summary

In this paper, I review some of the methods and tests currently available to validate trees, focussing on phylogenetic trees (dendrograms and cladograms). I first present some of the more commonly used techniques to compare a tree with the data it is derived from (internal validation), or compare a tree to another tree or to more than one (external validation). I also discuss some of the advantages of performing combined (total evidence) versus separate analyses (consensus) of independent data sets for validation purposes. A stepwise validation procedure defined across all levels of comparison is introduced, along with a corresponding statistical test: A phylogeny will be said to be globally validated only if it satisfies all the tests. An application to the phylogeny of kangaroos is presented to illustrate the stepwise procedure.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Adams, F. N., III. (1972): Consensus techniques and the comparison of taxonomic trees Systematic Zoology, 21, 390–397.
Google Scholar
Alroy. J. (1994): Four permutation tests for the presence of phytogenetic structure, Systematic Biology, 43, 430–437.
Google Scholar
Anderbcrg, A. and Tehler, A. (1990): Consensus trees, a necessity in taxonomic practice, Cladistics, 6, 399–402.
Article Google Scholar
Archie, J. W. (1989a): A randomization test for phytogenetic information in systematic data Systematic Zoology, 38, 219–252.
Google Scholar
Archie, J. W. (1989b): Homoplasy excess ratios: New indices for measuring levels of homoplasy in phytogenetic systematics and a critique of the consistency index, Systematic Zoology, 38, 253–269.
Article Google Scholar
Archie, J. W. (1989c): Phylogenies of plant families: A demonstration of phylogenetic randomness in DNA sequence data derived from proteins, Evolution, 43, 1796–1800.
Article Google Scholar
Archie, J. W. (1990): Homoplasy excess statistics and retention indices: A reply to Farris Systematic Zoology, 39, 169–174.
Google Scholar
Archie, J. W. and Felsenstein, J. (1993): The number of evolutionary steps on random and minimum lengths trees for random evolutionary data. Theoretical Population Biology, 43, 52–79.
Article MATH Google Scholar
Bandelt. H. J. (1995): Combination of data in phylogenetic analysis Plant Systematics and Evolution Supplementum 9, 355–361.
Google Scholar
Barrett, M. et al. (1991): Against consensus Systematic Zoology 40, 486–493.
Google Scholar
Barrett, M. et al. (1993): Crusade’? A response to Nelson Systematic Biology 42, 216–217.
Google Scholar
Barthélemy, J.-P. and McMorris, F. R. (1986): The median procedure for n-trees Journal of Classification 3, 329–334.
Google Scholar
Baum, B. R. (1992): Combining trees as a way of combining data for phylogenetic inference, and the desirability of combining gene trees, Taxon, 41, 3–10.
Article Google Scholar
Baum, B. R. and Ragan, M. A. (1993): Reply to A. G. Rodrigo’s “A comment on Baum’s method for combining phylogenetic trees, Taxon, 42, 637–640.
Article Google Scholar
Baverstock, P. R. et al. (1989): Albumin immunologic relationships of the Macropodidae (Marsupialia) Systematic Zoology 38, 38–50.
Google Scholar
Berry, V. and Gascuel, O. (1996): On the interpretation of bootstrap trees: Appropriate threshold of clade selection and induced gain, Molecular Biology and Evolution, 13, 999–1011.
Article Google Scholar
Bledsoe, A. H. and Raikow, R. J. (1990): A quantitative assessment of congruence between molecular and nonmolecular estimates of phylogeny, Journal of Molecular Evolution, 30, 247–259.
Article Google Scholar
Bleiweiss, R. et al. (1994): DNA-DNA hybridization-based phylogeny of “higher nonpasserines: Reevaluating a key portion of the avian family tree, Molecular Phylogenetics and Evolution, 3, 248–255.
Article Google Scholar
Bock, II. H. (1985): On some significance tests in cluster analysis, Journal of Classification, 2, 77–108. Bosibud, H. M. and Bosibud, L. E. (1972): A metric for classifications, Taxon, 21, 607–613.
Google Scholar
Bourque, M. (1978): Arbres de Steiner et réseaux dont varie l’emplacement de certains sommets. Ph. D. Thesis, Département d’Informatique et de Recherche Operatiouelle, Unversité de Montréal, Montréal.
Google Scholar
Bremer, K. (1990): Combinable component consensus, Cladistics, 6, 369–372. Bremer, K. (1995): Branch support and tree stability, Cladistics, 10, 295–304. Brown, J. K. M. (1994): Probabilities of evolutionary trees, Systematic Biology, 43, 78–91.
Google Scholar
Bryant, H. N. (1992): The role of permutation tail probability tests in phylogenetic systematics Systematic Biology 41, 258–263.
Google Scholar
Bull, J. J. et al. (1993): Partitioning and combining data in phylogenetic analysis, Systematic Biology, 42, 384–397.
Google Scholar
Buneman, P. (1971): The recovery of trees from measures of dissimilarity. In: Mathematics in Archeological and Historical Sciences, Hodson, F. R. et al. (eds.), 387–395, Edinburgh University Press, Edinburgh.
Google Scholar
Buneman, P. (1974): A note on the metric properties of trees, Journal of Combinatorial Theory (B), 17, 48–50.
Article MathSciNet MATH Google Scholar
Carpenter, J. M. (1992): Random cladistics Cladistics 8, 147–153.
Google Scholar
Carter, M. et al. (1990): On the distribution of lengths of evolutionary trees SIAM Journal of Discrete ai’lathematics 3, 38–47.
Google Scholar
Chìppindale, P. T. and Wiens, J. J. (1994): Weighting, partitioning, and combining characters in phylogenetic analysis, Systematic Biology, 43, 278–287.
Google Scholar
Colless, D. H. (1980): Congruence between morphometric and allozyme data for Menidia species: A reappraisal Systematic Zoology 29, 288–299 .
Google Scholar
Critchlow, D. E. et al. (1996): The triples distance for rooted bifurcating phylogenetic trees Systematic Biology 45, 323–334.
Google Scholar
Cucumel, G. and Lapointe, F.-J. (1997): Un test de la pertinence du consensus par une méthode de permutations. In: Actes des XXIXe journées de statistique 299–300, Carcassonne.
Google Scholar
Davis, J. I. (1993): Character removal as a means for assessing stability of clades, Cladistics, 9, 201–210.
Article Google Scholar
Day, W. H. E. (1983a): The role of complexity in comparing classifications, Mathematical Biosciences, 66, 97–114.
Article MathSciNet MATH Google Scholar
Day, W. H. E. (1983b): Distributions of distances between pairs of classifications. In: Numerical Taxonomy Felsenstein, J. (ed.), 127–131, Springer-Verlag, Berlin.
Google Scholar
Day, W. H. E. (1983c): Computationally difficult parsimony problems in phylogenetic systematics Journal of Theoretical Biology 103, 429–438.
Google Scholar
Day, W. H. E. (1986): Analysis of quartet dissimilarity measures between undirected phylogenetic trees Systematic Zoology 35, 325–333.
Google Scholar
Day, W. H. E. (1987): Computational complexity of inferring phylogenies from dissimilarity matrices Bulletin of Mathematical Biology 49, 461–467.
Google Scholar
Day, W. H. E. and McMorris, F. R. (1985): A formalization of consensus index methods Bulletin of Mathematical Biology 47, 215–229.
Google Scholar
de Queiroz, A. (1993): For consensus (sometimes) Systematic Biology 42, 368–372.
Google Scholar
de Queiroz, A. et al. (1995): Separate versus combined analysis of phylogenetic evidence Annual Review of Ecology and Systematics 26, 657–681.
Google Scholar
Dopazo, J. (1994): Estimating errors and confidence intervals for branch lengths in phylogenetic tres by a bootstrap approach. Journal of Molecular Evolution, 38, 300–304.
Article Google Scholar
Dubes, R. and Jain, A. K. (1979): Validity studies in clustering methodologies, Pattern Recognition, 11, 235–254.
Article MATH Google Scholar
Dwass, M. (1957): Modified randomization tests for nonparametric hypotheses Annals of Mathematics and Statistics 28, 181–187.
Google Scholar
Edgington, E. S. (1995): Randomization tests, 3rd Edition, Revised and Expanded. Marcel Dekker, New York.
Google Scholar
Eernisse, D. J. and Kluge, A. G. (1993): Taxonomic congruence versus total evidence, and the phylogeny of amniotes inferred from fossils, molecules and morphology, Molecular Biology and Evolution, 10, 1170–1195.
Google Scholar
Efron, B. (1979): Bootstrapping methods: Another look at the jackknife Annals of Statistics 7, 1–26.
Google Scholar
Efron, B. and Gong, G. (1983): A leisurely look at the bootstrap, the jackknife, and cross-validation American Statistician 37, 36–48.
Google Scholar
Efron, B. and Tibshirani, R. J. (1993): An introduction to the bootstrap, Chapman and Hall, New York.
Google Scholar
Efron, B. et al. (1996): Bootstrap confidence levels for phylogenetic trees Proceedings of the National Academy of Sciences USA93, 13429–13434.
Google Scholar
Estabrook, G. F. (1992): Evaluating undirected positional congruence of individual taxa between two estimates of the phylogenetic tree for a group of taxa, Systematic Biology, 41, 172–177.
Google Scholar
Estabrook, G. F. et al. (1985): Comparison of undirected phylogenetic trees based ou subtrees of four evolutionary units, Systematic Zoology, 34, 193–200.
Article Google Scholar
Faith, D. P. (1991): Cladistic permutation tests for monophyly and nonmonophyly, Systematic Zoology, 40, 366–375.
Article Google Scholar
Faith, D. P. (1992): Ou corroboration: A reply to Carpenter Cladistics 8, 265–273.
Google Scholar
Faith, D. P. and Ballard, J. W. O. (1994): Length differences topology-dependent tests: A response to Källersjö et al, Cladistics, 10, 57–64.
Article Google Scholar
Faith, D. P. and Belbin, L. (1986): Comparison of classifications using measures intermediate between metric dissimilarity and consensus similarity, Journal of Classification, 3, 257–280.
Article MATH Google Scholar
Faith, D. P. and Cranston, P. S. (1991): Could a cladogram this short have arisen by chance alone? on permutation tests for cladistic structure, Cladistics, 71–28.
Article Google Scholar
Faith, D. P. and Trueman, J. W. H. (1996): When the topology-dependent permutation test (T-PTP) for monophyly returns significant support for monophyly, should that be equated with (a) rejecting a null hypothesis of nonmonophyly, (b) rejecting a null hypothesis of “no structure,” (c) failing to falsify a hypothesis of monophyly, or (d) none of the above? Systematic Biology, 45, 580–586.
Article Google Scholar
Farris, J. S. (1989a): The retention index and the resealed consistency index, Cladistics, 5, 417–419. Farris, J. S. (1989b): The retention index and homoplasy excess, Systematic Zoology, 38, 406–407. Farris, J. S. (1991): Excess homoplasy ratios, Cladistics, 7,81–91.
Article Google Scholar
Farris, J. S. et al. (1995a): Constructing a significance test for incongruence Systematic Biology44, 570572.
Google Scholar
Farris, J. S. et al. (1995b): Testing significance of incongruencies, Cladistics, 10, 315–370. Felsenstein, J. (1978): The number of evolutionary trees, Systematic Zoology, 27, 27–33.
Google Scholar
Felsenstein, J. (1985): Confidence limits on phylogenies: An approach using the bootstrap, Evolution, 39, 783–791.
Article Google Scholar
Felsenstein, J. (1993): PHYLIP: Phylogeny inference package, version 3.5c, distributed by the author, University of Washington, Seattle.
Google Scholar
Felsenstein, J. and Kishino, H. (1993): Is there something wrong with the bootstrap on phylogenies? A reply to Hillis and Bull, Systematic Biology, 42, 193–200.
Google Scholar
Finden, C. R. and Gordon, A. D. (1985): Obtaining common pruned trees Journal of Classification 2, 225–276.
Google Scholar
Fowlkes, E. B. and Mallows, C. L. (1983): A method for comparing two hierarchical clusterings, Journal.
Google Scholar
of the American Statistical Association,78, 553–569.
Google Scholar
Pumas, G. W. (1984): The generation of random, binary unordered trees Journal of Classification 1 187–233.
Google Scholar
Goloboff, P. (1991a): Homoplasy and the choice among cladograms,•Cladistics, 7, 215–232. Goloboff, P. (1991b): Random data, homoplasy and information, Cladistics,7 395–406.
Article Google Scholar
Gordon, A. D. (1986): Consensus supertrees: the synthesis of rooted trees containing overlapping sets of labeled leaves, Journal of Classification, 3, 335–348.
Article MathSciNet MATH Google Scholar
Gordon, A. D. (1987): A review of hierarchical classifications Journal of the Royal Statistical Society (A)150, 119–137.
Google Scholar
Gower, J. C. (1983): Comparing classifications. In: Numerical Taxonomy, Felsenstein, J. (ed.), 137–155, Springer-Verlag, Berlin.
Chapter Google Scholar
Graham, R. L. and Foulds, L. R. (1982): Unlikelihood that minimal phylogenies for a realistic biological study can be constructed in reasonable computational time, Mathematical Biosciences, 60, 133–142.
Google Scholar
Hall, P. and Martin, M. A. (1988): On bootstrap resampling and iterations Biometrika 75, 661–671.
Google Scholar
Harding, E. F. (1971): The probabilities of rooted tree-shapes generated by random bifurcations Advances in Applied Probability 4, 44–77.
Google Scholar
-Iarshman, J. (1994): The effect of irrelevant characters on bootstrap values, Systematic Biology, 43, 419–424.
Google Scholar
Hartigan, J. A. (1967): Representation of similarity matrices by trees Journal of the American Statistical Association 62, 1140–1158.
Google Scholar
Hedges, S. B. (1992): The number of replications needed for accurate estimation of the bootstrap P value in phylogenetic studies, Molecular Biology and Evolution, 9, 366–369.
Google Scholar
Hendy, M. D. et al. (1984): Comparing trees with pendant vertices labelled SIAM Journal in Applied Mathematics 44, 1054–1065.
Google Scholar
Hillis, D. M. (1987): Molecular versus morphological approaches to systematics Annual Review of Ecology and Systematics 18, 23–42.
Google Scholar
Hillis, D. M. (1991): Discriminatin g between phylogenetic signal and random noise in DNA sequences, In: Phylogenetic analysis of DNA sequences, Miyamoto, M. M. and Cracraft, J. (eds.), 278–294, Oxford University Press, New York.
Google Scholar
Hillis, D. M. (1995): Approaches for assessing phylogenetic accuracy Systematic Biology 44, 3–16.
Google Scholar
Hillis, D. M. and Bull, J. J. (1993): An empirical test of bootstrapping as a method for assessing confidence in phylogenetic analysis, Systematic Biology, 42, 182–192.
Google Scholar
Hubert, L. J. and Baker, F. B. (1977): The comparison and fitting of given classification schemes Journal of Mathematical Psychology 16, 233–253.
Google Scholar
uelsenbeck, J. P. (1995): Performance of phylogenetic methods in simulation, Systematic Biology, 44, 17–48.
Google Scholar
Huelsenbeck, J. P. and Bull, J. J. (1996): A likelihood ratio test for detection of conflicting phylogenetic signal, Systematic Biology, 45, 92–98.
Article Google Scholar
Huelsenbeck, J. P. et al. (1994): Is character weighting a panacea for the problem of data heterogeneity in phylogenetic analysis?, Systematic Biology, 43, 288–291.
Google Scholar
Huelsenbeck, J. P. et al. (1995): Parametric bootstrapping in molecular phylogenetics: Applications and performance, In: Molecular Zoology: Strategies and Protocols, Ferraris, J and Palumbi, S. (eds.), Wiley, New York.
Google Scholar
Huelsenbeck, J. P. et al. (1996): Combining data in phylogenetic analysis, Trends in Ecology and Evolution, 11, 152–158.
Article Google Scholar
Jardine, C. J. et al. (1967): The structure and construction of taxonomic hierarchies Mathematical Biosciences 1, 173–179.
Google Scholar
Källersjö, M. et al. (1992): Skewness and permutation Cladistics8, 275–287.
Google Scholar
Kim, J. (1993): Improving the accuracy of phylogenetic estimation by combining different methods, Systematic Biology, 42, 331–340.
Google Scholar
Kirsch, J. A. W. et al. (1995): Resolution of portions of the kangaroo phylogeny (Marsupialia: Macropodidae) using DNA hybridization Biological Journal of the Linnean Society 55, 309–328.
Google Scholar
Kirsch, J. A. W. et al. (1997): DNA-hybridisation studies of marsupials and their implications for metatherian classification. Australian Journal of Zoology, in press.
Google Scholar
Klassen, G. J. et al. (1991): Consistency indices and random data Systematic Zoology 40, 446–457.
Google Scholar
Kluge, A. G. (1989): A concern for evidence and a phylogenetic hypothesis of relationships among Epicrates (Boidae, Serpentes) Systematic Biology 38, 7–25.
Google Scholar
Kluge, A. G. and Farris, J. S. (1969): Quantitative phyletics and the evolution of anurans Systematic Zoology 18, 1–32.
Google Scholar
Krajewski, C. and Dickerman, A. W. (1990): Bootstrap analysis of phylogenetic trees derived from DNA hybridization matrices, Systematic Zoology, 39, 383–390.
Article Google Scholar
Lanyon, S. (1985): Detecting internal inconsistencies in distance data Systematic Zoology 34, 397–403.
Google Scholar
Lanyon, S. (1993): Phylogenetic frameworks: Towards a firmer foundation for the comparative approach Biological Journal of the Linnean Society 49, 45–61.
Google Scholar
Lapointe, F.-J. and Cucumel, G. (1997): The average consensus procedure: combination of weighted trees containing identical or overlapping sets of objects, Systematic Biology, 46, 306–312.
Article Google Scholar
Lapointe, F.-J. and Legendre, P. (1990): A statistical framework to test the consensus of two nested classifications, Systematic Zoology, 39, 1–13.
Article Google Scholar
Lapointe, F.-J. and Legendre, P. (1991): The generation of random ultrametric matrices representing dendrograms, Journal of Classification, 8, 177–200.
Article Google Scholar
Lapointe, F.-J. and Legendre, P. (1992a): A statistical framework to test the consensus among additive trees (cladograms), Systematic Biology, 41, 158–171.
Google Scholar
Lapointe, F.-J. and Legendre, P. (1992b): Statistical significance of the matrix correlation coefficient for comparing independent phylogenetic trees, Systematic Biology, 41, 378–384.
Google Scholar
Lapointe, F.-J. and Legendre, P. (1994): A classification of pure. malt Scotch whiskies Applied Statistics 43, 237–257.
Google Scholar
Lapointe, F.-J. and Kirsch, J. A. W. (1995): Estimating phylogenies from lacunose distance matrices, with special reference to DNA hybridization data, Molecular Biology and Evolution, 12, 266–284.
Google Scholar
Lapointe, F.-J. and Legendre, P. (1995): Comparison tests for dendrograms: A comparative evaluation Journal of Classification 12, 265–282.
Google Scholar
Lapointe, F.-J. et al. (1994): Jackknifing of weighted trees: Validation of phylogenies reconstructed from distances matrices, Molecular Phylogenetics and Evolution, 3, 256–267.
Article Google Scholar
Leclerc, B. and Cucumel, G. (1987): Consensus en classification: Une revue bibliographique Mathématiques et Sciences Humaines 100, 109–128.
Google Scholar
Lecointre, G. H. et al. (1993): Species sampling has a major impact on phylogenetic inference Molecular Phylogenetics and Evolution 2, 205–224.
Google Scholar
Lefkovitch, L. P. (1985): Euclidean consensus dendrograms and other classification structures Mathematical Biosciences 74, 1–15.
Google Scholar
Le Quesne, W. (1989): Frequency distributions of lengths of possible networks from a data matrix Cladistics 5, 395–407.
Google Scholar
Li, W.-H. and Guoy, M. (1991): Statistical methods for testing phylogenies, In: Phylogenetic analysis of DNA sequences Miyamoto, M. M. and Cracraft, J. (eds.), 249–277, Oxford University Press, New York.
Google Scholar
Li, W.-H. and Zharkikh, A. (1994): What is the bootstrap technique?, Systematic Biology, 43, 424–430. Li, W.-H. and Zharkikh, A. (1995): Statistical tests of DNA phylogenies, Systematic Biology, 44, 49–63.
Google Scholar
Ling, R. F. (1973): A probability theory of cluster analysis Journal of the American Statistical Association 68, 159–164.
Google Scholar
Mantel, N. (1967): The detection of disease clustering and a generalized regression approach Cancer Research 27, 209–220.
Google Scholar
Margush, T. (1982): Distances between trees Discrete Applied Mathematics 4, 281–290.
Google Scholar
Margush, T. and McMorris, F. R. (1981): Consensus n-trees, Bulletin of Mathematical Biology, 43, 239244.
Google Scholar
Marshall, C. R. (1991): Statistical tests and bootstrapping: Assessing the reliability of phylogenies based on distance data, Molecular Biology and Evolution, 8, 386–391.
Google Scholar
Mason-Gamer, R. J. and Kellogg, E. K. (1996): Testing for phylogenetic conflict among molecular data.
Google Scholar
sets in the tribe Triticeae (Gramineae), Systematic Biology,45 524–545.
Google Scholar
McMorris, F. R. (1985): Axioms for consensus functions ou undirected phylogenetic trees Mathematical Biosciences 74 17–21.
Google Scholar
McMorris, F. R. et al. (1983): A view of some consensus methods for trees. In: Numerical Taxonomy Felsenstein, J. (ed.), 122–126, Springer-Verlag, Berlin.
Google Scholar
McMorris, F. R. and Neumann, D. (1983): Consensus functions defined on trees Mathematical Social Sciences 4 131–136.
Google Scholar
Meier, R. et al. (1991): Homoplasy slope ratio: A better measurement of observed homoplasy in cladistic analyses, Systematic Zoology, 40, 74–88.
Article Google Scholar
Mickevich, M. F. (1978): Taxonomic congruence, Systematic Zoology, 27, 143–158.
Article Google Scholar
Milligan, G. W. (1981): A Monte-Carlo study of 30 internal criterion measures for cluster-analysis, Psychometrika, 46, 187–195.
Article MathSciNet MATH Google Scholar
Miyamoto, M. M. (1985): Consensus cladograms and general classifications Cladistics 1186–189.
Google Scholar
Miyamoto, M. M. et al. (1994): A congruence test of reliability using linked mitochondria) DNA sequences, Systematic Biology, 43, 236–249.
Google Scholar
Miyamoto, M. M. and Fitch, W. M. (1995): Testing species phylogenies and phylogenetic methods with congruence, Systematic Biology, 44, 64–76.
Google Scholar
Mueller, L. D. and Ayala, F. J. (1982): Estimation and interpretation of genetic distances in empirical studies, Genetical Research, 40, 127–137.
Article Google Scholar
Murtagh, F. (1984): Counting dendrograms: A survey, Discrete Applied Mathematics, 7, 191–199.
Article MathSciNet MATH Google Scholar
Nelson, G. (1979): Cladistic analysis and synthesis: Principles and definitions, with a historical note on Adauson’s Famille des Plantes (1763–1764), Systematic Zoology, 28, 1–21.
Article Google Scholar
Nelson, G. (1993): Why crusade against consensus? A reply to Barrett, Donoghue, and Sober Systematic Biology 42 215–216.
Google Scholar
Nemec, A. F. L. and Brinkburst, R. O. (1988): The Fowlkes-Mallows statistic and the comparison of two independently determined dendrograms, Canadian Journal of Fisheries and Aquatic Sciences, 45, 97 1975.
Google Scholar
Neumann, D. A. (1983): Faithful consensus methods for n-trees, Mathematical Biosciences, 63, 271–287. Nixon, K. C. and J. M. Carpenter. (1996): On simultaneous analysis, Cladistics, 12, 221–241.
Google Scholar
Oden, N. L. and Shao, K. T. (1984): An algorithm to equiprobably generate all directed trees with k labeled terminal nodes and unlabeled interior nodes, Bulletin of Mathematical Biology, 46, 379–387.
MathSciNet MATH Google Scholar
Olmstead, R. G. and Sweere, J. A. (1994): Combining data in phylogenetic systematics: An empirical approach using three molecular data sets in the Solanacae, Systematic Biology, 43, 467–481.
Article Google Scholar
Omland, K. E. (1994): Character congruence between a molecular and a morphological phylogeny for dabbling ducks (Arras), Systematic Biology, 43, 369–386.
Google Scholar
Page, R. D. M. (1988): Quantitative cladistic biogeography: Constructing and comparing area cladograms, Systematic Zoology, 37, 254–270.
Article Google Scholar
Page, R. D. M. (1991): Random dendrograms and null hypotheses in cladistic biogeography Systematic Zoology 40 54–62.
Google Scholar
Patterson, C. et al. (1993): Congruence between molecular and morphological phylogenies Annual Review of Ecology and Systematics 24 153–188.
Google Scholar
Penny, D. and Hendy, M. D. (1985a): The use of tree comparison metrics, Systematic Zoology, 34, 75–82. Penny, D. and Hendy, M. D. (1985b): Testing methods of evolutionary tree construction, Cladistics, 1, 266–278.
Article Google Scholar
Penny, D. et al. (1982): Testing the theory of evolution by comparing phylogenetic trees constructed from five different protein sequences, Nature, 297, 197–200.
Article Google Scholar
Penny, D. et al. (1992): Progress with methods for constructing evolutionary trees Trends in Ecology and Evolution 7, 73–79.
Google Scholar
Phillips, C. and Warnow, T. J. (1996): The asymmetric median tree–A new model for building consensus trees, Discrete Applied Mathematics, 71, 311–335.
Article MathSciNet MATH Google Scholar
Phipps, J. B. (1975): The numbers of classifications, Canadian Journal of Botany, 54, 686–688.
Article Google Scholar
Podani, J. and Dickinson, T. A. (1984): Comparison of dendrograms: A multivariate approach Canadian Journal of Botany 62 2765–2778.
Google Scholar
Poe, S. 1996. Data set incongrence and the phylogeny of Crocodilians, Systematic Biology, 45, 393–414.
Article Google Scholar
Prager, E. M. and Wilson, A. C. (1976): Congruency of phylogenies derived from different proteins, Journal of Molecular Evolution, 9, 45–57.
Article Google Scholar
Proskurowski, A. (1980): On the generation of binary trees Journal of the Association of Computing Machinery 27 1–2.
Google Scholar
Purvis, A. (1995a): A modification to Baum and Ragan’s method for combining phylogenetic trees, Systematic Biology, 44, 251–255.
Google Scholar
Purvis, A. (1995b): A composite estimate of primate phylogeny Philosophical Transactions of the Royal Society of London (B) 348 405–421.
Google Scholar
Quiroz, A. J. (1989): Fast random generation of binary, t-ary and other types of trees Journal of Classification 6 223–231.
Google Scholar
Ragan, M. A. (1992): Phylogeuetic inference based on matrix representation of trees Molecular Phylogenetics and Evolution 1 53–58.
Google Scholar
Robinson, D. F. (1971): Comparison of labeled trees with valency Three Journal of Combinatorial Theory 11 105–119.
Google Scholar
Robinson, D. F. and Foulds, L. R. (1979): Comparison of weighted labelled trees. In: Lecture Notes in Matehmatics Volume 748, 119–126, Springer-Verlag, Berlin.
Google Scholar
Robinson, D. F. and Foulds, L. R. (1981): Comparison of phylogenetic trees, Mathematical Biosciences, 53, 131–147.
Article MathSciNet MATH Google Scholar
Rodrigo, A. G. (1993a): Calibrating the bootstrap test of monophyly, International Journal of Parasitology, 23, 507–514.
Article Google Scholar
Rodrigo, A. G. (19936): A comment on Baum’s method for combining phylogenetic trees, Taxon, 42, 63 1636.
Google Scholar
Rodrigo, A. G. et al. (1993): A randomisation test of the null hypothesis that two cladograms are sample estimates of a parametric phylogenetic tree, New Zealand Journal of Botany, 31, 257–268.
Article Google Scholar
Rohlf, F. J. (1974): Methods of comparing classifications, Annual Review of Ecology and Systematics, 5, 101–113.
Article Google Scholar
Rohlf, F. J. (1982): Consensus indices for comparing classifications, Mathematical Biosciences, 59, 13 1144.
Google Scholar
Ronquist, F. (1996): Matrix representations of trees, redudancy and weighting, Systematic Biology, 45, 247–253.
Article Google Scholar
Russo, C. A. M. et al. (1996): Efficiencies of different genes and different tree-building methods in recovering a known vertebrate phylogeny, Molecular Biology and Evolution, 13, 525–536.
Article Google Scholar
Sanderson, M. J. (1989): Confidence limits on phylogenies: The bootstrap revisited, (laths es, 5, 113129.
Google Scholar
Sanderson, M. J. (1995): Objections of bootstrapping phylogenies: A critique, Systematic Biology, 44, 299–320.
Google Scholar
Savage, H. M. (1983): The shape of evolution: Systematic tree topology Biological Journal of the Linneae Society20, 225–244.
Google Scholar
Shao, K. and Rohlf, F. J. (1983): Sampling distribution of consensus indices when all bifurcating trees are equally likely. In: Numerical Taxonomy, Felsenstein, J. (ed.), 132–136, Springer-Verlag, Berlin.
Chapter Google Scholar
Shoo, K. and Sokal, R. R. (1986): Significance tests of consensus indices, Systematic Zoology, 35, 58 2590.
Google Scholar
Simberloff, D. (1987): Calculating probabilities that cladograms match: A method of biogeographic inference, Systematic Zoology, 36, 175–195.
Article Google Scholar
Simberloff, D. et al. (1981): There have been no statistical tests of cladistics biogeographical hypotheses. In: Vicariance Biogeography: A Critique, Nelson, G. and Rosen, D. E. (eds.), 40–63, Columbia University Press, New York.
Google Scholar
Sneath, P. H. A. (1967): Some statistical problems in numerical taxonomy, The Statistician, 17, 1–12.
Article Google Scholar
Sokal R. R. and Rohlf, F. J. (1962): The comparison of dendrograms by objective methods, Taxon, 9, 3340.
Google Scholar
Sokal R. R. and Rohlf, F. J. (1981): Taxonomic congruence in the Leptopodomorpha re-examined Systematic Zoology30, 309–325.
Google Scholar
Steel, M. A. (1988): Distribution of the symmetric difference metric on phylogenetic trees SLANI.Journal of Discrete Mathematics1, 541–555.
Google Scholar
Steel, M. A. (1992): The complexity of reconstructing trees from qualitative characters and subtrees Journal of Classification9, 91–116.
Google Scholar
Steel, M. A. and Penny, D. (1993): Distribution of tree comparison metrics-Some new results Systematic Biology42, 126–141.
Google Scholar
Steel., M. A. et al. (1992): Significance of the length of the shortest tree Journal of Classification9, 6370.
Google Scholar
Stinebrickuer, R. (1982): S-consensus trees and indices Bulletin of Mathematical Biology46, 923–935.
Google Scholar
Stinebrickner, R. (1984): An extension of intersection methods from trees to dendrograms Systematic Zoology33, 381–386.
Google Scholar
Sullivan, J. (1996): Combining data with different distributions of among-site variation Systematic Biology45, 375–379.
Google Scholar
Swofford, D. L. (1991): When are phylogeny estimates from molecular and morphological data incongruent?, In: Phylogenetic analysis of DNA sequences, Miyamoto, M. M. and Cracraft, J. (eds.), 295–333, Oxford University Press, New York.
Google Scholar
Swofford, D. I. et al. (1996a): Phylogenetic inference, In: Molecular Systematics, 2nd edition, Hillis, D. M. et al. (eds.), 407–514, Sinauer, Sunderland.
Google Scholar
Swofford, D. L. et al. (19966): The topology-dependent permutation test for monophyly does not test for monophyly, Systeneatic Biology, 45, 575–579.
Google Scholar
Waterman, M. S. and Smith, T. F. (1978): On the similarity of dendrograms Journal of Theoretical Biology73, 789–800.
Google Scholar
Wiens, J. J. and Chippindale, P. T. (1994): Combining and weighting characters and the prior agreement approach revisited, Systematic Biology, 43, 564–566.
Article Google Scholar
Wiens, J. J. and Reeder, T. W. (1995): Combining data sets with different numbers of taxa for phylogenetic analysis, Systematic Biology, 44, 548–558.
Google Scholar
Wilkinson, M. (1994): Common cladistic information and its consensus representation: Reduced Adams and reduced cladistic consensus trees and profiles, Systematic Biology, 43, 343–368.
Google Scholar
Wilkinson, M. (1996): Majority-rule reduced consensus trees and their use in boostrapping Molecular Biology and Evolution13, 437–444.
Google Scholar
Williams, D. M. (1994): Combining trees and combining data Taxon43, 449–453.
Google Scholar
Williams, W. T. and Clifford, FL T. (1971): On the comparison of two classifications ou the same set of elements Taxon20, 519–522.
Google Scholar
Zaretskii, K. (1965): Constructing a tree on the basis of a set of distances between the hanging vertices Uspekhi Mathematika Nauk20, 90–92. (in Russian).
Google Scholar
Zharkikh, A. and Li, W.-H. (1992a): Statistical properties of bootstrap estimation of phylogenetic variability from nucleotide sequences. I. Four taxa with a molecular clock, Molecular Biology and Evolution, 9, 1119–1147.
Google Scholar
Zharkikh, A. and Li, W.-H. (1992b): Statistical properties of bootstrap estimation of phylogenetic variability from nucleotide sequences. I1. Four taxa without a molecular clock. Journal of Molecular Evolution, 35, 356–366.
Article Google Scholar
Zharkikh, A. and Li, W.-H. (1995): Estimation of confidence in phylogeny: The full-and-partial bootstrap technique, Molecular Phylogenetics and Evolution, 4, 44–63.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Département de sciences biologiques, Université de Montréal, C.P. 6128, Succursale centre-ville, Montréal, Québec, H3C 3J7, Canada
François-Joseph Lapointe

Authors

François-Joseph Lapointe
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

The Institute of Statistical Mathematics, 4-6-7 Minami-Azabu, Minato-ku, Tokyo 106, Japan
Chikio Hayashi , Noboru Ohsumi & Yasumasa Baba , &
School of Management, Science University of Tokyo, 500 Shimokiyoku, Kuki, Saitama 346, Japan
Keiji Yajima
Institut für Statistik, Rheinisch-Westfälische Technische Hochschule (RWTH), D-52056, Aachen, Germany
Hans-Hermann Bock
Faculty of Environmental Science & Technology, Okayama University, 2-1-1 Tsushima-naka, Okayama 700, Japan
Yutaka Tanaka

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lapointe, FJ. (1998). How to validate phylogenetic trees? A stepwise procedure. In: Hayashi, C., Yajima, K., Bock, HH., Ohsumi, N., Tanaka, Y., Baba, Y. (eds) Data Science, Classification, and Related Methods. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Tokyo. https://doi.org/10.1007/978-4-431-65950-1_6

Download citation

DOI: https://doi.org/10.1007/978-4-431-65950-1_6
Publisher Name: Springer, Tokyo
Print ISBN: 978-4-431-70208-5
Online ISBN: 978-4-431-65950-1
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics