Advertisement

A General Approach to Test the Pertinence of a Consensus Classification

  • Guy Cucumel
  • François-Joseph Lapointe
Conference paper
Part of the Studies in Classification, Data Analysis, and Knowledge Organization book series (STUDIES CLASS)

Abstract

Many techniques have been proposed to combine classifications defined on the same set of objects. All the methods that have been developed are designed to return a solution, but validation of the solution is seldom performed. In this paper we propose a general approach to test the pertinence of a consensus classification and discuss the choices that one has to make at each step of the method.

Keywords

Branch Length Terminal Node Consensus Method Label Tree Consensus Classification 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. ADAMS, E.N., III. (1972): Consensus Techniques and the Comparison of Taxo-nomic Trees. Systematic Zoology 21, 390–397CrossRefGoogle Scholar
  2. BANDELT, H.-J. and DRESS, A.W.M. (1989): Weak Hierarchies Associated with Similarity Mesures: an Additive Clustering Technique. Bulletin of Mathematical Biology, 51, 133–166Google Scholar
  3. BARTHÉLÉMY, J.-P. and McMORRIS, F.R. (1986): The Median Procedure forn-Trees. Journal of Classification, 3, 329–334CrossRefGoogle Scholar
  4. BAUM, B.R. (1992): Combining Trees as a Way of Combining Data for PhylogeneticInference, and the Desirability of Combining Gene Trees. Taxon,41, 3–10.CrossRefGoogle Scholar
  5. BERTRAND, P. and DIDAY, E.(1985): A Visual Representation of the Compatibility between an Order and a Dissimilarity Index: the Pyramids. Computational Statistics Quaterly, 2, 31–42.Google Scholar
  6. BOSIBUD, H.M. and BOSIBUD, L.E. (1972): A Metric for Classifications. Taxon, 21, 607–613.CrossRefGoogle Scholar
  7. BREMER, K. (1990): Combinable Component Consensus. Cladistics, 6, 369–372.CrossRefGoogle Scholar
  8. BROSSIER, G. (1990): Piecewise Hierarchical Clustering. Journal of Classification, 7, 197–216.CrossRefGoogle Scholar
  9. BUNEMAN, P. (1971): The Recovery of Trees From Measures of Dissimilarity. In: F.R. Hudson, D.G. Kendall and P. Tautu (Eds.): Mathematics in Archeological and Historical Sciences. Edinburgh University Press, Edinburgh, 387–395.Google Scholar
  10. CUCUMEL, G. and LAPOINTE, F.-J. (1998): Assessing the Pertinence of a Consensus with Permutations. Short Papers of the VI Conference of the IFCS. Istituto Nazionale di Statistica, Roma, 89–91.Google Scholar
  11. DAY, W.H.E. (1983): Distributions of Distances Between Pairs of Classifications. In: J. Felsenstein (Ed.): Numerical Taxonomy. Springer-Verlag, Berlin, 127–131.Google Scholar
  12. ESTABROOK, G.F., McMORRIS, F.R. and MEACHAM, C. (1985): Comparison of Undirected phylogenetic trees based on subtrees of four evolutionary units. Systematic Zoology, 34, 193–200.CrossRefGoogle Scholar
  13. FINDEN, C.R. and GORDON, A.D. (1985): Obtaining Common Pruned Trees. Journal of Classification, 2, 225–276.CrossRefGoogle Scholar
  14. FOWLKES, E.B. and MALLOWS, C.L. (1983): A Method for Comparing Two Hierarchical Clusterings. Journal of the American Statistical Association, 78, 553–569.CrossRefGoogle Scholar
  15. FURNAS, G.W. (1984): The Generation of Random, Binary Unordered Trees. Journal of Classification, 1, 187–233.CrossRefGoogle Scholar
  16. GORDON, A.D. (1986): Consensus Supertrees: The Synthesis of Rooted Trees Containing Overlapping Sets of Labeled Leaves. Journal of Classification, 3, 335 348.Google Scholar
  17. HARTIGAN, J.A. (1967): Representation of Similarity Matrices by Trees. Journal of the American Statistical Association, 62, 1140–1148.CrossRefGoogle Scholar
  18. LAPOINTE, F.-J. (1998a): How to Validate Phylogenetic Trees? A Stepwise Procedure. In: C. Hayashi, N. Ohsumi, K. Yajima, Y. Tanaka, H.-H. Bock and Y. Baba (Eds.): Data Science, Classification, and Related Methods. Springer-Verlag, Tokyo, 71–88.Google Scholar
  19. LAPOINTE, F.-J. (1998b): For Consensus (with Branch Lengths). In: A. Rizzi, M. Vichi and H.-H. Bock (Eds.): Advances in Data Science and Classification. Springer-Verlag, Berlin, 73–80.Google Scholar
  20. LAPOINTE, F.-J. and CUCUMEL, G. (1997): The Average Consensus Procedure: Combination of Weighted Trees Containing Identical or Overlapping Sets of Objects. Systematic Biology, 46, 306–312.CrossRefGoogle Scholar
  21. LAPOINTE, F.-J. and LEGENDRE, P. (1990): A Statistical Framework to Test the Consensus of Two Nested Classifications. Systematic Zoology, 39, 1 13.CrossRefGoogle Scholar
  22. LAPOINTE, F.-J. and LEGENDRE, P. (1991): The Generation of Random Ul- trametric Matrices Representing Dendrograms. Journal of Classification, 8, 177 200.Google Scholar
  23. LAPOINTE, F.-J. and LEGENDRE, P. (1995): Comparison Tests for Dendrograms: A Comparative Evaluation. Journal of Classification, 12, 265 282.Google Scholar
  24. LECLERC, B. (1998): Consensus of Classifications: the Case of Trees. In: A. Rizzi, M. Vichi and H.-H. Bock (Eds.): Advances in Data Science and Classification. Springer-Verlag, Berlin, 81–90.Google Scholar
  25. LEFKOVITCH, L.P. (1985): Euclidean Consensus Dendrograms and Other Classification Structures. Mathematical Biosciences, 74, 1–15.CrossRefGoogle Scholar
  26. MARGUSH, T. (1982): Distances Between Trees. Discrete Applied Mathematics, 4, 281–290.CrossRefGoogle Scholar
  27. MARGUSH, T. and McMORRIS, F.R. (1981): Consensus n-Trees. Bulletin of Mathematical Biology, 43, 239–244.Google Scholar
  28. McMORRIS, F.R., MERONK, D.B. and NEUMANN, D.A. (1983): A View of Some Consensus Methods for Trees. In: J. Felsenstein (Ed.): Numerical Taxonomy. Springer-Verlag, Berlin, 122–126.Google Scholar
  29. McMORRIS, F.R. and POWERS, R.C. (1991): Consensus Weak Hierarchies. Bulletin of Mathematical Biology, 53, 679–684.Google Scholar
  30. NEUMANN, D.A. (1983): Faithful Consensus Methods for n-Trees. Mathematical Biosciences, 63, 271–287.CrossRefGoogle Scholar
  31. ODEN, N.L. and SHAO, K.T. (1984): An Algorithm to Equiprobably Generate all Directed Trees With k Labeled Terminal Nodes and Unlabeled Interior Nodes. Bulletin of Mathematical Biology, 46, 379–387.Google Scholar
  32. PENNY, D. and HENDY, M.D. (1985): The Use of Tree Comparison Metrics. Systematic Zoology, 34, 75–82.CrossRefGoogle Scholar
  33. QUIROZ, A.J. (1989): Fast Random Generation of Binary, t-ary and Other Types of Trees. Journal of Classification, 6, 223–231.CrossRefGoogle Scholar
  34. ROBINSON, D.F. (1971): Comparison of Labeled Trees With Valency Three. Journal of Combinatorial Theory, 11, 105–119.CrossRefGoogle Scholar
  35. ROBINSON, D.F. and FOULDS, L.R. (1979): Comparison of Weighted Labeled Trees. In: C. Hayashi, N. Ohsumi, K. Yajima, Y. Tanaka, H.-H. Bock and Y. Baba (Eds.): Lecture Notes in Mathematics, Volume 748. Springer-Verlag, Berlin, 119–126.Google Scholar
  36. ROBINSON, D.F. and FOULDS, L.R. (1981): Comparison of Phylogenetic Trees. Mathematical Biosciences, 53, 131–147.CrossRefGoogle Scholar
  37. ROHLF, F.J. (1982): Consensus Indices for Comparing Classifications. Mathematical Biosciences, 59, 131–144.CrossRefGoogle Scholar
  38. SIMBERLOFF, D., HECK, K.L., McCOY, E.D. and CONNOR, E.F. (1981): There Have Been no Statistical Tests of Cladistics Biogeographical Hypotheses. In: G. Nelson and D.E. Rosen (Eds.): Vicariance Biogeography: A Critique. Columbia University Press, New York, 40–63.Google Scholar
  39. SOKAL R.R. and ROHLF, F.J. (1962): The Comparison of Dendrograms by Objective Methods. Taxon, 9, 33–40.CrossRefGoogle Scholar
  40. SOKAL R.R. and ROHLF, F.J. (1981): Taxonomic Congruence in the Lep- topodomorpha Re-examined. Systematic Zoology, 30, 309–325.CrossRefGoogle Scholar
  41. STEEL, M.A. (1988): Distribution of the Symmetric Difference Metric on Phylogenetic Trees. SIAM Journal of Discrete Mathematics, 1, 541–555.CrossRefGoogle Scholar
  42. STEEL, M.A. (1992): The Complexity of Reconstructing Trees From Qualitative Characters and Subtrees. Journal of Classification, 1, 91–116.CrossRefGoogle Scholar
  43. STEEL, M.A. and PENNY, D. (1993): Distribution of Tree Comparison Metrics- Some New Results. Systematic Biology, 42, 126–141.Google Scholar
  44. STINEBRICKNER, R. (1984): An Extension of Intersection Methods From Trees to Dendrograms. Systematic Zoology, 33, 381–386.CrossRefGoogle Scholar
  45. WILKINSON, M. (1994): Common Cladistic Information and its Consensus Representation: Reduced Adams and Reduced Cladistic Consensus Trees and Profiles. Systematic Biology, 43, 343–368.Google Scholar

Copyright information

© Springer-Verlag Berlin · Heidelberg 2000

Authors and Affiliations

  • Guy Cucumel
    • 1
  • François-Joseph Lapointe
    • 2
  1. 1.École des sciences de la gestionUniversité du Québec à MontréalMontréalCanada
  2. 2.Département de sciences biologiquesUniversité de MontréalMontréalCanada

Personalised recommendations