Probability Distributions on Indexed Dendrograms and Related Problems of Classifiability

Van Cutsem, Bernard; Ycart, Bernard

doi:10.1007/978-3-642-80098-6_7

Bernard Van Cutsem⁶ &
Bernard Ycart⁶

Part of the book series: Studies in Classification, Data Analysis, and Knowledge Organization ((STUDIES CLASS))

404 Accesses
2 Citations

Summary

This paper studies the dendrograms produced by algorithms of classification such as the Single Link Algorithm. We introduce probability distributions on dendrograms corresponding to distinct non classifiability hypotheses. The distributions of the height of a random dendrogram under these hypotheses are studied and their asymptotics explicitly computed. This leads to statistical tests for non-classifiability.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Benzecri, J.P. (1973): L’analyse des données. I. La taxinomie. Dunod. Paris.
Google Scholar
Bock, H.H. (1985): On some significance tests in cluster analysis. Journal of Classification, 2, 77–108.
Article Google Scholar
Bock, H.H. (1995a): Probabilistic models in cluster analysis. Comput. Statist. Data Anal, (to appear).
Google Scholar
Bock, H.H. (1995b): Probabilistic approaches and hypothesis testing in partitional cluster analysis. To appear in: Ph. Arabie, L. Hubert and G. de Soete (eds.): Clustering and classification. World Sciences Publ. Singapore, NJ.
Google Scholar
Critchley, F., and Fichet, B. (1994): The partial order by inclusion of the principal classes of dissimilarity on a finite set, and some of their basic properties. In: B. Van Cutsem (ed.): Classification and Dissimilarity Analysis. Lecture Notes in Statistics 93. Springer-Verlag, New York, 5–65.
Google Scholar
Erdôs, P., and Renyi, A. (1960): On the evolution of random graphs. Magyar Tud. Akad. Mat. Kut. Int. Kozi, 5, 17–61.
Google Scholar
Florek, K.J., Lukaszewicz, J., Perkal, J., Steinhaus, H., and Zu- Brzycki, S. (1951a): Sur la liaison et la division des points d’un ensemble fini. Colloquium Math, 2, 282–285.
Google Scholar
Florek, K.J., Lukaszewicz, J., Perkal, J., Steinhaus, H., and Zu- Brzycki, S. (1951b): Taksonomia Wroclawska. Przegl. AntropoL, 17, 193–211.
Google Scholar
Frank, O., and Svensson, K. (1981): On probability distributions of single link dendrograms. J. Statist. Comput. Simul, 12, 121–131.
Article Google Scholar
Hartigan, J.A. (1967): Representations of similarity matrices by trees. J. Amer. Statist. Assoc, 62, 1140–1158.
Article Google Scholar
Jain, A.K., and DUBES, R.C. (1988): Algorithms for clustering data. Prentice Hall, Englewood Cliffs.
Google Scholar
Jardine, C.J., Jardine, N., and SIBSON, R. (1967): The structure and the construction of taxonomic hierarchies. Math. Biosci, 1, 171–179.
Article Google Scholar
Johnson, S.C. (1967): Hierarchical clustering schemes. Psychometrika, 32, 241–254.
Article Google Scholar
Lengyel, T. (1984): On a recurrence involving Stirling numbers. Europ. J. Combinatorics, 5, 313–321.
Google Scholar
Lerman, I.C. (1970): Les bases de la classification automatique. Gauthier- Villard, Paris.
Google Scholar
Ling, R.F., and Killough, G. G. (1976): Probability tables for cluster analysis based on a theory of random graphs. J. Amer. Statist. Assoc, 71, 293–300.
Article Google Scholar
Ling, R.F. (1973): A probability theory of cluster analysis. J. Amer. Statist. Assoc, 68, 159–164.
Article Google Scholar
Murtagh, F. (1983): A probability theory of hierarchic clustering using random dendrograms. J. Statist. Comput. Simul, 18, 145–157.
Article Google Scholar
Sneath, P.H.A. (1957): The application of computers to taxonomy. J. Gen. Microbiol, 17, 184–200.
Google Scholar
Sneath, P.H.A., and Sokal, R.R. (1973): Numerical Taxonomy. Freeman, San Francisco.
Google Scholar
Spencer, J. (1993): Nine lectures on random graphs. In: P.L. Hennequin (éd.): Ecole d’été de probabilités de Saint-Flour XXI - 1991. Lecture Notes in Mathematics 1541. Springer Verlag, Berlin, 293–347.
Google Scholar
Van Cutsem, B. (1995): Combinatorial structures and structures for classification. To appear in: Proceedings of the XIVth Journées Franco-Belges de Statisticiens. Namur, Nov. 1993. Springer Verlag, Berlin.
Google Scholar
Van Cutsem, B., and Ycart, B. (1994): Renewal-type behaviour of absorption times in Markov Chains. Adv. Appl. Prob, 26, 998–1005.
Article Google Scholar
Wolfram, S. (1992): Mathematica. Wolfram Res. Inc.
Google Scholar

Download references

Author information

Authors and Affiliations

Laboratoire Modélisation et Calcul - I.M.A.G., B.P. 53, F-38041, Grenoble Cedex 9, France
Bernard Van Cutsem & Bernard Ycart

Authors

Bernard Van Cutsem
View author publications
You can also search for this author in PubMed Google Scholar
Bernard Ycart
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Institut für Statistik und Wirtschaftsmathematik, Rheinisch-Westfälische Technische Hochschule Aachen (RWTH), Wüllnerstr. 3, D-52056, Aachen, Germany
Hans-Hermann Bock
Institut für Statistik und Ökonometrie, Universität Basel, Holbeinstr. 12, CH-4051, Basel, Switzerland
Wolfgang Polasek

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Van Cutsem, B., Ycart, B. (1996). Probability Distributions on Indexed Dendrograms and Related Problems of Classifiability. In: Bock, HH., Polasek, W. (eds) Data Analysis and Information Systems. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-80098-6_7

Download citation

DOI: https://doi.org/10.1007/978-3-642-80098-6_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-60774-8
Online ISBN: 978-3-642-80098-6
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics