On the Minimum Description Length (MDL) Principle for Hierarchical Classifications

Bryant, Peter G.

doi:10.1007/978-4-431-65950-1_17

On the Minimum Description Length (MDL) Principle for Hierarchical Classifications

Peter G. Bryant⁸

Conference paper

2024 Accesses

Part of the book series: Studies in Classification, Data Analysis, and Knowledge Organization ((STUDIES CLASS))

Summary

Hierarchical clustering procedures such as single-, average-, or complete-link procedures produce a series of groupings of the data arranged in the form of a hierarchy, or tree structure. In most cases, the choice of where to “cut” the tree is left to the user. Occasional formal guidelines have usually been based on ideas of random sampling, but that assumption is often violated in the contexts in which cluster analysis is used. This paper explores the application of Rissanen’s MDL principle to derive possible guidelines for cutting the tree. These guidelines do not assume random sampling.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Bryant, P. (1996): The Minimum Description Length Principle for Gaussian Regression. Working Paper 1996–08, University of Colorado at Denver, Graduate School of Business Administration. Denver, Colorado 80217–3364.
Google Scholar
Duda, R. O. and Hart, P.E. (1973): Pattern Classification and Scene Analysis. John Wiley & Sons, New York.
Google Scholar
Everitt, B. S. (1993): Cluster Analysis. Edward Arnold, London.
Google Scholar
Johnson, R. A. and Wiehern, D. W. (1988): Applied Multivariate Statistical Analysis, second edition, Prentice-Hall, Englewood Cliffs, N. J.
Google Scholar
Rissanen, J. (1987): Stochastic complexity. Journal of the Royal Statistical Society, Series B, 49, 3, 223–265
MathSciNet Google Scholar
Rissanen, J. (1989): Stochastic Complexity in Statistical Inquiry. World Scientific Publishing Co., Singapore.
MATH Google Scholar
Rissanen, J. (1996): Shannon-Wiener information and stochastic complexity, In: Proceedings, N. Wiener Centenary Congress, East Lansing, Michigan.
Google Scholar

Download references

Author information

Authors and Affiliations

Graduate School of Business Administration, University of Colorado at Denver, Campus Box 165, Denver, Colorado, 80217-3364, USA
Peter G. Bryant

Authors

Peter G. Bryant
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

The Institute of Statistical Mathematics, 4-6-7 Minami-Azabu, Minato-ku, Tokyo 106, Japan
Chikio Hayashi , Noboru Ohsumi & Yasumasa Baba , &
School of Management, Science University of Tokyo, 500 Shimokiyoku, Kuki, Saitama 346, Japan
Keiji Yajima
Institut für Statistik, Rheinisch-Westfälische Technische Hochschule (RWTH), D-52056, Aachen, Germany
Hans-Hermann Bock
Faculty of Environmental Science & Technology, Okayama University, 2-1-1 Tsushima-naka, Okayama 700, Japan
Yutaka Tanaka

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Bryant, P.G. (1998). On the Minimum Description Length (MDL) Principle for Hierarchical Classifications. In: Hayashi, C., Yajima, K., Bock, HH., Ohsumi, N., Tanaka, Y., Baba, Y. (eds) Data Science, Classification, and Related Methods. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Tokyo. https://doi.org/10.1007/978-4-431-65950-1_17

Download citation

DOI: https://doi.org/10.1007/978-4-431-65950-1_17
Publisher Name: Springer, Tokyo
Print ISBN: 978-4-431-70208-5
Online ISBN: 978-4-431-65950-1
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics