Skip to main content

On the Minimum Description Length (MDL) Principle for Hierarchical Classifications

  • Conference paper
  • 2024 Accesses

Summary

Hierarchical clustering procedures such as single-, average-, or complete-link procedures produce a series of groupings of the data arranged in the form of a hierarchy, or tree structure. In most cases, the choice of where to “cut” the tree is left to the user. Occasional formal guidelines have usually been based on ideas of random sampling, but that assumption is often violated in the contexts in which cluster analysis is used. This paper explores the application of Rissanen’s MDL principle to derive possible guidelines for cutting the tree. These guidelines do not assume random sampling.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Bryant, P. (1996): The Minimum Description Length Principle for Gaussian Regression. Working Paper 1996–08, University of Colorado at Denver, Graduate School of Business Administration. Denver, Colorado 80217–3364.

    Google Scholar 

  • Duda, R. O. and Hart, P.E. (1973): Pattern Classification and Scene Analysis. John Wiley & Sons, New York.

    Google Scholar 

  • Everitt, B. S. (1993): Cluster Analysis. Edward Arnold, London.

    Google Scholar 

  • Johnson, R. A. and Wiehern, D. W. (1988): Applied Multivariate Statistical Analysis, second edition, Prentice-Hall, Englewood Cliffs, N. J.

    Google Scholar 

  • Rissanen, J. (1987): Stochastic complexity. Journal of the Royal Statistical Society, Series B, 49, 3, 223–265

    MathSciNet  Google Scholar 

  • Rissanen, J. (1989): Stochastic Complexity in Statistical Inquiry. World Scientific Publishing Co., Singapore.

    MATH  Google Scholar 

  • Rissanen, J. (1996): Shannon-Wiener information and stochastic complexity, In: Proceedings, N. Wiener Centenary Congress, East Lansing, Michigan.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1998 Springer Japan

About this paper

Cite this paper

Bryant, P.G. (1998). On the Minimum Description Length (MDL) Principle for Hierarchical Classifications. In: Hayashi, C., Yajima, K., Bock, HH., Ohsumi, N., Tanaka, Y., Baba, Y. (eds) Data Science, Classification, and Related Methods. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Tokyo. https://doi.org/10.1007/978-4-431-65950-1_17

Download citation

  • DOI: https://doi.org/10.1007/978-4-431-65950-1_17

  • Publisher Name: Springer, Tokyo

  • Print ISBN: 978-4-431-70208-5

  • Online ISBN: 978-4-431-65950-1

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics