Advertisement

On the Minimum Description Length (MDL) Principle for Hierarchical Classifications

  • Peter G. Bryant
Conference paper
Part of the Studies in Classification, Data Analysis, and Knowledge Organization book series (STUDIES CLASS)

Summary

Hierarchical clustering procedures such as single-, average-, or complete-link procedures produce a series of groupings of the data arranged in the form of a hierarchy, or tree structure. In most cases, the choice of where to “cut” the tree is left to the user. Occasional formal guidelines have usually been based on ideas of random sampling, but that assumption is often violated in the contexts in which cluster analysis is used. This paper explores the application of Rissanen’s MDL principle to derive possible guidelines for cutting the tree. These guidelines do not assume random sampling.

Keywords

Complete Linkage Minimum Description Length Aggregation Criterion Hierarchical Method Penalize Maximum Likelihood 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bryant, P. (1996): The Minimum Description Length Principle for Gaussian Regression. Working Paper 1996–08, University of Colorado at Denver, Graduate School of Business Administration. Denver, Colorado 80217–3364.Google Scholar
  2. Duda, R. O. and Hart, P.E. (1973): Pattern Classification and Scene Analysis. John Wiley & Sons, New York.Google Scholar
  3. Everitt, B. S. (1993): Cluster Analysis. Edward Arnold, London.Google Scholar
  4. Johnson, R. A. and Wiehern, D. W. (1988): Applied Multivariate Statistical Analysis, second edition, Prentice-Hall, Englewood Cliffs, N. J.Google Scholar
  5. Rissanen, J. (1987): Stochastic complexity. Journal of the Royal Statistical Society, Series B, 49, 3, 223–265MathSciNetGoogle Scholar
  6. Rissanen, J. (1989): Stochastic Complexity in Statistical Inquiry. World Scientific Publishing Co., Singapore.MATHGoogle Scholar
  7. Rissanen, J. (1996): Shannon-Wiener information and stochastic complexity, In: Proceedings, N. Wiener Centenary Congress, East Lansing, Michigan.Google Scholar

Copyright information

© Springer Japan 1998

Authors and Affiliations

  • Peter G. Bryant
    • 1
  1. 1.Graduate School of Business AdministrationUniversity of Colorado at DenverDenverUSA

Personalised recommendations