MDL-based selection of the number of components in mixture models for pattern classification

  • Hiroshi Tenmoto
  • Mineichi Kudo
  • Masaru Shimbo
Poster Papers
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1451)


A new method is proposed for selection of the optimal number of components of a mixture model for pattern classification. We approximate a class-conditional density by a mixture of Gaussian components. We estimate the parameters of the mixture components by the EM (Expectation Maximization) algorithm and select the optimal number of components on the basis of the MDL (Minimum Description Length) principle. We evaluate the goodness of an estimated model in a trade-off between the number of the misclassified training samples and the complexity of the model.


Mixture Model Training Sample Recognition Rate Mixture Component Pattern Classification 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Dempster, A. P., Laird, N. M., Rubin, D. B.: Maximum Likelihood from Incomplete Data via the EM Algorithm. J. Royal Stat. Soc. Series B 39 (1977) 1–38Google Scholar
  2. 2.
    McLachlan, G. J., Basford, K. E.: Mixture Models. Marcel Dekker, Inc., New York (1988) 21–29Google Scholar
  3. 3.
    Ichimura, N.: Robust Clustering Based on a Maximum Likelihood Method for Estimation of the Suitable Number of Clusters. Trans. Inst. Electron. Info. Comm. Eng. D-II J78 8 (1995) 1184–1195 (in Japanese)Google Scholar
  4. 4.
    Rissanen, J.: A Universal Prior for Integers and Estimation by Minimum Description Length. Ann. Stat. 11 (1983) 416–431Google Scholar
  5. 5.
    Bezdek, J. C.: Cluster Validity with Fuzzy Sets. J. Cybern. 3 3 (1974) 58–73Google Scholar
  6. 6.
    Kudo, M., Shimbo, M.: Selection of Classifiers Based on the MDL Principle Using the VC Dimension. Proc. ICPR '96 (1996) 886–890Google Scholar
  7. 7.
    Hayamizu, S. et al.: Generation of VCV/CVC Balanced Word Sets for Speech Database. Bullet. Electro. Lab. 49 10 (1985) 803–834Google Scholar
  8. 8.
    Park, Y., Sklansky, J.: Automated Design of Multiple-Class Piecewise Linear Classifiers. J. Classification 6 (1989) 195–222CrossRefMathSciNetGoogle Scholar
  9. 9.
    Pudil, P., Novoičová, J., Kittle, J.: Floating Search Methods in Feature Selection. Pattern Recogn. Letters 15 (1994) 1119–1125CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1998

Authors and Affiliations

  • Hiroshi Tenmoto
    • 1
  • Mineichi Kudo
    • 1
  • Masaru Shimbo
    • 1
  1. 1.Division of Systems and Information Engineering Graduate School of EngineeringHokkaido UniversitySapporoJapan

Personalised recommendations