Number of Components and Initialization in Gaussian Mixture Model for Pattern Recognition

  • Pavel Paclík
  • Jana Novovičová
Conference paper


Number of components and initial parameter estimates are of crucial importance for sueeessful mixture estimation using Expectation-Maximization (EM) algorithm. In the paper a method for the complete mixture initialization based on a product kernel estimate of probability density function is proposed. The mixture components are assumed here to correspond to local maxima of optimally smoothed kerne Idensity estimate. The gradient method is used for local extrema finding. Then, local extrema are grouped together to form component eandidates and these are merged by the 4hierarchical clustering method. Finally, the initial mixture parameters are estimated. A comparison to scale-space approaches for finding of the number of components is given on examples.


Mixture Model Gaussian Mixture Model Local Extremum Finite Mixture Model Pattern Recognition Letter 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. [1]
    G. Celeux and G. Soromenho. An entropy criterion for assessing the number of clusters in a mixture model. Journal of Classification, 13: 195–212, 1996.MathSciNetCrossRefMATHGoogle Scholar
  2. [2]
    S. V. Chakravarthy and J. Ghosh. Scale-Based Clustering Using the Radial Basis Function Network. IEEE Trans. on Neural Networks, 7(5):1250–1261, 1996.CrossRefGoogle Scholar
  3. [3]
    A. Dempster, N. Laird, and D. Rubin. Maximum likelihood from incomplete data via EM algorithm. J. Royal Stat.Soc. vol.39, pp. 1–38, 1977.MathSciNetMATHGoogle Scholar
  4. [4]
    J. Grim, J. Novovicova, P. Pudil, P. Somol, and F. Ferri. Initialization normal mixtures of densities. In Proceedings of the 14th ICPR, pages 886–890, Australia, 1998.Google Scholar
  5. [5]
    N. Kehtamavaz and E. Nakamura. Generalization of the EM algorithm for mixture density estimation. Pattern Recognition Letters, 19:133–140, 1998.CrossRefGoogle Scholar
  6. [6]
    R. Kothari and D. Pitts. On finding the number of clusters. Pattern Recognition Letters, 20:405–416, 1999.CrossRefGoogle Scholar
  7. [7]
    T. Lindeberg. Scale-space theory: A basic tool for analysing structures at different scales. Journal of Applied Statistics, 21(2):225–270, 1994.CrossRefGoogle Scholar
  8. [8]
    P. Paclfk, J. Novovicova, P. Pudil, and P. Somol. Road Sign Classification using Laplace Kernel Classifier. Pattern Recognition Letters, 21(13–14): 1165–1173, 2000.CrossRefGoogle Scholar
  9. [9]
    H. Tenmoto, M. Kudo, and M. Shimbo. MDL-Based Selection of the Number of Components in Mixture Models for Pattern Recognition. In Lecture Notes in Computer Science 1451: Advances in Pattern Recognition, pages 831–836, 1998.Google Scholar
  10. [10]
    D. Titterington, A. Smith, and U. Makov. Statistical analysis of finite mixture distributions. John Wiley & Sons: Chichecter, Singapore, New York, 1985.MATHGoogle Scholar
  11. [11]
    N. Ueda and R. Nakano. Deterministic annealing EM algorithm. Neural Networks, (11):271–282, 1998.CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Wien 2001

Authors and Affiliations

  • Pavel Paclík
    • 1
    • 2
  • Jana Novovičová
    • 3
  1. 1.Pattern Recognition GroupTU DelftDelftThe Netherlands
  2. 2.Faculty of Transportation SciencesCzech Technical UniversityPragueCzech Republic
  3. 3.Institute of Information Theory and AutomationAcademy of Sciences of the Czech RepublicPragueCzech Republic

Personalised recommendations