Pattern Analysis and Applications

, Volume 21, Issue 1, pp 181–192 | Cite as

Estimating number of components in Gaussian mixture model using combination of greedy and merging algorithm

  • Karla Štepánová
  • Michal Vavrečka
Theoretical Advances


The brain must deal with a massive flow of sensory information without receiving any prior information. Therefore, when creating cognitive models, it is important to acquire as much information as possible from the data itself. Moreover, the brain has to deal with an unknown number of components (concepts) contained in a dataset without any prior knowledge. Most of the algorithms in use today are not able to effectively copy this strategy. We propose a novel approach based on neural modelling fields theory (NMF) to overcome this problem. The algorithm combines NMF and greedy Gaussian mixture models. The novelty lies in the combination of information criterion with the merging algorithm. The performance of the algorithm was compared with other well-known algorithms and tested both on artificial and real-world datasets.


Clustering Mixture model Gaussian mixture model Number of clusters EM algorithm 



This research has been supported by SGS Grant No. 10/279/OHK3/3T/13, sponsored by the CTU in Prague, and by the research programme MSM6840770012 Transdisciplinary Research in the Area of Biomedical Engineering II of the CTU in Prague, sponsored by the Ministry of Education, Youth and Sports of the Czech Republic.


  1. 1.
    Bar M (2003) A cortical mechanism for triggering top-down facilitation in visual object recognition. J Cogn Neurosci 15(4):600–609CrossRefGoogle Scholar
  2. 2.
    Schacter DL, Dobbins IG, Schnyer DM (2004) Specificity of priming: a cognitive neuroscience perspective. Nat Rev Neurosci 5(11):853–862CrossRefGoogle Scholar
  3. 3.
    Schacter DL, Addis DR (2007) The cognitive neuroscience of constructive memory: remembering the past and imagining the future. Philos Trans R Soc B Biol Sci 362(1481):773–786CrossRefGoogle Scholar
  4. 4.
    Perlovsky LI (2001) Neural neworks and intellect: using model-based concepts. Oxford University Press, New YorkGoogle Scholar
  5. 5.
    Perlovsky LI, Deming R, Illin R (2011) Emotional cognitive neural algorithms with engineering applications. Studies in computational intelligence. Springer, BerlinCrossRefGoogle Scholar
  6. 6.
    Fraley Ch, Raftery AE (1998) How many clusters? Which clustering method? Answers via model-based cluster analysis. Comput J 41:578–588CrossRefMATHGoogle Scholar
  7. 7.
    Perlovsky LI (2009) Vague-to-crisp? Neural mechanism of perception. IEEE Trans Neural Netw 20:1363–1367CrossRefGoogle Scholar
  8. 8.
    Perlovsky LI (2007) Neural networks, fuzzy models and dynamic logic. Asp Autom Text Anal 209(2007):363–386Google Scholar
  9. 9.
    Perlovsky LI, McManus MM (1991) Maximum likelihood neural networks for sensor fusion and adaptive classification. Neural Netw 4(1):89–102CrossRefGoogle Scholar
  10. 10.
    Perlovsky LI (2005) Neural network with fuzzy dynamic logic. In: Proceedings of international joint conference on neural network, pp 3046–3051Google Scholar
  11. 11.
    Deming R, Perlovsky LI (2008) Multi-target/multi-sensor tracking from optical data using modelling field theory. In: World congress on computational intelligence (WCCI). Hong Kong, ChinaGoogle Scholar
  12. 12.
    Cangelosi A, Tikhanoff V, Fontanari JF (2007) Integrating language and cognition: a cognitive robotics approach. Comput Intell Mag 2(3):65–70CrossRefGoogle Scholar
  13. 13.
    Figueiredo MAT, Jain AK (2002) Unsupervised learning of finite mixture models. IEEE Trans Pattern Anal Mach Intell 24(3):381–396CrossRefGoogle Scholar
  14. 14.
    Ueda N, Nakano R, Ghahramani Z, Hinton GE (1998) Split and merge EM algorithm for improving Gaussian mixture density estimates. In: Proceedings of IEEE workshop neural networks for signal processing, pp 274–283Google Scholar
  15. 15.
    Li Y, Li L (2009) A novel split and merge EM algorithm for Gaussian mixture model. In: ICNC ’09. Fifth international conference on natural computation, pp 479–483Google Scholar
  16. 16.
    Verbeek J, Vlassis M, Kröse B (2003) Efficient greedy learning for Gaussian mixture models. Neural Comput 5(2):469–485CrossRefMATHGoogle Scholar
  17. 17.
    Vlassis N, Likas A (2002) A greedy EM algorithm for Gaussian mixture learning. Neural Process Lett 15:77–87CrossRefMATHGoogle Scholar
  18. 18.
    Akaike H (1974) A new look at the statistical model identification. IEEE Trans Autom Control 19(6):716MathSciNetCrossRefMATHGoogle Scholar
  19. 19.
    Schwarz GE (1978) Estimating the dimension of a model. Ann Stat 6(2):461–464MathSciNetCrossRefMATHGoogle Scholar
  20. 20.
    Bouchard G, Celeux G (2006) Selection of generative models in classification. IEEE Trans Pattern Anal Mach Intell 28(4):544–554CrossRefGoogle Scholar
  21. 21.
    Celeux G, Soromenho G (1994) An entropy criterion for assessing the number of clusters in a mixture models. J Classif 13:195–212MathSciNetCrossRefMATHGoogle Scholar
  22. 22.
    Tenmoto H, Kudo M, Shimbo M (1998) MDL-based selection of the number of components in mixture models for pattern classification. Adv Pattern Recognit 1451(1998):831–836CrossRefGoogle Scholar
  23. 23.
    Lee Y, Lee KY, Lee J (2006) The estimating optimal number of GMM based on incremental k-means for speaker identifications. Int J Inf Technol 12(7):1119–1128Google Scholar
  24. 24.
    Likas A, Vlassis N, Verbeek JJ (2003) The global k-means clustering algorithm. Pattern Recognit 36:451–461CrossRefGoogle Scholar
  25. 25.
    Grim J, Novovicova J, Pudil P, Somol P, Ferri F (1987) Initialization normal mixutres of densities. Appl Stat 36(3):318–324CrossRefGoogle Scholar
  26. 26.
    Smyth P (2000) Model selection for probabilistic clustering using crossvalidated likelihood. Stat Comput 9:63CrossRefGoogle Scholar
  27. 27.
    McLachlan G (1987) On bootstraping the likelihood ratio test statistic for the number of components in a normal mixture. Appl Stat 36(3):318–324CrossRefGoogle Scholar
  28. 28.
    Pernkopf F, Bouchaffra D (2005) Genetic-based EM algorithm for learning gaussian mixture models. IEEE Trans Pattern Anal Mach Intell 27(8):1344–1348CrossRefGoogle Scholar
  29. 29.
    Ververidis D, Kotropoulos C (2008) Gaussian mixture modeling by exploiting the Mahalanobis distance. IEEE Trans Signal Process 56(7B):2797–2811MathSciNetCrossRefMATHGoogle Scholar
  30. 30.
    Bozdogan H, Sclove SL (1984) Multi-sample cluster analysis using Akaike’s information criterion. Ann Inst Stat Math 36:163–180CrossRefMATHGoogle Scholar
  31. 31.
    Roberts S, Husmaier D, Rezek I, Penny W (1998) Bayesian approaches to Gaussian mixture modelling. IEEE Trans Pattern Anal Mach Intell 20(11):1133–1142CrossRefGoogle Scholar
  32. 32.
    Banfield J, Raftery A (1993) Model-based Gaussian and non-Gaussian clustering. Biometrics 49:803–821MathSciNetCrossRefMATHGoogle Scholar
  33. 33.
    Bozdogan H (1993) Choosing the number of component clusters in the mixture model using a new informational complexity criterion of the inverse Fisher information matrix. In: Opitz O, Lausen B, Klar R (eds) Information and Classification, Springer-Verlag, Berlin, pp 40–54CrossRefGoogle Scholar
  34. 34.
    Whindham M, Cutler A (1992) Information ratios for validating mixture analysis. J Am Stat Assoc 87:1188–1192CrossRefGoogle Scholar
  35. 35.
    Biernacki C, Celeux G (1999) An improvement of the NEC criterion for assessing the number of clusters in a mixture model. Pattern Recognit Lett 20(3):267–272CrossRefMATHGoogle Scholar
  36. 36.
    Oliver JJ, Baxter RA, Wallace CS (1996) Unsupervised Learning using MML. In: Proceedings of the Thirteenth International Conference (ICML96), Morgan Kaufmann Publishers, SanFrancisco, CA, pp 364–372Google Scholar
  37. 37.
    Rissanen J (1989) Stochastic complexity in statisticalinquiry. World Scientific, SingaporeMATHGoogle Scholar
  38. 38.
    Biernacki C, Govaert G (1997) Using the classification likelihood to choose the number of clusters. Comput sci stat 29(2):451–457Google Scholar
  39. 39.
    Yang ZR, Zwolinski M (2001) A mutual information theory for adaptive mixture models. IEEE Trans Pattern Anal Mach Intell 23(4):1–8Google Scholar
  40. 40.
    Smyth MHC, Figueiredo MAT, Jain AK (2004) Simultaneous feature selection and clustering using mixture models. IEEE Trans Pattern Anal Mach Intell 26(9):1154–1166CrossRefGoogle Scholar
  41. 41.
    Franti P, Virmajoki O (2006) Iterative shrinking method for clustering problems. Pattern Recognit 39(5):761–765CrossRefMATHGoogle Scholar
  42. 42.
    Frank A, Asuncion A (2010) UCI machine learning repository. University of California, School of Information and Computer Science, Irvine
  43. 43.
    Li J, Tao D (2013) Simple exponential family PCA. IEEE Trans Neural Netw Learn Syst 24(3):485–497CrossRefGoogle Scholar
  44. 44.
    Rasmussen CE (1999) The infinite Gaussian mixture model. In: Advances in Neural Information Processing Systems 12, MIT Press, Cambridge, MA, pp 554–560Google Scholar
  45. 45.
    Gershman SJ, Blei DM (2012) A tutorial on Bayesian nonparametric models. J Math Psychol 56(1):1–12MathSciNetCrossRefMATHGoogle Scholar
  46. 46.
    Richardson S, Green PJ (1997) On Bayesian analysis of mixtures with an unknown number of components. J R Stat Soc B 59:731–792MathSciNetCrossRefMATHGoogle Scholar
  47. 47.
    Song M, Wang H (2005) Highly efficient incremental estimation of Gaussian mixture models for online data stream clustering. In: Defense and security. Proc. SPIE 5803, Intelligent Computing: Theory and Applications III, pp 174–183. doi: 10.1117/12.601724
  48. 48.
    Lv J, Yi Z, Tan K (2007) Determination of the number of principal directions in a biologically plausible PCA model. IEEE Trans Neural Netw Learn Syst 18(3):910–916CrossRefGoogle Scholar

Copyright information

© Springer-Verlag London 2016

Authors and Affiliations

  1. 1.Department of CyberneticsCzech Technical University in PraguePrague 2Czech Republic
  2. 2.Czech Institut of Informatics, Robotics and CyberneticsCzech Technical University in PraguePrague 2Czech Republic

Personalised recommendations