Advertisement

Gaussian parsimonious clustering models with covariates and a noise component

  • Keefe MurphyEmail author
  • Thomas Brendan Murphy
Regular Article

Abstract

We consider model-based clustering methods for continuous, correlated data that account for external information available in the presence of mixed-type fixed covariates by proposing the MoEClust suite of models. These models allow different subsets of covariates to influence the component weights and/or component densities by modelling the parameters of the mixture as functions of the covariates. A familiar range of constrained eigen-decomposition parameterisations of the component covariance matrices are also accommodated. This paper thus addresses the equivalent aims of including covariates in Gaussian parsimonious clustering models and incorporating parsimonious covariance structures into all special cases of the Gaussian mixture of experts framework. The MoEClust models demonstrate significant improvement from both perspectives in applications to both univariate and multivariate data sets. Novel extensions to include a uniform noise component for capturing outliers and to address initialisation of the EM algorithm, model selection, and the visualisation of results are also proposed.

Keywords

Model-based clustering Mixtures of experts EM algorithm Parsimony Multivariate response Covariates Noise component 

Mathematics Subject Classification

62H25 62J12 

Notes

Acknowledgements

This work was supported by the Science Foundation Ireland funded Insight Centre for Data Analytics in University College Dublin under Grant Number SFI/12/RC/2289_P2.

References

  1. Andrews JL, McNicholas PD (2012) Model-based clustering, classification, and discriminant analysis via mixtures of multivariate \(t\)-distributions: the \(t\)EIGEN family. Stat Comput 22(5):1021–1029MathSciNetzbMATHCrossRefGoogle Scholar
  2. Banfield J, Raftery AE (1993) Model-based Gaussian and non-Gaussian clustering. Biometrics 49(3):803–821MathSciNetzbMATHCrossRefGoogle Scholar
  3. Benaglia T, Chauveau D, Hunter DR, Young D (2009) mixtools: an R package for analyzing finite mixture models. J Stat Softw 32(6):1–29CrossRefGoogle Scholar
  4. Biernacki C, Celeux G, Govaert G (2000) Assessing a mixture model for clustering with the integrated completed likelihood. IEEE Trans Pattern Anal Mach Intell 22(7):719–725CrossRefGoogle Scholar
  5. Bishop CM (2006) Pattern recognition and machine learning. Springer, New YorkzbMATHGoogle Scholar
  6. Böhning D, Dietz E, Schaub R, Schlattmann P, Lindsay BG (1994) The distribution of the likelihood ratio for mixtures of densities from the one-parameter exponential family. Ann Inst Stat Math 46(2):373–388zbMATHCrossRefGoogle Scholar
  7. Celeux G, Govaert G (1995) Gaussian parsimonious clustering models. Pattern Recogn 28(5):781–793CrossRefGoogle Scholar
  8. Cook RD, Weisberg S (1994) An introduction to regression graphics. Wiley, New YorkzbMATHCrossRefGoogle Scholar
  9. Dang UJ, McNicholas PD (2015) Families of parsimonious finite mixtures of regression models. In: Morlini I, Minerva T, Vichi M (eds) Advances in statistical models for data analysis: studies in classification. Springer International Publishing, Switzerland, pp 73–84 data analysis, and knowledge organizationCrossRefGoogle Scholar
  10. Dang UJ, Punzo A, McNicholas PD, Ingrassia S, Browne RP (2017) Multivariate response and parsimony for Gaussian cluster-weighted models. J Classif 34(1):4–34MathSciNetzbMATHCrossRefGoogle Scholar
  11. Dayton CM, Macready GB (1988) Concomitant-variable latent-class models. J Am Stat Assoc 83(401):173–178MathSciNetCrossRefGoogle Scholar
  12. Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc Ser B 39(1):1–38MathSciNetzbMATHGoogle Scholar
  13. Fraley C, Raftery AE (2007) Bayesian regularization for normal mixture estimation and model-based clustering. J Classif 24(2):155–181MathSciNetzbMATHCrossRefGoogle Scholar
  14. García-Escudero LA, Gordaliza A, Greselin F, Ingrassia S, Mayo-Iscar A (2018) Eigenvalues and constraints in mixture modeling: geometric and computational issues. Adv Data Anal Classif 12(2):203–233MathSciNetzbMATHCrossRefGoogle Scholar
  15. Geweke J, Keane M (2007) Smoothly mixing regressions. J Econ 138(1):252–290MathSciNetzbMATHCrossRefGoogle Scholar
  16. Gormley IC, Murphy TB (2010) Clustering ranked preference data using sociodemographic covariates. In S. Hess & A. Daly, editors, Choice modelling: the state-of-the-art and the state-of-practice, chapter 25, pp 543–569. EmeraldGoogle Scholar
  17. Gormley IC, Murphy TB (2011) Mixture of experts modelling with social science applications. In: Mengersen K, Robert C, Titterington DM (eds) Mixtures: estimation and applications, chapter 9. Wiley, New York, pp 101–121CrossRefGoogle Scholar
  18. Grün B, Leisch F (2007) Fitting finite mixtures of generalized linear regressions in R. Comput Stat Data Anal 51(11):5247–5252MathSciNetzbMATHCrossRefGoogle Scholar
  19. Grün B, Leisch F (2008) FlexMix version 2: finite mixtures with concomitant variables and varying and constant parameters. J Stat Softw 28(4):1–35CrossRefGoogle Scholar
  20. Hennig C (2000) Identifiability of models for clusterwise linear regression. J Classif 17(2):273–296MathSciNetzbMATHCrossRefGoogle Scholar
  21. Hennig C, Coretto P (2008) The noise component in model-based cluster analysis. In: Preisach C, Burkhardt H, Schmidt-Thieme L, Decker R (eds) Data analysis. Springer, Berlin, pp 127–138 machine learning and applications: studies in classification, data analysis, and knowledge organizationGoogle Scholar
  22. Hubert L, Arabie P (1985) Comparing partitions. J Classif 2(1):193–218zbMATHCrossRefGoogle Scholar
  23. Hurn M, Justel A, Robert CP (2003) Estimating mixtures of regressions. J Comput Graph Stat 12(1):55–79MathSciNetCrossRefGoogle Scholar
  24. Ingrassia S, Minotti SC, Vittadini G (2012) Local statistical modeling via the cluster-weighted approach with elliptical distributions. J Classif 29(3):363–401MathSciNetzbMATHCrossRefGoogle Scholar
  25. Ingrassia S, Punzo A, Vittadini G, Minotti SC (2015) The generalized linear mixed cluster-weighted model. J Classif 32(1):85–113MathSciNetzbMATHCrossRefGoogle Scholar
  26. Jacobs RA, Jordan MI, Nowlan SJ, Hinton GE (1991) Adaptive mixtures of local experts. Neural Comput 3(1):79–87CrossRefGoogle Scholar
  27. Jordan MI, Jacobs RA (1994) Hierarchical mixtures of experts and the EM algorithm. Neural Comput 6(2):181–214CrossRefGoogle Scholar
  28. Lamont AE, Vermunt JK, Van Horn ML (2016) Regression mixture models: does modeling the covariance between independent variables and latent classes improve the results? Multivariate Behav Res 51(1):35–52CrossRefGoogle Scholar
  29. Lebret R, Iovleff S, Langrognet F, Biernacki C, Celeux G, Govaert G (2015) Rmixmod: the R package of the model-based unsupervised, supervised, and semi-supervised classification mixmod library. J Stat Softw 67(6):1–29CrossRefGoogle Scholar
  30. Mahalanobis PC (1936) On the generalised distance in statistics. Proc Natl Inst Sci India 2(1):49–55MathSciNetzbMATHGoogle Scholar
  31. Mazza A, Punzo A, Ingrassia S (2018) flexCWM: a flexible framework for cluster-weighted models. J Stat Softw pp 1–27Google Scholar
  32. McCullagh P, Nelder J (1983) Generalized linear models. Chapman and Hall, LondonzbMATHCrossRefGoogle Scholar
  33. McNicholas PD, Murphy TB (2008) Parsimonious Gaussian mixture models. Stat Comput 18(3):285–296MathSciNetCrossRefGoogle Scholar
  34. McParland D, Gormley IC (2016) Model based clustering for mixed data: clustMD. Adv Data Anal Classif 10(2):155–169MathSciNetzbMATHCrossRefGoogle Scholar
  35. Murphy K, Murphy TB (2019) MoEClust: Gaussian parsimonious clustering models with covariates and a noise component. R package version 1.2.3. https://cran.r-project.org/package=MoEClust
  36. Ning H, Hu Y, Huang TS (2008) Efficient initialization of mixtures of experts for human pose estimation. In Proceedings of the international conference on image processing, ICIP 2008, October 12-15, 2008, San Diego, California, pp 2164–2167Google Scholar
  37. Punzo A, Ingrassia S (2015) Parsimonious generalized linear Gaussian cluster-weighted models. In: Morlini I, Minerva T, Vichi M (eds) Advances in statistical models for data analysis: studies in classification. Springer International Publishing, Switzerland, pp 201–209 data analysis, and knowledge organizationCrossRefGoogle Scholar
  38. Punzo A, Ingrassia S (2016) Clustering bivariate mixed-type data via the cluster-weighted model. Comput Stat 31(3):989–1030MathSciNetzbMATHCrossRefGoogle Scholar
  39. Punzo A, McNicholas PD (2016) Parsimonious mixtures of multivariate contaminated normal distributions. Biom J 58(6):1506–1537MathSciNetzbMATHCrossRefGoogle Scholar
  40. R Core Team. R: a language and environment for statistical computing. Statistical Computing, Vienna, AustriaGoogle Scholar
  41. Schwarz G (1978) Estimating the dimension of a model. Ann Stat 6(2):461–464MathSciNetzbMATHCrossRefGoogle Scholar
  42. Scrucca L, Fop M, Murphy TB, Raftery AE (2016) mclust 5: clustering, classification and density estimation using Gaussian finite mixture models. R J 8(1):289–317CrossRefGoogle Scholar
  43. Thompson TJ, Smith PJ, Boyle JP (1998) Finite mixture models with concomitant information: assessing diagnostic criteria for diabetes. J Roy Stat Soc: Ser C 47(3):393–404zbMATHCrossRefGoogle Scholar
  44. Wang P, Puterman ML, Cockburn I, Le N (1996) Mixed Poisson regression models with covariate dependent rates. Biometrics 52(2):381–400zbMATHCrossRefGoogle Scholar
  45. Wang P, Puterman ML, Cockburn I (1998) Analysis of patent data: a mixed-Poisson regression-model approach. J Bus Econ Stat 16(1):27–41Google Scholar
  46. Wedel M, Kamakura WA (2012) Market segmentation: conceptual and methodological foundations. International Series in Quantitative Marketing. Springer, USGoogle Scholar
  47. Young DS, Hunter DR (2010) Mixtures of regressions with predictor-dependent mixing proportions. Comput Stat Data Anal 54(10):2253–2266MathSciNetzbMATHCrossRefGoogle Scholar
  48. Zellner A (1962) An efficient method of estimating seemingly unrelated regression equations and tests for aggregation bias. J Am Stat Assoc 57(298):348–368zbMATHCrossRefGoogle Scholar

Copyright information

© Springer-Verlag GmbH Germany, part of Springer Nature 2019

Authors and Affiliations

  1. 1.School of Mathematics and StatisticsUniversity College DublinDublinIreland
  2. 2.Insight Centre for Data AnalyticsUniversity College DublinDublinIreland

Personalised recommendations