Skip to main content

Parsimonious Generalized Linear Gaussian Cluster-Weighted Models

  • Conference paper
Advances in Statistical Models for Data Analysis

Abstract

Mixtures with random covariates are statistical models which can be applied for clustering and for density estimation of a random vector composed by a response variable and a set of covariates. In this class, the generalized linear Gaussian cluster-weighted model (GLGCWM) assumes, in each mixture component, an exponential family distribution for the response variable and a multivariate Gaussian distribution for the vector of real-valued covariates. For parsimony sake, a family of fourteen models is here introduced by applying some constraints on the eigen-decomposed covariance matrices of the Gaussian distribution. The EM algorithm is described to find maximum likelihood estimates of the parameters for these models. This novel family of models is finally applied to a real data set where a good classification performance is obtained, especially when compared with other well-established mixture-based approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Airoldi, J., Hoffmann, R.: Age variation in voles (Microtus californicus, M. ochrogaster) and its significance for systematic studies. In: Occasional Papers of the Museum of Natural History, vol. 111. University of Kansas, Lawrence (1984)

    Google Scholar 

  2. Aitken, A.: On Bernoulli’s numerical solution of algebraic equations. In: Proceedings of the Royal Society of Edinburgh, vol. 46, pp. 289–305 (1926)

    MATH  Google Scholar 

  3. Bagnato, L., Punzo, A.: Finite mixtures of unimodal beta and gamma densities and the k-bumps algorithm. Comput. Stat. 28(4), 1571–1597 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  4. Celeux, G., Govaert, G.: Gaussian parsimonious clustering models. Pattern Recogn. 28(5), 781–793 (1995)

    Article  Google Scholar 

  5. Dempster, A., Laird, N., Rubin, D.: Maximum likelihood from incomplete data via the EM algorithm. J. R. Stat. Soc. Ser. B Methodol. 39(1), 1–38 (1977)

    MathSciNet  MATH  Google Scholar 

  6. Flury, B.: A First Course in Multivariate Statistics. Springer, New York (1997)

    Book  MATH  Google Scholar 

  7. Gershenfeld, N.: Nonlinear inference and cluster-weighted modeling. Ann. N. Y. Acad. Sci. 808(1), 18–24 (1997)

    Article  Google Scholar 

  8. Greselin, F., Punzo, A.: Closed likelihood ratio testing procedures to assess similarity of covariance matrices. Am. Stat. 67(3), 117–128 (2013)

    Article  MathSciNet  Google Scholar 

  9. Grün, B., Leisch, F.: FlexMix version 2: Finite mixtures with concomitant variables and varying and constant parameters. J. Stat. Softw. 28(4), 1–35 (2008)

    Google Scholar 

  10. Hennig, C.: Identifiablity of models for clusterwise linear regression. J. Classif. 17(2), 273–296 (2000)

    Article  MathSciNet  MATH  Google Scholar 

  11. Ingrassia, S., Minotti, S.C., Vittadini, G.: Local statistical modeling via the cluster-weighted approach with elliptical distributions. J.Classif. 29(3), 363–401 (2012)

    Google Scholar 

  12. Ingrassia, S., Minotti, S.C., Punzo, A.: Model-based clustering via linear cluster-weighted models. Comput. Stat. Data Anal. 71, 159–182 (2014)

    Article  MathSciNet  Google Scholar 

  13. Ingrassia, S., Punzo, A., Vittadini, G., Minotti, S.C.: The generalized linear mixed cluster-weighted model. J. Classif. 32(1), 85–113 (2015)

    Article  MathSciNet  Google Scholar 

  14. Mazza, A., Punzo, A., Ingrassia, S.: flexCWM: Flexible Cluster-Weighted Modeling. Available at http://cran.r-project.org/web/packages/flexCWM/index.html (2014)

  15. Punzo, A.: Flexible mixture modeling with the polynomial Gaussian cluster-weighted model. Stat. Model. 14(3), 257–291 (2014)

    Article  MathSciNet  Google Scholar 

  16. Punzo, A., Ingrassia, S.: On the use of the generalized linear exponential cluster-weighted model to asses local linear independence in bivariate data. QdS J. Methodol. Appl. Stat. 15, 131–144 (2013)

    Google Scholar 

  17. Punzo, A., Ingrassia, S.: Clustering bivariate mixed-type data via the cluster-weighted model. Comput. Stat. (2015)

    Google Scholar 

  18. Punzo, A., McNicholas, P.D.: Robust clustering in regression analysis via the contaminated Gaussian cluster-weighted model. Available at: http://arxiv.org/abs/1409.6019 (2014) [arXiv.org e-print 1409.6019]

  19. Punzo, A., Browne, R.P., McNicholas, P.D.: Hypothesis testing for parsimonious Gaussian mixture models. Available at: http://arxiv.org/abs/1405.0377 (2014) [arXiv.org e-print 1405.0377]

  20. R Core Team: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna (2013)

    Google Scholar 

  21. Schwarz, G.: Estimating the dimension of a model. Ann. Stat. 6(2), 461–464 (1978)

    Article  MATH  Google Scholar 

  22. Subedi, S., Punzo, A., Ingrassia, S., McNicholas, P.D.: Clustering and classification via cluster-weighted factor analyzers. Adv. Data Anal. Classif. 7(1), 5–40 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  23. Subedi, S., Punzo, A., Ingrassia, S., McNicholas, P.D.: Cluster-weighted t-factor analyzers for robust model-based clustering and dimension reduction. Stat. Methods Appl. 24 (2015)

    Google Scholar 

  24. Wedel, M.: Concomitant variables in finite mixture models. Statistica Neerlandica 56(3), 362–375 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  25. Wedel, M., Kamakura, W.: Market Segmentation: Conceptual and Methodological Foundations, 2nd edn. Kluwer Academic, Boston (2001)

    Google Scholar 

Download references

Acknowledgements

The authors acknowledge the financial support from the grant “Finite mixture and latent variable models for causal inference and analysis of socio-economic data” (FIRB 2012-Futuro in ricerca) funded by the Italian Government (RBFR12SHVV).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Antonio Punzo .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Punzo, A., Ingrassia, S. (2015). Parsimonious Generalized Linear Gaussian Cluster-Weighted Models. In: Morlini, I., Minerva, T., Vichi, M. (eds) Advances in Statistical Models for Data Analysis. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Cham. https://doi.org/10.1007/978-3-319-17377-1_21

Download citation

Publish with us

Policies and ethics