Scale-constrained approaches for maximum likelihood estimation and model selection of clusterwise linear regression models

  • Roberto Di MariEmail author
  • Roberto Rocci
  • Stefano Antonio Gattone
Original Paper


We consider an equivariant approach imposing data-driven bounds for the variances to avoid singular and spurious solutions in maximum likelihood estimation of clusterwise linear regression models. We investigate its use in the choice of the number of components and we propose a computational shortcut, which significantly reduces the computational time needed to tune the bounds on the data. In the simulation study and the two real-data applications, we show that the proposed methods guarantee a reliable assessment of the number of components compared to standard unconstrained methods, together with accurate model parameters estimation and cluster recovery.


Clusterwise linear regression Mixtures of linear regression models Data-driven constraints Equivariant estimators Computationally efficient approach Model selection 



  1. Alfó M, Viviani S (2016) Finite mixtures of structured models. In: Hennig C, Meila M, Murtagh F, Rocci R (eds) Handbook of cluster analysis. Chapman & Hall, Boca Raton, pp 217–240Google Scholar
  2. Arlot S, Celisse A (2010) Cross-validation procedures for model selection. Stat Surv 4:40–79MathSciNetCrossRefzbMATHGoogle Scholar
  3. Bagirov AM, Ugon J, Mirzayeva H (2013) Nonsmooth nonconvex optimization approach to clusterwise linear regression problems. Eur J Oper Res 229(1):132–142MathSciNetCrossRefzbMATHGoogle Scholar
  4. Carbonneau RA, Caporossi G, Hansen P (2011) Globally optimal clusterwise regression by mixed logical-quadratic programming. Eur J Oper Res 212(1):213–222MathSciNetCrossRefGoogle Scholar
  5. Cerioli A, García-Escudero LA, Mayo-Iscar A, Riani M (2017) Finding the number of groups in model-based clustering via constrained likelihoods. J Comput Graph Stat. Google Scholar
  6. Day NE (1969) Estimating the components of a mixture of two normal distributions. Biometrika 56:463–474MathSciNetCrossRefzbMATHGoogle Scholar
  7. Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc Ser B (Stat Methodol) 39:1–38MathSciNetzbMATHGoogle Scholar
  8. Di Mari R, Rocci R, Gattone SA (2017) Clusterwise linear regression modeling with soft scale constraints. Int J Approx Reason 91:160–178MathSciNetCrossRefzbMATHGoogle Scholar
  9. Fraley C, Raftery AE (2007) Bayesian regularization for normal mixture estimation and model-based clustering. J Classif 24(2):155–181MathSciNetCrossRefzbMATHGoogle Scholar
  10. García-Escudero LA, Gordaliza A, Greselin F, Ingrassia S, Mayo-Iscar A (2017) Eigenvalues and constraints in mixture modeling: geometric and computational issues. Adv Data Anal Classif. zbMATHGoogle Scholar
  11. Hathaway RJ (1985) A constrained formulation of maximum-likelihood estimation for normal mixture distributions. Ann Stat 13:795–800MathSciNetCrossRefzbMATHGoogle Scholar
  12. Hennig C, Liao TF (2013) How to find an appropriate clustering for mixed-type variables with application to socio-economic stratification. J R Stat Soc Ser C 62(3):309–369MathSciNetCrossRefGoogle Scholar
  13. Hubert L, Arabie P (1985) Comparing partitions. J Classif 2:193–218CrossRefGoogle Scholar
  14. Ingrassia S (2004) A likelihood-based constrained algorithm for multivariate normal mixture models. Stat Methods Appl 13:151–166MathSciNetCrossRefGoogle Scholar
  15. Ingrassia S, Rocci R (2007) A constrained monotone EM algorithm for finite mixture of multivariate Gaussians. Comput Stat Data Anal 51:5339–5351MathSciNetCrossRefzbMATHGoogle Scholar
  16. Keribin C (2000) Consistent estimation of the order of mixture models. Sankhyā 62:49–66MathSciNetzbMATHGoogle Scholar
  17. Kiefer NM (1978) Discrete parameter variation: efficient estimation of a switching regression model. Econometrica 46:427–434MathSciNetCrossRefzbMATHGoogle Scholar
  18. Kiefer J, Wolfowitz J (1956) Consistency of the maximum likelihood estimator in the presence of infinitely many incidental parameters. Ann Math Stat 27:886–906MathSciNetzbMATHGoogle Scholar
  19. Kim D, Seo B (2014) Assessment of the number of components in Gaussian mixture models in the presence of multiple local maximizers. J Multivar Anal 125:100–120MathSciNetCrossRefzbMATHGoogle Scholar
  20. Koehler AB, Murphree ES (1988) A comparison of the Akaike and Schwarz criteria for selecting model order. Appl Stat 37:187–195MathSciNetCrossRefGoogle Scholar
  21. Leroux BG (1992) Consistent estimation of a mixing distribution. Ann Stat 20:1350–1360MathSciNetCrossRefzbMATHGoogle Scholar
  22. McLachlan GJ, Peel D (2000) Finite mixture models. Wiley, New YorkCrossRefzbMATHGoogle Scholar
  23. Quandt RE (1972) A new approach to estimating switching regressions. J Am Stat Assoc 67(338):306–310CrossRefzbMATHGoogle Scholar
  24. Quandt RE, Ramsey JB (1978) Estimating mixtures of normal distributions and switching regressions. J Am Stat Assoc 73(364):730–738MathSciNetCrossRefzbMATHGoogle Scholar
  25. Ritter G (2014) Robust cluster analysis and variable selection. Monographs on statistics and applied probability, vol 137. CRC PressGoogle Scholar
  26. Rocci R, Gattone SA, Di Mari R (2017) A data driven equivariant approach to constrained Gaussian mixture modeling. Adv Data Anal Classif. zbMATHGoogle Scholar
  27. Seo B, Kim D (2012) Root selection in normal mixture models. Comput Stat Data Anal 56:2454–2470MathSciNetCrossRefzbMATHGoogle Scholar
  28. Seo B, Lindsay BG (2010) A computational strategy for doubly smoothed MLE exemplified in the normal mixture model. Comput Stat Data Anal 54(8):1930–1941MathSciNetCrossRefzbMATHGoogle Scholar
  29. Smyth P (1996) Clustering using Monte-Carlo cross validation. In: Proceedings of the second international conference on knowledge discovery and data mining, Menlo Park, CA, AAAI Press, pp 126–133Google Scholar
  30. Smyth P (2000) Model selection for probabilistic clustering using cross-validated likelihood. Stat Comput 10(1):63–72CrossRefGoogle Scholar
  31. Zou H, Hastie T, Tibshirani R (2007) On the “degrees of freedom” of the lasso. Ann Stat 35(5):2173–2192MathSciNetCrossRefzbMATHGoogle Scholar

Copyright information

© Springer-Verlag GmbH Germany, part of Springer Nature 2019

Authors and Affiliations

  1. 1.Department of Economics and BusinessUniversity of CataniaCataniaItaly
  2. 2.Department of Economics and FinanceUniversity of Rome Tor VergataRomeItaly
  3. 3.Department of Philosophical and Social Sciences, Economics and Quantitative MethodsUniversity G. d’AnnunzioChieti-PescaraItaly

Personalised recommendations