Advertisement

TEST

pp 1–19 | Cite as

A robust conditional maximum likelihood estimator for generalized linear models with a dispersion parameter

  • Alfio Marazzi
  • Marina Valdora
  • Victor Yohai
  • Michael Amiguet
Original Paper
  • 13 Downloads

Abstract

Highly robust and efficient estimators for generalized linear models with a dispersion parameter are proposed. The estimators are based on three steps. In the first step, the maximum rank correlation estimator is used to consistently estimate the slopes up to a scale factor. The scale factor, the intercept, and the dispersion parameter are robustly estimated using a simple regression model. Then, randomized quantile residuals based on the initial estimators are used to define a region S such that observations out of S are considered as outliers. Finally, a conditional maximum likelihood (CML) estimator given the observations in S is computed. We show that, under the model, S tends to the whole space for increasing sample size. Therefore, the CML estimator tends to the unconditional maximum likelihood estimator and this implies that this estimator is asymptotically fully efficient. Moreover, the CML estimator maintains the high degree of robustness of the initial one. The negative binomial regression case is studied in detail.

Keywords

Generalized linear model Conditional maximum likelihood Negative binomial regression Overdispersion Robust regression 

Mathematics Subject Classification

62F10 62F12 62F35 62J12 62J20 

Supplementary material

11749_2018_624_MOESM1_ESM.pdf (332 kb)
Supplementary material 1 (pdf 332 KB)

References

  1. Abrevaya J (1999) Computation of the maximum rank correlation estimator. Econ Lett 62:279–285MathSciNetCrossRefGoogle Scholar
  2. Aeberhard WH, Cantoni E, Heritier S (2014) Robust inference in the negative binomial regression model with an application to falls data. Biometrics 70:920–931MathSciNetCrossRefGoogle Scholar
  3. Agostinelli C, Marazzi A (2018) robustnegbin: robust estimates for the negative binomial regression model. R package, Preliminary versionGoogle Scholar
  4. Alfons A (2015) ccaPP: (Robust) canonical correlation analysis via projection pursuit. R package version 0.3.1Google Scholar
  5. Alfons A, Croux C, Filzmoser P (2017) Robust maximum association estimators. J Am Stat Assoc 112(517):436–445MathSciNetCrossRefGoogle Scholar
  6. Amiguet M (2011) Adaptively weighted maximum likelihood estimation of discrete distributions. Ph.D. thesis, Université de Lausanne, SwitzerlandGoogle Scholar
  7. Austin PC, Rothwell DM, Tu JV (2002) A comparison of statistical modeling strategies for analyzing length of stay after CABG surgery. Health Serv Outcomes Res Methodol 3:107–133CrossRefGoogle Scholar
  8. Cadigan NG, Chen J (2001) Properties of robust M-estimators for Poisson and negative binomial data. J Stat Comput Simul 70:273–288MathSciNetCrossRefGoogle Scholar
  9. Cantoni E, Ronchetti E (2001) Robust inference for generalized linear models. J Am Stat Assoc 96(455):1022–1030MathSciNetCrossRefGoogle Scholar
  10. Cantoni E, Zedini A (2009). A robust version of the hurdle model. Cahiers du département d’économétrie No 2009.07, Faculté des sciences économiques et sociales, Université de GenèveGoogle Scholar
  11. Carter EM, Potts HWW (2014) Predicting length of stay from an electronic patient record system: a primary total knee replacement example. BMC Med Inform Decis Mak 14:26CrossRefGoogle Scholar
  12. Cuesta-Albertos JA, Matrán C, Mayo-Iscar A (2008) Trimming and likelihood: robust location and dispersion estimate in the elliptical model. Ann Stat 36(5):2284–2318CrossRefGoogle Scholar
  13. Davison AC, Snell EJ (1991) Residuals and diagnostics. In: Hinkley DV, Reid N, Snell EJ (eds) Statistical theory and modelling: in honour of Sir David Cox. Chapman and Hall, Boca Raton, pp 83–106Google Scholar
  14. Dunn PK, Smyth GK (1996) Randomized quantile residuals. J Comput Graph Stat 5(3):236–244Google Scholar
  15. Gervini D, Yohai VJ (2002) A class of robust and fully efficient regression estimators. Ann Stat 30(2):583–616MathSciNetCrossRefGoogle Scholar
  16. Ghosh A, Basu A (2013) Robust estimation for independent non-homogeneous observations using density power divergence with applications to linear regression. Electron J Stat 7:2420–2456MathSciNetCrossRefGoogle Scholar
  17. Ghosh A, Basu A (2016) Robust estimation in generalized linear models: the density power divergence approach. TEST 25:269–290MathSciNetCrossRefGoogle Scholar
  18. Hampel FR, Ronchetti EM, Rousseeuw PJ, Stahel WA (1986) Robust statistics: the approach based on influence functions. Wiley, New YorkzbMATHGoogle Scholar
  19. Han AK (1987a) Non-parametric analysis of a generalized regression model: the maximum rank correlation estimator. J Econ 35(23):303–316CrossRefGoogle Scholar
  20. Han AK (1987b) A non-parametric analysis of transformations. J Econ 35(2–3):191–209CrossRefGoogle Scholar
  21. Heritier S, Cantoni E, Copt S, Victoria-Feser MP (2009) Robust methods in biostatistics. Wiley, ChichesterCrossRefGoogle Scholar
  22. Hilbe JM (2008) Negative binomial regression. Cambridge University Press, CambridgezbMATHGoogle Scholar
  23. Huber PJ (1980) Robust statistics. Wiley, New YorkGoogle Scholar
  24. Künsch HR, Stefanski LA, Carroll RJ (1989) Conditionally unbiased bounded-influence estimation in general regression models, with applications to generalized linear models. J Am Stat Assoc 84(406):460–466MathSciNetzbMATHGoogle Scholar
  25. Locatelli I, Marazzi A, Yohai VJ (2010) Robust accelerated failure time regression. Comput Stat Data Anal 55(1):874–887MathSciNetCrossRefGoogle Scholar
  26. Marazzi A, Yohai VJ (2004) Adaptively truncated maximum likelihood regression with asymmetric errors. J Stat Plan Inference 122(1–2):271–291MathSciNetCrossRefGoogle Scholar
  27. Marazzi A, Yohai VJ (2010) Optimal robust estimates based on the Hellinger distance. Adv Data Anal Classif 4(2):169–179MathSciNetCrossRefGoogle Scholar
  28. Marazzi A, Paccaud F, Ruffieux C, Beguin C (1998) Fitting the distribution of length of stay by parametric models. Med Care 36(6):915–927CrossRefGoogle Scholar
  29. Maronna RA, Martin RD, Yohai VJ (2006) Robust statistics theory and methods. Wiley, New YorkCrossRefGoogle Scholar
  30. Min Y, Agresti A (2002) Modeling nonnegative data with clumping at zero: a survey. J Iran Stat Soc 1(1–2):7–33zbMATHGoogle Scholar
  31. Nelder JA, Wedderburn RWM (1972) Generalized linear models. J R Stat Soc Ser A 135(3):370–384CrossRefGoogle Scholar
  32. Rousseeuw PJ (1985) Multivariate estimation with high breakdwon point. In: Grossman W, Pflug G, Vincze I, Wertz W (eds) Mathematical statistics and applications. Reidel Publishing, Dordrecht, pp 283–297CrossRefGoogle Scholar
  33. Sherman RP (1993) The limiting distribution of the maximum rank correlation estimator. Econometrica 61(1):123–137MathSciNetCrossRefGoogle Scholar
  34. Valdora M, Yohai VJ (2014) Robust estimation in generalized linear models. J Stat Plan Inference 146:31–48CrossRefGoogle Scholar
  35. Yohai VJ (1987) High breakdown-point and high efficiency robust estimates for regression. Ann Stat 15(2):642–656MathSciNetCrossRefGoogle Scholar

Copyright information

© Sociedad de Estadística e Investigación Operativa 2018

Authors and Affiliations

  1. 1.Institute of Social and Preventive MedicineLausanneSwitzerland
  2. 2.Nice ComputingLe Mont-sur-LausanneSwitzerland
  3. 3.Departamento de matematicas and Instituto de cálculo, Facultad de ciencias exactas y naturalesUniversidad de Buenos AiresBuenos AiresArgentina
  4. 4.CONICETBuenos AiresArgentina

Personalised recommendations