Skip to main content
Log in

Random-intercept misspecification in generalized linear mixed models for binary responses

  • Original Paper
  • Published:
Statistical Methods & Applications Aims and scope Submit manuscript

Abstract

We study properties of maximum likelihood estimators of parameters in generalized linear mixed models for a binary response in the presence of random-intercept model misspecification. Further exploiting the test proposed in an existing work initially designed for detecting general random-effects misspecification, we are able to reveal how the true random-intercept distribution deviates from the assumed. Besides this advance compared to the existing methods, we also provide theoretical insights on when and why the proposed test has low power to identify certain forms of misspecification. Large-sample numerical study and finite-sample simulation experiments are carried out to illustrate the theoretical findings.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  • Agresti A, Caffo B, Ohman-Strickland P (2004) Examples in which misspecification of a random effects distribution reduces efficiency, and possible remedies. Comput Statist Data Anal 47:639–653

    Article  MathSciNet  MATH  Google Scholar 

  • Alonso A, Litière S, Molenberghs G (2008) A family of tests to detect misspecifications in random-effects structure of generalized linear mixed models. Comput Statist Data Anal 52:4474–4486

    Article  MathSciNet  MATH  Google Scholar 

  • Azzalini A (1985) A class of distributions which includes the normal ones. Scand J Stat 12:171–178

    MathSciNet  MATH  Google Scholar 

  • Bartolucci F, Belotti F, Peracchi F (2015) Testing for time-invariant unobserved heterogeneity in generalized linear models for panel data. J Econom 184:111–123

    Article  MathSciNet  MATH  Google Scholar 

  • Bartolucci F, Bacci S, Pigini C (2015) A misspecification test for finite-mixture logistic models for clustered binary and ordered responses. MPRA Paper 64220, University Library of Munich

  • Butler SM, Louis TA (1992) Random effects models with nonparametric priors. Stat Med 11:1981–2000

    Article  Google Scholar 

  • Caffo B, Ming-Wen A, Rohde C (2007) Flexible random intercept models for binary outcomes using mixtures of normals. Comput Stat Data Anal 51:5220–5235

    Article  MathSciNet  MATH  Google Scholar 

  • Chen J, Zhang D, Davidian M (2002) A Monte Carol EM algorithm for generalized linear models with flexible random effects distribution. Biostatistics 3:347–360

    Article  MATH  Google Scholar 

  • Claeskens G, Hart JD (2009) Goodness-of-fit tests in mixed models. Test 18:213–239

    Article  MathSciNet  MATH  Google Scholar 

  • Cytel Software, Inc. (2012) Proc-LogXact 7.0 For SAS users. Cytel Software, Inc, Cambridge

  • Drikvandi R, Verbeke G, Molenberghs G (2016) Diagnosing misspecification of the random-effects distribution in mixed models. Biometrics. doi:10.1111/biom.12551

    MATH  Google Scholar 

  • Efendi A, Drikvandi R, Verbeke G, Molenberghs G (2014) A goodness-of-fit test for the random-effects distribution in mixed models. Stat Methods Med Res. doi:10.1177/0962280214564721

    Google Scholar 

  • Fitzmaurice GM, Laird NM, Ware JH (2004) Applied longitudinal analysis. Wiley, New York

    MATH  Google Scholar 

  • Follmann DA, Lambert D (1989) Generalized logistic regression by nonparametric mixing. J Am Stat Assoc 84:295–300

    Article  Google Scholar 

  • Grilli L, Rampichini C (2015) Specification of random effects in multilevel models: a review. Qual Quant 49:967–976

    Article  Google Scholar 

  • Gustafson P (1996) The effect of mixing-distribution misspecification in conjugate mixture models. Can J Stat 24:307–318

    Article  MathSciNet  MATH  Google Scholar 

  • Hausman JA (1978) Specification tests in econometrics. Econometrika 46:1251–1271

    Article  MathSciNet  MATH  Google Scholar 

  • Heagerty PJ, Kurland BF (2001) Misspecified maximum likelihood estimates and generalised linear mixed models. Biometrika 88:973–985

    Article  MathSciNet  MATH  Google Scholar 

  • Huang X (2009) Diagnosis of random-effect model misspecification in generalized linear mixed models for binary response. Biometrics 65:361–368

    Article  MathSciNet  MATH  Google Scholar 

  • Huang X (2011) Detecting random-effects model misspecification via coarsened data. Comput Stat Data Anal 55:703–714

    Article  MathSciNet  MATH  Google Scholar 

  • Huang X (2013) Tests for random effects in linear mixed models using missing data. Stat Sin 23:1043–1070

    MathSciNet  MATH  Google Scholar 

  • Jiang J (2007) Linear and generalized linear mixed models and their applications. Springer series in statistics. Springer, New York

    Google Scholar 

  • Kleinman KP, Ibrahim JG (1998) A semi-parametric Bayesian approach to generalized linear mixed models. Stat Med 17:2579–2596

    Article  Google Scholar 

  • Komàrek A, Lesaffre E (2008) Generalized linear mixed model with a penalized Gaussian mixture as a random effects distribution. Comput Stat Data Anal 52:3441–3458

    Article  MathSciNet  MATH  Google Scholar 

  • Lee KJ, Thompson SG (2008) Flexible parametric models for random-effects distributions. Stat Med 7:418–434

    Article  MathSciNet  Google Scholar 

  • Lesperance M, Saab R, Neuhaus J (2014) Nonparametric estimation of the mixing distribution in logistic regression mixed models with random intercepts and slopes. Comput Stat Data Anal 71:211–219

    Article  MathSciNet  Google Scholar 

  • Lin X, Carroll RJ (2001) Semiparametric regression for clustered data. Biometrika 88:1179–1185

    Article  MathSciNet  MATH  Google Scholar 

  • Litière S, Alonso A, Molenberghs G (2007) Type I and type II error under random-effects misspecification in generalized linear mixed models. Biometrics 63:1038–1044

    Article  MathSciNet  MATH  Google Scholar 

  • Litière S, Alonso A, Molenberghs G (2008) The impact of a misspecified random-effects distribution on maximum likelihood estimation in generalized linear mixed models. Stat Med 27:3125–3144

    Article  MathSciNet  Google Scholar 

  • Magder LS, Zeger SL (1996) A smooth nonparametric estimate of a mixing distribution using mixtures of Gaussians. J Am Stat Assoc 91:1141–1151

    Article  MathSciNet  MATH  Google Scholar 

  • McCulloch CE, Searle SR, Neuhaus JM (2008) Generalized, linear, and mixed models. Wiley, Hoboken, New Jersey

    MATH  Google Scholar 

  • McCulloch CE, Neuhaus JM (2011) Misspecifying the shape of a random effects distribution: why getting it wrong may not matter. Stat Sci 26:388–402

    Article  MathSciNet  MATH  Google Scholar 

  • McCulloch CE, Neuhaus JM (2011) Prediction of random effects in linear and generalized linear models under model misspeciffication. Biometrics 67:270–279

    Article  MathSciNet  MATH  Google Scholar 

  • Molenberghs G, Verbeke G (2005) Models for discrete longitudinal data. Springer series in statistics. Springer, New York

    MATH  Google Scholar 

  • Neuhaus JM, Hauck WW, Kalbfleisch JD (1992) The effects of mixture distribution specification when fitting mixed-effects logistic models. Biometrics 79:755–762

    Article  Google Scholar 

  • Neuhaus JM, McCulloch CE, Boylan R (2013) Estimation of covariate effects in generalized linear mixed models with a misspecified distribution of random intercepts and slopes. Stat Med 32:2419–2429

    Article  MathSciNet  Google Scholar 

  • Pan Z, Lin DY (2005) Goodness-of-fit methods for generalized linear mixed models. Biometrics 61:1000–1009

    Article  MathSciNet  MATH  Google Scholar 

  • Papageorgiou G, Hinde J (2012) Multivariate generalized linear mixed models with semi-nonparametric and smooth nonparametric random effects densities. Stat Comput 22:79–92

    Article  MathSciNet  MATH  Google Scholar 

  • Ritz C (2004) Goodness-of-fit tests for mixed models. Scand J Stat 31:443–458

    Article  MathSciNet  MATH  Google Scholar 

  • Rotnitzky A, Wypij D (1994) A note on the bias of estimators with missing data. Biometrics 50:1163–1170

    Article  Google Scholar 

  • Sartori N, Severini TA (2004) Conditional likelihood inference in generalized linear mixed models. Stat Sin 14:349–360

    MathSciNet  MATH  Google Scholar 

  • Scurrah KJ, Palmer LJ, Burton PR (2000) Variance components analysis for pedigree-based censored survival data using generalized linear mixed models (GLMMs) and Gibbs sampling in BUGS. Genet Epidemiol 19:127–148

    Article  Google Scholar 

  • Tchetgen EJ, Coull BA (2006) A diagnostic test for the mixing distribution in a generalised linear mixed model. Biometrika 93:1003–1010

    Article  MathSciNet  MATH  Google Scholar 

  • Ten Have TR, Kunselman AR, Tran L (1999) A comparison of mixed effects logistic regression models for binary response data with two nested levels of clustering. Stat Med 18:947–960

    Article  Google Scholar 

  • Verbeke G, Molenberghs G (2010) Arbitrariness of models for augmented and coarse data, with emphasis on incomplete data and random effects models. Stat Model 14:477–490

    MathSciNet  Google Scholar 

  • Verbeke G, Molenberghs G (2013) The gradient function as an exploratory goodness-of-fit assessment of the random-effects distribution in mixed models. Biostatistics 14:477–490

    Article  Google Scholar 

  • Waagepetersen R (2006) A simulation-based goodness-of-fit test for random effects in generalized linear mixed models. Scand J Stat 33:721–731

    Article  MathSciNet  MATH  Google Scholar 

  • Wang J (2010) A nonparametric approach using Dirichlet process for hierarchical generalized linear mixed models. J Data Sci 8:43–59

    Google Scholar 

  • Wang Z, Louis TA (2003) Matching conditional and marginal shapes in binary mixed-effects models using a bridge distribution function. Biometrika 90:765–775

    Article  MathSciNet  MATH  Google Scholar 

  • Wang Z, Louis TA (2004) Marginalized binary mixed-effects with covariate-dependent random effects and likelihood inference. Biometrics 60:884–891

    Article  MathSciNet  MATH  Google Scholar 

  • White H (1982) Maximum likelihood estimation of misspecified models. Econometrica 50:1–25

    Article  MathSciNet  MATH  Google Scholar 

  • Woods CM (2008) Likelihood-ratio DIF testing: effects of nonnormality. Appl Psychol Meas 32:511–526

    Article  MathSciNet  Google Scholar 

  • Zackin R, De Gruttola VG, Laird N (1996) Nonparametric mixed-effects models for repeated binary data arising in serial dilution assays: an application to estimating viral burden in AIDS. J Am Stat Assoc 91:52–61

    Article  MATH  Google Scholar 

Download references

Acknowledgements

The authors are grateful to the referees, the Associate Editor, and the Editor for their helpful suggestions that lead to a substantially improved manuscript. This work was supported by National Science Foundation (US) Grant DMS-1006222.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xianzheng Huang.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 448 KB)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yu, S., Huang, X. Random-intercept misspecification in generalized linear mixed models for binary responses. Stat Methods Appl 26, 333–359 (2017). https://doi.org/10.1007/s10260-017-0376-0

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10260-017-0376-0

Keywords

Navigation