Skip to main content

The Performance of the Gradient-Like Influence Measure in Generalized Linear Mixed Models

  • Conference paper
Advances in Statistical Models for Data Analysis
  • 2430 Accesses

Abstract

A gradient-like statistic, recently introduced as an influence measure, has been proven to work well in large sample, thanks to its asymptotic properties. In this work, through small-scale simulation schemes, the performance of such a diagnostic measure is further investigated in terms of concordance with the main influence measures used for outlier identification. The simulation studies are performed by using generalized linear mixed models (GLMMs).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Bates, D., Maechler, M., Bolker, B.: lme4: Linear mixed-effects models using S4 classes. R package version 0.999999-2. http://CRAN.R-project.org/package=lme4 (2013)

  2. Böstrom, G., Holmberg, H.: glmmML: Generalized linear models with clustering. R package version 0.82-1. http://CRAN.R-project.org/package=glmmML (2011)

  3. Cook, R.D.: Detection of influential observations in linear regression. Technometrics 19, 15–18 (1977)

    Article  MathSciNet  MATH  Google Scholar 

  4. Cook, R.D.: Assessment of Local Influence. J. R. Stat. Soc. B Met. 4(2), 133–169 (1986)

    Google Scholar 

  5. Cook, R.D., Weisberg, S.: Residuals and Influence in Regression. Chapman and Hall, London (1982)

    MATH  Google Scholar 

  6. Enea, M., Plaia, A.: Influence diagnostics for meta-analysis of individual patient data using generalized linear mixed models. In: Vicari, D., Okada, A., Ragozini, G., Weihs, C. (eds.) Analysis and Modeling of Complex Data in Behavioral and Social Sciences. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, New York (2014)

    Google Scholar 

  7. Fahrmeier, L., Tutz, G.: Multivariate Statistical Modelling Based on Generalized Linear Models. Springer, New York (1994)

    Book  Google Scholar 

  8. Lemonte, A.J.: On the gradient statistic under model misspecification. Stat. Prob. Lett. 83, 390–398 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  9. McCulloch, C.E.: Maximum likelihood algorithm for generalized linear mixed models: applications to clustered data. J. Am. Stat. Assoc. 92, 162–170 (1997)

    Article  MathSciNet  MATH  Google Scholar 

  10. McCulloch, C.E., Searle, S.R.: Generalized, Linear, and Mixed Models. Wiley, New York (2001)

    MATH  Google Scholar 

  11. Ouwens, M.J.N.M., Tan, F.E.S., Berger, M.P.F.: Local influence to detect influential data structures for generalized linear mixed models. Biometrics 57(42), 1166–1172 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  12. R Core Team: R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna (2012) [ISBN 3-900051-07-0]

    Google Scholar 

  13. Terrell, G.R.: The gradient statistic. Comput. Sci. Stat. 34, 206–215 (2002)

    Google Scholar 

  14. Xiang, L., Tse, S.-K., Lee A. H.: Influence diagnostics for generalized linear mixed models: applications to clustered data. Comput. Stat. Data Anal. 40, 759–774 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  15. Xu, L., Lee, S., Poon, W.: Deletion measures for generalized linear mixed models. Comput. Stat. Data Anal. 51, 1131–1146 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  16. Zhu, H., Lee, S., Wei, B., Zhou, J.: Case-deletion measures for models with incomplete data. Biometrika. 88(3), 727–737 (2001)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Marco Enea .

Editor information

Editors and Affiliations

Appendix: The following R [12] code allows

Appendix: The following R [12] code allows

The following code allows to perform cluster-level influence diagnostics from an object returned by glmer for binomial or Poisson random intercept models. Currently, the code works under lme4 version 0.999999-2. At time of writing, package lme4 was updated to version 1.0-5, but some bugs, concerning the conditional variances of the random effects, are not fixed yet. Further, as it has been explained by Enea and Plaia [6], the information matrix, which is necessary to perform the diagnostics using C i and CD i , can be obtained from package glmmML [2], which uses the same estimation method and provides the same estimates of lme4. A more complete code allowing diagnostics at the observation level, for random intercept/slopes models and for specified parameter subsets, here not reported due to space limits, can be requested to the authors.

influence.mer <- function(obj,H=NULL){

  options(warn=-1)

  parf <- obj@fixef

  nparf <- length(parf)

  oneresp <- is.null(ncol(obj@frame[[1]]))

  Y <- (if (oneresp) obj@frame[[1]] else obj@frame[[1]][,1])

  m <- if(oneresp) rep(1,length(Y)) else rowSums(obj@frame[[1]])

  nobs <- as.vector(table(obj@flist[,ncol(obj@flist)]))

  iclus <- obj@flist[,ncol(obj@flist)]

  clus <- levels(iclus)

  nclus <- length(clus)

  logLik1 <- logLik(obj)[1]

  delta <- VarCorr(obj)[[1]]

  names(delta) <- "delta"

  psi <- c(parf,delta)

  bi <- ranef(obj,postVar=TRUE)[[1]]

  Di <- c()

  for (i in 1:nclus) Di[i]<-(attributes(bi)\(postVar[,,i]+bi[i,]^2)/(2*delta^2)

  E <- Y-fitted(obj)*m

  logLik2 <-c()

  offset <- if (length(obj@offset)>0) exp(obj@offset) else rep(1,length(Y))

  sDelta <- matrix(,nclus,nparf)

  Dpsi <- matrix(,nclus,length(psi))

  for (j in 1:nclus){

     yes <- (iclus==clus[j])

     sDelta[j,] <- crossprod(obj@X[yes,],E[yes])

     newobj <- update(obj,data=obj@frame[!yes,])

     deltai <- VarCorr(newobj)[[1]]

     Dpsi[j,] <- psi-c(fixef(newobj),deltai)

     logLik2[j] <- logLik(update(obj,data=obj@frame,start=list(ST=newobj@ST,

                     fixef=fixef(newobj)),control=list(maxFN=0,maxIter=0)))[1]

  }

  Delta <- cbind(sDelta,Di)

  DD <- Delta*Dpsi

  sGD <- 2*abs(DD)

  GD <- 2*abs(rowSums(DD))

  colnames(sGD) <- colnames(Delta) <- colnames(Dpsi) <- names(psi)

  Ci <- if (!is.null(H)) 2*diag(abs(Delta%*%solve(H)%*%t(Delta))) else NULL

  CDi <- if (!is.null(H)) diag(Dpsi%*%H%*%t(Dpsi)) else NULL

  return(list("GDi"=GD,"LDi"=2*abs(logLik1-logLik2),"Ci"=Ci,"CDi"=CDi))

}

library(lme4)

library(glmmML)

library(mvtnorm)

simul.pois  <- function(j,n,param){ #create an artificial data set

  pa <- as.vector(rmvnorm(j,c(0,0),matrix(a,2,2)))

  clus <- kronecker(1:j,rep(1,n))

  x <- rep((1:n)/n ,j)

  resp <- rpois(n*j,lambda=exp(param[1]+param[2]*x+cbind(kronecker(diag(j),

            rep(1,n)),kronecker(diag(j),(1:n/n)))%*%pa ))

  data.frame(clus,x,resp)

}

a <- c(1,0.5,0.5,1) #for variance/covariance components

dad <- simul.pois(j=10,n=30,param=c(1,-1,a))

m0 <- glmer(resp ~ x + (1|clus),data=dad, family=poisson, x=TRUE)

m0b <- glmmML(resp ~ x, cluster=clus,data=dad, family=poisson)

r0 <- influence.mer(obj=m0,H=solve(m0b\)variance))

r01 <- r0

r01$Ci <- r01$Ci/2

r01$GDi <- r01$GDi/2 #GDi is the Gradient-like influence measure

r01 <- do.call("cbind",r01)

matplot(r01,lty=1:4,type="l",col=1:4,ylab="influence",xlab="cluster index")

legend("topright",c("GDi/2","LRi","Ci/2","CDi"),lty=1:4,col=1:4)

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Enea, M., Plaia, A. (2015). The Performance of the Gradient-Like Influence Measure in Generalized Linear Mixed Models. In: Morlini, I., Minerva, T., Vichi, M. (eds) Advances in Statistical Models for Data Analysis. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Cham. https://doi.org/10.1007/978-3-319-17377-1_12

Download citation

Publish with us

Policies and ethics