Skip to main content

Part of the book series: ICSA Book Series in Statistics ((ICSABSS,volume 9))

Abstract

Many fields of study use longitudinal datasets, which usually consist of repeated measurements of a response variable, often accompanied by a set of covariates for each of the subjects/units. However, longitudinal datasets are problematic because they inherently show correlation due to a subject’s repeated set of measurements. For example, one might expect a correlation to exist when looking at a patient’s health status over time or a student’s performance over time. But in those cases, when the responses are correlated, we cannot readily obtain the underlying joint distribution; hence, there is no closed-form joint likelihood function to present, as with the standard logistic regression model. One remedy is to fit a generalized estimating equations (GEE) logistic regression model for the data, which is explored in this chapter. This chapter addresses repeated measures of the sampling unit, showing how the GEE method allows missing values within a subject without losing all the data from the subject, and time-varying predictors that can appear in the model. The method requires a large number of subjects and provides estimates of the marginal model parameters. We fit this model in SAS, SPSS, and R, basing our work on the variance means relationship methods, Ziang and Leger (Biometrics 42:121–130, 1986a, Biometrics 73:13–22, 1986b), and Liang and Zeger (Biometrika 73:13–22, 1986).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 54.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Ballinger, G. A. (2004). Using generalized estimating equations for longitudinal data analysis. Organizational Research Methods, 7, 127–150.

    Article  Google Scholar 

  • Breslow, N. E. (1989). Score tests in overdispersed GLMs. In A. Decarli, B. J. Francis, R. Gilchrist, & G. U. H. Seeber (Eds.), Workshop on statistical modeling (pp. 64–74). New York: Springer.

    Chapter  Google Scholar 

  • Davidian, M., & Carroll, R. J. (1987). Variance function estimation. Journal of American Statistical Association, 82, 1079–1091.

    Article  MathSciNet  Google Scholar 

  • Diggle, P. J., Liang, K. Y., & Zeger, S. L. (1994). Analysis of longitudinal data. New York: Oxford University Press.

    MATH  Google Scholar 

  • Galbraith, S., Daniel, J. A., & Vissel, B. (2010). A study of clustered data and approaches to its analysis. Journal of Neuroscience, 30, 10601–10608.

    Article  Google Scholar 

  • Gibbons, R. D., & Hedeker, D. H. (1997). Random effects probit and logistic regression models for three-level data. Biometrics, 53, 1527–1537.

    Article  Google Scholar 

  • Hardin, J. W., & Hilbe, J. M. (2003). Generalized estimating equations. New York: Wiley.

    MATH  Google Scholar 

  • Hu, F. B., Goldberg, J., Hedeker, D., Flay, B. R., & Pentz, M. A. (1998). Comparison of population-averaged and subject-specific approaches for analyzing repeated binary outcomes. American Journal of Epidemiology, 147(7), 694–703.

    Article  Google Scholar 

  • Liang, K. Y., & Zeger, S. L. (1986). Longitudinal data analysis using generalized linear models. Biometrika, 73(1), 13–22.

    Article  MathSciNet  Google Scholar 

  • McCullagh, P., & Nelder, J. (1989). Generalized linear models (2nd ed.). London: Chapman and Hall.

    Book  Google Scholar 

  • Pan, W., & Connett, J. E. (2002). Selecting the working correlation structure in generalized estimating equations with application to the lung health study. Statistica Sinica, 12(2), 475–490.

    MathSciNet  MATH  Google Scholar 

  • Sullivan Pepe, M., & Anderson, G. L. (1994). A cautionary note on inference for marginal regression models with longitudinal data and general correlated response data. Communications in Statistics—Simulation and Computation, 23(4), 939–951.

    Article  Google Scholar 

  • Wilson, P. M., & Wilson, J. R. (1992). Environmental influences on adolescent educational aspirations: A logistic transform model. Youth & Society, 24(1), 52–70.

    Article  Google Scholar 

  • Zeger, S. L., & Liang, K. Y. (1986a). Longitudinal data analysis for discrete and continuous outcomes. Biometrics, 42, 121–130.

    Article  Google Scholar 

  • Zeger, S. L., & Liang, K. Y. (1986b). Longitudinal data analysis using generalized linear models. Biometrics, 73, 13–22.

    Article  MathSciNet  Google Scholar 

  • Zeger, S. L., & Liang, K. Y. (1992). An overview of methods for the analysis of longitudinal data. Statistics in Medicine, 11(14–15), 1825–1839.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

1 Electronic Supplementary Material

Below is the link to the electronic supplementary material.

Ch6 Pred Prob (XLSX 333 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Wilson, J.R., Lorenz, K.A. (2015). Generalized Estimating Equations Logistic Regression. In: Modeling Binary Correlated Responses using SAS, SPSS and R. ICSA Book Series in Statistics, vol 9. Springer, Cham. https://doi.org/10.1007/978-3-319-23805-0_6

Download citation

Publish with us

Policies and ethics