Statistical Methodologies for Dealing with Incomplete Longitudinal Outcomes Due to Dropout Missing at Random

  • A. Satty
  • H. MwambiEmail author
  • G. Molenberghs
Part of the ICSA Book Series in Statistics book series (ICSABSS)


Longitudinal studies are based on repeatedly measuring the outcome of interest and covariates over a sequences of time points. These studies play a vital role in many disciplines of science, such as medicine, epidemiology, ecology and public health. However, data arising from such studies often show inevitable incompleteness due to dropouts or even intermittent missingness that can potentially cause serious bias problems in the analysis of longitudinal data. In this chapter we confine our considerations to the dropout missingness pattern. Given the problems that can arise when there are dropouts in longitudinal studies, the following question is forced upon researchers: What methods can be utilized to handle these potential pitfalls? The goal is to use approaches that better avoid the generation of biased results. This chapter considers some of the key modelling techniques and basic issues in statistical data analysis to address dropout problems in longitudinal studies. The main objective is to provide an overview of issues and different methodologies in the case of subjects dropping out in longitudinal data for both the case of continuous and discrete outcomes. The chapter focusses on methods that are valid under the missing at random (MAR) mechanism and the missingness patterns of interest will be monotone; these are referred to as dropout in the context of longitudinal data. The fundamental concepts of the patterns and mechanisms of dropout are discussed. The techniques that are investigated for handling dropout are: (1) Multiple imputation (MI); (2) Likelihood-based methods, in particular Generalized linear mixed models (GLMMs) ; (3) Multiple imputation based generalized estimating equations (MI-GEE) ; and (4) Weighted estimating equations (WGEE) . For each method, useful and important assumptions regarding its applications are presented. The existing literature in which we examine the effectiveness of these methods in the analysis of incomplete longitudinal data is discussed in detail. Two application examples are presented to study the potential strengths and weaknesses of the methods under an MAR dropout mechanism.


Multiple imputation GEE Weighted GEE Generalized linear mixed model (GLMM) Likelihood analysis Incomplete longitudinal outcome Missing at random (MAR) Dropout 


  1. Alosh, M. (2010). Modeling longitudinal count data with dropouts. Pharmaceutical Statistics, 9, 35–45.CrossRefGoogle Scholar
  2. Anderson, J. A., & Aitkin, M. (1985). Variance component models with binary response: Interviewer variability. Journal of the Royal Statistical Society, Series B, 47, 203–210.MathSciNetGoogle Scholar
  3. Beunckens, C., Sotto, C., & Molenberghs, G. (2008). A simulation study comparing weighted estimating equations with multiple imputation based estimating equations for longitudinal binary data. Computational Statistics and Data Analysis, 52, 1533–1548.MathSciNetCrossRefzbMATHGoogle Scholar
  4. Birhanu, T., Molenberghs, G., Sotto, C., & Kenward, M. G. (2011). Doubly robust and multiple-imputation-based generalized estimating equations. Journal of Biopharmaceutical Statistics, 21, 202–225.MathSciNetCrossRefGoogle Scholar
  5. Breslow, N. E., & Lin, X. (1995). Bias correction in generalised linear models with a single component of dispersion. Biometrika, 82, 81–91.MathSciNetCrossRefzbMATHGoogle Scholar
  6. Burton, A., Altman, D. G., Royston, P., & Holder, R. (2006). The design of simulation studies in medical statistics. Statistics in Medicine, 25, 4279–4292.MathSciNetCrossRefGoogle Scholar
  7. Carpenter, J., & Kenward, M. (2013). Multiple imputation and its application. UK: Wiley.CrossRefzbMATHGoogle Scholar
  8. Collins, L. M., Schafer, J. L., & Kam, C. M. (2001). A comparison of inclusive and restrictive strategies in modern missing data procedures. Psychological Methods, 6, 330–351.CrossRefGoogle Scholar
  9. De Backer, M., De Keyser, P., De Vroey, C., & Lesaffre, E. (1996). A 12-week treatment for dermatophyte toe onychomycosis: terbinafine 250mg/day vs. itraconazole 200mg/day? a double-blind comparative trial. British Journal of Dermatology, 134, 16–17.CrossRefGoogle Scholar
  10. Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm (with discussion). Journal of Royal Statistical Society: Series B, 39, 1–38.MathSciNetzbMATHGoogle Scholar
  11. Jansen, I., Beunckens, C., Molenberghs, G., Verbeke, G., & Mallinckrodt, C. (2006). Analyzing incomplete discrete longitudinal clinical trial data. Statistical Science, 21, 52–69.MathSciNetCrossRefzbMATHGoogle Scholar
  12. Laird, N. M., & Ware, J. H. (1982). Random-effects models for longitudinal data. Biometrics, 38, 963–974.CrossRefzbMATHGoogle Scholar
  13. Liang, K. Y., & Zeger, S. L. (1986). Longitudinal data analysis using generalized linear models. Biometrika, 73, 13–22.MathSciNetCrossRefzbMATHGoogle Scholar
  14. Little, R. J. A., & Rubin, D. B. (1987). Statistical analysis with missing data. New York: Wiley.zbMATHGoogle Scholar
  15. Little, R. J. A. (1995). Modeling the drop-out mechanism in repeated-measures studies. Journal of the American Statistical Association, 90, 1112–1121.MathSciNetCrossRefzbMATHGoogle Scholar
  16. Little, R. J., & DAgostino, R., Cohen, M. L., Dickersin, K., Emerson, S. S., Farrar, J., Frangakis, C., Hogan, J. W., Molenberghs, G., Murphy, S. A., Neaton, J. D., Rotnitzky, A., Scharfstein, D., Shih, W, J., Siegel, J. P., & Stern, H., (2012). The prevention and treatment of missing data in clinical trials. The New England Journal of Medicine, 367, 1355–1360.Google Scholar
  17. Mallinckrodt, C. H., Clark, W. S., & Stacy, R. D. (2001a). Type I error rates from mixedeffects model repeated measures versus fixed effects analysis of variance with missing values imputed via last observation carried forward. Drug Information Journal, 35, 1215–1225.CrossRefGoogle Scholar
  18. Mallinckrodt, C. H., Clark, W. S., & Stacy, R. D. (2001b). Accounting for dropout bias using mixed-effect models. Journal of Biopharmaceutical Statistics, 11, 9–21.CrossRefGoogle Scholar
  19. Mallinckrodt, C. H., Clark, W. S., Carroll, R. J., & Molenberghs, G. (2003a). Assessing response profiles from incomplete longitudinal clinical trial data under regulatory considerations. Journal of Biopharmaceutical Statistics, 13, 179–190.CrossRefzbMATHGoogle Scholar
  20. Mallinckrodt, C. H., Sanger, T. M., Dube, S., Debrota, D. J., Molenberghs, G., Carroll, R. J., et al. (2003b). Assessing and interpreting treatment effects in longitudinal clinical trials with missing data. Biological Psychiatry, 53, 754–760.CrossRefGoogle Scholar
  21. Milliken, G. A., & Johnson, D. E. (2009). Analysis of messy data. Design experiments (2nd ed., Vol. 1). Chapman and Hall/CRC.Google Scholar
  22. Molenberghs, G., Kenward, M. G., & Lesaffre, E. (1997). The analysis of longitudinal ordinal data with non-random dropout. Biometrika, 84, 33–44.CrossRefzbMATHGoogle Scholar
  23. Molenberghs, G., & Verbeke, G. (2005). Models for discrete longitudinal data. New York: Springer.zbMATHGoogle Scholar
  24. Molenberghs, G., & Kenward, M. G. (2007). Missing data in clinical studies. England: Wiley.CrossRefGoogle Scholar
  25. Molenberghs, G., Beunckens, C., Sotto, C., & Kenward, M. (2008). Every missing not at random model has got a missing at random counterpart with equal fit. Journal of Royal Statistical Soceity: Series B, 70, 371–388.CrossRefzbMATHGoogle Scholar
  26. Pinheiro, J. C., & Bates, D. M. (2000). Mixed effects models in S and S-Plus. New York: Springer.CrossRefzbMATHGoogle Scholar
  27. Rubin, D. B. (1976). Inference and missing data. Biometrika, 63, 581–592.MathSciNetCrossRefzbMATHGoogle Scholar
  28. Rubin, D. B. (1978). Multiple imputations in sample surveys. In Proceedings of the Survey Research Methods Section (pp. 20–34). American Statistical Association.Google Scholar
  29. Rubin, D. B., & Schenker, N. (1986). Multiple imputation for interval estimation from simple random samples with ignorable nonresponse. Journal of the American Statistical Association, 81, 366–374.MathSciNetCrossRefzbMATHGoogle Scholar
  30. Rubin, D. B. (1987). Multiple imputation for nonresponse in surveys. New York: Wiley.CrossRefzbMATHGoogle Scholar
  31. Rubin, D. B. (1996). Multiple imputation after 18+ years (with discussion). Journal of the American Statistical Association, 91, 473–520.CrossRefzbMATHGoogle Scholar
  32. Schafer, J. L. (1997). Analysis of incomplete multivariate data. New York: Champan and Hall.CrossRefzbMATHGoogle Scholar
  33. Schafer, J. L., & Olsen, M. K. (1998). Multiple imputation for multivariate missing-data problems: A data analysts perspective. Multivariate Behavioral Research, 33, 545–571.CrossRefGoogle Scholar
  34. Schafer, J. L. (1999). Multiple imputation: A primer. Statistical Methods in Medical Research, 8, 3–15.CrossRefGoogle Scholar
  35. Schafer, J. L., & Graham, J. W. (2002). Missing data: Our view of the state of the art. Psychological Methods, 7, 147–177.CrossRefGoogle Scholar
  36. Schafer, J. L. (2003). Multiple imputation in multivariate problems when the imputation and analysis models differ. Statistica Neerlandica, 57, 19–35.MathSciNetCrossRefGoogle Scholar
  37. Stiratelli, R., Laird, N., & Ware, J. (1984). Random effects models for serial observations with dichotomous response. Biometrics, 40, 961–972.CrossRefGoogle Scholar
  38. Tanner, M. A., & Wong, W. H. (1987). The calculation of posterior distributions by data augmentation (with discussion). Journal of the American Statistical Association, 82, 528–550.MathSciNetCrossRefzbMATHGoogle Scholar
  39. Verbeke, G., & Molenberghs, G. (2000). Linear mixed models for longitudinal data. New York: Springer.zbMATHGoogle Scholar
  40. Yoo, B. (2009). The impact of dichotomization in longitudinal data analysis: A simulation study. Pharmaceutical Statistics, 9, 298–312.CrossRefGoogle Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2017

Authors and Affiliations

  1. 1.Faculty of Mathematical Sciences and StatisticsAlneelain UniversityKhartoumSudan
  2. 2.School of Mathematics, Statistics and Computer ScienceUniversity of KwaZulu-NatalPietermaritzburgSouth Africa
  3. 3.I-BioStat, Universiteit Hasselt & KU LeuvenHasseltBelgium

Personalised recommendations