Skip to main content

Statistical Modelling of Partially Observed Data Using Multiple Imputation: Principles and Practice

  • Chapter
  • First Online:
Modern Methods for Epidemiology

Abstract

Missing data are inevitably ubiquitous in experimental and observational epidemiological research. Nevertheless, despite a steady flow of theoretical work in this area, from the mid-1970s onwards, recent studies have shown that the way partially observed data are reported and analysed in experimental research falls far short of best practice (Wood et al. 2004; Chan and Altman 2005; Sterne et al. 2009). The aim of this Chapter is thus to present an accessible review of the issues raised by missing data, together with the advantages and disadvantages of different approaches to the analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Blatchford, P., Goldstein, H., Martin, C., & Browne, W. (2002). A study of class size effects in English school reception year classes. British Educational Research Journal, 28, 169–185.

    Article  Google Scholar 

  • Cao, W., Tsiatis, A. A., & Davidian, M. (2009). Improving efficiency and robustness of the doubly robust estimator for a population mean with incomplete data. Biometrika, 96, 723–734.

    Article  PubMed  Google Scholar 

  • Carpenter, J. R., & Kenward, M. G. (2008). Missing data in clinical trials — A practical guide. Birmingham: National Health Service Co-ordinating Centre for Research Methodology. Freely downloadable from www.missingdata.org.uk . Accessed 25 Jan 2012.

  • Carpenter, J. R., & Plewis, I. (2011). Coming to terms with non-response in longitudinal studies. In M. Williams & P. Vogt (Eds.), SAGE handbook of methodological innovation. London: Sage.

    Google Scholar 

  • Carpenter, J. R., Kenward, M. G., & Vansteelandt, S. (2006). A comparison of multiple imputation and inverse probability weighting for analyses with missing data. Journal of the Royal Statistical Society: Series A (Statistics in Society), 169, 571–584.

    Article  Google Scholar 

  • Carpenter, J. R., Kenward, M. G., & White, I. R. (2007). Sensitivity analysis after multiple imputation under missing at random — A weighting approach. Statistical Methods in Medical Research, 16, 259–275.

    Article  PubMed  Google Scholar 

  • Carpenter, J. R., Roger, J. H., & Kenward, M. G. (2012). Analysis of longitudinal trials with missing data:—A framework for relevant, accessible assumptions, and inference via multiple imputation (Submitted).

    Google Scholar 

  • Chan, A., & Altman, D. G. (2005). Epidemiology and reporting of randomised trials published in PubMed journals. The Lancet, 365, 1159–1162.

    Article  Google Scholar 

  • Clayton, D., Spiegelhalter, D., Dunn, G., & Pickles, A. (1998). Analysis of longitudinal binary data from multi-phase sampling (with discussion). Journal of the Royal Statistical Society, Series B (statistical methodology), 60, 71–87.

    Article  Google Scholar 

  • Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm (with discussion). Journal of the Royal Statistical Society Series B (Statistical Methodology), 39, 1–38.

    Google Scholar 

  • Goldstein, H., Carpenter, J. R., Kenward, M. G., & Levin, K. (2009). Multilevel models with multivariate mixed response types. Statistical Modelling, 9, 173–197.

    Article  Google Scholar 

  • Kang, J. D. Y., & Schafer, J. L. (2007). Demystifying double robustness: A comparison of alternative strategies for estimating a population mean from incomplete data (with discussion). Statistical Science, 22, 523–539.

    Article  Google Scholar 

  • Kenward, M. G., & Carpenter, J. R. (2007). Multiple imputation: Current perspectives. Statistical Methods in Medical Research, 16, 199–218.

    Article  PubMed  Google Scholar 

  • Klebanoff, M. A., & Cole, S. R. (2008). Use of multiple imputation in the epidemiologic literature. American Journal of Epidemiology, 168, 355–357.

    Article  PubMed  Google Scholar 

  • Little, R. J. A., & Rubin, D. B. (2002). Statistical analysis with missing data (2nd ed.). Chichester: Wiley.

    Google Scholar 

  • Louis, T. (1982). Finding the observed information matrix when using the EM algorithm. Journal of the Royal Statistical Society, Series B, 44, 226–233.

    Google Scholar 

  • Orchard, T., & Woodbury, M. (1972). A missing information principle: theory and applications. In L. M. L. Cam, J. Neyman, & E. L. Scott (Eds.), Proceedings of the Sixth Berkely Symposium on Mathematics, Statistics and Probability: Vol. 1 (pp. 697–715). Berkeley: University of California Press.

    Google Scholar 

  • Royston, P. (2007). Multiple imputation of missing values: Further update of ice with emphasis on interval censoring. The Stata Journal, 7, 445–464.

    Google Scholar 

  • Rubin, D. B. (1976). Inference and missing data. Biometrika, 63, 581–592.

    Article  Google Scholar 

  • Rubin, D. B. (1987). Multiple imputation for nonresponse in surveys. New York: Wiley.

    Book  Google Scholar 

  • Rubin, D. B. (1996). Multiple imputation after 18 years. Journal of the American Statistical Association, 91, 473–490.

    Article  Google Scholar 

  • Schafer, J. L. (1997). Analysis of incomplete multivariate data. London: Chapman and Hall.

    Book  Google Scholar 

  • Spratt, M., Sterne, J. A. C., Tilling, K., Carpenter, J. R., & Carlin, J. B. (2010). Strategies for multiple imputation in longitudinal studies. American Journal of Epidemiology, 172, 478–487.

    Article  PubMed  Google Scholar 

  • Sterne, J. A. C., White, I. R., Carlin, J. B., Spratt, M., Royston, P., Kenward, M. G., Wood, A. M., & Carpenter, J. R. (2009). Multiple imputation for missing data in epidemiological and clinical research: Potential and pitfalls. British Medical Journal, 339, 157–160.

    Google Scholar 

  • van Buuren, S., Boshuizen, H. C., & Knook, D. L. (1999). Multiple imputation of missing blood pressure covariates in survival analysis. Statistics in Medicine, 18, 681–694.

    Article  PubMed  Google Scholar 

  • van Buuren, S., Brand, J. P. L., Groothuis-Oudshoorn, C. G. M., & Rubin, D. B. (2006). Fully conditional specification in multivariate imputation. Journal of Statistical Computation and Simulation, 76, 1049–1064.

    Article  Google Scholar 

  • Vansteelandt, S., Carpenter, J. R., & Kenward, M. G. (2010). Analysis of incomplete data using inverse probability weighting and doubly robust estimators. Methodology, 6, 37–48.

    Article  Google Scholar 

  • White, I. R., & Royston, P. (2009). Imputing missing covariate values for the Cox model. Statistics in Medicine, 28, 1982–1998.

    Article  PubMed  Google Scholar 

  • Wood, A. M., White, I. R., & Thompson, S. G. (2004). Are missing outcome data adequately handled? A review of published randomized controlled trials in major medical journals. Clinical Trials, 1, 368–376.

    Article  PubMed  Google Scholar 

Download references

Acknowledgements

James Carpenter is funded by ESRC research fellowship RES-063-27-0257. We are grateful to Peter Blatchford for permission to use the class size data.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to James R. Carpenter .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer Science+Business Media Dordrecht

About this chapter

Cite this chapter

Carpenter, J.R., Goldstein, H., Kenward, M.G. (2012). Statistical Modelling of Partially Observed Data Using Multiple Imputation: Principles and Practice. In: Tu, YK., Greenwood, D. (eds) Modern Methods for Epidemiology. Springer, Dordrecht. https://doi.org/10.1007/978-94-007-3024-3_2

Download citation

Publish with us

Policies and ethics