Abstract
Missing data are inevitably ubiquitous in experimental and observational epidemiological research. Nevertheless, despite a steady flow of theoretical work in this area, from the mid-1970s onwards, recent studies have shown that the way partially observed data are reported and analysed in experimental research falls far short of best practice (Wood et al. 2004; Chan and Altman 2005; Sterne et al. 2009). The aim of this Chapter is thus to present an accessible review of the issues raised by missing data, together with the advantages and disadvantages of different approaches to the analysis.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Blatchford, P., Goldstein, H., Martin, C., & Browne, W. (2002). A study of class size effects in English school reception year classes. British Educational Research Journal, 28, 169–185.
Cao, W., Tsiatis, A. A., & Davidian, M. (2009). Improving efficiency and robustness of the doubly robust estimator for a population mean with incomplete data. Biometrika, 96, 723–734.
Carpenter, J. R., & Kenward, M. G. (2008). Missing data in clinical trials — A practical guide. Birmingham: National Health Service Co-ordinating Centre for Research Methodology. Freely downloadable from www.missingdata.org.uk . Accessed 25 Jan 2012.
Carpenter, J. R., & Plewis, I. (2011). Coming to terms with non-response in longitudinal studies. In M. Williams & P. Vogt (Eds.), SAGE handbook of methodological innovation. London: Sage.
Carpenter, J. R., Kenward, M. G., & Vansteelandt, S. (2006). A comparison of multiple imputation and inverse probability weighting for analyses with missing data. Journal of the Royal Statistical Society: Series A (Statistics in Society), 169, 571–584.
Carpenter, J. R., Kenward, M. G., & White, I. R. (2007). Sensitivity analysis after multiple imputation under missing at random — A weighting approach. Statistical Methods in Medical Research, 16, 259–275.
Carpenter, J. R., Roger, J. H., & Kenward, M. G. (2012). Analysis of longitudinal trials with missing data:—A framework for relevant, accessible assumptions, and inference via multiple imputation (Submitted).
Chan, A., & Altman, D. G. (2005). Epidemiology and reporting of randomised trials published in PubMed journals. The Lancet, 365, 1159–1162.
Clayton, D., Spiegelhalter, D., Dunn, G., & Pickles, A. (1998). Analysis of longitudinal binary data from multi-phase sampling (with discussion). Journal of the Royal Statistical Society, Series B (statistical methodology), 60, 71–87.
Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm (with discussion). Journal of the Royal Statistical Society Series B (Statistical Methodology), 39, 1–38.
Goldstein, H., Carpenter, J. R., Kenward, M. G., & Levin, K. (2009). Multilevel models with multivariate mixed response types. Statistical Modelling, 9, 173–197.
Kang, J. D. Y., & Schafer, J. L. (2007). Demystifying double robustness: A comparison of alternative strategies for estimating a population mean from incomplete data (with discussion). Statistical Science, 22, 523–539.
Kenward, M. G., & Carpenter, J. R. (2007). Multiple imputation: Current perspectives. Statistical Methods in Medical Research, 16, 199–218.
Klebanoff, M. A., & Cole, S. R. (2008). Use of multiple imputation in the epidemiologic literature. American Journal of Epidemiology, 168, 355–357.
Little, R. J. A., & Rubin, D. B. (2002). Statistical analysis with missing data (2nd ed.). Chichester: Wiley.
Louis, T. (1982). Finding the observed information matrix when using the EM algorithm. Journal of the Royal Statistical Society, Series B, 44, 226–233.
Orchard, T., & Woodbury, M. (1972). A missing information principle: theory and applications. In L. M. L. Cam, J. Neyman, & E. L. Scott (Eds.), Proceedings of the Sixth Berkely Symposium on Mathematics, Statistics and Probability: Vol. 1 (pp. 697–715). Berkeley: University of California Press.
Royston, P. (2007). Multiple imputation of missing values: Further update of ice with emphasis on interval censoring. The Stata Journal, 7, 445–464.
Rubin, D. B. (1976). Inference and missing data. Biometrika, 63, 581–592.
Rubin, D. B. (1987). Multiple imputation for nonresponse in surveys. New York: Wiley.
Rubin, D. B. (1996). Multiple imputation after 18 years. Journal of the American Statistical Association, 91, 473–490.
Schafer, J. L. (1997). Analysis of incomplete multivariate data. London: Chapman and Hall.
Spratt, M., Sterne, J. A. C., Tilling, K., Carpenter, J. R., & Carlin, J. B. (2010). Strategies for multiple imputation in longitudinal studies. American Journal of Epidemiology, 172, 478–487.
Sterne, J. A. C., White, I. R., Carlin, J. B., Spratt, M., Royston, P., Kenward, M. G., Wood, A. M., & Carpenter, J. R. (2009). Multiple imputation for missing data in epidemiological and clinical research: Potential and pitfalls. British Medical Journal, 339, 157–160.
van Buuren, S., Boshuizen, H. C., & Knook, D. L. (1999). Multiple imputation of missing blood pressure covariates in survival analysis. Statistics in Medicine, 18, 681–694.
van Buuren, S., Brand, J. P. L., Groothuis-Oudshoorn, C. G. M., & Rubin, D. B. (2006). Fully conditional specification in multivariate imputation. Journal of Statistical Computation and Simulation, 76, 1049–1064.
Vansteelandt, S., Carpenter, J. R., & Kenward, M. G. (2010). Analysis of incomplete data using inverse probability weighting and doubly robust estimators. Methodology, 6, 37–48.
White, I. R., & Royston, P. (2009). Imputing missing covariate values for the Cox model. Statistics in Medicine, 28, 1982–1998.
Wood, A. M., White, I. R., & Thompson, S. G. (2004). Are missing outcome data adequately handled? A review of published randomized controlled trials in major medical journals. Clinical Trials, 1, 368–376.
Acknowledgements
James Carpenter is funded by ESRC research fellowship RES-063-27-0257. We are grateful to Peter Blatchford for permission to use the class size data.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer Science+Business Media Dordrecht
About this chapter
Cite this chapter
Carpenter, J.R., Goldstein, H., Kenward, M.G. (2012). Statistical Modelling of Partially Observed Data Using Multiple Imputation: Principles and Practice. In: Tu, YK., Greenwood, D. (eds) Modern Methods for Epidemiology. Springer, Dordrecht. https://doi.org/10.1007/978-94-007-3024-3_2
Download citation
DOI: https://doi.org/10.1007/978-94-007-3024-3_2
Published:
Publisher Name: Springer, Dordrecht
Print ISBN: 978-94-007-3023-6
Online ISBN: 978-94-007-3024-3
eBook Packages: Biomedical and Life SciencesBiomedical and Life Sciences (R0)