Skip to main content

Techniques for Analyzing Incomplete Data in Public Health Research

  • Chapter
Innovative Statistical Methods for Public Health Data

Part of the book series: ICSA Book Series in Statistics ((ICSABSS))

  • 2306 Accesses

Abstract

Statistical inference of incomplete data has been an obstacle in numerous areas of research, and public health studies are no exception. Since studies in this field are often survey-based and can center around sensitive personal information, it can make them susceptible to missing records. This chapter discusses the causes and problems created by incomplete data and recommends techniques for how to handle it through multiple imputation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 99.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 129.00
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 119.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Barnard, J., Rubin, D.B.: Small-sample degrees of freedom with multiple imputation. Biometrika 86, 948–955 (1999)

    Article  MathSciNet  Google Scholar 

  • Belin, T.: Missing data: what a little can do and what researchers can do in response. Am. J. Opthalmology 148(6), 820–822 (2009)

    Article  Google Scholar 

  • Bodner, T.E.: What improves with increased missing data imputations? Struct. Equ. Model. 15(4), 651–675 (2008)

    Article  MathSciNet  Google Scholar 

  • Cain, D., Pare, V., Kalichman, S.C., Harel, O., Mthembu, J., Carey, M.P., Carey, K.B., Mehlomakulu, V., Simbayi, L.C., Mwaba, K.: Hiv risks associated with patronizing alcohol serving establishments in south african townships, cape town. Prev. Sci. 13(6), 627–634 (2012)

    Article  Google Scholar 

  • Chang, C.-C.H., Yang, H.-C., Tang, G., Ganguli, M.: Minimizing attrition bias: a longitudinal study of depressive symptoms in an elderly cohort. Int. Psychogeriatr. 21(05), 869–878 (2009)

    Article  MATH  Google Scholar 

  • Collins, L., Schafer, J., Kam, C.: A comparison of inclusive and restrictive strategies in modern missing data procedures. Psychol. Methods 6, 330–351 (2001)

    Article  Google Scholar 

  • Dempster, A., Laird, A., Rubin, D.: Maximum likelihood from incomplete data via the em algorithm. J. R. Stat. Soc. Ser. B Methodol. 39(1), 1–38 (1977)

    MathSciNet  Google Scholar 

  • Diggle, P., Kenward, M.G.: Informative drop-out in longitudinal data analysis. Appl. Stat., 49–93 (1994)

    Google Scholar 

  • Fitzmaurice, G., Laird, N., Ware, J.: Applied Longitudinal Analysis. Wiley Series in Probability and Statistics. Wiley (2011)

    Google Scholar 

  • Gelman, A., Carlin, J., Stern, H., Rubin, D.: Bayesian Data Analysis. Chapman and Hall/CRC, Boca Raton, FL (2003)

    Google Scholar 

  • Graham, J.W., Hofer, S.M., MacKinnon, D.P.: Maximizing the usefulness of data obtained with planned missing value patterns: An application of maximum likelihood procedures. Multivar. Behav. Res. 31(2), 197–218 (1996)

    Article  Google Scholar 

  • Graham, J. W., Olchowski, A.E., Gilreath, T.D.: How many imputations are really needed? some practical clarifications of multiple imputation theory. Prev. Sci. 8(3), 206–213 (2007)

    Article  Google Scholar 

  • Harel, O.: Inferences on missing information under multiple imputation and two-stage multiple imputation. Stat. Methodol. 4, 75–89 (2007)

    Article  MathSciNet  Google Scholar 

  • Harel, O., Pellowski, J., Kalichman, S.: Are we missing the importance of missing values in HIV prevention randomized clinical trials? Review and recommendations. AIDS Behav. 16, 1382–1393 (2012)

    Article  Google Scholar 

  • Jamshidian, M., Jalal, S.: Tests of homoscedasticity, normality, and missing completely at random for incomplete multivariate data. Psychometrika 75(4), 649–674 (2010)

    Article  MathSciNet  Google Scholar 

  • Jamshidian, M., Jalal, S., Jansen, C.: MissMech: an R package for testing homoscedasticity, multivariate normality, and missing completely at random (mcar). J. Stat. Softw. 56(6), 1–31 (2014)

    MATH  Google Scholar 

  • Little, R., Rubin, D.: Statistical Analysis with Missing Data, 2nd edn. Wiley, Hoboken, NJ (2002)

    Google Scholar 

  • Little, R.J.: A test of missing completely at random for multivariate data with missing values. J. Am. Stat. Assoc. 83(404), 1198–1202 (1988)

    Article  MathSciNet  Google Scholar 

  • Little, R.J., D’Agostino, R., Cohen, M.L., Dickersin, K., Emerson, S.S., Farrar, J.T., Frangakis, C., Hogan, J.W., Molenberghs, G., Murphy, S.A., et al.: The prevention and treatment of missing data in clinical trials. N. Engl. J. Med. 367(14), 1355–1360 (2012)

    Article  Google Scholar 

  • Marchenko, Y.V., Reiter, J.P.: Improved degrees of freedom for multivariate significance tests obtained from multiply imputed, small-sample data. Stata J. 9(3), 388–397 (2009)

    Google Scholar 

  • Reiter, J.P.: Small-sample degrees of freedom for multi-component significance tests with multiple imputation for missing data. Biometrika 94, 502–508 (2007)

    Article  MathSciNet  Google Scholar 

  • Rubin, D.: Inference and missing data. Biometrika 63(3), 581–592 (1976)

    Article  MathSciNet  Google Scholar 

  • Rubin, D.: Multiple Imputation for Nonresponse in Surveys. Wiley, Hoboken, NJ (1987)

    Book  Google Scholar 

  • Schafer, J.: Analysis of Incomplete Multivariate Data. Chapman and Hall/CRC, Boca Raton, FL (1997)

    Book  MATH  Google Scholar 

  • Schafer, J., Graham, J.: Missing data: our view of the state of the art. Psychol. Methods 7, 147–177 (2002)

    Article  MATH  Google Scholar 

  • Schafer, J.L.: Norm: analysis of multivariate normal datasets with missing values. R package version 1.0-9.4 (2012)

    Google Scholar 

  • Templ, M., Alfons, A., Kowarik, A., Prantner, B.: VIM: visualization and imputation of missing values. R package version 4.0.0. (2013)

    Google Scholar 

  • van Buuren, S., Groothuis-Oudshoorn, K.: Mice: multivariate imputation by chained equations in r. J. Stat. Softw. 45(3), 1–67 (2011)

    Google Scholar 

  • van Buuren, S., Oudshoorn, K.: Multivariate imputation by chained equations:mice v1.0 user’s manual (2000)

    Google Scholar 

  • Wagstaff, D.A., Harel, O.: A closer examination of three small-sample approximations to the multiple-imputation degrees of freedom. Stata J. 11(3), 403–419 (2011)

    Google Scholar 

  • White, I., Carlin, J.: Bias and efficiency of multiple imputation compared with complete-case analysis for missing covariate values. Stat. Med. 28, 2920–2931 (2010)

    Article  MathSciNet  Google Scholar 

  • White, I.R., Royston, P., Wood, A.M.: Multiple imputation using chained equations: Issues and guidance for practice. Stat. Med. 30(4), 377–399 (2011)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

The authors wish to thank Dr. Seth Kalichman for generously sharing his data. This project was supported in part by the National Institute of Mental Health, Award Number K01MH087219. The content is solely the responsibility of the authors, and it does not represent the official views of the National Institute of Mental Health or the National Institutes of Health.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ofer Harel .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Pare, V., Harel, O. (2015). Techniques for Analyzing Incomplete Data in Public Health Research. In: Chen, DG., Wilson, J. (eds) Innovative Statistical Methods for Public Health Data. ICSA Book Series in Statistics. Springer, Cham. https://doi.org/10.1007/978-3-319-18536-1_8

Download citation

Publish with us

Policies and ethics