A guide to missing data for the pediatric nephrologist

  • Nicholas G. Larkins
  • Jonathan C. Craig
  • Armando Teixeira-Pinto
Educational Review

Abstract

Missing data is an important and common source of bias in clinical research. Readers should be alert to and consider the impact of missing data when reading studies. Beyond preventing missing data in the first place, through good study design and conduct, there are different strategies available to handle data containing missing observations. Complete case analysis is often biased unless data are missing completely at random. Better methods of handling missing data include multiple imputation and models using likelihood-based estimation. With advancing computing power and modern statistical software, these methods are within the reach of clinician-researchers under guidance of a biostatistician. As clinicians reading papers, we need to continue to update our understanding of statistical methods, so that we understand the limitations of these techniques and can critically interpret literature.

Keywords

Multiple imputation Statistics Epidemiology Nephrology 

Notes

Funding source

This review is supported by the National Health and Medical Research Council (APP1092957 program grant including ATP; GNT1114218 to NL).

Compliance with ethical standards

Conflict of interest

The authors declare that they have no conflict of interest.

References

  1. 1.
    Wood AM, White IR, Thompson SG (2004) Are missing outcome data adequately handled? A review of published randomized controlled trials in major medical journals. Clin Trials 1:368–376CrossRefPubMedGoogle Scholar
  2. 2.
    Little RJ, D'Agostino R, Cohen ML, Dickersin K, Emerson SS, Farrar JT, Frangakis C, Hogan JW, Molenberghs G, Sa M, Neaton JD, Rotnitzky A, Scharfstein D, Shih WJ, Siegel JP, Stern H (2012) The prevention and treatment of missing data in clinical trials. N Engl J Med 367:1355–1360CrossRefPubMedPubMedCentralGoogle Scholar
  3. 3.
    von Elm E, Altman DG, Egger M, Pocock SJ, Gøtzsche PC, Vandenbroucke JP (2008) The strengthening the reporting of observational studies in epidemiology (STROBE) statement: guidelines for reporting observational studies. J Clin Epidemiol 61:344–349CrossRefGoogle Scholar
  4. 4.
    Altman DG, Schulz KF, Moher D, Egger M, Davidoff F, Elbourne D, Lang T (2001) The revised CONSORT statement for reporting randomized trials. Ann Intern Med 134:663–694CrossRefPubMedGoogle Scholar
  5. 5.
    Fleming TR (2011) Research and reporting methods addressing missing data in clinical trials. Ann Intern Med 154:113–113CrossRefPubMedPubMedCentralGoogle Scholar
  6. 6.
    Hoberman A, Greenfield SP, Mattoo TK, Keren R, Mathews R, Pohl HG, Kropp BP, Skoog SJ, Nelson CP, Moxey-Mims M, Chesney RW, Carpenter MA (2014) Antimicrobial prophylaxis for children with vesicoureteral reflux. N Engl J Med 370:2367–2376CrossRefPubMedGoogle Scholar
  7. 7.
    Craig JC, Williams GJ, Hodson EM (2014) Antimicrobial prophylaxis for children with vesicoureteral reflux. N Engl J Med 371:1070–1070CrossRefPubMedGoogle Scholar
  8. 8.
    Ford I, Norrie J (2016) Pragmatic trials. N Engl J Med 375:454–463CrossRefPubMedGoogle Scholar
  9. 9.
    Jeffries-Stokes C, Stokes A, McDonald L (2015) Pulkurlkpa: the joy of research in aboriginal communities. J Paediatr Child Health 51:1054–1059CrossRefPubMedGoogle Scholar
  10. 10.
    Cleland JGF, Torp-pedersen C, Coletta AP, Lammiman MJ (2004) A method to reduce loss to follow-up in clinical trials: informed, withdrawal of consent. Eur J Heart Fail 6:1–2CrossRefPubMedGoogle Scholar
  11. 11.
    Young C, Gunasekera H, Kong K, Purcell A, Muthayya S, Vincent F, Wright D, Gordon R, Bell J, Gillor G, Booker J, Fernando P, Kalucy D, Sherriff S, Tong A, Parter C, Bailey S, Redman S, Banks E, Craig JC (2016) A case study of enhanced clinical care enabled by aboriginal health research: the Hearing, EAr health and Language Services (HEALS) project. Aust N Z J Public Health 40:523–528CrossRefPubMedGoogle Scholar
  12. 12.
    Rubin DB (1976) Inference and missing data. Biometrika 63:581–592CrossRefGoogle Scholar
  13. 13.
    Schafer JL, Graham JW (2002) Missing data: our view of the state of the art. Psychol Methods 7:147–177CrossRefPubMedGoogle Scholar
  14. 14.
    Little RJA, Rubin DB (2014) Statistical analysis with missing data. Wiley, HobokenGoogle Scholar
  15. 15.
    Groenwold RH, Donders AR, Roes KC, Harrell FE Jr, Moons KG (2012) Dealing with missing outcome data in randomized trials and observational studies. Am J Epidemiol 175:210–217CrossRefPubMedGoogle Scholar
  16. 16.
    Bartlett JW, Harel O, Carpenter JR (2015) Asymptotically unbiased estimation of exposure odds ratios in complete records logistic regression. Am J Epidemiol 182:730–736CrossRefPubMedPubMedCentralGoogle Scholar
  17. 17.
    Liublinska V, Rubin DB (2012) Re: “Dealing with missing outcome data in randomized trials and observational studies”. Am J Epidemiol 176:357–358CrossRefPubMedGoogle Scholar
  18. 18.
    Cologne J, Furukawa K (2016) Re: “Asymptotically unbiased estimation of exposure odds ratios in complete records logistic regression”. Am J Epidemiol 184:160CrossRefPubMedGoogle Scholar
  19. 19.
    White IR, Carlin JB (2010) Bias and efficiency of multiple imputation compared with complete-case analysis for missing covariate values. Stat Med 29:2920–2931CrossRefPubMedGoogle Scholar
  20. 20.
    Little R, An H (2004) Robust likelihood-based analysis of multivariate data with missing values. Stat Sin 14:949–968Google Scholar
  21. 21.
    Ibrahim JG, Chen M-H, Lipsitz SR, Herring AH (2005) Missing-data methods for generalized linear models. J Am Stat Assoc 100:332–346CrossRefGoogle Scholar
  22. 22.
    Cheng J, Edwards LJ, Maldonado-Molina MM, Komro KA, Muller KE (2010) Real longitudinal data analysis for real people: building a good enough mixed model. Stat Med 29:504–520CrossRefPubMedPubMedCentralGoogle Scholar
  23. 23.
    Verbeke G, Fieuws S, Molenberghs G, Davidian M (2014) The analysis of multivariate longitudinal data: a review. Stat Methods Med Res 23:42–59CrossRefPubMedGoogle Scholar
  24. 24.
    Teixeira-Pinto A, Mauri L (2011) Statistical analysis of noncommensurate multiple outcomes. Circ Cardiovasc Qual Outcomes 4:650–656CrossRefPubMedGoogle Scholar
  25. 25.
    White IR, Horton NJ, Carpenter J, Pocock SJ (2011) Strategy for intention to treat analysis in randomised trials with missing outcome data. BMJ 342:d40CrossRefPubMedPubMedCentralGoogle Scholar
  26. 26.
    Ibrahim JG, Chu H, Chen LM (2010) Basic concepts and methods for joint models of longitudinal and survival data. J Clin Oncol 28:2796–2801CrossRefPubMedPubMedCentralGoogle Scholar
  27. 27.
    Faucett CL, Schenker N, Jeremy MGT (2002) Survival analysis using auxiliary variables via multiple imputation, with application to AIDS clinical trial data. Biometrics 58:37–47CrossRefPubMedGoogle Scholar
  28. 28.
    Hogan JW, Laird NM (1997) Mixture models for the joint distribution of repeated measures and event times. Stat Med 16:239–257CrossRefPubMedGoogle Scholar
  29. 29.
    Seaman SR, White IR (2013) Review of inverse probability weighting for dealing with missing data. Stat Methods Med Res 22:278–295CrossRefPubMedGoogle Scholar
  30. 30.
    Kreuter F, Valliant R (2007) A survey on survey statistics: what is done and can be done in Stata. Stata J 7:1–21Google Scholar
  31. 31.
    De Goeij MCM, Van Diepen M, Jager KJ, Tripepi G, Zoccali C, Dekker FW (2013) Multiple imputation: dealing with missing data. Nephrol Dial Transplant 28:2415–2420CrossRefPubMedGoogle Scholar
  32. 32.
    van Buuren S, Groothuis-Oudshoorn K (2011) Mice : multivariate imputation by chained equations in R. J Stat Softw 45:1–67.  https://doi.org/10.18637/jss.v045.i03 CrossRefGoogle Scholar
  33. 33.
    StataCorp (2015) Stata 14 base reference manual. Stata Press, College StationGoogle Scholar
  34. 34.
    Moons KGM, Donders RART, Stijnen T, Harrell FE Jr (2006) Using the outcome for imputation of missing predictor values was preferred. J Clin Epidemiol 59:1092–1101CrossRefPubMedGoogle Scholar
  35. 35.
    Sterne JAC, White IR, Carlin JB, Spratt M, Royston P, Kenward MG, Wood AM, Carpenter JR (2009) Multiple imputation for missing data in epidemiological and clinical research: potential and pitfalls. BMJ 338:b2393CrossRefPubMedPubMedCentralGoogle Scholar
  36. 36.
    Hippisley-Cox J, Coupland C, Vinogradova Y, Robson J, May M, Brindle P (2007) Derivation and validation of QRISK, a new cardiovascular disease risk score for the United Kingdom: prospective open cohort study. BMJ 335:136–136CrossRefPubMedPubMedCentralGoogle Scholar
  37. 37.
    Peto R (2007) Doubts about QRISK score: total/HDL cholesterol should be important [electronic response to Hippisley-Cox J, et al]. BMJ [rapid response]. http://www.bmj.com/rapid-response/2011/11/01/doubts-about-qrisk-score-total-hdl-cholesterol-should-be-important
  38. 38.
    Hippisley-Cox J, Coupland C, Vinogradova Y, Robson J, May M, Brindle P (2007) QRISK: authors’ response. BMJ [rapid response]. http://www.bmj.com/rapid-response/2011/11/01/qrisk-authors-response
  39. 39.
    Graham JW, Olchowski AE, Gilreath TD (2007) How many imputations are really needed? Some practical clarifications of multiple imputation theory. Prev Sci 8:206–213CrossRefPubMedGoogle Scholar
  40. 40.
    Schafer JL, Olsen MK (1998) Multiple imputation for multivariate missing-data problems: a data analyst’s perspective. Multivariate Behav Res 33:545–571CrossRefPubMedGoogle Scholar
  41. 41.
    Herring AH, Ibrahim JG, Lipsitz SR (2004) Non-ignorable missing covariate data in survival analysis: a case-study of an International Breast Cancer Study Group trial. J R Stat Soc Ser C Appl Stat 53:293–310CrossRefGoogle Scholar
  42. 42.
    Klebanoff MA, Cole SR (2008) Use of multiple imputation in the epidemiologic literature. Am J Epidemiol 168:355–357CrossRefPubMedPubMedCentralGoogle Scholar
  43. 43.
    Laine C, Goodman SN, Griswold ME, Sox HC (2007) Reproducible research: moving toward research the public can really trust. Ann Intern Med 146:450–453CrossRefPubMedGoogle Scholar

Copyright information

© IPNA 2018

Authors and Affiliations

  1. 1.Department of NephrologyPrincess Margaret HospitalSubiacoAustralia
  2. 2.Sydney School of Public HealthUniversity of SydneySydneyAustralia
  3. 3.Centre for Kidney ResearchThe Children’s Hospital at WestmeadWestmeadAustralia

Personalised recommendations