A guide to missing data for the pediatric nephrologist
Missing data is an important and common source of bias in clinical research. Readers should be alert to and consider the impact of missing data when reading studies. Beyond preventing missing data in the first place, through good study design and conduct, there are different strategies available to handle data containing missing observations. Complete case analysis is often biased unless data are missing completely at random. Better methods of handling missing data include multiple imputation and models using likelihood-based estimation. With advancing computing power and modern statistical software, these methods are within the reach of clinician-researchers under guidance of a biostatistician. As clinicians reading papers, we need to continue to update our understanding of statistical methods, so that we understand the limitations of these techniques and can critically interpret literature.
KeywordsMultiple imputation Statistics Epidemiology Nephrology
This review is supported by the National Health and Medical Research Council (APP1092957 program grant including ATP; GNT1114218 to NL).
Compliance with ethical standards
Conflict of interest
The authors declare that they have no conflict of interest.
- 2.Little RJ, D'Agostino R, Cohen ML, Dickersin K, Emerson SS, Farrar JT, Frangakis C, Hogan JW, Molenberghs G, Sa M, Neaton JD, Rotnitzky A, Scharfstein D, Shih WJ, Siegel JP, Stern H (2012) The prevention and treatment of missing data in clinical trials. N Engl J Med 367:1355–1360CrossRefPubMedPubMedCentralGoogle Scholar
- 11.Young C, Gunasekera H, Kong K, Purcell A, Muthayya S, Vincent F, Wright D, Gordon R, Bell J, Gillor G, Booker J, Fernando P, Kalucy D, Sherriff S, Tong A, Parter C, Bailey S, Redman S, Banks E, Craig JC (2016) A case study of enhanced clinical care enabled by aboriginal health research: the Hearing, EAr health and Language Services (HEALS) project. Aust N Z J Public Health 40:523–528CrossRefPubMedGoogle Scholar
- 14.Little RJA, Rubin DB (2014) Statistical analysis with missing data. Wiley, HobokenGoogle Scholar
- 20.Little R, An H (2004) Robust likelihood-based analysis of multivariate data with missing values. Stat Sin 14:949–968Google Scholar
- 30.Kreuter F, Valliant R (2007) A survey on survey statistics: what is done and can be done in Stata. Stata J 7:1–21Google Scholar
- 33.StataCorp (2015) Stata 14 base reference manual. Stata Press, College StationGoogle Scholar
- 37.Peto R (2007) Doubts about QRISK score: total/HDL cholesterol should be important [electronic response to Hippisley-Cox J, et al]. BMJ [rapid response]. http://www.bmj.com/rapid-response/2011/11/01/doubts-about-qrisk-score-total-hdl-cholesterol-should-be-important
- 38.Hippisley-Cox J, Coupland C, Vinogradova Y, Robson J, May M, Brindle P (2007) QRISK: authors’ response. BMJ [rapid response]. http://www.bmj.com/rapid-response/2011/11/01/qrisk-authors-response