An Empirical Comparison of Statistical Methods for Missing Data in Randomized, Double-Blind, Placebo-Controlled, Phase 3 Clinical Trials for Chronic Pain and Lipid-Lowering Products



Missing data are uncollected data but meaningful for the statistical analysis due to clinical relevancy of the data for properly specified estimands in clinical trials. Meanwhile the efforts to prevent or minimize missing data are commonly applied in clinical trials, in practice, missing data still occurs. Choosing a statistical method for imputation that deals with missing data targeting specified estimands provides the more reliable estimates of treatment effects.


We considered longitudinal clinical settings that have different degrees of missing data and treatment effects, and simulated different missing mechanisms using data from randomized, double-blind, placebo-controlled phase 3 confirmatory clinical trials of approved drugs. We compared four commonly used statistical methods to deal with missing data in clinical trials.


We find that, when the data are missing not at random (MNAR) with higher missing rates, mixed model for repeated measurements (MMRM) method overestimates treatment difference. Pattern-mixture model estimates were seen to be more conservative in our studies than MMRM given MNAR assumptions, which are more realistic with missing data in clinical trials.


We emphasize the importance of prevention of missing data and specifying the estimand based on trial objectives beforehand. The specified proper estimand and the proper statistical method might be key features to value the clinical trial results despite missing data.

This is a preview of subscription content, log in to check access.

Figure 1.
Figure 2.


  1. 1.

    International Council for Harmonization (2017) Draft ICH E9 (R1) Addendum on Estimands and Sensitivity Analysis in Clinical Trials to the Guideline on Statistical Principles for Clinical Trials (EMA/CHMP/ICH/436221/2017)

  2. 2.

    Rubin DB. Inference and missing data. Biometrika. 1976;63(3):581–92.

    Article  Google Scholar 

  3. 3.

    Little RJ. A test of missing completely at random for multivariate data with missing values. J Am Stat Assoc. 1988;83(404):1198–202.

    Article  Google Scholar 

  4. 4.

    Laird NM. Missing data in longitudinal studies. Stat Med. 1988;7(1–2):305–15.

    CAS  Article  Google Scholar 

  5. 5.

    Barnes SA, Mallinckrodt CH, Lindborg SR, Carter MK. The impact of missing data and how it is handled on the rate of false-positive results in drug development. Pharm Stat. 2008;7(3):215–25.

    Article  Google Scholar 

  6. 6.

    Mallinckrodt CH, Kaiser CJ, Watkin JG, Detke MJ, Molenberghs G, Carroll RJ. Type I error rates from likelihood-based repeated measures analyses of incomplete longitudinal data. Pharm Stat. 2004;3(3):171–86.

    Article  Google Scholar 

  7. 7.

    Carpenter JR, Roger JH, Kenward MG. Analysis of longitudinal trials with protocol deviation: a framework for relevant, accessible assumptions, and inference via multiple imputation. J Biopharm Stat. 2013;23(6):1352–71.

    Article  Google Scholar 

  8. 8.

    Nüesch E, Häuser W, Bernardy K, Barth J, Jüni P. Comparative efficacy of pharmacological and non-pharmacological interventions in fibromyalgia syndrome: network meta-analysis. Ann Rheum Dis. 2013;72(6):955–62.

    Article  Google Scholar 

  9. 9.

    Stone NJ, Robinson JG, Lichtenstein AH, Merz CB, Blum CB, Eckel RH. American College of Cardiology/American Heart Association Task Force on Practice Guidelines. 2013 ACC/AHA guideline on the treatment of blood cholesterol to reduce atherosclerotic cardiovascular risk in adults: a report of the American College of Cardiology/American Heart Association Task Force on Practice Guidelines. Circulation. 2014;129(25 Suppl 2):S1–S45.

    Article  Google Scholar 

  10. 10.

    Molenberghs G, Kenward M. Missing Data in Clinical Studies. Hoboken: Wiley; 2007.

    Google Scholar 

  11. 11.

    R Core Team (2019) R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL

  12. 12.

    Wickham H, et al. Welcome to the tidyverse. J Open Source Softw. 2019;4(43):1686.

    Article  Google Scholar 

  13. 13.

    Siddiqui O, Hung HJ, O'Neill R. MMRM vs. LOCF: a comprehensive comparison based on simulation study and 25 NDA datasets. J Biopharm Stat. 2009;19(2):227–46.

    Article  Google Scholar 

  14. 14.

    Glynn RJ, Laird NM, Rubin DB. Selection modeling versus mixture modeling with nonignorable nonresponse. In: Wainer H, editor. DDrawing Inferences from Self-selected Samples. New York, NY: Springer; 1986. p. 115–142.

    Google Scholar 

  15. 15.

    Little RJ. Pattern-mixture models for multivariate incomplete data. J Am Stat Assoc. 1993;88(421):125–34.

    Google Scholar 

  16. 16.

    Rubin DB. Multiple Imputation for Survey Nonresponse. New York: Wiley; 1987.

    Google Scholar 

  17. 17.

    SAS Institute Inc. 2018. SAS/STAT® 15.1 User’s Guide. Cary, NC, USA

  18. 18.

    Rombach I, Jenkinson C, Gray AM, Murray DW, Rivero-Arias O. Comparison of statistical approaches for analyzing incomplete longitudinal patient-reported outcome data in randomized controlled trials. Pat Related Outcome Meas. 2018;9:197.

    Article  Google Scholar 

  19. 19.

    Elobeid MA, Padilla MA, McVie T, Thomas O, Brock DW, Musser B, Gadde KM, et al. Missing data in randomized clinical trials for weight loss: scope of the problem, state of the field, and performance of statistical methods. PLoS ONE. 2009;4(8):6624.

    Article  Google Scholar 

  20. 20.

    Genolini C, Jacqmin-Gadda H. Copy mean: a new method to impute intermittent missing values in longitudinal studies. Open J Stat. 2013;3(04):26.

    Article  Google Scholar 

  21. 21.

    Lee M, Rahbar MH, Gensler LS, Brown M, Weisman M, Reveille JD. A latent class based imputation method under Bayesian quantile regression framework using asymmetric Laplace distribution for longitudinal medication usage data with intermittent missing values. J Biopharm Stat. 2020;30(1):160–77.

    Article  Google Scholar 

  22. 22.

    Filippatos GS, de Graeff P, Bax JJ, Borg J-J, Cleland JG, Dargie HJ, Flather M, Ford I, Friede T, Greenberg B, Henon-Goburdhun C, Holcomb R, Horst B, Lekakis J, Mueller-Velten G, Papavassiliou AG, Prasad K, Rosano GM, Severin T, Sherman W, Stough WG, Swedberg K, Tavazzi L, Tousoulis D, Vardas P, Ruschitzka F, Anker SD. Independent academic Data Monitoring Committees for clinical trials in cardiovascular and cardiometabolic diseases. Eur J Heart Fail. 2017;19:449–56.

    Article  PubMed  Google Scholar 

Download references


This work was supported in part by the Oak Ridge Institute for Science and Education (ORISE) summer fellowship program. This paper reflects the views of the authors and should not be construed to represent FDA’s views or policies. We would like to thank the anonymous reviewers for the careful reading of our manuscript and for providing us with critical and insightful comments.

Author information



Corresponding author

Correspondence to Yoonhee Kim PhD.

Ethics declarations

Conflict of interest

The authors declare that they have no conflicts of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Gnang, J., Kim, Y., Ren, Y. et al. An Empirical Comparison of Statistical Methods for Missing Data in Randomized, Double-Blind, Placebo-Controlled, Phase 3 Clinical Trials for Chronic Pain and Lipid-Lowering Products. Ther Innov Regul Sci (2020).

Download citation


  • MAR
  • MNAR
  • MMRM
  • Pattern-mixture model
  • Multiple imputation