A Statistical Roadmap for Journey from Real-World Data to Real-World Evidence


Randomized controlled clinical trials are the gold standard for evaluating the safety and efficacy of pharmaceutical drugs, but in many cases their costs, duration, limited generalizability, and ethical or technical feasibility have caused some to look for real-world studies as alternatives. On the other hand, real-world data may be much less convincing due to the lack of randomization and the presence of confounding bias. In this article, we propose a statistical roadmap to translate real-world data (RWD) to robust real-world evidence (RWE). The Food and Drug Administration (FDA) is working on guidelines, with a target to release a draft by 2021, to harmonize RWD applications and monitor the safety and effectiveness of pharmaceutical drugs using RWE. The proposed roadmap aligns with the newly released framework for FDA’s RWE Program in December 2018 and we hope this statistical roadmap is useful for statisticians who are eager to embark on their journeys in the real-world research.

This is a preview of subscription content, log in to check access.

Figure 1.
Figure 2.
Figure 3.


  1. 1.

    Franklin JM, Schneeweiss S. When and how can real world data analyses substitute for randomized controlled trials? Clin Pharmacol Ther. 2017;102(6):924–33.

    Article  Google Scholar 

  2. 2.

    Hernan MA, Robins JM. Causal inference: What If. Boca Raton: Chapman & Hall/CRC; 2020.

    Google Scholar 

  3. 3.

    Sherman RE, Anderson SA, Pan GJD, et al. Real-world evidence—what is it and what can it tell us? N Engl J Med. 2016;375(23):2293–7.

    Article  Google Scholar 

  4. 4.

    Farrugia P, Petrisor BA, Farrokhyar F, Bhandari M. Research questions, hypotheses and objectives. Can J Surg. 2010;53(4):278–81.

    PubMed  PubMed Central  Google Scholar 

  5. 5.

    Haynes RB. Forming research questions. J Clin Epidemiol. 2006;59(9):881–6.

    Article  Google Scholar 

  6. 6.

    Petersen ML, van der Laan MJ. Causal models and learning from data. Epidemiol Camb Mass. 2014;25(3):418–26.

    Article  Google Scholar 

  7. 7.

    Imbens GW, Rubin DB. Causal inference in statistics, social, and biomedical sciences. Cambridge: Cambridge University Press; 2015.

    Google Scholar 

  8. 8.

    van der Laan MJ, Rose S. Targeted learning: causal inference for observational and experimental data. New York: Springer; 2011. http://public.eblib.com/choice/publicfullrecord.aspx?p=763456. Accessed May 17, 2019.

  9. 9.

    van der Laan MJ, Rose S. Targeted learning in data science: causal inference for complex longitudinal studies. New York: Springer; 2018.

    Google Scholar 

  10. 10.

    Rosenbaum PR, Rubin DB. The central role of the propensity score in observational studies for causal effects. Biometrika. 1983;70(1):41–55.

    Article  Google Scholar 

  11. 11.

    van der Laan MJ, Polley EC, Hubbard AE. Super learner. Stat Appl Genet Mol Biol. 2007. https://doi.org/10.2202/1544-6115.1309.

    Article  PubMed  PubMed Central  Google Scholar 

  12. 12.

    Cornfield J, Haenszel W, Hammond EC, Lilienfeld AM, Shimkin MB, Wynder EL. Smoking and lung cancer: recent evidence and a discussion of some questions. J Natl Cancer Inst. 1959;22(1):173–203.

    CAS  PubMed  Google Scholar 

  13. 13.

    Greenland S. Multiple-bias modelling for analysis of observational data. J R Stat Soc Ser A. 2005;168(2):267–306.

    Article  Google Scholar 

  14. 14.

    VanderWeele T, Ding P. Sensitivity analysis in observational research: introducing the E-value. Ann Intern Med. 2017;167(4):268–74.

    Article  Google Scholar 

  15. 15.

    Hammond EC, Horn D. Smoking and death rates—report on 44 months of follow-up of 187,783 men: 2. Death rates by cause. Am Med Assoc. 1958;166(11):1294–308.

    CAS  Article  Google Scholar 

  16. 16.

    Fisher RA. Cancer and smoking. Nature. 1958;182:596.

    CAS  Article  Google Scholar 

  17. 17.

    Brookhart MA, Schneeweiss S, Rothman KJ, Glynn RJ, Avorn J, Stürmer T. Variable selection for propensity score models. Am J Epidemiol. 2006;163(12):1149–56.

    Article  Google Scholar 

Download references


The comments provided here are solely those of the presenters and are not necessarily reflective of the positions, policies or practices of authors’ employers.

Author information



Corresponding author

Correspondence to Yixin Fang PhD.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Fang, Y., Wang, H. & He, W. A Statistical Roadmap for Journey from Real-World Data to Real-World Evidence. Ther Innov Regul Sci 54, 749–757 (2020). https://doi.org/10.1007/s43441-019-00008-2

Download citation


  • Causal inference
  • Clinical trials
  • Confounding bias
  • Statistical methods
  • Real-world studies