The Data: Observational Studies and Sequentially Randomized Trials

Chakraborty, Bibhas; Moodie, Erica E. M.

doi:10.1007/978-1-4614-7428-9_2

Bibhas Chakraborty³ &
Erica E. M. Moodie⁴

Part of the book series: Statistics for Biology and Health ((SBH))

4817 Accesses
1 Citations

Abstract

The data for constructing (optimal) dynamic treatment regimes that we consider are obtained from either longitudinal observational studies or sequentially randomized trials. In this chapter, we review these two types of data sources, their advantages and drawbacks, and the assumptions required to perform valid analyses in each, along with some examples. We also discuss a basic framework of causal inference in the context of observational studies, and power and sample size issues in the context of randomized studies.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Hardcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
In this book, we use the term treatment generically to denote either a medical treatment or an exposure (which is the preferred term in the causal inference literature and more generally in epidemiology).
2.
While the term stage is commonly used in the randomized trial literature, the term interval is more popular in the causal inference literature. In this book, for consistency, we will use the term stage for both observational and randomized studies.

References

Almirall, D., Compton, S. N., Gunlicks-Stoessel, M., Duan, N., & Murphy, S. A. (2012a). Designing a pilot sequential multiple assignment randomized trial for developing an adaptive treatment strategy. Statistics in Medicine, 31, 1887–1902.
Article MathSciNet Google Scholar
Auyeung, S. F., Long, Q., Royster, E. B., Murthy, S., McNutt, M. D., Lawson, D., Miller, A., Manatunga, A., & Musselman, D. L. (2009). Sequential multiple-assignment randomized trial design of neurobehavioral treatment for patients with metastatic malignant melanoma undergoing high-dose interferon-alpha therapy. Clinical Trials, 6, 480–490.
Article Google Scholar
Banerjee, A., & Tsiatis, A. A. (2006). Adaptive two-stage designs in phase II clinical trials. Statistics in Medicine, 25, 3382–3395.
Article MathSciNet Google Scholar
Berry, D. A. (2001). Adaptive clinical trials and Bayesian statistics in drug development (with discussion). Biopharmaceutical Report, 9, 1–11.
Google Scholar
Berry, D. A. (2004). Bayesian statistics and the efficiency and ethics of clinical trials. Statistical Science, 19, 175–187.
Article MathSciNet MATH Google Scholar
Berry, D. A., Mueller, P., Grieve, A. P., Smith, M., Parke, T., Blazek, R., Mitchard, N., & Krams, M. (2001). Adaptive Bayesian designs for dose-ranging drug trials. In Gatsonis, C., Kass, R.E., Carlin, B., Carriquiry, A. Gelman, A. Verdinelli, I., and West, M. (Eds.), Case studies in Bayesian statistics (Vol. V, pp. 99–181). New York: Springer.
Google Scholar
Berzuini, C., Dawid, A. P., & Didelez, V. (2012). Assessing dynamic treatment strategies. In C. Berzuini, A. P. Dawid, & L. Bernardinelli (Eds.), Causality: Statistical perspectives and applications (pp. 85–100). Chichester, West Sussex, United Kindom.
Chapter Google Scholar
Box, G. E. P., Hunter, W. G., & Hunter, J. S. (1978). Statistics for experimenters: An introduction to design, data analysis, and model building. New York: Wiley.
MATH Google Scholar
Breiman, L. (1995). Better subset regression using the nonnegative garrote. Technometrics, 37, 373–384.
Article MathSciNet MATH Google Scholar
Buhlmann, P., & Yu, B. (2002). Analyzing bagging. Annals of Statistics, 30, 927–961.
Article MathSciNet Google Scholar
Carlin, B. P., Kadane, J. B., & Gelfand, A. E. (1998). Approaches for optimal sequential decision analysis in clinical trials. Biometrics, 54, 964–975.
Article MATH Google Scholar
Chakraborty, B. (2011). Dynamic treatment regimes for managing chronic health conditions: A statistical perspective. American Journal of Public Health, 101, 40–45.
Article Google Scholar
Chakraborty, B., Murphy, S. A., & Strecher, V. (2010). Inference for non-regular parameters in optimal dynamic treatment regimes. Statistical Methods in Medical Research, 19, 317–343.
Article MathSciNet Google Scholar
Chakraborty, B., Laber, E. B., & Zhao, Y. (2013). Inference for optimal dynamic treatment regimes using an adaptive m-out-of-n bootstrap scheme. Biometrics, (in press).
Google Scholar
Chen, M.-H., Muller, P., Sun, D., & Ye, K. (Eds.). (2010). Frontiers of statistical decision making and Bayesian analysis: In Honor of James O. Berger. New York: Springer.
MATH Google Scholar
Clemen, R. T., & Reilly, T. (2001). Making hard decisions. Pacific Grove: Duxbury.
Google Scholar
Cohen, J. (1988). Statistical power for the behavioral sciences (2nd ed.). Hillsdale: Erlbaum.
MATH Google Scholar
Cole, S. R., & Frangakis, C. (2009). The consistency statement in causal inference: A definition or an assumption? Epidemiology, 20, 3–5.
Article Google Scholar
Cole, S. A., & Hernán, M. A. (2008). Constructing inverse probability weights for marginal structural models. American Journal of Epidemiology, 168, 656–664.
Article Google Scholar
Collins, L. M., Murphy, S. A., & Bierman, K. (2004). A conceptual framework for adaptive preventive interventions. Prevention Science, 5, 185–196.
Article Google Scholar
Collins, L. M., Chakraborty, B., Murphy, S. A., & Strecher, V. J. (2009). Comparison of a phased experimental approach and a single randomized clinical trial for developing multicomponent behavioral interventions. Clinical Trials, 6, 5–15.
Article Google Scholar
Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine Learning, 20, 273–297.
MATH Google Scholar
Cox, D. R. (1958). Planning of experiments. New York: Wiley.
MATH Google Scholar
Cox, D. R., & Oaks, D. (1984). Analysis of survival data. Boca Raton, Florida: Chapman & Hall/CRC.
Google Scholar
Dawson, R., & Lavori, P. W. (2010). Sample size calculations for evaluating treatment policies in multi-stage designs. Clinical Trials, 7, 643–652.
Article Google Scholar
Dawson, R., & Lavori, P. W. (2012). Efficient design and inference for multistage randomized trials of individualized treatment policies. Biostatistics, 13, 142–152.
Article MATH Google Scholar
Dehejia, R. H. (2005). Program evaluation as a decision problem. Journal of Econometrics, 125, 141–173.
Article MathSciNet Google Scholar
Efron, B. (1979). Bootstrap methods: Another look at the jackknife. Annals of Statistics, 7, 1–26.
Article MathSciNet MATH Google Scholar
Feng, W., & Wahed, A. S. (2009). Sample size for two-stage studies with maintenance therapy. Statistics in Medicine, 28, 2028–2041.
Article MathSciNet Google Scholar
Ferguson, T. S. (1996). A course in large sample theory. London: Chapman & Hall/CRC.
MATH Google Scholar
Gao, H. (1998). Wavelet shrinkage denoising using the nonnegative garrote. Journal of Computational and Graphical Statistics, 7, 469–488.
MathSciNet Google Scholar
Greenland, S., Pearl, J., & Robins, J. M. (1999). Causal diagrams for epidemiologic research. Epidemiology, 10, 37–48.
Article Google Scholar
Guez, A., Vincent, R., Avoli, M., & Pineau, J. (2008). Adaptive treatment of epilepsy via batch-mode reinforcement learning. In Proceedings of the innovative applications of artificial intelligence (IAAI), Chicago.
Google Scholar
Hernán, M. A., & Taubman, S. L. (2008). Does obesity shorten life? The importance of well-defined interventions to answer causal questions. International Journal of Obesity, 32, S8–S14.
Article Google Scholar
Hernán, M. A., Brumback, B., & Robins, J. M. (2000). Marginal structural models to estimate the causal effect of zidovudine on the survival of HIV-positive men. Epidemiology, 11, 561–570.
Article Google Scholar
Hernán, M. A., Hernández-Díaz, S., & Robins, J. M. (2004). A structural approach to selection bias. Epidemiology, 15, 615–625.
Article Google Scholar
Hernán, M. A., Cole, S. J., Margolick, J., Cohen, M., & Robins, J. M. (2005). Structural accelerated failure time models for survival analysis in studies with time-varying treatments. Pharmacoepidemiology and Drug Safety, 14, 477–491.
Article Google Scholar
Huang, F., & Lee, M.-J. (2010). Dynamic treatment effect analysis of TV effects on child cognitive development. Journal of Applied Econometrics, 25, 392–419.
Article MathSciNet Google Scholar
Kaelbling, L. P., Littman, M. L., & Moore, A. (1996). Reinforcement learning: A survey. The Journal of Artificial Intelligence Research, 4, 237–385.
Google Scholar
Kaslow, R. A., Ostrow, D. G., Detels, R., Phair, J. P., Polk, B. F., & Rinaldo, C. R. (1987). The Multicenter AIDS Cohort Study: Rationale, organization, and selected characteristics of the participants. American Journal of Epidemiology, 126, 310–318.
Article Google Scholar
Kramer, M. S., Aboud, F., Miranova, E., Vanilovich, I., Platt, R., Matush, L., Igumnov, S., Fombonne, E., Bogdanovich, N., Ducruet, T., Collet, J., Chalmers, B., Hodnett, E., Davidovsky, S., Skugarevsky, O., Trofimovich, O., Kozlova, L., & Shapiro, S. (2008). Breastfeeding and child cognitive development: New evidence from a large randomized trial. Archives of General Psychiatry, 65, 578–584.
Article Google Scholar
Lavori, P. W., & Dawson, R. (2004). Dynamic treatment regimes: Practical design considerations. Clinical Trials, 1, 9–20.
Article Google Scholar
Lavori, P. W., & Dawson, R. (2008). Adaptive treatment strategies in chronic disease. Annual Review of Medicine, 59, 443–453.
Article Google Scholar
LeBlanc, M., & Kooperberg, C. (2010). Boosting predictions of treatment success. Proceedings of the National Academy of Sciences, 107, 13559–13560.
Article Google Scholar
Levin, B., Thompson, J. L. P., Chakraborty, R. B., Levy, G., MacArthur, R., & Haley, E. C. (2011). Statistical aspects of the TNK-S2B trial of tenecteplase versus alteplase in acute ischemic stroke: An efficient, dose-adaptive, seamless phase II/III design. Clinical Trials, 8, 398–407.
Article Google Scholar
Li, Z., & Murphy, S. A. (2011). Sampe size formulae for two-stage randomized trials with survival outcomes. Biometrika, 98, 503–518.
Article MathSciNet MATH Google Scholar
Lieberman, J. A., Stroup, T. S., McEvoy, J. P., Swartz, M. S., Rosenheck, R. A., Perkins, D. O., Keefe, R. S. E., Davis, S., Davis, C. E., Lebowitz, B. D., & Severe, J. (2005). Effectiveness of antipsychotic drugs in patients with chronic schozophrenia. New England Journal of Medicine, 353, 1209–1223.
Article Google Scholar
Moodie, E. E. M., Chakraborty, B., & Kramer, M. S. (2012). Q-learning for estimating optimal dynamic treatment rules from observational data. Canadian Journal of Statistics, 40, 629–645.
Article MathSciNet Google Scholar
Moodie, E. E. M., Dean, N., & Sun, Y. R. (2013). Q-learning: Flexible learning about useful utilities. Statistics in Biosciences, (in press).
Google Scholar
Murphy, S. A. (2005b). A generalization error for Q-learning. Journal of Machine Learning Research, 6, 1073–1097.
MATH Google Scholar
Murphy, S. A., Van der Laan, M. J., Robins, J. M., & CPPRG (2001). Marginal mean models for dynamic regimes. Journal of the American Statistical Association, 96, 1410–1423.
Google Scholar
Murphy, S. A., Oslin, D., Rush, A. J., & Zhu, J. (2007b). Methodological challenges in constructing effective treatment sequences for chronic psychiatric disorders. Neuropsychopharmacology, 32, 257–262.
Article Google Scholar
Nahum-Shani, I., Qian, M., Almiral, D., Pelham, W., Gnagy, B., Fabiano, G., Waxmonsky, J., Yu, J., & Murphy, S. (2012b). Q-learning: A data analysis method for constructing adaptive interventions. Psychological Methods, 17, 478–494.
Article Google Scholar
Nankervis, J. C. (2005). Computational algorithms for double bootstrap confidence intervals. Computational Statistics & Data Analysis, 49, 461–475.
Article MathSciNet MATH Google Scholar
Neugebauer, R., & Van der Laan, M. J. (2006). G-computation estimation for causal inference with complex longitudinal data. Computational Statistics & Data Analysis, 51, 1676–1697.
Article MathSciNet MATH Google Scholar
Ng, A., & Jordan, M. (2000). PEGASUS: A policy search method for large MDPs and POMDPs.
Google Scholar
Olshen, R. A. (1973). The conditional level of the F-test. Journal of the American Statistical Association, 68, 692–698.
MathSciNet MATH Google Scholar
Pampallona, S., & Tsiatis, A. A. (1994). Group sequential designs for one and two sided hypothesis testing with provision for early stopping in favour of the null hypothesis. Journal of Statistical Planning and Inference, 42, 19–35.
Article MathSciNet MATH Google Scholar
Parmigiani, G. (2002). Modeling in medical decision making: A Bayesian approach. New York: Wiley.
MATH Google Scholar
Petersen, M. L., Deeks, S. G., & Van der Laan, M. J. (2007). Individualized treatment rules: Generating candidate clinical trials. Statistics in Medicine, 26, 4578–4601.
Article MathSciNet Google Scholar
Partnership for Solutions (2004). Chronic conditions: Making the case for ongoing care: September 2004 update. Baltimore: Partnership for Solutions, Johns Hopkins University.
Google Scholar
Politis, D. N., Romano, J. P., & Wolf, M. (1999). Subsampling. New York: Springer.
Book MATH Google Scholar
Rich, B., Moodie, E. E. M., and Stephens, D.A. (2013) Adaptive individualized dosing in pharmacological studies: Generating candidate dynamic dosing strategies for warfarin treatment. (submitted).
Google Scholar
Robins, J. M. (1997). Causal inference from complex longitudinal data. In M. Berkane (Ed.), Latent variable modeling and applications to causality: Lecture notes in statistics (pp. 69–117). New York: Springer.
Chapter Google Scholar
Robins J. M. (1999a). Marginal structural models versus structural nested models as tools for causal inference. In: M. E. Halloran & D. Berry (Eds.) Statistical models in epidemiology: The environment and clinical trials. IMA, 116, NY: Springer-Verlag, pp. 95–134.
Google Scholar
Robins, J. M., Hernán, M. A., & Brumback, B. (2000). Marginal structural models and causal inference in epidemiology. Epidemiology, 11, 550–560.
Article Google Scholar
Rosenbaum, P. R., & Rubin, D. B. (1983). The central role of the propensity score in observational studies for causal effects. Biometrika, 70, 41–55.
Article MathSciNet MATH Google Scholar
Rubin, D. B. (1974). Estimating causal effects of treatments in randomized and nonrandomized studies. Journal of Educational Psychology, 66, 688–701.
Article Google Scholar
Rubin, D. B. (1980). Discussion of “randomized analysis of experimental data: The Fisher randomization test” by D. Basu. Journal of the American Statistical Association, 75, 591–593.
Google Scholar
Rubin, D. B., & Shenker, N. (1991). Multiple imputation in health-case data bases: An overview and some applications. Statistics in Medicine, 10, 585–598.
Article Google Scholar
Saarela, O., Moodie, E. E. M., Stephens, D. A., & Klein, M. B. (2013a). On Bayesian estimation of marginal structural models (submitted).
Google Scholar
Schulte, P. J., Tsiatis, A. A., Laber, E. B., & Davidian, M. (2012). Q- and A-learning methods for estimating optimal dynamic treatment regimes. arXiv, 1202.4177v1.
Google Scholar
Strecher, V., McClure, J., Alexander, G., Chakraborty, B., Nair, V., Konkel, J., Greene, S., Collins, L., Carlier, C., Wiese, C., Little, R., Pomerleau, C., & Pomerleau, O. (2008). Web-based smoking cessation components and tailoring depth: Results of a randomized trial. American Journal of Preventive Medicine, 34, 373–381.
Article Google Scholar
Stroup, T. S., McEvoy, J. P., Swartz, M. S., Byerly, M. J., Glick, I. D., Canive, J. M., McGee, M., Simpson, G. M., Stevens, M. D., & Lieberman, J. A. (2003). The National Institute of Mental Health Clinical Antipschotic Trials of Intervention Effectiveness (CATIE) project: Schizophrenia trial design and protocol deveplopment. Schizophrenia Bulletin, 29, 15–31.
Article Google Scholar
Thall, P. F., Millikan, R. E., & Sung, H. G. (2000). Evaluating multiple treatment courses in clinical trials. Statistics in Medicine, 30, 1011–1128.
Article Google Scholar
Thall, P. F., Sung, H. G., & Estey, E. H. (2002). Selecting therapeutic strategies based on efficacy and death in multicourse clinical trials. Journal of the American Statistical Association, 97, 29–39.
Article MathSciNet MATH Google Scholar
Thall, P. F., Wooten, L. H., Logothetis, C. J., Millikan, R. E., & Tannir, N. M. (2007a). Bayesian and frequentist two-stage treatment strategies based on sequential failure times subject to interval censoring. Statistics in Medicine, 26, 4687–4702.
Article MathSciNet Google Scholar
Van der Laan, M. J., & Robins, J. M. (2003). Unified methods for censored longitudinal data and causality. New York: Springer.
Book MATH Google Scholar
Wagner, E. H., Austin, B. T., Davis, C., Hindmarsh, M., Schaefer, J., & Bonomi, A. (2001). Improving chronic illness care: Translating evidence into action. Health Affairs, 20, 64–78.
Article Google Scholar
Wahed, A. S., & Tsiatis, A. A. (2006). Semiparametric efficient estimation of survival distributions in two-stage randomisation designs in clinical trials with censored data. Biometrika, 93, 163–177.
Article MathSciNet MATH Google Scholar
Wald, A. (1949). Statistical decision functions. New York: Wiley.
Google Scholar
Wang, L., Rotnitzky, A., Lin, X., Millikan, R. E., & Thall, P. F. (2012). Evaluation of viable dynamic treatment regimes in a sequentially randomized trial of advanced prostate cancer. Journal of the American Statistical Association, 107, 493–508.
Article MathSciNet MATH Google Scholar
Wathen, J. K., & Thall, P. F. (2008). Bayesian adaptive model selection for optimizing group sequential clinical trials. Statistics in Medicine, 27, 5586–5604.
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Department of Biostatistics, Columbia University, New York, USA
Bibhas Chakraborty
Department of Epidemiology, Biostatistics, and Occupational Health, McGill University, Montreal, Québec, Canada
Erica E. M. Moodie

Authors

Bibhas Chakraborty
View author publications
You can also search for this author in PubMed Google Scholar
Erica E. M. Moodie
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Chakraborty, B., Moodie, E.E.M. (2013). The Data: Observational Studies and Sequentially Randomized Trials. In: Statistical Methods for Dynamic Treatment Regimes. Statistics for Biology and Health. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-7428-9_2

Download citation

DOI: https://doi.org/10.1007/978-1-4614-7428-9_2
Published: 15 April 2013
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-7427-2
Online ISBN: 978-1-4614-7428-9
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics