Skip to main content

Estimation of Optimal DTRs by Directly Modeling Regimes

  • Chapter
  • First Online:
Statistical Methods for Dynamic Treatment Regimes

Part of the book series: Statistics for Biology and Health ((SBH))

  • 4738 Accesses

Abstract

In this chapter, we consider several approaches to estimating the optimal dynamic treatment regime by directly modeling the regimes as opposed to modeling the conditional mean outcome: inverse probability of treatment weighting, marginal structural models, and classification-based methods. The fundamental difference between the approaches considered in the current chapter and those considered in previous chapters (e.g. Q-learning and G-estimation) lies in the primary target of estimation (and inference): the methods considered presently target the parameters of the decision rule itself.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 69.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 89.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 119.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    While the term feasibility is commonly used in the causal inference literature, absolute continuity is an older concept in measure-theoretic probability.

  2. 2.

    The fourth was not FDA-approved at the time CATIE began enrollment; consequently more than a third of the study participants were not eligible to receive it. We therefore excluded all participants assigned to this drug in our analysis.

References

  • Bembom, O., & Van der Laan, M. J. (2007). Statistical methods for analyzing sequentially randomized trials. Journal of the National Cancer Institute, 99, 1577–1582.

    Article  Google Scholar 

  • Bertsekas, D. P., & Tsitsiklis, J. (1996). Neuro-dynamic programming. Belmont: Athena Scientific.

    MATH  Google Scholar 

  • Carlin, B. P., Kadane, J. B., & Gelfand, A. E. (1998). Approaches for optimal sequential decision analysis in clinical trials. Biometrics, 54, 964–975.

    Article  MATH  Google Scholar 

  • Cotton, C. A., & Heagerty, P. J. (2011). A data augmentation method for estimating the causal effect of adherence to treatment regimens targeting control of an intermediate measure. Statistics in Bioscience, 3, 28–44.

    Article  Google Scholar 

  • Cox, D. R. (1958). Planning of experiments. New York: Wiley.

    MATH  Google Scholar 

  • Henderson, R., Ansell, P., & Alshibani, D. (2010). Regret-regression for optimal dynamic treatment regimes. Biometrics, 66, 1192–1201.

    Article  MathSciNet  MATH  Google Scholar 

  • Hernán, M. A., & Robins, J. M. (2013). Causal inference. Chapman & Hall/CRC (in revision).

    Google Scholar 

  • Hernán, M. A., Hernández-Díaz, S., & Robins, J. M. (2004). A structural approach to selection bias. Epidemiology, 15, 615–625.

    Article  Google Scholar 

  • Hirano, K., & Porter, J. (2009). Asymptotics for statistical treatment rules. Econometrica, 77, 1683–1701.

    Article  MathSciNet  MATH  Google Scholar 

  • Kasari, C. (2009). Developmental and augmented intervention for facilitating expressive language (ccnia). Bethesda: National Institutes of Health. http://clinicaltrials.gov/ct2/show/NCT01013545?term=kasari&rank=5.

  • Kramer, M. S., Chalmers, B., Hodnett, E. D., Sevkovskaya, Z., Dzikovich, I., Shapiro, S., Collet, J., Vanilovich, I., Mezen, I., Ducruet, T., Shishko, G., Zubovich, V., Mknuik, D., Gluchanina, E., Dombrovsky, V., Ustinovitch, A., Ko, T., Bogdanovich, N., Ovchinikova, L., & Helsing, E. (2001). Promotion of Breastfeeding Intervention Trial (PROBIT): A randomized trial in the Republic of Belarus. Journal of the American Medical Association, 285, 413–420.

    Article  Google Scholar 

  • Lindley, D. V. (1985). Making decisions (2nd ed.). New York: Wiley.

    Google Scholar 

  • Moodie, E. E. M. (2009a). A note on the variance of doubly-robust G-estimates. Biometrika, 96, 998–1004.

    Article  MathSciNet  MATH  Google Scholar 

  • Moodie, E. E. M., & Richardson, T. S. (2010). Estimating optimal dynamic regimes: Correcting bias under the null. Scandinavian Journal of Statistics, 37, 126–146.

    Article  MathSciNet  MATH  Google Scholar 

  • Murphy, S. A. (2003). Optimal dynamic treatment regimes (with Discussion). Journal of the Royal Statistical Society, Series B, 65, 331–366.

    Article  MATH  Google Scholar 

  • Murphy, S. A., & Bingham, D. (2009). Screening experiments for developing dynamic treatment regimes. Journal of the American Statistical Association, 184, 391–408.

    Article  MathSciNet  Google Scholar 

  • Murphy, S. A., Lynch, K. G., Oslin, D., Mckay, J. R., & TenHave, T. (2007a). Developing adaptive treatment strategies in substance abuse research. Drug and Alcohol Dependence, 88, s24–s30.

    Article  Google Scholar 

  • Newey, W. K., & McFadden, D. (1994). Large sample estimation and hypothesis testing. In R. F. Engle & D. L. McFadden (Eds.), Handbook of econometrics (Vol. IV, pp. 2113–2245). Amsterdam/Oxford: Elsevier Science.

    Google Scholar 

  • Oetting, A. I., Levy, J. A., Weiss, R. D., & Murphy, S. A. (2011). Statistical methodology for a SMART design in the development of adaptive treatment strategies. In: P. E. Shrout, K. M. Keyes, & K. Ornstein (Eds.) Causality and Psychopathology: Finding the Determinants of Disorders and their Cures (pp. 179–205). Arlington: American Psychiatric Publishing.

    Google Scholar 

  • Orellana, L., Rotnitzky, A., & Robins, J. M. (2010b). Dynamic regime marginal structural mean models for estimation of optimal dynamic treatment regimes, part II: Proofs and additional results. The International Journal of Biostatistics, 6.

    Google Scholar 

  • Ormoneit, D., & Sen, S. (2002). Kernel-based reinforcement learning. Machine Learning, 49, 161–178.

    Article  MATH  Google Scholar 

  • Petersen, M. L., Porter, K. E., Gruber, S., Wang, Y., & Van der Laan, M. J. (2012). Diagnosing and responding to violations in the positivity assumption. Statistical Methods in Medical Research, 21, 31–54.

    Article  MathSciNet  Google Scholar 

  • Robins, J. M. (1999b). Association, causation, and marginal structural models. Synthese, 121, 151–179.

    Article  MathSciNet  MATH  Google Scholar 

  • Robins, J. M. (2004). Optimal structural nested models for optimal sequential decisions. In D. Y. Lin & P. Heagerty (Eds.), Proceedings of the second Seattle symposium on biostatistics (pp. 189–326). New York: Springer.

    Chapter  Google Scholar 

  • Robins, J. M., & Hernán, M. A. (2009). Estimation of the causal effects of time-varying exposures. In G. Fitzmaurice, M. Davidian, G. Verbeke, & G. Molenberghs (Eds.), Longitudinal data analysis. Boca Raton: Chapman & Hall/CRC.

    Google Scholar 

  • Robins, J. M., Orellana, L., & Rotnitzky, A. (2008). Estimation and extrapolation of optimal treatment and testing strategies. Statistics in Medicine, 27, 4678–4721.

    Article  MathSciNet  Google Scholar 

  • Rosenbaum, P. R. (1991). Discussing hidden bias in observational studies. Annals of Internal Medicine, 115, 901–905.

    Article  Google Scholar 

  • Rush, A. J., Fava, M., Wisniewski, S. R., Lavori, P. W., Trivedi, M. H., Sackeim, H. A., Thase, M. E., Nierenberg, A. A., Quitkin, F. M., Kashner, T. M., Kupfer, D. J., Rosenbaum, J. F., Alpert, J., Stewart, J. W., McGrath, P. J., Biggs, M. M., Shores-Wilson, K., Lebowitz, B. D., Ritz, L., & Niederehe, G. (2004). Sequenced treatment alternatives to relieve depression (STAR*D): Rationale and design. Controlled Clinical Trials, 25, 119–142.

    Article  Google Scholar 

  • Shepherd, B. E., Jenkins, C. A., Rebeiro, P. F., Stinnette, S. E., Bebawy, S. S., McGowan, C. C., Hulgan, T., & Sterling, T. R. (2010). Estimating the optimal CD4 count for HIV-infected persons to start antiretroviral therapy. Epidemiology, 21, 698–705.

    Article  Google Scholar 

  • Shortreed, S. M., Laber, E., & Murphy, S. A. (2010). Imputation methods for the clinical antipsychotic trials of intervention and effectiveness study (Technical report SOCS-TR-2010.8). School of Computer Science, McGill University.

    Google Scholar 

  • Shortreed, S. M., Laber, E., Lizotte, D. J., Stroup, T. S., Pineau, J., & Murphy, S. A. (2011). Informing sequential clinical decision-making through reinforcement learning: An empirical study. Machine Learning, 84, 109–136.

    Article  Google Scholar 

  • Stroup, T. S., Lieberman, J. A., McEvoy, J. P., Davis, S. M., Meltzer, H. Y., Rosenheck, R. A., Swartz, M. S., Perkins, D. O., Keefe, R. S. E., Davis, C. E., Severe, J., & Hsiao, J. K. (2006). Effectiveness of olanzapine, quetiapine, risperidone, and ziprasidone in patients with chronic schizophrenia folllowing discontinuation of a previous atypical antipsychotic. American Journal of Psychiatry, 163, 611–622.

    Article  Google Scholar 

  • Sturmer, T., Schneeweiss, S., Brookhart, M. A., Rothman, K. J., Avorn, J., & Glynn, R. J. (2005). Analytic strategies to adjust confounding using exposure propensity scores and disease risk scores: Nonsteroidal antiinflammatory drugs and short-term mortality in the elderly. American Journal of Epidemiology, 161, 891–898.

    Article  Google Scholar 

  • Sutton, R. S., & Barto, A. G. (1998). Reinforcement learning: An introduction. Cambridge, MA: MIT.

    Google Scholar 

  • Taubman, S. L., Robins, J. M., Mittleman, M. A., & Hernán, M. A. (2009). Intervening on risk factors for coronary heart disease: An application of the parametric g-formula. International Journal of Epidemiology, 38, 1599–1611.

    Article  Google Scholar 

  • Van der Laan, M. J., & Petersen, M. L. (2007b). Statistical learning of origin-specific statically optimal individualized treatment rules. The International Journal of Biostatistics, 3.

    Google Scholar 

  • Van der Laan, M. J., & Rubin, D. (2006). Targeted maximum likelihood learning. The International Journal of Biostatistics, 2.

    Google Scholar 

  • Van der Vaart, A. W. (1998). Asymptotic statistics. Cambridge, UK: Cambridge University Press.

    Book  MATH  Google Scholar 

  • Young, J. G., Cain, L. E., Robins, J. M., O’Reilly, E. J., & Hernán, M. A. (2011). Comparative effectiveness of dynamic treatment regimes: An application of the parametric G-formula. Statistics in Biosciences, 1, 119–143.

    Article  Google Scholar 

  • Zhang, B., Tsiatis, A. A., Davidian, M., Zhang, M., & Laber, E. B. (2012a). Estimating optimal treatment regimes from a classification perspective. Stat, 1, 103–114.

    Article  Google Scholar 

  • Zhang, B., Tsiatis, A. A., Laber, E. B., & Davidian, M. (2012b). A robust method for estimating optimal treatment regimes. Biometrics, 68, 1010–1018.

    Article  MATH  Google Scholar 

  • Zhao, Y., Kosorok, M. R., & Zeng, D. (2009). Reinforcement learning design for cancer clinical trials. Statistics in Medicine, 28, 3294–3315.

    Article  MathSciNet  Google Scholar 

  • Zou, H. (2006). The adaptive lasso and its oracle properties. Journal of the American Statistical Association, 101, 1418–1429.

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer Science+Business Media New York

About this chapter

Cite this chapter

Chakraborty, B., Moodie, E.E.M. (2013). Estimation of Optimal DTRs by Directly Modeling Regimes. In: Statistical Methods for Dynamic Treatment Regimes. Statistics for Biology and Health. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-7428-9_5

Download citation

Publish with us

Policies and ethics