Abstract
In this chapter, we consider several approaches to estimating the optimal dynamic treatment regime by directly modeling the regimes as opposed to modeling the conditional mean outcome: inverse probability of treatment weighting, marginal structural models, and classification-based methods. The fundamental difference between the approaches considered in the current chapter and those considered in previous chapters (e.g. Q-learning and G-estimation) lies in the primary target of estimation (and inference): the methods considered presently target the parameters of the decision rule itself.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
While the term feasibility is commonly used in the causal inference literature, absolute continuity is an older concept in measure-theoretic probability.
- 2.
The fourth was not FDA-approved at the time CATIE began enrollment; consequently more than a third of the study participants were not eligible to receive it. We therefore excluded all participants assigned to this drug in our analysis.
References
Bembom, O., & Van der Laan, M. J. (2007). Statistical methods for analyzing sequentially randomized trials. Journal of the National Cancer Institute, 99, 1577–1582.
Bertsekas, D. P., & Tsitsiklis, J. (1996). Neuro-dynamic programming. Belmont: Athena Scientific.
Carlin, B. P., Kadane, J. B., & Gelfand, A. E. (1998). Approaches for optimal sequential decision analysis in clinical trials. Biometrics, 54, 964–975.
Cotton, C. A., & Heagerty, P. J. (2011). A data augmentation method for estimating the causal effect of adherence to treatment regimens targeting control of an intermediate measure. Statistics in Bioscience, 3, 28–44.
Cox, D. R. (1958). Planning of experiments. New York: Wiley.
Henderson, R., Ansell, P., & Alshibani, D. (2010). Regret-regression for optimal dynamic treatment regimes. Biometrics, 66, 1192–1201.
Hernán, M. A., & Robins, J. M. (2013). Causal inference. Chapman & Hall/CRC (in revision).
Hernán, M. A., Hernández-DÃaz, S., & Robins, J. M. (2004). A structural approach to selection bias. Epidemiology, 15, 615–625.
Hirano, K., & Porter, J. (2009). Asymptotics for statistical treatment rules. Econometrica, 77, 1683–1701.
Kasari, C. (2009). Developmental and augmented intervention for facilitating expressive language (ccnia). Bethesda: National Institutes of Health. http://clinicaltrials.gov/ct2/show/NCT01013545?term=kasari&rank=5.
Kramer, M. S., Chalmers, B., Hodnett, E. D., Sevkovskaya, Z., Dzikovich, I., Shapiro, S., Collet, J., Vanilovich, I., Mezen, I., Ducruet, T., Shishko, G., Zubovich, V., Mknuik, D., Gluchanina, E., Dombrovsky, V., Ustinovitch, A., Ko, T., Bogdanovich, N., Ovchinikova, L., & Helsing, E. (2001). Promotion of Breastfeeding Intervention Trial (PROBIT): A randomized trial in the Republic of Belarus. Journal of the American Medical Association, 285, 413–420.
Lindley, D. V. (1985). Making decisions (2nd ed.). New York: Wiley.
Moodie, E. E. M. (2009a). A note on the variance of doubly-robust G-estimates. Biometrika, 96, 998–1004.
Moodie, E. E. M., & Richardson, T. S. (2010). Estimating optimal dynamic regimes: Correcting bias under the null. Scandinavian Journal of Statistics, 37, 126–146.
Murphy, S. A. (2003). Optimal dynamic treatment regimes (with Discussion). Journal of the Royal Statistical Society, Series B, 65, 331–366.
Murphy, S. A., & Bingham, D. (2009). Screening experiments for developing dynamic treatment regimes. Journal of the American Statistical Association, 184, 391–408.
Murphy, S. A., Lynch, K. G., Oslin, D., Mckay, J. R., & TenHave, T. (2007a). Developing adaptive treatment strategies in substance abuse research. Drug and Alcohol Dependence, 88, s24–s30.
Newey, W. K., & McFadden, D. (1994). Large sample estimation and hypothesis testing. In R. F. Engle & D. L. McFadden (Eds.), Handbook of econometrics (Vol. IV, pp. 2113–2245). Amsterdam/Oxford: Elsevier Science.
Oetting, A. I., Levy, J. A., Weiss, R. D., & Murphy, S. A. (2011). Statistical methodology for a SMART design in the development of adaptive treatment strategies. In: P. E. Shrout, K. M. Keyes, & K. Ornstein (Eds.) Causality and Psychopathology: Finding the Determinants of Disorders and their Cures (pp. 179–205). Arlington: American Psychiatric Publishing.
Orellana, L., Rotnitzky, A., & Robins, J. M. (2010b). Dynamic regime marginal structural mean models for estimation of optimal dynamic treatment regimes, part II: Proofs and additional results. The International Journal of Biostatistics, 6.
Ormoneit, D., & Sen, S. (2002). Kernel-based reinforcement learning. Machine Learning, 49, 161–178.
Petersen, M. L., Porter, K. E., Gruber, S., Wang, Y., & Van der Laan, M. J. (2012). Diagnosing and responding to violations in the positivity assumption. Statistical Methods in Medical Research, 21, 31–54.
Robins, J. M. (1999b). Association, causation, and marginal structural models. Synthese, 121, 151–179.
Robins, J. M. (2004). Optimal structural nested models for optimal sequential decisions. In D. Y. Lin & P. Heagerty (Eds.), Proceedings of the second Seattle symposium on biostatistics (pp. 189–326). New York: Springer.
Robins, J. M., & Hernán, M. A. (2009). Estimation of the causal effects of time-varying exposures. In G. Fitzmaurice, M. Davidian, G. Verbeke, & G. Molenberghs (Eds.), Longitudinal data analysis. Boca Raton: Chapman & Hall/CRC.
Robins, J. M., Orellana, L., & Rotnitzky, A. (2008). Estimation and extrapolation of optimal treatment and testing strategies. Statistics in Medicine, 27, 4678–4721.
Rosenbaum, P. R. (1991). Discussing hidden bias in observational studies. Annals of Internal Medicine, 115, 901–905.
Rush, A. J., Fava, M., Wisniewski, S. R., Lavori, P. W., Trivedi, M. H., Sackeim, H. A., Thase, M. E., Nierenberg, A. A., Quitkin, F. M., Kashner, T. M., Kupfer, D. J., Rosenbaum, J. F., Alpert, J., Stewart, J. W., McGrath, P. J., Biggs, M. M., Shores-Wilson, K., Lebowitz, B. D., Ritz, L., & Niederehe, G. (2004). Sequenced treatment alternatives to relieve depression (STAR*D): Rationale and design. Controlled Clinical Trials, 25, 119–142.
Shepherd, B. E., Jenkins, C. A., Rebeiro, P. F., Stinnette, S. E., Bebawy, S. S., McGowan, C. C., Hulgan, T., & Sterling, T. R. (2010). Estimating the optimal CD4 count for HIV-infected persons to start antiretroviral therapy. Epidemiology, 21, 698–705.
Shortreed, S. M., Laber, E., & Murphy, S. A. (2010). Imputation methods for the clinical antipsychotic trials of intervention and effectiveness study (Technical report SOCS-TR-2010.8). School of Computer Science, McGill University.
Shortreed, S. M., Laber, E., Lizotte, D. J., Stroup, T. S., Pineau, J., & Murphy, S. A. (2011). Informing sequential clinical decision-making through reinforcement learning: An empirical study. Machine Learning, 84, 109–136.
Stroup, T. S., Lieberman, J. A., McEvoy, J. P., Davis, S. M., Meltzer, H. Y., Rosenheck, R. A., Swartz, M. S., Perkins, D. O., Keefe, R. S. E., Davis, C. E., Severe, J., & Hsiao, J. K. (2006). Effectiveness of olanzapine, quetiapine, risperidone, and ziprasidone in patients with chronic schizophrenia folllowing discontinuation of a previous atypical antipsychotic. American Journal of Psychiatry, 163, 611–622.
Sturmer, T., Schneeweiss, S., Brookhart, M. A., Rothman, K. J., Avorn, J., & Glynn, R. J. (2005). Analytic strategies to adjust confounding using exposure propensity scores and disease risk scores: Nonsteroidal antiinflammatory drugs and short-term mortality in the elderly. American Journal of Epidemiology, 161, 891–898.
Sutton, R. S., & Barto, A. G. (1998). Reinforcement learning: An introduction. Cambridge, MA: MIT.
Taubman, S. L., Robins, J. M., Mittleman, M. A., & Hernán, M. A. (2009). Intervening on risk factors for coronary heart disease: An application of the parametric g-formula. International Journal of Epidemiology, 38, 1599–1611.
Van der Laan, M. J., & Petersen, M. L. (2007b). Statistical learning of origin-specific statically optimal individualized treatment rules. The International Journal of Biostatistics, 3.
Van der Laan, M. J., & Rubin, D. (2006). Targeted maximum likelihood learning. The International Journal of Biostatistics, 2.
Van der Vaart, A. W. (1998). Asymptotic statistics. Cambridge, UK: Cambridge University Press.
Young, J. G., Cain, L. E., Robins, J. M., O’Reilly, E. J., & Hernán, M. A. (2011). Comparative effectiveness of dynamic treatment regimes: An application of the parametric G-formula. Statistics in Biosciences, 1, 119–143.
Zhang, B., Tsiatis, A. A., Davidian, M., Zhang, M., & Laber, E. B. (2012a). Estimating optimal treatment regimes from a classification perspective. Stat, 1, 103–114.
Zhang, B., Tsiatis, A. A., Laber, E. B., & Davidian, M. (2012b). A robust method for estimating optimal treatment regimes. Biometrics, 68, 1010–1018.
Zhao, Y., Kosorok, M. R., & Zeng, D. (2009). Reinforcement learning design for cancer clinical trials. Statistics in Medicine, 28, 3294–3315.
Zou, H. (2006). The adaptive lasso and its oracle properties. Journal of the American Statistical Association, 101, 1418–1429.
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer Science+Business Media New York
About this chapter
Cite this chapter
Chakraborty, B., Moodie, E.E.M. (2013). Estimation of Optimal DTRs by Directly Modeling Regimes. In: Statistical Methods for Dynamic Treatment Regimes. Statistics for Biology and Health. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-7428-9_5
Download citation
DOI: https://doi.org/10.1007/978-1-4614-7428-9_5
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-7427-2
Online ISBN: 978-1-4614-7428-9
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)