Abstract
Transitioning from a failing antiretroviral regimen to a new regimen is a critical period in managing treatments to suppress HIV-1 RNA because it can have lasting effects on the durability of disease and likelihood of developing resistant mutations. Evaluating the timing of a switch to the subsequent therapy is difficult because patients are not randomly assigned to switch failing regimens at designed time points. Li et al. (J. Am. Stat. Assoc. 107:542–554, 2012) proposed and applied doubly robust semi-parametric methods to evaluate the effect of early versus late regimen switch in a two-stage design setting. These semi-parametric estimators are consistent if a parametric treatment model is correctly specified and achieve optimal performance if a parametric outcome model is also correctly specified. Here, we propose a new non-parametric estimator of the same causal estimand using an ensemble-type statistical learner. Compared to earlier estimators, the proposed estimator requires fewer model assumptions and can easily accommodate a large number of potential confounders. We illustrate the methods through simulation studies and application to data from the AIDS Clinical Trials Group Study A5095.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Binder, H., Tutz, G.: A comparison of methods for the fitting of generalized additive models. Stat. Comput. 18, 87–99 (2008)
Borra, S., Ciaccio, A.: Improving nonparametric regression methods by bagging and boosting. Comput. Stat. Data Anal. 38, 407–420 (2002). doi:10.1016/S0167-9473(01)00068-8
Breiman, L.: Prediction Games and Arcing Algorithms. Technical Report 504. Statistics Department, University of California, Berkeley (1997/1998), revised. http://stat-www.berkeley.edu/tech-reports/index.html
Bühlmann, P., Hothorn, T.: Boosting algorithms: regularization, prediction and model fitting. Stat. Sci. 22, 477–505 (2007). doi:10.1214/07-STS242
Cao, W., Tsiatis, A.A., Davidian, M.: Improving efficiency and robustness of the doubly robust. Biometrika 96, 723–734 (2009)
Cheng, P.E.: Nonparametric estimation of mean functionals with data missing at random. J. Am. Stat. Assoc. 89, 81–87 (1994)
Efron, B., Tibshirani, R.: Bootstrap methods for standard errors, confidence intervals, and other measures of statistical accuracy. Stat. Sci. 1, 54–75 (1986). doi:10.1214/ss/1177013815
Fan, J., Gijbels, I.: Local polynomial fitting. In: Smoothing and Regression. Approaches, Computation and Application (M.G. Schimek), pp. 228–275. Wiley, New York (2000)
Freund, Y., Schapire, R.E.: A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. Syst. Sci. 55, 119–139 (1997)
Freund, Y., Schapire, R.E.: A short introduction to boosting. J. Jpn. Soc. Artif. Intell. 14, 771–780 (1999)
Friedman, J.: Greedy function approximation: a gradient boosting machine. Ann. Stat. 29, 1189–1232 (2001)
Friedman, J., Hastie, T., Tibshirani, T.: Additive logistic regression: a statistical view of boosting. Ann. Stat. 28, 337–374 (2000)
Friedman, J., Hastie, T., Tibshirani, T.: Rejoiner for additive logistic regression: a statistical view of boosting. Ann. Stat. 28, 400–407 (2000)
Gu, C.: Smoothing Spline ANOVA Models. Springer, New York (2002)
Gulick, R.M., Ribaudo, H.J., Lustgarten, S., Squires, K.E., Meyer, W.A., Acosta, E.P., Schackman, B.R., Pilcher, C.D., Murphy, R.L., Maher, W.L., Witt, M.D., Reichman, R.C., Snyder, S., Klingman, K.L., Kuritzkes, D.R.: Triple-nucleoside regimens versus efavirenz-containing regimens for the initial treatment of HIV-1 infection. N. Engl. J. Med. 350, 1850–1861 (2004)
Gulick, R.M., Ribaudo, H.J., Shikuma, C.M., Lalama, C., Schackman, B.R., Meyer, W.A. 3rd., Acosta, E.P., Schouten, J., Squires, K.E., Pilcher, C.D., Murphy, R.L., Koletar, S.L., Carlson, M., Reichman, R.C., Bastow, B., Klingman, K.L., Kuritzkes, D.R., AIDS Clinical Trials Group (ACTG) A5095 Study Team: Three- vs four-drug antiretroviral regimens for the initial treatment of HIV-1 infection: a randomized controlled trial. J. Am. Med. Assoc. 296 (7), 768–781 (2006)
Hastie, T., Tibshirani, R.: Generalized Additive Models, 1st edn. Monographs on Statistics and Applied Probability. Chapman and Hall/CRC, Boca Raton (1990)
Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd edn. Springer, New York (2001)
Johnson, B.A., Ribaudo, H., Gulick, R.M., Eron, J.J.: Modeling clinical endpoints as a function of time of switch to second-line ART with incomplete data on switching times. Biometrics 69, 732–740 (2013)
Li, L., Eron, J., Ribaudo, H., Gulick, R.M., Johnson, B.A.: Evaluating the effect of early versus late ARV regimen change after failure on the initial regimen: results from the AIDS clinical trials group study A5095. J. Am. Stat. Assoc. 107, 542–554 (2012)
Lunceford, J., Davidian, M., Tsitatis, A.: Estimation of survival distributions of treatment policies in two-stage randomization designs in clinical trials. Biometrics 58, 48–57 (2002)
McCullagh, P., Nelder, J.A.:Generalized Linear Models, 1st edn. Chapman and Hall, London (1983)
Nadaraya, E.A.: On estimating regression. Theory Probab. Appl. 9 (1), 141–142 (1964). doi:10.1137/1109020
Petersen, M.L., van der Laan, M.J., Napravnik, S., Eron, J., Moore, R., Deeks, S.: Long term consequences of the delay between virologic failure of highly active antiretroviral therapy and regimen modification: a prospective cohort study. AIDS 22, 2097–106 (2008)
Riddler, S., Jiang, H., Tenorio, A., Huang, H., Kuritzkes, D., Acosta, E., Landay, A., Bastow, B., Haas, D., Tashima, K., Jain, M., Deeks, S., Bartlett, J.: A randomized study of antiviral medication switch at lower- versus higher-switch thresholds: AIDS clinical trials group study A5115. Antivir. Ther. 12, 531–541 (2007)
Robins, J.M., Rotnitzky, A., Zhao, L.P.: Estimation of regression coefficients when some regressors are not always observed. J. Am. Stat. Assoc. 89, 846–866 (1994)
Robins, J.M., Rotnitzky, A., Zhao, L.P.: Analysis of semiparametric regression models for repeated outcomes in the presence of missing data. J. Am. Stat. Assoc. 90, 106–121 (1995)
Rosenbaum, P.R., Rubin, D.B.: The central role of the propensity score in observational studies for causal effects. Biometrika 70, 41–55 (1983)
Shao, J., Sitter, R.R.: Bootstrap for imputed survey data. J. Am. Stat. Assoc. 91, 1278–1288 (1996)
Simonoff, J.: Smoothing Methods in Statistics. Springer Science and Business Media, New York (1996)
Stone, R.M., Berg, D.T., George, S.L., Dodge, R.K., Paciucci, P.A., Schulman, P., Lee, E.J., Moore, J.O., Powell, B.L., Schiker, C.A.: Granulocyte- macrophage colony-stimulating factor after initial chemotherapy for elderly patients with primary acute myelogenous leukemia. N. Engl. J. Med. 322, 1671–1677 (1995)
Tan, Z.: A distributional approach for causal inference using propensity scores. J. Am. Stat. Assoc. 101, 1619–1637 (2006)
Tan, Z.: Understanding OR, PS and DR. Stat. Sci. 22, 560–568 (2007)
Watson, G.S.: Smooth regression analysis. Sankhy\(\bar{\mathrm{a}}\) Indian J. Stat. Ser. A 26 (4), 359–372 (1964) [JSTOR 25049340]
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendix
Appendix
Boosting is machine learning algorithm from a theory that attempts to construct a strong learner from a series or collection of weak learners and the earliest substantial contribution is widely attributed to [9, 10]. Freund and Schapire developed an early version of an adaptive resampling and combination scheme that became adaptive boosting or AdaBoost. Breiman [3] showed that AdaBoost can be viewed as functional gradient descent in function space while Friedman et al. [12, 13] linked AdaBoost and other boosting algorithms to a statistical framework in function estimation. The work by Breiman [3] and Friedman et al. [12, 13] brought boosting to a wide array of statistical regression and prediction applications beyond classification and our proposed estimator builds on this idea of function estimation.
Bühlmann and Hothorn [4] recently reviewed the literature in boosting and aggregation and their review informed our outline here. Boosting algorithms can be written as functional gradient descent techniques [3, 12, 13] and we adopt this view here. Briefly, the goal of functional gradient descent is to estimate a function by minimizing an expected loss
where ρ(⋅ , ⋅ ) is a (loss) function of data O ≡ { (X 1, Y 1), …, (X n , Y n )} and convex with respect to the second argument. Friedman [11] provided a generic outline of a descent algorithm through the following steps:
-
1.
Initialize \(\hat{f}^{0}(\cdot )\) with an offset value. A common choice is
$$\displaystyle{\hat{f}^{0}(\cdot ) = \mbox{ argmin} \frac{1} {n}\sum \limits _{i=1}^{n}\rho (Y _{ i},c);}$$for a constant c or let \(\hat{f}^{0}(\cdot ) = 0\). Set m = 0.
-
2.
Increase m by 1. Compute the negative gradient (∂∕∂ f)ρ(Y, f) and evaluate it at \(\hat{f}^{m-1}(X_{i})\):
$$\displaystyle{U_{i} = -\frac{\partial \rho (Y _{i},f)} {\partial f} \bigg\vert _{f=\hat{f}^{m-1}(X_{i})},i = 1,\ldots,n.}$$ -
3.
Fit the negative gradient vector U 1, …, U n to X 1, …, X n by the real-valued base procedure
$$\displaystyle{(X_{i},U_{i})_{i=1}^{n}\stackrel{\mbox{ base procedure}}{\longrightarrow }\hat{g}^{m}(\cdot ).}$$ -
4.
Update \(\hat{f}^{m}(\cdot ) =\hat{ f}^{m-1}(\cdot ) +\upsilon \hat{ g}^{m}(\cdot )\), where υ is a step-length factor.
-
5.
Iterate steps 2–4 until m = m stop for some stopping iteration m stop.
We need to determine two user-defined parameters in the algorithm above; namely, m stop in step 5 and the step-length factor υ in step 4. The stopping iteration m stop is determined via cross-validation or some information criterion, such as corrected AIC criterion. The choice of the step-length factor υ is chosen to be sufficiently small (e.g., υ = 0. 1). Popular loss functions ρ(y, f) are exp{ − (2y − 1)f} or log2[1 + exp{ − (2y − 1)f}] for binary outcomes and squared error loss for continuous outcomes.
BlackBoost was developed by Friedman [11] and uses regression trees as the base learner. Bühlmann and Hothorn [4] reviewed both theory and applications and have highlighted the advantage that estimates will be invariant under monotone transformations of variables. In addition, regression trees can handle continuous and categorical covariates in a unified way.
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Li, L., Johnson, B.A. (2016). Causal Ensembles for Evaluating the Effect of Delayed Switch to Second-Line Antiretroviral Regimens. In: He, H., Wu, P., Chen, DG. (eds) Statistical Causal Inferences and Their Applications in Public Health Research. ICSA Book Series in Statistics. Springer, Cham. https://doi.org/10.1007/978-3-319-41259-7_11
Download citation
DOI: https://doi.org/10.1007/978-3-319-41259-7_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-41257-3
Online ISBN: 978-3-319-41259-7
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)