Optimizing variance-bias trade-off in the TWANG package for estimation of propensity scores
While propensity score weighting has been shown to reduce bias in treatment effect estimation when selection bias is present, it has also been shown that such weighting can perform poorly if the estimated propensity score weights are highly variable. Various approaches have been proposed which can reduce the variability of the weights and the risk of poor performance, particularly those based on machine learning methods. In this study, we closely examine approaches to fine-tune one machine learning technique [generalized boosted models (GBM)] to select propensity scores that seek to optimize the variance-bias trade-off that is inherent in most propensity score analyses. Specifically, we propose and evaluate three approaches for selecting the optimal number of trees for the GBM in the twang package in R. Normally, the twang package in R iteratively selects the optimal number of trees as that which maximizes balance between the treatment groups being considered. Because the selected number of trees may lead to highly variable propensity score weights, we examine alternative ways to tune the number of trees used in the estimation of propensity score weights such that we sacrifice some balance on the pre-treatment covariates in exchange for less variable weights. We use simulation studies to illustrate these methods and to describe the potential advantages and disadvantages of each method. We apply these methods to two case studies: one examining the effect of dog ownership on the owner’s general health using data from a large, population-based survey in California, and a second investigating the relationship between abstinence and a long-term economic outcome among a sample of high-risk youth.
KeywordsCausal inference Propensity score Machine learning
This study was funded by National Institutes of Health grant 1R01DA034065-01A1 and National Institute of Child Health and Human Development grant R01HD066591.
Compliance with ethical standards
Conflict of interest
The authors declare that they have no conflict of interest.
Ethical approval and informed consent
This study used only secondary de-identified datasets.
- Breiman, L., Friedman, J., Stone, C.J., Olshen, R.A.: Classification and regression trees. CRC Press, New York (1984)Google Scholar
- Burgette, L., McCaffrey, D.F., Griffin, B.A.: Propensity score estimation with boosted regression. In: Pan, W. (ed.) Propensity Score Analysis: Fundamentals and Developments. Guilford Publications, New York (2015)Google Scholar
- California Health Interview Survey (CHIS): CHIS 2003 Methodology Report Series. UCLA Center for Health Policy Research, Los Angeles, CA (2005)Google Scholar
- Dennis, M.L.: Overview of the Global Appraisal of Individual Needs (Gain): Summary. Chestnut Health Systems, Bloomington, IL (1999)Google Scholar
- Kaestner, R.: The effect of illicit drug use on the wages of young adults. Tech. rep., National Bureau of Economic Research (1990)Google Scholar
- Kang, J.D., Schafer, J.L.: Demystifying double robustness: a comparison of alternative strategies for estimating a population mean from incomplete data. Statistical Science, pp. 523–539 (2007)Google Scholar
- Liaw, A., Wiener, M.: Classification and regression by randomforest. R News 2(3), 18–22 (2002)Google Scholar
- Normand, S.L.T., Landrum, M.B., Guadagnoli, E., Ayanian, J.Z., Ryan, T.J., Cleary, P.D., McNeil, B.J.: Validating recommendations for coronary angiography following acute myocardial infarction in the elderly: a matched analysis using propensity scores. J. Clin. Epidemiol. 54(4), 387–398 (2001)CrossRefPubMedGoogle Scholar
- Ridgeway, G.: gbm: Generalized Boosted Regression Models. R package version 2.1.1. Retrieved from cran.r-project.org (2015)Google Scholar
- Ridgeway, G., McCaffrey, D., Morral, A., Griffin, B.A., Burgette, L.: Twang: Toolkit for Weighting and Analysis of Nonequivalent Groups. R package version 9.5. Retrieved from cran.r-project.org (2016)Google Scholar
- Rosenbaum, P.R.: Various practical issues in matching. In: Design of Observational Studies, pp. 187–195. Springer, New York (2010)Google Scholar
- Rosenbaum, P.R., Rubin, D.B.: Assessing sensitivity to an unobserved binary covariate in an observational study with binary outcome. J. R. Stat. Soc. Ser. B (Methodol.) 45(2), 212–218 (1983a)Google Scholar
- Survey, C.H.I.: Technical Paper No. 1: The chis 2001 Sample: Response Rate and Representativeness. Ucla Center for Health Policy Research, Los Angeles, CA (2003)Google Scholar
- Tibshirani, R.: Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B (Methodol.) 58(1), 267–288 (1996)Google Scholar
- van der Laan, M.J., Polley, E.C., Hubbard, A.E.: Super learner. Stat. Appl. Genet. Mol. Biol. (2007). doi: 10.2202/1544-6115.1309