Abstract
In previous chapters we considered modeling techniques that assume the predictor of the model to be an additive function of the covariates. Although this assumption is intuitive and facilitates interpretation of the models, it may happen that additive predictors do not capture the true data structure. This is, for example, the case when interactions between categorical covariates are present. In this chapter we consider recursive partitioning techniques (also termed “tree-based methods”), which are a popular approach to estimate non-additive predictors. Starting with an introduction to recursive partitioning, we consider two strategies to adapt tree-based methods to discrete survival data. The two strategies, which are described in detail in Sects. 6.2 and 6.3, are characterized by different choices for the split criterion to form the nodes of the trees. Section 6.4 contains an overview of tree ensembles for discrete survival data, which are designed to reduce the variance and to increase the prediction accuracy of single-tree methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bou-Hamad, I., Larocque, D., & Ben-Ameur, H. (2011a). Discrete-time survival trees and forests with time-varying covariates: Application to bankruptcy data. Statistical Modelling, 11, 429–446.
Bou-Hamad, I., Larocque, D., & Ben-Ameur, H. (2011b). A review of survival trees. Statistics Surveys, 5, 44–71.
Bou-Hamad, I., Larocque, D., Ben-Ameur, H., Masse, L., Vitaro, F., & Tremblay, R. (2009). Discrete-time survival trees. Canadian Journal of Statistics, 37, 17–32.
Breiman, L. (1996). Bagging predictors. Machine Learning, 24, 123–140.
Breiman, L. (2001). Random forests. Machine Learning, 45, 5–32.
Breiman, L., Cutler, A., Liaw, A., & Wiener, M. (2015). randomForest: Breiman and Cutler’s random forests for classification and regression. R package version 4.6-12. http://cran.r-project.org/web/packages/randomForest
Breiman, L., Friedman, J. H., Olshen, R. A., & Stone, J. C. (1984). Classification and regression trees. Monterey, CA: Wadsworth.
Broström, H. (2007). Estimating class probabilities in random forests. In ICMLA ’07: Proceedings of the 6th International Conference on Machine Learning and Applications (pp. 211–216). Washington, DC: IEEE Computer Society.
Ferri, C., Flach, P. A., & Hernandez-Orallo, J. (2003). Improving the AUC of probabilistic estimation trees. In Proceedings of the 14th European Conference on Artifical Intelligence (Vol. 2837, pp. 121–132). Berlin: Springer.
Gneiting, T., & Raftery, A. (2007). Strictly proper scoring rules, prediction, and estimation. Journal of the American Statistical Association, 102, 359–376.
Hastie, T., Tibshirani, R., & Friedman, J. H. (2009). The elements of statistical learning (2nd ed.). New York: Springer.
Hothorn, T., Hornik, K., & Zeileis, A. (2006). Unbiased recursive partitioning: A conditional inference framework. Journal of Computational and Graphical Statistics, 15, 651–674.
Hothorn, T., Lausen, B., Benner, A., & Radespiel-Tröger, M. (2004). Bagging survival trees. Statistics in Medicine, 23, 77–91.
Ishwaran, H., Kogalur, U. B., Blackstone, E. H., & Lauer, M. S. (2008). Random survival forests. Annals of Applied Statistics, 2, 841–860.
Ishwaran, H., Kogalur, U. B., Chen, X., & Minn, A. J. (2011). Random survival forests for high-dimensional data. Statistical Analysis and Data Mining, 4, 115–132.
Klein, J. P., Moeschberger, M. L., & J. Yan (2012). KMsurv: Data sets from Klein and Moeschberger (1997), survival analysis. R package version 0.1-5. http://cran.r-project.org/web/packages/KMsurv
LeBlanc, M., & Crowley, J. (1993). Survival trees by goodness of split. Journal of the American Statistical Association, 88, 457–467.
LeBlanc, M., & Crowley, J. (1995). A review of tree-based prognostic models. Journal of Cancer Treatment and Research, 75, 113–124.
Mayer, P., Larocque, D., & Schmid, M. (2014). DStree: Recursive partitioning for discrete-time survival trees. R package version 1.0. http://cran.r-project.org/web/packages/DStree/index.html
Morgan, J. N., & Sonquist, J. A. (1963). Problems in the analysis of survey data, and a proposal. Journal of the American Statistical Association, 58, 415–435.
Provost, F., & Domingos, P. (2003). Tree induction for probability-based ranking. Machine Learning, 52, 199–215.
Quinlan, J. R. (1993). C4.5: Programs for machine learning. San Francisco, CA: Morgan Kaufmann.
Schmid, M., Küchenhoff, H., Hoerauf, A., & Tutz, G. (2016). A survival tree method for the analysis of discrete event times in clinical and epidemiological studies. Statistics in Medicine, 35, 734–751.
Strobl, C., Malley, J., & Tutz, G. (2009). An introduction to recursive partitioning: Rationale, application and characteristics of classification and regression trees, bagging and random forests. Psychological Methods, 14, 323–348.
Therneau, T., Atkinson, B., & Ripley, B. (2015). rpart: Recursive partitioning. R package version 4.1-9. http://cran.r-project.org/web/packages/rpart
Tutz, G. (2012). Regression for categorical data. Cambridge: Cambridge University Press.
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Tutz, G., Schmid, M. (2016). Tree-Based Approaches. In: Modeling Discrete Time-to-Event Data. Springer Series in Statistics. Springer, Cham. https://doi.org/10.1007/978-3-319-28158-2_6
Download citation
DOI: https://doi.org/10.1007/978-3-319-28158-2_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-28156-8
Online ISBN: 978-3-319-28158-2
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)