- 40 Downloads
This chapter continues to build on the idea of ensembles of statistical learning procedures. Random forests is introduced, which is an extremely useful approach that extends and improves on bagging. As before, there is an ensemble of classification or regression trees and votes over trees to regularize. Additional randomness is introduced when at each potential partitioning for each tree, a random subset of prediction is selected for evaluation. This has a variety of benefits, some of which can be quite subtle. Also discussed are supplementary algorithms to random forests that allow a peek into the black box.
- Berk, R. A. (2003). Regression analysis: A constructive critique. Newbury Park, CA.: SAGE.Google Scholar
- Berk, R. A., Kriegler, B., & Ylvisaker, D. (2008). Counting the Homeless in Los Angeles County. In D. Nolan & Speed, S. (Eds.). Probability and statistics: Essays in honor of David A. Freedman. Monograph Series for the Institute of Mathematical Statistics.Google Scholar
- Kapelner, A., & Bleich, J. (2014). bartMachine: Machine learning for Bayesian additive regression trees. asXiv:1312.2171v3 [stat.ML].Google Scholar
- Mentch, L., & Hooker, G. (2015). Quantifying uncertainty in random forests via confidence intervals and hypothesis tests. Cornell University Library, arXiv:1404.6473v2 [stat.ML].Google Scholar
- Seligman, M. (2015). Rborist: Extensible, parallelizable implementation of the random forest algorithm. R package version 0.1–0. http://cran.r-project.org/package=Rborist.
- Wager, S. (2014). Asymptotic theory for random forests (2014). Working Paper arXiv:1405.0352v1.Google Scholar
- Wager, S., & Walther, G. (2015). Uniform convergence of random forests via adaptive concentration. Working Paper arXiv:1503.06388v1.Google Scholar
- Winham, S. J., Freimuth, R. R., & Beirnacka, J. M. (2103). A weighted random forests approach to improve predictive performance. Statitical Analysis and Data Mining, 6(6): 496–505.Google Scholar
- Ziegler, A., & König, I. R. (2014). Mining data with random forests: Current options for real world applications. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 4(1), 55–63.Google Scholar