Random Forests

  • Richard A. Berk
Part of the Springer Texts in Statistics book series (STS)


This chapter continues to build on the idea of ensembles of statistical learning procedures. Random forests is introduced, which is an extremely useful approach that extends and improves on bagging. As before, there is an ensemble of classification or regression trees and votes over trees to regularize. Additional randomness is introduced when at each potential partitioning for each tree, a random subset of prediction is selected for evaluation. This has a variety of benefits, some of which can be quite subtle. Also discussed are supplementary algorithms to random forests that allow a peek into the black box.


  1. Berk, R. A. (2003). Regression analysis: A constructive critique. Newbury Park, CA.: SAGE.Google Scholar
  2. Berk, R. A., Kriegler, B., & Ylvisaker, D. (2008). Counting the Homeless in Los Angeles County. In D. Nolan & Speed, S. (Eds.). Probability and statistics: Essays in honor of David A. Freedman. Monograph Series for the Institute of Mathematical Statistics.Google Scholar
  3. Biau, G. (2012). Analysis of a random forests model. Journal of Machine Learning Research, 13, 1063–1095.MathSciNetzbMATHGoogle Scholar
  4. Biau, G., & Devroye, L. (2010). On the layered nearest neighbor estimate, the bagged nearest neighbor estimate and the random forest method in regression and classification. Journal Multivariate Analysis, 101, 2499–2518.MathSciNetzbMATHCrossRefGoogle Scholar
  5. Biau, G., Devroye, L., & Lugosi, G. (2008). Consistency of random forests and other averaging classifiers. Journal of Machine Learning Research, 9, 2015–2033.MathSciNetzbMATHGoogle Scholar
  6. Breiman, L. (2001a). Random forests. Machine Learning, 45, 5–32.zbMATHCrossRefGoogle Scholar
  7. Breiman, L., Friedman, J. H., Olshen, R. A., & Stone, C. J. (1984). Classification and regression trees. Monterey, CA: Wadsworth Press.zbMATHGoogle Scholar
  8. Chaudhuri, P., & Loh, W.-Y. (2002). Nonparametric estimation of conditional quantiles using quantile regression trees. Bernoulli, 8(5), 561–576.MathSciNetzbMATHGoogle Scholar
  9. Chipman, H. A., George, E. I., & McCulloch, R. E. (2010). BART: Bayesian additive regression trees. Annals of Applied Statistics, 4(1), 266–298.MathSciNetzbMATHCrossRefGoogle Scholar
  10. de Leeuw, J., & Mair, P. (2009). Multidimensional scaling using majorization: SMACOF in R. Journal of Statistical Software, 31(3), 11557–11587.CrossRefGoogle Scholar
  11. Friedman, J. H. (2001). Greedy function approximation: A gradient boosting machine. The Annals of Statistics, 29 1189–1232MathSciNetzbMATHCrossRefGoogle Scholar
  12. Geurts, P., Ernst, D., & Wehenkel, L. (2006). Extremely randomized trees. Machine Learning, 63(1), 3–42.zbMATHCrossRefGoogle Scholar
  13. Granger, C. W. J., & Newbold, P. (1986). Forecasting economic time series. New York: Academic Press.zbMATHGoogle Scholar
  14. Hastie, T., Tibshirani, R., & Friedman, J. (2009). The elements of statistical learning (2nd edn.). New York: Springer.zbMATHCrossRefGoogle Scholar
  15. Ishwaran, H. (2015). The effect of splitting on random forests. Machine Learning, 99, 75–118.MathSciNetzbMATHCrossRefGoogle Scholar
  16. Ishwaran, H., Kogalur, U. B., Blackstone, E. H., & Lauer, T. S. (2008). Random survival forests. The Annals of Applied Statistics, 2(3), 841–860.MathSciNetzbMATHCrossRefGoogle Scholar
  17. Ishwaran, H., Gerds, T. A., Kogalur, U. B., Moore, R. D., Gange, S. J., & Lau, B. M. (2014). Random survival forests for competing risks. Biostatistics, 15(4), 757–773.CrossRefGoogle Scholar
  18. Ishwaran H., & Lu, M. (2019) Standard errors and confidence intervals for variable importance in random forest regression, classification, and survival. Statistics in Medicine, 38(4), 558–582.MathSciNetCrossRefGoogle Scholar
  19. Kapelner, A., & Bleich, J. (2014). bartMachine: Machine learning for Bayesian additive regression trees. asXiv:1312.2171v3 [stat.ML].Google Scholar
  20. Lin, Y., & Jeon, Y. (2006). Random forests and adaptive nearest neighbors. Journal of the American Statistical Association, 101, 578–590.MathSciNetzbMATHCrossRefGoogle Scholar
  21. Loh, W.-L. (2014). Fifty years of classification and regression trees (with discussion). International Statistical Review, 82(3), 329–348.MathSciNetzbMATHCrossRefGoogle Scholar
  22. Mathlourthi, W., Fredette, M., & Larocque, D. (2015). Regression trees and forests for non-homogeneous Poisson processes. Statistics and Probability Letters, 96, 204–211.MathSciNetzbMATHCrossRefGoogle Scholar
  23. Meinshausen, N. (2006). Quantile regression forests. Journal of Machine Learning Research, 7, 983–999.MathSciNetzbMATHGoogle Scholar
  24. Mentch, L., & Hooker, G. (2015). Quantifying uncertainty in random forests via confidence intervals and hypothesis tests. Cornell University Library, arXiv:1404.6473v2 [stat.ML].Google Scholar
  25. Scornet, E., Biau, G., & Vert, J.-P. (2015). Consistency of random forest methods. The Annals of Statistics, 43(4), 1716–1741.MathSciNetzbMATHCrossRefGoogle Scholar
  26. Schwarz, D. F., König, I. R., & Ziegler, A. (2010). On safari to random jungle: A fast implementation of random forests for high-dimensional data. Bioinformatics, 26(14), 1752–1758.CrossRefGoogle Scholar
  27. Seligman, M. (2015). Rborist: Extensible, parallelizable implementation of the random forest algorithm. R package version 0.1–0.
  28. Thompson, S. K. (2002). Sampling (2nd edn). New York: Wiley.zbMATHGoogle Scholar
  29. Wager, S. (2014). Asymptotic theory for random forests (2014). Working Paper arXiv:1405.0352v1.Google Scholar
  30. Wager, S., & Walther, G. (2015). Uniform convergence of random forests via adaptive concentration. Working Paper arXiv:1503.06388v1.Google Scholar
  31. Wager, E., Hastie, T., & Efron, B. (2014). Confidence intervals for random forests: The Jackknife and infinitesimal Jackknife. Journal of Machine Learning Research, 15, 1625–1651.MathSciNetzbMATHGoogle Scholar
  32. Winham, S. J., Freimuth, R. R., & Beirnacka, J. M. (2103). A weighted random forests approach to improve predictive performance. Statitical Analysis and Data Mining, 6(6): 496–505.Google Scholar
  33. Wright, M. N., & Ziegler, A. (2017). Ranger: A fast implementation of random forests for high dimensional data in C++ and R. Journal of Statistical Software, 77(1), 7671–7688.CrossRefGoogle Scholar
  34. Wu, Y., Tjelmeland, H., & West, M. (2007). Bayesian CART: Prior specification and posterior simulation. Journal of Computational and Graphical Statistics, 16(1), 44–66.MathSciNetCrossRefGoogle Scholar
  35. Wyner, A. J., Olson, M., Bleich, J., & Mease, D. (2017), Explaining the Success of AdaBoost and Random Forests as Interpolating Classifiers, Journal of Machine Learning Research, 18, 1–33.MathSciNetzbMATHGoogle Scholar
  36. Xu, B., Huang, J. Z., Williams, G., Wang, Q., & Ye, Y. (2012). Classifying very high dimensional data with random forests build from small subspaces. International Journal of Data Warehousing and Mining, 8(2), 44–63.CrossRefGoogle Scholar
  37. Zhang, H., Wang, M., & Chen, X. (2009). Willows: A memory efficient tree and forest construction package. BMC Bioinformatics, 10(1), 130–136.CrossRefGoogle Scholar
  38. Ziegler, A., & König, I. R. (2014). Mining data with random forests: Current options for real world applications. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 4(1), 55–63.Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  • Richard A. Berk
    • 1
  1. 1.Department of Criminology, Schools of Arts and SciencesUniversity of PennsylvaniaPhiladelphiaUSA

Personalised recommendations