Advertisement

BayesRandomForest: An R Implementation of Bayesian Random Forest for Regression Analysis of High-Dimensional Data

  • Oyebayo Ridwan OlaniranEmail author
  • Mohd Asrul Affendi Bin Abdullah
Conference paper

Abstract

This paper presents methods of Bayesian inference for Random Forest (RF) procedure with high-dimensional data. The new methods termed Bayesian Random Forest (BRF) is developed to tackle sparsity in regression analysis of high-dimensional data. The bootstrap sampling and choosing of subsample variable size (mtry) procedures used by RF are replaced with the full Bayesian inference of binomial and hypergeometric random sampling from independent and dependent finite populations. Furthermore, the individual tree parameter estimates in the forest are obtained using a Metropolis-Hasting algorithm to achieve efficient posterior inference. We also introduced the application of the procedure in R via package BayesRandomForest. Monte-Carlo simulations of the Friedman five dimensional dataset of varying dimensions were used to demonstrate BRF relative efficiency with competing methods. Results from the simulations revealed that BRF is more efficient than the competing frequentist and Bayesian methods.

Keywords

Random forest Bayesian additive regression trees High-dimensional 

References

  1. 1.
    Kapelner, A., Bleich, J.: bartmachine: machine learning with bayesian additive regression trees. (2014a)Google Scholar
  2. 2.
    Breiman, L.: Random forests. Mach. Learn. 45, 5–32 (2001)CrossRefGoogle Scholar
  3. 3.
    Friedman, J.H.: Greedy function approximation: a gradient boosting machine. Ann. Statist. 29(5), 1189–1232 (2001)MathSciNetCrossRefGoogle Scholar
  4. 4.
    Chipman, H.A., George, E.I., McCulloch, R.E.: BART: bayesian additive regression trees. Ann. Appl. Stat. 4, 266–298 (2010)MathSciNetCrossRefGoogle Scholar
  5. 5.
    Taddy, M.A., Gramacy, R.B., Polson, N.G.: Dynamic trees for learning and design. J. Am. Stat. Assoc. 106(493), 109–123 (2011).  https://doi.org/10.1198/jasa.2011.ap09769MathSciNetCrossRefzbMATHGoogle Scholar
  6. 6.
    Taddy, M., Chen, C.S., Yu, J., Wyle, M.: Bayesian and empirical Bayesian forests (2015). arXiv:1502.02312
  7. 7.
    Hernández, B., Raftery, A.E., Pennington, S.R., Parnell, A.C.: Bayesian Additive Regression Trees using Bayesian Model Averaging (2015). arXiv:1507.00181
  8. 8.
    Friedman, J.H.: Multivariate adaptive regression splines (with discussion and a rejoinder by the author). Ann. Stat. 19(1), 67 (1991)Google Scholar
  9. 9.
    Olaniran, O.R., Yahya, W.B.: Bayesian hypothesis testing of two normal samples using bootstrap prior technique. J. Mod. Appl. Stat. Methods 16(2), 618–638 (2017).  https://doi.org/10.22237/jmasm/1509496440CrossRefGoogle Scholar
  10. 10.
    Olaniran, O.R., Olaniran, S.F., Yahya, W.B., Banjoko, A.W., Garba, M.K., Amusa, L.B., Gatta, N.F.: Improved Bayesian feature selection and classification methods using bootstrap prior techniques. Anale. Seria Informatică. 14(2), 46–52 (2016)Google Scholar
  11. 11.
    Olaniran, O.R., Affendi, M.A. Bayesian Analysis of Extended Cox Model with Time-varying Covariates using Bootstrap prior. J. Mod. Appl. Stat. Methods. In press (2017)Google Scholar
  12. 12.
    Yahya, W.B., Olaniran, O.R., Ige, S.O.: On Bayesian conjugate normal linear regression and ordinary least square regression methods: a monte carlo study. Ilorin J. Sci. 1(1), 216–227 (2014)Google Scholar
  13. 13.
    Gelman, A., Carlin, J.B., Stern, H.S., Dunson, D.B., Vehtari, A., Rubin, D.B.: Bayesian Data Analysis. CRC Press, Boca Raton, FL (2013)zbMATHGoogle Scholar
  14. 14.
    Greg, R., et al.: gbm: Generalized boosted regression models. R package version 2.1.3 (2017). https://CRAN.R-project.org/package=gbm
  15. 15.
    Liaw, A., Wiener, M.: Classification and regression by random forest. R News 2(3), 18–22 (2002)Google Scholar
  16. 16.
    Marvin N.W., Andreas Z.: ranger: A Fast Implementation of Random Forests for High Dimensional Data in C ++ and R. J. Stat. Softw. 77(1), 1–17 (2017)  https://doi.org/10.18637/jss.v077.i01

Copyright information

© Springer Nature Singapore Pte Ltd. 2019

Authors and Affiliations

  • Oyebayo Ridwan Olaniran
    • 1
    Email author
  • Mohd Asrul Affendi Bin Abdullah
    • 1
  1. 1.Department of Mathematics and Statistics, Faculty of Applied Sciences and TechnologyUniversiti Tun Hussein Onn Malaysia, Pagoh Educational HubPagohMalaysia

Personalised recommendations