Advertisement

Bayesian Random Forest for the Classification of High-Dimensional mRNA Cancer Samples

  • Oyebayo Ridwan OlaniranEmail author
  • Mohd Asrul Affendi Bin Abdullah
Conference paper

Abstract

The goal of many machine learning algorithms is to adequately identify the informative biomarkers in the biological samples useful for predicting disease outcome. Several algorithms have been proposed to perform this task using high-dimensional genomic messenger Ribonucleic Acid (mRNA) data. High-dimensionality poses serious problem in statistical analysis in terms of parameter estimation and inference. To address this problem, a powerful method has been developed called Random forest. Random forest was able to tackle high-dimensionality problem but it fails because it’s more of computer program than a statistical learning method thus uncertainty in prediction cannot be quantified. In this paper, we develop Bayesian Random Forest (BRF) model for the classification of high-dimensional mRNA data. Bayesian procedures are the emerging solution to most applications of statistics in the recent time and in fact it has the least error rate in theory. In addition, they give appealing results in terms of parameter uncertainty, model uncertainty and data uncertainty. BRF model fitting and inference were achieved via Metropolis-Hasting (MH) MCMC algorithm. The model strength was illustrated using bake-off of 10 different mRNA cancer datasets. Results from data calibration established appreciable supremacy over competing methods.

Keywords

Bayesian Random forest mRNA High-dimensional Classification 

Notes

Funding

This work was supported by Universiti Tun Hussein Onn, Malaysia [grant numbers Vot, U607].

References

  1. 1.
    Lynch, C.: Big data: how do your data grow? Nature 455(7209), 28–29 (2008)CrossRefGoogle Scholar
  2. 2.
    Breiman, L.: Random forests. Mach. Learn. 45, 5–32 (2001)CrossRefGoogle Scholar
  3. 3.
    Olaniran, O.R., Yahya, W.B.: Bayesian hypothesis testing of two normal samples using bootstrap prior technique. J. Mod. Appl. Stat. Methods 16(2), 618–638 (2017).  https://doi.org/10.22237/jmasm/1509496440CrossRefGoogle Scholar
  4. 4.
    Olaniran, O.R., Olaniran, S.F., Yahya, W.B., Banjoko, A.W., Garba, M.K., Amusa, L.B., Gatta, N.F.: Improved Bayesian feature selection and classification methods using bootstrap prior techniques. Anale. Seria Informatică 14(2), 46–52 (2016)Google Scholar
  5. 5.
    Olaniran, O.R., Affendi, M.A.: Bayesian analysis of extended cox model with time-varying covariates using bootstrap prior. J. Mod. Appl. Stat. Methods (2017) (in press)Google Scholar
  6. 6.
    Yahya, W.B., Olaniran, O.R., Ige, S.O.: On Bayesian conjugate normal linear regression and ordinary least square regression methods: a monte carlo study. Ilorin J. Sci. 1(1), 216–227 (2014)Google Scholar
  7. 7.
    Chipman, H.A., George, E.I., McCulloch, R.E.: BART: Bayesian additive regression trees. Ann. Appl. Stat. 4, 266–298 (2010)MathSciNetCrossRefGoogle Scholar
  8. 8.
    Pratola, M.T.: Efficient metropolis-hastings proposal mechanisms for Bayesian regression tree models. Bayesian Anal. 11(3), 885–911 (2016)MathSciNetCrossRefGoogle Scholar
  9. 9.
    Taddy, M., Chen, C.S., Yu, J., Wyle, M.: Bayesian and empirical Bayesian forests (2015). arXiv:1502.02312
  10. 10.
    Efron, B.: Bootstrap methods: another look at the jackknife. In: Breakthroughs in Statistics, pp. 569–593. Springer, New York (1992)Google Scholar
  11. 11.
    Rubin, D.: The Bayesian bootstrap. Ann. Stat. 9, 130–134 (1981)MathSciNetCrossRefGoogle Scholar
  12. 12.
    Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning: Prediction, Inference and Data Mining, 2nd edn. Springer, New York (2011)zbMATHGoogle Scholar
  13. 13.
    Ramey, J.A.: Datamicroarray: collection of data sets for classification. https://github.com/ramhiser/datamicroarray, http://ramhiser.com (2016)

Copyright information

© Springer Nature Singapore Pte Ltd. 2019

Authors and Affiliations

  • Oyebayo Ridwan Olaniran
    • 1
    Email author
  • Mohd Asrul Affendi Bin Abdullah
    • 1
  1. 1.Faculty of Applied Sciences and Technology, Department of Mathematics and StatisticsUniversiti Tun Hussein Onn Malaysia, Pagoh Educational HubPagohMalaysia

Personalised recommendations