Skip to main content

Diversified Random Forests Using Random Subspaces

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8669))

Abstract

Random Forest is an ensemble learning method used for classification and regression. In such an ensemble, multiple classifiers are used where each classifier casts one vote for its predicted class label. Majority voting is then used to determine the class label for unlabelled instances. Since it has been proven empirically that ensembles tend to yield better results when there is a significant diversity among the constituent models, many extensions were developed during the past decade that aim at inducing some diversity in the constituent models in order to improve the performance of Random Forests in terms of both speed and accuracy. In this paper, we propose a method to promote Random Forest diversity by using randomly selected subspaces, giving a weight to each subspace according to its predictive power, and using this weight in majority voting. Experimental study on 15 real datasets showed favourable results, demonstrating the potential of the proposed method.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Adeva, J.J.G., Beresi, U., Calvo, R.: Accuracy and diversity in ensembles of text categorisers. CLEI Electronic Journal 9(1) (2005)

    Google Scholar 

  2. Amit, Y., Geman, D.: Shape quantization and recognition with randomized trees. Neural Computation 9(7), 1545–1588 (1997)

    Article  Google Scholar 

  3. Bache, K., Lichman, M.: UCI machine learning repository (2013)

    Google Scholar 

  4. Bader-El-Den, M., Gaber, M.: Garf: towards self-optimised random forests. In: Huang, T., Zeng, Z., Li, C., Leung, C.S. (eds.) ICONIP 2012, Part II. LNCS, vol. 7664, pp. 506–515. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  5. Bernard, S., Heutte, L., Adam, S.: A study of strength and correlation in random forests. In: Huang, D.-S., McGinnity, M., Heutte, L., Zhang, X.-P. (eds.) ICIC 2010. CCIS, vol. 93, pp. 186–191. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  6. Boinee, P., De Angelis, A., Foresti, G.L.: Meta random forests. International Journal of Computationnal Intelligence 2(3), 138–147 (2005)

    Google Scholar 

  7. Breiman, L.: Bagging predictors. Machine Learning 24(2), 123–140 (1996)

    MathSciNet  MATH  Google Scholar 

  8. Breiman, L.: Random forests. Machine Learning 45(1), 5–32 (2001)

    Article  MATH  Google Scholar 

  9. Brown, G., Wyatt, J., Harris, R., Yao, X.: Diversity creation methods: a survey and categorisation. Information Fusion 6(1), 5–20 (2005)

    Article  Google Scholar 

  10. Cai, Q.-T., Peng, C.-Y., Zhang, C.-S.: A weighted subspace approach for improving bagging performance. In: IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2008, pp. 3341–3344. IEEE (2008)

    Google Scholar 

  11. Cuzzocrea, A., Francis, S.L., Gaber, M.M.: An information-theoretic approach for setting the optimal number of decision trees in random forests. In: 2013 IEEE International Conference on Systems, Man, and Cybernetics (SMC), pp. 1013–1019. IEEE (2013)

    Google Scholar 

  12. Freund, Y., Schapire, R.E.: A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences 55(1), 119–139 (1997)

    Article  MathSciNet  MATH  Google Scholar 

  13. García-Pedrajas, N., Ortiz-Boyer, D.: Boosting random subspace method. Neural Networks 21(9), 1344–1362 (2008)

    Article  MATH  Google Scholar 

  14. Ho, T.K.: Random decision forests. In: Proceedings of the Third International Conference on Document Analysis and Recognition, vol. 1, pp. 278–282. IEEE (1995)

    Google Scholar 

  15. Ho, T.K.: The random subspace method for constructing decision forests. IEEE Transactions on Pattern Analysis and Machine Intelligence 20(8), 832–844 (1998)

    Article  Google Scholar 

  16. Kuncheva, L.I., Whitaker, C.J.: Measures of diversity in classifier ensembles and their relationship with the ensemble accuracy. Machine Learning 51(2), 181–207 (2003)

    Article  MATH  Google Scholar 

  17. Liaw, A., Wiener, M.: Classification and regression by randomforest. R News 2(3), 18–22 (2002)

    Google Scholar 

  18. Maclin, R., Opitz, D.: Popular ensemble methods: An empirical study. arXiv preprint arXiv:1106.0257 (2011)

    Google Scholar 

  19. Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and regression trees. Wadsworth International Group (1984)

    Google Scholar 

  20. Opitz, D.W.: Feature selection for ensembles. In: AAAI/IAAI, pp. 379–384 (1999)

    Google Scholar 

  21. Panov, P., Džeroski, S.: Combining bagging and random subspaces to create better ensembles. Springer (2007)

    Google Scholar 

  22. Polikar, R.: Ensemble based systems in decision making. IEEE Circuits and Systems Magazine 6(3), 21–45 (2006)

    Article  Google Scholar 

  23. Robnik-Šikonja, M.: Improving random forests. In: Boulicaut, J.-F., Esposito, F., Giannotti, F., Pedreschi, D. (eds.) ECML 2004. LNCS (LNAI), vol. 3201, pp. 359–370. Springer, Heidelberg (2004)

    Google Scholar 

  24. Rokach, L.: Ensemble-based classifiers. Artificial Intelligence Review 33(1-2), 1–39 (2010)

    Article  Google Scholar 

  25. Tang, K., Suganthan, P.N., Yao, X.: An analysis of diversity measures. Machine Learning 65(1), 247–271 (2006)

    Article  Google Scholar 

  26. Wolpert, D.H.: Stacked generalization. Neural Networks 5(2), 241–259 (1992)

    Article  MathSciNet  Google Scholar 

  27. Yan, W., Goebel, K.F.: Designing classifier ensembles with constrained performance requirements. In: Defense and Security, pp. 59–68. International Society for Optics and Photonics (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Fawagreh, K., Gaber, M.M., Elyan, E. (2014). Diversified Random Forests Using Random Subspaces. In: Corchado, E., Lozano, J.A., Quintián, H., Yin, H. (eds) Intelligent Data Engineering and Automated Learning – IDEAL 2014. IDEAL 2014. Lecture Notes in Computer Science, vol 8669. Springer, Cham. https://doi.org/10.1007/978-3-319-10840-7_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-10840-7_11

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-10839-1

  • Online ISBN: 978-3-319-10840-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics