On Cesáro Averages for Weighted Trees in the Random Forest

Pham, Hieu; Olafsson, Sigurður

doi:10.1007/s00357-019-09322-8

On Cesáro Averages for Weighted Trees in the Random Forest

Published: 30 March 2019

Volume 37, pages 223–236, (2020)
Cite this article

Journal of Classification Aims and scope Submit manuscript

373 Accesses
13 Citations
Explore all metrics

Abstract

The random forest is a popular and effective classification method. It uses a combination of bootstrap resampling and subspace sampling to construct an ensemble of decision trees that are then averaged for a final prediction. In this paper, we propose a potential improvement on the random forest that can be thought of as applying a weight to each tree before averaging. The new method is motivated by the potential instability of averaging predictions of trees that may be of highly variable quality, and because of this, we replace the regular average with a Cesáro average. We provide both a theoretical analysis that gives exact conditions under which the new approach outperforms the traditional random forest, and numerical analysis that shows the new approach is competitive when training a classification model on numerous realistic data sets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Double random forest

Article 02 July 2020

Ensemble of optimal trees, random forest and random projection ensemble classification

Article Open access 12 June 2019

A random forest guided tour

Article 19 April 2016

References

Apostol, T. (1976). Introduction to analytic number theory, Berlin Germany. New York: Springer.
Book Google Scholar
Bache, K. , & Lichman, M. UCI machine learning repository. http://archive.ics.uci.edu/ml.
Breiman, L. (2001). Random forests. Machine Learning, 45, 5–32.
Article Google Scholar
Daho, M.E.H., Settouti, N., Lazouni, M.E., Chikh, M.E.A. (2014). Weighted vote for trees aggregation in random forest. In Intl Conference on Multimedia Computing Systems (ICMCS) (pp. 438–443).
Friedman, J.H. (2006). Recent advances in predictive (machine) learning. Journal of Classification, 23, 175–197.
Article MathSciNet Google Scholar
Hendricks, P. (2015). titanic: Titanic passenger survival data set. R package version 0.1.0. https://CRAN.R-project.org/package=titanic.
Li, H.B., Wang, W., Ding, H.W., Dong, J. (2010). Trees weighting random forest method for classifying high-dimensional noisy data. In Proc. IEEE 7th Int. Conf. e-Business Eng. (ICEBE) (pp. 160–163).
Naghibi, S.A., Pourghasemi, H.R., Dixon, B. (2016). GIS-based groundwater potential mapping using boosted regression tree, classification and regression tree, and random forest machine learning models in Iran. Environmental Monitoring and Assessment, 188, 44.
Article Google Scholar
Ronao, C.A., & Cho, S.B. (2015). Random forests with weighted voting for anomalous query access detection in relational databases. Artificial Intelligence and Soft Computing, 9120, 36–48.
Article Google Scholar
Stein, E., & Shakarchi, R. (2003). Fourier analysis: an introduction Princeton. New Jersey: Princeton University Press.
MATH Google Scholar
Subasi, A., Alickovic, E., Kevric, J. (2017). Diagnosis of chronic kidney disease by using random forest. CMBEBIH, 62, 589–594.
Article Google Scholar
Weisstein, E.W. (2004). Harmonic series. http://mathworld.wolfram.com/HarmonicSeries.html.
Winham, S.J., Freimuth, R.R., Biernacka, J.M. (2013). A weighted random forests approach to improve predictive performance. Statistical Analysis and Data Mining, 6, 496–505.
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Department of Industrial and Manufacturing Systems Engineering, Iowa State University, Ames, IA, 50011, USA
Hieu Pham & Sigurður Olafsson

Authors

Hieu Pham
View author publications
You can also search for this author in PubMed Google Scholar
Sigurður Olafsson
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hieu Pham.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Pham, H., Olafsson, S. On Cesáro Averages for Weighted Trees in the Random Forest. J Classif 37, 223–236 (2020). https://doi.org/10.1007/s00357-019-09322-8

Download citation

Published: 30 March 2019
Issue Date: April 2020
DOI: https://doi.org/10.1007/s00357-019-09322-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

On Cesáro Averages for Weighted Trees in the Random Forest

Abstract

Access this article

Similar content being viewed by others

Double random forest

Ensemble of optimal trees, random forest and random projection ensemble classification

A random forest guided tour

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

On Cesáro Averages for Weighted Trees in the Random Forest

Abstract

Access this article

Similar content being viewed by others

Double random forest

Ensemble of optimal trees, random forest and random projection ensemble classification

A random forest guided tour

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation