Some Enhancements of Decision Tree Bagging

Geurts, Pierre

doi:10.1007/3-540-45372-5_14

Pierre Geurts⁴

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1910))

Included in the following conference series:

European Conference on Principles of Data Mining and Knowledge Discovery

3008 Accesses
8 Citations

Abstract

This paper investigates enhancements of decision tree bagging which mainly aim at improving computation times, but also accuracy. The three questions which are reconsidered are: discretization of continuous attributes, tree pruning, and sampling schemes. A very simple discretization procedure is proposed, resulting in a dramatic speedup without significant decrease in accuracy. Then a new method is proposed to prune an ensemble of trees in a combined fashion, which is significantly more effective than individual pruning. Finally, different resampling schemes are considered leading to different CPU time/accuracy tradeoffs. Combining all these enhancements makes it possible to apply tree bagging to very large datasets, with computational performances similar to single tree induction. Simulations are carried out on two synthetic databases and four real-life datasets.

Download to read the full chapter text

Chapter PDF

Pruning Bagging Ensembles with Metalearning

SPAARC: A Fast Decision Tree Algorithm

A Novel Prototype Decision Tree Method Using Sampling Strategy

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

E. Bauer and R. Kohavi. An empirical comparison of voting classification algorithms: Bagging, boosting, and variants. Machine Learning, 36:105–139, 1999.
Article Google Scholar
L. Breiman. Bagging predictors. Technical report, University of California, Department of Statistics, September 1994.
Google Scholar
L. Breiman. Pasting small votes for classification in large databases and on-line. Machine Learning, 36:85–103, 1999.
Article Google Scholar
L. Breiman. Using adaptive bagging to debias regressions. Technical report, Statistics Department, University of California, Berkeley, February 1999.
Google Scholar
L. Breiman, J.H. Friedman, R.A. Olsen, and C.J. Stone. Classification and Regression Trees. Wadsworth International (California), 1984.
Google Scholar
J. H. Friedman and P. Hall. On bagging and nonlinear estimation. Technical report, Statistics Department, Standford University, January 2000.
Google Scholar
J.H. Friedman. On bias, variance, 0/1-loss, and the curse-of-dimensionality. Data Mining and Knowledge Discovery, 1:55–77, 1997.
Article Google Scholar
P. Geurts and L. Wehenkel. Investigation and reduction of discretization variance in decision tree induction. In Proc. of the 11th European Conference on Machine Learning (ECML-2000), Barcelona, pages 162–170, May 2000.
Google Scholar
T.O. Kvålseth. Entropy and correlation: Some comments. IEEE Trans. on Systems, Man and Cybernetics, SMC-17(3):517–519, 1987.
Article Google Scholar
Dragos D. Margineantu and Thomas G. Dietterich. Pruning adaptive boosting. In Morgan Kaufmann, editor, Proc. of Fourteenth International Conference on Machine Learning (ICML-97), 19
Google Scholar
D. Michie and D.J. Spiegelhalter, editors. Machine learning, neural and statistical classifcation. Ellis Horwood, 1994.
Google Scholar
Peter Sollich and Anders Krogh. Learning with ensembles: How over-fiing can be useful In D.S. Touretzky, M.C. Mozer, and M.E. Hasselmo, editors, Advances in Neural Information Processing Systems, volume 8, pages 190–196. MIT Press, 1996.
Google Scholar
L. Wehenkel. Automatic learning techniques in power systems. Kluwer Academic, Boston, 1998.
MATH Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electrical and Computer Engineering, University of Liège, Institut Montefiore, Sart-Tilman B28, B4000, Liège, Belgium
Pierre Geurts

Authors

Pierre Geurts
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer and Information Science, Norwegian University of Science and Technology, O.S. Bragstads plass 2E, 7491, Trondheim, Norway
Jan Komorowski
Department of Computer Science, University of North Carolina, Charlotte, NC 28223, USA
Jan Żytkow
Laboratoire ERIC, Université Lyon 2, 5 avenue Pierre Mendès-France, 69676, Bron, France
Djamel A. Zighed

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Geurts, P. (2000). Some Enhancements of Decision Tree Bagging. In: Zighed, D.A., Komorowski, J., Żytkow, J. (eds) Principles of Data Mining and Knowledge Discovery. PKDD 2000. Lecture Notes in Computer Science(), vol 1910. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45372-5_14

Download citation

DOI: https://doi.org/10.1007/3-540-45372-5_14
Published: 18 July 2002
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-41066-9
Online ISBN: 978-3-540-45372-7
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

Some Enhancements of Decision Tree Bagging

Abstract

Chapter PDF

Similar content being viewed by others

Pruning Bagging Ensembles with Metalearning

SPAARC: A Fast Decision Tree Algorithm

A Novel Prototype Decision Tree Method Using Sampling Strategy

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Some Enhancements of Decision Tree Bagging

Abstract

Chapter PDF

Similar content being viewed by others

Pruning Bagging Ensembles with Metalearning

SPAARC: A Fast Decision Tree Algorithm

A Novel Prototype Decision Tree Method Using Sampling Strategy

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation