On the Boosting Pruning Problem

Tamon, Christino; Xiang, Jie

doi:10.1007/3-540-45164-1_41

Christino Tamon⁴ &
Jie Xiang⁵

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1810))

Included in the following conference series:

European Conference on Machine Learning

2143 Accesses
42 Citations

Abstract

Boosting is a powerful method for improving the predictive accuracy of classifiers. The AdaBoost algorithm of Freund and Schapire has been successfully applied to many domains [2, 10, 12] and the combination of AdaBoost with the C4.5 decision tree algorithm has been called the best off-the-shelf learning algorithm in practice. Unfortunately, in some applications, the number of decision trees required by AdaBoost to achieve a reasonable accuracy is enormously large and hence is very space consuming. This problem was first studied by Margineantu and Dietterich [7], where they proposed an empirical method called Kappa pruning to prune the boosting ensemble of decision trees. The Kappa method did this without sacrificing too much accuracy. In this work-in-progress we propose a potential improvement to the Kappa pruning method and also study the boosting pruning problem from a theoretical perspective. We point out that the boosting pruning problem is intractable even to approximate. Finally, we suggest a margin-based theoretical heuristic for this problem.

Download to read the full chapter text

Chapter PDF

Explaining AdaBoost

A review of boosting methods for imbalanced data classification

Article 06 August 2014

Optimization by Gradient Boosting

References

Y. Freund and R.E. Schapire. A decision-theoretic generalization of online learning and an application to boosting. J. Comp. System Sciences, 55(1):119–139, 1997. 404, 405
Article MATH MathSciNet Google Scholar
Y. Freund and R.E. Schapire. Experiments with a New Boosting Algorithm. Proc. 13th Int. Conf. on Machine Learning, 148–156, 1996. 404, 405, 406
Google Scholar
M.R. Garey and D.S. Johnson. Computers and Intractability: A Guide to the Theory of NP-Completeness. W.H. Freeman and Company, 1979. 409
Google Scholar
D. Hochbaum. Approximation Algorithms for NP-hard Problems. PWS Publishing Company, 1997. 410
Google Scholar
W. Hoeffding. Probability Inequalities for Sums of Bounded Random Variables. J. American Stat. Assoc., 58:13–30, 1963. 411
Article MATH MathSciNet Google Scholar
V. Kann. Polynomially bounded minimization problems that are hard to approximate. Nordic Journal of Computing, 1:317–331, 1994. 409
MATH MathSciNet Google Scholar
D. Margineantu and T.G. Dietterich. Pruning Adaptive Boosting. Proc. 14th Int. Conf. Machine Learning, 211–218, 1997. 404, 405, 406, 408, 411
Google Scholar
C.J. Merz and P.M. Murphy. UCI Repository of Machine Learning Databases. Tech. Report, U.C. Irvine, CA. 407
Google Scholar
J.R. Quinlan. C4.5: Programs for Machine Learning. Morgan Kaufmann, 1993. 404, 405
Google Scholar
J.R. Quinlan. Bagging, Boosting, and C4.5. Proc. 13th Nat. Conf. Artificial Intelligence, 725–730, 1996. 404, 405, 406
Google Scholar
R.E. Schapire, Y. Freund, P. Bartlett, and W.S. Lee. Boosting the Margin: a new explanation of the effectiveness of voting methods. The Annals of Statistics, 26(5):1651–1686, 1998. 405, 410
Article MATH MathSciNet Google Scholar
R.E. Schapire and Y. Singer. Improved Boosting Algorithms using Confidencerated Predictions. Proc. 11th Ann. Conf. Comp. Learning Theory, 80–91, 1998. 404, 410
Google Scholar

Download references

Author information

Authors and Affiliations

Clarkson University, Potsdam, NY, 13699-5815, USA
Christino Tamon
BCL Computers, Inc., USA
Jie Xiang

Authors

Christino Tamon
View author publications
You can also search for this author in PubMed Google Scholar
Jie Xiang
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Institut d’Investigació en Intelligència Artificial, IIIA, Spanish Council for Scientific Research, CSIC, Campus, U.A.B., 08193, Bellaterra, Catalonia, Spain
Ramon López de Mántaras & Enric Plaza &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Tamon, C., Xiang, J. (2000). On the Boosting Pruning Problem. In: López de Mántaras, R., Plaza, E. (eds) Machine Learning: ECML 2000. ECML 2000. Lecture Notes in Computer Science(), vol 1810. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45164-1_41

Download citation

DOI: https://doi.org/10.1007/3-540-45164-1_41
Published: 14 January 2003
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-67602-7
Online ISBN: 978-3-540-45164-8
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

On the Boosting Pruning Problem

Abstract

Chapter PDF

Similar content being viewed by others

Explaining AdaBoost

A review of boosting methods for imbalanced data classification

Optimization by Gradient Boosting

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

On the Boosting Pruning Problem

Abstract

Chapter PDF

Similar content being viewed by others

Explaining AdaBoost

A review of boosting methods for imbalanced data classification

Optimization by Gradient Boosting

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation