Abstract
Top-down algorithms such as C4.5 and CART for constructing decision trees are known to perform boosting, with the procedure of choosing classification rules at internal nodes regarded as the base learner. In this work, by introducing a notion of pseudo-entropy functions for measuring the loss of hypotheses, we give a new insight into this boosting scheme from an information-theoretic viewpoint: Whenever the base learner produces hypotheses with non-zero mutual information, the top-down algorithm reduces the conditional entropy (uncertainty) about the target function as the tree grows. Although its theoretical guarantee on its performance is worse than other popular boosting algorithms such as AdaBoost, the top-down algorithms can naturally treat multiclass classification problems. Furthermore we propose a base learner LIN that produces linear classification functions and carry out some experiments to examine the performance of the top-down algorithm with LIN as the base learner. The results show that the algorithm can sometimes perform as well as or better than AdaBoost.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
M. Anthony and P. Bartlett. Neural Network Learning: Theoretical Foundations. University Press, Cambridge, 1999.
J. A. Aslam. Improving algorithms for boosting. In 13th COLT, pages 200–207, 2000.
L. Breiman, J. H. Friedman, R. A. Olshen, and C. J. Stone. Classification and Regression Trees. Wadsworth International Group, 1984.
Y. Freund. Boosting a weak learning algorithm by majority. Inform. Comput., 121(2):256–285, Sept. 1995. Also appeared in COLT90.
Y. Freund and R. E. Schapire. Game theory, on-line prediction and boosting. In Proc. 9th Annu. Conf. on Comput. Learning Theory, pages 325–332. 1996.
Y. Freund and R. E. Schapire. A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. Syst. Sci., 55(1):119–139, 1997.
J. Friedman, T. Hastie, and R. Tibshirani. Additive logistic regression: a statistical view of boosting. Technical report, Stanford University, 1998.
M. Kearns and Y. Mansour. On the boosting ability of top-down decision tree learning algorithms. J. of Comput. Syst. Sci., 58(1):109–128, 1999.
Y. Mansour and D. McAllester. Boosting using branching probrams. In 13th COLT, pages 220–224, 2000.
B. K. Natarajan. Machine Learning: A Theoretical Approach. Morgan Kaufmann, San Mateo, CA, 1991.
J. R. Quinlan. C4.5: Programs for machine learning. Morgan Kaufmann, 1993.
R. E. Schapire. The strength of weak learnability. Machine Learning, 5(2):197–227, 1990.
R. E. Schapire, Y. Freund, P. Bartlett, and W. S. Lee. Boosting the margin: a new explanation for the effectiveness of voting methods. In Proc. 14th International Conference on Machine Learning, pages 322–330. Morgan Kaufmann, 1997.
R. E. Schapire and Y. Singer. Improved boosting algorithms using confidence-rated predictions. In Proc. 11th Annu. Conf. on Comput. learning Theory, 1998.
E. Takimoto and A. Maruoka. Top-down decision tree learning as information based boosting. to appear in Theoretical Computer Science. Earlier version in [16].
E. Takimoto and A. Maruoka. On the boosting algorithm for multiclass functions based on information-theoretic criterion for approximation. In Proc. 1st International Conference on Discovery Science, volume 1532 of Lecture Notes in Artificial Intelligence, pages 256–267. Springer-Verlag, 1998.
E. Takimoto, I. Tajika, and A. Maruoka. Mutual information gaining algorithm and its relation to PAC-learning algorithm. In Proc. 5th Int. Workshopon Algorithmic Learning Theory, volume 872 of Lecture Notes in Artificial Intelligence, pages 547–559. Springer-Verlag, 1994.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Takimoto, E., Maruoka, A. (2002). Top-Down Decision Tree Boosting and Its Applications. In: Arikawa, S., Shinohara, A. (eds) Progress in Discovery Science. Lecture Notes in Computer Science(), vol 2281. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45884-0_23
Download citation
DOI: https://doi.org/10.1007/3-540-45884-0_23
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-43338-5
Online ISBN: 978-3-540-45884-5
eBook Packages: Springer Book Archive