Abstract
Tree-based models are popular and widely used because they are simple, flexible and powerful tools for classification. Unfortunately they are not stable classifiers. Significant improvement of model stability and prediction accuracy can be obtained by aggregation of multiple classification trees. The reduction of classification error is a result of decreasing bias or/and variance of the committee of trees (called also an ensemble or a forest). In this paper we discuss and compare different methods for model aggregation. We also address the problem of finding minimal number of trees sufficient for the forest.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
AMIT, Y. and BLANCHARD, G. (2001): Multiple Randomized Classifiers: MRCL. Technical Report, Department of Statistics, University of Chicago, Chicago.
BAUER, E. and KOHAVI R. (1999): An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants. Machine Learning, 36, 105–142.
BLAKE, C., KEOGH, E., and MERZ, C.J. (1998): UCI Repository of Machine Learning Databases. Department of Information and Computer Science, University of California, Irvine.
BREIMAN, L., FRIEDMAN, J., OLSHEN, R., and STONE, C. (1984): Classification and Regression Trees, Chapman & Hall/CRC Press, London.
BREIMAN, L. (1996a): Bagging predictors. Machine Learning, 24, 123–140.
BREIMAN, L. (1996b): Bias, Variance and Arcing Classifiers. Technical Report, Statistics Department, University of California, Berkeley.
BREIMAN, L. (1998): Arcing classifers. Annals of Statistics, 26, 801–849.
BREIMAN, L. (1999): Using adaptive bagging to debias regressions. Technical Report 547, Department of Statistics, University of California, Berkeley.
BREIMAN, L. (2001): Random Forests. Machine Learning 45, 5–32.
BREIMAN, L. (2002): Wald Lecture I-Machine Learning. Department of Statistics, University of California, Berkeley.
CARTER, C. and CATLETT, J. (1987): Assesing Credit Card Applications Using Machine Learning. IEEE Expert, Fall issue, 71–79.
DIETTERICH, T. (2000): An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees: Bagging, Boosting and Randomization. Machine Learning 40, 139–158.
DIETTERICH, T. and KONG, E. (1995): Machine Learning Bias, Statistical Bias, and Statistical Variance of Decision Tree Algorithms. Technical Report, Department of Computer Science, Oregon State University.
FREUND, Y. and SCHAPIRE, R.E. (1997): A decision-theoretic generalization of on-line learning and an application to boosting, Journal of Computer and System Sciences 55, 119–139.
FRIEDMAN, J.H. (1996): On Bias, Variance, 0/1-Loss, and The Curse-of-Dimensionality. Technical Report, Department of Computer Science, Stanford University.
FRIEDMAN, J.H. (1999): Stochastic Gradient Boosting. Technical Report, Department of Computer Science, Stanford University.
GATNAR, E. (2002): Tree-based models in statistics: three decades of research. In: K. Jajuga, A. Sokoiowski, and H.-H. Bock (Eds.): Classification, Clustering, and Analysis. Springer, Berlin, 399–408.
HASTIE, T. and PREGIBON, D. (1991): Shrinking Trees. Technical Report, AT&T Laboratories, Murray Hill.
HO, T.K. (1998): The random subspace method for constructing decision forests. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20, 832–844.
JIANG, W. (2000): Process Consistency for AdaBoost. Technical Report 00-05, Department of Statistics, Northwestern University.
KOHAVI, R. and WOLPERT, D. (1996): Bias Plus Variance Decomposition for Zero-One Loss Functions. In: L. Saitta (Ed.): Machine Learning: Proceedings of the XIIIth International Conference, Morgan Kaufman, 313–321.
LATINNE, P., DEBEIR, O., and DECAESECKER, Ch. (2001): Limiting the number of trees in random forests, In: J. Kittler and F. Roli (Eds.): Multiple Classifier System, LNCS 2096, Springer, Berlin, 178–187.
LUGOSI, G. and VAYATIS, N. (2002): Statistical Study of Regularized Boosting Methods. Technical Report, Department of Economics, Pompeu Fabra University, Barcelona.
QUINLAN, J.R. (1993): C4.5: Programs for Machine Learning, Morgan Kaufmann, San Mateo.
TIBSHIRANI, R. (1996): Regression shrinkage and selection via the lasso. J.R. Statist. Soc. B, 58, 267–288.
TUKEY, J. (1977): Exploratory Data Analysis, Addison-Wesly, Reading.
WOLPERT, D.H. (1992): Stacked Generalization. Neural Networks, 5, 241–259
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin · Heidelberg
About this paper
Cite this paper
Gatnar, E. (2005). Randomization in Aggregated Classification Trees. In: Baier, D., Wernecke, KD. (eds) Innovations in Classification, Data Science, and Information Systems. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-26981-9_25
Download citation
DOI: https://doi.org/10.1007/3-540-26981-9_25
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-23221-6
Online ISBN: 978-3-540-26981-6
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)