Abstract
Methods for comparing and combining classification trees based on proximity measures have been proposed in the last few years. These methods could be used to analyse a set of trees obtained from independent data sets or from resampling methods like bootstrap or cross validation applied to the same training sample. In this paper we consider, as an alternative to the pruning techniques, a modified version of a consensus algorithm we have previously proposed that combines trees obtained by bootstrap samples. This consensus algorithm is based on a dissimilarity measure recently proposed. Experimental results are provided to illustrate, in two real data sets, the performances of the proposed consensus method.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
BREIMAN L.(1996): Bagging predictors. Machine learning, 24, 123–140.
BREIMAN L., FRIEDMAN J.H., OLSHEN R.A. and STONE C.J. (1984): Classification and Regression Trees. Wadsworth, Belmont, California.
CHIPMAN H. A., GEORGE E. I. and McCULLOCH R. E. (2001): Managing multiple models. In: T. Jaakola and T. Richardson (Eds.): Artificial Intelligence and Statistics 2001. ProBook, Denver, 11–18.
FREUND Y. and SHAPIRE R. (1996): Experiments with a new boosting algorithm. In: Saitta L. (Ed.): Machine Learning: Proceedings of the Thirteenth International Conference. San Francisco, 148–156.
FREUND Y. and SHAPIRE R. (1999): A short introduction to boosting. Journal of Japanese Society for Artificial Intelligence, 14, 771–780.
MANGASARIAN O. L. and WOLBERG W. H. (1990): Cancer diagnosis via linear programming. SIAM News, 23, 1–18.
MIGLIO R. (1996): Metodi di partizione ricorsiva nell'analisi discriminante. PhD Thesis, Dipartimento di Scienze Statistiche, Bologna.
MIGLIO R. and SOFFRITTI G. (2003): Methods to combine classification trees. In: M. Shader, W. Gaul and M. Vichi (Eds.): Between data science and applied data analysis. Springer-Verlag, Heidelberg, 65–73.
MIGLIO R. and SOFFRITTI G. (2004): The comparison between classification trees through proximity measures. Computational Statistics and Data Analysis, 45, 577–593.
SHANNON W. D. and BANKS D. (1999): Combining classification trees using MLE. Statistics in Medicine, 18, 727–740.
SIGILLITO V. G., WING S. P., HUTTON L. V., and BAKER K. B. (1989): Classification of radar returns from the ionosphere using neural networks. Johns Hopkins APT Technical Digest, 18, 262–266.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin · Heidelberg
About this paper
Cite this paper
Miglio, R., Soffritti, G. (2005). Simplifying Classification Trees Through Consensus Methods. In: Bock, HH., et al. New Developments in Classification and Data Analysis. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-27373-5_4
Download citation
DOI: https://doi.org/10.1007/3-540-27373-5_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-23809-6
Online ISBN: 978-3-540-27373-8
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)