Abstract
An active research area in Machine Learning is the construction of multiple classifier systems to increase learning accuracy of simple classifiers. In this paper we present E-CIDIM, a multiple classifier system designed to improve the performance of CIDIM, an algorithm that induces small and accurate decision trees. E-CIDIM keeps a maximum number of trees and it induces new trees that may substitute the old trees in the ensemble. The substitution process finishes when none of the new trees improves the accuracy of any of the trees in the ensemble after a pre-configured number of attempts. In this way, the accuracy obtained thanks to an unique instance of CIDIM can be improved. In reference to the accuracy of the generated ensembles, E-CIDIM competes well against bagging and boosting at statistically significance confidence levels and it usually outperforms them in the accuracy and the average size of the trees in the ensemble.
This work has been partially supported by the MOISES project, number TIC2002-04019-C03-02, of the MCyT, Spain.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Breiman, L.: Bagging predictors. Machine Learning 24, 123–140 (1996)
Freund, Y., Schapire, R.E.: Experiments with a new boosting algorithm. In: Proc. of 13th International Conference on Machine Learning, pp. 146–148 (1996)
Wolpert, D.: Stacked generalization. Neural Networks 5, 241–260 (1992)
Gama, J., Brazdil, P.: Cascade generalization. Machine Learning 41, 315–343 (2000)
Ferri, C., Flach, P., Hernández-Orallo, J.: Delegating classifiers. In: Proceedings of the 21st International Conference on Machine Learning. Omnipress (2004)
Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and Regression Trees. Wadsworth (1984)
Quinlan, J.R.: Induction of decision trees. Machine Learning 1, 81–106 (1986)
Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Francisco (1993)
Utgoff, P.E., Berkman, N.C., Clouse, J.A.: Decision tree induction based on efficient tree restructuring. Machine Learning 29, 5–44 (1997)
Ramos-Jiménez, G., Morales-Bueno, R., Villalba-Soria, A.: CIDIM. Control of induction by sample division methods. In: Proceedings of the International Conference on Artificial Intelligence (IC-AI 2000), Las Vegas, pp. 1083–1087 (2000)
Ruiz-Gómez, J., Ramos-Jiménez, G., Villalba-Soria, A.: Modelling based on rule induction learning. In: Computers and Computacional Engineering in Control, pp. 158–163. World Scientific and Engineering Society Press, Greece (1999)
Jerez-Aragonés, J.M., Gómez-Ruiz, J.A., Ramos-Jiménez, G., Muñoz-Pérez, J., Alba-Conejo, E.: A combined neural network and decision trees model for prognosis of breast cancer relapse. Artificial Intelligence in Medicine 27, 45–63 (2003)
Schapire, R.E.: The strength of weak learnability. Machine Learning 5, 197–227 (1990)
Freund, Y.: Boosting a weak learning algorithm by majority. Information and Computation 121, 256–285 (1995)
Freund, Y., Schapire, R.E.: The strength of weak learnability. Journal of Computer and System Sciences 55, 119–139 (1997)
Aslam, J.A., Decatur, S.E.: General bounds on statistical query learning and PAC learning with noise via hypothesis boosting. Information and Computation 141, 85–118 (1998)
Bauer, E., Kohavi, R.: An empirical comparison of voting classification algorithms: Bagging, boosting and variants. Machine Learning 36, 105–139 (1999)
Kearns, M.J., Vazirani, U.V.: On the boosting ability of top-down decision tree learning algorithms. Journal of Computer and System Sciences 58, 109–128 (1999)
Blake, C., Merz, C.J.: UCI repository of machine learning databases, University of California, Department of Information and Computer Science (2000)
Witten, I.H., Frank, E.: Data Mining: Practical machine learning tools with Java implementations. Morgan Kaufmann, San Francisco (2000)
Herrera, F., Hervás, C., Otero, J., Sánchez, L.: Un estudio empírico preliminary sobre los tests estadísticos más habituales en el aprendizaje automático. In: Tendencias de la Minería de Datos en España. Red Española Minería Datos (2004)
R Development Core Team: R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria (2004) 3-900051-07-0, http://www.R-project.org
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ramos-Jiménez, G., del Campo-Ávila, J., Morales-Bueno, R. (2005). E-CIDIM: Ensemble of CIDIM Classifiers. In: Li, X., Wang, S., Dong, Z.Y. (eds) Advanced Data Mining and Applications. ADMA 2005. Lecture Notes in Computer Science(), vol 3584. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11527503_14
Download citation
DOI: https://doi.org/10.1007/11527503_14
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-27894-8
Online ISBN: 978-3-540-31877-4
eBook Packages: Computer ScienceComputer Science (R0)