Abstract
Learning bagging ensembles of rule classifiers from imbalanced data is considered. We claim that simply introducing bagging instead of single classifiers may not bring the expected improvement in recognizing a minority class. The reason lies in the classification strategies of component classifiers, which are biased toward majority classes when no-matching or multiple-matching conflicts between rules occur. We argue that abstaining, i.e. allowing component classifiers to refrain from giving a prediction in ambiguous situations, may help to correctly recognize minority examples. Our evaluation on 17 imbalanced datasets and 5 classification strategies shows that bagging with abstaining is better than both standard bagging and single rule based classifiers.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
An, A.: Learning classification rules from data. Computers and Mathematics with Applications 45, 737–748 (2003)
Blaszczynski, J., Stefanowski, J., Zajac, M.: Ensembles of Abstaining Classifiers Based on Rule Sets. In: Rauch, J., Raś, Z.W., Berka, P., Elomaa, T. (eds.) ISMIS 2009. LNCS (LNAI), vol. 5722, pp. 382–391. Springer, Heidelberg (2009)
Breiman, L.: Bagging predictors. Machine Learning 24(2), 123–140 (1996)
Clark, P., Boswell, R.: Rule Induction with CN2: Some Recent Improvements. In: Kodratoff, Y. (ed.) EWSL 1991. LNCS, vol. 482, pp. 151–163. Springer, Heidelberg (1991)
Cestnik, B.: Estimating probabilities: A crucial task in Machine Learning. In: Proc. of the 9th European Conf. on Artificial Intelligence (ECAI 1990), pp. 147–150 (1990)
Cohen, W., Singer, Y.: A simple, fast and effective rule learner. In: Proc. of the 16th National Conference on Artificial Intelligence AAAI 1999, pp. 335–342 (1999)
Galar, M., Fernandez, A., Barrenechea, E., Bustince, H., Herrera, F.: A Review on Ensembles for the Class Imbalance Problem: Bagging-, Boosting-, and Hybrid-Based Approaches. IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews 99, 1–22 (2011)
Grzymala-Busse, J.W.: Managing uncertainty in machine learning from examples. In: Proc. 3rd Int. Symp. in Intelligent Systems, pp. 70–84 (1994)
He, H., Garcia, E.: Learning from imbalanced data. IEEE Transactions on Data and Knowledge Engineering 21(9), 1263–1284 (2009)
Rucket, U., Kramer, S.: Towards tight bounds for rule learning. In: Proc. of the 21st Int. Conf. on Machine Learning, ICML 2004, pp. 711–718 (2004)
Stefanowski, J.: On Combined Classifiers, Rule Induction and Rough Sets. In: Peters, J.F., Skowron, A., Düntsch, I., Grzymała-Busse, J.W., Orłowska, E., Polkowski, L. (eds.) Transactions on Rough Sets VI. LNCS, vol. 4374, pp. 329–350. Springer, Heidelberg (2007)
Stefanowski, J., Wilk, S.: Improving Rule Based Classifiers Induced by MODLEM by Selective Pre-processing of Imbalanced Data. In: Proc. of the RSKD Workshop at ECML/PKDD, Warsaw, pp. 54–65 (2007)
Wilson, D.R., Martinez, T.R.: Improved heterogeneous distance functions. J. Artificial Intelligence Research 6, 1–34 (1997)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Napierala, K., Stefanowski, J. (2012). Modifications of Classification Strategies in Rule Set Based Bagging for Imbalanced Data. In: Corchado, E., Snášel, V., Abraham, A., Woźniak, M., Graña, M., Cho, SB. (eds) Hybrid Artificial Intelligent Systems. HAIS 2012. Lecture Notes in Computer Science(), vol 7209. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28931-6_49
Download citation
DOI: https://doi.org/10.1007/978-3-642-28931-6_49
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-28930-9
Online ISBN: 978-3-642-28931-6
eBook Packages: Computer ScienceComputer Science (R0)