Modifications of Classification Strategies in Rule Set Based Bagging for Imbalanced Data

Napierala, Krystyna; Stefanowski, Jerzy

doi:10.1007/978-3-642-28931-6_49

Krystyna Napierala²⁵ &
Jerzy Stefanowski²⁵

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7209))

Included in the following conference series:

International Conference on Hybrid Artificial Intelligence Systems

1771 Accesses

Abstract

Learning bagging ensembles of rule classifiers from imbalanced data is considered. We claim that simply introducing bagging instead of single classifiers may not bring the expected improvement in recognizing a minority class. The reason lies in the classification strategies of component classifiers, which are biased toward majority classes when no-matching or multiple-matching conflicts between rules occur. We argue that abstaining, i.e. allowing component classifiers to refrain from giving a prediction in ambiguous situations, may help to correctly recognize minority examples. Our evaluation on 17 imbalanced datasets and 5 classification strategies shows that bagging with abstaining is better than both standard bagging and single rule based classifiers.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

An, A.: Learning classification rules from data. Computers and Mathematics with Applications 45, 737–748 (2003)
Article MathSciNet MATH Google Scholar
Blaszczynski, J., Stefanowski, J., Zajac, M.: Ensembles of Abstaining Classifiers Based on Rule Sets. In: Rauch, J., Raś, Z.W., Berka, P., Elomaa, T. (eds.) ISMIS 2009. LNCS (LNAI), vol. 5722, pp. 382–391. Springer, Heidelberg (2009)
Chapter Google Scholar
Breiman, L.: Bagging predictors. Machine Learning 24(2), 123–140 (1996)
MathSciNet MATH Google Scholar
Clark, P., Boswell, R.: Rule Induction with CN2: Some Recent Improvements. In: Kodratoff, Y. (ed.) EWSL 1991. LNCS, vol. 482, pp. 151–163. Springer, Heidelberg (1991)
Chapter Google Scholar
Cestnik, B.: Estimating probabilities: A crucial task in Machine Learning. In: Proc. of the 9th European Conf. on Artificial Intelligence (ECAI 1990), pp. 147–150 (1990)
Google Scholar
Cohen, W., Singer, Y.: A simple, fast and effective rule learner. In: Proc. of the 16th National Conference on Artificial Intelligence AAAI 1999, pp. 335–342 (1999)
Google Scholar
Galar, M., Fernandez, A., Barrenechea, E., Bustince, H., Herrera, F.: A Review on Ensembles for the Class Imbalance Problem: Bagging-, Boosting-, and Hybrid-Based Approaches. IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews 99, 1–22 (2011)
Google Scholar
Grzymala-Busse, J.W.: Managing uncertainty in machine learning from examples. In: Proc. 3rd Int. Symp. in Intelligent Systems, pp. 70–84 (1994)
Google Scholar
He, H., Garcia, E.: Learning from imbalanced data. IEEE Transactions on Data and Knowledge Engineering 21(9), 1263–1284 (2009)
Article Google Scholar
Rucket, U., Kramer, S.: Towards tight bounds for rule learning. In: Proc. of the 21st Int. Conf. on Machine Learning, ICML 2004, pp. 711–718 (2004)
Google Scholar
Stefanowski, J.: On Combined Classifiers, Rule Induction and Rough Sets. In: Peters, J.F., Skowron, A., Düntsch, I., Grzymała-Busse, J.W., Orłowska, E., Polkowski, L. (eds.) Transactions on Rough Sets VI. LNCS, vol. 4374, pp. 329–350. Springer, Heidelberg (2007)
Chapter Google Scholar
Stefanowski, J., Wilk, S.: Improving Rule Based Classifiers Induced by MODLEM by Selective Pre-processing of Imbalanced Data. In: Proc. of the RSKD Workshop at ECML/PKDD, Warsaw, pp. 54–65 (2007)
Google Scholar
Wilson, D.R., Martinez, T.R.: Improved heterogeneous distance functions. J. Artificial Intelligence Research 6, 1–34 (1997)
MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Computing Science, Poznań University of Technology, 60–965, Poznań, Poland
Krystyna Napierala & Jerzy Stefanowski

Authors

Krystyna Napierala
View author publications
You can also search for this author in PubMed Google Scholar
Jerzy Stefanowski
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Universidad de Salamanca, Plaza de la Merced S/N, 37008, Salamanca, Spain
Emilio Corchado
VŠB-TU Ostrava 17, Listopadu 15, 70833, Ostrava, Czech Republic
Václav Snášel
Machine Intelligence Research Labs Machine Intelligence Research Labs(MIR Labs),, Scientific Network for Innovation and Research Excellence, P.O. Box 2259, 98071, Auburn, Washington, USA
Ajith Abraham
Wroclaw University of Technology, Wybrzeze Wyspianskiego 27, 50-370, Wroclaw, Poland
Michał Woźniak
University of the Basque Country, Pº Manuel Lardizabal 1, 20018, San Sebastian, Spain
Manuel Graña
Yonsei University, 134 Shinchon-dong, 120-749, Sudaemoon-ku, Seoul, Korea
Sung-Bae Cho

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Napierala, K., Stefanowski, J. (2012). Modifications of Classification Strategies in Rule Set Based Bagging for Imbalanced Data. In: Corchado, E., Snášel, V., Abraham, A., Woźniak, M., Graña, M., Cho, SB. (eds) Hybrid Artificial Intelligent Systems. HAIS 2012. Lecture Notes in Computer Science(), vol 7209. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28931-6_49

Download citation

DOI: https://doi.org/10.1007/978-3-642-28931-6_49
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-28930-9
Online ISBN: 978-3-642-28931-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics