Skip to main content

Controlled Under-Sampling with Majority Voting Ensemble Learning for Class Imbalance Problem

  • Conference paper
  • First Online:
Intelligent Computing (SAI 2018)

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 857))

Included in the following conference series:

Abstract

Class imbalance problem has been a widely studied problem in data mining. In this paper, we present a new filter approach to the class imbalance problem that uses repeated under-sampling to create balanced data sets and then uses majority voting ensemble type learning to create a meta-classifier. We test our method on five imbalanced data sets and compare its performance with that of three other techniques. We show that our method significantly improves the prediction accuracy of the under-represented class while also reducing the gap in prediction accuracy between the two classes.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. He, H., Garcia, E.: Learning from imbalanced data. IEEE Trans. Knowl. Data Eng. 21(9), 1263–1284 (2009)

    Article  Google Scholar 

  2. Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: SMOTE: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)

    Article  Google Scholar 

  3. Woods, K., Doss, C., Bowyer, K., Solka, J., Priebe, C., Kegelmeyer, W.: Comparative evaluation of pattern recognition techniques for detection of microcalcifications in mammography. Int. J. Pattern Recogn. Artif. Intell. 7(6), 1417–1436 (1993)

    Article  Google Scholar 

  4. Sun, Y., Kamel, M.S., Wang, Y.: Boosting for learning multiple classes with imbalanced class distribution. In: Proceedings of International Conference on Data Mining, pp. 592–602 (2006)

    Google Scholar 

  5. Liu, X.Y., Zhou, Z.H.: Training cost-sensitive neural networks with methods addressing the class imbalance problem. IEEE Trans. Knowl. Data Eng. 18(1), 63–77 (2006)

    Article  Google Scholar 

  6. UC Irvine Machine Learning Repository (2009). http://archive.ics.uci.edu/ml/

  7. Garcia, V.: The class imbalance problem in pattern classification and learning. Congreso Espanol de Informatica, vol. 9 (2013)

    Google Scholar 

  8. Batista, G.E., Pratti, R.C., Monard, M.C.: A study of the behavior of several methods for balancing machine learning training data. SIGKDD Expl. 6, 20–29 (2004)

    Article  Google Scholar 

  9. Ling, C.X., Li, C.: Data mining for direct marketing: problems and solutions. In: Proceeding of 4th International Conference on Knowledge Discovery and Data Mining, pp. 73–79 (1998)

    Google Scholar 

  10. Drummond, C., Holte, R.C.: C4.5, class imbalance, and cost sensitivity: why under sampling beats over-sampling. In: Proceedings of International Conference on Machine Learning, Workshop Learning from Imbalanced Data Sets II (2003)

    Google Scholar 

  11. Han, H.: Borderline - SMOTE. Springer, Berlin (2005)

    Google Scholar 

  12. Raskutti, B., Kowalczyk, A.: Extreme rebalancing for svms: a case study. SIGKDD Expl. 6, 60–69 (2004)

    Article  Google Scholar 

  13. Domingos, P.: Metacost: a general method for making classifiers costsensitive. In: Proceedings of 5th International Conference on Knowledge Discovery and Data Mining, pp. 155–164 (1999)

    Google Scholar 

  14. Gordon, D.F., Perlis, D.: Explicitly biased generalization. Comput. Intell. 5, 67–81 (1989)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Riyaz Sikora .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Sikora, R., Raina, S. (2019). Controlled Under-Sampling with Majority Voting Ensemble Learning for Class Imbalance Problem. In: Arai, K., Kapoor, S., Bhatia, R. (eds) Intelligent Computing. SAI 2018. Advances in Intelligent Systems and Computing, vol 857. Springer, Cham. https://doi.org/10.1007/978-3-030-01177-2_3

Download citation

Publish with us

Policies and ethics