Advertisement

Oversample Based Large Scale Support Vector Machine for Online Class Imbalance Problem

  • D. HimajaEmail author
  • T. Maruthi Padmaja
  • P. Radha Krishna
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11297)

Abstract

Dealing with online class imbalance from evolving stream is a critical issue than the conventional class imbalance problem. Usually, the class imbalance problem occurs when one class of data severely outnumbers the other classes of data, thus leads to skewed class boundaries. In the case of online class imbalance problem, the degree of class imbalance changes over time and the present state of imbalance is not known a prior to the learner. To address such problem, in this paper, we present an Oversampling based Online Large Scale Support Vector Machine (OOLASVM) algorithm which is a hybrid of active sample selection and over sampling of Support Vectors and thereby both oversampling and under sampling coexists while learning the new boundary. Further, OOLASVM maintains the balanced boundary throughout the learning process. Results on simulated and real world datasets demonstrate that proposed OOLASVM yields better performance than existing approaches such as Generalized Oversampling based Online Imbalanced Learners and Over Online Bagging.

Keywords

Online learning Dynamic class imbalance Active learning Support Vector Machines Oversampling 

Notes

Acknowledgement

This work is supported by the Defense Research and Development Organization (DRDO), India, under the sanction code: ERIPR/GIA/17-18/038. Center For Artificial Intelligence and Robotics (CAIR) is acting as the reviewing lab for the work is concerned.

References

  1. 1.
    He, H., Garcia, E.A.: Learning from imbalanced data. IEEE Trans. Knowl. Data Eng. 21(9), 1263–1284 (2009)CrossRefGoogle Scholar
  2. 2.
    Sun, Y., Wong, A., Kamel, M.: Classification of imbalanced data. Int. J. Pattern Recognit. Artif. Intell. 23(4), 687–719 (2009)CrossRefGoogle Scholar
  3. 3.
    Galar, M., Fernandez, A., Barrenechea, E., Bustince, H., Herrera, F.: A review on ensembles for the class imbalance problem: bagging-, boosting-, and hybrid-based approaches. IEEE Trans. Syst. Man Cybern. Part C (Appl. Rev.) 42(4), 463–484 (2012)CrossRefGoogle Scholar
  4. 4.
    Wang, S., Minku, L.L., Yao, X.: Systematic study of online class imbalance learning with concept drift. IEEE Trans. Neural Netw. Learn. Syst. 29(10), 1–20 (2018)CrossRefGoogle Scholar
  5. 5.
    Nathalie, J., Stephen, S.: The class imbalance problem: a systematic study. Intell. Data Anal. 6(5), 429–449 (2002)CrossRefGoogle Scholar
  6. 6.
    Wu, G., Chang, E.: Class-boundary alignment for imbalanced dataset Learning. In: ICML 2003, Santa Barbara, California (2003)Google Scholar
  7. 7.
    Bordes, A., Ertekin, S., Weston, J., Bottou, L.: Fast kernel classifiers with online and active learning. J. Mach. Learn. Res. 6, 1579–1619 (2005)MathSciNetzbMATHGoogle Scholar
  8. 8.
    Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: SMOTE: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)CrossRefGoogle Scholar
  9. 9.
    Morik, K., Brockhausen, P., Joachims, T.: Combining statistical learning with a knowledge-based approach-a case study in intensive care monitoring. In: ICML (1999)Google Scholar
  10. 10.
    Lee, H., Cho, S.: The novelty detection approach for different degrees of class imbalance. In: King, I., Wang, J., Chan, L.-W., Wang, D.L. (eds.) ICONIP 2006. LNCS, vol. 4233, pp. 21–30. Springer, Heidelberg (2006).  https://doi.org/10.1007/11893257_3CrossRefGoogle Scholar
  11. 11.
    Wang, S., Minku, L.L., Yao, X.: Resampling-based ensemble methods for online class imbalance learning. IEEE Trans. Knowl. Data Eng. 275, 1356–1368 (2014)Google Scholar
  12. 12.
    Ghazikhani, A., Monsefi, R., Yazdi, H.S.: Recursive least square perceptron model for non-stationary and imbalanced data streams classification. Evol. Syst. 42, 119–131 (2013)CrossRefGoogle Scholar
  13. 13.
    Ghazikhani, A., Monsefi, R., Yazdi, H.S.: Online neural network model for non-stationary and imbalanced data stream. Int. J. Mach. Learn. Cybern. 51, 51–62 (2013)Google Scholar
  14. 14.
    Barua, S., Islam, M.M., Murase, K.: GOS-IL: a generalized over-sampling based online imbalanced learning framework. In: Arik, S., Huang, T., Lai, W.K., Liu, Q. (eds.) ICONIP 2015. LNCS, vol. 9489, pp. 680–687. Springer, Cham (2015).  https://doi.org/10.1007/978-3-319-26532-2_75CrossRefGoogle Scholar
  15. 15.
    Yan, Y., Yang, T., Chen, J.: A framework of online learning with imbalanced streaming data. In: Sing, S.P., Markovitch, S. (eds.) Conference on Artificial Intelligence 2017, San Francisco, pp. 2817–2823. AAAI Press (2017)Google Scholar
  16. 16.
    Tang, Y., Zhang, Q.-Y., Chawla, N.V., Krasser, S.: SVMs modeling for highly imbalanced classification. IEEE Trans. Syst. Man Cybern. Part B 39(1), 281–288 (2009)CrossRefGoogle Scholar
  17. 17.
    Kremer, J., Steenstrup Pedersen, K., Igel, C.: Active Learning with support vector machines. Wires Data Min. Knowl. Discov. 4(4), 313–326 (2014)CrossRefGoogle Scholar
  18. 18.
    Calma, A., Reitmaier, T., Sick, B.: Semi-supervised active learning for support vector machines: a novel approach that exploits structure information in data. Inf. Sci. 456, 3–33 (2018)MathSciNetCrossRefGoogle Scholar
  19. 19.
    Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 203, 273–297 (1995)zbMATHGoogle Scholar
  20. 20.
    Platt, J.C.: Sequential minimal optimization: a fast algorithm for training support vector machines. A technical report MSR-TR-98-14 (1998)Google Scholar
  21. 21.
    Ertekin, S., Huang, J., Lee Giles, C.: Active learning class imbalance Problem. In: Kraaij, W., de Vries, AP., Clarke, L.A.C., Fuhr, N., Kando, N. (eds.) Conference on Research and Development in Information Retrieval 2007, Netherlands, pp. 823–824 (2007).  https://doi.org/10.1145/1277741
  22. 22.
    Dawid, A.P., Vovk, V.G.: Prequential probability: principles and properties. Bernoulli 51, 125–162 (1999)MathSciNetCrossRefGoogle Scholar
  23. 23.
    Minku, L., White, A., Yao, X.: The impact of diversity on online ensemble learning in the presence of concept drift. IEEE Trans. Knowl. Data Eng. 225, 730–742 (2010)CrossRefGoogle Scholar
  24. 24.
    UCI Machine Learning Repository. http://archive.ics.uci.edu/ml/. Accessed 26 June 2018
  25. 25.
    Barua, S., Islam, M.M., Yao, X., Murase, K.: MWMOTE-majority weighted minority oversampling technique for imbalanced data set learning. IEEE Trans. Knowl. Data Eng. 262, 405–425 (2014)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  • D. Himaja
    • 1
    Email author
  • T. Maruthi Padmaja
    • 1
  • P. Radha Krishna
    • 2
  1. 1.Department of Computer Science and EngineeringVFSTR UniversityGunturIndia
  2. 2.Department of Computer Science and EngineeringNational Institute of TechnologyWarangalIndia

Personalised recommendations