GOS-IL: A Generalized Over-Sampling Based Online Imbalanced Learning Framework

Barua, Sukarna; Islam, Md. Monirul; Murase, Kazuyuki

doi:10.1007/978-3-319-26532-2_75

GOS-IL: A Generalized Over-Sampling Based Online Imbalanced Learning Framework

Sukarna Barua¹⁷,
Md. Monirul Islam¹⁷ &
Kazuyuki Murase¹⁸

Conference paper
First Online: 12 November 2015

2119 Accesses
4 Citations

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9489))

Abstract

Online imbalanced learning has two important characteristics: samples of one class (minority class) are under-represented in the data set and samples come to the learner online incrementally. Such a data set may pose several problems to the learner. First, it is impossible to determine the minority class beforehand as the learner has no complete view of the whole data. Second, the status of imbalance may change over time. To handle such a data set efficiently, we present here a dynamic and adaptive algorithm called Generalized Over-Sampling based Online Imbalanced Learning (GOS-IL) framework. The proposed algorithm works by updating a base learner incrementally. This update is triggered when number of errors made by the learner crosses a threshold value. This deferred update helps the learner to avoid instantaneous harms of noisy samples and to achieve better generalization ability in the long run. In addition, correctly classified samples are not used by the algorithm to update the learner for avoiding over-fitting. Simulation results on some artificial and real world datasets show the effectiveness of the proposed method on two performance metrics: recall and g-mean.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Ciaramita, M., Murdock, V., Plachouras, V.: Online learning from click data for sponsored search. In: International World Wide Web Conference, pp. 227–236 (2008)
Google Scholar
Wang, H., Fan, W., Yu, P.S., Han, J.: Mining concept-drifting data streams using ensemble classifiers. In: 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 226–235 (2003)
Google Scholar
Nishida, K., Shimada, S., Ishikawa, S., Yamauchi, K.: Detecting sudden concept drift with knowledge of human behavior. In: IEEE International Conference on Systems, Man and Cybernetics, pp. 3261–3267 (2008)
Google Scholar
He, H., Garcia, E.A.: Learning from imbalanced data. IEEE Trans. Knowl. Data Eng. 21(10), 1263–1284 (2009)
Google Scholar
Barua, S., Islam, M.M., Yao, X., Murase, K.: MWMOTE-majority weighted minority oversampling technique for imbalanced data set learning. IEEE Trans. Knowl. Data Eng. 26(2), 405–425 (2014)
Article Google Scholar
Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: SMOTE: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)
MATH Google Scholar
He, H., Bai, Y., Garcia, E.A., Li, S.: ADASYN: adaptive synthetic sampling approach for imbalanced learning. In: IEEE International Joint Conference on Neural Networks, pp. 1322–1328. IEEE, Hong Kong (2008)
Google Scholar
Barua, S., Islam, M.M., Murase, K.: ProWSyn: proximity weighted synthetic oversampling technique for imbalanced data set learning. In: Pei, J., Tseng, V.S., Cao, L., Motoda, H., Xu, G. (eds.) PAKDD 2013, Part II. LNCS, vol. 7819, pp. 317–328. Springer, Heidelberg (2013)
Chapter Google Scholar
Ghazikhani, A., Monsefi, R., Yazdi, H.S.: Recursive least square perceptron model for non-stationary and imbalanced data stream classification. Evol. Syst. 4(2), 119–131 (2013)
Article Google Scholar
Mirza, B., Lin, Z., Toh, K.A.: Weighted online sequential extreme learning machine for class imbalance learning. Neural Process. Lett. 38(3), 465–486 (2013)
Article Google Scholar
Wang, S., Minku, L.L., Yao, X.: A learning framework for online class imbalance learning. In: Computational Intelligence and Ensemble Learning (CIEL), pp. 36–45 (2013)
Google Scholar
Dawid, A.P., Vovk, V.G.: Prequential probability: principles and properties. Bernoulli 5(1), 125–162 (1999)
Article MathSciNet MATH Google Scholar
UCI Machine Learning Repository. http://archive.ics.uci.edu/ml/

Download references

Acknowledgments

This research work has been done in the Department of Computer Science & Engineering of Bangladesh University of Engineering and Technology (BUET). The authors would like to acknowledge BUET for its generous support.

Author information

Authors and Affiliations

Bangladesh University of Engineering and Technology (BUET), Dhaka, Bangladesh
Sukarna Barua & Md. Monirul Islam
University of Fukui, Fukui, Japan
Kazuyuki Murase

Authors

Sukarna Barua
View author publications
You can also search for this author in PubMed Google Scholar
Md. Monirul Islam
View author publications
You can also search for this author in PubMed Google Scholar
Kazuyuki Murase
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sukarna Barua .

Editor information

Editors and Affiliations

University of Istanbul, Istanbul, Turkey
Sabri Arik
University at Qatar, Doha, Qatar
Tingwen Huang
Tunku Abdul Rahman University College, Kuala Lumpur, Malaysia
Weng Kin Lai
University of Science Technology, Wuhan, China
Qingshan Liu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Barua, S., Islam, M.M., Murase, K. (2015). GOS-IL: A Generalized Over-Sampling Based Online Imbalanced Learning Framework. In: Arik, S., Huang, T., Lai, W., Liu, Q. (eds) Neural Information Processing. ICONIP 2015. Lecture Notes in Computer Science(), vol 9489. Springer, Cham. https://doi.org/10.1007/978-3-319-26532-2_75

Download citation

DOI: https://doi.org/10.1007/978-3-319-26532-2_75
Published: 12 November 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-26531-5
Online ISBN: 978-3-319-26532-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics