Classifying imbalanced data using BalanceCascade-based kernelized extreme learning machine

Raghuwanshi, Bhagat Singh; Shukla, Sanyam

doi:10.1007/s10044-019-00844-w

Classifying imbalanced data using BalanceCascade-based kernelized extreme learning machine

Theoretical advances
Published: 03 September 2019

Volume 23, pages 1157–1182, (2020)
Cite this article

Pattern Analysis and Applications Aims and scope Submit manuscript

Bhagat Singh Raghuwanshi¹ &
Sanyam Shukla¹

668 Accesses
12 Citations
Explore all metrics

Abstract

Imbalanced learning is one of the substantial challenging problems in the field of data mining. The datasets that have skewed class distribution pose hindrance to conventional learning methods. Conventional learning methods give the same importance to all the examples. This leads to the prediction inclined in favor of the majority classes. To solve this intrinsic deficiency, numerous strategies have been proposed such as weighted extreme learning machine (WELM) and boosting WELM (BWELM). This work designs a novel BalanceCascade-based kernelized extreme learning machine (BCKELM) to tackle the class imbalance problem more effectively. BalanceCascade includes the merits of random undersampling and the ensemble methods. The proposed method utilizes random undersampling to design balanced training subsets. The proposed ensemble generates the base learner in a sequential manner. In each iteration, the correctly classified examples belonging to the majority class are replaced by the other majority class examples to create a new balanced training subset, i.e., the base learners differ in the choice of the balanced training subset. The cardinality of the balanced training subsets depends on the imbalance ratio. This work utilizes a kernelized extreme learning machine (KELM) as the base learner to build the ensemble as it is stable and has good generalization performance. The time complexity of BCKELM is considerably lower in contrast to BWELM, BalanceCascade, EasyEnsemble and hybrid artificial bee colony WELM. The exhaustive experimental evaluation on real-world benchmark datasets demonstrates the efficacy of the proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Classifying imbalanced data using ensemble of reduced kernelized weighted extreme learning machine

Article 24 August 2019

Artificial bee colony optimization-based weighted extreme learning machine for imbalanced data learning

Article 06 February 2018

Class-specific cost-sensitive boosting weighted ELM for class imbalance learning

Article 28 June 2018

References

Alcalá J, Fernández A, Luengo J, Derrac J, García S, Sánchez L, Herrera F (2011) Keel data-mining software tool: data set repository, integration of algorithms and experimental analysis framework. J Multiple Valued Logic Soft Comput 17(2–3):255–287
Google Scholar
Belciug S, Gorunescu F (2018) Learning a single-hidden layer feedforward neural network using a rank correlation-based strategy with application to high dimensional gene expression and proteomic spectra datasets in cancer detection. J Biomed Inform 83:159–166
Article Google Scholar
Bradley AP (1997) The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognit 30(7):1145–1159
Article Google Scholar
Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) Smote: synthetic minority over-sampling technique. J Artif Int Res 16(1):321–357
MATH Google Scholar
Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30
MathSciNet MATH Google Scholar
Deng W, Zheng Q, Chen L (2009) Regularized extreme learning machine. In: IEEE symposium on computational intelligence and data mining, pp 389–395
Dheeru D, Casey G (2017) UCI machine learning repository. University of California, Irvine, School of Information and Computer Sciences. http://archive.ics.uci.edu/ml
Fawcett T (2003) ROC graphs: notes and practical considerations for researchers. Technical report, HP Labs, HPL-2003-4
Galar M, Fernandez A, Barrenechea E, Bustince H, Herrera F (2012) A review on ensembles for the class imbalance problem: bagging-, boosting-, and hybrid-based approaches. IEEE Trans Syst Man Cybernet Part C (Appl Rev) 42(4):463–484
Article Google Scholar
Haixiang G, Yijing L, Shang J, Mingyun G, Yuanyue H, Bing G (2017) Learning from class-imbalanced data: review of methods and applications. Expert Syst Appl 73:220–239
Article Google Scholar
Han H, Wang WY, Mao BH (2005) Borderline-smote: a new over-sampling method in imbalanced data sets learning. In: Huang DS, Zhang XP, Huang GB (eds) Advances in intelligent computing. Springer, Berlin, pp 878–887
Chapter Google Scholar
He H, Garcia EA (2009) Learning from imbalanced data. IEEE Trans Knowl Data Eng 21(9):1263–1284
Article Google Scholar
He H, Bai Y, Garcia EA, Li S (2008) ADASYN: adaptive synthetic sampling approach for imbalanced learning. In: 2008 IEEE international joint conference on neural networks (IEEE world congress on computational intelligence), pp 1322–1328
Hu S, Liang Y, Ma L, He Y (2009) MSMOTE: improving classification performance when training data is imbalanced. In: 2009 second international workshop on computer science and engineering, vol 2, pp 13–17
Huang G, Huang GB, Song S, You K (2015) Trends in extreme learning machines: a review. Neural Netw 61:32–48
Article MATH Google Scholar
Huang GB, Zhu QY, Siew CK (2006) Extreme learning machine: theory and applications. Neurocomputing 70(1–3):489–501
Article Google Scholar
Huang GB, Zhou H, Ding X, Zhang R (2012) Extreme learning machine for regression and multiclass classification. IEEE Trans Syst Man Cybernet Part B (Cybernet) 42(2):513–529
Article Google Scholar
Huang J, Ling CX (2005) Using AUC and accuracy in evaluating learning algorithms. IEEE Trans Knowl Data Eng 17(3):299–310
Article Google Scholar
Iosifidis A, Gabbouj M (2015) On the kernel extreme learning machine speedup. Pattern Recognit Lett 68:205–210
Article Google Scholar
Iosifidis A, Tefas A, Pitas I (2015) On the kernel extreme learning machine classifier. Pattern Recognit Lett 54:11–17
Article Google Scholar
Janakiraman VM, Nguyen X, Sterniak J, Assanis D (2015) Identification of the dynamic operating envelope of HCCI engines using class imbalance learning. IEEE Trans Neural Netw Learn Syst 26(1):98–112
Article MathSciNet Google Scholar
Janakiraman VM, Nguyen X, Assanis D (2016) Stochastic gradient based extreme learning machines for stable online learning of advanced combustion engines. Neurocomputing 177:304–316
Article Google Scholar
Krawczyk B, Galar M, Jele L, Herrera F (2016) Evolutionary undersampling boosting for imbalanced classification of breast cancer malignancy. Appl Soft Comput 38(C):714–726
Article Google Scholar
Kubat M, Holte RC, Matwin S (1998) Machine learning for the detection of oil spills in satellite radar images. Mach Learn 30(2):195–215
Article Google Scholar
Li K, Kong X, Lu Z, Wenyin L, Yin J (2014) Boosting weighted ELM for imbalanced learning. Neurocomputing 128:15–21
Article Google Scholar
Liu XY, Wu J, Zhou ZH (2009) Exploratory undersampling for class-imbalance learning. IEEE Trans Syst Man Cybernet Part B (Cybernet) 39(2):539–550
Article Google Scholar
López V, Fernández A, García S, Palade V, Herrera F (2013) An insight into classification with imbalanced data: empirical results and current trends on using data intrinsic characteristics. Inform Sci 250:113–141
Article Google Scholar
Mathew J, Pang CK, Luo M, Leong WH (2018) Classification of imbalanced data by oversampling in kernel space of support vector machines. IEEE Trans Neural Netw Learn Syst 99:1–12
Google Scholar
Nanni L, Fantozzi C, Lazzarini N (2015) Coupling different methods for overcoming the class imbalance problem. Neurocomputing 158(C):48–61
Article Google Scholar
Parvin H, Minaei-Bidgoli B, Alizadeh H (2011) Detection of cancer patients using an innovative method for learning at imbalanced datasets. In: Yao J, Ramanna S, Wang G, Suraj Z (eds) Rough sets and knowledge technology. Springer, Berlin, pp 376–381
Chapter Google Scholar
Raghuwanshi BS, Shukla S (2018a) Class-specific cost-sensitive boosting weighted elm for class imbalance learning. Memet Comput 4:1–12
Google Scholar
Raghuwanshi BS, Shukla S (2018b) Class-specific extreme learning machine for handling binary class imbalance problem. Neural Netw 105:206–217
Article MATH Google Scholar
Raghuwanshi BS, Shukla S (2018c) Class-specific kernelized extreme learning machine for binary class imbalance learning. Appl Soft Comput 73:1026–1038
Article MATH Google Scholar
Raghuwanshi BS, Shukla S (2018d) Underbagging based reduced kernelized weighted extreme learning machine for class imbalance learning. Eng Appl Artif Intell 74:252–270
Article Google Scholar
Raghuwanshi BS, Shukla S (2019a) Class imbalance learning using underbagging based kernelized extreme learning machine. Neurocomputing 329:172–187
Article Google Scholar
Raghuwanshi BS, Shukla S (2019b) Generalized class-specific kernelized extreme learning machine for multiclass imbalanced learning. Expert Syst Appl 121:244–255
Article Google Scholar
Schapire RE (1999) A brief introduction to boosting. In: Proceedings of the 16th international joint conference on artificial intelligence, Vol 2, IJCAI’99, pp 1401–1406
Seiffert C, Khoshgoftaar TM, Hulse JV, Napolitano A (2010) Rusboost: a hybrid approach to alleviating class imbalance. IEEE Trans Syst Man Cybernet Part A Syst Hum 40(1):185–197
Article Google Scholar
Shukla S, Yadav RN (2015) Regularized weighted circular complex-valued extreme learning machine for imbalanced learning. IEEE Access 3:3048–3057
Article Google Scholar
Tang X, Chen L (2018) Artificial bee colony optimization-based weighted extreme learning machine for imbalanced data learning. Cluster Comput 1:1–16
Google Scholar
Wang N, Gao X, Sun L, Li J (2017) Bayesian face sketch synthesis. IEEE Trans Image Process 26(3):1264–1274
Article MathSciNet MATH Google Scholar
Wang N, Gao X, Li J (2018a) Random sampling for fast face sketch synthesis. Pattern Recognit 76:215–227
Article Google Scholar
Wang N, Gao X, Sun L, Li J (2018b) Anchored neighborhood index for face sketch synthesis. IEEE Trans Circuits Syst Video Technol 28(9):2154–2163
Article Google Scholar
Wang S, Yao X (2013) Using class imbalance learning for software defect prediction. IEEE Trans Reliab 62(2):434–443
Article Google Scholar
Xiao W, Zhang J, Li Y, Zhang S, Yang W (2017) Class-specific cost regulation extreme learning machine for imbalanced classification. Neurocomputing 261:70–82
Article Google Scholar
Yang X, Song Q, Wang Y (2007) A weighted support vector machine for data classification. Int J Pattern Recognit Artif Intell 21(05):961–976
Article Google Scholar
Zhang Y, Liu B, Cai J, Zhang S (2016) Ensemble weighted extreme learning machine for imbalanced data classification based on differential evolution. Neural Comput Appl 28:1–9
Article Google Scholar
Zhao YP (2016) Parsimonious kernel extreme learning machine in primal via Cholesky factorization. Neural Netw 80:95–109
Article MATH Google Scholar
Zhou Z (2012) Ensemble methods: foundations and algorithms. Data mining and knowledge discovery series. Taylor & Francis, Boca Raton
Book Google Scholar
Zhu QY, Qin A, Suganthan P, Huang GB (2005) Evolutionary extreme learning machine. Pattern Recognit 38(10):1759–1763
Article MATH Google Scholar
Zong W, Huang GB, Chen Y (2013) Weighted extreme learning machine for imbalance learning. Neurocomputing 101:229–242
Article Google Scholar

Download references

Author information

Authors and Affiliations

Maulana Azad National Institute of Technology, Bhopal, Madhya Pradesh, 462003, India
Bhagat Singh Raghuwanshi & Sanyam Shukla

Authors

Bhagat Singh Raghuwanshi
View author publications
You can also search for this author in PubMed Google Scholar
Sanyam Shukla
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sanyam Shukla.

Ethics declarations

Conflict of interests

The authors declare that they have no conflict of interest.

Human and animal rights

This article does not contain any studies with human participants or animals performed by any of the authors.

Informed consent

Informed consent was obtained from all individual participants included in the study.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Raghuwanshi, B.S., Shukla, S. Classifying imbalanced data using BalanceCascade-based kernelized extreme learning machine. Pattern Anal Applic 23, 1157–1182 (2020). https://doi.org/10.1007/s10044-019-00844-w

Download citation

Received: 01 August 2018
Accepted: 22 August 2019
Published: 03 September 2019
Issue Date: August 2020
DOI: https://doi.org/10.1007/s10044-019-00844-w

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Classifying imbalanced data using BalanceCascade-based kernelized extreme learning machine

Abstract

Access this article

Similar content being viewed by others

Classifying imbalanced data using ensemble of reduced kernelized weighted extreme learning machine

Artificial bee colony optimization-based weighted extreme learning machine for imbalanced data learning

Class-specific cost-sensitive boosting weighted ELM for class imbalance learning

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interests

Human and animal rights

Informed consent

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Classifying imbalanced data using BalanceCascade-based kernelized extreme learning machine

Abstract

Access this article

Similar content being viewed by others

Classifying imbalanced data using ensemble of reduced kernelized weighted extreme learning machine

Artificial bee colony optimization-based weighted extreme learning machine for imbalanced data learning

Class-specific cost-sensitive boosting weighted ELM for class imbalance learning

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interests

Human and animal rights

Informed consent

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation