Abstract
Imbalanced dataset is a dataset, in which the number of samples in different classes is highly uneven, which makes it very challenging for classification, i.e., classification becomes very tough as the result may get biased by the dominating class values. But misclassification of minor class sample or interested samples is very much costlier. So to provide solution to this problem, various studies have been made out of which sampling techniques are successfully adopted to preprocess the imbalance datasets. In this paper, experimental comparison of two pioneering sampling techniques SMOTE and MWMOTE is simulated using the classification models SVM, RBF, and MLP.
References
Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: SMOTE: synthetic minority oversampling technique. In: Foundations and Trends in Information Retrieva, vol. 16, pp. 321–357 (2002)
Chawla, N V., Lazarevic, A., Hall, O.: SMOTE Boost improving prediction of the minority class in boosting. In: The 7th European Conference on Principles and Practice of Knowledge Discovery in Databases, pp. 1322–1328. Springer (2003)
Hu, S., Liang, Y., Ma, L., He, Y.: Improving classification performance when training data is imbalanced. IEEE (2005)
Maciejewski, T., Stefanowski, J.: Local neighborhood extension of SMOTE for mining imbalanced data. In: IEEE Symposium on Computational Intelligence and Data Mining (CIDM), pp. 978-1-4244-99 (2011)
Han, H., Wang, W.Y., Mao, B.H.: Borderline-SMOTE: a new oversampling method in imbalanced data sets learning. In: Proceedings International Conference Intelligent Computing, pp. 878–887 (2005)
He, H., Bai, Y., Garcia, E.A., Li, S.: ADASYN: adaptive synthetic sampling approach for imbalanced learning. In: Proceedings of International Joint Conference Neural Networks, pp. 1322–1328 (2008)
Barua, S., Islam, M.M., Yao, X., Murase, K.: MWMOTE majority weighted minority oversampling technique for imbalanced data set learning. IEEE Trans. Knowl. Data Eng. 26(2) (2014)
Jayashree, S., Alice Gavya, A.: Classification of imbalanced problem by MWMOTE and SSO. IJMTES 2(5) (2015)
Japkowicz, N., Stephen, S.: The class imbalance problem: a systematic study. Intell. Data Anal. 6(5), 429–449 (2002)
He, H., Garcia, E.A.: Learning from imbalanced data. IEEE Trans. Knowl. Data Eng. 21(9), 1263–1284 (2009)
Buckland, M., Gey, A.: The relationship between recall and precision. J. Am. Soc. Inf. Sci. 45(1), 12–19 (1994)
Ramentol, E., Caballero, Y., Bello, R., Herrera, F.: SMOTE-RSB: a hybrid preprocessing approach based on oversampling and undersampling for high imbalanced data-sets using SMOTE and rough sets theory. Knowledge and Information System, vol. 33(2), pp. 245–265. Springer (2012)
Han, H., Wang, W.Y., Mao, B.H.: Borderline-SMOTE a new oversampling method in data sets learning. In: Proceedings of International Conference on Intelligent Computing, pp. 878-887 (2005)
Tang, Y., Zhang, Y.Q., Chawla, N.V., Krasser, S.: Modeling for highly imbalanced classification. J. latex class files. 1(11) (2002)
Imam, T., Ting, K.M., Kamruzzaman, J.: z-SVM: An SVM for Improved Classification Of Imbalanced Data. Advances in Artifical Intelligence, vol. 4304, pp. 264–273 (2006)
Prez-Godoy, M.D., Rivera, A.J., Carmona, C.J., delJesus, M.J.: Training algorithms for radial basis function networks to tackle learning processes with imbalanced data-sets. Appl. Soft Comput. 25, 26–39 (2014)
Haddad, L., Morris, C W., Boddy, L.: Training radial basis function neural networks: effects of training set size and imbalanced training sets. J. Microbiol. Methods 43(1), 33–44 (2000)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Pattanayak, S.S., Rout, M. (2018). Experimental Comparison of Sampling Techniques for Imbalanced Datasets Using Various Classification Models. In: Saeed, K., Chaki, N., Pati, B., Bakshi, S., Mohapatra, D. (eds) Progress in Advanced Computing and Intelligent Engineering. Advances in Intelligent Systems and Computing, vol 564. Springer, Singapore. https://doi.org/10.1007/978-981-10-6875-1_2
Download citation
DOI: https://doi.org/10.1007/978-981-10-6875-1_2
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-6874-4
Online ISBN: 978-981-10-6875-1
eBook Packages: EngineeringEngineering (R0)