Abstract
In view of inconsistent problems caused by that Synthetic Minority Over-sampling Technique (SMOTE) and Support Vector Machine (SVM) work in different space, this paper presents a kernel-based SMOTE approach to solve classification with imbalance data set by SVM. The method first preprocesses the data by oversampling the minority instances in the feature space, then the pre-images of the synthetic samples are found based on a distance relation between feature space and input space. Finally, these pre-images are appended to the original dataset to train a SVM. Experiments on real data set indicate that compared with SMOTE approach, the samples constructed by the proposed method have the higher quality. As a result, the effectiveness of classification by SVM on imbalance data set is improved.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Vapnik, V.: Statistical Learning Theory. Wiley, New York (1998)
Veropoulos, K., Campbell, C., Cristianini, N.: Controlling the sensitivity of support vector machines. In: Proceedings of the International Joint Conference on AI, pp. 55–60 (1999)
Akbani, R., Kwek, S., Japkowicz, N.: Applying Support Vector Machines to Imbalance data set. In: Boulicaut, J.-F., Esposito, F., Giannotti, F., Pedreschi, D. (eds.) ECML 2004. LNCS (LNAI), vol. 3201, pp. 39–50. Springer, Heidelberg (2004)
Yuan, J., Li, J., Zhang, B.: Learning concepts from large scale imbalanced data sets using support cluster machines. In: Proc. of the ACM Int’l Conf. on Multimedia, pp. 441–450 (2006)
Kang, P., Cho, S.: EUS SVMs: Ensemble of Under-Sampled SVMs for Data Imbalance Problems. In: King, I., Wang, J., Chan, L.-W., Wang, D. (eds.) ICONIP 2006. LNCS, vol. 4232, pp. 837–846. Springer, Heidelberg (2006)
Li, P., Wang, X., Liu, Y., Wang, X.: A Classification Method for Imbalance Data Set Based on Hybrid Strategy. Chinese Journal of Electronics 35(11), 2161–2165 (2007)
Imam, T., Ting, K.M., Kamruzzaman, J.: z-SVM: An SVM for improved classification of imbal-anced data. In: Sattar, A., Kang, B.-h. (eds.) AI 2006. LNCS (LNAI), vol. 4304, pp. 264–273. Springer, Heidelberg (2006)
Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: Smote: Synthetic minority over-sampling technique. J. Artif. Intell. Res. (JAIR) 16, 321–357 (2002)
Liu, Y., An, A., Huang, X.: Boosting Prediction Accuracy on Imbalance data set with SVM Ensembles. In: Ng, W.-K., Kitsuregawa, M., Li, J., Chang, K. (eds.) PAKDD 2006. LNCS (LNAI), vol. 3918, pp. 107–118. Springer, Heidelberg (2006)
Kwok, J.T., Tsang, I.W.: The pre-image problem in kernel methods. IEEE Transactions on Neural Networks 15(6), 1517–1525 (2004)
Williams, C.K.I.: On a connection between kernel PCA and metric multidimensional scaling. Machine Learning 46(1/3), 11–19 (2002)
Gower, J.C.: Adding a point to vector diagrams in multivariate analysis. Biometrika 55(3), 582–585 (1968)
Chang, C.-C., Lin, C.-J.: LIBSVM: a library for support vector machines (2001), http://www.csie.ntu.edu.tw/~cjlin/libsvm
Murphy, P.M., Aha, D.W.: UCI repository of machine learning databases, Irvine, CA (1994), http://www.ics.uci.edu/~mlearn/MLRepository.html
Kubat, M., Matwin, S.: Addressing the Curse of Imbalanced Training Sets: One-Sided Selection. In: Proceedings of the 14th International Conference on Machine Learning (1997)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Zeng, ZQ., Gao, J. (2009). Improving SVM Classification with Imbalance Data Set. In: Leung, C.S., Lee, M., Chan, J.H. (eds) Neural Information Processing. ICONIP 2009. Lecture Notes in Computer Science, vol 5863. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-10677-4_44
Download citation
DOI: https://doi.org/10.1007/978-3-642-10677-4_44
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-10676-7
Online ISBN: 978-3-642-10677-4
eBook Packages: Computer ScienceComputer Science (R0)