Abstract
In Large-Scale of Multi-label classification framework, applications of Non-linear kernel support vector machines (SVMs) classification algorithm are restricted by the problem of excessive training time. Hence, we propose Approximate Extreme Points Multi-label Support Vector Machine (AEMLSVM) classification algorithm to solve this problem. The first step of AEMLSVM classification algorithm is using approximate extreme points method to extract the training subsets, called the representative sets, from training dataset. Then SVM is trained from the representative sets. In addition, the AEMLSVM classification algorithm also can adopt Cost-Sensitive method to deal with the imbalanced data issue. Experiment results from three Large-Scale public datasets show that AEMLSVM classification algorithm can substantially shorten training time greatly and obtain a similar result compared with the traditional Multi-label SVM classification algorithm. It also exceeds existing fast Multi-label SVM classification algorithm in both training time and effectiveness. Besides, AEMLSVM classification algorithm has advantages in the classification time.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Tsoumakas, G., Katakis, I., Vlahavas, I.: Mining multi-label data. In: Maimon, O., Rokach, L. (eds.) Data Mining and Knowledge Discovery Handbook, pp. 667–685. Springer, US (2009)
Brucker, F., Benites, F., Sapozhnikova, E.: Multi-label classification and extracting predicted class hierarchies. Pattern Recogn. 44(3), 724–738 (2011)
Zhang, M.L., Zhou, Z.H.: ML-KNN: a lazy learning approach to multi-label learning. Pattern Recogn. 40(7), 2038–2048 (2007)
Chou, K.C., Shen, H.B.: Cell-PLoc 2.0: an improved package of web-servers for predicting subcellular localization of proteins in various organisms. Nat. Sci. 02(10), 1090–1103 (2010)
Trohidis, K., Tsoumakas, G., Kalliris, G., et al.: Multi-label classification of music into emotions. In: ISMIR, vol. 8, pp. 325–330 (2008)
Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995)
Gibaja, E., Ventura, S.: A tutorial on multi-label learning. ACM Comput. Surv. 47(3), 1–38 (2015)
Elisseeff, A., Weston, J.: A kernel method for multi-labelled classification. In: Advances in Neural Information Processing Systems, pp. 681–687 (2001)
Tahir, M.A., Kittler, J., Bouridane, A.: Multi-label classification using heterogeneous ensemble of multi-label classifiers. Pattern Recogn. Lett. 33(5), 513–523 (2012)
Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. (TIST) 2(3), 27 (2011)
Nandan, M., Khargonekar, P.P., Talathi, S.S.: Fast SVM training using approximate extreme points. J. Mach. Learn. Res. 15(1), 59–98 (2014)
Tsang, I.W., Cs., U.H.J.T., Cheung, H.M., Nello, C.U.H.: Core vector machines: fast SVM training on very large data sets. J. Mach. Learn. Res. 6(1), 363–392 (2010)
Tsang, I.W., Kocsor, A., Kwok, J.T.: Simpler core vector machines with enclosing balls. In: Proceedings of the 24th International Conference on Machine Learning, pp. 911–918. ACM (2007)
Boutell, M.R., Luo, J., Shen, X., et al.: Learning multi-label scene classification. Pattern Recogn. 37(9), 1757–1771 (2004)
Clare, A.J., King, R.D.: Knowledge discovery in multi-label phenotype data. In: Siebes, A., De Raedt, L. (eds.) PKDD 2001. LNCS (LNAI), vol. 2168, p. 42. Springer, Heidelberg (2001)
Xu, J.: An efficient multi-label support vector machine with a zero label. Expert Syst. Appl. 39(5), 4796–4804 (2012)
Gulat, J., Marcotte, P.: Some comments on Wolfe’s away step. Math. Program. 35(1), 110–119 (1986)
Frank, M., Wolfe, P.: An algorithm for quadratic programming. Naval Res. Logistics Q. 3(1–2), 95–110 (1956)
Xu, J.: Fast multi-label core vector machine. Pattern Recogn. 46(3), 885–898 (2013)
Xu, J.: Multi-label core vector machine with a zero label. Pattern Recogn. 47(7), 2542–2557 (2014)
Zhang, M.L., Zhou, Z.H.: Multi-label neural networks with applications to functional genomics and text categorization. IEEE Trans. Knowl. Data Eng. 18(10), 1338–1351 (2006)
He, H., Garcia, E.A.: Learning from imbalanced data. IEEE Trans. Knowl. Data Eng. 21(9), 1263–1284 (2009)
Li, Y., Zhang, X.: Improving k nearest neighbor with exemplar generalization for imbalanced classification. In: Cao, L., Huang, J.Z., Srivastava, J. (eds.) PAKDD 2011, Part II. LNCS, vol. 6635, pp. 321–332. Springer, Heidelberg (2011)
Sun, Y., Kamel, M.S., Wong, A.K.C., et al.: Cost-sensitive boosting for classification of imbalanced data. Pattern Recogn. 40(12), 3358–3378 (2007)
Joachims, T., Yu, C.N.J.: Sparse kernel SVMs via cutting-plane training. Mach. Learn. 76(2–3), 179–193 (2009)
Joachims, T.: Training linear SVMs in linear time. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 217–226. ACM (2006)
Shalev-Shwartz, S., Srebro, N.: SVM optimization: inverse dependence on training set size. In: Proceedings of the 25th International Conference on Machine Learning, pp. 928–935. ACM (2008)
Platt, J.C.: Fast training of support vector machines using sequential minimal optimization. In: Advances in Kernel Methods, pp. 185–208 (1999)
LIBSVM datasets. https://www.csie.ntu.edu.tw/cjlin/libsvmtools/datasets/
Schapire, R.E., Singer, Y.: BoosTexter: a boosting-based system for text categorization. Mach. Learn. 39(2), 135–168 (2000)
Shalev-Shwartz, S., Singer, Y., Srebro, N., et al.: Pegasos: primal estimated sub-gradient solver for SVM. Math. Program. 127(1), 3–30 (2011)
Zhou, Z.H., Zhang, M.L., Huang, S.J., et al.: Multi-instance multi-label learning. Artif. Intell. 176(1), 2291–2320 (2012)
Read, J.: Advances in multi-label classification (2011)
Acknowledgments
This work is supported by the National Natural Science Foundation of China (NSFC) under the grant number 61170258, 61103196, 61379127, 61379128, 61572448 and by the Shandong Provincial Natural Science Foundation of China under the grant number ZR2014JL043.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Sun, Z., Guo, Z., Jiang, M., Wang, X., Liu, C. (2016). Research and Application of Fast Multi-label SVM Classification Algorithm Using Approximate Extreme Points. In: Wang, Y., Yu, G., Zhang, Y., Han, Z., Wang, G. (eds) Big Data Computing and Communications. BigCom 2016. Lecture Notes in Computer Science(), vol 9784. Springer, Cham. https://doi.org/10.1007/978-3-319-42553-5_4
Download citation
DOI: https://doi.org/10.1007/978-3-319-42553-5_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-42552-8
Online ISBN: 978-3-319-42553-5
eBook Packages: Computer ScienceComputer Science (R0)