Research and Application of Fast Multi-label SVM Classification Algorithm Using Approximate Extreme Points

Sun, Zhongwei; Guo, Zhongwen; Jiang, Mingxing; Wang, Xi; Liu, Chao

doi:10.1007/978-3-319-42553-5_4

Zhongwei Sun¹⁸,
Zhongwen Guo¹⁸,
Mingxing Jiang¹⁹,
Xi Wang¹⁸ &
…
Chao Liu¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9784))

Included in the following conference series:

International Conference on Big Data Computing and Communications

1635 Accesses
2 Citations

Abstract

In Large-Scale of Multi-label classification framework, applications of Non-linear kernel support vector machines (SVMs) classification algorithm are restricted by the problem of excessive training time. Hence, we propose Approximate Extreme Points Multi-label Support Vector Machine (AEMLSVM) classification algorithm to solve this problem. The first step of AEMLSVM classification algorithm is using approximate extreme points method to extract the training subsets, called the representative sets, from training dataset. Then SVM is trained from the representative sets. In addition, the AEMLSVM classification algorithm also can adopt Cost-Sensitive method to deal with the imbalanced data issue. Experiment results from three Large-Scale public datasets show that AEMLSVM classification algorithm can substantially shorten training time greatly and obtain a similar result compared with the traditional Multi-label SVM classification algorithm. It also exceeds existing fast Multi-label SVM classification algorithm in both training time and effectiveness. Besides, AEMLSVM classification algorithm has advantages in the classification time.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Tsoumakas, G., Katakis, I., Vlahavas, I.: Mining multi-label data. In: Maimon, O., Rokach, L. (eds.) Data Mining and Knowledge Discovery Handbook, pp. 667–685. Springer, US (2009)
Chapter Google Scholar
Brucker, F., Benites, F., Sapozhnikova, E.: Multi-label classification and extracting predicted class hierarchies. Pattern Recogn. 44(3), 724–738 (2011)
Article MATH Google Scholar
Zhang, M.L., Zhou, Z.H.: ML-KNN: a lazy learning approach to multi-label learning. Pattern Recogn. 40(7), 2038–2048 (2007)
Article MATH Google Scholar
Chou, K.C., Shen, H.B.: Cell-PLoc 2.0: an improved package of web-servers for predicting subcellular localization of proteins in various organisms. Nat. Sci. 02(10), 1090–1103 (2010)
Google Scholar
Trohidis, K., Tsoumakas, G., Kalliris, G., et al.: Multi-label classification of music into emotions. In: ISMIR, vol. 8, pp. 325–330 (2008)
Google Scholar
Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995)
MATH Google Scholar
Gibaja, E., Ventura, S.: A tutorial on multi-label learning. ACM Comput. Surv. 47(3), 1–38 (2015)
Article Google Scholar
Elisseeff, A., Weston, J.: A kernel method for multi-labelled classification. In: Advances in Neural Information Processing Systems, pp. 681–687 (2001)
Google Scholar
Tahir, M.A., Kittler, J., Bouridane, A.: Multi-label classification using heterogeneous ensemble of multi-label classifiers. Pattern Recogn. Lett. 33(5), 513–523 (2012)
Article Google Scholar
Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. (TIST) 2(3), 27 (2011)
Google Scholar
Nandan, M., Khargonekar, P.P., Talathi, S.S.: Fast SVM training using approximate extreme points. J. Mach. Learn. Res. 15(1), 59–98 (2014)
MathSciNet MATH Google Scholar
Tsang, I.W., Cs., U.H.J.T., Cheung, H.M., Nello, C.U.H.: Core vector machines: fast SVM training on very large data sets. J. Mach. Learn. Res. 6(1), 363–392 (2010)
Google Scholar
Tsang, I.W., Kocsor, A., Kwok, J.T.: Simpler core vector machines with enclosing balls. In: Proceedings of the 24th International Conference on Machine Learning, pp. 911–918. ACM (2007)
Google Scholar
Boutell, M.R., Luo, J., Shen, X., et al.: Learning multi-label scene classification. Pattern Recogn. 37(9), 1757–1771 (2004)
Article Google Scholar
Clare, A.J., King, R.D.: Knowledge discovery in multi-label phenotype data. In: Siebes, A., De Raedt, L. (eds.) PKDD 2001. LNCS (LNAI), vol. 2168, p. 42. Springer, Heidelberg (2001)
Chapter Google Scholar
Xu, J.: An efficient multi-label support vector machine with a zero label. Expert Syst. Appl. 39(5), 4796–4804 (2012)
Article Google Scholar
Gulat, J., Marcotte, P.: Some comments on Wolfe’s away step. Math. Program. 35(1), 110–119 (1986)
Article MATH Google Scholar
Frank, M., Wolfe, P.: An algorithm for quadratic programming. Naval Res. Logistics Q. 3(1–2), 95–110 (1956)
Article MathSciNet Google Scholar
Xu, J.: Fast multi-label core vector machine. Pattern Recogn. 46(3), 885–898 (2013)
Article MATH Google Scholar
Xu, J.: Multi-label core vector machine with a zero label. Pattern Recogn. 47(7), 2542–2557 (2014)
Article MATH Google Scholar
Zhang, M.L., Zhou, Z.H.: Multi-label neural networks with applications to functional genomics and text categorization. IEEE Trans. Knowl. Data Eng. 18(10), 1338–1351 (2006)
Article Google Scholar
He, H., Garcia, E.A.: Learning from imbalanced data. IEEE Trans. Knowl. Data Eng. 21(9), 1263–1284 (2009)
Article Google Scholar
Li, Y., Zhang, X.: Improving k nearest neighbor with exemplar generalization for imbalanced classification. In: Cao, L., Huang, J.Z., Srivastava, J. (eds.) PAKDD 2011, Part II. LNCS, vol. 6635, pp. 321–332. Springer, Heidelberg (2011)
Chapter Google Scholar
Sun, Y., Kamel, M.S., Wong, A.K.C., et al.: Cost-sensitive boosting for classification of imbalanced data. Pattern Recogn. 40(12), 3358–3378 (2007)
Article MATH Google Scholar
Joachims, T., Yu, C.N.J.: Sparse kernel SVMs via cutting-plane training. Mach. Learn. 76(2–3), 179–193 (2009)
Article Google Scholar
Joachims, T.: Training linear SVMs in linear time. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 217–226. ACM (2006)
Google Scholar
Shalev-Shwartz, S., Srebro, N.: SVM optimization: inverse dependence on training set size. In: Proceedings of the 25th International Conference on Machine Learning, pp. 928–935. ACM (2008)
Google Scholar
Platt, J.C.: Fast training of support vector machines using sequential minimal optimization. In: Advances in Kernel Methods, pp. 185–208 (1999)
Google Scholar
LIBSVM datasets. https://www.csie.ntu.edu.tw/cjlin/libsvmtools/datasets/
Schapire, R.E., Singer, Y.: BoosTexter: a boosting-based system for text categorization. Mach. Learn. 39(2), 135–168 (2000)
Article MATH Google Scholar
Shalev-Shwartz, S., Singer, Y., Srebro, N., et al.: Pegasos: primal estimated sub-gradient solver for SVM. Math. Program. 127(1), 3–30 (2011)
Article MathSciNet MATH Google Scholar
Zhou, Z.H., Zhang, M.L., Huang, S.J., et al.: Multi-instance multi-label learning. Artif. Intell. 176(1), 2291–2320 (2012)
Article MathSciNet MATH Google Scholar
Read, J.: Advances in multi-label classification (2011)
Google Scholar

Download references

Acknowledgments

This work is supported by the National Natural Science Foundation of China (NSFC) under the grant number 61170258, 61103196, 61379127, 61379128, 61572448 and by the Shandong Provincial Natural Science Foundation of China under the grant number ZR2014JL043.

Author information

Authors and Affiliations

Department of Computer Science and Technology, Ocean University of China, Qingdao, China
Zhongwei Sun, Zhongwen Guo, Xi Wang & Chao Liu
Department of Computer Foundation, Ocean University of China, Qingdao, China
Mingxing Jiang

Authors

Zhongwei Sun
View author publications
You can also search for this author in PubMed Google Scholar
Zhongwen Guo
View author publications
You can also search for this author in PubMed Google Scholar
Mingxing Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Xi Wang
View author publications
You can also search for this author in PubMed Google Scholar
Chao Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhongwen Guo .

Editor information

Editors and Affiliations

Department of Computer Science, University of N. Carolina at Charlotte, Charlotte, North Carolina, USA
Yu Wang
Northeastern University, College of Information Science and Engineering, Shenyang, Liaoning, China
Ge Yu
Department of Electrical & Computer Engineering, Rutgers University, Piscataway, New Jersey, USA
Yanyong Zhang
Department of Electrical and Computer Engineering, University of Houston Department of Engineering, Houston, Texas, USA
Zhu Han
College of Information Science and Engineering, Northeastern University, Shenyang , Liaoning, China
Guoren Wang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sun, Z., Guo, Z., Jiang, M., Wang, X., Liu, C. (2016). Research and Application of Fast Multi-label SVM Classification Algorithm Using Approximate Extreme Points. In: Wang, Y., Yu, G., Zhang, Y., Han, Z., Wang, G. (eds) Big Data Computing and Communications. BigCom 2016. Lecture Notes in Computer Science(), vol 9784. Springer, Cham. https://doi.org/10.1007/978-3-319-42553-5_4

Download citation

DOI: https://doi.org/10.1007/978-3-319-42553-5_4
Published: 19 July 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-42552-8
Online ISBN: 978-3-319-42553-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics