Skip to main content

Research and Application of Fast Multi-label SVM Classification Algorithm Using Approximate Extreme Points

  • Conference paper
  • First Online:
Big Data Computing and Communications (BigCom 2016)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9784))

Included in the following conference series:

Abstract

In Large-Scale of Multi-label classification framework, applications of Non-linear kernel support vector machines (SVMs) classification algorithm are restricted by the problem of excessive training time. Hence, we propose Approximate Extreme Points Multi-label Support Vector Machine (AEMLSVM) classification algorithm to solve this problem. The first step of AEMLSVM classification algorithm is using approximate extreme points method to extract the training subsets, called the representative sets, from training dataset. Then SVM is trained from the representative sets. In addition, the AEMLSVM classification algorithm also can adopt Cost-Sensitive method to deal with the imbalanced data issue. Experiment results from three Large-Scale public datasets show that AEMLSVM classification algorithm can substantially shorten training time greatly and obtain a similar result compared with the traditional Multi-label SVM classification algorithm. It also exceeds existing fast Multi-label SVM classification algorithm in both training time and effectiveness. Besides, AEMLSVM classification algorithm has advantages in the classification time.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Tsoumakas, G., Katakis, I., Vlahavas, I.: Mining multi-label data. In: Maimon, O., Rokach, L. (eds.) Data Mining and Knowledge Discovery Handbook, pp. 667–685. Springer, US (2009)

    Chapter  Google Scholar 

  2. Brucker, F., Benites, F., Sapozhnikova, E.: Multi-label classification and extracting predicted class hierarchies. Pattern Recogn. 44(3), 724–738 (2011)

    Article  MATH  Google Scholar 

  3. Zhang, M.L., Zhou, Z.H.: ML-KNN: a lazy learning approach to multi-label learning. Pattern Recogn. 40(7), 2038–2048 (2007)

    Article  MATH  Google Scholar 

  4. Chou, K.C., Shen, H.B.: Cell-PLoc 2.0: an improved package of web-servers for predicting subcellular localization of proteins in various organisms. Nat. Sci. 02(10), 1090–1103 (2010)

    Google Scholar 

  5. Trohidis, K., Tsoumakas, G., Kalliris, G., et al.: Multi-label classification of music into emotions. In: ISMIR, vol. 8, pp. 325–330 (2008)

    Google Scholar 

  6. Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995)

    MATH  Google Scholar 

  7. Gibaja, E., Ventura, S.: A tutorial on multi-label learning. ACM Comput. Surv. 47(3), 1–38 (2015)

    Article  Google Scholar 

  8. Elisseeff, A., Weston, J.: A kernel method for multi-labelled classification. In: Advances in Neural Information Processing Systems, pp. 681–687 (2001)

    Google Scholar 

  9. Tahir, M.A., Kittler, J., Bouridane, A.: Multi-label classification using heterogeneous ensemble of multi-label classifiers. Pattern Recogn. Lett. 33(5), 513–523 (2012)

    Article  Google Scholar 

  10. Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. (TIST) 2(3), 27 (2011)

    Google Scholar 

  11. Nandan, M., Khargonekar, P.P., Talathi, S.S.: Fast SVM training using approximate extreme points. J. Mach. Learn. Res. 15(1), 59–98 (2014)

    MathSciNet  MATH  Google Scholar 

  12. Tsang, I.W., Cs., U.H.J.T., Cheung, H.M., Nello, C.U.H.: Core vector machines: fast SVM training on very large data sets. J. Mach. Learn. Res. 6(1), 363–392 (2010)

    Google Scholar 

  13. Tsang, I.W., Kocsor, A., Kwok, J.T.: Simpler core vector machines with enclosing balls. In: Proceedings of the 24th International Conference on Machine Learning, pp. 911–918. ACM (2007)

    Google Scholar 

  14. Boutell, M.R., Luo, J., Shen, X., et al.: Learning multi-label scene classification. Pattern Recogn. 37(9), 1757–1771 (2004)

    Article  Google Scholar 

  15. Clare, A.J., King, R.D.: Knowledge discovery in multi-label phenotype data. In: Siebes, A., De Raedt, L. (eds.) PKDD 2001. LNCS (LNAI), vol. 2168, p. 42. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  16. Xu, J.: An efficient multi-label support vector machine with a zero label. Expert Syst. Appl. 39(5), 4796–4804 (2012)

    Article  Google Scholar 

  17. Gulat, J., Marcotte, P.: Some comments on Wolfe’s away step. Math. Program. 35(1), 110–119 (1986)

    Article  MATH  Google Scholar 

  18. Frank, M., Wolfe, P.: An algorithm for quadratic programming. Naval Res. Logistics Q. 3(1–2), 95–110 (1956)

    Article  MathSciNet  Google Scholar 

  19. Xu, J.: Fast multi-label core vector machine. Pattern Recogn. 46(3), 885–898 (2013)

    Article  MATH  Google Scholar 

  20. Xu, J.: Multi-label core vector machine with a zero label. Pattern Recogn. 47(7), 2542–2557 (2014)

    Article  MATH  Google Scholar 

  21. Zhang, M.L., Zhou, Z.H.: Multi-label neural networks with applications to functional genomics and text categorization. IEEE Trans. Knowl. Data Eng. 18(10), 1338–1351 (2006)

    Article  Google Scholar 

  22. He, H., Garcia, E.A.: Learning from imbalanced data. IEEE Trans. Knowl. Data Eng. 21(9), 1263–1284 (2009)

    Article  Google Scholar 

  23. Li, Y., Zhang, X.: Improving k nearest neighbor with exemplar generalization for imbalanced classification. In: Cao, L., Huang, J.Z., Srivastava, J. (eds.) PAKDD 2011, Part II. LNCS, vol. 6635, pp. 321–332. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  24. Sun, Y., Kamel, M.S., Wong, A.K.C., et al.: Cost-sensitive boosting for classification of imbalanced data. Pattern Recogn. 40(12), 3358–3378 (2007)

    Article  MATH  Google Scholar 

  25. Joachims, T., Yu, C.N.J.: Sparse kernel SVMs via cutting-plane training. Mach. Learn. 76(2–3), 179–193 (2009)

    Article  Google Scholar 

  26. Joachims, T.: Training linear SVMs in linear time. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 217–226. ACM (2006)

    Google Scholar 

  27. Shalev-Shwartz, S., Srebro, N.: SVM optimization: inverse dependence on training set size. In: Proceedings of the 25th International Conference on Machine Learning, pp. 928–935. ACM (2008)

    Google Scholar 

  28. Platt, J.C.: Fast training of support vector machines using sequential minimal optimization. In: Advances in Kernel Methods, pp. 185–208 (1999)

    Google Scholar 

  29. LIBSVM datasets. https://www.csie.ntu.edu.tw/cjlin/libsvmtools/datasets/

  30. Schapire, R.E., Singer, Y.: BoosTexter: a boosting-based system for text categorization. Mach. Learn. 39(2), 135–168 (2000)

    Article  MATH  Google Scholar 

  31. Shalev-Shwartz, S., Singer, Y., Srebro, N., et al.: Pegasos: primal estimated sub-gradient solver for SVM. Math. Program. 127(1), 3–30 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  32. Zhou, Z.H., Zhang, M.L., Huang, S.J., et al.: Multi-instance multi-label learning. Artif. Intell. 176(1), 2291–2320 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  33. Read, J.: Advances in multi-label classification (2011)

    Google Scholar 

Download references

Acknowledgments

This work is supported by the National Natural Science Foundation of China (NSFC) under the grant number 61170258, 61103196, 61379127, 61379128, 61572448 and by the Shandong Provincial Natural Science Foundation of China under the grant number ZR2014JL043.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhongwen Guo .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Sun, Z., Guo, Z., Jiang, M., Wang, X., Liu, C. (2016). Research and Application of Fast Multi-label SVM Classification Algorithm Using Approximate Extreme Points. In: Wang, Y., Yu, G., Zhang, Y., Han, Z., Wang, G. (eds) Big Data Computing and Communications. BigCom 2016. Lecture Notes in Computer Science(), vol 9784. Springer, Cham. https://doi.org/10.1007/978-3-319-42553-5_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-42553-5_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-42552-8

  • Online ISBN: 978-3-319-42553-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics