Skip to main content

Abstract

The new parallel incremental Support Vector Machine (SVM) algorithm aims at classifying very large datasets on graphics processing units (GPUs). SVM and kernel related methods have shown to build accurate models but the learning task usually needs a quadratic programming, so that the learning task for large datasets requires big memory capacity and a long time. We extend the recent finite Newton classifier for building a parallel incremental algorithm. The new algorithm uses graphics processors to gain high performance at low cost. Numerical test results on UCI, Delve dataset repositories showed that our parallel incremental algorithm using GPUs is about 45 times faster than a CPU implementation and often significantly over 100 times faster than state-of-the-art algorithms LibSVM, SVM-perf and CB-SVM.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 109.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Blake, C., Merz, C.: UCI Repository of Machine Learning Databases (2008)

    Google Scholar 

  2. Boser, B., Guyon, I., Vapnik, V.: An Training Algorithm for Optimal Margin Classifiers. In: Proc. of 5th ACM Annual Workshop on Computational Learning Theory, Pittsburgh, Pennsylvania, pp. 144–152 (1992)

    Google Scholar 

  3. Cauwenberghs, G., Poggio, T.: Incremental and Decremental Support Vector Machine Learning. In: Advances in Neural Information Processing Systems, vol. 13, pp. 409–415. MIT Press, Cambridge (2001)

    Google Scholar 

  4. Chang, C.C., Lin, C.J.: LIBSVM – A Library for Support Vector Machines (2001)

    Google Scholar 

  5. Cristianini, N., Shawe-Taylor, J.: An Introduction to Support Vector Machines and Other Kernel-based Learning Methods. Cambridge University Press, Cambridge (2000)

    Book  MATH  Google Scholar 

  6. Delve: Data for evaluating learning in valid experiments (1996)

    Google Scholar 

  7. Do, T.N., Poulet, F.: Towards High Dimensional Data Mining with Boosting of PSVM and Visualization Tools. In: Proc. of 6th Int. Conf. on Entreprise Information Systems, pp. 36–41 (2004)

    Google Scholar 

  8. Do, T.N., Poulet, F.: Mining Very Large Datasets with SVM and Visualization. In: Proc. of 7th Int. Conf. on Entreprise Information Systems, pp. 127–134 (2005)

    Google Scholar 

  9. Do, T.N., Poulet, F.: Classifying one billion data with a new distributed SVM algorithm. In: Proc. of 4th IEEE International Conference on Computer Science, Research, Innovation and Vision for the Future, pp. 59–66 (2006)

    Google Scholar 

  10. Do, T.N., Fekete, J.D.: Large Scale Classification with Support Vector Machine Algorithms. In: Proc. of 6th International Conference on Machine Learning and Applications, pp. 7–12. IEEE Press, USA (2007)

    Google Scholar 

  11. Dongarra, J., Pozo, R., Walker, D.: LAPACK++: a design overview of object-oriented extensions for high performance linear algebra. In: Proc. of Supercomputing 1993, pp. 162–171. IEEE Press, Los Alamitos (1993)

    Google Scholar 

  12. Fayyad, U., Piatetsky-Shapiro, G., Uthurusamy, R.: Summary from the KDD-03 Panel - Data Mining: The Next 10 Years. SIGKDD Explorations 5(2), 191–196 (2004)

    Article  Google Scholar 

  13. Fung, G., Mangasarian, O.: Incremental Support Vector Machine Classification. In: Proc. of the 2nd SIAM Int. Conf. on Data Mining SDM, USA (2002)

    Google Scholar 

  14. Guyon, I.: Web Page on SVM Applications (1999)

    Google Scholar 

  15. Joachims, T.: Training Linear SVMs in Linear Time. In: Proc. of the ACM SIGKDD Intl Conf. on KDD, pp. 217–226 (2006)

    Google Scholar 

  16. Lyman, P., Varian, H.R., Swearingen, K., Charles, P., Good, N., Jordan, L., Pal, J.: How much information (2003)

    Google Scholar 

  17. Mangasarian, O.: A finite newton method for classification problems. Data Mining Institute Technical Report 01-11, Computer Sciences Department, University of Wisconsin (2001)

    Google Scholar 

  18. Mangasarian, O., Musicant, D.: Lagrangian Support Vector Machines. Journal of Machine Learning Research 1, 161–177 (2001)

    MathSciNet  MATH  Google Scholar 

  19. NVIDIA CUDA: CUDA Programming Guide 1.1 (2007)

    Google Scholar 

  20. NVIDIA CUDA: CUDA CUBLAS Library 1.1 (2007)

    Google Scholar 

  21. Osuna, E., Freund, R., Girosi, F.: An Improved Training Algorithm for Support Vector Machines. Neural Networks for Signal Processing VII, 276–285 (1997)

    Google Scholar 

  22. Platt, J.: Fast Training of Support Vector Machines Using Sequential Minimal Optimization. In: Schoelkopf, B., Burges, C., Smola, A. (eds.) Advances in Kernel Methods – Support Vector Learning, pp. 185–208 (1999)

    Google Scholar 

  23. Poulet, F., Do, T.N.: Mining Very Large Datasets with Support Vector Machine Algorithms. In: Camp, O., Filipe, J., Hammoudi, S., Piattini, M., et al. (eds.) Enterprise Information Systems V, pp. 177–184. Kluwer Academic Publishers, Dordrecht (2004)

    Google Scholar 

  24. Syed, N., Liu, H., Sung, K.: Incremental Learning with Support Vector Machines. In: Proc. of the 6th ACM SIGKDD Intl Conf. on KDD 1999, USA (1999)

    Google Scholar 

  25. Tong, S., Koller, D.: Support Vector Machine Active Learning with Applications to Text Classification. In: Proc. of 17th Int. Conf. on Machine Learning, pp. 999–1006 (2000)

    Google Scholar 

  26. Vapnik, V.: The Nature of Statistical Learning Theory. Springer, New York (1995)

    Book  MATH  Google Scholar 

  27. Wasson, S.: Nvidia’s GeForce 8800 graphics processor. Technical report, PC Hardware Explored (2006)

    Google Scholar 

  28. Yu, H., Yang, J., Han, J.: Classifying large data sets using SVMs with hierarchical clusters. In: Proc. of the ACM SIGKDD Intl Conf. on KDD, pp. 306–315 (2003)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Do, TN., Nguyen, VH., Poulet, F. (2008). A Fast Parallel SVM Algorithm for Massive Classification Tasks. In: Le Thi, H.A., Bouvry, P., Pham Dinh, T. (eds) Modelling, Computation and Optimization in Information Systems and Management Sciences. MCO 2008. Communications in Computer and Information Science, vol 14. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-87477-5_45

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-87477-5_45

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-87476-8

  • Online ISBN: 978-3-540-87477-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics