A Fast Parallel SVM Algorithm for Massive Classification Tasks

Do, Thanh-Nghi; Nguyen, Van-Hoa; Poulet, François

doi:10.1007/978-3-540-87477-5_45

Thanh-Nghi Do⁴,
Van-Hoa Nguyen⁵ &
François Poulet⁵

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 14))

Included in the following conference series:

International Conference on Modelling, Computation and Optimization in Information Systems and Management Sciences

1742 Accesses
2 Citations

Abstract

The new parallel incremental Support Vector Machine (SVM) algorithm aims at classifying very large datasets on graphics processing units (GPUs). SVM and kernel related methods have shown to build accurate models but the learning task usually needs a quadratic programming, so that the learning task for large datasets requires big memory capacity and a long time. We extend the recent finite Newton classifier for building a parallel incremental algorithm. The new algorithm uses graphics processors to gain high performance at low cost. Numerical test results on UCI, Delve dataset repositories showed that our parallel incremental algorithm using GPUs is about 45 times faster than a CPU implementation and often significantly over 100 times faster than state-of-the-art algorithms LibSVM, SVM-perf and CB-SVM.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 109.00; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Blake, C., Merz, C.: UCI Repository of Machine Learning Databases (2008)
Google Scholar
Boser, B., Guyon, I., Vapnik, V.: An Training Algorithm for Optimal Margin Classifiers. In: Proc. of 5th ACM Annual Workshop on Computational Learning Theory, Pittsburgh, Pennsylvania, pp. 144–152 (1992)
Google Scholar
Cauwenberghs, G., Poggio, T.: Incremental and Decremental Support Vector Machine Learning. In: Advances in Neural Information Processing Systems, vol. 13, pp. 409–415. MIT Press, Cambridge (2001)
Google Scholar
Chang, C.C., Lin, C.J.: LIBSVM – A Library for Support Vector Machines (2001)
Google Scholar
Cristianini, N., Shawe-Taylor, J.: An Introduction to Support Vector Machines and Other Kernel-based Learning Methods. Cambridge University Press, Cambridge (2000)
Book MATH Google Scholar
Delve: Data for evaluating learning in valid experiments (1996)
Google Scholar
Do, T.N., Poulet, F.: Towards High Dimensional Data Mining with Boosting of PSVM and Visualization Tools. In: Proc. of 6th Int. Conf. on Entreprise Information Systems, pp. 36–41 (2004)
Google Scholar
Do, T.N., Poulet, F.: Mining Very Large Datasets with SVM and Visualization. In: Proc. of 7th Int. Conf. on Entreprise Information Systems, pp. 127–134 (2005)
Google Scholar
Do, T.N., Poulet, F.: Classifying one billion data with a new distributed SVM algorithm. In: Proc. of 4th IEEE International Conference on Computer Science, Research, Innovation and Vision for the Future, pp. 59–66 (2006)
Google Scholar
Do, T.N., Fekete, J.D.: Large Scale Classification with Support Vector Machine Algorithms. In: Proc. of 6th International Conference on Machine Learning and Applications, pp. 7–12. IEEE Press, USA (2007)
Google Scholar
Dongarra, J., Pozo, R., Walker, D.: LAPACK++: a design overview of object-oriented extensions for high performance linear algebra. In: Proc. of Supercomputing 1993, pp. 162–171. IEEE Press, Los Alamitos (1993)
Google Scholar
Fayyad, U., Piatetsky-Shapiro, G., Uthurusamy, R.: Summary from the KDD-03 Panel - Data Mining: The Next 10 Years. SIGKDD Explorations 5(2), 191–196 (2004)
Article Google Scholar
Fung, G., Mangasarian, O.: Incremental Support Vector Machine Classification. In: Proc. of the 2nd SIAM Int. Conf. on Data Mining SDM, USA (2002)
Google Scholar
Guyon, I.: Web Page on SVM Applications (1999)
Google Scholar
Joachims, T.: Training Linear SVMs in Linear Time. In: Proc. of the ACM SIGKDD Intl Conf. on KDD, pp. 217–226 (2006)
Google Scholar
Lyman, P., Varian, H.R., Swearingen, K., Charles, P., Good, N., Jordan, L., Pal, J.: How much information (2003)
Google Scholar
Mangasarian, O.: A finite newton method for classification problems. Data Mining Institute Technical Report 01-11, Computer Sciences Department, University of Wisconsin (2001)
Google Scholar
Mangasarian, O., Musicant, D.: Lagrangian Support Vector Machines. Journal of Machine Learning Research 1, 161–177 (2001)
MathSciNet MATH Google Scholar
NVIDIA CUDA: CUDA Programming Guide 1.1 (2007)
Google Scholar
NVIDIA CUDA: CUDA CUBLAS Library 1.1 (2007)
Google Scholar
Osuna, E., Freund, R., Girosi, F.: An Improved Training Algorithm for Support Vector Machines. Neural Networks for Signal Processing VII, 276–285 (1997)
Google Scholar
Platt, J.: Fast Training of Support Vector Machines Using Sequential Minimal Optimization. In: Schoelkopf, B., Burges, C., Smola, A. (eds.) Advances in Kernel Methods – Support Vector Learning, pp. 185–208 (1999)
Google Scholar
Poulet, F., Do, T.N.: Mining Very Large Datasets with Support Vector Machine Algorithms. In: Camp, O., Filipe, J., Hammoudi, S., Piattini, M., et al. (eds.) Enterprise Information Systems V, pp. 177–184. Kluwer Academic Publishers, Dordrecht (2004)
Google Scholar
Syed, N., Liu, H., Sung, K.: Incremental Learning with Support Vector Machines. In: Proc. of the 6th ACM SIGKDD Intl Conf. on KDD 1999, USA (1999)
Google Scholar
Tong, S., Koller, D.: Support Vector Machine Active Learning with Applications to Text Classification. In: Proc. of 17th Int. Conf. on Machine Learning, pp. 999–1006 (2000)
Google Scholar
Vapnik, V.: The Nature of Statistical Learning Theory. Springer, New York (1995)
Book MATH Google Scholar
Wasson, S.: Nvidia’s GeForce 8800 graphics processor. Technical report, PC Hardware Explored (2006)
Google Scholar
Yu, H., Yang, J., Han, J.: Classifying large data sets using SVMs with hierarchical clusters. In: Proc. of the ACM SIGKDD Intl Conf. on KDD, pp. 306–315 (2003)
Google Scholar

Download references

Author information

Authors and Affiliations

CIT, CanTho University, Viet Nam
Thanh-Nghi Do
IRISA, Rennes, France
Van-Hoa Nguyen & François Poulet

Authors

Thanh-Nghi Do
View author publications
You can also search for this author in PubMed Google Scholar
Van-Hoa Nguyen
View author publications
You can also search for this author in PubMed Google Scholar
François Poulet
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Laboratory of Theoretical and Applied Computer Science, UFR MIM,, Paul Verlaine University of Metz, Metz, France
Hoai An Le Thi
Faculty of Sciences, Technology and Communications, University of Luxembourg, Luxembourg
Pascal Bouvry
Laboratory of Mathematics, National Institute for Applied Sciences, Rouen Mont, Saint Aignan, France
Tao Pham Dinh

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Do, TN., Nguyen, VH., Poulet, F. (2008). A Fast Parallel SVM Algorithm for Massive Classification Tasks. In: Le Thi, H.A., Bouvry, P., Pham Dinh, T. (eds) Modelling, Computation and Optimization in Information Systems and Management Sciences. MCO 2008. Communications in Computer and Information Science, vol 14. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-87477-5_45

Download citation

DOI: https://doi.org/10.1007/978-3-540-87477-5_45
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-87476-8
Online ISBN: 978-3-540-87477-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics