Abstract
The new parallel incremental Support Vector Machine (SVM) algorithm aims at classifying very large datasets on graphics processing units (GPUs). SVM and kernel related methods have shown to build accurate models but the learning task usually needs a quadratic programming, so that the learning task for large datasets requires big memory capacity and a long time. We extend the recent finite Newton classifier for building a parallel incremental algorithm. The new algorithm uses graphics processors to gain high performance at low cost. Numerical test results on UCI, Delve dataset repositories showed that our parallel incremental algorithm using GPUs is about 45 times faster than a CPU implementation and often significantly over 100 times faster than state-of-the-art algorithms LibSVM, SVM-perf and CB-SVM.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Blake, C., Merz, C.: UCI Repository of Machine Learning Databases (2008)
Boser, B., Guyon, I., Vapnik, V.: An Training Algorithm for Optimal Margin Classifiers. In: Proc. of 5th ACM Annual Workshop on Computational Learning Theory, Pittsburgh, Pennsylvania, pp. 144–152 (1992)
Cauwenberghs, G., Poggio, T.: Incremental and Decremental Support Vector Machine Learning. In: Advances in Neural Information Processing Systems, vol. 13, pp. 409–415. MIT Press, Cambridge (2001)
Chang, C.C., Lin, C.J.: LIBSVM – A Library for Support Vector Machines (2001)
Cristianini, N., Shawe-Taylor, J.: An Introduction to Support Vector Machines and Other Kernel-based Learning Methods. Cambridge University Press, Cambridge (2000)
Delve: Data for evaluating learning in valid experiments (1996)
Do, T.N., Poulet, F.: Towards High Dimensional Data Mining with Boosting of PSVM and Visualization Tools. In: Proc. of 6th Int. Conf. on Entreprise Information Systems, pp. 36–41 (2004)
Do, T.N., Poulet, F.: Mining Very Large Datasets with SVM and Visualization. In: Proc. of 7th Int. Conf. on Entreprise Information Systems, pp. 127–134 (2005)
Do, T.N., Poulet, F.: Classifying one billion data with a new distributed SVM algorithm. In: Proc. of 4th IEEE International Conference on Computer Science, Research, Innovation and Vision for the Future, pp. 59–66 (2006)
Do, T.N., Fekete, J.D.: Large Scale Classification with Support Vector Machine Algorithms. In: Proc. of 6th International Conference on Machine Learning and Applications, pp. 7–12. IEEE Press, USA (2007)
Dongarra, J., Pozo, R., Walker, D.: LAPACK++: a design overview of object-oriented extensions for high performance linear algebra. In: Proc. of Supercomputing 1993, pp. 162–171. IEEE Press, Los Alamitos (1993)
Fayyad, U., Piatetsky-Shapiro, G., Uthurusamy, R.: Summary from the KDD-03 Panel - Data Mining: The Next 10 Years. SIGKDD Explorations 5(2), 191–196 (2004)
Fung, G., Mangasarian, O.: Incremental Support Vector Machine Classification. In: Proc. of the 2nd SIAM Int. Conf. on Data Mining SDM, USA (2002)
Guyon, I.: Web Page on SVM Applications (1999)
Joachims, T.: Training Linear SVMs in Linear Time. In: Proc. of the ACM SIGKDD Intl Conf. on KDD, pp. 217–226 (2006)
Lyman, P., Varian, H.R., Swearingen, K., Charles, P., Good, N., Jordan, L., Pal, J.: How much information (2003)
Mangasarian, O.: A finite newton method for classification problems. Data Mining Institute Technical Report 01-11, Computer Sciences Department, University of Wisconsin (2001)
Mangasarian, O., Musicant, D.: Lagrangian Support Vector Machines. Journal of Machine Learning Research 1, 161–177 (2001)
NVIDIA CUDA: CUDA Programming Guide 1.1 (2007)
NVIDIA CUDA: CUDA CUBLAS Library 1.1 (2007)
Osuna, E., Freund, R., Girosi, F.: An Improved Training Algorithm for Support Vector Machines. Neural Networks for Signal Processing VII, 276–285 (1997)
Platt, J.: Fast Training of Support Vector Machines Using Sequential Minimal Optimization. In: Schoelkopf, B., Burges, C., Smola, A. (eds.) Advances in Kernel Methods – Support Vector Learning, pp. 185–208 (1999)
Poulet, F., Do, T.N.: Mining Very Large Datasets with Support Vector Machine Algorithms. In: Camp, O., Filipe, J., Hammoudi, S., Piattini, M., et al. (eds.) Enterprise Information Systems V, pp. 177–184. Kluwer Academic Publishers, Dordrecht (2004)
Syed, N., Liu, H., Sung, K.: Incremental Learning with Support Vector Machines. In: Proc. of the 6th ACM SIGKDD Intl Conf. on KDD 1999, USA (1999)
Tong, S., Koller, D.: Support Vector Machine Active Learning with Applications to Text Classification. In: Proc. of 17th Int. Conf. on Machine Learning, pp. 999–1006 (2000)
Vapnik, V.: The Nature of Statistical Learning Theory. Springer, New York (1995)
Wasson, S.: Nvidia’s GeForce 8800 graphics processor. Technical report, PC Hardware Explored (2006)
Yu, H., Yang, J., Han, J.: Classifying large data sets using SVMs with hierarchical clusters. In: Proc. of the ACM SIGKDD Intl Conf. on KDD, pp. 306–315 (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Do, TN., Nguyen, VH., Poulet, F. (2008). A Fast Parallel SVM Algorithm for Massive Classification Tasks. In: Le Thi, H.A., Bouvry, P., Pham Dinh, T. (eds) Modelling, Computation and Optimization in Information Systems and Management Sciences. MCO 2008. Communications in Computer and Information Science, vol 14. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-87477-5_45
Download citation
DOI: https://doi.org/10.1007/978-3-540-87477-5_45
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-87476-8
Online ISBN: 978-3-540-87477-5
eBook Packages: Computer ScienceComputer Science (R0)