CloudSVM: Training an SVM Classifier in Cloud Computing Systems

Catak, F. Ozgur; Balaban, M. Erdal

doi:10.1007/978-3-642-37015-1_6

CloudSVM: Training an SVM Classifier in Cloud Computing Systems

F. Ozgur Catak¹⁹ &
M. Erdal Balaban²⁰

Conference paper

4112 Accesses
16 Citations

Part of the book series: Lecture Notes in Computer Science ((LNCCN,volume 7719))

Abstract

In conventional distributed machine learning methods, distributed support vector machines (SVM) algorithms are trained over pre-configured intranet/internet environments to find out an optimal classifier. These methods are very complicated and costly for large datasets. Hence, we propose a method that is referred as the Cloud SVM training mechanism (CloudSVM) in a cloud computing environment with MapReduce technique for distributed machine learning applications. Accordingly, (i) SVM algorithm is trained in distributed cloud storage servers that work concurrently; (ii) merge all support vectors in every trained cloud node; and (iii) iterate these two steps until the SVM converges to the optimal classifier function. Single computer is incapable to train SVM algorithm with large scale data sets. The results of this study are important for training of large scale data sets for machine learning applications. We provided that iterative training of splitted data set in cloud computing environment using SVM will converge to a global optimal classifier in finite iteration size.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Chang, E.Y., Zhu, K., Wang, H., Bai, H., Li, J., Qiu, Z., Cui, H.: PSVM: Parallelizing Support Vector Machines on Distributed Computers. In: Advances in Neural Information Processing Systems, vol. 20 (2007)
Google Scholar
Tsang, I.W., Kwok, J.T., Cheung, P.M.: Core Vector Machines: Fast SVM Training on Very Large Data Sets. J. Mach. Learn. Res. 6, 363–392 (2005)
MathSciNet MATH Google Scholar
Weston, J., Mukherjee, S., Chapelle, O., Pontil, M., Poggio, T., Vapnik, V.: Feature selection for SVMs. In: Advances in Neural Information Processing Systems, vol. 13, pp. 668–674 (2000)
Google Scholar
Golub, G., Reinsch, C.E.: Singular value decomposition and least squares solutions. Numerische Mathematik 14, 403–420 (1970)
Article MathSciNet MATH Google Scholar
Jolliffe, I.T.: Principal Component Analysis, 2nd edn., New York. Springer Series in Statistics (2002)
Google Scholar
Comon, P.: Independent Component Analysis, a new concept? Signal Processing 36, 287–314 (1994)
Article MATH Google Scholar
Hall, M.A.: Correlation-based Feature Selection for Discrete and Numeric Class Machine Learning. In: Proceedings of the Seventeenth International Conference on Machine Learning, pp. 359–366. Morgan Kaufmann Publishers Inc., San Francisco (2000)
Google Scholar
Lu, Y., Roychowdhury, V., Vandenberghe, L.: Distributed parallel support vector machines in strongly connected networks. IEEE Trans. Neural Networks 19, 1167–1178 (2008)
Article Google Scholar
Stefan, R.: Incremental Learning with Support Vector Machines. In: IEEE International Conference on Data Mining, p. 641. IEEE Computer Society, Los Alamitos (2001)
Google Scholar
Syed, N.A., Liu, H., Sung, K.: Incremental learning with support vector machines. In: Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), San Diego, California (1999)
Google Scholar
Caragea, C., Caragea, D., Honavar, V.: Learning support vector machine classifiers from distributed data sources. In: Proceedings of the Twentieth National Conference on Artificial Intelligence (AAAI), Student Abstract and Poster Program, pp. 1602–1603. AAAI Press, Pittsburgh (2005)
Google Scholar
Collobert, R., Bengio, S., Bengio, Y.: A parallel mixture of SVMs for very large scale problems. Neural Computation 14, 1105–1114 (2002)
Article MATH Google Scholar
Vapnik, V.N.: The nature of statistical learning theory. Springer, NY (1995)
MATH Google Scholar
Graf, H.P., Cosatto, E., Bottou, L., Durdanovic, I., Vapnik, V.: Parallel support vector machines: The cascade SVM. In: Proceedings of the Eighteenth Annual Conference on Neural Information Processing Systems (NIPS), pp. 521–528. MIT Press, Vancouver (2004)
Google Scholar
Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines. ACM Transactions on Intelligent Systems and Technology 2, 27–27 (2011)
Article Google Scholar
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86, 2278–2324 (1998)
Article Google Scholar
Bertsekas, D.P.: Nonlinear Programming, 2nd edn. Athena Scientific, Cambridge (1999)
MATH Google Scholar
Dean, J., Ghemawat, S.: Mapreduce: Simplified data processing on large clusters. In: Proceedings of the 6th conference on Symposium on Operating Systems Design & Implementation(OSDI), p. 10. USENIX Association, Berkeley (2004)
Google Scholar
Schatz, M.C.: CloudBurst: highly sensitive read mapping with MapReduce. Bioinformatics 25, 1363–1369 (2009)
Article Google Scholar
Rosasco, L., De Vito, E., Caponnetto, A., Piana, M., Verri, A.: Are loss functions all the same. Neural Computation 16, 1063–1076 (2011)
Article Google Scholar

Download references

Author information

Authors and Affiliations

National Research Institute of Electronics and Cryptology (UEKAE), Tubitak, Turkey
F. Ozgur Catak
Quantitative Methods, Istanbul University, Turkey
M. Erdal Balaban

Authors

F. Ozgur Catak
View author publications
You can also search for this author in PubMed Google Scholar
M. Erdal Balaban
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Wuhan University of Technology, Heping Road 1178, Wuchang District, 430081, Wuhan, Hubei, China
Qiaohong Zu
Hayes Park Central, Fujitsu Laboratories of Europe Ltd., Hayes End Road, UB4 8FE, Hayes, Middlesex, UK
Bo Hu
Department of Electrical and Electronics Engineering, Aksaray University, Merkez Kampüsü, 68100, Aksaray, Turkey
Atilla Elçi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Catak, F.O., Balaban, M.E. (2013). CloudSVM: Training an SVM Classifier in Cloud Computing Systems. In: Zu, Q., Hu, B., Elçi, A. (eds) Pervasive Computing and the Networked World. ICPCA/SWS 2012. Lecture Notes in Computer Science, vol 7719. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-37015-1_6

Download citation

DOI: https://doi.org/10.1007/978-3-642-37015-1_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-37014-4
Online ISBN: 978-3-642-37015-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics