Abstract
We consider the problem of selecting and tuning learning parameters of support vector machines, especially for the classification of large and unbalanced data sets. We show why and how simple models with few parameters should be refined and propose an automated approach for tuning the increased number of parameters in the extended model. Based on a sensitive quality measure we analyze correlations between the number of parameters, the learning cost and the performance of the trained SVM in classifying independent test data. In addition we study the influence of the quality measure on the classification performance and compare the behavior of serial and asynchronous parallel parameter tuning on an IBM p690 cluster.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Vapnik, V.N.: Statistical learning theory. Wiley & Sons, New York (1998)
Platt, J.: Fast training of support vector machines using sequential minimal optimization. In: Schölkopf, B., Burges, C.J.C., Smola, A.J. (eds.) Advances in Kernel Methods — Support Vector Learning, pp. 185–208. MIT Press, Cambridge (1999)
Poulet, F.: Multi-way distributed SVM algorithms. In: Proc. of ECML/PKDD 2003 Int. Workshop on Parallel and Distributed Algorithms for Data Mining (2003)
Cristianini, N., Shawe-Taylor, J.: An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods. Cambridge University Press, Cambridge (2000)
Schölkopf, B., Smola, A.J.: Learning With Kernels. MIT Press, Cambridge (2002)
Hsu, C.W., Lin, C.J.: A simple decomposition method for support vector machines. Machine Learning 46, 291–314 (2002)
Serafini, T., Zanghirati, G., Zanni, L.: Gradient projection methods for quadratic programs and applications in training support vector machines. Optimization Methods and Software 20, 353–378 (2005)
Pardalos, P.M., Kovoor, N.: An algorithm for a singly constrained class of quadratic programs subject to upper and lower bounds. Mathematical Programming 46, 321–328 (1990)
Eitrich, T., Lang, B.: Efficient optimization of support vector machine learning parameters for unbalanced datasets. Preprint BUW-SC 2005/2, University of Wuppertal (2005)
Zanghirati, G., Zanni, L.: A parallel solver for large quadratic programs in training support vector machines. Parallel Computing 29, 535–551 (2003)
Collobert, R., Bengio, S., Bengio, Y.: A parallel mixture of SVMs for very large scale problems. Neural Computation 14, 1105–1114 (2002)
Selikoff, S.: The SVM-tree algorithm (2003), http://scott.selikoff.net/papers/CS678_-_Final_Report.pdf
Celis, S., Musicant, D.R.: Weka-parallel: machine learning in parallel. Computer Science Technical Report 2002b, Carleton College (2002)
Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines (2001), Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm
Gray, G.A., Kolda, T.G.: APPSPACK 4.0: asynchronous parallel pattern search for derivative-free optimization. Sandia Report SAND2004-6391, Sandia National Laboratories, Livermore, CA (2004)
Hettich, S., Blake, C.L., Merz, C.J.: UCI Repository of machine learning databases (1998), http://www.ics.uci.edu/~mlearn/MLRepository.html
Schiffmann, W., Joost, M., Werner, R.: Synthesis and performance analysis of multilayer neural network architectures. Technical Report 16/1992, University of Koblenz (1992)
Inoue, T., Abe, S.: Fuzzy support vector machines for pattern classification. In: Proc. Intl. Joint Conf. Neural Networks (IJCNN 2001), pp. 1449–1454 (2001)
Detert, U.: Introduction to the JUMP architecture (2004), http://jumpdoc.fz-juelich.de
Markowetz, F.: Support vector machines in bioinformatics. Master’s thesis, University of Heidelberg (2001)
Hough, P.D., Kolda, T.G., Torczon, V.J.: Asynchronous parallel pattern search for nonlinear optimization. SIAM Journal on Scientific Computing 23, 134–156 (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Eitrich, T., Lang, B. (2005). Parallel Tuning of Support Vector Machine Learning Parameters for Large and Unbalanced Data Sets. In: R. Berthold, M., Glen, R.C., Diederichs, K., Kohlbacher, O., Fischer, I. (eds) Computational Life Sciences. CompLife 2005. Lecture Notes in Computer Science(), vol 3695. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11560500_23
Download citation
DOI: https://doi.org/10.1007/11560500_23
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-29104-6
Online ISBN: 978-3-540-31726-5
eBook Packages: Computer ScienceComputer Science (R0)