Abstract
We present a general decomposition algorithm that is uniformly applicable to every (suitably normalized) instance of Convex Quadratic Optimization and efficiently approaches the optimal solution. The number of iterations required to be within ε of optimality grows linearly with 1/ε and quadratically with the number m of variables. The working set selection can be performed in polynomial time. If we restrict our considerations to instances of Convex Quadratic Optimization with at most k 0 equality constraints for some fixed constant k 0 plus some so-called box-constraints (conditions that hold for most variants of SVM-optimization), the working set is found in linear time. Our analysis builds on a generalization of the concept of rate certifying pairs that was introduced by Hush and Scovel. In order to extend their results to arbitrary instances of Convex Quadratic Optimization, we introduce the general notion of a rate certifying q-set. We improve on the results of Hush and Scovel [8] in several ways. First our result holds for Convex Quadratic Optimization whereas the results of Hush and Scovel are specialized to SVM-optimization. Second, we achieve a higher rate of convergence even for the special case of SVM-optimization (despite the generality of our approach). Third, our analysis is technically simpler.
This work was supported in part by the IST Programme of the European Community, under the PASCAL Network of Excellence, IST-2002-506778. This publication only reflects the authors’ views. This work was furthermore supported by the Deutsche Forschungsgemeinschaft Grant SI 498/7-1.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Boser, B.E., Guyon, I.M., Vapnik, V.N.: A training algorithm for optimal margin classifiers. In: Proceedings of the 5th Annual ACM Workshop on Computational Learning Theory, pp. 144–152 (1992)
Boyd, S., Vandenberghe, L.: Convex Optimization. Cambridge University Press, Cambridge (2004)
Chang, C.-C., Hsu, C.-W., Lin, C.-J.: The analysis of decomposition methods for support vector machines. IEEE Transactions on Neural Networks 11(4), 248–250 (2000)
Chang, C.-C., Lin, C.-J.: LIBSVM: A library for support vector machines (2001), Available from http://www.csie.ntu.edu.tw/~scjlin/libsvm
Chen, P.-H., Fan, R.-E., Lin, C.-J.: A study on SMO-type decomposition methods for support vector machines (2005), Available from http://www.csie.ntu.edu.tw/~cjlin/papers/generalSMO.pdf
Dunn, J.: Rates of convergence for conditional gradient algorithms near singular and non-singular extremals. SIAM J. Control and Optimization 17(2), 187–211 (1979)
Hsu, C.-W., Lin, C.-J.: A simple decomposition method for support vector machines. Machine Learning 46(1–3), 291–314 (2002)
Hush, D., Scovel, C.: Polynomial-time decomposition algorithms for support vector machines. Machine Learning 51(1), 51–71 (2003)
Joachims, T.: Making large scale SVM learning practical. In: Schölkopf, B., Burges, C.J.C., Smola, A.J. (eds.) Advances in Kernel Methods—Support Vector Learning, pp. 169–184. MIT Press, Cambridge (1998)
Sathiya Keerthi, S., Gilbert, E.G.: Convergence of a generalized SMO algorithm for SVM classifier design. Machine Learning 46(1–3), 351–360 (2002)
Sathiya Keerthi, S., Shevade, S.K., Bhattacharyya, C., Murthy, K.R.K.: Improvements to SMO algorithm for SVM regression. IEEE Transactions on Neural Networks 11(5), 1188–1193 (2000)
Sathiya Keerthi, S., Shevade, S.K., Bhattacharyya, C., Murthy, K.R.K.: Improvements to Platt’s SMO algorithm for SVM classifier design. Neural Computation 13(3), 637–649 (2001)
Laskov, P.: Feasible direction decomposition algorithms for training support vector machines. Machine Learning 46(1–3), 315–349 (2002)
Liao, S.-P., Lin, H.-T., Lin, C.-J.: A note on the decomposition methods for support vector regression. Neural Computation 14(6), 1267–1281 (2002)
Lin, C.-J.: Linear convergence of a decomposition method for support vector machines (2001), Available from http://www.csie.ntu.edu.tw/~cjlin/papers/linearconv.pdf
Lin, C.-J.: On the convergence of the decomposition method for support vector machines. IEEE Transactions on Neural Networks 12(6), 1288–1298 (2001)
Lin, C.-J.: Asymptotic convergence of an SMO algorithm without any assumptions. IEEE Transactions on Neural Networks 13(1), 248–250 (2002)
Lin, C.-J.: A formal analysis of stopping criteria of decomposition methods for support vector machines. IEEE Transactions on Neural Networks 13(5), 1045–1052 (2002)
List, N.: Convergence of a generalized gradient selection approach for the decomposition method. In: Ben-David, S., Case, J., Maruoka, A. (eds.) ALT 2004. LNCS (LNAI), vol. 3244, pp. 338–349. Springer, Heidelberg (2004)
List, N., Simon, H.U.: A general convergence theorem for the decomposition method. In: Shawe-Taylor, J., Singer, Y. (eds.) COLT 2004. LNCS (LNAI), vol. 3120, pp. 363–377. Springer, Heidelberg (2004)
Mangasarian, O.L., Musicant, D.R.: Successive overrelaxation for support vector machines. IEEE Transactions on Neural Networks 10(5), 1032–1037 (1999)
Mangasarian, O.L., Musicant, D.R.: Active support vector machine classification. In: Advances in Neural Information Processing Systems 12, pp. 577–583. MIT Press, Cambridge (2000)
Mangasarian, O.L., Musicant, D.R.: Lagrangian support vector machines. Journal of Machine Learning Research 1, 161–177 (2001)
Megiddo, N.: Linear programming in linear time when the dimension is fixed. Journal of the Association on Computing Machinery 31(1), 114–127 (1984)
Osuna, E.E., Freund, R., Girosi, F.: Training support vector machines: an application to face detection. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 130–136 (1997)
Platt, J.C.: Fast training of support vector machines using sequential minimal optimization. In: Schölkopf, B., Burges, C.J.C., Smola, A.J. (eds.) Advances in Kernel Methods—Support Vector Learning, pp. 185–208. MIT Press, Cambridge (1998)
Saunders, C., Stitson, M.O., Weston, J., Bottou, L., Schölkopf, B., Smola, A.J.: Support vector machine reference manual. Technical Report CSD-TR-98-03, Royal Holloway, University of London, Egham, UK (1998)
Simon, H.U.: On the complexity of working set selection. In: Ben-David, S., Case, J., Maruoka, A. (eds.) ALT 2004. LNCS (LNAI), vol. 3244, pp. 324–337. Springer, Heidelberg (2004)
Vapnik, V.: Statistical Learning Theory. Series on Adaptive and Learning Systems for Signal Processing, Communications, and Control. John Wiley & Sons, Chichester (1998)
Vishwanthan, S.V.N., Smola, A.J., Murty, M.N.: SimpleSVM. In: Proceedings of the 20th International Conference on Machine Learning (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
List, N., Simon, H.U. (2005). General Polynomial Time Decomposition Algorithms. In: Auer, P., Meir, R. (eds) Learning Theory. COLT 2005. Lecture Notes in Computer Science(), vol 3559. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11503415_21
Download citation
DOI: https://doi.org/10.1007/11503415_21
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-26556-6
Online ISBN: 978-3-540-31892-7
eBook Packages: Computer ScienceComputer Science (R0)