An Efficient SMO Algorithm for Solving Non-smooth Problem Arising in \(\varepsilon \)-Insensitive Support Vector Regression

  • Aykut Kocaoğlu


Classical support vector regression (C-SVR) is a powerful function approximation method, which is robust against noise and performs a good generalization, since it is formulated by a regularized error function employing the \(\varepsilon \)-insensitiveness property. To exploit the kernel trick, C-SVR generally solves the Lagrangian dual problem. In this paper, an efficient sequential minimal optimization (SMO) algorithm with a novel easy to compute working set selection (WSS) based on the minimization of an upper bound on the difference between consecutive loss function values for solving a convex non-smooth dual optimization problem obtained by reformulating the dual problem of C-SVR with \(l_2\) error loss function which is equivalent to the \(\varepsilon \)-insensitive version of the LSSVR, is proposed. The asymptotic convergence to the optimum of the proposed SMO algorithm is also proved. This proposed SMO algorithm for solving non-smooth problem comprises both SMO algorithms for solving LSSVR and C-SVR. Indeed, it becomes equivalent to the SMO algorithm with second-order WSS for solving LSSVR when \(\varepsilon =0\). The proposed algorithm has the advantage of dealing with the optimization variables half the number of the ones in C-SVR, which results in lesser number of kernel related matrix evaluations than the standard SMO algorithm developed for C-SVR and improves the probability of the matrix outputs to have been precomputed and cached. Therefore, the proposed SMO algorithm results better training time than the standard SMO algorithm for solving C-SVR, especially with caching process. Moreover, the superiority of the proposed WSS over its first-order counterpart for solving the non-smooth optimization problem is presented.


Sequential minimal optimization Support vector regression Non-smooth optimization Working set selection Least squares support vector regression 



  1. 1.
    Boser B, Guyon I, Vapnik V (1992) A training algorithm for optimal margin classifiers. In: Proceedings of the fifth annual workshop on computational learning theoryGoogle Scholar
  2. 2.
    Cortes C, Vapnik V (1995) Support-vector network. Mach Learn 20:273–297zbMATHGoogle Scholar
  3. 3.
    Smola AJ, Schölkopf B (2004) A tutorial on support vector regression. Stat Comput 14(3):199–222MathSciNetCrossRefGoogle Scholar
  4. 4.
    Vapnik VN (1998) Statistical learning theory. Wiley, New YorkzbMATHGoogle Scholar
  5. 5.
    Suykens JAK, Vandewalle J (1999) Least squares support vector machine classifiers. Neural Process Lett 9(3):293–300CrossRefGoogle Scholar
  6. 6.
    Suykens JAK, Brabanter JD, Lukas L, Vandewalle J (2002) Weighted least squares support vector machines: robustness and sparse approximation. Neurocomputing 48:85–105CrossRefzbMATHGoogle Scholar
  7. 7.
    Balasundaram S, Gupta D, Kapil (2014) Lagrangian support vector regression via unconstrained convex minimization. Neural Netw 51:67–79CrossRefzbMATHGoogle Scholar
  8. 8.
    Balasundaram S, Meena Y (2016) A new approach for training Lagrangian support vector regression. Knowl Inf Syst 49:1097–1129CrossRefGoogle Scholar
  9. 9.
    Qin S, Xue X (2015) A two-layer recurrent neural network for nonsmooth convex optimization problems. IEEE Trans Neural Netw Learn Syst 26(6):1149–1160MathSciNetCrossRefGoogle Scholar
  10. 10.
    Qin S, Liu Y, Xue X, Wang F (2016) A neurodynamic approach to convex optimization problems with general constraint. Neural Netw 84:113–124CrossRefGoogle Scholar
  11. 11.
    Qin S, Yang X, Xue X, Song J (2017) A one-layer recurrent neural network for pseudoconvex optimization problems with equality and inequality constraints. IEEE Trans Cybern 47(10):3063–3074CrossRefGoogle Scholar
  12. 12.
    Jiao Y, Zhang Y, Chen X, Yin E, Jin J, Wang X, Cichocki A (2018) Sparse group representation model for motor imagery EEG classification. IEEE J Biomed Health Inform.
  13. 13.
    Zhang Y, Nam CS, Zhou G, Jin J, Wang X, Cichocki A (2018) Temporally constrained sparse group spatial patterns for motor imagery BCI. IEEE Trans Cybern.
  14. 14.
    Chen SS, Donoho DL, Saunders MA (2001) Atomic decomposition by basis pursuit. SIAM Rev 43(1):129–159MathSciNetCrossRefzbMATHGoogle Scholar
  15. 15.
    Zhang Y, Zhou G, Jin J, Zhao Q, Wang X, Cichocki A (2016) Sparse Bayesian classification of EEG for Brain–Computer Interface. IEEE Trans Neural Netw Learn Syst 27(11):2256–2267MathSciNetCrossRefGoogle Scholar
  16. 16.
    Wipf D, Palmer J, Rao B, Kreutz-Delgado K (2007) Performance evaluation of latent variable models with sparse priors. In: IEEE international conference on acoustics, speech, and signal processing, ICASSP 2007Google Scholar
  17. 17.
    Wang R, Zhang Y, Zhang L (2016) An adaptive neural network approach for operator functional state prediction using psychophysiological data. Integr Comput-Aided Eng 23:81–97CrossRefGoogle Scholar
  18. 18.
    Bottou L, Lin CJ (2007) Support vector machine solvers. In: Large scale kernel machines. MIT Press, Cambridge, MAGoogle Scholar
  19. 19.
    Shawe Taylor J, Sun S (2011) A review of optimization methodologies in support vector machines. Neurocomputing 74(17):3609–3618CrossRefGoogle Scholar
  20. 20.
    Platt JC (1998) Fast training of support vector machines using sequential minimal optimization. In: Schölkopf B, Burges C, Smola A (eds) Kernel methods: support vector machines. MIT Press, Cambridge, MAGoogle Scholar
  21. 21.
    Keerthi SS, Shevade SK, Bhattacharyya C, Murthy KRK (2001) Improvements to Platt’s SMO algorithm for SVM classifier design. Neural Comput 13(3):637–649CrossRefzbMATHGoogle Scholar
  22. 22.
    Flake GW, Lawrence S (2002) Efficient SVM regression training with SMO. Mach Learn 46:271–290CrossRefzbMATHGoogle Scholar
  23. 23.
    Guo J, Takahashi N, Nishi T (2006) A novel sequential minimal optimization algorithm for support vector regression. Lect Notes Comput Sci 4232:827–836CrossRefGoogle Scholar
  24. 24.
    Takahashi N, Guo J, Nishi T (2006) Global convergence of SMO algorithm for support vector regression. IEEE Trans Neural Net 19(6):971–982CrossRefGoogle Scholar
  25. 25.
    Fan RE, Chen PH, Lin CJ (2005) Working set selection using second order information for training support vector machines. J Mach Learn Res 6:1889–1918MathSciNetzbMATHGoogle Scholar
  26. 26.
    Keerthi SS, Shevade SK (2003) SMO algorithm for least-squares SVM formulations. Neural Comput 15(2):487–507CrossRefzbMATHGoogle Scholar
  27. 27.
    Lopez J, Suykens JAK (2011) First and second Order SMO algorithms for LS-SVM classifiers. Neural Process Lett 33(1):31–44CrossRefGoogle Scholar
  28. 28.
    Chang CC, Hsu CW, Lin CJ (2000) The analysis of decomposition methods for support vector machines. IEEE Trans Neural Netw 11(4):1003–1008CrossRefGoogle Scholar
  29. 29.
    Hush D, Kelly P, Scovel C, Steinwart I (2006) QP algorithms with guaranteed accuracy and run time for support vectormachines. J Mach Learn Res 7:733–769MathSciNetzbMATHGoogle Scholar
  30. 30.
    Keerthi SS, Gilbert EG (2002) Convergence of a generalized SMO algorithm for SVM classifier design. Mach Learn 46(1–3):351–360CrossRefzbMATHGoogle Scholar
  31. 31.
    Lin CJ (2001) On the convergence of the decomposition method for support vector machines. IEEE Trans Neural Netw 12(6):1288–1298CrossRefGoogle Scholar
  32. 32.
    Lopez J, Dorronsoro JR (2012) Simple proof of convergence of the SMO algorithm for different SVM variants. IEEE Trans Neural Netw Learn Syst 23(7):1142–1147CrossRefGoogle Scholar
  33. 33.
    Abe S (2015) Optimizing working sets for training support vector regressors by Newton’s method. In: International joint conference on neural networks, IJCNN 2015Google Scholar
  34. 34.
    Abe S (2016) Fusing sequential minimal optimization and Newton’s method for support vector training. Int J Mach Learn Cybern 7(3):345–364CrossRefGoogle Scholar
  35. 35.
    Barbero A, Lopez J, Dorronsoro JR (2009) Cycle-breaking acceleration of SVM training. Neurocomputing 72(7–9):1398–1406CrossRefGoogle Scholar
  36. 36.
    Barbero A, Dorronsoro JR (2011) Momentum sequential minimal optimization: an accelerated method for support vector machine training. In: International joint conference on neural networks, IJCNN 2011Google Scholar
  37. 37.
    Chang CC, Lin CJ (2011) LIBSVM: a library for support vector machines software. ACM Trans Intell Syst Technol 2(3):27.

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  1. 1.Department of Technical Programs, İzmir Vocational SchoolDokuz Eylul UniversityIzmirTurkey

Personalised recommendations