Skip to main content
Log in

Stream-suitable optimization algorithms for some soft-margin support vector machine variants

  • Published:
Japanese Journal of Statistics and Data Science Aims and scope Submit manuscript

Abstract

Soft-margin support vector machines (SVMs) are an important class of classification models that are well known to be highly accurate in a variety of settings and over many applications. The training of SVMs usually requires that the data be available all at once, in batch. The Stochastic majorization–minimization (SMM) algorithm framework allows for the training of SVMs on streamed data instead. We utilize the SMM framework to construct algorithms for training hinge loss, squared-hinge loss, and logistic loss SVMs. We prove that our three SMM algorithms are each convergent and demonstrate that the algorithms are comparable to some state-of-the-art SVM-training methods. An application to the famous MNIST data set is used to demonstrate the potential of our algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1

Similar content being viewed by others

References

  • Abe, S. (2010). Support Vector Machines for Pattern Classification. London: Springer.

    Book  Google Scholar 

  • Bohning, D., & Lindsay, B. R. (1988). Monotonicity of quadratic-approximation algorithms. Annals of the Institute of Mathematical Statistics, 40, 641–663.

    Article  MathSciNet  Google Scholar 

  • Boyd, S., & Vandenberghe, L. (2004). Convex Optimization. Cambridge: Cambridge University Press.

    Book  Google Scholar 

  • Cappe, O., & Moulines, E. (2009). On-line expectation-maximizatoin algorithm for latent data models. Journal of the Royal Statistical Society B, 71, 593–613.

    Article  Google Scholar 

  • Chouzenoux, E., Idier, J., & Moussaoui, S. (2011). A majorize-minimize strategy for subspace otpimization applied to image restoration. IEEE Transactions on Image Processing, 20, 1517–1528.

    Article  MathSciNet  Google Scholar 

  • Chouzenoux, E., Jezierska, A., Pesquet, J.-C., & Talbot, H. (2013). A majorize-minimize subspace approach for \(l_2\)-\(l_0\) image regularization. SIAM Journal of Imaging Science, 6, 563–591.

    Article  Google Scholar 

  • Chouzenoux, E., & Pesquet, J.-C. (2017). A stochastic majorize-minimize subspace algorithm for online penalized least squares estimation. IEEE Transactions on Signal Processing, 65, 4770–4783.

    Article  MathSciNet  Google Scholar 

  • Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine Learning, 20, 273–297.

    MATH  Google Scholar 

  • De Pierro, A. R. (1993). On the relation between the ISRA and the EM algorithm for positron emission tomography. IEEE Transactions on Medical Imaging, 12, 328–333.

    Article  Google Scholar 

  • Eddelbuettel, D. (2013). Seamless R and C++ Integration with Rcpp. New York: Springer.

    Book  Google Scholar 

  • Fan, R.-E., Chang, K.-W., Hsieh, C.-J., Wang, X.-R., & Lin, C.-J. (2008). LIBLINEAR: a library for large linear classification. Journal of Machine Learning Research, 9, 1871–1874.

    MATH  Google Scholar 

  • Groenen, P. J. F., Nalbantov, G., & Bioch, J. C. (2008). SVM-Maj: a majorization approach to linear support vector machines with different hinge errors. Advances in Data Analysis and Classification, 2, 17–43.

    Article  MathSciNet  Google Scholar 

  • Helleputte, T. (2017). LiblineaR: Linear Predictive Models Based on the LIBLINEAR C/C++ Library

  • Hsia, C.-Y., Zhu, Y., & Lin, C.-J. (2017). A study on trust region update rules in Newton methods for large-scale linear classification. Proceedings of Machine Learning Research, 77, 33–48.

    Google Scholar 

  • Jolliffe, I. T. (2002). Principal Component Analysis. New York: Springer.

    MATH  Google Scholar 

  • Kim, S., Pasupathy, R., & Henderson, S. G. (2015). Handbook of Simulation Optimization, chapter A guide to sample average approximation (pp. 207–243). New York: Springer.

    Google Scholar 

  • Lange, K. (2016). MM Optimization Algorithms. Philadelphia: SIAM.

    Book  Google Scholar 

  • LeCun, Y. (1998). The MNIST database of handwritten digits. http://yann.lecun.com/exdb/mnist/

  • Lin, C.-J., Weng, R. C., & Keerthi, S. S. (2008). Trust region Newton method for large-scale logistic regression. Journal of Machine Learning Research, 9, 627–650.

    MathSciNet  MATH  Google Scholar 

  • Mairal, J. (2013). Stochastic majorization-minimization algorithms for large-scale optimization. In Advances in Neural Information Processing Systems (pp. 2283–2291)

  • Mairal, J. (2015). Incremental majorization-minimization optimization with application to large-scale machine learning. SIAM Journal of Optimization, 25, 829–855.

    Article  MathSciNet  Google Scholar 

  • McAfee, A., Brynjolfsson, E., & Davenport, T. H. (2012). Big data: the management revolution. Harvard Business Review, 90, 60–68.

    Google Scholar 

  • Navia-Vasquez, A., Perez-Cruz, F., Artes-Rodriguez, A., & Figueiras-Vidal, A. R. (2001). Weighted least squares training of support vector classifiers leading to compact and adaptive schemes. IEEE Transactions on Neural Networks, 12, 1047–1059.

    Article  Google Scholar 

  • Nguyen, H. D. & McLachlan, G. J. (2017). Iteratively-reweighted least-squares fitting of support vector machines: a majorization-minimization algorithm approach. In Proceedings of the 2017 Future Technologies Conference (FTC)

  • Pollard, D. (1991). Asymptotics for least absolute deviation regression estimators. Econometric Theory, 7, 186–199.

    Article  MathSciNet  Google Scholar 

  • R Core Team (2016). R: a language and environment for statistical computing. R Foundation for Statistical Computing

  • Razaviyayn, M., Hong, M., & Luo, Z.-Q. (2013). A unified convergence analysis of block successive minimization methods for nonsmooth optimization. SIAM Journal of Optimization, 23, 1126–1153.

    Article  MathSciNet  Google Scholar 

  • Razaviyayn, M., Sanjabi, M., & Luo, Z.-Q. (2016). A stochastic successive minimization method for nonsmooth nonconvex optimization with applications to transceiver design in wireless communication networks. Mathematical Programming Series B, 157, 515–545.

    Article  MathSciNet  Google Scholar 

  • Scholkopf, B., & Smola, A. J. (2002). Learning with Kernels. Cambridge: MIT Press.

    MATH  Google Scholar 

  • Shalev-Shwartz, S., Singer, Y., Srebro, N., & Cotter, A. (2011). Pegasos: primal estimated sub-gradient solver for SVM. Mathematical Programming Series B, 127, 3–30.

    Article  MathSciNet  Google Scholar 

  • Shawe-Taylor, J., & Sun, S. (2011). A review of optimization methodologies in support vector machines. Neurocomputing, 74, 3609–3618.

    Article  Google Scholar 

  • Steinwart, I., & Christmann, A. (2008). Support Vector Machine. New York: Springer.

    MATH  Google Scholar 

  • Titterington, D. M. (1984). Recursive parameter estimation using incomplete data. Journal of the Royal Statistical Society B, 46, 257–267.

    MathSciNet  MATH  Google Scholar 

  • Zhang, T. (2004). Solving large scale linear prediction problems using stochastic gradient descent algorithms. In Proceedings of the twenty-first international conference on Machine learning

Download references

Acknowledgements

We thank the Associate Editor and Reviewer of the article for making helpful comments that greatly improved our exposition. HDN was supported by Australian Research Council (ARC) Grant DE170101134. GJM was supported by ARC Grant DP170100907.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hien D. Nguyen.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Nguyen, H.D., Jones, A.T. & McLachlan, G.J. Stream-suitable optimization algorithms for some soft-margin support vector machine variants. Jpn J Stat Data Sci 1, 81–108 (2018). https://doi.org/10.1007/s42081-018-0001-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s42081-018-0001-y

Keywords

Navigation