Sparse kernel SVMs via cutting-plane training

Joachims, Thorsten; Yu, Chun-Nam John

doi:10.1007/s10994-009-5126-6

Sparse kernel SVMs via cutting-plane training

Published: 23 July 2009

Volume 76, pages 179–193, (2009)
Cite this article

Download PDF

Machine Learning Aims and scope Submit manuscript

Sparse kernel SVMs via cutting-plane training

Download PDF

Thorsten Joachims¹ &
Chun-Nam John Yu¹

960 Accesses
91 Citations
Explore all metrics

Abstract

We explore an algorithm for training SVMs with Kernels that can represent the learned rule using arbitrary basis vectors, not just the support vectors (SVs) from the training set. This results in two benefits. First, the added flexibility makes it possible to find sparser solutions of good quality, substantially speeding-up prediction. Second, the improved sparsity can also make training of Kernel SVMs more efficient, especially for high-dimensional and sparse data (e.g. text classification). This has the potential to make training of Kernel SVMs tractable for large training sets, where conventional methods scale quadratically due to the linear growth of the number of SVs. In addition to a theoretical analysis of the algorithm, we also present an empirical evaluation.

References

Bach, F., & Jordan, M. (2005). Predictive low-rank decomposition for kernel methods. In ICML (pp. 33–40).
Bordes, A., Ertekin, S., Weston, J., & Bottou, L. (2005). Fast kernel classifiers with online and active learning. JMLR, 6, 1579–1619.
MathSciNet Google Scholar
Burges, C. (1996). Simplified support vector decision rules. In ICML (pp. 71–77).
Burges, C., & Schölkopf, B. (1997). Improving the accuracy and speed of support vector learning machines. NIPS, 9, 375–381.
Google Scholar
Fine, S., & Scheinberg, K. (2001). Efficient SVM training using low-rank kernel representations. JMLR, 2, 243–264.
Article Google Scholar
Joachims, T. (1999). Making large-scale SVM learning practical. In Schölkopf, B., Burges, C., Smola, A. (Eds.), Advances in kernel methods—support vector learning (pp. 169–184). Cambridge: MIT Press.
Google Scholar
Joachims, T. (2006). Training linear SVMs in linear time. In SIGKDD (pp. 217–226).
Joachims, T., Finley, T., & Yu, C. N. (2009). Cutting-plane training of structural SVMs. Machine Learning, 76(1).
Keerthi, S., Chapelle, O., & DeCoste, D. (2006). Building support vector machines with reduced classifier complexity. JMLR, 7, 1493–1515.
MathSciNet Google Scholar
Platt, J. (1999). Using analytic QP and sparseness to speed training of support vector machines. In NIPS (pp. 557–563).
Schölkopf, B., & Smola, A. J. (2002). Learning with kernels. Cambridge: MIT Press.
Google Scholar
Smola, A., & Schölkopf, B. (2000). Sparse greedy matrix approximation for machine learning. In ICML (pp. 911–918).
Steinwart, I. (2003). Sparseness of support vector machines. JMLR, 4, 1071–1105.
Article MathSciNet Google Scholar
Teo, C. H., Smola, A., Vishwanathan, S. V., & Le, Q. V. (2007). A scalable modular convex solver for regularized risk minimization. In SIGKDD (pp. 727–736).
Tsang, I., Kwok, J., & Cheung, P. M. (2005). Core vector machines: Fast SVM training on very large data sets. JMLR, 6, 363–392.
MathSciNet Google Scholar
Tsang, I. W., Kocsor, A., & Kwok, J. T. (2007). Simpler core vector machines with enclosing balls. In ICML (pp. 911–918).
Tsochantaridis, I., Joachims, T., Hofmann, T., & Altun, Y. (2005). Large margin methods for structured and interdependent output variables. JMLR, 6, 1453–1484.
MathSciNet Google Scholar
Vincent, P., & Bengio, Y. (2002). Kernel matching pursuit. Machine Learning, 48(1–3), 165–187.
Article MATH Google Scholar
Williams, C., & Seeger, M. (2001). Using the Nystrom method to speed up kernel machines. In NIPS.
Wu, M., Schölkopf, B., & Bakir, G. H. (2006). A direct method for building sparse kernel learning algorithms. JMLR, 7, 603–624.
Google Scholar

Download references

Author information

Authors and Affiliations

Dept. of Computer Science, Cornell University, Ithaca, NY, 14853, USA
Thorsten Joachims & Chun-Nam John Yu

Authors

Thorsten Joachims
View author publications
You can also search for this author in PubMed Google Scholar
Chun-Nam John Yu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chun-Nam John Yu.

Additional information

Editors: Aleksander Kołcz, Dunja Mladenić, Wray Buntine, Marko Grobelnik, and John Shawe-Taylor.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Joachims, T., Yu, CN.J. Sparse kernel SVMs via cutting-plane training. Mach Learn 76, 179–193 (2009). https://doi.org/10.1007/s10994-009-5126-6

Download citation

Received: 12 June 2009
Revised: 12 June 2009
Accepted: 16 June 2009
Published: 23 July 2009
Issue Date: September 2009
DOI: https://doi.org/10.1007/s10994-009-5126-6

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Sparse kernel SVMs via cutting-plane training

Abstract

Article PDF

Similar content being viewed by others

Nyström-SGD: Fast Learning of Kernel-Classifiers with Conditioned Stochastic Gradient Descent

Deconstructing Kernel Machines

Fast online algorithm for nonlinear support vector machines and other alike models

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Sparse kernel SVMs via cutting-plane training

Abstract

Article PDF

Similar content being viewed by others

Nyström-SGD: Fast Learning of Kernel-Classifiers with Conditioned Stochastic Gradient Descent

Deconstructing Kernel Machines

Fast online algorithm for nonlinear support vector machines and other alike models

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation