Skip to main content

Support Vector Machines for Classification: A Statistical Portrait

  • Protocol
  • First Online:

Part of the book series: Methods in Molecular Biology ((MIMB,volume 620))

Abstract

The support vector machine is a supervised learning technique for classification increasingly used in many applications of data mining, engineering, and bioinformatics. This chapter aims to provide an introduction to the method, covering from the basic concept of the optimal separating hyperplane to its nonlinear generalization through kernels. A general framework of kernel methods that encompass the support vector machine as a special case is outlined. In addition, statistical properties that illuminate both advantage and limitation of the method due to its specific mechanism for classification are briefly discussed. For illustration of the method and related practical issues, an application to real data with high-dimensional features is presented.

This is a preview of subscription content, log in via an institution.

Buying options

Protocol
USD   49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   199.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Springer Nature is developing a new tool to find and evaluate Protocols. Learn more

References

  1. Hastie, T., Tibshirani, R., and Friedman, J. (2001) The Elements of Statistical Learning. Springer Verlag, New York.

    Google Scholar 

  2. Duda, R. O., Hart, P. E., and Stork, D. G. (2000) Pattern Classification (2nd Edition). Wiley-Interscience, New York.

    Google Scholar 

  3. McLachlan, G. J. (2004) Discriminant Analysis and Statistical Pattern Recognition. Wiley-Interscience, New York.

    Google Scholar 

  4. Vapnik, V. (1998) Statistical Learning Theory. Wiley, New York.

    Google Scholar 

  5. Boser, B., Guyon, I., and Vapnik, V. (1992) A training algorithm for optimal margin classifiers. In Proceedings of the Fifth Annual Workshop on Computational Learning Theory 5, 144–152.

    Article  Google Scholar 

  6. Cristianini, N. and Shawe-Taylor, J. (2000) An Introduction to Support Vector Machines. Cambridge University Press, Cambridge.

    Google Scholar 

  7. Schölkopf, B. and Smola, A. (2002) Learning with Kernels – Support Vector Machines, Regularization, Optimization and Beyond. MIT Press, Cambridge, MA.

    Google Scholar 

  8. Cortes, C. and Vapnik, V. (1995) Support-Vector Networks. Machine Learning 20(3), 273–297.

    Google Scholar 

  9. Rosenblatt, F. (1958) The perceptron: A probabilistic model for information storage and organization in the brain. Psychological Review 65, 386–408.

    Article  PubMed  CAS  Google Scholar 

  10. Burges, C. (1998) A tutorial on support vector machines for pattern recognition. Data Mining and Knowledge Discovery 2(2), 121–167.

    Article  Google Scholar 

  11. Bennett, K. P. and Campbell, C. (2000) Support vector machines: Hype or hallelujah? SIGKDD Explorations 2(2), 1–13.

    Article  Google Scholar 

  12. Moguerza, J. M., and Munoz, A. (2006) Support vector machines with applications. Statistical Science 21(3), 322–336.

    Article  Google Scholar 

  13. Hoerl, A. and Kennard, R. (1970) Ridge regression: Biased estimation for nonorthogonal problems. Technometrics 12(3), 55–67.

    Article  Google Scholar 

  14. Tibshirani, R. (1996) Regression selection and shrinkage via the lasso. Journal of the Royal Statistical Society B 58(1), 267–288.

    Google Scholar 

  15. Mangasarian, O. (1994) Nonlinear Programming. Classics in Applied Mathematics, Vol. 10, SIAM, Philadelphia.

    Book  Google Scholar 

  16. Wahba, G. (1990) Spline Models for Observational Data. Series in Applied Mathematics, Vol. 59, SIAM, Philadelphia.

    Book  Google Scholar 

  17. Wahba, G. (1998) Support vector machines, reproducing kernel Hilbert spaces, and randomized GACV. In Schölkopf, B., Burges, C. J. C., and Smola, A. J. (ed.), Advances in Kernel Methods: Support Vector Learning, MIT Press, p. 69–87.

    Google Scholar 

  18. Aronszajn, N. (1950) Theory of reproducing kernel. Transactions of the American Mathematical Society 68, 3337–3404.

    Article  Google Scholar 

  19. Kimeldorf, G. and Wahba, G. (1971) Some results on Tchebychean Spline functions. Journal of Mathematics Analysis and Applications 33(1), 82–95.

    Article  Google Scholar 

  20. Schölkopf, B., Tsuda, K., and Vert, J. P. (ed.) (2004) Kernel Methods in Computational Biology. MIT Press, Cambridge, MA.

    Google Scholar 

  21. Zhang, T. (2004) Statistical behavior and consistency of classification methods based on convex risk minimization. Annals of Statistics 32(1), 56–85.

    Article  CAS  Google Scholar 

  22. Bartlett, P. L., Jordan, M. I., and McAuliffe, J. D. (2006) Convexity, classification, and risk bounds. Journal of the American Statististical Association 101, 138–156.

    Article  CAS  Google Scholar 

  23. Lin, Y. (2002) A note on margin-based loss functions in classification. Statistics and Probability Letters 68, 73–82.

    Article  Google Scholar 

  24. Lee, Y., Lin, Y., and Wahba, G. (2004) Multicategory Support Vector Machines, theory, and application to the classification of microarray data and satellite radiance data. Journal of the American Statistical Association 99, 67–81.

    Article  Google Scholar 

  25. Tewari, A. and Bartlett, P. L. (2007) On the consistency of multiclass classification methods. Journal of Machine Learning Research 8, 1007–1025.

    Google Scholar 

  26. Liu, Y. and Shen, X. (2006) Multicategory SVM and ψ-learning-methodology and theory. Journal of the American Statistical Association 101, 500–509.

    Article  CAS  Google Scholar 

  27. Steinwart, I. (2005) Consistency of support vector machines and other regularized kernel machines. IEEE Transactions on Information Theory 51, 128–142.

    Article  Google Scholar 

  28. Koo, J.-Y., Lee, Y., Kim, Y., and Park, C. (2008) A Bahadur representation of the linear Support Vector Machine. Journal of Machine Learning Research 9, 1343–1368.

    Google Scholar 

  29. van’t Veer, L. J., Dai, H., van de Vijver, M. J., He, Y. D., Hart, A. A., Mao, M., Peterse, H. L., van der Kooy, K., Marton, M. J., Witteveen, A. T., Schreiber, G. J., Kerkhoven, R. M., Roberts, C., Linsley, P. S., Bernards, R., and Friend, S. H. (2002) Gene expression profiling predicts clinical outcome of breast cancer. Nature 415(6871), 530–536.

    Article  Google Scholar 

  30. Zhu, J. and Hastie, T. (2004) Classification of gene microarrays by penalized logistic regression. Biostatistics 5(3), 427–443.

    Article  PubMed  Google Scholar 

  31. Wahba, G. (2002) Soft and hard classification by reproducing kernel Hilbert space methods. Proceedings of the National Academy of Sciences 99, 16524–16530.

    Article  CAS  Google Scholar 

  32. Lin, Y., Lee, Y., and Wahba, G. (2002) Support vector machines for classification in nonstandard situations. Machine Learning 46, 191–202.

    Article  Google Scholar 

  33. Guyon, I., Weston, J., Barnhill, S., and Vapnik, V. (2002) Gene selection for cancer classification using support vector machines. Machine Learning 46(1–3), 389–422.

    Article  Google Scholar 

  34. Chen, S. S., Donoho, D. L., and Saunders, M. A. (1999) Atomic decomposition by basis pursuit. SIAM Journal on Scientific Computing 20(1), 33–61.

    Article  CAS  Google Scholar 

  35. Bradley, P. S., and Mangasarian, O. L. (1998) Feature selection via concave minimization and support vector machines. In Shavlik, J. (ed.), Machine Learning Proceedings of the Fifteenth International Conference Morgan Kaufmann, San Francisco, California, p. 82–90.

    Google Scholar 

  36. Zhu, J., Rosset, S., Hastie, T., and Tibshirani, R. (2004) 1-norm support vector machines. In Thrun, S., Saul, L., and Schölkopf, B. (ed.), Advances in Neural Information Processing Systems 16, MIT Press, Cambridge, MA.

    Google Scholar 

  37. Weston, J., Elisseff, A., Schölkopf, B., and Tipping, M. (2003) Use of the zero-norm with linear models and kernel methods. Journal of Machine Learning Research 3, 1439–1461.

    Google Scholar 

  38. Weston, J., Mukherjee, S., Chapelle, O., Pontil, M., Poggio, T., and Vapnik, V. (2001) Feature selection for SVMs. In Solla, S. A., Leen, T. K., and Muller, K.-R. (ed.), Advances in Neural Information Processing Systems 13, MIT Press, Cambridge, MA, pp. 668–674.

    Google Scholar 

  39. Chapelle, O., Vapnik, V., Bousquet, O., and Mukherjee, S. (2002) Choosing multiple parameters for support vector machines. Machine Learning 46 (1–3), 131–59.

    Article  Google Scholar 

  40. Zhang, H. H. (2006) Variable selection for support vector machines via smoothing spline ANOVA. Statistica Sinica 16(2), 659–674.

    Google Scholar 

  41. Lee, Y., Kim, Y., Lee, S., and Koo, J.-Y. (2006) Structured Multicategory Support Vector Machine with ANOVA decomposition. Biometrika 93(3), 555–571.

    Article  Google Scholar 

  42. Lin, Y. and Zhang, H. H. (2006) Component selection and smoothing in multivariate nonparametric regression. The Annals of Statistics 34, 2272–2297.

    Article  Google Scholar 

  43. Bottou, L., and Lin, C.-J. (2007) Support Vector Machine Solvers. In Bottou, L., Chapelle, O., DeCoste, D., and Weston, J. (ed.), Large Scale Kernel Machines, MIT Press, Cambridge, MA, pp. 301–320.

    Google Scholar 

  44. Joachims, T. (1998) Making large-scale support vector machine learning practical. In Schölkopf, C. B. (ed.), Advances in Kernel Methods: Support Vector Machines. MIT Press, Cambridge, MA.

    Google Scholar 

  45. Fan, R.-E., Chang, K.-W., Hsieh, C.-J., Wang, X.-R., and Lin, C.-J. (2008) LIBLINEAR: A library for large linear classification. Journal of Machine Learning Research 9, 1871–1874.

    Google Scholar 

  46. Hastie, T., Rosset, S., Tibshirani, R., and Zhu, J. (2004) The entire regularization path for the support vector machine. Journal of Machine Learning Research 5, 1391–1415.

    Google Scholar 

  47. Lee, Y. and Cui, Z. (2006) Characterizing the solution path of Multicategory Support Vector Machines. Statistica Sinica 16(2), 391–409.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Humana Press, a part of Springer Science+Business Media, LLC

About this protocol

Cite this protocol

Lee, Y. (2010). Support Vector Machines for Classification: A Statistical Portrait. In: Bang, H., Zhou, X., van Epps, H., Mazumdar, M. (eds) Statistical Methods in Molecular Biology. Methods in Molecular Biology, vol 620. Humana Press, Totowa, NJ. https://doi.org/10.1007/978-1-60761-580-4_11

Download citation

  • DOI: https://doi.org/10.1007/978-1-60761-580-4_11

  • Published:

  • Publisher Name: Humana Press, Totowa, NJ

  • Print ISBN: 978-1-60761-578-1

  • Online ISBN: 978-1-60761-580-4

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics