Abstract
The support vector machine is a supervised learning technique for classification increasingly used in many applications of data mining, engineering, and bioinformatics. This chapter aims to provide an introduction to the method, covering from the basic concept of the optimal separating hyperplane to its nonlinear generalization through kernels. A general framework of kernel methods that encompass the support vector machine as a special case is outlined. In addition, statistical properties that illuminate both advantage and limitation of the method due to its specific mechanism for classification are briefly discussed. For illustration of the method and related practical issues, an application to real data with high-dimensional features is presented.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Hastie, T., Tibshirani, R., and Friedman, J. (2001) The Elements of Statistical Learning. Springer Verlag, New York.
Duda, R. O., Hart, P. E., and Stork, D. G. (2000) Pattern Classification (2nd Edition). Wiley-Interscience, New York.
McLachlan, G. J. (2004) Discriminant Analysis and Statistical Pattern Recognition. Wiley-Interscience, New York.
Vapnik, V. (1998) Statistical Learning Theory. Wiley, New York.
Boser, B., Guyon, I., and Vapnik, V. (1992) A training algorithm for optimal margin classifiers. In Proceedings of the Fifth Annual Workshop on Computational Learning Theory 5, 144–152.
Cristianini, N. and Shawe-Taylor, J. (2000) An Introduction to Support Vector Machines. Cambridge University Press, Cambridge.
Schölkopf, B. and Smola, A. (2002) Learning with Kernels – Support Vector Machines, Regularization, Optimization and Beyond. MIT Press, Cambridge, MA.
Cortes, C. and Vapnik, V. (1995) Support-Vector Networks. Machine Learning 20(3), 273–297.
Rosenblatt, F. (1958) The perceptron: A probabilistic model for information storage and organization in the brain. Psychological Review 65, 386–408.
Burges, C. (1998) A tutorial on support vector machines for pattern recognition. Data Mining and Knowledge Discovery 2(2), 121–167.
Bennett, K. P. and Campbell, C. (2000) Support vector machines: Hype or hallelujah? SIGKDD Explorations 2(2), 1–13.
Moguerza, J. M., and Munoz, A. (2006) Support vector machines with applications. Statistical Science 21(3), 322–336.
Hoerl, A. and Kennard, R. (1970) Ridge regression: Biased estimation for nonorthogonal problems. Technometrics 12(3), 55–67.
Tibshirani, R. (1996) Regression selection and shrinkage via the lasso. Journal of the Royal Statistical Society B 58(1), 267–288.
Mangasarian, O. (1994) Nonlinear Programming. Classics in Applied Mathematics, Vol. 10, SIAM, Philadelphia.
Wahba, G. (1990) Spline Models for Observational Data. Series in Applied Mathematics, Vol. 59, SIAM, Philadelphia.
Wahba, G. (1998) Support vector machines, reproducing kernel Hilbert spaces, and randomized GACV. In Schölkopf, B., Burges, C. J. C., and Smola, A. J. (ed.), Advances in Kernel Methods: Support Vector Learning, MIT Press, p. 69–87.
Aronszajn, N. (1950) Theory of reproducing kernel. Transactions of the American Mathematical Society 68, 3337–3404.
Kimeldorf, G. and Wahba, G. (1971) Some results on Tchebychean Spline functions. Journal of Mathematics Analysis and Applications 33(1), 82–95.
Schölkopf, B., Tsuda, K., and Vert, J. P. (ed.) (2004) Kernel Methods in Computational Biology. MIT Press, Cambridge, MA.
Zhang, T. (2004) Statistical behavior and consistency of classification methods based on convex risk minimization. Annals of Statistics 32(1), 56–85.
Bartlett, P. L., Jordan, M. I., and McAuliffe, J. D. (2006) Convexity, classification, and risk bounds. Journal of the American Statististical Association 101, 138–156.
Lin, Y. (2002) A note on margin-based loss functions in classification. Statistics and Probability Letters 68, 73–82.
Lee, Y., Lin, Y., and Wahba, G. (2004) Multicategory Support Vector Machines, theory, and application to the classification of microarray data and satellite radiance data. Journal of the American Statistical Association 99, 67–81.
Tewari, A. and Bartlett, P. L. (2007) On the consistency of multiclass classification methods. Journal of Machine Learning Research 8, 1007–1025.
Liu, Y. and Shen, X. (2006) Multicategory SVM and ψ-learning-methodology and theory. Journal of the American Statistical Association 101, 500–509.
Steinwart, I. (2005) Consistency of support vector machines and other regularized kernel machines. IEEE Transactions on Information Theory 51, 128–142.
Koo, J.-Y., Lee, Y., Kim, Y., and Park, C. (2008) A Bahadur representation of the linear Support Vector Machine. Journal of Machine Learning Research 9, 1343–1368.
van’t Veer, L. J., Dai, H., van de Vijver, M. J., He, Y. D., Hart, A. A., Mao, M., Peterse, H. L., van der Kooy, K., Marton, M. J., Witteveen, A. T., Schreiber, G. J., Kerkhoven, R. M., Roberts, C., Linsley, P. S., Bernards, R., and Friend, S. H. (2002) Gene expression profiling predicts clinical outcome of breast cancer. Nature 415(6871), 530–536.
Zhu, J. and Hastie, T. (2004) Classification of gene microarrays by penalized logistic regression. Biostatistics 5(3), 427–443.
Wahba, G. (2002) Soft and hard classification by reproducing kernel Hilbert space methods. Proceedings of the National Academy of Sciences 99, 16524–16530.
Lin, Y., Lee, Y., and Wahba, G. (2002) Support vector machines for classification in nonstandard situations. Machine Learning 46, 191–202.
Guyon, I., Weston, J., Barnhill, S., and Vapnik, V. (2002) Gene selection for cancer classification using support vector machines. Machine Learning 46(1–3), 389–422.
Chen, S. S., Donoho, D. L., and Saunders, M. A. (1999) Atomic decomposition by basis pursuit. SIAM Journal on Scientific Computing 20(1), 33–61.
Bradley, P. S., and Mangasarian, O. L. (1998) Feature selection via concave minimization and support vector machines. In Shavlik, J. (ed.), Machine Learning Proceedings of the Fifteenth International Conference Morgan Kaufmann, San Francisco, California, p. 82–90.
Zhu, J., Rosset, S., Hastie, T., and Tibshirani, R. (2004) 1-norm support vector machines. In Thrun, S., Saul, L., and Schölkopf, B. (ed.), Advances in Neural Information Processing Systems 16, MIT Press, Cambridge, MA.
Weston, J., Elisseff, A., Schölkopf, B., and Tipping, M. (2003) Use of the zero-norm with linear models and kernel methods. Journal of Machine Learning Research 3, 1439–1461.
Weston, J., Mukherjee, S., Chapelle, O., Pontil, M., Poggio, T., and Vapnik, V. (2001) Feature selection for SVMs. In Solla, S. A., Leen, T. K., and Muller, K.-R. (ed.), Advances in Neural Information Processing Systems 13, MIT Press, Cambridge, MA, pp. 668–674.
Chapelle, O., Vapnik, V., Bousquet, O., and Mukherjee, S. (2002) Choosing multiple parameters for support vector machines. Machine Learning 46 (1–3), 131–59.
Zhang, H. H. (2006) Variable selection for support vector machines via smoothing spline ANOVA. Statistica Sinica 16(2), 659–674.
Lee, Y., Kim, Y., Lee, S., and Koo, J.-Y. (2006) Structured Multicategory Support Vector Machine with ANOVA decomposition. Biometrika 93(3), 555–571.
Lin, Y. and Zhang, H. H. (2006) Component selection and smoothing in multivariate nonparametric regression. The Annals of Statistics 34, 2272–2297.
Bottou, L., and Lin, C.-J. (2007) Support Vector Machine Solvers. In Bottou, L., Chapelle, O., DeCoste, D., and Weston, J. (ed.), Large Scale Kernel Machines, MIT Press, Cambridge, MA, pp. 301–320.
Joachims, T. (1998) Making large-scale support vector machine learning practical. In Schölkopf, C. B. (ed.), Advances in Kernel Methods: Support Vector Machines. MIT Press, Cambridge, MA.
Fan, R.-E., Chang, K.-W., Hsieh, C.-J., Wang, X.-R., and Lin, C.-J. (2008) LIBLINEAR: A library for large linear classification. Journal of Machine Learning Research 9, 1871–1874.
Hastie, T., Rosset, S., Tibshirani, R., and Zhu, J. (2004) The entire regularization path for the support vector machine. Journal of Machine Learning Research 5, 1391–1415.
Lee, Y. and Cui, Z. (2006) Characterizing the solution path of Multicategory Support Vector Machines. Statistica Sinica 16(2), 391–409.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Humana Press, a part of Springer Science+Business Media, LLC
About this protocol
Cite this protocol
Lee, Y. (2010). Support Vector Machines for Classification: A Statistical Portrait. In: Bang, H., Zhou, X., van Epps, H., Mazumdar, M. (eds) Statistical Methods in Molecular Biology. Methods in Molecular Biology, vol 620. Humana Press, Totowa, NJ. https://doi.org/10.1007/978-1-60761-580-4_11
Download citation
DOI: https://doi.org/10.1007/978-1-60761-580-4_11
Published:
Publisher Name: Humana Press, Totowa, NJ
Print ISBN: 978-1-60761-578-1
Online ISBN: 978-1-60761-580-4
eBook Packages: Springer Protocols