Skip to main content

Support Vector Machines

  • Chapter
  • First Online:

Part of the book series: Springer Handbooks of Computational Statistics ((SHCS))

Abstract

In this chapter we introduce basic concepts and ideas of the Support Vector Machines (SVM). In the first section we formulate the learning problem in a statistical framework. A special focus is put on the concept of consistency, which leads to the principle of structural risk minimization (SRM). Application of these ideas to classification problems brings us to the basic, linear formulation of the SVM, described in Sect. 30.3. We then introduce the so called “kernel trick” as a tool for building a non-linear SVM as well as applying an SVM to non-vectorial data (Sect. 30.4). The practical issues of implementation of the SVM training algorithms and the related optimization problems are the topic of Sect. 30.5. Extensions of the SVM algorithms for the problems of non-linear regression and novelty detection are presented in Sect. 30.6. A brief description of the most successful applications of the SVM is given in Sect. 30.7. Finally, in the last Sect. 30.8 we summarize the main ideas of the chapter.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   259.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   329.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   329.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  • Aizerman, M., Braverman, E., Rozonoer, L.: Theoretical foundations of the potential function method in pattern recognition learning. Autom. Remote Control 25, 821–837 (1964)

    MathSciNet  Google Scholar 

  • Akaike, H.: A new look at the statistical model identification. IEEE Trans. Automat. Control 19(6), 716–723 (1974)

    Article  MathSciNet  MATH  Google Scholar 

  • Aronszajn, N.: Theory of reproducing kernels. Trans. Am. Math. Soc. 68, 337–404 (1950)

    Article  MathSciNet  MATH  Google Scholar 

  • Barron, A., Birgé, L., Massart, P.: Risk bounds for model selection via penalization. Probab. Theor. Relat. Fields 113, 301–415 (1999)

    Article  MATH  Google Scholar 

  • Bartlett, P., Mendelson, S.: Rademacher and gaussian complexities: Risk bounds and structural results. J. Mach. Learn. Res. 3, 463–482 (2002)

    MathSciNet  Google Scholar 

  • Bartlett, P., Long, P., Williamson, R.: Fat-shattering and the learnability of real-valued functions. J. Comput. Syst. Sci. 52(3), 434–452 (1996)

    Article  MathSciNet  MATH  Google Scholar 

  • Bartlett, P., Bousquet, O., Mendelson, S.: Localized rademacher complexities. In: Kivinen, J., Sloan, R. (eds.) Proceedings COLT, Lecture Notes in Computer Science, vol. 2375, pp. 44–58. Springer, Berlin (2002)

    Google Scholar 

  • Ben-Hur, A., Noble, W.S.: Kernel methods for predicting protein-protein interactions. Bioinformatics, 21(1), i38–i46 (2005)

    Article  Google Scholar 

  • Ben-Hur, A., Ong, C., Sonnenburg, S., Schölkopf, B., Rätsch, G.: Support vector machines and kernels for computational biology. PLoS Comput. Biol. 4(10), e1000173 (2008)

    Article  Google Scholar 

  • Bennett, K., Mangasarian, O.: Robust linear programming discrimination of two linearly inseparable sets. Optim. Meth. Software 1, 23–34 (1992)

    Article  Google Scholar 

  • Bertsekas, D.: Nonlinear Programming. Athena Scientific, Belmont, MA (1995)

    MATH  Google Scholar 

  • Bießmann, F., Meinecke, F.C., Gretton, A., Rauch, A., Rainer, G., Logothetis, N., Müller, K.-R.: Temporal kernel canonical correlation analysis and its application in multimodal neuronal data analysis. Mach. Learn. 79(1–2), 5—27 (2009); doi: 10.1007/s10994-009-5153-3. URL http://www.springerlink.com/content/e1425487365v2227.

  • Bishop, C.: Neural Networks for Pattern Recognition. Oxford University Press, Oxford (London/Melbourne) (1995)

    Google Scholar 

  • Blankertz, B., Curio, G., Müller, K.-R.: Classifying single trial EEG: Towards brain computer interfacing. In: Diettrich, T.G., Becker, S., Ghahramani, Z. (eds.) Advances in Neural Inf. Proc. Systems (NIPS 01), vol. 14, pp. 157–164 (2002)

    Google Scholar 

  • Blankertz, B., Dornhege, G., Krauledat, M., Müller, K.-R., Curio, G.: The non-invasive Berlin Brain-Computer Interface: Fast acquisition of effective performance in untrained subjects. NeuroImage 37(2), 539–550 (2007); URL http://dx.doi.org/10.1016/j.neuroimage.2007.01.051.

  • Bordes, A., Bottou, L., Gallinari, P.: Sgd-qn: Careful quasi-newton stochastic gradient descent. JMLR 10 1737–1754 (2009)

    MathSciNet  MATH  Google Scholar 

  • Boser, B., Guyon, I., Vapnik, V.: A training algorithm for optimal margin classifiers. In: Haussler, D. (eds.) Proceedings of the 5th Annual ACM Workshop on Computational Learning Theory, pp. 144–152 (1992)

    Google Scholar 

  • Bottou, L., Bousquet, O.: The tradeoffs of large scale learning. In NIPS 20. MIT Press, Cambridge, MA (2008)

    Google Scholar 

  • Bottou, L., Cortes, C., Denker, J., Drucker, H., Guyon, I., Jackel, L., LeCun, Y., Müller, U., Säckinger, E., Simard, P., Vapnik, V.: Comparison of classifier methods: a case study in handwritten digit recognition. In Proceedings of the 12th International Conference on Pattern Recognition and Neural Networks, Jerusalem, pp. 77–87. IEEE Computer Society Press, Washington, DC, USA (1994)

    Google Scholar 

  • Braun, M.L., Buhmann, J., Müller, K.-R.: On relevant dimensions in kernel feature spaces. J. Mach. Learn. Res. 9, 1875–1908 (2008)

    MathSciNet  MATH  Google Scholar 

  • Breiman, L., Friedman, J., Olshen, J., Stone, C.: Classification and Regression Trees. Wadsworth, Belmont, CA (1984)

    MATH  Google Scholar 

  • Brown, M., Grundy, W., Lin, D., Cristianini, N., Sugnet, C., Furey, T., Ares, M., Haussler, D.: Knowledge-based analysis of microarray gene expression data using support vector machines. Proc. Natl. Acad. Sci. 97(1), 262–267 (2000)

    Article  Google Scholar 

  • Cancedda, N., Gaussier, E., Goutte, C., Renders, J.-M.: Word-sequence kernels. J. Mach. Learn. Res. 3(Feb), 1059–1082 (2003)

    Google Scholar 

  • Cauwenberghs, G., Poggio, T.: Incremental and decremental support vector machine learning. In: Leen, T., Diettrich, T., Tresp, V. (eds.) Advances in Neural Information Processing Systems 13, pp. 409–415 (2001)

    Google Scholar 

  • Chang, Y.-W., Hsieh, C.-J., Chang, K.-W., Ringgaard, M., Lin, C.-J.: Training and testing low-degree polynomial data mappings via linear svm. JMLR 11, 1471–1490 (2010)

    MathSciNet  MATH  Google Scholar 

  • Collins, M., Duffy, N.: Convolution kernel for natural language. In Advances in Neural Information Proccessing Systems (NIPS), vol. 16, pp. 625–632 (2002)

    Google Scholar 

  • Cortes, C., Vapnik, V.: Support vector networks. Mach. Learn. 20, 273–297 (1995)

    MATH  Google Scholar 

  • Cristianini, N., Shawe-Taylor, J.: An Introduction to Support Vector Machines. Cambridge University Press, Cambridge, UK (2000)

    Google Scholar 

  • Cuturi, M., Vert, J.-P., Matsui, T.: A kernel for time series based on global alignments. In Proceedings of the International Conferenc on Acoustics, Speech and Signal Processing (ICASSP), Honolulu, HI (2007)

    Book  Google Scholar 

  • Damashek, M.: Gauging similarity with n-grams: Language-independent categorization of text. Science 267(5199), 843–848 (1995)

    Article  Google Scholar 

  • DeCoste, D., Schölkopf, B.: Training invariant support vector machines. Mach. Learn. 46, 161–190 (2002)

    Article  MATH  Google Scholar 

  • Degroeve, S., Saeys, Y., Baets, B.D., Rouzé, P., de Peer, Y.V.: Splicemachine: predicting splice sites from high-dimensional local context representations. Bioinformatics 21(8), 1332–1338 (2005)

    Article  Google Scholar 

  • Devroye, L., Györfi, L., Lugosi, G.: A Probabilistic Theory of Pattern Recognition. Number 31 in Applications of Mathematics. Springer, New York (1996)

    Google Scholar 

  • Donoho, D., Johnstone, I., Kerkyacharian, G., Picard, D.: Density estimation by wavelet thresholding. Ann. Stat. 24, 508–539 (1996)

    Article  MathSciNet  MATH  Google Scholar 

  • Drucker, H., Schapire, R., Simard, P.: Boosting performance in neural networks. Intern. J. Pattern Recognit. Artif. Intell. 7, 705–719 (1993)

    Article  Google Scholar 

  • Duda, R., Hart, P.E., Stork, D.G.: Pattern classification. (2nd edn.), Wiley, New York (2001)

    MATH  Google Scholar 

  • Eskin, E., Arnold, A., Prerau, M., Portnoy, L., Stolfo, S.: Applications of Data Mining in Computer Security, chapter A geometric framework for unsupervised anomaly detection: detecting intrusions in unlabeled data. Kluwer, Dordecht (2002)

    Google Scholar 

  • Fan, R., Chang, K., Hsieh, C., Wang, X., Lin, C.: LIBLINEAR: A library for large linear classification. JMLR 9, 1871–1874 (2008)

    MATH  Google Scholar 

  • Franc, V., Sonnenburg, S.: OCAS optimized cutting plane algorithm for support vector machines. In Proceedings of the 25nd International Machine Learning Conference. ACM Press, New York, NY, USA (2008); URL http://cmp.felk.cvut.cz/~xfrancv/ocas/html/index.html.

  • Franc, V., Sonnenburg, S.: Optimized cutting plane algorithm for large-scale risk minimization. J. Mach. Learn. Res. 10(Oct), 2157–2192 (2009)

    Google Scholar 

  • Freund, Y., Schapire, R.: A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. Syst. Sci. 55(1), 119–139 (1997)

    Article  MathSciNet  MATH  Google Scholar 

  • Gärtner, T., Lloyd, J., Flach, P.: Kernels and distances for structured data. Mach. Learn. 57(3), 205–232 (2004)

    Article  MATH  Google Scholar 

  • Girosi, F.: An equivalence between sparse approximation and support vector machines. Neural Comput. 10, 1455–1480 (1998)

    Article  Google Scholar 

  • Girosi, F., Jones, M., Poggio, T.: Priors, stabilizers and basis functions: From regularization to radial, tensor and additive splines. Technical Report A.I. Memo No. 1430, Massachusetts Institute of Technology (1993)

    Google Scholar 

  • Graepel, T., Herbrich, R., Shawe-Taylor, J.: Generalization error bounds for sparse linear classifiers. In Proceedings of COLT, pp. 298–303, San Francisco, Morgan Kaufmann (2000)

    Google Scholar 

  • Harmeling, S., Ziehe, A., Kawanabe, M., Müller, K.-R.: Kernel-based nonlinear blind source separation. Neural Comput. 15, 1089–1124 (2003)

    Article  MATH  Google Scholar 

  • Haussler, D.: Convolution kernels on discrete structures. Technical Report UCSC-CRL-99-10, UC Santa Cruz (1999)

    Google Scholar 

  • Herbrich, R., Graepel, T., Campbell, C.: Bayes point machines. J. Mach. Learn. Res. 1, 245–279 (2001)

    MathSciNet  MATH  Google Scholar 

  • Jaakkola, T., Diekhans, M., Haussler, D.: A discriminative framework for detecting remote protein homologies. J. Comp. Biol. 7, 95–114 (2000)

    Article  Google Scholar 

  • Joachims, T.: Training linear SVMs in linear time. In International Conference on Knowledge Discovery and Data Mining (KDD), pp. 217–226 (2006)

    Google Scholar 

  • Joachims, T.: Text categorization with support vector machines: Learning with many relevant features. Technical Report 23, LS VIII, University of Dortmund (1997)

    Google Scholar 

  • Joachims, T.: Making large–scale SVM learning practical. In: Schölkopf, B., Burges, C., Smola, A. (eds.) Advances in Kernel Methods – Support Vector Learning, pp. 169–184. MIT Press, Cambridge, MA (1999)

    Google Scholar 

  • Joachims, T., Yu, C.-N.J.: Sparse kernel svms via cutting-plane training. Mach. Learn. 76(2–3), 179–193 (2009)

    Article  Google Scholar 

  • Kashima, H., Koyanagi, T.: Kernels for semi-structured data. In International Conference on Machine Learning (ICML), pp. 291–298 (2002)

    Google Scholar 

  • Kashima, H., Tsuda, K., Inokuchi, A.: Kernels for graphs. In Kernels and Bioinformatics, pp. 155–170. MIT press, Cambridge, MA (2004)

    Google Scholar 

  • Kelly, J.: The cutting-plane method for solving convex programs. J. Soc. Ind. Appl. Math. 8, 703–712 (1960)

    Article  Google Scholar 

  • Kivinen, J., Smola, A., Williamson, R.: Online learning with kernels. In: Diettrich, T.G., Becker, S., Ghahramani, Z. (eds.) Advances in Neural Inf. Proc. Systems (NIPS 01), pp. 785–792 (2001)

    Google Scholar 

  • Kolmogorov, A.: Stationary sequences in hilbert spaces. Moscow Univ. Math. 2, 1–40 (1941)

    Google Scholar 

  • Laskov, P.: Feasible direction decomposition algorithms for training support vector machines. Mach. Learn. 46, 315–349 (2002)

    Article  MATH  Google Scholar 

  • Laskov, P., Gehl, C., Krüger, S., Müller, K.R.: Incremental support vector learning: Analysis, implementation and applications. J. Mach. Learn. Res. 7, 1909–1936 (2006)

    MathSciNet  MATH  Google Scholar 

  • LeCun, Y., Jackel, L., Bottou, L., Brunot, A., Cortes, C., Denker, J., Drucker, H., Guyon, I., Müller, U., Säckinger, E., Simard, P., Vapnik, V.: Comparison of learning algorithms for handwritten digit recognition. In: Fogelman-Soulié, F., Gallinari, P. (eds.) Proceedings ICANN’95 – International Conference on Artificial Neural Networks, vol. II, pp. 53–60. Nanterre, France (1995)

    Google Scholar 

  • Leslie, C., Kuang, R.: Fast string kernels using inexact matching for protein sequences. J. Mach. Learn. Res. 5, 1435–1455 (2004)

    MathSciNet  MATH  Google Scholar 

  • Leslie, C., Eskin, E., Noble, W.: The spectrum kernel: A string kernel for SVM protein classification. In Proceedings of Pacific Symposium on Biocomputing, pp. 564–575 (2002)

    Google Scholar 

  • Leslie, C., Eskin, E., Cohen, A., Weston, J., Noble, W.: Mismatch string kernel for discriminative protein classification. Bioinformatics 1(1), 1–10 (2003)

    Google Scholar 

  • Lin, C.-J.: On the convergence of the decomposition method for support vector machines. IEEE Trans. Neural Networks 12(6), 1288–1298 (2001)

    Article  Google Scholar 

  • Lodhi, H., Saunders, C., Shawe-Taylor, J., Cristianini, N., Watkins, C.: Text classification using string kernels. J. Mach. Learn. Res. 2, 419–444 (2002)

    MATH  Google Scholar 

  • Luenberger, D.: Introduction to Linear and Nonlinear Programming. Addison-Wesley, Reading, MA (1973)

    MATH  Google Scholar 

  • Mallows, C.: Some comments on Cp. Technometrics 15, 661–675 (1973)

    MATH  Google Scholar 

  • Mercer, J.: Functions of positive and negative type and their connection with the theory of integral equations. Philos. Trans. Roy. Soc. London A 209, 415–446 (1909)

    Article  MATH  Google Scholar 

  • Mika, S.: Kernel Fisher Discriminants. PhD thesis, Berlin Institute of Technology (2002)

    Google Scholar 

  • Moody, J., Darken, C.: Fast learning in networks of locally-tuned processing units. Neural Comput. 1(2), 281–294 (1989)

    Article  Google Scholar 

  • Morozov, V.: Methods for Solving Incorrectly Posed Problems. Springer, New York, NY (1984)

    Book  Google Scholar 

  • Moschitti, A.: Efficient convolution kernels for dependency and constituent syntactic trees. In European Conference on Machine Learning (ECML), pp. 318–329 (2006)

    Google Scholar 

  • Müller, K.-R., Smola, A., Rätsch, G., Schölkopf, B., Kohlmorgen, J., Vapnik, V.: Predicting time series with support vector machines. In: Gerstner, W., Germond, A., Hasler, M., Nicoud, J.-D. (eds.) Artificial Neural Networks – ICANN ’97, LNCS, vol. 1327, pp. 999–1004. Springer, Berlin (1997)

    Chapter  Google Scholar 

  • Müller, K.-R., Mika, S., Rätsch, G., Tsuda, K., Schölkopf, B.: An introduction to kernel-based learning algorithms. IEEE Neural Netw. 12(2), 181–201 (2001)

    Article  Google Scholar 

  • Müller, K.-R., Rätsch, G., Sonnenburg, S., Mika, S., Grimm, M., Heinrich, N.: Classifying ’drug-likeness’ with kernel-based learning methods. J. Chem. Inf. Model 45, 249–253 (2005)

    Article  Google Scholar 

  • Müller, K.-R., Tangermann, M., Dornhege, G., Krauledat, M., Curio, G., Blankertz, B.: Machine learning for real-time single-trial EEG-analysis: From brain-computer interfacing to mental state monitoring. J. Neurosci. Meth. 167(1), 82–90 (2008); URL http://dx.doi.org/10.1016/j.jneumeth.2007.09.022.

  • Nassar, M., State, R., Festor, O.: Monitoring SIP traffic using support vector machines. In Proceedings of Symposium on Recent Advances in Intrusion Detection, pp. 311–330 (2008)

    Google Scholar 

  • Ong, C.S., Zien, A.: An automated combination of kernels for predicting protein subcellular localization. In Proceedings of the 8th Workshop on Algorithms in Bioinformatics (WABI), Lecture Notes in Bioinformatics, pp. 186–179. Springer, New York (2008)

    Google Scholar 

  • Osuna, E., Freund, R., Girosi, F.: An improved training algorithm for support vector machines. In: Principe, J., Giles, L., Morgan, N., Wilson, E. (eds.) Neural Networks for Signal Processing VII – Proceedings of the 1997 IEEE Workshop, pp. 276–285. Springer, New York (1997a).

    Chapter  Google Scholar 

  • Osuna, E., Freund, R., Girosi, F.: Training support vector machines: An application to face detection. In Proceedings CVPR’97 (1997b)

    Google Scholar 

  • Perdisci, R., Ariu, D., Fogla, P., Giacinto, G., Lee, W.: McPAD: A multiple classifier system for accurate payload-based anomaly detection. Computer Networks, pp. 864–881 (2009)

    Google Scholar 

  • Platt, J.: Fast training of support vector machines using sequential minimal optimization. In: Schölkopf, B., Burges, C., Smola, A. (ed.): Advances in Kernel Methods – Support Vector Learning, pp. 185–208. MIT Press, Cambridge, MA (1999)

    Google Scholar 

  • Ralaivola, L., d’Alché Buc, F.: Incremental support vector machine learning: A local approach. Lect. Notes Comput. Sci. 2130, 322–329 (2001)

    Google Scholar 

  • Rätsch, G.: Ensemble learning methods for classification. Master’s thesis, Department of Computer Science, University of Potsdam, In German (1998)

    Google Scholar 

  • Rätsch, G., Mika, S., Schölkopf, B., Müller, K.-R.: Constructing boosting algorithms from SVMs: an application to one-class classification. IEEE PAMI 24(9), 1184–1199 (2002)

    Article  Google Scholar 

  • Rätsch, G., Sonnenburg, S., Schölkopf, B.: RASE: recognition of alternatively spliced exons in c. elegans. Bioinformatics 21, i369–i377 (2005)

    Google Scholar 

  • Rätsch, G., Sonnenburg, S., Srinivasan, J., Witte, H., Sommer, R., Müller, K.-R., Schölkopf, B.: Improving the c. elegans genome annotation using machine learning. PLoS Comput. Biol. 3(2), e20 (2007)

    Google Scholar 

  • Rieck, K.: Machine Learning for Application-Layer Intrusion Detection. PhD thesis, Berlin Institute of Technology, Berlin (2009)

    Google Scholar 

  • Rieck, K., Krueger, T., Brefeld, U., Müller, K.-R.: Approximate tree kernels. J. Mach. Learn. Res. 11(Feb), 555–580 (2010)

    Google Scholar 

  • Rüping, S.: Incremental learning with support vector machines. Technical Report TR-18, Universität Dortmund, SFB475 (2002)

    Google Scholar 

  • Schölkopf, B., Burges, C., Vapnik, V.: Extracting support data for a given task. In: Fayyad, U., Uthurusamy, R. (eds.) Proceedings, First International Conference on Knowledge Discovery & Data Mining. AAAI Press, Menlo Park, CA (1995)

    Chapter  Google Scholar 

  • Schölkopf, B., Smola, A.: Learning with Kernels. MIT Press, Cambridge, MA (2002)

    Google Scholar 

  • Schölkopf, B., Simard, P., Smola, A., Vapnik, V.: Prior knowledge in support vector kernels. In: Jordan, M., Kearns, M., Solla, S. (eds) Advances in Neural Information Processing Systems, vol. 10, pp. 640–646. MIT Press, Cambridge, MA (1998a)

    Google Scholar 

  • Schölkopf, B., Smola, A., Müller, K.-R.: Nonlinear component analysis as a kernel eigenvalue problem. Neural Comput. 10, 1299–1319 (1998b)

    Article  Google Scholar 

  • Schölkopf, B., Mika, S., Burges, C., Knirsch, P., Müller, K.-R., Rätsch, G., Smola, A.: Input space vs. feature space in kernel-based methods. IEEE Trans. Neural Netw. / A Publication of the IEEE Neural Netw. Council 10(5), 1000–1017 (1999)

    Google Scholar 

  • Schölkopf, B., Smola, A., Williamson, R., Bartlett, P.: New support vector algorithms. Neural Comput. 12, 1207–1245 (2000)

    Google Scholar 

  • Schölkopf, B., Platt, J., Shawe-Taylor, J., Smola, A., Williamson, R.: Estimating the support of a high-dimensional distribution. Neural Comput. 13(7), 1443–1471 (2001)

    Article  MATH  Google Scholar 

  • Shawe-Taylor, J., Cristianini, N.: Kernel methods for pattern analysis. Cambridge University Press, Cambridge (London/New York) (2004)

    Book  Google Scholar 

  • Shawe-Taylor, J., Bartlett, P., Williamson, R.: Structural risk minimization over data-dependent hierachies. IEEE Trans. Inform. Theor. 44(5), 1926–1940 (1998)

    Article  MathSciNet  MATH  Google Scholar 

  • Shwartz, S.-S., Singer, Y., Srebro, N.: Pegasos: Primal estimated sub-gradient solver for svm. In ICML, pp. 807–814. ACM Press, New York (2007)

    Google Scholar 

  • Simard, P., LeCun, Y., Denker, J., Victorri, B.: Transformation invariance in pattern recognition – tangent distance and tangent propagation. In: Orr, G., Müller, K.-R. (eds.) Neural Networks: Tricks of the Trade, vol. 1524, pp. 239–274. Springer LNCS (1998)

    Google Scholar 

  • Smola, A., Schölkopf, B., Müller, K.-R.: The connection between regularization operators and support vector kernels. Neural Netw. 11, 637–649 (1998)

    Article  Google Scholar 

  • Sonnenburg, S., Franc, V.: COFFIN: a computational framework for linear SVMs. In Proceedings of the 27th International Machine Learning Conference, Haifa (2010); (accepted).

    Google Scholar 

  • Sonnenburg, S., Rätsch, G., Jagota, A., Müller, K.-R.: New methods for splice-site recognition. In: Dorronsoro, J. (eds.) Proceedings of International conference on artificial Neural Networks – ICANN’02, pp. 329–336. LNCS 2415, Springer, Berlin (2002)

    Google Scholar 

  • Sonnenburg, S., Zien, A., Rätsch, G.: ARTS: Accurate Recognition of Transcription Starts in Human. Bioinformatics 22(14), e472–480 (2006)

    Article  Google Scholar 

  • Sonnenburg, S., Rätsch, G., Rieck, K.: Large scale learning with string kernels. In: Bottou, L., Chapelle, O., DeCoste, D., Weston, J. (eds.) Large Scale Kernel Machines, pp. 73–103. MIT Press, Cambridge, MA (2007a).

    Google Scholar 

  • Sonnenburg, S., Schweikert, G., Philips, P., Behr, J., Rätsch, G.: Accurate Splice Site Prediction. BMC Bioinformatics, Special Issue from NIPS workshop on New Problems and Methods in Computational Biology Whistler, Canada, 18 December 2006, 8(Suppl. 10):S7 (2007b)

    Google Scholar 

  • Sonnenburg, S., Zien, A., Philips, P., Rätsch, G.: POIMs: positional oligomer importance matrices – understanding support vector machine based signal detectors. Bioinformatics 24(13), i6–i14 (2008)

    Article  Google Scholar 

  • Sonnenburg, S., Rätsch, G., Henschel, S., Widmer, C., Behr, J., Zien, A., de Bona, F., Binder, A., Gehl, C., Franc, V.: The SHOGUN machine learning toolbox. J. Mach. Learn. Res. 11, 1799–1802 (2010); URL http://www.shogun-toolbox.org.

    Google Scholar 

  • Tax, D., Duin, R.: Uniform object generation for optimizing one-class classifiers. J. Mach. Learn. Res. pp. 155–173 (2001)

    Google Scholar 

  • Tax, D., Laskov, P.: Online SVM learning: from classification to data description and back. In: Molina, C. (eds.) Proc. NNSP, pp. 499–508 (2003)

    Google Scholar 

  • Teo, C.H., Le, Q., Smola, A., Vishwanathan, S.: A scalable modular convex solver for regularized risk minimization. In KDD’07 (2007)

    Google Scholar 

  • Teo, C.H., Vishwanthan, S., Smola, A. J., Le, Q.V.: Bundle methods for regularized risk minimization. J. Mach. Learn. Res. 11(Jan), 311–365 (2010)

    Google Scholar 

  • Tikhonov, A., Arsenin, V.: Solutions of Ill-posed Problems. In: Winston, W.H., Washington, DC (1977)

    MATH  Google Scholar 

  • Tsuda, K., Kawanabe, M., Rätsch, G., Sonnenburg, S., Müller, K.-R.: A new discriminative kernel from probabilistic models. Neural Comput. 14, 2397–2414 (2002)

    Article  MATH  Google Scholar 

  • Vapnik, V.: Estimation of Dependences Based on Empirical Data. Springer, Berlin (1982)

    MATH  Google Scholar 

  • Vapnik, V.: Statistical Learning Theory. Wiley, New York (1998)

    MATH  Google Scholar 

  • Vapnik, V., Chervonenkis, A.: The necessary and sufficient conditions for consistency in the empirical risk minimization method. Pattern Recogn. Image Anal. 1(3), 283–305 (1991)

    Google Scholar 

  • Vert, J.-P.: A tree kernel to analyze phylogenetic profiles. Bioinformatics 18, S276–S284 (2002)

    Article  Google Scholar 

  • Vert, J.-P., Saigo, H., Akutsu, T.: Kernel Methods in Computational Biology, chapter Local alignment kernels for biological sequences, pp. 131–154. MIT Press, Cambridge, MA (2004)

    Google Scholar 

  • Vishwanathan, S., Smola, A.: Fast kernels for string and tree matching. In: Tsuda, K., Schölkopf, B., Vert, J. (eds.) Kernels and Bioinformatics, pp. 113–130. MIT Press, Cambridge, MA (2004)

    Google Scholar 

  • Vishwanathan, S.V.N., Schraudolph, N.N., Kondor, R., Borgwardt, K.M.: Graph kernels. J. Mach. Learn. Res. 11(Apr) (2010)

    Google Scholar 

  • Wahba, G.: Spline bases, regularization, and generalized cross-validation for solving approximation problems with large quantities of noisy data. In Proceedings of the International Conference on Approximation theory. Academic Press, Austin, Texas (1980)

    Google Scholar 

  • Wahl, S., Rieck, K., Laskov, P., Domschitz, P., Müller, K.-R.: Securing IMS against novel threats. Bell Labs Technical J. 14(1), 243–257 (2009)

    Article  Google Scholar 

  • Warmuth, M.K., Liao, J., Rätsch, G.M.M., Putta, S., Lemmem, C.: Support Vector Machines for active learning in the drug discovery process. J. Chem. Inform. Sci. 43(2), 667–673 (2003)

    Article  Google Scholar 

  • Watkins, C.: Dynamic alignment kernels. In: Smola, A., Bartlett, P., Schölkopf, B., Schuurmans, D. (ed.): Advances in Large Margin Classifiers, pp. 39–50. MIT Press, Cambridge, MA (2000)

    Google Scholar 

  • Weston, J., Gammerman, A., Stitson, M., Vapnik, V., Vovk, V., Watkins, C.: Support vector density estimation. In: Schölkopf, B., Burges, C., Smola, A. (eds.) Advances in Kernel Methods – Support Vector Learning, pp. 293–305. MIT Press, Cambridge, MA (1999)

    Google Scholar 

  • Williamson, R., Smola, A., Schölkopf, B.: Generalization performance of regularization networks and support vector machines via entropy numbers of compact operators. NeuroCOLT Technical Report NC-TR-98-019, Royal Holloway College, University of London, UK (1998)

    Google Scholar 

  • Yu, J., Vishwanathan, S., Gunter, S., Schraudolph, N.N.: A quasi-newton approach to nonsmooth convex optimization problems in machine learning. JMLR 11, 1145–1200 (2010)

    MathSciNet  MATH  Google Scholar 

  • Zien, A., Rätsch, G., Mika, S., Schölkopf, B., Lengauer, T., Müller, K.-R.: Engineering support vector machine kernels that recognize translation initiation sites in DNA. BioInformatics 16(9), 799–807 (2000)

    Article  Google Scholar 

  • Zoutendijk, G.: Methods of feasible directions. Elsevier, Amsterdam (1960)

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Konrad Rieck .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Rieck, K. et al. (2012). Support Vector Machines. In: Gentle, J., Härdle, W., Mori, Y. (eds) Handbook of Computational Statistics. Springer Handbooks of Computational Statistics. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-21551-3_30

Download citation

Publish with us

Policies and ethics