Support Vector Machines

  • Mahesh Pal
  • Pakorn Watanachaturaporn


Support Vector Machines (SVMs) are a relatively new generation of techniques for classification and regression problems. These are based on Statistical Learning Theory having its origins in Machine Learning, which is defined by Kohavi and Foster (1998) as,

...Machine Learning is the field of scientific study that concentrates on induction algorithms and on other algorithms that can be said to “learn.”


Support Vector Machine Empirical Risk Hypothesis Space Binary Classification Problem Multiclass Classification 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Bennett KP, Campbell C (2000) Support vector machines: hype or hallelujah. Special Interest Group on Knowledge Discovery in Data Mining Explorations 2 (2): 1–13Google Scholar
  2. Bennett KP, Mangasarian OL (1992) Robust linear programming discrimination of two linearly inseparable sets. Optimization Methods and Software 1: 23–34CrossRefGoogle Scholar
  3. Boser H, Guyon IM, Vapnik VN (1992) A training algorithm for optimal margin classifiers. Proceedings of the 5th Annual ACM Workshop on Computational Learning Theory, Pittsburgh, PA, pp 144–152Google Scholar
  4. Campbell C (2002) Kernel methods: a survey of current techniques. Neurocomputing 48: 63–84CrossRefGoogle Scholar
  5. Campbell C, Cristianini N (1998) Simple training algorithms for support vector machines. Technical Report, Bristol University ( Google Scholar
  6. Cortes C, Vapnik VN (1995) Support vector networks. Machine Learning 20: 273–297 Courant R, Hilbert D (1970) Methods of mathematical Physics, I and I I. Wiley Interscience, New YorkGoogle Scholar
  7. Cover TM (1965) Geometrical and statistical properties of systems of linear inequalities with applications in pattern recognition. IEEE Transaction on Electronic Computers EC-14: 326–334.Google Scholar
  8. CPLEX Optimization Inc. (1992) CPLEX User’s guide, Incline Village, NYGoogle Scholar
  9. Crammer K, Singer Y (2001) On the algorithmic implementation of multiclass kernel-based vector machines. Journal of Machine Learning Research 2: 265–292Google Scholar
  10. Cristianini N, Shawe-Taylor J (2000) An introduction to support vector machines and other kernel-based learning methods. Cambridge University Press, Cambridge, UKGoogle Scholar
  11. Ferris MC, Munson TS (2000a) “Interior point methods for massive support vector machines:’ Data Mining Institute Technical Report 00–05, Computer Science Department, University of Wisconsin, Madion, WIGoogle Scholar
  12. Ferris MC, Munson TS (2000b) Semi-smooth support vector machines. Data Mining Institute Technical Report 00–09, Computer Science Department, University of Wisconsin, Madion, WIGoogle Scholar
  13. Friedman JH (1994) Flexible metric nearest neighbor classification.Technical Report, Department of Statistics, Stanford UniversityGoogle Scholar
  14. Friedman JH (1996) Another approach to polychotomous classification. Technical Report, Department of Statistics and Stanford Linear Accelerator Center, Stanford University Gray RM, Davisson LD (1986) Random processes: a mathematical approach for engineers. Prentice-Hall, Englewood Cliffs, NJGoogle Scholar
  15. Hastie TJ, Tibshirani RJ (1996) Discriminant adaptive nearest neighbor classification. IEEEGoogle Scholar
  16. Transactions on Pattern Analysis and Machine Intelligence 18(6): 607–615Google Scholar
  17. Hastie TJ, Tibshirani RJ (1998) Classification by pairwise coupling. In: Jordan MI, KearnsGoogle Scholar
  18. MJ, Solla, SA (eds) Advances in neural information processing systems10, The MIT Press, Cambridge, MA, pp 507–513Google Scholar
  19. Haykin S (1999) Neural networks: a comprehensive foundation. Prentice Hall, Upper Saddle River, NJGoogle Scholar
  20. Hsu C-W, Lin CJ (2002) A simple decomposition method for support vector machines. Machine Learning 46: 291–314CrossRefGoogle Scholar
  21. Hughes GF (1968) On the mean accuracy of statistical pattern recognizers. IEEE Transactions on Information Theory 14 (1): 55–63CrossRefGoogle Scholar
  22. Knerr S, Personnaz L, Dreyfus G (1990) Single-layer learning revisited: A stepwise procedure for building and training neural network. In: Neurocomputing: algorithms, architectures and applications, NATO ASI, Springer Verlag, BerlinGoogle Scholar
  23. Kohavi R, Foster P (1998) Editorial: glossary of terms. Machine Learning 30:271–274 Kreßel U ( 1999 ) Pairwise classification and support vector machines. In: Schölkopf B, BurgesGoogle Scholar
  24. CJC, Smola AJ (eds), Advances in kernel methods - support vector learning The MIT Press, Cambridge, MA, pp 255–268Google Scholar
  25. Lee Y, Lin Y, Wahba G (2001) Multicategory support vector machines, Technical Report 1043, Department of Statistics, University of Wisconsin, Madison, WIGoogle Scholar
  26. Leunberger D (1984) Linear and nonlinear programming, 2nd edition. Addison-Wesley, Menlo Park, CaliforniaGoogle Scholar
  27. Mangasarian OL, Musicant, DR (1998) Successive overrelaxation for support vector machines. Technical Report, Computer Sciences Department, University of Wisconsin, Madison, Wisconsin.Google Scholar
  28. Mangasarian OL, Musicant DR (2000a) Active support vector machine classification. Technical Report 0004, Data Mining Institute, Computer Sciences Department, University of Wisconsin, Madison, Wisconsin (ftp: Google Scholar
  29. Mangasarian OL, Musicant DR (2000b) Lagrangian support vector machines. Technical Report 0006, Data Mining Institute, Computer Sciences Department, University of Wisconsin, Madison, Wisconsin ( Google Scholar
  30. Mather PM (1999) Computer processing of remotely-sensed images: an introduction 2nd edition. John Wiley and Sons. Chichester, NYGoogle Scholar
  31. Mercer J (1909) Functions of positive and negative type, and their connection with the theory of integral equations. Transactions of the London Philosophical Society (A) 209: 415–446CrossRefGoogle Scholar
  32. Minsky ML, Papert SA (1969) Perceptrons. MIT Press, Cambridge, MAGoogle Scholar
  33. Murtagh BA, Saunders MA (1987) MINOS 5.1 user’s guide. (SOL 83–20R), Stanford University.Google Scholar
  34. Osuna EE, Freund R, Girosi F (1997) Support vector machines: training and applications. A. I. Memo No. 1602, CBCL paper No. 144, Artificial Intelligence laboratory, Massachusetts Institute of Technology, ( Scholar
  35. Pal M (2002) Factors influencing the accuracy of remote sensing classifications: a comparative study. Ph. D. Thesis (unpublished ), University of Nottingham, UKGoogle Scholar
  36. Platt JC (1999) Fast training of support vector machines using sequential minimal optimization. In: Schölkopf B, Burges CJC, Smola AJ (eds), Advances in kernel methods–support vector learning The MIT Press, Cambridge, MA, pp 185–208Google Scholar
  37. Platt JC, Cristianini N, Shawe-Taylor J (2000) Large margin DAGs for multiclass classification. In: Solla SA, Leen TK, Müller K-R (eds), Advance in neural information processing systems 12, The MIT Press, Cambridge, MA, pp 547–553Google Scholar
  38. Richards JA, Jia X (1999) Remote sensing digital image analysis: an introduction, 3rd edition. Springer Verlag, Berlin, Heidelberg, New YorkGoogle Scholar
  39. Schölkopf B (1997) Support vector learning. PhD thesis, Technische Universität, Berlin. Schölkopf B, Smola AJ ( 2002 ) Learning with kernels - support vector machines, regularization, optimization and beyond.The MIT Press, Cambridge, MAGoogle Scholar
  40. Takahashi F, Abe S (2002) Decision-tree-based multiclass support vector machines. Proceedings of the 9th International Conference on Neural Information Processing (ICONIP’02), 3, pp 1418–1422CrossRefGoogle Scholar
  41. Tso BCK, Mather PM (2001) Classification methods for remotely sensed data. Taylor and Francis, LondonCrossRefGoogle Scholar
  42. Vanderbei RJ (1997) User’s manual - version 3.10. (SOR-97–08), Statistics and Operations Research, Princeton University, Princeton, NJGoogle Scholar
  43. Vapnik VN (1982) Estimation of dependencies based on empirical data. Springer Verlag, BerlinGoogle Scholar
  44. Vapnik VN (1995) The nature of statistical learning theory. Springer Verlag, New York Vapnik VN ( 1998 ) Statistical learning theory. John Wiley and Sons, New YorkGoogle Scholar
  45. Vapnik VN (1999) An overview of statistical learning theory. IEEE Transactions of Neural Networks 10: 988–999CrossRefGoogle Scholar
  46. Vapnik VN, Chervonenkis AJ (1971) On the uniform convergence of relative frequencies of events to their probabilities. Theory of Probability and its Applications 17: 264–280CrossRefGoogle Scholar
  47. Vapnik VN, Chervonenkis AJ (1974) Theory of pattern recognition (in Russian). Nauka, MoscowGoogle Scholar
  48. Vapnik VN, Chervonenkis AJ (1979) Theory of pattern recognition (in German). Akademia Verlag, BerlinGoogle Scholar
  49. Vapnik VN, Chervonenkis AJ (1991) The necessary and sufficient conditions for consistency in the empirical risk minimization method. Pattern Recognition and Image Analysis 1: 283–305Google Scholar
  50. Weston J, Watkins C (1998) Multi-class support vector machines. Technical Report CSDTR-98–04, Royal Holloway, University of London, UKGoogle Scholar
  51. Weston J, Watkins C (1999) Multi-class support vector machines. In: Verleysen M (ed), 7th European Symposium on Artificial Neural Networks. Proceedings of ESANN’99, D-Facto, Brussels, Belgium, pp 219–224Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2004

Authors and Affiliations

  • Mahesh Pal
  • Pakorn Watanachaturaporn

There are no affiliations available

Personalised recommendations