Data Discrimination via Nonlinear Generalized Support Vector Machines

  • O. L. Mangasarian
  • David R. Musicant
Part of the Applied Optimization book series (APOP, volume 50)


The main purpose of this paper is to show that new formulations of support vector machines can generate nonlinear separating surfaces which can discriminate between elements of a given set better than a linear surface. The principal approach used is that of generalized support vector machines (GSVMs) [21] which employ possibly indefinite kernels. The GSVM training procedure is carried out by either a simple successive overrelaxation (SOR) [22] iterative method or by linear programming. This novel combination of powerful support vector machines [28, 7] with the highly effective SOR computational algorithm [19, 20, 17], or with linear programming, allows us to use a nonlinear surface to discriminate between elements of a dataset that belong to one of two categories. Numerical results on a number of datasets show improved testing set correctness, by as much as a factor of two, when comparing the nonlinear GSVM surface to a linear separating surface.


Support Vector Machine Linear Complementarity Problem Linear Kernel Linear Programming Formulation Convex Quadratic Program 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. [1]
    K. P. Bennett, D. Hui, and L. Auslender. On support vector decision trees for database marketing. Department of Mathematical Sciences Math Report No. 98–100, Rensselaer Polytechnic Institute, Troy, NY 12180, March 1998. Scholar
  2. [2]
    B. E. Boser, I. M. Guyon, and V. N. Vapnik. A training algorithm for optimal margin classifiers. In D. Haussler, editor, Proceedings of the 5th Annual ACM Workshop on Computational Learning Theory, pages 144–152, Pittsburgh, PA, July 1992. ACM Press.Google Scholar
  3. [3]
    P. S. Bradley and O. L. Mangasarian. Feature selection via concave minimization and support vector machines. In J. Shavlik, editor, Machine Learning Proceedings of the Fifteenth International Conference (ICML ’98), pages 82–90, San Francisco, California, 1998. Morgan Kaufmann,– Scholar
  4. [4]
    E. J. Bredensteiner. Optimization Methods in Data Mining and Machine Learning. PhD thesis, Department of Mathematical Sciences, Rensselaer Polytechnic Institute, Troy, NY, 1997.Google Scholar
  5. [5]
    E. J. Bredensteiner and K. P. Bennett. Feature minimization within decision trees. Computational Optimizations and Applications, 10:111–126, 1998.MathSciNetzbMATHGoogle Scholar
  6. [6]
    C. J. C. Burges. A tutorial on support vector machines for pattern recognition. Data Mining and Knowledge Discovery, 2(2):121–167, 1998.CrossRefGoogle Scholar
  7. [7]
    V. Cherkassky and F. Mulier. Learning from Data — Concepts, Theory and Methods. John Wiley & Sons, New York, 1998.zbMATHGoogle Scholar
  8. [8]
    G. B. Dantzig. Linear Programming and Extensions. Princeton University Press, Princeton, New Jersey, 1963.zbMATHGoogle Scholar
  9. [9]
    R. De Leone and O. L. Mangasarian. Serial and parallel solution of large scale linear programs by augmented Lagrangian successive overrelaxation. In A. Kurzhanski, K. Neumann, and D. Pallaschke, editors, Optimization, Parallel Processing and Applications, pages 103–124, Berlin, 1988. Springer-Verlag. Lecture Notes in Economics and Mathematical Systems 304.CrossRefGoogle Scholar
  10. [10]
    R. De Leone, O. L. Mangasarian, and T.-H. Shiau. Multi-sweep asynchronous parallel successive overrelaxation for the nonsymmet-ric linear complementarity problem. Annals of Operations Research, 22:43–54, 1990.MathSciNetzbMATHCrossRefGoogle Scholar
  11. [11]
    R. De Leone and M. A. Tork Roth. Massively parallel solution of quadratic programs via successive overrelaxation. Concurrency: Practice and Experience, 5:623–634, 1993.CrossRefGoogle Scholar
  12. [12]
    M. C. Ferris and O. L. Mangasarian. Parallel constraint distribution. SI AM Journal on Optimization, l(4):487–500, 1991.MathSciNetCrossRefGoogle Scholar
  13. [13]
    T.-T. Priess. Support vector neural networks: The kernel ada-tron with bias and soft margin. Technical report, Department of Automatic Control and Systems Engineering, University of Sheffield, Sheffield, England, 1998. Revised Version: Scholar
  14. [14]
    T.-T. Priess, N. Cristianini, and C. Campbell. The kernel-adatron algorithm: A fast and simple learning procedure for support vector machines. In Jude Shavlik, editor, Machine Learning Proceedings of the Fifteenth International Conference (ICML’98), pages 188–196, San Francisco, 1998. Morgan Kaufmann. Scholar
  15. [15]
    Tin Kam Ho and Eugene M. Kleinberg. Building projectable classifiers of arbitrary complexity. In Proceedings of the 13th International Conference on Pattern Recognition, pages 880–885, Vienna, Austria, 1996. Checker dataset at: Scholar
  16. [16]
    L. Kaufman. Solving the quadratic programming problem arising in support vector classification. In Bernhard Schölkopf, Christopher J. C. Burges, and Alexander J. Smola, editors, Advances in Kernel Methods — Support Vector Learning, pages 147–167. MIT Press, 1999.Google Scholar
  17. [17]
    Z.-Q. Luo and P. Tseng. Error bounds and convergence analysis of feasible descent methods: A general approach. Annals of Operations Research, 46:157–178, 1993.MathSciNetCrossRefGoogle Scholar
  18. [18]
    O. L. Mangasarian. Nonlinear Programming. McGraw-Hill, New York, 1969. Reprint: SIAM Classic in Applied Mathematics 10, 1994, Philadelphia.Google Scholar
  19. [19]
    O. L. Mangasarian. Solution of symmetric linear complementarity problems by iterative methods. Journal of Optimization Theory and Applications, 22(4):465–485, August 1977.MathSciNetzbMATHCrossRefGoogle Scholar
  20. [20]
    O. L. Mangasarian. On the convergence of iterates of an inexact matrix splitting algorithm for the symmetric monotone linear complementarity problem. SIAM Journal on Optimization, 1:114–122, 1991.MathSciNetzbMATHCrossRefGoogle Scholar
  21. [21]
    O. L. Mangasarian. Generalized support vector machines. In A. Smola, P. Bartlett, B. Schölkopf, and D. Schuurmans, editors, Advances in Large Margin Classifiers, pages 135–146, Cambridge, MA, 2000. MIT Press.– Scholar
  22. [22]
    O. L. Mangasarian and David R. Musicant. Successive over-relaxation for support vector machines. IEEE Transactions on Neural Networks, 10:1032–1037, 1999.– Scholar
  23. [23]
    Matlab. User’s Guide. The MathWorks, Inc., Natick, MA 01760, 1992.Google Scholar
  24. [24]
    Matlab. Application Program Interface Guide. The MathWorks, Inc., Natick, MA 01760, 1997.Google Scholar
  25. [25]
    P. M. Murphy and D. W. Aha. UCI repository of machine learning databases, 1992. Scholar
  26. [26]
    B. A. Murtagh and M. A. Saunders. MINOS 5.0 user’s guide. Technical Report SOL 83.20, Stanford University, December 1983. MINOS 5.4 Release Notes, December 1992.Google Scholar
  27. [27]
    B. T. Polyak. Introduction to Optimization. Optimization Software, Inc., Publications Division, New York, 1987.Google Scholar
  28. [28]
    V. N. Vapnik. The Nature of Statistical Learning Theory. Springer, New York, 1995.zbMATHGoogle Scholar

Copyright information

© Springer Science+Business Media Dordrecht 2001

Authors and Affiliations

  • O. L. Mangasarian
    • 1
  • David R. Musicant
    • 1
  1. 1.Computer Sciences DepartmentUniversity of WisconsinMadisonUSA

Personalised recommendations