Large-margin Distribution Machine-based regression

  • Reshma RastogiEmail author
  • Pritam Anand
  • Suresh Chandra
Original Article


This paper presents an efficient and robust Large-margin Distribution Machine formulation for regression. The proposed model is termed as ‘Large-margin Distribution Machine-based Regression’ (LDMR) model, and it is in the spirit of Large-margin Distribution Machine (LDM) (Zhang and Zhou, in: Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, 2014) classification model. The LDM model optimizes the margin distribution instead of minimizing a single-point margin as is done in the traditional SVM. The optimization problem of the LDMR model has been mathematically derived from the optimization problem of the LDM model using an interesting result of Bi and Bennett (Neurocomputing 55(1):79–108, 2003). The resulting LDMR formulation attempts to minimize the \(\epsilon\)-insensitive loss function and the quadratic loss function simultaneously. Further, the successive over-relaxation technique (Mangasarian and Musicant, IEEE Trans Neural Netw 10(5):1032−1037, 1999) has also been applied to speed up the training procedure of the proposed LDMR model. The experimental results on artificial datasets, UCI datasets and time-series financial datasets show that the proposed LDMR model owns better generalization ability than other existing models and is less sensitive to the presence of outliers.


Support vector machine Regression Large-margin Distribution Machine \(\epsilon \)-insensitive loss Quadratic loss Successive over-relaxation 



We would like to thank the learned referees for their valuable comments and suggestions which has substantially improved the contents and presentation of the manuscript. We would also like to acknowledge Ministry of Electronics and Information Technology, Government of India, as this work has been funded by them under Visvesvaraya Ph.D. Scheme for Electronics and IT, Order No. Phd-MLA/4(42)/2015-16.

Compliance with ethical standards

Conflict of Interest

The authors declare that they have no conflict of interest.


  1. 1.
    Cortes C, Vapnik V (1995) Support vector networks. Mach Learn 20(3):273–297zbMATHGoogle Scholar
  2. 2.
    Burges JC (1998) A tutorial on support vector machines for pattern recognition. Data Min Knowl Discov 2(2):121–167CrossRefGoogle Scholar
  3. 3.
    Cherkassky V, Mulier F (2007) Learning from data: concepts, theory and methods. Wiley, New YorkCrossRefGoogle Scholar
  4. 4.
    Vapnik V (1998) Statistical learning theory, vol 1. Wiley, New YorkzbMATHGoogle Scholar
  5. 5.
    Osuna E, Freund R, Girosit F (1997) Training support vector machines: an application to face detection. In: Proceedings of IEEE computer vision and pattern recognition, San Juan, Puerto Rico, pp 130–136Google Scholar
  6. 6.
    Joachims T (1998) Text categorization with support vector machines: learning with many relevant features, European conference on machine learning. Springer, BerlinGoogle Scholar
  7. 7.
    Schlkopf B, Tsuda K, Vert JP (2004) Kernel methods in computational biology. MIT Press, CambridgeGoogle Scholar
  8. 8.
    Lal TN, Schroder M, Hinterberger T, Weston J, Bogdan M, Birbaumer N, Scholkopf B (2004) Support vector channel selection in BCI. IEEE Trans Biomed Eng 51(6):10031010CrossRefGoogle Scholar
  9. 9.
    Bradley P, Mangasarian OL (2000) Massive data discrimination via linear support vector machines. Optim Methods Softw 13(1):1–10MathSciNetCrossRefGoogle Scholar
  10. 10.
    Freund Y, Schapire RE (1995) A decision-theoretic generalization of on-line learning and an application to boosting. In: Proceedings of the 2nd European conference on computational learning theory, Barcelona, Spain, p. 2337Google Scholar
  11. 11.
    Zhou ZH (2012) Ensemble methods: foundations and algorithms. CRC Press, Boca RatonCrossRefGoogle Scholar
  12. 12.
    Breiman L (1999) Prediction games and arcing classifiers. Neural Comput 11(7):14931517CrossRefGoogle Scholar
  13. 13.
    Schapire RE, Freund Y, Bartlett PL, Lee WS (1998) Boosting the margin: a new explanation for the effectives of voting methods. Annu Stat 26(5):16511686zbMATHGoogle Scholar
  14. 14.
    Reyzin L, Schapire RE (2006) How boosting the margin can also boost classifier complexity. In: Proceedings of 23rd international conference on machine learning, Pittsburgh, PA, p 753–760Google Scholar
  15. 15.
    Wang L, Sugiyama M, Yang C, Zhou ZH, Feng J (2008) On the margin explanation of boosting algorithm. In: Proceedings of the 21st annual conference on learning theory, Helsinki, Finland, p 479–490Google Scholar
  16. 16.
    Gao W, Zhou ZH (2013) On the doubt about margin explanation of boosting. Artif Intell 199–200:2244MathSciNetzbMATHGoogle Scholar
  17. 17.
    Zhang T, Zhou ZH (2014) Large margin distribution machine. In: Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining. ACMGoogle Scholar
  18. 18.
    Vapnik V, Golowich SE, Smola AJ (1997) Support vector method for function approximation, regression estimation and signal processing. In: Mozer M, Jordan M, Petsche T (eds) Advances in neural information processing systems. MIT Press, Cambridge, pp 281–287Google Scholar
  19. 19.
    Drucker H, Burges CJ, Kaufman L, Smola AJ, Vapnik V (1997) Support vector regression machines. In: Mozer MC, Jordan MI, Petsche T (eds) Advances in neural information processing systems. MIT Press, Cambridge, pp 155–161Google Scholar
  20. 20.
    Bi J, Bennett KP (2003) A geometric approach to support vector regression. Neurocomputing 55(1):79–108CrossRefGoogle Scholar
  21. 21.
    Suykens JAK, Lukas L, van Dooren P, De Moor B, Vandewalle J (1999) Least squares support vector machine classifiers: a large scale algorithm. In: Proceedings of European conference of circuit theory design, pp 839–842Google Scholar
  22. 22.
    Suykens JAK, Vandewalle J (1999) Least squares support vector machine classifiers. Neural Process Lett 9(3):293300CrossRefGoogle Scholar
  23. 23.
    Shao YH, Zhang C, Yang Z, Deng N (2013) An \(\epsilon \)-twin support vector machine for regression. Neural Comput Appl 23(1):175–185CrossRefGoogle Scholar
  24. 24.
    Tanveer M, Mangal M, Ahmad I, Shao YH (2016) One norm linear programming support vector regression. Neurocomputing 173:1508–1518CrossRefGoogle Scholar
  25. 25.
    Mangasarian OL, Musicant DR (1999) Successive overrelaxation for support vector machines. IEEE Trans Neural Netw 10(5):1032–1037CrossRefGoogle Scholar
  26. 26.
    Luo ZQ, Tseng P (1993) Error bounds and convergence analysis of feasible descent methods: a general approach. Ann Oper Res 46(1):157–178MathSciNetCrossRefGoogle Scholar
  27. 27.
    Chang CC, Lin CJ (2011) LIBSVM, a library for support vector machines. ACM Trans Intell Syst Technol (TIST) 2(3):27Google Scholar
  28. 28.
    Blake CI, Merz CJ (1998) UCI repository for machine learning databases [*mlearn/MLRepository.html]
  29. 29.
    Huang X, Shi L, Suykens JA (2014) Support vector machine classifier with pinball loss. IEEE Trans Pattern Anal Mach Intell 36(5):984–997CrossRefGoogle Scholar
  30. 30.
    Hsu CW, Lin CJ (2002) A comparison of methods for multi class support vector machines. IEEE Trans Neural Netw 13:415–425CrossRefGoogle Scholar
  31. 31.
    Duda RO, Hart PR, Stork DG (2001) Pattern classification, 2nd edn. Wiley, HobokenzbMATHGoogle Scholar
  32. 32.
    Kohavi R (1995) A study of cross-validation and bootstrap for accuracy estimation and model selection. In: Ijcai, Vol. 14, No. 2Google Scholar

Copyright information

© Springer-Verlag London Ltd., part of Springer Nature 2018

Authors and Affiliations

  1. 1.Faculty of Mathematics and Computer ScienceSouth Asian UniversityNew DelhiIndia
  2. 2.Department of MathematicsIndian Institute of Technology DelhiNew DelhiIndia

Personalised recommendations