Skip to main content

A Generalized Quadratic Loss for SVM and Deep Neural Networks

  • Conference paper
  • First Online:
Machine Learning, Optimization, and Data Science (LOD 2020)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12565))

  • 1549 Accesses

  • The original version of this chapter was revised: wrong RTS values in Table 3 and the author’s e-mail address have been corrected. Additionally a link was added on the abstract. The correction to this chapter is available at https://doi.org/10.1007/978-3-030-64583-0_64

Abstract

We consider some supervised binary classification tasks and a regression task, whereas SVM and Deep Learning, at present, exhibit the best generalization performances. We extend the work [3] on a generalized quadratic loss for learning problems that examines pattern correlations in order to concentrate the learning problem into input space regions where patterns are more densely distributed. From a shallow methods point of view (e.g.: SVM), since the following mathematical derivation of problem (9) in [3] is incorrect, we restart from problem (8) in [3] and we try to solve it with one procedure that iterates over the dual variables until the primal and dual objective functions converge. In addition we propose another algorithm that tries to solve the classification problem directly from the primal problem formulation. We make also use of Multiple Kernel Learning to improve generalization performances. Moreover, we introduce for the first time a custom loss that takes in consideration pattern correlation for a shallow and a Deep Learning task. We propose some pattern selection criteria and the results on 4 UCI data-sets for the SVM method. We also report the results on a larger binary classification data-set based on Twitter, again drawn from UCI, combined with shallow Learning Neural Networks, with and without the generalized quadratic loss. At last, we test our loss with a Deep Neural Network within a larger regression task taken from UCI. We compare the results of our optimizers with the well known solver \(\text {SVM}^{\text {light}}\) and with Keras Multi-Layers Neural Networks with standard losses and with a parameterized generalized quadratic loss, and we obtain comparable results (Code is available at: https://osf.io/fbzsc/wiki/home/).

Supported by organization Universita’ degli Studi “Ca’ Foscari” di Venezia.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Change history

  • 30 March 2021

    The original version of chapter 2 was inadvertently published with wrong RTS values in Table 3: “Results comparison with RTS, S, and SVMlight with standard linear loss with a 10-fold cross validation procedure.”

    The RTS values were corrected by replacing the wrong values with the appropriate ones.

    The footnote reads “1Code is available at: https://osf.io/fbzsc/” and has been added to the last sentence in the abstract.

    The original version of chapter 34 was inadvertently published with incorrect allocations between authors and affiliations resp. one affiliation was entirely missing.

    The affiliations have been corrected and read as follows: 1University of Primorska, Koper, Slovenia; 2Jožef Stefan Institute, Ljubljana, Slovenia; and 3Urgench State University, Urgench, Uzbekistan. The authors' affiliations are: Jamolbek Mattiev1,3 and Branko Kavšek1,2.

Notes

  1. 1.

    https://archive.ics.uci.edu/ml/datasets.php.

  2. 2.

    This software runs on a Intel(R) Core(TM) i7-6700 CPU @ 3.40 GHz with 32.084 MB of RAM, 32.084 MB of swap space, and a SSD of 512 GB.

References

  1. Vapnik, V.N.: Statistical Learning Theory. Wiley, New York (1998)

    MATH  Google Scholar 

  2. Platt, J.: Sequential Minimal Optimization: A Fast Algorithm for Training Support Vector Machines. Advances in Kernel Methods - Support Vector Learning (1998)

    Google Scholar 

  3. Portera, F., Sperduti, A.: A generalized quadratic loss for Support Vector Machines. In: ECAI 2004 Proceedings of the 16th European Conference on Artificial Intelligence, pp. 628–632 (2004)

    Google Scholar 

  4. Aiolli, F., Sperduti, A.: An efficient SMO-like algorithm for multiclass SVM. In: Proceedings of the 12th IEEE Workshop on Neural Networks for Signal Processing (2002)

    Google Scholar 

  5. Schölkopf, B., Herbrich, R., Smola, A.J.: A generalized representer theorem. In: Helmbold, D., Williamson, B. (eds.) COLT 2001. LNCS (LNAI), vol. 2111, pp. 416–426. Springer, Heidelberg (2001). https://doi.org/10.1007/3-540-44581-1_27

    Chapter  Google Scholar 

  6. Joachims, T.: Learning to Classify Text Using Support Vector Machines (2002)

    Google Scholar 

  7. Sze, V., Chen, Y.-H., Yang, T., Emer, J.S.: Efficient processing of deep neural networks: a tutorial and survey. Proc. IEEE 105(12) (2017)

    Google Scholar 

  8. Bertin-Mahieux, T., Ellis, D.P.W., Whitman, B., Lamere, P.: The Million Song Dataset. In: Proceedings of the 12th International Society for Music Information Retrieval Conference (ISMIR 2011) (2011)

    Google Scholar 

  9. Lauriola, I., Gallicchio, C., Aiolli, F.: Enhancing deep neural networks via multiple kernel learning. Pattern Recogn. 101, 107194 (2020)

    Article  Google Scholar 

  10. Qiu, S., Lane, T.: A framework for multiple kernel support vector regression and its applications to siRNA efficacy prediction. IEEE/ACM Trans. Comput. Biol. Bioinf. 6(2), 190–199 (2009)

    Article  Google Scholar 

  11. Lanckriet, G.R.G., Cristianini, N., Bartlett, P., El Ghaoui, L., Jordan, M.I.: Learning the kernel matrix with semidefinite programming. In: Proceedings of the 19th International Conference on Machine Learning (2002)

    Google Scholar 

  12. Courcoubetis, C., Weber, R.: Lagrangian Methods for Constrained Optimization. Wiley (2003). ISBN 0-470-85130-9

    Google Scholar 

  13. Rodriguez, J.D., Perez, A., Lozano, J.A.: Sensitivity analysis of k-Fold cross validation in prediction error estimation. IEEE Trans. Pattern Anal. Mach. Intell. 32(3), 569–575 (2010)

    Article  Google Scholar 

  14. Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20, 273–297 (1995)

    MATH  Google Scholar 

  15. Fan, R.-E., Chen, P.-H., Lin, C.-J.: Working set selection using second order information for training SVM. J. Mach. Learn. Res. 6, 1889–1918 (2005)

    MathSciNet  MATH  Google Scholar 

  16. Severyn, A., Moschitti, A.: Twitter sentiment analysis with deep convolutional neural networks. In: SIGIR 2015: Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 959–962 (2015)

    Google Scholar 

  17. Cliche, M.: BB\_twtr at SemEval-2017 task 4: twitter sentiment analysis with CNNs and LSTMs. In: Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval 2017), pp. 573–580 (2017)

    Google Scholar 

  18. Hernandez-Lobato, J.M., Adams, R.P.: Probabilistic backpropagation for scalable learning of bayesian neural networks. In: ICML 2015: Proceedings of the 32nd International Conference on International Conference on Machine Learning - Volume 37, pp. 1861–1869 (2015)

    Google Scholar 

  19. Lakshminarayanan, B., Pritzel, A., Blundell, C.: Simple and scalable predictive uncertainty estimation using deep ensembles. In: NIPS 2017: Proceedings of the 31st International Conference on Neural Information Processing Systems, pp. 6405–6416 (2017)

    Google Scholar 

Download references

Acknowledgments

I would like to express my gratitude to Giovanna Zamara, Fabrizio Romano, Fabio Aiolli, Alessio Micheli, Ralf Herbrich, Alex Smola, Alessandro Sperduti for their insightful suggestions.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Filippo Portera .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Portera, F. (2020). A Generalized Quadratic Loss for SVM and Deep Neural Networks. In: Nicosia, G., et al. Machine Learning, Optimization, and Data Science. LOD 2020. Lecture Notes in Computer Science(), vol 12565. Springer, Cham. https://doi.org/10.1007/978-3-030-64583-0_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-64583-0_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-64582-3

  • Online ISBN: 978-3-030-64583-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics