A Generalized Quadratic Loss for SVM and Deep Neural Networks

Portera, Filippo

doi:10.1007/978-3-030-64583-0_2

Filippo Portera ORCID: orcid.org/0000-0002-2179-372X¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 12565))

Included in the following conference series:

International Conference on Machine Learning, Optimization, and Data Science

1549 Accesses

The original version of this chapter was revised: wrong RTS values in Table 3 and the author’s e-mail address have been corrected. Additionally a link was added on the abstract. The correction to this chapter is available at https://doi.org/10.1007/978-3-030-64583-0_64

Abstract

We consider some supervised binary classification tasks and a regression task, whereas SVM and Deep Learning, at present, exhibit the best generalization performances. We extend the work [3] on a generalized quadratic loss for learning problems that examines pattern correlations in order to concentrate the learning problem into input space regions where patterns are more densely distributed. From a shallow methods point of view (e.g.: SVM), since the following mathematical derivation of problem (9) in [3] is incorrect, we restart from problem (8) in [3] and we try to solve it with one procedure that iterates over the dual variables until the primal and dual objective functions converge. In addition we propose another algorithm that tries to solve the classification problem directly from the primal problem formulation. We make also use of Multiple Kernel Learning to improve generalization performances. Moreover, we introduce for the first time a custom loss that takes in consideration pattern correlation for a shallow and a Deep Learning task. We propose some pattern selection criteria and the results on 4 UCI data-sets for the SVM method. We also report the results on a larger binary classification data-set based on Twitter, again drawn from UCI, combined with shallow Learning Neural Networks, with and without the generalized quadratic loss. At last, we test our loss with a Deep Neural Network within a larger regression task taken from UCI. We compare the results of our optimizers with the well known solver \(\text {SVM}^{\text {light}}\) and with Keras Multi-Layers Neural Networks with standard losses and with a parameterized generalized quadratic loss, and we obtain comparable results (Code is available at: https://osf.io/fbzsc/wiki/home/).

Supported by organization Universita’ degli Studi “Ca’ Foscari” di Venezia.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

On the Choice of Inter-Class Distance Maximization Term in Siamese Neural Networks

Article 21 July 2018

Deep One-Class Fine-Tuning for Imbalanced Short Text Classification in Transfer Learning

Machine learning framework for country image analysis

Article 03 February 2024

Change history

30 March 2021
The original version of chapter 2 was inadvertently published with wrong RTS values in Table 3: “Results comparison with RTS, S, and SVM^light with standard linear loss with a 10-fold cross validation procedure.”
The RTS values were corrected by replacing the wrong values with the appropriate ones.
The footnote reads “¹Code is available at: https://osf.io/fbzsc/” and has been added to the last sentence in the abstract.
The original version of chapter 34 was inadvertently published with incorrect allocations between authors and affiliations resp. one affiliation was entirely missing.
The affiliations have been corrected and read as follows: ¹University of Primorska, Koper, Slovenia; ²Jožef Stefan Institute, Ljubljana, Slovenia; and ³Urgench State University, Urgench, Uzbekistan. The authors' affiliations are: Jamolbek Mattiev^1,3 and Branko Kavšek^1,2.

Notes

1.
https://archive.ics.uci.edu/ml/datasets.php.
2.
This software runs on a Intel(R) Core(TM) i7-6700 CPU @ 3.40 GHz with 32.084 MB of RAM, 32.084 MB of swap space, and a SSD of 512 GB.

References

Vapnik, V.N.: Statistical Learning Theory. Wiley, New York (1998)
MATH Google Scholar
Platt, J.: Sequential Minimal Optimization: A Fast Algorithm for Training Support Vector Machines. Advances in Kernel Methods - Support Vector Learning (1998)
Google Scholar
Portera, F., Sperduti, A.: A generalized quadratic loss for Support Vector Machines. In: ECAI 2004 Proceedings of the 16th European Conference on Artificial Intelligence, pp. 628–632 (2004)
Google Scholar
Aiolli, F., Sperduti, A.: An efficient SMO-like algorithm for multiclass SVM. In: Proceedings of the 12th IEEE Workshop on Neural Networks for Signal Processing (2002)
Google Scholar
Schölkopf, B., Herbrich, R., Smola, A.J.: A generalized representer theorem. In: Helmbold, D., Williamson, B. (eds.) COLT 2001. LNCS (LNAI), vol. 2111, pp. 416–426. Springer, Heidelberg (2001). https://doi.org/10.1007/3-540-44581-1_27
Chapter Google Scholar
Joachims, T.: Learning to Classify Text Using Support Vector Machines (2002)
Google Scholar
Sze, V., Chen, Y.-H., Yang, T., Emer, J.S.: Efficient processing of deep neural networks: a tutorial and survey. Proc. IEEE 105(12) (2017)
Google Scholar
Bertin-Mahieux, T., Ellis, D.P.W., Whitman, B., Lamere, P.: The Million Song Dataset. In: Proceedings of the 12th International Society for Music Information Retrieval Conference (ISMIR 2011) (2011)
Google Scholar
Lauriola, I., Gallicchio, C., Aiolli, F.: Enhancing deep neural networks via multiple kernel learning. Pattern Recogn. 101, 107194 (2020)
Article Google Scholar
Qiu, S., Lane, T.: A framework for multiple kernel support vector regression and its applications to siRNA efficacy prediction. IEEE/ACM Trans. Comput. Biol. Bioinf. 6(2), 190–199 (2009)
Article Google Scholar
Lanckriet, G.R.G., Cristianini, N., Bartlett, P., El Ghaoui, L., Jordan, M.I.: Learning the kernel matrix with semidefinite programming. In: Proceedings of the 19th International Conference on Machine Learning (2002)
Google Scholar
Courcoubetis, C., Weber, R.: Lagrangian Methods for Constrained Optimization. Wiley (2003). ISBN 0-470-85130-9
Google Scholar
Rodriguez, J.D., Perez, A., Lozano, J.A.: Sensitivity analysis of k-Fold cross validation in prediction error estimation. IEEE Trans. Pattern Anal. Mach. Intell. 32(3), 569–575 (2010)
Article Google Scholar
Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20, 273–297 (1995)
MATH Google Scholar
Fan, R.-E., Chen, P.-H., Lin, C.-J.: Working set selection using second order information for training SVM. J. Mach. Learn. Res. 6, 1889–1918 (2005)
MathSciNet MATH Google Scholar
Severyn, A., Moschitti, A.: Twitter sentiment analysis with deep convolutional neural networks. In: SIGIR 2015: Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 959–962 (2015)
Google Scholar
Cliche, M.: BB\_twtr at SemEval-2017 task 4: twitter sentiment analysis with CNNs and LSTMs. In: Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval 2017), pp. 573–580 (2017)
Google Scholar
Hernandez-Lobato, J.M., Adams, R.P.: Probabilistic backpropagation for scalable learning of bayesian neural networks. In: ICML 2015: Proceedings of the 32nd International Conference on International Conference on Machine Learning - Volume 37, pp. 1861–1869 (2015)
Google Scholar
Lakshminarayanan, B., Pritzel, A., Blundell, C.: Simple and scalable predictive uncertainty estimation using deep ensembles. In: NIPS 2017: Proceedings of the 31st International Conference on Neural Information Processing Systems, pp. 6405–6416 (2017)
Google Scholar

Download references

Acknowledgments

I would like to express my gratitude to Giovanna Zamara, Fabrizio Romano, Fabio Aiolli, Alessio Micheli, Ralf Herbrich, Alex Smola, Alessandro Sperduti for their insightful suggestions.

Author information

Authors and Affiliations

Universita’ degli Studi “Ca’ Foscari” di Venezia, Venice, Italy
Filippo Portera

Authors

Filippo Portera
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Filippo Portera .

Editor information

Editors and Affiliations

University of Catania, Catania, Italy
Giuseppe Nicosia
University of Reading, Reading, UK
Varun Ojha
University of Oxford, Oxford, UK
Emanuele La Malfa
University of Cambridge, Cambridge, UK
Giorgio Jansen
Almawave, Rome, Italy
Vincenzo Sciacca
University of Florida, Gainesville, FL, USA
Panos Pardalos
University of Catania, Catania, Italy
Giovanni Giuffrida
Harvard University, Cambridge, MA, USA
Renato Umeton

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Portera, F. (2020). A Generalized Quadratic Loss for SVM and Deep Neural Networks. In: Nicosia, G., et al. Machine Learning, Optimization, and Data Science. LOD 2020. Lecture Notes in Computer Science(), vol 12565. Springer, Cham. https://doi.org/10.1007/978-3-030-64583-0_2

Download citation

DOI: https://doi.org/10.1007/978-3-030-64583-0_2
Published: 08 January 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-64582-3
Online ISBN: 978-3-030-64583-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

A Generalized Quadratic Loss for SVM and Deep Neural Networks

Abstract

Access this chapter

Similar content being viewed by others

On the Choice of Inter-Class Distance Maximization Term in Siamese Neural Networks

Deep One-Class Fine-Tuning for Imbalanced Short Text Classification in Transfer Learning

Machine learning framework for country image analysis

Change history

30 March 2021

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

A Generalized Quadratic Loss for SVM and Deep Neural Networks

Abstract

Access this chapter

Similar content being viewed by others

On the Choice of Inter-Class Distance Maximization Term in Siamese Neural Networks

Deep One-Class Fine-Tuning for Imbalanced Short Text Classification in Transfer Learning

Machine learning framework for country image analysis

Change history

30 March 2021

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation