Abstract
We consider some supervised binary classification tasks and a regression task, whereas SVM and Deep Learning, at present, exhibit the best generalization performances. We extend the work [3] on a generalized quadratic loss for learning problems that examines pattern correlations in order to concentrate the learning problem into input space regions where patterns are more densely distributed. From a shallow methods point of view (e.g.: SVM), since the following mathematical derivation of problem (9) in [3] is incorrect, we restart from problem (8) in [3] and we try to solve it with one procedure that iterates over the dual variables until the primal and dual objective functions converge. In addition we propose another algorithm that tries to solve the classification problem directly from the primal problem formulation. We make also use of Multiple Kernel Learning to improve generalization performances. Moreover, we introduce for the first time a custom loss that takes in consideration pattern correlation for a shallow and a Deep Learning task. We propose some pattern selection criteria and the results on 4 UCI data-sets for the SVM method. We also report the results on a larger binary classification data-set based on Twitter, again drawn from UCI, combined with shallow Learning Neural Networks, with and without the generalized quadratic loss. At last, we test our loss with a Deep Neural Network within a larger regression task taken from UCI. We compare the results of our optimizers with the well known solver \(\text {SVM}^{\text {light}}\) and with Keras Multi-Layers Neural Networks with standard losses and with a parameterized generalized quadratic loss, and we obtain comparable results (Code is available at: https://osf.io/fbzsc/wiki/home/).
Supported by organization Universita’ degli Studi “Ca’ Foscari” di Venezia.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Change history
30 March 2021
The original version of chapter 2 was inadvertently published with wrong RTS values in Table 3: “Results comparison with RTS, S, and SVMlight with standard linear loss with a 10-fold cross validation procedure.”
The RTS values were corrected by replacing the wrong values with the appropriate ones.
The footnote reads “1Code is available at: https://osf.io/fbzsc/” and has been added to the last sentence in the abstract.
The original version of chapter 34 was inadvertently published with incorrect allocations between authors and affiliations resp. one affiliation was entirely missing.
The affiliations have been corrected and read as follows: 1University of Primorska, Koper, Slovenia; 2Jožef Stefan Institute, Ljubljana, Slovenia; and 3Urgench State University, Urgench, Uzbekistan. The authors' affiliations are: Jamolbek Mattiev1,3 and Branko Kavšek1,2.
Notes
- 1.
- 2.
This software runs on a Intel(R) Core(TM) i7-6700 CPU @ 3.40 GHz with 32.084 MB of RAM, 32.084 MB of swap space, and a SSD of 512 GB.
References
Vapnik, V.N.: Statistical Learning Theory. Wiley, New York (1998)
Platt, J.: Sequential Minimal Optimization: A Fast Algorithm for Training Support Vector Machines. Advances in Kernel Methods - Support Vector Learning (1998)
Portera, F., Sperduti, A.: A generalized quadratic loss for Support Vector Machines. In: ECAI 2004 Proceedings of the 16th European Conference on Artificial Intelligence, pp. 628–632 (2004)
Aiolli, F., Sperduti, A.: An efficient SMO-like algorithm for multiclass SVM. In: Proceedings of the 12th IEEE Workshop on Neural Networks for Signal Processing (2002)
Schölkopf, B., Herbrich, R., Smola, A.J.: A generalized representer theorem. In: Helmbold, D., Williamson, B. (eds.) COLT 2001. LNCS (LNAI), vol. 2111, pp. 416–426. Springer, Heidelberg (2001). https://doi.org/10.1007/3-540-44581-1_27
Joachims, T.: Learning to Classify Text Using Support Vector Machines (2002)
Sze, V., Chen, Y.-H., Yang, T., Emer, J.S.: Efficient processing of deep neural networks: a tutorial and survey. Proc. IEEE 105(12) (2017)
Bertin-Mahieux, T., Ellis, D.P.W., Whitman, B., Lamere, P.: The Million Song Dataset. In: Proceedings of the 12th International Society for Music Information Retrieval Conference (ISMIR 2011) (2011)
Lauriola, I., Gallicchio, C., Aiolli, F.: Enhancing deep neural networks via multiple kernel learning. Pattern Recogn. 101, 107194 (2020)
Qiu, S., Lane, T.: A framework for multiple kernel support vector regression and its applications to siRNA efficacy prediction. IEEE/ACM Trans. Comput. Biol. Bioinf. 6(2), 190–199 (2009)
Lanckriet, G.R.G., Cristianini, N., Bartlett, P., El Ghaoui, L., Jordan, M.I.: Learning the kernel matrix with semidefinite programming. In: Proceedings of the 19th International Conference on Machine Learning (2002)
Courcoubetis, C., Weber, R.: Lagrangian Methods for Constrained Optimization. Wiley (2003). ISBN 0-470-85130-9
Rodriguez, J.D., Perez, A., Lozano, J.A.: Sensitivity analysis of k-Fold cross validation in prediction error estimation. IEEE Trans. Pattern Anal. Mach. Intell. 32(3), 569–575 (2010)
Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20, 273–297 (1995)
Fan, R.-E., Chen, P.-H., Lin, C.-J.: Working set selection using second order information for training SVM. J. Mach. Learn. Res. 6, 1889–1918 (2005)
Severyn, A., Moschitti, A.: Twitter sentiment analysis with deep convolutional neural networks. In: SIGIR 2015: Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 959–962 (2015)
Cliche, M.: BB\_twtr at SemEval-2017 task 4: twitter sentiment analysis with CNNs and LSTMs. In: Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval 2017), pp. 573–580 (2017)
Hernandez-Lobato, J.M., Adams, R.P.: Probabilistic backpropagation for scalable learning of bayesian neural networks. In: ICML 2015: Proceedings of the 32nd International Conference on International Conference on Machine Learning - Volume 37, pp. 1861–1869 (2015)
Lakshminarayanan, B., Pritzel, A., Blundell, C.: Simple and scalable predictive uncertainty estimation using deep ensembles. In: NIPS 2017: Proceedings of the 31st International Conference on Neural Information Processing Systems, pp. 6405–6416 (2017)
Acknowledgments
I would like to express my gratitude to Giovanna Zamara, Fabrizio Romano, Fabio Aiolli, Alessio Micheli, Ralf Herbrich, Alex Smola, Alessandro Sperduti for their insightful suggestions.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Portera, F. (2020). A Generalized Quadratic Loss for SVM and Deep Neural Networks. In: Nicosia, G., et al. Machine Learning, Optimization, and Data Science. LOD 2020. Lecture Notes in Computer Science(), vol 12565. Springer, Cham. https://doi.org/10.1007/978-3-030-64583-0_2
Download citation
DOI: https://doi.org/10.1007/978-3-030-64583-0_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-64582-3
Online ISBN: 978-3-030-64583-0
eBook Packages: Computer ScienceComputer Science (R0)