Neural Network Training with Safe Regularization in the Null Space of Batch Activations

Kissel, Matthias; Gottwald, Martin; Diepold, Klaus

doi:10.1007/978-3-030-61616-8_18

Matthias Kissel¹¹,
Martin Gottwald¹¹ &
Klaus Diepold¹¹

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12397))

Included in the following conference series:

International Conference on Artificial Neural Networks

2190 Accesses

Abstract

We propose to formulate the training of neural networks with side optimization goals, such as obtaining structured weight matrices, as lexicographic optimization problem. The lexicographic order can be maintained during training by optimizing the side-optimization goal exclusively in the null space of batch activations. We call the resulting training method Safe Regularization, because the side optimization goal can be safely integrated into the training with limited influence on the main optimization goal. Moreover, this results in a higher robustness regarding the choice of regularization hyperparameters. We validate our training method with multiple real-world regression data sets with the side-optimization goal of obtaining sparse weight matrices.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Belsley, D.A., Kuh, E., Welsch, R.: Regression diagnostics: identifying influential data and sources of collinearity (1980)
Google Scholar
Cheng, Y., Felix, X.Y., Feris, R.S., Kumar, S., Choudhary, A., Chang, S.F.: Fast neural networks with circulant projections. arXiv preprint arXiv:1502.03436 (2015)
Dua, D., Graff, C.: UCI machine learning repository (2017). http://archive.ics.uci.edu/ml
Duchi, J., Singer, Y.: Efficient online and batch learning using forward backward splitting. J. Mach. Learn. Res. 10, 2899–2934 (2009)
MathSciNet MATH Google Scholar
Feppon, F., Allaire, G., Dapogny, C.: Null space gradient flows for constrained optimization with applications to shape optimization (2019)
Google Scholar
Gerritsma, J., Onnink, R., Versluis, A.: Geometry, resistance and stability of the delft systematic yacht hull series. Int. Shipbuilding Prog. 28(328), 276–297 (1981)
Article Google Scholar
Glorot, X., Bengio, Y.: Understanding the difficulty of training deep feedforward neural networks. In: Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics (2010)
Google Scholar
Goggin, S.D., Gustafson, K.E., Johnson, K.M.: Accessing the null space with nonlinear multilayer neural networks. In: Science of Artificial Neural Networks (1992)
Google Scholar
Goldstein, T., Studer, C., Baraniuk, R.: A field guide to forward-backward splitting with a FASTA implementation. arXiv preprint arXiv:1411.3406 (2014)
Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press, Cambridge (2016)
MATH Google Scholar
Harrison Jr., D., Rubinfeld, D.L.: Hedonic housing prices and the demand for clean air. J. Environ. Econ. Manag. 5(1), 81–102 (1978)
Article Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2016)
Google Scholar
Kingma, D., Ba, J.: Adam: a method for stochastic optimization. In: International Conference on Learning Representations (2014)
Google Scholar
Kukačka, J., Golkov, V., Cremers, D.: Regularization for deep learning: a taxonomy. arXiv preprint arXiv:1710.10686 (2017)
Lopes, M.E.: Estimating unknown sparsity in compressed sensing. In: Proceedings of the 30th International Conference on Machine Learning, vol. 28 (2013)
Google Scholar
Ortigosa, I., Lopez, R., Garcia, J.: A neural networks approach to residuary resistance of sailing yachts prediction. In: Proceedings of the International Conference on Marine Engineering MARINE (2007)
Google Scholar
Potlapalli, H., Luo, R.C.: Projection learning for self-organizing neural networks. IEEE Trans. Ind. Electron. 43(4), 485–491 (1996)
Article Google Scholar
Redmond, M., Baveja, A.: A data-driven software tool for enabling cooperative information sharing among police departments. Eur. J. Oper. Res. 141(3), 660–678 (2002)
Article Google Scholar
Scardapane, S., Comminiello, D., Hussain, A., Uncini, A.: Group sparse regularization for deep neural networks. Neurocomputing 241, 81–89 (2017)
Article Google Scholar
Shen, H.: Towards a mathematical understanding of the difficulty in learning with feedforward neural networks. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018)
Google Scholar
Sindhwani, V., Sainath, T., Kumar, S.: Structured transforms for small-footprint deep learning. In: Advances in Neural Information Processing Systems (2015)
Google Scholar
Wang, H., Langley, R., Kim, S., McCord-Snook, E., Wang, H.: Efficient exploration of gradient space for online learning to rank. In: The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval (2018)
Google Scholar
Weigend, A.S., Rumelhart, D.E., Huberman, B.A.: Generalization by weight-elimination with application to forecasting. In: Advances in Neural Information Processing Systems (1991)
Google Scholar
Wen, W., Wu, C., Wang, Y., Chen, Y., Li, H.: Learning structured sparsity in deep neural networks. In: Advances in Neural Information Processing Systems (2016)
Google Scholar
Xie, D., Xiong, J., Pu, S.: All you need is beyond a good init: exploring better solution for training extremely deep convolutional neural networks with orthonormality and modulation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2017)
Google Scholar
Xu, K., Cao, T., Shah, S., Maung, C., Schweitzer, H.: Cleaning the null space: a privacy mechanism for predictors. In: Thirty-First AAAI Conference on Artificial Intelligence (2017)
Google Scholar
Yeh, I.C.: Modeling of strength of high-performance concrete using artificial neural networks. Cem. Concr. Res. 28(12), 1797–1808 (1998)
Article Google Scholar
Yoon, S.W., Seo, J., Moon, J.: Meta learner with linear nulling. arXiv preprint arXiv:1806.01010 (2018)
Yuan, Y.: A null space algorithm for constrained optimization. In: Advances in Scientific Computing. Science Press, Beijing (2001)
Google Scholar
Zhang, C., Patras, P., Haddadi, H.: Deep learning in mobile and wireless networking: a survey. IEEE Commun. Surv. Tutor. 21(3), 2224–2287 (2019)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Chair of Data Processing, Technical University of Munich, Munich, Germany
Matthias Kissel, Martin Gottwald & Klaus Diepold

Authors

Matthias Kissel
View author publications
You can also search for this author in PubMed Google Scholar
Martin Gottwald
View author publications
You can also search for this author in PubMed Google Scholar
Klaus Diepold
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Matthias Kissel .

Editor information

Editors and Affiliations

Department of Applied Informatics, Comenius University in Bratislava, Bratislava, Slovakia
Igor Farkaš
Department of Applied Mathematics and Computer Science, Technical University of Denmark, Kgs. Lyngby, Denmark
Paolo Masulli
Department of Informatics, University of Hamburg, Hamburg, Germany
Stefan Wermter

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kissel, M., Gottwald, M., Diepold, K. (2020). Neural Network Training with Safe Regularization in the Null Space of Batch Activations. In: Farkaš, I., Masulli, P., Wermter, S. (eds) Artificial Neural Networks and Machine Learning – ICANN 2020. ICANN 2020. Lecture Notes in Computer Science(), vol 12397. Springer, Cham. https://doi.org/10.1007/978-3-030-61616-8_18

Download citation

DOI: https://doi.org/10.1007/978-3-030-61616-8_18
Published: 14 October 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-61615-1
Online ISBN: 978-3-030-61616-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics