Theoretical Characterization of Deep Neural Networks

Kaul, Piyush; Lall, Brejesh

doi:10.1007/978-3-030-31756-0_2

Piyush Kaul⁴ &
Brejesh Lall⁴

Part of the book series: Studies in Computational Intelligence ((SCI,volume 866))

3176 Accesses

Abstract

Deep neural networks are poorly understood mathematically, however there has been a lot of recent work focusing on analyzing and understanding their success in a variety of pattern recognition tasks. We describe some of the mathematical techniques used for characterization of neural networks in terms of complexity of classification or regression task assigned, or based on functions learned, and try to relate this to architecture choices for neural networks. We explain some of the measurable quantifiers that can been used for defining expressivity of neural network including using homological complexity and curvature. We also describe neural networks from the viewpoints of scattering transforms and share some of the mathematical and intuitive justifications for those. We finally share a technique for visualizing and analyzing neural networks based on concept of Riemann curvature.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 139.00; Price excludes VAT (USA)

Softcover Book: USD 179.99; Price excludes VAT (USA)

Hardcover Book: USD 179.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Amodei, D., Ananthanarayanan, S., Anubhai, R., Bai, J., Battenberg, E., Case, C., Casper, J., Catanzaro, B., Cheng, Q., Chen, G., et al. Deep speech 2: end-to-end speech recognition in English and mandarin. In: International Conference on Machine Learning, pp. 173–182 (2016)
Google Scholar
Andén, J., Mallat, S.: Deep scattering spectrum. IEEE Trans. Signal Process. 62(16), 4114–4128 (2014)
Article MathSciNet Google Scholar
Bartlett, P.L., Maass, W.: Vapnik-Chervonenkis dimension of neural nets. In: The Handbook of Brain Theory and Neural Networks, pp. 1188–1192 (2003)
Google Scholar
Bengio, Y., Delalleau, O.: On the expressive power of deep architectures. In: International Conference on Algorithmic Learning Theory, pp. 18–36. Springer, Berlin (2011)
Google Scholar
Bianchini, M., Scarselli, F.: On the complexity of shallow and deep neural network classifiers. In: ESANN (2014)
Google Scholar
Bredon, G.E.: Topology and Geometry, vol. 139. Springer Science & Business Media (2013)
Google Scholar
Bronstein, M.M., Bruna, J., LeCun, Y., Szlam, A., Vandergheynst, P.: Geometric deep learning: going beyond Euclidean data. IEEE Signal Process. Mag. 34(4), 18–42 (2017)
Article Google Scholar
Bruna, J.: Geometric stability in Euclidean domains: the scattering transform and beyond. https://joanbruna.github.io/MathsDL-spring18/ (2018)
Bruna, J., Mallat, S.: Invariant scattering convolution networks. IEEE Trans. Pattern Anal. Mach. Intell. 35(8), 1872–1886 (2013)
Article Google Scholar
Cho, K., Van Merriënboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., Bengio, Y.: Learning phrase representations using RNN encoder-decoder for statistical machine translation (2014). arXiv preprint arXiv:1406.1078
Choquet-Bruhat, Cécile, Y., DeWitt-Morette, C., Dillard-Bleick, M.: Analysis, Manifolds, and Physics. Gulf Professional Publishing (1982)
Google Scholar
Cover, T.M., Thomas, J.A.: Elements of Information Theory. Wiley (2012)
Google Scholar
Friedman, J., Hastie, T. and Tibshirani, R.: The Elements of Statistical Learning, vol. 1. Springer Series in Statistics. Springer, New York (2001)
Google Scholar
Gilmore, R.: Lie Groups, Lie Algebras, and Some of Their Applications. Courier Corporation (2012)
Google Scholar
Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, pp. 315–323 (2011)
Google Scholar
Goodfellow, I., Bengio, Y. and Courville, A.: Deep Learning. MIT Press (2016)
Google Scholar
Guss, W.H., Salakhutdinov, R.: On characterizing the capacity of neural networks using algebraic topology (2018). arXiv preprint arXiv:1802.04443
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Google Scholar
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Article Google Scholar
Kaul, P., Lall, B.: Riemannian curvature of deep neural networks. IEEE Trans. Neural Netw. Learn. Syst. (2019). https://doi.org/10.1109/TNNLS.2019.2919705
Article Google Scholar
Kearns, M.J., Vazirani, U.V., Vazirani, U.: An Introduction to Computational Learning Theory. MIT Press (1994)
Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
Google Scholar
LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)
Article Google Scholar
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
Article Google Scholar
LeCun, Y., Cortes, C.: MNIST handwritten digit database (2010)
Google Scholar
Lee, J.M.: Riemannian Manifolds: An Introduction to Curvature, vol. 176. Springer, New York (1997)
Google Scholar
Lee, J.M.: Introduction to Smooth Manifolds, vol. 218. Springer, New York (2013)
Google Scholar
Mallat, S.: Group invariant scattering. Commun. Pure Appl. Math. 65(10), 1331–1398 (2012)
Article MathSciNet Google Scholar
Mallat, S.: Understanding deep convolutional networks. Phil. Trans. R. Soc. A 374(2065), 20150203 (2016)
Article Google Scholar
Mathworks. im2col. https://in.mathworks.com/help/images/ref/im2col.html. Accessed 10 Feb 2019
Monti, F., Boscaini, D., Masci, J., Rodola, E., Svoboda, J., Bronstein, M.M.: Geometric deep learning on graphs and manifolds using mixture model CNNs. In: Proceedings of CVPR, vol. 1, p. 3 (2017)
Google Scholar
Munkres, J.R.: Topology. Prentice Hall (2000)
Google Scholar
Nakahara, M.: Geometry, Topology and Physics. CRC Press (2003)
Google Scholar
Poole, B., Lahiri, S., Raghu, M., Sohl-Dickstein, J., Ganguli, S.: Exponential expressivity in deep neural networks through transient chaos. In: Advances in Neural Information Processing Systems, pp. 3360–3368 (2016)
Google Scholar
Raghu, M., Poole, B., Kleinberg, J., Ganguli, S., Sohl-Dickstein, J.: Survey of expressivity in deep neural networks (2016). arXiv preprint arXiv:1611.08083
Ren, S., He, K., Girshick, R., Sun, J.: Faster r-cnn: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, pp. 91–99 (2015)
Google Scholar
Saxe, A.M., McClelland, J.L., Ganguli, S.:Exact solutions to the nonlinear dynamics of learning in deep linear neural networks (2013). arXiv preprint arXiv:1312.6120
Schutz, B.: A First Course in General Relativity. Cambridge University Press (2009)
Google Scholar
Sermanet, P., LeCun, Y.: Traffic sign recognition with multi-scale convolutional networks. In: Neural Networks (IJCNN), pp. 2809–2813. IEEE (2011)
Google Scholar
Shuman, D.I., Narang, S.K., Frossard, P., Ortega, A., Vandergheynst, P.: The emerging field of signal processing on graphs: extending high-dimensional data analysis to networks and other irregular domains. IEEE Signal Process. Mag. 30(3), 83–98 (2013)
Article Google Scholar
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9 (2015)
Google Scholar
Topaz, C., Ziegelmeier, L., Halverson, T.: Topological data analysis of biological aggregation models. PloS one. 10. https://doi.org/10.1371/journal.pone.0126383
Article Google Scholar
Vapnik, V.N., Chervonenkis, A.Y.: On the uniform convergence of relative frequencies of events to their probabilities. In: Measures of Complexity, pp. 11–30. Springer, Cham (2015)
Chapter Google Scholar
Wiatowski, T., Bölcskei, H.: A mathematical theory of deep convolutional neural networks for feature extraction (2015). arXiv preprint arXiv:1512.06293
Zomorodian, A., Carlsson, G.: Computing persistent homology. Discret. Comput. Geom. 33(2), 249–274 (2005)
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Electrical Engineering Department, Indian Institute of Technology, Delhi, India
Piyush Kaul & Brejesh Lall

Authors

Piyush Kaul
View author publications
You can also search for this author in PubMed Google Scholar
Brejesh Lall
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Piyush Kaul .

Editor information

Editors and Affiliations

Department of Electrical and Computer Engineering, University of Alberta, Edmonton, AB, Canada
Witold Pedrycz
Department of Computer Science and Information Engineering, National Taiwan University of Science and Technology, Taipei, Taiwan
Shyi-Ming Chen

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Kaul, P., Lall, B. (2020). Theoretical Characterization of Deep Neural Networks. In: Pedrycz, W., Chen, SM. (eds) Deep Learning: Concepts and Architectures. Studies in Computational Intelligence, vol 866. Springer, Cham. https://doi.org/10.1007/978-3-030-31756-0_2

Download citation

DOI: https://doi.org/10.1007/978-3-030-31756-0_2
Published: 30 October 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-31755-3
Online ISBN: 978-3-030-31756-0
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics