Skip to main content

Theoretical Characterization of Deep Neural Networks

  • Chapter
  • First Online:
  • 3132 Accesses

Part of the book series: Studies in Computational Intelligence ((SCI,volume 866))

Abstract

Deep neural networks are poorly understood mathematically, however there has been a lot of recent work focusing on analyzing and understanding their success in a variety of pattern recognition tasks. We describe some of the mathematical techniques used for characterization of neural networks in terms of complexity of classification or regression task assigned, or based on functions learned, and try to relate this to architecture choices for neural networks. We explain some of the measurable quantifiers that can been used for defining expressivity of neural network including using homological complexity and curvature. We also describe neural networks from the viewpoints of scattering transforms and share some of the mathematical and intuitive justifications for those. We finally share a technique for visualizing and analyzing neural networks based on concept of Riemann curvature.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   139.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   179.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   179.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Amodei, D., Ananthanarayanan, S., Anubhai, R., Bai, J., Battenberg, E., Case, C., Casper, J., Catanzaro, B., Cheng, Q., Chen, G., et al. Deep speech 2: end-to-end speech recognition in English and mandarin. In: International Conference on Machine Learning, pp. 173–182 (2016)

    Google Scholar 

  2. Andén, J., Mallat, S.: Deep scattering spectrum. IEEE Trans. Signal Process. 62(16), 4114–4128 (2014)

    Article  MathSciNet  Google Scholar 

  3. Bartlett, P.L., Maass, W.: Vapnik-Chervonenkis dimension of neural nets. In: The Handbook of Brain Theory and Neural Networks, pp. 1188–1192 (2003)

    Google Scholar 

  4. Bengio, Y., Delalleau, O.: On the expressive power of deep architectures. In: International Conference on Algorithmic Learning Theory, pp. 18–36. Springer, Berlin (2011)

    Google Scholar 

  5. Bianchini, M., Scarselli, F.: On the complexity of shallow and deep neural network classifiers. In: ESANN (2014)

    Google Scholar 

  6. Bredon, G.E.: Topology and Geometry, vol. 139. Springer Science & Business Media (2013)

    Google Scholar 

  7. Bronstein, M.M., Bruna, J., LeCun, Y., Szlam, A., Vandergheynst, P.: Geometric deep learning: going beyond Euclidean data. IEEE Signal Process. Mag. 34(4), 18–42 (2017)

    Article  Google Scholar 

  8. Bruna, J.: Geometric stability in Euclidean domains: the scattering transform and beyond. https://joanbruna.github.io/MathsDL-spring18/ (2018)

  9. Bruna, J., Mallat, S.: Invariant scattering convolution networks. IEEE Trans. Pattern Anal. Mach. Intell. 35(8), 1872–1886 (2013)

    Article  Google Scholar 

  10. Cho, K., Van Merriënboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., Bengio, Y.: Learning phrase representations using RNN encoder-decoder for statistical machine translation (2014). arXiv preprint arXiv:1406.1078

  11. Choquet-Bruhat, Cécile, Y., DeWitt-Morette, C., Dillard-Bleick, M.: Analysis, Manifolds, and Physics. Gulf Professional Publishing (1982)

    Google Scholar 

  12. Cover, T.M., Thomas, J.A.: Elements of Information Theory. Wiley (2012)

    Google Scholar 

  13. Friedman, J., Hastie, T. and Tibshirani, R.: The Elements of Statistical Learning, vol. 1. Springer Series in Statistics. Springer, New York (2001)

    Google Scholar 

  14. Gilmore, R.: Lie Groups, Lie Algebras, and Some of Their Applications. Courier Corporation (2012)

    Google Scholar 

  15. Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, pp. 315–323 (2011)

    Google Scholar 

  16. Goodfellow, I., Bengio, Y. and Courville, A.: Deep Learning. MIT Press (2016)

    Google Scholar 

  17. Guss, W.H., Salakhutdinov, R.: On characterizing the capacity of neural networks using algebraic topology (2018). arXiv preprint arXiv:1802.04443

  18. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

    Google Scholar 

  19. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)

    Article  Google Scholar 

  20. Kaul, P., Lall, B.: Riemannian curvature of deep neural networks. IEEE Trans. Neural Netw. Learn. Syst. (2019). https://doi.org/10.1109/TNNLS.2019.2919705

    Article  Google Scholar 

  21. Kearns, M.J., Vazirani, U.V., Vazirani, U.: An Introduction to Computational Learning Theory. MIT Press (1994)

    Google Scholar 

  22. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)

    Google Scholar 

  23. LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)

    Article  Google Scholar 

  24. LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)

    Article  Google Scholar 

  25. LeCun, Y., Cortes, C.: MNIST handwritten digit database (2010)

    Google Scholar 

  26. Lee, J.M.: Riemannian Manifolds: An Introduction to Curvature, vol. 176. Springer, New York (1997)

    Google Scholar 

  27. Lee, J.M.: Introduction to Smooth Manifolds, vol. 218. Springer, New York (2013)

    Google Scholar 

  28. Mallat, S.: Group invariant scattering. Commun. Pure Appl. Math. 65(10), 1331–1398 (2012)

    Article  MathSciNet  Google Scholar 

  29. Mallat, S.: Understanding deep convolutional networks. Phil. Trans. R. Soc. A 374(2065), 20150203 (2016)

    Article  Google Scholar 

  30. Mathworks. im2col. https://in.mathworks.com/help/images/ref/im2col.html. Accessed 10 Feb 2019

  31. Monti, F., Boscaini, D., Masci, J., Rodola, E., Svoboda, J., Bronstein, M.M.: Geometric deep learning on graphs and manifolds using mixture model CNNs. In: Proceedings of CVPR, vol. 1, p. 3 (2017)

    Google Scholar 

  32. Munkres, J.R.: Topology. Prentice Hall (2000)

    Google Scholar 

  33. Nakahara, M.: Geometry, Topology and Physics. CRC Press (2003)

    Google Scholar 

  34. Poole, B., Lahiri, S., Raghu, M., Sohl-Dickstein, J., Ganguli, S.: Exponential expressivity in deep neural networks through transient chaos. In: Advances in Neural Information Processing Systems, pp. 3360–3368 (2016)

    Google Scholar 

  35. Raghu, M., Poole, B., Kleinberg, J., Ganguli, S., Sohl-Dickstein, J.: Survey of expressivity in deep neural networks (2016). arXiv preprint arXiv:1611.08083

  36. Ren, S., He, K., Girshick, R., Sun, J.: Faster r-cnn: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, pp. 91–99 (2015)

    Google Scholar 

  37. Saxe, A.M., McClelland, J.L., Ganguli, S.:Exact solutions to the nonlinear dynamics of learning in deep linear neural networks (2013). arXiv preprint arXiv:1312.6120

  38. Schutz, B.: A First Course in General Relativity. Cambridge University Press (2009)

    Google Scholar 

  39. Sermanet, P., LeCun, Y.: Traffic sign recognition with multi-scale convolutional networks. In: Neural Networks (IJCNN), pp. 2809–2813. IEEE (2011)

    Google Scholar 

  40. Shuman, D.I., Narang, S.K., Frossard, P., Ortega, A., Vandergheynst, P.: The emerging field of signal processing on graphs: extending high-dimensional data analysis to networks and other irregular domains. IEEE Signal Process. Mag. 30(3), 83–98 (2013)

    Article  Google Scholar 

  41. Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9 (2015)

    Google Scholar 

  42. Topaz, C., Ziegelmeier, L., Halverson, T.: Topological data analysis of biological aggregation models. PloS one. 10. https://doi.org/10.1371/journal.pone.0126383

    Article  Google Scholar 

  43. Vapnik, V.N., Chervonenkis, A.Y.: On the uniform convergence of relative frequencies of events to their probabilities. In: Measures of Complexity, pp. 11–30. Springer, Cham (2015)

    Chapter  Google Scholar 

  44. Wiatowski, T., Bölcskei, H.: A mathematical theory of deep convolutional neural networks for feature extraction (2015). arXiv preprint arXiv:1512.06293

  45. Zomorodian, A., Carlsson, G.: Computing persistent homology. Discret. Comput. Geom. 33(2), 249–274 (2005)

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Piyush Kaul .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Kaul, P., Lall, B. (2020). Theoretical Characterization of Deep Neural Networks. In: Pedrycz, W., Chen, SM. (eds) Deep Learning: Concepts and Architectures. Studies in Computational Intelligence, vol 866. Springer, Cham. https://doi.org/10.1007/978-3-030-31756-0_2

Download citation

Publish with us

Policies and ethics