Skip to main content

An Introduction to Neural Networks

Abstract

Artificial neural networks are popular machine learning techniques that simulate the mechanism of learning in biological organisms. The human nervous system contains cells, which are referred to as neurons. The neurons are connected to one another with the use of axons and dendrites, and the connecting regions between axons and dendrites are referred to as synapses. These connections are illustrated in Figure 1.1(a). The strengths of synaptic connections often change in response to external stimuli. This change is how learning takes place in living organisms.

“Thou shalt not make a machine to counterfeit a human mind.”—Frank Herbert

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-319-94463-0_1
  • Chapter length: 52 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   59.99
Price excludes VAT (USA)
  • ISBN: 978-3-319-94463-0
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   79.99
Price excludes VAT (USA)
Hardcover Book
USD   69.99
Price excludes VAT (USA)
Figure 1.1
Figure 1.2
Figure 1.3
Figure 1.4
Figure 1.5
Figure 1.6
Figure 1.7
Figure 1.8
Figure 1.9
Figure 1.10
Figure 1.11
Figure 1.12
Figure 1.13
Figure 1.14
Figure 1.15
Figure 1.16
Figure 1.17
Figure 1.18
Figure 1.19
Figure 1.20

Notes

  1. 1.

    The ReLU shows asymmetric saturation.

  2. 2.

    Examples include Torch [572], Theano [573], and TensorFlow [574].

  3. 3.

    Weight decay is generally used with other loss functions in single-layer models and in all multi-layer models with a large number of parameters.

  4. 4.

    This is an overloading of the terminology used in convolutional neural networks. The meaning of the word “depth” is inferred from the context in which it is used.

Bibliography

  1. C. Aggarwal. Data classification: Algorithms and applications, CRC Press, 2014.

    Google Scholar 

  2. C. Aggarwal. Data mining: The textbook. Springer, 2015.

    Google Scholar 

  3. C. Aggarwal. Machine learning for text. Springer, 2018.

    Google Scholar 

  4. Y. Bengio. Learning deep architectures for AI. Foundations and Trends in Machine Learning, 2(1), pp. 1–127, 2009.

    CrossRef  Google Scholar 

  5. Y. Bengio, A. Courville, and P. Vincent. Representation learning: A review and new perspectives. IEEE TPAMI, 35(8), pp. 1798–1828, 2013.

    CrossRef  Google Scholar 

  6. Y. Bengio and O. Delalleau. On the expressive power of deep architectures. Algorithmic Learning Theory, pp. 18–36, 2011.

    Google Scholar 

  7. J. Bergstra et al. Theano: A CPU and GPU math compiler in Python. Python in Science Conference, 2010.

    Google Scholar 

  8. C. M. Bishop. Pattern recognition and machine learning. Springer, 2007.

    Google Scholar 

  9. C. M. Bishop. Neural networks for pattern recognition. Oxford University Press, 1995.

    Google Scholar 

  10. L. Breiman. Random forests. Journal Machine Learning archive, 45(1), pp. 5–32, 2001.

    CrossRef  Google Scholar 

  11. A. Bryson. A gradient method for optimizing multi-stage allocation processes. Harvard University Symposium on Digital Computers and their Applications, 1961.

    Google Scholar 

  12. D. Ciresan, U. Meier, L. Gambardella, and J. Schmidhuber. Deep, big, simple neural nets for handwritten digit recognition. Neural Computation, 22(12), pp. 3207–3220, 2010.

    CrossRef  Google Scholar 

  13. T. Cover. Geometrical and statistical properties of systems of linear inequalities with applications to pattern recognition. IEEE Transactions on Electronic Computers, pp. 326–334, 1965.

    Google Scholar 

  14. N. de Freitas. Machine Learning, University of Oxford (Course Video), 2013.https://www.youtube.com/watch?v=w2OtwL5T1ow&list=PLE6Wd9FREdyJ5lbFl8Uu-GjecvVw66F6

  15. N. de Freitas. Deep Learning, University of Oxford (Course Video), 2015.https://www.youtube.com/watch?v=PlhFWT7vAEw&list=PLjK8ddCbDMphIMSXn-1IjyYpHU3DaUYw

  16. O. Delalleau and Y. Bengio. Shallow vs. deep sum-product networks. NIPS Conference, pp. 666–674, 2011.

    Google Scholar 

  17. Y. Freund and R. Schapire. Large margin classification using the perceptron algorithm. Machine Learning, 37(3), pp. 277–296, 1999.

    CrossRef  Google Scholar 

  18. K. Fukushima. Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biological Cybernetics, 36(4), pp. 193–202, 1980.

    CrossRef  Google Scholar 

  19. S. Gallant. Perceptron-based learning algorithms. IEEE Transactions on Neural Networks, 1(2), pp. 179–191, 1990.

    CrossRef  Google Scholar 

  20. A. Ghodsi. STAT 946: Topics in Probability and Statistics: Deep Learning, University of Waterloo, Fall 2015. https://www.youtube.com/watch?v=fyAZszlPphs&list=PLehuLRPyt1Hyi78UOkMP-WCGRxGcA9NVOE

  21. X. Glorot and Y. Bengio. Understanding the difficulty of training deep feedforward neural networks. AISTATS, pp. 249–256, 2010.

    Google Scholar 

  22. I. Goodfellow, Y. Bengio, and A. Courville. Deep learning. MIT Press, 2016.

    Google Scholar 

  23. A. Graves, A. Mohamed, and G. Hinton. Speech recognition with deep recurrent neural networks. Acoustics, Speech and Signal Processing (ICASSP), pp. 6645–6649, 2013.

    Google Scholar 

  24. A. Graves, G. Wayne, and I. Danihelka. Neural turing machines. arXiv:1410.5401, 2014.https://arxiv.org/abs/1410.5401

  25. K. Greff, R. K. Srivastava, and J. Schmidhuber. Highway and residual networks learn unrolled iterative estimation. arXiv:1612.07771, 2016.https://arxiv.org/abs/1612.07771

  26. D. Hassabis, D. Kumaran, C. Summerfield, and M. Botvinick. Neuroscience-inspired artificial intelligence. Neuron, 95(2), pp. 245–258, 2017.

    CrossRef  Google Scholar 

  27. T. Hastie, R. Tibshirani, and J. Friedman. The elements of statistical learning. Springer, 2009.

    Google Scholar 

  28. S. Haykin. Neural networks and learning machines. Pearson, 2008.

    Google Scholar 

  29. K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778, 2016.

    Google Scholar 

  30. G. Hinton and R. Salakhutdinov. Reducing the dimensionality of data with neural networks. Science, 313, (5766), pp. 504–507, 2006.

    Google Scholar 

  31. S. Hochreiter and J. Schmidhuber. Long short-term memory. Neural Computation, 9(8), pp. 1735–1785, 1997.

    CrossRef  Google Scholar 

  32. J. J. Hopfield. Neural networks and physical systems with emergent collective computational abilities. National Academy of Sciences of the USA, 79(8), pp. 2554–2558, 1982.

    MathSciNet  CrossRef  Google Scholar 

  33. K. Hornik, M. Stinchcombe, and H. White. Multilayer feedforward networks are universal approximators. Neural Networks, 2(5), pp. 359–366, 1989.

    CrossRef  Google Scholar 

  34. D. Hubel and T. Wiesel. Receptive fields of single neurones in the cat’s striate cortex. The Journal of Physiology, 124(3), pp. 574–591, 1959.

    CrossRef  Google Scholar 

  35. H. Kandel, J. Schwartz, T. Jessell, S. Siegelbaum, and A. Hudspeth. Principles of neural science. McGraw Hill, 2012.

    Google Scholar 

  36. A. Karpathy, J. Johnson, and L. Fei-Fei. Stanford University Class CS321n: Convolutional neural networks for visual recognition, 2016.http://cs231n.github.io/

  37. H. J. Kelley. Gradient theory of optimal flight paths. Ars Journal, 30(10), pp. 947–954, 1960.

    CrossRef  Google Scholar 

  38. T. Kietzmann, P. McClure, and N. Kriegeskorte. Deep Neural Networks In Computational Neuroscience. bioRxiv, 133504, 2017.https://www.biorxiv.org/content/early/2017/05/04/133504

  39. J. Kivinen and M. Warmuth. The perceptron algorithm vs. winnow: linear vs. logarithmic mistake bounds when few input variables are relevant. Computational Learning Theory, pp. 289–296, 1995.

    Google Scholar 

  40. D. Koller and N. Friedman. Probabilistic graphical models: principles and techniques. MIT Press, 2009.

    Google Scholar 

  41. A. Krizhevsky, I. Sutskever, and G. Hinton. Imagenet classification with deep convolutional neural networks. NIPS Conference, pp. 1097–1105. 2012.

    Google Scholar 

  42. H. Larochelle. Neural Networks (Course). Universite de Sherbrooke, 2013.https://www.youtube.com/watch?v=SGZ6BttHMPw&list=PL6Xpj9I5qXYEcOhn7-TqghAJ6NAPrNmUBH

  43. H. Larochelle, D. Erhan, A. Courville, J. Bergstra, and Y. Bengio. An empirical evaluation of deep architectures on problems with many factors of variation. ICML Confererence, pp. 473–480, 2007.

    Google Scholar 

  44. Y. LeCun, Y. Bengio, and G. Hinton. Deep learning. Nature, 521(7553), pp. 436–444, 2015.

    CrossRef  Google Scholar 

  45. Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), pp. 2278–2324, 1998.

    CrossRef  Google Scholar 

  46. Y. LeCun, C. Cortes, and C. Burges. The MNIST database of handwritten digits, 1998.http://yann.lecun.com/exdb/mnist/

  47. C. Manning and R. Socher. CS224N: Natural language processing with deep learning. Stanford University School of Engineering, 2017. https://www.youtube.com/watch?v=OQQ-W_63UgQ

  48. W. S. McCulloch and W. H. Pitts. A logical calculus of the ideas immanent in nervous activity. The Bulletin of Mathematical Biophysics, 5(4), pp. 115–133, 1943.

    MathSciNet  CrossRef  Google Scholar 

  49. G. Miller, R. Beckwith, C. Fellbaum, D. Gross, and K. J. Miller. Introduction to WordNet: An on-line lexical database. International Journal of Lexicography, 3(4), pp. 235–312, 1990.https://wordnet.princeton.edu/

    CrossRef  Google Scholar 

  50. M. Minsky and S. Papert. Perceptrons. An Introduction to Computational Geometry, MIT Press, 1969.

    Google Scholar 

  51. G. Montufar. Universal approximation depth and errors of narrow belief networks with discrete units. Neural Computation, 26(7), pp. 1386–1407, 2014.

    MathSciNet  CrossRef  Google Scholar 

  52. R. Miotto, F. Wang, S. Wang, X. Jiang, and J. T. Dudley. Deep learning for healthcare: review, opportunities and challenges. Briefings in Bioinformatics, pp. 1–11, 2017.

    Google Scholar 

  53. H. Poon and P. Domingos. Sum-product networks: A new deep architecture. Computer Vision Workshops (ICCV Workshops), pp. 689–690, 2011.

    Google Scholar 

  54. V. Romanuke. Parallel Computing Center (Khmelnitskiy, Ukraine) represents an ensemble of 5 convolutional neural networks which performs on MNIST at 0.21 percent error rate. Retrieved 24 November 2016.

    Google Scholar 

  55. F. Rosenblatt. The perceptron: A probabilistic model for information storage and organization in the brain. Psychological Review, 65(6), 386, 1958.

    Google Scholar 

  56. D. Rumelhart, G. Hinton, and R. Williams. Learning representations by back-propagating errors. Nature, 323 (6088), pp. 533–536, 1986.

    CrossRef  Google Scholar 

  57. D. Rumelhart, G. Hinton, and R. Williams. Learning internal representations by back-propagating errors. In Parallel Distributed Processing: Explorations in the Microstructure of Cognition, pp. 318–362, 1986.

    Google Scholar 

  58. J. Schmidhuber. Deep learning in neural networks: An overview. Neural Networks, 61, pp. 85–117, 2015.

    CrossRef  Google Scholar 

  59. H. Siegelmann and E. Sontag. On the computational power of neural nets. Journal of Computer and System Sciences, 50(1), pp. 132–150, 1995.

    MathSciNet  CrossRef  Google Scholar 

  60. S. Shalev-Shwartz, Y. Singer, N. Srebro, and A. Cotter. Pegasos: Primal estimated sub-gradient solver for SVM. Mathematical Programming, 127(1), pp. 3–30, 2011.

    MathSciNet  CrossRef  Google Scholar 

  61. B. W. Silverman. Density Estimation for Statistics and Data Analysis. Chapman and Hall, 1986.

    Google Scholar 

  62. S. Wang, C. Aggarwal, and H. Liu. Using a random forest to inspire a neural network and improving on it. SIAM Conference on Data Mining, 2017.

    Google Scholar 

  63. A. Wendemuth. Learning the unlearnable. Journal of Physics A: Math. Gen., 28, pp. 5423–5436, 1995.

    MathSciNet  CrossRef  Google Scholar 

  64. P. Werbos. Beyond Regression: New Tools for Prediction and Analysis in the Behavioral Sciences. PhD thesis, Harvard University, 1974.

    Google Scholar 

  65. P. Werbos. The roots of backpropagation: from ordered derivatives to neural networks and political forecasting (Vol. 1). John Wiley and Sons, 1994.

    Google Scholar 

  66. P. Werbos. Backpropagation through time: what it does and how to do it. Proceedings of the IEEE, 78(10), pp. 1550–1560, 1990.

    CrossRef  Google Scholar 

  67. J. Weston, S. Chopra, and A. Bordes. Memory networks. ICLR, 2015.

    Google Scholar 

  68. B. Widrow and M. Hoff. Adaptive switching circuits. IRE WESCON Convention Record, 4(1), pp. 96–104, 1960.

    Google Scholar 

  69. http://caffe.berkeleyvision.org/

  70. http://torch.ch/

  71. http://deeplearning.net/software/theano/

  72. https://www.tensorflow.org/

  73. https://keras.io/

  74. https://lasagne.readthedocs.io/en/latest/

  75. http://www.image-net.org/

  76. http://www.image-net.org/challenges/LSVRC/

  77. https://deeplearning4j.org/

  78. https://www.wikipedia.org/

  79. https://science.education.nih.gov/supplements/webversions/BrainAddiction/guide/lesson2-1.html

  80. https://www.ibm.com/us-en/marketplace/deep-learning-platform

  81. https://www.coursera.org/learn/neural-networks

  82. https://archive.ics.uci.edu/ml/datasets.html

  83. https://www.youtube.com/watch?v=2pWv7GOvuf0

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and Permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this chapter

Verify currency and authenticity via CrossMark

Cite this chapter

Aggarwal, C.C. (2018). An Introduction to Neural Networks. In: Neural Networks and Deep Learning. Springer, Cham. https://doi.org/10.1007/978-3-319-94463-0_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-94463-0_1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-94462-3

  • Online ISBN: 978-3-319-94463-0

  • eBook Packages: Computer ScienceComputer Science (R0)