Advertisement

Initialising Deep Neural Networks: An Approach Based on Linear Interval Tolerance

  • Cosmin StamateEmail author
  • George D. Magoulas
  • Michael S. C. Thomas
Conference paper
Part of the Lecture Notes in Networks and Systems book series (LNNS, volume 16)

Abstract

Deep neural networks, supported by recent advances in hardware and the availability of computational resources, have managed to outperform multilayer neural networks, with one or two hidden layers, producing impressive results in several difficult tasks. Nevertheless, training deep networks remains considerably challenging and there is lack of approaches for initialising deep architectures. In this paper we present an approach that builds on interval analysis to provide weight initialisation for deep neural networks. We have built on our previous work presented in [1], making the necessary adjustments to tailor for deeper architectures. We conducted an empirical study to preliminary evaluate this approach using well known benchmarks from the deep learning literature.

Keywords

Deep learning Weight initialisation Linear interval tolerance Deep neural networks Multilayer perceptron GPGPU computing 

References

  1. 1.
    Adam, S.P., Karras, D.A., Magoulas, G.D., Vrahatis, M.N.: Solving the linear interval tolerance problem for weight initialization of neural networks. Neural Netw. 54, 17–37 (2014)CrossRefzbMATHGoogle Scholar
  2. 2.
    Bengio, Y., Lamblin, P., Popovici, P., Larochelle, H.: Greedy layer-wise training of deep networks. In: Advances in Neural Information Processing Systems, vol. 19. MIT Press, Cambridge, MA (2007)Google Scholar
  3. 3.
    Bengio, Y.: Learning deep architectures for AI. Found. Trends Mach. Learn. 2(1), 1–127 (2009). Also published as a book. Now Publishers (2009)CrossRefzbMATHGoogle Scholar
  4. 4.
    Dauphin, Y.N., Pascanu, R., Gulcehre, C., Cho, K., Ganguli, S., Bengio, Y.: Identifying and attacking the saddle point problem in high-dimensional non-convex optimization. In: NIPS, pp. 2933–2941 (2014)Google Scholar
  5. 5.
    Glorot, X., Bengio, Y.: Understanding the difficulty of training deep feedforward neural networks. In: AISTATS. JMLR Proceedings, vol. 9, pp. 249–256. JMLR.org (2010)
  6. 6.
    Hinton, G.E., Osindero, S., Teh, Y.W.: A fast learning algorithm for deep belief nets. Neural Comput. 18, 1527–1554 (2006)MathSciNetCrossRefzbMATHGoogle Scholar
  7. 7.
    Krizhevsky, A., Hinton, G.: Learning multiple layers of features from tiny images. Master’s thesis, Department of Computer Science, University of Toronto (2009)Google Scholar
  8. 8.
    Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86, 2278–2324 (1998)CrossRefGoogle Scholar
  9. 9.
    Magoulast, G.D., Vrahatis, M.N., Androulakis, G.S.: On the alleviation of the problem of local minima in back-propagation. Nonlinear Anal.: Theory, Methods Appl. 30(7), 4545–4550 (1997)MathSciNetCrossRefzbMATHGoogle Scholar
  10. 10.
    Schmidhuber, J.: Deep learning in neural networks: an overview. Neural Netw. 61, 85–117 (2015)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2018

Authors and Affiliations

  • Cosmin Stamate
    • 1
    Email author
  • George D. Magoulas
    • 1
  • Michael S. C. Thomas
    • 2
  1. 1.Department of Computer ScienceBirkbeck, University of LondonLondonUK
  2. 2.Department of Psychological SciencesBirkbeck, University of LondonLondonUK

Personalised recommendations