Initialising Deep Neural Networks: An Approach Based on Linear Interval Tolerance
Deep neural networks, supported by recent advances in hardware and the availability of computational resources, have managed to outperform multilayer neural networks, with one or two hidden layers, producing impressive results in several difficult tasks. Nevertheless, training deep networks remains considerably challenging and there is lack of approaches for initialising deep architectures. In this paper we present an approach that builds on interval analysis to provide weight initialisation for deep neural networks. We have built on our previous work presented in , making the necessary adjustments to tailor for deeper architectures. We conducted an empirical study to preliminary evaluate this approach using well known benchmarks from the deep learning literature.
KeywordsDeep learning Weight initialisation Linear interval tolerance Deep neural networks Multilayer perceptron GPGPU computing
- 2.Bengio, Y., Lamblin, P., Popovici, P., Larochelle, H.: Greedy layer-wise training of deep networks. In: Advances in Neural Information Processing Systems, vol. 19. MIT Press, Cambridge, MA (2007)Google Scholar
- 4.Dauphin, Y.N., Pascanu, R., Gulcehre, C., Cho, K., Ganguli, S., Bengio, Y.: Identifying and attacking the saddle point problem in high-dimensional non-convex optimization. In: NIPS, pp. 2933–2941 (2014)Google Scholar
- 5.Glorot, X., Bengio, Y.: Understanding the difficulty of training deep feedforward neural networks. In: AISTATS. JMLR Proceedings, vol. 9, pp. 249–256. JMLR.org (2010)
- 7.Krizhevsky, A., Hinton, G.: Learning multiple layers of features from tiny images. Master’s thesis, Department of Computer Science, University of Toronto (2009)Google Scholar