Abstract
This paper describes a method to initialize the LSTM network weights and estimate the configuration of hidden units in order to improve training time for function approximation tasks. The motivation of this method is based on the behavior of the hidden units and the complexity of the function to be approximated. The results obtained for 1-D and 2-D functions show that the proposed methodology improves the network performance, stabilizing the training phase.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Costa, L.F., Cesar, R.M.: Shape Analysis and Classification: Theory and Practice. CRC Press, Boca Raton (2001)
Franklin, J.A.: Recurrent Neural Networks for Music Computation. Informs Journal on Computing 18(3), 321–328 (2006)
Gaves, A., Schmidhuber, J.: Framewise phoneme classification with bidirectional LSTM and other neural network architectures. Neural Networks 18(5-6), 602–610 (2005)
Gers, F.: Long Short-Term Memory in Recurrent Neural Networks. PhD Thesis (2001)
Haykin, S.: Neural Networks: A comprehensive Foundation, 2nd edn. Prentice Hall, Englewood Cliffs (1998)
Hochreiter, S., Bengio, Y., Frasconi, P., Schmidhuber, J.: Gradient flow in recurrent nets: The difficulty of learning long-term dependencies. In: Kolen, J., Kremer, S.C. (eds.) A Field Guide to Dynamical Recurrent Networks. IEEE Press, New York (2001)
Hochreiter, S., Schmidhuber, J.: Long Short-Term Memory. Neural Computation 9(8), 1735–1780 (1997)
Irie, B., Miyake, S.: Capabilities of three-layerd perceptrons. In: Proceedings of the IEEE International Conference on Neural Networks p. I–641 (1998)
Kak, A.C., Malcolm, S.: Principles of Computerized Tomographic Imaging. IEEE Press, New York (2001)
Nguyen, D., Widrow, B.: Improving the learning speed of 2-layer neural networks by choosing initial values of adaptive weights. Proc. IJCNN 3, 21–26 (1990)
Pearlmutter, B.A.: Gradient calculations for dynamic recurrent neural networks: A survey. IEEE Transactions on Neural Network 6(5), 1212–1228 (1995)
Perez-Ortiz, J.A., Gers, F.A., Eck, D., Schmidhuber, J.: Kalman Filters improve LSTM network performance in problems unsolvable by traditional recurrent nets. Neural Networks 16(2), 241–250 (2003)
Schmidhuber, J., Wierstra, D., Gagliolo, M., Gomez, F.: Training Recurrent Networks by Evolino. Neural Computation 19, 757–779 (2007)
Smith, G.D.: Numerical Solution of Partial Differential Equations: Finite Difference Methods, 3rd edn. Clarendon Press (1985)
Werbos, P.J.: Generalization of backpropagation with application to a recurrent gas market model. Neural Networks 1, 339–356 (1988)
Williams, R.J., Zipser, D.: Gradient-Based Learning Algorithms for Recurrent Networks and Their Computational Complexity. In: Chauvin, Y., Rumelhart, D.E. (eds.) Back-Propagation: Theory, Architectures and Applications, Hillsdale, NJ (1992)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Corrêa, D.C., Levada, A.L.M., Saito, J.H. (2008). Improving the Learning Speed in 2-Layered LSTM Network by Estimating the Configuration of Hidden Units and Optimizing Weights Initialization. In: Kůrková, V., Neruda, R., Koutník, J. (eds) Artificial Neural Networks - ICANN 2008. ICANN 2008. Lecture Notes in Computer Science, vol 5163. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-87536-9_12
Download citation
DOI: https://doi.org/10.1007/978-3-540-87536-9_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-87535-2
Online ISBN: 978-3-540-87536-9
eBook Packages: Computer ScienceComputer Science (R0)