Improving the Learning Speed in 2-Layered LSTM Network by Estimating the Configuration of Hidden Units and Optimizing Weights Initialization

Corrêa, Débora C.; Levada, Alexandre L. M.; Saito, José H.

doi:10.1007/978-3-540-87536-9_12

Débora C. Corrêa¹,
Alexandre L. M. Levada² &
José H. Saito¹

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 5163))

Included in the following conference series:

International Conference on Artificial Neural Networks

2092 Accesses
7 Citations

Abstract

This paper describes a method to initialize the LSTM network weights and estimate the configuration of hidden units in order to improve training time for function approximation tasks. The motivation of this method is based on the behavior of the hidden units and the complexity of the function to be approximated. The results obtained for 1-D and 2-D functions show that the proposed methodology improves the network performance, stabilizing the training phase.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 149.00; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Costa, L.F., Cesar, R.M.: Shape Analysis and Classification: Theory and Practice. CRC Press, Boca Raton (2001)
Google Scholar
Franklin, J.A.: Recurrent Neural Networks for Music Computation. Informs Journal on Computing 18(3), 321–328 (2006)
Article Google Scholar
Gaves, A., Schmidhuber, J.: Framewise phoneme classification with bidirectional LSTM and other neural network architectures. Neural Networks 18(5-6), 602–610 (2005)
Article Google Scholar
Gers, F.: Long Short-Term Memory in Recurrent Neural Networks. PhD Thesis (2001)
Google Scholar
Haykin, S.: Neural Networks: A comprehensive Foundation, 2nd edn. Prentice Hall, Englewood Cliffs (1998)
Google Scholar
Hochreiter, S., Bengio, Y., Frasconi, P., Schmidhuber, J.: Gradient flow in recurrent nets: The difficulty of learning long-term dependencies. In: Kolen, J., Kremer, S.C. (eds.) A Field Guide to Dynamical Recurrent Networks. IEEE Press, New York (2001)
Google Scholar
Hochreiter, S., Schmidhuber, J.: Long Short-Term Memory. Neural Computation 9(8), 1735–1780 (1997)
Article Google Scholar
Irie, B., Miyake, S.: Capabilities of three-layerd perceptrons. In: Proceedings of the IEEE International Conference on Neural Networks p. I–641 (1998)
Google Scholar
Kak, A.C., Malcolm, S.: Principles of Computerized Tomographic Imaging. IEEE Press, New York (2001)
Google Scholar
Nguyen, D., Widrow, B.: Improving the learning speed of 2-layer neural networks by choosing initial values of adaptive weights. Proc. IJCNN 3, 21–26 (1990)
Google Scholar
Pearlmutter, B.A.: Gradient calculations for dynamic recurrent neural networks: A survey. IEEE Transactions on Neural Network 6(5), 1212–1228 (1995)
Article Google Scholar
Perez-Ortiz, J.A., Gers, F.A., Eck, D., Schmidhuber, J.: Kalman Filters improve LSTM network performance in problems unsolvable by traditional recurrent nets. Neural Networks 16(2), 241–250 (2003)
Article Google Scholar
Schmidhuber, J., Wierstra, D., Gagliolo, M., Gomez, F.: Training Recurrent Networks by Evolino. Neural Computation 19, 757–779 (2007)
Article MATH Google Scholar
Smith, G.D.: Numerical Solution of Partial Differential Equations: Finite Difference Methods, 3rd edn. Clarendon Press (1985)
Google Scholar
Werbos, P.J.: Generalization of backpropagation with application to a recurrent gas market model. Neural Networks 1, 339–356 (1988)
Article Google Scholar
Williams, R.J., Zipser, D.: Gradient-Based Learning Algorithms for Recurrent Networks and Their Computational Complexity. In: Chauvin, Y., Rumelhart, D.E. (eds.) Back-Propagation: Theory, Architectures and Applications, Hillsdale, NJ (1992)
Google Scholar

Download references

Author information

Authors and Affiliations

Computer Department, Federal University of São Carlos, São Paulo, Brazil
Débora C. Corrêa & José H. Saito
Physics Institute of São Carlos, University of São Paulo, São Paulo, Brazil
Alexandre L. M. Levada

Authors

Débora C. Corrêa
View author publications
You can also search for this author in PubMed Google Scholar
Alexandre L. M. Levada
View author publications
You can also search for this author in PubMed Google Scholar
José H. Saito
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Véra Kůrková Roman Neruda Jan Koutník

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Corrêa, D.C., Levada, A.L.M., Saito, J.H. (2008). Improving the Learning Speed in 2-Layered LSTM Network by Estimating the Configuration of Hidden Units and Optimizing Weights Initialization. In: Kůrková, V., Neruda, R., Koutník, J. (eds) Artificial Neural Networks - ICANN 2008. ICANN 2008. Lecture Notes in Computer Science, vol 5163. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-87536-9_12

Download citation

DOI: https://doi.org/10.1007/978-3-540-87536-9_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-87535-2
Online ISBN: 978-3-540-87536-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics