Skip to main content

Fast Training of Deep LSTM Networks

  • Conference paper
  • First Online:
Advances in Neural Networks – ISNN 2019 (ISNN 2019)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11554))

Included in the following conference series:

Abstract

Deep recurrent neural networks (RNN), such as LSTM, have many advantages over forward networks. However, the LSTM training method, such as backward propagation through time (BPTT), is really slow.

In this paper, by separating the LSTM cell into forward and recurrent substructures, we propose a much simpler and faster training method than the BPTT. The deep LSTM is modified by combining the deep RNN with the multilayer perceptron (MLP). The simulation results show that our fast training method for LSTM is better than BPTT for LSTM.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Bengio, Y., Lamblin, P., Popovici, D., Larochelle, H.: Greedy layer-wise training of deep networks. In: Advances in Neural Information Processing Systems (NIPS 2006), pp. 153–160 (2007)

    Google Scholar 

  2. Chung, J., Gulcehre, C., Cho, K., Bengio, Y.: Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling. arXiv:1412.3555 [cs.NE] (2014)

  3. Graves, A., Mohamed, A., Hinton, G.: Speech Recognition with Deep Recurrent Neural Networks. arXiv:1303.5778 (2013)

  4. Hirose, N., Tajima, R.: Modeling of rolling friction by recurrent neural network using LSTM. In: 2017 IEEE International Conference on Robotics and Automation (ICRA), Singapore, pp. 6471–6475 (2017)

    Google Scholar 

  5. Hinton, G., Osindero, S., Teh, Y.: A fast learning algorithm for deep belief nets. Neural Comput. 18(7), 1–6 (2006)

    Google Scholar 

  6. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)

    Google Scholar 

  7. Box, G., Jenkins, G., Reinsel, G.: Time Series Analysis: Forecasting and Control, 4th edn. Wiley, Hoboken (2008)

    Google Scholar 

  8. Kumar, A., Chandel, Y.: Solar radiation prediction using artificial neural network techniques: a review. Renew. Sustain. Energy Rev. 33(2), 772–781 (2014)

    Google Scholar 

  9. Kingma, P., Ba, J.: Adam: A Method for Stochastic Optimization. arXiv:1412.6980 [cs.LG] (2014)

  10. Narendra, K., Parthasarathy, K.: Gradient methods for optimization of dynamical systems containing neural networks. IEEE Trans. Neural Netw. 2(2), 252–262 (1991)

    Google Scholar 

  11. Krizhevsky, A., Sutskever, I., Hinton, G.: ImageNet classification with deep convolutional neural networks. In: NIPS, pp. 12–19 (2012)

    Google Scholar 

  12. LeCun, Y., Bottou, L., Bengio, Y., Haffne, P.: Gradient based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)

    Google Scholar 

  13. Cao, W., Wang, X., Ming, Z., Gao, J.: A review on neural networks with random weights. Neurocomputing 275(2), 278–287 (2018)

    Google Scholar 

  14. Wang, X., Musa, A.: Advances in neural network based learning. Int. J. Mach. Learn. Cybern. 5(1), 1–2 (2014)

    Google Scholar 

  15. Wang, X., Cao, W.: Non-iterative approaches in training feed-forward neural networks and their applications. Soft Comput. 22(11), 3473–3476 (2018)

    Google Scholar 

  16. Ljung, L.: System Identification-Theory for User. Prentice Hall, Englewood Cliffs (1987)

    Google Scholar 

  17. Nelles, O.: Nonlinear System Identification: From Classical Approaches to Neural Networks and Fuzzy Models. Springer, Heidelberg (2013)

    Google Scholar 

  18. Ogunmolu, O., Gu, X., Jiang, S., Gans, N.: Nonlinear Systems Identification Using Deep Dynamic Neural Networks. arXiv:1610.01439v1 [cs.NE] (2016)

  19. Schoukens, J., Schoukens, J., Ljung, L.: Wiener-Hammerstein benchmark. In: 15th IFAC Symposium on System Identification, pp. 1–6 (2009)

    Google Scholar 

  20. Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)

    Google Scholar 

  21. Sugeno, M., Yasukawa, T.: A fuzzy logic based approach to qualitative modeling. IEEE Trans. Fuzzy Syst. 5(1), 7–31 (1993)

    Google Scholar 

  22. Wang, Y.: A new concept using LSTM neural networks for dynamic system identification. In: 2017 American Control Conference, Seattle, USA, pp. 5324–5329 (2017)

    Google Scholar 

  23. Wang, L., Langari, R.: Complex systems modeling via fuzzy logic. IEEE Trans. Syst. Man Cybern. 26(1), 100–106 (1996)

    Google Scholar 

  24. Yu, W.: Nonlinear system identification using discrete-time recurrent neural networks with stable learning algorithms. Inf. Sci. 58(1), 131–147 (2004)

    Google Scholar 

  25. Yu, W., Li, X.: Discrete-time neuro identification without robust modification. IEE Proc. Control Theory Appl. 150(3), 311–316 (2003)

    Google Scholar 

  26. Yu, W., Rubio, J.: Recurrent neural networks training with stable bounding ellipsoid algorithm. IEEE Trans. Neural Netw. 20(6), 983–991 (2009)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wen Yu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Yu, W., Li, X., Gonzalez, J. (2019). Fast Training of Deep LSTM Networks. In: Lu, H., Tang, H., Wang, Z. (eds) Advances in Neural Networks – ISNN 2019. ISNN 2019. Lecture Notes in Computer Science(), vol 11554. Springer, Cham. https://doi.org/10.1007/978-3-030-22796-8_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-22796-8_1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-22795-1

  • Online ISBN: 978-3-030-22796-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics