Fast Training of Deep LSTM Networks

Yu, Wen; Li, Xiaoou; Gonzalez, Jesus

doi:10.1007/978-3-030-22796-8_1

Wen Yu¹⁷,
Xiaoou Li¹⁸ &
Jesus Gonzalez¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11554))

Included in the following conference series:

International Symposium on Neural Networks

2534 Accesses
8 Citations

Abstract

Deep recurrent neural networks (RNN), such as LSTM, have many advantages over forward networks. However, the LSTM training method, such as backward propagation through time (BPTT), is really slow.

In this paper, by separating the LSTM cell into forward and recurrent substructures, we propose a much simpler and faster training method than the BPTT. The deep LSTM is modified by combining the deep RNN with the multilayer perceptron (MLP). The simulation results show that our fast training method for LSTM is better than BPTT for LSTM.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Bengio, Y., Lamblin, P., Popovici, D., Larochelle, H.: Greedy layer-wise training of deep networks. In: Advances in Neural Information Processing Systems (NIPS 2006), pp. 153–160 (2007)
Google Scholar
Chung, J., Gulcehre, C., Cho, K., Bengio, Y.: Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling. arXiv:1412.3555 [cs.NE] (2014)
Graves, A., Mohamed, A., Hinton, G.: Speech Recognition with Deep Recurrent Neural Networks. arXiv:1303.5778 (2013)
Hirose, N., Tajima, R.: Modeling of rolling friction by recurrent neural network using LSTM. In: 2017 IEEE International Conference on Robotics and Automation (ICRA), Singapore, pp. 6471–6475 (2017)
Google Scholar
Hinton, G., Osindero, S., Teh, Y.: A fast learning algorithm for deep belief nets. Neural Comput. 18(7), 1–6 (2006)
Google Scholar
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Google Scholar
Box, G., Jenkins, G., Reinsel, G.: Time Series Analysis: Forecasting and Control, 4th edn. Wiley, Hoboken (2008)
Google Scholar
Kumar, A., Chandel, Y.: Solar radiation prediction using artificial neural network techniques: a review. Renew. Sustain. Energy Rev. 33(2), 772–781 (2014)
Google Scholar
Kingma, P., Ba, J.: Adam: A Method for Stochastic Optimization. arXiv:1412.6980 [cs.LG] (2014)
Narendra, K., Parthasarathy, K.: Gradient methods for optimization of dynamical systems containing neural networks. IEEE Trans. Neural Netw. 2(2), 252–262 (1991)
Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.: ImageNet classification with deep convolutional neural networks. In: NIPS, pp. 12–19 (2012)
Google Scholar
LeCun, Y., Bottou, L., Bengio, Y., Haffne, P.: Gradient based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
Google Scholar
Cao, W., Wang, X., Ming, Z., Gao, J.: A review on neural networks with random weights. Neurocomputing 275(2), 278–287 (2018)
Google Scholar
Wang, X., Musa, A.: Advances in neural network based learning. Int. J. Mach. Learn. Cybern. 5(1), 1–2 (2014)
Google Scholar
Wang, X., Cao, W.: Non-iterative approaches in training feed-forward neural networks and their applications. Soft Comput. 22(11), 3473–3476 (2018)
Google Scholar
Ljung, L.: System Identification-Theory for User. Prentice Hall, Englewood Cliffs (1987)
Google Scholar
Nelles, O.: Nonlinear System Identification: From Classical Approaches to Neural Networks and Fuzzy Models. Springer, Heidelberg (2013)
Google Scholar
Ogunmolu, O., Gu, X., Jiang, S., Gans, N.: Nonlinear Systems Identification Using Deep Dynamic Neural Networks. arXiv:1610.01439v1 [cs.NE] (2016)
Schoukens, J., Schoukens, J., Ljung, L.: Wiener-Hammerstein benchmark. In: 15th IFAC Symposium on System Identification, pp. 1–6 (2009)
Google Scholar
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)
Google Scholar
Sugeno, M., Yasukawa, T.: A fuzzy logic based approach to qualitative modeling. IEEE Trans. Fuzzy Syst. 5(1), 7–31 (1993)
Google Scholar
Wang, Y.: A new concept using LSTM neural networks for dynamic system identification. In: 2017 American Control Conference, Seattle, USA, pp. 5324–5329 (2017)
Google Scholar
Wang, L., Langari, R.: Complex systems modeling via fuzzy logic. IEEE Trans. Syst. Man Cybern. 26(1), 100–106 (1996)
Google Scholar
Yu, W.: Nonlinear system identification using discrete-time recurrent neural networks with stable learning algorithms. Inf. Sci. 58(1), 131–147 (2004)
Google Scholar
Yu, W., Li, X.: Discrete-time neuro identification without robust modification. IEE Proc. Control Theory Appl. 150(3), 311–316 (2003)
Google Scholar
Yu, W., Rubio, J.: Recurrent neural networks training with stable bounding ellipsoid algorithm. IEEE Trans. Neural Netw. 20(6), 983–991 (2009)
Google Scholar

Download references

Author information

Authors and Affiliations

Departamento de Control Automático, CINVESTAV-IPN (National Polytechnic Institute), Mexico City, Mexico
Wen Yu & Jesus Gonzalez
Departamento de Computación, CINVESTAV-IPN (National Polytechnic Institute), Mexico City, Mexico
Xiaoou Li

Authors

Wen Yu
View author publications
You can also search for this author in PubMed Google Scholar
Xiaoou Li
View author publications
You can also search for this author in PubMed Google Scholar
Jesus Gonzalez
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wen Yu .

Editor information

Editors and Affiliations

Dalian University of Technology, Dalian, China
Huchuan Lu
Sichuan University, Chengdu, China
Huajin Tang
Northeastern University, Shenyang, China
Zhanshan Wang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yu, W., Li, X., Gonzalez, J. (2019). Fast Training of Deep LSTM Networks. In: Lu, H., Tang, H., Wang, Z. (eds) Advances in Neural Networks – ISNN 2019. ISNN 2019. Lecture Notes in Computer Science(), vol 11554. Springer, Cham. https://doi.org/10.1007/978-3-030-22796-8_1

Download citation

DOI: https://doi.org/10.1007/978-3-030-22796-8_1
Published: 26 June 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-22795-1
Online ISBN: 978-3-030-22796-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics