Abstract
Deep recurrent neural networks (RNN), such as LSTM, have many advantages over forward networks. However, the LSTM training method, such as backward propagation through time (BPTT), is really slow.
In this paper, by separating the LSTM cell into forward and recurrent substructures, we propose a much simpler and faster training method than the BPTT. The deep LSTM is modified by combining the deep RNN with the multilayer perceptron (MLP). The simulation results show that our fast training method for LSTM is better than BPTT for LSTM.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bengio, Y., Lamblin, P., Popovici, D., Larochelle, H.: Greedy layer-wise training of deep networks. In: Advances in Neural Information Processing Systems (NIPS 2006), pp. 153–160 (2007)
Chung, J., Gulcehre, C., Cho, K., Bengio, Y.: Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling. arXiv:1412.3555 [cs.NE] (2014)
Graves, A., Mohamed, A., Hinton, G.: Speech Recognition with Deep Recurrent Neural Networks. arXiv:1303.5778 (2013)
Hirose, N., Tajima, R.: Modeling of rolling friction by recurrent neural network using LSTM. In: 2017 IEEE International Conference on Robotics and Automation (ICRA), Singapore, pp. 6471–6475 (2017)
Hinton, G., Osindero, S., Teh, Y.: A fast learning algorithm for deep belief nets. Neural Comput. 18(7), 1–6 (2006)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Box, G., Jenkins, G., Reinsel, G.: Time Series Analysis: Forecasting and Control, 4th edn. Wiley, Hoboken (2008)
Kumar, A., Chandel, Y.: Solar radiation prediction using artificial neural network techniques: a review. Renew. Sustain. Energy Rev. 33(2), 772–781 (2014)
Kingma, P., Ba, J.: Adam: A Method for Stochastic Optimization. arXiv:1412.6980 [cs.LG] (2014)
Narendra, K., Parthasarathy, K.: Gradient methods for optimization of dynamical systems containing neural networks. IEEE Trans. Neural Netw. 2(2), 252–262 (1991)
Krizhevsky, A., Sutskever, I., Hinton, G.: ImageNet classification with deep convolutional neural networks. In: NIPS, pp. 12–19 (2012)
LeCun, Y., Bottou, L., Bengio, Y., Haffne, P.: Gradient based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
Cao, W., Wang, X., Ming, Z., Gao, J.: A review on neural networks with random weights. Neurocomputing 275(2), 278–287 (2018)
Wang, X., Musa, A.: Advances in neural network based learning. Int. J. Mach. Learn. Cybern. 5(1), 1–2 (2014)
Wang, X., Cao, W.: Non-iterative approaches in training feed-forward neural networks and their applications. Soft Comput. 22(11), 3473–3476 (2018)
Ljung, L.: System Identification-Theory for User. Prentice Hall, Englewood Cliffs (1987)
Nelles, O.: Nonlinear System Identification: From Classical Approaches to Neural Networks and Fuzzy Models. Springer, Heidelberg (2013)
Ogunmolu, O., Gu, X., Jiang, S., Gans, N.: Nonlinear Systems Identification Using Deep Dynamic Neural Networks. arXiv:1610.01439v1 [cs.NE] (2016)
Schoukens, J., Schoukens, J., Ljung, L.: Wiener-Hammerstein benchmark. In: 15th IFAC Symposium on System Identification, pp. 1–6 (2009)
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)
Sugeno, M., Yasukawa, T.: A fuzzy logic based approach to qualitative modeling. IEEE Trans. Fuzzy Syst. 5(1), 7–31 (1993)
Wang, Y.: A new concept using LSTM neural networks for dynamic system identification. In: 2017 American Control Conference, Seattle, USA, pp. 5324–5329 (2017)
Wang, L., Langari, R.: Complex systems modeling via fuzzy logic. IEEE Trans. Syst. Man Cybern. 26(1), 100–106 (1996)
Yu, W.: Nonlinear system identification using discrete-time recurrent neural networks with stable learning algorithms. Inf. Sci. 58(1), 131–147 (2004)
Yu, W., Li, X.: Discrete-time neuro identification without robust modification. IEE Proc. Control Theory Appl. 150(3), 311–316 (2003)
Yu, W., Rubio, J.: Recurrent neural networks training with stable bounding ellipsoid algorithm. IEEE Trans. Neural Netw. 20(6), 983–991 (2009)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Yu, W., Li, X., Gonzalez, J. (2019). Fast Training of Deep LSTM Networks. In: Lu, H., Tang, H., Wang, Z. (eds) Advances in Neural Networks – ISNN 2019. ISNN 2019. Lecture Notes in Computer Science(), vol 11554. Springer, Cham. https://doi.org/10.1007/978-3-030-22796-8_1
Download citation
DOI: https://doi.org/10.1007/978-3-030-22796-8_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-22795-1
Online ISBN: 978-3-030-22796-8
eBook Packages: Computer ScienceComputer Science (R0)