Overview of Long Short-Term Memory Neural Networks

Smagulova, Kamilya; James, Alex Pappachen

doi:10.1007/978-3-030-14524-8_11

Kamilya Smagulova¹⁷ &
Alex Pappachen James¹⁷

Part of the book series: Modeling and Optimization in Science and Technologies ((MOST,volume 14))

1660 Accesses
18 Citations

Abstract

Long Short-term Memory was designed to avoid vanishing and exploding gradient problems in recurrent neural networks. Over the last twenty years, various modifications of an original LSTM cell were proposed. This chapter gives an overview of basic LSTM cell structures and demonstrates forward and backward propagation within the most widely used configuration called traditional LSTM cell. Besides, LSTM neural network configurations are described.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 149.00; Price excludes VAT (USA)

Hardcover Book: USD 199.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Rosenblatt F (1958) The perceptron: a probabilistic model for information storage and organization in the brain. Psychol Rev 65(6):386
Article Google Scholar
Lipton ZC, Berkowitz J, Elkan C (2015) A critical review of recurrent neural networks for sequence learning. arXiv:1506.00019
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780
Article Google Scholar
Gers FA, Schmidhuber J, Cummins F (1999) Learning to forget: Continual prediction with LSTM
Google Scholar
Greff K, Srivastava RK, Koutník J, Steunebrink BR, Schmidhuber J (2017) LSTM: a search space odyssey. IEEE Trans Neural Netw Learn Syst 28(10):2222–2232
Article MathSciNet Google Scholar
Gomez, A. (2016). Backpropogating an LSTM: A Numerical Example. Aidan Gomez blog at Medium
Google Scholar
Xingjian SHI, Chen Z, Wang H, Yeung DY, Wong WK, Woo WC (2015) Convolutional LSTM network: a machine learning approach for precipitation now casting. In: Advances in neural information processing systems, pp 802–810
Google Scholar
Neil D, Pfeiffer M, Liu SC (2016) Phased lstm: accelerating recurrent network training for long or event-based sequences. In: Advances in Neural Information Processing Systems, pp 3882–3890
Google Scholar
Karpathy A (2015) The unreasonable effectiveness of recurrent neural networks. Andrej Karpathy blog
Google Scholar
Schuster M, Paliwal KK (1997) Bidirectional recurrent neural networks. IEEE Trans Signal Process 45(11):2673–2681
Article Google Scholar
Graves A, Jaitly N, Mohamed AR (2013) Hybrid speech recognition with deep bidirectional LSTM. In: 2013 IEEE workshop on automatic speech recognition and understanding (ASRU). IEEE, pp 273–278
Google Scholar
Yoon J, Zame WR, van der Schaar M (2017) Multi-directional recurrent neural networks: a novel method for estimating missing data
Google Scholar
Graves A, Schmidhuber J (2009) Offline handwriting recognition with multidimensional recurrent neural networks. In: Advances in neural information processing systems, pp 545–552
Google Scholar

Download references

Author information

Authors and Affiliations

Nazarbayev University, 53, Kabanbay Batyr ave., Astana, Kazakhstan
Kamilya Smagulova & Alex Pappachen James

Authors

Kamilya Smagulova
View author publications
You can also search for this author in PubMed Google Scholar
Alex Pappachen James
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Alex Pappachen James .

Editor information

Editors and Affiliations

School of Engineering, Nazarbayev University, Astana, Kazakhstan
Alex Pappachen James

Chapter Highlights

Long short-term memory (LSTM) is a special type of recurrent neural network (RNN).
LSTM unit has a memory and multiple weighted gates. Therefore it does not suffer from vanishing or exploding gradient problems of RNN and can process sequences of arbitrary length.
An original LSTM unit has no forget gate (NFG).
Traditional LSTM configuration:
$$ \begin{pmatrix} g_{t} \\ i_{t} \\ f_{t} \\ o_{t} \\ \end{pmatrix} = \begin{pmatrix} \text {tanh}\\ \sigma \\ \sigma \\ \sigma \\ \end{pmatrix} \cdot \begin{pmatrix} W^{(g)} &{}U^{(g)} \\ W^{(i)} &{}U^{(i)} \\ W^{(f)} &{}U^{(f)} \\ W^{(o)} &{}U^{(o)} \\ \end{pmatrix} \cdot \begin{pmatrix} x_{t} \\ h_{t-1} \end{pmatrix};$$

$$C_{t}= f_{t} \bigodot C_{t-1}+i_{t}\bigodot g_t{}; \qquad h_{t}=o_{t} \bigodot \text {tanh}(C_{t}).$$
Traditional LSTM with peephole connections is distinguished for precise timing and often referred as ‘Vanilla’ LSTM.
ConvLSTM are effective in spatiotemporal sequential problems.
Updates in Phased LSTM occur at irregularly sampled time points $t_{j}$ which can be controlled.
Depending on input and output sequnces’ length, following LSTM models are differentiated: ‘One-to-One’, ‘One-to-Many’, ‘Many-to-One’, ‘Many-to-Many’.
LSTM architecture can have different directionality, dimensionality and combination of them.

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Smagulova, K., James, A.P. (2020). Overview of Long Short-Term Memory Neural Networks. In: James, A. (eds) Deep Learning Classifiers with Memristive Networks. Modeling and Optimization in Science and Technologies, vol 14. Springer, Cham. https://doi.org/10.1007/978-3-030-14524-8_11

Download citation

DOI: https://doi.org/10.1007/978-3-030-14524-8_11
Published: 09 April 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-14522-4
Online ISBN: 978-3-030-14524-8
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics

Overview of Long Short-Term Memory Neural Networks

Abstract

Access this chapter

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Chapter Highlights

Chapter Highlights

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation