Abstract
Previous work on learning regular languages from exemplary training sequences showed that Long Short- Term Memory (LSTM) outperforms traditional recurrent neural networks (RNNs). Here we demonstrate LSTM’s superior performance on context free language (CFL) benchmarks, and show that it works even better than previous hardwired or highly specialized architectures. To the best of our knowledge, LSTM variants are also the first RNNs to learn a context sensitive language (CSL), namely, a n b n c n.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
F. A. Gers and J. Schmidhuber. Recurrent nets that time and count. In Proc. IJCNN’2000, Int. Joint Conf. on Neural Networks, Como, Italy, 2000.
F. A. Gers, J. Schmidhuber, and F. Cummins. Learning to forget: Continual prediction with LSTM. Neural Computation, 12(10):2451–2471, 2000.
S. Hochreiter and J. Schmidhuber. Long short-term memory. Neural Computation, 9(8):1735–1780, 1997.
B. A. Pearlmutter. Gradient calculations for dynamic recurrent neural networks: A survey. IEEE Transactions on Neural Networks, 6(5):1212–1228, 1995.
P. Rodriguez, J. Wiles, and J Elman. A recurrent neural network that learns to count. Connection Science, 11(1):5–40, 1999.
Paul Rodriguez and Janet Wiles. Recurrent neural networks can learn to implement symbol-sensitive counting. In Advances in Neural Information Processing Systems, volume 10, pages 87–93. The MIT Press, 1998.
G. Z. Sun, C. Lee Giles, H. H. Chen, and Y. C. Lee. The neural network pushdown automaton: Model, stack and learning simulations. Technical Report CS-TR-3118, University of Maryland, College Park August 1993.
B. Tonkes and J. Wiles. Learning a context-free task with a recurrent neural network: An analysis of stability. In Proceedings of the Fourth Biennial Conference of the Australasian Cognitive Science Society, 1997.
J. Wiles and J. Elman. Learning to count without a counter: A case study of dynamics and activation landscapes in recurrent networks. In Proceedings of the Seventeenth Annual Conference of the Cognitive Science Society, pages 482–487, Cambridge, MA, 1995. MIT Press.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Wien
About this paper
Cite this paper
Gers, F.A., Schmidhuber, J. (2001). Long Short-Term Memory Learns Context Free and Context Sensitive Languages. In: Kůrková, V., Neruda, R., Kárný, M., Steele, N.C. (eds) Artificial Neural Nets and Genetic Algorithms. Springer, Vienna. https://doi.org/10.1007/978-3-7091-6230-9_32
Download citation
DOI: https://doi.org/10.1007/978-3-7091-6230-9_32
Publisher Name: Springer, Vienna
Print ISBN: 978-3-211-83651-4
Online ISBN: 978-3-7091-6230-9
eBook Packages: Springer Book Archive