Long Short-Term Memory Learns Context Free and Context Sensitive Languages
Previous work on learning regular languages from exemplary training sequences showed that Long Short- Term Memory (LSTM) outperforms traditional recurrent neural networks (RNNs). Here we demonstrate LSTM’s superior performance on context free language (CFL) benchmarks, and show that it works even better than previous hardwired or highly specialized architectures. To the best of our knowledge, LSTM variants are also the first RNNs to learn a context sensitive language (CSL), namely, a n b n c n .
KeywordsRecurrent Neural Network Training Sequence Output Unit Regular Language Memory Block
Unable to display preview. Download preview PDF.
- F. A. Gers and J. Schmidhuber. Recurrent nets that time and count. In Proc. IJCNN’2000, Int. Joint Conf. on Neural Networks, Como, Italy, 2000.Google Scholar
- Paul Rodriguez and Janet Wiles. Recurrent neural networks can learn to implement symbol-sensitive counting. In Advances in Neural Information Processing Systems, volume 10, pages 87–93. The MIT Press, 1998.Google Scholar
- G. Z. Sun, C. Lee Giles, H. H. Chen, and Y. C. Lee. The neural network pushdown automaton: Model, stack and learning simulations. Technical Report CS-TR-3118, University of Maryland, College Park August 1993.Google Scholar
- B. Tonkes and J. Wiles. Learning a context-free task with a recurrent neural network: An analysis of stability. In Proceedings of the Fourth Biennial Conference of the Australasian Cognitive Science Society, 1997.Google Scholar
- J. Wiles and J. Elman. Learning to count without a counter: A case study of dynamics and activation landscapes in recurrent networks. In Proceedings of the Seventeenth Annual Conference of the Cognitive Science Society, pages 482–487, Cambridge, MA, 1995. MIT Press.Google Scholar