Abstract
Memory based neural networks can remember information longer while modelling temporal data. To improve LSTM’s memory, we encode a novel Relational Memory Core (RMC) as the cell state inside an LSTM cell using the standard multi-head self attention mechanism with variable length memory pointer and call it \(\text {LSTM}_{\textit{RMC}}\). Two improvements are claimed: The area on which the RMC operates is expanded to create the new memory as more data is seen with each time step, and the expanded area is treated as a fixed size kernel with shared weights in the form of query, key, and value projection matrices. We design a novel sentence encoder using \(\text {LSTM}_{\textit{RMC}}\) and test our hypotheses on four NLP tasks showing improvements over the standard LSTM and the Transformer encoder as well as state-of-the-art general sentence encoders.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Ahmed, M., Samee, M.R., Mercer, R.E.: Improving Tree-LSTM with tree attention. In: 13th International Conference on Semantic Computing (ICSC), pp. 247–254 (2019)
Baudiš, P., Stanko, S., Šedivý, J.: Joint learning of sentence embeddings for relevance and entailment. In: Proceedings of the 1st Workshop on Representation Learning for NLP, pp. 8–17 (2016)
Cer, D., et al.: Universal sentence encoder. arXiv preprint arXiv:1803.11175 (2018)
Conneau, A., et al.: Supervised learning of universal sentence representations from natural language inference data. arXiv preprint arXiv:1705.02364 (2017)
Dolan, B., Quirk, C., Brockett, C.: Unsupervised construction of large paraphrase corpora: exploiting massively parallel news sources. In: Proceedings of the 20th International Conference on Computational Linguistics, p. 350 (2004)
Graves, A., Wayne, G., Danihelka, I.: Neural turing machines. arXiv preprint arXiv:1410.5401 (2014)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Knowlton, B.J., et al.: A neurocomputational system for relational reasoning. Trends Cogn. Sci. 16(7), 373–381 (2012)
Lin, Z., et al.: A structured self-attentive sentence embedding. arXiv preprint arXiv:1703.03130 (2017)
Marelli, M., et al.: SemEval-2014 Task 1: Evaluation of compositional distributional semantic models on full sentences through semantic relatedness and textual entailment. In: Proceedings of the 8th International Workshop on Semantic Evaluation, pp. 1–8 (2014)
Pennington, J., Socher, R., Manning, C.: Glove: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543 (2014)
Santoro, A., et al.: Relational recurrent neural networks. In: Advances in Neural Information Processing Systems, pp. 7299–7310 (2018)
Santoro, A., Bartunov, S., Botvinick, M., Wierstra, D., Lillicrap, T.: Meta-learning with memory-augmented neural networks. In: International Conference on Machine Learning, pp. 1842–1850 (2016)
Santoro, A., et al.: A simple neural network module for relational reasoning. CoRR abs/1706.01427 (2017). http://arxiv.org/abs/1706.01427
Schacter, D.L., Tulving, E.: Memory Systems. MIT Press, Cambridge (1994)
Socher, R., et al.: Dynamic pooling and unfolding recursive autoencoders for paraphrase detection. In: Advances in Neural Information Processing Systems, pp. 801–809 (2011)
Sukhbaatar, S., Szlam, A., Weston, J., Fergus, R.: Weakly supervised memory networks. CoRR abs/1503.08895 (2015). http://arxiv.org/abs/1503.08895
Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, pp. 5998–6008 (2017)
Zhao, H., Lu, Z., Poupart, P.: Self-adaptive hierarchical sentence model. In: IJCAI, pp. 4069–4076 (2015)
Zhou, Y., Liu, C., Pan, Y.: Modelling sentence pairs with tree-structured attentive encoder. arXiv preprint arXiv:1610.02806 (2016)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Ahmed, M., Mercer, R.E. (2020). Investigating Relational Recurrent Neural Networks with Variable Length Memory Pointer. In: Goutte, C., Zhu, X. (eds) Advances in Artificial Intelligence. Canadian AI 2020. Lecture Notes in Computer Science(), vol 12109. Springer, Cham. https://doi.org/10.1007/978-3-030-47358-7_3
Download citation
DOI: https://doi.org/10.1007/978-3-030-47358-7_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-47357-0
Online ISBN: 978-3-030-47358-7
eBook Packages: Computer ScienceComputer Science (R0)