Investigating Relational Recurrent Neural Networks with Variable Length Memory Pointer

Ahmed, Mahtab; Mercer, Robert E.

doi:10.1007/978-3-030-47358-7_3

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12109))

Included in the following conference series:

Canadian Conference on Artificial Intelligence

2261 Accesses

Abstract

Memory based neural networks can remember information longer while modelling temporal data. To improve LSTM’s memory, we encode a novel Relational Memory Core (RMC) as the cell state inside an LSTM cell using the standard multi-head self attention mechanism with variable length memory pointer and call it \(\text {LSTM}_{\textit{RMC}}\). Two improvements are claimed: The area on which the RMC operates is expanded to create the new memory as more data is seen with each time step, and the expanded area is treated as a fixed size kernel with shared weights in the form of query, key, and value projection matrices. We design a novel sentence encoder using \(\text {LSTM}_{\textit{RMC}}\) and test our hypotheses on four NLP tasks showing improvements over the standard LSTM and the Transformer encoder as well as state-of-the-art general sentence encoders.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Ahmed, M., Samee, M.R., Mercer, R.E.: Improving Tree-LSTM with tree attention. In: 13th International Conference on Semantic Computing (ICSC), pp. 247–254 (2019)
Google Scholar
Baudiš, P., Stanko, S., Šedivý, J.: Joint learning of sentence embeddings for relevance and entailment. In: Proceedings of the 1st Workshop on Representation Learning for NLP, pp. 8–17 (2016)
Google Scholar
Cer, D., et al.: Universal sentence encoder. arXiv preprint arXiv:1803.11175 (2018)
Conneau, A., et al.: Supervised learning of universal sentence representations from natural language inference data. arXiv preprint arXiv:1705.02364 (2017)
Dolan, B., Quirk, C., Brockett, C.: Unsupervised construction of large paraphrase corpora: exploiting massively parallel news sources. In: Proceedings of the 20th International Conference on Computational Linguistics, p. 350 (2004)
Google Scholar
Graves, A., Wayne, G., Danihelka, I.: Neural turing machines. arXiv preprint arXiv:1410.5401 (2014)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Article Google Scholar
Knowlton, B.J., et al.: A neurocomputational system for relational reasoning. Trends Cogn. Sci. 16(7), 373–381 (2012)
Article Google Scholar
Lin, Z., et al.: A structured self-attentive sentence embedding. arXiv preprint arXiv:1703.03130 (2017)
Marelli, M., et al.: SemEval-2014 Task 1: Evaluation of compositional distributional semantic models on full sentences through semantic relatedness and textual entailment. In: Proceedings of the 8th International Workshop on Semantic Evaluation, pp. 1–8 (2014)
Google Scholar
Pennington, J., Socher, R., Manning, C.: Glove: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543 (2014)
Google Scholar
Santoro, A., et al.: Relational recurrent neural networks. In: Advances in Neural Information Processing Systems, pp. 7299–7310 (2018)
Google Scholar
Santoro, A., Bartunov, S., Botvinick, M., Wierstra, D., Lillicrap, T.: Meta-learning with memory-augmented neural networks. In: International Conference on Machine Learning, pp. 1842–1850 (2016)
Google Scholar
Santoro, A., et al.: A simple neural network module for relational reasoning. CoRR abs/1706.01427 (2017). http://arxiv.org/abs/1706.01427
Schacter, D.L., Tulving, E.: Memory Systems. MIT Press, Cambridge (1994)
Google Scholar
Socher, R., et al.: Dynamic pooling and unfolding recursive autoencoders for paraphrase detection. In: Advances in Neural Information Processing Systems, pp. 801–809 (2011)
Google Scholar
Sukhbaatar, S., Szlam, A., Weston, J., Fergus, R.: Weakly supervised memory networks. CoRR abs/1503.08895 (2015). http://arxiv.org/abs/1503.08895
Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, pp. 5998–6008 (2017)
Google Scholar
Zhao, H., Lu, Z., Poupart, P.: Self-adaptive hierarchical sentence model. In: IJCAI, pp. 4069–4076 (2015)
Google Scholar
Zhou, Y., Liu, C., Pan, Y.: Modelling sentence pairs with tree-structured attentive encoder. arXiv preprint arXiv:1610.02806 (2016)

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of Western Ontario, London, Canada
Mahtab Ahmed & Robert E. Mercer

Authors

Mahtab Ahmed
View author publications
You can also search for this author in PubMed Google Scholar
Robert E. Mercer
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mahtab Ahmed .

Editor information

Editors and Affiliations

National Research Council Canada, Ottawa, ON, Canada
Cyril Goutte
Queen’s University, Kingston, ON, Canada
Xiaodan Zhu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ahmed, M., Mercer, R.E. (2020). Investigating Relational Recurrent Neural Networks with Variable Length Memory Pointer. In: Goutte, C., Zhu, X. (eds) Advances in Artificial Intelligence. Canadian AI 2020. Lecture Notes in Computer Science(), vol 12109. Springer, Cham. https://doi.org/10.1007/978-3-030-47358-7_3

Download citation

DOI: https://doi.org/10.1007/978-3-030-47358-7_3
Published: 06 May 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-47357-0
Online ISBN: 978-3-030-47358-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics