Knowledge Memory Based LSTM Model for Answer Selection

  • Weijie An
  • Qin Chen
  • Yan YangEmail author
  • Liang He
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10635)


Recurrent neural networks (RNN) have shown great success in answer selection task in recent years. Although the attention mechanism has been widely used to enhance the information interaction between questions and answers, knowledge is still the gap between their representations. In this paper, we propose a knowledge memory based RNN model, which incorporates the knowledge learned from the data sets into the question representations. Experiments on two benchmark data sets show the great advantages of our proposed model over that without the knowledge memory. Furthermore, our model outperforms most of the recent progress in question answering.


Knowledge memory Answer selection Deep learning 



This work was supported by Xiaoi Research, Shanghai Municipal Commission of Economy and information Under Grant Project (No. 201602024) and the Natural Science Foundation of Shanghai (No. 172R1444900).


  1. 1.
    Yih, W.T., Chang, M.W., Meek, C., Pastusiak, A.: Question answering using enhanced lexical semantic models. In: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (vol. 1: Long Papers), pp. 1744–1753. Association for Computational Linguistics, Sofia (2013)Google Scholar
  2. 2.
    Heilman, M., Smith, N.A.: Tree edit models for recognizing textual entailments, paraphrases, and answers to questions. In: Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, pp. 1011–1019. Association for Computational Linguistics, Los Angeles (2010)Google Scholar
  3. 3.
    Wang, M., Smith, N.A., Mitamura, T.: What is the jeopardy model? a quasi-synchronous grammar for QA. In: Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), pp. 22–32. Association for Computational Linguistics, Prague (2007)Google Scholar
  4. 4.
    Wang, B., Liu, K., Zhao, J.: Inner attention based recurrent neural networks for answer selection. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (vol. 1: Long Papers). pp. 1288–1297. Association for Computational Linguistics, Berlin (2016)Google Scholar
  5. 5.
    Yin, W., Schütze, H., Xiang, B., Zhou, B.: ABCNN: Attention-based convolutional neural network for modeling sentence pairs. arXiv preprint arXiv:1512.05193 (2015)
  6. 6.
    Sukhbaatar, S., Szlam, A., Weston, J., Fergus, R.: End-to-end memory networks. In: Advances in Neural Information Processing Systems, vol. 28, pp. 2440–2448. Curran Associates, Inc. (2015)Google Scholar
  7. 7.
    Miller, A., Fisch, A., Dodge, J., Karimi, A.H., Bordes, A., Weston, J.: Key-value memory networks for directly reading documents. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pp. 1400–1409. Association for Computational Linguistics, Austin (2016)Google Scholar
  8. 8.
    Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)Google Scholar
  9. 9.
    Severyn, A., Moschitti, A.: Automatic feature engineering for answer selection and extraction. In: Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pp. 458–467. Association for Computational Linguistics, Seattle (2013)Google Scholar
  10. 10.
    Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)CrossRefGoogle Scholar
  11. 11.
    Wang, D., Nyberg, E.: A long short-term memory model for answer sentence selection in question answering. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (vol. 2: Short Papers), pp. 707–712. Association for Computational Linguistics, Beijing (2015)Google Scholar
  12. 12.
    Mueller, J., Thyagarajan, A.: Siamese recurrent architectures for learning sentence similarity. In: AAAI, pp. 2786–2792 (2016)Google Scholar
  13. 13.
    Hu, Q., Pei, Y., Chen, Q., He, L.: SG++: word representation with sentiment and negation for twitter sentiment classification. In: Proceedings of the 39th ACM SIGIR, pp. 997–1000 (2016)Google Scholar
  14. 14.
    Grbovic, M., Djuric, N., Radosavljevic, V., Silvestri, F., Bhamidipati, N.: Context-and content-aware embeddings for query rewriting in sponsored search. In: Proceedings of the 38th ACM SIGIR, pp. 383–392 (2015)Google Scholar
  15. 15.
    Yang, Y., Yih, W.T., Meek, C.: Wikiqa: a challenge dataset for open-domain question answering. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 2013–2018 (2015)Google Scholar
  16. 16.
    Severyn, A., Moschitti, A.: Learning to rank short text pairs with convolutional deep neural networks. In: Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 373–382. ACM (2015)Google Scholar
  17. 17.
    Wang, Z., Ittycheriah, A.: FAQ-based question answering via word alignment. arXiv preprint arXiv:1507.02628 (2015)

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.Department of Computer Science and TechnologyEast China Normal UniversityShanghaiChina
  2. 2.Shanghai Engineering Research Center of Intelligent Service RobotShanghaiChina

Personalised recommendations