Advertisement

Attention-based encoder-decoder model for answer selection in question answering

  • Yuan-ping Nie
  • Yi Han
  • Jiu-ming Huang
  • Bo Jiao
  • Ai-ping Li
Article

Abstract

One of the key challenges for question answering is to bridge the lexical gap between questions and answers because there may not be any matching word between them. Machine translation models have been shown to boost the performance of solving the lexical gap problem between question-answer pairs. In this paper, we introduce an attention-based deep learning model to address the answer selection task for question answering. The proposed model employs a bidirectional long short-term memory (LSTM) encoder-decoder, which has been demonstrated to be effective on machine translation tasks to bridge the lexical gap between questions and answers. Our model also uses a step attention mechanism which allows the question to focus on a certain part of the candidate answer. Finally, we evaluate our model using a benchmark dataset and the results show that our approach outperforms the existing approaches. Integrating our model significantly improves the performance of our question answering system in the TREC 2015 LiveQA task.

Key words

Question answering Answer selection Attention Deep learning 

CLC number

TP391.4 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bahdanau, D., Cho, K., Bengio, Y., 2014. Neural machine translation by jointly learning to align and translate. ArXiv:1409.0473.Google Scholar
  2. Berger, A., Caruana, R., Cohn, D., et al., 2000. Bridging the lexical chasm: statistical approaches to answer-finding. Proc. 23rd Annual Int. ACM SIGIR Conf. on Research and Development in Information Retrieval, p.192–199. http://dx.doi.org/10.1145/345508.345576Google Scholar
  3. Cho, K., van Merriënboer, B., Gulcehre, C., et al., 2014. Learning phrase representations using RNN encoder-decoder for statistical machine translation. ArXiv:1406.1078.Google Scholar
  4. Cui, H., Sun, R., Li, K., et al., 2005. Question answering passage retrieval using dependency relations. Proc. 28th Annual Int. ACM SIGIR Conf. on Research and Development in Information Retrieval, p.400–407. http://dx.doi.org/10.1145/1076034.1076103Google Scholar
  5. dos Santos, C., Barbosa, L., Bogdanova, D., et al., 2015. Learning hybrid representations to retrieve semantically equivalent questions. Proc. 53rd Annual Meeting of the Association for Computational Linguistics and 7th Int. Joint Conf. on Natural Language Processing, p.694–699. http://dx.doi.org/10.3115/v1/P15-2114Google Scholar
  6. Echihabi, A., Marcu, D., 2003. A noisy-channel approach to question answering. Proc. 41st Annual Meeting of the Association for Computational Linguistics, p.16–23. http://dx.doi.org/10.3115/1075096.1075099Google Scholar
  7. Feng, M., Xiang, B., Glass, M.R., et al., 2015. Applying deep learning to answer selection: a study and an open task. ArXiv:1508.01585.Google Scholar
  8. Graves, A., Mohamed, A., Hinton, G.E., 2013. Speech recognition with deep recurrent neural networks. IEEE Int. Conf. on Acoustics, Speech and Signal Processing, p.6645–6649. http://dx.doi.org/10.1109/ICASSP.2013.6638947Google Scholar
  9. Heilman, M., Smith, N.A., 2010. Tree edit models for recognizing textual entailments, paraphrases, and answers to questions. Human Language Technologies: Annual Conf. of the North American Chapter of the Association for Computational Linguistics, p.1011–1019.Google Scholar
  10. Hochreiter, S., Schmidhuber, J., 1997. Long short-term memory. Neur. Comput., 9(8): 1735–1780. http://dx.doi.org/10.1162/neco.1997.9.8.1735CrossRefGoogle Scholar
  11. Iyyer, M., Boyd-Graber, J.L., Claudino, L.M.B., et al., 2014. A neural network for factoid question answering over paragraphs. Proc. Conf. on Empirical Methods in Natural Language Processing, p.633–644. http://dx.doi.org/10.3115/v1/D14-1070Google Scholar
  12. Jeon, J., Croft, W.B., Lee, J.H., 2005. Finding similar questions in large question and answer archives. Proc. 14th ACM Int. Conf. on Information and Knowledge Management, p.84–90. http://dx.doi.org/10.1145/1099554.1099572Google Scholar
  13. Kalchbrenner, N., Blunsom, P., 2013. Recurrent continuous translation models. Proc. Conf. on Empirical Methods in Natural Language Processing, p.1700–1709.Google Scholar
  14. Kim, Y., 2014. Convolutional neural networks for sentence classification. ArXiv:1408.5882.Google Scholar
  15. Punyakanok, V., Roth, D., Yih, W.T., 2004. Mapping dependencies trees: an application to question answering. Proc. 8th Int. Symp. on Artificial Intelligence and Mathematics, p.1–10.Google Scholar
  16. Riezler, S., Vasserman, A., Tsochantaridis, I., et al., 2007. Statistical machine translation for query expansion in answer retrieval. Annual Meeting of the Association for Computational Linguistics, p.464–471.Google Scholar
  17. Robertson, S.E., Walker, S., Jones, S., et al., 1995. Okapi at TREC-3. Overview of 3rd Text REtrieval Conf., p.109–126.Google Scholar
  18. Rush, A.M., Chopra, S., Weston, J., 2015. A neural attention model for abstractive sentence summarization. ArXiv: 1509.00685.Google Scholar
  19. Severyn, A., Moschitti, A., 2013. Automatic feature engineering for answer selection and extraction. Proc. Conf. on Empirical Methods in Natural Language Processing, p.458–467.Google Scholar
  20. Severyn, A., Moschitti, A., 2015. Learning to rank short text pairs with convolutional deep neural networks. Proc. 38th Int. ACM SIGIR Conf. on Research and Development in Information Retrieval, p.373–382. http://dx.doi.org/10.1145/2766462.2767738Google Scholar
  21. Soricut, R., Brill, E., 2006. Automatic question answering using the web: beyond the factoid. Inform. Retr., 9(2): 191–206. http://dx.doi.org/10.1007/s10791-006-7149-yCrossRefGoogle Scholar
  22. Surdeanu, M., Ciaramita, M., Zaragoza, H., 2011. Learning to rank answers to non-factoid questions from web collections. Comput. Ling., 37(2): 351–383. http://dx.doi.org/10.1162/COLI_a_00051CrossRefGoogle Scholar
  23. Sutskever, I., Vinyals, O., Le, Q.V., 2014. Sequence to sequence learning with neural networks. Advances in Neural Information Processing Systems, p.3104–3112.Google Scholar
  24. Wang, D., Nyberg, E., 2015. A long short-term memory model for answer sentence selection in question answering. Proc. 53rd Annual Meeting of the Association for Computational Linguistics and 7th Int. Joint Conf. on Natural Language Processing, p.707–712. http://dx.doi.org/10.3115/v1/P15-2116Google Scholar
  25. Wang, M., Manning, C.D., 2010. Probabilistic tree-edit models with structured latent variables for textual entailment and question answering. Proc. 23rd Int. Conf. on Computational Linguistics, p.1164–1172.Google Scholar
  26. Wang, M., Smith, N.A., Mitamura, T., 2007. What is the jeopardy model? A quasi-synchronous grammar for QA. Proc. Joint Conf. on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, p.22–32.Google Scholar
  27. Xu, K., Ba, J., Kiros, R., et al., 2015. Show, attend and tell: neural image caption generation with visual attention. ArXiv:1502.03044.Google Scholar
  28. Xue, X., Jeon, J., Croft, W.B., 2008. Retrieval models for question and answer archives. Proc. 31st Annual Int. ACM SIGIR Conf. on Research and Development in Information Retrieval, p.475–482. http://dx.doi.org/10.1145/1390334.1390416Google Scholar
  29. Yao, X., van Durme, B., Callison-Burch, C., et al., 2013a. Answer extraction as sequence tagging with tree edit distance. Proc. NAACL-HLT, p.858–867.Google Scholar
  30. Yao, X., van Durme, B., Callisonburch, C., et al., 2013b. Semi-Markov phrase-based monolingual alignment. Proc. Conf. on Empirical Methods in Natural Language Processing, p.590–600.Google Scholar
  31. Yih, W., Chang, M., Meek, C., et al., 2013. Question answering using enhanced lexical semantic models. Proc. 51st Annual Meeting of the Association for Computational Linguistics, p.1744–1753.Google Scholar
  32. Yih, W., He, X., Meek, C., 2014. Semantic parsing for singlerelation question answering. Proc. 52nd Annual Meeting of the Association for Computational Linguistics, p.643–648. http://dx.doi.org/10.3115/v1/P14-2105Google Scholar
  33. Yu, L., Hermann, K.M., Blunsom, P., et al., 2014. Deep learning for answer sentence selection. ArXiv:1412.1632.Google Scholar
  34. Zhou, G., Cai, L., Zhao, J., et al., 2011. Phrase-based translation model for question retrieval in community question answer archives. Proc. 49th Annual Meeting of the Association for Computational Linguistics, p.653–662.Google Scholar
  35. Zhou, G., Liu, F., Liu, Y., et al., 2013. Statistical machine translation improves question retrieval in community question answering via matrix factorization. Proc. 51st Annual Meeting of the Association for Computational Linguistics, p.852–861.Google Scholar
  36. Zhou, G., Zhou, Y., He, T., et al., 2016. Learning semantic representation with neural networks for community question answering retrieval. Knowl.-Based Syst., 93: 75–83. http://dx.doi.org/10.1016/j.knosys.2015.11.002CrossRefGoogle Scholar

Copyright information

© Zhejiang University and Springer-Verlag GmbH Germany, part of Springer Nature 2017

Authors and Affiliations

  • Yuan-ping Nie
    • 1
  • Yi Han
    • 2
  • Jiu-ming Huang
    • 1
  • Bo Jiao
    • 3
  • Ai-ping Li
    • 1
  1. 1.College of ComputerNational University of Defense TechnologyChangshaChina
  2. 2.Institute of Information EngineeringChinese Academy of SciencesBeijingChina
  3. 3.Luoyang Electronic Equipment Test CenterLuoyangChina

Personalised recommendations