Temporality-enhanced knowledgememory network for factoid question answering

  • Xin-yu Duan
  • Si-liang Tang
  • Sheng-yu Zhang
  • Yin Zhang
  • Zhou Zhao
  • Jian-ru Xue
  • Yue-ting Zhuang
  • Fei Wu
Article

Abstract

Question answering is an important problem that aims to deliver specific answers to questions posed by humans in natural language. How to efficiently identify the exact answer with respect to a given question has become an active line of research. Previous approaches in factoid question answering tasks typically focus on modeling the semantic relevance or syntactic relationship between a given question and its corresponding answer. Most of these models suffer when a question contains very little content that is indicative of the answer. In this paper, we devise an architecture named the temporality-enhanced knowledge memory network (TE-KMN) and apply the model to a factoid question answering dataset from a trivia competition called quiz bowl. Unlike most of the existing approaches, our model encodes not only the content of questions and answers, but also the temporal cues in a sequence of ordered sentences which gradually remark the answer. Moreover, our model collaboratively uses external knowledge for a better understanding of a given question. The experimental results demonstrate that our method achieves better performance than several state-of-the-art methods.

Keywords

Question answering Knowledge memory Temporality interaction 

CLC number

TP391 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bahdanau D, Cho K, Bengio Y, 2014. Neural machine translation by jointly learning to align and translate. arXiv:1409.0473. https://arxiv.org/abs/1409.0473Google Scholar
  2. Bao JW, Duan N, Zhou M, et al., 2014. Knowledge-based question answering as machine translation. 52nd Annual Meeting of the Association for Computational Linguistics, p.967–976. https://doi.org/10.3115/v1/p4-1091Google Scholar
  3. Barrón-Cedeño A, Filice S, Martino GDS, et al., 2015. Thread-level information for comment classification in community question answering. 53rd Annual Meeting of the Association for Computational Linguistics, p.687–693. https://doi.org/10.3115/v1/p5-2113Google Scholar
  4. Bilotti MW, Elsas JL, Carbonell JG, et al., 2010. Rank learning for factoid question answering with linguistic and semantic constraints. 19th ACM Conf on Information and Knowledge Management, p.459–468. https://doi.org/10.1145/1871437.1871498Google Scholar
  5. Boyd-Graber JL, Satinoff B, He H, et al., 2012. Besting the quiz master: crowdsourcing incremental classification games. Joint Conf on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, p.1290–1301.Google Scholar
  6. Carr CE, 1993. Processing of temporal information in the brain. Ann Rev Neurosci, 16(1):223–243. https://doi.org/10.1016/S0166-4115(96)80051-3CrossRefGoogle Scholar
  7. Chen D, Bolton J, Manning CD, 2016. A thorough examination of the CNN/daily mail reading comprehension task. 54th Annual Meeting of the Association for Computational Linguistics, p.2358–2367. https://doi.org/10.18653/v1/p6-1223Google Scholar
  8. Cho K, van Merrienboer B, Gülçehre Ç, et al., 2014. Learning phrase representations using RNN encoder-decoder for statistical machine translation. Conf on Empirical Methods in Natural Language Processing, p.1724–1734. https://doi.org/10.3115/v1/d14-1179Google Scholar
  9. Chorowski J, Bahdanau D, Serdyuk D, et al., 2015. Attention-based models for speech recognition. Advances in Neural Information Processing Systems, p.577–585.Google Scholar
  10. Ding S, Cong G, Lin C, et al., 2008. Using conditional random fields to extract contexts and answers of questions from online forums. 46th Annual Meeting of the Association for Computational Linguistics, p.710–718.Google Scholar
  11. Figueroa A, Atkinson J, 2011. Maximum entropy context models for ranking biographical answers to open-domain definition questions. 25th AAAI Conf on Artificial Intelligence, p.1173–1179.Google Scholar
  12. Fillmore CJ, 1976. Frame semantics and the nature of language. Ann New York Acad Sci, 280(1):20–32. https://doi.org/10.1111/j.1749-6632.1976.tb25467.xCrossRefGoogle Scholar
  13. Ghazvininejad M, Brockett C, Chang M, et al., 2017. A knowledge-grounded neural conversation model. arXiv:1702.01932. https://arxiv.org/abs/1702.01932Google Scholar
  14. Graves A, Fernández S, Schmidhuber J, 2005. Bidirectional LSTM networks for improved phoneme classification and recognition. 15th Int Conf on Artificial Neural Networks, p.799–804.Google Scholar
  15. Huang J, Zhou M, Yang D, 2007. Extracting chatbot knowledge from online discussion forums. 20th Int Joint Conf on Artifical Intelligence, p.423–428.Google Scholar
  16. Ivry RB, 1996. The representation of temporal information in perception and motor control. Curr Opin Neurobiol, 6(6):851–857. https://doi.org/10.1016/s0959-4388(96)80037-7CrossRefGoogle Scholar
  17. Iyyer M, Boyd-Graber JL, Claudino LMB, et al., 2014. A neural network for factoid question answering over paragraphs. Conf on Empirical Methods in Natural Language Processing, p.633–644. https://doi.org/10.3115/v1/d14-1070Google Scholar
  18. Iyyer M, Manjunatha V, Boyd-Graber JL, et al., 2015. Deep unordered composition rivals syntactic methods for text classification. 53rd Annual Meeting of the Association for Computational Linguistics, p.1681–1691. https://doi.org/10.3115/v1/p5-1162Google Scholar
  19. Jeon J, Croft WB, Lee JH, 2005. Finding similar questions in large question and answer archives. 14th ACM Int Conf on Information and Knowledge Management, p.84–90. https://doi.org/10.1145/1099554.1099572Google Scholar
  20. Joty SR, Barrón-Cedeño A, Martino GDS, et al., 2015. Global thread-level inference for comment classification in community question answering. Conf on Empirical Methods in Natural Language Processing, p.573–578. https://doi.org/10.18653/v1/d15-1068Google Scholar
  21. Jurczyk P, Agichtein E, 2007. Discovering authorities in question answer communities by using link analysis. 16th ACM Conf on Information and Knowledge Management, p.919–922. https://doi.org/10.1145/1321440.1321575Google Scholar
  22. Kalchbrenner N, Grefenstette E, Blunsom P, 2014. A convolutional neural network for modeling sentences. 52nd Annual Meeting of the Association for Computational Linguistics, p.655–665. https://doi.org/10.3115/v1/p4-1062Google Scholar
  23. Le QV, Mikolov T, 2014. Distributed representations of sentences and documents. 31st Int Conf on Machine Learning, p.1188–1196.Google Scholar
  24. Li B, Lyu MR, King I, 2012. Communities of Yahoo! Answers and Baidu Zhidao: complementing or competing? Int Joint Conf on Neural Networks, p.1–8. https://doi.org/10.1109/ijcnn.2012.6252435Google Scholar
  25. Luong T, Pham H, Manning CD, 2015. Effective approaches to attention-based neural machine translation. Conf on Empirical Methods in Natural Language Processing, p.1412–1421. https://doi.org/10.18653/v1/d15-1166Google Scholar
  26. Ma X, Hovy EH, 2016. End-to-end sequence labeling via bidirectional LSTM-CNNs-CRF. 54th Annual Meeting of the Association for Computational Linguistics, p.1064–1074. https://doi.org/10.18653/v1/p6-1101Google Scholar
  27. Mikolov T, Karafiát M, Burget L, et al., 2010. Recurrent neural network based language model. 11th Annual Conf of the Int Speech Communication Association, p.1045–1048.Google Scholar
  28. Mikolov T, Chen K, Corrado G, et al., 2013. Efficient estimation of word representations in vector space. arXiv:1301.3781. https://arxiv.org/abs/1301.3781Google Scholar
  29. Minsky M, 1991. Society of mind: a response to four reviews. Artif Intell, 48(3):371–396. https://doi.org/10.1016/0004-3702(91)90036-JMathSciNetCrossRefGoogle Scholar
  30. Nakashole N, Mitchell TM, 2015. A knowledge-intensive model for prepositional phrase attachment. 53rd Annual Meeting of the Association for Computational Linguistics, p.365–375. https://doi.org/10.3115/v1/p5-1036Google Scholar
  31. Navigli R, Velardi P, 2010. Learning word-class lattices for definition and hypernym extraction. 48th Annual Meeting of the Association for Computational Linguistics, p.1318–1327.Google Scholar
  32. Pan B, Li H, Zhao Z, et al., 2017. MEMEN: multi-layer embedding with memory networks for machine comprehension. arXiv:1707.09098. https://arxiv.org/abs/1707.09098Google Scholar
  33. Pan Y, 2016. Heading toward artificial intelligence 2.0. Engineering, 2(4):409–413. https://doi.org/10.1016/J.ENG.2016.04.018CrossRefGoogle Scholar
  34. Qiu X, Huang X, 2015. Convolutional neural tensor network architecture for community-based question answering. 24th Int Joint Conf on Artificial Intelligence, p.1305–1311.Google Scholar
  35. Rush AM, Chopra S, Weston J, 2015. A neural attention model for abstractive sentence summarization. Conf on Empirical Methods in Natural Language Processing, p.379–389. https://doi.org/10.18653/v1/d15-1044Google Scholar
  36. Schweppe J, Rummer R, 2013. Attention, working memory, and long-term memory in multimedia learning: an integrated perspective based on process models of working memory. Ed Psychol Rev, 26(2):285–306. https://doi.org/10.1007/s10648-013-9242-2CrossRefGoogle Scholar
  37. Shah C, Pomerantz J, 2010. Evaluating and predicting answer quality in community QA. 33rd Int ACM SIGIR Conf on Research and Development in Information Retrieval, p.411–418. https://doi.org/10.1145/1835449.1835518Google Scholar
  38. Shen Y, Rong W, Sun Z, et al., 2015. Question/Answer matching for CQA system via combining lexical and sequential information. 29th AAAI Conf on Artificial Intelligence, p.275–281.Google Scholar
  39. Sukhbaatar S, Szlam A, Weston J, et al., 2015. End-to-end memory networks. Advances in Neural Information Processing Systems, p.2440–2448.Google Scholar
  40. Sun R, Jiang J, Tan YF, et al., 2005. Using syntactic and semantic relation analysis in question answering. 14th Text REtrieval Conf.Google Scholar
  41. Sutskever I, Vinyals O, Le QV, 2014. Sequence to sequence learning with neural networks. Advances in Neural Information Processing Systems, p.3104–3112.Google Scholar
  42. Wang B, Wang X, Sun C, et al., 2010. Modeling semantic relevance for question-answer pairs in web social communities. 48th Annual Meeting of the Association for Computational Linguistics, p.1230–1238.Google Scholar
  43. Wang M, 2006. A Survey of Answer Extraction Techniques in Factoid Question Answering. https://cs.stanford.edu/people/mengqiu/publication/LSII-LitReview.pdfGoogle Scholar
  44. Wei X, Huang H, Nie L, et al., 2017. I know what you want to express: sentence element inference by incorporating external knowledge base. IEEE Trans Knowl Data Eng, 29(2):344–358. https://doi.org/10.1109/TKDE.2016.2622705CrossRefGoogle Scholar
  45. Xue X, Jeon J, Croft WB, 2008. Retrieval models for question and answer archives. 31st Int ACM SIGIR Conf on Research and Development in Information Retrieval, p.475–482. https://doi.org/10.1145/1390334.1390416Google Scholar
  46. Yang B, Mitchell TM, 2017. Leveraging knowledge bases in LSTMs for improving machine reading. 55th Annual Meeting of the Association for Computational Linguistics, p.1436–1446. https://doi.org/10.18653/v1/p7-1132Google Scholar
  47. Yang Y, Zhuang Y, Wu F, et al., 2008. Harmonizing hierarchical manifolds for multimedia document semantics understanding and cross-media retrieval. IEEE Trans Multimed, 10(3):437–446. https://doi.org/10.1109/TMM.2008.917359CrossRefGoogle Scholar
  48. Yao X, Durme BV, 2014. Information extraction over structured data: question answering with freebase. 52nd Annual Meeting of the Association for Computational Linguistics, p.956–966. https://doi.org/10.3115/v1/p4-1090Google Scholar
  49. Zheng S, Bao H, Zhao J, et al., 2015. A novel hierarchical convolutional neural network for question answering over paragraphs. IEEE/WIC/ACM Int Conf on Web Intelligence and Intelligent Agent Technology, p.60–66. https://doi.org/10.1109/WI-IAT.2015.20Google Scholar
  50. Zhou G, Cai L, Zhao J, et al., 2011. Phrase-based translation model for question retrieval in community question answer archives. 49th Annual Meeting of the Association for Computational Linguistics, p.653–662.Google Scholar
  51. Zhuang Y, Yang Y, Wu F, 2008. Mining semantic correlation of heterogeneous multimedia data for cross-media retrieval. IEEE Trans Multimed, 10(2):221–229. https://doi.org/10.1109/TMM.2007.911822CrossRefGoogle Scholar
  52. Zhuang Y, Wu F, Chen C, et al., 2017. Challenges and opportunities: from big data to knowledge in AI 2.0. Front Inform Technol Electron Eng, 18(1):3–14. https://doi.org/10.1631/FITEE.1601883CrossRefGoogle Scholar

Copyright information

© Zhejiang University and Springer-Verlag GmbH Germany, part of Springer Nature 2018

Authors and Affiliations

  1. 1.College of Computer Science and TechnologyZhejiang UniversityHangzhouChina
  2. 2.School of Information ManagementWuhan UniversityWuhanChina
  3. 3.Institute of Artificial Intelligence and RoboticsXi’an Jiaotong UniversityXi’anChina

Personalised recommendations