Neural architecture for question answering using a knowledge graph and web corpus

  • Uma Sawant
  • Saurabh Garg
  • Soumen ChakrabartiEmail author
  • Ganesh Ramakrishnan
Knowledge Graphs and Semantics in Text Analysis and Retrieval


In Web search, entity-seeking queries often trigger a special question answering (QA) system. It may use a parser to interpret the question to a structured query, execute that on a knowledge graph (KG), and return direct entity responses. QA systems based on precise parsing tend to be brittle: minor syntax variations may dramatically change the response. Moreover, KG coverage is patchy. At the other extreme, a large corpus may provide broader coverage, but in an unstructured, unreliable form. We present AQQUCN, a QA system that gracefully combines KG and corpus evidence. AQQUCN accepts a broad spectrum of query syntax, between well-formed questions to short “telegraphic” keyword sequences. In the face of inherent query ambiguities, AQQUCN aggregates signals from KGs and large corpora to directly rank KG entities, rather than commit to one semantic interpretation of the query. AQQUCN models the ideal interpretation as an unobservable or latent variable. Interpretations and candidate entity responses are scored as pairs, by combining signals from multiple convolutional networks that operate collectively on the query, KG and corpus. On four public query workloads, amounting to over 8000 queries with diverse query syntax, we see 5–16% absolute improvement in mean average precision (MAP), compared to the entity ranking performance of recent systems. Our system is also competitive at entity set retrieval, almost doubling F1 scores for challenging short queries.


Question answering Knowledge graph Neural network Convolutional network Entity ranking 



Thanks to the reviewers for their constructive suggestions. Thanks to Elmar Haußmann for generous help with AQQU. Thanks to Doug Oard for advice on set versus ranked retrieval. Thanks to Saurabh Sarda for migrating the code of Joshi et al. (2014) to use AQQU. Partly supported by grants from IBM and nVidia.


  1. Andreas, J., Rohrbach, M., Darrell, T., & Klein, D. (2016). Learning to compose neural networks for question answering. arXiv preprint arXiv:160101705
  2. Bahdanau, D., Cho, K., & Bengio, Y. (2014). Neural machine translation by jointly learning to align and translate. CoRR. arXiv:1409.0473
  3. Balog, K., Azzopardi, L., & de Rijke, M. (2006). Formal models for expert finding in enterprise corpora. In SIGIR conference (pp. 43–50).
  4. Balog, K., Azzopardi, L., & de Rijke, M. (2009). A language modeling framework for expert finding. Information Processing and Management, 45(1), 1–19. CrossRefGoogle Scholar
  5. Bast, H., & Buchhold, B. (2017). QLever: A query engine for efficient sparql+text search. In CIKM (pp. 647–656).
  6. Bast, H., & Haußmann, E. (2015). More accurate question answering on freebase. In CIKM (pp. 1431–1440).
  7. Berant, J., & Liang, P. (2015). Imitation learning of agenda-based semantic parsers. TACL 3, 545–558.
  8. Berant, J., Chou, A., Frostig, R., & Liang, P. (2013). Semantic parsing on Freebase from question-answer pairs. In EMNLP conference (pp. 1533–1544).
  9. Bollacker, K., Evans, C., Paritosh, P., Sturge, T., & Taylor, J. (2008). Freebase: A collaboratively created graph database for structuring human knowledge. In SIGMOD conference (pp. 1247–1250).
  10. Bordes, A., Chopra, S., & Weston, J. (2014). Question answering with subgraph embeddings. arXiv preprint arXiv:14063676
  11. Bordes, A., Usunier, N., Chopra, S., & Weston, J. (2015). Large-scale simple question answering with memory networks. arXiv preprint arXiv:150602075
  12. Cardie, C. (2012). CS 4740: Introduction to natural language processing.
  13. Chakrabarti, S. (2010). Bridging the structured-unstructured gap: Searching the annotated Web. Keynote talk at WSDM 2010.
  14. CodaLab. (2016). Webquestions benchmark for question answering.
  15. Cornolti, M., Ferragina, P., Ciaramita, M., Rued, S., & Schuetze, H. (2014). The SMAPH system for query entity recognition and disambiguation. In ERD challenge workshop.
  16. CSAW. (2018). The CSAW project at IIT Bombay.
  17. Dalton, J., Dietz, L., & Allan, J. (2014). Entity query feature expansion using knowledge base links. In SIGIR conference.
  18. Dong, L., & Lapata, M. (2016). Language to logical form with neural attention. In ACL (Vol. 1, pp. 33–43). arXiv:1601.01280.
  19. Dong, L., Wei, F., Zhou, M., & Xu, K. (2015). Question answering over freebase with multi-column convolutional neural networks. In ACL conference.Google Scholar
  20. Fang, Y., Si, L., & Mathur, A. P. (2010). Discriminative models of integrating document evidence and document-candidate associations for expert search. In SIGIR conference.
  21. Ferragina, P., & Scaiella, U. (2010). TAGME: On-the-fly annotation of short text fragments (by wikipedia entities). arXiv:1006.3498.
  22. Gabrilovich, E., Ringgaard, M., & Subramanya, A. (2013). FACC1: Freebase annotation of ClueWeb corpora., version 1 (Release date 2013-06-26, Format version 1, Correction level 0).
  23. Ganea, O. E., & Hofmann, T. (2017). Deep joint entity disambiguation with local neural attention. arXiv preprint arXiv:170404920.
  24. Globerson, A., Lazic, N., Chakrabarti, S., Subramanya, A., Ringgaard, M., & Pereira, F. (2016). Collective entity resolution with multi-focal attention. In ACL conference (pp. 621–631).
  25. Huang, E. H., Socher, R., Manning, C. D., & Ng, A. Y. (2012). Improving word representations via global context and multiple word prototypes. In ACL conference.Google Scholar
  26. Hui, K., Yates, A., Berberich, K., & de Melo, G. (2017). PACRR: A position-aware neural IR model for relevance matching. arXiv preprint arXiv:170403940.
  27. Hui, K., Yates, A., Berberich, K., & de Melo, G. (2018). Co-PACRR: A context-aware neural IR model for ad-hoc retrieval. In WSDM conference (pp. 279–287).Google Scholar
  28. Iyyer, M., Boyd-Graber, J., Claudino, L., Socher, R., & Daumé, III. H. (2014). A neural network for factoid question answering over paragraphs. In EMNLP conference (pp. 633–644).Google Scholar
  29. Joachims, T. (2002). Optimizing search engines using clickthrough data. In SIGKDD conference, ACM (pp. 133–142).
  30. Joshi, M., Sawant, U., & Chakrabarti, S. (2014). Knowledge graph and corpus driven segmentation and answer inference for telegraphic entity-seeking queries. In EMNLP conference (pp. 1104–1114)., download
  31. Kasneci, G., Suchanek, F. M., Ifrim, G., Ramanath, M., & Weikum, G. (2008). NAGA: Searching and ranking knowledge. In ICDE, IEEE.Google Scholar
  32. Kim, Y. (2014). Convolutional neural networks for sentence classification. arXiv preprint arXiv:14085882.
  33. Kwiatkowski, T., Choi, E., Artzi, Y., & Zettlemoyer, L. S. (2013). Scaling semantic parsers with on-the-fly ontology matching. In EMNLP conference (pp. 1545–1556).
  34. Liang, C., Berant, J., Le, Q., Forbus, K. D., & Lao, N. (2016). Neural symbolic machines: Learning semantic parsers on Freebase with weak supervision. arXiv:1611.00020.
  35. Lin, T., Pantel, P., Gamon, M., Kannan, A., & Fuxman, A. (2012). Active objects: Actions for entity-centric search. In WWW conference, ACM (pp. 589–598).
  36. Ling, X., & Weld, D. S. (2012). Fine-grained entity recognition. In AAAI conference.
  37. Liu, T. Y. (2009). Learning to rank for information retrieval. In Foundations and trends in information retrieval (Vol. 3, pp. 225–331). Now Publishers.
  38. Lv, Y., & Zhai, C. (2009). Positional language models for information retrieval. In SIGIR conference (pp. 299–306).
  39. MacAvaney, S., Yates, A., Cohan, A., Soldaini, L., Hui, K., Goharian, N., & Frieder, O. (2018). Characterizing question facets for complex answer retrieval. arXiv preprint arXiv:180500791.
  40. Macdonald, C., & Ounis, I. (2006). Voting for candidates: Adapting data fusion techniques for an expert search task. In CIKM (pp. 387–396).
  41. Macdonald, C., & Ounis, I. (2011). Learning models for ranking aggregates. In Advances in information retrieval. LNCS (Vol. 6611, pp. 517–529). New York: Springer.
  42. Miller, A. H., Fisch, A., Dodge, J., Karimi, A., Bordes, A., & Weston, J. (2016). Key-value memory networks for directly reading documents. arXiv:1606.03126.
  43. Murdock, J. W., Kalyanpur, A., Welty, C., Fan, J., Ferrucci, D. A., Gondek, D. C., Zhang, L., & Kanayama, H. (2012). Typing candidate answers using type coercion. IBM Journal of Research and Development 56(3/4), 7:1–7:13.
  44. Petkova, D., & Croft, W. B. (2007). Proximity-based document representation for named entity retrieval. In CIKM (pp. 731–740). ACM.
  45. Pound, J., Hudek, A. K., Ilyas, I. F., & Weddell, G. (2012). Interpreting keyword queries over Web knowledge bases. In CIKM.
  46. Reed, S., & De Freitas, N. (2015). Neural programmer-interpreters. arXiv preprint arXiv:151106279.
  47. Ribeiro, M. T., Singh, S., & Guestrin, C. (2016). “Why should I trust you?” Explaining the predictions of any classifier. In SIGKDD conference (pp. 1135–1144).Google Scholar
  48. Roth, D. (2017). On the necessity of learning and reasoning: A perspective from natural language understanding. McCarthy award acceptance speech at IJCAI 2017.
  49. Saha, A., Pahuja, V., Khapra, M. M., Sankaranarayanan, K., & Chandar, S. (2018). Complex sequential question answering: Towards learning to converse over linked question answer pairs with a knowledge graph. arXiv preprint arXiv:180110314.
  50. Savenkov, D., & Agichtein, E. (2016). When a knowledge base is not enough: Question answering over knowledge bases with external text data. In SIGIR conference (pp. 235–244).
  51. Savenkov, D., & Agichtein, E. (2017). Evinets: Neural networks for combining evidence signals for factoid question answering. In ACL conference (Vol. 2, pp. 299–304).
  52. Sawant, U., & Chakrabarti, S. (2013). Features and aggregators for web-scale entity search. arXiv:1303.3164.
  53. Severyn, A., & Moschitti, A. (2015). Learning to rank short text pairs with convolutional deep neural networks. In Proceedings of the 38th international ACM SIGIR conference on research and development in information retrieval (pp. 373–382).
  54. Shalev-Shwartz, S., & Shashua, A. (2016). On the sample complexity of end-to-end training vs. semantic abstraction training. arXiv:1604.06915
  55. Wang, M. (2006). A survey of answer extraction techniques in factoid question answering. Computational Linguistics 1(1).
  56. West, R., Gabrilovich, E., Murphy, K., Sun, S., Gupta, R., & Lin, D. (2014). Knowledge base completion via search-based question answering. In WWW conference (pp. 515–526).
  57. Xiong, C., Callan, J., & Liu, T. Y. (2017). Word-entity duet representations for document ranking. In SIGIR conference (pp. 763–772). arXiv:1706.06636
  58. Xu, K., Reddy, S., Feng, Y., Huang, S., & Zhao, D. (2016). Question answering on Freebase via relation extraction and textual evidence. arXiv preprint arXiv:160300957.
  59. Yahya, M., Berberich, K., Elbassuoni, S., Ramanath, M., Tresp, V., & Weikum, G. (2012). Natural language questions for the Web of data. In EMNLP conference, Jeju Island, Korea (pp. 379–390).
  60. Yang, M. C., Duan, N., Zhou, M., & Rim, H. C. (2014). Joint relational embeddings for knowledge-based question answering. In EMNLP conference (pp. 645–650).Google Scholar
  61. Yao, X. (2015). Lean question answering over Freebase from scratch. In NAACL conference (pp. 66–70).
  62. Yao, X., & Van Durme, B. (2014). Information extraction over structured data: Question answering with Freebase. In ACL conference, ACL.
  63. Yavuz, S., Gur, I., Su, Y., Srivatsa, M., & Yan, X. (2016). Improving semantic parsing via answer type inference. In EMNLP conference (pp. 149–159).
  64. Yih, S. Wt., Chang, M. W., He, X., & Gao, J. (2015). Semantic parsing via staged query graph generation: Question answering with knowledge base. In ACL conference (pp. 1321–1331).
  65. Zhiltsov, N., Kotov, A., & Nikolaev, F. (2015). Fielded sequential dependence model for ad-hoc entity retrieval in the Web of data. In SIGIR conference (pp. 253–262).
  66. Zhong, V., Xiong, C., & Socher, R. (2017). Seq2SQL: Generating structured queries from natural language using reinforcement learning. arXiv preprint arXiv:170900103

Copyright information

© Springer Nature B.V. 2019

Authors and Affiliations

  • Uma Sawant
    • 1
  • Saurabh Garg
    • 1
  • Soumen Chakrabarti
    • 1
    Email author
  • Ganesh Ramakrishnan
    • 1
  1. 1.IIT BombayPowai, MumbaiIndia

Personalised recommendations