Skip to main content

Improving Question Answering by Commonsense-Based Pre-training

  • Conference paper
  • First Online:
Natural Language Processing and Chinese Computing (NLPCC 2019)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11838))

Abstract

Although neural network approaches achieve remarkable success on a variety of NLP tasks, many of them struggle to answer questions that require commonsense knowledge. We believe the main reason is the lack of commonsense connections between concepts. To remedy this, we provide a simple and effective method that leverages external commonsense knowledge base such as ConceptNet. We pre-train direct and indirect relational functions between concepts, and show that these pre-trained functions could be easily added to existing neural network models. Results show that incorporating commonsense-based function improves the state-of-the-art on three question answering tasks that require commonsense reasoning. Further analysis shows that our system discovers and leverages useful evidence from an external commonsense knowledge base, which is missing in existing neural network models and help derive the correct answer.

Work is done during internship at Microsoft Research Asia.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    In this work, concepts are words and phrases that can be extracted from natural language text [20].

  2. 2.

    The definitions of contexts in these tasks are slightly different and we will describe the details in the next section.

  3. 3.

    https://competitions.codalab.org/competitions/17184.

  4. 4.

    http://data.allenai.org/arc/arc-corpus/.

  5. 5.

    http://data.allenai.org/OpenBookQA.

  6. 6.

    http://conceptnet.io/.

  7. 7.

    During the SemEval evaluation, systems including TriAN report results based on model pretraining on RACE dataset [8] and system ensemble. In this work, we report numbers on SemEval without pre-trained on RACE or ensemble.

References

  1. Annervaz, K., Chowdhury, S.B.R., Dukkipati, A.: Learning beyond datasets: knowledge graph augmented neural networks for natural language processing. arXiv preprint arXiv:1802.05930 (2018)

  2. Boratko, M., et al.: A systematic classification of knowledge, reasoning, and context within the ARC dataset. arXiv preprint arXiv:1806.00358 (2018)

  3. Bordes, A., Usunier, N., Garcia-Duran, A., Weston, J., Yakhnenko, O.: Translating embeddings for modeling multi-relational data. In: NIPS, pp. 2787–2795 (2013)

    Google Scholar 

  4. Clark, P., et al.: Think you have solved question answering? Try ARC, the AI2 reasoning challenge. arXiv preprint arXiv:1803.05457 (2018)

  5. Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)

  6. Hirsch, E.D.: Reading comprehension requires knowledge—of words and the world. Am. Educator 27(1), 10–13 (2003)

    Google Scholar 

  7. Jia, R., Liang, P.: Adversarial examples for evaluating reading comprehension systems. arXiv preprint arXiv:1707.07328 (2017)

  8. Lai, G., Xie, Q., Liu, H., Yang, Y., Hovy, E.: RACE: Large-scale reading comprehension dataset from examinations. arXiv preprint arXiv:1704.04683 (2017)

  9. Levesque, H.J., Davis, E., Morgenstern, L.: The winograd schema challenge. In: AAAI Spring Symposium: Logical Formalizations of Commonsense Reasoning, vol. 46, p. 47 (2011)

    Google Scholar 

  10. Liang, P.: Learning executable semantic parsers for natural language understanding. Commun. ACM 59(9), 68–76 (2016)

    Article  Google Scholar 

  11. Mahajan, D., et al.: Exploring the limits of weakly supervised pretraining. arXiv preprint arXiv:1805.00932 (2018)

  12. Mihaylov, T., Clark, P., Khot, T., Sabharwal, A.: Can a suit of armor conduct electricity? A new dataset for open book question answering. arXiv preprint arXiv:1809.02789 (2018)

  13. Mihaylov, T., Frank, A.: Knowledgeable reader: enhancing cloze-style reading comprehension with external commonsense knowledge. arXiv preprint arXiv:1805.07858 (2018)

  14. Miller, A.H., Fisch, A., Dodge, J., Karimi, A., Bordes, A., Weston, J.: Key-value memory networks for directly reading documents. CoRR abs/1606.03126 (2016). http://arxiv.org/abs/1606.03126

  15. Mitchell, J., Lapata, M.: Composition in distributional models of semantics. Cogn. Sci. 34(8), 1388–1429 (2010)

    Article  Google Scholar 

  16. Ostermann, S., Roth, M., Modi, A., Thater, S., Pinkal, M.: SemEval-2018 task 11: machine comprehension using commonsense knowledge. In: Proceedings of the 12th International Workshop on Semantic Evaluation, pp. 747–757 (2018)

    Google Scholar 

  17. Pennington, J., Socher, R., Manning, C.: GloVe: global vectors for word representation. In: EMNLP, pp. 1532–1543 (2014)

    Google Scholar 

  18. Rajpurkar, P., Zhang, J., Lopyrev, K., Liang, P.: SQuAD: 100,000+ questions for machine comprehension of text. arXiv preprint arXiv:1606.05250 (2016)

  19. Scarselli, F., Gori, M., Tsoi, A.C., Hagenbuchner, M., Monfardini, G.: The graph neural network model. IEEE Trans. Neural Networks 20(1), 61–80 (2009)

    Article  Google Scholar 

  20. Speer, R., Havasi, C.: Representing general relational knowledge in ConceptNet 5. In: LREC, pp. 3679–3686 (2012)

    Google Scholar 

  21. Tandon, N., de Melo, G., Weikum, G.: Webchild 2.0: fine-grained commonsense knowledge distillation. In: Proceedings of ACL 2017, System Demonstrations, pp. 115–120 (2017)

    Google Scholar 

  22. Yang, B., Mitchell, T.: Leveraging knowledge bases in LSTMs for improving machine reading. In: ACL, pp. 1436–1446 (2017)

    Google Scholar 

Download references

Acknowledge

This work is supported by the National Key R&D Program of China (2018YFB1004404), Key R&D Program of Guangdong Province (2018B010107005), National Natural Science Foundation of China (U1711262, U1401256, U1501252, U1611264, U1711261, 61673403, U1611262).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jian Yin .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 155 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhong, W., Tang, D., Duan, N., Zhou, M., Wang, J., Yin, J. (2019). Improving Question Answering by Commonsense-Based Pre-training. In: Tang, J., Kan, MY., Zhao, D., Li, S., Zan, H. (eds) Natural Language Processing and Chinese Computing. NLPCC 2019. Lecture Notes in Computer Science(), vol 11838. Springer, Cham. https://doi.org/10.1007/978-3-030-32233-5_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-32233-5_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-32232-8

  • Online ISBN: 978-3-030-32233-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics