Improving Question Answering by Commonsense-Based Pre-training

Zhong, Wanjun; Tang, Duyu; Duan, Nan; Zhou, Ming; Wang, Jiahai; Yin, Jian

doi:10.1007/978-3-030-32233-5_2

Wanjun Zhong¹³,
Duyu Tang¹⁴,
Nan Duan¹⁴,
Ming Zhou¹⁴,
Jiahai Wang¹³ &
…
Jian Yin¹³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11838))

Included in the following conference series:

CCF International Conference on Natural Language Processing and Chinese Computing

2840 Accesses
14 Citations

Abstract

Although neural network approaches achieve remarkable success on a variety of NLP tasks, many of them struggle to answer questions that require commonsense knowledge. We believe the main reason is the lack of commonsense connections between concepts. To remedy this, we provide a simple and effective method that leverages external commonsense knowledge base such as ConceptNet. We pre-train direct and indirect relational functions between concepts, and show that these pre-trained functions could be easily added to existing neural network models. Results show that incorporating commonsense-based function improves the state-of-the-art on three question answering tasks that require commonsense reasoning. Further analysis shows that our system discovers and leverages useful evidence from an external commonsense knowledge base, which is missing in existing neural network models and help derive the correct answer.

Work is done during internship at Microsoft Research Asia.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
In this work, concepts are words and phrases that can be extracted from natural language text [20].
2.
The definitions of contexts in these tasks are slightly different and we will describe the details in the next section.
3.
https://competitions.codalab.org/competitions/17184.
4.
http://data.allenai.org/arc/arc-corpus/.
5.
http://data.allenai.org/OpenBookQA.
6.
http://conceptnet.io/.
7.
During the SemEval evaluation, systems including TriAN report results based on model pretraining on RACE dataset [8] and system ensemble. In this work, we report numbers on SemEval without pre-trained on RACE or ensemble.

References

Annervaz, K., Chowdhury, S.B.R., Dukkipati, A.: Learning beyond datasets: knowledge graph augmented neural networks for natural language processing. arXiv preprint arXiv:1802.05930 (2018)
Boratko, M., et al.: A systematic classification of knowledge, reasoning, and context within the ARC dataset. arXiv preprint arXiv:1806.00358 (2018)
Bordes, A., Usunier, N., Garcia-Duran, A., Weston, J., Yakhnenko, O.: Translating embeddings for modeling multi-relational data. In: NIPS, pp. 2787–2795 (2013)
Google Scholar
Clark, P., et al.: Think you have solved question answering? Try ARC, the AI2 reasoning challenge. arXiv preprint arXiv:1803.05457 (2018)
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
Hirsch, E.D.: Reading comprehension requires knowledge—of words and the world. Am. Educator 27(1), 10–13 (2003)
Google Scholar
Jia, R., Liang, P.: Adversarial examples for evaluating reading comprehension systems. arXiv preprint arXiv:1707.07328 (2017)
Lai, G., Xie, Q., Liu, H., Yang, Y., Hovy, E.: RACE: Large-scale reading comprehension dataset from examinations. arXiv preprint arXiv:1704.04683 (2017)
Levesque, H.J., Davis, E., Morgenstern, L.: The winograd schema challenge. In: AAAI Spring Symposium: Logical Formalizations of Commonsense Reasoning, vol. 46, p. 47 (2011)
Google Scholar
Liang, P.: Learning executable semantic parsers for natural language understanding. Commun. ACM 59(9), 68–76 (2016)
Article Google Scholar
Mahajan, D., et al.: Exploring the limits of weakly supervised pretraining. arXiv preprint arXiv:1805.00932 (2018)
Mihaylov, T., Clark, P., Khot, T., Sabharwal, A.: Can a suit of armor conduct electricity? A new dataset for open book question answering. arXiv preprint arXiv:1809.02789 (2018)
Mihaylov, T., Frank, A.: Knowledgeable reader: enhancing cloze-style reading comprehension with external commonsense knowledge. arXiv preprint arXiv:1805.07858 (2018)
Miller, A.H., Fisch, A., Dodge, J., Karimi, A., Bordes, A., Weston, J.: Key-value memory networks for directly reading documents. CoRR abs/1606.03126 (2016). http://arxiv.org/abs/1606.03126
Mitchell, J., Lapata, M.: Composition in distributional models of semantics. Cogn. Sci. 34(8), 1388–1429 (2010)
Article Google Scholar
Ostermann, S., Roth, M., Modi, A., Thater, S., Pinkal, M.: SemEval-2018 task 11: machine comprehension using commonsense knowledge. In: Proceedings of the 12th International Workshop on Semantic Evaluation, pp. 747–757 (2018)
Google Scholar
Pennington, J., Socher, R., Manning, C.: GloVe: global vectors for word representation. In: EMNLP, pp. 1532–1543 (2014)
Google Scholar
Rajpurkar, P., Zhang, J., Lopyrev, K., Liang, P.: SQuAD: 100,000+ questions for machine comprehension of text. arXiv preprint arXiv:1606.05250 (2016)
Scarselli, F., Gori, M., Tsoi, A.C., Hagenbuchner, M., Monfardini, G.: The graph neural network model. IEEE Trans. Neural Networks 20(1), 61–80 (2009)
Article Google Scholar
Speer, R., Havasi, C.: Representing general relational knowledge in ConceptNet 5. In: LREC, pp. 3679–3686 (2012)
Google Scholar
Tandon, N., de Melo, G., Weikum, G.: Webchild 2.0: fine-grained commonsense knowledge distillation. In: Proceedings of ACL 2017, System Demonstrations, pp. 115–120 (2017)
Google Scholar
Yang, B., Mitchell, T.: Leveraging knowledge bases in LSTMs for improving machine reading. In: ACL, pp. 1436–1446 (2017)
Google Scholar

Download references

Acknowledge

This work is supported by the National Key R&D Program of China (2018YFB1004404), Key R&D Program of Guangdong Province (2018B010107005), National Natural Science Foundation of China (U1711262, U1401256, U1501252, U1611264, U1711261, 61673403, U1611262).

Author information

Authors and Affiliations

Guangdong Key Laboratory of Big Data Analysis and Processing, The School of Data and Computer Science, Sun Yat-sen University, Guangzhou, People’s Republic of China
Wanjun Zhong, Jiahai Wang & Jian Yin
Microsoft Research Asia, Beijing, China
Duyu Tang, Nan Duan & Ming Zhou

Authors

Wanjun Zhong
View author publications
You can also search for this author in PubMed Google Scholar
Duyu Tang
View author publications
You can also search for this author in PubMed Google Scholar
Nan Duan
View author publications
You can also search for this author in PubMed Google Scholar
Ming Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Jiahai Wang
View author publications
You can also search for this author in PubMed Google Scholar
Jian Yin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jian Yin .

Editor information

Editors and Affiliations

Tsinghua University, Beijing, China
Jie Tang
National University of Singapore, Singapore, Singapore
Min-Yen Kan
Peking University, Beijing, China
Dongyan Zhao
Peking University, Beijing, China
Sujian Li
Zhengzhou University, Zhengzhou, China
Hongying Zan

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 155 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhong, W., Tang, D., Duan, N., Zhou, M., Wang, J., Yin, J. (2019). Improving Question Answering by Commonsense-Based Pre-training. In: Tang, J., Kan, MY., Zhao, D., Li, S., Zan, H. (eds) Natural Language Processing and Chinese Computing. NLPCC 2019. Lecture Notes in Computer Science(), vol 11838. Springer, Cham. https://doi.org/10.1007/978-3-030-32233-5_2

Download citation

DOI: https://doi.org/10.1007/978-3-030-32233-5_2
Published: 30 September 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-32232-8
Online ISBN: 978-3-030-32233-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

the China Computer Federation (CCF) (opens in a new tab)