Advertisement

Gated Convolutional Networks for Commonsense Machine Comprehension

  • Wuya Chen
  • Xiaojun QuanEmail author
  • Chengbo Chen
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11301)

Abstract

In this paper, we study the problem of commonsense machine comprehension and propose a new model based on convolutional neural networks and Gated Tanh-ReLU Units. The new model, which serves as an alternative to exiting recurrent models, consists of three layers: input layer, gated convolutional layer, and output layer. The input layer produces representations based on various features, such as part-of-speech and relation embeddings. Gated convolutional layer, the key component of our model, extracts n-gram features at different granularities and models the interactions between different texts (questions, answers, and passages). Bilinear interactions are used as output layer to capture the relations among the final expressive representations and to produce the final answers. We evaluate our model on the SemEval-2018 Machine Comprehension Using Commonsense Knowledge task. Experimental result shows that our model achieves highly competitive results with the state-of-the-art models but is much faster. To our knowledge, this is the first time a non-recurrent approach gains competitive performance with strong recurrent models for commonsense machine comprehension.

Keywords

Convolutional neural networks Gated mechanism Reading comprehension Commonsense knowledge 

Notes

Acknowledgments

The paper was supported by the Program for Guangdong Introducing Innovative and Enterpreneurial Teams (No. 2017ZT07X355).

References

  1. 1.
    Hill, F., Bordes, A., Chopra, S., Weston, J.: The goldilocks principle: reading children’s books with explicit memory representations. arXiv preprint arXiv:1511.02301 (2015)
  2. 2.
    Cui, Y. Liu, T.P., Chen, Z., Wang, S.J., Hu, G.P.: Consensus attention-based neural networks for chinese reading comprehension. arXiv preprint arXiv:1607.02250 (2016)
  3. 3.
    Cui, Y.M., Chen, Z.P., We, S., Wang, S.J., Liu, T., Hu, G.P.: Attention-over-attention neural networks for reading comprehension. In: 55th Annual Meeting of the Association for Computational Linguistics, vol. 1, pp. 593–602. Association for Computational Linguistics, Vancouver (2017)Google Scholar
  4. 4.
    Wang, S.H., Jiang, J.: Machine comprehension using match-LSTM and answer pointer. arXiv preprint arXiv:1608.07905 (2016)
  5. 5.
    Chen, D.Q., Fisch, A., Weston, J., Bordes, A.: Reading Wikipedia to answer open-domain questions. In: 55th Annual Meeting of the Association for Computational Linguistics, vol. 1, pp. 1870–1879. Association for Computational Linguistics, Vancouver (2017)Google Scholar
  6. 6.
    Ostermann, S., Roth, M., Modi, A., Thater, S., Pinkal, M.: SemEval-2018 Task 11: machine comprehension using commonsense knowledge. In: 12th International Workshop on Semantic Evaluation, pp. 747–757, New Orleans (2018)Google Scholar
  7. 7.
    Speer, R., Chin, J., Havasi, C.: Conceptnet 5.5: an open multilingual graph of general knowledge. arXiv preprint arXiv:1612.03975 (2016)
  8. 8.
    Wang, L., Sun, M., Zhao, W.: Yuanfudao at SemEval-2018 Task 11: three-way attention and relational knowledge for commonsense machine comprehension. arXiv preprint arXiv:1803.00191 (2018)
  9. 9.
    Seo, M., Kembhavi, A., Farhadi, A., Hajishirzi, H.: Bidirectional attention flow for machine comprehension. arXiv preprint arXiv:1611.01603 (2016)
  10. 10.
    Wang, W.H., Yang, N., Wei, F.R., Chang, B.B., Zhou, M.: Gated self-matching networks for reading comprehension and question answering. In: 55th Annual Meeting of the Association for Computational Linguistics, vol. 1, pp. 189–198. Association for Computational Linguistics, Vancouver (2017)Google Scholar
  11. 11.
    Xue, W., Li, T.: Aspect based sentiment analysis with gated convolutional networks. In: 56th Annual Meeting of the Association for Computational Linguistics, vol. 1, pp. 2514–2523. Association for Computational Linguistics, Melbourne (2018)Google Scholar
  12. 12.
    Yang, Z., Yang, D., Dyer, C., He, X., Smola, A., Hovy, E.: Hierarchical attention networks for document classification. In: the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 1480–1489 (2016)Google Scholar
  13. 13.
    Chen, Z., Cui, Y., Ma, W.: HFL-RC system at SemEval-2018 Task 11: hybrid multi-aspects model for commonsense reading comprehension. arXiv preprint arXiv:1803.05655 (2018)
  14. 14.
    Dauphin, Y.N., Fan, A., Auli, M.: Language modeling with gated convolutional networks. arXiv preprint arXiv:1612.08083 (2016)
  15. 15.
    Kalchbrenner, N., Grefenstette, E., Blunsom, P.: A convolutional neural network for modelling sentences. In: 52nd Annual Meeting of the Association for Computational Linguistics, vol. 1, pp. 655–665. Association for Computational Linguistics, Baltimore (2014)Google Scholar
  16. 16.
    Kim, Y.: Convolutional neural networks for sentence classification. In: the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1746–1751. Association for Computational Linguistics, Doha (2014)Google Scholar
  17. 17.
    Pan, B., Li, H., Zhao, Z., Cao, B., Cai, D., He, X.: Memen: multi-layer embedding with memory networks for machine comprehension. arXiv preprint arXiv:1707.09098 (2017)
  18. 18.
    Wang, Y.Z., et al.: Multi-passage machine reading comprehension with cross-passage answer verification. In: 56th Annual Meeting of the Association for Computational Linguistics, vol. 1, pp. 1918–1927. Association for Computational Linguistics, Melbourne (2018)Google Scholar
  19. 19.
    Wang, S., Yu, M., Jiang, J.: Evidence aggregation for answer re-ranking in open-domain question answering. arXiv preprint arXiv:1711.05116 (2017)
  20. 20.
    Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.School of Data and Computer ScienceSun Yat-sen UniversityGuangzhouChina

Personalised recommendations