Advertisement

Self-inhibition Residual Convolutional Networks for Chinese Sentence Classification

  • Mengting Xiong
  • Ruixuan LiEmail author
  • Yuhua Li
  • Qi Yang
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11301)

Abstract

Convolutional network has become a dominant approach in many Natural Language Processing (NLP) tasks. However, these networks are pretty shallow and simple so they are not able to capture the hierarchical feature of text. In addition, text preprocessing of those models in Chinese are quite rough, which leads to the loss of rich semantic information. In this paper, we explore deep convolutional networks for Chinese sentence classification and present a new model named Self-Inhibition Residual Convolutional Network (SIRCNN). This model employs extra Chinese character information and replaces convolutional block with self-inhibiting residual convolutional block to improve performance of deep network. It is one of the few explorations which use deep convolutional network in various text classification tasks. Experiments show that our model can achieve state-of-the-art accuracy on three different datasets with a better convergence rate.

Keywords

Self-inhibiting residual Text classification Character embedding 

Notes

Acknowledgement

This work is supported by the National Key Research and Development Program of China under grants 2016QY01W0202 and 2016YFB0800402, National Natural Science Foundation of China under grants 61572221, U1401258, 61433006 and 61502185, Major Projects of the National Social Science Foundation under grant 16ZDA092, Science and Technology Support Program of Hubei Province under grant 2015AAA013, and Science and Technology Program of Guangdong Province under grant 2014B010111007.

References

  1. 1.
    Cai, R., Zhang, X., Wang, H.: Bidirectional recurrent convolutional neural network for relation classification. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. ACL, Berlin (2016)Google Scholar
  2. 2.
    Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., Kuksa, P.P.: Natural language processing (almost) from scratch. J. Mach. Learn. Res. 12, 2493–2537 (2011)zbMATHGoogle Scholar
  3. 3.
    Graves, A.: Long short-term memory. In: Supervised Sequence Labelling with Recurrent Neural Networks. SCI, vol. 385, pp. 37–45. Springer, Heidelberg (2012).  https://doi.org/10.1007/978-3-642-24797-2_4Google Scholar
  4. 4.
    He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778. IEEE Computer Society, Las Vegas (2016)Google Scholar
  5. 5.
    Johnson, R., Zhang, T.: Effective use of word order for text categorization with convolutional neural networks. NAACL HLT 2015. In: The 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 103–112. NAACL, Denver (2015)Google Scholar
  6. 6.
    Kalchbrenner, N., Grefenstette, E., Blunsom, P.: A convolutional neural network for modelling sentences. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics. pp. 655–665. ACL, Baltimore (2014)Google Scholar
  7. 7.
    Kim, Y.: Convolutional neural networks for sentence classification. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, pp. 1746–1751. ACL, Doha (2014)Google Scholar
  8. 8.
    Kim, Y., Jernite, Y., Sontag, D., Rush, A.M.: Character-aware neural language models. arXiv preprint. arXiv:1508.06615 (2015)
  9. 9.
    Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint. arXiv:1412.6980 (2014)
  10. 10.
    Le, H.T., Cerisara, C., Denis, A.: Do convolutional networks need to be deep for text classification? arXiv preprint. arXiv:1707.04108 (2017)
  11. 11.
    Liu, P., Qiu, X., Huang, X.: Recurrent neural network for text classification with multi-task learning. In: Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, pp. 2873–2879. IJCAI/AAAI Press, New York (2016)Google Scholar
  12. 12.
    Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013, pp. 3111–3119. NIPS, Lake Tahoe (2013)Google Scholar
  13. 13.
    dos Santos, C.N., Gatti, M.: Deep convolutional neural networks for sentiment analysis of short texts. COLING 2014. 25th International Conference on Computational Linguistics, Proceedings of the Conference: Technical Papers, pp. 69–78. ACL, Dublin (2014)Google Scholar
  14. 14.
    Schwenk, H., Barrault, L., Conneau, A., LeCun, Y.: Very deep convolutional networks for text classification. In: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, pp. 1107–1116. ACL, Valencia (2017)Google Scholar
  15. 15.
    Srivastava, R.K., Greff, K., Schmidhuber, J.: Highway networks. arXiv preprint. arXiv:1505.00387 (2015)
  16. 16.
    Wang, X., Liu, Y., Sun, C., Wang, B., Wang, X.: Predicting polarities of tweets by composing word embeddings with long short-term memory. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, pp. 1343–1353. ACL, Beijing (2015)Google Scholar
  17. 17.
    Yang, Z., Yang, D., Dyer, C., He, X., Smola, A.J., Hovy, E.H.: Hierarchical attention networks for document classification. In: The 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL HLT 2016, pp. 1480–1489. NAACL, San Diego (2016)Google Scholar
  18. 18.
    Zhang, W., Chen, Z., Che, W., Hu, G., Liu, T.: The first evaluation of Chinese human-computer dialogue technology. arXiv preprint. arXiv:1709.10217 (2017)
  19. 19.
    Zhang, X., LeCun, Y.: Which encoding is the best for text classification in Chinese, English, Japanese and Korean? arXiv preprint. arXiv:1708.02657 (2017)
  20. 20.
    Zhang, X., Zhao, J.J., LeCun, Y.: Character-level convolutional networks for text classification. In: Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, pp. 649–657. NIPS, Montreal (2015)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.School of Computer Science and TechnologyHuazhong University of Science and TecnnologyWuhanChina

Personalised recommendations