Attention-based BiGRU-CNN for Chinese question classification

  • Jin Liu
  • Yihe Yang
  • Shiqi Lv
  • Jin Wang
  • Hui ChenEmail author
Original Research


Chinese question classification is one of the essential tasks in nature language processing (NLP) for Chinese language due to its distinctive characteristics. Methods presented in the literature are usually based on rules or traditional machine learning methods, which require manually created rules or features. Thus, the accuracy of the classification is constrained by inherent limitations of these methods. As deep learning-based methods have been proved to be able to mine deep information of text, to alleviate the problem, this article proposes a novel deep neural network model, Attention-Based BiGRU-CNN network (ABBC); and applies it to Chinese question classification task. The model combines the characteristics and advantages of convolutional neural network, attention mechanism and recurrent neural network. Our model can not only extract the features of Chinese questions effectively, but also learn the context information of words to solve the problem that the Text-CNN model can lose position feature. By comparing out model to four other classic models, the experimental results show that our model achieves the best performance in the Chinese question classification task.


Chinese question classification Gated recurrent unit Convolutional neural network Attention-based BiGRU-CNN 



This work is supported by the National Natural Science Foundation of China (61872231, 61772454, 61701297, 61811530332, 61811540410).


  1. Barigou F (2018) Impact of instance selection on kNN-based text categorization. J Inf Process Syst 14(2):418–434Google Scholar
  2. Bengio Y, Ducharme R, Vincent P, Jauvin C (2003) A neural probabilistic language model. J Mach Learn Res 3(Feb):1137–1155zbMATHGoogle Scholar
  3. Chen Z, Hu K (2018) Radical enhanced Chinese word embedding. In: Chinese computational linguistics and natural language processing based on naturally annotated big data, pp 3–11Google Scholar
  4. Cho K, van Merrienboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2014) Learning phrase representations using RNN Encoder–Decoder for statistical machine translation. In: Proceedings of the 2014 conference on empirical methods in natural language processing, pp 1724–1734Google Scholar
  5. Chung J, Gulcehre C, Cho K, Bengio Y (2014) Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555
  6. Collobert R, Weston J (2008). A unified architecture for natural language processing: deep neural networks with multitask learning. In: Proceedings of the 25th international conference on machine learning, July, pp 160–167, ACMGoogle Scholar
  7. Collobert R, Weston J, Bottou L, Karlen M, Kavukcuoglu K, Kuksa P (2011) Natural language processing (almost) from scratch. J Mach Learn Res 12(Aug):2493–2537zbMATHGoogle Scholar
  8. Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297zbMATHGoogle Scholar
  9. Dachapally PR, Ramanam S (2018) In-depth question classification using convolutional neural networks. arXiv preprint arXiv:1804.00968
  10. Hinton GE (1986) Learning distributed representations of concepts. In: Proceedings of the eighth annual conference of the cognitive science society, vol 1, p 12, AugustGoogle Scholar
  11. Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780CrossRefGoogle Scholar
  12. Jiang M, Liang Y, Feng X, Fan X, Pei Z, Xue Y, Guan R (2018) Text classification based on deep belief network and softmax regression. Neural Comput Appl 29(1):61–70CrossRefGoogle Scholar
  13. Kalchbrenner N, Grefenstette E, Blunsom P (2014) A convolutional neural network for modelling sentences. In: 52nd Annual meeting of the association for computational linguistics. Association for Computational Linguistics, JuneGoogle Scholar
  14. Kim Y (2014) Convolutional neural networks for sentence classification. In: Proceedings of the 2014 conference on empirical methods in natural language processing, pp 1746–1751Google Scholar
  15. Kocik K (2004) Question classification using maximum entropy models. The University of Sydney, SydneyGoogle Scholar
  16. LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324CrossRefGoogle Scholar
  17. Le-Hong P, Phan XH, Nguyen TD (2015) Using dependency analysis to improve question classification. In: Knowledge and systems engineering. Advances in intelligent systems and computing, vol 326, pp 653–665Google Scholar
  18. Li R, Tao X, Lei T, Hu Y (2005) Using maximum entropy model for Chinese text categorization. J Comput Res Dev 42(1):578–587Google Scholar
  19. Li C, Chai YM, Nan XF, Gao ML (2016) Research on problem classification method based on deep learning. Comput Sci 12:021Google Scholar
  20. Liu J, Zhou M, Lin L, Kim HJ, Wang J (2017) Rank web documents based on multi-domain ontology. J Ambient Intell Humanized Comput.
  21. Liu J, Ren H, Wu M, Wang J, Kim HJ (2018) Multiple relations extraction among multiple entities in unstructured text. Soft Comput 22(13):4295–4305CrossRefGoogle Scholar
  22. Liu W, Chen X, Jeon B, Chen L, Chen B (2019) Influence maximization on signed networks under independent cascade model. Appl Intell 49(3):912–928CrossRefGoogle Scholar
  23. Maron ME, Kuhns JL (1960) On relevance, probabilistic indexing and information retrieval. J ACM (JACM) 7(3):216–244CrossRefGoogle Scholar
  24. Mikolov T, Sutskever I, Chen K, Corrado G, Dean J (2013a) Distributed representations of words and phrases and their compositionality. In: Proceedings of the 26th international conference on neural information processing systems, vol 2, pp 3111–3119Google Scholar
  25. Mikolov T, Chen K, Corrado G, Dean J (2013b) Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781
  26. Rozental A, Fleischer D (2018) Amobee at SemEval-2018 task 1: GRU neural network with a CNN attention mechanism for sentiment classification. arXiv preprint arXiv:1804.04380
  27. Ruder S, Ghaffari P, Breslin JG (2016) A hierarchical model of reviews for aspect-based sentiment analysis. arXiv preprint arXiv:1609.02745
  28. Sathasivam S, Abdullah WATW (2008) Logic learning in Hopfield networks. arXiv preprint arXiv:0804.4075
  29. Shang L, Lu Z, Li H (2015) Neural responding machine for short-text conversation. In: Proceedings of the 53rd annual meeting of the association for computational linguistics and the 7th international joint conference on natural language processing, vol 1, pp 1577–1586Google Scholar
  30. Singh J, Singh G, Singh R (2017) Optimization of sentiment analysis using machine learning classifiers. Hum Centric Comput Inf Sci 7(1):32CrossRefGoogle Scholar
  31. Socher R, Perelygin A, Wu J, Chuang J, Manning CD, Ng A, Potts C (2013) Recursive deep models for semantic compositionality over a sentiment treebank. In: Proceedings of the 2013 conference on empirical methods in natural language processing, pp 1631–1642Google Scholar
  32. Su TR, Lee HY (2017) Learning Chinese word representations from glyphs of characters. In: Proceedings of the 2017 conference on empirical methods in natural language processing, pp 264–273Google Scholar
  33. Sun JG, Cai DF, De-Xin LV, Dong YJ (2007) Hownet based Chinese question automatic classification. J Chin Inf Process 21(1):90–95Google Scholar
  34. Tian WD, Gao YY, Zu YL (2010) Question classification based on self-learning rules and modified Bayes. Jisuanji Yingyong Yanjiu 27(8):2869–2871Google Scholar
  35. Wang D, Nyberg E (2015) A long short-term memory model for answer sentence selection in question answering. In: Proceedings of the 53rd annual meeting of the association for computational linguistics and the 7th international joint conference on natural language processing, vol 2: short papers, pp 707–712)Google Scholar
  36. Wang J, Zhang Z, Li B, Lee S, Sherratt RS (2014) An enhanced fall detection system for elderly person monitoring using consumer home networks. IEEE Trans Consum Electron 60(1):23–29CrossRefGoogle Scholar
  37. Wang J, Cao Y, Li B, Kim HJ, Lee S (2017) Particle swarm optimization based clustering algorithm with mobile sink for wsns. Future Gener Comput Syst 76:452–457CrossRefGoogle Scholar
  38. Wang G, Li, C, Wang W, Zhang Y, Shen D, Zhang X et al (2018) Joint embedding of words and labels for text classification. arXiv preprint arXiv:1805.04174
  39. Wu YZ, Zhao J, Duan XY, Xu B (2005) Research on question answering & evaluation: a survey. J Chin Inf Process 3:1–13Google Scholar
  40. Yang S, Gao C, Qin F, Dai X, Chen J (2012) A feature model integrating basic and bag-of-words binding features. J Chin Inf Process 26(5):46–52Google Scholar
  41. Yang Z, Yang D, Dyer C, He X, Smola A, Hovy E (2016) Hierarchical attention networks for document classification. In: Proceedings of the 2016 conference of the north american chapter of the association for computational linguistics: human language technologies, pp 1480–1489Google Scholar
  42. Yang M, Zhao W, Ye J, Lei Z, Zhao Z, Zhang S (2018). Investigating capsule networks with dynamic routing for text classification. In: Proceedings of the 2018 conference on empirical methods in natural language processing, pp 3110–3119Google Scholar
  43. Yin C, Xi J, Sun R, Wang J (2018) Location privacy protection based on differential privacy strategy for big data in industrial internet-of-things. IEEE Trans Ind Inf 14(8):3628–3636CrossRefGoogle Scholar
  44. Yu B, Xu Q, Zhang P (2018) Question classification based on MAC-LSTM. In: 2018 IEEE third international conference on data science in cyberspace (DSC), June, pp 69–75, IEEEGoogle Scholar
  45. Zeng D, Dai Y, Li F, Sherratt RS, Wang J (2018) Adversarial learning for distant supervised relation extraction. Comput Mater Continua 55(1):121–136Google Scholar
  46. Zhang D, Lee WS (2003) Question classification using support vector machines. In: Proceedings of the 26th annual international ACM SIGIR conference on research and development in information retrieval, vol 29, no 6, pp 26–32Google Scholar

Copyright information

© Springer-Verlag GmbH Germany, part of Springer Nature 2019

Authors and Affiliations

  • Jin Liu
    • 1
  • Yihe Yang
    • 1
  • Shiqi Lv
    • 1
  • Jin Wang
    • 2
  • Hui Chen
    • 3
    Email author
  1. 1.College of Information EngineeringShanghai Maritime UniversityShanghaiChina
  2. 2.School of Computer and Communication EngineeringChangsha University of Science and TechnologyChangshaChina
  3. 3.College of EducationShanghai Normal UniversityShanghaiChina

Personalised recommendations