A Sequence Transformation Model for Chinese Named Entity Recognition

  • Qingyue Wang
  • Yanjing Song
  • Hao Liu
  • Yanan CaoEmail author
  • Yanbing Liu
  • Li Guo
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11061)


Chinese Named Entity Recognition (NER), as one of basic natural language processing tasks, is still a tough problem due to Chinese polysemy and complexity. In recent years, most of previous works regard NER as a sequence tagging task, including statistical models and deep learning methods. In this paper, we innovatively consider NER as a sequence transformation task in which the unlabeled sequences (source texts) are converted to labeled sequences (NER labels). In order to model this sequence transformation task, we design a sequence-to-sequence neural network, which combines a Conditional Random Fields (CRF) layer to efficiently use sentence level tag information and the attention mechanism to capture the most important semantic information of the encoded sequence. In experiments, we evaluate different models both on a standard corpus consisting of news data and an unnormalized one consisting of short messages. Experimental results showed that our model outperforms the state-of-the-art methods on recognizing short interdependence entity.


Named Entity Recognition Deep learning Sequence to sequence neural network Conditional Random Fields 



This work was supported by the National Key Research and Development program of China (No. 2016YFB0801300), the National Natural Science Foundation of China grants (No. 61602466).


  1. 1.
    Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. 4, 3104–3112 (2014)Google Scholar
  2. 2.
    Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. Computer Science (2014)Google Scholar
  3. 3.
    Hermann, K.M., Kočiský, T., Grefenstette, E., Espeholt, L., Kay, W., Suleyman, M., et al.: Teaching machines to read and comprehend, pp. 1693–1701 (2015)Google Scholar
  4. 4.
    Mikolov, T., Sutskever, I., Chen, K., Corrado, G., Dean, J.: Distributed representations of words and phrases and their compositionality. 26, 3111–3119 (2013)Google Scholar
  5. 5.
    Nallapati, R., Zhou, B., Santos, C.N.D., Gulcehre, C., Xiang, B.: Abstractive text summarization using sequence-to-sequence RNNs and beyond (2016)Google Scholar
  6. 6.
    Collbert, R., Weston, J., Bottou, L.: Natural language processing (almost) from scratch. J. Mach. Learn. Res. 12, 2493–2537 (2011)zbMATHGoogle Scholar
  7. 7.
    Lample, G., Ballesteros, M., Subramanian, S., Kawakami, K., Dyer, C.: Neural architectures for named entity recognition, pp. 260–270 (2016)Google Scholar
  8. 8.
    Ma, X., Hovy, E.: End-to-end sequence labeling via bi-directional LSTM-CNNs-CRF. arXiv:1603.01354v4 (2016)
  9. 9.
    Dong, C., Zhang, J., Zong, C., Hattori, M., Di, H.: Character-based LSTM-CRF with radical-level features for Chinese named entity recognition. In: Lin, C.-Y., Xue, N., Zhao, D., Huang, X., Feng, Y. (eds.) ICCPOL/NLPCC 2016. LNCS (LNAI), vol. 10102, pp. 239–250. Springer, Cham (2016). Scholar
  10. 10.
    Pennington, J., Socher, R., Manning, C.: Glove: global vectors for word representation. In: Conference on Empirical Methods in Natural Language Processing, pp. 1532–1543 (2014)Google Scholar
  11. 11.
    Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)CrossRefGoogle Scholar
  12. 12.
    Paulus, R., Xiong, C., Socher, R.: A deep reinforced model for abstractive summarization. arXiv preprint arXiv:1705.04304v3 (2017)
  13. 13.
    Su, J., Su, J.: Named entity recognition using an HMM-based chunk tagger. In: Meeting on Association for Computational Linguistics, pp. 473–480. Association for Computational Linguistics (2002)Google Scholar
  14. 14.
    Borthwick, A.: A Maximum Entropy Approach to Named Entity Recognition. New York University (1999)Google Scholar
  15. 15.
    Hai, L.C., Ng, H.T.: Named entity recognition: a maximum entropy approach using global information. In: International Conference on Computational Linguistics, pp. 1–7. Association for Computational Linguistics (2002)Google Scholar
  16. 16.
    Li, L., Mao, T., Huang, D., Yang, Y.: Hybrid models for Chinese named entity recognition. In: Proceedings of the Fifth SIGHAN Workshop on Chinese Language Processing, pp. 72–78 (2006)Google Scholar
  17. 17.
    Chen, A., Peng, F., Shan, R., Sun, G.: Chinese named entity recognition with conditional probabilistic models. In: Proceedings of the Fifth SIGHAN Workshop on Chinese Language Processing, pp. 173–176 (2006)Google Scholar
  18. 18.
    Liu, B., Lane, I.: Attention-based recurrent neural network models for joint intent detection and slot filling (2016)Google Scholar
  19. 19.
    Lafferty, J.D., Mccallum, A., Pereira, F.C.N.: Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: Eighteenth International Conference on Machine Learning, vol. 3, pp. 282–289. Morgan Kaufmann Publishers Inc. (2001)Google Scholar
  20. 20.
    Huang, Z., Xu, W., Yu, K.: Bidirectional LSTM-CRF models for sequence tagging. Computer Science (2015)Google Scholar
  21. 21.
    Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. Computer Science. arXiv preprint arXiv:1412.6980 (2014)
  22. 22.
    Zaremba, W., Sutskever, I., Vinyals, O.: Recurrent neural network regularization (2014)Google Scholar
  23. 23.
    Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C.: TensorFlow: large-scale machine learning on heterogeneous distributed systems (2016)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  • Qingyue Wang
    • 1
    • 3
  • Yanjing Song
    • 2
  • Hao Liu
    • 2
  • Yanan Cao
    • 3
    Email author
  • Yanbing Liu
    • 3
  • Li Guo
    • 3
  1. 1.School of Cyber SecurityUniversity of Chinese Academy of SciencesBeijingChina
  2. 2.Software InstituteBeijing Institute of TechnologyBeijingChina
  3. 3.Institute of Information EngineeringChinese Academy of SciencesBeijingChina

Personalised recommendations