Multichannel LSTM-CRF for Named Entity Recognition in Chinese Social Media

  • Chuanhai DongEmail author
  • Huijia Wu
  • Jiajun Zhang
  • Chengqing Zong
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10565)


Named Entity Recognition (NER) is a tough task in Chinese social media due to a large portion of informal writings. Existing research uses only limited in-domain annotated data and achieves low performance. In this paper, we utilize both limited in-domain data and enough out-of-domain data using a domain adaptation method. We propose a multichannel LSTM-CRF model that employs different channels to capture general patterns, in-domain patterns and out-of-domain patterns in Chinese social media. The extensive experiments show that our model yields 9.8% improvement over previous state-of-the-art methods. We further find that a shared embedding layer is important and randomly initialized embeddings are better than the pretrained ones.


Multichannel Named entity recognition Chinese social media 



The research work has been supported by the Natural Science Foundation of China under Grant No. 61403379 and No. 61402478.


  1. 1.
    Blitzer, J., McDonald, R., Pereira, F.: Domain adaptation with structural correspondence learning. In: Proceedings of the 2006 conference on empirical methods in natural language processing, pp. 120–128. Association for Computational Linguistics (2006)Google Scholar
  2. 2.
    Chang, C.Y., Teng, Z., Zhang, Y.: Expectation-regulated neural model for event mention extraction. In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 400–410. Association for Computational Linguistics, San Diego, California, June 2016Google Scholar
  3. 3.
    Chen, Y., Zong, C., Su, K.Y.: On jointly recognizing and aligning bilingual named entities. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pp. 631–639. Association for Computational Linguistics (2010)Google Scholar
  4. 4.
    Cherry, C., Guo, H.: The unreasonable effectiveness of word representations for twitter named entity recognition. In: HLT-NAACL, pp. 735–745 (2015)Google Scholar
  5. 5.
    Chiu, J.P., Nichols, E.: Named entity recognition with bidirectional lstm-cnns. arXiv preprint (2015). arXiv:1511.08308
  6. 6.
    Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., Kuksa, P.: Natural language processing (almost) from scratch. J. Mach. Learn. Res. 12(Aug), 2493–2537 (2011)zbMATHGoogle Scholar
  7. 7.
    Daumé III., H.: Frustratingly easy domain adaptation. arXiv preprint (2009). arXiv:0907.1815
  8. 8.
    Daumé III., H., Kumar, A., Saha, A.: Frustratingly easy semi-supervised domain adaptation. In: Proceedings of the 2010 Workshop on Domain Adaptation for Natural Language Processing, pp. 53–59. Association for Computational Linguistics (2010)Google Scholar
  9. 9.
    Dong, C., Zhang, J., Zong, C., Hattori, M., Di, H.: Character-Based LSTM-CRF with Radical-Level Features for Chinese Named Entity Recognition. In: Lin, C.-Y., Xue, N., Zhao, D., Huang, X., Feng, Y. (eds.) ICCPOL/NLPCC -2016. LNCS, vol. 10102, pp. 239–250. Springer, Cham (2016). doi: 10.1007/978-3-319-50496-4_20 CrossRefGoogle Scholar
  10. 10.
    Dredze, M., McNamee, P., Rao, D., Gerber, A., Finin, T.: Entity disambiguation for knowledge base population. In: Proceedings of the 23rd International Conference on Computational Linguistics, pp. 277–285. Association for Computational Linguistics (2010)Google Scholar
  11. 11.
    Dyer, C., Ballesteros, M., Ling, W., Matthews, A., Smith, N.A.: Transition-based dependency parsing with stack long short-term memory. arXiv preprint (2015). arXiv:1505.08075
  12. 12.
    Fu, G., Luke, K.K.: Chinese named entity recognition using lexicalized hmms. ACM SIGKDD Explor. Newslett. 7(1), 19–25 (2005)CrossRefGoogle Scholar
  13. 13.
    Gottipati, S., Jiang, J.: Linking entities to a knowledge base with query expansion. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 804–813. Association for Computational Linguistics (2011)Google Scholar
  14. 14.
    Han, A.L.-F., Wong, D.F., Chao, L.S.: Chinese Named Entity Recognition with Conditional Random Fields in the Light of Chinese Characteristics. In: Kłopotek, M.A., Koronacki, J., Marciniak, M., Mykowiecka, A., Wierzchoń, S.T. (eds.) IIS 2013. LNCS, vol. 7912, pp. 57–68. Springer, Heidelberg (2013). doi: 10.1007/978-3-642-38634-3_8 CrossRefGoogle Scholar
  15. 15.
    He, H., Sun, X.: F-score driven max margin neural network for named entity recognition in chinese social media. arXiv preprint (2016). arXiv:1611.04234
  16. 16.
    He, H., Sun, X.: A unified model for cross-domain and semi-supervised named entity recognition in chinese social media. In: Thirty-First AAAI Conference on Artificial Intelligence (2017)Google Scholar
  17. 17.
    Hinton, G.E., Srivastava, N., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.R.: Improving neural networks by preventing co-adaptation of feature detectors. arXiv preprint (2012). arXiv:1207.0580
  18. 18.
    Huang, Z., Xu, W., Yu, K.: Bidirectional lstm-crf models for sequence tagging. arXiv preprint (2015). arXiv:1508.01991
  19. 19.
    Kim, Y.B., Stratos, K., Sarikaya, R.: Frustratingly easy neural domain adaptation. In: Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers. The COLING 2016 Organizing Committee, Osaka, Japan, pp. 387–396, December 2016Google Scholar
  20. 20.
    Lample, G., Ballesteros, M., Subramanian, S., Kawakami, K., Dyer, C.: Neural architectures for named entity recognition. arXiv preprint (2016). arXiv:1603.01360
  21. 21.
    Levow, G.A.: The third international chinese language processing bakeoff: Word segmentation and named entity recognition. In: Proceedings of the Fifth SIGHAN Workshop on Chinese Language Processing, pp. 108–117 (2006)Google Scholar
  22. 22.
    Li, L., Mao, T., Huang, D., Yang, Y.: Hybrid models for chinese named entity recognition. In: COLING\(\bullet \) ACL 2006, p. 72 (2006)Google Scholar
  23. 23.
    Ma, X., Hovy, E.: End-to-end sequence labeling via bi-directional lstm-cnns-crf. arXiv preprint (2016). arXiv:1603.01354
  24. 24.
    Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in neural information processing systems, pp. 3111–3119 (2013)Google Scholar
  25. 25.
    Pan, S.J., Yang, Q.: A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22(10), 1345–1359 (2010)CrossRefGoogle Scholar
  26. 26.
    Peng, N., Dredze, M.: Named entity recognition for chinese social media with jointly trained embeddings. In: EMNLP, pp. 548–554 (2015)Google Scholar
  27. 27.
    Peng, N., Dredze, M.: Improving named entity recognition for chinese social media with word segmentation representation learning. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, vol. 2, pp. 149–155 (2016)Google Scholar
  28. 28.
    Peng, N., Dredze, M.: Multi-task multi-domain representation learning for sequence tagging. arXiv preprint (2016). arXiv:1608.02689
  29. 29.
    Ratinov, L., Roth, D.: Design challenges and misconceptions in named entity recognition. In: Proceedings of the Thirteenth Conference on Computational Natural Language Learning, pp. 147–155. Association for Computational Linguistics (2009)Google Scholar
  30. 30.
    Rehurek, R., Sojka, P.: Software framework for topic modelling with large corpora. In: Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks. Citeseer (2010)Google Scholar
  31. 31.
    Ritter, A., Clark, S., Etzioni, O., et al.: Named entity recognition in tweets: an experimental study. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 1524–1534. Association for Computational Linguistics (2011)Google Scholar
  32. 32.
    Weiss, K., Khoshgoftaar, T.M., Wang, D.: A survey of transfer learning. J. Big Data 3(1), 1–40 (2016)CrossRefGoogle Scholar
  33. 33.
    Yang, Z., Salakhutdinov, R., Cohen, W.W.: Transfer learning for sequence tagging with hierarchical recurrent networks. arXiv preprint (2017). arXiv:1703.06345

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Chuanhai Dong
    • 1
    • 2
    Email author
  • Huijia Wu
    • 1
    • 2
  • Jiajun Zhang
    • 1
    • 2
  • Chengqing Zong
    • 1
    • 2
    • 3
  1. 1.CASIANational Laboratory of Pattern RecognitionBeijingChina
  2. 2.University of Chinese Academy of SciencesBeijingChina
  3. 3.CAS Center for Excellence in Brain Science and Intelligence TechnologyShanghaiChina

Personalised recommendations