Abstract
Recently, Chinese word segmentation (CWS) methods using neural networks have made impressive progress. Most of them regard the CWS as a sequence labeling problem which construct models based on local features rather than considering global information of input sequence. In this paper, we cast the CWS as a sequence translation problem and propose a novel sequence-to-sequence CWS model with an attention-based encoder-decoder framework. The model captures the global information from the input and directly outputs the segmented sequence. It can also tackle other NLP tasks with CWS jointly in an end-to-end mode. Experiments on Weibo, PKU and MSRA benchmark datasets show that our approach has achieved competitive performances compared with state-of-the-art methods. Meanwhile, we successfully applied our proposed model to jointly learning CWS and Chinese spelling correction, which demonstrates its applicability of multi-task fusion.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Executable source code is available at https://github.com/SourcecodeSharing/CWSpostediting.
- 2.
All data and the program are available at https://github.com/FudanNLP/NLPCC-WordSeg-Weibo.
- 3.
- 4.
All data and the program are available at http://sighan.cs.uchicago.edu/bakeoff2005/.
- 5.
Implementations are available at https://github.com/lisa-groundhog/GroundHog.
- 6.
Available online at https://github.com/HIT-SCIR/ltp.
References
Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014)
Cai, D., Zhao, H.: Neural word segmentation learning for Chinese. arXiv preprint arXiv:1606.04300 (2016)
Che, W., Li, Z., Liu, T.: LTP: a Chinese language technology platform. In: Proceedings of the 23rd International Conference on Computational Linguistics: Demonstrations, pp. 13–16. Association for Computational Linguistics (2010)
Chen, X., Qiu, X., Zhu, C., Huang, X.: Gated recursive neural network for Chinese word segmentation. In: ACL, vol. 1, pp. 1744–1753 (2015)
Chen, X., Qiu, X., Zhu, C., Liu, P., Huang, X.: Long short-term memory neural networks for Chinese word segmentation. In: EMNLP, pp. 1197–1206 (2015)
Cho, K., Van Merriënboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., Bengio, Y.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078 (2014)
Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., Kuksa, P.: Natural language processing (almost) from scratch. J. Mach. Learn. Res. 12, 2493–2537 (2011)
Emerson, T.: The second international Chinese word segmentation bakeoff. In: Proceedings of the Fourth SIGHAN Workshop on Chinese Language Processing, vol. 133 (2005)
Gehring, J., Auli, M., Grangier, D., Yarats, D., Dauphin, Y.N.: Convolutional sequence to sequence learning. ArXiv e-prints, May 2017
Koehn, P., Hoang, H., Birch, A., Callison-Burch, C., Federico, M., Bertoldi, N., Cowan, B., Shen, W., Moran, C., Zens, R., et al.: Moses: open source toolkit for statistical machine translation. In: Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions, pp. 177–180. Association for Computational Linguistics (2007)
Lafferty, J., McCallum, A., Pereira, F., et al.: Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: Proceedings of the Eighteenth International Conference on Machine Learning, ICML, vol. 1, pp. 282–289 (2001)
Lin, C.Y., Och, F.J.: Automatic evaluation of machine translation quality using longest common subsequence and skip-bigram statistics. In: Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics, p. 605. Association for Computational Linguistics (2004)
Pei, W., Ge, T., Chang, B.: Max-margin tensor neural network for Chinese word segmentation. In: ACL, vol. 1, pp. 293–303 (2014)
Peng, F., Feng, F., McCallum, A.: Chinese segmentation and new word detection using conditional random fields. In: Proceedings of the 20th International Conference on Computational Linguistics. p. 562. Association for Computational Linguistics (2004)
Qiu, P., Qiu, X., Huang, X.: A new psychometric-inspired evaluation metric for Chinese word segmentation. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, pp. 2185–2194 (2016)
Qiu, X., Qian, P., Shi, Z.: Overview of the NLPCC-ICCPOL 2016 shared task: Chinese word segmentation for micro-blog texts. In: Lin, C.-Y., Xue, N., Zhao, D., Huang, X., Feng, Y. (eds.) ICCPOL/NLPCC-2016. LNCS, vol. 10102, pp. 901–906. Springer, Cham (2016). doi:10.1007/978-3-319-50496-4_84
Srivastava, N., Hinton, G.E., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)
Sun, X., Li, W., Wang, H., Lu, Q.: Feature-frequency-adaptive on-line training for fast and accurate natural language processing. Comput. Linguist. 40(3), 563–586 (2014)
Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: Advances in Neural Information Processing Systems, pp. 3104–3112 (2014)
Wu, Y., Schuster, M., Chen, Z., Le, Q.V., Norouzi, M., Macherey, W., Krikun, M., Cao, Y., Gao, Q., Macherey, K., et al.: Google’s neural machine translation system: bridging the gap between human and machine translation. arXiv preprint arXiv:1609.08144 (2016)
Xue, N., Shen, L.: Chinese word segmentation as LMR tagging. In: Proceedings of the Second SIGHAN Workshop on Chinese Language Processing, vol. 17, pp. 176–179 (2003)
Yu, L.C., Lee, L.H., Tseng, Y.H., Chen, H.H., et al.: Overview of SIGHAN 2014 bake-off for Chinese spelling check. In: Proceedings of the 3rd CIPSSIGHAN Joint Conference on Chinese Language Processing (CLP 2014), pp. 126–132 (2014)
Zhang, L., Wang, H., Sun, X., Mansur, M.: Exploring representations from unlabeled data with co-training for Chinese word segmentation In: Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pp. 311–321 (2013)
Zhao, H., Huang, C.N., Li, M., Lu, B.L.: A unified character-based tagging framework for chinese word segmentation. ACM Trans. Asian Lang. Inf. Process. (TALIP) 9(2), 5 (2010)
Zheng, X., Chen, H., Xu, T.: Deep learning for Chinese word segmentation and pos tagging. In: EMNLP, pp. 647–657 (2013)
Ziemski, M., Junczys-Dowmunt, M., Pouliquen, B.: The united nations parallel corpus v1.0. In: Proceedings of the Tenth International Conference on Language Resources and Evaluation, LREC, pp. 23–28 (2016)
Acknowledgments
This work was supported by the National Basic Research Program (973) of China (No. 2013CB329303) and the National Natural Science Foundation of China (No. 61132009).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Shi, X., Huang, H., Jian, P., Guo, Y., Wei, X., Tang, YK. (2017). Neural Chinese Word Segmentation as Sequence to Sequence Translation. In: Cheng, X., Ma, W., Liu, H., Shen, H., Feng, S., Xie, X. (eds) Social Media Processing. SMP 2017. Communications in Computer and Information Science, vol 774. Springer, Singapore. https://doi.org/10.1007/978-981-10-6805-8_8
Download citation
DOI: https://doi.org/10.1007/978-981-10-6805-8_8
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-6804-1
Online ISBN: 978-981-10-6805-8
eBook Packages: Computer ScienceComputer Science (R0)