A Chunk-Based Multi-strategy Machine Translation Method
In this paper, a chunk-based multi-strategy machine translation method is proposed. Firstly, an English-Chinese bilingual tree-bank is constructed. Then, a translation strategy based on the chunk that combines statistics and rules is used in the translation stage. Through hierarchical sub-chunks, the input sentence is divided into a set of chunk sequence. Each chunk searches the corresponding instance in the corpus. Translation is completed by recursive refinement from chunks to words. Conditional Random Fields model is used to divide chunks. An experimental English-Chinese translation system is deployed, and experimental results show that the system performs better than the Systran system.
KeywordsMachine translation Chunks parsing Grammar induction Conditional random fields
The authors are very grateful to Special Projects for Reform and Development of Beijing Institute of Science and Technology Information (2018) (Information rapid processing capacity building with applied artificial intelligence and big data technology) for the supports and assistance.
- 1.Babhulgaonkar, A.R., Bharad, S.V.: Statistical machine translation. In: 1st International Conference on Intelligent Systems and Information Management, pp. 62–67. Institute of Electrical and Electronics Engineers Inc. (2017)Google Scholar
- 2.Gong, H.: The role of speech recognition and machine translation in interpreting. Study Lang. Arts Sports 5, 383–385 (2018)Google Scholar
- 3.Semmar, N., Laib, M.: Building multiword expressions bilingual lexicons for domain adaptation of an example-based machine translation system. In: 11th International Conference on Recent Advances in Natural Language Processing, pp. 661–669. Association for Computational Linguistics (2017)Google Scholar
- 5.Mahata, S.K., Das, D., Bandyopadhyay, S.: MTIL2017: machine translation using recurrent neural network on statistical machine translation. J. Intell. Syst. (2018)Google Scholar
- 6.Wang, X., Lu, Z., Tu, Z., et al.: Neural machine translation advised by statistical machine translation. In: 31st AAAI Conference on Artificial Intelligence, pp. 3330–3336. AAAI press (2017)Google Scholar
- 7.Sun, L., Jin, Y., Du, L., Sun, Y.: Automatic extraction of bilingual term lexicon from parallel corpora. J. Chin. Inform. Process. 14(6), 33–39 (2000)Google Scholar
- 8.Branco, A., Carvalheiro, C., Costa, F., et al.: DeepBankPT and companion Portuguese treebanks in a multilingual collection of treebanks aligned with the penn Treebank. In: 11th International Conference on Computational Processing of Portuguese, pp. 207–213. Springer (2014)Google Scholar
- 9.Badmaeva, E., Tyers, F.M.: A dependency treebank for Buryat. In: 17th International Conference on Intelligent Text Processing and Computational Linguistics, pp. 397–408. Springer (2018)Google Scholar
- 10.Bielinskiene, A., Boizou, L., Kovalevskaite, J., Rimkute, E.: Lithuanian dependency treebank ALKSNIS. In: 7th International Conference on Human Language Technologies - The Baltic Perspective, pp. 107–114. IOS Press (2016)Google Scholar
- 12.BLEU-WIKIPEDIA. https://en.wikipedia.org/wiki/BLEU. Accessed 12 Feb 2018