Abstract
Grammar Error Correction (GEC) is an important researching subject among Nature Language Processing tasks. In this work, aiming at tackling with genitive and ergative grammatical errors in Tibetan formal text, we collect 1793563 consecutive sentence pairs as training set and 5000 sentence pairs with the same distribution as well as 1159 sentence pairs in different distributions as testing sets. In our approach, we firstly preprocess Tibetan text data with compositional rules and then build a neural network architecture which is a combination of BERT and Bi-LSTM, to estimate the probability of given token being genitive or ergative. In experiments, 98.38% and 86.16% in terms of accuracy are observed respectively in testing the proposed model on two different testing sets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Ji, T.: Tibetan Syntactic Research. China Tibetology Press, Beijing (2013). (in Tibetan)
Gesang, J., Gesang, Y.: Practical Tibetan Grammar Tutorial. Sichuan Nationalities Press, Chengdu (2008). (in Chinese)
Zhu, J., Li, T., Liu, S.: The algorithm of spelling check base on TSRM. J. Chin. Inf. Process. 28(3), 92–98 (2014). (in Chinese)
Cai, Z., Sun, M., Cairang, Z.: Vector based spelling check for Tibetan characters. J. Chin. Inf. Proess. 32(9), 47–55 (2018). (in Chinese)
Zhu, J., Li, T., Liu, S.: An approach for Tibetan text automatic proofreading and its system design. Acta Scientiarum Naturalium Universitatis Pekinensis 50(1), 142–148 (2014). (in Chinese)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9, 1735–1780 (1997)
Luo, W., Luo, Z., Gong, X.: Study of techniques of automatic proofreading for Chinese texts. J. Comput. Res. Dev. 41(4), 244–249 (2004). (in Chinese)
Zhang, Y., Yu, S.: Summary of text automatic proofreading technology. Appl. Res. Comput. 23(6), 8–12 (2006). (in Chinese)
Chollampatt, S., Ng, H.T.: Connecting the dots: towards human-level grammatical error correction. In: Proceedings of the 12th Workshop on Innovative Use of NLP for Building Educational Applications. Association for Computational Linguistics, Copenhagen (2017)
Fu, K., Huang, J., Duan, Y.: Youdao’s winning solution to the NLPCC-2018 task 2 challenge: a neural machine translation approach to Chinese grammatical error correction. In: Zhang, M., Ng, V., Zhao, D., Li, S., Zan, H. (eds.) NLPCC 2018. LNCS (LNAI), vol. 11108, pp. 341–350. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-99495-6_29
Vaswani, A., et al.: Attention is all you need. Computation and Language (cs.CL); Machine Learning (cs.LG). arXiv:1706.03762 (2017)
Devlin, J., Chang, M., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
Acknowledgement
National Key R&D Program of China (2017YFB1402200), The National Natural Science Foundation of China (61063033, 61662061), The National Social Science Fund of China (14BYY132).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Jiacuo, C., Jia, S., Duanzhu, S., Jia, C. (2020). Tibetan Case Grammar Error Correction Method Based on Neural Networks. In: Hong, JF., Zhang, Y., Liu, P. (eds) Chinese Lexical Semantics. CLSW 2019. Lecture Notes in Computer Science(), vol 11831. Springer, Cham. https://doi.org/10.1007/978-3-030-38189-9_43
Download citation
DOI: https://doi.org/10.1007/978-3-030-38189-9_43
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-38188-2
Online ISBN: 978-3-030-38189-9
eBook Packages: Computer ScienceComputer Science (R0)