Skip to main content

Tibetan Case Grammar Error Correction Method Based on Neural Networks

  • Conference paper
  • First Online:
Chinese Lexical Semantics (CLSW 2019)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11831))

Included in the following conference series:

  • 1582 Accesses

Abstract

Grammar Error Correction (GEC) is an important researching subject among Nature Language Processing tasks. In this work, aiming at tackling with genitive and ergative grammatical errors in Tibetan formal text, we collect 1793563 consecutive sentence pairs as training set and 5000 sentence pairs with the same distribution as well as 1159 sentence pairs in different distributions as testing sets. In our approach, we firstly preprocess Tibetan text data with compositional rules and then build a neural network architecture which is a combination of BERT and Bi-LSTM, to estimate the probability of given token being genitive or ergative. In experiments, 98.38% and 86.16% in terms of accuracy are observed respectively in testing the proposed model on two different testing sets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 99.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 129.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Ji, T.: Tibetan Syntactic Research. China Tibetology Press, Beijing (2013). (in Tibetan)

    Google Scholar 

  2. Gesang, J., Gesang, Y.: Practical Tibetan Grammar Tutorial. Sichuan Nationalities Press, Chengdu (2008). (in Chinese)

    Google Scholar 

  3. Zhu, J., Li, T., Liu, S.: The algorithm of spelling check base on TSRM. J. Chin. Inf. Process. 28(3), 92–98 (2014). (in Chinese)

    Google Scholar 

  4. Cai, Z., Sun, M., Cairang, Z.: Vector based spelling check for Tibetan characters. J. Chin. Inf. Proess. 32(9), 47–55 (2018). (in Chinese)

    Google Scholar 

  5. Zhu, J., Li, T., Liu, S.: An approach for Tibetan text automatic proofreading and its system design. Acta Scientiarum Naturalium Universitatis Pekinensis 50(1), 142–148 (2014). (in Chinese)

    Google Scholar 

  6. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9, 1735–1780 (1997)

    Article  Google Scholar 

  7. Luo, W., Luo, Z., Gong, X.: Study of techniques of automatic proofreading for Chinese texts. J. Comput. Res. Dev. 41(4), 244–249 (2004). (in Chinese)

    Google Scholar 

  8. Zhang, Y., Yu, S.: Summary of text automatic proofreading technology. Appl. Res. Comput. 23(6), 8–12 (2006). (in Chinese)

    Google Scholar 

  9. Chollampatt, S., Ng, H.T.: Connecting the dots: towards human-level grammatical error correction. In: Proceedings of the 12th Workshop on Innovative Use of NLP for Building Educational Applications. Association for Computational Linguistics, Copenhagen (2017)

    Google Scholar 

  10. Fu, K., Huang, J., Duan, Y.: Youdao’s winning solution to the NLPCC-2018 task 2 challenge: a neural machine translation approach to Chinese grammatical error correction. In: Zhang, M., Ng, V., Zhao, D., Li, S., Zan, H. (eds.) NLPCC 2018. LNCS (LNAI), vol. 11108, pp. 341–350. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-99495-6_29

    Chapter  Google Scholar 

  11. Vaswani, A., et al.: Attention is all you need. Computation and Language (cs.CL); Machine Learning (cs.LG). arXiv:1706.03762 (2017)

  12. Devlin, J., Chang, M., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)

Download references

Acknowledgement

National Key R&D Program of China (2017YFB1402200), The National Natural Science Foundation of China (61063033, 61662061), The National Social Science Fund of China (14BYY132).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Cairang Jia .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Jiacuo, C., Jia, S., Duanzhu, S., Jia, C. (2020). Tibetan Case Grammar Error Correction Method Based on Neural Networks. In: Hong, JF., Zhang, Y., Liu, P. (eds) Chinese Lexical Semantics. CLSW 2019. Lecture Notes in Computer Science(), vol 11831. Springer, Cham. https://doi.org/10.1007/978-3-030-38189-9_43

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-38189-9_43

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-38188-2

  • Online ISBN: 978-3-030-38189-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics