Bidirectional LSTM Tagger for Latvian Grammatical Error Detection

Deksne, Daiga

doi:10.1007/978-3-030-27947-9_5

Daiga Deksne⁹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11697))

Included in the following conference series:

International Conference on Text, Speech, and Dialogue

832 Accesses
2 Citations

Abstract

This paper reports on the development of a grammar error labeling system for the Latvian language. We choose to label six error types that are crucial for understanding a text as noted in a survey by native Latvian speakers. The error types are the following: an incorrect use of a preposition, an incorrect agreement in a phrase, an incorrect verb form, an incorrect noun form, an incorrect choice of the definite/indefinite ending of an adjective, and a missing comma. For neural network model training, a large amount of error-annotated training data is required. We generate artificial errors in a correct text to cope with the lack of manually annotated data. As a bidirectional Long Short-Term Memory neural network algorithm is considered the best for erroneous word detection by several authors, we chose this architecture. We train several models – models labeling a single type of error and models labeling all six types of errors. The precision for all types of errors reaches 94.61%, the recall – 94.08%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Bojanowski, P., Grave, E., Joulin, A., Mikolov, T.: Enriching word vectors with subword information. Trans. Assoc. Comput. Linguist. 5(1), 135–146 (2017)
Article Google Scholar
Chollampatt, S., Ng, H.T.: A multilayer convolutional encoder-decoder neural network for grammatical error correction. In: Thirty-Second AAAI Conference on Artificial Intelligence (2018)
Google Scholar
Dale, R.: Checking in on grammar checking. Nat. Lang. Eng. 22(03), 491–495 (2016)
Article Google Scholar
Darǵis, R., Auziņa, I., Levāne-Petrova, K.: The use of text alignment in semi-automatic error analysis: use case in the development of the corpus of the Latvian language learners. In: Proceedings of the 11th International Conference on Language Resources and Evaluation (LREC), pp. 4111–4115 (2018)
Google Scholar
Deksne, D., Skadina, I.: Error-annotated corpus of Latvian. In: Utka, A., et al. (eds.) Human Language Technologies - The Baltic Perspective. Proceedings of the sixth International Conference Baltic HLT 2014, FAIA, vol. 268, pp. 163–166. IOS Press, Amsterdam (2014)
Google Scholar
Deksne, D.: A new phase in the development of a grammar checker for Latvian. In: Skadiņa, I., Rozis, R. (eds.) Human Language Technologies - The Baltic Perspective. Proceedings of the seventh International Conference Baltic HLT 2016, FAIA, vol. 289, pp. 147–152. IOS Press, Amsterdam (2016)
Google Scholar
Devlin, J.; Chang, M.W.; Lee, K.; Toutanova, K.: Bert: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
Ge, T., Wei, F., Zhou, M.: Fluency boost learning and inference for neural grammatical error correction. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 1055–1065 (2018)
Google Scholar
Ghosh, S., Kristensson, P.O.: Neural networks for text correction and completion in keyboard decoding. arXiv preprint arXiv:1709.06429 (2017)
Han, N.R., Chodorow, M., Leacock, C.: Detecting errors in English article usage by non-native speakers. Nat. Lang. Eng. 12(2), 115–129 (2006)
Article Google Scholar
Junczys-Dowmunt, M., Grundkiewicz, R., Guha, S., Heafield, K.: Approaching neural grammatical error correction as a low-resource machine translation task. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), pp. 595–606 (2018)
Google Scholar
Kaneko, M., Sakaizawa, Y., Komachi, M.: Grammatical error detection using error-and grammaticality-specific word embeddings. In: Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp. 40–48 (2017)
Google Scholar
Liu, Z.R., Liu, Y.: Exploiting unlabeled data for neural grammatical error detection. J. Comput. Sci. Technol. 32(4), 758–767 (2017)
Article MathSciNet Google Scholar
Ng, H.T., Wu, S.M., Briscoe, T., Hadiwinoto, C., Susanto, R.H., Bryant, C.: The CoNLL-2014 shared task on grammatical error correction. In: CoNLL Shared Task, pp. 1–14 (2014)
Google Scholar
Rei, M., Felice, M., Yuan, Z., Briscoe, T.: Artificial error generation with machine translation and syntactic patterns. In: Proceedings of the 12th Workshop on Innovative Use of NLP for Building Educational Applications, pp. 287–292. ACL, Copenhagen (2017)
Google Scholar
Rei, M., Yannakoudakis., H.: Compositional sequence labeling models for error detection in learner writing. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, pp. 1181–1191. ACL, Berlin (2016)
Google Scholar
Rei, M., Yannakoudakis, H.: Auxiliary objectives for neural error detection models. In: Proceedings of the 12th Workshop on Innovative Use of NLP for Building Educational Applications, pp. 33–43. ACL, Copenhagen (2017)
Google Scholar
Sakaguchi, K., Napoles, C., Tetreault, J.: GEC into the future: where are we going and how do we get there? In: Proceedings of the 12th Workshop on Innovative Use of NLP for Building Educational Applications, pp. 180–187. ACL, Copenhagen (2017)
Google Scholar
Schmaltz, A., Kim, Y., Rush, A. and Shieber, S.: Adapting sequence models for sentence correction. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 2807–2813. ACL, Copenhagen (2017)
Google Scholar
Sun, C., Jin, X., Lin, L., Zhao, Y., Wang, X.: Convolutional neural networks for correcting English article errors. In: Li, J., Ji, H., Zhao, D., Feng, Y. (eds.) National CCF Conference on Natural Language Processing and Chinese Computing. LNCS, vol. 9362, pp. 102–110. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-25207-0_9
Chapter Google Scholar
Šķilters, J., Zariņa, L., Žilinskaitė-Šinkūnienė, E., Skolmeistere, V.: Acceptability rating of ungrammatical colloquial Latvian: how native speakers judge different error types. Baltic J. Mod. Comput. 6(2), 173–194 (2018)
Article Google Scholar
Tiedemann, J.: News from OPUS - a collection of multilingual parallel corpora with tools and interfaces. In: Nicolov, N., Angelova, G., Mitkov, R. (eds.) Recent Advances in Natural Language Processing V. Selected papers from RANLP 2007, pp. 237–248. John Benjamins Publishing Company, Amsterdam/Philadelphia (2009)
Google Scholar
Znotiņa, I.: Computer-aided error analysis for researching baltic interlanguage. Rural Environment, Education, Personality (REEP). In: Proceedings of the tenth International Scientific Conference, pp. 238–244. LLU, Jelgava (2017)
Google Scholar

Download references

Acknowledgment

The research has been supported by the European Regional Development Fund within the project “Neural Network Modelling for Inflected Natural Languages” No. 1.1.1.1/16/A/215.

Author information

Authors and Affiliations

Tilde, Vienibas gatve 75a, Riga, Latvia
Daiga Deksne

Authors

Daiga Deksne
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Daiga Deksne .

Editor information

Editors and Affiliations

University of West Bohemia, Pilsen, Czech Republic
Kamil Ekštein

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Deksne, D. (2019). Bidirectional LSTM Tagger for Latvian Grammatical Error Detection. In: Ekštein, K. (eds) Text, Speech, and Dialogue. TSD 2019. Lecture Notes in Computer Science(), vol 11697. Springer, Cham. https://doi.org/10.1007/978-3-030-27947-9_5

Download citation

DOI: https://doi.org/10.1007/978-3-030-27947-9_5
Published: 06 August 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-27946-2
Online ISBN: 978-3-030-27947-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics