Abstract
Punctuation detection and correction belongs to the hardest automatic grammar checking tasks for the Czech language. The paper compares available grammar and punctuation correction programs on several data sets. It also describes a set of improvements of one of the available tools, leading to significantly better recall, as well as precision.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
- 2.
SET is an abbreviation of “syntactic engineering tool”.
- 3.
- 4.
References
Behún, D.: Kontrola české gramatiky pro MS Office - konec korektor\(\mathring{\text{u}}\) v Čechách? (2005). https://interval.cz/clanky/kontrola-ceske-gramatiky-pro-ms-office-konec-korektoru-v-cechach
Boháč, M., Blavka, K., Kuchařová, M., Škodová, S.: Post-processing of the recognized speech for web presentation of large audio archive. In: 2012 35th International Conference on Telecommunications and Signal Processing (TSP), pp. 441–445 (2012)
Holan, T., Kuboň, V., Plátek, M.: A prototype of a grammar checker for Czech. In: Proceedings of the 5th Conference on Applied Natural Language Processing, pp. 147–154. Association for Computational Linguistics (1997)
Horák, A.: Computer Processing of Czech Syntax and Semantics. Librix.eu, Brno (2008)
Jakubíček, M., Horák, A.: Punctuation detection with full syntactic parsing. Res. Comput. Sci. Spec. issue: Nat. Lang. Process. Appl. 46, 335–343 (2010)
Kovář, V.: Partial grammar checking for Czech using the set parser. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds.) TSD 2014. LNCS, vol. 8655, pp. 308–314. Springer, Heidelberg (2014)
Kovář, V., Horák, A., Jakubíček, M.: Syntactic analysis using finite patterns: a new parsing system for Czech. In: Vetulani, Z. (ed.) LTC 2009. LNCS, vol. 6562, pp. 161–171. Springer, Heidelberg (2011)
Lingea s.r.o.: Grammaticon (2003). www.lingea.cz/grammaticon.htm
Oliva, K., Petkevič, V., Microsoft s.r.o.: Czech Grammar Checker (2005). http://office.microsoft.com/word
Pala, K.: Pište dopisy konečně bez chyb – Český gramatický korektor pro Microsoft Office. Computer, 13–14 (2005)
Petkevič, V.: Kontrola české gramatiky (český grammar checker). Studie z aplikované lingvistiky-Stud. Appl. Linguist. 5(2), 48–66 (2014)
Sedláček, R., Smrž, P.: A new Czech morphological analyser ajka. In: Matoušek, V., Mautner, P., Mouček, R., Tauser, K. (eds.) TSD 2001. LNCS (LNAI), vol. 2166, pp. 100–107. Springer, Heidelberg (2001)
Suchomel, V., Michelfeit, J., Pomikálek, J.: Text tokenisation using unitok. In: Eighth Workshop on Recent Advances in Slavonic Natural Language Processing, pp. 71–75. Tribun EU, Brno (2014)
Šmerk, P.: Unsupervised learning of rules for morphological disambiguation. In: Sojka, P., Kopeček, I., Pala, K. (eds.) TSD 2004. LNCS (LNAI), vol. 3206, pp. 211–216. Springer, Heidelberg (2004)
Acknowledgments
This work has been partly supported by the Grant Agency of CR within the project 15-13277S. The research leading to these results has received funding from the Norwegian Financial Mechanism 2009–2014 and the Ministry of Education, Youth and Sports under Project Contract no. MSMT-28477/2014 within the HaBiT Project 7F14047. This work was also partly supported by Student Grant Scheme 2016 of Technical University of Liberec.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Kovář, V., Machura, J., Zemková, K., Rott, M. (2016). Evaluation and Improvements in Punctuation Detection for Czech. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds) Text, Speech, and Dialogue. TSD 2016. Lecture Notes in Computer Science(), vol 9924. Springer, Cham. https://doi.org/10.1007/978-3-319-45510-5_33
Download citation
DOI: https://doi.org/10.1007/978-3-319-45510-5_33
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-45509-9
Online ISBN: 978-3-319-45510-5
eBook Packages: Computer ScienceComputer Science (R0)