Abstract
The morphosyntactic disambiguation of verbs is a crucial pre-processing step for the syntactic analysis of morphologically rich languages like German and domains with complex clause structures like law texts. This paper explores how much linguistically motivated rules can contribute to the task. It introduces an incremental system of verbal morphosyntactic disambiguation that exploits the concept of topological fields. The system presented is capable of reducing the rate of POS-tagging mistakes from 10.2% to 1.6%. The evaluation shows that this reduction is mostly gained through checking the compatibility of morphosyntactic features within the long-distance syntactic relationships of discontinuous verbal elements. Furthermore, the present study shows that in law texts, the average distance between the left and right bracket of clauses is relatively large (9.5 tokens), and that in this domain, a wide context window is therefore necessary for the morphosyntactic disambiguation of verbs.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Bangalore, S., Joshi, A.K.: Supertagging: an approach to almost parsing. Computational Linguistics 25(2) (1999)
Becker, M.: Frank. A.: A Stochastic Topological Parser for German. In: Proceedings of COLING 2002, pp. 71–77. Association of Computational Linguistics, New York (2002)
Dudenredaktion (ed.): Duden - die Grammatik: unentbehrlich für richtiges Deutsch, Duden, vol. 4. Dudenverlag, Mannheim (2009)
Dürscheid, C.: Syntax: Grundlagen und Theorien. Vandenhoeck & Ruprecht, Göttingen (2012)
Foth, K., By, T., Menzel, W.: Guiding a constraint dependency parser with supertags. In: Bangalore, S., Joshi, A.K. (eds.) Supertagging: Using Complex Lexical Descriptions in Natural Language Processing. MIT Press, Cambridge (2010)
Frank, A., Becker, M., Crysmann, B., Kiefer, B., Schäfer, U.: Integrated Shallow and Deep Parsing: TopP Meets HPSG. In: Proceedings of ACL 2003, pp. 104–111. Association for Computational Linguistics, New York (2003)
Haapalainen, M., Majorin, A.: GERTWOL: ein System zur automatischen Wortformerkennung deutscher Wörter. Technical report, Lingsoft (1994)
Hansen-Schirra, S., Neumann, S.: Linguistische Verständlichmachung in der juristischen Realität. In: Lerch, K.D. (ed.) Recht verstehen: Verständlichkeit, Missverständlichkeit und Unverständlichkeit von Recht, Die Sprache des Rechts, vol. 1. Walter de Gruyter, Berlin (2004)
Harper, M.P., Wang, W.: Constraint dependency grammars: Superarvs, language modeling, and parsing. In: Bangalore, S., Joshi, A.K. (eds.) Supertagging: Using Complex Lexical Descriptions in Natural Language Processing. MIT Press, Cambridge (2010)
Hinrichs, E.W., Kübler, S., Müller, F.H., Ule, T.: A hybrid architecture for robust parsing of German. In: Proceedings of the 3rd International Confererence on Language Resources and Evaluation (LREC 2002), Las Palmas, Gran Canaria (2002)
Höfler, S., Piotrowski, M.: Building Corpora for the Philological Study of Swiss Legal Texts. Journal for Language Technology and Computational Linguistics (JLCL) 26(2), 77–89 (2011)
Höfler, S., Sugisaki, K.: From Drafting to Error Detection: Automating Style Checking for Legislative Texts. In: EACL 2012 Workshop on Computational Linguistics and Writing, pp. 9–18. Association for Computational Linguistics, New York (2012)
Karlsson, F., Voutilainen, A., Heikkilä, J., Anttila, A. (eds.): Constraint Grammar: A Language- Independent System for Parsing Unrestricted Text. Mouton de Gruyter, Berlin/New York (1995)
Kathol, A.: Linear syntax. Oxford University Press, Oxford (2000)
Nasr, A., Rambow, O.: Supertagging and full parsing. In: Proceedings of the 7th International Workshop on Tree Adjoining Grammar and Related Formalisms (TAG+7), Vancouver, British Columbia, Canada, pp. 56–63 (2004)
Neumann, G., Braun, C., Piskorski, J.: A divide-and-conquer strategy for shallow parsing of German free texts. In: Proceedings of the Sixth Conference on Applied Natural Language Processing (ANLC 2000), Seatle, WA, pp. 239–246 (2000)
Nussbaumer, M.: Rhetorisch-stilistische Eigenschaften der Sprache des Rechtswesens. In: Fix, U., Gardt, A., Knape, J. (eds.) Rhetorik und Stilistik / Rhetoric and Stylistics, Handbooks of Linguistics and Communication Science, vol. 31(2), pp. 2132–2150. Mouton de Gruyter, Boston/New York (2009)
Schiller, A., Teufel, C., Stöckert, C., Thielen, C.: Guidelines für das Tagging deutscher Textcorpora mit STTS (kleines und grosses Tagset). Technical report, Universität Stuttgart/Universität Tübingen (1999)
Schmid, H.: Improvements in Part-of-Speech Tagging with an Application to German. In: Proceedings of the ACL SIGDAT-Workshop, Dublin (1995)
Schneider, G., Volk, M.: Adding Manual Constraints and Lexical Look-Up to a Brill-Tagger for German. In: Proceedings of the ESSLLI 1998 Workshop on Recent Advances in Corpus Annotation, Saarbrücken (1998)
Volk, M., Schneider, G.: Comparing a Statistical and a Rule-Based Tagger for German. In: Lang, P., Frankfurt, A.M. (ed.) Proceeding of the 4th Conference on Natural Language Processing (KONVENS 1998), Berlin, Bern, New York, Paris, Wien, pp. 125–137 (1998)
Voutilainen, A.: NPtool, A Detector of English Noun Phrases. In: Proceeding of Workshop on Very Large Corpora: Academic and Industrial Perspectives, pp. 48–57. Ohio State University, Columbus (1993)
Voutilainen, A.: A Syntax-Based Part-of-Speech Analyser. In: Proceedings of the Seventh Conference on European Chapter of the Association for Computational Linguistics, EACL 1995, pp. 157–164. Morgan Kaufmann, San Francisco (1995)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Sugisaki, K., Höfler, S. (2013). Verbal Morphosyntactic Disambiguation through Topological Field Recognition in German-Language Law Texts. In: Mahlow, C., Piotrowski, M. (eds) Systems and Frameworks for Computational Morphology. SFCM 2013. Communications in Computer and Information Science, vol 380. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40486-3_8
Download citation
DOI: https://doi.org/10.1007/978-3-642-40486-3_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-40485-6
Online ISBN: 978-3-642-40486-3
eBook Packages: Computer ScienceComputer Science (R0)