Verbal Morphosyntactic Disambiguation through Topological Field Recognition in German-Language Law Texts

Sugisaki, Kyoko; Höfler, Stefan

doi:10.1007/978-3-642-40486-3_8

Kyoko Sugisaki³ &
Stefan Höfler³

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 380))

Included in the following conference series:

International Workshop on Systems and Frameworks for Computational Morphology

327 Accesses

Abstract

The morphosyntactic disambiguation of verbs is a crucial pre-processing step for the syntactic analysis of morphologically rich languages like German and domains with complex clause structures like law texts. This paper explores how much linguistically motivated rules can contribute to the task. It introduces an incremental system of verbal morphosyntactic disambiguation that exploits the concept of topological fields. The system presented is capable of reducing the rate of POS-tagging mistakes from 10.2% to 1.6%. The evaluation shows that this reduction is mostly gained through checking the compatibility of morphosyntactic features within the long-distance syntactic relationships of discontinuous verbal elements. Furthermore, the present study shows that in law texts, the average distance between the left and right bracket of clauses is relatively large (9.5 tokens), and that in this domain, a wide context window is therefore necessary for the morphosyntactic disambiguation of verbs.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Bangalore, S., Joshi, A.K.: Supertagging: an approach to almost parsing. Computational Linguistics 25(2) (1999)
Google Scholar
Becker, M.: Frank. A.: A Stochastic Topological Parser for German. In: Proceedings of COLING 2002, pp. 71–77. Association of Computational Linguistics, New York (2002)
Google Scholar
Dudenredaktion (ed.): Duden - die Grammatik: unentbehrlich für richtiges Deutsch, Duden, vol. 4. Dudenverlag, Mannheim (2009)
Google Scholar
Dürscheid, C.: Syntax: Grundlagen und Theorien. Vandenhoeck & Ruprecht, Göttingen (2012)
Google Scholar
Foth, K., By, T., Menzel, W.: Guiding a constraint dependency parser with supertags. In: Bangalore, S., Joshi, A.K. (eds.) Supertagging: Using Complex Lexical Descriptions in Natural Language Processing. MIT Press, Cambridge (2010)
Google Scholar
Frank, A., Becker, M., Crysmann, B., Kiefer, B., Schäfer, U.: Integrated Shallow and Deep Parsing: TopP Meets HPSG. In: Proceedings of ACL 2003, pp. 104–111. Association for Computational Linguistics, New York (2003)
Google Scholar
Haapalainen, M., Majorin, A.: GERTWOL: ein System zur automatischen Wortformerkennung deutscher Wörter. Technical report, Lingsoft (1994)
Google Scholar
Hansen-Schirra, S., Neumann, S.: Linguistische Verständlichmachung in der juristischen Realität. In: Lerch, K.D. (ed.) Recht verstehen: Verständlichkeit, Missverständlichkeit und Unverständlichkeit von Recht, Die Sprache des Rechts, vol. 1. Walter de Gruyter, Berlin (2004)
Google Scholar
Harper, M.P., Wang, W.: Constraint dependency grammars: Superarvs, language modeling, and parsing. In: Bangalore, S., Joshi, A.K. (eds.) Supertagging: Using Complex Lexical Descriptions in Natural Language Processing. MIT Press, Cambridge (2010)
Google Scholar
Hinrichs, E.W., Kübler, S., Müller, F.H., Ule, T.: A hybrid architecture for robust parsing of German. In: Proceedings of the 3rd International Confererence on Language Resources and Evaluation (LREC 2002), Las Palmas, Gran Canaria (2002)
Google Scholar
Höfler, S., Piotrowski, M.: Building Corpora for the Philological Study of Swiss Legal Texts. Journal for Language Technology and Computational Linguistics (JLCL) 26(2), 77–89 (2011)
Google Scholar
Höfler, S., Sugisaki, K.: From Drafting to Error Detection: Automating Style Checking for Legislative Texts. In: EACL 2012 Workshop on Computational Linguistics and Writing, pp. 9–18. Association for Computational Linguistics, New York (2012)
Google Scholar
Karlsson, F., Voutilainen, A., Heikkilä, J., Anttila, A. (eds.): Constraint Grammar: A Language- Independent System for Parsing Unrestricted Text. Mouton de Gruyter, Berlin/New York (1995)
Google Scholar
Kathol, A.: Linear syntax. Oxford University Press, Oxford (2000)
Google Scholar
Nasr, A., Rambow, O.: Supertagging and full parsing. In: Proceedings of the 7th International Workshop on Tree Adjoining Grammar and Related Formalisms (TAG+7), Vancouver, British Columbia, Canada, pp. 56–63 (2004)
Google Scholar
Neumann, G., Braun, C., Piskorski, J.: A divide-and-conquer strategy for shallow parsing of German free texts. In: Proceedings of the Sixth Conference on Applied Natural Language Processing (ANLC 2000), Seatle, WA, pp. 239–246 (2000)
Google Scholar
Nussbaumer, M.: Rhetorisch-stilistische Eigenschaften der Sprache des Rechtswesens. In: Fix, U., Gardt, A., Knape, J. (eds.) Rhetorik und Stilistik / Rhetoric and Stylistics, Handbooks of Linguistics and Communication Science, vol. 31(2), pp. 2132–2150. Mouton de Gruyter, Boston/New York (2009)
Google Scholar
Schiller, A., Teufel, C., Stöckert, C., Thielen, C.: Guidelines für das Tagging deutscher Textcorpora mit STTS (kleines und grosses Tagset). Technical report, Universität Stuttgart/Universität Tübingen (1999)
Google Scholar
Schmid, H.: Improvements in Part-of-Speech Tagging with an Application to German. In: Proceedings of the ACL SIGDAT-Workshop, Dublin (1995)
Google Scholar
Schneider, G., Volk, M.: Adding Manual Constraints and Lexical Look-Up to a Brill-Tagger for German. In: Proceedings of the ESSLLI 1998 Workshop on Recent Advances in Corpus Annotation, Saarbrücken (1998)
Google Scholar
Volk, M., Schneider, G.: Comparing a Statistical and a Rule-Based Tagger for German. In: Lang, P., Frankfurt, A.M. (ed.) Proceeding of the 4th Conference on Natural Language Processing (KONVENS 1998), Berlin, Bern, New York, Paris, Wien, pp. 125–137 (1998)
Google Scholar
Voutilainen, A.: NPtool, A Detector of English Noun Phrases. In: Proceeding of Workshop on Very Large Corpora: Academic and Industrial Perspectives, pp. 48–57. Ohio State University, Columbus (1993)
Google Scholar
Voutilainen, A.: A Syntax-Based Part-of-Speech Analyser. In: Proceedings of the Seventh Conference on European Chapter of the Association for Computational Linguistics, EACL 1995, pp. 157–164. Morgan Kaufmann, San Francisco (1995)
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Computational Linguistics, University of Zurich, Binzmühlestrasse 14, 8050, Zürich, Switzerland
Kyoko Sugisaki & Stefan Höfler

Authors

Kyoko Sugisaki
View author publications
You can also search for this author in PubMed Google Scholar
Stefan Höfler
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

University of Konstanz, 78457, Konstanz, Germany
Cerstin Mahlow
University of Zurich, Binzmülestr. 14, Zurich, Switzerland
Michael Piotrowski

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sugisaki, K., Höfler, S. (2013). Verbal Morphosyntactic Disambiguation through Topological Field Recognition in German-Language Law Texts. In: Mahlow, C., Piotrowski, M. (eds) Systems and Frameworks for Computational Morphology. SFCM 2013. Communications in Computer and Information Science, vol 380. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40486-3_8

Download citation

DOI: https://doi.org/10.1007/978-3-642-40486-3_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-40485-6
Online ISBN: 978-3-642-40486-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics