Skip to main content

German Treebanks: TIGER and TüBa-D/Z

  • Chapter
  • First Online:
Handbook of Linguistic Annotation

Abstract

German is a language that is closely related to English but has a richer morphology and freer word order than English. Additionally, German has four existing major treebanks, which differ considerably in their syntactic annotation schemes. All treebanks use a combination of constituent structure and grammatical functions, but the decisions with regard to other phenomena differ significantly, for example in the treatment of discontinuous structures. This makes German a good choice for a comparative analysis of treebanks. This chapter presents two major treebanks of German, TIGER and TüBa-D/Z. We describe the projects in which the two treebanks were annotated, discuss the respective annotation schemes, the processes used for annotation, and the data formats. We also discuss the usage of both treebanks, as well as other German treebanks, and we present a comparison of the two annotation schemes along with their advantages and disadvantages.

We would like to thank Heike Zinsmeister for insightful comments and for providing us with references, and we would like to thank the two anonymous reviewers for valuable comments.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 349.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 449.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 449.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Project websites are available at http://www.ims.uni-stuttgart.de/forschung/projekte/tiger.html (TIGER) and http://www.sfs.uni-tuebingen.de/en/ascl/resources/corpora/tueba-dz.html (TüBa-D/Z). All URLs provided in this paper have been accessed Nov 28, 2016.

  2. 2.

    Secondary edges were already proposed in the context of the NEGRA project [80] but had not been used in the actual annotation of the NEGRA corpus.

  3. 3.

    This period was chosen because it covers a globally relevant event: the assassination of Israeli Prime Minister Yitzhak Rabin. The idea was to keep the option open of building a multilingual corpus, because it would be rather easy to find news about this event in many different languages. A drawback is that the there is some overlap in content among the articles of the two weeks.

    The NEGRA corpus also consists of texts from ‘Frankfurter Rundschau’, from 1991 and 1992. As far as we know, there is no overlap in texts between the NEGRA and TIGER corpora.

  4. 4.

    TüBa-D/Z is short for ‘Tübinger Baumbank des Deutschen/Zeitungssprache’ (Tübingen Treebank of German/Newspaper), i.e., the Z denotes newspaper texts while the S in TüBa-D/S denotes spontaneous speech.

  5. 5.

    For more information on these projects, see http://www.sfs.uni-tuebingen.de/en/ascl/resources/corpora/tueba-dz.html.

  6. 6.

    Note that we only have a parenthetical construction if the matrix clause is embedded into the direct speech. If the parenthetical were annotated as the head of the direct speech, this would result in a crossing branch, which is not an option in the TüBa-D/Z annotation scheme.

  7. 7.

    Besides NEGRA, TIGER, TüBa-D/Z, and the Verbmobil treebanks, Annotate was also used for e.g. the Potsdam Commentary Corpus [84], Mercurius Treebank [18], Deutsche Diachrone Baumbank [39], and SMULTRON [93]. The tool is no longer maintained.

  8. 8.

    TigerMorph was developed by Berthold Crysmann and was only used in the TIGER project. It is not available.

  9. 9.

    The transfer system of the XEROX Translation Environment (XTE) by Martin Kay, which was part of the XLE development platform.

  10. 10.

    The grammar was later improved and extended, and, as of 2006, had a coverage of 86% in terms of full parses, and dependency-based F-scores of 84% [24, 71].

  11. 11.

    Flickinger et al. (chapter “Sustainable Development and Refinement of Complex Linguistic Annotations at Scale”) discuss the use of discriminants in grammar-based treebanking. Discriminants encode the features distinguishing competing analyses and can support annotators in disambiguating complex structures. Such an approach was later adapted to LFG in the INESS project, which developed the LFG Parsebanker. This tool has been applied in creating the Norwegian LFG treebank [56, 73].

  12. 12.

    For discussions of these and similar formats, see also Ide et al. (chapter “Designing Annotation Schemes: From Model to Representation”).

  13. 13.

    This description refers to the NEGRA export format 4. There is a previous version, export format 3, which lacks the lemma column, but is otherwise identical.

  14. 14.

    SynAF is a standard developed by the International Organization for Standardisation in ISO/TC37/SC4 (Language Resources Management); http://www.tc37sc4.org/, see Ide et al. (chapter “Community Standards for Linguistically-Annotated Resources”).

  15. 15.

    The script was part of the NEGRA corpus deliverable. The script could not deal correctly with some kinds of crossing branches and was not maintained after the end of NEGRA.

  16. 16.

    http://www.ims.uni-stuttgart.de/forschung/ressourcen/werkzeuge/Tiger2Dep.en.html.

  17. 17.

    To enhance readability, we provide indentation in the example presented in Fig. 20.

  18. 18.

    The license can be signed here:

    http://www.ims.uni-stuttgart.de/forschung/ressourcen/korpora/TIGERCorpus/license/index.html.

  19. 19.

    The license is available from http://www.sfs.uni-tuebingen.de/en/ascl/resources/corpora/tueba-dz.html.

  20. 20.

    http://weblicht.sfs.uni-tuebingen.de/weblichtwiki/index.php/Main_Page.

  21. 21.

    http://weblicht.sfs.uni-tuebingen.de/weblichtwiki/index.php/Tundra.

  22. 22.

    http://www.ims.uni-stuttgart.de/forschung/ressourcen/werkzeuge/icarus.html.

  23. 23.

    http://www.deutschestextarchiv.de/.

  24. 24.

    There is work in progress for the Copenhagen Dependency Treebank, but the annotations have not been released yet (http://code.google.com/p/copenhagen-dependency-treebank/wiki/CDT). After the time of writing, the Hamburg Dependency Treebank was announced in 2014, which consists of approx. 2,00,000 manually annotated sentences plus 55,000 automatically parsed sentences, see https://corpora.uni-hamburg.de/drupal/de/islandora/object/treebank:hdt.

References

  1. Albert, S., Anderssen, J., Bader, R., Becker, S., Bracht, T., Brants, S., Brants, T., Demberg, V., Dipper, S., Eisenberg, P., Hansen, S., Hirschmann, H., Janitzek, J., Kirstein, C., Langner, R., Michelbacher, L., Plaehn, O., Preis, C., Pußel, M., Rower, M., Schrader, B., Schwartz, A., Smith, G., Uszkoreit, H.: TIGER Annotationsschema. Technical report, Universität des Saarlandes, Universität Stuttgart and Universität Potsdam (2003). http://www.ims.uni-stuttgart.de/forschung/ressourcen/korpora/TIGERCorpus/annotation/tiger_scheme-syntax.pdf

  2. Bosch, S., Choi, K.-S., de la Clergerie, É., Fang, A.C., Faaß, G., Lee, K., Pareja-Lora, A., Romary, L., Witt, A., Zeldes, A., Zipser, F.: \(<\)tiger2/\(>\) as a standardised serialisation for ISO 24615 – SynAF. In: Proceedings of the Eleventh International Workshop on Treebanks and Linguistic Theories (TLT), Lisbon, Portugal, pp. 37–60 (2012)

    Google Scholar 

  3. Brants, S., Hansen, S.: Developments in the TIGER annotation scheme and their realization in the corpus. In: Proceedings of the Third Conference on Language Resources and Evaluation LREC-02, Las Palmas de Gran Canaria, pp. 1643–1649 (2002)

    Google Scholar 

  4. Brants, S., Dipper, S., Eisenberg, P., Hansen-Schirra, S., König, E., Lezius, W., Rohrer, C., Smith, G., Uszkoreit, H.: TIGER: linguistic interpretation of a German corpus. Res. Lang. Comput., Special Issue 2(4), 597–620 (2004)

    Google Scholar 

  5. Brants, T.: The NeGra Export Format for Annotated Corpora. Universität des Saarlandes, Computational Linguistics, Saarbrücken, Germany (1997). CLAUS Report No. 98, http://www.coli.uni-saarland.de/~thorsten/publications/Brants-CLAUS98.pdf

  6. Brants, T.: Cascaded Markov models. In: Proceedings of EACL-99, Bergen, Norway, pp. 118–125 (1999)

    Google Scholar 

  7. Brants, T.: Inter-annotator agreement for a German newspaper corpus. In: Proceedings of Second International Conference on Language Resources and Evaluation LREC-2000, Athens, Greece (2000)

    Google Scholar 

  8. Brants, T.: TnT – a statistical part-of-speech tagger. In: Proceedings of the Sixth Conference on Applied Natural Language Processing ANLP-2000, Seattle, Washington, pp. 224–231 (2000)

    Google Scholar 

  9. Brants, T., Skut, W.: Automation of treebank annotation. In: Proceedings of the Joint Conference on New Methods in Natural Language Processing and Computational Language Learning. NeMLaP3/CoNLL98, Australia, Sydney, pp. 49–57 (1998)

    Google Scholar 

  10. Brants, T., Skut, W., Uszkoreit, H.: Syntactic annotation of a German newspaper corpus. In: Proceedings of the ATALA Treebank Workshop, Paris, France, pp. 69–76 (1999)

    Google Scholar 

  11. Brants, T., Skut, W., Uszkoreit, H.: Syntactic annotation of a German newspaper corpus. In: Abeillé, A. (ed.) Treebanks: Building and Using Parsed Corpora. Text, Speech and Language Technology, vol. 20, pp. 73–87. Springer, The Netherlands (2003)

    Chapter  Google Scholar 

  12. Bresnan, J.: The Mental Representation of Grammatical Relations. MIT Press, Cambridge (1982)

    Google Scholar 

  13. Buchholz, S., Marsi, E.: CoNLL-X shared task on multilingual dependency parsing. In: Proceedings of the Tenth Conference on Computational Language Learning (CoNLL), New York, NY, pp. 149–164 (2006)

    Google Scholar 

  14. Butt, M., Dyvik, H., King, T.H., Masuichi, H., Rohrer, C.: The parallel grammar project. In: Proceedings of COLING-2002 Workshop on Grammar Engineering and Evaluation, Taipei, Taiwan, vol. 15, pp. 1–7 (2002)

    Google Scholar 

  15. Corazza, A., Lavelli, A., Satta, G.: An information-theoretic measure to evaluate parsing difficulty across treebanks. ACM Trans. Speech Lang. Process. 9(4) (2013)

    Google Scholar 

  16. Crouch, D., Dalrymple, M., Kaplan, R.M., King, T.H., Maxwell III, J.T., Newman, P.: XLE documentation. Technical report, Palo Alto Research Center

    Google Scholar 

  17. Crysmann, B., Hansen-Schirra, S., Smith, G., Ziegler-Eisele, D.: TIGER Morphologie-Annotationsschema. Technical report, Universität des Saarlandes, Universität Stuttgart and Universität Potsdam (2005). http://www.ims.uni-stuttgart.de/forschung/ressourcen/korpora/TIGERCorpus/annotation/tiger_scheme-morph.pdf

  18. Demske, U.: Das Mercurius-Projekt: eine Baumbank für das Frühneuhochdeutsche. In: Zifonun, G., Kallmeyer, W. (eds.) Sprachkorpora - Datenmengen und Erkenntnisfortschritt, Jahrbuch des Instituts für deutsche Sprache 2006, pp. 91–104. de Gruyter, Berlin (2007)

    Google Scholar 

  19. Dipper, S.: Grammar-based corpus annotation. In: Proceedings of the COLING Workshop on Linguistically Interpreted Corpora (LINC-2000), Luxembourg, pp. 56–64 (2000)

    Google Scholar 

  20. Dipper, S.: Implementing and Documenting Large-Scale Grammars – German LFG. Ph.D. thesis, IMS, University of Stuttgart (2003). Working papers of the Institut für Maschinelle Sprachverarbeitung (AIMS), vol. 9(1)

    Google Scholar 

  21. Dipper, S.: Querying topological fields in the TIGER scheme with TIGERSearch. In: Proceedings of the 13th International Workshop on Treebanks and Linguistic Theories (TLT13), Tübingen, Germany, pp. 37–50 (2014)

    Google Scholar 

  22. Drach, E.: Grundgedanken der Deutschen Satzlehre. Diesterweg, Frankfurt am Main (1937)

    Google Scholar 

  23. Erdmann, O.: Grundzüge der deutschen Syntax nach ihrer geschichtlichen Entwicklung dargestellt. Verlag der Cotta’schen Buchhandlung, Stuttgart (1886). Erste Abteilung

    Google Scholar 

  24. Forst, M.: Treebank conversion – establishing a testsuite for a broad-coverage LFG from the TIGER treebank. In: Proceedings of the EACL Workshop on Linguistically Interpreted Corpora (LINC 2003), Budapest, pp. 25–32 (2003)

    Google Scholar 

  25. Forst, M., Bertomeu, N., Crysmann, B., Fouvry, F., Hansen-Schirra, S., Kordoni, V.: Towards a dependency-based gold standard for German parsers - the TiGer dependency bank. In: Proceedings of LINC 2004 (2004)

    Google Scholar 

  26. Frank, A., King, TH., Kuhn, J., Maxwell, J.: Optimality theory style constraint ranking in large-scale LFG grammars. In: Proceedings of the Third LFG Conference, Brisbane, Australia (1998)

    Google Scholar 

  27. Gärtner, M., Thiele, G., Seeker, W., Björkelund, A., Kuhn, J.: ICARUS – an extensible graphical search tool for dependency treebanks. In: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pp. 55–60, Sofia, Bulgaria, August 2013. Association for Computational Linguistics

    Google Scholar 

  28. Gastel, A., Schulze, S., Versley, Y., Hinrichs, E.: Annotation of explicit and implicit discourse relations in the TüBa-D/Z Treebank. In: Proceedings of the Conference of the German Society for Computational Linguistics and Language Technology (GSCL), Hamburg, Germany (2011)

    Google Scholar 

  29. Hajič, J., Ciaramita, M., Johansson, R., Kawahara, D., Martí, M.A., Màrquez, L., Meyers, A., Nivre, J., Padó, S., Štěpánek, J., Straňák, P., Surdeanu, M., Xue, N., Zhang, Y.: The CoNLL-2009 shared task: syntactic and semantic dependencies in multiple languages. In: Proceedings of the Thirteenth Conference on Computational Natural Language Learning (CoNLL 2009): Shared Task, Boulder, Colorado, pp. 1–18, June 2009. Association for Computational Linguistics

    Google Scholar 

  30. Harbusch, K.: Incremental sentence production inhibits clausal coordinate ellipsis: a treebank study into Dutch and German. Dialogue Discourse. Special issue on Incremental Processing in Dialogue 2(1):313–332 (2011)

    Google Scholar 

  31. Harbusch, K., Kempen, G.: Clausal coordinate ellipsis in German: the TIGER treebank as a source of evidence. In: Proceedings of NODALIDA 2007 – Sixteenth Nordic Conference of Computational Linguistics, Tartu, Estonia (2007)

    Google Scholar 

  32. Hinrichs, E., Beck, K.: Auxiliary fronting in German: a walk in the woods. In: Proceedings of the Twelfth Workshop on Treebanks and Linguistic Theories (TLT), Sofia, Bulgaria, pp. 61–72 (2013)

    Google Scholar 

  33. Hinrichs, E., Telljohann, H.: Constructing a valence lexicon for a treebank of German. In: Proceedings of the 7th International Workshop on Treebanks and Linguistic Theories (TLT), Groningen, The Netherlands, pp. 41–52 (2009)

    Google Scholar 

  34. Hinrichs, E.W., Kübler, S.: Treebank profiling of spoken and written German. In: Proceedings of the Fourth Workshop on Treebanks and Linguistic Theories, Barcelona, Spain, pp. 65–76 (2005)

    Google Scholar 

  35. Hinrichs, E.W., Kübler, S.: What linguists always wanted to know about German and did not know how to ask. In: Suominen, M., Arppe, A., Airola, A., Heinämäki, O., Miestamo, M., Määttä, U., Niemi, J., Pitkänen, K.K., Sinnemäki, K. (eds.) A Man of Measure: Festschrift in Honour of Fred Karlsson on his 60th Birthday. SKY Journal of Linguistics, vol. 19, pp. 24–33. The Linguistic Association of Finland (2006). Special Supplement

    Google Scholar 

  36. Hinrichs, E.W., Bartels, J., Kawata, Y., Kordoni, V., Telljohann, H.: The Tübingen treebanks for spoken German, English, and Japanese. In: Wahlster, W. (ed.) Verbmobil: Foundations of Speech-to-Speech Translation, pp. 550–574. Springer, Berlin (2000)

    Chapter  Google Scholar 

  37. Hinrichs, E.W., Bartels, J., Kawata, Y., Kordoni, V., Telljohann, H.: The Verbmobil treebanks. In: Proceedings of KONVENS 2000, 5. Konferenz zur Verarbeitung natürlicher Sprache, Ilmenau, Germany, pp. 107–112 (2000)

    Google Scholar 

  38. Hinrichs, E.W., Filippova, K., Wunsch, H.: What treebanks can do for you: rule-based and machine-learning approaches to anaphora resolution in German. In: Civit, M., Kübler, S., Martí, M.A. (eds.) Proceedings of the Fourth Workshop on Treebanks and Linguistic Theories (TLT 2005), Barcelona, Spain, pp. 77–88 (2005)

    Google Scholar 

  39. Hirschmann, H., Linde, S.: Annotationsguidelines zur Deutschen Diachronen Baumbank. Technical report, Humboldt-Universität zu Berlin (2010). http://korpling.german.hu-berlin.de/ddb-doku

  40. Höhle,T.: Der Begriff “Mittelfeld”, Anmerkungen über die Theorie der topologischen Felder. In: Akten des Siebten Internationalen Germanistenkongresses 1985, Göttingen, Germany, pp. 329–340 (1986)

    Google Scholar 

  41. Kallmeyer, L., Maier, W.: Data-driven parsing using probabilistic linear context-free rewriting systems. Comput. Linguist. 39(1), 87–119 (2013)

    Article  Google Scholar 

  42. King, T.H., Crouch, R., Riezler, S., Dalrymple, M., Kaplan, R.M.: The PARC700 dependency bank. In: Proceedings of the 4th International Workshop on Linguistically Interpreted Corpora (LINC-03) at EACL-03, pp. 1–8 (2003)

    Google Scholar 

  43. King, T.H., Dipper, S., Frank, A., Kuhn, J., Maxwell, J.: Ambiguity management in grammar writing. Res. Lang. Comput. 2, 259–280 (2004)

    Article  Google Scholar 

  44. Kountz, M.: Extraktion von Dependenztripeln aus der TIGER-Baumbank (2006). Studienarbeit, Universität Stuttgart

    Google Scholar 

  45. Kübler, S.: How do treebank annotation schemes influence parsing results? Or how not to compare apples and oranges. In: Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP), Borovets, Bulgaria, pp. 293–300 (2005)

    Google Scholar 

  46. Kübler, S.: The PaGe shared task on parsing German. In: Proceedings of the ACL Workshop on Parsing German, Columbus, Ohio, pp. 55–63 (2008)

    Google Scholar 

  47. Kübler, S., Telljohann, H.: Towards a dependency-based evaluation for partial parsing. In: Proceedings of the LREC-Workshop Beyond PARSEVAL – Towards Improved Evaluation Measures for Parsing Systems, Las Palmas, Gran Canaria, pp. 9–16 (2002)

    Google Scholar 

  48. Kübler, S., Hinrichs, E.W., Maier, W.: Is it really that difficult to parse German? In: Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing (EMNLP), Sydney, Australia, pp. 111–119 (2006)

    Google Scholar 

  49. Kübler, S., Maier, W., Rehbein, I., Versley, Y.: How to compare treebanks. In: Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC), Marrakech, Morocco, pp. 2322–2329 (2008)

    Google Scholar 

  50. Kübler, S., Rehbein, I., van Genabith, J.: TePaCoC – a corpus for testing parser performance on complex German grammatical constructions. In: Proceedings of the Seventh International Workshop on Treebanks and Linguistic Theories, Groningen, The Netherlands, pp. 15–28 (2009)

    Google Scholar 

  51. Kübler, S., Beck, K., Hinrichs, E., Telljohann, H.: Chunking German: an unsolved problem. In: Proceedings of the Forth Linguistic Annotation Workshop (LAW), Uppsala, Sweden, pp. 147–151 (2010)

    Google Scholar 

  52. Kunze, C., Lemnitzer, L.: Germanet – representation, visualization, application. In: Proceedings of the Third International Conference on Language Resources and Evaluation (LREC), Las Palmas, Canary Islands, pp. 1485–1491 (2002)

    Google Scholar 

  53. Lezius, W.: Ein Suchwerkzeug für syntaktisch annotierte Textkorpora. Ph.D. thesis, Universität Stuttgart (2002). Arbeitspapiere des Instituts für Maschinelle Sprachverarbeitung (AIMS), vol. 8(4)

    Google Scholar 

  54. Martens, S.: TüNDRA: a web application for treebank search and visualization. In: Proceedings of the Twelfth Workshop on Treebanks and Linguistic Theories (TLT), Sofia, Bulgaria, pp. 133–144 (2013)

    Google Scholar 

  55. Mengel, A., Lezius, W.: An XML-based representation format for syntactically annotated corpora. In: Proceedings of the International Conference on Language Resources and Evaluation, pp. 121–126 (2000)

    Google Scholar 

  56. Meurer, P., Dyvik, H., Rosén, V., De Smedt, K., Lyse, GI., Losnegaard, G.S., Thunes, M.: The INESS treebanking infrastructure. In: Proceedings of the 19th Nordic Conference of Computational Linguistics (NODALIDA 2013). NEALT Proceedings, Olso, Norway, vol. 16, pp. 453–458 (2013)

    Google Scholar 

  57. Meurers, D., Müller, S.: Corpora and syntax. In: Lüdeling, A., Kytö, M. (eds.) Corpus Linguistics: An International Handbook, pp. 920–933. Mouton de Gruyter, Berlin (2009)

    Chapter  Google Scholar 

  58. Müller, F.H.: Stylebook for the Tübingen partially parsed corpus of written German (TüPP-D/Z). Technical report, Seminar für Sprachwissenschaft, Universität Tübingen (2004). http://www.sfs.uni-tuebingen.de/tupp/doc/stylebook.ps

  59. Naumann, K.: Manual for the annotation of in-document referential relations. Technical report, Universität Tübingen (2007). http://www.sfs.uni-tuebingen.de/resources/tuebadz-coreference-manual-2007.pdf

  60. Nivre, J., Hall, J., Kübler, S., McDonald, R., Nilsson, J., Riedel, S., Yuret, D.: The CoNLL 2007 shared task on dependency parsing. In: Proceedings of the CoNLL 2007 Shared Task. Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), Czech Republic, Prague, pp. 915–932(2007)

    Google Scholar 

  61. Orasan, C.: PALinkA: A highly customizable tool for discourse annotation. In: Proceedings of the 4th SIGdial Workshop on Discourse and Dialog, Sapporo, Japan, pp. 39–43 (2003)

    Google Scholar 

  62. Pappert, S., Schließer, J., Janssen, D., Pechmann, T.: Corpus- and psycholinguistic investigations of linguistic constraints on German object order. In: Späth, A. (ed.) Interfaces and Interface Conditions, pp. 299–328. Mouton de Gruyter, Berlin (2007)

    Google Scholar 

  63. Plaehn, O.: Annotate: Bedienungsanleitung. Technical report, Universität des Saarlandes (1998). http://www.coli.uni-saarland.de/projects/sfb378/negra-corpus/annotate-manual.ps.gz

  64. Plaehn, O.: Probabilistic parsing with discontinuous phrase structure grammar. Master’s thesis, Department of Computational Linguistics, University of the Saarland, Saarbrücken, Germany (1999)

    Google Scholar 

  65. Plaehn, O., Brants, T.: Annotate – an efficient interactive annotation tool. In: Proceedings of the 6th Conference on Applied Natural Language Processing (ANLP-2000), Seattle, WA (2000)

    Google Scholar 

  66. Pollard, C., Sag, I.A.: Head-Driven Phrase Structure Grammar. Studies in Contemporary Linguistics. University of Chicago Press, Chicago (1994)

    Google Scholar 

  67. Rehbein, I., van Genabith, J.: Treebank annotation schemes and parser evaluation for German. In: Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), Prague, Czech Republic, pp. 630–639 (2007)

    Google Scholar 

  68. Rehbein, I., van Genabith, J.: Why is it so difficult to compare treebanks? TIGER and TüBa-D/Z revisited. In: Proceedings of the Sixth International Workshop on Treebanks and Linguistic Theories (TLT), Bergen, Norway, pp. 115–126 (2007)

    Google Scholar 

  69. Rehm, G., Witt, A., Zinsmeister, H., Dellert, J.: Masking treebanks for the free distribution of linguistic resources and other applications. In: Proceedings of the Sixth International Workshop on Treebanks and Linguistic Theories (TLT), Bergen, Norway (2007)

    Google Scholar 

  70. Reis, M.: Zum Subjektbegriff im Deutschen. In: Abraham, W. (ed.) Satzglieder im Deutschen: Vorschläge zur syntaktischen, semantischen und pragmatischen Fundierung, pp. 171–211. Narr, Tübingen (1982)

    Google Scholar 

  71. Rohrer, C., Forst, M.: Improving coverage and parsing quality of a large-scale LFG for German. In: Proceedings of the Language Resources and Evaluation Conference (LREC-2006), Genoa, Italy, pp. 2206–2211 (2006)

    Google Scholar 

  72. Romary, L., Zeldes, A., Zipser, F.: \(<\)tiger2/\(>\) – Serialising the ISO SynAF syntactic object model. Lang. Resour. Eval. (to appear)

    Google Scholar 

  73. Rosén, V., Meurer, P., De Smedt, K.: LFG Parsebanker: a toolkit for building and searching a treebank as a parsed corpus. In: Proceedings of the Seventh International Workshop on Treebanks and Linguistic Theories, Utrecht, pp. 127–133 (2009)

    Google Scholar 

  74. Roussel, A.: Documentation of the tool TIGER Tree Enricher (2014). http://www.linguistics.ruhr-uni-bochum.de/resources /software/tte

  75. Samuelsson, Y., Volk, M.: Automatic node insertion for treebank deepening. In: Proceedings of the Third Workshop on Treebanks and Linguistic Theories (TLT), Tübingen, pp. 127–136 (2004)

    Google Scholar 

  76. Schiller, A., Teufel, S., Stöckert, C., Thielen, C.: Guidelines für das Tagging deutscher Textcorpora mit STTS (Kleines und großes Tagset). Technical report, Universität Stuttgart and Universität Tübingen (1999). http://www.ims.uni-stuttgart.de/forschung/ressourcen/lexika/TagSets/stts-1999.pdf

  77. Seeker, W., Kuhn, J.: Making ellipses explicit in dependency conversion for a German treebank. In: Proceedings of the 8th International Conference on Language Resources and Evaluation, Istanbul, Turkey, pp. 3132–3139 (2012)

    Google Scholar 

  78. Simon, S., Hinrichs, E., Schulze, S., Versley, Y.: Handbuch zur Annotation expliziter und impliziter Diskursrelationen im Korpus der Tübinger Baumbank des Deutschen (TüBa-D/Z). Universität Tübingen (2011)

    Google Scholar 

  79. Skut, W., Brants, T., Krenn, B., Uszkoreit, H.: A linguistically interpreted corpus of German newspaper text. In: Proceedings of the ESSLLI Workshop on Recent Advances in Corpus Annotation, pp. 705–711 (1998)

    Google Scholar 

  80. Skut, W., Krenn, B., Brants, T., Uszkoreit, H.: An annotation scheme for free word order languages. In: Proceedings of the Fifth Conference on Applied Natural Language Processing ANLP 1997, Washington, DC, pp. 88–95 (1997)

    Google Scholar 

  81. Smith, G.: A brief introduction to the TIGER Treebank, version 1. Technical report, Universität Potsdam (2003). http://www.ims.uni-stuttgart.de/forschung/ressourcen/korpora/TIGERCorpus/tiger_introduction.pdf

  82. Smith, G.: Searching for morphological structure with regular expressions. Technical report, Universität Potsdam (2003). http://www.ims.uni-stuttgart.de/forschung/ressourcen/korpora/TIGERCorpus/tiger_regex.pdf

  83. Spreyer, K., Frank, A.: The TIGER 700 RMRS Bank: RMRS construction from dependencies. In: Proceedings of LINC 2005, Jeju Island, Korea, pp. 1–10 (2005)

    Google Scholar 

  84. Stede, M.: The potsdam commentary corpus. In: Proceedings of the ACL-04 Workshop on Discourse Annotation, Barcelona, pp. 96–102 (2004)

    Google Scholar 

  85. Steiner, I.: Partial agreement in German: a processing issue? In: Proceedings of the International Conference on Linguistic Evidence, Tübingen, Germany (2009)

    Google Scholar 

  86. Telljohann, H., Hinrichs, E., Kübler, S.: The TüBa-D/Z treebank: annotating German with a context-free backbone. In: Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC), Lisbon, Portugal, pp. 2229–2235 (2004)

    Google Scholar 

  87. Telljohann, H., Hinrichs, E.W., Kübler, S., Zinsmeister, H., Beck, K.: Stylebook for the Tübingen Treebank of Written German (TüBa-D/Z). Universität Tübingen, Germany, Seminar für Sprachwissenschaft (2015)

    Google Scholar 

  88. Thielen, C., Schiller, A.: Ein kleines und erweitertes Tagset fürs Deutsche. In: Feldweg, H., Hinrichs, E. (eds.) Lexikon & Text, pp. 193–203. Niemeyer, Tübingen, Tübingen (1994)

    Google Scholar 

  89. Trushkina, J.: Morpho-Syntactic Annotation and Dependency Parsing of German. Ph.D. thesis, Universität Tübingen (2004)

    Google Scholar 

  90. Ule, T.: Treebank Refinement: Optimising Representations of Syntactic Analyses for Probabilistic Context-Free Parsing. Ph.D. thesis, Universität Tübingen (2007)

    Google Scholar 

  91. Veenstra, J., Müller, F.H., Ule, T.: Topological fields chunking for German. In: Proceedings of the Sixth Conference on Natural Language Learning (CoNLL 2002), Taipei, Taiwan, pp. 56–62 (2002)

    Google Scholar 

  92. Versley, Y., Beck, K., Hinrichs, E., Telljohann, H.: A syntax-first approach to high-quality morphological analysis and lemma disambiguation for the TüBa-D/Z Treebank. In: Proceedings of the Ninth International Workshop on Treebanks and Linguistic Theories (TLT), Tartu, Estonia, pp. 233–244 (2010)

    Google Scholar 

  93. Volk, M., Göhring, A., Marek, T., Samuelsson, Y.: SMULTRON (version 3.0) – The Stockholm MULtilingual parallel TReebank (2010). An English-French-German-Spanish-Swedish parallel treebank with sub-sentential alignments. http://www.cl.uzh.ch/research/parallelcorpora/paralleltreebanks_en.html

  94. Wahlster, W. (ed.): Verbmobil: Foundations of Speech-to-Speech Translation. Springer, Berlin (2000)

    Google Scholar 

  95. Zarrieß, S., Cahill, A., Kuhn, J.: To what extent does sentence-internal realisation reflect discourse context? A study on word order. In: Proceedings of the 13th Conference of the European Chapter of the ACL, Avignon, France, pp. 767–776 (2012)

    Google Scholar 

  96. Zeldes, A., Ritz, J., Lüdeling, A., Chiarcos, C.: ANNIS: a search tool for multi-layer annotated corpora. In: Proceedings of Corpus Linguistics 2009, Liverpool, UK (2009)

    Google Scholar 

  97. Zinsmeister, H.: Treebank data as linguistic evidence? Coordination in TüBa-D/Z. In: Proceedings of the International Conference on Linguistic Evidence, Tübingen, Germany (2006)

    Google Scholar 

  98. Zinsmeister, H., Kuhn, J., Dipper, S.: Utilizing LFG parses for treebank annotation. In: Proceedings of the LFG-02 Conference, Athens, Greece, pp. 427–447 (2002). CSLI Publications

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Stefanie Dipper .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer Science+Business Media Dordrecht

About this chapter

Cite this chapter

Dipper, S., Kübler, S. (2017). German Treebanks: TIGER and TüBa-D/Z. In: Ide, N., Pustejovsky, J. (eds) Handbook of Linguistic Annotation. Springer, Dordrecht. https://doi.org/10.1007/978-94-024-0881-2_22

Download citation

  • DOI: https://doi.org/10.1007/978-94-024-0881-2_22

  • Published:

  • Publisher Name: Springer, Dordrecht

  • Print ISBN: 978-94-024-0879-9

  • Online ISBN: 978-94-024-0881-2

  • eBook Packages: Social SciencesSocial Sciences (R0)

Publish with us

Policies and ethics