Skip to main content

Characterizing Discontinuity in Constituent Treebanks

  • Conference paper
Formal Grammar (FG 2009)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5591))

Included in the following conference series:

Abstract

Measures for the degree of non-projectivity of dependency grammar have received attention both on the formal and on the empirical side. The empirical characterization of discontinuity in constituent treebanks annotated with crossing branches has nevertheless been neglected so far. In this paper, we present two measures for the characterization of both the discontinuity of constituent structures and the non-projectivity of dependency structures. An empirical evaluation on German data as well as an investigation of the relation between the measures and grammars extracted from treebanks shows their relevance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Marcus, M.P., Santorini, B., Marcinkiewicz, M.A.: Building a large annotated corpus of English: The Penn Treebank. Computational Linguistics 19(2), 313–330 (1994)

    Google Scholar 

  2. Civit, M., Martí Antònín, M.A.: Design principles for a Spanish treebank. In: Proceedings of the 1st Workshop on Treebanks and Linguistic Theories, Sozopol, Bulgaria (2002)

    Google Scholar 

  3. Telljohann, H., Hinrichs, E., Kübler, S., Zinsmeister, H.: Stylebook for the Tübingen Treebank of Written German (TüBa-D/Z). Technischer Bericht, Seminar für Sprachwissenschaft, Universität Tübingen, Tübingen (July 2006) Revidierte Fassung

    Google Scholar 

  4. Skut, W., Krenn, B., Brants, T., Uszkoreit, H.: An annotation scheme for free word order languages. In: Proceedings of the 5th Applied Natural Language Processing Conference, Washington, DC, pp. 88–95 (1997)

    Google Scholar 

  5. Brants, S., Dipper, S., Hansen, S., Lezius, W., Smith, G.: The TIGER Treebank. In: Proceedings of the 1st Workshop on Treebanks and Linguistic Theories, Sozopol, Bulgaria, pp. 24–42 (2002)

    Google Scholar 

  6. Kübler, S., Hinrichs, E.W., Maier, W.: Is it really that difficult to parse German? In: Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing, Sydney, Australia, pp. 111–119 (July 2006)

    Google Scholar 

  7. Boyd, A.: Discontinuity revisited: An improved conversion to context-free representations. In: Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, the Linguistic Annotation Workshop, Prague, Czech Republic, pp. 41–44 (2007)

    Google Scholar 

  8. Kuhlmann, M.: Dependency Structures and Lexicalized Grammars. PhD thesis, Saarland University (2007)

    Google Scholar 

  9. Holan, T.: Kuboň, V., Oliva, K., Plátek, M.: Two useful measures of word order complexity. In: Workshop on Processing of Dependency-Based Grammars, Montréal, Canada, pp. 21–29 (1998)

    Google Scholar 

  10. Bodirsky, M., Kuhlmann, M., Möhl, M.: Well-nested drawings as models of syntactic structure. In: Proceedings of the 10th Conference on Formal Grammar and the 9th Meeting on Mathematics of Language (FG-MOL 2005), Edinburgh, UK (2005)

    Google Scholar 

  11. Kuhlmann, M., Satta, G.: Treebank grammar techniques for non-projective dependency parsing. In: Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics, Athens, Greece (2009)

    Google Scholar 

  12. Kunze, J.: Abhängigkeitsgrammatik. Studia grammatica, vol. 12. Akademie-Verlag, Berlin (1975)

    MATH  Google Scholar 

  13. Havelka, J.: Beyond projectivity: Multilingual evaluation of constraints and measures on non-projective structures. In: Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, pp. 608–615 (2007)

    Google Scholar 

  14. Kuhlmann, M., Nivre, J.: Mildly non-projective dependency structures. In: Proceedings of the COLING/ACL 2006 Main Conference Poster Sessions, Sydney, Australia (2006)

    Google Scholar 

  15. Gómez-Rodríguez, C., Weir, D., Carroll, J.: Parsing mildly non-projective dependency structures. In: Proceedings of the 12th Conference of the European Chapter of the ACL (EACL 2009), Athens, Greece, pp. 291–299. Association for Computational Linguistics (March 2009)

    Google Scholar 

  16. Vijay-Shanker, K., Weir, D., Joshi, A.: Characterising structural descriptions used by various formalisms. In: Proceedings of ACL (1987)

    Google Scholar 

  17. Boullier, P.: Proposal for a natural language processing syntactic backbone. Rapport de Recherche RR-3342, Institut National de Recherche en Informatique et en Automatique, Le Chesnay, France (1998)

    Google Scholar 

  18. Maier, W., Søgaard, A.: Treebanks and mild context-sensitivity. In: Proceedings of the 13th Conference on Formal Grammar 2008, Hamburg, Germany, pp. 61–76 (2008)

    Google Scholar 

  19. Kracht, M.: The Mathematics of Language. Mouton de Gruyter, Berlin (2003)

    Book  MATH  Google Scholar 

  20. Hajič, J., Hladka, B.V., Panevová, J., Hajičová, E., Sgall, P., Pajas, P.: Prague Dependency Treebank 1.0. LDC (2001) 2001T10

    Google Scholar 

  21. Kromann, M.T.: The Danish Dependency Treebank and the DTAG treebank tool. In: Second Workshop on Treebanks and Linguistic Theories, Växjö, Sweden, pp. 217–220 (2003)

    Google Scholar 

  22. Daum, M., Foth, K., Menzel, W.: Automatic transformation of phrase treebanks to dependency trees. In: Proceedings of the 4th International Conference on Language Resources and Evaluation, Lisbon, Portugal (2004)

    Google Scholar 

  23. Forst, M., Bertomeu, N., Crysmann, B., Fouvry, F., Hansen-Schirra, S., Kordoni, V.: Towards a dependency-based gold standard for German parsers: The TiGer Dependency Bank. In: Proceedings of LINC 2004, Geneva, Switzerland (2004)

    Google Scholar 

  24. Hudson, R.: Word Grammar. Basil Blackwell, Oxford (1984)

    Google Scholar 

  25. Engel, U.: Deutsche Grammatik. Groos, Heidelberg (1988)

    Google Scholar 

  26. Lobin, H.: Koordinationssyntax als prozedurales Phänomen. Studien zur deutschen Grammatik, vol. 46. Narr, Tübingen (1993)

    Google Scholar 

  27. Osenova, P., Simov, K.: BTB-TR05: BulTreebank Stylebook. Technical Report 05, BulTreeBank Project (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Maier, W., Lichte, T. (2011). Characterizing Discontinuity in Constituent Treebanks. In: de Groote, P., Egg, M., Kallmeyer, L. (eds) Formal Grammar. FG 2009. Lecture Notes in Computer Science(), vol 5591. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-20169-1_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-20169-1_11

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-20168-4

  • Online ISBN: 978-3-642-20169-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics