Skip to main content

Part of the book series: Studies in Computational Intelligence ((SCI,volume 589))

  • 396 Accesses

Abstract

State of the art parsers are currently trained on converted versions of Penn Treebank into dependency representations, which however don’t include null elements. This is done to facilitate structural learning and prevents the probabilistic engine to postulate the existence of deprecated null elements everywhere, see [19]. However it is a fact that in this way the semantics of the representation used and produced is inconsistent and will reduce dramatically its usefulness in real life applications, like Q/A and other semantically driven fields, by hampering the mapping of a complete logical form. What systems have come up with are “quasi”-logical forms or partial logical forms mapped directly from the surface representation in dependency structure. We show the most common problems derived from the conversion and then describe an algorithm that we have implemented to apply to our converted Italian Treebank, that can be used on any CoNLL-like treebank or representation to produce an almost complete semantically consistent dependency treebank.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Afonso, S., Eckhard, B., Renato H., Diana S. : Floresta sintá(c)tica: a treebank for Portuguese. In: Rodríguez, M.G., Araujo, C.P. (eds.) Proceedings of LREC 2002, pp. 1698–1703. ELRA, Spain (2002)

    Google Scholar 

  2. Attardi, G.: Experiments with a multilanguage non-projective dependency parser. In: Proceedings of the Tenth Conference on Natural Language Learning, New York (2006)

    Google Scholar 

  3. Bikel, D.M.: Intricacies of Collins’ parsing model. Comput. Linguist. 30(4), 479–511 (2003)

    Article  Google Scholar 

  4. Black, E., Abney, S., Flickinger, D., Gdaniec, C., Grishman, R., Harrison, P., Hindle, D., Ingria, R., Jelinek, F., Klavans, J., Liberman, M., Marcus, M., Roukos, S., Santorini, B., Strzalkowski, T.: A procedure for quantitatively comparing the syntactic coverage of English grammars. In: Proceedings of the DARPA Speech and Natural Language Workshop, pp. 306–311 (1991)

    Google Scholar 

  5. Brants, T.: TnT: a statistical part-of-speech tagger. In: ANLP 2000. Seattle (2000)

    Google Scholar 

  6. Brill, E.: Transformation-based error-driven learning and natural language processing: a case study in part-of-speech tagging. Comput. Linguist. 21, 543–565 (1995)

    Google Scholar 

  7. Carroll, J., Briscoe, T., Sanfilippo, A.: Parser evaluation: a survey and a new proposal. In: Proceedings of the [First] International Conference on Language Resources and Evaluation, pp. 447–454 (1998)

    Google Scholar 

  8. Collins, Michael, : A new statistical parser based on bigram lexical dependencies. In: Proceedings of the 34th Annual Meeting of the Association for Computational Linguistics, pp. 184–191 (1996)

    Google Scholar 

  9. Corazza, A., Lavelli, A., Satta, G., Zanoli, R.: Analyzing an Italian treebank with state-of-the-art statistical parsers. In: Proceedings of the 3rd Workshop on Treebanks and Linguistic Theories (TLT-2004), pp. 39–50. Tübingen, Germany (2004)

    Google Scholar 

  10. Delmonte, R., Bristot, A., Tonelli, S.: VIT—Venice Italian Treebank: syntactic and quantitative features. In: De Smedt, K., Hajic, J. Kübler, S. (eds.) Proceedings Sixth International Workshop on Treebanks and Linguistic Theories. Nealt Pnealt Proceedings Series, vol. 1, pp. 43–54 (2007)

    Google Scholar 

  11. Delmonte, R., Luminita, C., Ciprian, B. : Elementary trees for syntactic and statistical disambiguation. In: Proceedings TAG\(+\)5, pp. 237–240. Paris (2000)

    Google Scholar 

  12. Delmonte, R.: From shallow parsing to functional structure. In: Atti del Workshop AI*IA— “Elaborazione del Linguaggio e Riconoscimento del Parlato”, pp. 8–19. IRST, Trento (1999)

    Google Scholar 

  13. Delmonte, R.: How to annotate linguistic information in FILES and SCAT. In: Atti del Workshop “La Treebank Sintattico-Semantica dell’Italiano di SI-TAL”, pp. 75–84. Bari (2001)

    Google Scholar 

  14. Delmonte, R.: Strutture Sintattiche dall’Analisi Computazionale di Corpora di Italiano. In: Anna Cardinaletti(a cura di), Intorno all’Italiano Contemporaneo, pp. 187–220. Franco Angeli, Milano (2004)

    Google Scholar 

  15. Delmonte, R., Dolci, R.: Parsing Italian with a context-free recognizer. Annali di Ca’ Foscari XXVIII(1–2), 123–161 (1989)

    Google Scholar 

  16. Delmonte, R.: Shallow Parsing and Functional Structure in Italian Corpora, pp. 113–119. LREC, Atene (2000)

    Google Scholar 

  17. Delmonte, R.: Treebanking in VIT: from phrase structure to dependency representation. In: Nirenburg, Sergei (ed.) Language Engineering for Lesser-Studied Languages, pp. 51–80. IOS Press, The Netherlands (2009)

    Google Scholar 

  18. Delmonte, R.: Computational Linguistic Text Processing—Lexicon Grammar Parsing and Anaphora Resolution. Nova Science Publishers, New York (2009)

    Google Scholar 

  19. Gaizauskas R.: Investigations into the grammar underlying the Penn treebank II. Technical Report CS-95-25, Department of Computer Science, University of Sheffield (1995)

    Google Scholar 

  20. Harper, M.P., Helzerman, R.A.: Extensions to constraint dependency parsing for spoken language processing. Comput. Speech Lang. 9, 187–234 (1995)

    Article  Google Scholar 

  21. Hellwig, P.: Dependency unification grammar. In: Proceedings COLING-86, pp. 195–198 (1986)

    Google Scholar 

  22. Hudson, R.: Word Grammar. Blackwell, London (1984)

    Google Scholar 

  23. Hudson, R.: English Word Grammar. Blackwell, London (1990)

    Google Scholar 

  24. Jackendoff, R.: X-Bar Syntax. The MIT Press, Cambridge (1977)

    Google Scholar 

  25. Jaervinen, T., Tapanainen, P.: Towards an implementable dependency grammar. In: Kahane, S., Polguère, A. (eds.) Proceedings of the Workshop on Processing of Dependency-Based Grammars, pp. 1–10 (1998)

    Google Scholar 

  26. Lesmo, L., Lombardo, V., Bosco, C.: Treebank development: the TUT approach. In: Proceedings of ICON 2002. Mumbai (2002)

    Google Scholar 

  27. Marcus, M., et al.: Building a large annotated corpus of English: the Penn Treebank. Comput. Linguist. 19(2), 313–330 (1993)

    Google Scholar 

  28. Martí, M.A., Taulé, M., Márquez, L., Bertran, M.: Ancora: A Multilingual and Multilevel Annotated Corpus in http://clic.ub.edu/ancora/publications/ (2007)

  29. Maruyama, H.: Structural disambiguation with constraint propagation. In: Proceedings of the 28th Meeting of the Association for Computational Linguistics (ACL), pp. 31–38. Pittsburgh (1990)

    Google Scholar 

  30. Mel’cuk, I.: Dependency Syntax: Theory and Practice. State University of New York Press, New York (1988)

    Google Scholar 

  31. Menzel, W., Schroeder, I.: Decision procedures for dependency parsing using graded constraints. In: Kahane, S., Polguère, A. (eds.) Proceedings of the Workshop on Processing of Dependency-Based Grammars, pp. 78–87 (1998)

    Google Scholar 

  32. Montemagni, et al.: The Italian Syntactic-Semantic Treebank: Architecture, Annotation, Tools and Evaluation, pp. 18–27. LINC, ACL, Luxembourg (2000)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rodolfo Delmonte .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Delmonte, R. (2015). Dependency Treebank Annotation and Null Elements: An Experiment with VIT. In: Basili, R., Bosco, C., Delmonte, R., Moschitti, A., Simi, M. (eds) Harmonization and Development of Resources and Tools for Italian Natural Language Processing within the PARLI Project. Studies in Computational Intelligence, vol 589. Springer, Cham. https://doi.org/10.1007/978-3-319-14206-7_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-14206-7_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-14205-0

  • Online ISBN: 978-3-319-14206-7

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics