Skip to main content

EusPropBank: Integrating Semantic Information in the Basque Dependency Treebank

  • Conference paper
Computational Linguistics and Intelligent Text Processing (CICLing 2010)

Abstract

This paper deals with theoretical problems found in the work that is being carried out for annotating semantic roles in the Basque Dependency Treebank (BDT). We will present the resources used and the way the annotation is being done. Following the model proposed in the PropBank project, we will show the problems found in the annotation process and decisions we have taken. The representation of the semantic tag has been established and detailed guidelines for the annotation process have been defined, although it is a task that needs continuous updating. Besides, we have adapted AbarHitz, a tool used in the construction of the BDT, to this task.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Agirre, E., Aldezabal, I., Pociello, E.: A pilot study of English Selectional Preferences and their Cross-Lingual Compatibility with Basque. In: International Conference on Text Speech and Dialogue, Czech Republic, pp. 12–19 (2003)

    Google Scholar 

  • Agirre, E., Aldezabal, I., Etxeberria, J., Izagirre, I., Mendizabal, K., Pociello, E., Quintian, M.: A methodology for the joint development of the Basque WordNet and Semcor. In: Proceedings of the 5th International Conference on Language Resources and Evaluations (LREC), Genoa, Italy (2006a)

    Google Scholar 

  • Agirre, E., Aldezabal, I., Etxeberria, J., Pociello, E.: A Preliminary Study for Building the Basque PropBank. In: Proceedings of the 5th International Conference on Language Resources and Evaluations (LREC), Genoa, Italy (2006b)

    Google Scholar 

  • Aldezabal, I.: Levin’s verb classes and Basque. A comparative approach, UMIACS Departmental Colloquia. University of Maryland (1998)

    Google Scholar 

  • Aldezabal, I., Aranzabe, M., Atutxa, A., Gojenola, K., Sarasola, K., Goenaga, P.: Extracción masiva de información sobre subcategorización verbal vasca a partir de corpus. In: Actas del XVII Congreso de la SEPLN, vol. 27, pp. 29–36. Universidad de Jaen, Spain (2001)

    Google Scholar 

  • Aldezabal, I., Aranzabe, M.J., Atutxa, A., Gojenola, K., Oronoz, M., Sarasola, K.: Application of finite-state transducers to the acquisition of verb subcategorization information. Natural Language Engineering 9, 39–48 (2003)

    Article  Google Scholar 

  • Aldezabal, I.: Aditz-azpikategorizazioaren azterketa. 100 aditzen azterketa zehatza, Levin, oinarri harturik eta metodo automatikoak baliatuz. Leioa (Bilbao): University of Basque Country thesis (2004)

    Google Scholar 

  • Aldezabal, I.: Estudio preliminar para la creación de Euskal PropBank. In: Castellón, I., Fernández, A. (eds.) Perspectivas de análisis de la unidad verbal, SERES. Universitat de Barcelona, Spain (2007)

    Google Scholar 

  • Aldezabal, I., Aranzabe, M.J., Díaz de Ilarraza, A., Estarrona, A., Fernández, K., Uria, L.: EPEC-RS: EPEC (Euskararen Prozesamendurako Erreferentzia Corpusa) rol semantikoekin etiketatzeko eskuliburua [Guidelines to tag semantic roles in the EPEC corpus (the Reference Corpus for the Processing of Basque)]. Internal Report, UPV / EHU / LSI / TR 02-2010 (2010)

    Google Scholar 

  • Baker, C.F., Fillmore, C.J., Lowe, J.B.: The Berkeley FrameNet project. In: Proceedings of the COLING-ACL, Montreal, Canada (1998)

    Google Scholar 

  • Bengoetxea, K., Gojenola, K.: Desarrollo de un analizador sintáctico-estadístico basado en dependencias para el euskera [Development of a statistical parser for Basque]. Procesamiento del Lenguaje Natural 39, 5–12 (2007)

    Google Scholar 

  • Bird, S., Maeda, K., Ma, X., Lee, H., Randall, B., Zayat, S.: TreeTrans: Diverse Tools Built on The Annotation Graph Toolkit. In: Third International Conference on Language Resources and Evaluation, Las Palmas, Canary Islands, Spain, pp. 29–31 (2002)

    Google Scholar 

  • Civit, M., Aldezabal, I., Pociello, E., Taulé, M., Aparicio, J., Màrquez, L.: 3LB-LEX: léxico verbal con frames sintáctico-semánticos. In: XXI Congreso de la SEPLN, Granada, Spain (2005)

    Google Scholar 

  • Díaz de Ilarraza, A., Garmendia, A., Oronoz, M.: Abar-Hitz: An annotation tool for the Basque Dependency Treebank. In: Paper presented at the International Conference on Language Resources and Evaluation, Lisbon, Portugal (2004)

    Google Scholar 

  • Hajic, J., Panevová, J., Urešová, Z., Bémová, A., Kolárová, V., Pajas, P.: PDT-VALLEX: Creating a Largecoverage Valency Lexicon for Treebank Annotation. In: Proceedings of the Second Workshop on Treebanks and Linguistic Theories, Sweden, pp. 57–68 (2003)

    Google Scholar 

  • Kingsbury, P., Palmer, M.: From Treebank to PropBank. In: Proceedings of the 3rd International Conference on Language Resources and Evaluation (LREC 2002), Las Palmas, Spain (2002)

    Google Scholar 

  • Kipper, K., Palmer, M., Rambow, O.: Extending PropBank with VerbNet Semantic Predicates. In: Workshop on Applied Interlinguas, held in conjunction with AMTA 2002, Tiburon, CA (2002)

    Google Scholar 

  • Levin, B.: English Verb Classes and Alternations. A preliminary Investigation. The University of Chicago Press, Chicago (1993)

    Google Scholar 

  • Marcus, M.: The Penn TreeBank: A revised corpus design for extracting predicate argument structure. In: Proceedings of the ARPA Human Language Technology Workshop, Princeton, NJ (1994)

    Google Scholar 

  • Nianwen, X.: Labeling Chinese predicates with semantic roles. Computational Linguistics 34(2), 225–255 (2008)

    Article  MathSciNet  Google Scholar 

  • Palmer, M., Xue, N.: Annotating the Propositions in the Penn Chinese Treebank. In: Proceedings of the Second Sighan Workshop, Sapporo, Japan (2003)

    Google Scholar 

  • Palmer, M., Gildea, D., Kingsbury, P.: The Proposition Bank: A Corpus Annotated with Semantic Roles. Computational Linguistics Journal 31(1) (2005)

    Google Scholar 

  • Rosén, V., Smedt, K.D., Dyvik, H., Meurer, P.: TREPIL: Developing Methods and Tools for Multilevel Treebank Construction. In: Civit, M., Küber, S., Martí, M. (eds.) Proceeding of the Fourth Workshop on Trebank and Linguistics Theories, pp. 161–172. Universitat de Barcelona, Spain (2005)

    Google Scholar 

  • Zapirain, B., Agirre, E., Màrquez, L.: Robustness and Generalization of Role Sets: PropBank vs. VerbNet. In: Proceedings of the 46th Annual Meeting of the Association of Computational Linguistics, ACL 2008: HLT, Columbus, Ohio, pp. 550–558 (2008)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Aldezabal, I., Aranzabe, M.J., Díaz de Ilarraza, A., Estarrona, A., Uria, L. (2010). EusPropBank: Integrating Semantic Information in the Basque Dependency Treebank. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2010. Lecture Notes in Computer Science, vol 6008. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12116-6_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-12116-6_6

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-12115-9

  • Online ISBN: 978-3-642-12116-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics