Abstract
Recent work on the transfer of semantic information across languages has been recently applied to the development of resources annotated with Frame information for different non-English European languages. These works are based on the assumption that parallel corpora annotated for English can be used to transfer the semantic information to the other target languages. In this paper, a robust method based on a statistical machine translation step augmented with simple rule-based post-processing is presented. It alleviates problems related to preprocessing errors and the complex optimization required by syntax-dependent models of the cross-lingual mapping. Different alignment strategies are here investigated against the Europarl corpus. Results suggest that the quality of the derived annotations is surprisingly good and well suited for training semantic role labeling systems.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Baker, C.F., Fillmore, C.J., Lowe, J.B.: The Berkeley FrameNet project. In: Proc. of COLING-ACL 1998, pp. 86–90 (1998)
Fillmore, C.J.: Frames and the semantics of understanding. Quaderni di Semantica 4(2), 222–254 (1985)
Palmer, M., Gildea, D., Kingsbury, P.: The Proposition Bank: an Annotated Corpus of Semantic Roles. Computational Linguistics 31(1), 71–106 (2005)
Gildea, D., Jurafsky, D.: Automatic Labeling of Semantic Roles. Computational Linguistics 28(3), 245–288 (2002)
Carreras, X., Mà rquez, L.: Introduction to the CoNLL-2005 Shared Task: Semantic Role Labeling. In: Proc. of CoNLL 2005, Ann Arbor, Michigan, pp. 152–164 (2005)
Padïo, S.: Cross-lingual annotation projection models for role-semantic information. PhD Thesis, Dissertation, Universität des Saarlandes, Saarbrücken, Germany (2007)
Padïo, S., Pitel, G.: Annotation prïecise du francais en sïemantique de roles par projection cross-linguistique. In: Proc. of TALN 2007, Toulouse, France (2007)
Tonelli, S., Pianta, E.: Frame information transfer from english to italian. In: Proc. of LREC Conference, Marrakech, Marocco (2008)
Koehn, P., Hoang, H., Birch, A., Callison-Burch, C., Federico, M., Bertoldi, N., Cowan, B., Shen, W., Moran, C., Zens, R., Dyer, C., Bojar, O., Constantin, A., Herbst, E.: Moses: Open source toolkit for statistical machine translation. In: Annual Meeting of the Association for Computational Linguistics (ACL), Demonstration Session, Prague, Czech Republic (2007)
De Cao, D., Croce, D., Pennacchiotti, M., Basili, R.: Combining word sense and usage for modeling frame semantics. In: Proc. of The Symposium on Semantics in Systems for Text Processing (STEP 2008), Venice, Italy, September 22-24 (2008)
Roberto, B., De Cao, D., Pennacchiotti, M., Croce, D., Roth, M.: Automatic induction of framenet lexical units. In: Proc. of the 12th International Conference on Empirical Methods for NLP (EMNLP 2008), Honolulu, USA (2008)
Heyer, L., Kruglyak, S., Yooseph, S.: Exploring expression data: Identification and analysis of coexpressed genes. Genome Research (9), 1106–1115 (1999)
Landauer, T., Dumais, S.: A solution to plato’s problem: The latent semantic analysis theory of acquisition, induction and representation of knowledge. Psychological Review 104, 211–240 (1997)
Koehn, P., Hoang, H.: Factored translation models. In: Proc. of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, Prague, Czech Republic, pp. 868–876 (2007)
Koehn, P.: Europarl: A parallel corpus for statistical machine translation. In: Proc. of the MT Summit, Phuket, Thailand (2005)
Moschitti, A.: Making Tree Kernels Practical for Natural Language Learning. In: Proc. of EACL 2006, pp. 113–120 (2006)
Moschitti, A., Pighin, D., Basili, R.: Tree Kernels for Semantic Role Labeling. Computational Linguistics Special Issue on Semantic Role Labeling (3), 245–288 (2008)
Coppola, B., Moschitti, A., Pighin, D.: Generalized Framework for Syntax-based Relation Mining. In: Proceedings of the IEEE International Conference on Data Mining (ICDM 2008), Pisa, Italy (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Basili, R., De Cao, D., Croce, D., Coppola, B., Moschitti, A. (2009). Cross-Language Frame Semantics Transfer in Bilingual Corpora. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2009. Lecture Notes in Computer Science, vol 5449. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-00382-0_27
Download citation
DOI: https://doi.org/10.1007/978-3-642-00382-0_27
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-00381-3
Online ISBN: 978-3-642-00382-0
eBook Packages: Computer ScienceComputer Science (R0)