Abstract
In this paper, we introduce a methodology for annotating compositional operations in natural language text and describe the Generative Lexicon Mark-up Language (GLML), a mark-up language inspired by the Generative Lexicon model, for identifying such relations. While most annotation systems capture surface relationships, GLML captures the “compositional history” of the argument selection relative to the predicate. We provide a brief overview of GL before moving on to our proposed methodology for annotating with GLML. There are three main tasks described in the paper. The first one is based on atomic semantic types and the other two exploit more fine-grained meaning parameters encoded in the Qualia Structure roles: (i) Argument Selection and Coercion Annotation for the SemEval-2010 competition; (ii) Qualia Selection in modification constructions; (iii) Type selection in modification constructions and verb-noun combinations involving dot objects. We explain what each task comprises and include the XML format for annotated sample sentences. We show that by identifying and subsequently annotating the typing and subtyping shifts in these constructions, we gain an insight into the workings of the general mechanisms of composition.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
A complete overview of the GLML specification as well as updates on the annotation effort can be found at www.glml.org.
- 2.
Specifically, they argue that it is the particular quale binding the two nouns that determines the choice. They correlate the use of da with the Telic quale while di can be associated with either Agentive or Constitutive.
- 3.
References
Asher, N., & Pustejovsky, J. (2006). A type composition logic for Generative Lexicon. Journal of Cognitive Science, 6, 1–38.
Bisetto, A., & Scalise, S. (2005). The classification of compounds. Lingue e Linguaggio, 2, 319–332.
BNC (2000). The British National Corpus. The BNC Consortium, University of Oxford. http://www.natcorp.ox.ac.uk/.
Bouillon, P. (1997). Polymorphie et semantique lexical: le cas des adjectifs. PhD dissertation, Paris VII, Paris.
Burchardt, A., Erk, K., Frank, A., Kowalski, A., Pado, S., & Pinkal, M. (2006). The SALSA corpus: A German corpus resource for lexical semantics. In Proceedings of LREC, Genoa, Italy.
Chierchia, G. (1998). Reference to kinds across language. Natural Language Semantics, 6(4), 339–4015.
Egg, M. (2005). Flexible semantics for reinterpretation phenomena. Stanford: CSLI.
Groenendijk, J., & Stokhof, M. (1989). Type-shifting rules and the semantics of interrogatives (Vol. 2, pp. 21–68). Dordrecht: Kluwer.
Hanks, P. (2009). Corpus pattern analysis. CPA Project Page. Retrieved April 11, 2009, from http://nlp.fi.muni.cz/projekty/cpa/.
Hanks, P., & Pustejovsky, J. (2005). A pattern dictionary for natural language processing. Revue Française de Linguistique Appliquée, X, 63–82.
Hobbs, J. R., Stickel, M., & Martin, P. (1993). Interpretation as abduction. Artificial Intelligence, 63, 69–142.
Johnston, M., & Busa, F. (1999). The compositional interpretation of compounds. In E. Viegas (Ed.), Breadth and depth of semantic lexicons (pp. 167–187). Dordrecht: Kluwer.
Kilgarriff, A., Rychly, P., Smrz, P., & Tugwell, D. (2004). The sketch engine. In Proceedings of EURALEX, Lorient, France (pp. 105–116).
Kipper, K. (2005). VerbNet: A broad-coverage, comprehensive verb lexicon. PhD dissertation, University of Pennsylvania, PA.
Levi, J. N. (1978). The syntax and semantics of complex nominals. New York: Academic Press.
Markert, K., & Nissim, M. (2007). Metonymy resolution at SemEval I: Guidelines for participants. In Proceedings of the ACL 2007 conference.
Meyers, A., Reeves, R., Macleod, C., Szekely, R., Zielinska, V., Young, B., & Grishman, R. (2004). The NomBank project: An interim report. In HLT-NAACL 2004 workshop: Frontiers in corpus annotation (pp. 24–31).
Nunberg, G. (1979). The non-uniqueness of semantic solutions: Polysemy. Linguistics and Philosophy, 3, 143–184.
Palmer, M., Gildea, D., & Kingsbury, P. (2005). The proposition bank: An annotated corpus of semantic roles. Computational Linguistics, 31(1), 71–106.
Partee, B., & Rooth, M. (1983). Generalized conjunction and type ambiguity (pp. 361–383). Berlin: de Gruyter.
Pinkal, M. (1999). On semantic underspecification. In H. Bunt & R. Muskens (Eds.), Proceedings of the 2nd international workshop on computational semantics (IWCS 2), January 13–15, The Netherlands: Tilburg University.
Pradhan, S., Hovy, E., Marcus, M., Palmer, M., Ramshaw, L., & Weischedel, R. (2007). Ontonotes: A unified relational semantic representation. In ICSC 2007, International conference on semantic computing (pp. 517–526).
Pustejovsky, J. (1991). The Generative Lexicon. Computational Linguistics, 17(4), 409–441.
Pustejovsky, J. (1995). Generative Lexicon. Cambridge: MIT Press.
Pustejovsky, J. (2000). Events and the semantics of opposition. In C. Tenny & J. Pustejovsky (Eds.), Events as grammatical objects (pp. 445–482). Stanford: CSLI.
Pustejovsky, J. (2001). Type construction and the logic of concepts. In The syntax of word meaning, Cambridge: Cambridge University Press.
Pustejovsky, J. (2005). A survey of dot objects (Technical report). Brandeis University.
Pustejovsky, J. (2006a). Type theory and lexical decomposition. Journal of Cognitive Science, 6, 39–76.
Pustejovsky, J. (2006b). Unifying linguistic annotations: A TimeML case study. In Proceedings of TSD 2006, Brno, Czech Republic.
Pustejovsky, J. (2011). Coercion in a general theory of argument selection. Journal of Linguistics, 49(6), 1401–1431.
Pustejovsky, J., Hanks, P., & Rumshisky, A. (2004). Automated induction of sense in context. In COLING 2004, Geneva, Switzerland (pp. 924–931).
Pustejovsky, J., Knippen, R., Littman, J., & Sauri, R. (2005). Temporal and event information in natural language text. Language Resources and Evaluation, 39(2), 123–164.
Pustejovsky, J., Rumshisky, A., Plotnick, A., Jezek, E., Batiukova, O., & Quochi, V. (2010). Semeval-2010 task 7: Argument selection and coercion. In Proceedings of the 5th international workshop on semantic evaluation, Uppsala, Sweden (pp. 27–32). Stroudsburg: Association for Computational Linguistics.
Pustejovsky, J., & Stubbs, A. (2012). Natural language annotation for machine learning. Sebastopol: O’Reilly Publishers.
Rumshisky, A., & Batiukova, O. (2008). Polysemy in verbs: Systematic relations between senses and their effect on annotation. In COLING workshop on human judgement in computational linguistics (HJCL-2008), Manchester, England.
Rumshisky, A., Grinberg, V., & Pustejovsky, J. (2007). Detecting selectional behaviour of complex types in text. In 4th international workshop on Generative Lexicon, Paris.
Rumshisky, A., Hanks, P., Havasi, C., & Pustejovsky, J. (2006). Constructing a corpus-based ontology using model bias. In The 19th international FLAIRS conference, FLAIRS 2006, Melbourne Beach, Florida, USA.
Ruppenhofer, J., Ellsworth, M., Petruck, M., Johnson, C., & Scheffczyk, J. (2006). FrameNet II: Extended theory and practice. Berkeley: California International Computer Sciences Institute.
Spencer, A. (1991). Morphological theory: An introduction to word structure in generative grammar. Oxford, UK and Cambridge, USA: Blackwell Textbooks in Linguistics.
Subirats, C. (2004). FrameNet Español. Una red semántica de marcos conceptuales. In VI international congress of Hispanic linguistics, Leipzig.
Verhagen, M. (2010). The Brandeis annotation tool. In Language resources and evaluation conference, LREC 2010, Malta.
Warren, B. (1978). Semantic patterns of noun-noun compounds. Göteborg: Acta Universitatis Gothoburgensis.
Acknowledgements
The idea for annotating a corpus according to principles of argument selection within GL arose during a discussion at GL2007 in Paris, between one of the authors (James Pustejovsky) and Nicoletta Calzolari and Pierrette Bouillon. The authors would like to thank the members of the GLML Working Group and the organizers of the ASC task at SemEval-2010 for their fruitful feedback. In particular, we would like to thank Nicoletta Calzolari, Elisabetta Jezek, Alessandro Lenci, Valeria Quochi, Jan Odijk, Tommaso Caselli, Claudia Soria, Chu-Ren Huang, Marc Verhagen, and Kiyong Lee. The contribution by Olga Batiukova was partially financed by the Ministry of Economy and Competitiveness of Spain under Grant No. FFI2009-12191 (Subprogram FILO).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer Science+Business Media Dordrecht
About this chapter
Cite this chapter
Pustejovsky, J., Rumshisky, A., Batiukova, O., Moszkowicz, J.L. (2014). Annotation of Compositional Operations with GLML. In: Bunt, H., Bos, J., Pulman, S. (eds) Computing Meaning. Text, Speech and Language Technology, vol 47. Springer, Dordrecht. https://doi.org/10.1007/978-94-007-7284-7_12
Download citation
DOI: https://doi.org/10.1007/978-94-007-7284-7_12
Publisher Name: Springer, Dordrecht
Print ISBN: 978-94-007-7283-0
Online ISBN: 978-94-007-7284-7
eBook Packages: Computer ScienceComputer Science (R0)