Skip to main content

Annotation of Compositional Operations with GLML

  • Chapter
Computing Meaning

Part of the book series: Text, Speech and Language Technology ((TLTB,volume 47))

  • 1053 Accesses

Abstract

In this paper, we introduce a methodology for annotating compositional operations in natural language text and describe the Generative Lexicon Mark-up Language (GLML), a mark-up language inspired by the Generative Lexicon model, for identifying such relations. While most annotation systems capture surface relationships, GLML captures the “compositional history” of the argument selection relative to the predicate. We provide a brief overview of GL before moving on to our proposed methodology for annotating with GLML. There are three main tasks described in the paper. The first one is based on atomic semantic types and the other two exploit more fine-grained meaning parameters encoded in the Qualia Structure roles: (i) Argument Selection and Coercion Annotation for the SemEval-2010 competition; (ii) Qualia Selection in modification constructions; (iii) Type selection in modification constructions and verb-noun combinations involving dot objects. We explain what each task comprises and include the XML format for annotated sample sentences. We show that by identifying and subsequently annotating the typing and subtyping shifts in these constructions, we gain an insight into the workings of the general mechanisms of composition.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 99.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 129.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    A complete overview of the GLML specification as well as updates on the annotation effort can be found at www.glml.org.

  2. 2.

    Specifically, they argue that it is the particular quale binding the two nouns that determines the choice. They correlate the use of da with the Telic quale while di can be associated with either Agentive or Constitutive.

  3. 3.

    See Pustejovsky (2005), Rumshisky et al. (2007) for an expanded listing of dot objects.

References

  • Asher, N., & Pustejovsky, J. (2006). A type composition logic for Generative Lexicon. Journal of Cognitive Science, 6, 1–38.

    Google Scholar 

  • Bisetto, A., & Scalise, S. (2005). The classification of compounds. Lingue e Linguaggio, 2, 319–332.

    Google Scholar 

  • BNC (2000). The British National Corpus. The BNC Consortium, University of Oxford. http://www.natcorp.ox.ac.uk/.

  • Bouillon, P. (1997). Polymorphie et semantique lexical: le cas des adjectifs. PhD dissertation, Paris VII, Paris.

    Google Scholar 

  • Burchardt, A., Erk, K., Frank, A., Kowalski, A., Pado, S., & Pinkal, M. (2006). The SALSA corpus: A German corpus resource for lexical semantics. In Proceedings of LREC, Genoa, Italy.

    Google Scholar 

  • Chierchia, G. (1998). Reference to kinds across language. Natural Language Semantics, 6(4), 339–4015.

    Article  Google Scholar 

  • Egg, M. (2005). Flexible semantics for reinterpretation phenomena. Stanford: CSLI.

    Google Scholar 

  • Groenendijk, J., & Stokhof, M. (1989). Type-shifting rules and the semantics of interrogatives (Vol. 2, pp. 21–68). Dordrecht: Kluwer.

    Google Scholar 

  • Hanks, P. (2009). Corpus pattern analysis. CPA Project Page. Retrieved April 11, 2009, from http://nlp.fi.muni.cz/projekty/cpa/.

  • Hanks, P., & Pustejovsky, J. (2005). A pattern dictionary for natural language processing. Revue Française de Linguistique Appliquée, X, 63–82.

    Google Scholar 

  • Hobbs, J. R., Stickel, M., & Martin, P. (1993). Interpretation as abduction. Artificial Intelligence, 63, 69–142.

    Article  Google Scholar 

  • Johnston, M., & Busa, F. (1999). The compositional interpretation of compounds. In E. Viegas (Ed.), Breadth and depth of semantic lexicons (pp. 167–187). Dordrecht: Kluwer.

    Chapter  Google Scholar 

  • Kilgarriff, A., Rychly, P., Smrz, P., & Tugwell, D. (2004). The sketch engine. In Proceedings of EURALEX, Lorient, France (pp. 105–116).

    Google Scholar 

  • Kipper, K. (2005). VerbNet: A broad-coverage, comprehensive verb lexicon. PhD dissertation, University of Pennsylvania, PA.

    Google Scholar 

  • Levi, J. N. (1978). The syntax and semantics of complex nominals. New York: Academic Press.

    Google Scholar 

  • Markert, K., & Nissim, M. (2007). Metonymy resolution at SemEval I: Guidelines for participants. In Proceedings of the ACL 2007 conference.

    Google Scholar 

  • Meyers, A., Reeves, R., Macleod, C., Szekely, R., Zielinska, V., Young, B., & Grishman, R. (2004). The NomBank project: An interim report. In HLT-NAACL 2004 workshop: Frontiers in corpus annotation (pp. 24–31).

    Google Scholar 

  • Nunberg, G. (1979). The non-uniqueness of semantic solutions: Polysemy. Linguistics and Philosophy, 3, 143–184.

    Article  Google Scholar 

  • Palmer, M., Gildea, D., & Kingsbury, P. (2005). The proposition bank: An annotated corpus of semantic roles. Computational Linguistics, 31(1), 71–106.

    Article  Google Scholar 

  • Partee, B., & Rooth, M. (1983). Generalized conjunction and type ambiguity (pp. 361–383). Berlin: de Gruyter.

    Google Scholar 

  • Pinkal, M. (1999). On semantic underspecification. In H. Bunt & R. Muskens (Eds.), Proceedings of the 2nd international workshop on computational semantics (IWCS 2), January 13–15, The Netherlands: Tilburg University.

    Google Scholar 

  • Pradhan, S., Hovy, E., Marcus, M., Palmer, M., Ramshaw, L., & Weischedel, R. (2007). Ontonotes: A unified relational semantic representation. In ICSC 2007, International conference on semantic computing (pp. 517–526).

    Google Scholar 

  • Pustejovsky, J. (1991). The Generative Lexicon. Computational Linguistics, 17(4), 409–441.

    Google Scholar 

  • Pustejovsky, J. (1995). Generative Lexicon. Cambridge: MIT Press.

    Google Scholar 

  • Pustejovsky, J. (2000). Events and the semantics of opposition. In C. Tenny & J. Pustejovsky (Eds.), Events as grammatical objects (pp. 445–482). Stanford: CSLI.

    Google Scholar 

  • Pustejovsky, J. (2001). Type construction and the logic of concepts. In The syntax of word meaning, Cambridge: Cambridge University Press.

    Google Scholar 

  • Pustejovsky, J. (2005). A survey of dot objects (Technical report). Brandeis University.

    Google Scholar 

  • Pustejovsky, J. (2006a). Type theory and lexical decomposition. Journal of Cognitive Science, 6, 39–76.

    Google Scholar 

  • Pustejovsky, J. (2006b). Unifying linguistic annotations: A TimeML case study. In Proceedings of TSD 2006, Brno, Czech Republic.

    Google Scholar 

  • Pustejovsky, J. (2011). Coercion in a general theory of argument selection. Journal of Linguistics, 49(6), 1401–1431.

    Google Scholar 

  • Pustejovsky, J., Hanks, P., & Rumshisky, A. (2004). Automated induction of sense in context. In COLING 2004, Geneva, Switzerland (pp. 924–931).

    Google Scholar 

  • Pustejovsky, J., Knippen, R., Littman, J., & Sauri, R. (2005). Temporal and event information in natural language text. Language Resources and Evaluation, 39(2), 123–164.

    Article  Google Scholar 

  • Pustejovsky, J., Rumshisky, A., Plotnick, A., Jezek, E., Batiukova, O., & Quochi, V. (2010). Semeval-2010 task 7: Argument selection and coercion. In Proceedings of the 5th international workshop on semantic evaluation, Uppsala, Sweden (pp. 27–32). Stroudsburg: Association for Computational Linguistics.

    Google Scholar 

  • Pustejovsky, J., & Stubbs, A. (2012). Natural language annotation for machine learning. Sebastopol: O’Reilly Publishers.

    Google Scholar 

  • Rumshisky, A., & Batiukova, O. (2008). Polysemy in verbs: Systematic relations between senses and their effect on annotation. In COLING workshop on human judgement in computational linguistics (HJCL-2008), Manchester, England.

    Google Scholar 

  • Rumshisky, A., Grinberg, V., & Pustejovsky, J. (2007). Detecting selectional behaviour of complex types in text. In 4th international workshop on Generative Lexicon, Paris.

    Google Scholar 

  • Rumshisky, A., Hanks, P., Havasi, C., & Pustejovsky, J. (2006). Constructing a corpus-based ontology using model bias. In The 19th international FLAIRS conference, FLAIRS 2006, Melbourne Beach, Florida, USA.

    Google Scholar 

  • Ruppenhofer, J., Ellsworth, M., Petruck, M., Johnson, C., & Scheffczyk, J. (2006). FrameNet II: Extended theory and practice. Berkeley: California International Computer Sciences Institute.

    Google Scholar 

  • Spencer, A. (1991). Morphological theory: An introduction to word structure in generative grammar. Oxford, UK and Cambridge, USA: Blackwell Textbooks in Linguistics.

    Google Scholar 

  • Subirats, C. (2004). FrameNet Español. Una red semántica de marcos conceptuales. In VI international congress of Hispanic linguistics, Leipzig.

    Google Scholar 

  • Verhagen, M. (2010). The Brandeis annotation tool. In Language resources and evaluation conference, LREC 2010, Malta.

    Google Scholar 

  • Warren, B. (1978). Semantic patterns of noun-noun compounds. Göteborg: Acta Universitatis Gothoburgensis.

    Google Scholar 

Download references

Acknowledgements

The idea for annotating a corpus according to principles of argument selection within GL arose during a discussion at GL2007 in Paris, between one of the authors (James Pustejovsky) and Nicoletta Calzolari and Pierrette Bouillon. The authors would like to thank the members of the GLML Working Group and the organizers of the ASC task at SemEval-2010 for their fruitful feedback. In particular, we would like to thank Nicoletta Calzolari, Elisabetta Jezek, Alessandro Lenci, Valeria Quochi, Jan Odijk, Tommaso Caselli, Claudia Soria, Chu-Ren Huang, Marc Verhagen, and Kiyong Lee. The contribution by Olga Batiukova was partially financed by the Ministry of Economy and Competitiveness of Spain under Grant No. FFI2009-12191 (Subprogram FILO).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to James Pustejovsky .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer Science+Business Media Dordrecht

About this chapter

Cite this chapter

Pustejovsky, J., Rumshisky, A., Batiukova, O., Moszkowicz, J.L. (2014). Annotation of Compositional Operations with GLML. In: Bunt, H., Bos, J., Pulman, S. (eds) Computing Meaning. Text, Speech and Language Technology, vol 47. Springer, Dordrecht. https://doi.org/10.1007/978-94-007-7284-7_12

Download citation

  • DOI: https://doi.org/10.1007/978-94-007-7284-7_12

  • Publisher Name: Springer, Dordrecht

  • Print ISBN: 978-94-007-7283-0

  • Online ISBN: 978-94-007-7284-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics