Annotation of Compositional Operations with GLML

Pustejovsky, James; Rumshisky, Anna; Batiukova, Olga; Moszkowicz, Jessica L.

doi:10.1007/978-94-007-7284-7_12

James Pustejovsky⁵,
Anna Rumshisky^6,7,
Olga Batiukova⁸ &
…
Jessica L. Moszkowicz⁵

Part of the book series: Text, Speech and Language Technology ((TLTB,volume 47))

1053 Accesses

Abstract

In this paper, we introduce a methodology for annotating compositional operations in natural language text and describe the Generative Lexicon Mark-up Language (GLML), a mark-up language inspired by the Generative Lexicon model, for identifying such relations. While most annotation systems capture surface relationships, GLML captures the “compositional history” of the argument selection relative to the predicate. We provide a brief overview of GL before moving on to our proposed methodology for annotating with GLML. There are three main tasks described in the paper. The first one is based on atomic semantic types and the other two exploit more fine-grained meaning parameters encoded in the Qualia Structure roles: (i) Argument Selection and Coercion Annotation for the SemEval-2010 competition; (ii) Qualia Selection in modification constructions; (iii) Type selection in modification constructions and verb-noun combinations involving dot objects. We explain what each task comprises and include the XML format for annotated sample sentences. We show that by identifying and subsequently annotating the typing and subtyping shifts in these constructions, we gain an insight into the workings of the general mechanisms of composition.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 99.00; Price excludes VAT (USA)

Hardcover Book: USD 129.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
A complete overview of the GLML specification as well as updates on the annotation effort can be found at www.glml.org.
2.
Specifically, they argue that it is the particular quale binding the two nouns that determines the choice. They correlate the use of da with the Telic quale while di can be associated with either Agentive or Constitutive.
3.
See Pustejovsky (2005), Rumshisky et al. (2007) for an expanded listing of dot objects.

References

Asher, N., & Pustejovsky, J. (2006). A type composition logic for Generative Lexicon. Journal of Cognitive Science, 6, 1–38.
Google Scholar
Bisetto, A., & Scalise, S. (2005). The classification of compounds. Lingue e Linguaggio, 2, 319–332.
Google Scholar
BNC (2000). The British National Corpus. The BNC Consortium, University of Oxford. http://www.natcorp.ox.ac.uk/.
Bouillon, P. (1997). Polymorphie et semantique lexical: le cas des adjectifs. PhD dissertation, Paris VII, Paris.
Google Scholar
Burchardt, A., Erk, K., Frank, A., Kowalski, A., Pado, S., & Pinkal, M. (2006). The SALSA corpus: A German corpus resource for lexical semantics. In Proceedings of LREC, Genoa, Italy.
Google Scholar
Chierchia, G. (1998). Reference to kinds across language. Natural Language Semantics, 6(4), 339–4015.
Article Google Scholar
Egg, M. (2005). Flexible semantics for reinterpretation phenomena. Stanford: CSLI.
Google Scholar
Groenendijk, J., & Stokhof, M. (1989). Type-shifting rules and the semantics of interrogatives (Vol. 2, pp. 21–68). Dordrecht: Kluwer.
Google Scholar
Hanks, P. (2009). Corpus pattern analysis. CPA Project Page. Retrieved April 11, 2009, from http://nlp.fi.muni.cz/projekty/cpa/.
Hanks, P., & Pustejovsky, J. (2005). A pattern dictionary for natural language processing. Revue Française de Linguistique Appliquée, X, 63–82.
Google Scholar
Hobbs, J. R., Stickel, M., & Martin, P. (1993). Interpretation as abduction. Artificial Intelligence, 63, 69–142.
Article Google Scholar
Johnston, M., & Busa, F. (1999). The compositional interpretation of compounds. In E. Viegas (Ed.), Breadth and depth of semantic lexicons (pp. 167–187). Dordrecht: Kluwer.
Chapter Google Scholar
Kilgarriff, A., Rychly, P., Smrz, P., & Tugwell, D. (2004). The sketch engine. In Proceedings of EURALEX, Lorient, France (pp. 105–116).
Google Scholar
Kipper, K. (2005). VerbNet: A broad-coverage, comprehensive verb lexicon. PhD dissertation, University of Pennsylvania, PA.
Google Scholar
Levi, J. N. (1978). The syntax and semantics of complex nominals. New York: Academic Press.
Google Scholar
Markert, K., & Nissim, M. (2007). Metonymy resolution at SemEval I: Guidelines for participants. In Proceedings of the ACL 2007 conference.
Google Scholar
Meyers, A., Reeves, R., Macleod, C., Szekely, R., Zielinska, V., Young, B., & Grishman, R. (2004). The NomBank project: An interim report. In HLT-NAACL 2004 workshop: Frontiers in corpus annotation (pp. 24–31).
Google Scholar
Nunberg, G. (1979). The non-uniqueness of semantic solutions: Polysemy. Linguistics and Philosophy, 3, 143–184.
Article Google Scholar
Palmer, M., Gildea, D., & Kingsbury, P. (2005). The proposition bank: An annotated corpus of semantic roles. Computational Linguistics, 31(1), 71–106.
Article Google Scholar
Partee, B., & Rooth, M. (1983). Generalized conjunction and type ambiguity (pp. 361–383). Berlin: de Gruyter.
Google Scholar
Pinkal, M. (1999). On semantic underspecification. In H. Bunt & R. Muskens (Eds.), Proceedings of the 2nd international workshop on computational semantics (IWCS 2), January 13–15, The Netherlands: Tilburg University.
Google Scholar
Pradhan, S., Hovy, E., Marcus, M., Palmer, M., Ramshaw, L., & Weischedel, R. (2007). Ontonotes: A unified relational semantic representation. In ICSC 2007, International conference on semantic computing (pp. 517–526).
Google Scholar
Pustejovsky, J. (1991). The Generative Lexicon. Computational Linguistics, 17(4), 409–441.
Google Scholar
Pustejovsky, J. (1995). Generative Lexicon. Cambridge: MIT Press.
Google Scholar
Pustejovsky, J. (2000). Events and the semantics of opposition. In C. Tenny & J. Pustejovsky (Eds.), Events as grammatical objects (pp. 445–482). Stanford: CSLI.
Google Scholar
Pustejovsky, J. (2001). Type construction and the logic of concepts. In The syntax of word meaning, Cambridge: Cambridge University Press.
Google Scholar
Pustejovsky, J. (2005). A survey of dot objects (Technical report). Brandeis University.
Google Scholar
Pustejovsky, J. (2006a). Type theory and lexical decomposition. Journal of Cognitive Science, 6, 39–76.
Google Scholar
Pustejovsky, J. (2006b). Unifying linguistic annotations: A TimeML case study. In Proceedings of TSD 2006, Brno, Czech Republic.
Google Scholar
Pustejovsky, J. (2011). Coercion in a general theory of argument selection. Journal of Linguistics, 49(6), 1401–1431.
Google Scholar
Pustejovsky, J., Hanks, P., & Rumshisky, A. (2004). Automated induction of sense in context. In COLING 2004, Geneva, Switzerland (pp. 924–931).
Google Scholar
Pustejovsky, J., Knippen, R., Littman, J., & Sauri, R. (2005). Temporal and event information in natural language text. Language Resources and Evaluation, 39(2), 123–164.
Article Google Scholar
Pustejovsky, J., Rumshisky, A., Plotnick, A., Jezek, E., Batiukova, O., & Quochi, V. (2010). Semeval-2010 task 7: Argument selection and coercion. In Proceedings of the 5th international workshop on semantic evaluation, Uppsala, Sweden (pp. 27–32). Stroudsburg: Association for Computational Linguistics.
Google Scholar
Pustejovsky, J., & Stubbs, A. (2012). Natural language annotation for machine learning. Sebastopol: O’Reilly Publishers.
Google Scholar
Rumshisky, A., & Batiukova, O. (2008). Polysemy in verbs: Systematic relations between senses and their effect on annotation. In COLING workshop on human judgement in computational linguistics (HJCL-2008), Manchester, England.
Google Scholar
Rumshisky, A., Grinberg, V., & Pustejovsky, J. (2007). Detecting selectional behaviour of complex types in text. In 4th international workshop on Generative Lexicon, Paris.
Google Scholar
Rumshisky, A., Hanks, P., Havasi, C., & Pustejovsky, J. (2006). Constructing a corpus-based ontology using model bias. In The 19th international FLAIRS conference, FLAIRS 2006, Melbourne Beach, Florida, USA.
Google Scholar
Ruppenhofer, J., Ellsworth, M., Petruck, M., Johnson, C., & Scheffczyk, J. (2006). FrameNet II: Extended theory and practice. Berkeley: California International Computer Sciences Institute.
Google Scholar
Spencer, A. (1991). Morphological theory: An introduction to word structure in generative grammar. Oxford, UK and Cambridge, USA: Blackwell Textbooks in Linguistics.
Google Scholar
Subirats, C. (2004). FrameNet Español. Una red semántica de marcos conceptuales. In VI international congress of Hispanic linguistics, Leipzig.
Google Scholar
Verhagen, M. (2010). The Brandeis annotation tool. In Language resources and evaluation conference, LREC 2010, Malta.
Google Scholar
Warren, B. (1978). Semantic patterns of noun-noun compounds. Göteborg: Acta Universitatis Gothoburgensis.
Google Scholar

Download references

Acknowledgements

The idea for annotating a corpus according to principles of argument selection within GL arose during a discussion at GL2007 in Paris, between one of the authors (James Pustejovsky) and Nicoletta Calzolari and Pierrette Bouillon. The authors would like to thank the members of the GLML Working Group and the organizers of the ASC task at SemEval-2010 for their fruitful feedback. In particular, we would like to thank Nicoletta Calzolari, Elisabetta Jezek, Alessandro Lenci, Valeria Quochi, Jan Odijk, Tommaso Caselli, Claudia Soria, Chu-Ren Huang, Marc Verhagen, and Kiyong Lee. The contribution by Olga Batiukova was partially financed by the Ministry of Economy and Competitiveness of Spain under Grant No. FFI2009-12191 (Subprogram FILO).

Author information

Authors and Affiliations

Department of Computer Science, Brandeis University, 415 South Street, Waltham, MA, 02454, USA
James Pustejovsky & Jessica L. Moszkowicz
Department of Computer Science, University of Massachusetts, One University Avenue, Lowell, MA, 01854, USA
Anna Rumshisky
Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA, USA
Anna Rumshisky
Department of Spanish Philology, Autonomous University of Madrid, Ciudad Universitaria Cantoblanco, Carretera de Colmenar, km. 15, 28049, Cantoblanco, Madrid, Spain
Olga Batiukova

Authors

James Pustejovsky
View author publications
You can also search for this author in PubMed Google Scholar
Anna Rumshisky
View author publications
You can also search for this author in PubMed Google Scholar
Olga Batiukova
View author publications
You can also search for this author in PubMed Google Scholar
Jessica L. Moszkowicz
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to James Pustejovsky .

Editor information

Editors and Affiliations

Tilburg Center for Cognition & Comm., Tilburg University, Tilburg, The Netherlands
Harry Bunt
Alfa-Informatica, Rijksuniversiteit Groningen, Groningen, The Netherlands
Johan Bos
Department of Computer Science Wolfson Building, Oxford University, Oxford, United Kingdom
Stephen Pulman

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Pustejovsky, J., Rumshisky, A., Batiukova, O., Moszkowicz, J.L. (2014). Annotation of Compositional Operations with GLML. In: Bunt, H., Bos, J., Pulman, S. (eds) Computing Meaning. Text, Speech and Language Technology, vol 47. Springer, Dordrecht. https://doi.org/10.1007/978-94-007-7284-7_12

Download citation

DOI: https://doi.org/10.1007/978-94-007-7284-7_12
Publisher Name: Springer, Dordrecht
Print ISBN: 978-94-007-7283-0
Online ISBN: 978-94-007-7284-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics