A Comparison of Word Similarity Measures for Noun Compound Disambiguation

Nulty, Paul; Costello, Fintan

doi:10.1007/978-3-642-17080-5_25

Paul Nulty²¹ &
Fintan Costello²¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6206))

Included in the following conference series:

Irish Conference on Artificial Intelligence and Cognitive Science

1911 Accesses
1 Citations

Abstract

Noun compounds occur frequently in many languages, and the problem of semantic disambiguation of these phrases has many potential applications in natural language processing and other areas. One very common approach to this problem is to define a set of semantic relations which capture the interaction between the modifier and the head noun, and then attempt to assign one of these semantic relations to each compound. For example, the compound phrase flu virus could be assigned the semantic relation causal (the virus causes the flu); the relation for desert wind could be location (the wind is located in the desert). In this paper we investigate methods for learning the correct semantic relation for a given noun compound by comparing the new compound to a training set of hand-tagged instances, using the similarity of the words in each compound. The main contribution of this paper is to directly compare distributional and knowledge-based word similarity measures for this task, using various datasets and corpora. We find that the knowledge based system provides a much better performance when adequate training data is available.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Johnston, M., Busa, F.: Qualia structure and the compositional interpretation of compound. In: Proceedings of the ACL SIGLEX Workshop on Breadth and Depth of Semantic Lexicons (1996)
Google Scholar
Levi, J.: The Syntax and Semantics of Complex Nominals. Academic Press, New York (1978)
Google Scholar
Turney, P.D., Waterman, M.S.: Similarity of Semantic Relations. Computational Linguistics 32(3), 379–416 (2006)
Article MATH Google Scholar
Hearst, M.A.: Automatic Acquisition of Hyponyms from Large Text Corpor. In: Proceedings of Conf. Computational Linguistics (COLING 1992) (1992)
Google Scholar
Seaghdha, O’.D., Copestake, A.: Using Lexical and Relational Similarity to Classify Semantic Relations. In: Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2009), Athens, Greece (2009)
Google Scholar
Nakov, P., Heast, M.: Solving Relational Similarity Problems using the Web as a Corpus. In: Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics (ACL 2008), Columbus, OH (2008)
Google Scholar
Seco, N., Veale, T., Hayes, J.: An Intrinsic Information Content Metric for Semantic Similarity in WordNet. In: The proceedings of ECAI 2004, the 16th European Conference on Artificial Intelligence, Valencia, Spain. John Wiley, Chichester (2004)
Google Scholar
Nastase, V., Szpakowicz, S.: Exploring noun-modifier semantic relations. In: Proceedings of the 5th International Workshop on Computational Semantics (2003)
Google Scholar
Seaghdha, O’.M.: Annotating and Learning Compound Noun Semantics. In: Proceedings of the ACL 2007 Student Research Workshop, Prague, Czech Republic (2007)
Google Scholar
Kolhatkar, V., Pedersen, T.: WordNet::SenseRelate:: AllWords - A Broad Coverage Word Sense Tagger that Maximimizes Semantic Relatedness. In: The Proceedings of the North American Chapter of the Association for Computational Linguistics - Human Language Technologies 2009 Conference, Boulder, CO., June 1-3 (2009)
Google Scholar
Kilgarriff, A., Rychly, P., Smrz, P., Tugwell, D.: The Sketch Engine. In: Proc. of EURALEX 2004, pp. 105–116 (2004)
Google Scholar
Bird, S., Loper, E.: NLTK: The Natural Language Toolki. In: Proceedings of the 42nd meeting o the ACL (Demonstration session) (2004)
Google Scholar
Kim, S.N., Baldwin, T.: Automatic interpretation of noun compounds using WordNet similarity. In: Dale, R., Wong, K.-F., Su, J., Kwong, O.Y. (eds.) IJCNLP 2005. LNCS (LNAI), vol. 3651, pp. 945–956. Springer, Heidelberg (2005)
Chapter Google Scholar
Lin, D.: An Information-Theoretic Definition of Similarity. In: Proceedings of the 15th International Conference on Machine Learning, Madson, WI (1998)
Google Scholar
Ferraresi, A., Zanchetta, E., Bernardini, S., Baroni, M.: Introducing and evaluating ukWaC, a very large web-derived corpus of English. In: Proceedings of 4th WAC workshop, LREC, Marrakech, Morocco (2008)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Science and Informatics, University College Dublin, Dublin, 4, Ireland
Paul Nulty & Fintan Costello

Authors

Paul Nulty
View author publications
You can also search for this author in PubMed Google Scholar
Fintan Costello
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Lero, International Science Centre, University of Limerick, Limerick, Ireland
Lorcan Coyle
CSIRO Tasmanian ICT centre, GPO Box 1538, 7001, Hobart, Tasmania, Australia
Jill Freyne

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Nulty, P., Costello, F. (2010). A Comparison of Word Similarity Measures for Noun Compound Disambiguation. In: Coyle, L., Freyne, J. (eds) Artificial Intelligence and Cognitive Science. AICS 2009. Lecture Notes in Computer Science(), vol 6206. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-17080-5_25

Download citation

DOI: https://doi.org/10.1007/978-3-642-17080-5_25
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-17079-9
Online ISBN: 978-3-642-17080-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics