Measuring Similarity of Word Meaning in Context with Lexical Substitutes and Translations

McCarthy, Diana

doi:10.1007/978-3-642-19400-9_19

Diana McCarthy¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 6608))

Included in the following conference series:

International Conference on Intelligent Text Processing and Computational Linguistics

2206 Accesses
1 Citations

Abstract

Representation of word meaning has been a topic of considerable debate within the field of computational linguistics, and particularly in the subfield of word sense disambiguation. While word senses enumerated in manually produced inventories have been very useful as a start point to research, we know that the inventory should be selected for the purposes of the application. Unfortunately we have no clear understanding of how to determine the appropriateness of an inventory for monolingual applications, or when the target language is unknown in cross-lingual applications. In this paper we examine datasets which have paraphrases or translations as alternative annotations of lexical meaning on the same underlying corpus data. We demonstrate that overlap in lexical paraphrases (substitutes) between two uses of the same lemma correlates with overlap in translations. We compare the degree of overlap with annotations of usage similarity on the same data and show that the overlaps in paraphrases or translations also correlate with the similarity judgements. This bodes well for using any of these methods to evaluate unsupervised representations of lexical semantics. We do however find that the relationship breaks down for some lemmas, but this behaviour on a lemma by lemma basis itself correlates with low inter-tagger agreement and higher proportions of mid-range points on a usage similarity dataset. Lemmas which have many inter-related usages might potentially be predicted from such data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Fellbaum, C. (ed.): WordNet, An Electronic Lexical Database. The MIT Press, Cambridge (1998)
MATH Google Scholar
Schütze, H.: Automatic word sense discrimination. Computational Linguistics 24, 97–123 (1998)
Google Scholar
Pantel, P., Lin, D.: Discovering word senses from text. In: Proceedings of ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Edmonton, Canada, pp. 613–619 (2002)
Google Scholar
McCarthy, D., Navigli, R.: SemEval-2007 task 10: English lexical substitution task. In: Proceedings of the 4th International Workshop on Semantic Evaluations (SemEval 2007), Prague, Czech Republic, pp. 48–53 (2007)
Google Scholar
Mihalcea, R., Sinha, R., McCarthy, D.: Semeval-2010 task 2: Cross-lingual lexical substitution. In: Proceedings of the 5th International Workshop on Semantic Evaluation, pp. 9–14. Association for Computational Linguistics, Uppsala (2010)
Google Scholar
Erk, K., McCarthy, D., Gaylord, N.: Investigations on word senses and word usages. In: Proceedings of the 47th Annual Meeting of the Association for Computational Linguistics and the 4th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing. Association for Computational Linguistics, Suntec (2009)
Google Scholar
Kilgarriff, A.: Word senses. In: Agirre, E., Edmonds, P. (eds.) Word Sense Disambiguation, Algorithms and Applications, pp. 29–46. Springer, Heidelberg (2006)
Chapter Google Scholar
Resnik, P.: Selection and Information: A Class-Based Approach to Lexical Relationships. PhD thesis, University of Pennsylvania (1993)
Google Scholar
Sanderson, M.: Word sense disambiguation and information retrieval. In: 17th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 142–151. ACM Press, New York (1994)
Google Scholar
Carpuat, M., Wu, D.: Word sense disambiguation vs. statistical machine translation. In: Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL 2005). Association for Computational Linguistics, Ann Arbor (2005)
Google Scholar
Carpuat, M., Wu, D.: Improving statistical machine translation using word sense disambiguation. In: Proceedings of the Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL 2007), pp. 61–72. Association for Computational Linguistics, Prague (2007)
Google Scholar
Resnik, P.: wsd in nlp applications. In: Agirre, E., Edmonds, P. (eds.) Word Sense Disambiguation, Algorithms and Applications, pp. 299–337. Springer, Heidelberg (2006)
Chapter Google Scholar
Clough, P., Stevenson, M.: Evaluating the contribution of eurowordnet and word sense disambiguation to cross-language retrieval. In: Second International Global WordNet Conference (GWC 2004), pp. 97–105 (2004)
Google Scholar
Ide, N., Wilks, Y.: Making sense about sense. In: Agirre, E., Edmonds, P. (eds.) Word Sense Disambiguation, Algorithms and Applications, pp. 47–73. Springer, Heidelberg (2006)
Chapter Google Scholar
Hovy, E., Marcus, M., Palmer, M., Ramshaw, L., Weischedel, R.: Ontonotes: The 90% solution. In: Proceedings of the HLT-NAACL 2006 Workshop on Learning Word Meaning from Non-linguistic Data. Association for Computational Linguistics, New York City (2006)
Google Scholar
Navigli, R., Litkowski, K.C., Hargraves, O.: SemEval-2007 task 7: Coarse-grained english all-words task. In: Proceedings of the 4th International Workshop on Semantic Evaluations (SemEval 2007), Prague, Czech Republic, pp. 30–35 (2007)
Google Scholar
Navigli, R.: Meaningful clustering of senses helps boost word sense disambiguation performance. In: Proceedings of the 44th Annual Meeting of the Association for Computational Linguistics Joint with the 21st International Conference on Computational Linguistics (COLING-ACL 2006), Sydney, Australia, pp. 105–112 (2006)
Google Scholar
Stokoe, C.: Differentiating homonymy and polysemy in information retrieval. In: Proceedings of the Joint Conference on Human Language Technology and Empirical methods in Natural Language Processing, Vancouver, B.C., Canada, pp. 403–410 (2005)
Google Scholar
McCarthy, D.: Relating wordnet senses for word sense disambiguation. In: Proceedings of the EACL 2006 Workshop: Making Sense of Sense: Bringing Psycholinguistics and Computational Linguistics Together, Trento, Italy, pp. 17–24 (2006)
Google Scholar
Sharoff, S.: Open-source corpora: Using the net to fish for linguistic data. International Journal of Corpus Linguistics 11, 435–462 (2006)
Article Google Scholar
McCarthy, D., Navigli, R.: The English lexical substitution task. In: Language Resources and Evaluation Special Issue on Computational Semantic Analysis of Language: SemEval-2007 and Beyond, vol. 43(2), pp. 139–159 (2009)
Google Scholar
Ng, H.T., Chan, Y.S.: SemEval-2007 task 11: English lexical sample task via english-chinese parallel text. In: Proceedings of the 4th International Workshop on Semantic Evaluations (SemEval 2007), Prague, Czech Republic, pp. 54–58 (2007)
Google Scholar
Mitchell, J., Lapata, M.: Vector-based models of semantic composition. In: Proceedings of ACL 2008: HLT, pp. 236–244. Association for Computational Linguistics, Columbus (2008)
Google Scholar
Lefever, E., Hoste, V.: SemEval-2007 task 3: Cross-lingual word sense disambiguation. In: Proceedings of the 5th International Workshop on Semantic Evaluations (SemEval 2010), Uppsala, Sweden (2010)
Google Scholar

Download references

Author information

Authors and Affiliations

Lexical Computing Ltd., Brighton, BN1 6WE, UK
Diana McCarthy

Authors

Diana McCarthy
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Center for Computing Research, National Polytechnic Institute, Mexico
Alexander F. Gelbukh

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

McCarthy, D. (2011). Measuring Similarity of Word Meaning in Context with Lexical Substitutes and Translations. In: Gelbukh, A.F. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2011. Lecture Notes in Computer Science, vol 6608. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-19400-9_19

Download citation

DOI: https://doi.org/10.1007/978-3-642-19400-9_19
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-19399-6
Online ISBN: 978-3-642-19400-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics