A Quantitative Evaluation of Global Word Sense Induction

Apidianaki, Marianna; Van de Cruys, Tim

doi:10.1007/978-3-642-19400-9_20

Marianna Apidianaki¹⁷ &
Tim Van de Cruys¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 6608))

Included in the following conference series:

International Conference on Intelligent Text Processing and Computational Linguistics

2199 Accesses
1 Citations

Abstract

Word sense induction (WSI) is the task aimed at automatically identifying the senses of words in texts, without the need for handcrafted resources or annotated data. Up till now, most WSI algorithms extract the different senses of a word ‘locally’ on a per-word basis, i.e. the different senses for each word are determined separately. In this paper, we compare the performance of such algorithms to an algorithm that uses a ‘global’ approach, i.e. the different senses of a particular word are determined by comparing them to, and demarcating them from, the senses of other words in a full-blown word space model. We adopt the evaluation framework proposed in the SemEval-2010 Word Sense Induction & Disambiguation task. All systems that participated in this task use a local scheme for determining the different senses of a word. We compare their results to the ones obtained by the global approach, and discuss the advantages and weaknesses of both approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Ide, N., Wilks, Y.: Making sense about sense. In: Agirre, E., Edmonds, P. (eds.) Word Sense Disambiguation, Algorithms and Applications, pp. 47–73. Springer, Heidelberg (2007)
Chapter Google Scholar
Harris, Z.: Distributional structure. Word, 146–162 (1954)
Google Scholar
Manandhar, S., Klapaftis, I.P.: Semeval-2010 task 14: Evaluation setting for word sense induction & disambiguation systems. In: Proceedings of the NAACL HLT Workshop on Semantic Evaluations: Recent Achievements and Future Directions, Boulder, Colorado, pp. 117–122 (2009)
Google Scholar
Manandhar, S., Klapaftis, I.P., Dligach, D., Pradhan, S.: Semeval-2010 task 14: Word sense induction &disambiguation. In: Proceedings of the 5th International Workshop on Semantic Evaluation, ACL 2010, Uppsala, Sweden, pp. 63–68 (2010)
Google Scholar
Agirre, E., Soroa, A.: Semeval-2007 task 02: Evaluating word sense induction and discrimination systems. In: Proceedings of the 4th International Workshop on Semantic Evaluations, pp. 7–12. ACL, Prague (2007)
Chapter Google Scholar
Miller, G., Charles, W.: Contextual correlates of semantic similarity. Language and Cognitive Processes 6, 1–28 (1991)
Article Google Scholar
Navigli, R.: Word sense disambiguation: a survey. ACM Computing Surveys 41, 1–69 (2009)
Article Google Scholar
Schütze, H.: Automatic word sense discrimination. Computational Linguistics 24, 97–123 (1998)
Google Scholar
Purandare, A., Pedersen, T.: Word sense discrimination by clustering contexts in vector and similarity spaces. In: Proceedings of the Conference on Computational Natural Language Learning (CONLL), Boston, MA, pp. 41–48 (2004)
Google Scholar
Pedersen, T., Bruce, R.: Distinguishing word senses in untagged text. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), Providence, RI, pp. 197–207 (1997)
Google Scholar
Bordag, S.: Word sense induction: Triplet-based clustering and automatic evaluation. In: Proceedings of the 11th Conference of the European Chapter of the Association for Computational Linguistics (EACL), Trento, Italy, pp. 137–144 (2006)
Google Scholar
Widdows, D., Dorow, B.: A graph model for unsupervised lexical acquisition. In: Proceedings of the 19th International Conference on Computational Linguistics (COLING), Taipei, Taiwan, pp. 1093–1099 (2002)
Google Scholar
Véronis, J.: Hyperlex: lexical cartography for information retrieval. Computer Speech & Language 18, 223–252 (2004)
Article Google Scholar
Agirre, E., Martínez, D., de Lacalle, O.L., Soroa, A.: Two graph-based algorithms for state-of-the-art wsd. In: Proceedings of the Empirical Methods in Natural Language Processing (EMNLP) Conference, Sydney, Australia, pp. 585–593 (2006)
Google Scholar
Lin, D.: Automatic retrieval and clustering of similar words. In: Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics (COLING-ACL 1998), Montreal, Quebec, Canada, vol. 2, pp. 768–774 (1998)
Google Scholar
Pantel, P., Lin, D.: Discovering word senses from text. In: ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Edmonton, Alberta, Canada, pp. 613–619 (2002)
Google Scholar
Van de Cruys, T.: Using three way data for word sense discrimination. In: Proceedings of the 22nd International Conference on Computational Linguistics (COLING), Manchester, pp. 929–936 (2008)
Google Scholar
Lee, D.D., Seung, H.S.: Algorithms for non-negative matrix factorization. In: Advances in Neural Information Processing Systems, pp. 556–562 (2000)
Google Scholar
Church, K.W., Hanks, P.: Word association norms, mutual information & lexicography. Computational Linguistics 16, 22–29 (1990)
Google Scholar
Toutanova, K., Manning, C.D.: Enriching the knowledge sources used in a maximum entropy part-of-speech tagger. In: Proceedings of the Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora (EMNLP/VLC), pp. 63–70 (2000)
Google Scholar
Toutanova, K., Klein, D., Manning, C., Singer, Y.: Feature-rich part-of-speech tagging with a cyclic dependency network. In: Proceedings of the Human Language Technology / North American Association for Computational Linguistics conference (HLT-NAACL), pp. 252–259 (2003)
Google Scholar
Nivre, J., Hall, J., Nilsson, J.: Maltparser: A data-driven parser-generator for dependency parsing. In: Proceedings of the Language Resources and Evaluation Conference (LREC), Genoa, Italy, pp. 2216–2219 (2006)
Google Scholar
Hovy, E., Marcus, M., Palmer, M., Ramshaw, L., Weischedel, R.: Ontonotes: the 90% solution. In: Proceedings of the Human Language Technology / North American Association for Computational Linguistics conference (HLT-NAACL), Companion Volume: Short Papers on XX, New York, NY, pp. 57–60 (2006)
Google Scholar
Rosenberg, A., Hirschberg, J.: V-measure: A conditional entropy-based external cluster evaluation measure. In: Proceedings of the Joint 2007 Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), Prague, Czech Republic, pp. 410–420 (2007)
Google Scholar
Artiles, J., Amigó, E., Gonzalo, J.: The role of named entities in web people search. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 534–542 (2009)
Google Scholar
Pedersen, T.: Duluth-wsi: Senseclusters applied to the sense induction task of semeval-2. In: Proceedings of the 5th International Workshop on Semantic Evaluation, pp. 363–366. Association for Computational Linguistics, Uppsala (2010)
Google Scholar

Download references

Author information

Authors and Affiliations

Domaine de Voluceau, Alpage, INRIA & University Paris 7, Rocquencourt, B.P. 105, F-78153, Le Chesnay cedex, France
Marianna Apidianaki & Tim Van de Cruys

Authors

Marianna Apidianaki
View author publications
You can also search for this author in PubMed Google Scholar
Tim Van de Cruys
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Center for Computing Research, National Polytechnic Institute, Mexico
Alexander F. Gelbukh

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Apidianaki, M., Van de Cruys, T. (2011). A Quantitative Evaluation of Global Word Sense Induction. In: Gelbukh, A.F. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2011. Lecture Notes in Computer Science, vol 6608. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-19400-9_20

Download citation

DOI: https://doi.org/10.1007/978-3-642-19400-9_20
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-19399-6
Online ISBN: 978-3-642-19400-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

A Quantitative Evaluation of Global Word Sense Induction