An Experimental Study on Unsupervised Graph-based Word Sense Disambiguation

Tsatsaronis, George; Varlamis, Iraklis; Nørvåg, Kjetil

doi:10.1007/978-3-642-12116-6_16

George Tsatsaronis¹⁷,
Iraklis Varlamis¹⁸ &
Kjetil Nørvåg¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 6008))

Included in the following conference series:

International Conference on Intelligent Text Processing and Computational Linguistics

1869 Accesses
14 Citations

Abstract

Recent research works on unsupervised word sense disambiguation report an increase in performance, which reduces their handicap from the respective supervised approaches for the same task. Among the latest state of the art methods, those that use semantic graphs reported the best results. Such methods create a graph comprising the words to be disambiguated and their corresponding candidate senses. The graph is expanded by adding semantic edges and nodes from a thesaurus. The selection of the most appropriate sense per word occurrence is then made through the use of graph processing algorithms that offer a degree of importance among the graph vertices. In this paper we experimentally investigate the performance of such methods. We additionally evaluate a new method, which is based on a recently introduced algorithm for computing similarity between graph vertices, P-Rank. We evaluate the performance of all alternatives in two benchmark data sets, Senseval 2 and 3, using WordNet. The current study shows the differences in the performance of each method, when applied on the same semantic graph representation, and analyzes the pros and cons of each method for each part of speech separately. Furthermore, it analyzes the levels of inter-agreement in the sense selection level, giving further insight on how these methods could be employed in an unsupervised ensemble for word sense disambiguation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Agirre, E., Rigau, G.: Word sense disambiguation using conceptual density. In: Proc. of COLING, pp. 16–22 (1996)
Google Scholar
Agirre, E., Soroa, A.: Personalizing pagerank for word sense disambiguation. In: Proc. of EACL, pp. 33–41 (2009)
Google Scholar
Banerjee, S., Pedersen, T.: Extended gloss overlaps as a measure of semantic relatedness. In: Proc. of IJCAI (2003)
Google Scholar
Brin, S., Page, L.: The anatomy of a large-scale hypertextual web search engine. Computer Networks and ISDN Systems 30, 1–7 (1998)
Article Google Scholar
Brody, S., Navigli, R., Lapata, M.: Ensemble methods for unsupervised wsd. In: Proc. of COLING/ACL 2006, pp. 97–104 (2006)
Google Scholar
Chan, Y., Ng, H., Chiang, D.: Word sense disambiguation improves statistical machine translation. In: Proc. of ACL (2007)
Google Scholar
Chklovski, T., Mihalcea, R.: Exploiting agreement and disagreement of human annotators for word sense disambiguation. In: Proc. of RANLP (2003)
Google Scholar
Crestani, F.: Application of spreading activation techniques in information retrieval. Artificial Intelligence Review 11, 453–482 (1997)
Article Google Scholar
Decadt, B., Hoste, V., Daelemans, W., van den Bosch, A.: Gambl, genetic algorithm optimization of memory-based wsd. In: Proc. of the Senseval3: Third International Workshop on the Evaluation of Systems for the Semantic Analysis of Text (2004)
Google Scholar
Fellbaum, C.: WordNet – an electronic lexical database. MIT Press, Cambridge (1998)
MATH Google Scholar
Florian, R., Cucerzan, S., Schafer, C., Yarowsky, D.: Combining classifiers for word sense disambiguation. Natural Language Engineering 8(4), 327–341 (2002)
Article Google Scholar
Gale, W., Church, K., Yarowsky, D.: Estimating upper and lower bounds on the performance of word-sense disambiguation programs. In: Proc. of the ACL 1992, pp. 249–256 (1992)
Google Scholar
Gonzalo, J., Verdejo, F., Chugur, I.: Indexing with wordnet synsets can improve text retrieval. In: Proc. of the COLING/ACL Workshop on Usage of WordNet for NLP (1998)
Google Scholar
Hoste, V., Daelemans, W., Hendrickx, I., van den Bosch, A.: Evaluating the results of a memory-based word-expert approach to unrestricted word sense disambiguation. In: Proc. of the ACL Workshop on Word Sense Disambiguation (2002)
Google Scholar
Ide, N., Veronis, J.: Word sense disambiguation: the state of the art. Computational Linguistics 24(1), 1–40 (1998)
Google Scholar
Jeh, G., Widom, J.: Simrank: a measure of structural-context similarity. In: Proc. of KDD, pp. 538–543 (2002)
Google Scholar
Kleinberg, J.: Authoritative sources in a hyperlinked environment. Journal of the ACM 46(5), 604–632 (1999)
Article MATH MathSciNet Google Scholar
Kohomban, U., Lee, W.: Learning semantic classes for word sense disambiguation. In: Proc. of ACL, pp. 34–41 (2005)
Google Scholar
Lesk, M.: Automated sense disambiguation using machine-readable dictionaries: How to tell a pine cone from an ice cream cone. In: Proc. of the SIGDOC Conference, pp. 24–26 (1986)
Google Scholar
Mavroeidis, D., Tsatsaronis, G., Vazirgiannis, M., Theobald, M., Weikum, G.: Word sense disambiguation for exploiting hierarchical thesauri in text classification. In: Jorge, A.M., Torgo, L., Brazdil, P.B., Camacho, R., Gama, J. (eds.) PKDD 2005. LNCS (LNAI), vol. 3721, pp. 181–192. Springer, Heidelberg (2005)
Chapter Google Scholar
Mihalcea, R.: Word sense disambiguation with pattern learning and automatic feature selection. Natural Language Engineering 1(1), 1–15 (2002)
MathSciNet Google Scholar
Mihalcea, R.: Unsupervised large-vocabulary word sense disambiguation with graph-based algorithms for sequence data labeling. In: HLT (2005)
Google Scholar
Mihalcea, R., Csomai, A.: Senselearner: Word sense disambiguation for all words in unrestricted text. In: Proc. of ACL, pp. 53–56 (2005)
Google Scholar
Mihalcea, R., Tarau, P., Figa, E.: Pagerank on semantic networks with application to word sense disambiguation. In: Proc. of COLING (2004)
Google Scholar
Moldovan, D., Rus, V.: Logic form transformation of wordnet and its applicability to question answering. In: Proc. of ACL, pp. 394–401 (2001)
Google Scholar
Navigli, R.: Online word sense disambiguation with structural semantic interconnections. In: Proc. of EACL (2006)
Google Scholar
Navigli, R.: A structural approach to the automatic adjudication of word sense disagreements. Natural Language Engineering 14(4), 547–573 (2008)
Article Google Scholar
Navigli, R.: Word sense disambiguation: A survey. ACM Computing Surveys 41(2), Article 10 (2009)
Google Scholar
Navigli, R., Lapata, M.: Graph connectivity measures for unsupervised word sense disambiguation. In: Proc. of IJCAI, pp. 1683–1688 (2007)
Google Scholar
Palmer, M., Dang, H., Fellbaum, C.: Making fine-grained and coarse-grained sense distinctions, both manually and automatically. Journal of Natural Language Engineering 13(2), 137–163 (2007)
Google Scholar
Palmer, M., Fellbaum, C., Cotton, S.: English tasks: All-words and verb lexical sample. In: Proc. of Senseval-2, pp. 21–24 (2001)
Google Scholar
Patwardhan, S., Banerjee, S., Pedersen, T.: Using measures of semantic relatedness for word sense disambiguation. In: Gelbukh, A. (ed.) CICLing 2003. LNCS, vol. 2588, pp. 241–257. Springer, Heidelberg (2003)
Chapter Google Scholar
Pedersen, T.: A simple approach to building ensembles of naive bayesian classifiers for word sense disambiguation. In: Proc. of NAACL, pp. 63–69 (2000)
Google Scholar
Pedersen, T., Kolhatkar, V.: WordNet:: SenseRelate:: AllWords - A Broad Coverage Word Sense Tagger that Maximimizes Semantic Relatedness. In: Proc. of NAACL/HLT, pp. 17–20 (2009)
Google Scholar
Sinha, R., Mihalcea, R.: Unsupervised graph-based word sense disambiguation using measures of word semantic similarity. In: Proc. of ICSC (2007)
Google Scholar
Snyder, B., Palmer, M.: The english all-words task. In: Proc. of Senseval-3, pp. 41–43 (2004)
Google Scholar
Sussna, M.: Word sense disambiguation for free-text indexing using a massive semantic network. In: Proc. of CIKM (1993)
Google Scholar
Tsatsaronis, G., Vazirgiannis, M., Androutsopoulos, I.: Word sense disambiguation with spreading activation networks generated from thesauri. In: Proc. of IJCAI, pp. 1725–1730 (2007)
Google Scholar
Veronis, J., Ide, N.: Word sense disambiguation with very large neural networks extracted from machine readable dictionaries. In: Proc. of COLING, pp. 389–394 (1990)
Google Scholar
Wu, D., Su, W., Carpuat, M.: A kernel pca method for superior word sense disambiguation. In: Proc. of ACL, pp. 637–644 (2004)
Google Scholar
Yarowsky, D.: Word-sense disambiguation using statistical models of roget’s categories trained on large corpora. In: Proceedings of COLING, pp. 454–460 (1992)
Google Scholar
Zhao, P., Han, J., Sun, Y.: P-Rank: a comprehensive structural similarity measure over information networks. In: Proc. of CIKM, pp. 553–562 (2009)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer and Information Science, Norwegian University of Science and Technology,
George Tsatsaronis & Kjetil Nørvåg
Department of Informatics and Telematics, Harokopio University of Athens,
Iraklis Varlamis

Authors

George Tsatsaronis
View author publications
You can also search for this author in PubMed Google Scholar
Iraklis Varlamis
View author publications
You can also search for this author in PubMed Google Scholar
Kjetil Nørvåg
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Center for Computing Research, National Polytechnic Institute, 07738, Mexico City, Mexico
Alexander Gelbukh

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Tsatsaronis, G., Varlamis, I., Nørvåg, K. (2010). An Experimental Study on Unsupervised Graph-based Word Sense Disambiguation. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2010. Lecture Notes in Computer Science, vol 6008. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12116-6_16

Download citation

DOI: https://doi.org/10.1007/978-3-642-12116-6_16
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-12115-9
Online ISBN: 978-3-642-12116-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics