Abstract
Recent research works on unsupervised word sense disambiguation report an increase in performance, which reduces their handicap from the respective supervised approaches for the same task. Among the latest state of the art methods, those that use semantic graphs reported the best results. Such methods create a graph comprising the words to be disambiguated and their corresponding candidate senses. The graph is expanded by adding semantic edges and nodes from a thesaurus. The selection of the most appropriate sense per word occurrence is then made through the use of graph processing algorithms that offer a degree of importance among the graph vertices. In this paper we experimentally investigate the performance of such methods. We additionally evaluate a new method, which is based on a recently introduced algorithm for computing similarity between graph vertices, P-Rank. We evaluate the performance of all alternatives in two benchmark data sets, Senseval 2 and 3, using WordNet. The current study shows the differences in the performance of each method, when applied on the same semantic graph representation, and analyzes the pros and cons of each method for each part of speech separately. Furthermore, it analyzes the levels of inter-agreement in the sense selection level, giving further insight on how these methods could be employed in an unsupervised ensemble for word sense disambiguation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Agirre, E., Rigau, G.: Word sense disambiguation using conceptual density. In: Proc. of COLING, pp. 16–22 (1996)
Agirre, E., Soroa, A.: Personalizing pagerank for word sense disambiguation. In: Proc. of EACL, pp. 33–41 (2009)
Banerjee, S., Pedersen, T.: Extended gloss overlaps as a measure of semantic relatedness. In: Proc. of IJCAI (2003)
Brin, S., Page, L.: The anatomy of a large-scale hypertextual web search engine. Computer Networks and ISDN Systems 30, 1–7 (1998)
Brody, S., Navigli, R., Lapata, M.: Ensemble methods for unsupervised wsd. In: Proc. of COLING/ACL 2006, pp. 97–104 (2006)
Chan, Y., Ng, H., Chiang, D.: Word sense disambiguation improves statistical machine translation. In: Proc. of ACL (2007)
Chklovski, T., Mihalcea, R.: Exploiting agreement and disagreement of human annotators for word sense disambiguation. In: Proc. of RANLP (2003)
Crestani, F.: Application of spreading activation techniques in information retrieval. Artificial Intelligence Review 11, 453–482 (1997)
Decadt, B., Hoste, V., Daelemans, W., van den Bosch, A.: Gambl, genetic algorithm optimization of memory-based wsd. In: Proc. of the Senseval3: Third International Workshop on the Evaluation of Systems for the Semantic Analysis of Text (2004)
Fellbaum, C.: WordNet – an electronic lexical database. MIT Press, Cambridge (1998)
Florian, R., Cucerzan, S., Schafer, C., Yarowsky, D.: Combining classifiers for word sense disambiguation. Natural Language Engineering 8(4), 327–341 (2002)
Gale, W., Church, K., Yarowsky, D.: Estimating upper and lower bounds on the performance of word-sense disambiguation programs. In: Proc. of the ACL 1992, pp. 249–256 (1992)
Gonzalo, J., Verdejo, F., Chugur, I.: Indexing with wordnet synsets can improve text retrieval. In: Proc. of the COLING/ACL Workshop on Usage of WordNet for NLP (1998)
Hoste, V., Daelemans, W., Hendrickx, I., van den Bosch, A.: Evaluating the results of a memory-based word-expert approach to unrestricted word sense disambiguation. In: Proc. of the ACL Workshop on Word Sense Disambiguation (2002)
Ide, N., Veronis, J.: Word sense disambiguation: the state of the art. Computational Linguistics 24(1), 1–40 (1998)
Jeh, G., Widom, J.: Simrank: a measure of structural-context similarity. In: Proc. of KDD, pp. 538–543 (2002)
Kleinberg, J.: Authoritative sources in a hyperlinked environment. Journal of the ACM 46(5), 604–632 (1999)
Kohomban, U., Lee, W.: Learning semantic classes for word sense disambiguation. In: Proc. of ACL, pp. 34–41 (2005)
Lesk, M.: Automated sense disambiguation using machine-readable dictionaries: How to tell a pine cone from an ice cream cone. In: Proc. of the SIGDOC Conference, pp. 24–26 (1986)
Mavroeidis, D., Tsatsaronis, G., Vazirgiannis, M., Theobald, M., Weikum, G.: Word sense disambiguation for exploiting hierarchical thesauri in text classification. In: Jorge, A.M., Torgo, L., Brazdil, P.B., Camacho, R., Gama, J. (eds.) PKDD 2005. LNCS (LNAI), vol. 3721, pp. 181–192. Springer, Heidelberg (2005)
Mihalcea, R.: Word sense disambiguation with pattern learning and automatic feature selection. Natural Language Engineering 1(1), 1–15 (2002)
Mihalcea, R.: Unsupervised large-vocabulary word sense disambiguation with graph-based algorithms for sequence data labeling. In: HLT (2005)
Mihalcea, R., Csomai, A.: Senselearner: Word sense disambiguation for all words in unrestricted text. In: Proc. of ACL, pp. 53–56 (2005)
Mihalcea, R., Tarau, P., Figa, E.: Pagerank on semantic networks with application to word sense disambiguation. In: Proc. of COLING (2004)
Moldovan, D., Rus, V.: Logic form transformation of wordnet and its applicability to question answering. In: Proc. of ACL, pp. 394–401 (2001)
Navigli, R.: Online word sense disambiguation with structural semantic interconnections. In: Proc. of EACL (2006)
Navigli, R.: A structural approach to the automatic adjudication of word sense disagreements. Natural Language Engineering 14(4), 547–573 (2008)
Navigli, R.: Word sense disambiguation: A survey. ACM Computing Surveys 41(2), Article 10 (2009)
Navigli, R., Lapata, M.: Graph connectivity measures for unsupervised word sense disambiguation. In: Proc. of IJCAI, pp. 1683–1688 (2007)
Palmer, M., Dang, H., Fellbaum, C.: Making fine-grained and coarse-grained sense distinctions, both manually and automatically. Journal of Natural Language Engineering 13(2), 137–163 (2007)
Palmer, M., Fellbaum, C., Cotton, S.: English tasks: All-words and verb lexical sample. In: Proc. of Senseval-2, pp. 21–24 (2001)
Patwardhan, S., Banerjee, S., Pedersen, T.: Using measures of semantic relatedness for word sense disambiguation. In: Gelbukh, A. (ed.) CICLing 2003. LNCS, vol. 2588, pp. 241–257. Springer, Heidelberg (2003)
Pedersen, T.: A simple approach to building ensembles of naive bayesian classifiers for word sense disambiguation. In: Proc. of NAACL, pp. 63–69 (2000)
Pedersen, T., Kolhatkar, V.: WordNet:: SenseRelate:: AllWords - A Broad Coverage Word Sense Tagger that Maximimizes Semantic Relatedness. In: Proc. of NAACL/HLT, pp. 17–20 (2009)
Sinha, R., Mihalcea, R.: Unsupervised graph-based word sense disambiguation using measures of word semantic similarity. In: Proc. of ICSC (2007)
Snyder, B., Palmer, M.: The english all-words task. In: Proc. of Senseval-3, pp. 41–43 (2004)
Sussna, M.: Word sense disambiguation for free-text indexing using a massive semantic network. In: Proc. of CIKM (1993)
Tsatsaronis, G., Vazirgiannis, M., Androutsopoulos, I.: Word sense disambiguation with spreading activation networks generated from thesauri. In: Proc. of IJCAI, pp. 1725–1730 (2007)
Veronis, J., Ide, N.: Word sense disambiguation with very large neural networks extracted from machine readable dictionaries. In: Proc. of COLING, pp. 389–394 (1990)
Wu, D., Su, W., Carpuat, M.: A kernel pca method for superior word sense disambiguation. In: Proc. of ACL, pp. 637–644 (2004)
Yarowsky, D.: Word-sense disambiguation using statistical models of roget’s categories trained on large corpora. In: Proceedings of COLING, pp. 454–460 (1992)
Zhao, P., Han, J., Sun, Y.: P-Rank: a comprehensive structural similarity measure over information networks. In: Proc. of CIKM, pp. 553–562 (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Tsatsaronis, G., Varlamis, I., Nørvåg, K. (2010). An Experimental Study on Unsupervised Graph-based Word Sense Disambiguation. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2010. Lecture Notes in Computer Science, vol 6008. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12116-6_16
Download citation
DOI: https://doi.org/10.1007/978-3-642-12116-6_16
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-12115-9
Online ISBN: 978-3-642-12116-6
eBook Packages: Computer ScienceComputer Science (R0)