Abstract
This paper presents a novel approach for generating context templates for the task of word sense disambiguation (WSD). Context information of an ambiguous word, in form of feature vectors, is first classified into coarse-grained semantic categories by topic features using the latent dirichlet allocation (LDA) algorithm. To further refine the sense tags, all feature vectors of the ambiguous word, under the same topic, are recast into a network. Various centrality measures are derived to figure out the features or context words in the context templates, which are highly influential in the disambiguation. The WSD is achieved by identifying the maximum pairwise similarities between the context encoded in the templates and the sentence. The correct sense of an ambiguous word is resolved by distinguishing the most activated template without being trapped in a subjective linguistic quagmire. The approach is assessed in a corpus of more than 1,000,000 words. Experimental result shows the best measures perform comparably to the state-of-the-art.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Agirre, E., Bengoetxea, K., Gojenola, K., Nivre, J.: Improving dependency parsing with semantic classes. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, ACL HLT 2011, Portland, pp. 699–703 (2011)
Blei, D., Ng, A., Jordan, M.: Latent Dirichlet allocation. Journal of Machine Learning Research 3, 993–1022 (2003)
Cai, J.F., Lee, W.S., Teh, Y.W.: Improving word sense disambiguation using topic features. In: Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Languages Processing and Computational Natural Language Learning, pp. 1015–1023 (2007)
Dagan, I., Lee, L., Pereira, F.: Similarity-based models of word co-occurrence probabilities. Machine Learning Journal 3, 1–3, 43–69 (1999)
Decadt, B., Hoste, V., Daelemans, W., van den Bosch, A.: GAMBL, Genetic Algorithm Optimization of Memory-Based WSD. In: SENSEVAL-3: Third International Workshop on the Evaluation of Systems for the Semantic Analysis of Text (2004)
Di Sciullo, A.M., Williams, E.: On the Definition of Word. In: Linguistic Inquiry Monograph, vol. 14, MIT Press, Cambridge (1987)
Fellbaum, C.: WordNet: An Electronic Lexical Database. MIT Press, Cambridge (1998)
Freeman, L.C.: Centrality in social networks conceptual clarification. Social Networks 1, 215–239 (1977)
Jiang, J.J., Conrath, D.W.: Semantic Similarity Based on Corpus Statistics and Lexical Taxonomy. In: Proceedings of International Conference on Research in Computational Linguistics, pp. 19–33. International Committee on Computational Linguistics (1997)
Ker, S.-j., Huang, C.-R., Hong, J.-F., Liu, S.-Y., Jian, H.-L., Su, I.-L., Hsieh, S.-K.: Design and Prototype of a Large-scale and Fully Sense-tagged Corpus. In: Tokunaga, T., Ortega, A. (eds.) LKP 2008. LNCS (LNAI), vol. 4938, pp. 186–193. Springer, Heidelberg (2008)
Mackinlay, A., Dridan, R., Mccarthy, D., Baldwin, T.: The effects of semantic annotations on precision parse ranking. In: Proceedings of the First Joint Conference on Lexical and Computational Semantics (*SEM 2012), Montreal, pp. 228–236 (2012)
Mei, J., Zhu, Y., Gao, Y., Ying, H.: Tongyici Cilin. Commercial Press (1984) (in Chinese)
Mihalcea, R., Csomai, A.: SenseLearner: word sense disambiguation for all words in unrestricted text. In: Proceedings of the ACL 2005 on Interactive Poster and Demonstration Sessions, pp. 53–56 (2005)
Navigli, R., Lapata, M.: Graph connectivity measures for unsupervised word sense disambiguation. In: Proceedings of IJCAI, pp. 1683–1688 (2007)
Newman, M.: Networks: An Introduction. Oxford (2011)
Steyvers, M., Griffiths, T.: Probabilistic Topic Models. In: Landauer, T., Mcnamara, D., Dennis, S., Kintsch, W. (eds.) Handbook of Latent Semantic Analysis (2007)
Tsatsaronis, G., Varlamis, I., Nørvåg, K.: An experimental study on unsupervised graph-based word sense disambiguation. In: Gelbukh, A. (ed.) CICLing 2010. LNCS, vol. 6008, pp. 184–198. Springer, Heidelberg (2010)
Wu, Y., Jin, P., Guo, T., Yu, S.: Building Chinese sense annotated corpus with the help of software tools. In: Proceedings of the Linguistic Annotation Workshop. ACL, Prague (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer International Publishing Switzerland
About this paper
Cite this paper
Chan, S.W.K. (2013). Generating Context Templates for Word Sense Disambiguation. In: Cranefield, S., Nayak, A. (eds) AI 2013: Advances in Artificial Intelligence. AI 2013. Lecture Notes in Computer Science(), vol 8272. Springer, Cham. https://doi.org/10.1007/978-3-319-03680-9_47
Download citation
DOI: https://doi.org/10.1007/978-3-319-03680-9_47
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-03679-3
Online ISBN: 978-3-319-03680-9
eBook Packages: Computer ScienceComputer Science (R0)