Abstract
With the spread of semantic technologies more and more companies manage their own knowledge graphs (KG), applying them, among other tasks, to text analysis. However, the proprietary KGs are by design domain specific and do not include all the different possible meanings of the words used in a corpus. In order to enable the usage of these KGs for automatic text annotations, we introduce a robust method for discriminating word senses using sense indicators found in the KG: types, synonyms and/or hypernyms. The method uses collocations to induce word senses and to discriminate the sense included in the KG from the other senses, without the need for information about the latter, or the need for manual effort. On the two datasets created specially for this task the method outperforms the baseline and shows accuracy above 80%.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
https://meshb.nlm.nih.gov/ (visited on 05.03.2019).
- 2.
https://meshb.nlm.nih.gov/MeSHonDemand (visited on 05.03.2019).
- 3.
The additional “hyper” stems from “hypernym” as one of the most useful sense indicators.
- 4.
github.com/artreven/thesaural_wsi (visited on 05.03.2019).
- 5.
vocabulary.semantic-web.at/cocktails (visited on 05.03.2019).
- 6.
www.nlm.nih.gov/mesh/ (visited on 05.03.2019).
- 7.
www.cs.york.ac.uk/semeval-2013/task13/ (visited on 05.03.2019).
- 8.
wacky.sslmit.unibo.it/doku.php (visited on 05.03.2019).
References
Agirre, E., de Lacalle, O.L., Soroa, A.: Random walks for knowledge-based word sense disambiguation. Comput. Linguist. 40(1), 57–84 (2014)
Berman, E.: Hitchcock’s vertigo: the collapse of a rescue fantasy. Int. J. Psycho-Anal. 78, 975–988 (1997)
Di Marco, A., Navigli, R.: Clustering and diversifying web search results with graph-based word sense induction. Comput. Linguist. 39(3), 709–754 (2013)
Dice, L.R.: Measures of the amount of ecologic association between species. Ecology 26(3), 297–302 (1945)
Dorow, B., Widdows, D.: Discovering corpus-specific word senses. In: Proceedings of the Tenth Conference on European Chapter of the Association for Computational Linguistics, vol. 2, pp. 79–82. Association for Computational Linguistics (2003)
Ehrlinger, L., Wöß, W.: Towards a definition of knowledge graphs. SEMANTiCS (Posters, Demos, SuCCESS), 48 (2016)
Klapaftis, I.P., Manandhar, S.: Word sense induction using graphs of collocations. In: ECAI, pp. 298–302 (2008)
Li, J., Jurafsky, D.: Do multi-sense embeddings improve natural language understanding? arXiv preprint arXiv:1506.01070 (2015)
Navigli, R.: A quick tour of word sense disambiguation, induction and related approaches. In: Bieliková, M., Friedrich, G., Gottlob, G., Katzenbeisser, S., Turán, G. (eds.) SOFSEM 2012. LNCS, vol. 7147, pp. 115–129. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-27660-6_10
Navigli, R., Faralli, S., Soroa, A., de Lacalle, O., Agirre, E.: Two birds with one stone: learning semantic models for text categorization and word sense disambiguation. In: Proceedings of the 20th ACM International Conference on Information and Knowledge Management, pp. 2317–2320. ACM (2011)
Neelakantan, A., Shankar, J., Passos, A., McCallum, A.: Efficient non-parametric estimation of multiple embeddings per word in vector space. arXiv preprint arXiv:1504.06654 (2015)
Nelson, S.J.: Medical terminologies that work: the example of mesh. In: 2009 10th International Symposium on Pervasive Systems, Algorithms, and Networks, pp. 380–384 (2009)
Panchenko, A., Ruppert, E., Faralli, S., Ponzetto, S.P., Biemann, C.: Unsupervised does not mean uninterpretable: the case for word sense induction and disambiguation. Association for Computational Linguistics (2017)
Rao, D., McNamee, P., Dredze, M.: Entity linking: finding extracted entities in a knowledge base. In: Poibeau, T., Saggion, H., Piskorski, J., Yangarber, R. (eds.) Multi-source, Multilingual Information Extraction and Summarization. Theory and Applications of Natural Language Processing, pp. 93–115. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-28569-1_5
Ratinov, L., Roth, D., Downey, D., Anderson, M.: Local and global algorithms for disambiguation to wikipedia. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, vol. 1, pp. 1375–1384. Association for Computational Linguistics (2011)
Rodd, J., Gaskell, G., Marslen-Wilson, W.: Making sense of semantic ambiguity: semantic competition in lexical access. J. Mem. Lang. 46(2), 245–266 (2002)
Singh, S., Subramanya, A., Pereira, F., McCallum, A.: Wikilinks: a large-scale cross-document coreference corpus labeled via links to Wikipedia. Technical report UM-CS-2012-015, University of Massachusetts, Amherst (2012)
Véronis, J.: Hyperlex: lexical cartography for information retrieval. Comput. Speech Lang. 18(3), 223–252 (2004)
Yarowsky, D.: Unsupervised word sense disambiguation rivaling supervised methods. In: Proceedings of the 33rd Annual Meeting on Association for Computational Linguistics, pp. 189–196. Association for Computational Linguistics (1995)
Zhong, Z., Ng, H.T.: It makes sense: a wide-coverage word sense disambiguation system for free text. In: Proceedings of the ACL 2010 System Demonstrations, pp. 78–83. Association for Computational Linguistics (2010)
Acknowledgements
This work was supported in part by the H2020 project Prêt-á-LLOD under Grant Agreement number 825182.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Revenko, A., Mireles, V. (2019). The Use of Class Assertions and Hypernyms to Induce and Disambiguate Word Senses. In: Anderst-Kotsis, G., et al. Database and Expert Systems Applications. DEXA 2019. Communications in Computer and Information Science, vol 1062. Springer, Cham. https://doi.org/10.1007/978-3-030-27684-3_22
Download citation
DOI: https://doi.org/10.1007/978-3-030-27684-3_22
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-27683-6
Online ISBN: 978-3-030-27684-3
eBook Packages: Computer ScienceComputer Science (R0)