Abstract
In this paper, the convenience of storing a large lexical database (WordNet) in a graph database management system (Neo4j) is studied. The result is G-WordNet, which is a freely-available lexical database based on the Princeton WordNet 3.0 and all its sense-annotated corpora. We justify the need of this resource and the advantages of using a graph database in comparison with previous approaches. In addition, we present an application example of G-WordNet in the tasks of semantic lexical similarity using the declarative query language Cypher. Also, some possible usage scenarios of G-WordNet are discussed, particularly in the fields of lexicography, computational linguistics, natural language processing and NoSQL databases, for research, development, and teaching.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
- 8.
- 9.
- 10.
- 11.
- 12.
- 13.
- 14.
- 15.
- 16.
References
Agirre, E., Diab, M., Cer, D., Gonzalez-Agirre, A.: Semeval-2012 task 6: A pilot on semantic textual similarity. In: Proceedings of the First Joint Conference on Lexical and Computational Semantics, Proceedings of the Main Conference and the Shared Task, vol. 1 and Proceedings of the Sixth International Workshop on Semantic Evaluation, vol. 2, pp. 385–393. Association for Computational Linguistics (2012)
Angles, R., Gutierrez, C.: Survey of graph database models. ACM Comput. Surv. 40(1), 1 (2008)
Baccianella, S., Esuli, A., Sebastiani, F.: Sentiwordnet 3.0: An enhanced lexical resource for sentiment analysis and opinion mining. In: LREC, vol. 10, pp. 2200–2204 (2010)
Bird, S.: Nltk: the natural language toolkit. In: Proceedings of the COLING/ACL on Interactive Presentation Sessions, pp. 69–72. Association for Computational Linguistics (2006)
Bond, F., Foster, R.: Linking and extending an open multilingual wordnet. ACL 1, 1352–1362 (2013)
Budanitsky, A., Hirst, G.: Evaluating WordNet-based measures of lexical semantic relatedness. Comput. Linguist. 32(1), 13–47 (2006)
Edmonds, P., Cotton, S.: Senseval-2: overview. In: Proceedings of the Second International Workshop on Evaluating Word Sense Disambiguation Systems, pp. 1–5. Association for Computational Linguistics (2001)
Gonzalo, J., Verdejo, F., Chugur, I., Cigarran, J.: Indexing with WordNet synsets can improve text retrieval. arXiv preprint cmp-lg/9808002 (1998)
Haerder, T., Reuter, A.: Principles of transaction-oriented database recovery. ACM Comput. Surv. 15(4), 287–317 (1983)
Herrera, J., Peñas, A., Verdejo, F.: Textual entailment recognition based on dependency analysis and WordNet. In: Quiñonero-Candela, J., Dagan, I., Magnini, B., d’Alché-Buc, F. (eds.) MLCW 2005. LNCS, vol. 3944, pp. 231–239. Springer, Heidelberg (2006). doi:10.1007/11736790_13
Landes, S., Leacock, C., Tengi, R.I.: WordNet: an electronic lexical database. Build. Semant. Concord. 199(216), 199–216 (1998)
Liu, H., Singh, P.: Conceptneta practical commonsense reasoning tool-kit. BT Technol. J. 22(4), 211–226 (2004)
Magnini, B., Negri, M., Prevete, R., Tanev, H.: A WordNet-based approach to named entities recognition. In: Proceedings of the 2002 Workshop on Building and Using Semantic Networks, vol. 11, pp. 1–7. Association for Computational Linguistics (2002)
Meyer, C.M., Gurevych, I.: Wiktionary: A new rival for expert-built lexicons? Exploring the possibilities of collaborative lexicography. Electronic Lexicography. Oxford University Press, Cambridge (2012)
Miller, G.A.: Wordnet: a lexical database for english. Commun. ACM 38(11), 39–41 (1995)
Moldovan, D.I., Rus, V.: Logic form transformation of wordnet and its applicability to question answering. In: Proceedings of the 39th Annual Meeting on Association for Computational Linguistics, pp. 402–409. Association for Computational Linguistics (2001)
Moro, A., Navigli, R.: Semeval-2015 task 13: Multilingual all-words sense disambiguation and entity linking. In: Proceeding of SemEval-2015 (2015)
Navigli, R.: Word sense disambiguation: A survey. ACM Comput. Surv. 41(2), 10 (2009)
Navigli, R., Jurgens, D., Vannella, D.: Semeval-2013 task 12: Multilingual word sense disambiguation. In: Second Joint Conference on Lexical and Computational Semantics (SEM), vol. 2, pp. 222–231 (2013)
Navigli, R., Ponzetto, S.P.: Babelnet: The automatic construction, evaluation and application of a wide-coverage multilingual semantic network. Artif. Intell. 193, 217–250 (2012)
Pedersen, T., Patwardhan, S., Michelizzi, J.: Wordnet::similarity: Measuring the relatedness of concepts. In: Demonstration Papers at HLT-NAACL 2004, HLT-NAACL-Demonstrations 2004, Stroudsburg, PA, USA, pp. 38–41. Association for Computational Linguistics (2004)
Pichardo-Lagunas, O., Sidorov, G., Cruz-Corts, N., Gelbukh, A.: Automatic detection of semantics primitives in explicative dictionaries with bio-inspired algorithms. Onomazein 1(29), 104–117 (2014)
Ponzetto, S.P., Strube, M.: Exploiting semantic role labeling, wordnet and wikipedia for coreference resolution. In: Proceedings of the Main Conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, pp. 192–199. Association for Computational Linguistics (2006)
Pradhan, S.S., Loper, E., Dligach, D., Palmer, M.: Semeval-2007 task 17: English lexical sample, SRL and all words. In: Proceedings of the 4th International Workshop on Semantic Evaluations, pp. 87–92. Association for Computational Linguistics (2007)
Quillian, M.R.: Word concepts: A theory and simulation of some basic semantic capabilities. Syst. Res. Behav. Sci. 12(5), 410–430 (1967)
Silber, H.G., McCoy, K.F.: Efficiently computed lexical chains as an intermediate representation for automatic text summarization. Comput. Linguist. 28(4), 487–496 (2002)
Snyder, B., Palmer, M.: The english all-words task. In: Senseval-3: Third International Workshop on the Evaluation of Systems for the Semantic Analysis of Text, pp. 41–43 (2004)
Taghipour, K., Ng, H.T.: One million sense-tagged instances for word sense disambiguation and induction. In: CoNLL, pp. 338–344 (2015)
Vicknair, C., Macias, M., Zhao, Z., Nan, X., Chen, Y., Wilkins, D.: A comparison of a graph database and a relational database: a data provenance perspective. In: Proceedings of the 48th Annual Southeast Regional Conference, p. 42, ACM (2010)
Webber, J.: A programmatic introduction to neo4j. In: Proceedings of the 3rd Annual Conference on Systems, Programming, and Applications: Software for Humanity, SPLASH 2012, New York, NY, USA, pp. 217–218. ACM (2012)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Jimenez, S., Dueñas, G. (2017). G-WordNet: Moving WordNet 3.0 and Its Resources to a Graph Database. In: Solano, A., Ordoñez, H. (eds) Advances in Computing. CCC 2017. Communications in Computer and Information Science, vol 735. Springer, Cham. https://doi.org/10.1007/978-3-319-66562-7_8
Download citation
DOI: https://doi.org/10.1007/978-3-319-66562-7_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-66561-0
Online ISBN: 978-3-319-66562-7
eBook Packages: Computer ScienceComputer Science (R0)