G-WordNet: Moving WordNet 3.0 and Its Resources to a Graph Database

Jimenez, Sergio; Dueñas, George

doi:10.1007/978-3-319-66562-7_8

Sergio Jimenez¹¹ &
George Dueñas¹¹

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 735))

Included in the following conference series:

Colombian Conference on Computing

1742 Accesses

Abstract

In this paper, the convenience of storing a large lexical database (WordNet) in a graph database management system (Neo4j) is studied. The result is G-WordNet, which is a freely-available lexical database based on the Princeton WordNet 3.0 and all its sense-annotated corpora. We justify the need of this resource and the advantages of using a graph database in comparison with previous approaches. In addition, we present an application example of G-WordNet in the tasks of semantic lexical similarity using the declarative query language Cypher. Also, some possible usage scenarios of G-WordNet are discussed, particularly in the fields of lexicography, computational linguistics, natural language processing and NoSQL databases, for research, development, and teaching.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Agirre, E., Diab, M., Cer, D., Gonzalez-Agirre, A.: Semeval-2012 task 6: A pilot on semantic textual similarity. In: Proceedings of the First Joint Conference on Lexical and Computational Semantics, Proceedings of the Main Conference and the Shared Task, vol. 1 and Proceedings of the Sixth International Workshop on Semantic Evaluation, vol. 2, pp. 385–393. Association for Computational Linguistics (2012)
Google Scholar
Angles, R., Gutierrez, C.: Survey of graph database models. ACM Comput. Surv. 40(1), 1 (2008)
Article Google Scholar
Baccianella, S., Esuli, A., Sebastiani, F.: Sentiwordnet 3.0: An enhanced lexical resource for sentiment analysis and opinion mining. In: LREC, vol. 10, pp. 2200–2204 (2010)
Google Scholar
Bird, S.: Nltk: the natural language toolkit. In: Proceedings of the COLING/ACL on Interactive Presentation Sessions, pp. 69–72. Association for Computational Linguistics (2006)
Google Scholar
Bond, F., Foster, R.: Linking and extending an open multilingual wordnet. ACL 1, 1352–1362 (2013)
Google Scholar
Budanitsky, A., Hirst, G.: Evaluating WordNet-based measures of lexical semantic relatedness. Comput. Linguist. 32(1), 13–47 (2006)
Article MATH Google Scholar
Edmonds, P., Cotton, S.: Senseval-2: overview. In: Proceedings of the Second International Workshop on Evaluating Word Sense Disambiguation Systems, pp. 1–5. Association for Computational Linguistics (2001)
Google Scholar
Gonzalo, J., Verdejo, F., Chugur, I., Cigarran, J.: Indexing with WordNet synsets can improve text retrieval. arXiv preprint cmp-lg/9808002 (1998)
Google Scholar
Haerder, T., Reuter, A.: Principles of transaction-oriented database recovery. ACM Comput. Surv. 15(4), 287–317 (1983)
Article MathSciNet Google Scholar
Herrera, J., Peñas, A., Verdejo, F.: Textual entailment recognition based on dependency analysis and WordNet. In: Quiñonero-Candela, J., Dagan, I., Magnini, B., d’Alché-Buc, F. (eds.) MLCW 2005. LNCS, vol. 3944, pp. 231–239. Springer, Heidelberg (2006). doi:10.1007/11736790_13
Chapter Google Scholar
Landes, S., Leacock, C., Tengi, R.I.: WordNet: an electronic lexical database. Build. Semant. Concord. 199(216), 199–216 (1998)
Google Scholar
Liu, H., Singh, P.: Conceptneta practical commonsense reasoning tool-kit. BT Technol. J. 22(4), 211–226 (2004)
Article Google Scholar
Magnini, B., Negri, M., Prevete, R., Tanev, H.: A WordNet-based approach to named entities recognition. In: Proceedings of the 2002 Workshop on Building and Using Semantic Networks, vol. 11, pp. 1–7. Association for Computational Linguistics (2002)
Google Scholar
Meyer, C.M., Gurevych, I.: Wiktionary: A new rival for expert-built lexicons? Exploring the possibilities of collaborative lexicography. Electronic Lexicography. Oxford University Press, Cambridge (2012)
Google Scholar
Miller, G.A.: Wordnet: a lexical database for english. Commun. ACM 38(11), 39–41 (1995)
Article Google Scholar
Moldovan, D.I., Rus, V.: Logic form transformation of wordnet and its applicability to question answering. In: Proceedings of the 39th Annual Meeting on Association for Computational Linguistics, pp. 402–409. Association for Computational Linguistics (2001)
Google Scholar
Moro, A., Navigli, R.: Semeval-2015 task 13: Multilingual all-words sense disambiguation and entity linking. In: Proceeding of SemEval-2015 (2015)
Google Scholar
Navigli, R.: Word sense disambiguation: A survey. ACM Comput. Surv. 41(2), 10 (2009)
Article Google Scholar
Navigli, R., Jurgens, D., Vannella, D.: Semeval-2013 task 12: Multilingual word sense disambiguation. In: Second Joint Conference on Lexical and Computational Semantics (SEM), vol. 2, pp. 222–231 (2013)
Google Scholar
Navigli, R., Ponzetto, S.P.: Babelnet: The automatic construction, evaluation and application of a wide-coverage multilingual semantic network. Artif. Intell. 193, 217–250 (2012)
Article MathSciNet MATH Google Scholar
Pedersen, T., Patwardhan, S., Michelizzi, J.: Wordnet::similarity: Measuring the relatedness of concepts. In: Demonstration Papers at HLT-NAACL 2004, HLT-NAACL-Demonstrations 2004, Stroudsburg, PA, USA, pp. 38–41. Association for Computational Linguistics (2004)
Google Scholar
Pichardo-Lagunas, O., Sidorov, G., Cruz-Corts, N., Gelbukh, A.: Automatic detection of semantics primitives in explicative dictionaries with bio-inspired algorithms. Onomazein 1(29), 104–117 (2014)
Article Google Scholar
Ponzetto, S.P., Strube, M.: Exploiting semantic role labeling, wordnet and wikipedia for coreference resolution. In: Proceedings of the Main Conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, pp. 192–199. Association for Computational Linguistics (2006)
Google Scholar
Pradhan, S.S., Loper, E., Dligach, D., Palmer, M.: Semeval-2007 task 17: English lexical sample, SRL and all words. In: Proceedings of the 4th International Workshop on Semantic Evaluations, pp. 87–92. Association for Computational Linguistics (2007)
Google Scholar
Quillian, M.R.: Word concepts: A theory and simulation of some basic semantic capabilities. Syst. Res. Behav. Sci. 12(5), 410–430 (1967)
Article Google Scholar
Silber, H.G., McCoy, K.F.: Efficiently computed lexical chains as an intermediate representation for automatic text summarization. Comput. Linguist. 28(4), 487–496 (2002)
Article Google Scholar
Snyder, B., Palmer, M.: The english all-words task. In: Senseval-3: Third International Workshop on the Evaluation of Systems for the Semantic Analysis of Text, pp. 41–43 (2004)
Google Scholar
Taghipour, K., Ng, H.T.: One million sense-tagged instances for word sense disambiguation and induction. In: CoNLL, pp. 338–344 (2015)
Google Scholar
Vicknair, C., Macias, M., Zhao, Z., Nan, X., Chen, Y., Wilkins, D.: A comparison of a graph database and a relational database: a data provenance perspective. In: Proceedings of the 48th Annual Southeast Regional Conference, p. 42, ACM (2010)
Google Scholar
Webber, J.: A programmatic introduction to neo4j. In: Proceedings of the 3rd Annual Conference on Systems, Programming, and Applications: Software for Humanity, SPLASH 2012, New York, NY, USA, pp. 217–218. ACM (2012)
Google Scholar

Download references

Author information

Authors and Affiliations

Instituto Caro y Cuervo, Bogotá, Colombia
Sergio Jimenez & George Dueñas

Authors

Sergio Jimenez
View author publications
You can also search for this author in PubMed Google Scholar
George Dueñas
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sergio Jimenez .

Editor information

Editors and Affiliations

Universidad Autónoma de Occidente, Cali, Colombia
Andrés Solano
Universidad de San Buenaventura, Cali, Colombia
Hugo Ordoñez

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Jimenez, S., Dueñas, G. (2017). G-WordNet: Moving WordNet 3.0 and Its Resources to a Graph Database. In: Solano, A., Ordoñez, H. (eds) Advances in Computing. CCC 2017. Communications in Computer and Information Science, vol 735. Springer, Cham. https://doi.org/10.1007/978-3-319-66562-7_8

Download citation

DOI: https://doi.org/10.1007/978-3-319-66562-7_8
Published: 17 August 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-66561-0
Online ISBN: 978-3-319-66562-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics