Skip to main content

G-WordNet: Moving WordNet 3.0 and Its Resources to a Graph Database

  • Conference paper
  • First Online:
Advances in Computing (CCC 2017)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 735))

Included in the following conference series:

  • 1742 Accesses

Abstract

In this paper, the convenience of storing a large lexical database (WordNet) in a graph database management system (Neo4j) is studied. The result is G-WordNet, which is a freely-available lexical database based on the Princeton WordNet 3.0 and all its sense-annotated corpora. We justify the need of this resource and the advantages of using a graph database in comparison with previous approaches. In addition, we present an application example of G-WordNet in the tasks of semantic lexical similarity using the declarative query language Cypher. Also, some possible usage scenarios of G-WordNet are discussed, particularly in the fields of lexicography, computational linguistics, natural language processing and NoSQL databases, for research, development, and teaching.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://www.thefreedictionary.com/.

  2. 2.

    https://www.wiktionary.org/.

  3. 3.

    http://dle.rae.es/.

  4. 4.

    https://wordnet.princeton.edu/wordnet/related-projects/#local.

  5. 5.

    http://lucene.apache.org.

  6. 6.

    https://www.w3.org/RDF/.

  7. 7.

    https://virtuoso.openlinksw.com/.

  8. 8.

    https://www.w3.org/standards/techs/sparql.

  9. 9.

    http://www.opencypher.org/.

  10. 10.

    https://neo4j.com/.

  11. 11.

    http://wordnet.princeton.edu/glosstag.shtml.

  12. 12.

    http://www.sparcpoint.com/Blog/Post/wordnet-the-lexical-database.

  13. 13.

    https://neo4j.com/download/community-edition.

  14. 14.

    http://gwordnet.caroycuervo.gov.co/licence.html.

  15. 15.

    https://neo4j.com/developer/cypher-query-language/.

  16. 16.

    http://aclweb.org/anthology/.

References

  1. Agirre, E., Diab, M., Cer, D., Gonzalez-Agirre, A.: Semeval-2012 task 6: A pilot on semantic textual similarity. In: Proceedings of the First Joint Conference on Lexical and Computational Semantics, Proceedings of the Main Conference and the Shared Task, vol. 1 and Proceedings of the Sixth International Workshop on Semantic Evaluation, vol. 2, pp. 385–393. Association for Computational Linguistics (2012)

    Google Scholar 

  2. Angles, R., Gutierrez, C.: Survey of graph database models. ACM Comput. Surv. 40(1), 1 (2008)

    Article  Google Scholar 

  3. Baccianella, S., Esuli, A., Sebastiani, F.: Sentiwordnet 3.0: An enhanced lexical resource for sentiment analysis and opinion mining. In: LREC, vol. 10, pp. 2200–2204 (2010)

    Google Scholar 

  4. Bird, S.: Nltk: the natural language toolkit. In: Proceedings of the COLING/ACL on Interactive Presentation Sessions, pp. 69–72. Association for Computational Linguistics (2006)

    Google Scholar 

  5. Bond, F., Foster, R.: Linking and extending an open multilingual wordnet. ACL 1, 1352–1362 (2013)

    Google Scholar 

  6. Budanitsky, A., Hirst, G.: Evaluating WordNet-based measures of lexical semantic relatedness. Comput. Linguist. 32(1), 13–47 (2006)

    Article  MATH  Google Scholar 

  7. Edmonds, P., Cotton, S.: Senseval-2: overview. In: Proceedings of the Second International Workshop on Evaluating Word Sense Disambiguation Systems, pp. 1–5. Association for Computational Linguistics (2001)

    Google Scholar 

  8. Gonzalo, J., Verdejo, F., Chugur, I., Cigarran, J.: Indexing with WordNet synsets can improve text retrieval. arXiv preprint cmp-lg/9808002 (1998)

    Google Scholar 

  9. Haerder, T., Reuter, A.: Principles of transaction-oriented database recovery. ACM Comput. Surv. 15(4), 287–317 (1983)

    Article  MathSciNet  Google Scholar 

  10. Herrera, J., Peñas, A., Verdejo, F.: Textual entailment recognition based on dependency analysis and WordNet. In: Quiñonero-Candela, J., Dagan, I., Magnini, B., d’Alché-Buc, F. (eds.) MLCW 2005. LNCS, vol. 3944, pp. 231–239. Springer, Heidelberg (2006). doi:10.1007/11736790_13

    Chapter  Google Scholar 

  11. Landes, S., Leacock, C., Tengi, R.I.: WordNet: an electronic lexical database. Build. Semant. Concord. 199(216), 199–216 (1998)

    Google Scholar 

  12. Liu, H., Singh, P.: Conceptneta practical commonsense reasoning tool-kit. BT Technol. J. 22(4), 211–226 (2004)

    Article  Google Scholar 

  13. Magnini, B., Negri, M., Prevete, R., Tanev, H.: A WordNet-based approach to named entities recognition. In: Proceedings of the 2002 Workshop on Building and Using Semantic Networks, vol. 11, pp. 1–7. Association for Computational Linguistics (2002)

    Google Scholar 

  14. Meyer, C.M., Gurevych, I.: Wiktionary: A new rival for expert-built lexicons? Exploring the possibilities of collaborative lexicography. Electronic Lexicography. Oxford University Press, Cambridge (2012)

    Google Scholar 

  15. Miller, G.A.: Wordnet: a lexical database for english. Commun. ACM 38(11), 39–41 (1995)

    Article  Google Scholar 

  16. Moldovan, D.I., Rus, V.: Logic form transformation of wordnet and its applicability to question answering. In: Proceedings of the 39th Annual Meeting on Association for Computational Linguistics, pp. 402–409. Association for Computational Linguistics (2001)

    Google Scholar 

  17. Moro, A., Navigli, R.: Semeval-2015 task 13: Multilingual all-words sense disambiguation and entity linking. In: Proceeding of SemEval-2015 (2015)

    Google Scholar 

  18. Navigli, R.: Word sense disambiguation: A survey. ACM Comput. Surv. 41(2), 10 (2009)

    Article  Google Scholar 

  19. Navigli, R., Jurgens, D., Vannella, D.: Semeval-2013 task 12: Multilingual word sense disambiguation. In: Second Joint Conference on Lexical and Computational Semantics (SEM), vol. 2, pp. 222–231 (2013)

    Google Scholar 

  20. Navigli, R., Ponzetto, S.P.: Babelnet: The automatic construction, evaluation and application of a wide-coverage multilingual semantic network. Artif. Intell. 193, 217–250 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  21. Pedersen, T., Patwardhan, S., Michelizzi, J.: Wordnet::similarity: Measuring the relatedness of concepts. In: Demonstration Papers at HLT-NAACL 2004, HLT-NAACL-Demonstrations 2004, Stroudsburg, PA, USA, pp. 38–41. Association for Computational Linguistics (2004)

    Google Scholar 

  22. Pichardo-Lagunas, O., Sidorov, G., Cruz-Corts, N., Gelbukh, A.: Automatic detection of semantics primitives in explicative dictionaries with bio-inspired algorithms. Onomazein 1(29), 104–117 (2014)

    Article  Google Scholar 

  23. Ponzetto, S.P., Strube, M.: Exploiting semantic role labeling, wordnet and wikipedia for coreference resolution. In: Proceedings of the Main Conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, pp. 192–199. Association for Computational Linguistics (2006)

    Google Scholar 

  24. Pradhan, S.S., Loper, E., Dligach, D., Palmer, M.: Semeval-2007 task 17: English lexical sample, SRL and all words. In: Proceedings of the 4th International Workshop on Semantic Evaluations, pp. 87–92. Association for Computational Linguistics (2007)

    Google Scholar 

  25. Quillian, M.R.: Word concepts: A theory and simulation of some basic semantic capabilities. Syst. Res. Behav. Sci. 12(5), 410–430 (1967)

    Article  Google Scholar 

  26. Silber, H.G., McCoy, K.F.: Efficiently computed lexical chains as an intermediate representation for automatic text summarization. Comput. Linguist. 28(4), 487–496 (2002)

    Article  Google Scholar 

  27. Snyder, B., Palmer, M.: The english all-words task. In: Senseval-3: Third International Workshop on the Evaluation of Systems for the Semantic Analysis of Text, pp. 41–43 (2004)

    Google Scholar 

  28. Taghipour, K., Ng, H.T.: One million sense-tagged instances for word sense disambiguation and induction. In: CoNLL, pp. 338–344 (2015)

    Google Scholar 

  29. Vicknair, C., Macias, M., Zhao, Z., Nan, X., Chen, Y., Wilkins, D.: A comparison of a graph database and a relational database: a data provenance perspective. In: Proceedings of the 48th Annual Southeast Regional Conference, p. 42, ACM (2010)

    Google Scholar 

  30. Webber, J.: A programmatic introduction to neo4j. In: Proceedings of the 3rd Annual Conference on Systems, Programming, and Applications: Software for Humanity, SPLASH 2012, New York, NY, USA, pp. 217–218. ACM (2012)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sergio Jimenez .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Jimenez, S., Dueñas, G. (2017). G-WordNet: Moving WordNet 3.0 and Its Resources to a Graph Database. In: Solano, A., Ordoñez, H. (eds) Advances in Computing. CCC 2017. Communications in Computer and Information Science, vol 735. Springer, Cham. https://doi.org/10.1007/978-3-319-66562-7_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-66562-7_8

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-66561-0

  • Online ISBN: 978-3-319-66562-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics