Advertisement

A Semantic Similarity Measurement Tool for WordNet-Like Databases

  • Marek KubisEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10930)

Abstract

The paper describes a new framework for computing the semantic similarity of words and concepts using WordNet-like databases. The main advantage of the presented approach is the ability to implement similarity measures as concise expressions in the embedded query language. The preliminary results of the use of the framework to model the semantic similarity of Polish nouns are reported.

Keywords

WordNet-based Similarity Measures Polish Nouns Embedded Query Languages plWordNet PolNet 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    Bird, S., Klein, E., Loper, E.: Natural Language Processing with Python. O’Reilly Media, Sebastopol (2009)zbMATHGoogle Scholar
  2. 2.
    Budanitsky, A., Hirst, G.: Evaluating wordnet-based measures of lexical semantic relatedness. Comput. Linguist. 32(1), 13–47 (2006)CrossRefGoogle Scholar
  3. 3.
    Diedenhofen, B.: cocor: Comparing correlations, (Version 1.0-0) (2013). http://r.birkdiedenhofen.de/pckg/cocor/
  4. 4.
    Fellbaum, C. (ed.): WordNet: An Electronic Lexical Database. MIT Press, Cambridge, MA (1998)zbMATHGoogle Scholar
  5. 5.
    Finlayson, M.A.: Java libraries for accessing the princeton wordnet: comparison and evaluation. In: Proceedings of the 7th Global Wordnet Conference, Tartu, Estonia, pp. 78–85 (2014)Google Scholar
  6. 6.
    Global WordNet Association: Global WordNet Grid (2012). http://globalwordnet.org/global-wordnet-grid/. Accessed 20 Sept 2015
  7. 7.
    Hirst, G., St-Onge, D.: Lexical chains as representations of context for the detection and correction of malapropisms, chap. 13, pp. 305–332. In: Fellbaum [4] (1998)Google Scholar
  8. 8.
    Horak, A., Pala, K., Rambousek, A., Povolny, M.: DEBVisDic - first version of new client-server wordnet browsing and editing tool. In: Sojka, P., et al. (eds.) Proceedings of the Third International WordNet Conference - GWC 2006. Masaryk University, Brno, Czech Republic (2005)Google Scholar
  9. 9.
    Isahara, H., Bond, F., Uchimoto, K., Utiyama, M., Kanzaki, K.: Development of the Japanese WordNet. In: Proceedings of the International Conference on Language Resources and Evaluation, LREC 2008, Marrakech, Morocco, 26 May–1 June 2008, European Language Resources Association (2008)Google Scholar
  10. 10.
    Jiang, J.J., Conrath, D.W.: Semantic similarity based on corpus statistics and lexical taxonomy. In: Proceedings of 10th International Conference on Research in Computational Linguistics, ROCLING 1997 (1997)Google Scholar
  11. 11.
    Kubis, M.: A query language for wordnet-like lexical databases. In: Pan, J.-S., Chen, S.-M., Nguyen, N.T. (eds.) ACIIDS 2012. LNCS (LNAI), vol. 7198, pp. 436–445. Springer, Heidelberg (2012).  https://doi.org/10.1007/978-3-642-28493-9_46CrossRefGoogle Scholar
  12. 12.
    Kubis, M.: A tool for transforming wordnet-like databases. In: Vetulani, Z., Mariani, J. (eds.) LTC 2011. LNCS (LNAI), vol. 8387, pp. 343–355. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-08958-4_28CrossRefGoogle Scholar
  13. 13.
    Kubis, M.: A semantic similarity measurement tool for WordNet-like databases. In: Vetulani, Z., Mariani, J. (eds.) Proceedings of the 7th Language and Technology Conference, pp. 150–154. Fundacja Uniwersytetu im. Adama Mickiewicza, Poznań, Poland, November 2015Google Scholar
  14. 14.
    Leacock, C., Chodorow, M.: Combining local context and wordnet similarity for word sense identification, chap. 11, pp. 265–283. In: Fellbaum [4] (1998)Google Scholar
  15. 15.
    Liaw, A., Wiener, M.: Classification and Regression by randomForest. R News 2(3), 18–22 (2002). http://CRAN.R-project.org/doc/Rnews/Google Scholar
  16. 16.
    Lin, D.: An information-theoretic definition of similarity. In: Proceedings of the Fifteenth International Conference on Machine Learning, ICML 1998, pp. 296–304. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA (1998)Google Scholar
  17. 17.
    Maziarz, M., Piasecki, M., Szpakowicz, S.: Approaching plWordNet 2.0. In: Proceedings of the 6th Global Wordnet Conference. Matsue, Japan, January 2012Google Scholar
  18. 18.
    Meyer, D., Dimitriadou, E., Hornik, K., Weingessel, A., Leisch, F.: e1071: Misc functions of the department of statistics (e1071), TU Wien, R package version 1.6-3 (2014). http://CRAN.R-project.org/package=e1071
  19. 19.
    Miller, G.A., Charles, W.G.: Contextual correlates of semantic similarity. Lang. Cognit. Process. 6(1), 1–28 (1991)CrossRefGoogle Scholar
  20. 20.
    Paliwoda-Pękosz, G., Lula, P.: Measures of semantic relatedness based on wordnet. In: International Workshop For Ph.D. Students. Brno, Czech Republic (2009). ISBN: 978-80-214-3980-1Google Scholar
  21. 21.
    Patwardhan, S., Banerjee, S., Pedersen, T.: Using measures of semantic relatedness for word sense disambiguation. In: Gelbukh, A. (ed.) CICLing 2003. LNCS, vol. 2588, pp. 241–257. Springer, Heidelberg (2003).  https://doi.org/10.1007/3-540-36456-0_24CrossRefGoogle Scholar
  22. 22.
    Pedersen, T.: Information content measures of semantic similarity perform better without sense-tagged text. In: Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, HLT 2010, pp. 329–332. Association for Computational Linguistics, Stroudsburg, PA, USA (2010)Google Scholar
  23. 23.
    Pedersen, T., Patwardhan, S., Michelizzi, J.: WordNet::Similarity: Measuring the Relatedness of Concepts. In: Demonstration Papers at HLT-NAACL 2004, pp. 38–41, HLT-NAACL-Demonstrations 2004, Association for Computational Linguistics, Stroudsburg, PA, USA (2004). http://dl.acm.org/citation.cfm?id=1614025.1614037
  24. 24.
    Postma, M., Vossen, P.: What implementation and translation teach us: the case of semantic similarity measures in wordnets. In: Orav, H., Fellbaum, C., Vossen, P. (eds.) Proceedings of the Seventh Global Wordnet Conference, Tartu, Estonia, pp. 133–141 (2014)Google Scholar
  25. 25.
    R Core Team: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria (2014). http://www.R-project.org/
  26. 26.
    Rada, R., Mili, H., Bicknell, E., Blettner, M.: Development and application of a metric on semantic nets. IEEE Trans. Syst. Man Cybern. 19(1), 17–30 (1989)CrossRefGoogle Scholar
  27. 27.
    Resnik, P.: using information content to evaluate semantic similarity in a taxonomy. In: Proceedings of the 14th International Joint Conference on Artificial Intelligence - Volume 1, IJCAI 1995, pp. 448–453. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA (1995)Google Scholar
  28. 28.
    Rubenstein, H., Goodenough, J.B.: Contextual correlates of synonymy. Commun. ACM 8(10), 627–633 (1965)CrossRefGoogle Scholar
  29. 29.
    Shima, H.: ws4j - WordNet Similarity for Java (2015). https://code.google.com/p/ws4j/. Accessed 28 Aug 2015
  30. 30.
    Soria, C., Monachini, M., Vossen, P.: Wordnet-LMF: Fleshing out a standardized format for wordnet interoperability. In: Proceeding of the 2009 international workshop on Intercultural collaboration, pp. 139–146. ACM, New York, USA (2009)Google Scholar
  31. 31.
    Stevenson, M., Greenwood, M.A.: A semantic approach to IE pattern induction. In: Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, ACL 2005, pp. 379–386. Association for Computational Linguistics, Stroudsburg, PA, USA (2005)Google Scholar
  32. 32.
    Tengi, R.I.: Design and Implementation of the WordNet Lexical Database and Searching Software, chap. 4, pp. 105–127. In: Fellbaum [4] (1998)Google Scholar
  33. 33.
    Therneau, T., Atkinson, B., Ripley, B.: rpart: recursive partitioning and regression trees, R package version 4.1-8 (2014). http://CRAN.R-project.org/package=rpart
  34. 34.
    Venables, W.N., Ripley, B.D.: Modern Applied Statistics with S. Springer, New York (2002).  https://doi.org/10.1007/978-0-387-21706-2. http://www.stats.ox.ac.uk/pub/MASS4. ISBN 0-387-95457-0CrossRefzbMATHGoogle Scholar
  35. 35.
    Vetulani, Z., Kubis, M., Obrębski, T.: PolNet - Polish WordNet: Data and Tools. In: Calzolari, N., et al. (eds.) Proceedings of the Seventh International Conference on Language Resources and Evaluation, ELRA, Valletta, Malta, May 2010Google Scholar
  36. 36.
    Wu, Z., Palmer, M.: Verbs semantics and lexical selection. In: Proceedings of the 32nd Annual Meeting on Association for Computational Linguistics, ACL 1994, pp. 133–138. Association for Computational Linguistics, Stroudsburg, PA, USA (1994).  https://doi.org/10.3115/981732.981751

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Faculty of Mathematics and Computer Science, Department of Computer Linguistics and Artificial IntelligenceAdam Mickiewicz UniversityPoznańPoland

Personalised recommendations