Skip to main content

Analysis of the Semantic Distance of Words in the RuWordNet Thesaurus

  • Conference paper
  • First Online:
Data Analytics and Management in Data Intensive Domains (DAMDID/RCDL 2020)

Abstract

The article presents the analyses of analogues in the RuWordNet thesaurus regarding their semantic distance. Consideration of analogues is by nature a comprehensive analysis for identifying of possible routes for each pair of words with application of both traditional linguistic methods and approaches of cognitive and corpus linguistics. The lack of semantic connections between some members of the series, as well as the inconsistency of their routes with the general linguistic representations established in the dictionaries serves the basis for distance spacing between words in the RuWordNet thesaurus. Methods for bringing analogues, which are remote in the RuWordNet thesaurus, closer together, were proposed basing on a comparative analysis of the data in the explanatory dictionaries and in the New Explanatory Dictionary of Synonyms of the Russian Language, taking into account semantic relations of inclusion or meanings intersection. Basing on the obtained data, recommendations to reduce the distances between analogues were formulated for RuWordNet; and general principles of analysis were proposed, which could be useful for RuWordNet verifying.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Fellbaum, Ch. (ed.): WordNet: An electronic lexical database. MIT Press, Cambridge (1998)

    Google Scholar 

  2. Loukachevitch, N., Lashevich, G., Dobrov, B.: Comparing Two Thesaurus Representations for Russian. In: Proceedings of the 9th Global WordNet Conference (GWC 2018), Singapore, pp. 35–44 (2018)

    Google Scholar 

  3. Dudareva, Ya.A.: Nominative units with close meaning as components of the associative-verbal network of native speakers of the Russian language. Philological Sciences Ph.D. thesis. Kemerovo (2012) (in Russian)

    Google Scholar 

  4. Apresyan, Yu.: Selected Works, Volume I. Lexical Semantics. 2nd edn. School “Languages of Russian Culture”, Moscow (1995). (in Russian)

    Google Scholar 

  5. Loukachevitch, N.: Thesauruses in Information Retrieval Problems. Publishing house of Moscow State University, Moscow (2011).(in Russian)

    Google Scholar 

  6. Apresyan, Y. (ed.): A New Explanatory Dictionary of Synonyms of the Russian Language, 2nd edn. Yazykirusskoykultury, Moscow (2004). (in Russian)

    Google Scholar 

  7. Steyvers, M., Tenenbaum, J.B.: The large-scale structure of semantic networks: statistical analyses and a model of semantic growth. Cogn. Sci. 29(1), 41–78 (2005)

    Article  Google Scholar 

  8. Zhu, X., Yang, X., Huang, Y., Guo, Q., Zhang, B.: Measuring similarity and relatedness using multiple semantic relations in WordNet. Knowl. Inf. Syst. 62(4), 1539–1569 (2019). https://doi.org/10.1007/s10115-019-01387-6

    Article  Google Scholar 

  9. Gao, J.B., Zhang, B.W., Chen, X.H.: A WordNet-based semantic similarity measurement combining edge-counting and information content theory. Eng. Appl. Artif. Intell. 39, 80–88 (2015)

    Article  Google Scholar 

  10. Loukachevitch, N.: Corpus-based check-up for thesaurus. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 5773–5779. Association for Computational Linguistics, Florence (2019)

    Google Scholar 

  11. Bayrasheva, V.: Corpus-based vs thesaurus-based word similarities: expert verification. In: The XX-th International Scientific Conference “Cognitive Modeling in Linguistics” Proceedings, Rostov-on-Don, pp. 56–63 (2019)

    Google Scholar 

  12. Bochkarev, V.V, Solovyev, V.D.: Properties of the network of semantic relations in the Russian language based on the RuWordNet data. J. Phys.: Conf. Ser. 1391(1), Art. № 012052 (2019)

    Google Scholar 

  13. Thesaurus of Russian Language RuWordNet. (in Russian). https://ruwordnet.ru/ru. Accessed 10 Mar 2020

  14. Cormen, T.H., Leiserson, C.E., Rivest, R.L.: Introduction to Algorithms, 3rd edn. MIT Press, Cambridge (2009)

    MATH  Google Scholar 

  15. Miller, G.A.: WordNet: a lexical database for English. Commun. ACM 38(11), 39–41 (1995)

    Article  Google Scholar 

  16. Solovyev, V., Gimaletdinova, G., Khalitova, L., Usmanova, L.: Expert assessment of synonymic rows in RuWordNet. In: van der Aalst, W.M.P., et al. (eds.) AIST 2019. CCIS, vol. 1086, pp. 174–183. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-39575-9_18

    Chapter  Google Scholar 

  17. Nikitin, M.V.: Course of linguistic semantics: textbook. Publishing House of the Russian State Pedagogical University named after A.I. Herzen, St. Petersburg (2007). (in Russian)

    Google Scholar 

  18. Ozhegov, S.: Russian thesaurus. (in Russian). https://slovarozhegova.ru/. Accessed 10 Mar 2020

  19. Shansky, N.M., Ivanov, V.V., Shanskaya T.V.: Brief etymological dictionary of the Russian language. Uchpedgiz, Moscow (1961). (in Russian)

    Google Scholar 

  20. Wiktionary. (in Russian). https://ru.wiktionary.org/wiki/gam. Accessed 02 Mar 2020

Download references

Acknowledgments

This research was financially supported by RFBR, grant №. 18-00-01238 and by the Russian Government Program of Competitive Growth of Kazan Federal University.

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Usmanova, L., Erofeeva, I., Solovyev, V., Bochkarev, V. (2021). Analysis of the Semantic Distance of Words in the RuWordNet Thesaurus. In: Sychev, A., Makhortov, S., Thalheim, B. (eds) Data Analytics and Management in Data Intensive Domains. DAMDID/RCDL 2020. Communications in Computer and Information Science, vol 1427. Springer, Cham. https://doi.org/10.1007/978-3-030-81200-3_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-81200-3_5

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-81199-0

  • Online ISBN: 978-3-030-81200-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics