Skip to main content

RuThes Cloud: Towards a Multilevel Linguistic Linked Open Data Resource for Russian

  • Conference paper
  • First Online:
Knowledge Engineering and Semantic Web (KESW 2017)

Abstract

In this paper we present a new multi-level Linguistic Linked Open Data resource for Russian. It covers four linguistic levels: semantic, lexical, morphological and syntactic. The resource has been constructed on base of the well-known RuThes thesaurus and the original hitherto unpublished Extended Zaliznyak grammatical dictionary. The resource is represented in terms of SKOS, Lemon, and LexInfo ontologies and a new custom ontology. Building the resource, we automatically completed the following tasks: merging source resources upon common lexical entries, decomposing complex lexical entries, and publishing constructed resource as LLOD-compatible dataset. We demonstrate the use case in which the developed resource is exploited in IR task. We hope that our work can serve as a crystallization point of the LLOD cloud in Russian.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Chiarcos, C., McCrae, J., Cimiano, P., Fellbaum, C.: Towards open data for linguistics: Linguistic Linked Data. In: Oltramari, A., Vossen, P., Qin, L., Hovy, E. (eds.) New Trends of Research in Ontologies and Lexical Resources. Theory and Applications of Natural Language Processing, pp. 7–25. Springer, Heidelberg (2013). doi:10.1007/978-3-642-31782-8_2

    Chapter  Google Scholar 

  2. McCrae, J.P., et al.: The open linguistics working group: developing the Linguistic Linked Open Data cloud. In: Calzolari, N., et al. (eds.) Proceedings of the 10th International Conference on Language Resources and Evaluation (LREC 2016), pp. 2435–2441 (2016)

    Google Scholar 

  3. van Assem, M., Gangemi, A., Schreiber, G.: Conversion of WordNet to a standard RDF/OWL representation. In: Calzolari, N., et al. (eds.) Proceedings of the 5th International Conference on Language Resources and Evaluation (LREC 2006), pp. 237–242 (2006)

    Google Scholar 

  4. Eckle-Kohler, J., McCrae, J.P., Chiarcos, C.: lemonUby - a large, interlinked, syntactically-rich lexical resource for ontologies. Semant. Web 6(4), 371–378 (2015). doi:10.3233/SW-140159

    Article  Google Scholar 

  5. McCrae, J.P., Fellbaum, C., Cimiano, P.: Publishing and linking WordNet using Lemon and RDF. In: Chiarcos, C., et al. (eds.) Proceedings of the 3rd Workshop on Linked Data in Linguistics (LDL-2014) (2014)

    Google Scholar 

  6. Sérasset, G.: DBnary: Wiktionary as a Lemon-based multilingual lexical resource in RDF. Semant. Web 6(4), 355–361 (2015). doi:10.3233/SW-140147

    Article  Google Scholar 

  7. Paredes, L.P., Álvarez Rodríguez, J.M., Azcona, E.R.: Promoting government controlled vocabularies for the Semantic Web: the EUROVOC thesaurus and the CPV product classification system. In: Kollias, S., Cousins, J. (eds.) Proceedings of the 1st International Workshop on Semantic Interoperability in the European Digital Library (SIEDL 2008), pp. 111–122 (2008)

    Google Scholar 

  8. Caracciolo, C., Stellato, A.: Thesaurus maintenance, alignment and publication as Linked Data: the AGROVOC use case. Int. J. Metadata Semant. Ontol. 7(1), 65–75 (2012). doi:10.1504/IJMSO.2012.048511

    Article  Google Scholar 

  9. Caracciolo, C., Stellato, A., Morshed, A., Johannsen, G., Rajbhandari, S., Jaques, Y., Keizer, J.: The AGROVOC linked dataset. Semant. Web 4(3), 341–348 (2013). doi:10.3233/SW-130106

    Google Scholar 

  10. Zapilko, B., Schaible, J., Mayr, P., Mathiak, B.: TheSoz: a SKOS representation of the thesaurus for the social sciences. Semant. Web 4(3), 257–263 (2013). doi:10.3233/SW-2012-0081

    Google Scholar 

  11. Summers, E., Isaac, A., Redding, C., Krech, D.: LCSH, SKOS and Linked Data. In: Greenberg, J., Klas, W. (eds.) Proceedings of the 2008 International Conference on Dublin Core and Metadata Applications (DC 2008), pp. 25–33 (2008)

    Google Scholar 

  12. Ustalov, D.: Russian thesauri as Linked Open Data. In: Computational Linguistics and Intellectual Technologies: papers from the Annual conference “Dialogue”, vol. 1, pp. 616–625. RGGU (2015)

    Google Scholar 

  13. Nevzorova, O., Zhiltsov, N., Kirillovich, A., Lipachev, E.: OntoMathPro ontology: a Linked Data hub for mathematics. In: Klinov, P., Mouromtsev, D. (eds.) KESW 2014. CCIS, vol 468, pp. 105–119. Springer, Cham (2014). doi:10.1007/978-3-319-11716-4_9

    Google Scholar 

  14. Elizarov, A.M., Kirillovich, A.V., Lipachev, E.K., Nevzorova, O.A., Solovyev, V.D., Zhiltsov, N.G.: Mathematical knowledge representation: semantic models and formalisms. Lobachevskii J. Math. 35(4), 348–354 (2014). doi:10.1134/S1995080214040143

    Article  MathSciNet  MATH  Google Scholar 

  15. Navigli, R., Ponzetto, S.P.: BabelNet: the automatic construction, evaluation and application of a wide-coverage multilingual semantic network. Artif. Intell. 193, 217–250 (2012). doi:10.1016/j.artint.2012.07.001

    Article  MathSciNet  MATH  Google Scholar 

  16. Ehrmann, M., Cecconi, F., Vannella, D., McCrae, J., Cimiano, P., Navigli, R.: Representing multilingual data as Linked Data: the case of BabelNet 2.0. In: Calzolari, N., et al. (eds.) Proceedings of the 9th International Conference on Language Resources and Evaluation (LREC 2014), pp. 401–408 (2014)

    Google Scholar 

  17. Baker, T., et al.: Key choices in the design of Simple Knowledge Organization System (SKOS). J. Web Semant. 20, 35–49 (2013). doi:10.1016/j.websem.2013.05.001

    Article  Google Scholar 

  18. McCrae, J., Spohr, D., Cimiano, P.: Linking lexical resources and ontologies on the Semantic Web with Lemon. In: Antoniou, G., et al. (eds.) ESWC 2011. Part I, LNCS, vol. 6643, pp. 245–259. Springer, Heidelberg (2011). doi:10.1007/978-3-642-21034-1_17

    Google Scholar 

  19. McCrae, J., et al.: The Lemon cookbook. http://lemon-model.net/lemon-cookbook.pdf

  20. Cimiano, P., McCrae, J.P., Buitelaar, P.: Lexicon model for ontologies. Final community group report, 10 May 2016. https://www.w3.org/2016/05/ontolex/

  21. ISO 24613:2008: Language resource management - Lexical markup framework (LMF)

    Google Scholar 

  22. Kemps-Snijders, M., Windhouwer, M., Wittenburg, P., Wright, S.E.: ISOcat: remodelling metadata for language resources. Int. J. Metadata Semant. Ontol. 4(4), 261–276 (2009). doi:10.1504/IJMSO.2009.029230

    Article  Google Scholar 

  23. Kemps-Snijders, M., Windhouwer, M., Wittenburg, P., Wright, S.E.: ISOcat: corralling data categories in the wild. In: Proceedings of the 6th International Conference on Language Resources and Evaluation (LREC 2008), pp. 887–891 (2008)

    Google Scholar 

  24. Windhouwer, M., Wright, S.E.: Linking to linguistic data categories in ISOcat. In: Chiarcos, C., Nordhoff, S., Hellmann, S. (eds.) Linked Data in Linguistics, pp. 99–107. Springer, Heidelberg (2012). doi:10.1007/978-3-642-28249-2_10

    Chapter  Google Scholar 

  25. ISO 12620:2009: Terminology and other language and content resources—Specification of data categories and management of a Data Category Registry for language resources

    Google Scholar 

  26. LexInfo. http://www.lexinfo.net/

  27. Chiarcos, C.: OLiA – Ontologies of Linguistic Annotation. Semant. Web 6(4), 379–386 (2015). doi:10.3233/SW-140167

    Article  Google Scholar 

  28. Chiarcos, C.: Ontologies of linguistic annotation: survey and perspectives. In: Calzolari, N., et al. (eds.) Proceedings of the 8th International Conference on Language Resources and Evaluation (LREC 2012), pp. 303–310 (2012)

    Google Scholar 

  29. Hellmann, S., Lehmann, J., Auer, S., Brümmer, M.: Integrating NLP using Linked Data. In: Alani, H., et al. (eds.) ISWC 2013, Part II. LNCS, vol 8219, pp. 98–113. Springer, Heidelberg (2013). doi:10.1007/978-3-642-41338-4_7

  30. Sanderson, R., Ciccarese, P., Young, B.: Web annotation data model. W3C Recommendation, 23 February 2017. https://www.w3.org/TR/annotation-model/

  31. Nevzorova, O., Nevzorov, V.: The Development Support System “OntoIntegrator” for Linguistic Applications. Information Science and Computing, vol. 13, Intelligent Information and Engineering Systems, vol. 3, pp. 78–84. ITHEA, Rzeszow-Sofia (2009)

    Google Scholar 

  32. Loukachevitch, N., Dobrov, B., Chetviorkin, I.: RuThes-Lite, a publicly available version of thesaurus of Russian language RuThes. In: Computational Linguistics and Intellectual Technologies: Papers from the Annual International Conference “Dialogue”, pp. 340–349. RGGU (2014)

    Google Scholar 

  33. Loukachevitch, N., Dobrov, B.: Development of ontologies with minimal set of conceptual relations. In: Lino, M.T., et al. (eds.) Proceedings of the 4th International Conference on Language Resources and Evaluation (LREC 2004), pp. 1889–1892 (2004)

    Google Scholar 

  34. Gil, Y., Miles, S.: PROV Model Primer. W3C Working Group Note, 30 April 2013. https://www.w3.org/TR/prov-primer/

  35. Guarino, N., Welty, C.A.: A Formal ontology of properties. In: Dieng, R., Corby, O. (eds.) EKAW 2000. LNCS, vol. 1937, pp. 97–112. Springer, Heidelberg (2000). doi:10.1007/3-540-39967-4_8

Download references

Acknowledgements

The main part of the reported work was funded by Russian Science Foundation according to the research project no. 16-18-02074. Developing the semantic publishing technological platform was funded by the subsidy allocated to Kazan Federal University for the state assignment in the sphere of scientific activities, grant agreement no. 1.2368.2017.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alexander Kirillovich .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Kirillovich, A., Nevzorova, O., Gimadiev, E., Loukachevitch, N. (2017). RuThes Cloud: Towards a Multilevel Linguistic Linked Open Data Resource for Russian. In: Różewski, P., Lange, C. (eds) Knowledge Engineering and Semantic Web. KESW 2017. Communications in Computer and Information Science, vol 786. Springer, Cham. https://doi.org/10.1007/978-3-319-69548-8_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-69548-8_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-69547-1

  • Online ISBN: 978-3-319-69548-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics