Skip to main content

Knowledge Graph Embeddings over Hundreds of Linked Datasets

  • Conference paper
  • First Online:

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1057))

Abstract

There is an increasing trend of using Linked Datasets for creating embeddings from URI sequences, since such embeddings can be exploited for several tasks, i.e., for machine learning problems, tasks related to content-based similarity, and others. Existing techniques exploit either a single or a few datasets (or RDF graphs) for creating URI sequences for one or more entities. However, there are not available approaches, where data from multiple datasets are combined, for enriching the URI sequences for a given entity. For this reason, we introduce a prototype, called LODVec, that exploits LODsyndesis knowledge graph, which is the largest knowledge graph including all inferred equivalence relationships. LODVec exploits this graph for creating URI sequences for millions of entities by combining data from 400 datasets, whereas it offers several configurable options for creating such URI sequences that are based on metadata (e.g., provenance). Moreover, it uses as input the produced URI sequences for creating URI embeddings through word2vec model. We evaluate the gain of exploiting several datasets (instead of a single or few ones) and the impact of cross-dataset reasoning for machine-learning based tasks (i.e., classification and regression), and we compare the effectiveness of several configurations and machine learning models.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Antoniou, G., van Harmelen, F.: A Semantic Web Primer, 2nd edn. The MIT Press, Cambridge (2008)

    Google Scholar 

  2. Cochez, M., Ristoski, P., Ponzetto, S.P., Paulheim, H.: Biased graph walks for RDF graph embeddings. In: WIMS, p. 21. ACM (2017)

    Google Scholar 

  3. Cochez, M., Ristoski, P., Ponzetto, S.P., Paulheim, H.: Global RDF vector space embeddings. In: d’Amato, C., et al. (eds.) ISWC 2017. LNCS, vol. 10587, pp. 190–207. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-68288-4_12

    Chapter  Google Scholar 

  4. Dietze, S., Mohapatra, N., Iosifidis, V., Ekbal, A., Fafalios, P.: Time-aware and corpus-specific entity relatedness, pp. 33–39 (2018)

    Google Scholar 

  5. Hajra, A., Tochtermann, K.: Linking science: approaches for linking scientific publications across different LOD repositories. IJMSO 12(2–3), 124–141 (2017)

    Article  Google Scholar 

  6. Inan, E., Dikenelli, O.: Effect of enriched ontology structures on RDF embedding-based entity linking. In: Garoufallou, E., Virkus, S., Siatri, R., Koutsomiha, D. (eds.) MTSR 2017. CCIS, vol. 755, pp. 15–24. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-70863-8_2

    Chapter  Google Scholar 

  7. Lin, Y., Liu, Z., Sun, M., Liu, Y., Zhu, X.: Learning entity and relation embeddings for knowledge graph completion. In: AAAI Conference (2015)

    Google Scholar 

  8. Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)

  9. Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)

    Google Scholar 

  10. Mountantonakis, M., Tzitzikas, Y.: How linked data can aid machine learning-based tasks. In: Kamps, J., Tsakonas, G., Manolopoulos, Y., Iliadis, L., Karydis, I. (eds.) TPDL 2017. LNCS, vol. 10450, pp. 155–168. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-67008-9_13

    Chapter  Google Scholar 

  11. Mountantonakis, M., Tzitzikas, Y.: High performance methods for linked open data connectivity analytics. Information 9(6), 134 (2018)

    Article  Google Scholar 

  12. Mountantonakis, M., Tzitzikas, Y.: LODsyndesis: global scale knowledge services. Heritage 1(2), 335–348 (2018)

    Article  Google Scholar 

  13. Mountantonakis, M., Tzitzikas, Y.: Large scale semantic integration of linked data: a survey. ACM Comput. Surv. 52, 103 (2019)

    Article  Google Scholar 

  14. Nechaev, Y., Corcoglioniti, F., Giuliano, C.: Type prediction combining linked open data and social media. In: CIKM, pp. 1033–1042. ACM (2018)

    Google Scholar 

  15. Pennington, J., Socher, R., Manning, C.: Glove: global vectors for word representation. In: Proceedings of EMNLP Conference, pp. 1532–1543 (2014)

    Google Scholar 

  16. Ristoski, P., Bizer, C., Paulheim, H.: Mining the web of linked data with rapidminer. J. Web Semant. 35, 142–151 (2015)

    Article  Google Scholar 

  17. Ristoski, P., Rosati, J., Di Noia, T., De Leone, R., Paulheim, H.: RDF2Vec: RDF graph embeddings and their applications. Semant. Web 10(4), 721–752 (2019)

    Article  Google Scholar 

  18. Ristoski, P., de Vries, G.K.D., Paulheim, H.: A collection of benchmark datasets for systematic evaluations of machine learning on the semantic web. In: Groth, P., et al. (eds.) ISWC 2016. LNCS, vol. 9982, pp. 186–194. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46547-0_20

    Chapter  Google Scholar 

  19. Wang, Z., Zhang, J., Feng, J., Chen, Z.: Knowledge graph embedding by translating on hyperplanes. In: AAAI Conference on Artificial Intelligence (2014)

    Google Scholar 

  20. Witten, I.H., Frank, E., Hall, M.A., Pal, C.J.: Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann, San Francisco (2016)

    Google Scholar 

Download references

Acknowledgements

The research work was supported by the Hellenic Foundation for Research and Innovation (HFRI) and the General Secretariat for Research and Technology (GSRT), under the HFRI PhD Fellowship grant (GA. No. 166).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Michalis Mountantonakis .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Mountantonakis, M., Tzitzikas, Y. (2019). Knowledge Graph Embeddings over Hundreds of Linked Datasets. In: Garoufallou, E., Fallucchi, F., William De Luca, E. (eds) Metadata and Semantic Research. MTSR 2019. Communications in Computer and Information Science, vol 1057. Springer, Cham. https://doi.org/10.1007/978-3-030-36599-8_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-36599-8_13

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-36598-1

  • Online ISBN: 978-3-030-36599-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics