Analytics over RDF Graphs

  • Maria-Evangelia PapadakiEmail author
  • Yannis Tzitzikas
  • Nicolas Spyratos
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 1197)


The continuous accumulation of multi-dimensional data and the development of Semantic Web and Linked Data published in RDF bring new requirements for data analytics tools. Such tools should take into account the special features of RDF graphs, exploit the semantics of RDF and support flexible aggregate queries. In this paper, we present an approach for applying analytics to RDF data, based on a high-level functional query language called HIFUN. According to that language, each analytical query is considered as a well-formed expression of a functional algebra and its definition is independent of the nature and structure of the data. In this work, we detail the required transformations, as well as the translation of HIFUN queries to SPARQL and we introduce the primary implementation of a tool, developed for these purposes.


Analytics RDF Linked data 


  1. 1.
    Abelló, A., et al.: Fusion cubes: towards self-service business intelligence. Int. J. Data Warehous. Min. (IJDWM) 9, 66–88 (2013)CrossRefGoogle Scholar
  2. 2.
    Antoniou, G., Van Harmelen, F.: A Semantic Web Primer. MIT Press, Cambridge (2004)Google Scholar
  3. 3.
    Beheshti, S.-M.-R., Benatallah, B., Motahari-Nezhad, H.R.: Scalable graph-based OLAP analytics over process execution data. Distrib. Parallel Databases 34(3), 379–423 (2014). Scholar
  4. 4.
    Colazzo, D., Goasdoué, F., Manolescu, I., Roatiş, A.: RDF analytics: lenses over semantic graphs. In: Proceedings of the 23rd International Conference on World Wide Web (2014)Google Scholar
  5. 5.
    Etcheverry, L., Vaisman, A.A.: Enhancing OLAP analysis with web cubes. In: Simperl, E., Cimiano, P., Polleres, A., Corcho, O., Presutti, V. (eds.) ESWC 2012. LNCS, vol. 7295, pp. 469–483. Springer, Heidelberg (2012). Scholar
  6. 6.
    Etcheverry, L., Vaisman, A.A.: QB4OLAP: a new vocabulary for OLAP cubes on the semantic web. In: Proceedings of the Third International Conference on Consuming Linked Data (2012)Google Scholar
  7. 7.
    Etcheverry, L., Vaisman, A.A.: Querying semantic web data cubes. In: AMW (2016)Google Scholar
  8. 8.
    Etcheverry, L., Vaisman, A.A.: Efficient analytical queries on semantic web data cubes. J. Data Semant. 6(4), 199–219 (2017). Scholar
  9. 9.
    Inoue, H., Amagasa, T., Kitagawa, H.: An ETL framework for online analytical processing of linked open data. In: Wang, J., Xiong, H., Ishikawa, Y., Xu, J., Zhou, J. (eds.) WAIM 2013. LNCS, vol. 7923, pp. 111–117. Springer, Heidelberg (2013). Scholar
  10. 10.
    Isaac, A., Haslhofer, B.: Europeana linked open data-data. europeana. eu. Semant. Web 4, 291–297 (2013)CrossRefGoogle Scholar
  11. 11.
    Kämpgen, B., Harth, A.: Transforming statistical linked data for use in OLAP systems. In: Proceedings of the 7th International Conference on Semantic Systems (2011)Google Scholar
  12. 12.
    Kämpgen, B., O’Riain, S., Harth, A.: Interacting with statistical linked data via OLAP operations. In: Simperl, E., et al. (eds.) ESWC 2012. LNCS, vol. 7540, pp. 87–101. Springer, Heidelberg (2015). Scholar
  13. 13.
    Kokolaki, A., Tzitzikas, Y.: Facetize: an interactive tool for cleaning and transforming datasets for facilitating exploratory search. arXiv preprint arXiv:1812.10734 (2018)
  14. 14.
    Mountantonakis, M., Tzitzikas, Y.: On measuring the lattice of commonalities among several linked datasets. Proc. VLDB Endow. 9, 1101–1112 (2016)CrossRefGoogle Scholar
  15. 15.
    Mountantonakis, M., Tzitzikas, Y.: How linked data can aid machine learning-based tasks. In: Kamps, J., Tsakonas, G., Manolopoulos, Y., Iliadis, L., Karydis, I. (eds.) TPDL 2017. LNCS, vol. 10450, pp. 155–168. Springer, Cham (2017). Scholar
  16. 16.
    Mountantonakis, M., Tzitzikas, Y.: LODsyndesis: global scale knowledge services. Heritage 1, 335–348 (2018)CrossRefGoogle Scholar
  17. 17.
    Mountantonakis, M., Tzitzikas, Y.: Scalable methods for measuring the connectivity and quality of large numbers of linked datasets. J. Data Inf. Qual. (JDIQ) 9, 1–49 (2018)CrossRefGoogle Scholar
  18. 18.
    Mountantonakis, M., Tzitzikas, Y.: Large scale semantic integration of linked data: a survey. ACM Comput. Surv. (CSUR) 52, 1–40 (2019)CrossRefGoogle Scholar
  19. 19.
    Nebot, V., Berlanga, R.: Building data warehouses with semantic web data. Decis. Support Syst. 52, 853–868 (2012)CrossRefGoogle Scholar
  20. 20.
    Olston, C., Reed, B., Srivastava, U., Kumar, R., Tomkins, A.: Pig Latin: a not-so-foreign language for data processing. In: Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data (2008)Google Scholar
  21. 21.
    Papadaki, M.-E., Papadakos, P., Mountantonakis, M., Tzitzikas, Y.: An interactive 3D visualization for the LOD cloud. In: EDBT/ICDT Workshops (2018)Google Scholar
  22. 22.
    Spyratos, N.: A functional model for data analysis. In: Larsen, H.L., Pasi, G., Ortiz-Arroyo, D., Andreasen, T., Christiansen, H. (eds.) FQAS 2006. LNCS (LNAI), vol. 4027, pp. 51–64. Springer, Heidelberg (2006). Scholar
  23. 23.
    Spyratos, N., Sugibuchi, T.: HIFUN - a high level functional query language for big data analytics. J. Intell. Inf. Syst. 51(3), 529–555 (2018). Scholar
  24. 24.
    Spyratos, N., Sugibuchi, T.: Data exploration in the HIFUN language. In: Cuzzocrea, A., Greco, S., Larsen, H.L., Saccà, D., Andreasen, T., Christiansen, H. (eds.) FQAS 2019. LNCS (LNAI), vol. 11529, pp. 176–187. Springer, Cham (2019). Scholar
  25. 25.
    Thusoo, A., et al.: Hive-a petabyte scale data warehouse using hadoop. In: 2010 IEEE 26th International Conference on Data Engineering (ICDE 2010) (2010)Google Scholar
  26. 26.
    Tzitzikas, Y., et al.: Integrating heterogeneous and distributed information about marine species through a top level ontology. In: Garoufallou, E., Greenberg, J. (eds.) MTSR 2013. CCIS, vol. 390, pp. 289–301. Springer, Cham (2013). Scholar
  27. 27.
    Wang, K., Xu, G., Su, Z., Liu, Y.D.: GraphQ: graph query processing with abstraction refinement-scalable and programmable analytics over very large graphs on a single \(\{\)PC\(\}\). In: 2015 Annual Technical Conference 2015 (2015)Google Scholar
  28. 28.
    Zapilko, B., Mathiak, B.: Performing statistical methods on linked data. In: International Conference on Dublin Core and Metadata Applications (2011)Google Scholar
  29. 29.
    Zhao, P., Li, X., Xin, D., Han, J.: Graph cube: on warehousing and OLAP multidimensional networks. In: Proceedings of the 2011 ACM SIGMOD International Conference on Management of Data (2011)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  • Maria-Evangelia Papadaki
    • 1
    • 2
    Email author
  • Yannis Tzitzikas
    • 1
    • 2
  • Nicolas Spyratos
    • 3
  1. 1.Institute of Computer ScienceFORTHHeraklionGreece
  2. 2.Computer Science DepartmentUniversity of CreteHeraklionGreece
  3. 3.Laboratoire de Recherche en InformatiqueUniversité de Paris-SudOrsayFrance

Personalised recommendations