Advertisement

Exploring Importance Measures for Summarizing RDF/S KBs

  • Alexandros Pappas
  • Georgia Troullinou
  • Giannis Roussakis
  • Haridimos KondylakisEmail author
  • Dimitris Plexousakis
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10249)

Abstract

Given the explosive growth in the size and the complexity of the Data Web, there is now more than ever, an increasing need to develop methods and tools in order to facilitate the understanding and exploration of RDF/S Knowledge Bases (KBs). To this direction, summarization approaches try to produce an abridged version of the original data source, highlighting the most representative concepts. Central questions to summarization are: how to identify the most important nodes and then how to link them in order to produce a valid sub-schema graph. In this paper, we try to answer the first question by revisiting six well-known measures from graph theory and adapting them for RDF/S KBs. Then, we proceed further to model the problem of linking those nodes as a graph Steiner-Tree problem (GSTP) employing approximations and heuristics to speed up the execution of the respective algorithms. The performed experiments show the added value of our approach since (a) our adaptations outperform current state of the art measures for selecting the most important nodes and (b) the constructed summary has a better quality in terms of the additional nodes introduced to the generated summary.

Keywords

Semantic summaries Schema summary RDF/S Knowledge Bases Graph theory 

Notes

Acknowledgments

This work was partially supported by the EU projects iManageCancer and CloudSocket under the contracts H2020-643529, H2020-644690.

References

  1. 1.
    Du, D.-Z., Smith, J.M., Rubinstein, J.H. (eds.): Advances in Steiner Trees. Kluwer Academic Publishers, Dordrecht (2000)zbMATHGoogle Scholar
  2. 2.
    Boldi, P., Vigna, S.: Axioms for centrality. Internet Math. 10(3–4), 222–262 (2014)MathSciNetCrossRefGoogle Scholar
  3. 3.
    Donaway, R.L., Drummey, K.W., Mather, L.A.: A comparison of rankings produced by summarization evaluation measures. In: NAACL-ANLP Workshop, pp. 69–78 (2000)Google Scholar
  4. 4.
    Dreyfus, S.E., Wagner, R.A.: The steiner problem in graphs. Networks 1(3), 195–207 (1971)MathSciNetCrossRefGoogle Scholar
  5. 5.
    Dudáš, M., Svátek, V., Mynarz, J.: Dataset summary visualization with LODSight. In: Gandon, F., Guéret, C., Villata, S., Breslin, J., Faron-Zucker, C., Zimmermann, A. (eds.) ESWC 2015. LNCS, vol. 9341, pp. 36–40. Springer, Cham (2015). doi: 10.1007/978-3-319-25639-9_7CrossRefGoogle Scholar
  6. 6.
    Hakimi, S.L.: Steiner’s problem in graphs and its implications. Networks 1(2), 113–133 (1971)MathSciNetCrossRefGoogle Scholar
  7. 7.
    Jiang, X., Zhang, X., Gao, F., Pu, C., Wang, P.: Graph compression strategies for instance-focused semantic mining. In: Qi, G., Tang, J., Du, J., Pan, J.Z., Yu, Y. (eds.) CSWS 2013. CCIS, vol. 406, pp. 50–61. Springer, Heidelberg (2013). doi: 10.1007/978-3-642-54025-7_5CrossRefGoogle Scholar
  8. 8.
    Karp, R.M.: Reducibility among combinatorial problems. In: Jünger, M., Liebling, T.M., Naddef, D., Nemhauser, G.L., Pulleyblank, W.R., Reinelt, G., Rinaldi, G., Wolsey, L.A. (eds.) 50 Years of Integer Programming 1958–2008 - From the Early Years to the State-of-the-Art, pp. 219–241. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  9. 9.
    Khatchadourian, S., Consens, M.P.: Explod: summary-based exploration of interlinking and RDF usage in the linked open data cloud. In: ESWC, pp. 272–287 (2010)Google Scholar
  10. 10.
    Khatchadourian, S., Consens, M.P.: Exploring RDF usage and interlinking in the linked open data cloud using explod. In: LDOW (2010)Google Scholar
  11. 11.
    Kondylakis, H., Plexousakis, D.: Ontology evolution: assisting query migration. In: Atzeni, P., Cheung, D., Ram, S. (eds.) ER 2012. LNCS, vol. 7532, pp. 331–344. Springer, Heidelberg (2012). doi: 10.1007/978-3-642-34002-4_26CrossRefGoogle Scholar
  12. 12.
    Levin, A.Y.: Algorithm for the shortest connection of a group of graph vertices. Sov. Math. Dokl. 12, 1477–1481 (1971)zbMATHGoogle Scholar
  13. 13.
    Navlakha, S., Rastogi, R., Shrivastava, N.: Graph summarization with bounded error. In: ACM SIGMOD, pp. 419–432. ACM (2008)Google Scholar
  14. 14.
    Palmonari, M., Rula, A., Porrini, R., Maurino, A., Spahiu, B., Ferme, V.: ABSTAT: linked data summaries with ABstraction and STATistics. In: Gandon, F., Guéret, C., Villata, S., Breslin, J., Faron-Zucker, C., Zimmermann, A. (eds.) ESWC 2015. LNCS, vol. 9341, pp. 128–132. Springer, Cham (2015). doi: 10.1007/978-3-319-25639-9_25CrossRefGoogle Scholar
  15. 15.
    Pires, C.E., Sousa, P., Kedad, Z., Salgado, A.C.: Summarizing ontology-based schemas in pdms. In: ICDEW, pp. 239–244 (2010)Google Scholar
  16. 16.
    Plesnik, J.: Worst-case relative performances of heuristics for the steiner problem in graphs (1991)Google Scholar
  17. 17.
    Queiroz-Sousa, P.O., Salgado, A.C., Pires, C.E.: A method for building personalized ontology summaries. J. Inf. Data Manage. 4(3), 236 (2013)Google Scholar
  18. 18.
    Rayward-Smith, V.J., Clare, A.: On finding steiner vertices. Networks 16(3), 283–294 (1986)MathSciNetCrossRefGoogle Scholar
  19. 19.
    Rochat, Y.: Closeness centrality extended to unconnected graphs: the harmonic centrality index. In: Applications of Social Network Analysis (ASNA) (2009)Google Scholar
  20. 20.
    Peroni, S., Motta, E., d’Aquin, M.: Identifying key concepts in an ontology, through the integration of cognitive principles with statistical and topological measures. In: Domingue, J., Anutariya, C. (eds.) ASWC 2008. LNCS, vol. 5367, pp. 242–256. Springer, Heidelberg (2008). doi: 10.1007/978-3-540-89704-0_17CrossRefGoogle Scholar
  21. 21.
    Spearman, C.: The proof and measurement of association between two things. Am. J. Psychol. 15(1), 72–101 (1904)CrossRefGoogle Scholar
  22. 22.
    Tian, Y., Hankins, R.A., Patel, J.M.: Efficient aggregation for graph summarization. In: ACM SIGMOD, pp. 567–580. ACM (2008)Google Scholar
  23. 23.
    Troullinou, G., Kondylakis, H., Daskalaki, E., Plexousakis, D.: RDF digest: efficient summarization of RDF/S KBs. In: Gandon, F., Sabou, M., Sack, H., d’Amato, C., Cudré-Mauroux, P., Zimmermann, A. (eds.) ESWC 2015. LNCS, vol. 9088, pp. 119–134. Springer, Cham (2015). doi: 10.1007/978-3-319-18818-8_8CrossRefGoogle Scholar
  24. 24.
    Troullinou, G., Kondylakis, H., Daskalaki, E., Plexousakis, D.: RDF digest: ontology exploration using summaries. In: ISWC (2015)Google Scholar
  25. 25.
    Troullinou, G., Kondylakis, H., Daskalaki, E., Plexousakis, D.: Ontology understanding without tears: the summarization approach. Semant. Web J. (2017). IOS pressGoogle Scholar
  26. 26.
    Valente, T.W., Foreman, R.K.: Integration and radiality: measuring the extent of an individual’s connectedness and reachability in a network. Soc. Netw. 20(1), 89–105 (1998)CrossRefGoogle Scholar
  27. 27.
    Voß, S.: Steiner’s problem in graphs: heuristic methods. Discrete Appl. Math. 40(1), 45–72 (1992)MathSciNetCrossRefGoogle Scholar
  28. 28.
    Wu, G., Li, J., Feng, L., Wang, K.: Identifying potentially important concepts and relations in an ontology. In: Sheth, A., Staab, S., Dean, M., Paolucci, M., Maynard, D., Finin, T., Thirunarayan, K. (eds.) ISWC 2008. LNCS, vol. 5318, pp. 33–49. Springer, Heidelberg (2008). doi: 10.1007/978-3-540-88564-1_3CrossRefGoogle Scholar
  29. 29.
    Zhang, X., Cheng, G., Qu, Y.: Ontology summarization based on RDF sentence graph. In: WWW, pp. 707–716 (2007)Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Alexandros Pappas
    • 1
  • Georgia Troullinou
    • 2
  • Giannis Roussakis
    • 2
  • Haridimos Kondylakis
    • 2
    Email author
  • Dimitris Plexousakis
    • 2
  1. 1.Computer Science DepartmentUniversity of CreteHeraklionGreece
  2. 2.Institute for Computer ScienceFORTHHeraklionGreece

Personalised recommendations