Skip to main content

RDF Graph Summarization Based on Approximate Patterns

  • Conference paper
  • First Online:

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 622))

Abstract

The Linked Open Data (LOD) cloud brings together information described in RDF and stored on the web in (possibly distributed) RDF Knowledge Bases (KBs). The data in these KBs are not necessarily described by a known schema and many times it is extremely time consuming to query all the interlinked KBs in order to acquire the necessary information. But even when the KB schema is known, we need actually to know which parts of the schema are used. We solve this problem by summarizing large RDF KBs using top-K approximate RDF graph patterns, which we transform to an RDF schema that describes the contents of the KB. This schema describes accurately the KB, even more accurately than an existing schema because it describes the actually used schema, which corresponds to the existing data. We add information on the number of various instances of the patterns, thus allowing the query to estimate the expected results. That way we can then query the RDF graph summary to identify whether the necessary information is present and if it is present in significant numbers whether to be included in a federated query result.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    http://dbtune.org/jamendo/.

  2. 2.

    http://musicontology.com/.

  3. 3.

    http://www.geonames.org/ontology.

  4. 4.

    http://musicbrainz.org/.

References

  1. Adler, M., Mitzenmacher, M.: Towards compressing web graphs. In: 2001 Proceedings Data Compression Conference, DCC 2001, pp. 203–212. IEEE (2001)

    Google Scholar 

  2. Aggarwal, C.C., Wang, H.: Managing and Mining Graph Data, vol. 40. Springer, New York (2010)

    MATH  Google Scholar 

  3. Alzogbi, A., Lausen, G.: Similar structures inside rdf-graphs. In: LDOW (2013)

    Google Scholar 

  4. Campinas, S., Perry, T.E., Ceccarelli, D., Delbru, R., Tummarello, G.: Introducing rdf graph summary with application to assisted sparql formulation. In: 2012 23rd International Workshop on Database and Expert Systems Applications (DEXA), pp. 261–266. IEEE (2012)

    Google Scholar 

  5. Goasdoué, F., Manolescu, I.: Query-oriented summarization of rdf graphs. Proc. VLDB Endowment 8(12) (2015)

    Google Scholar 

  6. Khatchadourian, S., Consens, M.P.: ExpLOD: summary-based exploration of interlinking and RDF usage in the linked open data cloud. In: Aroyo, L., Antoniou, G., Hyvönen, E., Teije, A., Stuckenschmidt, H., Cabral, L., Tudorache, T. (eds.) ESWC 2010, Part II. LNCS, vol. 6089, pp. 272–287. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  7. Khatchadourian, S., Consens, M.P.: Exploring rdf usage and interlinking in the linked open data cloud using explod. In: LDOW (2010)

    Google Scholar 

  8. Khatchadourian, S., Consens, M.P.: Understanding billions of triples with usage summaries. In: Semantic Web Challenge (2011)

    Google Scholar 

  9. Konrath, M., Gottron, T., Scherp, A.: Schemex-web-scale indexed schema extraction of linked open data. In: Semantic Web Challenge, Submission to the Billion Triple Track, pp. 52–58 (2011)

    Google Scholar 

  10. Konrath, M., Gottron, T., Staab, S., Scherp, A.: Schemex-efficient construction of a data catalogue by stream-based indexing of linked data. Web Seman. Sci. Serv. Agents World Wide Web 16, 52–58 (2012)

    Article  Google Scholar 

  11. Louati, A., Aufaure, M.-A., Lechevallier, Y., Chatenay-Malabry, F.: Graph aggregation: application to social networks. In: HDSDA, pp. 157–177 (2011)

    Google Scholar 

  12. Lucchese, C., Orlando, S., Perego, R.: Mining top-k patterns from binary datasets in presence of noise. In: SDM, pp. 165–176. SIAM (2010)

    Google Scholar 

  13. Lucchese, C., Orlando, S., Perego, R.: A unifying framework for mining approximate top-k binary patterns. IEEE Trans. Knowl. Data Eng. 26, 2900–2913 (2014)

    Article  Google Scholar 

  14. Lucchese, C., Orlando, S., Perego, R.: Supervised evaluation of top-k itemset mining algorithms. In: Madria, S., Hara, T. (eds.) DaWaK 2015. LNCS, vol. 9263, pp. 82–94. Springer, Heidelberg (2015)

    Chapter  Google Scholar 

  15. Miettinen, P., Mielikainen, T., Gionis, A., Das, G., Mannila, H.: The discrete basis problem. IEEE Trans. Knowl. Data Eng. 20(10), 1348–1362 (2008)

    Article  Google Scholar 

  16. Miettinen, P., Vreeken, J.: Model order selection for boolean matrix factorization. In: Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 51–59 (2011)

    Google Scholar 

  17. Navlakha, S., Rastogi, R., Shrivastava, N.: Graph summarization with bounded error. In: Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data, pp. 419–432. ACM (2008)

    Google Scholar 

  18. Raghavan, S., Garcia-Molina, H.: Representing web graphs. In: 2003 Proceedings of 19th International Conference on Data Engineering, pp. 405–416. IEEE (2003)

    Google Scholar 

  19. Rissanen, J.: Modeling by shortest data description. Automatica 14(5), 465–471 (1978)

    Article  MATH  Google Scholar 

  20. Schätzle, A., Neu, A., Lausen, G., Przyjaciel-Zablocki, M.: Large-scale bisimulation of rdf graphs. In: Proceedings of the Fifth Workshop on Semantic Web Information Management, p. 1. ACM (2013)

    Google Scholar 

  21. Sun, Y., Kongfa, H., Zhipeng, L., Zhao, L., Chen, L.: A graph summarization algorithm based on rfid logistics. Physics Procedia 24, 1707–1714 (2012)

    Article  Google Scholar 

  22. Tian, Y., Hankins, R.A., Patel, J.M.: Efficient aggregation for graph summarization. In: Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data, pp. 567–580. ACM (2008)

    Google Scholar 

  23. Tian, Y., Patel, J.M.: Interactive graph summarization. In: Yu, P.S., Han, J., Faloutsos, C. (eds.) Link Mining: Models, Algorithms, and Applications, pp. 389–409. Springer, New York (2010)

    Chapter  Google Scholar 

  24. Toivonen, H., Zhou, F., Hartikainen, A., Hinkka, A.: Compression of weighted graphs. In: Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 965–973. ACM (2011)

    Google Scholar 

  25. Xiang, Y., Jin, R., Fuhry, D., Feodor, F.: Dragan.: summarizing transactional databases with overlapped hyperrectangles. Data Min. Knowl. Discov. 23(2), 215–251 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  26. Zaki, M.J., Hsiao, C.-J.: Efficient algorithms for mining closed itemsets and their lattice structure. IEEE Trans. Knowl. Data Eng. 17(4), 462–478 (2005)

    Article  Google Scholar 

  27. Zhang, H., Duan, Y., Yuan, X., Zhang, Y.: Assg: adaptive structural summary for rdf graph data. In: ISWC (2014)

    Google Scholar 

  28. Zhang, N., Tian, Y., Patel, J.M.: Discovery-driven graph summarization. In: 2010 IEEE 26th International Conference on Data Engineering (ICDE), pp. 880–891. IEEE (2010)

    Google Scholar 

  29. Zhou, F., Toivonen, H.: Methods for network abstraction. Ph.D. Thesis, The Department of Computer Science at the University of Helsinki (2012)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dimitris Kotzinos .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Zneika, M., Lucchese, C., Vodislav, D., Kotzinos, D. (2016). RDF Graph Summarization Based on Approximate Patterns. In: Grant, E., Kotzinos, D., Laurent, D., Spyratos, N., Tanaka, Y. (eds) Information Search, Integration, and Personalization. ISIP 2015. Communications in Computer and Information Science, vol 622. Springer, Cham. https://doi.org/10.1007/978-3-319-43862-7_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-43862-7_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-43861-0

  • Online ISBN: 978-3-319-43862-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics