Skip to main content

On Designing Archiving Policies for Evolving RDF Datasets on the Web

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8824))

Abstract

When dealing with dynamically evolving datasets, users are often interested in the state of affairs on previous versions of the dataset, and would like to execute queries on such previous versions, as well as queries that compare the state of affairs across different versions. This is especially true for datasets stored in the Web, where the interlinking aspect, combined with the lack of central control, do not allow synchronized evolution of interlinked datasets. To address this requirement the obvious solution is to store all previous versions, but this could quickly increase the space requirements; an alternative solution is to store adequate deltas between versions, which are generally smaller, but this would create the overhead of generating versions at query time. This paper studies the trade-offs involved in these approaches, in the context of archiving dynamic RDF datasets over the Web. Our main message is that a hybrid policy would work better than any of the above approaches, and describe our proposed methodology for establishing a cost model that would allow determining when each of the two standard methods (version-based or delta-based storage) should be used in the context of a hybrid policy.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Buneman, P., Khanna, S., Tajima, K., Tan, W.C.: Archiving scientific data. TODS 29, 2–42 (2004)

    Article  Google Scholar 

  2. Drago, I., Mellia, M., Munafò, M.M., Sperotto, A., Sadre, R., Pras, A.: Inside Dropbox: understanding personal cloud storage services. In: Internet Measurement Conference (2012)

    Google Scholar 

  3. Gutierrez, C., Hurtado, C.A., Vaisman, A.A.: Temporal RDF. In: Gómez-Pérez, A., Euzenat, J. (eds.) ESWC 2005. LNCS, vol. 3532, pp. 93–107. Springer, Heidelberg (2005)

    Google Scholar 

  4. Kang, U., Tong, H., Sun, J., Lin, C.-Y., Faloutsos, C.: Gbase: A scalable and general graph management system. In: KDD (2011)

    Google Scholar 

  5. Koloniari, G., Souravlias, D., Pitoura, E.: On graph deltas for historical queries. In: WOSS (2012)

    Google Scholar 

  6. Manola, F., Miller, E., McBride, B.: RDF primer (2004), http://www.w3.org/TR/rdf-primer

  7. Marian, A., Abiteboul, S., Cobena, G., Mignet, L.: Change-centric management of versions in an XML warehouse. In: VLDB (2001)

    Google Scholar 

  8. Noy, N., Musen, M.: PromptDiff: A fixed-point algorithm for comparing ontology versions. In: AAAI (2002)

    Google Scholar 

  9. Papavasileiou, V., Flouris, G., Fundulaki, I., Kotzinos, D., Christophides, V.: High-level change detection in RDF(S) KBs. TODS 38(1) (2013)

    Google Scholar 

  10. Rula, A., Palmonari, M., Harth, A., Stadtmüller, S., Maurino, A.: On the diversity and availability of temporal information in Linked Open Data. In: Cudré-Mauroux, P., et al. (eds.) ISWC 2012, Part I. LNCS, vol. 7649, pp. 492–507. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  11. Stavrakas, Y., Papastefanatos, G.: Supporting complex changes in evolving interrelated web databanks. In: OTM Conferences (1) (2010)

    Google Scholar 

  12. Stefanidis, K., Efthymiou, V., Herchel, M., Christophides, V.: Entity resolution in the Web of data. In: WWW (2014)

    Google Scholar 

  13. Tzitzikas, Y., Theoharis, Y., Andreou, D.: On storage policies for semantic Web repositories that support versioning. In: Bechhofer, S., Hauswirth, M., Hoffmann, J., Koubarakis, M. (eds.) ESWC 2008. LNCS, vol. 5021, pp. 705–719. Springer, Heidelberg (2008)

    Google Scholar 

  14. Umbrich, J., Hausenblas, M., Hogan, A., Polleres, A., Decker, S.: Towards dataset dynamics: Change frequency of Linked Open Data sources. In: LDOW (2010)

    Google Scholar 

  15. Volkel, M., Winkler, W., Sure, Y., Kruk, S., Synak, M.: SemVersion: A versioning system for RDF and ontologies. In: ESWC (2005)

    Google Scholar 

  16. Weikum, G., Theobald, M.: From information to knowledge: harvesting entities and relationships from Web sources. In: PODS (2010)

    Google Scholar 

  17. Zeginis, D., Tzitzikas, Y., Christophides, V.: On computing deltas of RDF(S) knowledge bases. In: TWEB (2011)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Stefanidis, K., Chrysakis, I., Flouris, G. (2014). On Designing Archiving Policies for Evolving RDF Datasets on the Web. In: Yu, E., Dobbie, G., Jarke, M., Purao, S. (eds) Conceptual Modeling. ER 2014. Lecture Notes in Computer Science, vol 8824. Springer, Cham. https://doi.org/10.1007/978-3-319-12206-9_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-12206-9_4

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-12205-2

  • Online ISBN: 978-3-319-12206-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics