Abstract
To integrate various Linked Datasets, data warehousing and live query processing provide two extremes for optimized response time and quality respectively. The first approach provides very fast responses but with low-quality because changes of original data are not immediately reflected on materialized data. The second approach provides accurate responses but it is notorious for long response times. A hybrid SPARQL query processor provides a middle ground between two specified extremes by splitting the triple patterns of SPARQL query between live and local processors based on a predetermined coherence threshold specified by the administrator. Considering quality requirements while splitting the SPARQL query, enables the processor to eliminate the unnecessary live execution and releases resources for other queries. This requires estimating the quality of response provided with current materialized data, compare it with user requirements and determine the most selective sub-queries which can boost the response quality up to the specified level with least computational complexity. In this work, we propose solutions for estimating the freshness of materialized data, as one dimension of the quality, by extending cardinality estimation techniques. Experimental results show that we can estimate the freshness of materialized data with a low error rate.
Keywords
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Bishop, B., Kiryakov, A., Ognyanov, D., Peikov, I., Tashev, Z., Velkov, R.: Factforge: A fast track to the web of data. Semantic Web 2(2), 157–166 (2011)
Bizer, C., Schultz, A.: The berlin sparql benchmark. International Journal on Semantic Web and Information Systems (IJSWIS) 5, 1–24 (2009)
Bouzeghoub, M.: A framework for analysis of data freshness. In: Proceedings of the 2004 International Workshop on Information Quality in Information Systems, pp. 59–67. ACM (2004)
Castillo, R., Rothe, C., Ulf, L.: Idexing RDF Data for SPARQL Queries. Professoren des Inst. für Informatik, RDFMatView (2010)
Dey, D., Kumar, S.: Data quality of query results with generalized selection conditions. Operations Research 61(1), 17–31 (2013)
Goasdoué, F., Karanasos, K., Leblay, J., Manolescu, I.: View selection in semantic web databases. Proceedings of the VLDB Endowment 5(2), 97–108 (2011)
Goldstein, J., Per-Åke, L.: Optimizing queries using materialized views: a practical, scalable solution. In: ACM SIGMOD Record 30, pp. 331–342. ACM (2001)
Hartig, O., Bizer, C., Freytag, J.-C.: Executing sparql queries over the web of linked data. In: Bernstein, A., Karger, D.R., Heath, T., Feigenbaum, L., Maynard, D., Motta, E., Thirunarayan, K. (eds.) ISWC 2009. LNCS, vol. 5823, pp. 293–309. Springer, Heidelberg (2009)
Kuno, H., Graefe, G.: Deferred maintenance of indexes and of materialized views. In: Kikuchi, S., Madaan, A., Sachdeva, S., Bhalla, S. (eds.) DNIS 2011. LNCS, vol. 7108, pp. 312–323. Springer, Heidelberg (2011)
Labrinidis, A., Qu, H., Xu, J.: Quality contracts for real-time enterprises. In: Bussler, C.J., Castellanos, M., Dayal, U., Navathe, S. (eds.) BIRTE 2006. LNCS, vol. 4365, pp. 143–156. Springer, Heidelberg (2007)
Labrinidis, A., Roussopoulos, N.: Exploring the tradeoff between performance and data freshness in database-driven web servers. The VLDB Journal 13(3), 240–255 (2004)
Lupei, D., Shaikhha, A., Koch, C., Nötzli, A.: Oliver Andrzej Kennedy, Milos Nikolic, and Yanif Ahmad. Higher-order delta processing for dynamic, frequently fresh views. Technical report, Dbtoaster (2013)
Neumann, T., Moerkotte, G.: Characteristic sets: Accurate cardinality estimation for rdf queries with multiple joins. In: 2011 IEEE 27th International Conference on Data Engineering (ICDE), pp. 984–994. IEEE (2011)
Parssian, A., Sarkar, S., Jacob, V.S.: Assessing information quality for the composite relational operation join. In: IQ, pp. 225–237 (2002)
Poosala, V., Haas, P.J., Loannidis, Y.E., Shekita, E.J.: Improved histograms for selectivity estimation of range predicates. ACM SIGMOD Record 25(2), 294–305 (1996)
Tummarello, G., Delbru, R., Oren, E.: Sindice.com: weaving the open linked data. In: Aberer, K., Choi, K.-S., Noy, N., Allemang, D., Lee, K.-I., Nixon, L.J.B., Golbeck, J., Mika, P., Maynard, D., Mizoguchi, R., Schreiber, G., Cudré-Mauroux, P. (eds.) ASWC 2007 and ISWC 2007. LNCS, vol. 4825, pp. 552–565. Springer, Heidelberg (2007)
Umbrich, J., Hose, K., Karnstedt, M., Harth, A., Polleres, A.: Comparing data summaries for processing live queries over linked data. World Wide Web 14(5–6), 495–544 (2011)
Parreira, J.X., Umbrich, J., Karnstedt, M., Hogan, A.: Hybrid sparql queries: fresh vs. fast results. In: Cudré-Mauroux, P., Heflin, J., Sirin, E., Tudorache, T., Euzenat, J., Hauswirth, M., Parreira, J.X., Hendler, J., Schreiber, G., Bernstein, A., Blomqvist, E. (eds.) ISWC 2012, Part I. LNCS, vol. 7649, pp. 608–624. Springer, Heidelberg (2012)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Dehghanzadeh, S., Parreira, J.X., Karnstedt, M., Umbrich, J., Hauswirth, M., Decker, S. (2015). Optimizing SPARQL Query Processing on Dynamic and Static Data Based on Query Time/Freshness Requirements Using Materialization. In: Supnithi, T., Yamaguchi, T., Pan, J., Wuwongse, V., Buranarach, M. (eds) Semantic Technology. JIST 2014. Lecture Notes in Computer Science(), vol 8943. Springer, Cham. https://doi.org/10.1007/978-3-319-15615-6_19
Download citation
DOI: https://doi.org/10.1007/978-3-319-15615-6_19
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-15614-9
Online ISBN: 978-3-319-15615-6
eBook Packages: Computer ScienceComputer Science (R0)