Abstract
With the widespread adoption of Linked Data, the efficient processing of SPARQL queries gains importance. A crucial category of queries that is prone to optimization is “top-k” queries, i.e. queries returning the top k results ordered by a specified ranking function. Top-k queries can be expressed in SPARQL by appending to a SELECT query the ORDER BY and LIMIT clauses, which impose a sorting order on the result set, and limit the number of results. However, the ORDER BY and LIMIT clauses in SPARQL algebra are result modifiers, i.e. their evaluation is performed only after the evaluation of the other query clauses. The evaluation of ORDER BY and LIMIT clauses in SPARQL engines typically requires the process of all the matching solutions (possibly thousands), followed by a monolithically computation of the ranking function for each solution, even if only a limited number (e.g. K = 10) of them were requested, thus leading to poor performance.
In this paper, we present \(\mathcal{S}\)PARQL-\(\mathcal{R}{\rm ANK}\), an extension of the SPARQL algebra and execution model that supports ranking as a first-class SPAR-QL construct. The new algebra and execution model allow for splitting the ranking function and interleaving it with other operations. We also provide a prototypal open source implementation of \(\mathcal{S}\)PARQL-\(\mathcal{R}{\rm ANK}\) based on ARQ, and we carry out a series of preliminary experiments.
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Anyanwu, K., Maduko, A., Sheth, A.: SemRank: ranking complex relationship search results on the semantic web. In: WWW 2005, pp. 117–127. ACM (2005)
Buil-Aranda, C., Arenas, M., Corcho, O.: Semantics and Optimization of the SPARQL 1.1 Federation Extension. In: Antoniou, G., Grobelnik, M., Simperl, E., Parsia, B., Plexousakis, D., De Leenheer, P., Pan, J. (eds.) ESWC 2011, Part II. LNCS, vol. 6644, pp. 1–15. Springer, Heidelberg (2011)
Bizer, C., Schultz, A.: The Berlin SPARQL Benchmark. Int. J. Semantic Web Inf. Syst. 5(2), 1–24 (2009)
Bozzon, A., Della Valle, E., Magliacane, S.: Towards and efficient SPARQL top-k query execution in virtual RDF stores. In: 5th International Workshop on Ranking in Databases (DBRANK 2011) (August 2011)
Bruno, N., Gravano, L., Marian, A.: Evaluating Top-k Queries over Web-Accessible Databases. In: ICDE, p. 369. IEEE Computer Society (2002)
Castagna, P.: Avoid a total sort for order by + limit queries. JENA bug tracker, https://issues.apache.org/jira/browse/jena-89
Chang, K.C.-C., Hwang, S.-W.: Minimal probing: supporting expensive predicates for top-k queries. In: SIGMOD Conference, pp. 346–357. ACM (2002)
Cheng, J., Ma, Z.M., Yan, L.: f-SPARQL: A Flexible Extension of SPARQL. In: Bringas, P.G., Hameurlain, A., Quirchmayr, G. (eds.) DEXA 2010, Part I. LNCS, vol. 6261, pp. 487–494. Springer, Heidelberg (2010)
Harris, S., Seaborne, A.: SPARQL 1.1 Working Draft. Technical report, W3C (2011), http://www.w3.org/TR/sparql11-query/
Hwang, S.-W., Chang, K.C.-C.: Probe minimization by schedule optimization: Supporting top-k queries with expensive predicates. IEEE Transactions on Knowledge and Data Engineering 19(5), 646–662 (2007)
Ilyas, I.F., Aref, W.G., Elmagarmid, A.K.: Supporting Top-k Join Queries in Relational Databases. In: VLDB, pp. 754–765 (2003)
Ilyas, I.F., Beskales, G., Soliman, M.A.: A survey of top-k query processing techniques in relational database systems. ACM Comput. Surv. 40(4) (2008)
Ilyas, I.F., Shah, R., Aref, W.G., Vitter, J.S., Elmagarmid, A.K.: Rank-aware Query Optimization. In: SIGMOD Conference, pp. 203–214. ACM (2004)
Li, C., Soliman, M.A., Chang, K.C.-C., Ilyas, I.F.: RankSQL: query algebra and optimization for relational top-k queries. In: SIGMOD 2005, pp. 131–142 (2005)
Pérez, J., Arenas, M., Gutierrez, C.: Semantics and complexity of SPARQL. ACM Trans. Database Syst. 34(3) (2009)
Prud’hommeaux, E., Seaborne, A.: SPARQL Query Language for RDF W3C Recommendation (January 2008), http://www.w3.org/TR/rdf-sparql-query/
Qi, Y., Candan, K.S., Sapino, M.L.: Sum-Max Monotonic Ranked Joins for Evaluating Top-K Twig Queries on Weighted Data Graphs. In: VLDB, pp. 507–518 (2007)
Schmidt, M., Meier, M., Lausen, G.: Foundations of SPARQL query optimization. In: ICDT 2010, pp. 4–33. ACM, New York (2010)
Schnaitter, K., Polyzotis, N.: Optimal algorithms for evaluating rank joins in database systems. ACM Transactions on Database Systems 35(1), 1–47 (2010)
Siberski, W., Pan, J.Z., Thaden, U.: Querying the Semantic Web with Preferences. In: Cruz, I., Decker, S., Allemang, D., Preist, C., Schwabe, D., Mika, P., Uschold, M., Aroyo, L.M. (eds.) ISWC 2006. LNCS, vol. 4273, pp. 612–624. Springer, Heidelberg (2006)
Stocker, M., Seaborne, A., Bernstein, A., Kiefer, C., Reynolds, D.: SPARQL basic graph pattern optimization using selectivity estimation. In: WWW, pp. 595–604. ACM (2008)
Straccia, U.: SoftFacts: A top-k retrieval engine for ontology mediated access to relational databases. In: SMC, pp. 4115–4122. IEEE (2010)
Vidal, M.-E., Ruckhaus, E., Lampo, T., Martínez, A., Sierra, J., Polleres, A.: Efficiently Joining Group Patterns in SPARQL Queries. In: Aroyo, L., Antoniou, G., Hyvönen, E., ten Teije, A., Stuckenschmidt, H., Cabral, L., Tudorache, T. (eds.) ESWC 2010, Part I. LNCS, vol. 6088, pp. 228–242. Springer, Heidelberg (2010)
Zimmermann, A., Lopes, N., Polleres, A., Straccia, U.: A general framework for representing, reasoning and querying with annotated semantic web data. CoRR, abs/1103.1255 (2011)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Bozzon, A., Della Valle, E., Magliacane, S. (2012). Extending SPARQL Algebra to Support Efficient Evaluation of Top-K SPARQL Queries. In: Ceri, S., Brambilla, M. (eds) Search Computing. Lecture Notes in Computer Science, vol 7538. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-34213-4_10
Download citation
DOI: https://doi.org/10.1007/978-3-642-34213-4_10
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-34212-7
Online ISBN: 978-3-642-34213-4
eBook Packages: Computer ScienceComputer Science (R0)