Abstract
The increasing use of data analytics on Linked Data leads to the requirement for SPARQL engines to efficiently execute Online Analytical Processing (OLAP) queries. While SPARQLÂ 1.1 provides basic constructs, further development on optimising OLAP queries lacks benchmarks that mimic the data distributions found in Link Data. Existing work on OLAP benchmarking for SPARQL has usually adopted queries and data from relational databases, which may not well represent Linked Data. We propose an approach that maps typical OLAP operations to SPARQL and a tool named ASPG to automatically generate OLAP queries from real-world Linked Data. We evaluate ASPG by constructing a benchmark called DBOBfrom the online DBpedia endpoint, and use DBOB to measure the performance of the Virtuoso engine.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
Mapping an OLAP data point to a subject is just one intuitive approach. An OLAP data point can be mapped to any RDF term.
- 3.
It is enough to GROUP BY a subset of all variables that uniquely identifies an entity. Variables excluded from GROUP BY can be selected using the SAMPLE aggregation.
- 4.
SPARQL 1.1 doesn’t have the ability to define new functions, and therefore cat should be considered as a macro in Query 3.
- 5.
It requires to calculate the position of an item in a linked list and to identify the maximum item in a set. Refer to https://git.io/vwP0t for more details.
- 6.
The complexity of a BGP is also affected by the number of intermediate results in each join. However the later requires detailed statistics to estimate which are not always available.
- 7.
References
Bizer, C., Schultz, A.: The Berlin SPARQL benchmark. Int. J. Semant. Web Inf. Syst. (IJSWIS) - Special Issue on Scalability and Performance of Semantic Web Systems 5(2), 1–24 (2009)
Capadisli, S., Auer, S., Riedl, R.: Linked Statistical Data Analysis. Semantic Web (2013)
Chaudhuri, S., Dayal, U.: An overview of data warehousing and OLAP technology. ACM SIGMOD Record 26(1), 65–74 (1997)
Ciferri, C., Ciferri, R., Gómez, L., Schneider, M., Vaisman, A., Zimányi, E.: Cube algebra: a generic user-centric model and query language for OLAP cubes. Int. J. Data Warehous. Min. 9(2), 39–65 (2013)
Codd, E.F., Codd, S.B., Salley, C.T.: Providing OLAP (on-line Analytical Processing) to user-analysts: an IT mandate. Codd Date 32, 3–5 (1993)
Cyganiak, R., Reynolds, D., Tennison, J.: The RDF Data Cube Vocabulary
Demartini, G., Enchev, I.: The bowlogna ontology: fostering open curricula and agile knowledge bases for Europe ’ s higher education. Landscape 0, 1–11 (2012)
Demartini, G., Enchev, I., Wylot, M., Gapany, J., Cudré-Mauroux, P.: BowlognaBench-Benchmarking RDF analytics. Data-Driven Process Discovery Anal. 116, 82–102 (2011)
Görlitz, O., Thimm, M., Staab, S.: SPLODGE: systematic generation of SPARQL benchmark queries for linked open data. In: Cudré-Mauroux, P., et al. (eds.) ISWC 2012. LNCS, vol. 7649, pp. 116–132. Springer, Heidelberg (2012). doi:10.1007/978-3-642-35176-1_8
Guo, Y., Pan, Z., Heflin, J.: LUBM: a benchmark for OWL knowledge base systems. Web Semant. 3(2–3), 158–182 (2005)
Harris, S., Seaborne, A.: SPARQL 1.1 Query Language (2013)
Kämpgen, B., Harth, A.: No size fits all – running the star schema benchmark with SPARQL and RDF aggregate views. In: Cimiano, P., Corcho, O., Presutti, V., Hollink, L., Rudolph, S. (eds.) ESWC 2013. LNCS, vol. 7882, pp. 290–304. Springer, Heidelberg (2013). doi:10.1007/978-3-642-38288-8_20
Kämpgen, B., O’Riain, S., Harth, A.: Interacting with Statistical Linked Data via OLAP Operations. In: Simperl, E., Norton, B., Mladenic, D., Della Valle, E., Fundulaki, I., Passant, A., Troncy, R. (eds.) ESWC 2012. LNCS, vol. 7540, pp. 87–101. Springer, Heidelberg (2015). doi:10.1007/978-3-662-46641-4_7
Morsey, M., Lehmann, J., Auer, S., Ngonga Ngomo, A.-C.: DBpedia SPARQL benchmark – performance assessment with real queries on real data. In: Aroyo, L., Welty, C., Alani, H., Taylor, J., Bernstein, A., Kagal, L., Noy, N., Blomqvist, E. (eds.) ISWC 2011. LNCS, vol. 7031, pp. 454–469. Springer, Heidelberg (2011). doi:10.1007/978-3-642-25073-6_29
Neil, P.O., Neil, B.O., Chen, X.: Star Schema Benchmark - Revision 3. Technical report, UMass/Boston (2009)
Saleem, M., Mehmood, Q., Ngonga Ngomo, A.-C.: FEASIBLE: a feature-based SPARQL benchmark generation framework. In: Arenas, M., et al. (eds.) ISWC 2015. LNCS, vol. 9366, pp. 52–69. Springer, Heidelberg (2015). doi:10.1007/978-3-319-25007-6_4
Schmidt, M., Hornung, T., Lausen, G., Pinkel, C.: SP2Bench: a SPARQL performance benchmark. In: Proceedings of the International Conference on Data Engineering, pp. 222–233. IEEE (2009)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing AG
About this paper
Cite this paper
Wang, X., Staab, S., Tiropanis, T. (2016). ASPG: Generating OLAP Queries for SPARQL Benchmarking. In: Li, YF., et al. Semantic Technology. JIST 2016. Lecture Notes in Computer Science(), vol 10055. Springer, Cham. https://doi.org/10.1007/978-3-319-50112-3_13
Download citation
DOI: https://doi.org/10.1007/978-3-319-50112-3_13
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-50111-6
Online ISBN: 978-3-319-50112-3
eBook Packages: Computer ScienceComputer Science (R0)