Advertisement

QED: Out-of-the-Box Datasets for SPARQL Query Evaluation

  • Veronika ThostEmail author
  • Julian Dolby
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11503)

Abstract

In this paper, we present SPARQL QED, a system generating out-of-the-box datasets for SPARQL queries over linked data. QED distinguishes the queries according to the different SPARQL features and creates, for each query, a small but exhaustive dataset comprising linked data and the query answers over this data. These datasets can support the development of applications based on SPARQL query answering in various ways. For instance, they may serve as SPARQL compliance tests or can be used for learning in query-by-example systems. We ensure that the created datasets are diverse and cover various practical use cases and, of course, that the sets of answers included are the correct ones. Example tests generated based on queries and data from DBpedia have shown bugs in Jena and Virtuoso.

Keywords

SPARQL datasets Compliance tests Benchmark 

Notes

Acknowledgments

This work is partly supported by the German Research Foundation (DFG) in the Cluster of Excellence “Center for Advancing Electronics Dresden” in CRC 912.

References

  1. 1.
    Aluç, G., Hartig, O., Özsu, M.T., Daudjee, K.: Diversified stress testing of RDF data management systems. In: Mika, P., et al. (eds.) ISWC 2014, Part I. LNCS, vol. 8796, pp. 197–212. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-11964-9_13CrossRefGoogle Scholar
  2. 2.
    Buil-Aranda, C., Hogan, A., Umbrich, J., Vandenbussche, P.-Y.: SPARQL web-querying infrastructure: ready for action? In: Alani, H., et al. (eds.) ISWC 2013, Part II. LNCS, vol. 8219, pp. 277–293. Springer, Heidelberg (2013).  https://doi.org/10.1007/978-3-642-41338-4_18CrossRefGoogle Scholar
  3. 3.
    Bizer, C., Schultz, A.: The berlin SPARQL benchmark. Int. J. Semantic Web Inf. Syst. 5(2), 1–24 (2009)CrossRefGoogle Scholar
  4. 4.
    Bonifati, A., Martens, W., Timm, T.: An analytical study of large SPARQL query logs. PVLDB 11(2), 149–161 (2017)Google Scholar
  5. 5.
    Bornea, M., Dolby, J., Fokoue, A., Kementsietsidis, A., Srinivas, K., Vaziri, M.: An executable specification for SPARQL. In: Cellary, W., Mokbel, M.F., Wang, J., Wang, H., Zhou, R., Zhang, Y. (eds.) WISE 2016. LNCS, vol. 10042, pp. 298–305. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-48743-4_24CrossRefGoogle Scholar
  6. 6.
    Diaz, G.I., Arenas, M., Benedikt, M.: SPARQLByE: querying RDF data by example. PVLDB 9(13), 1533–1536 (2016)Google Scholar
  7. 7.
    Erling, O.: Virtuoso, a hybrid RDBMS/graph column store. IEEE Data Eng. Bull. 35(1), 3–8 (2012)Google Scholar
  8. 8.
    Fariha, A., Sarwar, S.M., Meliou, A.: SQuID: semantic similarity-aware query intent discovery. In: SIGMOD 2018, pp. 1745–1748 (2018)Google Scholar
  9. 9.
    Görlitz, O., Thimm, M., Staab, S.: SPLODGE: systematic generation of SPARQL benchmark queries for linked open data. In: Cudré-Mauroux, P., et al. (eds.) ISWC 2012, Part I. LNCS, vol. 7649, pp. 116–132. Springer, Heidelberg (2012).  https://doi.org/10.1007/978-3-642-35176-1_8CrossRefGoogle Scholar
  10. 10.
    Guo, Y., Pan, Z., Heflin, J.: LUBM: a benchmark for OWL knowledge base systems. J. Web Sem. 3(2–3), 158–182 (2005)CrossRefGoogle Scholar
  11. 11.
    Lehmann, J., Bühmann, L.: AutoSPARQL: let users query your knowledge base. In: Antoniou, G., et al. (eds.) ESWC 2011, Part I. LNCS, vol. 6643, pp. 63–79. Springer, Heidelberg (2011).  https://doi.org/10.1007/978-3-642-21034-1_5CrossRefGoogle Scholar
  12. 12.
    Lissandrini, M., Mottin, D., Velegrakis, Y., Palpanas, T.: X2q: your personal example-based graph explorer. In: PVLDB 2018 (2018)CrossRefGoogle Scholar
  13. 13.
    Ma, L., Yang, Y., Qiu, Z., Xie, G., Pan, Y., Liu, S.: Towards a complete OWL ontology benchmark. In: Sure, Y., Domingue, J. (eds.) ESWC 2006. LNCS, vol. 4011, pp. 125–139. Springer, Heidelberg (2006).  https://doi.org/10.1007/11762256_12CrossRefGoogle Scholar
  14. 14.
    Morsey, M., Lehmann, J., Auer, S., Ngonga Ngomo, A.-C.: DBpedia SPARQL benchmark – performance assessment with real queries on real data. In: Aroyo, L., et al. (eds.) ISWC 2011, Part I. LNCS, vol. 7031, pp. 454–469. Springer, Heidelberg (2011).  https://doi.org/10.1007/978-3-642-25073-6_29CrossRefGoogle Scholar
  15. 15.
    Potoniec, J.: An on-line learning to query system. In: ISWC 2016 Posters & Demonstrations Track (2016)Google Scholar
  16. 16.
    Rafes, K., Nauroy, J., Germain, C.: Certifying the interoperability of RDF database systems. In: Proceedings of the 2nd Workshop on Linked Data Quality (2015)Google Scholar
  17. 17.
    Saleem, M., Ali, M.I., Hogan, A., Mehmood, Q., Ngomo, A.-C.N.: LSQ: the linked SPARQL queries dataset. In: Arenas, M., et al. (eds.) ISWC 2015, Part II. LNCS, vol. 9367, pp. 261–269. Springer, Cham (2015).  https://doi.org/10.1007/978-3-319-25010-6_15CrossRefGoogle Scholar
  18. 18.
    Saleem, M., Hasnainb, A., Ngonga Ngomo, A.C.: LargeRDFBench: a billion triples benchmark for SPARQL endpoint federation. J. Web Sem. (2017)Google Scholar
  19. 19.
    Saleem, M., Mehmood, Q., Ngonga Ngomo, A.-C.: FEASIBLE: a feature-based SPARQL benchmark generation framework. In: Arenas, M., et al. (eds.) ISWC 2015, Part I. LNCS, vol. 9366, pp. 52–69. Springer, Cham (2015).  https://doi.org/10.1007/978-3-319-25007-6_4CrossRefGoogle Scholar
  20. 20.
    Schmidt, M., Hornung, T., Lausen, G., Pinkel, C.: SP\(^2\) Bench: a SPARQL performance benchmark. In: ICDE 2009, pp. 222–233 (2009)Google Scholar
  21. 21.
    Schmidt, M., Görlitz, O., Haase, P., Ladwig, G., Schwarte, A., Tran, T.: FedBench: a benchmark suite for federated semantic data query processing. In: Aroyo, L., et al. (eds.) ISWC 2011, Part I. LNCS, vol. 7031, pp. 585–600. Springer, Heidelberg (2011).  https://doi.org/10.1007/978-3-642-25073-6_37CrossRefGoogle Scholar
  22. 22.
    Thompson, B.B., Personick, M., Cutcher, M.: The bigdata® RDF graph database. In: Linked Data Management, pp. 193–237 (2014)Google Scholar
  23. 23.
    Wilkinson, K., Sayers, C., Kuno, H.A., Reynolds, D., Ding, L.: Supporting scalable, persistent semantic web applications. IEEE Data Eng. Bull. 26(4), 33–39 (2003)Google Scholar
  24. 24.
    Wu, H., Fujiwara, T., Yamamoto, Y., Bolleman, J.T., Yamaguchi, A.: Biobenchmark toyama 2012: an evaluation of the performance of triple stores on biological data. J. Biomed. Semant. 5, 32 (2014)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

Authors and Affiliations

  1. 1.MIT-IBM-Watson AI LabIBM ResearchCambridgeUSA
  2. 2.IBM ResearchYorktown HeightsUSA

Personalised recommendations