Advertisement

List.MID: A MIDI-Based Benchmark for Evaluating RDF Lists

  • Albert Meroño-PeñuelaEmail author
  • Enrico Daga
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11779)

Abstract

Linked lists represent a countable number of ordered values, and are among the most important abstract data types in computer science. With the advent of RDF as a highly expressive knowledge representation language for the Web, various implementations for RDF lists have been proposed. Yet, there is no benchmark so far dedicated to evaluate the performance of triple stores and SPARQL query engines on dealing with ordered linked data. Moreover, essential tasks for evaluating RDF lists, like generating datasets containing RDF lists of various sizes, or generating the same RDF list using different modelling choices, are cumbersome and unprincipled. In this paper, we propose List.MID, a systematic benchmark for evaluating systems serving RDF lists. List.MID consists of a dataset generator, which creates RDF list data in various models and of different sizes; and a set of SPARQL queries. The RDF list data is coherently generated from a large, community-curated base collection of Web MIDI files, rich in lists of musical events of arbitrary length. We describe the List.MID benchmark, and discuss its impact and adoption, reusability, design, and availability.

Keywords

Linked lists RDF Benchmarks 

Notes

Acknowledgements

This work was partially funded by the CLARIAH project of the Dutch Science Foundation (NWO). We are grateful to all participants of the online survey.

References

  1. 1.
    Aluç, G., Hartig, O., Özsu, M.T., Daudjee, K.: Diversified stress testing of RDF data management systems. In: Mika, P., et al. (eds.) ISWC 2014. LNCS, vol. 8796, pp. 197–212. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-11964-9_13CrossRefGoogle Scholar
  2. 2.
    Beckett, D., Berners-Lee, T., Prud’hommeaux, E., Carothers, G.: RDF 1.1 turtle - terse RDF triple language. Technical report, World Wide Web Consrotium (2014). https://www.w3.org/TR/turtle/
  3. 3.
    Beek, W., Rietveld, L., Bazoobandi, H.R., Wielemaker, J., Schlobach, S.: LOD laundromat: a uniform way of publishing other people’s dirty data. In: Mika, P., et al. (eds.) ISWC 2014. LNCS, vol. 8796, pp. 213–228. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-11964-9_14CrossRefGoogle Scholar
  4. 4.
    Berners-Lee, T., Hendler, J., Lassila, O., et al.: The semantic web. Sci. Am. 284(5), 28–37 (2001)CrossRefGoogle Scholar
  5. 5.
    Bizer, C., Schultz, A.: The Berlin SPARQL benchmark. Int. J. Semant. Web Inf. Syst. 5(2), 1–24 (2009)CrossRefGoogle Scholar
  6. 6.
    Brickley, D., Guha, R.: RDF schema 1.1. Technical report, World Wide Web Consrotium (2014). https://www.w3.org/TR/rdf-schema/
  7. 7.
    Conrads, F., Lehmann, J., Saleem, M., Morsey, M., Ngonga Ngomo, A.-C.: Iguana: a generic framework for benchmarking the read-write performance of triple stores. In: d’Amato, C., et al. (eds.) ISWC 2017. LNCS, vol. 10588, pp. 48–65. Springer, Cham (2017).  https://doi.org/10.1007/978-3-319-68204-4_5CrossRefGoogle Scholar
  8. 8.
    Daga, E., Meroño-Peñuela, A., Motta, E.: Modelling and querying lists in RDF. A pragmatic study. In: 3rd Workshop on Querying and Benchmarking the Web of Data (QuWeDa 2019), ISWC 2019 (2019)Google Scholar
  9. 9.
    Eilbeck, K., et al.: The sequence ontology: a tool for the unification of genome annotations. Genome Biol. 6(5), R44 (2005)CrossRefGoogle Scholar
  10. 10.
    Gangemi, A.: Ontology design patterns for semantic web content. In: Gil, Y., Motta, E., Benjamins, V.R., Musen, M.A. (eds.) ISWC 2005. LNCS, vol. 3729, pp. 262–276. Springer, Heidelberg (2005).  https://doi.org/10.1007/11574620_21CrossRefGoogle Scholar
  11. 11.
    Gopan, D., Reps, T., Sagiv, M.: A framework for numeric analysis of array operations. SIGPLAN Not. 40(1), 338–350 (2005). http://doi.acm.org.vu-nl.idm.oclc.org/10.1145/1047659.1040333CrossRefGoogle Scholar
  12. 12.
    Görlitz, O., Thimm, M., Staab, S.: SPLODGE: systematic generation of SPARQL benchmark queries for linked open data. In: Cudré-Mauroux, P., et al. (eds.) ISWC 2012. LNCS, vol. 7649, pp. 116–132. Springer, Heidelberg (2012).  https://doi.org/10.1007/978-3-642-35176-1_8CrossRefGoogle Scholar
  13. 13.
    Guo, Y., Pan, Z., Heflin, J.: LUBM: a benchmark for OWL knowledge base systems. J. Web Semant. Sci. Serv. Agents World Wide Web 3(2), 158–182 (2005)CrossRefGoogle Scholar
  14. 14.
    Hobbs, J.R., Pan, F.: Time ontology in OWL. W3C working draft 27, 133 (2006)Google Scholar
  15. 15.
    Hopcroft, J.E., Ullman, J.D.: Data structures and algorithms (1983)Google Scholar
  16. 16.
    Ley, M.: The DBLP computer science bibliography: evolution, research issues, perspectives. In: Laender, A.H.F., Oliveira, A.L. (eds.) SPIRE 2002. LNCS, vol. 2476, pp. 1–10. Springer, Heidelberg (2002).  https://doi.org/10.1007/3-540-45735-6_1CrossRefGoogle Scholar
  17. 17.
    Meroño-Peñuela, A., Hoekstra, R.: The song remains the same: lossless conversion and streaming of MIDI to RDF and back. In: Sack, H., Rizzo, G., Steinmetz, N., Mladenić, D., Auer, S., Lange, C. (eds.) ESWC 2016. LNCS, vol. 9989, pp. 194–199. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-47602-5_38CrossRefGoogle Scholar
  18. 18.
    Meroño-Peñuela, A., et al.: The MIDI linked data cloud. In: d’Amato, C., et al. (eds.) ISWC 2017. LNCS, vol. 10588, pp. 156–164. Springer, Cham (2017).  https://doi.org/10.1007/978-3-319-68204-4_16CrossRefGoogle Scholar
  19. 19.
    Morsey, M., Lehmann, J., Auer, S., Ngonga Ngomo, A.-C.: DBpedia SPARQL benchmark – performance assessment with real queries on real data. In: Aroyo, L., et al. (eds.) ISWC 2011. LNCS, vol. 7031, pp. 454–469. Springer, Heidelberg (2011).  https://doi.org/10.1007/978-3-642-25073-6_29CrossRefGoogle Scholar
  20. 20.
    Reingold, E.M., Nievergelt, J., Deo, N.: Combinatorial Algorithms: Theory and Practice. Prentice Hall College Div, Englewood Cliffs (1977)zbMATHGoogle Scholar
  21. 21.
    Saleem, M., Ali, M.I., Hogan, A., Mehmood, Q., Ngomo, A.-C.N.: LSQ: the linked SPARQL queries dataset. In: Arenas, M., et al. (eds.) ISWC 2015. LNCS, vol. 9367, pp. 261–269. Springer, Cham (2015).  https://doi.org/10.1007/978-3-319-25010-6_15CrossRefGoogle Scholar
  22. 22.
    Schmidt, M., Hornung, T., Lausen, G., Pinkel, C.: \(\text{SP}^2\)Bench: a SPARQL performance benchmark. In: 2009 IEEE 25th International Conference on Data Engineering, ICDE 2009, pp. 222–233. IEEE (2009)Google Scholar
  23. 23.
    Schreiber, G., Raimond, Y.: RDF 1.1 primer. Technical report, World Wide Web Consrotium (2014). https://www.w3.org/TR/rdf11-primer/
  24. 24.
    Sporny, M., Kellogg, G., Lanthaler, M.: JSON-LD 1.0. Technical report, World Wide Web Consrotium (2014). https://www.w3.org/TR/2014/REC-json-ld-20140116/
  25. 25.
    Thakker, D., Osman, T., Gohil, S., Lakin, P.: A pragmatic approach to semantic repositories benchmarking. In: Aroyo, L., et al. (eds.) ESWC 2010. LNCS, vol. 6088, pp. 379–393. Springer, Heidelberg (2010).  https://doi.org/10.1007/978-3-642-13486-9_26CrossRefGoogle Scholar
  26. 26.
    The MIDI Manufacturers Association: MIDI 1.0 detailed specification. Technical report, Los Angeles, CA (1996–2014). https://www.midi.org/specifications
  27. 27.
    Van Hage, W.R., Malaisé, V., Segers, R., Hollink, L., Schreiber, G.: Design and use of the simple event model (SEM). Web Semant. Sci. Serv. Agents World Wide Web 9(2), 128–136 (2011)CrossRefGoogle Scholar
  28. 28.
    Vandenbussche, P.Y., Atemezing, G.A., Poveda-Villalón, M., Vatant, B.: Linked open vocabularies (LOV): a gateway to reusable semantic vocabularies on the web. Semant. Web 8(3), 437–452 (2017)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Computer Science DepartmentVrije Universiteit AmsterdamAmsterdamThe Netherlands
  2. 2.Knowledge Media InstituteThe Open UniversityMilton KeynesUK

Personalised recommendations