Detecting SPARQL Query Templates for Data Prefetching

  • Johannes Lorey
  • Felix Naumann
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7882)

Abstract

Publicly available Linked Data repositories provide a multitude of information. By utilizing Sparql, Web sites and services can consume this data and present it in a user-friendly form, e.g., in mash-ups. To gather RDF triples for this task, machine agents typically issue similarly structured queries with recurring patterns against the Sparql endpoint. These queries usually differ only in a small number of individual triple pattern parts, such as resource labels or literals in objects. We present an approach to detect such recurring patterns in queries and introduce the notion of query templates, which represent clusters of similar queries exhibiting these recurrences. We describe a matching algorithm to extract query templates and illustrate the benefits of prefetching data by utilizing these templates. Finally, we comment on the applicability of our approach using results from real-world Sparql query logs.

Keywords

Graph Pattern Sparql Query Query Pattern Distance Score Triple Pattern 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    Bartolomeo, G., Salsano, S.: A spectrometry of linked data. In: Proceedings of the WWW Workshop on Linked Data on the Web (LDOW), Lyon, France (2012)Google Scholar
  2. 2.
    Berendt, B., Hollink, L., Hollink, V., Luczak-Rösch, M., Möller, K., Vallet, D.: USEWOD2012 – 2nd international workshop on usage analysis and the web of data. In: Proceedings of the International World Wide Web Conference (WWW), Lyon, France (2012)Google Scholar
  3. 3.
    Bizer, C., Schultz, A.: The Berlin SPARQL benchmark. International Journal on Semantic Web and Information Systems 5(2), 1–24 (2009)CrossRefGoogle Scholar
  4. 4.
    Böhm, C., Lorey, J., Naumann, F.: Creating voiD descriptions for web-scale data. Journal of Web Semantics 9(3), 339–345 (2011)CrossRefGoogle Scholar
  5. 5.
    Dar, S., Franklin, M.J., Jónsson, B.T., Srivastava, D., Tan, M.: Semantic data caching and replacement. In: Proceedings of the International Conference on Very Large Databases (VLDB), Bombay, India, pp. 330–341 (1996)Google Scholar
  6. 6.
    Fagni, T., Perego, R., Silvestri, F., Orlando, S.: Boosting the performance of web search engines: Caching and prefetching query results by exploiting historical usage data. ACM Transactions on Information Systems 24(1), 51–78 (2006)CrossRefGoogle Scholar
  7. 7.
    Gale, D., Shapley, L.S.: College admissions and the stability of marriage. The American Mathematical Monthly 69(1), 9–15 (1962)MathSciNetMATHCrossRefGoogle Scholar
  8. 8.
    Khatchadourian, S., Consens, M.P.: Exploring RDF usage and interlinking in the linked open data cloud using explod. In: Proceedings of the WWW Workshop on Linked Data on the Web (LDOW) (2010)Google Scholar
  9. 9.
    Lehmann, J., Bühmann, L.: AutoSPARQL: Let users query your knowledge base. In: Antoniou, G., Grobelnik, M., Simperl, E., Parsia, B., Plexousakis, D., De Leenheer, P., Pan, J. (eds.) ESWC 2011, Part I. LNCS, vol. 6643, pp. 63–79. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  10. 10.
    Martin, M., Unbehauen, J., Auer, S.: Improving the performance of semantic web applications with SPARQL query caching. In: Aroyo, L., Antoniou, G., Hyvönen, E., ten Teije, A., Stuckenschmidt, H., Cabral, L., Tudorache, T. (eds.) ESWC 2010, Part II. LNCS, vol. 6089, pp. 304–318. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  11. 11.
    Möller, K., Hausenblas, M., Cyganiak, R., Grimnes, G.A.: Learning from linked open data usage: Patterns & metrics. In: Proceedings of the Web Science Conference, Raleigh, NC, USA (2010)Google Scholar
  12. 12.
    Morsey, M., Lehmann, J., Auer, S., Ngonga Ngomo, A.-C.: DBpedia SPARQL benchmark – performance assessment with real queries on real data. In: Aroyo, L., Welty, C., Alani, H., Taylor, J., Bernstein, A., Kagal, L., Noy, N., Blomqvist, E. (eds.) ISWC 2011, Part I. LNCS, vol. 7031, pp. 454–469. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  13. 13.
    Pérez, J., Arenas, M., Gutierrez, C.: Semantics and complexity of SPARQL. ACM Transactions on Database Systems (TODS) 34(3), 16:1–16:45 (2009)Google Scholar
  14. 14.
    Raghuveer, A.: Characterizing machine agent behavior through SPARQL query mining. In: Proceedings of the International Workshop on Usage Analysis and the Web of Data, Lyon, France (2012)Google Scholar
  15. 15.
    Yang, M., Wu, G.: Caching intermediate result of SPARQL queries. In: Proceedings of the International World Wide Web Conference (WWW), Hyderabad, India, pp. 159–160 (2011)Google Scholar
  16. 16.
    Zenz, G., Zhou, X., Minack, E., Siberski, W., Nejdl, W.: From keywords to semantic queries - incremental query construction on the semantic web. Journal of Web Semantics 7(3), 166–176 (2009)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Johannes Lorey
    • 1
  • Felix Naumann
    • 1
  1. 1.Hasso Plattner InstitutePotsdamGermany

Personalised recommendations