Skip to main content

Identifying and Caching Hot Triples for Efficient RDF Query Processing

  • Conference paper
  • First Online:
Database Systems for Advanced Applications (DASFAA 2015)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9050))

Included in the following conference series:

Abstract

Resource Description Framework (RDF) has been used as a general model for conceptual description and information modelling. As the growing number and volume of RDF datasets emerged recently, many techniques have been developed for accelerating the query answering process on triple stores, which handle large-scale RDF data. Caching is one of the popular solutions. Non-RDBMS based triple stores, which leverage the intrinsic nature of RDF graphs, are emerging and attracting more research attention in recent years. However, as their fundamental structure is different from RDBMS triple stores, they can not leverage the RDBMS caching mechanism. In this paper, we develop a time-aware frequency based caching algorithm to address this issue. Our approach retrieves the accessed triples by analyzing and expanding previous queries and collects most frequently accessed triples by evaluating their access frequencies using Exponential Smoothing, a forecasting method. We evaluate our approach using real world queries from a publicly available SPARQL endpoint. Our theoretical analysis and empirical results show that the proposed approach outperforms the state-of-the-art approaches with higher hit rates.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Carpineto, C., Romano, G.: A Survey of Automatic Query Expansion in Information Retrieval. ACM Computing Survey 44(1), 1 (2012)

    Article  Google Scholar 

  2. Denning, P.J.: The Working Set Model for Program Behaviour. Communications of the ACM 11(5), 323–333 (1968)

    Article  MATH  MathSciNet  Google Scholar 

  3. Huang, J., Abadi, D.J., Ren, K.: Scalable SPARQL Querying of Large RDF Graphs. The VLDB Endowment (PVLDB) 4(11), 1123–1134 (2011)

    Google Scholar 

  4. Johnson, N.L., Kemp, A.W., Kotz, S.: Univariate Discrete Distributions (2nd Edition). Wiley (1993)

    Google Scholar 

  5. Jr., E.S.G.: Exponential Smoothing: The State of The Art-Part II. International Journal of Forecasting 22(4), 637–666 (2006)

    Article  Google Scholar 

  6. Levandoski, J.J., Larson, P., Stoica, R.: Identifying hot and cold data in main-memory databases. In: Proc. of 29th International Conference on Data Engineering (ICDE 2013), pp. 26–37. Brisbane, Australia, April 2013

    Google Scholar 

  7. Lorey, J., Naumann, F.: Detecting SPARQL query templates for data prefetching. In: Cimiano, P., Corcho, O., Presutti, V., Hollink, L., Rudolph, S. (eds.) ESWC 2013. LNCS, vol. 7882, pp. 124–139. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  8. Martin, M., Unbehauen, J., Auer, S.: Improving the Performance of Semantic Web Applications with SPARQL Query Caching. In: Aroyo, L., Antoniou, G., Hyvönen, E., ten Teije, A., Stuckenschmidt, H., Cabral, L., Tudorache, T. (eds.) ESWC 2010, Part II. LNCS, vol. 6089, pp. 304–318. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  9. Megiddo, N., Modha, D.S.: ARC: a self-tuning, low overhead replacement cache. In: Proc. of the Conference on File and Storage Technologies (FAST 2003). San Francisco, California, USA, March 2003

    Google Scholar 

  10. Movellan, J.R.: A Quickie on Exponential Smoothing. http://mplab.ucsd.edu/tutorials/ExpSmoothing.pdfa/

  11. Neumann, T., Weikum, G.: Scalable join processing on very large RDF graphs. In: Proc. of the International Conference on Management of Data (SIGMOD 2009)

    Google Scholar 

  12. Neumann, T., Weikum, G.: The RDF-3X Engine for Scalable Management of RDF Data. The VLDB Journal 19(1), 91–113 (2010)

    Article  Google Scholar 

  13. O’Neil, E.J., O’Neil, P.E., Weikum, G.: The LRU-K page replacement algorithm for database disk buffering. In: Proc. of the International Conference on Management of Data (SIGMOD 1993), pp. 297–306. Washington, D.C., USA, May 1993

    Google Scholar 

  14. Pérez, J., Arenas, M., Gutierrez, C.: Semantics and Complexity of SPARQL. ACM Transactions on Database Systems 34(3) (2009)

    Google Scholar 

  15. Stocker, M., Seaborne, A., Bernstein, A., Kiefer, C., Reynolds, D.: SPARQL basic graph pattern optimization using selectivity estimation. In: Proc. of the 17th International World Wide Web Conference (WWW 2008), pp. 595–604. Beijing, China, April 2008

    Google Scholar 

  16. Yan, Y., Wang, C., Zhou, A., Qian, W., Ma, L., Pan, Y.: Efficiently querying RDF data in triple stores. In: Proc. of the 17th International World Wide Web Conference (WWW 2008), pp. 1053–1054. Beijing, China, April 2008

    Google Scholar 

  17. Yang, M., Wu, G.: Caching intermediate result of SPARQL queries. In: Proc. of the 20th International World Wide Web Conference (WWW 2011), pp. 159–160. Hyderabad, India, March 2011

    Google Scholar 

  18. Zeng, K., Yang, J., Wang, H., Shao, B., Wang, Z.: A Distributed Graph Engine for Web Scale RDF Data. The VLDB Endowment (PVLDB) 6(4), 265–276 (2013)

    Article  Google Scholar 

  19. Zou, L., Mo, J., Chen, L., Özsu, M.T., Zhao, D.: gStore: Answering SPARQL Queries via Subgraph Matching. The VLDB Endowment (PVLDB) 4(8), 482–493 (2011)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wei Emma Zhang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Zhang, W.E., Sheng, Q.Z., Taylor, K., Qin, Y. (2015). Identifying and Caching Hot Triples for Efficient RDF Query Processing. In: Renz, M., Shahabi, C., Zhou, X., Cheema, M. (eds) Database Systems for Advanced Applications. DASFAA 2015. Lecture Notes in Computer Science(), vol 9050. Springer, Cham. https://doi.org/10.1007/978-3-319-18123-3_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-18123-3_16

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-18122-6

  • Online ISBN: 978-3-319-18123-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics