Identifying and Caching Hot Triples for Efficient RDF Query Processing

Zhang, Wei Emma; Sheng, Quan Z.; Taylor, Kerry; Qin, Yongrui

doi:10.1007/978-3-319-18123-3_16

Wei Emma Zhang¹⁷,
Quan Z. Sheng¹⁷,
Kerry Taylor¹⁸ &
…
Yongrui Qin¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9050))

Included in the following conference series:

International Conference on Database Systems for Advanced Applications

1812 Accesses
11 Citations

Abstract

Resource Description Framework (RDF) has been used as a general model for conceptual description and information modelling. As the growing number and volume of RDF datasets emerged recently, many techniques have been developed for accelerating the query answering process on triple stores, which handle large-scale RDF data. Caching is one of the popular solutions. Non-RDBMS based triple stores, which leverage the intrinsic nature of RDF graphs, are emerging and attracting more research attention in recent years. However, as their fundamental structure is different from RDBMS triple stores, they can not leverage the RDBMS caching mechanism. In this paper, we develop a time-aware frequency based caching algorithm to address this issue. Our approach retrieves the accessed triples by analyzing and expanding previous queries and collects most frequently accessed triples by evaluating their access frequencies using Exponential Smoothing, a forecasting method. We evaluate our approach using real world queries from a publicly available SPARQL endpoint. Our theoretical analysis and empirical results show that the proposed approach outperforms the state-of-the-art approaches with higher hit rates.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Carpineto, C., Romano, G.: A Survey of Automatic Query Expansion in Information Retrieval. ACM Computing Survey 44(1), 1 (2012)
Article Google Scholar
Denning, P.J.: The Working Set Model for Program Behaviour. Communications of the ACM 11(5), 323–333 (1968)
Article MATH MathSciNet Google Scholar
Huang, J., Abadi, D.J., Ren, K.: Scalable SPARQL Querying of Large RDF Graphs. The VLDB Endowment (PVLDB) 4(11), 1123–1134 (2011)
Google Scholar
Johnson, N.L., Kemp, A.W., Kotz, S.: Univariate Discrete Distributions (2nd Edition). Wiley (1993)
Google Scholar
Jr., E.S.G.: Exponential Smoothing: The State of The Art-Part II. International Journal of Forecasting 22(4), 637–666 (2006)
Article Google Scholar
Levandoski, J.J., Larson, P., Stoica, R.: Identifying hot and cold data in main-memory databases. In: Proc. of 29th International Conference on Data Engineering (ICDE 2013), pp. 26–37. Brisbane, Australia, April 2013
Google Scholar
Lorey, J., Naumann, F.: Detecting SPARQL query templates for data prefetching. In: Cimiano, P., Corcho, O., Presutti, V., Hollink, L., Rudolph, S. (eds.) ESWC 2013. LNCS, vol. 7882, pp. 124–139. Springer, Heidelberg (2013)
Chapter Google Scholar
Martin, M., Unbehauen, J., Auer, S.: Improving the Performance of Semantic Web Applications with SPARQL Query Caching. In: Aroyo, L., Antoniou, G., Hyvönen, E., ten Teije, A., Stuckenschmidt, H., Cabral, L., Tudorache, T. (eds.) ESWC 2010, Part II. LNCS, vol. 6089, pp. 304–318. Springer, Heidelberg (2010)
Chapter Google Scholar
Megiddo, N., Modha, D.S.: ARC: a self-tuning, low overhead replacement cache. In: Proc. of the Conference on File and Storage Technologies (FAST 2003). San Francisco, California, USA, March 2003
Google Scholar
Movellan, J.R.: A Quickie on Exponential Smoothing. http://mplab.ucsd.edu/tutorials/ExpSmoothing.pdfa/
Neumann, T., Weikum, G.: Scalable join processing on very large RDF graphs. In: Proc. of the International Conference on Management of Data (SIGMOD 2009)
Google Scholar
Neumann, T., Weikum, G.: The RDF-3X Engine for Scalable Management of RDF Data. The VLDB Journal 19(1), 91–113 (2010)
Article Google Scholar
O’Neil, E.J., O’Neil, P.E., Weikum, G.: The LRU-K page replacement algorithm for database disk buffering. In: Proc. of the International Conference on Management of Data (SIGMOD 1993), pp. 297–306. Washington, D.C., USA, May 1993
Google Scholar
Pérez, J., Arenas, M., Gutierrez, C.: Semantics and Complexity of SPARQL. ACM Transactions on Database Systems 34(3) (2009)
Google Scholar
Stocker, M., Seaborne, A., Bernstein, A., Kiefer, C., Reynolds, D.: SPARQL basic graph pattern optimization using selectivity estimation. In: Proc. of the 17th International World Wide Web Conference (WWW 2008), pp. 595–604. Beijing, China, April 2008
Google Scholar
Yan, Y., Wang, C., Zhou, A., Qian, W., Ma, L., Pan, Y.: Efficiently querying RDF data in triple stores. In: Proc. of the 17th International World Wide Web Conference (WWW 2008), pp. 1053–1054. Beijing, China, April 2008
Google Scholar
Yang, M., Wu, G.: Caching intermediate result of SPARQL queries. In: Proc. of the 20th International World Wide Web Conference (WWW 2011), pp. 159–160. Hyderabad, India, March 2011
Google Scholar
Zeng, K., Yang, J., Wang, H., Shao, B., Wang, Z.: A Distributed Graph Engine for Web Scale RDF Data. The VLDB Endowment (PVLDB) 6(4), 265–276 (2013)
Article Google Scholar
Zou, L., Mo, J., Chen, L., Özsu, M.T., Zhao, D.: gStore: Answering SPARQL Queries via Subgraph Matching. The VLDB Endowment (PVLDB) 4(8), 482–493 (2011)
Article Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Science, The University of Adelaide, Adelaide, SA, 5005, Australia
Wei Emma Zhang, Quan Z. Sheng & Yongrui Qin
CSIRO, Canberra, ACT, 2601, Australia
Kerry Taylor

Authors

Wei Emma Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Quan Z. Sheng
View author publications
You can also search for this author in PubMed Google Scholar
Kerry Taylor
View author publications
You can also search for this author in PubMed Google Scholar
Yongrui Qin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wei Emma Zhang .

Editor information

Editors and Affiliations

Universität München, München, Germany
Matthias Renz
University of Southern California, Los Angeles, USA
Cyrus Shahabi
University of Queensland, Brisbane, Australia
Xiaofang Zhou
Monash University, Clayton, Australia
Muhammad Aamir Cheema

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, W.E., Sheng, Q.Z., Taylor, K., Qin, Y. (2015). Identifying and Caching Hot Triples for Efficient RDF Query Processing. In: Renz, M., Shahabi, C., Zhou, X., Cheema, M. (eds) Database Systems for Advanced Applications. DASFAA 2015. Lecture Notes in Computer Science(), vol 9050. Springer, Cham. https://doi.org/10.1007/978-3-319-18123-3_16

Download citation

DOI: https://doi.org/10.1007/978-3-319-18123-3_16
Published: 09 April 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-18122-6
Online ISBN: 978-3-319-18123-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics