Abstract
As an essential part of the W3C’s semantic web stack and linked data initiative, RDF data management systems (also known as triplestores) have drawn a lot of research attention. The majority of these systems use value-based indexes (e.g., B + -trees) for physical storage, and ignore many of the structural aspects present in RDF graphs. Structural indexes, on the other hand, have been successfully applied in XML and semi-structured data management to exploit structural graph information in query processing. In those settings, a structural index groups nodes in a graph based on some equivalence criterion, for example, indistinguishability with respect to some query workload (usually XPath). Motivated by this body of work, we have started the SAINT-DB project to study and develop a native RDF management system based on structural indexes. In this paper we present a principled framework for designing and using RDF structural indexes for practical fragments of SPARQL, based on recent formal structural characterizations of these fragments. We then explain how structural indexes can be incorporated in a typical query processing workflow; and discuss the design, implementation, and initial empirical evaluation of our approach.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Abadi, D., et al.: SW-Store: a vertically partitioned DBMS for semantic web data management. VLDB J. 18, 385–406 (2009)
Abiteboul, S., Hull, R., Vianu, V.: Foundations of Databases. Addison-Wesley (1995)
Arias, M., Fernández, J.D., Martínez-Prieto, M.A., de la Fuente, P.: An empirical study of real-world SPARQL queries. In: USEWOD (2011)
Arion, A., Bonifati, A., Manolescu, I., Pugliese, A.: Path summaries and path partitioning in modern XML databases. WWW 11(1), 117–151 (2008)
Brenes Barahona, S.: Structural summaries for efficient XML query processing. PhD thesis, Indiana University (2011)
Bröcheler, M., Pugliese, A., Subrahmanian, V.S.: DOGMA: A Disk-Oriented Graph Matching Algorithm for RDF Databases. In: Bernstein, A., Karger, D.R., Heath, T., Feigenbaum, L., Maynard, D., Motta, E., Thirunarayan, K. (eds.) ISWC 2009. LNCS, vol. 5823, pp. 97–113. Springer, Heidelberg (2009)
Fletcher, G.H.L., Beck, P.W.: Scalable indexing of RDF graphs for efficient join processing. In: CIKM, Hong Kong, pp. 1513–1516 (2009)
Fletcher, G.H.L., Hidders, J., Vansummeren, S., Luo, Y., Picalausa, F., De Bra, P.: On guarded simulations and acyclic first-order languages. In: DBPL, Seattle (2011)
Fletcher, G.H.L., Van Gucht, D., Wu, Y., Gyssens, M., Brenes, S., Paredaens, J.: A methodology for coupling fragments of XPath with structural indexes for XML documents. Information Systems 34(7), 657–670 (2009)
Gentilini, R., Piazza, C., Policriti, A.: From bisimulation to simulation: Coarsest partition problems. J. Autom. Reasoning 31(1), 73–103 (2003)
Guo, Y., Pan, Z., Heflin, J.: LUBM: A benchmark for OWL knowledge base systems. J. Web Sem. 3(2-3), 158 (2005)
Luo, Y., Picalausa, F., Fletcher, G.H.L., Hidders, J., Vansummeren, S.: Storing and indexing massive rdf datasets. In: De Virgilio, R., et al. (eds.) Semantic Search over the Web, Data-Centric Systems and Applications, pp. 29–58. Springer, Heidelberg (2012)
Milo, T., Suciu, D.: Index Structures for Path Expressions. In: Beeri, C., Bruneman, P. (eds.) ICDT 1999. LNCS, vol. 1540, pp. 277–295. Springer, Heidelberg (1998)
Neumann, T., Weikum, G.: Scalable join processing on very large RDF graphs. In: SIGMOD, pp. 627–640 (2009)
Neumann, T., Weikum, G.: The RDF-3X engine for scalable management of RDF data. VLDB J. 19(1), 91–113 (2010)
Picalausa, F., Vansummeren, S.: What are real SPARQL queries like? In: Proceedings of the International Workshop on Semantic Web Information Management, SWIM 2011, pp. 7:1–7:6. ACM, New York (2011)
Prud’hommeaux, E., Seaborne, A.: SPARQL query language for RDF. Technical report, W3C Recommendation (2008)
Sidirourgos, L., et al.: Column-store support for RDF data management: not all swans are white. Proc. VLDB Endow. 1(2), 1553–1563 (2008)
Tran, T., Ladwig, G.: Structure index for RDF data. In: Workshop on Semantic Data Management, SemData@ VLDB (2010)
Udrea, O., Pugliese, A., Subrahmanian, V.S.: GRIN: A graph based RDF index. In: AAAI, Vancouver, B.C., pp. 1465–1470 (2007)
van Glabbeek, R.J., Ploeger, B.: Correcting a Space-Efficient Simulation Algorithm. In: Gupta, A., Malik, S. (eds.) CAV 2008. LNCS, vol. 5123, pp. 517–529. Springer, Heidelberg (2008)
Weiss, C., Karras, P., Bernstein, A.: Hexastore: Sextuple Indexing for Semantic Web Data Management. In: VLDB, Auckland, New Zealand (2008)
Wylot, M., Pont, J., Wisniewski, M., Cudré-Mauroux, P.: dipLODocus[RDF]—Short and Long-Tail RDF Analytics for Massive Webs of Data. In: Aroyo, L., Welty, C., Alani, H., Taylor, J., Bernstein, A., Kagal, L., Noy, N., Blomqvist, E. (eds.) ISWC 2011, Part I. LNCS, vol. 7031, pp. 778–793. Springer, Heidelberg (2011)
Zou, L., Mo, J., Chen, L., Özsu, M.T., Zhao, D.: gStore: Answering SPARQL queries via subgraph matching. Proc. VLDB Endow. 4(8), 482–493 (2011)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Picalausa, F., Luo, Y., Fletcher, G.H.L., Hidders, J., Vansummeren, S. (2012). A Structural Approach to Indexing Triples. In: Simperl, E., Cimiano, P., Polleres, A., Corcho, O., Presutti, V. (eds) The Semantic Web: Research and Applications. ESWC 2012. Lecture Notes in Computer Science, vol 7295. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-30284-8_34
Download citation
DOI: https://doi.org/10.1007/978-3-642-30284-8_34
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-30283-1
Online ISBN: 978-3-642-30284-8
eBook Packages: Computer ScienceComputer Science (R0)