Abstract
In this paper we present WaterFowl, a novel approach for the storage of RDF triples that addresses scalability issues through compression. The architecture of our prototype, largely based on the use of succinct data structures, enables the representation of triples in a self-indexed, compact manner without requiring decompression at query answering time. Moreover, it is adapted to efficiently support RDF and RDFS entailment regimes thanks to an optimized encoding of ontology concepts and properties that does not require a complete inference materialization or query reformulation. This approach implies to make a distinction between the terminological and the assertional components of the knowledge base early in the process of data preparation, i.e., preprocessing the data before storing it in our structures. The paper describes our system’s architecture and presents some preliminary results obtained from evaluations on different datasets.
Chapter PDF
Similar content being viewed by others
Keywords
References
Abadi, D.J., Marcus, A., Madden, S., Hollenbach, K.J.: Scalable semantic web data management using vertical partitioning. In: VLDB, pp. 411–422 (2007)
Atre, M., Chaoji, V., Zaki, M.J., Hendler, J.A.: Matrix ”bit” loaded: a scalable lightweight join query processor for rdf data. In: WWW, pp. 41–50 (2010)
Dean, J., Ghemawat, S.: Mapreduce: Simplified data processing on large clusters. In: OSDI, pp. 137–150 (2004)
Fernández, J.D., Martínez-Prieto, M.A., Gutierrez, C.: Compact representation of large RDF data sets for publishing and exchange. In: Patel-Schneider, P.F., Pan, Y., Hitzler, P., Mika, P., Zhang, L., Pan, J.Z., Horrocks, I., Glimm, B. (eds.) ISWC 2010, Part I. LNCS, vol. 6496, pp. 193–208. Springer, Heidelberg (2010)
Goasdoué, F., Manolescu, I., Roatis, A.: Efficient query answering against dynamic RDF databases. In: EDBT, pp. 299–310 (2013)
Grossi, R., Gupta, A., Vitter, J.S.: High-order entropy-compressed text indexes. In: SODA, pp. 841–850 (2003)
Grossi, R., Ottaviano, G.: The wavelet trie: maintaining an indexed sequence of strings in compressed space. In: PODS, pp. 203–214 (2012)
Guo, Y., Pan, Z., Heflin, J.: Lubm: A benchmark for OWL knowledge base systems. J. Web Sem. 3(2-3), 158–182 (2005)
Martínez-Prieto, M.A., Gallego, M.A., Fernández, J.D.: Exchange and consumption of huge RDF data. In: ESWC, pp. 437–452 (2012)
Munro, J.I.: Tables. In: FSTTCS, pp. 37–42 (1996)
Neumann, T., Weikum, G.: The RDF-3X engine for scalable management of RDF data. VLDB J. 19(1), 91–113 (2010)
Pérez-Urbina, H., Horrocks, I., Motik, B.: Efficient query answering for OWL2. In: ISWC, pp. 489–504 (2009)
Rodriguez-Muro, M., Calvanese, D.: High performance query answering over DL-lite ontologies. In: KR (2012)
Rosati, R., Almatelli, A.: Improving query answering over DL-lite ontologies. In: KR (2010)
Tsialiamanis, P., Sidirourgos, L., Fundulaki, I., Christophides, V., Boncz, P.A.: Heuristics-based query optimisation for SPARQL. In: EDBT, pp. 324–335 (2012)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Curé, O., Blin, G., Revuz, D., Faye, D.C. (2014). WaterFowl: A Compact, Self-indexed and Inference-Enabled Immutable RDF Store. In: Presutti, V., d’Amato, C., Gandon, F., d’Aquin, M., Staab, S., Tordai, A. (eds) The Semantic Web: Trends and Challenges. ESWC 2014. Lecture Notes in Computer Science, vol 8465. Springer, Cham. https://doi.org/10.1007/978-3-319-07443-6_21
Download citation
DOI: https://doi.org/10.1007/978-3-319-07443-6_21
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-07442-9
Online ISBN: 978-3-319-07443-6
eBook Packages: Computer ScienceComputer Science (R0)