Advertisement

FILT – Filtering Indexed Lucene Triples – A SPARQL Filter Query Processing Engine–

  • Magnus Stuhr
  • Csaba Veres
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8132)

Abstract

The Resource Description Framework (RDF) is the W3C recommended standard for data on the semantic web, while the SPARQL Protocol and RDF Query Language (SPARQL) is the query language that retrieves RDF triples. RDF data often contain valuable information that can only be queried through filter functions. The SPARQL query language for RDF can include filter clauses in order to define specific data criteria, such as full-text searches, numerical filtering, and constraints and relationships between data resources. However, the downside of executing SPARQL filter queries is the frequently slow query execution times. This paper presents a SPARQL filter query-processing engine for conventional triplestores called FILT (Filtering Indexed Lucene Triples), built on top of the Apache Lucene framework for storing and retrieving indexed documents, compatible with unmodified SPARQL queries. The objective of FILT was to decrease the query execution time of SPARQL filter queries. This aspect was evaluated by performing a benchmark test of FILT compared to the Joseki triplestore, focusing on two different use-cases; SPARQL regular expression filtering in medical data, and SPARQL numerical/logical filtering of geo-coordinates in geographical locations.

Keywords

RDF full-text search SPARQL filter queries SPARQL regex filtering SPARQL numerical filtering RDF data indexing Lucene 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Apache Jena ARQ, ARQ - A SPARQL Processor for Jena (2012), http://incubator.apache.org/jena/documentation/larq/index.html
  2. 2.
    Apache Jena LARQ (2012), LARQ - adding free text searches to SPARQL, http://incubator.apache.org/jena/documentation/query/index.html
  3. 3.
    Apache Lucene Core, Apache Lucene Core (2011), http://lucene.apache.org/core/
  4. 4.
    Bizer, C., Schultz, A.: The Berlin SPARQL Benchmark. The Proceedings of the International Journal on Semantic Web and Information Systems (IJSWIS) 5(2), 24 (2009), http://www.igi-global.com/article/berlin-sparql-benchmark/4112, doi:10.4018/jswis.2009040101Google Scholar
  5. 5.
    Castillo, R., Rothe, C., Leser, U.: RDFMatView: Indexing RDF Data Using Materialized SPARQL Queries. In: Proceedings of the 6th International Workshop on Scalable Semantic Web Knowledge Base Systems (SSWS 2010), vol. 669, pp. 80–95 (2010), http://ceur-ws.org/Vol-669
  6. 6.
    Delbru, R., Campinas, S., Tummarello, G.: Searching Web Data: an Entity Retrieval and High-Performance Indexing Model. Web Semantics: Science, Services and Agents on the World Wide Web, Web-Scale Semantic Information Processing 10, 33–58 (2012), http://www.sciencedirect.com/science/article/pii/S1570826811000230, doi:10.1016/j.websem.2011.04.004CrossRefGoogle Scholar
  7. 7.
    Delbru, R., Toupikov, N., Catasta, M., Tummarello, G.: A Node Indexing Scheme for Web Entity Retrieval. In: Aroyo, L., Antoniou, G., Hyvönen, E., ten Teije, A., Stuckenschmidt, H., Cabral, L., Tudorache, T. (eds.) ESWC 2010, Part II. LNCS, vol. 6089, pp. 240–256. Springer, Heidelberg (2010), http://dx.doi.org/10.1007/978-3-642-13489-0_17, doi:10.1007/978-3-642-13489-0_17.CrossRefGoogle Scholar
  8. 8.
    Manola, F., Miller, E.: RDF Primer, W3C Recommendation (2004), http://www.w3.org/TR/rdf-primer/
  9. 9.
    Minack, E., Sauermann, L., Grimnes, G., Fluit, C., Broekstra, J.: The Sesame LuceneSail: RDF Queries with Full-text Search. NEPOMUK Technical Report 2008-1 (2008), http://www.dfki.uni-kl.de/~sauermann/papers/Minack%2B2008.pdf
  10. 10.
    NEPOMUK, NEPOMUK - The Social Semantic Desktop - FP6-027705 (2008), http://nepomuk.semanticdesktop.org/nepomuk/
  11. 11.
    OpenLink Software, OpenLink Virtuoso Universal Server: Documentation. RDF and Geometry (2009), http://docs.openlinksw.com/virtuoso/rdfsparqlgeospat.html (retrieved May 13, 2012)
  12. 12.
    Oren, E., Delbru, R., Catasta, M., Cyganiak, R., Stenzhorn, H., Tummarello, G.: Sindice.com: A Document-oriented Lookup Index for Open Linked Data. Proceedings of the International Journal of Metadata, Semantics and Ontologies 3(1/2008), 37–52 (2008), http://inderscience.metapress.com/content/3518208222365647, doi:10.1504/IJMSO.2008.021204CrossRefGoogle Scholar
  13. 13.
    Prud’hommeaux, E., Seaborne, A.: SPARQL Query Language for RDF. W3C working draft, 4 (January 2008), http://www.w3.org/TR/rdf-sparql-query
  14. 14.
    Wang, H., Liu, Q., Penin, T., Fu, L., Zhang, L., Tran, T., Yu, Y., Pan, Y.: Semplore: A scalable IR approach to search the Web of Data. Web Semantics: Science, Services and Agents on the World Wide Web 7(3), 177–188 (2009), http://www.sciencedirect.com/science/article/pii/S1570826809000262, doi:10.1016/j.websem.2009.08.001CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Magnus Stuhr
    • 1
  • Csaba Veres
    • 2
  1. 1.Computas ASNorway
  2. 2.University of BergenBergenNorway

Personalised recommendations