Skip to main content

Part of the book series: Atlantis Ambient and Pervasive Intelligence ((ATLANTISAPI,volume 2))

  • 395 Accesses

Abstract

Implementing scalable RDF triple stores that can store many triples and process many queries concurrently is challenging. Several projects have investigated the use of distributed hash tables for this task but query planning has received little attention in this context so far. Given the distributed nature of DHTs, latencies of messages and limited network bandwidth are crucial factors to consider. Also due to a lack of global knowledge in DHTs, query planning is different from centralized databases. This book chapter discusses a set of heuristics and evaluates their performance on the Lehigh University Benchmark with emphasis on the network traffic. The results show the importance of query planning in DHT based RDF triple stores.1

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 119.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Bibliography

  1. D. Battr´e. Query Planning in DHT based RDF stores. In Proceedings of Fourth International IEEE Conference on Signal-Image Technologies and Internet-Based System (SITIS 2008), pp. 187–194, (2008).

    Google Scholar 

  2. T. Berners-Lee, J. Hendler, and O. Lassila, The Semantic Web, Scientific American. 284(5), 28–47 (May, 2001).

    Google Scholar 

  3. E. Oren, B. Heitmann, and S. Decker, ActiveRDF: Embedding Semantic Web data into objectoriented languages, Web Semantics: Science, Services and Agents on the World Wide Web. 6(3), 191–202, (2008).

    Article  Google Scholar 

  4. M. Hepp. GoodRelations: An Ontology for Describing Products and Services Offers on the Web. In Proceedings of the 16th International Conference on Knowledge Engineering and Knowledge Management (EKAW2008), (2008).

    Google Scholar 

  5. N. Shadbolt, T. Berners-Lee, and W. Hall, The Semantic Web Revisited, IEEE Intelligent Systems. 21(3), 96–101, (2006).

    Article  Google Scholar 

  6. C. Bizer, T. Heath, D. Ayers, and Y. Raimond. Linking Open Data (ESWC 2007 Poster), (2007).

    Google Scholar 

  7. M. Hausenblas, W. Halb, Y. Raimond, and T. Heath. What is the Size of the Semantic Web? In

    Google Scholar 

  8. Proceedings of I-SEMANTICS 08 - International Conference on Semantic Systems, pp. 9–16, (2008).

    Google Scholar 

  9. E. K. Lua, J. Crowcroft, M. Pias, R. Sharma, and S. Lim, A survey and comparison of peer-topeer overlay network schemes, IEEE Communications Surveys & Tutorials. 7(2), 72–93, (2005).

    Article  Google Scholar 

  10. D. J. Abadi, A. Marcus, S. R. Madden, and K. Hollenbach. Scalable semantic web data management using vertical partitioning. In VLDB ’07: Proceedings of the 33rd international conference on Very large data bases, pp. 411–422. VLDB Endowment, (2007). ISBN 978-1-59593-649-3.

    Google Scholar 

  11. K.Wilkinson, C. Sayers, H. A. Kuno, and D. Reynolds. Efficient RDF Storage and Retrieval in Jena2. In Proceedings of SWDB’03, The first International Workshop on Semantic Web and Databases, pp. 131–150, (2003).

    Google Scholar 

  12. J. J. Carroll, I. Dickinson, C. Dollin, D. Reynolds, A. Seaborne, and K. Wilkinson. Jena: implementing the semantic web recommendations. In WWW Alt. ’04: Proceedings of the 13th international World Wide Web conference on Alternate track papers & posters, pp. 74–83, New York, NY, USA, (2004). ACM Press. ISBN 1-58113-912-8. doi: http://doi.acm.org/10.1145/1013367.1013381.

  13. E. I. Chong, S. Das, G. Eadon, and J. Srinivasan. An efficient SQL-based RDF querying scheme. In VLDB ’05: Proceedings of the 31st international conference on Very large data bases, pp. 1216–1227. VLDB Endowment, (2005). ISBN 1-59593-154-6.

    Google Scholar 

  14. J. Broekstra, A. Kampman, and F. van Harmelen. Sesame: A Generic Architecture for Storing and Querying RDF and RDF Schema. In ISWC ’02: Proceedings of the First International Semantic Web Conference on The Semantic Web, pp. 54–68, London, UK, (2002). Springer-Verlag. ISBN 3-540-43760-6.

    Google Scholar 

  15. S. Harris and N. Gibbins. 3store: Efficient Bulk RDF Storage. In eds. R. Volz, S. Decker, and I. F. Cruz, Proceedings of the First International Workshop on Practical and Scalable Semantic Systems, vol. 89, CEUR Workshop Proceedings. CEUR-WS.org, (2003).

    Google Scholar 

  16. T. Neumann and G.Weikum, RDF-3X: a RISC-style engine for RDF, Proceedings of the VLDB Endowment. 1(1), 647–659, (2008).doi:http://doi.acm.org/10.1145/1453856.1453927.

  17. W. Nejdl, B. Wolf, C. Qu, S. Decker, M. Sintek, A. Naeve, M. Nilsson, M. Palm´er, and T. Risch. EDUTELLA: A P2P networking infrastructure based on RDF. In WWW ’02: Proceedings of the 11th international conference on World Wide Web, pp. 604– 615, New York, NY, USA, (2002). ACM Press. ISBN 1-58113-449-5. doi: http://doi.acm.org/10.1145/511446.511525.

  18. W. Nejdl,W. Siberski, U. Thaden, and W.-T. Balke. Top-k Query Evaluation for Schema-Based Peer-to-Peer Networks. In eds. S. A. McIlraith, D. Plexousakis, and F. van Harmelen, The Semantic Web - ISWC 2004: Third International Semantic Web Conference, Lecture Notes in Computer Science, vol. 3298, pp. 137–151 (Jan., 2004).

    Google Scholar 

  19. W. Nejdl, M. Wolpers, W. Siberski, C. Schmitz, M. Schlosser, I. Brunkhorst, and A. L¨oser. Super-Peer-Based Routing and Clustering Strategies for RDF-Based Peer-To-Peer Networks. In WWW ’03: Proceedings of the 12th international conference on World Wide Web, pp. 536–543, New York, NY, USA, (2003). ACMPress. ISBN 1-58113-680-3. doi: http://doi.acm.org/10.1145/775152.775229.

  20. I. Brunkhorst, H. Dhraief, A. Kemper,W. Nejdl, and C.Wiesner. Distributed Queries and Query Optimization in Schema-Based P2P-Systems. In International Workshop On Databases, Information Systems and Peer-to-Peer Computing, pp. 184–199, (2003).

    Google Scholar 

  21. G. Kokkinidis, L. Sidirourgos, and V. Christophides, Semantic Web and Peer-to-Peer, In Semantic Web and Peer-to-Peer, chapter Query Processing in RDF/S-based P2P Database Systems, pp. 59–81. Springer, (2006).

    Google Scholar 

  22. M. Cai and M. Frank. RDFPeers: A Scalable Distributed RDF Repository based on A Structured Peer-to-Peer Network. In Proceedings of the 13th International World Wide Web Conference (WWW2004), pp. 650–657 (May, 2004).

    Google Scholar 

  23. M. Cai, M. Frank, B. Pan, and R. MacGregor, A Subscribable Peer-to-Peer RDF Repository for Distributed Metadata Management, Journal of Web Semantics: Science, Services and Agents on the World Wide Web. 2(2), 109–130, (2004).

    Article  Google Scholar 

  24. M. Cai,M. Frank, J. Chen, and P. Szekely, MAAN: A Multi-Attribute Addressable Network for Grid Information Services, Journal of Grid Computing. 2(1), (2004).

    Google Scholar 

  25. A. Matono, S. M. Pahlevi, and I. Kojima. RDFCube: A P2P-Based Three-Dimensional Index for Structural Joins on Distributed Triple Stores. In Ref. [55], pp. 323–330. ISBN 978- 3-540-71660-0.

    Google Scholar 

  26. F. Heine, M. Hovestadt, and O. Kao. Processing complex RDF queries over P2P networks. In P2PIR’05: Proceedings of the 2005 ACM workshop on Information retrieval in peer-topeer networks, pp. 41–48. ACM Press, (2005). ISBN 1-59593-164-3. doi: http://doi.acm.org/10.1145/1096952.1096960.

  27. F. Heine. Scalable P2P based RDF Querying. In InfoScale ’06: Proceedings of the 1st international conference on Scalable information systems, p. 17, New York, NY, USA, (2006). ACM Press. ISBN 1-59593-428-6. doi: http://doi.acm.org/10.1145/1146847.1146864.

  28. B. H. Bloom, Space/Time Trade-offs in Hash Coding with Allowable Errors., Communications of the ACM. 13(7), 422–426, (1970).

    Google Scholar 

  29. D. Battr´e, F. Heine, and O. Kao. Top k RDF Query Evaluation in Structured P2P Networks. In eds. W. Nagel, W. Walter, and W. Lehner, Euro-Par 2006 Parallel Processing: 12th International Euro-Par Conference, vol. 4128, LNCS, pp. 995–1004. Springer Berlin / Heidelberg, (2006). doi: 10.1007/11823285.

  30. D. Battr´e, F. Heine, A. H¨oing, and O. Kao. On Triple Dissemination, Forward-Chaining, and Load Balancing in DHT Based RDF Stores. In Ref. [55], pp. 343–354. ISBN 978-3- 540-71660-0.

    Google Scholar 

  31. D. Battr´e, F. Heine, A. H¨oing, and O. Kao. Load-balancing in P2P based RDF stores. In Proceedings of Second International Workshop on Scalable Semantic Web Knowledge Base Systems (SSWS 2006), pp. 21–34, (2006).

    Google Scholar 

  32. D. Battr´e, Caching of intermediate results in DHT-based RDF stores, International Journal of Metadata, Semantics and Ontologies. 3(1), 84–93, (2008).

    Google Scholar 

  33. M. Koubarakis, I. Miliaraki, Z. Kaoudi, M. Magiridou, and A. Papadakis-Pesaresi. Semantic Grid Resource Discovery using DHTs in Atlas. In 3rd GGF Semantic Grid Workshop (Feb., 2006).

    Google Scholar 

  34. E. Liarou, S. Idreos, and M. Koubarakis. Evaluating Conjunctive Triple Pattern Queries over Large Structured Overlay Networks. In eds. I. Cruz, S. Decker, D. Allemang, C. Preist, D. Schwabe, P. Mika, M. Uschold, and L. Aroyo, The Semantic Web – ISWC 2006, vol. 4273, LNCS, pp. 399–413 (Nov., 2006).

    Google Scholar 

  35. E. Liarou, S. Idreos, and M. Koubarakis. Continuous RDF Query Processing over DHTs. In Ref. [56], pp. 324–339. ISBN 978-3-540-76297-3.

    Google Scholar 

  36. Z. Kaoudi, I. Miliaraki, and M. Koubarakis. RDFS Reasoning and Query Answering on Top of DHTs. In 7th International Semantic Web Conference (ISWC 2008), (2008).

    Google Scholar 

  37. S. Rhea, D. Geels, T. Roscoe, and J. Kubiatowicz. Handling churn in a DHT. In ATEC’04: Proceedings of the USENIX Annual Technical Conference 2004 on USENIX Annual Technical Conference, pp. 10–10, Berkeley, CA, USA, (2004). USENIX Association.

    Google Scholar 

  38. K. Aberer, P. Cudr´e-Mauroux, M. Hauswirth, and T. van Pelt. GridVine: Building Internet- Scale Semantic Overlay Networks. In International Semantic Web Conference (ISWC), vol. 3298, LNCS, pp. 107–121, (2004).

    Google Scholar 

  39. P. Cudr´e-Mauroux, S. Agarwal, and K. Aberer, GridVine: An Infrastructure for Peer Information Management, IEEE Internet Computing. 11(5), 36–44, (2007). ISSN 1089-7801. doi:http://doi.ieeecomputersociety.org/10.1109/MIC.2007.108.

  40. K. Aberer, P. Cudr´e-Mauroux, A. Datta, Z. Despotovic, M. Hauswirth, M. Punceva, and R. Schmidt, P-Grid: a self-organizing structured P2P system, SIGMOD Rec. 32(3), 29–

    Google Scholar 

  41. 33, (2003). ISSN 0163-5808. doi: http://doi.acm.org/10.1145/945721.945729.

  42. A. Harth, J. Umbrich, A. Hogan, and S. Decker. YARS2: A Federated Repository for Querying Graph Structured Data from the Web. In Ref. [56], pp. 211–224. ISBN 978-3-540- 76297-3.

    Google Scholar 

  43. A. Harth and S. Decker. Optimized Index Structures for Querying RDF from the Web. In LA-WEB ’05: Proceedings of the Third Latin American Web Congress, pp. 71–80, Washington, DC, USA, (2005). IEEE Computer Society. ISBN 0-7695-2471-0. doi: http://dx.doi.org/10.1109/LAWEB.2005.25.

  44. O. Hartig and R. Heese. The SPARQL Query Graph Model for Query Optimization. In The Semantic Web: Research and Applications (ESWC 2007), vol. 4519/2007, LNCS, pp. 564–578, (2007). doi: 10.1007/978-3-540-72667-8.

  45. M. Stocker, A. Seaborne, A. Bernstein, C. Kiefer, and D. Reynolds. SPARQL Basic Graph Pattern Optimization Using Selectivity Estimation. In Proceedings of the 17th International World Wide Web Conference (WWW), (2008).

    Google Scholar 

  46. D. Battr´e. Efficient Query Processing in DHT-based RDF Stores. PhD thesis, Technische Universit¨at Berlin, Germany (Dec., 2008). URL http://nbn-resolving.de/urn: nbn:de:kobv:83-opus-21188.

  47. A. Rao, K. Lakshminarayanan, S. Surana, R. Karp, and I. Stoica. Load Balancing in Structured P2P Systems. In Proceedings of the 2nd International Workshop on Peer-to-Peer Systems (IPTPS 03). Springer, (2003).

    Google Scholar 

  48. S. Surana, B. Godfrey, K. Lakshminarayanan, R. Karp, and I. Stoica, Load Balancing in Dynamic Structured P2P Systems, Performance Evaluation. 63(6), 217–240 (Mar., 2006).

    Google Scholar 

  49. Y. Zhu and Y. Hu, Efficient, Proximity-Aware Load Balancing for DHT-Based P2P Systems, IEEE Transactions on Parallel and Distributed Systems. 16(4), 349–361, (2005).

    Article  Google Scholar 

  50. Y. Guo, Z. Pan, and J. Heflin, LUBM: A Benchmark for OWL Knowledge Base Systems, Journal of Web Semantics. 3(2), 158–182, (2005).

    Article  Google Scholar 

  51. D. E. Knuth, The Art of Computer Programming, Volume 3: Sorting and Searching. (Addison Wesley, 1998).

    Google Scholar 

  52. F. Heine. P2P based RDF Querying and Reasoning for Grid Resource Description and Matching. PhD thesis, University of Paderborn, Germany (July, 2006).

    Google Scholar 

  53. N. J. A. Harvey, M. B. Jones, S. Saroiu, M. Theimer, and A. Wolman. SkipNet: A Scalable Overlay Network with Practical Locality Properties. In USENIX Symposium on Internet Technologies and Systems, Seattle, WA (Mar., 2003).

    Google Scholar 

  54. P. Ganesan, M. Bawa, and H. Garcia-Molina. Online Balancing of Range-Partitioned Data with

    Google Scholar 

  55. Applications to Peer-to-Peer Systems. In eds. M. A. Nascimento, M. T. ¨Ozsu, D. Kossmann, R. J. Miller, J. A. Blakeley, and K. B. Schiefer, (e)Proceedings of the Thirtieth International Conference on Very Large Data Bases, pp. 444–455. Morgan Kaufmann, (2004). ISBN 0-12-088469-0.

    Google Scholar 

  56. Y. Chen and W. Benn. Query Evaluation for Distributed Heterogeneous Relational Databases. In COOPIS ’98: Proceedings of the 3rd IFCIS International Conference on Cooperative Information Systems, pp. 44–53, Washington, DC, USA, (1998). IEEE Computer Sociess.

    Google Scholar 

  57. M.-S. Chen, P. S. Yu, and K.-L.Wu, Optimization of Parallel Execution for Multi-Join Queries, IEEE Transactions on Knowledge and Data Engineering. 8, 416–428, (1996).

    Article  Google Scholar 

  58. G. Moro, S. Bergamaschi, S. Joseph, J.-H. Morin, and A. M. Ouksel, Eds. Databases, Information Systems, and Peer-to-Peer Computing, International Workshops, DBISP2P 2005/2006, Trondheim, Norway, August 28-29, 2005, Seoul, Korea, September 11, 2006, Revised Selected Papers, vol. 4125, Lecture Notes in Computer Science, (2007). Springer. ISBN 978-3-540-71660-0.

    Google Scholar 

  59. K. Aberer, K.-S. Choi, N. F. Noy, D. Allemang, K.-I. Lee, L. J. B. Nixon, J. Golbeck, P. Mika, D. Maynard, R. Mizoguchi, G. Schreiber, and P. Cudr´e-Mauroux, Eds. The Semantic Web, 6th International Semantic Web Conference, 2nd Asian Semantic Web Conference, ISWC 2007 + ASWC 2007, Busan, Korea, November 11-15, 2007, vol. 4825, Lecture Notes in Computer Science, (2007). Springer. ISBN 978-3-540-76297-3. April 18, 2010 11:57 Atlantis Press Book - 9.75in x 6.5in book˙Mansoor

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dominic Battré .

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Atlantis Press/World Scientific

About this chapter

Cite this chapter

Battré, D. (2010). Query Planning in DHT Based RDF Stores. In: Web-Based Information Technologies and Distributed Systems. Atlantis Ambient and Pervasive Intelligence, vol 2. Atlantis Press. https://doi.org/10.2991/978-94-91216-32-9_4

Download citation

Publish with us

Policies and ethics

Societies and partnerships