Advertisement

Optimizing the Performance of Concurrent RDF Stream Processing Queries

  • Chan Le VanEmail author
  • Feng Gao
  • Muhammad Intizar Ali
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10249)

Abstract

With the growing popularity of Internet of Things (IoT) and sensing technologies, a large number of data streams are being generated at a very rapid pace. To explore the potentials of the integration of IoT and semantic technologies, a few RDF Stream Processing (RSP) query engines are made available which are capable of processing, analyzing and reasoning over semantic data streams in real-time. This way, RSP mitigates data interoperability issues and promotes knowledge discovery and smart decision making for time-sensitive applications. However, a major hurdle in the wide adoption of RSP systems is their query performance. Particularly, the ability of RSP engines to handle a large number of concurrent queries is very limited which refrains large scale stream processing applications (e.g. smart city applications) to adopt RSP. In this paper, we propose a shared-join based approach to improve the performance of an RSP engine for concurrent queries. We also leverage query federation mechanisms to allow distributed query processing over multiple RSP engine instances in order to gain performance for concurrent and distributed queries. We apply load balancing strategies to distribute queries and further optimize the concurrent query performance. We provide a proof of concept implementation by extending CQELS RSP engine and evaluate our approach using existing benchmark datasets for RSP. We also compare the performance of our proposed approach with the state of the art implementation of CQELS RSP engine.

Keywords

Linked Data RDF Stream Processing Query optimization 

Notes

Acknowledgment

Authors are extremely thankful to John Breslin, Alessandra Mileo and Danh-Le Phouc for their valuable feedback and guidance. The work conducted during this study is supported by Science Foundation Ireland (SFI) under grant No. SFI/12/RC/2289.

References

  1. 1.
    Ali, M.I., Gao, F., Mileo, A.: CityBench: a configurable benchmark to evaluate RSP engines using smart city datasets. In: Arenas, M., et al. (eds.) ISWC 2015. LNCS, vol. 9367, pp. 374–389. Springer, Cham (2015). doi: 10.1007/978-3-319-25010-6_25CrossRefGoogle Scholar
  2. 2.
    Ali, M.I., Pichler, R., Truong, H.-L., Dustdar, S.: DeXIN: an extensible framework for distributed XQuery over heterogeneous data sources. In: Filipe, J., Cordeiro, J. (eds.) ICEIS 2009. LNBIP, vol. 24, pp. 172–183. Springer, Heidelberg (2009). doi: 10.1007/978-3-642-01347-8_15CrossRefGoogle Scholar
  3. 3.
    Arasu, A., Babu, S., Widom, J.: The CQL continuous query language: semantic foundations and query execution. VLDB J. 15(2), 121–142 (2006)CrossRefGoogle Scholar
  4. 4.
    Barbieri, D.F., Braga, D., Ceri, S., Della Valle, S.E., Grossniklaus, M.: C-SPARQL: SPARQL for continuous querying. In: Proceedings of WWW, pp. 1061–1062. ACM (2009)Google Scholar
  5. 5.
    Bizer, C., Heath, T., Berners-lee, T.: Linked data - the story so far. Int. J. Semant. Web Inf. Syst. 5, 1–22 (2009)Google Scholar
  6. 6.
    Calbimonte, J.-P., Jeung, H., Corcho, O., Aberer, K.: Enabling query technologies for the semantic sensor web. Proc. IJSWIS 8(1), 43–63 (2012)Google Scholar
  7. 7.
    Deen, S.M., Al-Qasem, M.: A query subsumption technique. In: Bench-Capon, T.J.M., Soda, G., Tjoa, A.M. (eds.) DEXA 1999. LNCS, vol. 1677, pp. 362–371. Springer, Heidelberg (1999). doi: 10.1007/3-540-48309-8_34CrossRefGoogle Scholar
  8. 8.
    Diao, Y., Franklin, M.J.: High-performance XML filtering: an overview of YFilter. IEEE Data Eng. Bull. 26, 41–48 (2003)Google Scholar
  9. 9.
    Gao, F., Ali, M.I., Mileo, A.: Semantic discovery and integration of urban data streams. In: Proceedings of the 13th International Semantic Web Conference (ISWC 2014), Workshop on Semantics for Smarter Cities (2014)Google Scholar
  10. 10.
    Gao, F., Curry, E., Ali, M.I., Bhiri, S., Mileo, A.: QoS-aware complex event service composition and optimization using genetic algorithms. In: Franch, X., Ghose, A.K., Lewis, G.A., Bhiri, S. (eds.) ICSOC 2014. LNCS, vol. 8831, pp. 386–393. Springer, Heidelberg (2014). doi: 10.1007/978-3-662-45391-9_28CrossRefGoogle Scholar
  11. 11.
    Gao, F., Curry, E., Bhiri, S.: Complex event service provision and composition based on event pattern matchmaking. In: Proceedings of the 8th ACM International Conference on Distributed Event-Based Systems, Mumbai, India. ACM (2014)Google Scholar
  12. 12.
    Hammad, M.A., Franklin, M.J., Aref, W.G., Elmagarmid, A.K.: Scheduling for shared window joins over data streams. In: VLDB. VLDB Endowment (2003)CrossRefGoogle Scholar
  13. 13.
    Hoeksema, J., Kotoulas, S.: High-performance distributed stream reasoning using S4. In: Ordering Workshop at ISWC (2011)Google Scholar
  14. 14.
    Koerner, M., Kao, O.: Multiple service load-balancing with openflow. In: 2012 IEEE 13th International Conference on High Performance Switching and Routing (HPSR), pp. 210–214. IEEE (2012)Google Scholar
  15. 15.
    Kossmann, D.: The state of the art in distributed query processing. ACM Comput. Surv. (CSUR) 32(4), 422–469 (2000)CrossRefGoogle Scholar
  16. 16.
    Le-Phuoc, D.: A native and adaptive approach for linked stream data processing. Ph.D. thesis, National University of Ireland Galway, IDA Business Park, Lower Dangan, Galway, Ireland (2012)Google Scholar
  17. 17.
    Le-Phuoc, D., Nguyen Mau Quoc, H., Le Van, C., Hauswirth, M.: Elastic and scalable processing of linked stream data in the cloud. In: Alani, H., et al. (eds.) ISWC 2013. LNCS, vol. 8218, pp. 280–297. Springer, Heidelberg (2013). doi: 10.1007/978-3-642-41335-3_18CrossRefGoogle Scholar
  18. 18.
    Matsuba, H., Joshi, K., Hiltunen, M., Schlichting, R.: Airfoil: a topology aware distributed load balancing service. In: 2015 IEEE 8th International Conference on Cloud Computing (CLOUD), pp. 325–332. IEEE (2015)Google Scholar
  19. 19.
    Naughton, V.J.F., Burger, J.: Maximizing the output rate of multi-way join queries over streaming information sources. In: VLDB. VLDB Endowment (2003)Google Scholar
  20. 20.
    Le-Phuoc, D., Xavier Parreira, J., Hauswirth, M.: Linked stream data processing. In: Eiter, T., Krennwallner, T. (eds.) Reasoning Web 2012. LNCS, vol. 7487, pp. 245–289. Springer, Heidelberg (2012). doi: 10.1007/978-3-642-33158-9_7CrossRefGoogle Scholar
  21. 21.
    Porter, G., Katz, R.H.: Effective web service load balancing through statistical monitoring. Commun. ACM 49(3), 48–54 (2006)CrossRefGoogle Scholar
  22. 22.
    Prud’hommeaux, E., Seaborne, A.: SPARQL query language for RDF. W3C Recommendation 4, 1–106 (2008)Google Scholar
  23. 23.
    Schwarte, A., Haase, P., Hose, K., Schenkel, R., Schmidt, M.: FedX: optimization techniques for federated query processing on linked data. In: Aroyo, L., et al. (eds.) ISWC 2011. LNCS, vol. 7031, pp. 601–616. Springer, Heidelberg (2011). doi: 10.1007/978-3-642-25073-6_38CrossRefGoogle Scholar
  24. 24.
    Sellis, T.K.: Multiple-query optimization. ACM Trans. Database Syst. 13(1), 23–52 (1988)CrossRefGoogle Scholar
  25. 25.
    Sequeda, J.F., Corcho, O.: Linked stream data: a position paper. In: SSN (2009)Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.INSIGHT Centre for Data AnalyticsNational University of IrelandGalwayIreland

Personalised recommendations