Advertisement

Multi-tenant Pub/Sub Processing for Real-Time Data Streams

  • Álvaro VillalbaEmail author
  • David CarreraEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11339)

Abstract

Devices and sensors generate streams of data across a diversity of locations and protocols. That data usually reaches a central platform that is used to store and process the streams. Processing can be done in real time, with transformations and enrichment happening on-the-fly, but it can also happen after data is stored and organized in repositories. In the former case, stream processing technologies are required to operate on the data; in the latter batch analytics and queries are of common use.

This paper introduces a runtime to dynamically construct data stream processing topologies based on user-supplied code. These dynamic topologies are built on-the-fly using a data subscription model defined by the applications that consume data. Each user-defined processing unit is called a Service Object. Every Service Object consumes input data streams and may produce output streams that others can consume. The subscription-based programing model enables multiple users to deploy their own data-processing services. The runtime does the dynamic forwarding of data and execution of Service Objects from different users. Data streams can originate in real-world devices or they can be the outputs of Service Objects.

The runtime leverages Apache STORM for parallel data processing, that combined with dynamic user-code injection provides multi-tenant stream processing topologies. In this work we describe the runtime, its features and implementation details, as well as we include a performance evaluation of some of its core components.

Keywords

Big Data Analytics Stream processing Real-time data processing Programming models Internet of Things IoT 

Notes

Acknowledgments

This work is partially supported by the European Research Council (ERC) under the EU Horizon 2020 programme (GA 639595), the Spanish Ministry of Economy, Industry and Competitivity (TIN2015-65316-P) and the Generalitat de Catalunya (2014-SGR-1051).

References

  1. 1.
    Apache Flink official website. http://flink.apache.org
  2. 2.
    Apache Storm official website. http://storm.apache.org
  3. 3.
    evrythng official website. evrythng.com
  4. 4.
    Xively official website. xively.com
  5. 5.
    Abadi, D.J., et al.: The design of the borealis stream processing engine. In: CIDR, vol. 5, pp. 277–289 (2005)Google Scholar
  6. 6.
    Abadi, D.J., et al.: Aurora: a new model and architecture for data stream management. VLDB J. Int. J. Very Large Data Bases 12(2), 120–139 (2003)CrossRefGoogle Scholar
  7. 7.
    Ali, M., Chandramouli, B., Goldstein, J., Schindlauer, R.: The extensibility framework in Microsoft StreamInsight. In: 2011 IEEE 27th International Conference on Data Engineering (ICDE), pp. 1242–1253. IEEE (2011)Google Scholar
  8. 8.
    Balazinska, M., Balakrishnan, H., Stonebraker, M.: Load management and high availability in the medusa distributed stream processing system. In: Proceedings of the 2004 ACM SIGMOD International Conference on Management of Data, pp. 929–930. ACM (2004)Google Scholar
  9. 9.
    Barga, R.S., Goldstein, J., Ali, M., Hong, M.: Consistent streaming through time: a vision for event stream processing. arXiv preprint cs/0612115 (2006)Google Scholar
  10. 10.
    Kleppmann, M., Kreps, J.: Kafka, Samza and the unix philosophy of distributed dataGoogle Scholar
  11. 11.
    Kuntschke, R., Stegmaier, B., Kemper, A., Reiser, A.: StreamGlobe: processing and sharing data streams in grid-based P2P infrastructures. In: Proceedings of the 31st International Conference on Very Large Data Bases, pp. 1259–1262. VLDB Endowment (2005)Google Scholar
  12. 12.
    Neumeyer, L., Robbins, B., Nair, A., Kesari, A.: S4: distributed stream computing platform. In: 2010 IEEE International Conference on Data Mining Workshops (ICDMW), pp. 170–177. IEEE (2010)Google Scholar
  13. 13.
    Pedrinaci, C., Liu, D., Maleshkova, M., Lambert, D., Kopecky, J., Domingue, J.: iServe: a linked services publishing platform. In: The 7th Extended Semantic Web Ontology Repositories and Editors for the Semantic Web Workshop, vol. 596, June 2010. http://oro.open.ac.uk/23093/
  14. 14.
    Qin, Y., Sheng, Q.Z., Falkner, N.J.G., Dustdar, S., Wang, H., Vasilakos, A.V.: When things matter: a data-centric view of the internet of things. CoRR abs/1407.2704 (2014). http://arxiv.org/abs/1407.2704
  15. 15.
    Stonebraker, M., Çetintemel, U., Zdonik, S.: The 8 requirements of real-time stream processing. ACM SIGMOD Rec. 34(4), 42–47 (2005)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Technical University of Catalonia (UPC)BarcelonaSpain
  2. 2.Barcelona Supercomputing Center (BSC)BarcelonaSpain

Personalised recommendations