Abstract
In the light of growing data volumes and continuing digitization in fields such as Industry 4.0 or Internet of Things, data stream processing have gained popularity and importance. Especially enterprises can benefit from this development by augmenting their vital, core business data with up-to-date streaming information. Enriching this transactional data with detailed information from high-frequency data streams allows answering new analytical questions as well as improving current analyses, e.g., regarding predictive maintenance. Comparing such data stream processing architectures for use in an enterprise context, i.e., when combining streaming and business data, is currently a challenging task as there is no suitable benchmark.
In this paper, we give an overview about performance benchmarks in the area of data stream processing. We highlight shortcomings of existing benchmarks and present the need for a new benchmark with a focus on an enterprise context. Furthermore, the ideas behind Senska, a new enterprise streaming benchmark that shall fill this gap, and its architecture are introduced.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Apache Kafka - clients. https://cwiki.apache.org/confluence/display/KAFKA/Clients. Accessed 24 Apr 2017
Documentation - Kafka 0.10.2 documentation. https://kafka.apache.org/documentation/. Accessed 24 Apr 2017
Abadi, D.J., Carney, D., Çetintemel, U., Cherniack, M., Convey, C., Lee, S., Stonebraker, M., Tatbul, N., Zdonik, S.: Aurora: a new model and architecture for data stream management. VLDB J. 12(2), 120–139 (2003). http://dx.doi.org/10.1007/s00778-003-0095-z
Abdessemed, M.A.: Real-time data integration with apache flink & kafka @bouygues telecom (2015). http://www.slideshare.net/FlinkForward/mohamed-amine-abdessemed-realtime-data-integration-with-apache-flink-kafka. Accessed 06 Apr 2017
Arasu, A., Babcock, B., Babu, S., Datar, M., Ito, K., Nishizawa, I., Rosenstein, J., Widom, J.: Stream: The stanford stream data manager (demonstration description). In: Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data, SIGMOD 2003, pp. 665–665. ACM, New York (2003). http://doi.acm.org/10.1145/872757.872854
Arasu, A., Babu, S., Widom, J.: The CQL continuous query language: semantic foundations and query execution. VLDB J. 15(2), 121–142 (2006). http://dx.doi.org/10.1007/s00778-004-0147-z
Arasu, A., Cherniack, M., Galvez, E., Maier, D., Maskey, A.S., Ryvkina, E., Stonebraker, M., Tibbetts, R.: Linear road: a stream data management benchmark. In: Proceedings of the Thirtieth International Conference on Very Large Data Bases, VLDB 2004, VLDB Endowment, vol. 30, pp. 480–491 (2004). http://dl.acm.org/citation.cfm?id=1316689.1316732
Dunning, T., Friedman, E.: Streaming Architecture: New Designs Using Apache Kafka and MapR Streams. O’Reilly Media, Sebastopol (2016)
Folkerts, E., Alexandrov, A., Sachs, K., Iosup, A., Markl, V., Tosun, C.: Benchmarking in the cloud: what it should, can, and cannot be. In: Nambiar, R., Poess, M. (eds.) TPCTC 2012. LNCS, vol. 7755, pp. 173–188. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-36727-4_12
Gray, J.: The Benchmark Handbook - For Database and Transaction Processing Systems. The Morgan Kaufmann Series in Data Management Systems. Morgan Kaufmann, Massachusetts (1993)
Hesse, G., Lorenz, M.: Conceptual survey on data stream processing systems. In: Proceedings of the 2015 IEEE 21st International Conference on Parallel and Distributed Systems (ICPADS), ICPADS 2015, pp. 797–802. IEEE Computer Society, Washington, DC (2015). http://dx.doi.org/10.1109/ICPADS.2015.106
Hesse, G., Matthies, C., Reissaus, B., Uflacker, M.: A new application benchmark for data stream processing architectures in an enterprise context: doctoral symposium. In: Proceedings of the 11th ACM International Conference on Distributed and Event-based Systems, DEBS 2017, pp. 359–362. ACM, New York (2017). http://doi.acm.org/10.1145/3093742.3093902
Huber, M.F., Voigt, M., Ngomo, A.N.: Big Data architecture for the semantic analysis of complex events in manufacturing. In: Informatik 2016, 46. Jahrestagung der Gesellschaft für Informatik, 26–30 September 2016, Klagenfurt, Österreich, pp. 353–360 (2016). http://subs.emis.de/LNI/Proceedings/Proceedings259/article173.html
Kreps, J., Narkhede, N., Rao, J., et al.: Kafka: a distributed messaging system for log processing. In: SIGMOD Workshop on Networking Meets Databases (2011)
Kulkarni, S., Bhagat, N., Fu, M., Kedigehalli, V., Kellogg, C., Mittal, S., Patel, J.M., Ramasamy, K., Taneja, S.: Twitter heron: stream processing at scale. In: Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, SIGMOD 2015, pp. 239–250. ACM, New York (2015). http://doi.acm.org/10.1145/2723372.2742788
Lu, R., Wu, G., Xie, B., Hu, J.: Stream bench: towards benchmarking modern distributed stream computing frameworks. In: Proceedings of the 2014 IEEE/ACM 7th International Conference on Utility and Cloud Computing, pp. 69–78. UCC 2014. IEEE Computer Society, Washington, DC (2014). http://dx.doi.org/10.1109/UCC.2014.15
Manyika, J., Chui, M., Bisson, P., Woetzel, J., Dobbs, R., Bughin, J., Aharon, D.: The internet of things: mapping the value beyond the hype, June 2015. http://www.mckinsey.com/~/media/McKinsey/Business%20Functions/McKinsey%20Digital/Our%20Insights/The%20Internet%20of%20Things%20The%20value%20of%20digitizing%20the%20physical%20world/The-Internet-of-things-Mapping-the-value-beyond-the-hype.ashx. Accessed 01 Mar 2017
Menasce, D.A.: Tpc-w: a benchmark for e-commerce. IEEE Internet Comput. 6(3), 83–87 (2002)
Mendes, M.R.N., Bizarro, P., Marques, P.: A performance study of event processing systems. In: Nambiar, R., Poess, M. (eds.) TPCTC 2009. LNCS, vol. 5895, pp. 221–236. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-10424-4_16
Shukla, A., Chaturvedi, S., Simmhan, Y.: Riotbench: A real-time iot benchmark for distributed stream processing platforms. CoRR abs/1701.08530 (2017). http://arxiv.org/abs/1701.08530
Southekal, P.H.: Data for Business Performance: The Goal-Question-Metric (GQM) Model to Transform Business Data into an Enterprise Asset (2017)
Vieru, M., López, J.: Flink in zalando’s world of microservices (2016). http://www.slideshare.net/ZalandoTech/flink-in-zalandos-world-of-microservices-62376341. Accessed 06 Apr 2017
Weiner, S., Line, D.: Manufacturing and the data conundrum - too much? too little? or just right? https://www.eiuperspectives.economist.com/sites/default/files/Manufacturing_Data_Conundrum_Jul14.pdf (2014). Accessed 01 Mar 2017
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG
About this paper
Cite this paper
Hesse, G., Reissaus, B., Matthies, C., Lorenz, M., Kraus, M., Uflacker, M. (2018). Senska – Towards an Enterprise Streaming Benchmark. In: Nambiar, R., Poess, M. (eds) Performance Evaluation and Benchmarking for the Analytics Era. TPCTC 2017. Lecture Notes in Computer Science(), vol 10661. Springer, Cham. https://doi.org/10.1007/978-3-319-72401-0_3
Download citation
DOI: https://doi.org/10.1007/978-3-319-72401-0_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-72400-3
Online ISBN: 978-3-319-72401-0
eBook Packages: Computer ScienceComputer Science (R0)