Combining Stream Processing Engines and Big Data Storages for Data Analysis

  • Thomas Steinmaurer
  • Patrick Traxler
  • Michael Zwick
  • Reinhard Stumptner
  • Christian Lettner
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8502)


We propose a system combining stream processing engines and big data storages for analyzing large amounts of data streams. It allows us to analyze data online and to store data for later offline analysis. An emphasis is laid on designing a system to facilitate simple implementations of data analysis algorithms.


Data Stream Data Item Query Language Continuous Query Test Data Generator 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Abadi, D.J., Ahmad, Y., Balazinska, M., Cetintemel, U., Cherniack, M., Hwang, J.-H., Lindner, W., Maskey, A.S., Rasin, A., Ryvkina, E., et al.: The design of the borealis stream processing engine. In: CIDR (2005)Google Scholar
  2. 2.
    Abadi, D.J., Carney, D., Çetintemel, U., Cherniack, M., Convey, C., Lee, S., Stonebraker, M., Tatbul, N., Zdonik, S.: Aurora: a new model and architecture for data stream management. The VLDB Journal 12(2), 120–139 (2003)CrossRefGoogle Scholar
  3. 3.
    Chang, F., Dean, J., Ghemawat, S., Hsieh, W.C., Wallach, D.A., Burrows, M., Chandra, T., Fikes, A., Gruber, R.E.: Bigtable: A distributed storage system for structured data. ACM Trans. Comput. Syst. 26(2) (2008)Google Scholar
  4. 4.
    Chu, C.-T., Kim, S.K., Lin, Y.-A., Yu, Y., Bradski, G.R., Ng, A.Y., Olukotun, K.: Map-Reduce for machine learning on multicore. In: Schölkopf, B., Platt, J.C., Hoffman, T. (eds.) NIPS, pp. 281–288. MIT Press (2006)Google Scholar
  5. 5.
    Condie, T., Conway, N., Alvaro, P., Hellerstein, J.M., Elmeleegy, K., Sears, R.: Map-Reduce online. In: NSDI, pp. 313–328. USENIX Association (2010)Google Scholar
  6. 6.
    Condie, T., Conway, N., Alvaro, P., Hellerstein, J.M., Gerth, J., Talbot, J., Elmeleegy, K., Sears, R.: Online aggregation and continuous query support in mapReduce. In: Elmagarmid, A.K., Agrawal, D. (eds.) SIGMOD Conference, pp. 1115–1118. ACM (2010)Google Scholar
  7. 7.
    Dean, J., Ghemawat, S.: Map-Reduce: a flexible data processing tool. Commun. ACM 53(1), 72–77 (2010)CrossRefGoogle Scholar
  8. 8.
    EsperTech. Esper – complex event processing. Website (2013)
  9. 9.
    The Apache Software Foundation. Apache Hadoop. Website (2013),
  10. 10.
    The Apache Software Foundation. Mahout: Scalable machine-learning and data-mining library (2013)
  11. 11.
    Franklin, M.J., Jeffery, S.R., Krishnamurthy, S., Reiss, F., Rizvi, S., Wu, E., Cooper, O., Edakkunni, A., Hong, W.: Design considerations for high fan-in systems: The HiFi approach. In: CIDR (2005)Google Scholar
  12. 12.
    Ghemawat, S., Gobioff, H., Leung, S.-T.: The Google file system. In: Scott, M.L., Peterson, L.L. (eds.) SOSP, pp. 29–43. ACM (2003)Google Scholar
  13. 13.
    Motwani, R., Widom, J., Arasu, A., Babcock, B., Babu, S., Datar, M., Manku, G., Olston, C., Rosenstein, J., Varma, R.: Query processing, resource management, and approximation in a data stream management system. In: CIDR (2003)Google Scholar
  14. 14.
    Neumeyer, L., Robbins, B., Nair, A., Kesari, A.: S4: Distributed stream computing platform. In: Fan, W., Hsu, W., Webb, G.I., Liu, B., Zhang, C., Gunopulos, D., Wu, X. (eds.) ICDM Workshops, pp. 170–177. IEEE Computer Society (2010)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Thomas Steinmaurer
    • 1
  • Patrick Traxler
    • 1
  • Michael Zwick
    • 1
  • Reinhard Stumptner
    • 1
  • Christian Lettner
    • 1
  1. 1.Software Competence Center HagenbergAustria

Personalised recommendations