Encyclopedia of Database Systems

2018 Edition
| Editors: Ling Liu, M. Tamer Özsu

Big Stream Systems

  • Nathan Backman
Reference work entry
DOI: https://doi.org/10.1007/978-1-4614-8265-9_80702

Synonyms

Continuous workflow execution frameworks; Distributed stream processing

Definition

Big stream systems aim to bring the scalability of batch processing frameworks to stream applications. Stream processing systems have different constraints than batch processing systems as well as a different set of challenges. The unbounded and potentially high-volume nature of streams require stream applications to execute continuously and to limit the role of disk-based storage. The throughput of high-volume streams can exceed the throughput of disks, and the stream data may not have any lasting value beyond the meaning that can be extracted from them. Big stream systems address the challenge of achieving high scalability in stream processing by (1) keeping data moving and off of disks, (2) implementing fault-tolerant strategies to allow stream data to persist in the event of faults, and (3) spreading computational workloads across many nodes while preserving the integrity and order of the...

This is a preview of subscription content, log in to check access.

Recommended Reading

  1. 1.
    Abadi D, Ahmad Y, Balazinska M, Çetintemel U, Cherniack M, Hwang J, Lindner W, Maskey A, Rasin A, Ryvkina E, Tatbul N, Xing Y, Zdonik S. The design of the borealis stream processing engine. In: Proceedings of the 2nd Biennial Conference on Innovative Data Systems Research; 2005. p. 277–89.Google Scholar
  2. 2.
    Apache Hadoop. The Apache Software Foundation. 2014. http://hadoop.apache.org. Accessed 1 June 2014.
  3. 3.
    Apache Storm. The Apache Software Foundation. 2014. http://storm.incubator.apache.org. Accessed 1 June 2014.
  4. 4.
    Condie T, Conway N, Alvaro P, Hellerstein J, Elmeleegy K, Sears R. MapReduce Online. In: Proceedings of the 7th USENIX Symposium on Networked Systems Design & Implementation; 2010.Google Scholar
  5. 5.
    Dean J, Ghemawat S. MapReduce: simplified data processing on large cluster. In: Proceedings of the 6th USENIX Symp. on Operating System Design and Implementation; 2004.Google Scholar
  6. 6.
    Neumeyer L, Robbins B, Nair A, Kesari A. S4: distributed stream computing platform. In: Proceedings of the 10th IEEE International Conference on Data Mining Workshops; 2010.Google Scholar
  7. 7.
    Zaharia M, Das T, Li H, Hunter T, Shenker S, Stoica, I. Discretized streams: a fault-tolerant model for scalable stream processing. In: Proceedings of the 24th ACM Symposium on Operating System Principles; 2013.Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Computer ScienceBuena Vista UniversityStorm LakeUSA

Section editors and affiliations

  • Ugur Cetintemel
    • 1
  1. 1.Department of Computer ScienceBrown UniversityProvidenceUSA