Apache Flink

Hueske, Fabian; Walther, Timo

doi:10.1007/978-3-319-63962-8_303-1

Apache Flink

Fabian Hueske³ &
Timo Walther³

Living reference work entry
Later version available View entry history
First Online: 24 April 2018

424 Accesses
13 Citations

Synonyms

Stratosphere platform

Overview

Today, virtually all data is continuously generated as streams of events. This includes business transactions, interactions with web or mobile application, sensor or device logs, and database modifications. There are two ways to process continuously produced data, namely batch and stream processing. For stream processing, the data is immediately ingested and processed by a continuously running application as it arrives. For batch processing, the data is first recorded and persisted in a storage system, such as a file system or database system, before it is (periodically) processed by an application that processes a bounded data set. While stream processing typically achieves lower latencies to produce results, it induces operational challenges because streaming applications which run 24 × 7 make high demands on failure recovery and consistency guarantees.

The most fundamental difference between batch and stream processing applications is that...

This is a preview of subscription content, log in via an institution.

References

Akidau T et al (2015) The dataflow model: a practical approach to balancing correctness, latency, and cost in massive-scale, unbounded, out-of-order data processing. Proc VLDB Endowment 8(12):1792–1803
Article Google Scholar
Alexandrov A, Ewen S, Heimel M, Hueske F, Kao O, Markl V, …, Warneke D (2011) MapReduce and PACT-comparing data parallel programming models. In BTW, pp 25–44
Google Scholar
Alexandrov A, Bergmann R, Ewen S, Freytag JC, Hueske F, Heise A, …, Naumann F (2014) The stratosphere platform for big data analytics. VLDB J 23(6):939–964
Article Google Scholar
Battré D, Ewen S, Hueske F, Kao O, Markl V, Warneke D (2010) Nephele/PACTs: a programming model and execution framework for web-scale analytical processing. In: Proceedings of the 1st ACM symposium on cloud computing. ACM, pp 119–130
Google Scholar
Carbone P, Katsifodimos A, Ewen S, Markl V, Haridi S, Tzoumas K (2015a) Apache Flink: stream and batch processing in a single engine. In: Bulletin of the IEEE Computer Society Technical Committee on Data Engineering, vol 36, no. 4
Google Scholar
Carbone P et al (2015b) Apache Flink: stream and batch processing in a single engine. In: Bulletin of the IEEE Computer Society Technical Committee on Data Engineering, vol 36, no. 4
Google Scholar
Carbone P et al (2015c) Lightweight asynchronous snapshots for distributed dataflows. In CoRR abs/1506.08603. http://arxiv.org/abs/1506.08603
Carbone P, Ewen S, Fóra G, Haridi S, Richter S, Tzoumas K (2017) State management in apache flink®: consistent stateful distributed stream processing. Proc VLDB Endowment 10(12):1718–1729
Article Google Scholar
Dean J, Ghemawat S (2008) MapReduce: simplified data processing on large clusters. Commun ACM 51(1):107–113
Article Google Scholar
Ewen S, Tzoumas K, Kaufmann M, Markl V (2012) Spinning fast iterative data flows. Proc VLDB Endowment 5(11):1268–1279
Article Google Scholar
Ghemawat S, Gobioff H, Leung ST (2003) The google file system. ACM SIGOPS Oper Syst Rev 37(5):29–43. ACM
Article Google Scholar
Hueske F, Peters M, Sax MJ, Rheinländer A, Bergmann R, Krettek A, Tzoumas K (2012) Opening the black boxes in data flow optimization. Proc VLDB Endowment 5(11):1256–1267
Article Google Scholar
Koliopoulos A (2017) Drivetribe’s modern take on CQRS with Apache Flink. Drivetribe. https://data-artisans.com/blog/drivetribe-cqrs-apache-flink. Visited on 7 Sept 2017
Mani Chandy K, Lamport L (1985) Distributed snapshots: determining global states of distributed systems. ACM Trans Comp Syst (TOCS) 3(1):63–75
Article Google Scholar
The Apache Software Foundation. RocksDB|A persistent key-value store|RocksDB. http://rocksdb.org/. Visited on 30 Sept 2017

Author information

Authors and Affiliations

data Artisans GmbH, Berlin, Germany
Fabian Hueske & Timo Walther

Authors

Fabian Hueske
View author publications
You can also search for this author in PubMed Google Scholar
Timo Walther
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Fabian Hueske .

Editor information

Editors and Affiliations

Institute of Computer Science, University of Tartu, Tartu, Estonia
Sherif Sakr
Sch of Info Techno, Building J12, University of Sydney Sch of Info Techno, Building J12, Sydney, Australia
Albert Zomaya

Section Editor information

Politecnico di Milano http://home.deib.polimi.it/margara/
Alessandro Margara
Database Systems and Information Management Group, Technische Universität Berlin, Berlin, Germany
Tilmann Rabl

Rights and permissions

Reprints and permissions

Copyright information

About this entry

Cite this entry

Hueske, F., Walther, T. (2018). Apache Flink. In: Sakr, S., Zomaya, A. (eds) Encyclopedia of Big Data Technologies. Springer, Cham. https://doi.org/10.1007/978-3-319-63962-8_303-1

Download citation

DOI: https://doi.org/10.1007/978-3-319-63962-8_303-1
Received: 23 February 2018
Accepted: 11 March 2018
Published: 24 April 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-63962-8
Online ISBN: 978-3-319-63962-8
eBook Packages: Springer Reference MathematicsReference Module Computer Science and Engineering

Publish with us

Policies and ethics

Chapter history

Latest
Apache Flink

Published:

17 May 2022

DOI: https://doi.org/10.1007/978-3-319-63962-8_303-2
Original
Apache Flink

Published:

24 April 2018

DOI: https://doi.org/10.1007/978-3-319-63962-8_303-1

Apache Flink

Synonyms

Overview

References

Recommended Reading

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Section Editor information

Rights and permissions

Copyright information

About this entry

Cite this entry

Download citation

Publish with us

Chapter history

Latest

Original

Navigation

Synonyms

Overview

References

Recommended Reading

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Section Editor information

Rights and permissions

Copyright information

About this entry

Cite this entry

Download citation

Publish with us

Chapter history

Latest

Original

Search

Navigation