Abstract
In 2004, Google introduced the MapReduce framework as a simple and powerful programming model that enables the easy development of scalable parallel applications to process vast amounts of data on large clusters of commodity machines (Dean and Ghemawa, OSDI, 2004, [20]). In particular, the implementation described in the original paper is mainly designed to achieve high performance on large clusters of commodity PCs. One of the main advantages of this approach is that it isolates the application from the details of running a distributed program, such as issues on data distribution, scheduling, and fault tolerance. In this model, the computation takes a set of key-value pairs as input and produces a set of key-value pairs as output.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Copyright information
© 2016 The Author(s)
About this chapter
Cite this chapter
Sakr, S. (2016). General-Purpose Big Data Processing Systems. In: Big Data 2.0 Processing Systems. SpringerBriefs in Computer Science. Springer, Cham. https://doi.org/10.1007/978-3-319-38776-5_2
Download citation
DOI: https://doi.org/10.1007/978-3-319-38776-5_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-38775-8
Online ISBN: 978-3-319-38776-5
eBook Packages: Computer ScienceComputer Science (R0)