Abstract
Spark is the next-generation big data processing framework for processing and analyzing large data sets. Spark features a unified processing framework that provides high-level APIs in Scala, Python, Java, and R and powerful libraries including Spark SQL for SQL support, MLlib for machine learning, Spark Streaming for real-time streaming, and GraphX for graph processing.i Spark was founded by Matei Zaharia at the University of California, Berkeley’s AMPLab and was later donated to the Apache Software Foundation, becoming a top-level project in February 24, 2014.ii The first version was released on May 30, 2017.iii
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2018 Butch Quinto
About this chapter
Cite this chapter
Quinto, B. (2018). Introduction to Spark. In: Next-Generation Big Data. Apress, Berkeley, CA. https://doi.org/10.1007/978-1-4842-3147-0_5
Download citation
DOI: https://doi.org/10.1007/978-1-4842-3147-0_5
Published:
Publisher Name: Apress, Berkeley, CA
Print ISBN: 978-1-4842-3146-3
Online ISBN: 978-1-4842-3147-0
eBook Packages: Professional and Applied ComputingApress Access BooksProfessional and Applied Computing (R0)