Introduction to Spark

Quinto, Butch

doi:10.1007/978-1-4842-3147-0_5

Butch Quinto²

2835 Accesses

Abstract

Spark is the next-generation big data processing framework for processing and analyzing large data sets. Spark features a unified processing framework that provides high-level APIs in Scala, Python, Java, and R and powerful libraries including Spark SQL for SQL support, MLlib for machine learning, Spark Streaming for real-time streaming, and GraphX for graph processing.ⁱ Spark was founded by Matei Zaharia at the University of California, Berkeley’s AMPLab and was later donated to the Apache Software Foundation, becoming a top-level project in February 24, 2014.ⁱⁱ The first version was released on May 30, 2017.ⁱⁱⁱ

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Author information

Authors and Affiliations

Plumpton, Victoria, Australia
Butch Quinto

Authors

Butch Quinto
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Quinto, B. (2018). Introduction to Spark. In: Next-Generation Big Data. Apress, Berkeley, CA. https://doi.org/10.1007/978-1-4842-3147-0_5

Download citation

DOI: https://doi.org/10.1007/978-1-4842-3147-0_5
Published: 13 June 2018
Publisher Name: Apress, Berkeley, CA
Print ISBN: 978-1-4842-3146-3
Online ISBN: 978-1-4842-3147-0
eBook Packages: Professional and Applied ComputingApress Access BooksProfessional and Applied Computing (R0)

Publish with us

Policies and ethics