Introduction to Spark

  • Zubair Nabi


The first version of Spark was open sourced in 2010, and it went into Apache incubation in 2013. By early 2014, it was promoted to a top-level Apache project. It has already replaced Hadoop as the Big Data processing engine of choice in most organizations. This is a testament to its maturity and the richness of its design. Batch processing, iterative and interactive computation, stream processing, graph analytics, ETL, machine learning, and data warehousing; you name it and Spark can already handle it. This chapter is a hands-on primer to Spark to set the stage for the rest of the book.


Work Node Manager Execution Execution Location Driver Program Persistence Level 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Copyright information

© Zubair Nabi 2016

Authors and Affiliations

  • Zubair Nabi
    • 1
  1. 1.LahorePakistan

Personalised recommendations