Spark SQL, DataFrames, and Datasets

  • Subhashini Chellappan
  • Dharanitharan Ganesan


In the previous chapter on Spark Core, you learned about the RDD transformations and actions as the fundamentals and building blocks of Apache Spark. In this chapter, you will learn about the concepts of Spark SQL, DataFrames, and Datasets. As a heads up, the Spark SQL DataFrames and Datasets APIs are useful to process structured file data without the use of core RDD transformations and actions. This allows programmers and developers to analyze the structured data much faster than they would by applying the transformations on RDDs created.

Copyright information

© Subhashini Chellappan, Dharanitharan Ganesan 2018

Authors and Affiliations

  • Subhashini Chellappan
    • 1
  • Dharanitharan Ganesan
    • 2
  1. 1.BangaloreIndia
  2. 2.KrishnagiriIndia

Personalised recommendations