Abstract
In the previous chapter on Spark Core, you learned about the RDD transformations and actions as the fundamentals and building blocks of Apache Spark. In this chapter, you will learn about the concepts of Spark SQL, DataFrames, and Datasets. As a heads up, the Spark SQL DataFrames and Datasets APIs are useful to process structured file data without the use of core RDD transformations and actions. This allows programmers and developers to analyze the structured data much faster than they would by applying the transformations on RDDs created.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2018 Subhashini Chellappan, Dharanitharan Ganesan
About this chapter
Cite this chapter
Chellappan, S., Ganesan, D. (2018). Spark SQL, DataFrames, and Datasets. In: Practical Apache Spark. Apress, Berkeley, CA. https://doi.org/10.1007/978-1-4842-3652-9_4
Download citation
DOI: https://doi.org/10.1007/978-1-4842-3652-9_4
Published:
Publisher Name: Apress, Berkeley, CA
Print ISBN: 978-1-4842-3651-2
Online ISBN: 978-1-4842-3652-9
eBook Packages: Professional and Applied ComputingApress Access BooksProfessional and Applied Computing (R0)