Abstract
Preparing the data for the actual analysis is an important portion of any analytics project. The raw data comes from a variety of sources such as classical relational databases, flat files, spreadsheets, and unstructured data from sources such as social media text. A project may contain both structured and unstructured data, and to add to the complexity, there can be numerous data sources. As you would expect, the data will have a lot of challenges—both in quality and in quantity. An analyst needs to first read the data from its sources, which itself can be a challenging task, and then parse it to be useful for any further analysis. SAS needs data to be in its own datasets before you can use any of its routines for analysis. In short, the raw data is not always ready for the analysis; it needs to be validated and cleaned before the analysis.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2015 Venkat Reddy Konasani
About this chapter
Cite this chapter
Konasani, V.R., Kadre, S. (2015). Data Exploration, Validation, and Data Sanitization. In: Practical Business Analytics Using SAS. Apress, Berkeley, CA. https://doi.org/10.1007/978-1-4842-0043-8_7
Download citation
DOI: https://doi.org/10.1007/978-1-4842-0043-8_7
Published:
Publisher Name: Apress, Berkeley, CA
Print ISBN: 978-1-4842-0044-5
Online ISBN: 978-1-4842-0043-8
eBook Packages: Professional and Applied ComputingApress Access BooksProfessional and Applied Computing (R0)