Augmenting ETL Processes
This chapter will cover how to use PowerShell to augment ETL development. We often need to load external files into a SQL Server database. However, these files usually need to go through some preparation before they can be loaded. Perhaps they arrive via FTP from an external source. Such files may be compressed. Perhaps they need to be scrubbed of bad values or modified to make them easier to load. After being loaded, the business may want the files archived and retained for a period of time. Before PowerShell, legacy-style batch files were often employed to do these tasks. However, batch files are cryptic, difficult to maintain, and lack support for reusability. In this chapter, we will see how PowerShell scripts can be used to accomplish these tasks. Rather than define a specific business scenario for these tasks, we consider this a common ETL pattern in which we can choose to employ the given tasks that apply. In this pattern, files arrive in a folder and copied to a local server, then are decompressed, loaded, and archived. Typically, when the job starts and ends, email notifications are sent out. Sometimes there are additional requirements. We will discuss functions that help with these tasks. Let’s consider the potential ETL steps already mentioned as a template from which we can pick what we need.