Populating the Data Warehouse


Now that we have extracted the data from the source system, we’ll populate the normalized data store and dimensional data store with the data we have extracted. In this chapter, I will discuss five main subjects regarding data warehouse population in the sequence they occur in a data warehouse system:
  1. 1

    Loading the stage: We load the source system data into the stage. Usually, the focus is to extract the data as soon as possible without doing too much transformation. In other words, the structure of the stage tables is similar to the source system tables. In the previous chapter, we discussed the extraction, and in this chapter, we will focus on the loading.

  2. 2

    Creating the data firewall: We check the data quality when the data is loaded from the stage into the NDS or ODS. The check is done using predefined rules that define what action to take: reject the data, allow the data, or fix the data.

  3. 3

    Populating a normalized data store: This is when we load the data from the stage into the NDS or ODS, after the data passes through the data firewall. Both are normalized data stores consisting of entities with minimal data redundancy. Here we deal with data normalization and key management.

  4. 4

    Populating dimension tables: This is when we load the NDS or ODS data into the DDS dimension tables. This is done after we populate the normalized data store. DDS is a dimensional store where the data is denormalized, so when populating dimension tables, we deal with issues such as denormalization and slowly changing dimension.



Data Warehouse Dimension Table Design Surface Source System Fact Table 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Vincent Rainardi 2008

Personalised recommendations