Sie sind auf Seite 1von 14

A case study on implementation of slowly changing dimension

A typical slowly changing dimension is a dimension in which the detailed description of a given attribute is occasionally adjusted. There are three main techniques for handling slowly changing dimensions in a data warehouse. Refer below attachment for more details on a case study on implementation of slowly changing dimension

Slowly Changing Dimension


A dimensional data warehouse database consists of a large central fact table with a multipart key. A single layer of smaller dimension tables, each containing a single primary key, surrounds this fact table. In a dimensional database, these issues of describing the past mostly involve slowly changing dimensions. A typical slowly changing dimension is a product dimension in which the detailed description of a given product is occasionally adjusted. For example, a minor ingredient change or a minor packaging change may be so small that production does not assign the product a new Product Id (which the data warehouse has been using as the primary key in the product dimension), but nevertheless gives the data warehouse team a revised description of t he product. The data warehouse team faces a dilemma when this happens. If they want the data warehouse to track both the old and new descriptions of the product, what do they use for the key? And where do they put the two values of the changed ingredient attribute? There are three main techniques for handling slowly changing dimensions in a data warehouse: 1) Overwriting-Type1 2) Creating another dimension record-Type2 3) Creating a current value field Type3 Each technique handles the problem differently. The designer chooses among these techniques depending on the users need. The table below gives the brief information about the features of each technique. Type of Slowly Changing Dimension Type1 Type2 Type3 Objective No History, Easy to Implement Complete History, Complex to Implement Partial History, Moderate to Implement

ETL

Overwriting- Type1 approach using Data stage


Type1 is a very easy and simple to implement. Here we are going to replace the old values with new one. Let us consider product dimension having fields Product Id, Product Name and Product Description, Effective Start Date. And Product Description field is changing one Here it is a simple one-to-one mapping .New Product Description will replace the old Product Description .In this case history will not be maintained

Fig.1.Type1

Creating another dimension record-Type2


This is the most commonly used solution for slowly changing dimensions. For example let us consider a slowly changing product dimension, having fields Product Id, Product Description and Effective Start Date. Product Description is changing with time .In this approach we are going to insert a new row for changed dimension with effective start date. And update the old product description with Effective End Date. Type2 can be implemented using following methods. 1. By Using Effective Start Date and Effective End Date 2. By Using Flag 3. By Using Version Ex1. By Using Effective Start Date and Effective End Date

ETL

Fig.2

To achieve this in Data stage you need to do following steps

ETL

Fig.3

Fig.4.Lookup_Transformation Stage 1) Connect to the Source table by editing the Source_Product_Table Stage. 2) Connect to the Target table as a lookup by editing Look_up_on_Target Stage 3) Create the mappings for Intermediate lookup Transformation as shown in fig.4 4) Set the constraint to update the changed dimension as (DSLink8.ID = DSLink9.ID) And IsNull(DSLink9.ENDDATE) 5) Set the constraint for inserting new product as IsNull(DSLink9.ID) 6) I n the link properties check the Reference link with multi row result set box.

ETL

Fig.5.Update_Insert_Transformation 7) I f the Product Id is existing in next transformation compare Product description by setting the constraint

ETL

8) (SLink10.olddesc <> DSink10.newdesc) as shown in Fig.5. 9) Update Effective end date for DS Link12 and Insert new product description by generating a surrogate key using DS Transform KeyMgtGetNextValue for the KEY field DS Link 13.

Fig.6.New_Product_Insert_Transformation 10) Else If the Product Id is new insert a new row with the values from source table. For KEY generate next sequence no by calling DS Transform KeyMgtGetNextValue as shown in Fig.6

Ex2. Updating Flag

ETL Strategy for DataStage .

Fig.7

To achieve this in Data stage you need to do following steps

Fig.8 1) Connect to the Source table by editing the Source_Product_Table Stage. 2) Connect to the Target table as a lookup by editing Look_up_on_Target Stage

ETL Strategy for DataStage .

Fig.9.Lookup_Transformation 3) Create the mappings for Intermediate lookup Transformation as shown in fig.9 4) Set the constraint to update the changed dimension as (DSLink8.ID = DSLink9.ID) And IsNull(DSLink9.ENDDATE) 5) Set the constraint for inserting new product as IsNull(DSLink9.ID) 6) I n the link properties check the Reference link with multi row result set box.

Fig.10.Update_Insert_Transformation 7) I f the Product Id is existing in next transformation compare Product description by setting the constraint (SLink10.olddesc <> DSink10.newdesc) as shown in Fig.10.

ETL Strategy for DataStage .

8)

Update Version for DS Link12 and Insert new product description by generating a surrogate key using DS Transform KeyMgtGetNextValue for the KEY field DS Link 13

Fig.11. New_Product_Insert_transformation 9) Else If the Product Id is new insert a new row with the values from source table. For KEY generate next sequence no by calling DS Transform KeyMgtGetNextValue as shown in Fig.11 Ex3. Updating Version

Fig.12

ETL Strategy for DataStage .

To achieve this in Data stage you need to do following steps

Fig.13 1) Connect to the Source table by editing the Source_Product_Table Stage. 2) Connect to the Target table as a lookup by editing Look_up_on_Target Stage

Fig.14.Lookup_Transformation 3) Create the mappings for Intermediate lookup Transformation as shown in fig.14 4) Set the constraint to update the changed dimension as (DSLink8.ID = DSLink9.ID) And IsNull(DSLink9.ENDDATE)
ETL Strategy for DataStage .

10

5) Set the constraint for inserting new product as IsNull(DSLink9.ID) 6) I n the link properties check the Reference link with multi row result set box.

Fig.15.Update_Insert_Transformation 7) I f the Product Id is existing compare Product description by setting the constraint (SLink10.olddesc <> DSink10.newdesc) as shown in Fig.15. 8) Then Update Existing row and Insert new product description by generating a surrogate key using DS Transform KeyMgtGetNextValue for the KEY field DS Link 13 as shown in Fig.15

Fig.16.Product_New_Insert_Transformation 9) Else If the Product Id is new insert a new row with the values from source table, and Version No as 1. For KEY generate next sequence no by calling DS Transform KeyMgtGetNextValue as shown in Fig.16

ETL Strategy for DataStage .

11

Creating a current value field Type3


In Type3 approach instead of creating a new row, a new field is created. So there will be two fields one for old value and another for new value. Consider the previous example of product dimension, in target dimension two fields will be created, Old_Product_Desc and New_Product_Desc. In this approach new product description from source table updates the New_Product_Desc field in the target table and value in New_Product_Desc will update the value in Old_Product_Desc as shown in fig.

Fig.17 To achieve this in Data stage you need to do following steps

Fig.18 1) Connect to the source table by editing Product_Source_Table Stage


ETL Strategy for DataStage .

12

2) Connect to The Target as Lookup by editing Lookup_On_Target Stage

Fig.19.Lookup_Transformation 3) Make the Mappings as shown in Fig.19 for Lookup_Transformation Stage 4) Set the constraint (DSLink4.PRODUCT_ID=DSLink5.PRODUCT_ID) For DS Link 9 to check whether Product Id is existing. 5) Set the constraint IsNull(DSLink5.PRODUCT_ID) to check whether it is a new Product Id insert.

Fig.20.Update_Transformation 6) If the Product Id is existing Make Update and Insert Operations as shown in Fig.20 .

ETL Strategy for DataStage .

13

Fig.21.Insert_Transformation 7) Else Product Id is new insert new Row as shown in Fig.21.

ETL Strategy for DataStage .

14

Das könnte Ihnen auch gefallen