SCD Type 2 Implementation in Informatica

One of the most important concept in Data ware housing. SCD Type II means history of data should be saved.In here we can use status flags,dates and versioning to store the data.
Consider the following example

So during the first run on jan 1 2015 we got the data as A1B2 and 12345. And on jan 5 2015 we got the data as A1B2 and 56789. So as part of SCD type II we need to maintain history hence the latest record should be inserted and earlier record should be updated.

In order to identify the latest record we have columns namely status,startdate,enddate
The latest record will have status have ÁCT and End Date will be a high end date (12/31/9999).Job_id is the primary key

In informatica for implementing SCD type II the mapping will look in the following way

So in here we have transformation such as aggregator,lookup,expression,router and update

Aggregator transformation is used to remove any duplicates coming from source. We create lookup on the target table (Employee) .We use both the columns JOB_ID and CONTACT_NO for lookup condition

In expression transformation we compare the lookupdata and source data. In here we generate two columns md5_src and md5_lkp for comparision purpose

The decode condition will be

DECODE(TRUE,
md5_lkp = md5_src, 0,
ISNULL(JOB_ID_lkp),1,
md5_lkp <> md5_src, 2)

0 means no change in data,1 means its a new record, 2 means record needs to be updated

md5 expression will be in the following way for md5_src
md5(
IIF(ISNULL(JOB_ID),'#',TO_CHAR(JOB_ID))||
IIF(ISNULL(CONTACT_NO),'#',TO_CHAR(CONTACT_NO))

)

for mds_lkp we will be using lkp columns

Then in the router transformation we will be using the following condition

Then we will be using two update strategy transformations one for insert and one for update

In update transformation which we are using for updating records. Records are updated based on primary key (i.e. job_id) and start_date.The following columns status and end date will be updated. We connect DATE_Ses_StartTime from router to End_date and Status_del to Status

In update transformation which we are using for inserting, new records will be pulled and the columns will be connected in following way
DATE_Ses_StartTime will be connected to start date,
DATE_High_End_Date to enddate and
status_act to status

Hope this article helps you in understanding SCD type II implementation using informatica

IT Pandora (ETL,DB,UNIX,AWS,NodeJS,Github etc.)

Search This Blog

SCD Type 2 Implementation in Informatica

Comments

Post a Comment

Popular posts from this blog

Comparing Objects in Informatica

Target Load Type - Normal or Bulk in Session Properties

Finding Duplicate records and Deleting Duplicate records in TERADATA