Skip to main content

SCD Type 2 Implementation in Informatica

One of the most important concept  in Data ware housing. SCD Type II  means history of data should be saved.In here we can use status flags,dates and versioning to store the data.
Consider the following example



So during the first run on jan 1 2015 we got the data as A1B2 and 12345. And on jan 5 2015 we got the data as A1B2 and 56789. So as part of SCD type II we need to maintain history hence the latest record should be inserted and earlier record should be updated.

In order to identify the latest record we have columns namely status,startdate,enddate
The latest record will have status have ÁCT and End Date will be a high end date (12/31/9999).Job_id is the primary key

In informatica for implementing SCD type II the mapping will look in the following way



So in here we have transformation such as aggregator,lookup,expression,router and update

Aggregator transformation is used to remove any duplicates coming from source. We create lookup on the target table (Employee) .We use both the columns JOB_ID and CONTACT_NO for lookup condition

In expression transformation we compare the lookupdata and source data. In here we generate two columns md5_src and md5_lkp for comparision purpose













The decode condition will be 

DECODE(TRUE,
md5_lkp = md5_src, 0,
ISNULL(JOB_ID_lkp),1,
md5_lkp <> md5_src, 2)

0 means no change in data,1 means its a new record, 2 means record needs to be updated

md5 expression will be in the following way for md5_src
md5(
IIF(ISNULL(JOB_ID),'#',TO_CHAR(JOB_ID))||
IIF(ISNULL(CONTACT_NO),'#',TO_CHAR(CONTACT_NO))

)

for mds_lkp we will be using lkp columns

Then in the router transformation we will be using the following condition








Then we will be using two update strategy transformations one for insert and one for update 

In update transformation which we are using for updating records. Records are updated  based on primary key (i.e. job_id) and start_date.The following columns status and end date will be updated. We connect  DATE_Ses_StartTime from router to End_date and Status_del to Status

In update transformation which we are using for inserting, new records will be pulled and the columns will be connected in following way
DATE_Ses_StartTime will be connected to start date,
DATE_High_End_Date to enddate  and 
status_act to status

Hope this article helps you in understanding SCD type II implementation using informatica

Comments

Popular posts from this blog

Comparing Objects in Informatica

We might face a scenario where there may be difference between PRODUCTION v/s SIT version of code or any environment or between different folders in same environment. In here we go for comparison of objects we can compare between mappings,sessions,workflows In Designer it would be present under "Mappings" tab we can find "Compare" option. In workflow manger under "Tasks & Workfows" tab we can find "Compare" option for tasks and workflows comparison respectively. However the easiest and probably the best practice would be by doing using Repository Manager.In Repository Manager under "Edit" tab we can find "Compare" option. The advantage of using Repository manager it compares all the objects at one go i.e. workflow,session and mapping. Hence reducing the effort of individually checking the mapping and session separately. Once we select the folder and corresponding workflow we Can click compare for checking out ...

Finding Duplicate records and Deleting Duplicate records in TERADATA

Requirement: Finding duplicates and removing duplicate records by retaining original record in TERADATA Suppose I am working in an office and My boss told me to enter the details of a person who entered in to office. I have below table structure. Create Table DUP_EXAMPLE ( PERSON_NAME VARCHAR2(50), PERSON_AGE INTEGER, ADDRS VARCHAR2(150), PURPOSE VARCHAR2(250), ENTERED_DATE DATE ) If a person enters more than once then I have to insert his details more than once. First time, I inserted below records. INSERT INTO DUP_EXAMPLE VALUES('Krishna reddy','25','BANGALORE','GENERAL',TO_DATE('01-JAN-2014','DD-MON-YYYY')) INSERT INTO DUP_EXAMPLE VALUES('Anirudh Allika','25','HYDERABAD','GENERAL',TO_DATE('01-JAN-2014','DD-MON-YYYY')) INSERT INTO DUP_EXAMPLE VALUES('Ashok Vunnam','25','CHENNAI','INTERVIEW',TO_DATE('01-JAN-2014',...

Updating Target Table in Informatica

Generally, in every project we will see at least one requirement to update target based on some conditions. Case1: When we have primary key defined on table in Database Case2:   When we do not have primary key defined on table in Database Case3: When we want to update target based on non-primary key field. Let’s discuss it in detail. Case1: When we have primary key defined on table in Database After creating table in DB, import table as target along with primary key constraints. After done with importing, we can check the primary key details by editing target in target definition. In this case, we can use update strategy to update target table. Suppose, we have a table called RETURN_CD and primary key ROW_WID defined on that table. Now, we can update target table using update strategy based on primary key. In mapping, we should connect primary key port and the ports which we want to update in target table. Please check the below screenshot. And ...