IT Pandora (ETL,DB,UNIX,AWS,NodeJS,Github etc.)

Posts

Showing posts from December, 2013

Identifying if verbose is enabled in a workflow

It's last day of Year 2013. So lets wind it up with a small topic. We usually enable verbose mode for a transformation in a workflow to check the actual data values that are being passed across that transformation. It is good for debugging purpose but in the final code it is advisable to not use verbose mode as the session log size increases invariably which consumes a lot of space. So in order to identify or cross check if we have enabled verbose mode for any transformation the simplest way is to export the workflow xml from repository manager and then search for "verbose" keyword. If we have used it anywhere we can easily identify for which transformation and in which session we have enabled it. Hope this articles helps you in understanding the above topic

Comparing Objects in Informatica

We might face a scenario where there may be difference between PRODUCTION v/s SIT version of code or any environment or between different folders in same environment. In here we go for comparison of objects we can compare between mappings,sessions,workflows In Designer it would be present under "Mappings" tab we can find "Compare" option. In workflow manger under "Tasks & Workfows" tab we can find "Compare" option for tasks and workflows comparison respectively. However the easiest and probably the best practice would be by doing using Repository Manager.In Repository Manager under "Edit" tab we can find "Compare" option. The advantage of using Repository manager it compares all the objects at one go i.e. workflow,session and mapping. Hence reducing the effort of individually checking the mapping and session separately. Once we select the folder and corresponding workflow we Can click compare for checking out ...

Columns to Rows Conversion - Simple Scenario

Scenario: We have a source having 10 records with 3 columns and target with 1 column. we have to design a mapping to load each column of each record into target. Example: Source: Name Father_Name Mother_Name RAGHU RAM RATHNA Target: Name RAGHU RAM RATHNA Solutions: Create an Expression transformation exp_<Name> and then pull the required ports from source qualifier to expression. Make 3 Instances of the target and connect port1 from expression to target_1, port2 from expression to target_2 and then port3 from expression to target_3.

'Update as Insert' and 'Update Else Insert'

Here the naming convention clearly gives the details about the row type conversion(Insert/Update) of records which will load into target. We can see the options in session properties under Mapping tab. ' Update as Insert ' - Integration Service treats the records which are coming as update row type also as Inserts. ' Update Else Insert ' - Integration service updates the record if exists in target, else Insert into target. And in simpler form we can say: Insert --> Flags the rows as inserts Update as Update --> Flags the rows as update Update as Insert --> Flags Updates as Inserts Update else insert --> Checks for the row in target ,Inserts if there is no record and update if the record exists. Always we will select the Insert property and we can Select only one out of three update properties.If you check one property then other properties will be unchecked automatically.

Target Load Type - Normal or Bulk in Session Properties

We can see the Target load type ( Normal or Bulk) property in session under Mapping tab and we will go for Bulk to improve the performance of session to load large amount of data. SQL loader utility will be used for Bulk load and it will not create any database logs(redolog and undolog), it directly writes to data file.Transaction can not be rolled back as we don't have database logs.However,Bulk loading is very as compared to Normal loading. In target if you are using Primary Key or Primary Index or any constraints you can't use Bulk mode. We can see this property in the below snap shot.

Loading records into multiple targets from single source

Scenario: Let's say we have one source with 20 records and 3 targets X,Y,Z.I have to load first record into X, 2nd record into Y, 3rd record into Z and again 4th record into X, 5th record into Y, 6th record into Z and so on up to last record. Solution: We can implement this using Sequence Generator,Expression,Router transformations. Pull all the required fields from Source Qualifier to Expression transformation (apply business logic if required), create one extra port (Let's say REC_NUM ) and then link the "Next Value" field from the Sequence Generator to REC_NUM in Expression. From Expression, Pull the required fields into Router transformation including REC_NUM and then create 3 groups in Router as we have 3 targets(m groups for m targets). Use the below conditions and then connect each group to respective target. Group 1 -- MOD(REC_NUM,3)=1 -- Connect to X Group 2 -- MOD(REC_NUM,3)=2 -- Connect to Y Group 3 -- MOD(REC_NUM,3)=0 -- Con...

Performance tuning with Re-cache in Lookup transformation

Suppose you have a requirement where you should look up on the same table in more than one mapping and the corresponding sessions are in same workflow. To improve performance of that workflow, you can check re-cache option in the mapping where the lookup will be executed for the first time. If you check Re-cache option in lookup, the lookup will create Cache in the mapping where you are using that lookup for the first time and other depending mappings will re-use the cache created by the first mapping. With this re-usability of lookup cache, performance will be increased. To do this, we should set dependency between sessions in the workflow and check re-cache in first session. I will explain you with an example. I have below mappings: 1) m_PRESTGtoXREF_mbr_xref 2) m_PRESTGtoXREF_grp_contc_xref 3) m_PRESTGtoRARI_sbgrp_contc And I have a lookup on table HCR3R_GRP_XREF and a...

Debugger in Informatica

We usually go for Debugger in informatica when we want to check at which level (i.e. transformation) in a mapping the data is getting wrongly populated or missing functionality. Suppose we are expecting a date field to be populated in YYYYMMDD format but it is populating in DDMMYYYY format.Then using debugger we can check in which expression or source qualifier the data is coming wrong. While using the debugger it's better to restrict the data to lesser records(preferably single record) i.e. we can restrict at SQ level by passing condition e.g where emp_id='1001' so as to get only 1001 record. In designer under mappings tab we need to click debugger.Then start debugger (F9) after which we will be getting option to choose integration service and we will be having three options. a) use an existing session instance b) use an existing reusable session instance c) create a debug session instance. we can click the existing session instance. After which it will a...

Informatica PowerCenter Client Tools

PowerCenter Client Tools: PowerCenter Admin Console: The repository is administered using the Repository Manager client tool, the PowerCenter Administration Console, and command line programs. PowerCenter Repository Manager: To create Users,Groups,Folders and assign privileges. PowerCenter Designer: To create/import sources,targets and to create mappings,mapplets,re-usable transformations. PowerCenter Designer internally will have the following components. Source Analyzer - To create or Import Sources. Target Designer - To Create or Import Targets. Transformation Developer - To Create Re-usable transformations. Mapplet Designer - To create Mapplets Mapplet is a set of transformations(excluding source and target) which are re-usable across mappings. Mapping Designer - To create Mappings. Mapping is a set of transformations(including source and target) which will transform the data as per business requirement...

Informatica Introduction

Introduction Informatica is an ETL (Extract, Transform, Load) tool which will be used to extract the data from homogeneous or heterogeneous sources, Transform the data according to your business logic, and load the transformed data into file and relational targets. Informatica Components: Informatica domain: The Informatica domain is the primary unit for management and administration within PowerCenter. The Service Manager runs on an Informatica domain. The Service Manager supports the domain and the application services. Application services represent server-based functionality. The domain supports PowerCenter and Informatica application services. PowerCenter application services include the PowerCenter Repository Service, PowerCenter Integration Service, Web Services Hub, and SAP BW Service. Informatica Services include the Data Integration Service, Model Repository Service, and the Analyst Service. PowerCenter repository: The PowerCenter repository resides ...

Data Profiling Using Translate Funtion in Oracle

Suppose, you have Oracle as a source and you need to do data profiling and needs to be loaded in target. Write a source qualifier query to eliminate invalid data because joining tables in Database is efficient than joining using joiner transformation in informatica. To check whether the incoming field is numeric or non-numeric, we can use TRANSLATE function in oracle. I cannot say this is the only way to do data profiling but this is one of the ways to do data profiling. Syntax: TRANSLATE (string, String_old, String_new) This function will replace the string_old characters with string_new characters. Suppose, if you give TRANSLATE (‘1234’,’14’,’98’), then this function will replace 1 with 9 and 4 with 8. The result will be 9238 . With the below query, we can check whether the incoming filed is numeric or non-numeric. SELECT LENGTH (TRIM (TRANSLATE ('1234','0123456789',' '))) FROM DUAL If the above is giving NULL then that input valu...

Target Load Plan

In a mapping we can have more than one data flow. But is is advisable to keep no of flows to a lesser count as the no of flows increases it hampers the performance. Data flow is nothing but a pipeline from source to target. In the above pipeline we can see there are two pipelines or data flows. Using target load plan we can decide which flow should run first and get completed We have an option in mapping designer under mappings tab named "target load Plan".Upon clicking this we will be able to see something like In here we can use the upward and downward arrows to set priority. Hope this article helps you in understanding target load plan.