Skip to main content

Data Profiling Using Translate Funtion in Oracle

Suppose, you have Oracle as a source and you need to do data profiling and needs to be loaded in target.
Write a source qualifier query to eliminate invalid data because joining tables in Database is efficient than joining using joiner transformation in informatica.
To check whether the incoming field is numeric or non-numeric, we can use TRANSLATE function in oracle. I cannot say this is the only way to do data profiling but this is one of the ways to do data profiling.

Syntax:
TRANSLATE (string, String_old, String_new)

This function will replace the string_old characters with string_new characters. Suppose, if you give
TRANSLATE (‘1234’,’14’,’98’), then this function will replace 1 with 9 and 4 with 8. The result will be 9238.

With the below query, we can check whether the incoming filed is numeric or non-numeric.

SELECT LENGTH (TRIM (TRANSLATE ('1234','0123456789',' '))) FROM DUAL

If the above is giving NULL then that input value is NUMERIC.
If the above query is giving NOT NULL value then that input value is Non-Numeric.

Note: In oracle, TRIM(‘ ‘) {TRIM of SPACE or EMPTY} will give you NULL value.

Example (For Numeric field) :



This Article was written by K Krishna Reddy (kkrishnareddychp@gmail.com)

Comments

Popular posts from this blog

Comparing Objects in Informatica

We might face a scenario where there may be difference between PRODUCTION v/s SIT version of code or any environment or between different folders in same environment. In here we go for comparison of objects we can compare between mappings,sessions,workflows In Designer it would be present under "Mappings" tab we can find "Compare" option. In workflow manger under "Tasks & Workfows" tab we can find "Compare" option for tasks and workflows comparison respectively. However the easiest and probably the best practice would be by doing using Repository Manager.In Repository Manager under "Edit" tab we can find "Compare" option. The advantage of using Repository manager it compares all the objects at one go i.e. workflow,session and mapping. Hence reducing the effort of individually checking the mapping and session separately. Once we select the folder and corresponding workflow we Can click compare for checking out ...

Types of Joins in Oracle/Teradata

In Data warehousing, irrespective of schema (snow flake schema or star schema) we are using, we should join dimension and fact tables to analyze the business. Below are the frequently used joins: Inner join Left outer Join Right outer Join Cross join Inner Join: Inner join will give you the matching rows from both the tables. If the join condition is not matching then zero records will return. We should use ON keyword to give join condition. Example: Table1: ID Name 1 Krishna 2 Anirudh 4 Ashok Table2: ID Location 1 Bangalore 3 Chennai 4 Chennai We can join above two tables using inner join based on key column ID. SELECT T1.ID, T1.Name, T2.Location FROM Table1 T1 INNER JOIN Table2 T2 ON T1.ID = T2.ID     If we are using inner join, it will give us matching rows from both the table. Here in this example, we have 2 matching rows i.e. ID 1 and 4. Below will be the result set for the above exa...

Looping using Expression Transformation in Informatica

One of the most common used transformation in Informatica is Expression transformation. In Expression transformation we can perform various operations such as data conversions i.e to_date,to_char, string manipulation such as substr,instr etc. Now coming to one of the widely and prominent task which we perform using Expression transformation is looping a value. Expression transformation has three types of ports i.e. input,variable and output.Only output port values can be propagated to next transformations. So in order to pass values of input and variable ports to next level of transformation these must be assigned to output ports.The order of execution in Expression transformation is top to bottom and first input then variable and finally output ports are processed. let us consider the following scenario   The files should be generated with employee name as file name and that particular file should have the details of that respective employee only, if the employee has more t...