Вы находитесь на странице: 1из 6

Identifying duplicate records by

using expression transformation

Applies to:
Informatica PowerCenter 8.6.0

Summary
This article describes a solution to identify duplicate and distinct records in a source.

Author Bio
Author(s): Avneesh Kumar Rathor
Company: Steria India Ltd.
Created on: Jul 10, 2009

Informatica Technology Network http://technet.informatica.com


© 2009 Informatica Corporation. All Rights Reserved. 1
Identifying duplicate records by using expression transformation

Table of Contents
Table of Contents ............................................................................................................................................... 2
1 Objective ...................................................................................................................................................... 3
2 Source definition .......................................................................................................................................... 3
3 Target definition ........................................................................................................................................... 3
4 Mapping ....................................................................................................................................................... 3
4.1 Transformations used ............................................................................................................................................ 3

Informatica Technology Network http://technet.informatica.com


© 2009 Informatica Corporation. All Rights Reserved. 2
Identifying duplicate records by using expression transformation

1 Objective
To segregate distinct and duplicate records in a source, solution described below uses an expression
transformation. This solution exploits the concept that all input ports are evaluated first, then variable ports
are evaluated and all output ports are evaluated in last.

2 Source definition
Source is EMPLOYEE table as shown in figure 1.

Figure 1: Relational source EMPLOYEE

3 Target definition
There are two targets in sample mapping. All duplicate records found in source table will be inserted into
target table T_DUPLICATE_EMP and distinct records will be inserted into target table T_DISTINCT_EMP.

4 Mapping
Figure 2 shows the mapping designed to identify duplicate and distinct records.

Figure 2: Mapping design

4.1 Transformations used


SRT_SOURCE_RECORDS
Sorter transformation is used after Source Qualifier to sort source records. All ports are selected as key.

Informatica Technology Network http://technet.informatica.com


© 2009 Informatica Corporation. All Rights Reserved. 3
Identifying duplicate records by using expression transformation

Figure 3: Sorter transformation

EXP_FLAG_DUPLICATE_DISTINCT
Expression transformation is used to flag a record as either duplicate or distinct. Current record
(v_CurrentRec) is compared with previous record (v_PrevRec) and if records match v_IsDuplicate variable is
set to ‘Y’ else it is set to ‘N’.

Figure 4: Expression transformation

Table 1 shows expression used for variable and output ports.

Informatica Technology Network http://technet.informatica.com


© 2009 Informatica Corporation. All Rights Reserved. 4
Identifying duplicate records by using expression transformation

Port Name Expression

v_CurrentRec EMPNO || ENAME || JOB || MGR || HIREDATE || SAL || COMM || DEPTNO

v_IsDuplicate IIF(v_CurrentRec = v_PrevRec, 'Y', 'N')

v_PrevRec EMPNO || ENAME || JOB || MGR || HIREDATE || SAL || COMM || DEPTNO

o_IsDuplicate v_IsDuplicate

Table 1: Logic used in Expression transformation

RTR_DUPLICATE_DISTINCT
Router transformation is used to segregate duplicate and distinct records using o_IsDuplicate port.

Figure 5: Router transformation

Informatica Technology Network http://technet.informatica.com


© 2009 Informatica Corporation. All Rights Reserved. 5
Identifying duplicate records by using expression transformation

Disclaimer and Liability notice


Informatica offers no guarantees and assumes no responsibility or liability of any type with respect to the content of this software asset,
including any liability resulting from incompatibility between the content within this asset and the materials and services offered by
Informatica. You agree that you will not hold, or seek to hold, Informatica responsible or liable with respect to the content of this software
asset.

Informatica Technology Network http://technet.informatica.com


© 2009 Informatica Corporation. All Rights Reserved. 6

Вам также может понравиться