Вы находитесь на странице: 1из 149

ETL

DWH
ODI

ETL Activity
Extract Transform Load
Data Migration
Data Integration
DWH

70-80

ETL Tools
ETL Approach
Informatica Power center
IBM Datastage
SAP BODS
Talend Pentaho

ELT Tools
ELT Approach
ODI

DEPT
(file)

EMP
(file)

E
D
E

emp_dept

insert into emp_dept


select * from t_emp,t_dept where t_e
DEPT
(file)
KM
t_dept
EMP
(file)

t_emp

m t_emp,t_dept where t_emp.deptno=t_dept.deptnp


r

emp_dept

Dimensional Modeling
Create DW database and design tables
Data modeling
process of designing tables..
normalisation
E-R Modeling

De-normalisation
Dimensional Modeling

preferred in OLTP systems


in oltp continuos insert/update/select operations

Dimensional Modeling

(de-nomalisation)

Tables in dimensional modeling are created in the form of dimensions and facts

Preferred in DWH
in dwh most of the operations are select operations

ons and facts

No of
App
5lk

Passed
30k

perc
10

year(1999,2000..)

gender(B/G)
Teacher
College
Subjectwise
m1
eng
sec1

20
60
70

<1
<2

50
60

Dimensional Modeling
Dimension table
Stores descriptive information
or textual attributes
or dimensional information
Fact table
contains measurable information which are known as measures

Star Schema
In star schema every dimension table is fully de-normalised
due to this when we relate dimensions and facts it will represent
a star kind of diagram

Snowflake Schema
In snowflake schema One or more than one dimension is normalised partially
(or)
In snowflake schema One or more than one dimension is normalised to some ex

Multi star/hybrid/galaxy.
group of star,snowflake schema

normalised

s normalised partially

s normalised to some extend

Dimension tables Detailed

DWH Types
Passive
data is loaded to dwh daily once
Real Time
up to date data will be loaded to dwh
Near real-time
data loaded to dwh on hourly or in regualar short period intervals

Dimension Load strategy


three different techniques
Type-1
Type-2
Type-3
4,5,6.

d intervals

Dimension table Load strategies


Type-1
Type-2
Type-3
4,5,6

Type-1
In Type1 Any new record => insert
any changed record update
due to this no history is maintained in dimension table.

Type-2
New record insert
Changed record also insert
and versionize old records
maintains history

In type 2 tables to understand and represent data properly we will create additioanal colum

Surrogate Key
Surrogate key is a primary key in dimension table.
we can create a dummy column in dimension table and keep inserting unique se
to act like a primary key
Current flag
Indicates which record is latest and which is older
We can update these columns with Y /N or 1/0
Y,1
N,0

latest records
oldest records

effective from date


date on which record/information is generated

effective to date
date on which record /information is expired
always for latest records effective data can be null or a date which is bigger in c
Natural Key
Natural key is a primary key column in oltp,transactional systems

A column which is naturally a primary key


example: studentid,rollno,customer numer..

natural keys cannot be primary keys in Type2 dimension table as we insert dupli

Type-3
Type 3 maintains history in separate columns
Type3 maintains partial history

city_id
ct1
ct2

city_name
hyd
new chennai

city_id
ct1
ct2

sion table.

erly we will create additioanal columns like

sion table.
n table and keep inserting unique serial numbers

h Y /N or 1/0

e null or a date which is bigger in calendar like 31-dec-9999

ransactional systems

2 dimension table as we insert duplicate records

city_name
hyd
new chennai

prev_city_name
chennai

Types of Dimensions
Slowly changing Dimension(SCD)
A dimension in which data changes very slow

Fast Changing Dimension


Rapidly changing/Rapidly growin
A Dimension in which changes are very fast.
example: Age in weeks,share market,commodity,exchange rates..etc
Conformed Dimension
shared dimension
A dimension which is shared across multiple facts tables
1

fac1

Dim
fac2

Role Palying

(alias)
A single dimension refered to single fact table multiple times
(or)
A single dimension joined with another fact multiple times using alias

fac

Dim
Dim(alias)

student_id
s1

name
john

class_teacher_id
t1

temp_class_teacher_id
t2

Mini

smaller or subset
A mini diemnsion is a subset of main/big diemnsion
whenever there are few columns with slowly changing and few columns with
then we can create a separate table(mini table) to only have fastchaning co
whenever there are few columns accessed very frequently by reporting and
few columns are accessed very rare
then we can create a separate table(mini table) to only frequently accessed

Bridge Dimension
joins to diemensions
A dimension table which joins two other dimensions which are in many to m
Relation
student
student_id
s1
s2

name
xyz
pqr
One
One
Many
Many

student_id
s1
s1
s1
s2
s2
One
10 ACCOUNTING
20 RESEARCH
30 SALES
40 OPERATIONS

NEW YORK
DALLAS
CHICAGO
BOSTON

Junk Dimension
consolidated codes
A Dimension table which stores consolidation of smaller code descriptions
It also stores un known code descriptions

cust_status
cust_status
accnt_status
accnt_status
accnt_status

A
I
C
A
B

Active
Inactive
Closed
Re-Activated
Unknown

Audit Dimension
statistics
Audit Dimension stores statistical information like no of table,rows,job run ti
average run time..etc

De-generate Dimension
A dimension column which doesn't hold any business meaning

y changing/Rapidly growing/monster

xchange rates..etc
dimension
m
fac1

fac2
m

tiple times

e times using alias

fac

lass_teacher_id
teacher_id
t1
t2

teachers
name
reema
jinger

teacher_id
t1
t2

teachers(alias)
name
reema
jinger

class teacher

temp

ging and few columns with fastchaning


only have fastchaning columns
quently by reporting and

only frequently accessed columns

o diemensions
s which are in many to many relation

course
course_id course_name
1 odi
2 obiee
3 inf
to
to
to
to

One
many
One
Many

course_id
1
2
3
1
2
Many
7369 SMITH
7499 ALLEN
7521 WARD
7566 JONES
7654 MARTIN
7698 BLAKE
7782 CLARK
7788 SCOTT
7839 KING
7844 TURNER

300
500
1400

20
30
30
20
30
30
10
20
10
30

7876 ADAMS
7900 JAMES
7902 FORD
7934 MILLER

idated codes
maller code descriptions

no of table,rows,job run time,load time,no of reports

ess meaning

20
30
20
10

Fact
ETL /ETL Arch
Fact tables

A table which contains measures


these measure are quantifiable and provide metrics to understand the business performanc
Components of Fact
Measures
Foreign keys
De-generate dimension
Types of measures
Additive(Fully additive)
non Additive
Semi Additive
Additive(Fully additive)
A measure which can be summarised/Aggregated across all dimensions
example quntity,price,
Non Additive
A measure which cannot be summarised/Agregated
example : average columns,perc columns

Semi Additve
A measure which can be summarised/Aggregated across few dimensions and ca
few dimensions

Types of Facts
Detailed Fact/ transactional fact
Stored detailed level of information
example:
order1
day1
order1
day2

10 booked
2 cancelled

normal/Regular
stores some consolidated and calculated information
example
order_num
booked _dacancelled_dnet qty

order1

Day1

Day2

fact less fact


A fact table which doesn't contain any measures is called fact less fact
though we do not contain any measure , however we can get business metrics
columns

Aggregate
Aggregate table stores Pre calculated,summarised data to get the reports very q

business performance

l dimensions

w dimensions and cannot be across

act less fact


get business metrics by counting the

get the reports very quickly

Fact Loading
ETL Arch
Fact tables are most of the times only inserts the data
in some cases we can update and insert.
always dimensions have to be loaded first and then facts have to be loaded.
Every dimension table primary key will be a foreign key in fact table.

to ensure the primary and foreign key relationships from dimension to fact
we will lookup on dimensions to get the latest surrogate keys and inserts them to fact table

ETL Architecture

DWH Architecture

oltp
customer
cust_id
name
c1
xxxx

income
1289.88889

staging
cust_id
c1

name
xxxx

income
1900

cust_id
c1

name
xxxx

c1

xxxx

income
1289.889
1900

to be loaded.

nsion to fact
nd inserts them to fact table.

dim_customer
cust_id
c1
c1

name
xxxx
xxxx

income
12.89
1900

dim_customer
cust_id
name
c1
xxxx
c1
xxxx

income
1289.889
1900

ETL Architecture
Top down Approach
Bottom up approach

EDW first and then data marts


datamarts first and then EDW (optional)

Layer od ETL
oltp system
Staging Area

EDW

Datamart

Transactional systems or source systems


Is temporary data landing area
always data to be extracted from oltp tables and insert into staging
generally we dont apply complex transformations here
we will keep data for few weeks
Contains history Dimensions and fact data
we will apply complex transformations,data cleansing ,validation,reje

Data mart is a subset of DWH. Constains ready reportable data of a d


area

Data Extraction Strategy


Incremental Data extraction
Delta data extraction

Reading Full data from oltp system is called FullLoad or Initial Load
generally , we will do this only time that is when we running the ETL for the first

On daily basis we will have to extract only changed records compared to last run
this is called delta extraction,incremental extraction or change data capture

ODS
Operational Data source
ODS is Integrated,volitile,current valued data.
whenever we need to consolidate all oltp data to create some operational reports
we can integrated oltp data and keep into another database which is known as ods
difference from ods dwh
ods
integrated
subject oriented
volatile
current(one day)

dwh
integrated
subject oriented
non volatile
historical

and insert into staging


ations here

cleansing ,validation,rejection handling..etc

dy reportable data of a department or a subject

Initial Load
ning the ETL for the first time

ords compared to last run.


change data capture

rational reports
is known as ods

Informatica

Powercenter

IBM

Infosphere datastage

SAP

BODS

Abinitio

Abinitio

SAS

SAS ETL Studio

Talend
Pentaho

Talend Studio
Pentaho ETL

Microsoft

SSIS

Oracle
All above Tools follows ETL approach

Extract
Transform
Load
Oracle Data Integrator

In ETL approach all the Transformations are performed on ETL Engine


ODI
ODI will follow ELT approach
Extract Load Transform

ODI Tools is earlier developed by sunopsis


later taken over by Oracle and renamed as ODI(Oracle Data Integrator)

In ELT approach All the transformations are performed in Database


so it is faster than ETL
smaller Server are required so cost of servers will be saved
ODI has lot of built on code template known as Knowledge modules which can be customis

Integrator

ch can be customised

ODI Archi
Server Components
Client Components
Repository
Repository manages metadata
metadata is data about data
any programs,code,structures or
anything explains about data
ODI Client

supervisor

ODI Studio
Designer
Operator
Topology
Security

Reposiotory

Master
Work
(Development
/Execution)

Server

Web

ODI 12c Installations


Java latest version 1.7 or above
download ODI 12c 1.1.3 (latest one)
other old versions are ODI 12c 1.1.2
Oracle databse 12c or Oracle 11g R2 path 3
11.2.1.3

software file
fmw_12.1.3.0.0_infrastructure.jar
fmw_12.1.3.0.0_odi.jar

cd C:\Java\jdk1.7.0_51\bin

java -jar F:\Saraswathi\softwares\CurrentVersionSoftwares\ODI\ODI12.1.3\fmw_12

java -jar c:\ODI\fmw_12.1.3.0.0_infrastructure.jar


during the installation select "Enterprise installation"

java -jar F:\Saraswathi\softwares\CurrentVersionSoftwares\ODI\ODI12.1.3\fmw_12.1.3.0.0_o

E:\ODI12c\Oracle\Middleware\Oracle_Home\odi\studio
create Repository
two ways to create repositories
using ODI studio.
this used to be old method before ODI12c

using RCU utility


Repository Configuration Utility

<<E:\ODI12c\Oracle\Middleware\Oracle_Home>>\oracle_common\bin
rcu
open command prompt in Admin mode
cd E:\ODI12c\Oracle\Middleware\Oracle_Home\oracle_common\bin
Note : set the Environment variable JAVA_HOME to java bin directory
Go to my computers
properties

Advance setting
Advance tab
Environment variables

undr system variables section


click new

NOTE java path should be base path of java means


it is before bin directory
to verify java path
echo %JAVA_HOME%
now
type rcu
click next
choose Create repository
System Load and product Load

click next
ok

select prefix name


select Oracle DataIntegrator

select "same pasword for all schema"


specify any password

Connecting to Repository from studio

jdbc:oracle:thin:@<host>:<port>:<sid>

dio
ner
tor
ogy
rity

ODI Agent
(Standalone,colocated,J2EE)
Oracle Enterprise Manager
WebLogic Server

ODI 10g
Integration with Fussion middleware
11g

es\ODI\ODI12.1.3\fmw_12.1.3.0.0_infrastructure_Disk1_1of1\fmw_12.1.3.0.0_infrastructure.jar

12.1.3\fmw_12.1.3.0.0_odi_Disk1_1of1\fmw_12.1.3.0.0_odi.jar

ommon\bin
directory

n middleware

a.java
a.class
import
main()
{
}

java a

Read data

load data

EMP_SRC78

EMP_T78

Oracle
localhost
scott

Oracle
localhost
scott

Topology
1) Configure Topology
configure all the database configuration from which we need to extract or load data
Physical Architacture
Create dataServer
here we will provide database drivers,database name and
user/password to connect to daatabase
Context
(Global)
Create physical Schema
here select the schema name from which we need to access
tables
these schemas can be used for loading table or reading
LogSchemasCOTT

Note: any object which we create in physical architecture is called as physical object
in physical schema we need to choose two schema names
schema this schema name is actual schema from where we need to extract /load tab
work schema
this schema is the palace where work table /temp tables are created.
for now we can use schema and workschema as same names.

Logical Architecture
Any object which created in Logical architecture is called as Logical object.
here we need to create logical schemas and map every logical schema to a physical sch
throgh a intermediate object called context
by default in ODI there will be one context exists by name Global

we can map one logical schema to more than one physical scheme using different cont
example
Logical
Schema
Logical1

Physical
Schema

Cont
ext
Cont
ext

Physi
cal

Dataserver

Logical1

Cont
ext

Physi
cal

SCOTT
(schema)

sh

Model

Create a datamodel under Model and map to a Logical Schema which is created above
Modeling is a process of designing tables.
Reverse Engineering process
importing the table structure from the database to ODI
Note: when we reverse engineer tables only metadata (table structure is imp

We can map one model to one logical schema only.


in case if we need tables from different schema or database , we need to create anothe
map to those logical schemas
Development
Create a project (if project is not already present)
Create mappings
Mapping

Oracle Database Type


apple database name

system

DataServer
(databasename
user/pwd

schema
table

scott
dept

bisample

tract or load data

Physical
Schema
Physical Schema
system

Context
(Global)

emasCOTT

d to extract /load tables

s are created.

ema to a physical schema

e using different context

Dataserver

emp

Technology
Oracle

ich is created above

ase to ODI
table structure is imported but not data)

eed to create another model and

emp
scott

sh

dept

bisample

system

scott

emp_src

emp_tgt

KM
14

0
insert into emp_tgt
select * from emp_src

emp_
src

mapping

emp_t
gt

emp_tgt

ELT

mapping
emp_src

scott

emp
(file)

LKM

dept
(file)

c$_emp

LKM

0
c$_dept

LKM

src

ODI
emp
file
deptfil
e

RKM
SKM
JKM

Reverse Engineer KM
Services KM
Journal KM

J mapping

emp_dept

Oracle database

c$_emp

c$_dept

IKM

insert into emp_dept


select * from
c$_emp,c$_dept
where
c4_emp.deptno=c$_dept
.deptno

CKM
l

emp_dept

errortable
e$_emp_dept
empno

work tables
temp tableELT
staging
mapping
emp_src

pk

table
p_dept

Mapping

mapping is a smallest programable component to perform ELT activity


we can design mappings using source,target and graphical componets
we we execute mapping Knowledge modules generates the code in ELT approach to perform
actual data movement
In older versions mappings used to called as interfaces.
below are new feature introduced in 12c in mappings
interfaces vs mappings
1 interfaces are not flow based.
mappings can be designed in flow based apprioach with the help of components
these componens are newly introduced in 12c.
2 interfaces has no components
mappings have components
3 interface can have single target only
mappings can have more than one target
4 in older versions we have yellow interfaces to re use the interface code
in 12c they are called re usable mapping.

mapping components
Every mapping will have below sections
logical diagram
in logical diagram we can deisgn actual code using source ,targets an
Physical diagram
In diagram we need to choose the knowledge module which needs to
mapping execution
components
components are newly introduced in mappings to perform data transformations
sort
filter
split(multiple filters)
lookup
join
Aggregate
expressionetc
Knowledge Module

KM is a ready code template which can be used to generate code while running
below are the types of KM's

Components

LKM

Loading knowledge Module


LKM always extarcts data from source to staging work tables.
we need to use this when data is extracted from different source and
during source to staging work tables/temp tables there will not be an
LKM will create temp tables and drops the temp tables after load

IKM

Integration knowledge module


IKM extracts data from work/temp tables and loads to final target
in case data is extracted from same database and loaded to same da
case IKM can read data from source directly and loaded to target
(here we need to neessary to use LKM)
IKM will perform the actual data transformation logics and loads to fin

CKM

Check knowledge module


CKM performs the data validation as per defined constrains on target
when we choose flow/static control on IKM's then CKM will validates t
and inserts all invalid records to a error table.
and then all valid records only will be loaded to target using IKM

JKM

Journal Knowlede module


JKM performs the change data capture activity
it supports CDC on other softwware tools like Oracle GoldenGate
also support ODI built in trigger based CDC

SKM

Service KnowledgeModule
SKM process the data based on web services

RKM

Reverse Engineer Knwolede Module


RKM performs customised Reverse engineering process

n ELT approach to perform

f components

using source ,targets and components

e module which needs to be used for

rm data transformations like

rate code while running the mappings

ging work tables.


rom different source and loaded to different target
bles there will not be any transformatiosn applied
mp tables after load

d loads to final target


e and loaded to same databases , in this
and loaded to target

on logics and loads to final target

ned constrains on target tables.


then CKM will validates the data
to target using IKM

Oracle GoldenGate

ng process

Filter
Filter component filters the data based on given filter condition
we can write any valid ansi SQL where condition statement as filter
since ODI works like ELT , hence it supports SQL where conditions as filters
example
DEPTNO=10
DEPTNO=10 and SAL>0
Additionally we can also write sub query filters
example:
DEPTNO in (select distinct DEPTNO from DEPT)
SAL in (select max(sal) from EMP)
from ODI 12.1.3 there is a new component introduces to create a subquery filter

Sort
Sort component sorts the data based on the order by clause provided in condition
example
DEPTNO DESC,SAL ASC
DEPTNO DESC,SAL NULLS FIRST

ODI knowledge modules generates order by clauses in SQL statements wherever sort comp
is used

Split

1tb

int1

emp_src

dept=
10

emp_t10

int1
IKM Oracle Multi Table Insert
define_query=true

emp_src

emp_src

Split

dept=
20

deptn
ot in
10,20

emp_t20

emp_toth
es

int2
IKM Oracle Multi Table Insert
is_target=true

int3
IKM Oracle Multi Table Insert
execute=true

ubquery filter

ded in condition

ents wherever sort component

Multi Table Insert


package

Multi Table Insert

Multi Table Insert

m1

m2

m3

Distinct
7369 APPLE
7369 SMITH
7369 APPLE1
7499 ALLEN
7499 ALLEN
7499 ALLEN
7521 WARD
7566 JONES
7654 MARTIN
7698 BLAKE
7782 CLARK
7788 SCOTT
7839 KING
7844 TURNER
7876 ADAMS
7900 JAMES
7902 FORD
7934 MILLER

CLERK

7902 17-Dec-80

SALESMAN
SALESMAN
SALESMAN
SALESMAN
MANAGER
SALESMAN
MANAGER
MANAGER
ANALYST
PRESIDENT
SALESMAN
CLERK
CLERK
ANALYST
CLERK

7698
7698
7698
7698
7839
7698
7839
7839
7566
7698
7788
7698
7566
7782

20-Feb-81
20-Feb-81
20-Feb-81
22-Feb-81
2-Apr-81
28-Sep-81
1-May-81
9-Jun-81
19-Apr-87
17-Nov-81
8-Sep-81
23-May-87
3-Dec-81
3-Dec-81
23-Jan-82

100
800
101
1600
1600
1600
1250
2975
1250
2850
2450
3000
5000
1500
1100
950
3000
1300

300
300
300
500
1400

SELECT * FROM
(
SELECT E.*
,ROW_NUMBER() OVER(PARTITION BY EMPNO ORDER BY SAL DESC) SNO
--,RANK() OVER(ORDER BY SAL DESC) RK
--,DENSE_RANK() OVER(PARTITION BY DEPTNO ORDER BY SAL DESC) DRK
FROM EMP_SRC78 E
)T
WHERE SNO>1

SELECT * FROM EMP_SRC78


WHERE EMP_SRC78.ROWID not IN (SELECT MAX(ROWID) FROM scott.EMP_SRC78 GROUP BY EMPNO)

Types of Components

Projected Components
Projected components adds the columns structure(metadata) from left side link
example:expression,distinct,set ..etc

Non projected Components


Non projected components will not get added structured from left link
example
Filter,sort,split

Distinct
Distinct Component will eliminate the row duplicates of input.
This will add a distinct clause on top of input query and eliminates the rowduplicates
Note : distinct will not eliminate column duplicates
In general we can see two types of duplicates
row duplicates
where all the column values are identical
example
e1
a
100
e1
a
100
e1
a
100
column duplicates
one column value is similar , but other column values are different
example
e1
e1
e1

a
a1
a2

100
200
300

to eliminate both column and row duplicates we can use below queries

1)
SELECT * FROM EMP_SRC78
WHERE EMP_SRC78.ROWID IN (SELECT MAX(ROWID) FROM scott.EMP_SRC78 GROUP BY E

to use this odi apply a subquery filter like


EMP_SRC78.ROWID IN (SELECT MAX(ROWID) FROM scott.EMP_SRC7
2)

use analytical functions


SELECT * FROM
(
SELECT E.*
,ROW_NUMBER() OVER(PARTITION BY EMPNO ORDER BY SAL DESC) SNO
--,RANK() OVER(ORDER BY SAL DESC) RK

--,DENSE_RANK() OVER(PARTITION BY DEPTNO ORDER BY SAL DESC) DRK


FROM EMP_SRC78 E
)T
WHERE SNO=1

PK
20 column duplicate
20 column duplicate
20 column duplicate
30 row duplicate
30 row duplicate
30 row duplicate
30
20
30
30
10
20
10
30
20
30
20
10

MP_SRC78 GROUP BY EMPNO)

e(metadata) from left side link

1
2
3
1
2
3
1
1
1
1
1
1
1
1
1
1
1
1

7369 APPLE1
7369 APPLE
7369 SMITH
7499 ALLEN
7521 WARD
7566 JONES
7654 MARTIN
7698 BLAKE
7782 CLARK
7788 SCOTT
7839 KING
7844 TURNER
7876 ADAMS
7900 JAMES
7902 FORD
7934 MILLER

CLERK
SALESMAN
SALESMAN
MANAGER
SALESMAN
MANAGER
MANAGER
ANALYST
PRESIDENT
SALESMAN
CLERK
CLERK
ANALYST
CLERK

uctured from left link

minates the rowduplicates

r column values are different

low queries

scott.EMP_SRC78 GROUP BY EMPNO)

ROWID) FROM scott.EMP_SRC78 GROUP BY EMPNO)

ORDER BY SAL DESC) SNO

ORDER BY SAL DESC) DRK

RESIDENT

7902 17-Dec-80
7698 20-Feb-81
7698 22-Feb-81
7839 2-Apr-81
7698 28-Sep-81
7839 1-May-81
7839 9-Jun-81
7566 19-Apr-87
17-Nov-81
7698 8-Sep-81
7788
###
7698 3-Dec-81
7566 3-Dec-81
7782 23-Jan-82

101
100
800
1600
1250
2975
1250
2850
2450
3000
5000
1500
1100
950
3000
1300

300
500
1400

20
20
20
30
30
20
30
30
10
20
10
30
20
30
20
10

Aggregate

Exp

set

Joins/Lookup
Joins
natural join
inner join
left outer
right outer
full outer
cross joins

7369 SMITH
7499 ALLEN
7521 WARD
7566 JONES
7654 MARTIN
7698 BLAKE
7782 CLARK
7788 SCOTT
7839 KING
7844 TURNER
7876 ADAMS
7900 JAMES
7902 FORD
7934 MILLER
1212 APPLE

CLERK
SALESMAN
SALESMAN
MANAGER
SALESMAN
MANAGER
MANAGER
ANALYST
PRESIDENT
SALESMAN
CLERK
CLERK
ANALYST
CLERK

7902 17-Dec-80
7698 20-Feb-81
7698 22-Feb-81
7839 2-Apr-81
7698 28-Sep-81
7839 1-May-81
7839 9-Jun-81
7566 19-Apr-87
17-Nov-81
7698 8-Sep-81
7788
###
7698 3-Dec-81
7566 3-Dec-81
7782 23-Jan-82

800
1600
1250
2975
1250
2850
2450
3000
5000
1500
1100
950
3000
1300
1000

300
500
1400

15

4
3

(+)
7934 MILLER
7839 KING
7782 CLARK
7902 FORD
7876 ADAMS
7788 SCOTT
7566 JONES
7369 SMITH
7900 JAMES
7844 TURNER
7698 BLAKE
7654 MARTIN
7521 WARD
7499 ALLEN
1212 APPLE

CLERK
PRESIDENT
MANAGER
ANALYST
CLERK
ANALYST
MANAGER
CLERK
CLERK
SALESMAN
MANAGER
SALESMAN
SALESMAN
SALESMAN

14

7782 23-Jan-82
17-Nov-81
7839 9-Jun-81
7566 3-Dec-81
7788
###
7566 19-Apr-87
7839 2-Apr-81
7902 17-Dec-80
7698 3-Dec-81
7698 8-Sep-81
7839 1-May-81
7698 28-Sep-81
7698 22-Feb-81
7698 20-Feb-81

1300
5000
2450
3000
1100
3000
2975
800
950
1500
2850
1250
1250
1600
1000

20
30
30
20
30
30
10
20
10
30
20
30
20
10
99

0
1400
500
300

10
10
10
20
20
20
20
20
30
30
30
30
30
30
99

10 ACCOUNTI NEW YORK


20 RESEARCHDALLAS
30 SALES
CHICAGO
40 OPERATIONBOSTON
20 TRAINING HYD

3 (+)
4
ACCOUNTI NEW YORK
ACCOUNTI NEW YORK
ACCOUNTI NEW YORK
RESEARCHDALLAS
RESEARCHDALLAS
RESEARCHDALLAS
RESEARCHDALLAS
RESEARCHDALLAS
SALES
CHICAGO
SALES
CHICAGO
SALES
CHICAGO
SALES
CHICAGO
SALES
CHICAGO
SALES
CHICAGO

Err:509
SELECT E.*,DNAME,LOC
FROM EMP_SRC78 E,DEPT_SRC78 D
WHERE E.DEPTNO=D.DEPTNO

SELECT E.*,DNAME,LOC
FROM EMP_SRC78 E INNER JOIN DEPT_SRC78 D
ON ( E.DEPTNO=D.DEPTNO)

Err:509
SELECT E.*,DNAME,LOC
FROM EMP_SRC78 E,DEPT_SRC78 D
WHERE E.DEPTNO=D.DEPTNO(+)

SELECT E.*,DNAME,LOC
FROM EMP_SRC78 E LEFT OUTER JOIN DEPT_SRC78 D
ON ( E.DEPTNO=D.DEPTNO)

SELECT E.*,DNAME,LOC
FROM EMP_SRC78 E,DEPT_SRC78 D
WHERE D.DEPTNO(+)=E.DEPTNO

Err:509
SELECT E.*,D.DEPTNO,DNAME,LOC
FROM EMP_SRC78 E,DEPT_SRC78 D
WHERE E.DEPTNO(+)=D.DEPTNO

SELECT E.*,DNAME,LOC
FROM EMP_SRC78 E RIGHT OUTER JOIN DEPT_SRC78 D
ON ( E.DEPTNO=D.DEPTNO)

Err:509
SELECT E.*,DNAME,LOC
FROM EMP_SRC78 E,DEPT_SRC78 D
WHERE E.DEPTNO(+)=D.DEPTNO(+)

SELECT E.*,DNAME,LOC
FROM EMP_SRC78 E,DEPT_SRC78 D
WHERE E.DEPTNO(+)=D.DEPTNO
UNION
SELECT E.*,DNAME,LOC
FROM EMP_SRC78 E,DEPT_SRC78 D
WHERE E.DEPTNO=D.DEPTNO(+)

SELECT E.*,DNAME,LOC
FROM EMP_SRC78 E FULL OUTER JOIN DEPT_SRC78 D
ON ( E.DEPTNO=D.DEPTNO)

Err:509
SELECT E.*,DNAME,LOC
FROM EMP_SRC78 E,DEPT_SRC78 D

CREATE TABLE ORG_NAME(ORN_NAME VARCHAR2(50))


INSERT INTO ORG_NAME VALUES('APPLE')

SELECT * FROM EMP_SRC78,ORG_NAME

SELECT * FROM ORG_NAME

Err:509

SELECT * FROM EMP,DEPT


SELECT * FROM EMP NATURAL JOIN DEPT

SELECT E.*,DNAME,LOC
FROM EMP_SRC78 E,DEPT_SRC78 D
WHERE E.DEPTNO=D.DEPTNO

INSERT INTO DEPT_SRC78


VALUES(20,'TRAINING','HYD')

SELECT * FROM DEPT_SRC78

CREATE TABLE EMP_DEPT78 AS


SELECT E.*,D.DEPTNO AS DEPTNO1,DNAME,LOC
FROM EMP_SRC78 E,DEPT_SRC78 D
WHERE E.DEPTNO=D.DEPTNO

SELECT * FROM EMP_DEPT78

TRUNCATE TABLE EMP_DEPT78

Joins
Join Components joins two or more than two input tables
these tables can be of same database or different database.
below are the type of joins
inner join
matched data of both tables.

left outer join


matched data and unmacthed data of left table
right outer join
matched data and unmatched data of right side table
full outer join
matched data and unmatched data of both sides
cross join
joing every record of first table with every record of second table
cross join can be peformed by joining two tables, without any join con

we can use cross join when there is a single record in one of the table
natural join
joining two table based on primary and foreign keys without giving jo
supported only using ansi SQL

We can choose join component to generate ansi SQL syntaxes or database spec
ANSI SQL

ansi SQL's are standard sql's which can be supported in any database
certified by ANSI
example: SELECT E.*,DNAME,LOC
FROM EMP_SRC78 E FULL OUTER JOIN DEPT_SRC78 D
ON ( E.DEPTNO=D.DEPTNO)
join Order

We can have one or more than one join component in a single mappin
we can choose join order for every join component to specify the orde
while sql's are generated
Note:

during all types of joins if any record of first tables matches mutliple records in s
then first table record will match with all records of second table
exmple
emp_id
e1

ename
xxx

job_id
j1

jo_id
j1

desc
manager

j1

e1
e1

xxx
xxx

j1
j1

j1
j1

manag

manager
manag

Lookup
Lookup component is similar to join but have many options
Lookup can take two inputs to join the tables
driving tables
lookup table

unmatched policy
when there are no macthed records found in lookup table for a record on driving
then we choose below options
drop records
it is requal to inner join
take defualt values eual to left outer join

mutiple records policy


in case of multiple records match in lookup table, below are the options
match with all records
similar to join
this is deprecated from 12.1.3
return error
fail the mapping if multiple matches found
this option will use the sub select sql's to join th
this is deprecated from 12.1.3
match with any single record
driving table will match with one single record r
match with first record

matches with first record


we can choose the order of columns

match with last record

matches with last record


we can choose the order of columns

match with nth record

matches with nth record


we can choose the order of columns

only ansi sysntaxes are supported

cord of second table


bles, without any join condition

record in one of the table.

gn keys without giving join condition

ntaxes or database specific sysntaxes

upported in any database which are

OIN DEPT_SRC78 D

ponent in a single mapping


onent to specify the order of joins

hes mutliple records in second table

le for a record on driving table

are the options

tiple matches found


sub select sql's to join the tables

h with one single record randomly

er of columns

er of columns

er of columns

Files
Structured
Semi structured
Un structured

Structured
Fixed width

keeping the data in fixed coclumn positions


10ACCOUNTINGNEW YORK
20RESEAR CHDALLAS
30SALES
CHICAGO
40OPERATIONSBOSTON

Delimited
10|ACCOUNTING|NEW YORK
20|RESEARCH|DALLAS
30|SALES|CHICAGO
40|OPERATIONS|BOSTON
delimited character
optional quotes
Header
footer

positions

lkm File to SQL


DROP WORK TABLE
drop table SCOTT.C$_0DEPT_FILE79
CREATE TABLE SCOTT.C$_0DEPT_FILE79
insert into SCOTT.C$_0DEPT_FILE79

LKM will extract data from source technology to ODI staging / ODI work schema/target tech
that is the reason , LKM is required only when source and target technologies are different
LKM naming convesion following below names

example

LKM

<source technology>

To

<target technology>

LKM

File

To

Oracle

for every RDBMS technology , we can use two types of LKM 's
one with specific to technology using the Advance features of that technology
second with SQL technology where SQL is ANSI SQL supported environment

SQL technology can be used for any RDBMS , however it will generates only ANSI SQL synta
where technology specific knowledge modules like Oracle /teradata they will use advace fe

LKM File to SQL

this will create a work table with prefix C$ and loads data from file to C$ table using jython
it is not recommended for large volumes
drop work table
C$_0DEPT_FILE79
create table SCOTT.C$_0DEPT_FILE79
insert file data into c$ table using jython programs and ANSI SQL

File

c$_

File

c$_
15
LKM

IKM

LKM File to Oracle (EXTERNAL TABLE)

c$_

File

external table

External table is a concept is database where data of the table is organised externally.
since external table is mapped to a file which is sitting ontop of OS platform so we can dire
load to final target.
LKM File to Oracle (sql loader)

File

c$_

sql loader

this KM uses Oracle sql loader utility to read from file and load to work table.
it supports only date,decimal and character data types
for other datatypes we need to choose other knowledge modules
KM will create a control file to run the sql loader and runs through java program

sqlldr 'system@apple/apple123' control='D:\ODIFILES/DEPT_FILE79.ctl' log='D:\ODIFILES/D

How to read multiple file into single target

(only when all the file having similar structure)

Option1) concatenate multiple files to single file using unix command


create a mapping with final file as source and load to table
example unix command

Option2)
drag all sources to mapping
combine them using set component union all
and load to target
Option 3)

l
create a variable to read the multiple file names one after one and load them to

work schema/target technology


chnologies are different

<features of technology used>


sqlldr

technology
ironment

ates only ANSI SQL syntaxes


a they will use advace features of that technology

e to C$ table using jython

ANSI SQL

target

target

target

rganised externally.
platform so we can directly read data from file using sql syntaxes and

target

work table.

java program

9.ctl' log='D:\ODIFILES/DEPT_FILE79.log' > "D:\ODIFILES/DEPT_FILE79.out"

having similar structure)

xample unix command

packages
cat *.txt>final.txt

er one and load them to target using package loop

Sequences
Native

We create sequence in database and use the same sequence in ODI.


these sequence managed in database
to reset the sequence we can use ODI package tools or alter reset command in b
ALTER SEQUENCE scott.SEQ78 RESTART start WITH 1
generally we use sequences to generate surrogate keys.

sequence will start the number given in start with commond and increment with the numbe
create sequence test start with 0 increment by 1
to call the sequence we need to use nextval keyword in database
if it is odi we can use below syntax
#ProjectName.seqName_NEXTVAL
:PROJDEV78.seqNative_NEXTVAL
currval key word will give the current value of sequence.
Standard Sequence

Agent

standard sequence will increment the number when it is trigger on ODI Agent.
any mapping runs in a elt method, where data is move from sourcce to target directly , the
if it is used in etl style method then for every record a new number will ge generated
In standard , always sequence starts with 1 , we cannot specify starting number
Specific Sequence
CREATE TABLE SPEC_SEQ78(PROCESS VARCHAR2(20),SN NUMBER)

insert into spec_seq78


values('HR',1)

Specific sequence uses the number specified in above table with the help of given informat
table name
column name
where condition

spec_seq78
sn
process='HR'

while incrementing the numbers it follows the standard sequence approach like elt vs etl
when a new number in incremented , odi will update the number in above table.

sequence can be create globally under global section


global sequences can be used by any project mapping

project sequence
any sequence created within project is called project sequence . These sequences can be u

uence in ODI.

alter reset command in begin mapping commands

ncrement with the number given under increment by

database

elt

etl

ODI Agent.
ce to target directly , there will be only on time incrememt on numbers
r will ge generated

rting number

N NUMBER)

he help of given information in specific sequence like

approach like elt vs etl

n above table.

hese sequences can be used within project only

Variable
Variable in ODI can hold a value, which can be used further in mappings.
to assign values to variables, we need to write a database sql in ODI refresh command.
the out put of the sql is assigned to variable.
Variable can hold only one value at a time(it cannot hold array of values)
we can refresh,set,increment and evaluate variables through packages)
we can use variables to Implement incremental/delta extraction
read multiple files and load to single target
database name,user names,password and for any other values

deptno=#PROJDEV
78.varDEPTNO

src

varDEPTNo=

SELECT VAR_VAL FROM


SCOTT.ODI_VAR78

ODI refresh command.

tgt

IKM
Intenration Knowledge Modules
IKM SQL Insert
generates ANSI SQL syntaxes to insert data to target
it will not load data in bulk mode so processing will be slow
can be used for any ansi certified database

IKM SQL Control Append


similar to IKM SQL Insert, below are additional option
commit is configurable
ability to perform validation by choosing flow/static control options

IKM Oracle Insert

Similar to IKM SQL Insert, However supports hints for insert and select
example

append,parallel

IKM Oracle Control Append


Similar to IKM SQL Control Append with below additional options
flow/static controls for data validation
hints for select,insert and for ckm tables.
analyze tables
flow table without logging so that it is faster
Update/Incemental Update
IKM SQL Update
Updates data on target table based on key column selected on target table.
Can be used for any ANSI SQL database.
Note: no support of Validation

IKM Oracle Update


Similar to IKM SQL Update, hower supports UPDATE and SELECT hints.
example : USE_INDEX(<<INDEX NAME>>)
PARALLEL
Note: no support of Validation

IKM SQL Incremental Update

INSERT
UPDATE
COMMIT IS CONFIGURABLE
Supports Data Validation
Incremental Updates perform insert and update operations at a time
in this case
all new records are inserted
and existing changed records are updated
During the mapping execution
IKM will create one temporoty table by name I$

then updates the flag in the I$ table to identify and indicate the new records,changed recor
then inserts new records to target
and updates only changed records on target
IKM Oracle Incremental Update

It is similar to SQL Incremental Update


however creates indexes and utilize the oracle query optimization techniques like analyze..

IKM SQL Merge

IKM SQL merge performs incremental update operations (new record inert /exist
using SQL MERGE command
SQL Merge command will inserts new records
and updates all existing records(including changed and un changed record)

this is not recommended when there are more no of un changed records in sourc

IKM Oracle Merge

Similar to SQL Merge , however we can keep oracle hints to improve the perform

this is not recommended when there are more no of un changed records in sourc

IKM Oracle Incremental Update (MERGE)


Similar to IKM Oracle Merge but below are additional options available
supports data validation
it will update only changed records on target.
performs source minus target to capture changed records

IKM Oracle Incremental Update (PL SQL)

It performs the incremental update using PLSQL script


this can be used when there are blob/clob datatype columns which is not suppor
update /insert operations using regular SQL

trol options

ts for insert and select operations

cted on target table.

SELECT hints.

w records,changed records and un changed records

techniques like analyze..

s (new record inert /existing record update)

un changed record)

changed records in source

s to improve the performance

changed records in source

tions available

nged records

umns which is not supporting

7369 SMITH
7499 ALLEN
7521 WARD
7566 JONES
7654 MARTIN
7698 BLAKE
7782 CLARK
7788 SCOTT
7839 KING
7844 TURNER
7876 ADAMS
7900 JAMES
7902 FORD
7934 MILLER
1212 APPLE
1111
9999
1
2
2
3
3
3
1 mill

CLERK
SALESMAN
SALESMAN
MANAGER
SALESMAN
MANAGER
MANAGER
ANALYST
PRESIDENT
SALESMAN
CLERK
CLERK
ANALYST
CLERK

7902 17-Dec-80
7698 20-Feb-81
7698 22-Feb-81
7839 2-Apr-81
7698 28-Sep-81
7839 1-May-81
7839 9-Jun-81
7566 19-Apr-87
17-Nov-81
7698 8-Sep-81
7788
###
7698 3-Dec-81
7566 3-Dec-81
7782 23-Jan-82

800
1600
1250
2975
1250
2850
2450
3000
5000
1500
1100
950
3000
1300
1000
100
100

300
500
1400

20
30
30
20
30
30
10
20
10
30
20
30
20
10
99
10
10

SMITH
ALLEN
WARD
JONES
MARTIN
BLAKE
CLARK
SCOTT
KING
TURNER
ADAMS
JAMES
FORD
MILLER
APPLE

7369 APPLE
7499 APPLE
7521 APPLE
7566 APPLE
7654 APPLE
7698 APPLE
7782 APPLE
7788 APPLE
7839 APPLE
7844 APPLE
7876 APPLE
7900 APPLE
7902 APPLE
7934 APPLE
1212 APPLE
1111 APPLE
9999 APPLE

CLERK
SALESMAN
SALESMAN
MANAGER
SALESMAN
MANAGER
MANAGER
ANALYST
PRESIDENT
SALESMAN
CLERK
CLERK
ANALYST
CLERK

7902
7698
7698
7839
7698
7839
7839
7566
7698
7788
7698
7566
7782

17-Dec-80
20-Feb-81
22-Feb-81
2-Apr-81
28-Sep-81
1-May-81
9-Jun-81
19-Apr-87
17-Nov-81
8-Sep-81
###
3-Dec-81
3-Dec-81
23-Jan-82

800
1600
1250
2975
1250
2850
2450
3000
5000
1500
1100
950
3000
1300
1000
100
100

300
500
1400

20
30
30
20
30
30
10
20
10
30
20
30
20
10
99
10
10

src
7782 CLARK
7839 KING
7934 MILLER
7369 SMITH
7566 JONES
7788 SCOTT
7876 ADAMS
7902 FORD

MANAGER
PRESIDENT
CLERK
CLERK
MANAGER
ANALYST
CLERK
ANALYST

7839

9-Jun-81
17-Nov-81
7782 23-Jan-82
7902 17-Dec-80
7839 2-Apr-81
7566 19-Apr-87
7788
###
7566 3-Dec-81

2450
9999
1300
800
2975
3000
1100
3000

10
10
10
20
20
20
20
20

2450
9999
1300
800
2975
3000
1100
3000

10
10
10
20
20
20
20
20

I$

7782 CLARK
7839 KING
7934 MILLER
7369 SMITH
7566 JONES
7788 SCOTT
7876 ADAMS
7902 FORD

MANAGER
PRESIDENT
CLERK
CLERK
MANAGER
ANALYST
CLERK
ANALYST

7839

9-Jun-81
17-Nov-81
7782 23-Jan-82
7902 17-Dec-80
7839 2-Apr-81
7566 19-Apr-87
7788
###
7566 3-Dec-81

1m

7782 CLARK
7839 KING
7934 MILLER

N
U
N
I
I
I
I
I

1m

tgt
MANAGER
PRESIDENT
CLERK

7839
7782

9-Jun-81
17-Nov-81
23-Jan-82

2450
5000
1300

10
10
10

7782 APPLE
7839 KING
7934 MILLER
7369 SMITH
7566 JONES
7788 SCOTT
7876 ADAMS
7902 FORD

MANAGER
PRESIDENT
CLERK
CLERK
MANAGER
ANALYST
CLERK
ANALYST

7839
7782
7902
7839
7566
7788
7566

9-Jun-81
17-Nov-81
23-Jan-82
17-Dec-80
2-Apr-81
19-Apr-87
23-May-87
3-Dec-81

999
5000
1300
800
2975
3000
1100
3000

10
10
10
20
20
20
20
20

1
2
3

7782 CLARK
7839 KING
7934 MILLER

MANAGER
PRESIDENT
CLERK

7839

9-Jun-81
17-Nov-81
7782 23-Jan-82

2450
5000
1300

4
5
6
7
8
9
1
2
3

7782 APPLE
7369 SMITH
7566 JONES
7788 SCOTT
7876 ADAMS
7902 FORD
7782 CLARK
7839 KING
7934 MILLER

MANAGER
CLERK
MANAGER
ANALYST
CLERK
ANALYST
MANAGER
PRESIDENT
CLERK

7839 9-Jun-81
7902 17-Dec-80
7839 2-Apr-81
7566 19-Apr-87
7788
###
7566 3-Dec-81
7839 9-Jun-81
17-Nov-81
7782 23-Jan-82

999
800
2975
3000
1100
3000
2450
5000
1300

High Level steps to create sCD2 mapping


Reverse Eng SCD2 table
select olap type as Slowly changing Dimension
confgure below columns
Surrogate key
Natural Key
Current Record indicator columns
Effective from date
Effective to date columns

for scd2
select Add row on change for all columns
in case for some columns if we need to override the change(no history) then
choose Override on change
We can choose all columns as "Add row on change"
this is complete SCD2

We can choose some columns as "Add row on change" and some columns as "Ove
in this case it is hybrid

Create a mapping with source and target


choose natural key column as key column on target table
Add below derivations to below columns
Surrogate
choose sequence
current flag columns
effective from dt
sysdate
effective to dt
sysdate

choose execute on target option for all columns on target


finally select IKM slowly changing dimension

10
10
10

1
1
1

17-Apr-15
17-Apr-15
17-Apr-15

1-Jan-00
1-Jan-00
1-Jan-00

10
20
20
20
20
20
10
10
10

1
1
1
1
1
1
0
1
1

21-Apr-15
21-Apr-15
21-Apr-15
21-Apr-15
21-Apr-15
21-Apr-15
17-Apr-15
17-Apr-15
17-Apr-15

1-Jan-00
1-Jan-00
1-Jan-00
1-Jan-00
1-Jan-00
1-Jan-00
21-Apr-15
1-Jan-00
1-Jan-00

e columns as "Override"

Re usable Mapping and writing analytical Functions

Check KM

flow control
recycle
I$_EMP_T78
insert into SCOTT.I$_EMP_T78
SNP_CHECK_TAB
E$_EMP_T78

delete
invalid
source

SNP_CHECK_TAB
3

I$_EMP_T78

Logics

statistics

1
8
2

E$
5

7839
7934
7788
7782
7782
7369
7369
7782
7902

7499 ALLEN
7521 WARD
7654 MARTIN
7698 BLAKE
7844 TURNER
7900 JAMES

SALESMAN
SALESMAN
SALESMAN
MANAGER
SALESMAN
CLERK

7698
7698
7698
7839
7698
7698

20-Feb-81
22-Feb-81
28-Sep-81
1-May-81
8-Sep-81
3-Dec-81

1600
1250
1250
2850
1500
950

300
500
1400
0

30
30
30
30
30
30

10 ACCOUNTI NEW YORK


20 RESEARCHDALLAS
30 SALES
CHICAGO
40 OPERATIONBOSTON
99
dept_78

SNP_CHECK_TAB

statistics

4
emp_t

7566 JONES
7876 ADAMS

MANAGER
CLERK

5
6

E$

KING
MILLER
SCOTT
APPLE
APPLE
BIGATA
SMITH
CLARK
FORD

PRESIDENT
CLERK
ANALYST

17-Nov-81
7782 23-Jan-82
7566 19-Apr-87

5000
1300
3000

MANAGER
CLERK
CLERK
MANAGER
ANALYST

7839 9-Jun-81
7902 17-Dec-80
7902 17-Dec-80
7839 9-Jun-81
7566 3-Dec-81

999
800
800
2450
3000

0
0
10
78
99
20
55
40
60

10
10
20
10
10
20
20
10
99

7839
7788

2-Apr-81
###

2975
1100

30
80

20
20

CDC

Change Data Capture


Incremental extraction / Delta

cust_id name
c1
xyz
c2
abc

c2

C
D
add C
hyd S
hyd o
ft
w
a
r
e

2 24th
last_upd_dt
22nd
23rd

1m
1m
1k
chang
ed
record

odi

1)
2)
3)

CDC based on last updated date column using ODI variable


CDC using CDC Tools like Oracle golden gate,Informatica power exchang
ODI Trigger based CDC

abc

hyd

23rd

1k

ODI Trigger based CDC

Import the table


Import Journal KM's
Create a primary key
Go to model select JKM
right click table and select add to CDC
right click and select start journal
and add one subscriber name "Sub78"

Sub78

create a mapping with EMP_CDC78 as source


in logical diagram select source table and select below condition'
JRN_SUBSCRIBER = 'Sub78'
Go to physical diagram windows
select source table and enable "journalised data only"

j$
Sub78
Sub78
Sub78
Sub78
Sub78
Sub78
SUNOPSIS
SUNOPSIS
SUNOPSIS
SUNOPSIS
SUNOPSIS
SUNOPSIS

1I
1I
1I
1I
1I
1I
0I
0I
0I
0I
0I
0I

24-Apr-15
24-Apr-15
24-Apr-15
24-Apr-15
24-Apr-15
24-Apr-15
24-Apr-15
24-Apr-15
24-Apr-15
24-Apr-15
24-Apr-15
24-Apr-15

7499
7521
7654
7698
7844
7900
7499
7521
7654
7698
7844
7900

CDC/ journal
simple
consistent set

cdc on single table


capture changed records from master and detailed tables combination

emp
empid
e1

ename
abc

deptno
10 e1

abc

10

j
dept
deptno loc
10 hyd

10 hyd

c1

xyz

hyd

ng ODI variable
rmatica power exchange..ibm cdc

22nd

d tables combination

scd-1/2

empid
e1

ename
xyz

deptno

loc
10 hyd

Procedure

src

tgt

AGent

T
Proc
task1
task2
task3
task4
task5

Isolation level
uncommited
committed
repeatable
serializable

insert
insert
insert
insert
insert

into
into
into
into
into

emp
emp
emp
dept
dept

5 tran0
10 tran0
3 tran0
2 tarn1
4 tarn1

nocommit
nocommit
commit
nocommit
commit

will define data consistency


reads un committed
reads only committed data
row level lock
table level lock
empno deptno sal
e1
10
1000
e2
10
3000
e3
10
50000

select * from emp where sal>=1000

Scenario

Scenario is a executable code of odi developed objects like mappings,procedure,packages.

001

Scenario re create with new version


scenario re generate to have single scenario with always latest code
a.java
a.class
(executable)
a
a
a
a
b
mapping

scenario

Scenarios will not allowed to make any changes.


In production systems we can create a work repository of type executable
and place only scenarios

In real- time always we need to generate scenarios and execute the code, so that any accid
on mapping will not be impacting the tested/certified code

If work repository type is development , then we can create source code like (mapping,proc
and also keep scenarios
Versioning
Versioning craetes copies of the source in development repository
we can compare the code from one version to another
and also restore to oldest version of code

Packages
We can add source code components like
mapping,procedure,variable,sequences
We can add any scenario created for mapping,procedure,variable,sequences,packages
We cannot add a package to another package. But create a scenario for package and then
when we add any scenario, by default they are configured for synchronous mode
means they all run serial
by changing them to asynchronous, we can run them parallel.

for every package , me must have starting step (indcates with green arrow)
In addition to placing odi developed components in package we can add odi bultin tools to

handling variables in packages

ngs,procedure,packages..etc

s latest code

executable)
001

002
java a

e code, so that any accidental changes

code like (mapping,procedure,packages)

sequences,packages

io for package and then we add the package scenario to another package
hronous mode

n add odi bultin tools to perform various operations

ODI Tools
OdiOutFile 1
output content to another file
echo "empno,ename,loc" > test.txt

OdiFileCopy
copy file within one server
copies sub directories also

cp a.txt b.txt

OdiFileDelete
delete the files from directory
rm *.txt
also can delete the files which have been modifies during x time to y time
this requires find command in unix
OdiFileMove
move the file from one directory to another
if the file is moved within same directory, then it will work like rename
mv a.txt b.txt
mv /a/a.txt /b/a.txt

OdiMkDir 5
creates directory
mkdir test

OdiFileAppend 6
cat file1.txt file2.txt file3.txt > finalfile.txt
cat f*.txt > finalfile.txt

OdiSqlUnload 7

File transfer
Agent
ODI

scp

remote
unix

local
unix

ftp

windows

unix

ftp

windows

windows

ftp

unix

windows

get
put
mget
mput

ODI

OdiDeleteScen
deleet scenario
OdiExportAllScen
export one or all scenarios.
we can use this when we need to take a backup of all scenarios regularly
or export all screnarions to migrate code to another repository
OdiExportEnvironmentInformation
exports ODI environment to a csv file

OdiExportLog

Agents
Load plans

Agents
local(no Agent)
Agent creation
RCU must be configuration
cd E:\ODI12c\Oracle\Middleware\Oracle_Home\oracle_common\common\bin

start Agent
start nodemanager
cd E:\ODI12c\Oracle\Middleware\Oracle_Home\user_projects\domains\odi12_domain\bin
startnodemanager
start Weblogic
cd E:\ODI12c\Oracle\Middleware\Oracle_Home\user_projects\domains\odi12_domain\bin
startWebLogic
verify below link

http://localhost:7001/console/
create a Agent in topology

start the agent from command prompt


agent -NAME=OracleDIAgent1 -PORT=20910

ODI Studio
LocalAgent

RCU Rep

Agent1

Agent2

Enterprise manager
Weblogic
node
cluster1
cluster2
manager

localhost:7001/console
localhost:7001/em

Go to enterprise manager ink and start ODI Server


Now go to http://localhost:15101/odiconsole
this link will open odi operational console

mmon\bin

ins\odi12_domain\bin

ins\odi12_domain\bin

Load plan
debug
solutions
smart imp/exp
context

deployment grp/physical design


In every mapping for a single logical design, we can have multiple physical designs

In each physical design we can configure different knowledge modules and different config
example
in initial load select Oracle insert with truncates
in daily load select incremental update without truncate

context
Context can help to have one single logical schema to point to multiple physical schemas
example
One logical schema can point to three physical schema through different contxt
1 for development database
using dev context
1 for testing database
using test context
1 for production database
using prod context

while executing the mapping we can choose the appropriate context to run on specific env
physic
al
log schema
model

dev
Dev
(context)

mapping
physic
al

test
(context)

test

physic
al

prod
(context)

physic
al
prod

test
(context)

Debug
Debug can help trouble shooting the mapping during the execution

when we run the mapping in debug session, first it will give the blue print of complete map
here we can add the break point at any step to see ,check and trouble shoot the code.

import /export
We can perform import and export in two ways

regural import /export: in this case only select objects /components or exported and import

smart export import


this will export all the selected components and also exports if there are any de

when we are exporting the mapping for the first time , we can choose smart option so that
if only certain selected components to be exported then we choose normal export.

We need import export in following scenario..

move the code from development enviromnent to test and form test to production

in this case we may have to export all the code from development repository to testing rep

when we export the code , all the components will be saved as xml files.

when we are importing in normal import we can choose below options

duplicate
synonym mode insert
synonym mode update
synonym mode insert/update

Solutions

solution is a group of scenarios which needs to be migrated from one repository to another

Loadplans
Load plan is a collection of scenarios with plan of execution path with dependencies
Load plans can help executing scenarios in serial and parallel.
We can schedule Loadplans to run automatically on given time and intervals.
Additionally we can set the exceptions and advance dependency calculation
Load plans has steps to execute scenarios
root step
Run scenario step
serial
parallel
case
when
else

starting step of Load plan


Executes scenario
Executes scenario in serial
Executes scenario in parallel
define a condition on variable
run a scenario when condition is met
run a scenario when condition is not met

Exception steps

runs the scenarion when there is a exception


example: send email when there is a failure
exceptions can be handled with raise or ignore
raise will execute exception step and fails the step
ignore will execute exception step and continue further

Restart Type
Restart requires some time to execute all the scenarios
including the scenarios which are already successful
in this case choose Restart type as execute all child
some times we need to start only from the last failure point
in this case choose to run from failure tasks
Schedule
Load plans can be scheduled to run on their own as per set timing

11g
filter
drag a column to outside space
join
link two tables and select join option
lookup
link two tables and select lookup option
lookup supports only two options when there are multiple matches
a) error on multiple match
b) join will all records
derivations
all the derivations we need to write on the target directly
for all aggregate calculations we need to keep on target.
complex logic
in case if any logic requires after target,
we need to place a temp/dummy target in first mapping(interface)
and call this in another interface(mapping)
A interface which contains temp target is called as yellow interface

physical designs

ules and different configurations

tiple physical schemas

through different contxt

xt to run on specific environment


physic
al

physic
al

physic
al

physic
al

ue print of complete mapping execution steps.


uble shoot the code.

s or exported and imported.

ports if there are any dependent objects required to run the mapping/packages

ose smart option so that all the dependent objects are exported

e normal export.

st to production
repository to testing repository and from test to production repository

ne repository to another repository for code deplyment purpose

ith dependencies

d intervals.
alculation

nue further

scenarios

st failure point

le matches

g(interface)

w interface