Академический Документы
Профессиональный Документы
Культура Документы
Data Merging:
It is the process of combining the data from multiple source systems
Data merging are two types
1.Join
2.Union
Session:
A Session is a task that executes mapping .
A session is a set of instructions that tells ETL server
Work flow:
A Work flow is a set of instructions that tells how to execute the session tasks
A Work flow is designed with two types of batch process
1.Sequential Batch Process
2.Concurrent or Parallel batch Process
1.Sequential Batch Process:
Sequential batch process is recommended when hence it is a dependency between
data loads
No of recorded Loaded
Through Put in Power Center :
It defines the efficency or the rate at which recors are extracted/sec,the records are
loaded/sec
Through put can also be express in bytes/sec
Can evaluate the ETL server Efficeny
Users can acess session Log(Execution Log)
Development of ETL Objects:
Step1: Create Source Definition
Step2: Create Target Definition
Step3: Design Mapping (ETL application with or with without Business rules)
Step4: Create Session for each Mapping
Step5: Design Work flow
Step6: Run Work flow
Step7: Monitor Workflow
Power Center Repository Manager:
It is a GUI based Administrative client which is used to perform the following tasks
a) Create ,Edit ,Delete Folders
b) Objects Back up and Restore
c) Assign users to access the folders with read ,write,execute permissions
Power Center Repository:
A repository is a brain of ETL system that stores ETL objects or Meta Data.
A relational DB is required to create repository
Repository DB that consist of system tables that stores ETL objects
Power Centre Repository Service[PCRS]:
A Power Center client component connects to the repository DB using repository
service
A repository service is a set of process that insert ,update,delete,retrive metadata from
repository
Reader: It connects to the source and Extract the data from tables,Files,etc
Data Transformation Manager (DTM): It process the data according to the
business rules that you configured in the mapping
Writer: It connects to the target system and loads the data into the tables (or) Files
Note: Log Created by Integration service and saved in repository that log can
accessed by work flow Manager
Power Center Domain:
1.The Informatica power center has the ability to scale the services and shared
resources across multiple machines
2.The power center domain is a primary unit for managing and administrating
application services(PCRS,PCIS)
3.Power Center Domain is a collection of one or more Nodes
4.A Node which host the Domain is known as Primary Node Master Gate way Node
5.If master gate way Node fails users request cant be processed
6.Hence it is recommended to configure more than one Node as Master Gate way
Node
7.If the worker Node fails the request can be distributed to other Nodes[High
Availabilty]
8.Each Node is created or Configured with application services
Create User:
SQL>SHO USER
SQL>Create user BATCH7AM identified by TARGET;
Assign permission to User:
SQL>Grant DBA to BATCH7AM;
ETL Development process:
1.Creation of Source& Target Definitions
2 A Source Definition is created using Source Analyzer tools
3 A Source Analyzer connects to the Source DB using ODBC connection
1.Creation of Source Definition:
A Source Definition is created using Source Analyzer tools
A Source Analyzer connects to the Source DB using ODBC connection
4.Creation of Session:
1.A Session is a task that runs the mapping
2.It is created using Task Developer tool in Work flow Manager Client component
3.Every Session is configured with the following details
a) Source Connection
b) Target Connection
c) Load Type
Creation of Reader Connection(Oracle):
From the client Power center work flow Manager Select connections menu click on
Realational select the type Oracle click on New Enter the following details
Select Create table& Click on Generate&Execute and click Ok then the SQL
stores in a file ,file name called MKTABLES.SQL
Transformations&Types of Transformations:
A transformation is a power center object which allow you to develop the business
rules to process the data in desired business formats.
Transformations are categorized in two types
1.Active transformation
2.Passive Transformation
1.Active transformation:
A transformation that can effect the no of rows(or) change the no of rows is known
as Active transformation
The following are the list of active transformations used to process the data
1.Source Qualifier Transformation
2.Filter Transformation
3.Rank Transformation
4.Sorter Transformation
5.TransactionControl Transformation
6.Update Strategy Transformation
7.Normalizer Transformation
8.Aggrigator Transformation
9.Joiner Transformation
10.Union Transformation
11 .Router Transformation
12.SQL Transformation
13.JAVA Transformation
14.Look Up Transformation(From 9.0 version on wards Act as Active transformation)
1.Passive transformation:
A transformation that doesnt effect the no of rows(or) does nt change the no of
rows is known as Passive transformation
The following are the list of active transformations used to process the data
1.Look UpTransformation( Up to Informatica 8.6 act as Passive transformation)
2.Expression Transformation
3.SQL Transformation(it Act as Duel Transformation)
4.Stored Procedure Transformation
5.Sequence Generator Transformation
6.XML Source Qualifier Transformation
Expression Transformation:
1.It is a passive transformation which allow you to calculate the expression for each
row
2.It performs row by row process
3.Expressions are developed using functions&arthematical operations
4.An expression transformation is created with 3types of ports
Input,Output,Variable
5.Expressions are developed either in output(O) or Variable ports(V)
6.Varible ports are recommended to create to simplify the complex expressions and
reuse expressions
Scenario1:
Calculate the tax for each employee who belongs to the sales department ,If sal is
greater than 5000 then calculate the tax as Sal*0.17 else calculated the tax as
Sal*0.13
Sales department is identified with department identification no is 30
Logic:
Expression transformation
SAL-[I]
TAX[O] (IFF(SAL>5000,SAL*0.17,SAL*0.13)
LOAD_DATE[O](SYSDATE)
Scenario2:
Calculate the total salary for each employee based on Sal and Commission
Total sal=Sal+Comm
Comm May have Nulls
Logic:
Expression transformation
TotSal=IIF(ISNULL(COMM),SAL,SAL+COMM)
Scenario3:
Implement LIKE operator using filter transformation in job column of EMP
table SALESMAN is represented 3 different format
SALESMAN
SALES-MAN
PRE-SALES
Variable Port:
A port which can store the data temporarly is known as variable port(v)
2.Varible ports are created to simplify the complex expressions and reuse expressions
in several Output Ports
3.Varible ports are local to the transformation
4.Increase the efficiency of calculations
5.The default value for numerical variable port is 0
6.The default value for variable port with data type string is space
7.Varible ports are not visible normal view of transformation but in edit view
Router Transformation:
A Router Transformation it is of type an active transformation which allows you to
create multiple conditions and passes the data to the multiple target
2.A router transformation is created with two types of the groups
1.Input Group
2.Output Group
Input Group:
Only Input Group can receive the data from source pipe line
Output Group:
Multiple Output Group categorized in to two types
1.User defined Output group
2.Default group
1.User defined Output group:
1.Each user defined output group has one condition
2.All Group conditions are evaluated for each row
3.One row can pass multiple conditions
Default Group
1.Always one default group
2.Captures the rows that fails all group conditions(Rejected records)
Performance Considerations:
The router transformation has a performance advantage over multiple filter conditions
because .A row is read once into Input Group but evaluated multiple times based on
the no of groups,where as using multiple filter transformation requires the same data
to be duplicated for each filter transformation.