Академический Документы
Профессиональный Документы
Культура Документы
1.While importing the relational source defintion from database,what are the meta data of source
import?
Source name, Database location, Column names, Datatypes, Key constraints.
2.Howmany ways you can update a relational source defintion and what are they?
Two ways
1. Edit the definition
2. Reimport the defintion
3.Where should you place the flat file to import the flat file defintion to the designer?
Place it in local folder
4.To provide support for Mainframes source data,which files are used as a source definitions?
COBOL files
5.Which transformation should you need while using the COBOL sources as source defintions?
Normalizer transformaiton, which is used to normalize the data.Since COBOL sources oftenly
consists of Denormailzed data.
6.How can you create or import flat file definition into the warehouse designer?
You can not create or import flat file defintion into warehouse designer directly.Instead you must
analyze the file in source analyzer,then drag it into the warehouse designer.When you drag the
flat file source defintion into warehouse desginer workspace,the warehouse designer creates a
relational target defintion not a file defintion.If you want to load to a file,configure the session to
write to a flat file.When the informatica server runs the session,it creates and loads the flatfile.
Classification : Confidential
Maplet is a set of transformations that is built in the maplet designer and you can use in multiple
mapings.
8.what is a transforamation?
It is a collection objects that generates,modifies or passes data.
9.What are the designer tools for creating tranformations?
1. Mapping designer
2. Tansformation developer
3. Mapplet designer
10.What are the active and passive transforamtions?
An active transforamtion can change the number of rows that pass through it.A passive
transformation does not change the number of rows that pass through it.
11.What are the connected or unconnected transforamations?
An unconnected transforamtion is not connected to other transformations in the
mapping.Connected transforamation is connected to other transforamtions in the mapping.
12.How many ways you create ports?
1.Drag the port from another transforamtion
2.Click the add buttion on the ports tab.
13.What are the reusable transforamtions?
Reusable transformations can be used in multiple mappings. You can create a reusable
transformation and add an instance of it to maping.Later if you change the definition of the
transformation, all instances of it inherit the changes.Since the instance of reusable
transforamation is a pointer to that transforamtion,you can change the transforamation in the
transformation developer,its instances automatically reflect these changes.This feature can save
you great deal of work.
14.What are the methods for creating reusable transforamtions?
Two methods
2
Classification : Confidential
Classification : Confidential
The aggregator stores data in the aggregate cache until it completes aggregate
calculations.When you run a session that uses an aggregator transformation,the
informatica server creates index and data caches in memory to process the
transformation.If the informatica server requires more space,it stores overflow
values in cache files.
21.What are the diffrences between joiner transformation and source qualifier transformation?
You can join hetrogenious data sources in Joiner transformation which we can not
achieve in Source qualifier transformation.You need matching keys to join two
relational sources in source qualifier transformation.Whereas you don't need
matching keys to join two sources in Joiner. Two relational sources should come
from the same datasource in Source qualifier. In Joiner transformation you can
join relatinal sources which are coming from diffrent sources also.
22.In which condtions we can not use Joiner transformation(Limitaions of Joiner
transformation)?
1. Both pipelines begin with the same original data source.
2. Both input pipelines originate from the same Source Qualifier transformation.
3. Both input pipelines originate from the same Normalizer transformation.
4. Both input pipelines originate from the same Joiner transformation.
5. Either input pipelines contains an Update Strategy transformation.
6.Either input pipelines contains a connected or unconnected Sequence Generator
transformation.
23.What are the settiings that you use to cofigure the Joiner transformation?
1. Master and detail source
2. Type of join
3. Condition of the join
24.What are the join types in Joiner transformation?
4
Classification : Confidential
Normal (Default)
Master outer
Detail outer
Full outer
25.What are the Joiner caches?
When a Joiner transformation occurs in a session, the Informatica Server reads all the records
from the master source and builds index and data caches based on the master rows.
After building the caches, the Joiner transformation reads records from the detail source and
perform joins.
26.what is the look up transformation?
Use lookup transformation in u'r mapping to lookup data in a relational table,view,synonym.
Informatica server queries the look up table based on the lookup ports in the transformation.It
compares the lookup transformation port values to lookup table column values based on the look
up condition.
27.Why use the lookup transformation ?
To perform the following tasks.
Get a related value: For example, if your source table includes employee ID, but you want to
include the employee name in your target table to make summary data easier to read.
Perform a calculation: Many normalized tables include values used in a calculation, such as
gross sales per invoice or sales tax, but not the calculated value (such as net sales).
Update slowly changing dimension tables: can use a Lookup transformation to determine
whether records already exist in the target.
28.What are the types of lookup?
Connected and unconnected
29.Differences between connected and unconnected lookup?
Connected Lookup
Unconnected Lookup
Classification : Confidential
Dynamic cache
6
Classification : Confidential
33.Which transformation should we use to normalize the COBOL and relational sources?
Normalizer Transformation.
When you drag the COBOL source into the mapping Designer workspace,the normalizer
transformation automatically appears,creating input and output ports for every column in the
source.
34.How the informatica server sorts the string values in Rank transformation?
When the informatica server runs in the ASCII data movement mode it sorts session data using
Binaryy sortorder.If you configure the session to use a binary sort order, the informatica server
caluculates the binary value of each string and returns the specified number of rows with the
highest binary values for the
String.
Classification : Confidential
group. For example, if you create a Rank transformation that ranks the top 5 salespersons for
each quarter, the rank index numbers the salespeople from 1 to 5:
37.What is the Router transformation?
A Router transformation is similar to a Filter transformation because both transformations
allow to use a condition to test data. However, a Filter transformation tests data for one
condition and drops the rows of data that do not meet the condition. A Router transformation
tests data for one or more conditions and gives the option to route rows of data that do not meet
any of the conditions to a default output group.
If you need to test the same input data based on multiple conditions, use a Router
Transformation in a mapping instead of creating multiple Filter transformations to perform the
same task.
38.What are the types of groups in Router transformation?
1. Input group
2. Output group
The designer copies property information from the input ports of the input group to create a set
of output ports for each output group.
Two types of output groups
1. User defined groups
2. Default group
You can not modify or delete default groups.
39.Why we use stored procedure transformation?
For populating and maintaining data bases.
40.What are the types of data that passes between informatica server and stored procedure?
Three types of data
1. Input/Out put parameters
2. Return Values
3. Status code.
41.What is the status code?
8
Classification : Confidential
Status code provides error handling for the informatica server during the session.The stored p
rocedure issues a status code that notifies whether or not stored procedure completed
sucessfully.This value can be used not seen by the user.It is only used by the informatica server
to determine whether to continue running the session or stop.
42.What is source qualifier transformation?
When you add a relational or a flat file source definition to a maping,you need to connect it to
a source qualifer transformation.The source qualifier transformation represnets the records
that the informatica server reads when it runs a session.
43.What are the tasks that source qualifier performs?
1. Join data originating from same source data base.
2. Filter records when the informatica server reads source data.
3. Specify an outer join rather than the default inner join
4. Specify sorted records.
5. Select only distinct values from the source.
6. Creating custom query to issue a special SELECT statement for the informatica server to read
source data.
44. What is the target load order?
You specify the target loadorder based on source qualifiers in a maping.If you have the multiple
source qualifiers connected to the multiple targets,You can designate the order in which informatica
server loads data into the targets.
45.What is the default join that Source qualifier provides?
Inner equi join.
46. What are the basic needs to join two sources in a source qualifier?
Two sources should have primary and Foreign key relationships.
Two sources should have matching data types.
47.What is update strategy transformation ?
This transformation is used to maintain the history data or just most recent changes into the target
table.
9
Classification : Confidential
51.What are the options in the target session of update strategy transsformatioin?
Insert
Delete
Update
Update as update
Update as insert
Update esle insert
Truncate table
52.What are the types of maping wizards that are to be provided in Informatica?
The Designer provides two mapping wizards to help create mappings quickly and easily. Both
wizards are designed to create mappings for loading and maintaining star schemas, a series of
dimensions related to a central fact table.
10
Classification : Confidential
Getting Started Wizard: Creates mappings to load static fact and dimension tables, as well as
slowly growing dimension tables.
Slowly Changing Dimensions Wizard: Creates mappings to load slowly changing dimension
tables based on the amount of historical dimension data you want to keep and the method you
choose to handle historical dimension data.
53.What r the types of maping in Getting Started Wizard?
Simple Pass through maping :
Loads a static fact or dimension table by inserting all rows. Use this mapping when want to
drop all existing data from table before loading new data.
Slowly Growing target :
Loads a slowly growing fact or dimension table by inserting new rows.Use this mapping to load
new data when existing data does not require updates.
54.What are the mapings that we use for slowly changing dimension table?
Type 1: Rows containing changes to existing dimensions are updated in the target by
overwriting the existing dimension. In the Type 1 Dimension mapping, all rows contain current
dimension data.
Use the Type 1 Dimension mapping to update a slowly changing dimension table when you do
not need to keep any previous versions of dimensions in the table.
Type 2: The Type 2 Dimension Data mapping inserts both new and changed dimensions into
the target. Changes are tracked in the target table by versioning the primary key and creating a
version number for each dimension in the table.
Use the Type2 Dimension/Version Data mapping to update a slowly changing dimension table
when you want to keep a full history of dimension data in the table. Version numbers and
versioned primary keys track the order of changes to each dimension.
Type 3: The Type 3 Dimension mapping filters source rows based on user-defined comparisons
and inserts only those found to be new dimensions to the target. Rows containing changes to
existing dimensions are updated in the target. When updating an existing dimension, the
Informatica Server saves existing data in different columns of the same row and replaces the
existing data with the updates
11
Classification : Confidential
56.How can you recognise whether or not the newly added rows in the source gets inserted in
the target ?
In the Type 2 maping we have three options to recognise the newly added rows
1. Version number
2. Flagvalue
3. Effective date Range
57.What are two types of processes that informatica runs a session with?
Load manager Process: Starts the session, creates the DTM process, and sends post-session
email when the session completes.
The DTM process: Creates threads to initialize the session, read, write, and transform data, and
handle pre- and post-session operations.
58.Can you generate reports in Informatcia?
Yes. By using Metadata reporter we can generate reports in informatica.
59.What is metadata reporter?
It is a web based application that enables to run reports againist repository metadata.
12
Classification : Confidential
With a meta data reporter,you can access information about your repository without having
knowledge of SQL,transformation language or underlying tables in the repository.
60.Define maping and sessions?
Maping: It is a set of source and target definitions linked by transformation objects that define
the rules for transformation.
Session: It is a set of instructions that describe how and when to move data from source to
targets.
61.Which tool you use to create and manage sessions and batches and to monitor and stop the
informatica server?
Informatica Workflow manager.
62.Why we use partitioning the session in informatica?
Partitioning achieves the session performance by reducing the time period of reading the source
and loading the data into target.
63.To achieve the session partition what are the necessary tasks you have to do?
Configure the session to partition source data.
Install the informatica server on a machine with multiple CPU's.
64.How the informatica server increases the session performance through partitioning the
source?
For a relational sources informatica server creates multiple connections for each parttion of a
single source and extracts seperate range of data for each connection.Informatica server reads
multiple partitions of a single source concurently.Similarly for loading also informatica server
creates multiple connections to the target and loads partitions of data concurently.
For XML and file sources,informatica server reads multiple files concurently.For loading the
data informatica server creates a seperate file for each partition(of a source file).You can choose
to merge the targets.
65.Why you use repository connectivity?
13
Classification : Confidential
When you edit, schedule the sesion each time,informatica server directly communicates the
repository to check whether or not the session and users are valid.All the metadata of sessions
and mappings is stored in repository.
66.What are the tasks that Loadmanger process will do?
Manages the session and batch scheduling: When you start the informatica server the load
maneger launches and queries the repository for a list of sessions configured to run on the
informatica server.When you configure the session the loadmanager maintains list of sessions
and session start times.When you sart a session loadmanger fetches the session information from
the repository to perform the validations and verifications prior to starting DTM process.
Locking and reading the session: When the informatica server starts a session lodamaager
locks the session from the repository.Locking prevents you from starting the session again and
again.
Reading the parameter file: If the session uses a parameter files,loadmanager reads the
parameter file and verifies that the session level parematers are declared in the file
Verifies permission and priveleges: When the sesson starts load manger checks whether or
not the user has privelleges to run the session.
Creating log files: Loadmanger creates logfile which contains the status of session.
67.What is DTM process?
After the loadmanger performs validations for session,it creates the DTM process.DTM is to
create and manage the threads that carry out the session tasks.It creates the master
thread.Master thread creates and manges all the other threads.
68.What r the different threads in DTM process?
Master thread: Creates and manages all other threads
Maping thread: One maping thread will be created for each session.Fectchs session and
maping information.
Pre and post session threads: This will be created to perform pre and post session operations.
Reader thread: One thread will be created for each partition of a source.It reads data from
source.
14
Classification : Confidential
15
Classification : Confidential
Post session email: Post session email allows you to automatically communicate information
about a session run to the designated recipents.You can create two different messages.One if the
session completed sucessfully the other if the session fails.
Indicator file: If you use the flat file as a target, you can configure the Informatica server to
create indicator file.For each target row,the indicator file contains a number to indicate whether
the row was marked for insert,update,delete or reject.
Output file: If session writes to a target file,the informatica server creates the target file based
on file prpoerties entered in the session property sheet.
Cache files: When the informatica server creates memory cache it also creates cache files.For
the following circumstances informatica server creates index and data cache files.
1. Aggreagtor transformation
2. Joiner transformation
3. Rank transformation
4. Lookup transformation
71.In which circumstances the Informatica server creates Reject files?
1. When it encounters the DD_Reject in update strategy transformation.
2. Violates database constraint
3. Filed in the rows was truncated or overflowed.
72.What is polling?
It displays the updated information about the session in the monitor window.The monitor
window displays the status of each session when you poll the Informatica server
73.Can you copy the session to a different folder or repository?
Yes. By using copy session wizard you can copy a session in a different folder or repository.But
that target folder or repository should consists of mapping of that session.If target folder or
repository does not have the maping of copying session , you must copy that maping first before
you copy the session.
74.What is batch and describe about types of batches?
Grouping of session is known as batch.Batches are of two types
16
Classification : Confidential
Classification : Confidential
We can start our required session only in case of sequential batch.in case of concurrent batch
we cant do like this.
83.How can you stop a batch?
By using server manager or pmcmd.
Classification : Confidential
Hetrogenous: When your maping contains more than one source type,the Workflow manager creates
a hetrogenous session that displays source options for all types.
87.What is difference between partioning of relatonal target and partitioning of file targets?
If you parttion a session with a relational target the Informatica server creates multiple connections
to the target database to write target data concurently.If you partition a session with a file target
the informatica server creates one target file for each partition.You can configure session properties
to merge these target files.
88.What are the transformations that restrict the partitioning of sessions?
Advanced External procedure tranformation and External procedure transformation: This
transformation contains a check box on the properties tab to allow partitioning.
Aggregator Transformation: If you use sorted ports you can not parttion the assosiated source.
Joiner Transformation: You cannot partition the master source for a joiner transformation.
Normalizer Transformation
XML targets.
89.Performance tuning in Informatica?
The goal of performance tuning is optimize session performance so sessions run during the
available load window for the Informatica Server.Increases the session performance by
following.
Network connecions: The performance of the Informatica Server is related to network
connections. Data generally moves across a network at less than 1 MB per second, whereas a
local disk moves data five to twenty times faster. Thus network connections ofteny affect on
session performance.So aviod netwrok connections.
Flat files: If your flat files stored on a machine other than the informatca server, move those
files to the machine that consists of informatica server.
19
Classification : Confidential
Relational datasources: Minimize the connections to sources, targets and informatica server to
improve session performance. Moving target database into server system may improve session
performance.
Staging areas: If you use staging areas you force informatica server to perform multiple
datapasses.Removing of staging areas may improve session performance.
You can run the multiple informatica servers againist the same repository.Distibuting the session
load to multiple informatica servers may improve session performance.
Running the informatica server in ASCII datamovement mode improves the session
performance.Because ASCII datamovement mode stores a character value in one byte.Unicode
mode takes two bytes to store a character.
If a session joins multiple source tables in one Source Qualifier, optimizing the query may
improve performance. Also, single table select statements with an ORDER BY or GROUP BY
clause may benefit from optimization such as adding indexes.
We can improve the session performance by configuring the network packet size,which allows
data to cross the network at one time.To do this go to Workflow manger, choose server
configure database connections.
If your target consists key constraints and indexes you slow the loading of data.To improve the
session performance in this case drop constraints and indexes before you run the session and
rebuild them after completion of session.
Running parallel sessions by using concurrent batches will also reduce the time of loading the
data.So concurent batches may also increase the session performance.
Partittionig the session improves the session performance by creating multiple connections to
sources and targets and loads data in paralel pipe lines.
In some cases if a session contains an aggregator transformation, you can use incremental
aggregation to improve session performance.
Aviod transformation errors to improve the session performance.
If the sessioin contains lookup transformation you can improve the session performance by
enabling the look up cache.
20
Classification : Confidential
If your session contains filter transformation ,create that filter transformation nearer to the
sources
or you can use filter condition in source qualifier.
Aggreagator,Rank and Joiner transformation may oftenly decrease the session performance
.Because they must group data before processing it.To improve session performance in this case
use sorted ports option.
Classification : Confidential
Classification : Confidential
flexible
You can export objects into repository and import objects from repository.when you export a
repository object,the designer or Workflow manager creates an XML file to describe the
repository metadata.
The designer allows you to use Router transformation to test data for multiple conditions.Router
transformation allows you route groups of data to transformation or target.
You can use XML data as a source or target.
Server Enahancements:
You can use the command line program pmcmd to specify a parameter file to run sessions or
batches.This allows to change the values of session parameters, and mapping parameters and
variables at runtime.
If you run the Informatica Server on a symmetric multi-processing system, you can use multiple
CPUs to process a session concurrently. configure partitions in the session properties based on
source qualifiers. The Informatica Server reads, transforms, and writes partitions of data in
parallel for a single session. This is avialable for Power center only.
Informatica server creates two processes like Loadmanager process, DTM process to run the
sessions.
can copy the session across the folders and reposotories using the copy session wizard in the
informatica server manager
With new email variables, can configure post-session email to include information, such as the
mapping used during the session
96.What is incremantal aggregation?
When using incremental aggregation, apply captured changes in the source to aggregate
calculations in a session. If the source changes only incrementally and can capture changes,
you can configure the session to process only those changes. This allows the Informatica Server
to update target incrementally, rather than forcing it to process the entire source and recalculate
the same calculations each time run the session.
Classification : Confidential
You can schedule a session to run at a given time or intervel,or you can manually run the
session.
Different options of scheduling
Run only on demand: Informatica server runs the session only when user starts session
explicitly
Run once: Informatica server runs the session only once at a specified date and time.
Run every: Informatica server runs the session at regular intervels as you configured.
Customized repeat: Informatica server runs the session at the dates and times specified in the
repeat dialog box.
98.What is tracing level and what are the types of tracing level?
Tracing level represents the amount of information that informatcia server writes in a log file.
Types of tracing level
1. Normal
2. Verbose
3. Verbose init
4. Verbose data
99.What is difference between stored procedure transformation and external procedure
transformation?
In case of stored procedure transformation procedure will be compiled and executed in a
relational data source.You need data base connection to import the stored procedure in to your
maping.Where as in external procedure transformation, procedure or function will be executed
out side of data source. You need to make it as a DLL to access in your mapping. No need to
have data base connection in case of external procedure transformation.
100.Explain about Recovering sessions?
If you stop a session or if an error causes a session to stop, refer to the session and error logs to
determine the cause of failure.Correct the errors, and then complete the session. The method se
to complete the session depends on the properties of the mapping, session, and Informatica
Server configuration.
Use one of the following methods to complete the session:
1. Run the session again if the Informatica Server has not issued a commit.
2. Truncate the target tables and run the session again if the session is not recoverable.
24
Classification : Confidential
3. Consider performing recovery if the Informatica Server has issued at least one commit.
101.If a session fails after loading of 10,000 records into the target. How can you load the
records from 10001st record when you run the session next time?
As explained above informatcia server has three methods to recover the sessions.Use performing
recovery to load the records from where the session fails.
102.Explain about perform recovery?
When the Informatica Server starts a recovery session, it reads the OPB_SRVR_RECOVERY
table and notes the ROWID of the last row committed to the target database. The Informatica
Server then reads all sources again and starts processing from the next rowID. For example, if
the Informatica Server commits 10,000 rows before the session fails, when run recovery, the
Informatica Server bypasses the rows up to 10,000 and starts loading with row 10,001.
By default, Perform Recovery is disabled in the Informatica Server setup.You must enable
Recovery in the Informatica Server setup before you run a session so the Informatica Server can
create and/or write entries in the OPB_SRVR_RECOVERY table.
103. How to recover a standalone session?
A standalone session is a session that is not nested in a batch. If a standalone session fails, you
can run recovery using a menu command or pmcmd. These options are not available for batched
sessions.
To recover sessions using the menu:
1. In the Server Manager, highlight the session want to recover.
2. Select Server Requests-Stop from the menu.
3. With the failed session highlighted, select Server Requests-Start Session in Recovery Mode
from the menu.
Classification : Confidential
If you configure a session in a sequential batch to stop on failure, you can run recovery starting
with the failed session. The Informatica Server completes the session and then runs the rest of
the batch. Use the Perform Recovery session property
To recover sessions in sequential batches configured to stop on failure:
1.In the Workflow Manager, open the session property sheet.
2.On the Log Files tab, select Perform Recovery, and click OK.
3.Run the session.
4.After the batch completes, open the session property sheet.
5.Clear Perform Recovery, and click OK.
If you do not clear Perform Recovery, the next time you run the session, the Informatica Server
attempts to recover the previous session.
If you do not configure a session in a sequential batch to stop on failure, and the remaining
sessions in the batch complete, recover the failed session as a standalone session.
105.How to recover sessions in concurrent batches?
If multiple sessions in a concurrent batch fail, you might want to truncate all targets and run the
batch again. However, if a session in a concurrent batch fails and the rest of the sessions
complete successfully, you can recover the session as a standalone session.
To recover a session in a concurrent batch:
1.Copy the failed session using Operations-Copy Session.
2.Drag the copied session outside the batch to be a standalone session.
3.Follow the steps to recover a standalone session.
4.Delete the standalone copy.
106.How can you complete unrcoverable sessions?
Under certain circumstances, when a session does not complete, you need to truncate the target
tables and run the session from the beginning. Run the session from the beginning when the
Informatica Server cannot run recovery or when running recovery might result in inconsistent
data.
107.What are the circumstances that lead to an unrecoverable session?
1. The source qualifier transformation does not use sorted ports.
26
Classification : Confidential
2. If you change the partition information after the initial session fails.
3. Perform Recovery is disabled in the informatica server configuration.
3. If the sources or targets changes after initial session fails.
4. If the maping consists of sequence generator or normalizer transformation.
5. If a concuurent batch contains multiple failed sessions.
108.If I did any modifications for my table in back end does it reflect in informatca warehouse
or maping desginer or source analyzer?
NO. Informatica is not at all concerned with back end data base.It displays you all the
information
that is to be stored in repository.If you want to reflect back end changes to informatica screens,
again you have to import from back end to informatica by valid connection.And you have to
replace the existing files with imported files.
109.After draging the ports of three sources(sql server,oracle,informix) to a single source
qualifier, can u map these three ports directly to target?
NO.Unless and until you join those three ports in source qualifier you cannot map them directly.
Classification : Confidential
Update strategy transformation: We can write our own code .It is flexible.
Normal insert / update /delete (with proper variation of the update option) :
It can be configured in the Session properties.
Any change in the row will cause an update.Inflexible.
Classification : Confidential
Source tuning
Target tuning
Repository tuning
Session performance tuning
Incremental Change identification in source side.
Software , hardware(Use multiple servers) and network tuning.
Bulk Loading
Use the appropriate transformation.
To monitor this :
Set performance detail criteria
Enable performance monitoring
Monitor session at runtime &/ or Check the performance monitor file .
117.What is a suggested method for validating fields / marking them with
errors?.
One of the successful methods is to create an expression object, which contains variables.> One
variable per port that is to be checked.> Set the error flag for that field, then at the bottom of
the expression trap each of the error fields.> From this port can choose to set flags based on
each individual error which occurred, or feed them out as a combination of concatenated field
names to be inserted in to the database as an error row in an error tracking table.
118.Where is the cache (lookup, index) created and how can you see it.
The cache is created in the server.Some default memory is allocated for it.
Once that memory is exceeded then these files can be seen in the Cache directory in the Sever,
not before that.
119.When do you use SQL override in Look up Transformation.
Use SQl override when
Classification : Confidential
Classification : Confidential
When the Session is run these two procedures are executed before the session and after the
session.
124.How can utilize COM components in Informatica.
By writing C+,VB,VC++ code in External Stored Procedure Transformation
125.What is an indicator file and how it can be used.
Indicator file is used for Event Based Scheduling when you dont know when the Source Data is
availaible.
A shell command ,script or a batch file creates and sends this indicator file to the directory local
to the Informatica Server.
Server waits for the indicator file to appear before running the session.
126.What is persistent cache? When it should be used.
When Lookup cache is saved in Look up Transformation It is called persistent cache.
The first time session runs it is saved on the disk and utilized in subsequent runnings of the
Session.
It is used when the look up table is Static i.e doesnt change frequently
127. What is Incremental Aggregation and how it should be used
If the source changes only incrementally and can capture changes, you can configure the session
to process only those changes. This allows the Informatica Server to update your target
incrementally, rather than forcing it to process the entire source and recalculate the same
calculations each time run the session. Therefore, only use incremental aggregation if:
can capture incremental changes. might do this by filtering source data by timestamp.
Classification : Confidential
S1
Source
Source Qualifier
Lookups on Dimensions
Fact
Table
S2
129.Informatica Server and Client are in different machines. When run a session from the server
manager by specifying the source and target databases. It displays an error.You are confident
that everything is correct. Then why it is displaying the error?
32
Classification : Confidential
The connect strings for source and target databases are not configured on the Workstation
conatining the server though they may be on the client machine.
Designer Tool
This tool is used by programmers to develop ETL programs (referred as mapping programs or
mappings)
It has following five components:
1. Source Analyzer:
To define source data objects for mappings. These sources can be RDBMS, semi RDBMS,
Files, ERPs, XML files, COBOL Files etc.
2. Warehouse Designer:
To create or include Target data objects. Target can be RDBMS (most preferable ) ERP and file.
3. Mapping Designer:
To relate source data objects with the target data objects using predefined or user defined
Transformations.
4. Transformation Developer:
33
Classification : Confidential
Source Qualifier(Active):
It is a wrapper on the source data objects and data flow from source .Data objects are not
allowed without this transformation. A source Qualifier is a complete SELECT statement.
We can have one or more source Quailifiers for single source object. We may require multiple
source qualifiers when different data objects need different set of data.
A source qualifier can combine multiple source objects (Join Query) such SQ are reffered as
common source qualifier.
2.
Same as source Qualifier but only for MQ Series product from IBM. It is used for Data
integration from different databases.
5.
Joiner(Active):
34
Classification : Confidential
It is used to join two Source Qualifiers (Source Objects) based on Equi-join including outer
joins. This is required when source data objects are heterogenous. Since in such cases common
Source Qualifier cannot be used.
6.
Lookup(Active):
Same as Joiner but it can be used for non-Equi join based data objects also.
No outer join is allowed. It can combine data objects from source, Target as well as From
external database while Joiner can combine data objects only from the source.
7. Expression(Pasive):
It is used to define row or record based formula or expressions.
Like:
Netpay(sal + comm+da-tax),
Substr(Name,1,3),
To_char(Join_date,yyyy) etc.
It is SELECT clause of SELECT statement.
For example:
SELECT sal + comm FROM emp;
8. Filter(Active) :
To define conditions to restrict records. WHERE clause of SELECT statement.
For example:
SELECT * FROM emp WHERE sal > 30000;
9. Aggregator(Active):
To group the records with/without summary functions output.
GROUP BY clause of SELECT statement with group functions use.
For example:
SELECT job, SUM(sal), avg(sal) FROM emp
GROUP BY job;
OR
SELECT job from emp GROUP BY job;
35
Classification : Confidential
10. Rank(Active):
Same as filter but it filters records from top or bottom of the sorted records. For example:
Top three customers based on total sales amount.
Last three employees based on salary.
It can be done by SELECT statement in the following ways:
--Simple sub-query using rownum virtual column.
--Corelated subquery.
--START WITH & CONNECT BY clause.
11. Router(Active):
It is used to provide multiple outputs from single source of data. Each output can have its own
filter. Basically it is a combintion of Source Qualifier and Filter transformations. Router is like
multiple views on a single table under database.
12.Update strategy(Passive):
It is used to flag records for insert, Delete, Update or Reject to Target data objects.
Default flag is Insert. Based on the flags attached to the records Target decides data
manipulation within the data objects. This is very useful for incremental data loading.
13. Stored Procedure(Passive/Active):
It is used to call or execute back-end programs like:
Procedures, Functions and package members.
14. Sequence Generator(Passive):
It is used to generate unique serial numbers. Same as Sequence object of Oracle database.
15. Normalizer(Passive):
It is used to convert non tabular data into tabular format. Basically for COBOL data structure.
16. Mapplet input(virtual Source Table)(Passive):
36
Classification : Confidential
To define virtual source data objects that is nothing but parameters or ports. It is same as we
define IN or OUT parameters with procedures or functions. When Mapplets are used in
mappings then Mapplet inout ports are attached to actual source object columns.
17. Mapplet Output(Virtual Target Table)(Passive):
To define virtual Target data objects that is nothing but parameters or ports .
When Mapplets are used in Mappings then Mapplet output ports are attached to actual target
object columns.
Note:
A reusable program must never be based on actual data objects.
Ports
Ports are channels through which values are passed from one transformation to another
Transformation. A port does not hold the value. They are just like a pipe used to pass data either
directly or after some transformation(Formula). There are following four basic types of ports:
1. Iput Port: to receive values from source.
2. Output Port: To pass values to target.
3. Variable Port:
To define formula for internal use of transformation.
Such ports are not visible or exposed to other transformation.
4. Input/Output port:
To receive as well as to Pass values.
Note:
Output or variable ports require a formula or expression.
Joins
Normal Join:
All matching records between Master and Detail tables.
Master Outer Join:
All matching records plus Master records without Detail records.
Detail Outer Join:
All matching records plus Detail records without Master records.
Full Outer Join:
All matching records plus Master records without Detail recoreds plus Detail
records without Master records.
37
Classification : Confidential
38
Classification : Confidential
Classification : Confidential
Decide position of level within the hierarchy. (In case of star flake and snow flake)
Use clse button to complete.
40
Classification : Confidential
Transformations
1. Surce Qualifier
2. Expression
3. Aggregator
4. Filter
5. Joiner
6. Look Up
7. Stored Procedure
8.Sequence Generator
9. Update Strategy
10. Rank
11. Router
12. Normalizer
1. Surce Qualifier:
can use the source Qualifier to perform the following tasks.
Join data originationg from the same source database.
Filter records when the Informatica Server reads the source data.
Specify an Outer Join rather than a default inner Join.
Specify sorted ports.
Select only distinct values form the source.
Create a custom query to issue a special SELECT statement for the Infornatica Server to read
the source data.
SQL Query:
Defines a custom query that replaces the default query the Informatica Server uses to read from
source represented in this Source Qualifier.
A custom query overwrites entries for a custom join or a source filter.
User Defined Join:
Specifies condition used to join data from multiple sources represented in the same source
qualifier transformation.
41
Classification : Confidential
Source Filter:
Specifies the Filter condition the Informatica Server applies when querying records.
Number of Sorted Ports:
Indicates the number of columns used when sorting records queried from relational sources.
If select this option ,the informatica Server adds an ORDER BY to the default query.
When it reads source records the ORDER BY includes the number of ports specified starting
from the top of the Source Qualifier. When selected , the database sort order must match the
sesson sort order.
Tracing Level:
Sets the amount of detail included in the session log when run a session containing this
transformation.
SELECT DISTINCT:
Specifies if want to select only unique records. The informatica Server includes a SELECT
DISTINCT statement.
Do not alter the data types in the source Qualifier. If the data types in the source definition and
the Source Qualifier do not match, cannot save the mapping.
2. Expression:
The expression transformation allows to perform calculations on row by row basis.
3. Aggregator:
To perform calculations inolving multiple rows, such as SUMS or AVARAGES , Use the
aggregator transformation.
The aggregator transformation allows to perform aggregate calculations such as Averages and
SUMs.
There are partitioning restrictions that apply to aggregator transformation.
By default Informatica Server treats NULL values as NULL in Aggregator Functions.
42
Classification : Confidential
4.Filter:
The Source Qualifier Transformation provides an alternate way to filter records.
Rather than filtering records from within a mapping, the source Qualifier filters records when
reading from a source. The main difference is that the Source Qualifier limits the record set
extracted from a source, while the filter transformation limits the recordset sent to the target.
Since a Source Qualifier reduces the number of records used throughout the mapping, it
provides better performance.
However source Qualifier only lets filter records from relational sources, while filter
transformation filters records from any type of source. Also note that
Since it runs in the database, must make sure that the Source Qualifier condition only uses
standard SQL. The filter transformation can define a condition using Any statement or
transformation function that returns either a true or false value.
5. Joiner:
use joiner Transformation to join two sources with at least one matching port. The joiner
transformation uses a condition that matches one or more pairs of ports between the two
sources.
The combination of sources can be varied.
Source:
1. Two relational tables existing in separate databases.
2. Two flat files in potentially different File systems.
3. Two different ODBC sources.
4. Two instances of the sam XML source.
5. A relational table and a flat File source.
6. A relational table and an XML source.
The Joiner Transformation accepts input from any transformation, However , there are some
limitations on the data flows connect to the Joiner Transformation.
You cannot use Joiner transformation in the following situations:
1. Both pipelines begin with the same original data source.
2. Both input pipelines originate from the same source Qualifier transformation.
3. Both input pipelines originate from the same Normalizer.
4. Both input pipelines originate form the same Joiner Transformation.
5. Either input pipeline contains an Update Strategy Transformation.
6. Either pipeline contains a Connected or Unconnectd Sequence Generator transformation.
43
Classification : Confidential
Specify one of the sources as Master Source, and the other as the Detail Source.
This is spaecified on the prooperties Tab in the Transformation by clicking the M column. When
add the ports of a transformation, the ports from the first source are automatically set as Detail
Surce. Adding the ports from the second transformtion automatically sets them as Master
Sources. The Master/ Detail relation determines how the join treats data from those sources
based on the join.
Note:
The Master and Detail can be known by setting the table. The table in which most of the
columns are required for the next transformation OR the table on which the next
transformations value will depend take that one as Master and the the rest one as Detail.
Generally there are four types of Joins:
Normal: Selects only the matching records from both the tables.
Master Outer: Selects matching records and also the records of the Detail.
Detail Outer: Selects the matching recoprds and also the records of the Master.
Full Outer: Selects all the records of both the tables.
A normal or Master Outer join performs faster than a Full Outer or Detail Outer.
6. Look Up:
can use the look Up transformation to perform many tasks including:
Get a related value:
If r source table includes Employee_ID but want to include the Employee_Name in r Target
table to Make r summary data easier to read.
Perform a calculation:
Many normalized tables include values used in calculations, such as gross sales per invoice or
sales tax , but not the calcualted value such as net Sales.
Update Slowly changing Dimensions:
can use a lookup transformation to determine whether records already exist in the target.
7. Stored Procedure:
44
Classification : Confidential
A stored Procedure is precompiled collection of transact SQL statements and optional flow
control statements, similar to an execuatable script. Stored procedures are stored and run within
thedatabase.
Unlike standard SQL however, stored procedures allow user defined variables, conditional
statements, and other powerful programming features.
might use Stored procedures to:
1. Drop and recreate indexex.
2. Check the status of the target database before moving records into it.
3. Determine it enough space exists in a database.
4. Perform specified calculation.
Stored procedures allow greater flexibility than SQL statements. Stored procedures also allow
error handling and logging necessary for mission critical tasks.
One of the most useful features of the stored procedures is the ability to send data to the stored
procedure and receive data from the stored procedure.
There are three types of data that pass between the stored procedure and the informatica Server.
1. Input/Output parameters.
2. Return Values.
3. Status Codes.
4. Status Codes:
Status code provides error handling for the informatica server during a session. The stored
procedure issues a status code that notifies wherther or not the stored procedure completed
successfully. This value cannot be sent by the user ; it is only used by the Informatica Server to
determine whether to continue running the session or stop. configure options in the Workflow
Manager to continue or stop the session in the event of the stored procedure error.
8. Sequence Generator:
The Sequence Generator transformation generates numeric values. might use the Sequence to
create unique primary key values, replace missing primary keys. Or cycle through a sequential
range of numbers.
It contains two output ports that can connect to one or more transformations. The informatica
server generates a value each time a row enters a connected transformation , even if that value is
not used.
When NEXTVAL is connected to the input port of another transformation, the informatica server
generates a sequence of numbers. When a CURRVAL is connected to the transformation the the
Informatica Server generates the NEXT value plus one.
Some common Uses for the Sequence Generator transformation are:
1. Creating Keys
2. Replacing missing values.
45
Classification : Confidential
can connect NEXTVAL to multiple transformations and generate unique values for each
row in the transformation.
For example: might connect NEXTVAL to two target tables in a mapping to generate unique
primary key values. The informatica server creates a column of unique primary key values for
each target table.
The Sequence generator is unique among all transformations because cannot add, or delete its
default ports (NEXTVAL and CURRVAL)
9. Update Strategy:
1.Insert:
Populate the target table for the first time, or maintaining a historical DataWarehuse . In the
latter case, must set this strategy for the entire DataWarehouse, not just a select group of target
tables.
2. Delete:
Clear target tables.
3. Update:
Update target tables. might choose this setting whether DataWarehouse contains historical data
or a snapshot.
Later when configure how to update individual target tables, can detemine whether to insert
updated records as new records or use the updated information to modify existing records in the
target.
4. Data Driven:
Exert fine control over how to flag records for insert, delete , Update or reject.
Choose this setting if records designed for the same table need to be flagged on occasion for one
operation (for example update) , or for a different operation(for example reject). In addition , this
setting provides the only way can flag records for reject.
For the greatest control over r Update Strartegy , add Update Strategy Transformations to a
mapping. The most important feature of this transformation is its Update Strategy expression
used to flag individual records for insert, delete, update or reject.
The constatns for each database operation and their numeric equivalent include the following:
Insert :- DD_INSERT
Update :-DD_UPDATE
Delete:-DD_DELETE
Reject:-DD_REJECT
0
1
2
3
46
Classification : Confidential
Session Wizard
1. Insert: Treat all records as Insert. If inserting the records voilates a primary or foreign key
constraint in the database ,the Informatica Server rejects the record.
2.Delete: Treat all records as delete. For each record if the informatica Server finds the record in
the target based on the primarykey the Informatica Server deletes the record. Note that the
primary key constraint must exist in the target difinition in the repository.
3.Update: Treat all rows as Update. For each record the Informatica Server looks for a
matching primary key value in the target table. If it exists the Informatica Server Updates the
record. Again the Primary key constraint must exist in the Target Repository.
47
Classification : Confidential
4. Data Driven: The Informatica Server follows the instructions coded into the Update Strategy
transformations within the session mapping to determine how to flag records for insert, delete,
update or reject. If a mapping for a session contains an Update Strategy Transformation, this
field is marked data driven by default. If do not choose Data Driven Setting, the Informatica
Server ignores all Update Strategy Transformations in the mapping.
What are the shortcuts and what are their advantages?
Shortcuts allow to use metadata across folders without making copies, ensuring uniform
metadata. A shortcut inherits all properties of the object to which it points. Once create a
shortcut, can configure the shortcut name and description.
When the object, the shortcut references changes, the shortcut inherits those changes. By using a
shortcut instesd of a copy ensure each use of the shortcut exactly matches the original object.
For example if have a shortcut to a target definition and add a column to the definition, the
shortcut automatically inherits the additional column.
Shortcuts allow to reuse an object without creating multiple objects in the repository. For
example use a source definition in the mapping in ten different folders, instead of creating ten
copies of the same source definition, one in each floder. can create ten shortcuts to the original
definition.
can create shortcuts to objects in the shared folders. If try to create a shortcut to a non sharable
folder, the designer creates a copy of the object instead.
can create shortcuts to the following repository objects.
1. Source Definitions.
2. Reusable transformations.
3. Mapplets
4. Target Definitions
5. Business Components
can create two types of shortcuts.
1. Local shortcut: A shortcut created in the same repository as the original object.
2. Global Shortcut: A shortcut ctreated in a local repository that references an object in the
global repository.
Advantages: One of the primary advantages of using shortcuts is ease of Maintainance.
If need to change all instances of an object, can edit the the original repository object. All
shortcuts accessing the object automatically inherit the changes.
Shortcuts have the following advantages over the copied repository objects:
can maintain a common repository object in a single location.
If need to edit the object, all shortcuts immidiately inherit the changes make.
48
Classification : Confidential
can restrict repository users to a set of predefined metadata by asking users to incorporate the
shortcuts into their work instead of developing repository objects indepently.
can develop complex mappings, Mapplets, or reusable transformations, then reuse them easily
in other folders.
can save space in r repository by keeping a single repository object instead of creating copies of
the objects in multiple repositories.
49
Classification : Confidential
50
Classification : Confidential