Академический Документы
Профессиональный Документы
Культура Документы
Shared Folder is like another folder but that can be accessed by all the users(Access
can be changed) This is mainly used to share the Objects between the folders for
resuablity. For example you can create a shared folder for keeping all the Common
Mapplets Src Tgt and Transformation that can be used across the folders by creating
shortcut to those. By doing we are increasing usablity of the code also changes can
be made at one place and that will easily reflect in all other shortcuts.
1. Quantitaive testing
2.Qualitative testing
Steps.
Once the session is succeeded the right click on session and go for statistics tab.
There you can see how many number of source rows are applied and how many
number of rows loaded in to targets and how many number of rows rejected.This is
called Quantitative testing.
If once rows are successfully loaded then we will go for qualitative testing.
Steps
1.Take the DATM(DATM means where all business rules are mentioned to the
corresponding source columns) and check whether the data is loaded according to
the DATM in to target table.If any data is not loaded according to the DATM then go
and check in the code and rectify it.
Its a session option. when the informatica server performs incremental aggr. it
passes new source data through the mapping and uses historical chache data to
perform new aggregation caluculations incrementaly. for performance we will use
it.
When using incremental aggregation, you apply captured changes in the source to
aggregate calculations in a session. If the source changes incrementally and you
can capture changes, you can configure the session to process those changes. This
allows the Integration Service to update the target incrementally, rather than
forcing it to process the entire source and recalculate the same data each time you
run the session.
For example, you might have a session using a source that receives new data every
day. You can capture those incremental changes because you have added a filter
condition to the mapping that removes pre-existing data from the flow of data. You
then enable incremental aggregation.
When the session runs with incremental aggregation enabled for the first time on
March 1, you use the entire source. This allows the Integration Service to read and
store the necessary aggregate data. On March 2, when you run the session again,
you filter out all the records except those time-stamped March 2. The Integration
Service then processes the new data and updates the target accordingly.
You can capture new source data. Use incremental aggregation when you can
capture new source data each time you run the session. Use a Stored Procedure or
Filter transformation to process new data.
How to delete duplicate rows in flat files source is any option in informatica
Use a sorter transformation , in that u will have a "distinct" option make use of it .
in the concept of mapping parameters and variables, the variable value will be saved
to the repository after the completion of the session and the next time when u run
the session, the server takes the saved variable value in the repository and starts
assigning the next value of the saved value. for example i ran a session and in the
end it stored a value of 50 to the repository.next time when i run the session, it
should start with the value of 70. not with the value of 51.
how to do this.
start-------->session.
right clickon the session u will get a menu, in that go for persistant values, there u
will find the last value stored in the repository regarding to mapping variable. then
remove it and put ur desired one, run the session... i hope ur task will be done
You can use aggregator after update strategy. The problem will be, once you perform
the update strategy, say you had flagged some rows to be deleted and you had
performed aggregator transformation for all rows, say you are using SUM function,
then the deleted rows will be subtracted from this aggregator transformation.
so all the dimensions are marinating historical data, they are de normalized,
because of duplicate entry means not exactly duplicate record with same employee
number another record is maintaining in the table.
How do you handle decimal places while importing a flatfile into informatica?
while importing flat file definetion just specify the scale for a neumaric data type. in
the mapping, the flat file source supports only number datatype(no decimal and
integer). In the SQ associated with that source will have a data type as decimal for
that number port of the source.
If you are workflow is running slow in informatica. Where do you start trouble
shooting and what are the steps you follow?
When the work flow is running slowly u have to find out the bottlenecks
in this order
target
source
mapping
session
system
If you have four lookup tables in the workflow. How do you troubleshoot to
improve performance?
There r many ways to improve the mapping which has multiple lookups.
2) divide the lookup mapping into two (a) dedicate one for insert means: source -
target,, these r new rows . only the new rows will come to mapping and the
process will be fast . (b) dedicate the second one to update : source=target,, these
r existing rows. only the rows which exists allready will come into the mapping.
Can anyone explain error handling in informatica with examples so that it will be
easy to explain the same in the interview.
Go to the session log file there we will find the information regarding to the
errors encountered.
load summary.
so by seeing the errors encountered during the session running, we can resolve the
errors.
There is one file called the bad file which generally has the format as *.bad and it
contains the records rejected by informatica server. There are two parameters one
fort the types of row and other for the types of columns. The row indicators
signifies what operation is going to take place ( i.e. insertion, deletion, updation
etc.). The column indicators contain information regarding why the column has
been rejected.( such as violation of not null constraint, value error, overflow etc.) If
one rectifies the error in the data preesent in the bad file and then reloads the data
in the target,then the table will contain only valid data.
,Just right click on the particular session and going to recovery option
or
It is possible two run two session only (by presession,post session) using pmcmd
without using workflow. Not more than two.
If a session fails after loading of 10,000 records in to the target.How can u load the
records from 10001 th record when u run the session next time in informatica 6.1?
Running the session in recovery mode will work, but the target load type should be
normal. If its bulk then recovery wont work as expected
What are mapping parameters and varibles in which situation we can use it
If we need to change certain attributes of a mapping after every time the session is
run, it will be very difficult to edit the mapping and then change the attribute. So
we use mapping parameters and variables and define the values in a parameter
file. Then we could edit the parameter file to change the attribute values. This
makes the process simple.
What is worklet and what use of worklet and in which situation we can use it
1)timer2)decesion3)command4)eventwait5)eventrise6)mail
What is the logic will you implement to laod the data in to one factv from 'n'
number of dimension tables.
Noramally evey one use
In the source, if we also have duplicate records and we have 2 targets, T1- for
unique values and T2- only for duplicate values. How do we pass the unique values
to T1 and duplicate values to T2 from the source to these 2 different targets in a
single mapping?
function)--->t2
confirmed dimension == one dimension that shares with two fact table
factless means ,fact table without measures only contains foreign keys-two types of
factless table ,one is event tracking and other is covergae table
Can any body write a session parameter file which will change the source and
targets for every session. i.e different source and targets for each session run.
You are supposed to define a parameter file. And then in the Parameter file, you
can define two parameters, one for source and one for target.
$tgt_file = c: argetsabc_targets.txt
[folder_name.WF:workflow_name.ST:s_session_name]
$Src_file =c:program filesinformaticaserverinabc_source.txt
$tgt_file = c: argetsabc_targets.txt
If its a relational db, you can even give an overridden sql at the session level...as a
parameter. Make sure the sql is in a single line.
If you want to create indexes after the load process which transformation you
choose?
Its usually not done in the mapping(transformation) level. Its done in session level.
Create a command task which will execute a shell script (if Unix) or any other
scripts which contains the create index command. Use this command task in the
workflow after the session or else, You can create it with a post session command.
Cache is stored in the Informatica server memory and over flowed data is stored on
the disk in file format which will be automatically deleted after the successful
completion of the session run. If you want to store that data you have to use a
persistant cache.
What will happen if you are using Update Strategy Transformation and your
session is configured for "insert"?
If you have rank index for top 10. However if you pass only 5 records, what will be
the output of such a Rank Transformation?
ELSE
u can use the UPD session level options. instead of using a UPD in mapping just
select the update in treat source rows and update else insert option. this will do the
same job as UPD. but be sure to have a PK in the target table.
for teradata:tpump,mload.
3) if u pass only 5 rows to rank, it will rank only the 5 records based on the rank
port.
How can you delete duplicate rows with out using Dynamic Lookup? Tell me any
other ways using lookup delete the duplicate rows?
For example u have a table Emp_Name and it has two columns Fname, Lname in
the source table which has douplicate rows. In the mapping Create Aggregator
transformation. Edit the aggregator transformation select Ports tab select Fname
then click the check box on GroupBy and uncheck the (O) out port. select Lname
then uncheck the (O) out port and click the check box on GroupBy. Then create 2
new ports Uncheck the (I) import then click Expression on each port. In the first
new port Expression type Fname. Then second Newport type Lname. Then close the
aggregator transformation link to the target table.
In real time only star schema will implement because it will take less time and
surrogate key will there in each and every dimension table in star schema and this
surrogate key will assign as foreign key in fact table.
How do u check the source for the latest records that are to be loaded into the
target.
i.e i have loaded some records yesterday, today again the file has been populated
with some more records today, so how do i find the records populated today.
Create a lookup to target table from Source Qualifier based on primary Key.
b) Use and expression to evaluate primary key from target look-up. ( If a new
source record look-up primary key port for target table should return null). Trap
this with decode and proceed.
Time Dimension will generally load manually by using PL/SQL , shell scripts, proc C
etc......
What are the properties should be notified when we connect the flat file source
definition to
relational database target definition?
If its can be executed without performance issues then normal load will work
If its huge in GB they NWAY partitions can be specified at the source side and
the target side.
In hash partitioning, the Informatica Server uses a hash function to group rows of
data among partitions. The Informatica Server groups the data based on a partition
key.Use hash partitioning when you want the Informatica Server to distribute rows
to the partitions by group. For example, you need to sort items by item ID, but you
do not know how many items have a particular ID number.
What is meant by EDW?
EDW is Enterprise Datawarehouse which means that its a centralised DW for the
whole organization.
this apporach is the apporach on Imon which relies on the point of having a single
warehouse/centralised where the kimball apporach says to have seperate data
marts for each vertical/department.
2. Same point of source of data for all the users acroos the organization.
to over come is the time it takes to develop and also the management that is
required to build a centralised database.
How can we join the tables if the tables have no primary and forien key relation
and no matchig port to join?
without common column or common data type we can join two sources using
dummy ports.
2.Use Joiner transformation to join the sources using dummy port(use join
conditions).
If the workflow has 5 session and running sequentially and 3rd session hasbeen
failed how can we run again from only 3rd to 5th session?
If multiple sessions in a concurrent batch fail, you might want to truncate all targets
and run the batch again. However, if a session in a concurrent batch fails and the
rest of the sessions complete successfully, you can recover the session as a
standalone session.To recover a session in a concurrent batch:1.Copy the failed
session using Operations-Copy Session.2.Drag the copied session outside the batch
to be a standalone session.3.Follow the steps to recover a standalone
session.4.Delete the standalone copy.
Hi, as per the questions all the sessions are serial. So you can start the session as
"start workflow from task" from there it wil continue to run the rest of the tasks.
What is the diff b/w STOP & ABORT in INFORMATICA sess level ?
Abort:WE cant restart the session.We should truncate all the pipeline after that
start the session
What is the diff b/w Stored Proc (DB level) & Stored proc trans (INFORMATICA
level) ?
again why should we use SP trans ?
First of all stored procedures (at DB level) are series of SQL statement. And those
are stored and compiled at the server side.In the Informatica it is a transformation
that uses same stored procedures which are stored in the database. Stored
procedures are used to automate time-consuming tasks that are too complicated
for standard SQL statements.if you don't want to use the stored procedure then you
have to create expression transformation and do all the coding in it.
In my source table 1000 rec's r there.I want to load 501 rec to 1000 rec into my
Target table ?how can u do this ?
minus
Use Sorter Transformation. When you configure the Sorter Transformation to treat
output rows as distinct, it configures all ports as part of the sort key. It therefore
discards duplicate rows compared during the sort operation
K1 X N N N N
K2 X N N N N
--------------------------
Table2:
K1 N X N N N
K2 N X N N N
--------------------------
But there can be a situation like any of the table can contain duplicates like:
K1 X N N N N
K1 Y N N N N
--------------------------
Because of this, we can't use aggregator/group by as we are not sure which one
should be removed.
Do the following
In aggregator transformation
Now take your original port with eliminated duplicate key port?s record in map2
When do u we use dynamic cache and when do we use static cache in an connected
and unconnected lookup transformation
We use dynamic cache only for connected lookup. We use dynamic cache to check
whether the record already exists in the target table are not. And depending on
that, we insert,update or delete the records using update strategy. Static cache is
the default cache in both connected and unconnected. If u select static cache on
lookup table in infa, it own't update the cache and the row in the cache remain
constant. We use this to check the results and also to update slowly changing
records
and companring.
How do you Merge multiple Flat files for example 100 flat files with out using
Union T/F
By using File List we can Merge more than one Flat file.
when we are importing more than one flat file we should Set
the Source Property as INDIRECT,where default property ll
have DIRECT. and we should give the file,which ll have the
adresses of all the Flat files which we are going to merge.
How do you Merge multiple Flat files for example 100 flat files
with out using Union T/F
By using File List we can Merge more than one Flat file.
when we are importing more than one flat file we should Set
the Source Property as INDIRECT,where default property ll
have DIRECT. and we should give the file,which ll have the
adresses of all the Flat files which we are going to merge.
i have source flat file like 1 a,1 b,1 c,2 a,2 b,2 c
i want output as 1 a,b,c and 2 a,b,c ...
how can achieve this
Use 2 variables one for counter and another for value.
sort all the records by c1 and keep track on c1 and hold
the value in v1. When the v1 and c1 are equal, then
concatenate c2 with v2 other assign c2 with v2. Finaly
connct to aggregator and take last value by grouping the
records by c1.
Take the mapping as (suppose the columns as C1, c2)
SD --> SQ --> Exp --> Agg --> Tgt
In exp define as
c1 <-- c1
c2 <-- c2
v_2 <-- IIF(c1 = v_1,TO_CHAR(v_2) || ', ' || TO_CHAR
(c2),TO_CHAR(c2))
v_1 <-- c1
o_p1 <-- v_2
In Agg
Group by c1
c2 <-- v_2
c11 <-- LAST(c1)
Tgt is connected as
c1 <-- c11
c2 <-- c2