Академический Документы
Профессиональный Документы
Культура Документы
In a scenario I have col1, col2, col3, under that 1,x,y, and 2,a,b and I want in this
form col1, col2 and 1,x and 1,y and 2,a and 2,b, what is the procedure?
Use Normalizer :
create two ports -
first port occurs = 1
second make occurs = 2
two output ports are created and
connect to target
On a day, I load 10 rows in my target and on next day if I get 10 more rows to be
added to my target out of which 5 are updated rows how can I send them to target?
How can I insert and update the record?
1
We can use do this by identifying the granularity of the target table .
We can use CRC external procedure after that to compare newly generated CRC no.
with the
old one and if they do not match then update the row.
What is the method of loading 5 flat files of having same structure to a single target
and which transformations I can use?
union transformation,
otherwise write all file paths of five files in one file and use this file in session
properties as
indirect
When we create a target as flat file and source as oracle.. how can i specify first
rows as column names in flat files...
use a pre sql statement....but this is a hardcoding method...if you change the
column names
or put in extra columns in the flat file, you will have to change the insert statement
You can also achive this by changing the setting in the Informatica Repository
manager to
display the columns heading. The only disadvantage of this is that it will be applied
on all the
files that will be generated by This server
1.can u explain one critical mapping? 2.performance issue which one is better?
whether connected lookup tranformation or unconnected one?
it depends on your data and the type of operation u r doing.
2
If u need to calculate a value for all the rows or for the maximum rows coming out
of the
source then go for a connected lookup.
Or,if it is not so then go for unconnectd lookup.
Specially in conditional case like,
we have to get value for a field 'customer' from order tabel or from customer_data
table,on
the basis of following rule:
If customer_name is null then ,customer=customer_data.ustomer_Id
otherwise
customer=order.customer_name.
so in this case we will go for unconnected lookup
Dimesions are
1.SCD
2.Rapidly changing Dimensions
3.junk Dimensions
4.Large Dimensions
5.Degenerated Dimensions
6.Conformed Dimensions.
3
mapping
In update strategy target table or flat file which gives more performance ? why?
Pros: Loading, Sorting, Merging operations will be faster as there is no indexconcept
and
Data will be in ASCII mode.
Cons: There is no concept of updating existingrecords in flat file.
As there is no indexes, while lookups speed will be lesser.
How can u work with remote database in informatica?did u work directly by using
remote connections?
You can work with remote,
But you have to
Configure FTP
Connection details
IP address
User authentication
How the informatica server increases the session performance through partitioning
the source?
For a relational sources informatica server creates multiple connections for each
parttion of a
single source and extracts seperate range of data for each connection.Informatica
server
reads multiple partitions of a single source concurently.Similarly for loading also
informatica
server creates multiple connections to the target and loads partitions of data
concurently.
How can u recognise whether or not the newly added rows in the source r gets
insert in the target ?
If it is Type 2 Dimension the abouve answer is fine, but if u want to get the info of all
the insert statements and Updates you need to use session log file where you
configure it to verbose.
You will get complete set of data which record was inserted and which was not.
What is Datadriven?
The Informatica Server follows instructions coded into Update Strategy
transformations within
the session mapping to determine how to flag rows for insert, delete,up date, or
reject.
If the mapping for the session contains an Update Strategy transformation, this field
8
is
marked Data Driven by default
10
Input Group
TheDesi gner copies property information from the input ports of the input group to
create a
set of output ports for each output group.
Output Groups
There are two types of output groups:
User-defined groups
Default group
You cannot modify or delete output ports or their properties.
Two types of groups
user defined group
default group.
12
What r the joiner caches?
Specifies the directory used to cache masterrecords and the index to these records.
By
default, the cached files are created in a directory specified by theserver variable
$PMCacheDir. If you override the directory, make sure the directory exists and
contains
enough disk space for the cache files. The directory can be a mapped or mounted
drive.
There r 2-types of cache in the joiner
1. Data cache
2. Index Cache
Can U use the maping parameters or variables created in one maping into another
maping?
NO. You might want to use aworkflow parameter/variable if you want it to be visible
with
other mappings/sessions
13
Start value and Current value
Start value = Current value ( when the session starts the execution of the undelying
mapping)
Start value <> Current value ( while the session is in progress and the variable
value
changes in one ore more occasions)
Current value at the end of the session is nothing but the start value for the
subsequent run
of the same session.
Which transformation should u need while using the cobol sources as source
defintions?
Normalizer transformaiton which is used to normalize the data.Since cobol sources r
oftenly
consists of Denormailzed data.
Which is a transformation?
It is a process of converting given input to desired output.
How many ways you can update a relational source defintion and what r they?
Two ways
1. Edit the definition
2. Reimport the defintion
Where should U place the flat file to import the flat file defintion to the designer?
There is no such restrication to place the source file. In performance point of view
its better to place the file inserver local src folder. if you need path please check the
server properties availble atworkflow manager.
It doesn't mean we should not place in any other folder, if we place in server src
folder by
default src will be selected at time session creation.
16
The PowerCenter Server is a repository client application. It connects to the
Repository
Server and Repository Agent to retrieve workflow and mapping metadata from the
repository
database. When the PowerCenter Server requests a repository connection from the
Repository Server, the Repository Server starts and manages the Repository Agent.
The
Repository Server then re-directs the PowerCenter Server to connect directly to the
Repository Agent.
How to read rejected data or bad data from bad file and reload it to target?
Correction the rejected data and send to target relational tables using loadorder
utility. Find
out the rejected data by using column indicatior and row indicator.
How can U create or import flat file definition in to the warehouse designer?
U can create flat file definition in warehouse designer.in the warehouse designer,u
can create
new target: select the type as flat file. save it and u can enter various columns for
that
created target by editing its properties.Once the target is created, save it. u can
import it
from the mappingdesigner.
How to get the first 100 rows from the flat file into the target?
1.Use test download option if you want to use it for testing.
2. Putcounter/sequence generator in mapping and perform it.
Discuss which is better among incremental load, Normal Load and Bulk load
It depends on the requirement. Otherwise Incremental load which can be better as
it takes
onle that data which is not available previously on the target.
What are the Differences between Informatica Power Center versions 6.2 and 7.1,
also between Versions 6.2 and 5.1?
The main difference between informatica 5.1 and 6.1 is that in 6.1 they introduce a
new thing
called repository server and in place of server manager(5.1), they introduce
workflow
manager and workflow monitor.
Whats the diff between Informatica powercenter server, repositoryserver and
repository?
Repository is adatabase in which all informatica componets are stored in the form of
tables.
The reposiitory server controls the repository and maintains the data integrity and
Consistency across the repository when multiple users use Informatica. Powercenter
Server/Infa Server is responsible for execution of the components (sessions) stored in
the
repository.
20
So create using the same layout as in your source tables or using the Generate SQL
option in
theWarehouse Designer tab.
In a filter expression we want to compare one date field with a db2 system field
CURRENT DATE. Our Syntax: datefield = CURRENT DATE (we didn't define it by
ports, its a system field ), but this is not valid (PMParser: Missing Operator).. Can
someone help us. Thanks
Briefly explian the Versioning Concept in Power Center 7.1.
Thedb2 date formate is "yyyymmdd" where as sysdate in oracle will give "dd-mm-
yy" so conversion of
db2 date formate to localdat abase date formate is compulsary. other wise u will
get that type of error
How do we estimate the depth of the session scheduling queue? Where do we set
the number of maximum concurrent sessions that Informatica can run at a given
time?
please be more specific on the first half of the question.
u set the max no of concurrent sessions in the info server.by default its 10. u can
set to any no.
Suppose session is configured with commit interval of 10,000 rows and source has
50,000 rows. Explain the commit points for Source based commit and Target based
commit. Assume appropriate value wherever required.
Source based commit will commit thedata into target based on commit
interval.so,for every 10,000
rows it will commit into target.
Target based commit will commit the data into target based on buffer size of the
target.i.e., it commits the data into target when ever the buffer fills.Let us assume
that the buffer size is 6,000.So,for every 6,000 rows it commits the data.
What is the procedure to write the query to list the highest salary of three
employees?
The following is the query to find out the top three salaries
in ORACLE:--(take emptable)
select * from emp e where 3>(select count (*) from emp where
e.sal>emp.sal) order by sal desc.
in SQL Server:-(take emp table)
select top 10 sal from emp
Which objects are required by the debugger to create a valid debug session?
Intially the session should be valid session.
source, target, lookups, expressions should be availble, min 1 break point should be
available for
debugger to debug your session.
What is the limit to the number of sources and targets you can have in a mapping
As per my knowledge there is no such restriction to use this number of sources or
targets inside a
mapping.
Question is " if you make N number oftables to participate at a time in processing
what is the position of yourdat abase. I orginzation point of view it is never
encouraged to use N number of tables at a time, It reduces database and
informatica server performance
What are variable ports and list two situations when they can be used?
We have mainly tree ports Inport, Outport, Variable port. Inport representsdat a is
flowing into
transformation. Outport is used when data is mapped to next transformation.
Variable port is used
when we mathematical caluculations are required. If any addition i will be more
than happy if you can
share.
How does the server recognise the source and target databases?
By using ODBC connection.if it is relational.if is flat fileF TP connection..see we can
make sure with
connection in the properties of session both sources && targets
How to retrive the records from a rejected file. explane with syntax or example
During the execution of workflow all the rejected rows will be stored in bad
files(where your informatica
server get installed;C:Program FilesInformatica PowerCenter 7.1Server) These bad
files can be
imported as flat a file in source then thro' direct maping we can load these files in
desired format.
How to delete duplicate rows in flat files source is any option in informatica
Use a sorter transformation , in that u will have a "distinct" option make use of it .
in the concept of mapping parameters and variables, the variable value will be
saved to the repository after the completion of the session and the next time when
u run the session, the server takes the saved variable value in the repository and
starts assigning the next value of the saved value. for example i ran a session and
in the end it stored a value of 50 to the repository.next time when i run the session,
it should start with the value of 70. not with the value of 51. how to do this.
u can do onething after running the mapping,, in workflow manager
start-------->session.
right clickon the session u will get a menu, in that go for persistant values, there u
will find the last
value stored in the repository regarding to mapping variable. then remove it and
put ur desired one, run
the session... i hope ur task will be done
26
Can we use aggregator/active transformation after update strategy transformation
You can use aggregator after update strategy. The problem will be, once you
perform the update
strategy, say you had flagged some rows to be deleted and you had performed
aggregator
transformation for all rows, say you are using SUM function, then the deleted rows
will be subtracted
from this aggregator transformation.
How do you handle decimal places while importing a flatfile into informatica?
while importing flat file definetion just specify the scale for a neumaric data type. in
the mapping, the flat file sourcesuppor ts only number datatype(no decimal and
integer). In the SQ associated with that source will have a data type as decimal for
that number port of the source.
source ->number datatype port ->SQ -> decimal datatype.Integer is not supported.
hence decimal is
taken care.
If you are workflow is running slow in informatica. Where do you start trouble
shooting and what are the steps you follow?
When the work flow is running slowly u have to find out the bottlenecks
in this order
target
source
mapping
session
system
27
If you have four lookup tables in the workflow. How do you troubleshoot to improve
performance?
There r many ways to improve the mapping which has multiple lookups.
1) we can create an index for the lookup table if we have permissions(staging area).
2) divide the lookup mapping into two (a) dedicate one for insert means: source -
target,, these r new
rows . only the new rows will come to mapping and the process will be fast . (b)
dedicate the second
one to update : source=target,, these r existing rows. only the rows which exists
allready will come into
the mapping.
3)we can increase the chache size of the lookup.
Can anyone explain error handling in informatica with examples so that it will be
easy to explain the same in the interview.
Go to the session log file there we will find the information regarding to the
session initiation process,
errors encountered.
load summary.
so by seeing the errors encountered during the session running, we can resolve the
errors.
There is one file called the bad file which generally has the format as *.bad and it
contains the records
rejected by informatica server. There are two parameters one fort the types of row
and other for the
types of columns. The row indicators signifies what operation is going to take place
( i.e. insertion,
deletion, updation etc.). The column indicators contain information regarding why
the column has been
rejected.( such as violation of not null constraint, value error, overflow etc.) If one
rectifies the error in
the data preesent in the bad file and then reloads the data in the target,then the
table will contain only
valid data.
How do I import VSAM files from source to target. Do I need a special plugin
As far my knowledge by using powerexchange tool convert vsam file to oracletables
then do mapping
as usual to the target table.
Could anyone please tell me what are the steps required for type2
dimension/version data mapping. how can we implement it
1. Determine if the incoming row is 1) a new record 2) an updated record or 3) a
record that
already exists in the table using two lookup transformations. Split the mapping into
3 seperate
flows using a router transformation.
2. If 1) create a pipe that inserts all the rows into the table.
3. If 2) create two pipes from the same source, one updating the old record, one to
insert the new.
Hope this makes sense,
With out using Updatestretagy and sessons options, how we can do the update our
target table?
In session properties, There is an option
insert
update
insert as update
update as update
like that
by using this we will easily solve
Two relational tables are connected to SQ Trans,what are the possible errors it will
be thrown?
The only two possibilities as of I know is
Both the table should have primary key/foreign key relation ship
Both the table should be available in the same schema or same database
what is the best way to show metadata(number of rows at source, target and each
transformation level, error related data) in a report format
You can select these details from the repositorytable. you can use the view
REP_SESS_LOG to get
these data
If u had to split the source level key going into two seperate tables. One as
surrogate and other as primary. Since informatica does not gurantee keys are
loaded properly(order!) into those tables. What are the different ways you could
handle this type of situation?
foreign key
What are cost based and rule based approaches and the difference
Cost based and rule based approaches are the optimization techniques which are
used in related to
databases, where we need to optimize a sql query.
Basically Oracle provides Two types of Optimizers (indeed 3 but we use only these
two techniques., bcz
the third has some disadvantages.)
When ever you process any sql query in Oracle, what oracle engine internally does
is, it reads the query and decides which will the best possible way for executing the
query. So in this process, Oracle follows these optimization techniques.
1. cost based Optimizer(CBO): If a sql query can be executed in 2 different ways
( like may have path 1 and path2 for same query),then What CBO does is, it
basically calculates the cost of each path and the analyses for which path the cost
of execution is less and then executes that path so that it can optimize the quey
execution.
2. Rule base optimizer(RBO): this basically follows the rules which are needed for
executing a query. So
depending on the number of rules which are to be applied, the optimzer runs the
query.
Use:
If the table you are trying to query is already analysed, then oracle will go with CBO.
If the table is not analysed , the Oracle follows RBO.
For the first time, if table is not analysed, Oracle will go with full table scan.
What is Micro Strategy? Why is it used for? Can any one explain in detail about it?
Micro strategy is again an BI tool whicl is a HOLAP... u can create 2 dimensional
report and also cubes in
here.......basically a reporting tool. IT HAS A FULL RANGE OF REPORTING ON WEB
ALSO IN WINDOWS.
30
build data warehouse. Can you please tell me what should be those 15 questions to
ask from a company, say a telecom company?
First of all meet your sponsors and make a BRD(business requirement document)
about their
expectation from this datawarehouse(main aim comes from them).For example
they need :customer
billing process.Now goto business managment team :they can ask for metrics out of
billing process for
their use.Now magament people :monthly usage,billing metrics,sales
organization,rate plan to perform
sales rep and channel performance analysis and rate plan analysis. So your
dimensiontab les can
be:Customer (customer id,name,city,state etc)Sales rep;sales rep
number,name,idsales org:sales ord
idBill dimension: Bill #,Bill date,Numberrate plan:rate plan codeAnd Fact table can
be:Billing details(bill
#,customer id,minutes used,call details etc)you can follow star and snow flake
schema in this
case.Depend upon the granualirty of your data.
what is difference between lookup cashe and unchashed lookup? Can i run the
mapping with out starting the informatica server?
The difference between cache and uncacheed lookup iswhen you configure the
lookup transformation
cache lookup it stores all the lookuptable data in the cache when the first input
record enter into the
lookup transformation, in cache lookup the select statement executes only once
and compares the
values of the input record with the values in the cachebut in uncache lookup the the
select statement
executes for each input record entering into the lookup transformation and it has to
connect to
database each time entering the new record
If a session fails after loading of 10,000 records in to the target.How can u load the
32
records from 10001 th record when u run the session next time in informatica 6.1?
Running the session in recovery mode will work, but the target load type should be
normal. If its bulk
then recovery wont work as expected
What are mapping parameters and varibles in which situation we can use it
If we need to change certain attributes of a mapping after every time the session is
run, it will be very
difficult to edit the mapping and then change the attribute. So we use mapping
parameters and
variables and define the values in a parameter file. Then we could edit the
parameter file to change the
attribute values. This makes the process simple.
What is worklet and what use of worklet and in which situation we can use it
A set of worlflow tasks is called worklet,
Workflow tasks means
1)timer2)decesion3)command4)eventwait5)eventrise6)mail etc......
But we r use diffrent situations by using this only
What is difference between dimention table and fact table and what are different
dimention tables and fact tables
In the fact table contain measurable data and less columns and meny rows,
It's contain primarykey
Diffrent types of fact tables:
additive,non additive, semi additive
In the dimensions table contain textual descrption of data and also contain meny
columns,less rows
Its contain primary key
What is the logic will you implement to laod the data in to one factv from 'n'
number of dimension tables.
34