Вы находитесь на странице: 1из 41

Data Warehousing Interview Questions

Data Warehouse Architecture


Data Data Mart
Source Staging
Area Warehouse
Analysis

Metadata

Reporting
Raw Summary
Data Data

Reporting

Dat
Information Decision
a
L Q
o u
Operational a Data
e dippers
data d r
y
M Summary
a info M
n Detailed a
External a inform- n
data g a OLAP
e ation tools
D Meta g
r data e
r

Warehouse manager

Figure:-Architecture for data warehouse


Data Warehousing Interview Questions

2. Data warehouse VS OLTP

0 traditional relational databases relational data bases


1 warehouses are time variant, subject-oriented, OLTP databases are designed
non-voltaile and integrated. to maintain atomicity, consistency
and integrity
0 Data warehouse is not updated Insert or update the rows.

3. Data warehouse VS Data mart

0 data warehouse is used on an enterprise level.

1 Data mart is used on a business division/department level. A data mart only contains
the required subject data for local analysis.

4. Data warehouse VS operational Data store (ODS)

1 Traditional relational data base integrated data base


2 Information contains years of data Information contains 30 to 90 days.

5) Mapplet VS Reusable transformation

0 Mapplet consists of set of transformations that is reusable.


A reusable transformation is a single transformation that can be reusable
1 If you create variables or parameters in mapplet that cannot be used in another
mapping or mapplet.
Variables that are created in a reusable transformation can be useful in any other
mapping or mapplet.
2 We cannot include source definition in reusable transformation. But we can add source
to mapplet.
We cannot use cobol source in mapplet.

6) OLAP VS Data mining

3 presentation tool, reports on data Not presentation tool.


4 Hypothesis driven Discovers patterns in the data.
5 Designed for faster response Brings ot all the Hypothesis fitting the
data
6 Designed to produce generalization
about the data.

7) Power center VS power mart

0 applicability high end warehouses applicability low-mid range warehouses


1 no limit of repositories no limit of repositories
2 global repository support not supported global repository
3 supported local repository supported local repository
4 ERP support available ERP support doesn’t available.

1.Can two Fact Tables share same dimensions Tables? How many Dimension tables
are associated with one Fact Table ur project?
Ans: Yes

2.What is ROLAP, MOLAP, and DOLAP...?


Data Warehousing Interview Questions

Ans: ROLAP (Relational OLAP), MOLAP (Multidimensional OLAP), and DOLAP (Desktop
OLAP). In these three OLAP architectures, the interface to the analytic layer is typically the
same; what is quite different is how the data is physically stored.

In MOLAP, the premise is that online analytical processing is best implemented by storing
the data multidimensionally; that is, data must be stored ultidimensionally in order to be
viewed in a multidimensional manner.

In ROLAP, architects believe to store the data in the relational model; for instance, OLAP
capabilities are best provided against the relational database.
DOLAP, is a variation that exists to provide portability for the OLAP user. It
creates multidimensional datasets that can be transferred from server to desktop, requiring
only the DOLAP software to exist on the target system. This provides significant advantages
to portable computer users, such as salespeople who are frequently on the road and do not
have direct access to
their office server.

3.What is an MDDB? and What is the difference between MDDBs and RDBMSs?
Ans: There are two primary technologies that are used for storing the data used in OLAP
applications.
1. multidimensional databases (MDDB)
2. relational databases (RDBMS).
The major difference between MDDBs and RDBMSs is in how they store data. Relational
databases store their data in a series of tables and columns. Multidimensional databases, on
the other hand, store thedata in a large multidimensional arrays.

Advantages of MDDB:
1.Retrieval data is very fast becauseThe data corresponding to any combination of
dimension members can be retrieved with a single I/O.
2. Data is clustered compactly in a multidimensional array.
3. Values are caluculated ahead of time.
4.the index is small and can therefore usually reside completely in memory.
5.Storage is very efficient because the blocks contain only data.
6.A single index locates the block corresponding to a combination of sparse
dimension numbers.

4. What is MDB modeling and RDB Modeling?


Ans:

5. What is Mapplet and how do u create Mapplet?


Ans: A mapplet is a reusable object that represents a set of transformations. It allows you to
reuse transformation logic and can contain as many transformations as you need. Create a
mapplet when you want to use a standardized set of transformation logic in several mappings.
For example, if you have a several fact tables that require a series of dimension keys, you can
create a mapplet containing a series of Lookup transformations to find each dimension key.
You can then use the mapplet in each fact table mapping, rather than recreate the same
lookup logic in each mapping.
To create a new mapplet:
1. In the Mapplet Designer, choose Mapplets-Create Mapplet.
2. Enter a descriptive mapplet name.
The recommended naming convention for mapplets is mpltMappletName.
3. Click OK.
The Mapping Designer creates a new mapplet in the Mapplet Designer.
4. Choose Repository-Save.

6. What for is the transformations are used?


Ans: Transformations are the manipulation of data from how it appears in the source
system(s) into another form in the data warehouse or mart in a way that enhances or
simplifies its meaning. In short, u transform data into information. This includes
Datamerging, Cleansing, Aggregation.
Data Warehousing Interview Questions

Datamerging: Process of standardizing data types and fields. Suppose one source system
calls integer type data as smallint where as another calls similar data as decimal. The data
from the two source systems needs to rationalized when moved into the oracle data format
called number.

Cleansing: This involves identifying any changing inconsistencies or inaccuracies.


Eliminating inconsistencies in the data from multiple sources.
Converting data from different systems into single consistent data set suitable for
analysis.Meets a standard for establishing data elements, codes, domains, formats and
naming conventions.Correct data errors and fills in for missing data values.

Aggregation: The process where by multiple detailed values are combined into a single
summary value typically summation numbers representing dollars spend or units
sold.Generate summarized data for use in aggregate fact and dimension tables.

Data Transformation is an interesting concept in that some transformation can occur during
the “extract,” some during the transformation,” or even – in limited cases--- during “load“
portion of the ETL process. The type of transformation function u need will most often
determine where it should be performed. Some transformation functions could even be
performed in more than one place. B’ze many of the transformations u will want to perform
already exist in some form or another in more than one of the three environments (source
database or application, ETL tool, or the target db).

7. What is the difference btween OLTP & OLAP?


Ans: OLTP stand for Online Transaction Processing. This is standard, normalized database
structure. OLTP is designed for Transactions, which means that inserts, updates, and deletes
must be fast. Imagine a call center that takes orders. Call takers are continually taking calls
and entering orders that may contain numerous items. Each order and each item must be
inserted into a database. Since the performance of database is critical, we want to maximize
the speed of inserts (and updates and deletes). To maximize performance, we typically try
to hold as few records in the database as possible.

OLAP stands for Online Analytical Processing. OLAP is a term that means many things to
many people. Here, we will use the term OLAP and Star Schema pretty much
interchangeably. We will assume that star schema database is an OLAP system.( This is not
the same thing that Microsoft calls OLAP; they extend OLAP to mean the cube structures
built using their product, OLAP Services). Here, we will assume that any system of read-
only, historical, aggregated data is an OLAP system.

A data warehouse(or mart) is way of storing data for later retrieval. This retrieval is almost
always used to support decision-making in the organization. That is why many data
warehouses are considered to be DSS (Decision-Support Systems).

Both a data warehouse and a data mart are storage mechanisms for read-only, historical,
aggregated data.By read-only, we mean that the person looking at the data won’t be
changing it. If a user wants at the sales yesterday for a certain product, they should not
have the ability to change that number.

The “historical” part may just be a few minutes old, but usually it is at least a day old.A data
warehouse usually holds data that goes back a certain period in time, such as five years. In
contrast, standard OLTP systems usually only hold data as long as it is “current” or active.
An order table, for example, may move orders to an archive table once they have been
completed, shipped, and received by the customer.

When we say that data warehouses and data marts hold aggregated data, we need to stress
that there are many levels of aggregation in a typical data warehouse.

8. If data source is in the form of Excel Spread sheet then how do use?
Data Warehousing Interview Questions

Ans: PowerMart and PowerCenter treat a Microsoft Excel source as a relational database,
not a flat file. Like relational sources, the Designer uses ODBC to import a Microsoft Excel
source. You do not need database permissions to import Microsoft Excel sources. To import
an Excel source definition, you need to complete the following tasks: Install the Microsoft
Excel ODBC driver on your system. Create a Microsoft Excel ODBC data source for each
source file in the ODBC 32-bit Administrator.Prepare Microsoft Excel spreadsheets by
defining ranges and formatting columns of numeric data. Import the source definitions in
the Designer.Once you define ranges and format cells, you can import the ranges in the
Designer. Ranges display as source definitions when you import the source.

9. Which db is RDBMS and which is MDDB can u name them?


Ans: MDDB ex. Oracle Express Server(OES), Essbase by Hyperion Software, Powerplay by
Cognos .
RDBMS ex. Oracle , SQL Server,DB2…

10. What are the modules/tools in Business Objects? Explain their purpose briefly?
Ans: BO Designer, Business Query for Excel, BO Reporter, Infoview,Explorer,WEBI, BO Publisher,
and Broadcast Agent, BO (ZABO).
InfoView: IT portal entry into WebIntelligence & Business Objects.
Base module required for all options to view and refresh reports.
Reporter: Upgrade to create/modify reports on LAN or Web.
Explorer: Upgrade to perform OLAP processing on LAN or Web.
Designer: Creates semantic layer between user and database.
Supervisor: Administer and control access for group of users.
WebIntelligence: Integrated query,reporting,and OLAP analysis over the Web.
Broadcast Agent: Used to schedule, run, publish, push, and broadcast pre-built reports and
spreadsheets, including event notification and response
capabilities, event filtering, and calendar based notification, over the LAN, e-mail, pager,Fax,
Personal Digital Assistant( PDA), Short Messaging Service(SMS), etc.
Set Analyzer - Applies set-based analysis to perform functions such as execlusion,
intersections, unions, and overlaps visually.
Developer Suite – Build packaged, analytical, or customized apps.

11.What are the Ad hoc quries, Canned Quries/Reports? and How do u create
them? (Plz check this page……C\:BObjects\Quries\Data Warehouse - About
Queries.htm)
Ans: The data warehouse will contain two types of query. There will be fixed queries that are
clearly defined and well understood, such as regular reports, canned queries (standard
reports) and common aggregations. There will also be ad hoc queries that are unpredictable,
both in quantity and frequency.

Ad Hoc Query: Ad hoc queries are the starting point for any analysis into a database. Any
business analyst wants to know what is inside the database. then he proceeds by calculating
totals, averages, maximum and minimum values for most attributes within the database.
These are unpredictable element of a data warehouse. It is exactly that ability to run any
query when desired and expect a reasonable response that makes the data warhouse
worthwhile, and makes the design such a significant challenge.
The end-user access tools are capable of automatically generating the database query that
answers any Question posed by the user. The user will typically pose questions in terms that
they are familier with (for example, sales by store last week); this is converted into the
database query by the access tool, which is aware of the structure of information within the
data warehouse.

Canned queries: Canned queries are predefined queries. In most instances, canned queries
contain prompts that allow you to customize the query for your specific needs. For example,
a prompt may ask you for a School, department, term, or section ID. In this instance you
would enter the name of the School, department or term, and the query will retrieve the
specified data from the Warehouse.You can measure resource requirements of these
queries, and the results can be used for capacity palnning and for database design.
Data Warehousing Interview Questions

The main reason for using a canned query or report rather than creating your own is that
your chances of misinterpreting data or getting the wrong answer are reduced. You are
assured of getting the right data and the right answer.

12. How many Fact tables and how many dimension tables u did? Which table
precedes what?
Ans: http://www.ciobriefings.com/whitepapers/StarSchema.asp

13. What is the difference between STAR SCHEMA & SNOW FLAKE SCHEMA?
Ans: http://www.ciobriefings.com/whitepapers/StarSchema.asp

14. Why did u choose STAR SCHEMA only? What are the benefits of STAR SCHEMA?
Ans: Because it’s denormalized structure , i.e., Dimension Tables are denormalized. Why to
denormalize means the first (and often only) answer is : speed. OLTP structure is designed for
data inserts, updates, and deletes, but not data retrieval. Therefore, we can often squeeze
some speed out of it by denormalizing some of the tables and having queries go against fewer
tables. These queries are faster because they perform fewer joins to retrieve the same
recordset. Joins are also confusing to many End users. By denormalizing, we can present the
user with a view of the data that is far easier for them to understand.

Benefits of STAR SCHEMA:


1.Far fewer Tables.
2.Designed for analysis across time.
3.Simplifies joins.
4.Less database space.
5.Supports “drilling” in reports.
6.Flexibility to meet business and technical needs.

15. How do u load the data using Informatica?


Ans: Using session.

16. (i) What is FTP? (ii) How do u connect to remote? (iii) Is there another way to
use FTP without a special utility?
Ans: (i): The FTP (File Transfer Protocol) utility program is commonly used for copying files
to and from other computers. These computers may be at the same site or at different sites
thousands of miles apart. FTP is general protocol that works on UNIX systems as well as
other non- UNIX systems.

(ii): Remote connect commands: ftp machinename


ex: ftp 129.82.45.181 or ftp iesg
If the remote machine has been reached successfully, FTP responds by asking for a
loginname and password. When u enter ur own loginname and password for the remote
machine, it returns the prompt like below ftp>and permits u access to ur own home
directory on the remote machine. U should be able to move around in ur own directory and
to copy files to and from ur local machine using the FTP interface commands. Note: U can
set the mode of file transfer to ASCII ( default and transmits seven bits per character). Use
the ASCII mode with any of the following: Raw Data (e.g. *.dat or *.txt, codebooks, or
other plain text documents)
- SPSS Portable files.’
- HTML files.
If u set mode of file transfer to Binary (the binary mode transmits all eight bits per byte and
thus provides less chance of a transmission error and must be used to transmit files other
than ASCII files).
For example use binary mode for the following types of files:
- SPSS System files
- SAS Dataset
- Graphic files (eg., *.gif, *.jpg, *.bmp, etc.)
- Microsoft Office documents (*.doc, *.xls, etc.)
Data Warehousing Interview Questions

(iii): Yes. If u r using Windows, u can access a text-based FTP utility from a DOS
prompt.
To do this, perform the following steps:
0 From the Start  Programs MS-Dos Prompt
1 Enter “ftp ftp.geocities.com.” A prompt will appear
(or)
Enter ftp to get ftp prompt  ftp> open hostname ex. ftp>open ftp.geocities.com (It
connect to the specified host).
2 Enter ur yahoo! GeoCities member name.
3 enter your yahoo! GeoCities pwd.
You can now use standard FTP commands to manage the files in your Yahoo! GeoCities
directory.

17.What cmd is used to transfer multiple files at a time using FTP?


Ans: mget :To copy multiple files from the remote machine to the local machine .You will be
prompted for a y/n answer before transferring each file mget * ( copies all files in the current
remote directory to ur current local directory, using the same file names).

Mput: To copy multiple files from the local machine to the remote machine.

18. What is an Filter Transformation? or what options u have in Filter


Transformation?
Ans: The Filter transformation provides the means for filtering records in a mapping. You
pass all the rows from a source transformation through the Filter transformation, then enter
a filter condition for the transformation. All ports in a Filter transformation are input/output,
and only records that meet the condition pass through the Filter transformation.

Note: Discarded rows do not appear in the session log or reject files. To maximize
session performance, include the Filter transformation as close to the sources in the
mapping as possible. Rather than passing records you plan to discard through the mapping,
you then filter out unwanted data early in the flow of data from sources to targets.

You cannot concatenate ports from more than one transformation into the Filter
transformation; the input ports for the filter must come from a single transformation. Filter
transformations exist within the flow of the mapping and cannot be unconnected. The Filter
transformation does not allow setting output default values.

19.What are default sources which will supported by Informatica Powermart ?


Ans :
Relational tables, views, and synonyms.
Fixed-width and delimited flat files that do not contain binary data.
COBOL files.

20. When do u create the Source Definition ? Can I use this Source Defn to any
Transformation?
Ans: When working with a file that contains fixed-width binary data, you must create the
source definition. The Designer displays the source definition as a table, consisting of
names, datatypes, and constraints. To use a source definition in a mapping, connect a
source definition to a Source Qualifier or Normalizer ransformation. The Informatica Server
uses these transformations to read the source data.

21. What is Active & Passive Transformation ?


Ans: An active transformation can change the number of records passed through it.
A passive transformation never changes the record count.For example, the Filter
transformation removes rows that do not meet the filter condition defined in the
transformation.
Active transformations that might change the record count include the following:
1.Advanced External Procedure
2.Aggregator
Data Warehousing Interview Questions

3.Filter
4.Joiner
5.Normalizer
6.Rank
7.Source Qualifier
Note: If you use PowerConnect to access ERP sources, the ERP Source Qualifier is also an
active transformation. You can connect only one of these active transformations to the
same transformation or target, since the Informatica Server cannot determine how to
concatenate data from different sets of records with different numbers of rows.

Passive transformations that never change the record count include the following:
1.Lookup
2.Expression
3.External Procedure
4.Sequence Generator
5.Stored Procedure
6.Update Strategy

You can connect any number of these passive transformations, or connect one active
transformation with any number of passive transformations, to the same transformation or
target.

22. What is staging Area and Work Area?


Ans: Staging Area : -
- Holding Tables on DW Server.
- Loaded from Extract Process
- Input for Integration/Transformation
- May function as Work Areas
- Output to a work area or Fact Table
Work Area: -
- Temporary Tables
- Memory

23. What is Metadata?


Ans: Defn: “Data About Data”
Metadata contains descriptive data for end users. In a data warehouse the term metadata is
used in a number of different situations.
Metadata is used for:
1.Data transformation and load
2.Data management
3.Query management

Data transformation and load:


Metadata may be used during data transformation and load to describe the source data and
any changes that need to be made. The advantage of storing metadata about the data being
transformed is that as source data changes the changes can be captured in the metadata,
and transformation programs automatically regenerated.
For each source data field the following information is reqd:
Source Field:
Unique identifier (to avoid any confusion occurring betn 2 fields of the same anme from
different sources).
Name (Local field name).
Type (storage type of data, like character,integer,floating point…and so on).
Location
- system ( system it comes from ex.Accouting system).
- object ( object that contains it ex. Account Table).
The destination field needs to be described in a similar way to the source:
Destination:
0 Unique identifier
1 Name
Data Warehousing Interview Questions

2 Type (database data type, such as Char, Varchar, Number and so on).
3 Tablename (Name of the table th field will be part of).

The other information that needs to be stored is the transformation or transformations that
need to be applied to turn the source data into the destination data:
Transformation:
4 Transformation (s)
- Name
-Language (name of the lanjuage that transformation is written in).
- module name
- syntax
The Name is the unique identifier that differentiates this from any other similar
transformations.The Language attribute contains the name of the lnguage that the
transformation is written in.The other attributes are module name and syntax. Generally
these will be mutually exclusive, with only one being defined. For simple transformations
such as simple SQL functions the syntax will be stored. For complex transformations the
name of the module that contains the code is stored instead.

Data management:
Metadata is reqd to describe the data as it resides in the data arehouse.This is needed by
the warhouse manager to allow it to track and control all data movements. Every object in
the database needs to be described.
Metadata is needed for all the following:
5 Tables
- Columns
- name
- type
6 Indexes
- Columns
- name
- type
7 Views
- Columns
- name
- type
8 Constraints
- name
- type
- table
- columns
Aggregations, Partition information also need to be stored in Metadata( for details refer page
# 30)
Query Generation:
Metadata is also required by the query manger to enable it to generate queries. The same
metadata can be used by the Whouse manager to describe the data in the data warehouse
is also reqd by the query manager.
The query mangaer will also generate metadata about the queries it has run. This metadata
can be used to build a history of all quries run and generate a query profile for each user,
group of users and the data warehouse as a whole.
The metadata that is reqd for each query is:
- query
- tables accessed
- columns accessed
- name
- refence identifier
- restrictions applied
- column name
- table name
- reference identifier
- restriction
Data Warehousing Interview Questions

- join Criteria applied


……
……
- aggregate functions used
……
……
- group by criteria
……
……
- sort criteria
……
……
- syntax
- execution plan
- resources
……

24. What kind of Unix flavoures u r experienced?


Ans: Solaris 2.5 SunOs 5.5 (Operating System)
Solaris 2.6 SunOs 5.6 (Operating System)
Solaris 2.8 SunOs 5.8 (Operating System)
AIX 4.0.3
5.5.1 2.5.1 May 96 sun4c, sun4m, sun4d, sun4u, x86, ppc
5.6 2.6 Aug. 97 sun4c, sun4m, sun4d, sun4u, x86
5.7 7 Oct. 98 sun4c, sun4m, sun4d, sun4u, x86
5.8 8 2000 sun4m, sun4d, sun4u, x86

25. What are the tasks that are done by Informatica Server?
Ans:The Informatica Server performs the following tasks:
1.Manages the scheduling and execution of sessions and batches
2.Executes sessions and batches
3.Verifies permissions and privileges
4.Interacts with the Server Manager and pmcmd.
The Informatica Server moves data from sources to targets based on metadata stored in a
repository. For instructions on how to move and transform data, the Informatica Server
reads a mapping (a type of metadata that includes transformations and source and target
definitions). Each mapping uses a session to define additional information and to optionally
override mapping-level options. You can group multiple sessions to run as a single unit,
known as a batch.

26. What are the two programs that communicate with the Informatica Server?
Ans: Informatica provides 1.Server Manager and 2.pmcmd programs to communicate with
the Informatica Server:
Server Manager: A client application used to create and manage sessions and batches, and
to monitor and stop the Informatica Server. You can use information provided through the
Server Manager to troubleshoot sessions and improve session performance.
Pmcmd: A command-line program that allows you to start and stop sessions and batches,
stop the Informatica Server, and verify if the Informatica Server is running.

27. When do u reinitialize Aggregate Cache?


Ans: Reinitializing the aggregate cache overwrites historical aggregate data with new
aggregate data. When you reinitialize the aggregate cache, instead of using the captured
changes in source tables, you typically need to use the entire source table. For example,
you can reinitialize the aggregate cache if the source for a session changes incrementally
every day and completely changes once a month. When you receive the new monthly
source, you might configure the session to reinitialize the aggregate cache, truncate the
existing target, and use the new source table during the session.

/? Note: To be clarified when server manger works for following ?/


To reinitialize the aggregate cache:
Data Warehousing Interview Questions

1.In the Server Manager, open the session property sheet.


2.Click the Transformations tab.
3.Check Reinitialize Aggregate Cache.
4.Click OK three times to save your changes.
5.Run the session.
The Informatica Server creates a new aggregate cache, overwriting the existing aggregate
cache.
/? To be check for step 6 & step 7 after successful run of session… ?/
6.After running the session, open the property sheet again.
7.Click the Data tab.
8.Clear Reinitialize Aggregate Cache.
9.Click OK.

28. (i) What is Target Load Order in Designer?


Ans: Target Load Order: - In the Designer, you can set the order in which the Informatica
Server sends records to various target definitions in a mapping. This feature is crucial if you
want to maintain referential integrity when inserting, deleting, or updating records in tables
that have the primary key and foreign key constraints applied to them. The Informatica
Server writes data to all the targets connected to the same Source Qualifier or Normalizer
simultaneously, to maximize performance.

28. (ii) What are the minimim condition that u need to have so as to use Targte
Load Order Option in Designer?
Ans: U need to have Multiple Source Qualifier transformations.
To specify the order in which the Informatica Server sends data to targets, create one Source
Qualifier or Normalizer transformation for each target within a mapping. To set the target load
order, you then determine the order in which each Source Qualifier sends data to connected
targets in the mapping. When a mapping includes a Joiner transformation, the Informatica
Server sends all records to targets connected to that Joiner at the same time, regardless of
the target load order.

28(iii). How do u set the Target load order?


Ans: To set the target load order:
1. Create a mapping that contains multiple Source Qualifier transformations.
2. After you complete the mapping, choose Mappings-Target Load Plan.
A dialog box lists all Source Qualifier transformations in the mapping, as well as the
targets that receive data from each Source Qualifier.
3. Select a Source Qualifier from the list.
4. Click the Up and Down buttons to move the Source Qualifier within the load order.
5. Repeat steps 3 and 4 for any other Source Qualifiers you wish to reorder.
6. Click OK and Choose Repository-Save.

29. What u can do with Repository Manager?


Ans: We can do following tasks using Repository Manager : -
 To create usernames, you must have one of the following sets of privileges:
- Administer Repository privilege
- Super User privilege
To create a user group, you must have one of the following privileges :
- Administer Repository privilege
- Super User privilege
To assign or revoke privileges , u must hv one of the following privilege..
- Administer Repository privilege
- Super User privilege
Note: You cannot change the privileges of the default user groups or the default
repository users.

30. What u can do with Designer ?


Ans: The Designer client application provides five tools to help you create mappings:
Data Warehousing Interview Questions

1.Source Analyzer. Use to import or create source definitions for flat file,Cobol,
ERP, and relational sources.
2.Warehouse Designer. Use to import or create target definitions.
3.Transformation Developer. Use to create reusable transformations.
4.Mapplet Designer. Use to create mapplets.
5.Mapping Designer. Use to create mappings.

Note:The Designer allows you to work with multiple tools at one time. You can also work
in multiple folders and repositories

31. What are different types of Tracing Levels u hv in Transformations?


Ans
Terse:Indicates when the Informatica Server initializes the session and its components.
Summarizes session results, but not at the level of individual records.
Normal:Includes initialization information as well as error messages and notification of
rejected data.
Verbose initialization: Includes all information provided with the Normal setting plus more
extensive information about initializing transformations in the session.
Verbose data:Includes all information provided with the Verbose itialization setting.

Note: By default, the tracing level for every transformation is Normal.

To add a slight performance boost, you can also set the tracing level to Terse, writing the
minimum of detail to the session log when running a session containing the transformation.

31. What the difference is between a database, a data warehouse and a data mart?
Ans:
A database is an organized collection of information.
A data warehouse is a very large database with special sets of tools to extract and cleanse
data from operational systems and to analyze data.
A data mart is a focused subset of a data warehouse that deals with a single area of data
and is organized for quick analysis.

32. What is Data Mart, Data WareHouse and Decision Support System explain
briefly?
Ans: Data Mart:
A data mart is a repository of data gathered from operational data and other sources that is
designed to serve a particular
community of knowledge workers. In scope, the data may derive from an enterprise-wide
database or data warehouse or be more specialized. The emphasis of a data mart is on
meeting the specific demands of a particular group of knowledge users in terms of analysis,
content, presentation, and ease-of-use. Users of a data mart can expect to have data
presented in terms that are familiar.
In practice, the terms data mart and data warehouse each tend to imply the presence of the
other in some form. However, most writers using the term seem to agree that the design of
a data mart tends to start from an analysis of user needs and that a data warehouse tends
to start from an analysis of what data already exists and how it can be collected in such a
way that the data can later be used. A data warehouse is a central aggregation of data
(which can be distributed physically); a data mart is a data repository that may derive from
a data warehouse or not and that emphasizes ease of access and usability for a particular
designed purpose. In general, a data warehouse tends to be a strategic but somewhat
unfinished concept; a data mart tends to be tactical and aimed at meeting an immediate
need.

Data Warehouse:
A data warehouse is a central repository for all or significant parts of the data that an
enterprise's various business systems collect. The term was coined by W. H. Inmon. IBM
sometimes uses the term "information warehouse."
Typically, a data warehouse is housed on an enterprise mainframe server. Data from various
online transaction processing (OLTP) applications and other sources is selectively extracted
Data Warehousing Interview Questions

and organized on the data warehouse database for use by analytical applications and user
queries. Data warehousing emphasizes the capture of data from diverse sources for useful
analysis and access, but does not generally start from the point-of-view of the end user or
knowledge worker who may need access to specialized, sometimes local databases. The
latter idea is known as the data mart.data mining, Web mining, and a decision support
system (DSS) are three kinds of applications that can make use of a data warehouse.

Decision Support System:


A decision support system (DSS) is a computer program application that analyzes business
data and presents it so that users can make business decisions more easily. It is an
"informational application" (in distinction to an "operational application" that collects the
data in the course of normal business operation).

Typical information that a decision support application might gather and present would be:
Comparative sales figures between one week and the next
Projected revenue figures based on new product sales assumptions
The consequences of different decision alternatives, given past experience in a context that
is described

A decision support system may present information graphically and may include an expert
system or artificial intelligence (AI). It may be aimed at business executives or some other
group of knowledge workers.

33. What r the differences between Heterogeneous and Homogeneous?


Ans: Heterogeneous Homogeneous
1.Stored in different Schemas 1.Common structure
2.Stored in different file or db types 2.Same database type
3.Spread across in several countries 3.Same data center
4.Different platform n H/W config. 4.Same platform and
H/Ware configuration.

34. How do you use DDL commands in PL/SQL block ex. Accept table name from
user and drop it, if available else display msg?
Ans: To invoke DDL commands in PL/SQL blocks we have to use Dynamic SQL, the Package
used is DBMS_SQL.

35. What r the steps to work with Dynamic SQL?


Ans: Open a Dynamic cursor, Parse SQL stmt, Bind i/p variables (if any), Execute SQL stmt
of Dynamic Cursor and
Close the Cursor.

36. Which package, procedure is used to find/check free space available for db
objects like table/procedures/views/synonyms…etc?
Ans: The Package  is DBMS_SPACE
The Procedure  is UNUSED_SPACE
The Table  is DBA_OBJECTS

Note: See the script to find free space @ c:\informatica\tbl_free_space

37. Does informatica allow if EmpId is PKey in Target tbl and source data is 2 rows
with same EmpID?If u use lookup for the same Situation does it allow to load 2
rows or only 1?
Ans: => No, it will not it generates pkey constraint voilation. (it loads 1 row)
=> Even then no if EmpId is Pkey.

38. If Ename varchar2(40) from 1 source(siebel), Ename char(100) from another source
(oracle) and the target is having Name varchar2(50) then how does informatica handles
this situation? How Informatica handles string and numbers datatypes sources?
Data Warehousing Interview Questions

39. How do u debug mappings? I mean where do u attack?

40. How do u qry the Metadata tables for Informatica?

41(i). When do u use connected lookup n when do u use unconnected lookup?


Ans:
Connected Lookups : -
A connected Lookup transformation is part of the mapping data flow. With connected
lookups, you can have multiple return values. That is, you can pass multiple values from the
same row in the lookup table out of the Lookup transformation.
Common uses for connected lookups include:
1. Finding a name based on a number ex. Finding a Dname based on deptno
2. Finding a value based on a range of dates
3. Finding a value based on multiple conditions

Unconnected Lookups : -
An unconnected Lookup transformation exists separate from the data flow in the mapping.
You write an expression using
the :LKP reference qualifier to call the lookup within another transformation.
Some common uses for unconnected lookups include:
1. Testing the results of a lookup in an expression
2. Filtering records based on the lookup results
3. Marking records for update based on the result of a lookup (for example, updating slowly
changing dimension tables)
4. Calling the same lookup multiple times in one mapping

41 What r the differences between Connected lookups and Unconnected lookups?


Ans:Although both types of lookups perform the same basic task, there are some important
differences:
------------------------------------------------------------------ --------------------
Connected Lookup Unconnected Lookup
--------------------------------------------------------------- --------------------------
1. Part of the mapping data flow. Separate from the mapping data flow.
2.Can return multiple values from the same row. Returns one value from each row.
You link the lookup/output ports to another You designate the return value with the
Return port (R).
transformation.
3.Supports default values. Does not support default values.
If there's no match for the lookup condition, the If there's no match for the lookup
condition, the server
server returns the default value for all output ports. returns NULL.
More visible. Shows the data passing in and out Less visible. You write an expression
using :LKP to tell
of the lookup. the server when to perform the lookup.
Cache includes all lookup columns used in the Cache includes lookup/output ports in
the Lookup condition
mapping (that is, lookup table columns included and lookup/return port.
in the lookup condition and lookup table
columns linked as output ports to other
transformations).
-----------------------------------------------------------------------------------
42. What u need concentrate after getting explain plan?
Ans: The 3 most significant columns in the plan table are named OPERATION,OPTIONS, and
OBJECT_NAME.For each step, these tell u which operation is going to be performed and
which object is the target of that operation.
Data Warehousing Interview Questions

Ex:-
**************************
TO USE EXPLAIN PLAN FOR A QRY...
**************************
SQL> EXPLAIN PLAN
2 SET STATEMENT_ID = 'PKAR02'
3 FOR
4 SELECT JOB,MAX(SAL)
5 FROM EMP
6 GROUP BY JOB
7 HAVING MAX(SAL) >= 5000;

Explained.

**************************
TO QUERY THE PLAN TABLE :-
**************************
SQL> SELECT RTRIM(ID)||' '||
2 LPAD(' ', 2*(LEVEL-1))||OPERATION
3 ||' '||OPTIONS
4 ||' '||OBJECT_NAME STEP_DESCRIPTION
5 FROM PLAN_TABLE
6 START WITH ID = 0 AND STATEMENT_ID = 'PKAR02'
7 CONNECT BY PRIOR ID = PARENT_ID
8 AND STATEMENT_ID = 'PKAR02'
9 ORDER BY ID;

STEP_DESCRIPTION
----------------------------------------------------
0 SELECT STATEMENT
1 FILTER
2 SORT GROUP BY
3 TABLE ACCESS FULL EMP

43. How components are interfaced in Psoft?


Ans:

44. How do u do the analysis of an ETL?


Ans:

45. What is Standard, Reusable Transformation and Mapplet?


Ans: Mappings contain two types of transformations, standard and reusable.
Standard transformations exist within a single mapping. You cannot reuse a standard
transformation you created in another mapping, nor can you create a shortcut to that
transformation. However, often you want to create transformations that perform common tasks,
such as calculating the average salary in a department. Since a standard transformation cannot be
used by more than one mapping, you have to set up the same transformation each time you want to
calculate the average salary in a department.
Mapplet: A mapplet is a reusable object that represents a set of transformations. It allows
you to reuse transformation logic and can contain as many transformations as you need. A
mapplet can contain transformations, reusable transformations, and shortcuts to
transformations.

46. How do u copy Mapping, Repository, Sessions?


Ans: To copy an object (such as a mapping or reusable transformation) from a shared
folder, press the Ctrl key and drag and drop the mapping into the destination folder.
To copy a mapping from a non-shared folder, drag and drop the mapping into the
destination folder.
In both cases, the destination folder must be open with the related tool active.
Data Warehousing Interview Questions

For example, to copy a mapping, the Mapping Designer must be active. To copy a Source
Definition, the Source Analyzer must be active.

Copying Mapping:
To copy the mapping, open a workbook.
In the Navigator, click and drag the mapping slightly to the right, not dragging it to the
workbook.
When asked if you want to make a copy, click Yes, then enter a new name and click OK.
Choose Repository-Save.

Repository Copying: You can copy a repository from one database to another. You use this
feature before upgrading, to preserve the original repository. Copying repositories provides
a quick way to copy all metadata you want to use as a basis for a new repository.
If the database into which you plan to copy the repository contains an existing repository,
the Repository Manager deletes the existing repository. If you want to preserve the old
repository, cancel the copy. Then back up the existing repository before copying the new
repository.
To copy a repository, you must have one of the following privileges:
Administer Repository privilege
Super User privilege

To copy a repository:
1. In the Repository Manager, choose Repository-Copy Repository.
2. Select a repository you wish to copy, then enter the following information:

-------------------------------- --------------------------- -------------------------------------


Copy Repository Field Required/ Optional Description
-------------------------------- --------------------------- --------------------------------------
Repository Required Name for the repository copy. Each
repository name must be
unique within
the domain and should be easily
distinguished from all other
repositories.

Database Username Required Username required to connect to the


database. This login must have
the
appropriate database permissions to
create the repository.

Database Password Required Password associated with the database


username.Must be in US-ASCII.

ODBC Data Source Required Data source used to connect to the


database.

Native Connect String Required Connect string identifying the location


of the database.

Code Page Required Character set associated with the


repository. Must be a superset
of the code page of the
repository you want to copy.

If you are not connected to the repository you want to copy, the Repository Manager asks
you to log in.
Data Warehousing Interview Questions

3. Click OK.
4 If asked whether you want to delete an existing repository data in the second
repository, click OK to delete it. Click Cancel to preserve the existing repository.

Copying Sessions:
In the Server Manager, you can copy stand-alone sessions within a folder, or copy sessions
in and out of batches.
To copy a session, you must have one of the following:
Create Sessions and Batches privilege with read and write permission
0 Super User privilege
To copy a session:
1. In the Server Manager, select the session you wish to copy.
2. Click the Copy Session button or choose Operations-Copy Session.
The Server Manager makes a copy of the session. The Informatica Server names the copy
after the original session, appending a number, such as session_name1.

47. What are shortcuts, and what is advantage?


Ans: Shortcuts allow you to use metadata across folders without making copies, ensuring
uniform metadata. A shortcut inherits all
properties of the object to which it points. Once you create a shortcut, you can
configure the shortcut name and description.

When the object the shortcut references changes, the shortcut inherits those changes.
By using a shortcut instead of a copy,
you ensure each use of the shortcut exactly matches the original object. For example,
if you have a shortcut to a target
definition, and you add a column to the definition, the shortcut automatically inherits
the additional column.

Shortcuts allow you to reuse an object without creating multiple objects in the
repository. For example, you use a source
definition in ten mappings in ten different folders. Instead of creating 10 copies of the
same source definition, one in each
folder, you can create 10 shortcuts to the original source definition.
You can create shortcuts to objects in shared folders. If you try to create a shortcut to
a non-shared folder, the Designer
creates a copy of the object instead.

You can create shortcuts to the following repository objects:


0 Source definitions
1 Reusable transformations
2 Mapplets
3 Mappings
4 Target definitions
5 Business components

You can create two types of shortcuts:


Local shortcut. A shortcut created in the same repository as the original object.
Global shortcut. A shortcut created in a local repository that references an object in a
global repository.

Advantages: One of the primary advantages of using a shortcut is maintenance. If


you need to change all instances of an object, you can edit the original repository object. All
shortcuts accessing the object automatically inherit the changes.
Shortcuts have the following advantages over copied repository objects:
You can maintain a common repository object in a single location. If you need to edit the
object, all shortcuts immediately inherit the changes you make. You can restrict repository
users to a set of predefined metadata by asking users to incorporate the shortcuts into their
work instead of developing repository objects independently.You can develop complex
mappings, mapplets, or reusable transformations, then reuse them easily in other
Data Warehousing Interview Questions

folders.You can save space in your repository by keeping a single repository object and
using shortcuts to that object, instead of creating copies of the object in multiple folders or
multiple repositories.

48. What are Pre-session and Post-session Options?


Ans: The Informatica Server can perform one or more shell commands before or after the
session runs. Shell commands are operating system commands. You can use pre- or post-
session shell commands, for example, to delete a reject file or session log, or to archive
target files before the session begins.
The status of the shell command, whether it completed successfully or failed, appears in the
session log file. To call a pre- or post-session shell command you must: Use any valid UNIX
command or shell script for UNIX servers, or any valid DOS or batch file for Windows NT
servers.
1.configure the session to execute the pre- or post-session shell commands.

You can configure a session to stop if the Informatica Server encounters an error while
executing pre-session shell commands.

For example, you might use a shell command to copy a file from one directory to another.
For a Windows NT server you would use the following shell command to copy the SALES_
ADJ file from the target directory, L, to the source, H:
copy L:\sales\sales_adj H:\marketing\

For a UNIX server, you would use the following command line to perform a similar
operation:
cp sales/sales_adj marketing/

Tip: Each shell command runs in the same environment (UNIX or Windows NT) as the
Informatica Server. Environment settings in one shell command script do not carry over to
other scripts. To run all shell commands in the same environment, call a single shell script
that in turn invokes other scripts.

49. What are Folder Versions?


Ans: In the Repository Manager,
0 you can create different versions within a folder to help you archive work in
development.
1 You can copy versions to other folders as well. When you save a version, you save all
metadata at a particular point in development. Later versions contain new or modified
metadata, reflecting work that you have completed since the last version.
2 Maintaining different versions lets you revert to earlier work when needed. By
archiving the contents of a folder into a version each time you reach a development landmark,
you can access those versions if later edits prove unsuccessful.
4. You create a folder version after completing a version of a difficult mapping, then continue
working on the mapping. If you are unhappy with the results of subsequent work, you can
revert to the previous version, then create a new version to continue development. Thus you
keep the landmark version intact, but available for regression.

Note: You can only work within one version of a folder at a time.

50. How do automate/schedule sessions/batches n did u use any tool for


automating Sessions/batch?
Ans: We scheduled our sessions/batches using Server Manager. You can either schedule a
session to run at a given time or interval, or you can manually start the session. U need to
create sessions n batches with Read n Execute permissions or super user privilege.If you
configure a batch to run only on demand, you cannot schedule it.

Note: We did not use any tool for automation process.

51. What are the differences between 4.7 and 5.1 versions?
Data Warehousing Interview Questions

Ans: New Transformations added like XML Transformation and MQ Series Transformation,
and PowerMart and PowerCenter both are same from 5.1version.

52. What r the procedure that u need to undergo before moving


Mappings/sessions from Testing/Development to Production?
Ans:

53. How many values it (informatica server) returns when it passes thru
Connected Lookup n Unconncted Lookup?
Ans: Connected Lookup can return multiple values where as Unconnected Lookup will return
only one value that is Return Value.

54. What is the difference between PowerMart and PowerCenter in 4.7.2?


Ans: If You Are Using PowerCenter PowerCenter allows you to register and run multiple
Informatica Servers against the same repository. Because you can run these servers at the
same time, you can distribute the repository session load across available servers to
improve overall performance. With PowerCenter, you receive all product functionality,
including distributed metadata, the ability to organize repositories into a data mart domain
and share metadata across repositories.
A PowerCenter license lets you create a single repository that you can configure as a
global repository, the core component of a data warehouse. If You Are Using
PowerMart This version of PowerMart includes all features except distributed metadata and
multiple registered servers. Also, the various options available with PowerCenter (such as
PowerCenter Integration Server for BW, owerConnect for IBM DB2, PowerConnect for SAP
R/3, and PowerConnect for PeopleSoft) are not available with PowerMart.

55. What kind of modifications u can do/perform with each Transformation?


Ans: Using transformations, you can modify data in the following ways:
----------------- ------------------------
Task Transformation
----------------- ------------------------
1.Calculate a value Expression
2.Perform an aggregate calculations Aggregator
3.Modify text Expression
4.Filter records Filter, Source Qualifier
5.Order records queried by the Informatica Server Source Qualifier
6.Call a stored procedure Stored Procedure
7.Call a procedure in a shared library External Procedure
or in the COM layer of Windows NT
8.Generate primary keys Sequence Generator
9.Limit records to a top or bottom range Rank
10.Normalize records, including those reads Normalizer
from COBOL sources
11.Look up values Lookup
12.Determine whether to insert, delete, update, Update Strategy
or reject records
13.Join records from different databases Joiner
or flat file systems

56. Expressions in Transformations, Explain briefly how do u use?


Ans: Expressions in Transformations:To transform data passing through a transformation,
you can write an expression. The most obvious examples of these are the Expression and
Aggregator transformations, which perform calculations on either single values or an entire
range of values within a port. Transformations that use expressions include the following:
expression:Calculates the result of an expression for each row passing through the
transformation, using values from one or more ports.
aggregator:Calculates the result of an aggregate expression, such as a sum or average,
based on all data passing through a port or on groups within that data.
Filter:Filters records based on a condition you enter using an expression.
Data Warehousing Interview Questions

Rank:Filters the top or bottom range of records, based on a condition you enter using an
expression.
update Strategy:Assigns a numeric code to each record based on an expression, indicating
whether the Informatica Server should use the information in the record to insert, delete, or
update the target.

In each transformation, you use the Expression Editor to enter the expression. The
Expression Editor supports the transformation language for building expressions. The
transformation language uses SQL-like functions, operators, and other components to build
the expression. For example, as in SQL, the transformation language includes the functions
COUNT and SUM. However, the PowerMart/PowerCenter transformation language includes
additional functions not found in SQL.

When you enter the expression, you can use values available through ports. For example, if
the transformation has two input ports representing a price and sales tax rate, you can
calculate the final sales tax using these two values. The ports used in the expression can
appear in the same transformation, or you can use output ports in other transformations.

57. In case of Flat files (which comes thru FTP as source) has not arrived then what
happens?Where do u set this option?
Ans: U get an fatel error which cause server to fail/stop the session.
U can set Event-Based Scheduling Option in Session Properties under General tab--
>Advanced options..
----------------- ------------------- ------------------
Event-Based Required/ Optional Description
----------------- -------------------- ------------------
Indicator File to Wait For Optional Required to use event-based
scheduling. Enter the indicator
file
(or directory and file) whose arrival
schedules the session. If you do not enter a directory, the Informatica Server assumes the
file appears in the server variable directory $PMRootDir.

58. What is the Test Load Option and when you use in Server Manager?
Ans: When testing sessions in development, you may not need to process the entire source.
If this is true, use the Test Load
Option(Session Properties  General Tab  Target Options Choose Target Load
options as Normal (option button), with
Test Load cheked (Check box) and No.of rows to test ex.2000 (Text box with
Scrolls)). You can also click the Start button.

59. SCD Type 2 and SGT difference?

60. Differences between 4.7 and 5.1?

61. Tuning Informatica Server for improving performance? Performance Issues?


Ans: See /* C:\pkar\Informatica\Performance Issues.doc */

62. What is Override Option? Which is better?

63. What will happen if u increase buffer size?

64. what will happen if u increase commit Intervals? and also decrease commit
Intervals?

65. What kind of Complex mapping u did? And what sort of problems u faced?

66. If u have 10 mappings designed and u need to implement some changes(may


be in existing mapping or new mapping need to
be designed) then how much time it takes from easier to complex?
Data Warehousing Interview Questions

67. Can u refresh Repository in 4.7 and 5.1? and also can u refresh pieces
(partially) of repository in 4.7 and 5.1?

68. What is BI?


Ans: http://www.visionnet.com/bi/index.shtml

69. Benefits of BI?


Ans: http://www.visionnet.com/bi/bi-benefits.shtml

70. BI Faq
Ans: http://www.visionnet.com/bi/bi-faq.shtml

71. What is difference between data scrubbing and data cleansing?


Ans:
Scrubbing data is the process of cleaning up the junk in legacy data and making it
accurate and useful for the next generations of automated systems. This is perhaps the
most difficult of all conversion activities. Very often, this is made more difficult when the
customer wants to make good data out of bad data. This is the dog work. It is also the most
important and can not be done without the active participation of the user.

DATA CLEANING - a two step process including DETECTION and then CORRECTION of
errors in a data set

72. What is Metadata and Repository?


Ans: Metadata. “Data about data” .
It contains descriptive data for end users.Contains data that controls the ETL
processing. Contains data about the current state of the data warehouse.ETL updates
metadata, to provide the most current state.

Repository: The place where you store the metadata is called a repository. The more
sophisticated your repository, the more Complex and detailed metadata you can store in it.
PowerMart and PowerCenter use a relational database as the Repository.

73. SQL * LOADER?


Ans: http://download-
west.oracle.com/otndoc/oracle9i/901_doc/server.901/a90192/ch03.htm#1004678

74. Debugger in Mapping?

75. Parameters passing in 5.1 vesion exposure?

76. What is the filename which u need to configure in Unix while Installing
Informatica?

77. How do u select duplicate rows using Informatica i.e., how do u use
Max(Rowid)/Min(Rowid) in Informatica?

Business Objects

10. WHAT ARE THE MODULES/TOOLS IN BUSINESS OBJECTS? EXPLAIN THEIR


PURPOSE BRIEFLY?

ANS:
BO DESIGNER,
BUSINESS QUERY FOR EXCEL,
BO REPORTER,
INFOVIEW,
EXPLORER,
Data Warehousing Interview Questions

WEBI,
BO PUBLISHER, AND
BROADCAST AGENT,
BO ZABO).

INFOVIEW: IT PORTAL ENTRY INTO WEBINTELLIGENCE & BUSINESS OBJECTS BASE MODULE
REQUIRED FOR ALL OPTIONS TO VIEW AND REFRESH REPORTS.
REPORTER: UPGRADE TO CREATE/MODIFY REPORTS ON LAN OR WEB.
EXPLORER: UPGRADE TO PERFORM OLAP PROCESSING ON LAN OR WEB.
DESIGNER: CREATES SEMANTIC LAYER BETWEEN USER AND DATABASE.
SUPERVISOR: ADMINISTER AND CONTROL ACCESS FOR GROUP OF USERS.
WEBINTELLIGENCE: INTEGRATED QUERY, REPORTING, AND OLAP ANALYSIS OVER THE WEB.
BROADCAST AGENT: USED TO SCHEDULE, RUN, PUBLISH, PUSH, AND BROADCAST PRE-
BUILT REPORTS AND SPREADSHEETS, INCLUDING EVENT NOTIFICATION AND RESPONSE
CAPABILITIES, EVENT FILTERING, AND CALENDAR BASED
NOTIFICATION, OVER THELAN, E-MAIL, PAGER,FAX, PERSONAL DIGITALASSISTANT( PDA),
SHORT MESSAGING SERVICE(SMS), ETC. SET ANALYZER - APPLIES SET-BASED ANALYSISTO
PERFORM FUNCTIONS SUCHAS EXECLUSION, INTERSECTIONS, UNIONS, AND OVERLAPS
VISUALLY.
DEVELOPER SUITE - BUILD PACKAGED, ANALYTICAL, OR CUSTOMIZED APPS.

1)there are five sessions in a batch, he wants to run first two parallel and next three
in sequence. how do u do that.
answer is we have to go for nested batches.

2)u are using a cache in sequence generator tfn. u are


caching 50 values. after processing 40 values, the
session has been completed, what happens when u start
next session(means what about the remaining 10
values).

i don't know exact answer. but what i am thinkig is


the next session will use previous ten values and
create another cahes after processing the previous ten values.

3)u are using a update stratagy tfn in ur mapping.


what happens to records rejected by this mapping(that means will it go to bad file).

yes

4)u are using a filer tfn in ur mapping. what happens


to records filtered records by this mapping(that means
will it go to bad file).

5)how do u recover the sessions?

6)What are the Differences Between informatica 4.7 and 5.1 and 6.0
0 Router Tranformation is avialble from 5.0 onwards
1 Debugging in Designer
2 Partition of session in session manager
3 In 6.0 complete heterogeneous targets one in oracle one in db2 into multiple targets
4 Data partitioning run as multiple sessions in informatica 6.0
5 Repository Server (New Component)
6 Workflow Manager (New Component)
7 Workflow Monitor (New Component)
8
7)What is diff. Between 5.1 & 6.0.
1) One new transformation is added which is called sorter transformation.Sorter
transformation can be used before aggregator for fast processing.2) Repository manager is
same in both.3) server manager is called workflow manager.4) Batch is called workflow and
Data Warehousing Interview Questions

sub batches are called worklets.5) workflow is the top batch and u can have sub batches
which are called worklets and sessions under worklets.6) sessions can be run independently by
right clicking on them.7) output monitoring window in 6.x is called workflow monitor

Mascot Questions Informatica:

1). what is difference between Transformation Designer & Mapplet Designer?


2). What are different types of SCD? Type1-2-3
3). How you are getting the data from the Client?
Ans.Through vss (visual source safe)
4). How you are transferring the mappings objects from development environment to
production?
Ans. export and import
5). What is the size of source and target database? > 50gb
6). What is the size of frequently loading data?1000 recs
7). How frequently you are loading the data into the target? weekly
8). How you are maintaining the documentation from client?
9). What is Dynamic cache in lookup transformation?
10).What will be stored in cache? Source/target metadata
11).How do you test the sessions? W.f monitor
12).How do you increase the performance of the sessions? Bottle necks
13).How do you identify the data rejected while loading to the target? Bad file
14).How do you load the rejected data? Manually or run session again
15).How do you find that the data loaded into the target is correct or not? Run the session and
test.

DATAWAREHOUSE FAQ :

0 WHAT IS DATWAREHOUSE ?
1 WHO NEEDS DATAWAREHOUSE ?
2 WHAT ARE TYPES OF DATABASE SYSTEMS ?
3 WHAT ARE IMPORTANT CONCERNS OF OLTP AND DSS SYSTEMS ?
4 WHAT IS ARCHITECTURE OF DATAWAREHOUSE?
5 WHAT IS A DATA MART?
6 WHAT ARE CHARACTERISTICS OF DATA WAREHOUSE?
7 WHAT IS DIFFERENCE BETWEEN DATA MART AND DATAWAREHOUSE?
8 WHAT IS DATA MODELING?
9 WHAT IS AN ENTITY, ATTRIBUTE AND RELATIONSHIP?
10 WHAT ARE DIFFERENT TYPES OF RELATIONSHIPS ?
11 WHAT IS DIFFERENCE BETWEEN CARDINALITY AND NULLABILITY?
12 WHAT ARE DIFFERENT STEPS FOR DATA MODELING?
13 WHAT IS A PHYSICAL DATA MODEL?
14 WHAT IS A LOGICAL DATA MODEL ?
15 WHAT IS FORWARD,REVERSE AND RE ENGINEERING?
16 WHAT IS NORMALIZATION, DENORMALIZATION?
17 WHAT ARE DIFFERENT FORMS OF NORMALIZATION?
18 WHAT IS ETL OR ETT ?
19 WHAT IS A STAR SCHEMA ?
20 WHAT ARE FACT AND DIMENSION TABLES?
21 WHAT IS A STAR-FLAKE OR SNOW-FLAKE SHEMA ?
22 WHAT IS VERY LARGE DATABASE?
23 WHAT ARE SMP AND MPP?
24 WHAT IS PARALLELISM ?
25 WHAT IS A PARALLEL QUERY ?
26 WHAT IS AN OLAP AND WHAT ARE ITS TYPES?
27 HOW OLTP IS DIFFERENT FROM OLAP ?
28 WHAT IS DATA MINING?
29 WHAT IS DIFFERENCE BETWEEN DATAWAREHOUSE AND OLAP?
30 WHAT ARE FACILITIES PROVIDED BY DW TO ANALYTICAL USERS?
Data Warehousing Interview Questions

31 WHAT ARE FACILITIES PROVIDED BY OLAP TO ANALYTICAL USERS?


32 WHAT ARE DIFFERENT WAYS OF LOADING DATA TO DATAWAREHOUSE USING
ORACLE?
33 WHAT IS TABLE PARTITIONING? HOW IT IS USEFUL TO WAREHOUSE DATABASE?
34 WHAT ARE DIFFERENT TYPES OF PARTITIONING IN ORACLE?
35 WHAT IS A MATERIALIZED VIEW? HOW IT IS DIFFERENT FROM NORMAL AND INLINE
VIEWS?
36 WHAT IS INDEXING? WHAT ARE DIFFERENT TYPES OF INDEXES SUPPORTED BY
ORACLE?
37 WHAT ARE DIFFERENT STORAGE OPTIONS SUPPORTED BY ORACLE?
38 EXPLAIN ROLLUP,CUBE,RANK AND DENSE_RANK FUNCTIONS OF ORACLE 8i.

39 WHAT IS COGNOS? WHAT ARE IMPORTANT PRODUCTS OF COGNOS AND THEIR USE?
40 WHAT IS A CATALOG OF COGNOS? WHAT ARE DIFFERENT TYPES OF CATALOG?
41 WHAT IS A DIMENSION,LEVEL,CATEGORY AND MEASURE IN COGNOS TRANSFORMER?
42 WHAT IS NAME OF ADMINISTRATOR USER IN COGNOS?
43 HOW TO CREATE A USER AND MANAGE A USER IN COGNOS?
44 WHAT ARE DIFFERENT TYPES OF REPORTS GENERATED USING COGNOS IMPROMPTU?
45 WHAT IS BUSINESS OBJECTS? WHAT ARE IMPORTANT PRODUCTS OF BUSINESS
OBJECTS?
46 WHAT ARE DIFFERENT USER PROFILES OF BUSINESS OBJECTS?
47 WHAT IS A UNIVERSE OF BUSINESS OBJECTS?
48 EXPLAIN USER HIERARCHY IN BUSINESS OBJECTS?
49 WHAT IS A CLASS,OBJECT,DIMENSION,DETAIL,MEASURE OF BUSINESS OBJECTS?
50 WHAT IS AN ETL TOOL? EXPLAIN EXTRACTION,TRANSFORMATION,LOADING
PROCESS?
51 WHAT IS METADATA REPOSITORY OF OWB?
52 WHAT ARE CODE GENERATOR AND INTEGRATORS OF OWB?

INFORMATICA FAQ :

0 WHAT IS DATA MERGING, CLEANSING AND AGGREGATION ?


1 WHAT ARE DIFFERENT CLIENT PRODUCTS OF INFORMATICA ?
2 * WHAT IS VERSION OF INFORMATICA ? WHAT ARE NEW ENHANCEMENTS IN THAT
VERSION ?
3 WHAT IS DIFFERENCE BETWEEN POWERMART Client AND POWERCENTER CLIENT ?
4 WHAT IS REPOSITORY IN INFORMATICA ? HOW MANY OBJECTS IT CONSISTS OF ?
5 WHAT YOU CAN DO WITH "REPOSITORY MANAGER" TOOL ?
6 WHAT FOR "DESIGNER" TOOL IS ?
7 WHY "SERVER MANAGER" TOOL IS USED FOR ?
8 WHAT ARE DIFFERENT PRIVILEGES OF INFORMATICA ?
9 WHAT ARE SERVICES AND PROCESSES IN INFORMATICA ?
10 HOW MANY REPOSITORIES CAN BE CREATED IN INFORMATICA ?
11 WHAT IS FOLDER IN INFORMATICA ?
12 WHAT ARE DEFAULT USERS AND GROUPS IN INFORMATICA ?
13 CAN YOU CREATE REPORTS IN REPOSITORY MANAGER ? WHAT FOR THOSE REPORTS
ARE ?
14 HOW TO RELEASE LOCKS IN INFORMATICA ?
15 WHAT IS DIFFERENCE BETWEEN CREATING REPOSITORY AND ADDING REPOSITORY ?
16 WHAT IS A TRANSFORMATION ? WHAT ARE DIFFERENT TYPES OF TRANSFORMATIONS
?
17 WHICH TRANSFORMATIONS CAN BE USED AS UNCONNECTED ?
18 WHAT IS FOLDER VERSIONING ?
19 HOW DO YOU TEST YOUR MAPPING ?
20 WHAT ARE MAPPING PARAMETERS AND VARIABLES ?
21 WHAT IS LOOKUP CACHE ?
22 WHAT IS CLEANSING ? WHICH TRANSFORMATION CAN BE USED FOR HANDLING
CLEANSING ?
23 WHAT IS DIFFERENCE BETWEEN REUSABLE TRANSFORMATION AND MAPPLET ?
Data Warehousing Interview Questions

24 WHAT ARE SLOWLY CHANGING AND SLOWLY GROWING DIMENSIONS ?


25 WHAT ARE TYPE 1, TYPE 2 AND TYPE 3 DIMENSIONS ?
26 WHAT IS A BATCH IN INFORMATICA ? WHAT IS SEQUENTIAL AND CONCURRENT
BATCH ?
27 WHAT IS A CUBE IN INFORMATICA ?
Data Warehouse:
1.Describe Metadata.
2.Snowflake Schema.
3.Oracle 8i features in of Data Warehouse.
4.Difference between Data Warehouse and data Mart.
5.Maintained views in Data Warehouse.
6.Difference between OLTP and DSS.
7.Hierarchy of Data Warehouse.
8.Data Mart whether it is Logical or Physical.

Informatica :
1.Performance Enhancement in ETL.
2.Difficulties faced in ETL Job ?. Did you overcome ?.
3.What are Mapplets .
4.What are the OLTP Process you worked with ______.
5.What is Lookup Transformation.
6.Tell about Cache Directory in Lookup.
7.What is Confirmed Diemension.
8.Slowly Changing Diemension and How to over them ?
9.Can a Mapplet have Target.
10.How many Transfomations are there in Informatica.
11.How do you connect to Remote database in Mapping.
12.What is Data Driven in Update Strategy Transformation.
13.What is Mapplets?. Repository Objects that are not supported in Mapplets Why?
14.Difference between Oracle Warehouse Builder and Informatica.
15.How do you identify Fact and Dimension tables.
16.Difference between Connected and Unconnected Lookup.
17.What is Source Qualifier.
18.What are the OLTP Process you worked with ___RDBMS___.

Business Objects :
1.What is Universe.
2.How can we create Universe.
3.What are the parameters that we are using at time of Universe Creation.
4.What is Repository.
5.How can we restrict rows in the report in Business Objects.
6.What is .key file.
7.What are Domains.
8.Can we create a report with Data Providers.
9.What are Locks.
10.What is Broad Cast Agent.
11.What is trouble shooting in Broad Cast Agent.
12.What is Adhoc report.
13.What are Loops in Business Objects. How to Use it.
14.Definition of Universe.
15.Explain grouped cross tab.
16.Who launches the Supervisor

Business Objects:

1. What are different Modules of Business Objects?


Data Warehousing Interview Questions

Supervisor, Designer, Business Objects, WEBI, BCA

2. How do u export the report data into personal files (.txt, .xls)
Open the report containing the data U wants to export in BO
Click the view data command on the data menu.
Click the Export button on the data manager box  select the format U
Want to export.

3. How do u auto refreshes a document when u open a business objects odule?


Goto the Tools tab  click the Options menu  go to the Save tab
check the “Refresh document when opening” .

4. What is the difference when applying a sort on report and sort on Query panel?
Sort on report: Click the cell, column, row or chart element containing the
Data and then click the toll bar button for the sort U want to apply.
Sort on query panel: Click an object in the Result Objects box and then
click the sort button on the toll bar.

5. What is the result set u get if u drag objects with measure object (sum) into the report?
(How many rows u get in the report)
One row with the sum of the result measure object

6. What are the different types of connections?


Synchronous
Asynchronous

7. What are synchronous and asynchronous connections?


Asynchronous mode allows a user to regain control once a query is sent to
the server and cancel queries during both the analysis and fetch phases.

Synchronous mode allows a user to cancel queries only during the fetch
phase. This is the option by default.

8. Explain how document Filters works?


When u apply document filter, it displays only the data which u specify and
Hides all other data.

9. How do u solve the following errors #ERROR, #COMPUTATION, #SYNTAX, #DIV 0


ERROR
In Business Objects Module: Suppose the formula is (a-b)/b,
then change it to
If b<>0 then (a-b)/b
In Designer Module: Use NVL function if the object is defined in Universe
Type Max (#Multivalue) or change object qualification to Measure

10. What is Sub Query?


Sub query is a query within a query (nested query)
The outer query fires can u force a cross tab to show columns without data
(eg. A crosstab of months displaying 1-12,where data only exists for months 1,4,8)
Create a new Query on the same Universe with Month object and link to the
old query(Data Link)
Data Warehousing Interview Questions

12.When developing a report how would u apply a single break with region,
division and department in that order
By going to Format Break -> Edit -> and add the variables that u want in a
single break

13.What is a Context within the reporter module (not a designer created context)
When u extend a Formula, u will see contexts like Inbody, Inblock, where
Inbody is input context and Inblock is output context

14.Explain the difference between inbody, inblock, and inreport


Inbody- it refers to the input parameter
Inblock- it refers to the output parameter
Inreport- it refers to the whole report in a document

15.Why would the result set u see on the screen without applying any
formatting or filtering is different that what would be exported?

B.O has an option in Table Format->General Tab->Avoid Duplicate Rows


Aggregation

16.How do u setup a multicolumn report (eg.labels)


By going to Table format->General tab->change the column value

17. What do u has to execute a VBA Macro when any document is opened
Create a Adding in VBA Editor, save it with .rea extension and call this
adding into Business objects

18.What is Repository?
Provide a centralized storage location for BO applications.
Secure access b/n BO deployment and data ware house.

19. Can u have more than one Repository on one database?


No, U can create a repository on different database.

20. How u apply Row Level Security on the table


In the universe properties->go to the rows tab

21. What is the use of Linked Universes, and Link Data Providers?
Linking two or more universes allows you to access multiple databases,
which may be deployed over different servers.
Linked universes are universes that share common components such as
parameters, classes, objects, or joins. One universe is said to be the kernel
or master universe while the others are the derived universes.
Kernel approach, Master approach and Component approach.

Linking data providers enables data from different sources to be computed


in the same table, cross tab, or chart in a report.
Data Warehousing Interview Questions

22. What are Object Security Access Levels?


Private, Confidential, Restricted, Controlled, Public. The default level is
public.
(Objects are components in BO universes that make data accessible to
users. Their Object security Access levels are defined by the designers who
create them.)

***How do u Qualify the object

23.What is a Strategy? What are the different types.


A strategy is a script that you declare in an RDBMS folder that reads
structural information from a specific database or flat file.

Object Strategy, Join Strategy, and table Strategy

24.How does u resolves the loops?


U can resolve loops in two ways…
Using aliases: In SQL alias is an alternative name for a table.
Using contexts: A context is a rule by which determines which of two
paths can be chosen when more than one path is possible
in the database.

25.What are Aggregate Tables, Aggregate awareness, Aggregate Navigation

Aggregate awareness is a feature of DESIGNER that makes use of


aggregate tables in a database. The tables that contain pre-calculated data.

Aggregate Navigation is the dialogue box of DESIGNER that consists of 2


panes.
Universe tables, which lists all the tables of the universe.
Associated Incompatible objects, which lists all the objects of the universe.

26. What are Universe Parameters?


Definition: Identifies the universe
Summary: Date created, Modified and revision
Strategies: Are used to detect the joins and cardinalities
Controls: Settings allow limiting the size of the result set and the
execution of quires.
SQL: Controls the query and SQL generation options for this
universe.
Links Dynamically linked to the list of universes.

27. What is Chasm Trap and Fan Trap?


There are two types of join paths that return too many rows in relational DB.
Chasm Trap Converging many to one joins.
Fan Trap Serial many to one joins.

*28. *Why does u get the Partial results?

29. What are cardinalities, What is unknown cardinality and Cardinality not valid
Cardinality expresses the minimum and maximum number of instances of an
entity B that can be associated with an instance of an entity A.
Data Warehousing Interview Questions

30. How do u specify an outer join in a complex join


Double click on the join in the Universe, it opens expression window where u
Specify the outer join.

31. Difference between linking and including universe


When u link two universes, and if any changes made to the Kernel
Universe, the derived universe will effect
When u includes the components of Kernel Universe into Derived
Universe, the components are now independent of Kernel Universe, the
change u made in Kernel Universe wont effect in Derived Universe.

32, How would u use the same LOV with many different objects?
Double Click on the object, go to properties tab, copy the code under
ListName box and paste it to the new object ListName box.

33, How can u minimize the download time of a Universe that has many custom
lov’s that are exported with the universe and refreshed upon usage?
This can be done by setting the check box “Do not retrieve the data” in
Go to object properties->Edit->Options

34. Illustrate the syntax to call a VBA macro as part of a Universe?


@script(‘var_name’,’vartype’,’script_name’)

35. What is Safe Recovery?


Allows you to perform a recovery installation. U can do this in the following situations…
The location of the security domain has been changed.
Connection string parameters user name and password has been changed.
The key file has been moved, renamed and damaged.

36 What is Shared Installation?


All shareable application files remain on the server and run remotely.

37 What are User Profiles?


General supervisor, Supervisor, Supervisor-Designer, Designer, User, versatile

38 Can u create multiple users with same name?


No

39 Which module in B.O won’t work without Repository?


Supervisor

40 What are Repository Tables, name some


OBJ_M_ACTOR, OBJ_M_CATEG, OBJ_M_UNIVCST
UNV_JOIN, UNV_OBJECT, UNV_TABLE

41 What is Domain, name different domains and use


To ensure security and manage user resources.
Security Domain: Contain the definition of the other domains as well as the
definition of users.
Universe Domain: Meta models of related databases, containing a description of the
data to be accessed.
Data Warehousing Interview Questions

Document Domain: Contain the structures for storing shared docs.

42 Difference between the End User and Versatile User


Versatile User: is a customized user who may be given access by a
Supervisor to any combination of BO products.
End User: users use BO products to query report and analyze data.

43 When u exports the Universe to Repository, what happens is it going directly to the
repository?
Yes, it directly goes to the repository and resides in the universe domain.

4 4 How does u restricts the user to change his password?


By deselecting the Enable Password Modification of the user properties in
supervisor.

45 How can u skip the username and password screen?


By deleting the bomain.key, bomain.lsi files from locdata folder.

46 What are Pdac.lsi and Sdac.lsi files, where these files are stored?
These are the security and administration files.
Personal Data Account file (Pdac.lsi): Stores security information concerning the user’s personal
onnections to the database. Stored in the LocData folder in the client m/c.
Shared Data Account file (Sdac.lsi): Stores security information concerning the
shared type connections to the database. Stored in the ShData folder on the server.

47 What is B.O.Main Key, where it is stored?


BOMain.Key is used to provide access to the repository. Stored on LocData
or ShData folder.

48 What is Cluster Manager, Cluster Node, Load Balance


Load Balancing A distributed deployment can scale to a greater number of
users by automatically redirecting requests to the m/c’s that are less loaded
in the system.
Cluster Manager It clusters central m/c and it houses HTTP server,
OSAgent, an OAD (ORB Activation Daemon), mandatory BO Modules.
Cluster Node Each of which runs the OAD required to communicate to the
cluster manager and starts the BO and WEBI processes on the node.

49 Differences between B.O Addin and a Macro


Macro is a series of commands and functions that are stored in a VBA
module and can be run whenever you need to run the task.
Add-ins are programs that add optional commands and features to Business Objects.

50 How can u keep bomain key files synchronized between Cluster Manager and Cluster
Nodes?
By copying the bomain.key file to server and client

51 What is BCA, can u mail a document thru BCA?


Broad Cast Agent is software to process and distribute BO or WebI
Data Warehousing Interview Questions

Documents automatically and securely at scheduled dates and times.


You can send mail the documents through BCA.

52 What is Report Bursting and File Watcher in BCAgent?


Report Bursting is the BCA feature that distributes copies of some parts of a
processed document to different users, depending on each user’s profile.
File Watcher is the BCA feature that permits the processing of tasks when
and only when a specified file is present in its specified location.

53 How does drill down in WebIntelligence differs from drill down in Business Objects?
WebI needs WebIntelligence Explorer to drill the reports In Business Objects - drill down is
applicable, but if u want to perform drill down analysis,install BUSINESSOBJECTS EXPLORER
module

54 What does Padding and Spacing do when designing a Webintelligence report?


Padding is to increase or decrease the space between title text and the border
Spacing is to increase or decrease the space between title and the border

55 What does download do when working with a Web Intelligence document


It saves the document with .wqy extension

56 What does download do when working with a Full Client document?


It saves the document with .rep extension

57 Explain the major differences between full client reports run thru Infoview and
reports developed thru Weblntelligence.
Full Client Reports are compressed in Info View where as Web-I reports are
dynamically generated.

58 What does the WIADESERVER component does?


It’s a module used to install Zero Administration deployment of B.O 5.1 from
Server to Clients. This module must be enabled on the server on which the
server products is installed.

59 What is a DMZ?
Configuring the system to use double firewall between the WebServer and
the Application Server is called DMZ (De Militarized Zone)

60 What is ZABO? What are its advantages over Full Client?


ZABO is Zero Administration Business Objects
Pros:
Maintenance free.
No disk required and it downloads from info view portal as per the
profile (he should have NT administrator privilege)
It works on almost all operating system.
You can simply download from simple browser.
You can upgrade software when ever new versions come simply from
Browser.
Data Warehousing Interview Questions

Questions

1. What is Data Link?


2. What is an Optimizer Hint, What Optimizer does Business Objects uses?
3. What is Business Objects Object Model
4. What is conditional hiding and conditional sorting
5. What is Ranking, Sorting, Break, Filters, Groups, Alerters
6. Difference between Slice and Dice and Drill Down
7. What are Global Filters
8. What are Value Based Breaks
9. What are Combined Queries
10. What are different types of Sorting.
11. Difference between Full Client and Thin Client
12. What will integrity check do
13. What are Lov’s, can u export or import them
14. What is a Data Provider, Name different providers in ver4.x and ver5.x
15. What are Prompt Reports. What is the difference when applying prompts in Designer
module and User Module
16. What is User Response, Why it is used
17. What are Templates
18. What are Hierarchies
19. What is the difference between hiding and deleting a column in a report
20. Can Universe Developed in ver5.0 be used in ver 4
21. How do u view ver5 reports using ver4 of B.O

1. What r the business objects products?

Reader
Reporter
Explorer
Business query for excel
Business query server
Business miner
Designer
Supervisor

2. What is supervisor?

Supervisor is the product u need in order to set up and maintain secure environment for business objects
products. It is with supervisor u create the business objects repository then define the users and user groups, as
well as assign profiles to them.

3. What r the different types of user profiles in BO?

1. General Supervisor (He will have access to all BO products)


2. Supervisor (He will have access to all BO products)
3 Designer (He will have access to all BO products except Supervisor)
4. Designer-Supervisor (He will have access to all BO products)
5. User (He will have access to all BO products except Designer-Supervisor)
6. Versatile Configurable. He can access the products which u have assigned to him)

4. Who is the general administrative user in BO?

General Supervisor. He only creates the repository when he launches the Supervisor.

5. What r the tasks of General Supervisor?

Creates the repository.


Creates any type of users including General Supervisor.
Creates user, groups.
Data Warehousing Interview Questions

Import and exports universes to and from repository.


Launch a Broad Cast Agent from Broad Cast Agent administrator.

6. What r the tasks of Supervisor?

Create users of any type except General Supervisor.


Create usergroups
Import and export universes to and from the repository

7. What r the tasks of Supervisor-Designer?

He creates user-profiles, user groups and universes. This user has all rights of the
Supervisor and Designer combined. This user can access Supervisor, Designer and business
objects end user products.

8. What r the business objects end user products? and why for them?

Users use BO end users products to query report and analyze data. They r
Business Objects
Reporter
Explorer
Business Query for Excel.

Reporter and Explorer used for multidimensional analysis. Business Query for Excel
provides the queried data in Excel Sheets for analysis.

9. What is the Business Objects Repository?

It is a centralized set of data structures stored in a database. It enables business objects users
to share resources in a controlled environment. The repository is made up of three domains.

Security domain
Universe domain
Document domain

10.Describe about domains?

Security domain contains information on the other domains (universe and documents
domains) and on the identification of business objects users. Security domain is created with
the wizard the first time supervisor is launched.

Universe domain It contains the information on the universes created and exported with
Designer. The universe domain makes it possible to store, distribute and administrative
universe.

Document domain co Explorer


Business query for excel
Business query server
Business miner
Designer
Supervisor

2. What is supervisor?

Supervisor is the product u need in order to set up and maintain secure environment for business objects
products. It is with supervisor u create the business objects repository then define the users and user groups, as
well as assign profiles to them.

3. What r the different types of user profiles in BO?


Data Warehousing Interview Questions

1. General Supervisor (He will have access to all BO products)


2. Supervisor (He will have access to all BO products)
3 Designer (He will have access to all BO products except Supervisor)
4. Designer-Supervisor (He will have access to all BO products)
5. User (He will have access to all BO products except Designer-Supervisor)
6. Versatile Configurable. He can access the products which u have assigned to him)

4. Who is the general administrative user in BO?

General Supervisor. He only creates the repository when he launches the Supervisor.

5. What r the tasks of General Supervisor?

Creates the repository.


Creates any type of users including General Supervisor.
Creates user, groups.
Import and exports universes to and from repository.
Launch a Broad Cast Agent from Broad Cast Agent administrator.

6. What r the tasks of Supervisor?

Create users of any type except General Supervisor.


Create usergrouformance of Supervisor?

Regularly delete old or outdated documents from the repository. U can do this by using the
delete document command in the tools menu in SUPERVISOR.

15. What r the factors that u have to consider while choosing the repository database?

Databases which supports row level locking.


Database which supports Blob’s

Row level locking mechanism allows for the highest degree of concurrency and minimum
conflicts between multiple users accessing and updating data in the same repository
domain(s).

When users exchange documents via the repository the documents r stored in slices in the
OBJ_X_documents table of the document domain. Depending on it's size and the length of
each slice, single document might be stored in one or more rows in the table.

16. What is Designer?

It is BO product which is used to create, maintain and distribute the universes.

17.Define Universe?
Slowly Changing Dimension
The basic assumption while designing data warehouse is that the data in the
warehouse will never change but over a period of time contain attributes of a dimension will
change this is called SCD.

Ex: customer
Customer address
Phone number changes
Email etc

Classification three types:

• TYPE I
Data Warehousing Interview Questions

• TYPE II 1.Versioning 2.Flag 3.Effective date


• TYPE III

TYPE I : we will over write previous data with the new data..
TYPE II: we will add a new record or field.
TYPE III : we will add a new attribute or column.
SlowlyChangingDimensionsTypeI

• Clicktoad
danou
tline

Source T
arget
Empid Na
m e Ema
il Empid Na
m e Ema
il

1
001 Sh
ane Shan
e@xy
z. 1
001 Sh
ane Sha
ne@xy
co
m z
.com

Source T
arget
Empid Na
m e Ema
il Empid Na
m e Ema
il

1
001 Sh
ane Sha
n e
@ 1
001 Sh
ane Sha
n e
@
Shane@xyz.
a
bc.co.in a
bc.co.in
co
m

SlowlyChangingDimensionsTypeII

• Clicktoaddanoutline

Target
Source P
M
M
A
_P
RY
R
K
I Empid Name Email P
R
M
S
_V
IO
E
N_
EY NUMBE
Empid Name Email R

10 Shane Shane@xyz.
com 1000 10 Shane Shan e@ 0
xyz.
com

S
lowlyC
hang
ingD
ime
n s
ion
s-V
ers
ion
ing
S
ource
E
m pid N
ame E
m a
il

1
0 S
han
e S
han
e@
a
bc.c
om

P
M _
P R
IM E
m pid N
ame E
m a
il P
M _
V E
RSIO
N_NU
M
A
RYKEY B
ER

T
arg
et 1
000 1
0 S
han
e Shane@ 0
x
yz.c
om

1
001 1
0 S
han
e S
han
e@ 1
a
bc.c
o.in

1
003 1
0 S
han
e S
han
e@ 2
a
bc.c
om
Data Warehousing Interview Questions

S
lowlyC
han
gin
gDim
ens
ion
sTy
peII-F
lag

• C
lic
ktoa
dda
nou
tlin
e

P
M _P E
m pid N
ame E
m a
il PM_
C UR
RIM
A RENT_
F
E
m pid N
ame E
m a
il
RYKE L
A G
Y

1
0 S
han
e Sha
ne@x
yz.
c
om 1
000 1
0 S
han
e Sha
ne 1
@xyz.
c
om

S
ourc
e
T
arg
et

S
lowlyC
han
gin
gDim
ens
ion
s-F
lagC
urre
nt
S
ourc
e
E
m pid N
ame E
m a
il

1
0 S
han
e S
han
e@
a
bc.c
o.in

P
M _
P R
IMA E
m pid N
ame E
m a
il P
M _
C U
RRE
NT_
FLAG
RYKEY

1
000 1
0 S
han
e Shane@ N
x
yz.c
om

1
001 1
0 S
han
e Shan
e@ Y
a
bc.c
o.in

T
arg
et

S
lo
wlyCh
ang
ingDim
en
sio
ns-F
lagCu
rr
e n
t
S
ourc
e
E
mpid N
ame E
mail

1
0 S
han
e S
han
e@
a
bc.c
om

P
M_P
RIM
A E
mpid N
ame E
mail P
M_C
URR
ENT
_FL
AG
R
YKE
Y

T
arg
et 1
000 1
0 S
han
e S
han
e@ N
x
yz.c
om

1
001 1
0 S
han
e S
han
e@ N
a
bc.c
o.in

1
003 1
0 S
han
e S
han
e@ Y
a
bc.c
om

S
lowlyC
hang
ingD
ime
n s
ion
sTy
peII

• C
lic
ktoa
dda
nou
tlin
e

PM_
P RI Empid Na
m e E
m a
il PM_BE PM_EN
MARYK GIN_
D A D_
D AT
E
E
m pid Na
m e E
m a
il
EY T
E

1
0 Sh
ane Sha
ne@xy
z.
c
o m
1
000 1
0 Sh
ane Shane@ 0
1/0
1/0
0
xy
z.com

S
ourc
e

T
arg
et
Data Warehousing Interview Questions

S
lowlyC
han
gin
gDim
ens
ion
s-E
ffe
ctiv
eDa
te
S
ourc
e
E
m pid N
ame E
m a
il

1
0 S
han
e Sh
a n
e@
a
bc.c
o.in

P
M _
PRIM
AR E
mpid N
ame E
m a
il P
M_BE
GIN
_ P
M _
END
_
Y
KEY D
ATE D
ATE

1
000 1
0 S
han
e Shane
@ 0
1/0
1/0
0 0
3/0
1/0
0
x
yz.c
om

1
001 1
0 S
han
e Sh
a n
e@ 0
3/0
1/0
0
a
bc.c
o.in

T
arg
et

S
lo
wlyCh
ang
in
gDim
en
sio
ns-E
ff
e c
tiv
eDa
te
S
our
ce
E
mpid N
ame E
mail

1
0 S
ha
ne S
ha n
e@
a
bc.c
om

P
M_P
RIM E
mpid N
ame E
mail P
M_B
EGIN
_ P
M_E
ND_
DA
A
RYK
EY D
ATE T
E

1
000 1
0 S
ha
ne S
ha n
e@ 0
1/0
1/0
0 0
3/0
1/0
0
x
yz.c
om

1
001 1
0 S
ha
ne S
ha n
e@ 0
3/0
1/0
0 0
5/0
2/0
0
a
bc.c
o.in

1
003 1
0 S
ha
ne S
ha n
e@ 0
5/0
2/0
0
a
bc.c
om

T
ar
g e
t

S
lo
wlyCh
ang
ingDim
en
sio
nsT
ypeIII

P
M_PR
I E
mpid N
ame E
mail PM_Pr
e v P
M_EF
FE
MA
RYKE _
Colu
m n C
T_D
ATE
Y Na
m e

E
mpid N
ame E
mail

1
0 S
han
e Sha
ne@x
yz. 1 1
0 S
han
e Sh
ane@x
y 0
1/0
1/0
0
c
om z
.co
m

S
ourc
e T
arg
et

S
low
lyC
han
gin
g D
ime
nsio
nsT
ypeIII
S
ourc
e
E
mpid N
ame E
mail

1
0 S
han
e S
ha n
e@
a
bc.c
o.in

P
M_P
RIM
AR E
mpid N
ame E
mail P
M _
Pre
v_C
ol P
M _
EFF
EC
Y
KEY u
m n
Name T
_DATE

1 1
0 S
han
e S
ha n
e@ S
han
e@x
yz.c 0
1/0
2/0
0
a
bc.c
o.in o
m

T
arg
et
Data Warehousing Interview Questions

S
lowlyC
han
gin
gDim
ens
ion
sTy
peIII

S
ourc
e
E
m pid N
ame E
m a
il

1
0 S
han
e Sh
a n
e@
a
bc.c
om

P
M_P
RIM E
m pid N
ame E
m a
il PM_Pre
v_Co
l PM_
EFF
ECT
A
RYK
EY u
m n
Na me _
DATE

1 1
0 S
han
e Sh
a n
e@ Sh
a n
e@ 0
1/0
3/0
0
a
bc.c
om a
bc.c
o.in

T
arg
et

1) Degenarate Dimension (DD)

A dimension which have single attribute we place this attribute in the fact table. This
is called DD.

2)Meta data

 Data about the data.


 Meta data is stored in data dictionary and repository.

Why u need meta data


o Shara resources
o Document system.

With out mata data


o Not sustainable
o Not able to fully utilize resource.
1.business meta data(for designing)
2.technical meta data( for infomatica)

3)CTL (capture Transform Load)


To get back in to main database.

4) cube
0 multi dimensional databases store information in the form of cubes.
1 Cube is collection of facts and related dimensions stored together in arrays.

5) what is a OLAP?

0 On-line analytical processing applications---designed for on-line ad-hoc data access


and analysis.
1 Data organized into multi dimensions.
2 Fast analysis of shared multi dimensional information.

6) data cleansing?

Possible techniques for data cleasing include:


 Data normalization
 Data smoothing
 Treatment of missing values
 Data reduction: either the data may be too big for the program, or expected time
for obtaining the solution might be too long.
Data Warehousing Interview Questions

7) Transformation

0 a Transformation is a process of converting a given input to desired output.

8) Mappings

1 mapping is a one or more sources, several transformations applied to these sources


and target.
2 Mapping defines how the source data is being channel to target.

9) Conformed Dimensions

0 Conformed dimensions are those which are consistent across data marts.

10) Casual Dimensions.

1 Casual dimensions can be used for explaining why a record exists in a fact table.
2 Casual dimensions should not change the grain of the fact table.

CasualDimensions

• Casu aldim ensio


nscanb
eusedfo
rexpla
inin
gwh
yareco
rde
xistsina
fa
ct ta
b le.

• Ca
sua
ldimensio
nsshou
ldn
ot ch
ang
eth
egra
ino
f th
efa
ct ta
ble
.

HelperTables

• H elper tablesareusedwhentherearem ulti valueddimensions. That is


w henthereisam anytomanyrelationshipbetw eenafact tableanda
dim ensiontable.

• Helper tablecanbeplacedbetw eentwodimensionstablesor between


adim ensiontableandafact table.
Data Warehousing Interview Questions

Helper Tables- Example

Example: Acustomer havingmorethanonebankaccount

SurrogateKeys

• Joinsbetweenfact anddimensiontablesshouldbebasedonsurrogate
keys

• Surrogatekeysshouldnot becomposedof natural keysgluedtogether

• Usersshouldnot obtainanyinformationbylookingat thesekeys

• Thesekeysshouldbesimpleintegers

FactlessFactTables

Thetwotypesof factlessfact tablesare:

• Coveragetables

• Event trackingtables
Data Warehousing Interview Questions

FactlessFact Tables- CoverageTables

Coveragetables arerequiredwhenaprimary fact tableis sparse

Example: Trackingproducts inastorethat didnot sell

Factless Fact Tables - Event Tracking

These tables are used for tracking a event:

Example: Tracking student attendance

Вам также может понравиться