Вы находитесь на странице: 1из 20

ETL TESTING REAL TIME INTERVIEW QUESTIONS

Jul 30, 2015


What is the difference between a ods and staging area
ODS is nothing but the Operational Data Store which holds the data when the
business gets started. It means , it holds the history of data till yesterdays
data(depends upon customer requirement). Some...
Yes, ODS is a Open Data Source where it contains real time data (because we should
apply any changes on real time data right..!) so dump the real time data into ODS
called Landing area later we get the data into staging area here is the place where
we do all transformation.
Lets suppose we have some 10,000 odd records in source system and
when load them into target how do we ensure that all 10,000 records that
are loaded to target doesn't contain any garbage values. How do...
It requires 2 steps:
1.Select count(*) from source
Select count(*) from target
2. If source and target tables have same attributes and datatype
Select * from source
MINUS
Select * from target
Else
We have to go for attribute wise testing for each attribute according to design doc.
As other posts have mentioned, I would do some of the following: "sql Select
COLUMN, count(*) from TABLE group by COLUMN order by COLUMN Select
min(COLUMN), max(COLUMN) from TABL...
What are the etl tester responsibilities?
Asked By: phani.nekkkalapudi | Asked On: Nov 28th, 2011
3 answers
Answered by: Uttam22 on: Feb 20th, 2014
A ETL Tester primarily test source data extraction, business transformation logic and
target table loading . There are so many tasks involved for doing the same , which
are given below - 1. Stage tab...

Answered by: radhakrishna on: Nov 4th, 2013


ETL tester responsibilities are: writing sql queries for various scenarios like count
test, primary key test, duplicate test, attribute test, default check, technical data
quality, business data qua...
Find out the number of columns in flat file
Asked By: kishore.vakalapudi | Asked On: Aug 3rd, 2010
6 answers
In etl testing if the src is flat file, then how will you verify the data count validation?
Answered by: JeffManson on: Feb 13th, 2014
Not always (in my experience, quite rarely in fact). Most often the flat file is just
that, a flat file. If in UNIX, the wc command is great, in Windows, one could open the
file in notepad and CTRL+...
Answered by: Radhakrishna on: Nov 13th, 2013
To find the count in flat file, you have to import that file into excel sheet (use Data>Import External Data->import data) and find the count of records using count()
[goto Insert->function->count->] in excel....
What is latest version of power center / power mart?
Asked By: Interview Candidate | Asked On: Jun 5th, 2005
16 answers
Answered by: Mark Rutherford on: Jun 17th, 2013
9.5.1
Answered by: Kumar on: Nov 17th, 2012
9.5
Test objective is to.
Asked By: sandydil | Asked On: Mar 7th, 2013
1.To find faults in software 2.To purify that software has no defects. 3.To find perform
problems. 4.To give cunt in software. select best option.
List of all the etl and reporting tools available in the market
Asked By: perlamohan | Asked On: May 28th, 2008

15 answers
Can any one let me know where can I find the list of etl& reporting tolls available in
the market. Thanks in advance
Answered by: Michael on: Feb 27th, 2013
Add InetSofts Style Report to the list of BI reporting tools
Answered by: sajid on: Feb 22nd, 2013
Hi, i would like share my knowledge on this. ETL Tools: Ab Initio, BusinessObjects
Data Integrator, IBM InfoSphereDataStage, Informatica, Oracle Warehouse Builder,
SQL Server In...
Etl testing in Informatica
Asked By: phani.nekkkalapudi | Asked On: Nov 28th, 2011
2 answers
As an etl tester what are the things to test in Informatica tool. What are the types of
testing has to do in Informatica and the things to test in each type? If there is any
documents on etl testing
Answered by: lakshmi on: Jan 31st, 2013
ETL Testing in Informatica: 1. First check the workflow exist in specified folder. 2.
Run the workflow. If the workflow is success then check the target table is loaded on
proper data else we can nee...
Answered by: Lokesh M on: Feb 18th, 2012
- Test ETL software. - Test ETL datawarehouse components. - Executing backend
data-driven test. - Create, design and execute test plans, test harnesses and test
cases. - Identify, troubleshoot and pro...
What are the testings performed in etl testing?
Asked By: sindhugowda | Asked On: Jan 23rd, 2013
1 answer
Answered by: jatin on: Feb 7th, 2013
Data-Centric Testing: Data-centric testing revolves around testing quality of the
data. The objective of the data-centric testing is to ensure valid and correct data is
in the system. Following are th...

If LKP on target table is taken, can we update the rows without update
strategy transformation?
Yes, by using dynamic lookup
Update strategy transformation determines whether to insert, update, delete or
reject a record for the target. We can bypass update strategy transformation by
creating a router to divide rows based on insert, update, etc and connecting to one
of the multiple instances of target. In the session, for that target instance we can
check the appropriate box to mark records for insert or update or delete

If you even do dynamic lookup you should make use of update strategy to mark the
records either to insert or update in the target using look-up cache (look-up will be
cached on target).
your requirement is not so clear here whether to use dynamic lookup or session
properties.
Note:When you create a mapping with a Lookup transformation that uses a dynamic
lookup cache, you must use Update Strategy transformations to flag the rows for
the target tables.
Clean data before loading
Asked By: riyazz.shaik | Asked On: Sep 17th, 2008
3 answers
Why is it necessary to clean data before loading it into the warehouse
Answered by: Prashant Khare on: Oct 3rd, 2012
Data Cleansing is a process of detecting and correcting the corrupt and inaccurate
data from table or database.
There are following steps used:1) Data auditing
2) Workflow Specification
3) Workflow Execution
4) Post-processing and controlling
Informatica: if flat file name varies day by day ...
Asked By: Mahendra | Asked On: Nov 16th, 2011
5 answers

If we are using flat file in our loading, and flat file name change daily so how we
handle this without changing file name manually daily? for example: like file name is
changing depend on date so what should I do? Plshelp
Answered by: Charu on: Jan 24th, 2012
Use the indirect filelist option at informatica session. say the filelist name is
daily_file.txt put a shell script daily_filename.sh at the pre-session command. The
content of daily_filename.sh is ...
Answered by: SHIV on: Dec 28th, 2011
You can use Informatica File List option in target to get the dynamic file names along
with the transaction control transformation so that you can create dynamic
filenames based on some transaction properties.....
What are the different lookup methods used in Informatica?
Asked By: Interview Candidate | Asked On: Sep 9th, 2005
11 answers
Answered by: SHIV on: Dec 28th, 2011
The main difference between connected and unconnected lookup is, we can call
unconnected lookup based on some conditions but not the connected lookup.
Answered by: SreedharLokaray on: Nov 2nd, 2011
Lookup can be used as Connected or Unconnected. Apart from cache and receiving
input values from pipe line, there is one more difference. If you want to use the
same lookup more than one in a mapping ...
How do you implement the concept of reusability in microstrategy?( as in
Informatica it is done by creating mapplets.)
Asked By: arjunsarathy | Asked On: Jan 2nd, 2006
1 answer
Answered by: Doyel Ghosh Dutta on: Sep 16th, 2011
I think in microstrategy view can be reusable object.
How to check the existence of a lookup file in a graph ?
Asked By: debashish29 | Asked On: Sep 9th, 2011
How to check the existence of a lookup file in a graph ..The requriement is if lookup
file is present then some search will be carried out in the same else default value

will be set. Please note we need to check the existence of the lookup file in graph
level only..

What is a mapping, session, worklet, workflow, mapplet?


Asked By: Interview Candidate | Asked On: Aug 29th, 2005
7 answers
Answered by: koteshchowdary on: Aug 26th, 2011
SIMPLY to say::::::::::::::::::
worklet : set of sessions
mapplet : set of transformations that can be called within a mapping
Answered by: Naveen on: Aug 22nd, 2011
Mapping :-set of transformations.And moving data from source to target along with
transformation s
session :-set of instructions source addresses and target address u write it
worklet:- set of sessions
Cache files
Asked By: YugandharS | Asked On: Jul 27th, 2010
3 answers
What are the t/rs which create cache files?
Answered by: deepthi on: Aug 20th, 2011
Aggregator
joiner
lookup
rank
sorter
Answered by: shiva on: Jul 22nd, 2011
Actually the answer is partially related to the Question.......
t/rs like Joiner, Lookup, Aggregator,Rank Transformations uses the Caches.....

Cache files are mainly used in Lookup transformations in Informatica 2 types Static
cache , Dynamic cache. Both are used by connected lookup Unconnected lookup
uses only static cache. If LookupSQL Override is used it uses Index cache and Data
Cache Index cache - Stores key columns ( i.e on which index present)Data Cache Stores output values. These 2 files sufficient memory allocation is important aspect
in case of lookup optimization. If the memory allocation is Auto, make sure that
Maximum Memory Allowed For Auto Memory Attributes , Maximum Percentage of
Total Memory Allowed For Auto Memory Attributes is defined properly.
Actually the answer is partially related to the Question.......
t/rs like Joiner, Lookup, Aggregator,Rank Transformations uses the Caches.....
Answer Question

Login to rate this answer.


deepthi
Answered On : Aug 20th, 2011
Aggregator
joiner
lookup
rank
sorter

How do we call shell scripts from Informatica?


Asked By: Interview Candidate | Asked On: May 27th, 2005
5 answers
Answer posted by staline on 2005-05-27 00:42:44: you can use a command task to
call the shell scripts, in the following ways: 1. Standalone command task. You can
use a command task anywhere in the workflow or worklet to run shell commands.
2. Pre- and post-session shell command. You can call...
Answered by: Hanuma on: Jul 24th, 2011
You can use a Command task to call the shell scripts, in the following ways: 1.
Standalone Command task. You can use a Command task anywhere in the workflow
or worklet to run shell commands. 2. Pr...
Answered by: praveenathota on: Nov 12th, 2006

Hi,There are two ways to do this,they are as follows:1)we can use a command task
anywhere in the workflow or worklet to run the shell commands.2)In the session
task,we can call reusable command ...
1
. What is Data warehouse?
In 1980, Bill Inmon known as father of data warehousing. "A Data warehouse is a
subject oriented, integrated ,time variant, non volatile collection of data in support
of management's decision making process".

Subject oriented : means that the data addresses a specific subject such
as sales, inventory etc.

Integrated : means that the data is obtained from a variety of sources.

Time variant : implies that the data is stored in such a way that when
some data is changed.

Non volatile : implies that data is never removed. i.e., historical data is
also kept.

2 What is the difference between database and data


. warehouse?
A database is a collection of related data.
A data warehouse is also a collection of information as well as a
supporting system.

A database is a collection of related data. Where as Data Warehouse stores


historical data, the business users take their decisions based on historical data only.

In Data base we can maintain only current data which was not more than 3 years
But in datawarehouse we can maintain history data it means from the starting day
of enterprise DDL commands it means ( Insert ,update,delete)we can do in
Database In datawarehouse once data loaded in Datawarehouse we can do any
DDL operatations.

Database is used for insert, update and delete operation where asdatawarehouse is

used for select to analyse the data.

Database

Data Warehouse

The tables and joins in DB are are complex Tables and joins are simple since they
since they are normalized
are de-normalized
ER Modeling techniques are used for
database design

Dimension modeling techniques are


used for database design

Optimized for write operation

Optimized for read operrations

Performance is slow for analysis queries

High performance for anlytical queries

Database uses OLTP concept Data warehouse uses OLAP concept, means Data
warehouse stores historical data.

A database is a collection related data and also it is related to same data. Where as
come to Data warehouse, It is collection of data integrated from different sources
and stored in one container for taking or ( getting knowledge) managerial
decisions.

In database we are using CRUD operations means create, read, use, delete but in
datawarehouse we are using select operation.

3
. What are the benefits of data warehousing?

Historical information for comparative and competitive analysis.

Enhanced data quality and completeness.

Supplementing disaster recovery plans with another data back up source.

4
. What are the types of data warehouse?
There are mainly three type of Data Warehouse are :

Enterprise Data Warehouse

Operational data store

Data Mart

5
. What is the difference between data mining and data warehousing?
Data mining, the operational data is analyzed using statistical techniques and
clustering techniques to find the hidden patterns and trends. So, the data mines
do some kind of summarization of the data and can be used by data warehouses
for faster analytical processing for business intelligence.
Data warehouse may make use of a data mine for analytical processing of the
data in a faster way.

Data mining is one of the key concept to analyze data in


Datawarehouse.
Generally, basic testing concepts remains same across all domains. So, the basic
testing questions will also remain same. The only addition would be some questions
on domain. e.g. in case of ETL testing interview questions, it would be some
concepts of ETL, how tos on some specific type of checks / tests in SQL and some
set of best practices. Here is the list of some ETL testing interview questions:

Q. 1) What is ETL?
Ans. ETL - extract, transform, and load. Extracting data from outside source
systems.Transforming raw data to make it fit for use by different departments.
Loading transformed data into target systems like data mart or data warehouse.
Q. 2) Why ETL testing is required?
Ans.
To verify the correctness of data transformation against the signed off business
requirements and rules.
To verify that expected data is loaded into data mart or data warehouse without
loss of any data.
To validate the accuracy of reconciliation reports (if any e.g. in case of comparison
of report of transactions made via bank ATM ATM report vs. Bank Account Report).
To make sure complete process meet performance and scalability requirements
Data security is also sometimes part of ETL testing
To evaluate the reporting efficiency
Q 3) What is Data warehouse?
Ans. Data warehouse is a database used for reporting and data analysis.

Q4) What are the characteristics of a Data Warehouse?


Ans. Subject Oriented, Integrated, Time-variant and Non-volatile
Q5) What is the difference between Data Mining and Data Warehousing?
Ans. Data mining - analyzing data from different perspectives and concluding it
into useful decision making information. It can be used to increase revenue, cost
cutting, increase productivity or improve any business process. There are lot of tools
available in market for various industries to do data mining. Basically, it is all about
finding correlations or patterns in large relational databases.
Data warehousing comes before data mining. It is the process of compiling and
organizing data into one database from various source systems where as data
mining is the process of extracting meaningful data from that database (data
warehouse).
Q6. What are the main stages of Business Intelligence.
Ans. Data Sourcing > Data Analysis > Situation Awareness > Risk Assessment
> Decision Support
Q7. What tools you have used for ETL testing?
Ans.
1. Data access tools e.g., TOAD, WinSQL, AQT etc. (used to analyze content of
tables)
2. ETL Tools e.g. Informatica, DataStage
3. Test management tool e.g. Test Director, Quality Center etc. ( to maintain
requirements, test cases, defects and traceability matrix)
Below are few more questions that can be asked:
Q8. What is Data Mart?
Q9. Data Warehouse Testing vs Database Testing
Q10. Who are the participants of data warehouse testing
Q11. How to prepare test cases for ETL / Data Warehousing testing?
Q12. What is OLTP and OLAP
Q13. What is look up table
Q14. What is MDM (Master data management)
Q15. Give some examples of real time data warehousing
Benefits of ETL Testing

Production Reconciliation

IT Developer Productivity

Data Integrity

Generally, basic testing concepts remains same across all domains. So, the basic testing questions will also remain
same. The only addition would be some questions on domain. e.g. in case of ETL testing interview questions, it would
be some concepts of ETL, how tos on some specific type of checks / tests in SQL and some set of best practices.
Here is the list of some ETL testing interview questions:

Q. 1) What is ETL?
Ans. ETL - extract, transform, and load. Extracting data from outside source systems. Transforming raw data to make
it fit for use by different departments. Loading transformed data into target systems like data mart or data warehouse.
Q. 2) Why ETL testing is required?
Ans.
To verify the correctness of data transformation against the signed off business requirements and rules.
To verify that expected data is loaded into data mart or data warehouse without loss of any data.
To validate the accuracy of reconciliation reports (if any e.g. in case of comparison of report of transactions made

via bank ATM ATM report vs. Bank Account Report).


To make sure complete process meet performance and scalability requirements
Data security is also sometimes part of ETL testing
To evaluate the reporting efficiency
Q 3) What is Data warehouse?
Ans. Data warehouse is a database used for reporting and data analysis.
Q4) What are the characteristics of a Data Warehouse?
Ans. Subject Oriented, Integrated, Time-variant and Non-volatile
Q5) What is the difference between Data Mining and Data Warehousing?
Ans. Data mining - analyzing data from different perspectives and concluding it into useful decision making
information. It can be used to increase revenue, cost cutting, increase productivity or improve any business process.
There are lot of tools available in market for various industries to do data mining. Basically, it is all about finding
correlations or patterns in large relational databases.
Data warehousing comes before data mining. It is the process of compiling and organizing data into one database
from various source systems where as data mining is the process of extracting meaningful data from that database
(data warehouse).
Q6. What are the main stages of Business Intelligence.
Ans. Data Sourcing > Data Analysis > Situation Awareness > Risk Assessment > Decision Support
Q7. What tools you have used for ETL testing?
Ans.
1. Data access tools e.g., TOAD, WinSQL, AQT etc. (used to analyze content of tables)
2. ETL Tools e.g. Informatica, DataStage
3. Test management tool e.g. Test Director, Quality Center etc. ( to maintain requirements, test cases, defects and
traceability matrix)
Below are few more questions that can be asked:
Q8. What is Data Mart?
Q9. Data Warehouse Testing vs Database Testing
Q10. Who are the participants of data warehouse testing
Q11. How to prepare test cases for ETL / Data Warehousing testing?
Q12. What is OLTP and OLAP
Q13. What is look up table
Q14. What is MDM (Master data management)
Q15. Give some examples of real time data warehousing

Types of ETL testingWhat is covered in ETL testing

Each organization categorize testing types by their own way based on the testing practice or testing strategy build @
organization level. It holds true for ETL testing also. Sometimes, for larger projects / programs, it vary from client to
client. Generally, below are the main types of testing that are covered under ETL testing:

Reconciliation testing: Sometimes, it is also referred as Source to Target count testing. In this check,
matching of count of records is checked. Although this is not the best way, but in case of time crunch, it helps.

Constraint testing: Here test engineer, maps data from source to target and identify whether the data is
mapped or not. Following are the key checks: UNIQUE, NULL, NOT NULL, Primary Key, Foreign key, DEFAULT,
CHECK

Validation testing (source to target data): It is generally executed in mission critical or financial projects.
Here, test engineer, validates each data point and match source to target data.

Testing for duplicate check: It is done to ensure that there are no duplicate values for unique columns.
Duplicate data can arise due to any reason like missing primary key etc. Below is one example:

Testing for attribute check: To check if all attributes of source system are present in target table.

Logical or transformation testing: To test any logical gaps in the. Here, depending upon the scenario,
following methods can be used: boundary value analysis, equivalence partitioning, comparison testing, error guessing
or sometimes, graph based testing methods. It also covers testing for look-up conditions.

Incremental and historical data testing: Test to check the data integrity of old & new data with the addition
of new data. It also covers the validation of purging policy related scenarios.

GUI / navigation testing: To check the navigation or GUI aspects of the front end reports.

In case of ETL or data warehouse testing, re-testing or regression testing is also part of this effort.

ETL Testing Tools and Interview Questions with Answers


Business information and the Data are of key importance to any business and company. Many
companies invest a lot of time and money in the process of analyzing and sorting out this vital
information. Analyzing and Integration of Data has gained a huge potential market and so to make
this process organized and simple,ETL testing tools have been introduced by many Software
vendors.
Presently, there are many open source ETL tools available, where the vendors allow the users to
directly download the free versions from their official website. All the basic functions will be available
in this free version download but to upgrade to the next level, the company need to subscribe to the
vendors on payment.

Each company had different business structure and model, so they need to make a clear analyze
before choosing the ETL tool for their business. With the help of these Open source ETL tools,
business have the opportunity to try out the free software tools without any huge investments.
All the Software giants have introduced their own BI tools.
Some of the most used ETL tools are as follows:
Talend Open Studio,

Clover ETL,
Elixir,
Pentaho,
Informatica,
IBM Cognos Data Manager,
Oracle Data Integrator,
SAS Data Integration Studio,
AB Initio, SAP Business Objects Data Integrator.

When an ETL tool has been selected, the next logical step would be testing using these tools. Here,
the company will get to know if they are in the right path on the selection of the tool.
As these tools help in dealing with huge amount of Data and Historic Data, it is necessary to carry out
the ETL testing. To keep a check on the accuracy of the Data, ETL testing is very important.
There are two types of ETL testing available
Application Testing

Data Eccentric Testing

ETL Testing Process:


Although there are many ETL tools, there is a simple testing process which is commonly used in ETL
testing. It is as important as the implementation of ETL tool into your business.
Having a well defined ETL testing strategy can make the testing process much easier. Hence, this
process need to be followed before you start the Data Integration processed with the selected ETL
tool.
In this ETL testing process, a group of experts comprising the programming and developing team will
start writing SQL statements. The development team may customize according to the requirements.
ETL testing process is:
Analyzing the requirement Understanding the business structure and their particular
requirement.
Validation and Test Estimation An estimation of time and expertise required to carry on with the
procedure.
Test Planning and Designing the testing environment Based on the inputs from the estimation,
an ETL environment is planned and worked out.
Test Data preparation and Execution Data for the test is prepared and executed as per the
requirement.

Summary Report: Upon the completion of the test run, a brief summary report is prepared for
improvising and concluding.

ETL Testing Interview Questions with Answers:


With the huge requirement for ETL testing, comes a huge requirement for experts to carry out these
ETL testing process. Today, there are many jobs available for this process.
But only if you are well aware of the technical features and applications, you will have the chance of
getting hired in this profile. You have to be well prepared on these basic concepts of ETL tools, their
techniques and processes to give your best shot.
Below you can find few Questions and Answers which are more frequently asked in the ETL
testing interviews:
Q #1. What is ETL?
Ans. ETL refers to Extracting, Transforming and Loading of Data from any outside system to the
required place. These are the basic 3 steps in the Data Integration process. Extracting means locating
the Data and removing from the source file, transforming is the process of transporting it to the
required target file and Loading the file in the target system in the format applicable.
-----------[AD]

Q #2. Why ETL testing is required?


Ans.

To keep a check on the Data which are being transferred from one system to the other.
To keep a track on the efficiency and speed of the process.
To be well acquainted with the ETL process before it gets implemented into your business and
production.

Q #3. What are ETL tester responsibilities?


Ans.

Requires in depth knowledge on the ETL tools and processes.


Needs to write the SQL queries for the various given scenarios during the testing phase.
Should be able to carry our different types of tests such as Primary Key, defaults and keep a
check on the other functionality of the ETL process.
Quality Check

Q #4. What are Dimensions?


Ans. Dimensions are the groups or categories through which the summarized data are sorted.
Q #5. What is Staging area referring to?
Ans. Staging area is the place where the data is stored temporarily in the process of Data Integration.
Here, the data s cleansed and checked for any duplication.
Q #6. Explain ETL Mapping Sheets.
Ans. ETL mapping sheets contains all the required information from the source file including all the
rows and columns. This sheet helps the experts in writing the SQL queries for the ETL tools testing.
Q #7. Mention few Test cases and explain them.
Ans.

Mapping Doc Validation Verifying if the ETL information is provided in the Mapping Doc.
Data Check Every aspect regarding the Data such as Data check, Number Check, Null check
are tested in this case
Correctness Issues Misspelled Data, Inaccurate data and null data are tested.

Q #8. List few ETL bugs.


Ans. Calculation Bug, User Interface Bug, Source Bugs, Load condition bug, ECP related bug.

ETL interview questions and answers

1. What is ETL process? How many steps ETL contains?


2.What is Full load & Incremental or Refresh load? - ETL
3.What is a three tier data warehouse? - ETL
4.What are snapshots? What are materialized views & where do we use them? What is a materialized view log? - ETL
5.What is partitioning? Explain about Round-Robin, Hash partitioning. - ETL
6.What is partitioning? Explain about Round-Robin, Hash partitioning. - ETL
7.What is a mapping, session, worklet, workflow, mapplet? - ETL
8.What is Operational Data Store? - ETL
9.When do we Analyze the tables? How do we do it? - ETL
10.How to fine tune mappings? - ETL
11.What are the differences between Connected and Unconnected lookup? - ETL
12.When do we use dynamic cache and static cache in connected and unconnected lookup transformations? Difference
between Stop and Abort - ETL
13.What is tracing level? How many types of transformations supported by sorted input? - ETL
14.What are the differences between SQL Override and Update Override? - ETL
15.What is factless fact schema? - ETL
16.In Informatica, I have 10 records in my source system. I need to load 2 records to target for each run. how do I do this? ETL
17.Why can't we see the data in dataset? - ETL
18.How DTM buffer size and buffer block size are related? - ETL

19.State the differences between shortcut and reusable transformation - ETL


20.What are the out put files that the informatica server creates during the session running? - ETL

ETL Testing Interview Questions and answers


Question.1 What is a Data Warehouse?
Answer: A Data Warehouse is a collection of data marts representing historical
data from different operational data source (OLTP). The data from these OLTP are
structured and optimized for querying and data analysis in a Data Warehouse.
Question.2 What is a Data mart?
Answer: A Data Mart is a subset of a data warehouse that can provide data for
reporting and analysis on a section, unit or a department like Sales Dept, HR Dept,
etc. The Data Mart are sometimes also called as HPQS (Higher Performance Query
Structure).[sociallocker]
Question.3 What is OLAP?
Answer:
OLAP stands for Online Analytical Processing. It uses database tables
(Fact and Dimension tables) to enable multidimensional viewing, analysis and
querying of large amount of data.
Question.4 What is OLTP?
Answer: OLTP stands for Online Transaction Processing Except data warehouse
databases the other databases are OLTPs. These OLTP uses normalized schema
structure. These OLTP databases are designed for recording the daily operations and
transactions of a business.
Question.5 What are Dimensions?
Answer: Dimensions are categories by which summarized data can be viewed.
For example a profit Fact table can be viewed by a time dimension.
Question.6 What are Confirmed Dimensions?
Answer: The Dimensions which are reusable and fixed in nature Example
customer, time, geography dimensions.
Question.7 What are Fact Tables?
Answer: A Fact Table is a table that contains summarized numerical (facts) and
historical data. This Fact Table has a foreign key-primary key relation with a
dimension table. The Fact Table maintains the information in 3rd normal form.
A star schema is defined is defined as a logical database design in which
there will be a centrally located fact table which is surrounded by at least one or
more dimension tables. This design is best suited for Data Warehouse or Data Mart.

Question.8 What are the types of Facts?


Answer: The types of Facts are as follows.
1.
Additive Facts: A Fact which can be summed up for any of the dimension
available in the fact table.
2.
Semi-Additive Facts: A Fact which can be summed up to a few dimensions and
not for all dimensions available in the fact table.
3.
Non-Additive Fact: A Fact which cannot be summed up for any of the
dimensions available in the fact table.
Question.9 What are the types of Fact Tables?
Answer: The types of Fact Tables are:
1.
Cumulative Fact Table: This type of fact tables generally describes what was
happened over the period of time. They contain additive facts.
2.
Snapshot Fact Table: This type of fact table deals with the particular period of
time. They contain non-additive and semi-additive facts.
Question.10 What is Grain of Fact?
Answer: The Grain of Fact is defined as the level at which the fact information is
stored in a fact table. This is also called as Fact Granularity or Fact Event Level.
Question.11 What is Factless Fact table?
Answer: The Fact Table which does not contains facts is called as Fact Table.
Generally when we need to combine two data marts, then one data mart will have a
fact less fact table and other one with common fact table.
Question.12 What are Measures?
Answer: Measures are numeric data based on columns in a fact table.
Question.13 What are Cubes?
Answer: Cubes are data processing units composed of fact tables and dimensions
from the data warehouse. They provided multidimensional analysis.
Question.14 What are Virtual Cubes?
Answer:
These are combination of one or more real cubes and require no disk
space to store them. They store only definition and not the data.
Question.15 What is a Star schema design?
Answer:
A Star schema is defined as a logical database design in which there
will be a centrally located fact table which is surrounded by at least one or more
dimension tables. This design is best suited for Data Warehouse or Data Mart.
Question.16 What is Snow Flake schema Design?
Answer:
In a Snow Flake design the dimension table (de-normalized table) will
be further divided into one or more dimensions (normalized tables) to organize the
information in a better structural format. To design snow flake we should first design
star schema design.
Question.17 What is Operational Data Store [ODS] ?

Answer:
It is a collection of integrated databases designed to support
operational monitoring. Unlike the OLTP databases, the data in the ODS are
integrated, subject oriented and enterprise wide data.
Question.18 What is Denormalization?
Answer: Denormalization means a table with multi duplicate key. The dimension
table follows Denormalization method with the technique of surrogate key.
Question.19 What is Surrogate Key?
Answer: A Surrogate Key is a sequence generated key which is assigned to be a
primary key in the system (table).[/signinlocker]

Вам также может понравиться