Вы находитесь на странице: 1из 18

Data warehouse Concepts

What is BIDW?
Introduction to BI
Introduction to DW
Need of DW
What is Data mart?
What is ODS?
Advantages ODS and Differences with DW

Trends in BI
Data Warehousing: Consolidate data from many
sources in one large repository.
Loading, periodic synchronization of replicas.
Semantic integration.

OLAP: Complex SQL queries, business-oriented


queries based on spreadsheet-style operations and
multidimensional view of data.
Data Mining: Exploratory analysis; essentially, fishing
for interesting trends and anomalies.

Data Warehouse
A data warehouse is a
subject-oriented
integrated
time-varying
non-volatile

collection of data that is used primarily in


organizational decision making.
-- Bill Inmon, Building the Data Warehouse 1996

Subject Oriented
Data is Integrated and Loaded by Subject
Cust

Prod

1996
1996
1997

O/P
1998
A/R

D/W
Data

Time Variant
Operational System

Data Warehouse

View of The
Business Today

Designated Time
Frame (3 - 10 Years)

Operational Time
Frame

One Snapshot Per


Cycle

Key Need Not Have Key Includes Date


Date

Integrated
Operational
Order Processing

Systems
Order ID = 10
D/W

Accounts Receivable

Order ID = 12
Order ID = 16

Product Management

Order ID = 8

HR System

Sex = M/F
D/W

Payroll

Sex = 1/2
Sex = M/F

Product Management

Sex = 0/1

Non-Volatile
Operational System

Data Warehouse

CRUD Actions

No Data Update

Insert
Load
Create
Update

Read

Read

Read

Replace

Read

Delete

Read

Summary
A Data Warehouse Is A Structured Repository
of Historic Data.

It Is:

Subject Oriented
Integrated
Time Variant
Non-volatile

It Contains:
Business Specified Data,
To Answer Business Questions

Evolution of Data mining


QUERY

OLAP

DATA MINING

Extraction of
detailed and
summary data

Summaries, trends
and forecasts

Knowledge discovery
of hidden patterns and
insights

Information

Analysis

Insight and
Prediction

Who
purchased
mutual funds in
the last 3
years?

What is the
income
distribution
of mutual
fund buyers?

Who will buy a


mutual fund in the
next 6 months
and why?

Popular uses of DataWarehousing

To build customer centric views by consolidating


islands of information
To enhance financial consolidation and business
consolidation
Identifying and monitoring Key Performance
Indicators and Business Metrics
Leverage on centralized information by coupling
DW with the Internet

Why is it Important?
Predict New Trends - Beat The Competition To
Market
Understand and Better Service the Customer
Helps to understand the business
Operational Systems are focused on
running the business, not understanding It!

Profitability - Increase Productivity Per Employee

Approaches..
Data Warehouse
A Multi-Subject Information Store Designed For DSS Apps
Typically 100s of Gigabytes to Terabytes

Data Mart

A Single Subject Data Warehouse


Often Departmental or Line of Business Oriented
Typically Less Than a 100 Gigabytes

Centralized Data Warehouses vs.


Departmental Data Warehouses
The Data Warehouse Dogma...
Source once feed many

Good

Not so good

Single view
of truth

Difficult
to build

Quick to
build

No single
version of
truth,
Management
issues

The Data Mart Dogma...


Source from many to feed many...

Data Marts vs. Data Warehouse


Payback
ROI
Avg Cost
Risk

Data Mart
Data Warehouse
1.7 Year
~3 Years
532 %
321%
$1.3 Million $2.2 Million
Low
High

(IDC - ROI of Data Warehousing, 62 companies)

70% of DSS/Data Warehouse Projects are


Departmental or Data Marts.
Sentry Market Research study of 700 Major I/T Purchasers