Вы находитесь на странице: 1из 18

DATA WAREHOUSE CONCEPTS

A Definition
A Data Warehouse:
Is a repository for collecting, standardizing, and summarizing snapshots of transactional data contained in an organizations operations or production systems provides a historical perspective of information is most often, but not exclusively, used for decision support applications and business information queries can be more than one database Is not a new concept

Another Definition
Decision Support:
is a set of tools to easily access data is becoming a critical business tool is usually graphically oriented is empowering end users with tools to access vital business information is moving lots of data down to the end user workstation is a rapidly expanding area because of data warehousing efforts and projects

Why a warehouse?
For analysis and decision support, end users

require access to data captured and stored in an organizations operational or production systems

This data is stored in multiple formats, on

multiple platforms, in multiple data structures, with multiple names, and probably created using different business rules

Why do we want a central data store


Lif e LTD Voluntary Benf its Non-Medical

Indiv idual Disability

Financial

Account Management Underwriting

Customer Service Sales/Marketing Financial Analysis

A Good Reason for a Central Data Store

Interesting Statistics
85% of the Fortune 1000 companies have, are

implementing, or are looking at, data warehouses (Meta Group)

90% of all information processing organizations will be

pursuing a data warehouse strategy in the next three years (Meta Group)

The Decision Support industry will be a $1 Billion

industry by 1997 (IDC & Forrester)

Data Warehouse Evolution - Stage 0


Application End User Reports

No end user access to production files


Production Files

What we print is what you get


End User Reports

Application

Production Files

Data Warehouse Evolution - Stage 1


Application End User Reports

End users denied direct access to production files Snapshots or copies of production files are made available instead Solution: Provide end users access to production systems

Production Files

Snapshot File

Application

End User Reports

Production Files

Snapshot Files

No Integration Between Systems


a
System developed in 1979

c
Purchased company system

Purchased Package

New Application development Rebuilt Application

Data Warehouse Evolution - Stage 2


Document Document Document
Desktop computer

Document Document Document

Document

A
Desktop computer

Document Document

Document Document Document

Mainf rame

Document Document Document

Serv er or Midreange

Document Document Document

4GL

Desktop computer

Data Characteristics
Type
Data Use Level of detail Currency Longevity Stability

Production
Operational Detailed Real time, Latest value Relatively brief Dynamic Application wide Capture/update Coded

Warehouse
Mgt Reporting Summary Multiple generations Forever Static Enterprise wide Read only Decoded

Scope of definition
Data Operations Data values

Transforming the Logical Model


Brand Group Brand Shipment

Product

SKU Day Order Item Size

Week

Month customer

Ty pe

Class Calendar Y ear Sales Rep Route

time

marke t
District Region

Key Differences - Part 1


Key differences between data jails

(operational database) & warehouses


Subject orientation - operational systems are applicationsegmented (i.e. banks = auto loan, demand deposit accounting or mortgages). Subject areas for banks would be customer and each financial product Level of integration - warehouses resolve years of application inconsistency in encoding/decoding, data name rationalization, etc Update volatility - record at a time updates in operational database vs bulk loads in data warehouse Time variance norms include: 30-90 days of transactions for operational system, 1-10 years for data warehouses

Key Differences - Part 2


Characteristic Transaction volume Response time Updating Time Period Scope Activities Queries Operational High Very fast High volume Current Period Internal Focused, clerical operational Predictable, periodic Warehouse Low to huge Reasonable Very Low Past to Future External Exploratory, analytical, managerial Can be Unpredictable, Ad hoc

Types of Warehouse Configurations


Enterprise Division Functional
Financial Personnel Engineering/Product

Departmental Special Project

Whats Really Involved?


Data Warehouse Components
DB/2 VSAM

IMS

Mainframe Applications

M anage me nt Re porting Sale s/M arke ting Custome r Re lations Re se rv e Analysis Risk Analysis

DB2/2

PC Applications

???

Extract Programs Data Cleansers/Scrubbers Translators/Transformers Timing Tools Data Loading File Transfer

Reserv es

Customers

Rates

Combine d Data Ware house

Policies

External Sources
Claims Premiums

DB/6000

Midrange

De cision Support Tools

DB/400

Typical Users of a Data Warehouse


Decision Support Analysts, Business Analysts
Marketing, Actuaries, Financial, Sales, Executive

Grocery Store attitudes


Going to the store, not knowing what they want Close proximity says give me everything

Explorers
Dont know what they want Search on a random basis, non-repetitively Frequently finds nothing, but when they do, there are huge rewards

Farmers
Know what they want Non random searches, finds frequent flakes of gold Finds small amounts of data

Advanced Warehouse Topics


Metadata repositories
Information about the data in the warehouse Like a library card catalog Data about when the information was created, what files accessed, how much data Data about changes in business rules, processes Context versus Content
What does it mean?

Data Mining
Drilling down into databases with tools to find specific anomolies

Online Analysis Processing (OLAP)


Really means summary data

Вам также может понравиться