Академический Документы
Профессиональный Документы
Культура Документы
Data Warehouse
Understanding Data Warehouse
Amalia Anjani A.
anjani.arifiyanti@gmail.com
What
is Data Warehouse?
Defined in many different ways, but not rigorously.
A decision support database that is maintained separately from the
sources and made available to end users in a way that they can understand
and use in business context Barry Devlin
Data warehousing:
The process of constructing and using data warehouses
DW Subject Oriented
Operational Systems
Sales system
Customer data
Payroll system
Employee data
Purchasing system
Vendor data
DW Subject Oriented
Oriented to the major subject areas of the organization
DW Integrated
Operational Systems
Marketing system
Order system
Billing system
Customer data
DW - Integrated
Constructed by integrating multiple, heterogeneous data
sources
relational databases, flat files, on-line transaction
records
Data cleaning and data integration techniques are
applied.
Ensure consistency in naming conventions, encoding
structures, attribute measures, etc. among different data
sources
E.g., Hotel price: currency, tax, breakfast covered, etc.
DW Time Variant
Operational Systems
Order system
60-90 days
Customer data
5-10 years
DW Time Variant
The time horizon for the data warehouse is significantly
DW Non Volatile
Operational Systems
create
update
Order system
insert
delete
load
Customer
data
access
10
DW Non Volatile
A physically separate store of data transformed from the
operational environment.
Operational update of data does not necessarily occur in
11
Why
use Data Warehouse?
We collect tons of data, but we cant access it.
We need to slice and dice the data every which way.
Business people need to get at data easily.
Just show me what is important.
We spend entire meeting arguing about who has the right
12
characteristics:
Massive volume
Dispersed
Difficult to access
Badly integrated
Complex data structures
Not suitable for high level business queries
13
14
Data Sources
Production data
15
16
17
Data
Warehou
se
18
19
OLTP VS OLAP
OLTP
OLAP
users
clerk, IT professional
knowledge worker
function
decision support
DB design
application-oriented
subject-oriented
data
current, up-to-date
detailed, flat relational
isolated
repetitive
historical,
summarized, multidimensional
integrated, consolidated
ad-hoc
lots of scans
unit of work
read/write
index/hash on prim. key
short, simple transaction
# records accessed
tens
millions
#users
thousands
hundreds
DB size
100MB-GB
100GB-TB
metric
transaction throughput
usage
access
complex query
20
21
Requirement
The DW system must make information easily accessible.
The DW system must present information consistently.
The DW system must adapt to change
The DW system must present information in a timely way
The DW system must be a secure bastion that protect the
information assets
The DW system must serve as the authoritative and
trustworthy foundation for improved decision making.
The business community must accept the DW system to
deem it successful.
22
23
24
Individual Assignment
Create a report that explain: (A4 page)
Multidimensional data model (cube, fact table, dimension, etc)
Scheme (star, snowflake, etc)
Architecture data warehouse system