Академический Документы
Профессиональный Документы
Культура Документы
Management (BISM)
A.K. Swain
IIM Kozhikode
Processes in the DW
BA Portal Data
Reports
Mart 1 Score Cards Dash Boards Analytical
With access to business users Applns.
ETL Processes
Data Warehouse
Data from the source systems
y are
integrated. Dimensions and Matadata Data
D t Dimensions
Repository
metadata repository. Access is given
to analytics warehouse
Data Quality Firewall with profiling and cleansing tools
Processes
ETL Processes
Staging ODSs
Staging Area Data
The data entry of the data
warehouse and repository of ODSs
ETL Processes
S
Source 1 Source
S Source
S Source
S
Source Systems 2 3 4
ERP, CRM, external data sources,
time sheet data, etc.
BA Analysts Accesses to the DW
Traditional Business Users ((Typically
yp y Lag
g Information))
BA Analys
st (Typically Lead Inform
mation)
OLTP Database Storage
Examples
Membership Start
Date No. Time End Time
10/6/2010 10201 9 a.m. 9.12 a.m.
10/6/2010 10358 8.45 a.m. 8.57 a.m.
10/6/2010 10243 7 38 a
7.38 a.m.
m 7 45 a
7.45 a.m.
m
11/6/2010 10309 7.18 a.m. 7.29 a.m.
11/6/2010 10379 8.32 a.m. 8.41 a.m. DW Data Storage
12/6/2010 10151 8.05 a.m. 8.18 a.m. Membership No of Total
12/6/2010 10432 9.25 a.m. 9.41 a.m. Month Year No. Visits time (hr)
12/6/2010 10312 9.48 a.m. 9.59 a.m. June 2010 10201 10 6.5
June 2010 10358 15 3.4
June 2010 10243 11 5.8
June 2010 10309 8 7.2
June 2010 10379 17 1.9
J
June 2010 10151 14 3 34
3.34
June 2010 10432 15 2.8
June 2010 10312 17 2.26
OLTP Database Snapshot
Examples
Summary of the Snapshot
Date Membership No
No. Start Time End Time
Membership No of Total
10/6/2010 10201 9 a.m. 9.12 a.m. Month Year No. Visits time (hr)
10/6/2010 10358 8.45 a.m. 8.57 a.m. June 2010 10201 10 6.5
11/6/2010 10379 8.32 a.m. 8.41 a.m. June 2010 10358 15 3.4
12/6/2010 10151 8.05 a.m. 8.18 a.m. June 2010 10379 17 1.9
6
Dimensionality
Dimensional aspects of a Spreadsheet?
Jimmys Fine Auto Jan Feb March April May
Sports
Economy
Family
Wagon
Pickup
Truck
Utility Van
uct
Produ
Mi i V
Mini Van
SUV
M th
Month
Dimensionality
Multidimensional Dimensional aspects of a
Spreadsheet?
Spiffy Motors Jan Feb March April May
H
Happy M
Motors
t J
Jan F b
Feb M h
March A il
April M
May
Utility Van
Mini Van
SUV
Month
Three Dimensional Planes
Multidimensional Representation on 2D
Dealer Product Jan Feb March April May
Sports 2 1 3.8 4.2 1.8
Jimmys Fine Auto
Economy
3.8 4 9.2 10.1 4.3
Family 3 3.8 7.5 8.9 2.9
Wagon 2.8 2.9 5.4 5.8 2.7
Pi k
Pickup 3.1 3.5 5.7 5.4 2.9
Truck 5.9 6.2 7.2 8.1 4.9
Utility Van 5.15 4.90 6.91 7.31 4.36
Mini Van 5.10 4.80 7.03 7.38 4.68
SUV 6.14 6.42 10.80 10.93 6.10
Happy Motors S
Sports 1.8 1.7 3.6 3.98 1.9
Economy 3.2 3.4 4.9 7.3 3.6
Family 3.7 4.2 8.1 8.3 3.4
Wagon 3.1 3.7 6.2 6.21 3.2
Pickup 3.1 3.3 5.1 5.4 2.9
Truck 2.1 2.5 3.9 4.5 2.3
Utility Van 4.10 4.2 5.80 7.31 4.16
Mini Van 5.10 6.99 9.20 8.90 4.70
SUV 6.70 7.67 12.60 10.49 5.24
Spiffy Motors Sports 2.02 0.05 3.20 3.80 1.10
Economy 2.51 3.10 5.80 6.20 2.30
Family 3.01 2.70 4.90 5.60 2.80
Wagon 3.50 2.33 5.10 5.32 2.98
Pickup 3.99 3.09 4.90 5.12 2.44
Truck 4.48 3.85 7.30 7.21 3.31
Utility Van 5 15
5.15 4 90
4.90 6 91
6.91 7 31
7.31 4 36
4.36
Mini Van 5.10 4.70 7.03 7.38 4.68
SUV 6.14 6.42 10.80 10.93 6.10
Analysis: One Dimensional Approach
Sl. No. Time Amount in Lakhs Sl. No. Dealers Amount in Lakhs
1 Jimmy's Fine 11
1 Q1 22 Auto
2 Q2 22 2 H
Happy M
Motors
t 11
Total 44 3 Spiffy Motors 11
4 Rincy Motors 11
Total 44
Sl No.
Sl. No Product Amount in Lakhs
1 Sports 11
2 Economy 11
3 Family 11
4 Wagon 11
Total 44
Analysis: Multidimensional Approach
M ltidi
Multidimensional
i l View
Vi
Product Time Jimmy's Fine Auto Happy Motors Spiffy Motors Rincy Motors Total
Sports 0 0 3.5 2 5.5
Economy 0 0 3 2.5 5.5
Q1
Family 1 4.5 0 0 5.5
Wagon 3 2.5 0 0 5.5
T t l1
Total1 4 7 65
6.5 45
4.5 22
Sports 5.5 0 0 0 5.5
Economy Q2 1.5 4 0 0 5.5
Family 0 0 2 35
3.5 55
5.5
Wagon 0 0 2.5 3 5.5
Total 2 7 4 4.5 6.5 22
Grand Total 11 11 11 11 44
Star Schema
Online
O li Li
Line A
Analytical
l ti l
Processing (OLAP)
14
OLAP Architecture
Types
Desktop online analytical processing (DOLAP)
Relational online analytical processing (ROLAP)
Multidimensional online analytical processing
(MOLAP)
Hybrid online analytical processing (HOLAP)
15
Data Mining Strategies
DM Strategies
Unsupervised
Clustering
Supervised Market Basket
Learning Analysis
IIM Kozhikode