Академический Документы
Профессиональный Документы
Культура Документы
Decision Support
• Composition of technologies
BI=DW+OLAP+data mining+ (visualization, what-if, CRM,…)
DB DM
Appl. Selected
subjects
DB
Appl.
D-App
DB DM
Trans.
Appl.
DB
D-App
DM
Appl.
DB
Trans.
Appl. D-App
DM
DB
Trans.
Appl.
DB
D-Appl.
Appl.
DM
DB Trans.
Appl.
DB
D-Appl.
Appl. DM
DB
D-Appl.
Appl.
DM
Trans.
DB DW
Appl.
DB D-Appl.
DM
IT University, October 28, 2003 12
Top-down vs. Bottom-up
Appl.
D-Appl.
DB DM
Appl.
D-Appl.
DB
DM
Trans.
Appl. DW
DB
”In-between”:
Appl. 1. Design of DW for D-Appl.
DM1 DM
DB
2. Design of DM2 and Bottom-up:
Appl. integration with DW 1. Design of DMs
Top-down:DB 3. Design of DM3 and 2. Maybe integration
1. Design of DW integration with DW of DMs in DW
2. Design of DMs 4. ... 3. Maybe no DW
IT University, October 28, 2003 13
Data’s Way To The DW
• Extraction
Extract from many heterogeneous systems
• Staging area
Large, sequential bulk operations => flat files best ?
• Cleansing
Data checked for missing parts and erroneous values
Default values provided and out-of-range values marked
• Transformation
Data transformed to decision-oriented format
Data from several sources merged, optimize for querying
• Aggregation?
Are individual business transactions needed in the DW ?
• Loading into DW
Large bulk loads rather than SQL INSERTs
Fast indexing (and pre-aggregation) required
IT University, October 28, 2003 14
Online Analytical Processing
• Millions of clicks
Fast query response
Schema Instance
• Additive
Can be aggregated over all dimensions
Example: sales price
Often occur in event facts
• Semi-additive
Cannot be aggregated over some dimensions - typically time
Example: inventory
Often occur in snapshot facts
• Non-additive
Cannot be aggregated over any dimensions
Example: average sales price
Occur in all types of facts
5
"
Documentation Of Schema
5 #
$
!
%
87
4
6 # 0
1 12/
/
/
/
( -
5
3
6 ,
/
3
3 /
, -
&
' *
! 5
( ) - +, .
36
Kimball Dimension Notation
• The granulariteten is Day
Calendar Year
• There is an implicit ”top”
value which means ”all
days” or ”the whole time
Calendar
Quarter axis”.
This is selected by not
mentioning the dimension
Calendar
in a query
Month
Day
,-+
8 10 '&
23
* %(
) 543
!
64
71 )
1
1+ 8
< =>
)
,
1?
1 @
Star Schema Example
&
BCA
9
D
9
:% * "
)
!
1 @ ;
BCA 9
!
D
J IH
K '(
E
L F
7= :%
*M G
NM
40
"!
% % #$
"! +
3$
% #$ %
4
*
+ )
% , % *
,
-$
.& )
/ 01
*
. < + '
.2
>?=
,
@
,
+
-$
,
%
,
;
-$ !* 5
, % 6
%
% -$
9 78
: 34
5
ED 6
0 -$
Snow-flake Schema Example
5 C
% 6 %
-$
41
Relational OLAP Queries
• Aggregating data, e.g., with SUM
• Starting level: (Quarter, Product)
• Roll Up: less detail, Quarter->Year
• Drill Down: more detail, Quarter->Month
• Slice/Dice: selection, Year=1999
• Drill Across: “join” on common dimensions
• Visualization and exceptions
• Note: only two kinds of queries
Navigation queries examine one dimension
◆ SELECT DISTINCT l FROM d [WHERE p]
Aggregation queries summarize fact data
◆ SELECT d1.l1,d2.l2,SUM(f.m) FROM d1,d2,f
WHERE f.dk1=d1.dk1 AND f.dk2=d2.dk2 [AND p]
GROUP BY d1.l1,d2.l2