Вы находитесь на странице: 1из 4

OLAP and Data Warehousing

1. Consider the following situation: the sales department of a supermarket chain wants to
have a system to support the strategic planning and evaluation of promotions. To this end,
they need sales information over the different stores of the supermarket chain. For their
analysis tasks they want to compute average sales and total sales, for different product
types (e.g., food, non-food), brands, for different stores at different levels: province,
country, and for different time periods: per year, month, quarter, semester and also by day
of the week.
a) How would you conceptually model the data needed by the sales department as a
data cube? E.g., what are the measures, the dimensional attributes, the hierarchies,
the aggregations that are needed?
The dimensional attributes are Product, Store, and Date.
The measurement attributes are Total_Sales and Average_Sales
The hierarchies are as follows:
Product
 Brand
Store
 Province  Country
Date
 Month
 Semester  Year
 Weekday
b) Given the cube of part a), explain how you would construct the answers to the
following queries with the operations slice-and-dice, pivot, roll-up, and drilldown. If necessary, indicate in which cell(s) of the constructed cube the answer
can be found:
a. Give the total overall sales per store.
Cell (s,all,all) of the original cube. Slice on Product=all and Date=all.
The measure is Total_Sales.
b. Give an overview of the average sales per month per province.
The measure is Average_Sales. Slice on Product=all. Roll-up Store to
Province, and Date to Month. Represent it using a pivot-table on
dimensions Store and Date.
c. Give the sub-cube with only dimensions store at level province and date at
level month for the average and total sales for the period 1999 till 2005.
Slice: date must be in 1999
2005. Roll-up Store to Province, Date to
Month.

2. Given the setting of question 1, give a relational schema for this data.
a) Give a SQL:1999 expression that produces the data-cube (i.e., contains all
aggregates of the cube using the null-value in an attribute to represent aggregation
on the corresponding dimension). How do you handle the multiple measures? The
hierarchy?
Relational schema:
Product(pID, brand)
Store(sID, province, country)
Date(Day,Weekday,Month,Semester,Year)
Sales(pID,sID,Day,Amount)
SELECT

P.pID, P.Brand,
S.sID, S.Province,S.Country,
D.Day, NULL as D.Weekday, D.Month, D.Semester, D.Year,
average(A.Amount) as Average, sum(A.Amount) as Total
FROM
Product P, Store S, Date D, Sales A
WHERE
P.pID = A.pID and S.sID = A.sID and D.Day = A.Day
GROUP BY rollup(P.Brand, P.pID),
rollup(S.Country, S.Province, S.sID),
rollup(Year, Semester, Month, Day)
UNION
SELECT
P.pID, P.Brand,
S.sID, S.Province,S. Country, D.Day, D.Weekday,
NULL as D.Month, NULL as D.Semester, NULL as D.Year,
average(A.Amount) as Average, sum(A.Amount) as Total
FROM
Product P, Store S, Date D, Sales A
WHERE
P.pID = A.pID and S.sID = A.sID and D.Day = A.Day
GROUP BY rollup(P.Brand, P.pID),
rollup(S.Country, S.Province, S.sID),
rollup(WeekDay, Day)

b) Give SQL:1999 expressions for the queries in 1b).


Let Cube be the result of the query in 2a).
Give the total overall sales per store.
SELECT Store, Total
FROM Cube
WHERE Brand is NULL and Year is NULL and Weekday is NULL
and Store is not NULL

Give an overview of the average sales per month per province.

SELECT Month, Province, Average


FROM Cube
WHERE Brand is NULL and Date is NULL and Month is not NULL and
Province is not NULL and Store is NULL

Give the sub-cube with only dimensions store at level province and date at
level month for the average and total sales for the period 1999 till 2005.
SELECT Month, Province, Average
FROM Cube
WHERE Brand is NULL and Date is NULL and Month is not NULL and
Province is not NULL and Store is NULL and Year >=1999 and
Year <=2005

3. Is it always possible in SQL:1999 to express a group by cube by a series of group


by rollups? I.e., can you always rewrite a group by cube with a series of group by rollups? If yes, explain how, if not, explain why not.
Yes; group by cube(A1,,An) = group by rollup(A1), rollup(A2), ,
rollup(An)
4. Suppose that we have a relation Sales(Product, Month, Store, Amount). There are five
products: P1, P2, P3, P4, P5, 12 months, and three stores: S1, S2, and S3.
a) (Dense setting) Suppose that every product has been sold in every month in every
store; i.e., for every combination of a product p, a month m, and a store s, there is
a tuple (p,m,s,a) with a a non-zero amount.
a) How many tuples does this relation contain? 5x12x3 = 180
b) How many tuples does a data cube with dimensions Product, Month, Store
and measurement attribute Amount contain? 6x13x4 = 312
b) (Sparse setting) Consider the following (sparse) relation:
Product
P1
P1
P2
P2
P3
P3
P4
P5

Month
Jan
Jan
Feb
Feb
Jan
Feb
Feb
Jan

Store
S1
S2
S2
S3
S1
S1
S1
S3

Amount
a1
a2
a3
a4
a5
a6
a7
a8

How many non-empty cells does the data cube of this relation contain?
Group By
#
Group By
#
PMS
8
P
5
PM
6
M
2
PS
7
S
3
MS
6
()
1

Hence, in total: 38 non-empty cells.


c) (hierarchies) Suppose now that the following hierarchies are defined over the
different attributes:
Month
 Semester  Year
Store
 Size
(S1 and S2 are large stores, S3 is a small store)
Product
 Brand
(P1, P2, P3 are of brand B1, P4 and P5 of brand B2)
 Color
(P1, P3, P5 have color C1, P2 and P4 color C2)
Include the hierarchies in the analysis you made for a) and b); i.e., compute the
number of non-empty cells in the cubes taking the hierarchies into account as
well. An example of a non-empty cell in the cube for the sparse relation would be
the cell containing the total amount over all stores for brand B1 in the first
semester.
For a), the dense relation, the size of the relation is still 180. The size of the
cube, however, becomes:
(5 + 2 + 2 + 1) x (3+2+1) x (12+2+1+1) = 960
For b), the sparse setting, the base relation still has 8 tuples. The number of
non-empty cells in the cube is: 205 (this number is computed using a python
script; it is assumed that all dates fall in the same year)

Вам также может понравиться