Академический Документы
Профессиональный Документы
Культура Документы
Jan. 22 (T)
Jan. 24 (TH)
Subqueries, Grouping and Aggregation. Read Sections 6.3-6.4. Project Part 2 due.
Jan. 29 (T)
Winter 2002
A small set of operators that allow us to manipulate relations in limited but useful ways. The operators are: 1. Union, intersection, and difference: the usual set operators.
2. Selection: Picking certain rows from a relation. 3. Projection: Picking certain columns. 4. Products and joins: Composing relations in useful ways. 5. Renaming of relations and their attributes.
Winter 2002
52
Relational Algebra
Finiteness
UNARY
PROJECT
X CARTESIAN PRODUCT
U UNION
SET-INTERSECTION
THETA-JOIN
NATURAL JOIN
DIVISION or QUOTIENT
Winter 2002
53
Borrow
T1 T2 T3 T4
Winter 2002
Selection
Example
bar Joe's Joe's Sue's Sue's beer Bud Miller Bud Coors price 2.50 2.75 2.50 3.00
Relation Sells:
JoeMenu = bar=Joe's(Sells)
bar Joe's Joe's beer Bud Miller
Arthur Keller CS 180
Winter 2002
SELECT ()
arity((R)) = arity(R) 0 card((R)) card(R) c (R) (R)
c (R)
Winter 2002
56
Projection
Example
beer Bud Miller Coors price 2.50 2.75 3.00
beer,price(Sells)
Winter 2002
Projection
() arity ( A (R)) = m arity(R) = k (R) 1 ij k distinct 0 card ( A (R)) card (R)
i ,...,i 1 m
for j = 1,...,m
branch-name, cust-name
Midtown Sally
Fred
Midtown
Downtown Tom
Winter 2002
58
Product
R = R1 R2 pairs each tuple t1 of R1 with each tuple t2 of R2 and puts in R a tuple t1t2.
Winter 2002
59
R
D E F A B C D D' E F
RS
A B C D
Winter 2002
510
Theta-Join
Winter 2002
511
Example
Bars =
price 2.50 2.75 2.50 3.00
name Joe's Sue's addr Maple St. River Rd.
Sells.Bar=Bars.Name
Sells =
BarInfo = Sells
beer Bud Miller Bud Coors price 2.50 2.75 2.50 3.00
Bars
addr Maple St. Maple St. River Rd. River Rd.
Winter 2002
512
Theta-Join R arity(R) = r arity(S) = s arity (R 0 card(R S) = r + s ij (R S) S 1...s j can be < > = If equal (=), then it is an EQUIJOIN S= c (R S)
$i $(r+ j)
S) card(R) card(S)
1...r i
R(ABC) S(CDE) T(ABCCDE) 135 211 135122 246 122 135334 R(A B C) S(C D E) 357 334 135443 R.A<S.D 468 443 246334 246443 result has schema T(A B C C' D E) 357443
Winter 2002
513
Natural Join
R = R1 R2 calls for the theta-join of R1 and R2 with the condition that all attributes of the same name be equated. Then, one column for each pair of equated attributes is projected out.
Example
Suppose the attribute name in relation Bars was changed to bar, to match the bar name in Sells. BarInfo = Sells Bars
bar Joe's Joe's Sue's Sue's beer Bud Miller Bud Coors price 2.50 2.75 2.50 3.00
Arthur Keller CS 180
Winter 2002
Renaming
S(A1,,An) (R) produces a relation identical to R but named S and with attributes, in order, named A1,,An.
name Joe's Sue's bar Joe's Sue's addr Maple St. River Rd. addr Maple St. River Rd.
Example
Bars =
R(bar,addr) (Bars) =
Winter 2002
Union (R S) arity(R) = arity(S) = arity(R S) max(card(R),card(S)) card(R S) card(R) + card(S) RRS SRS
Winter 2002
516
(DEPOSIT BORROW) )
Borrow
Cust-Name ( Branch-Name = "Perryridge" Cust-Name ( Branch-Name = "Perryridge" Does ( (D) (B) ) work?
(DEPOSIT) ) (BORROW) )
Winter 2002
517
Combining Operations
Algebra = 1. Basis arguments + 2. Ways of constructing expressions. For relational algebra: 1. Arguments = variables standing for relations + finite, constant relations. 2. Expressions constructed by applying one of the operators + parentheses. Query = expression of relational algebra.
Arthur Keller CS 180 518
Winter 2002
Cust-Name,Cust-City =
(CLIENT.Banker-Name = "Johnson"
(CLIENT CUSTOMER) )
Cust-Name,Cust-City (CUSTOMER)
CLIENT.Cust-Name, CUSTOMER.Cust-City
(CLIENT.Banker-Name = "Johnson"
CLIENT.Cust-Name = CUSTOMER.Cust-Name
(CLIENT CUSTOMER) )
CLIENT.Cust-Name, CUSTOMER.Cust-City
(CLIENT.Cust-Name =CUSTOMER.Cust-Name
(CUSTOMER Cust-Name
( CLIENT.Banker-Name="Johnson" (CLIENT) ) ) )
Winter 2002
519
(R S)
RS R RS S S
R (R S) = R S
Winter 2002
520
Operator Precedence
1.
2.
3.
But there is no universal agreement, so we always put parentheses around the argument of a unary operator, and it is a good idea to group all binary operators with parentheses enclosing their arguments.
Example
T as R ((S ) T ).
Arthur Keller CS 180 521
Group R S
Winter 2002
If , , applied, schemas are the same, so use this schema. Projection: use the attributes listed in the projection. Selection: no change in schema. Product R S: use attributes of R and S.
But if they share an attribute A, prefix it with the relation name, as R.A, S.A.
Theta-join: same as product. Natural join: use attributes from each relation; common attributes are merged anyway. Renaming: whatever it says.
Arthur Keller CS 180 522
Winter 2002
Example
Find the bars that are either on Maple Street or sell Bud for less than $3.
Winter 2002
523
Example
Find the bars that sell two different beers at the same price. Sells(bar, beer, price)
Winter 2002
524
Invent new names for intermediate relations, and assign them values that are algebraic expressions. Renaming of attributes implicit in schema of new relation.
Example
Find the bars that are either on Maple Street or sell Bud for less than $3.
Sells(bar, beer, price) Bars(name, addr) R1(name) := name( addr = Maple St.(Bars)) R2(name) := bar( beer=Bud AND price<$3(Sells)) R3(name) := R1 R2
Arthur Keller CS 180 525
Winter 2002
What does it mean to work? Why cant we just tear sets of attributes apart as we like? Answer: the decomposed relations need to represent the same information as the original.
Winter 2002
Example
addr Voyager Voyager Enterprise beersLiked Bud WickedAle Bud manf A.B. Pete's A.B. favoriteBeer WickedAle WickedAle Bud
R=
Winter 2002
527
Winter 2002
Reconstruction of Original
Can we figure out the original relation from the decomposed relations? Sometimes, if we natural join the relations. Drinkers4 =
beersLiked Bud WickedAle Bud manf A.B. Pete's A.B.
Example
name Janeway Janeway Spock
Drinkers3
Winter 2002
Theorem
Suppose we decompose a relation with schema XYZ into XY and XZ and project the relation for XYZ onto XY and XZ. Then XY XZ is guaranteed to reconstruct XYZ if and only if X Y (or equivalently, X Z). Usually, the MVD is really a FD, X Y or X Z. BCNF: When we decompose XYZ into XY and XZ, it is because there is a FD X Y or X Z that violates BCNF.
Thus,
we can always reconstruct XYZ from its projections onto XY and XZ.
4NF: when we decompose XYZ into XY and XZ, it is because there is an MVD X Y or X Z that violates 4NF.
Again,
Winter 2002
Bag Semantics
A relation (in SQL, at least) is really a bag or multiset. It may contain the same tuple more than once, although there is no specified order (unlike a list). Example: {1,2,1,3} is a bag and not a set. Select, project, and join work for bags as well as sets.
Just
Winter 2002
Bag Union
Sum the times an element appears in the two bags. Example: {1,2,1} {1,2,3,3} = {1,1,1,2,2,3,3}.
Bag Intersection
Take the minimum of the number of occurrences in each bag. Example: {1,2,1} {1,2,3,3} = {1,2}.
Bag Difference
Proper-subtract the number of occurrences in the two bags. Example: {1,2,1} {1,2,3,3} = {1}.
Arthur Keller CS 180 532
Winter 2002
But other laws that hold for sets do not hold for bags.
Example
R (S T) (R S) (R T) holds for sets. Let R, S, and T each be the bag {1}. Left side: S T = {1,1}; R (S T) = {1}. Right side: R S = R T = {1}; (R S) (R T) = {1} {1} = {1,1} {1}.
Arthur Keller CS 180 533
Winter 2002
Adds features needed for SQL, bags. 1. Duplicate-elimination operator . 2. Extended projection. 3. Sorting operator . 4. Grouping-and-aggregation operator . 5. Outerjoin operator o .
Winter 2002
534
Duplicate Elimination
(R) = relation with one copy of each tuple that appears one or more times in R.
Example
A 1 3 1 A 1 3 B 2 4
Arthur Keller CS 180 535
R= B 2 4 2
(R) =
Winter 2002
Sorting
L(R) = list of tuples of R, ordered according to attributes on list L. Note that result type is outside the normal types (set or bag) for relational algebra.
Winter 2002
Extended Projection
Allow the columns in the projection to be functions of one or more columns in the argument relation.
Example
R=
A B 1 2 3 4 A+B,A,A(R) = A+B A1 3 1 7 3 A2 1 3
Arthur Keller CS 180
Winter 2002
537
Aggregation Operators
These are not relational operators; rather they summarize a column in some way. Five standard operators: Sum, Average, Count, Min, and Max.
Winter 2002
538
Grouping Operator
a) Individual (grouping) attributes or b) Of the form (A), where is an aggregation operator and A the attribute to which it is applied, is computed by: 1. Group R according to all the grouping attributes on list L. 2. Within each group, compute (A), for each element (A) on list L. 3. Result is the relation whose columns consist of one tuple for each group. The components of that tuple are the values associated with each element of L for that group.
Arthur Keller CS 180 539
Winter 2002
Example
Let R =
bar beer price Joe's Bud 2.00 Joe's Miller 2.75 Sue's Bud 2.50 Sue's Coors 3.00 Mel's Miller 3.25 Compute beer,AVG(price)(R). 1. Group by the grouping attribute(s), beer in this case: bar beer price Joe's Bud 2.00 Sue's Bud 2.50 Joe's Miller 2.75 Mel's Miller 3.25 Sue's Coors 3.00
Arthur Keller CS 180 540
Winter 2002
2.Compute average of price within groups: beer Bud Miller Coors AVG(price) 2.25 3.00 3.00
Winter 2002
541
Outerjoin
The normal join can lose information, because a tuple that doesnt join with any from the other relation (dangles) has no vestage in the join result. The null value can be used to pad dangling tuples so they appear in the join. Gives us the outerjoin operator o . Variations: theta-outerjoin, left- and rightouterjoin (pad only dangling tuples from the left (respectively, right).
Arthur Keller CS 180 542
Winter 2002
Example
A 1 3 B 4 6 A 3 1 B 4 2 6 C 5 7 C 5 7 part of natural join part of right-outerjoin part of left-outerjoin
Arthur Keller CS 180 543
R=
B 2 4
S=
S=
Winter 2002