Department of Computer Science Normalization Goal = BCNF = Boyce-Codd Normal Form = all FDs follow from the fact key everything. Formally, R is in BCNF if for every nontrivial FD for R, say X A, then X is a superkey. Nontrivial = right-side attribute not in left side. Why? 1. Guarantees no redundancy due to FDs. 2. Guarantees no update anomalies = one occurrence of a fact is updated, not all. 3. Guarantees no deletion anomalies = valid fact is lost when tuple is deleted. Example of Problems Drinkers(name, addr, beersLiked, manf, favoriteBeer)
FDs: 1. name addr 2. name favoriteBeer 3. beersLiked manf ???s are redundant, since we can figure them out from the FDs. Update anomalies: If Janeway gets transferred to the Intrepid, will we change addr in each of her tuples? Deletion anomalies: If nobody likes Bud, we lose track of Buds manufacturer. name addr beersLi ked manf favori teBeer Janeway Voyager Bud A.B. Wi ckedAl e Janeway ??? Wi ckedAl e Pete's ??? Spock Enterpri se Bud ??? Bud Each of the given FDs is a BCNF violation: Key = {name, beersLiked} Each of the given FDs has a left side that is a proper subset of the key. Another Example Beers(name, manf, manfAddr). FDs = name manf, manf manfAddr. Only key is name. Manf manfAddr violates BCNF with a left side unrelated to any key. Decomposition to Reach BCNF Setting: relation R, given FDs F. Suppose relation R has BCNF violation X B. We need only look among FDs of F for a BCNF violation, not those that follow from F. Proof: If Y A is a BCNF violation and follows from F, then the computation of Y +
used at least one FD X B from F. X must be a subset of Y. Thus, if Y is not a superkey, X cannot be a superkey either, and X B is also a BCNF violation. 1. Compute X + . Cannot be all attributes why? 2. Decompose R into X + and (RX + ) X.
3. Find the FDs for the decomposed relations. Project the FDs from F = calculate all consequents of F that involve only attributes from X + or only from (RX + ) X. R X +
X Example R = Drinkers(name, addr, beersLiked, manf, favoriteBeer) F = 1. name addr 2. name favoriteBeer 3. beersLiked manf Pick BCNF violation name addr. Close the left side: name + = name addr favoriteBeer. Decomposed relations: Drinkers1(name, addr, favoriteBeer) Drinkers2(name, beersLiked, manf) Projected FDs (skipping a lot of work that leads nowhere interesting): For Drinkers1: name addr and name favoriteBeer. For Drinkers2: beersLiked manf. (Repeating) Decomposed relations: Drinkers1(name, addr, favoriteBeer) Drinkers2(name, beersLiked, manf) Projected FDs: For Drinkers1: name addr and name favoriteBeer. For Drinkers2: beersLiked manf.
BCNF violations? For Drinkers1, name is key and all left sides of FDs are superkeys. For Drinkers2, {name, beersLiked} is the key, and beersLiked manf violates BCNF. Decompose Drinkers2 First set of decomposed relations: Drinkers1(name, addr, favoriteBeer) Drinkers2(name, beersLiked, manf)
Close beersLiked + = beersLiked, manf. Decompose Drinkers2 into: Drinkers3(beersLiked, manf) Drinkers4(name, beersLiked) Resulting relations are all in BCNF: Drinkers1(name, addr, favoriteBeer) Drinkers3(beersLiked, manf) Drinkers4(name, beersLiked) 3NF One FD structure causes problems: If you decompose, you cant check all the FDs only in the decomposed relations. If you dont decompose, you violate BCNF. Abstractly: AB C and C B. Example 1: title city theatre and theatre city. Example 2: street city zip, zip city. Keys: {A, B} and {A, C}, but C B has a left side that is not a superkey. Suggests decomposition into BC and AC. But you cant check the FD AB C in only these relations. Example A = street, B = city, C = zip.
Join: street zi p 545 Tech Sq. 02138 545 Tech Sq. 02139 ci ty zi p Cambridge 02138 Cambridge 02139 ci ty street zi p Cambri dge 545 Tech Sq. 02138 Cambri dge 545 Tech Sq. 02139 zip city street city zip Elegant Workaround Define the problem away. A relation R is in 3NF iff (if and only if) for every nontrivial FD X A, either: 1. X is a superkey, or 2. A is prime = member of at least one key. Thus, the canonical problem goes away: you dont have to decompose because all attributes are prime. What 3NF Gives You There are two important properties of a decomposition: 1. We should be able to recover from the decomposed relations the data of the original. Recovery involves projection and join, which we shall defer until weve discussed relational algebra. 2. We should be able to check that the FDs for the original relation are satisfied by checking the projections of those FDs in the decomposed relations. Without proof, we assert that it is always possible to decompose into BCNF and satisfy (1). Also without proof, we can decompose into 3NF and satisfy both (1) and (2). But it is not possible to decompose into BNCF and get both (1) and (2). Street-city-zip is an example of this point. Multivalued Dependencies The multivalued dependency X Y holds in a relation R if whenever we have two tuples of R that agree in all the attributes of X, then we can swap their Y components and get two new tuples that are also in R.
X Y others Example Drinkers(name, addr, phones, beersLiked) with MVD Name phones. If Drinkers has the two tuples: name addr phones beersLiked sue a p1 b1 sue a p2 b2 it must also have the same tuples with phones components swapped: name addr phones beersLiked sue a p2 b1 sue a p1 b2 Note: we must check this condition for all pairs of tuples that agree on name, not just one pair. MVD Rules 1. Every FD is an MVD. Because if X Y, then swapping Ys between tuples that agree on X doesnt create new tuples. Example, in Drinkers: name addr. 2. Complementation: if X Y, then X Z, where Z is all attributes not in X or Y. Example: since name phones holds in Drinkers, so does name addr beersLiked. Splitting Doesnt Hold Sometimes you need to have several attributes on the right of an MVD. For example: Drinkers(name, areaCode, phones, beersLiked, beerManf) name areaCode phones beersLiked beerManf Sue 831 555-1111 Bud A.B. Sue 831 555-1111 Wicked Ale Petes Sue 408 555-9999 Bud A.B. Sue 408 555-9999 Wicked Ale Petes name areaCode phones holds, but neither name areaCode nor name phones do. 4NF Eliminate redundancy due to multiplicative effect of MVDs. Roughly: treat MVDs as FD's for decomposition, but not for finding keys. Formally: R is in Fourth Normal Form if whenever MVD X Y is nontrivial (Y is not a subset of X, and X Y is not all attributes), then X is a superkey. Remember, X Y implies X Y, so 4NF is more stringent than BCNF. Decompose R, using 4NF violation X Y, into XY and X (RY). R Y X Example Drinkers(name, addr, phones, beersLiked) FD: name addr Nontrivial MVDs: name phones and name beersLiked. Only key: {name, phones, beersLiked} All three dependencies above violate 4NF. Successive decomposition yields 4NF relations: D1(name, addr) D2(name, phones) D3(name, beersLiked) 1NF 2NF 3NF BCNF 4NF 5NF functional dependencies multivalued dependencies join dependencies HIGHER NORMAL FORMS ASSUMPTIONS 24-HOUR FLIGHT-TIMETABLE, ALL FLIGHTS EVERY DAY ALL PLANES TAKE-OFF AND LAND (BUT DO NOT CRASH) NO AIRPORT IS LANDING-ONLY & NO AIRPORT IS TAKE-OFF-ONLY TTAB (ORG) = TTAB (DST) THERE IS AT LEAST ONE TIME DELAY ENTRY FOR EVERY FLIGHT SIMILARLY IN WEATHER REPORT HISTORY IF NO TWO FLIGHTS CAN TAKE OFF AT THE SAME TIME IN THE SAME AIRPORT WES CAN BE POSTED TO FLIGHT AND WEATHER@ORIGIN ELIMINATED UNIVERSITY DISCIPLINE DEGREE Old Town Computing BSc Old Town Mathematics PhD New City Computing PhD Old Town Computing PhD AWARD teaches (UNIVERSITY, DISCIPLINE) is_read_for (DISCIPLINE, DEGREE) awards (UNIVERSITY, DEGREE) teaches (NewCity, Computing) = true awards (NewCity, PhD) = true is_read_for (Computing, BSc) = true FROM (NewCity teaches Computing) and (Computing is_read_for BSc)
IT DOES NOT FOLLOW NewCity awards BSc for_reading Computing Join Dependency JD * (R 1 , R 2 , R 3 , ..., R m ) holds in R iff R = join (R 1 , R 2 , R 3 , ..., R m ), R i - a projection of R R JD* ( , ) then R is in 5NF if JD* ( , ) A relation R is in 5NF iff for all JD * (R 1 , R 2 , R 3 , ..., R m ) in R, every R i is a superkey for R. preventing illogical conjunction of facts Fifth Normal Form JD* ( , ) holds for R then R is not in 5NF if does not contain key UNIVERSITY DISCIPLINE Old Town Computing Old Town Mathematics New City Computing DISCIPLINE DEGREE Computing BSc Mathematics PhD Computing PhD UNIVERSITY DEGREE Old Town BSc Old Town PhD New City PhD AWARD NAME CODE TEACHING RESEARCH Ol d Town P05 21 4 New City C01 23 3 RANKING join dependencies
JD1 * ((NAME, CODE, TEACHING), (NAME, RESEARCH)) JD2 * ((NAME, CODE, RESEARCH), (NAME, TEACHING)) JD3 * ((NAME, CODE, TEACHING), (CODE, RESEARCH)) JD4 * ((NAME, CODE, RESEARCH), (CODE, TEACHING)) JD5 * ((NAME, CODE), (NAME, TEACHING), (CODE,RESEARCH)) ............................................................................................. candidate keys - NAME or CODE all projections in JD1 - to JD5 are superkeys for RANKING 5NF
Influence of Organizational Culture and Compensation On Employee Performance With Motivation Work As A Mediation Variable at Pt. Jasaraharja Branch Riau
International Journal of Innovative Science and Research Technology