Академический Документы
Профессиональный Документы
Культура Документы
P. M. Jat @ DAIICT
Normalization
Normalization is the process of analyzing the given relation schema based on FDs and candidate keys to achieve desirable properties of Minimizing redundancies and Minimizing update anomalies. Defines, various Normal Forms as measure of goodness of a relation
10/17/2011
Database Systems
Normal forms
Initially Codd proposed three normal forms, which he called First, Second, and Third normal forms. A stronger definition of 3NF, called Boyce-Codd Norm Form (BCNF) was proposed later by Boyce and Codd. All these normal forms are based on the functional dependencies among the attributes of a relation. Later a Fourth normal form (4NF) and Fifth normal form (5NF) were proposed based on multi-value dependencies and join dependencies respectively
10/17/2011 Database Systems 3
However permitting multi-values (like arrays), or relations (nested relations) as attributes, goes against basic understanding of relational model and you may need to perform non-relation operations to retrieve data.
10/17/2011 Database Systems 5
Nested Relations:
Invoice(InvNo, Date, CustID, InvoiceItems); where
10/17/2011
Database Systems
10/17/2011
Database Systems
BCNF
A relation R is in Boyce-Codd Normal Form, when determinant of every FD that holds on R, is super-key of R. In other words, For every FD A relation R, A is its super key. B that holds on
10/17/2011
Database Systems
10
10/17/2011
Database Systems
12
BCNF
Use of term key or super-key in definition of BCNF may cause confusion: Consider the definition- for every FD A B that holds on relation R, A is its super key, and following FDs in Company: Is FD {ssn, fname} salary acceptable in BCNF?
10/17/2011
Database Systems
13
BCNF
To accept FD like {ssn, fname} salary (which are basically reducible FDs), the definition uses the term super-key; if input set of FD is in canonical form usage of term key is also fine. Therefore, informally you can that a relation is in BCNF, if for every FD that holds on R, determinant is always a key.
10/17/2011
Database Systems
14
Relations in TGMC
Member(MembID, MembName, MembEmail, TeamID) Team(TeamID, TeamPWD, MentorID) Mentor(MentorID, MentorName, Email, InstID) Institute(InstID, InstName, City, PIN, State)
10/17/2011
Database Systems
15
FDs in TGMC
MembID -> MembName, MembEmail, TeamID, TeamPWD, MentorID, MentorName, Email, InstID, InstName, City, PIN, State TeamID -> TeamPWD, MentorID, MentorName, Email, InstID, InstName, City, PIN, State MentorID -> MentorName, Email, InstID InstID -> InstName, City, PIN, State PIN -> City, State
10/17/2011 Database Systems 16
Relations in TGMC
Sometimes for some reason, if InstID is placed in Team relation too.
Member(MembID, MembName, MembEmail, TeamID) Team(TeamID, TeamPWD, MentorID, InstID) Mentor(MentorID, MentorName, Email, InstID) Institute(InstID, InstName, City, PIN, State)
10/17/2011
Database Systems
17
3NF
3NF is less restrictive that BCNF, it relaxes BCNF condition for prime attributes (attribute that are part of some candidate key) A relation is in 3NF, if, for every FD A holds on relation R, A is its super key, or B is a prime attribute. B that
10/17/2011
Database Systems
18
Keys: {ESSN, PNO}, and {ESSN, PName} *Sometimes students find their own reasons and allow redundancies like this
10/17/2011 Database Systems 19
10/17/2011
Database Systems
20
2NF
Consider our EMP-DEP relation (and FDs in Company Database), that had lot of redundanciesEMP_DEP(ssn, fname, salary, superssn, dno, dname, mgrssn, mgrstartdate)
Concerned FDs:
ssn -> fname, salary, superssn, dno, dname, mgrssn, mgrstartdate dno -> dname, mgrssn, mgrstartdate
2NF
And culprit FDs are following:
dno -> dname, mgrssn, mgrstartdate
In 2NF, we permit FD X B; when we also have a FD A X, and A is super-key: That means we have FD A B transitively inferred from FDs A X and X B) B is still determined by A (super key) Example: dno --> mgrssn is acceptable FD in 2NF, because you also have FD ssn dno and ssn is Key
10/17/2011 Database Systems 22
Decomposition of EMP-DEP
Decomposition strategy based on Transitivity FDs Identifying X (transitivity pivot) in R, create another R2 relation that has X and all attributed determined by X; and X become key of R2 Remaining attributes of R are put in R1, and X becomes FK in R1 referring to R2
10/17/2011
Database Systems
23
Definition of 2NF
A relation R is in 2NF, if every non-prime attribute is irreducibly dependent (or determined by) on key (we do not say super key) Whether, you have direct dependency or inferred through Transitivity
10/17/2011
Database Systems
24
Summary
1NF: attributes have atomic values or have no repeating groups. BCNF: Only acceptable FDs, where super key is determinant 3NF: Relaxes BCNF. FD X Y is acceptable; either X is super-key or Y is prime attribute 2NF: non-prime attributes are irreducibly dependent on key
10/17/2011 Database Systems 25
What normal form following relation W is? W(ssn, pno, pname, hours) Concerned FDs:
pno -> pname {ssn, pno} -> hours
Start with BCNF till you reach to a form that accepts given FDs
10/17/2011 Database Systems 26
Exercises
10/17/2011
Database Systems
27
In every decomposed relation, all non-prime attributes are fully functionally dependent on PK?
10/17/2011
Database Systems
28
Example-2
Consider relation (S#, SName, P#, QTY), and SName is unique. Candidates keys are {S#, P#} and {SNAME, P#}. Following FDs exist1. {S#,P#} QTY 2. {SNAME, P#} QTY 3. S# SNAME 4. SNAME S# In which normal form the relation is? 2NF: Yes/No? YES 3NF: Yes/No? YES BCNF: Yes/No?
No, Because of FDs 3 and 4
10/17/2011 Database Systems 29
Example-2 contd..
Obviously there are redundancy and anomalies in relation (S#, SName, P#, QTY), to make it in BCNF, it can be decomposed to 1. (S#, SNAME), where both are candidate keys 2. (S#, P#, QTY), where (S#, P#) is candidate key OR 1. (S#, SNAME), where both are candidate keys 2. (SNAME, P#, QTY), where (SNAME, P#) is candidate key
10/17/2011
Database Systems
30
10/17/2011
Database Systems
31
10/17/2011
Database Systems
32
10/17/2011
Database Systems
33
10/17/2011
Database Systems
34
References
10/17/2011
Database Systems
35