Вы находитесь на странице: 1из 18

 Week 6

 Normalisation
Database Normalization

 Proposed by Codd (1972)

 Introduced 3 normal forms, the first, second and
third normal form

 A stronger definition of 3NF - called Boyce-Codd
normal form (CDNF) was proposed later

 Later, 4NF and 5NF were proposed
The minimum, and most common, goal is to achieve 3NF.

Database Normalization
Normalization Is the process of analyzing the given

relational schema based on its functional dependencies


and keys to achieve the desirable properties of:
  Minimizing redundancy
• Minimizing the insertion, deletion, and updating
anomalies
• Minimize data storage
• Unsatisfactory relation schema that do not meet a
given normal form test are decomposed into
smaller relational schemas that meet the test and
hence possess the desired properties.
• Key Concepts in normalization are Functional
Dependency and keys
Example
Sales

 (Order#, Date, CustID, Name, Address, City,


State, Zip, {Product#, ProductDesc, Price,
QuantityOrdered}, Subtotal, Tax, S&H, Total)

• What are the problems with using a single table


for all order information?
– Insert Anomaly
– Update Anomaly
– Delete Anomaly
Problems
• Implementing Repeating Groups
• Duplication of Data (customer name & address)
• Unnecessary Data (subtotal, total, tax)
• Others, which includes anomalies:
• If we insert a new customer, which has no invoices, we have to
insert null values for all attributes relating to invoice (insert
anomaly)
• If we insert a new invoice for a customer, we have to insert
customer details (name, address, etc) correctly so that it will
be consistent with the existing values (insert anomaly)
• If we delete an invoice for a customer and that customer happen
to be to have only one invoice, the information concerning
this customer will be lost from the database (delete anomaly)
• If we update the address of a customer, we have at update all
invoices for that customer as well (update anomaly)


Database Normalization
 Functional dependency (FD) X − − − − − >means
Y that if
there is only one possible value of Y for every value of X, then

Y is Functionally dependent on X.

Is the following FDs hold?



X Y Z
 10 B1 C1

10 B2 C2
X − − − − −> Y Y − − − − −> Z
11 B4 C1
Y − − − − −> X 12 B3 C4
Z− − − −− > Y
13 B1 C1
14 B3 C4
Database Normalization
• Functional Dependency is “good”. With functional
dependency the primary key (Attribute A) determines the
value of all the other non-key attributes (Attributes
B,C,D,etc.)
• Transitive dependency is “bad”. Transitive dependency
exists if the primary/candidate key (Attribute A)
determines non-key Attribute B, and Attribute B
determines non-key Attribute C.
• If a relation schema has more than one key, each is called a
candidate key
• An attribute in a relation schema R is called prim if it is a
member of some candidate key of R

First Normal Form (1NF)

Each attribute must be atomic (single value)


•No repeating columns within a row (composite attributes)
•No multi-valued columns.

1NF simplifies attributes


•Queries become easier.
1NF
Deptno Dname Location
10 IT Leeds, Bradford, Kent

20 Research Hundredfold
30 Marketing Leeds

Deptno Location
10 Leeds
Deptno Dname 10 Bradfprd
10 IT
10 Kent
20 Research 20 Hundredfold
30 Marketing 30 Leeds
Second Normal Form (2NF)
Each attribute must be functionally dependent on
the primary key.
•If the primary key is a single attribute, then the relation is in 2NF
•The test for 2NF involves testing for FDs whose left-hand-side
attribute are part of the primary key
•Disallow partial dependency, where non-keys attributes depend on
part of a composite primary key
•In short, remove partial dependencies

2NF improves data integrity.
•Prevents update, insert, and delete anomalies.
2NF
PNo PName PLoc EmpNo EName Salary Address HoursNo

Given the following FDs:

PNo , EmpNo −−−−− >H oursNo


PNo −−−− − >Dname , Loc
EmpNo −−−− − >Nam e , Salary , Address

Assuming all attributes are atomic, is the above relation in


the 1NF, 2NF ?
Relation X1 Relation X3
PNo PName PLoc PNo EmpNo HoursNo
Relation X2

EmpNo EName Salary Address


Third Normal Form (3NF)
Remove transitive dependencies.
Transitive dependency
■A non-prime attribute is dependent on another, non-
prime attribute or attributes
■Attribute is the result of a calculation

Examples:
§Area code attribute based on City attribute of a customer
§Total price attribute of order entry based on quantity
attribute and unit price attribute (calculated value)
§

Solution:
•Any transitive dependencies are moved into a smaller table.
Transitive Dependence
Give a relation R, EmpNo EName Salary Address
Assume the following FD hold:
Ename − − − − > Address
Note : Both Ename and Address attributes are non-key attributes in R, and since
Address depends on a non-Prime attribute Name, which depends on the primary
key(EmpNo), a transitive dependency exists

EmpNo − − − − > Ename, Ename − − − − > Addresst, EmpNo − − − − > Address

R1 R2
EmpNo EName Salary Ename Address

Note : If address is a prime attribute Then R is in 3NF


Database Normalization
• Boyce-Codd Normal Form (BCNF)
– A relation is in Boyce-Codd normal form
(BCNF) if every determinant in the table is a
candidate key.
 (A determinant is any attribute whose value
determines other values with a row.)

– If a table contains only one candidate key, the


3NF and the BCNF are equivalent.
– BCNF is a special case of 3NF.
A Table That Is In 3NF But Not In BCNF

Figure 5.7
The Decomposition of a Table Structure to Meet
BCNF Requirements

Figure 5.8
Sample Data for a BCNF Conversion

Table 5.2
Decomposition into BCNF

Figure 5.9

Вам также может понравиться