Вы находитесь на странице: 1из 8

Normalization

Normalizing a logical database design involves organizing the data into more than
one table. Normalization improves performance by redundancy. Redundancy can
lead to:

* Inconsistencies – Errors are more likely to occur when facts are repeated.
* Update anomalies – Inserting, modifying and deleting data may cause
inconsistencies.

There is a high likelihood of data in one table being updated or deleted, while
corresponding changes in other relations are omitted.

Normalization has numerous benefits. These include faster sorting and index
creation, few indexes per table, few NULLs and an increase in the compactness of the
database. However the number and complexity of joins increase with the increase in
normalization. If the number of joins between table increases, the performance of
the database may deteriorate. Normalization helps to simplify the structure of tables.
The performance of an application is directly linked to the data base design. A poor
design hinders the performance of the system. The logical design of the database
lays the foundation for an optimal database.

Some rules that should be followed to achieve a good database design are:

* Each table should have an identifier.


* Each table should store data for a single type of entity.
* Nullable columns in tables should be avoided.
* The repetition of values of columns in table should be avoided.

Normal Forms

Normalization results in the formation of tables that satisfy certain specified


constraints, and represent certain normal forms. The normal forms are used to
ensure that various types of anomalies and inconsistencies are not introduced in the
database. Normal forms are table structures with minimum redundancy. Several
normal forms have been identified. The most important and widely used of these
are:

* First Normal Form ( 1 NF)


* Second Normal Form ( 2 NF)
* Third Normal Form ( 3 NF)
* Boyce-Codd Normal Form ( BCNF)

First Normal Form ( 1 NF)

A table is said to be in the 1 NF when each cell of the table constants precisely one
value.

Consider the following table Project.

Page No: 1
All rights reserved to www.2classnotes.com
Project

Ecode Dept ProjCode Hours


E101 Systems P27 90
P51 101
P20 60
E305 P27 109
Sales P22 98
E508 Admin P51 NULL
P27 72

The data in the table is not normalized because a cell in ProjCode and Hours has
more then one value.

By applying the INF definition to the project table, you arrive at the following table.

Project

Ecode Dept ProjCode Hours


E101 Systems P27 90
E101 Systems P51 101
E101 Systems P20 60
E305 Sales P27 109
E305 Sales P22 98
E508 Admin P51 NULL
E508 Admin P27 72

Functional Dependency

The Normalization theory is based on the fundamental notion of functional


dependency. First, let us examine the concept of the functional dependency.

Given a relation (you may recall that a table is also called a relation) R, attribute A is
functionally dependent on attribute B if each value of A in R is Associated with
precisely one value of B.

In other words, attribute A is functionally dependent on B if and only if, for each
value of B, there is exactly one value of A. Attribute B is called the determinant.

Consider the following table Employee.

Employee

Code Name City


E1 Mac Delhi
E2 Sandra CA
E3 Henry France

Page No: 2
All rights reserved to www.2classnotes.com
Given a particular value of code, there is precisely one corresponding value for
name. Foe example, for code E1 there is exactly one value of name, Mac. Hence,
name is functionally dependent on code. Similarly, there is exactly one value of city
for each value of code. Hence the attribute city is functionally dependent on the
attribute code. The attribute code is the determinant. You can also say that code
determines city and name.

Second Normal Form (2 NF)

A table is said to be in 2 NF when it is in 1 NF and every attribute in the row is


functionally dependent upon the whole key and not just part of the key.

Consider the Project Table:

Project
Ecode
ProjCode
Dept
Hours

The table has the following rows:

Ecode ProjCode Dept Hours


E101 P27 Systems 90
E305 P27 Finance 10
E508 P51 Admin NULL
E101 P51 Systems 101
E101 P20 Systems 60
E508 P27 Admin 72

This situation could lead to the following problems:

* Insertion

The department of a particular employee can not be recorded until the employee is
assigned a project.

* Updating

For giving employee, the employee code and department are repeated several times.
Hence if an employee is transferred to another department, this change will have to
be recorded in every row of the employee table. Any omission will lead to
inconsistencies.

* Deletion

If an employee work on a project, the employee’s recode will be deleted. The


information regarding the department to which the employee belongs will also be
lost.

Page No: 3
All rights reserved to www.2classnotes.com
The primary key here is composite (ECode + ProjCode).

The table satisfies the definition of 1 NF. You need to now check if it satisfies 2NF.

In the table for each value of ECode, there is more then one value of Hours. For
example, for ECode, E101, there are three value of Hours: 90, 101 and 60. Hence,
Hours is not functionally dependent on ECode. Similarly, for each value of
ProjCode, there is more then one value of Hours. For example for ProjCode, P27
there is three values of Hours, 90, 10 and 72. However, for a combination of the
ECode and ProjCode values, there is exactly one value of Hours. Hence Hours is
functionally dependent on the whole key, ECode + ProjCode.

Now you are must check it Dept is functionally dependent on the whole key,
ECode+ProjCode. Foe Each value of ECode, there is exactly one value of Dept. For
example, for ECode 101, there is exactly one value the systems department. Hence,
Dept is functionally department on ECode. However, for each value of ProjCode,
there is more than one value of Dept. For example, for ProjCode P27, there are two
values of Dept, System and Finance. Hence, Dept is not functionally dependent on
ProjCode. Dept is not functionally dependent on ProjCode. Dept is, therefore,
functionally dependent on part of the key (which is ECode) and not functionally
dependent on the whole key (ECode+ProjCode). Therefore the table Project is not in
2NF. For the table to be in 2NF, the non-key attributes must be fully functionally
dependent on the whole key and not part of the key.

Guidelines for Converting a Table to 2 NF

* Find and remove attributes that are functionally dependent on only a part of the
key and not on the whole key. Place them in a different table.

* Group the remaining attributes.

To Convert the table Project into 2NF, you must remove the attributes that are not
fully functionally dependent on the whole key and place them in a different table
along with the attribute that it is functionally dependent on. In the above example,
since Dept is not fully functionally dependent on the whole key ECode+ProjCode, you
place Dept along with ECode in a separate table called EmployeeDept.

Now the table Project will contain ECode, ProjCode and Hours.

EmployeeDept

ECode Dept
E101 Systems
E305 Sales
E508 Admin

Page No: 4
All rights reserved to www.2classnotes.com
Project

ECode ProjCode Hours


E101 P27 90
E101 P51 101
E101 P20 60
E305 P27 10
E508 P51 NULL
E508 P27 72

Third Normal Form (3 NF)

A relation is said to be in 3 NF when it is in 2 NF and every non-key attribute is


functionally dependent only on the primary key.

Consider the table Employee.

ECode Dept DeptHead


E101 Systems E901
E305 Finance E906
E402 Sales E906
E508 Admin E908
E607 Finance E909
E608 Finance E909

The problems with dependencies of this kind are:

* Insertion

The department head of the new department that does not have any employees at
present cannot be entered in the DeptHead column. This is because the primary key
is unknown.

* Updating

For a given department, the code for a particular department head (DeptHead) is
repeated several times. Hence if a department head moves to another department,
the change will have to be made consistently across the table.

* Deletion

If the record of an employee is deleted, the information regarding the head of the
department will also be deleted. Hence there will be a loss of information.

You must check if the table is in 3NF. Since each cell in the table has the single
value, the table is in 1NF.

The primary key in Employee table is ECode. For Each value of Ecode, there is
exactly one value of Dept. Hence the attribute Dept is functionally dependent on the

Page No: 5
All rights reserved to www.2classnotes.com
primary key, ECode. Similarly for each value of ECode, there is exactly on r value of
DeptHead. Hence DeptHead is functionally dependent on the primary key ECode.
Hence all the attributes are functionally dependent on the whole key, ECode. Hence
the table is in 2NF.

However, the attribute DeptHead is dependent on the attribute Dept also. As per
3NF, all non-key attributes have to be functionally dependent only on the primary
key. This table is not in 3NF since DeptHead is functionally dependent on Dept, which
is not a primary key.

Guidelines for Converting a Table to 3NF

* Find and remove non-key attributes that are functionally dependent on the
attributes that are not the primary key. Place them in a different table.
* Group the remaining attributes.

To convert the table employee into 3NF, you must remove the column DeptHead
since it is not functionally dependent on only the primary key ECode and place it in
another table called Department along with the attribute dependent on.

Employee

Ecode Dept
E101 Systems
E305 Finance
E402 Sales
E508 Admin
E607 Finance
E608 Finance

Department

Dept DeptHead
Systems E901
Sales E906
Admin E908
Finance E909

Boyce-Codd Normal Form

The original definition of 3NF was inadequate in some situation. It was not
satisfactory for the tables:

* That had multiple candidate keys


* Where the multiple candidate key were composite.
* Where the multiple candidate key overlapped (Had at least one attribute in
common)

Hence, a new normal form – the Boyce-Codd normal form was introduced. You must

Page No: 6
All rights reserved to www.2classnotes.com
understand that in table were the above three condition do not apply, you can stop
at the third normal form. In such cases, the third NF is the same as the Boyce-Codd
normal form.

A relation is in the Boyce-Codd normal form (BCNF) if and only if every determinant
is a candidate key.

Consider the table Project given below.

Project

ECode Name ProjCode Hours


E1 Veronica P2 48
E2 Anthony P5 100
E3 Mac P6 15
E4 Susan P3 250
E4 Susan P5 75
E1 Veronica P5 40

This table has redundundancies. If the name of an employee is changed, the change
will have to be made in every row of the table, otherwise there will be
inconsistencies.

ECode+ProjCode is the primary key. You will notice that Name+ProjCode could
be chosen as the primary key and hence, is a candidate key.

* Hours is functionally dependent on ECode+ProjCode.


* Hours is also functionally dependent on Name+ProjCode.
* Name is functionally dependent on Ecode.
* ECode Is functionally dependent on Name.

You will notice that this table has:

* Multiple candidate keys, that is ECode+ProjCode and Name+ProjCode.


* The candidate keys are composite.
* The candidate keys overlap since the attribute –ProjCode is common.

This is the case of the Boyce-Codd Normal form. This is in third NF. The only non key
item is Hours, which is dependent on the whole key, that is ECode+ProjCode or
Name+ProjCode.

ECode and Name are determinants since they are functionally dependent on the each
other. However, they are not candidate keys by themselves. As per BCNF, the
determinants have to be candidate keys.

Guidelines for Converting a Table to BCNF

* Find the remove the overlapping candidate keys. Place the part of the candidate
key and the attribute it is functionally dependent on, in a different table.

Page No: 7
All rights reserved to www.2classnotes.com
* Group the remaining items into a table.

Hence, remove Name and ECode and place them in a different table. You will arrive
at the following tables.

Employee

ECode Name
E1 Veronica
E2 Anthony
E3 Mac
E4 Susan
E4 Susan
E1 Veronica

Project

ECode ProjCode Hours


E1 P2 48
E2 P5 100
E3 P6 15
E4 P2 250
E4 P5 75
E1 P5 40

Page No: 8
All rights reserved to www.2classnotes.com