Вы находитесь на странице: 1из 26

Module II

Normalization
Intro & 1NF
What is Normalization?
Database design may have some amount of
• Inconsistency
• Uncertainty
• Redundancy

To eliminate these drawbacks some refinement has to be


done on the
database. This Refinement process is called
Normalization.

2
Normalization
 It is Defined as a step-by-step process of decomposing a
complex relation into a simple and stable data structure.

 It is a formal process that can be followed to achieve a good


database design

 Also used to check that an existing design is of good quality

 The different stages of normalization are known as “Normal


Forms”

 To accomplish normalization we need to understand the


concept of Functional Dependencies.

3
Functional dependency
• In a given relation R, X and Y are attributes. Attribute Y is
functionally dependent on attribute X if each value of X
determines EXACTLY ONE value of Y, which is
represented as X -> Y (X can be composite in nature).

• We say here “x determines y” or “y is functionally


dependent on x”
XY does not imply YX

• If the value of an attribute “Marks” is known then the


value of an attribute “Grade” is determined since
MarksGrade

4
Functional dependency
Types of functional dependencies:

• Full Functional dependency


• Partial Functional dependency
• Transitive dependency

5
Functional Dependencies
Consider the following Relation

REPORT (STUDENT#,COURSE#, StudentName,CourseName,


Marks, Grade)

Description of the Attributes:

• STUDENT# - Student Number


• COURSE# - Course Number
• StudentName- Student Name
• CourseName - Course Name
• Marks - Scored in Course COURSE# by Student STUDENT#
• Grade - obtained by Student STUDENT# in Course COURSE#
6
Functional Dependencies- From the previous example
• For each value of (Student# ,Course#), Marks obtained will be
exactly one value. So we observe the following Functional
dependency

STUDENT# COURSE#  Marks

• For each value of Course# the name of the course will be exactly
one value. So we observe the following Functional dependency

COURSE#  CourseName,

• For each value of Marks the grade will be exactly one value. So we
observe the following functional dependency

Marks  Grade

7
Functional Dependencies- From the previous example

• For each value of Student# the name of the student will


be exactly one value. So we observe the following
Functional dependency

STUDENT#  StudentName

8
Full dependencies
X and Y are attributes.
X Functionally determines Y
Note: Subset of X should not functionally determine Y

Student#

Marks

Course#
In above example Marks is fully functionally dependent on STUDENT# COURSE#
and not on sub set of STUDENT# COURSE#. This means Marks can not be
determined either by STUDENT# OR COURSE# alone. It can be determined only
using STUDENT# AND COURSE# together. Hence Marks is fully functionally
dependent on STUDENT# COURSE#.
9
Full Functional Dependency
A functional dependency
𝑋→𝑌
Is a full functional dependency if removal of some attribute A
from X means that dependency does not hold any more.
i.e. for any attribute A, (𝑋 − {𝐴}) does not functionally
determine Y

𝑋 → 𝑌 is full FD if there is no functional dependency


𝐴 → 𝑌 such that 𝐴 ⊂ 𝑋 (i.e. A is proper subset of X )
then 𝑋 is full functional dependent on Y
Partial dependencies
X and Y are attributes.
Attribute Y is partially dependent on the attribute X only if it is
dependent on a sub-set of attribute X.

We have both the functional dependency valid in our example

Student# Course# CourseName


Course# CourseName

So we can say that CourseName is partially dependent on Student# Course#

11
Partial Dependency
A functional dependency
𝑋→𝑌
is a partial functional dependency if removal of some
attribute A Є X and the dependency still holds
i.e. for any attribute A, (𝑋 − {𝐴}) →Y also holds

𝑋 → 𝑌is partial FD if there exists some functional


dependency 𝐴 → 𝑌 such that 𝐴 ⊂ 𝑋 (i.e. A is proper
subset of X ) then 𝑋 is partially dependent on Y
Transitive dependencies
X Y and Z are three attributes.
X -> Y
Y-> Z
=> X -> Z

13
Need for Normalization
Lets observe the Online Retail Application Table

Each row of the table Represents the information of a customer who has
purchased an item.

14
Need
In this Scenario
for Normalization
 Can we Insert the record of an item which has not been purchased by any
customer?
The table is not to maintain the record of items but it is
to keep the record of purchase of item by customers

Can we delete the record of item which has been purchased by only one
customer? There will be information
Loss for that item

How many rows we need to update if there is a change in description of


item? Depends upon the no of times the
item has been purchased

How many times we need to store the description of an item if the same
item is purchased many times? Depends upon the no of times
the item has been purchased

So we observe the following in the Un Normalized table:


Insert , Delete, Update Anomaly
Data Duplication
15
Need for Normalization
So we observe the following in the Un Normalized table:
 Insert , Delete, Update Anomaly
 Data Duplication

16
First Normal Form: 1NF
• A relation schema is in 1NF :

• if and only if all the attributes of the relation R are atomic


in nature.

• Atomic: the smallest level to which data may be broken


down and remain meaningful

17
Online Retail Application Tables –
1NF Normalized
Observation on Un Normalized Retail Application Table
Customerdetails, Itemdetails and PurchaseDetails are composite

Above observation violates 1NF definition

To bring it to 1NF we need to make the columns atomic

18
First Normal Form (1NF)
 Disallow multivalued attribute, composite and their
combination
 Attribute with single atomic (or indivisible) values
Consider the following relation
Student(SSN, sName, address)
Data in this table shows Steve with SSN SSN sName address
has his parental house in New York (NY), he
currently works at Los Angeles (LA), and he 109 Steve NY, LA, SJ
also has a house at San Jose (SA) 232 Larry LA, SJ
To represent all this information in relation
171 Bill NY
we need to have address as multivalued
attribute
First Normal Form (1NF)
For dealing with these situation we have 3 solutions
Solution 1: create column for each value
Consider the following relation SSN sName address
Student(SSN, sName, address) 109 Steve NY, LA, SJ
Suppose we are interested in storing only 232 Larry LA, SJ
some of information rather all values for e.g. 171 Bill NY
information of their permanent and
correspondence address is only we are 1NF
interested in then our relation be:
Student(SSN, sName, permanentAddress, correspondenceAddress)

This be good option if we are Permanent Correspondence


interested in some values of a SSN sName
Address Address
multivalued attribute
109 Steve NY SJ
Lets see in next slide what can
we do to keep all the information 232 Larry LA SJ
171 Bill NY NY
First Normal Form (1NF)
For dealing with these situation we have 3 solutions
Solution 1
Consider the following relation
CandidateInfo(SSN, sName, Qualification)
Suppose we are interested in storing all the information about all the
values such as
We storing the information of all candidates applying for some job
interview then we need to have all information of his degrees such as
HighSchool (HS), Intermediate (Inter), B.Tech, M.Tech, Ph.d

SSN sName Qualification


178 Mark HS, Inter, B.Tech
127 Sergey HS, Inter
171 Bill HS, Inter, B.Tech, M.Tech
786 Zuhaib HS, Inter, B.Tech, M.Tech, Ph.d
First Normal Form (1NF)
For dealing with these situation we have 3 solutions
Solution 1
Consider the following relation
CandidateInfo(SSN, sName, Qualification)
We will keep all SSN sName Qualification
info in separate 178 Mark HS, Inter, B.Tech
attributes 127 Sergey HS, Inter
171 Bill HS, Inter, B.Tech, M.Tech
1NF 786 Zuhaib HS, Inter, B.Tech, M.Tech, Ph.d

SSN sName HighSchool Inter Grad PostGrad Doctrate


178 Mark HS Inter B.Tech NULL NULL
127 Sergey HS Inter NULL NULL NULL
171 Bill HS Inter B.Tech M.Tech NULL
786 Zuhaib HS Inter B.Tech M.Tech P.hd
First Normal Form (1NF)
For dealing with these situation we have 3 solutions
Solution 1
Consider the following relation
CandidateInfo(SSN, sName, Qualification)
Issues (Disadvantages)
This solution has certain issues such as
 Memory wastage as NULL value is introduced
 Require prior estimate of number of attribute to be introduced
 For e.g. if any candidate was PostDoc then there is no attribute
that store that information

SSN sName HighSchool Inter Grad PostGrad Doctrate


178 Mark HS Inter B.Tech NULL NULL
127 Sergey HS Inter NULL NULL NULL
171 Bill HS Inter B.Tech M.Tech NULL
786 Zuhaib HS Inter B.Tech M.Tech P.hd
First Normal Form (1NF)
For dealing with these situation we have 3 solutions
Solution 2 : create row for each value SSN sName skill
Consider the following relation 109 Steve C, C++,Java
Student(SSN, sName, skill) 232 Larry Java, .NET
171 Bill HTML5, PHP
Suppose we storing students info with their
language expertise 143 Mark Java
SSN sName skill
1NF
109 Steve C
109 Steve C++
109 Steve Java
232 Larry Java Issues (Disadvantages)
232 Larry .NET This solution has certain issues such as
171 Bill HTML5  Add redundancy to relation

171 Bill PHP


143 Mark Java
First Normal Form (1NF)
For dealing with these situation we have 3 solutions
Solution 3 : create separate relation for multivalued attribute
Consider the following relation
Student(SSN, sName, email, city, skill)
Suppose we storing students info with their language expertise

SSN sName email city skill


109 Steve steve@a.c Boston C, C++,Java
232 Larry Larry@g.c NY Java, .NET
171 Bill Bill@msn.c LA HTML5, PHP
143 Mark Mark@fb.c PA Java
First Normal Form (1NF)
Solution 3 : create separate relation for multivalued attribute
Consider the following relation
SSN sName email city skill
109 Steve steve@a.c Boston C, C++,Java
232 Larry Larry@g.c NY Java, .NET
171 Bill Bill@msn.c LA HTML5, PHP

SSN skill 143 Mark Mark@fb.c PA Java


109 C Decompose
109 C++
109 Java
232 Java SSN sName email city

232 .NET 109 Steve steve@a.c Boston

171 HTML5 232 Larry Larry@g.c NY

171 PHP 171 Bill Bill@msn.c LA

143 Java 143 Mark Mark@fb.c PA

Вам также может понравиться