Вы находитесь на странице: 1из 37

Database Design

•Bottom-up Perspective
•Normalization
•1NF, 2NF, to 3NF

Copyright, 1996 © Dale Carnegie &


Database Design

❚ Top Down
❙ Identifying entities
❙ Identifying relationships
❙ Normalizing cardinalities
❘ Eg., M:N relationships get broken down to 1:M
and M:1 via a bridge entity
❚ Bottom Up
❘ Start off with existing forms and reports and
normalize to 3rd normal form (3NF)
Top Down Vs. Bottom Up
• To ❙Faster
pDo
wn ❙Futuristic

❙Slower
❙More accurate process

•Bottom-up
Which is better?
Top Down or Bottom-Up?

❚ Top down
❙ Can be faster than bottom up.
❙ Can be futuristic. Can project new
entities (and hence tables).
❚ Bottom-up
❙ Slower
❙ Entities (and hence tables) identified
from bottom up are based on existing
forms and reports.
❙ More accurate process.
Bottom Up Design
Normalization

❚ The following concepts are discussed


in this section:
❙ Why to normalize a database
❙ How to normalize a database
❙ Denormalization
Normalization Defined
❚ It is the process of organizing data to
minimize duplication.
❚ Normalization usually involves
dividing a database into two or more
tables and defining relationships
between the tables.
Why Be Normal?

❚ A properly designed database can


❙ Increase data integrity
❙ Simplify data maintenance
❙ Take less disk space
Why Normalize a
Database?

❚ Reduce Redundant data


❙ A normalized database reduces the amount of
redundant data stored in a database.
❚ Remove Inconsistent data
❙ Reduce the likelihood of inconsistent data.
❙ Reduce anomalies
❙ This leads to improved data integrity.
Data Update Anomalies

❚ With a normalized design you reduce the risk


of data modification anomalies.
❚ Three kinds of anomalies exist
❚ (1) Update Anomaly
❙ An update anomaly is when you must modify
duplicate data in the system.
❙ Since you must modify the data more than once,
you run the risk of the data not being properly
modified throughout the system.
Insert and Delete
Anomalies

❚ (2) Insert Anomaly


❙ You can’t insert data due to missing
information (especially a missing key!)
❚ (3) Delete Anomaly
❙ You can’t delete data without deleting
other essential data (I.e. data that you
don’t want to delete.)
Example: Insert Anamoly
❚ Student# Studentname, Course#, CourseName
❚ U100 joe cs100 C++
❚ U100 joe cs101 Java

❚ Primary Key is Student#, Course# (jointly)


❚ You cannot create/insert a new course unless you have a
student enrolled in a course.
❙ This is not desirable in any college or corporate training
environment.
Example: Delete Anamoly
❚ Student# Studentname, Course#, CourseName
❚ U100 joe cs100 C++
❚ U100 joe cs101 Java

❚ Primary Key is Student#, Course# (jointly)


❚ If a course has only one student. Deleting the
student will delete the course.
❙ This is not desirable in any college or corporate training
environment.
Example: Update Anamoly
❚ Student# Studentname, Course#, CourseName
❚ U100 joe cs100 C++
❚ U100 joe cs101 Java

❚ Primary Key is Student#, Course# (jointly)


❚ Assume “joe” needs to be updated to “Joe” (lower
case “j” to proper case “J”)
❚ We will have to locate and update the value “joe”
more than once.
How Do We Normalize?
❚ Goal:
❚ Create new tables.
❚ In each table all non-key
attributes should be dependent on
the primary key and nothing but
the primary key.
How Do We Normalize?

Normalization

full dependency

A B C D E F

partial transitive
dependency dependency
How Do We Normalize?

3NF

full dependency

A B C D E F

partial transitive
dependency dependency
How Do We Normalize?

3NF

full dependency

A B C D E F

partial transitive
dependency dependency
How Do We Normalize?

3NF

full dependency

A B D E

Non-transitive dependency or partial dependency!


How Do We Normalize?

Normalization Steps
❚ Your table is in UNF if it has repeating groups of
data. Remove them!
❙ The table is now in 1NF.
❚ Remove partial dependency.
❙ eg., B,C (duplicate the key B)
❙ What remains is ABDEF.
❙ The table is in 2NF.
❚ Remove transitive dependency.
❙ eg., E,F (duplicate E, which becomes a key).
❙ The tables are now in 3NF.
Case Study: Normalization
STUDENT TRANSCRIPT FORM 11/2/9
Student # 999-99-9999 Name: Richie Rich 8
Address: 6 Chicory Rd, Andover, MA 01886

Course# Course Description Instructor# Name Grade


CS 601 Database Systems 101 King A
PM 700 Project Mgmt. 210 Smith B

GPA 3.5
Case Study

Case Study: Normalization

❚ List all attributes first


❚ Student #, Name, Address (Course#
Course_Description Instructor# Name
Grade))
❚ There are repeating groups!
❚ This table is in UNF
Case Study

Case Study: Normalization

Student #, Name, Address (Course#


Course_Description Instructor# Name Grade)

1 2

Student # Name, Address


Course# Course_Description
Instructor# Name
You must introduce a Student # here Grade
as the grade belongs to a student.
Case Study

Case Study: Normalization

Student #, Name, Address (Course# Course_


Description Instructor# Name Grade))

Student #, Name, Address


Course# Student# Course_
Description Instructor# Name
Grade
Revised table
Case Study

Case Study: Normalization

Course# Student# Course_ Description


Instructor# Name Grade

4
3
Course# Student#
Instructor# Name Grade

Course# Course_Description
Case Study

Case Study: Normalization

Course# Student# Instructor#


InstructorName Grade

Name is dependent on
Instructor# primarily

Course# Student# Grade

Instructor # InstructorName
Case Study

Case Study: Normalization


Are all the tables
Linked via
common FK?
Course# Student# Grade

Instructor # Name

Student #, Name, Address

Course# Course Description


Case Study

Case Study: Normalization

Not
Course# Student# Grade linked

Instructor # Name

Student #, Name, Address

Course# Course Description


Case Study

Question
❚ How do you link Instructor?
❚ Hint:
❙ The link should be on the side of the Many.
❙ That is, the course table should have Instructor
# as the foreign key.
❚ Why not the other way around?
❙ That is, why not attach Course# to the
Instructor table?
❙ Think about the drawback of this option.
Case Study

Case Study: Normalization

Course# Student# Grade

Instructor # Name

Student #, Name, Address

Course# Course_Description Instructor#


Other Issues: Ignore Date
& GPA

❚ Ignore date and GPA when


normalizing.
❚ GPA is a calculated field.
❙ No reason to store GPA in the database
❚ Date is a Report-Date.
❙ Not significant information.
❙ No need to store date in the database.
Other Issues: Integrate

❚ Student (Student#, Name)


❚ Student (Student#, Major)
❚ Should be integrated into one table
(assuming the college only allows
one declared major)
❚ Student(Student#, Name, Major)
Other Issues: 2NF may be
OK Most organizations
have a similar table
❚ 2NF is good enough
❙ Eg., address

A B C D E F

a = ID (PK)
b = Employee Name
c = street.
d = city Is this in 3NF?
e = state
f = zip
Exercise B: Normalize This Invoice

E-COMMERCE INC. 11/2/9


Customer ID: 1011 Name: Bart Simpson 8
Address: 101 United Way, Babson Park, MA 01586

ITEM# Description Qty UnitPrice Total


A-100 Pentium 1000 2 1000 2000
A-200 Pentium 2000 1 1500 1500

Total 3500
Summary
Summary
❚ Un-normalized: Contains repeating
groups
❙ Resolution: Remove them!

❚ First Normal Form: Contains no


repeating groups but has Update,
Delete, Insert anomalies.
❙ Resolution: Remove partial
dependencies
Summary

❚ Second Normal Form


❙ Partial functional dependencies are removed.
Still has Update, Delete, Insert anomalies.
❙ Bring it to 3NF by removing transitive
dependencies
❚ Resolution:
❙ Remove transitive dependencies
Summary
Summary

❚ Third Normal Form:


❙ Partial functional dependencies are
removed.
❙ Transitive dependencies removed.
❙ All attributes are dependent on PK
and nothing but PK!
❙ Tables are small and well-formed.
Synchronize the Top Down
and Bottom Up Models
Summary

Вам также может понравиться