Академический Документы
Профессиональный Документы
Культура Документы
Goals of Normalization
Let R be a relation scheme with a set F of functional dependencies. Decide whether a relation scheme R is in good form. In the case that a relation scheme R is not in good form, decompose it into a set of relation scheme {R1, R2, ..., Rn} such that Each relation scheme is in good form There is no loss of information Eliminate Redundancies Caused By: Fields Repeated Within A File Fields Not Directly Describing The Key Entity Fields Derived From Other Fields Avoid Anomalies In Updating (Adding, Editing, Deleting) Represent Accurately The Items Being Modeled Simplify Maintenance And Retrieval Of Info Preferably, the decomposition should be dependency preserving.
8.2
Functional-Dependency Theory
A Functional Dependency is a relationship between or among
attributes such that the values of one attribute depend on or are determined by the values of the other attribute(s).
Partial Dependency is a relationship between attributes such that the
values of one attribute is dependent on, or determined by, the values of another attribute which is part of the composite key.
Partial dependencies are not good due to duplication of data and
update anomalies
8.3
Example
If we know an ISBN, then we know the Book Title and the author(s)
8.4
Transitive Dependency
Is a relationship between attributes such that the values of one
attribute is dependent on, or determined by, the values of another attribute which is not a part of the key.
Exist when a nonkey attribute value is functionally dependent upon
An employee data table that includes the hourly pay rate would
require searching every employee record to properly update an hourly rate for a particular job category.
8.5
Set of names, composite attributes Identification numbers like CS101 that can be broken up into parts
Example: Set of accounts stored with each customer, and set of owners stored with each account
8.6
arrays)
Entries in a column (attribute, field) are of the same kind Goal is the elimination of repeated groups of data by creating separate
8.7
Instructor Table
ID 10101 12121 LastName Sutton Banks FirstName Ronald Myles DeptName Computer Science Finance Building Johnson Hall Lewis Building Salary 65000 90000
15151 22222
32343 45565 58554
Hill Young
Jones Allen Miller
Christopher Aaron
Jonathan Andrew Ayanna
Physics Music
Health Computer Science Music
75000 95000
85000 72000 80000
8.8
8.9
8.10
8.11
In a table containing a list of three things - college courses, the lecturer in charge of each course and the recommended book for each course - these three elements (course, lecturer and book) are independent of one another. Changing the courses recommended book, for instance, has no effect on the course itself. Multi-valued dependency : An item depends on more than one value. In this example, the course depends on both lecturer and book. 4NF states that a table should not have more than one of these dependencies. 4NF is rarely used outside of academic circles. Under fourth normal form, a record type should not contain two or more independent multivalued facts about an entity. In addition, the record must satisfy third normal form. Consider employees, skills, and languages, where an employee may have several skills and several languages. We have here two many-to-many relationships, one between employees and skills, and one between employees and languages. Under fourth normal form, these two relationships should not be represented in a single record such as EMPLOYEE | SKILL | LANGUAGE Instead, they should be represented in the two records EMPLOYEE | SKILL & EMPLOYEE | LANGUAGE |
8.12
Design Goals
Enter the minimum data necessary Avoiding duplicate entry of information Minimum risks to data integrity
8.13
lead to project-join normal form (PJNF) (also called fifth normal form)
8.14
R could have been generated when converting E-R diagram to a set of tables. R could have been a single relation containing all attributes that are of interest (called universal relation). Normalization breaks R into smaller relations. R could have been the result of some ad hoc design of relations, which we then test/convert to normal form.
8.15
correctly, the tables generated from the E-R diagram should not need further normalization.
However, in a real (imperfect) design, there can be functional
Example: an employee entity with attributes department_name and building, and a functional dependency department_name building Good design would have made department an entity
8.16
environment. The conflict between design efficiency, information requirements, and processing speed are often resolved through compromises that include denormalization. May want to use non-normalized schema for performance For example, displaying prereqs along with course_id, and title requires join of course with prereq
Alternative 1: Use denormalized relation containing attributes of course
as well as prereq with all above attributes faster lookup extra space and extra execution time for updates extra coding work for programmer and possibility of error in extra code Alternative 2: use a materialized view defined as course prereq Benefits and drawbacks same as above, except no extra coding work for programmer and avoids possible errors
8.17
Above are in BCNF, but make querying across years difficult and needs new table each year
Also in BCNF, but also makes querying across years difficult and requires new attribute each year.
Is an example of a crosstab, where values for one attribute become column names Used in spreadsheets, and in data analysis tools
8.18
End of Chapter
Figure 8.02
8.20
Figure 8.03
8.21
Figure 8.04
8.22
Figure 8.05
8.23
Figure 8.06
8.24
Figure 8.14
8.25
Figure 8.15
8.26
Figure 8.17
8.27