Вы находитесь на странице: 1из 7

NORMALIZATION is the process where a database is designed in a way that removes redundancies, and increases the clarity in organizing

data in a database. In easy English, it means take similar stuff out of a collection of data and place them into tables. Keep doing this for each new table recursively and you'll have a Normalized database. From this resultant database you should be able to recreate the data into it's original state if there is a need to do so. The important thing here is to know when to Normalize and when to be practical. That will come with experience. For now, read on... Normalization of a database helps in modifying the design at later times and helps in being prepared if a change is required in the database design. Normalization raises the efficiency of the datatabase in terms of management, data storage and scalability. Now Normalization of a Database is achieved by following a set of rules called 'forms' in creating the database. These rules are 5 in number (with one extra one stuck in-between 3&4) and they are:
Database normalization Rule 1: Eliminate Repeating Groups. Make a separate table for each set of related attributes, and give each table a primary key. Unnormalized Data Items for Puppies puppy number puppy name kennel code kennel name kennel location trick ID trick name trick where learned skill level In the original list of data, each puppy description is followed by a list of tricks the puppy has learned. Some might know 10 tricks, some

might not know any. To answer the question Can Fifi roll over? we need first to find Fifis puppy record, then scan the list of tricks associated with the record.This is awkward, inefficient, and extremely untidy. Moving the tricks into a seperate tablehelps considerably. Seperating the repeating groupsof tricks from the puppy information results in first normal form. The puppy number in the trick table matches the primarykey in the puppy table, providing a foreign key for relating the two tables with a join operation. Now we can answer our question with a direct retrieval look to see if Fifis puppy number and the trick ID for roll over appear together in the trick table. First Normal Form: Puppy Table puppy number puppy name kennel name kennel location Trick Table puppy number trick ID trick name trick where learned skill level Database Normalization Rule 2: Eliminate Redundant Data, if an attribute depends on only part of a multi-valued key, remove it to a separate table. The trick name (e.g. roll over) appears redundantly for every puppy that knows it. Just trick ID whould do. primary key

TRICK TABLE Puppy Number Trick ID Trick Name Where Learned Skill Level 52 53 54 27 16 27 roll over Nose Stand roll over 16 9 9 9 5 9

*Note that trick name depends on only a part (the trick ID) of the multi-valued, i.e. composite key. In the trick table, the primary key is made up of the puppy number and trick ID. This makes sense for the where learned and skill level attributes, since they will be different for every puppy -trick combination. But the trick name depends only on the trick ID. The same name will appear redundantly every time its associated ID appears in the trick table. Second Normal Form puppy table puppy number puppy name kennel code kennel name kennel location tricks table tricks ID tricks name Puppy-Tricks puppy number trick ID trick where learned skill level Suppose you want to reclassify a trick, i.e. to give it a different trick ID. The change has to be made for every puppy that knows the trick. If you miss some of the changes, you will have several puppies with the same trick under different IDs, this is an update anomaly. Database normalization Rule 3: Eliminate columns not dependent on key. If attributes do not contribute to a description of the key, remove them to a separate table. Puppy Table puppy number puppy name kennel code kennel name

The puppy table satisfies the first normal form, since in contains no repeating groups. It satisfies the second normal form, since it does not have a multivalued key. But the key is puppy number , and the kennel name and the kennel location describe only a kennel, not a puppy. To achieve the third normal form, they must be moved into a separate table. Since they describe a kennel, kennel code becomes the key of the new kennels table. Third Normal Form Puppies puppy number puppy name kennel code Kennel kennel code kennel name kennel location Tricks trick ID trick name Puppy Tricks puppy number trick ID trick where learned skill level The motivation for this is the same as for the second normal form. We want to avoidupdate and delete anomalies. For example suppose no puppies from the Puppy Farm were currently stored in the database. With the previous design, there would be no record of its existence.

Boyce-Codd Normal Form


The previous normalization forms are considered elementary, and should be applied on tables during our design process. This normalization form however, and the following forms, are done in special tables.

A table is considered in BCNF (Boyce-Codd Normal Form) if its already in 3NF AND doesnt contain any nontrivial functional dependencies. That is it doesnt contain any field (other than the primary key) that can determine the value of another field. Lets take the following table:

Student Smith Smith Jones Jones Doe

Subject Math English Math English Math

Teacher Dr. White Dr. Brown Dr. White Dr. Brown Dr. Green

By taking into consideration the following conditions: For each subject, every student is educated by one teacher. Every teacher teaches one subject only. Each subject can be teached by more than one teacher. Its clear we have the following functional dependency: Teacher -> Subject And the left side of this dependency is not the primary key. So, to convert the table from 3NF to BCNF, we do these steps: Determine in the table, a key other than the primary key. That can be left side to the functional dependency. Delete the key in the right side of our functional dependency in the main table. Make a table for this dependency, with its key being the left side of the dependency, as the following:

Student Smith Smith Jones Jones Doe


And

Teacher Dr. White Dr. Brown Dr. White Dr. Brown Dr. Green

Teacher Dr. White Dr. Brown Dr. Green

Subject Math English Math

Database Normalization Rule 4: Isolate independent multiple relationships. No table may contain two or more 1:n (one-to-many) or n:m (many-to-many) relationships that are not directly related. This applies only to designs that include one-to-many and many-to-many relationships. An example of a one-to-many relationship is that one kennel can hold many puppies. An example of a many-to-many relationship is that a puppy can know many tricks and several puppies can know the same tricks. Puppy Tricks and Costumes puppy number trick ID trick where learned skill level costume suppose we want to add a new attribute to the puppy-trick table, Costume, this way we can look for puppies that can both set-up-and-beg and wear a Groucho Marx mask, for example. The forth normal form dictates against this (i.e. against using the puppy-tricks table not against begging while wearing a Groucho mask). The two attributes do not share a meaningful relationship. A puppy may be able to wear a wet suit. This does not mean it can simultaneously sit up and beg. How will you represent this if you store both attributes in the same table? Forth Normal Form Puppies puppy number puppy name kennel code Kennels kennel code kennel name kennel location Tricks trick ID trick name

Puppy-Tricks puppy number trick ID trick where learned skill level Costumes costume number costume name Puppy-Custumes puppy number costume number

THE REASON FOR DENORMALIZATION


Only one valid reason exists for denormalizing a relational design - to enhance performance. However, there are several indicators which will help to identify systems and tables which are potential denormalization candidates. These are:

Many critical queries and reports exist which rely upon data from more than one table. Often times these requests need to be processed in an online environment. Repeating groups exist which need to be processed in a group instead of individually. Many calculations need to be applied to one or many columns before queries can be successfully answered. Tables need to be accessed in different ways by different users during the same timeframe. Many large primary keys exist which are clumsy to query and consume a large amount of DASD when carried as foreign key columns in related tables. Certain columns are queried a large percentage of the time. Consider 60% or greater to be a cautionary number flagging denormalization as an option.

Be aware that each new RDBMS release usually brings enhanced performance and improved access options that may reduce the need for denormalization. However, most of the popular RDBMS products on occasion will require denormalized data structures. There are many different types of denormalized tables which can resolve the performance problems caused when accessing fully normalized data. The following topics will detail the different types and give advice on when to implement each of the denormalization types.

Вам также может понравиться