Вы находитесь на странице: 1из 11

Method for determining candidate keys and highest normal form of a re...

1 of 11

http://www.rlvision.com/blog/method-for-determining-candidate-keys...

RLV Blog
Thoughts on technology and other
musings by an independent software
developer.

Method for determining candidate keys and highest normal form of a relation based on
functional dependencies
Posted on October 19, 2013 by Dan

For my advanced database systems course I needed to learn how to take a given relation and functional
dependencies, tell the highest normal form and then normalize it up to BCNF. Its actually not that hard, but there
are a lot of pitfalls to watch out for. Here Im going to show the methods I learned to solve the exam questions. I
need to give credit to the mentors and StinaQ for their help and tutorials!

Visualizing the dependencies


You start with a given relation and its dependencies. For example:

R(A,B,C)
A B
B {A,C}

The trick is to draw boxes in order to visualize the dependencies (ignore the right hand side CK/NP box for now):

Example 1

Note that B->{A,C} was broken down to B->A and B->C thanks to the decomposition rule.
Lets try a more tricky one:

R(A,B,C,D,E,F)
A B
B A
{B,C} D
C E

15-09-2015 17:15

Method for determining candidate keys and highest normal form of a re...

2 of 11

http://www.rlvision.com/blog/method-for-determining-candidate-keys...

Example 2

Note that B and C in this relation is a composite key that together determine D. Thus we draw a box around both
of them to illustrate this dependency. And dont forget F, even though nothing is pointing to it!

Candidate Keys & Non-Primes


Its time to determine the candidate keys (CK:s). Recall that a CK is one or more attributes that with which you
can determine all other attributes in the relation. (In the end, one of the CK:s will be chosen to be the primary
key.)
To find the candidate keys:
Begin by selecting a start box (this can be a composite key too). If there is a box at which no arrow is pointing,
this is usually a good place start. Follow the outwards arrows to the next box, and from this box to the next and
so forth. The goal is to find find a way that covers all boxes.
If you cant cover all boxes, try another starting box. If no single attribute (i.e. box) is able to cover all boxes by
itself, the CK needs to include one or more additional attributes.
There can be many candidate keys, so repeat this until you are sure you havent missed anything.
If no arrow points at a box, the rule is that it must be part of all CK:s. This includes stand-alone boxes like F in
example 2.
A candidate key is a minimal super key. Minimal means that you cant remove any attribute from the key and
still determine all other attributes. Note that you can still have keys of different sizes. For example, {A} and
{B,C} in Example 3 below can both be candidate keys. But {A,B,C} would not be minimal since you could
remove either {A} or {B,C} from it and still determine the relation.
Now that all CK:s are determined, also write down the non-primary attributes (NP). The Non-Primes are simply
all attributes that are not part of any CK.

Finding the highest normal form


Its time to determine the highest normal form of the relation. I will not use the exact formal definitions here, but
you should learn them too in order to understand whats happening here, especially if you need to motivate your
answer on an exam!
Is the relation in 1NF? Yes, always!
Is the relation in 2NF? Is there a composite CK? If not, 2NF is automatically achieved. Otherwise, check

15-09-2015 17:15

Method for determining candidate keys and highest normal form of a re...

3 of 11

http://www.rlvision.com/blog/method-for-determining-candidate-keys...

your boxes. In any of your composite CKs: does a part of it (i.e. a box inside the composite box) by itself
determine (point to) a non-prime attribute? If yes, NF2 does not pass (and the relation is thus in 1NF).
Is the relation in 3NF? Does the relation have any non-primes? If no, the relation is automatically in 3NF!
Otherwise: Is there a non-prime that determines (point to) another non-prime (resulting in a transitive
dependency because the last non-prime is determined indirectly)? If yes, 3NF does not pass (and the relation
is thus in 2NF).
Is the relation in BCNF? Are all the determinants also candidate keys? If yes, BCNF pass. This is easiest to
check by looking at the dependencies you were given (e.g. A->B and B->{A,C} in example 1) and check that
each key (left hand side of the arrow) is in your list of CKs.
Lets try this! Here is another example:

R(A,B,C,D,E)
A {B,C}
{B,C} A,D
D E

Example 3

First find the CK:s. We can easily see that {B,C} together determines A and D. Since D also determines E, starting
with {B,C} can determine all other attributes and is thus a CK. But wait! Since A determines {B,C} it must be a CK
on its own!
Now that we have the CK:s, we see that D and E are not part of any CK, so they are non-primary attributes.
All relations are already in 1NF, so we note this. How about 2NF? There is a composite key {B,C} so we need to
check if part of it can determine a non-prime on its own? Not a problem; this relation is in 2NF! 3NF say that a
non-prime must not determine another non-prime. Here is a problem, since D determines E. Thus 3NF does not
pass, and the relation only achieves 2NF!
Note the circle I draw to indicate the highest NF, e.g. 2NF in the above example. Since 3NF did not pass, the
highest NF is 2NF. If you dont do this, it is easy to make a mistake and pick the last one as the highest NF.

Tips
When you have finished drawing the boxes, do a final count of all attributes in R() and make sure they are all
present as boxes. Its very easy to forget the ones without any dependencies otherwise!
Remember that for example a 3NF relation is also in 2NF. If an exam question is is the relation in 2NF? and
you determine that the highest NF is 3NF, then the answer to the question is true. Dont mix up highest vs is

15-09-2015 17:15

Method for determining candidate keys and highest normal form of a re...

4 of 11

http://www.rlvision.com/blog/method-for-determining-candidate-keys...

in.
This page allows you to check that you have correctly determined the CKs and normal forms. Its a bit quirky to
use, but very convenient!
In the next post I will continue this tutorial and show how to normalize the relations to make sure they are in
BCNF!
Like

21

Tweet

12

This entry was posted in Tutorials and tagged databases, dependencies, education, normalization, rdbms. Bookmark the permalink.

45 Responses to Method for determining candidate keys and highest normal form of a relation
based on functional dependencies
Pingback: Method for normalizing a relation to BCNF based on functional dependencies | RLV Blog

Akshay says:
November 6, 2013 at 04:43

That was really helpfulThanks


Reply

Joe says:
November 23, 2013 at 18:57

If CK is minimal, why did you include {B, C} as a CK for example 3?


Reply

Dan says:
November 25, 2013 at 16:15

Both {A} and {B,C} are candidate keys in this example. Even if {A} looks smaller than {B,C}, {B,C} is still a minimal
CK, because if you were to remove either B or C from {B,C} it would no longer be able to determine the relation. Thus
it is minimal. {A,B,C} would not be a minimal key, because you could remove either A or B,C from it and it would still
determine the relation. Remember that we can have many CKs (but only one is chosen in the end as the primary key,
PK)
Reply

ram says:
January 9, 2014 at 06:26

very helpful..Thank you very much.


Reply

ram says:
January 9, 2014 at 06:41

when Highest Normal form is asked,do we need to check normal form with all keys and select the normal form (or) select the
Highest normal form formed by one of its keys???
For example, if
A -> BCDEF
BC ->ADEF
B ->F

15-09-2015 17:15

Method for determining candidate keys and highest normal form of a re...

5 of 11

http://www.rlvision.com/blog/method-for-determining-candidate-keys...

D ->E is given.
then keys =A,{B,C}.
NF achieved by A is 2NF
NF achieved by {B,C} is 1NF. so what is Highest Normal form??? 1NF or 2NF is my doubt..
and some where i saw asking what is BEST normal form .can you clarify this also .
Thanks in advance.
Reply

Dan says:
January 9, 2014 at 14:15

This page gives then answer:


Rule: 2NF: Each non-prime-attributes has to be fully functionally dependent from each candidate key.
But: The non-prime-attribute F is not fully functionally dependent from the candidate-key {B, C} because e.g. {B}
also identifies the attribute.
Thus it is in 1NF.
The key here is to divide into primes and non-primes.
Reply

ram says:
January 10, 2014 at 09:51

Thanks a lotnow I got it right.what about BEST normal form.are best and highest normal forms,both
same..i got doubt because finally,we select one of keys as primary keyso if BEST normal form is asked,we
need to choose a key that provides us highest normal form
Thanks in advance
Reply

Dan says:
January 10, 2014 at 13:42

Good question. Best is subjective, so if that is the question, I suggest to ask your teacher to clarify.
Theoretically highest is probably always best,but in some practical situations it is better to keep a
lower normal form because for example performance issues.
Reply

ram says:
January 10, 2014 at 19:04

okayThank you for spending time on my questions.i am learning a lot from your
blogsplease keep posting as many as possible.
Thank you,

Aniruddha says:
January 16, 2014 at 11:05

Very nice artcile and helpfull, keep writting


Reply

Shareena says:
January 26, 2014 at 17:40

Thankyou so much. This is very helpfull.


Can you please explain about redundancy in these examples.

15-09-2015 17:15

Method for determining candidate keys and highest normal form of a re...

6 of 11

http://www.rlvision.com/blog/method-for-determining-candidate-keys...

Reply

Dan says:
January 27, 2014 at 18:45

Sorry, but my understanding of redundancy in this context is limited. Perhaps some other reader can help?
Reply

Ksenia says:
February 13, 2014 at 11:06

Dan, thank you very much for such a great explanation! It is really hard to understand how to find CK in FD, but you made it
clear! super!
Reply

chiranjiv says:
February 26, 2014 at 13:48

Thanks Its very helpful


Reply

Harshit Mehta says:


March 7, 2014 at 17:31

Can you please give me a C code for this procedure!


Reply

Dan says:
March 7, 2014 at 17:48

Sorry, but I have no such code


Reply

Duc Cuong says:


April 6, 2014 at 20:57

Thank you very much for your detail article. Finally did I find something that is really useful and helps me understands
clearly about how to determine CK and NF from FDs.
Reply

prem says:
September 29, 2014 at 12:20

can you help me in finding the normal for relation :


R(a,b,c,d,e)
Functional Dependencies : A->B
BC->E
ED->A
Reply

amit says:
October 11, 2014 at 04:24

hi prem,

15-09-2015 17:15

Method for determining candidate keys and highest normal form of a re...

7 of 11

http://www.rlvision.com/blog/method-for-determining-candidate-keys...

Diagrammatic representation of the FDs is:


A -> B |
C |-> E |
D | -> A
How to find the CKs:
A uniquely cant determine all the attributes. It needs C and D. Therefore CK is { A, C, D }
Similarly, if we have B and C, we need D to determine all other attributes, so CK is { B, C, D }
Finally if we have E and D, we need C to determine all other attributes, so CK is { C, E, D }
So we have 3 CKs:
1. { A, C, D}
2. { B, C, D }
3. {C, E , D}
Non prime attributes are ones that are not part of any CK. Therefore they are C and D
Now to check all NFs:
i) Any given relation by definition of a relation is already in 1NF
ii) To find out 2NF:
Is there any composite Key? Ans: Yes
Then does any part of our composite keys independently determine any of the non-prime attributes?
Ans: C and D cannot be independently determined by any part of any CK
So 2NF condition is passed
iii. To find out 3NF:
Does the relation have any non-primes? Ans: Yes
Then, is there a non-prime that determines (point to) another non-prime?
Ans: Neither C determines (points to) to D nor vice versa.
So 3NF condition is passed
iv. To find out BCNF
In all given FDs, are all the determinants also candidate keys?
No, neither A nor BC nor ED are candidate keys. Therefore BCNF condition does not pass.
Highest NF is 3NF
Reply

smita says:
October 21, 2014 at 16:56

A->BC
B->D
CD->E
E->A
Its highest normal fom is 3NF or BCNF
Reply

Hareish says:
January 7, 2015 at 06:20

CK:-A,BC,CD,E
All attributes in a relation is a key or part of key..then blindly say it is in 3NF.
its time for check BCNF,
in BCNF every determinent is CK then check one by one.
A determinent is CK.
B determinent is not a CK is a part of a key.

15-09-2015 17:15

Method for determining candidate keys and highest normal form of a re...

8 of 11

http://www.rlvision.com/blog/method-for-determining-candidate-keys...

conclusion:-highest NF is 3NF
Reply

Shaurya says:
May 8, 2015 at 21:22

Dear Sir
You said that non-primes do not exist in the CKs but both C and D do exist in the CKs sir could you
please explain this querry to me.
Thanking You
Reply

Shaurya says:
May 8, 2015 at 21:23

sorry wrong comment , pardon me


Reply

Shaurya says:
May 8, 2015 at 21:24

Hey
You said that non-primes do not exist in the CKs but both C and D do exist in the CKs sir could you
please explain this querry to me.
Thanking You
Reply

Marvin P says:
November 2, 2014 at 08:39

Super.really helpfulthanks a lot


Reply

snopcoff says:
November 10, 2014 at 13:11

Its really helpful, thank you very much.


Reply

Diamond says:
November 14, 2014 at 04:34

What if you have a very complex set of FDs which you cannot put in diagram form? What do you do then? Example if you
have up to 8 FDs and only one of that 8 has one attribute on the LHS. All others have 2 or 3 attributes. Its going to be really
complex drawing the box diagram for such a situation. So what do you do?
Reply

Dan says:
November 15, 2014 at 08:44

If the drawings become too complex, then I think you need to really dig in to the theory behind functional
dependencies. There is probably some deep mathematical thinking behind it that will help you to a solution, but Im
not the right guy to ask.
Reply

15-09-2015 17:15

Method for determining candidate keys and highest normal form of a re...

9 of 11

http://www.rlvision.com/blog/method-for-determining-candidate-keys...

Pravin says:
December 24, 2014 at 18:45

Really its helpful


Reply

Deepak gautam says:


December 30, 2014 at 05:42

great and very helpful . thanks for this ..


Reply

Manasi says:
December 30, 2014 at 09:27

Its very helpful.


Thanks a lot !!
Reply

Hareish says:
January 7, 2015 at 06:05

my searching ends here..thanks a lot!!


if D->E in a relation R then E->D also possible?
Reply

Harish says:
January 7, 2015 at 06:28

F = {ABC, CB, BCD}


Find highest normal form?
i think it is in 3NF
Reply

Jay Sadhwani says:


January 20, 2015 at 11:59

Thanks for this article DAN Keep on posting


Reply

suhaib says:
January 24, 2015 at 17:43

what if in a f.d,collection of prime and non prime attribute determines non prime attribute.Which normal form is that?
Reply

rashmi says:
March 16, 2015 at 05:28

firstly i would like to thank this blog and Dan sir for a simpler explanation to identify CNs.
i have a dought in ur example 1:, if a->b and b->{a,c} then according to formal definition of 3NF a->b,b->c so a->c. hence
this relation should not be in 3NF . please help me out from this confusion. hoping a solution from this blog. thanking you in
advance.

15-09-2015 17:15

Method for determining candidate keys and highest normal form of a re...

10 of 11

http://www.rlvision.com/blog/method-for-determining-candidate-keys...

Reply

Vaughan says:
April 24, 2015 at 22:56

The definition strictly says that no non-prime attribute should determine another non-prime attribute.
In this case, we can see that is true.
Reply

arpit says:
March 20, 2015 at 12:25

gr8 work buddy (Y)


Reply

Vaughan says:
April 24, 2015 at 22:49

I think we have a bit of an error here. In Example 3, we list two different candidate keys. However, they are of different sizes.
The candidate key should be the minimally sized one.
Is the candidate key then just A?
Reply

Romit Nagariya says:


May 6, 2015 at 21:16

Thank you very much! Your post helped me to get through the exams.
Reply

Anonymous says:
May 8, 2015 at 20:59

R(a,b,c,d,e)
Functional Dependencies : A->B
BC->E
ED->A
How to find a non-prime
Reply

Ilja I says:
June 2, 2015 at 00:18

Hi, thanks for the article


but why in the second example {A,C,F} is a candidate key ?
can D be determined by {A,C,F} ?
Reply

SN says:
June 17, 2015 at 18:14

Firstly Thanks a lot for the really helpful article.

15-09-2015 17:15

Method for determining candidate keys and highest normal form of a re...

11 of 11

http://www.rlvision.com/blog/method-for-determining-candidate-keys...

Ive got 2 questions though:


1) In example 2, why is {ACF} a candidate key? A leads to B and C leads to E (and F is included in the Key), but between the 3
of them none independently leads to D and neither do any of them together. The only path to D was through {B,C}
collectively, so Im assuming C alone cant lead to D. Then either:
(i) {B,C,F} is the sole candidate and {ACF} is not valid as they cant lead to D or
(ii) The {ACF} CK should include B or D to solve the issue: as either {A,B,C,F} or {A,C,D,F}
Which leads to my second question:
2) When talking about minimal CKs, you only mentioned situations where ex: CKs are: A and {B,C} then {A,B,C} is not a CK
since you can drop either half and the CK still holds true.
But what about when {B,C,F} is a CK and A is not. Can {A,B,C,F} be a CK or is it not valid since you can drop A and still have
a valid CK even though vice versa isnt true?
[I really hope my questions make sense to you]
Thanks again!
Reply

Marvin P says:
July 6, 2015 at 08:01

Very Nice article.:)


Reply

RLV Blog
Proudly powered by WordPress.

15-09-2015 17:15

Вам также может понравиться