Вы находитесь на странице: 1из 11

1) a) Define what is “Apriori principle” and briefly discuss why Apriori principle is useful in

association rule mining.

Apriori Principle:-

If an item set is frequent, then all of its subsets must also be frequent,
Or
If an item set is infrequent, then all of its supersets must be infrequent.
Apriori principle reduces the number of candidate item sets in an association rule mining process by
eliminating the candidates that are infrequent and leaving only those that are frequent.

b) Compare and contrast FP-Growth algorithm with Apriori algorithm.

Apriori Algorithm FP-Growth Algorithm


Use Apriori property and join and prune It constructs conditional frequent pattern tree
property. and conditional pattern base from database
which satisfy minimum support.
Due to large number of candidates are Due to compact structure and no candidate
generated require large memory space. generation require less memory.
Multiple scans for generating candidate sets. Database scanning happens twice only.
Execution time is higher than FP-Growth Execution time is less than Apriori algorithm.
algorithm as time is wasted in producing
candidates every time.
2) Consider the market basket transactions given in the following table. Let min_sup = 40% and
min_conf = 40%.

a) Find all the frequent item sets using Apriori algorithm.

Minimum Support = 40%

Minimum Confidence = 40%

Transaction ID Items Bought


T1 A,B,C
T2 A,B,C,D,E
T3 A,C,D
T4 A,C,D,E
T5 A,B,C,D

C1

Item Number of Transactions Minimum Support


A 5 5/5=100%
B 3 3/5=60%
C 5 5/5=100%
D 4 4/5=80%
E 2 2/5=40%

L1

Item Number of Transactions


A 5
B 3
C 5
D 4
E 2

C2

Item Pairs Number of Transactions Minimum Support


A,B 3 3/5=60%
A,C 5 5/5=100%
A,D 4 4/5=80%
A,E 2 2/5=40%
B,C 3 3/5=60%
B,D 2 2/5=40%
B,E 1 1/5=20%
C,E 2 2/5=40%
C,D 4 4/5=80%
E,D 2 2/5=40%
L2

Item Pairs No of Transactions


A,B 3
A,C 5
A,D 4
A,E 2
B,C 3
B,D 2
C,D 4
C,E 2
E,D 2

AB & AC => ABC BC & BD =>BCD CD & CE =>CDE


AB & AD =>ABD
AB & AE => ABE
AC & AD =>ACD
AC & AE => ACE
AD & AE => ADE

C3

Item Set Number of Transactions Minimum Support


A,B,C 3 3/5=60%
A,B,D 2 2/5=40%
A,B,E 1 1/5=20%
A,C,D 4 4/5=80%
A,C,E 2 2/5=40%
A,D,E 2 2/5=40%
B,C,D 2 2/5=40%
C,D,E 2 2/5=40%

L3

Item set Number of Transactions


A,B,C 3
A,B,D 2
A,C,D 4
A,C,E 2
A,D,E 2
B,C,D 2
C,D,E 2
ABC & ABD => ABCD ACD & ACE => ACDE

Item Set Number of Transactions


A, B,C, D 2
A, C, D, E 2

Sets of {A, B, C, D} & {A, C, D, E} are bought together most frequently.


b) Obtain significant decision rules.

Subsets of {A, B, C, D}
{A} {B, C}
{B} {B, D}
{C} {C, D}
{D} {A, B, C}
{A, B} {A, C, D}
{A, C} {A, B, D}
{A, D} {B, C, D}

{A} => {B, C, D}


C= σ{A, B, C, D}/ σ{A}
=2/5 = 40% Confidence

{B} => {A, C, D}


C= {A, B, C, D}/ {B}
=2/3 = 66.66% Confidence

{C} => {A, B, D}


C= σ{A, B, C, D}/σ {C}
=2/5=40% Confidence

{D} => {A, B, C}


C=σ {A, B, C, D}/ σ{D}
=2/4=50% Confidence

{A, B} => {C, D}


C= σ{A, B, C, D}/ σ{A, B}
=2/3=66.66% Confidence

{A, C} => {B, D}


C= σ{A, B, C, D}/σ{A, C}
=2/5=40% Confidence

{A, D} => {B, C}


C= σ{A, B, C, D}/σ{A, D}
=2/4=50% Confidence

{B, C} => {A, D}


C=σ {A, B, C, D}/σ{B, C}
=2/3=66.66% Confidence
{B, D} => {A, C}
C= σ{A, B, C, D}/σ{B, D}
=2/2=100% Confidence

{C, D} => {A, B}


C= σ{A, B, C, D}/σ{C, D}
=2/4=50% Confidence

{A, B, C} => {D}


C= {A, B, C, D}/{A, B, C}
=2/3=66.66% Confidence

{A, C, D} => {B}


C= σ{A, B, C, D}/σ{A, C, D}
=2/4=50% Confidence

{A, B, D} => {C}


C= σ{A, B, C, D}/σ{A, B, D}
=2/2=100% Confidence

{B, C, D} => {A}


C=σ {A, B, C, D}/σ{B, C, D}
=2/2=100% Confidence
Subsets of {A, C, D, E}
{A} {C, D}
{C} {C, E}
{D} {D, E}
{E} {A, C, D}
{A, C} {A, D, E}
{A, D} {A, C, E}
{A, E} {C, D, E}

{A} => {C, D, E}


C=σ{A, C , D, E}/σ{A}
=2/5=40% Confidence

{C} => {A, D, E}


C=σ{A, C, D, E}/σ{C}
=2/5=40% Confidence

{D} => {A, C, E}


C=σ{A, C, D, E}/σ{D}
=2/4=50% Confidence

{E} => {A, C, D}


C=σ{A, C, D, E}/σ{E}
=2/2=100% Confidence

{A, C} => {D, E}


C= σ{A, C, D, E}/σ{A, C}
=2/5=40% Confidence

{A, D} => {C, E}


C=σ{A, C, D, E}/σ{A, D}
=2/4=50% Confidence

{A, E} => {C, D}


C=σ{A, C, D, E}/σ{A, E}
=2/2=100% Confidence

{C, D} => {A, E}


C= σ{A, C, D, E}/ σ {C, D}
=2/4=50% Confidence

{C, E} => {A, D}


C= σ {A, C, D, E}/ σ {C, E}
=2/2=100% Confidence
{D, E} => {A, C}
C= σ {A, C, D, E}/ σ {D, E}
=2/2=100% Confidence

{A, C, D} => {E}


C= σ {A, C, D, E}/ σ {A, C, D}
=2/4=50% Confidence

{A, D, E} => {C}


C= σ {A, C, D, E}/ σ {A, D, E}
=2/2=100% Confidence

{A, C, E} => {D}


C= σ {A, C, D, E}/ σ {A, C, E}
=2/2=100% Confidence

{C, D, E} => {A}


C= σ {A, C, D, E}/ σ {C, D, E}
=2/2=100% Confidence
c) Derive the FP-Tree for the above transaction table.

Step 01

Support for each item.

A=5/5=100%

B=3/5=60%

C=5/5=100%

D=4/5=80%

E=2/5=40%

Transaction ID Items Bought


T1 A,C,B
T2 A,C,D,B,E
T3 A,C,D
T4 A,C,D,E
T5 A,C,D,B

TID:1 =>
NULL

A:1

C:1

B:1

TID:2 =>

NULL

A:2

C:2

D:1
B:1
B:1

E:1
TID:3 =>

NULL

A:3

C:3

B:1 D:2
B:1

E:1

TID:4 =>

NULL

A:4

C:4

D:3
B:1
B:1

E:1 E:1

TID:5 =>

NULL

A:5

C:5

D:4
B:1
B:2

E:1 E:1

Вам также может понравиться