Вы находитесь на странице: 1из 22

Association Rule Mining Using

Apriori and FP Growth

By
P.Kalki Prasad – PGDBT201914
T.Lakshmi Prasanna – PGDBT2901918
Association Rule Mining
 Has roots in analysis of point-of-sale transactions
◦ Determine what products are purchased together or likely to be
purchased by the same person
 Common applications
◦ Cross-sell - make the purchasers of one product the targets for another
◦ Up-sell – target customers likely to upgrade their product or service
 In general, when customers do multiple things in close
proximity then there is a potential application
Our Motivation and Objective

 Generating Association rules based on products combination


given to us

 This Implies our Business Understanding.


CRISPM-DM Reference Model
Data Understanding & Preparation
 Data Understanding : In this case we need to generate
rules based on the available data set so that the products can
be marketed together. So it is a Association Rule Mining
Data Mining Task.

 Data Pre-Processing
Removing redundancy
Merging
Converting dataset into required form
Data Used for Prediction

Type Description
Data Set Characteristics: Multivariate
Attribute Characteristics: Integer, Real,String
Number of Instances: 30000
Number of Attributes: 3
Associated Tasks: Association Rule Mining
Data Description
 Customer ID
 Account Opening Date
 Different_Products
Data Pre-processing
 Dataset contain some redundant rows. So redundant rows
are removed.
 Dataset also contains some rows which contain same
Customer_id, same Account Opening_Date and Different
Products . These kind of rows are merged into one
transaction.
 In our dataset a transaction contain max of three products.
These products are split into three different columns.
Data Pre-Processing
 NaN are removed from the dataset to get effective rules in
Assosciation rule mining.

 NaN are removed because the algorithm takes NaN as a


product and predict rules which are ineeficient
Association Rules
Tool Used : Python
Apriori Alogorithm
 The Apriori principle: Any subset of a frequent item set
must be frequent.
 The Apriori algorithm adopts candidates’ generations-
and-testing methodology to produce the frequent item
sets. That is, Apriori uses an iterative method of layer-
by-layer search, which kth item set is used to search
(k+1)th item set. First find first frequent item set,
written L1. L1 is used to search second item set L2, in
this way until cannot find frequent item set. Each search
requires a database scan.
 Min_supp: it is minimum support used for searching
frequent patterns that satisfy this constraint.
 Min_conf: it is Minimum confidence used for finding the
strong association rule that satisfy this threshold.
 Frequent Itemset (records): denoted by Li, where I means
ith item, these are the item sets that satisfy the minimum
support (min_supp) threshold
FP Growth Algorithm
o Analysis FP-growth approach mines frequent item sets
without candidate generation and has been proposed as
an alternative to the Apriori-based approach. FP-growth
adopts the divide-and-conquer methodology to
decompose mining tasks into smaller ones in order to
produce the frequent item sets.
Evaluation Process
 Consider two products X and Y

 Support : Support is an indication of how frequently


the items appear in the database. Support is the
percentage of transactions that contain both
Antecedent and Consequent.

 Confidence : It signifies the likelihood of item Y being


purchased when item X is purchased.

 Lift : This signifies the likelihood of the itemset Y being


purchased when item X is purchased while taking into
account the popularity of Y.
Evaluation Process
 Conviction : Conviction compares the probability that X
appears without Y if they were dependent with the
actual frequency of the appearance of X without Y.

 Intrestingness : This is a measure to identify rare rules. These


rules even though they have less individual support count
adds interestingness since they have high collective support
counts.

 Leverage shows the impact of ARM.


Tables
Results
Conclusion

References
 [1] Research and Improvement on Association Rule Algorithm Based on FP-Growth-
by Jingbo Yuan and Shunli Ding.
 [2] Mining Efficient Association Rules Through Apriori Algorithm Using Attributes
and Comparative Analysis of Various Association Rule Algorithms- by Ms Shweta &
Dr. Kanwal Garg.
 [3] Mining Efficient Association rules Through Apriori Algorithm Using Attributes- by
Mamta Dhanda, Sonali Guglani.
 [4] A Comparative Analysis on Association Rule Mining Algorithms by Gurpreet
Singh & Sonia Jassi.
 [5] An Improved Apriori Algorithm For Association Rules – by Mohammed Al-
Maolegi, Bassam Arkok.
 [6] A Survey on Association Rule Mining Algorithms Used in Different Application
Areas—by PandyaJalpa.P, Morena Rustom.D
 [7] https://towardsdatascience.com/association-rules-2-aa9a77241654 -- Association
metrics.

Вам также может понравиться