Вы находитесь на странице: 1из 64

Market Basket Analysis and Association Rules

What can be inferred?


I purchase diapers I purchase a new car I purchase OTC cough medicine I purchase a prescription medication I dont show up for class

What are Association Rules?

Study of what goes with what


Customers who bought X also bought Y What symptoms go with what diagnosis

Transaction-based or event-based Also called market basket analysis and affinity analysis Originated with study of customer transactions databases to determine associations among items purchased

What is Market Basket Analysis?


Understanding behavior of shoppers What items are bought together

Whats in each shopping cart/basket?

Basket data consist of collection of transaction date and items bought in a transaction

Itemset Pivoting

How does this data differ from a transaction database?

Retail organizations interested in generating qualified decisions and strategy based on analysis of transaction data

what to put on sale, how to place merchandise on shelves for maximizing profit, customer segmentation based on buying pattern
4

Examples

Rule form: LHS RHS

IF a customer buys diapers, THEN they also buy beer

diapers beer

Transactions that purchase bread and butter also purchase milk

bread butter milk

Customers who purchase maintenance agreements are very likely to purchase large appliances When a new hardware store opens, one of the most commonly sold items is toilet bowl cleaners

Evaluation

Support : measure of how often the collection of items in an association occur together as a percentage of all the transactions

In 2% of the purchases at hardware store, both pick and shovel were bought support = #tuples(LHS, RHS)/N

Confidence : confidence of rule B given A is a measure of how much more likely it is that B occurs when A has occurred

100% meaning that B always occurs if A has occurred confidence = #tuples(LHS, RHS) / #tuples(LHS) Example: bread and butter milk [90%, 1%]

Rules originating from the same itemset have identical support but can have different confidence
6

The association rules mining problem


Generate all association rules from the given dataset that have

support greater than a specified minimum and confidence greater than a specified minimum

Examples

Rule form:

LHS RHS [confidence, support]


diapers beer [60%, 0.5%] 90% of transactions that purchase bread and butter also purchase milk

bread and butter milk [90%, 1%]

Example
Large Itemsets with minsup=30%
Tr# Items T1 Beer, Milk

T2
T3 T4

Bread, Butter
Bread, Butter, Jelly Bread, Butter, Milk

T5

Beer, Bread

Itemset Bread Butter Milk Beer Bread, Butter

Support 80 60 40 40 60

Consider the itemset {Bread, Butter}, and the two possible rules Bread Butter Butter Bread

Support({Bread, Butter})/support({Bread} = .75


i.e., Confidence(Bread Butter) = 75% Support({Bread, Butter})/support({Butter} = 1 i.e. Confidence(Butter Bread) = 100%
9

How Good is an Association Rule?


Is support and confidence enough? Lift (improvement) tells us how much better a rule is at predicting the result than just assuming the result in the first place

Lift = P(LHS^RHS) / (P(LHS).P(RHS)

When lift > 1 then the rule is better at predicting the result than guessing
When lift < 1, the rule is doing worse than informed guessing and using the Negative Rule produces a better rule than guessing

10

The Problem of Lots of Data

Fast Food Restaurantcould have 100 items on its menu

Supermarket10,000 or more unique items

How many combinations are there with 3 different menu items? 161,700 !

Use of product hierarchies (groupings) helps address this common issue Also, the number of transactions in a given time-period could also be huge (hence expensive to analyze)
11

50 million 2-item combinations 100 billion 3-item combinations

Preparing Data for MBA


Determining scope of dataset (one or many stores, what period, etc) Converting transaction data to itemsets Generalizing items to appropriate level

Depends on objective of model Rolling up rare items to get adequate support

12

Preparing Data for MBA


Determining scope of dataset (one or many stores, what period, etc) Converting transaction data to itemsets Generalizing items to appropriate level

Depends on objective of model Rolling up rare items to get adequate support

13

Search Approach
Two sub-problems in discovering all association rules:

Find all sets of items (itemsets) that have transaction support above minimum support
Itemsets that qualify are called large itemsets, and all others small itemsets.

Generate from each large itemset, rules that use items from the large itemset.

Given a large itemset Y, and X is a subset of Y Take the support of Y and divide it by the support of X If the ratio c is at least minconf, then X (Y - X) is satisfied with confidence factor c
14

Reducing Number of Candidates


Apriori

principle:

If an itemset is large, then all of its subsets must also be large

Support of an itemset never exceeds the support of its subsets

15

The Apriori Algorithm

Progressively identifies large itemsets of different sizes Exploits the property that any subset of a large itemset is also a large itemset

AB

AC

AD

BC

BD

CD

ABC

ABD

ACD

BCD

Also, any superset of a small itemset is also small

ABCD

16

Used in many recommender systems

17

Generating Rules

18

Terms
IF part = antecedent THEN part = consequent
Item set = the items (e.g., products) comprising the antecedent or consequent

Antecedent and consequent are disjoint (i.e., have no items in common)

19

Tiny Example: Phone Faceplates

20

Many Rules are Possible


For example: Transaction 1 supports several rules, such as
If red, then white (If a red faceplate is purchased, then so is a white one) If white, then red If red and white, then green + several more

21

Frequent Item Sets

Ideally, we want to create all possible combinations of items Problem: computation time grows exponentially as # items increases

Solution: consider only frequent item sets Criterion for frequent: support

22

Support
Support = # (or percent) of transactions

that include both the antecedent and the consequent Example: support for the item set {red, white} is 4 out of 10 transactions, or 40%

23

Apriori Algorithm

24

Generating Frequent Item Sets


For k products 1. User sets a minimum support criterion 2. Next, generate list of one-item sets that meet the support criterion 3. Use the list of one-item sets to generate list of two-item sets that meet the support criterion 4. Use list of two-item sets to generate list of three-item sets 5. Continue up through k-item sets
25

Measures of Performance
Confidence: the % of antecedent transactions
that also have the consequent item set

Lift = confidence/(benchmark confidence)

Benchmark confidence = transactions with


consequent as % of all transactions

Lift > 1 indicates a rule that is useful in finding consequent items sets (i.e., more useful than just selecting transactions randomly)

26

Alternate Data Format: Binary Matrix

27

Process of Rule Selection


Generate all rules that meet specified support & confidence
Find frequent item sets (those with sufficient support see above) From these item sets, generate rules with sufficient confidence

28

Example: Rules from {red, white, green}


{red, white} > {green} with confidence = 2/4 = 50% [(support {red, white, green})/(support {red, white})]
{red, green} > {white} with confidence = 2/2 = 100% [(support {red, white, green})/(support {red, green})]

Plus 4 more with confidence of 100%, 33%, 29% & 100%


If confidence criterion is 70%, report only rules 2, 3 and 6
29

All Rules (XLMiner Output)


Rule # 1 2 3 4 5 6 Conf. % Antecedent (a) 100 Green=> 100 Green=> 100 Green, White=> 100 Green=> 100 Green, Red=> 100 Orange=> Consequent (c) Red, White Red Red White White White Support(a) 2 2 2 2 2 2 Support(c) 4 6 6 7 7 7 Support(a U c) 2 2 2 2 2 2 Lift Ratio 2.5 1.666667 1.666667 1.428571 1.428571 1.428571

30

Interpretation

Lift ratio shows how effective the rule is


in finding consequents (useful if finding particular consequents is important)

Confidence shows the rate at which

consequents will be found (useful in learning costs of promotion)

Support measures overall impact


31

Caution: The Role of Chance


Random data can generate apparently interesting association rules The more rules you produce, the greater this danger Rules based on large numbers of records are less subject to this danger
32

Market Basket Analysis


MBA is a set of techniques, Association Rules being most common, that focus on point-of-sale (p-o-s) transaction data 3 types of market basket data (p-o-s data)

Customers Orders (basic purchase data) Items (merchandise/services purchased)

33

Market Basket Analysis


Retail each customer purchases different set of products, different quantities, different times MBA uses this information to:

Identify who customers are (not by name) Understand why they make certain purchases Gain insight about its merchandise (products):

Take action:

Fast and slow movers Products which are purchased together Products which might benefit from promotion

Combining all of this with a customer loyalty card it becomes even more valuable
34

Store layouts Which products to put on specials, promote, coupons

Association Rules
DM technique most closely allied with Market Basket Analysis AR can be automatically generated

AR represent patterns in the data without a specified target variable Good example of undirected data mining

35

36

Market Basket Analysis : Measures


Consider the association rule Y Z, where Y and Z are two products. Y represents the antecedent en Z is called the consequent.

Support of the rule: the percentage of all baskets that contain both product Y and Z
support = P(Y Z).

Confidence of the rule: the percentage of all the baskets containing Y that also
contain Z. Hence, confidence is a conditional probability, i.e. P(Z|Y) confidence = P(Y Z)/P(Y).

Interest of the rule: measures the statistical dependence of the rule, by relating the

observed frequency of occurrence (P(Y Z)) to the expected frequency of cooccurrence under the assumption of conditional independence of Y and Z (P(Y)*P(Z)) interest = P(Y Z)/(P(Y)*P(Z)). Association-rule discovery is the process of finding strong product associations with a minimum support and/or confidence and an interest of at least one.
37

Association Rules Apply Elsewhere


Besides retail supermarkets, etc Purchases made using credit/debit cards Optional Telco Service purchases Banking services Unusual combinations of insurance claims can be a warning of fraud Medical patient histories

38

A certainty measure for association rules of the form A => B, where A and B are sets of items, is confidence. Given a set of task

39

Typical Data Structure (Relational Database)

Lots of questions can be answered


Avg # of orders/customer Avg # unique items/order Avg # of items/order For a product


What % of customers have purchased Avg # orders/customer include it Avg quantity of it purchased/order

Transaction Data

Etc

Visualization is extremely helpful


40

Sales Order Characteristics

41

Sales Order Characteristics

Did the order use gift wrap? Billing address same as Shipping address? Did purchaser accept/decline a cross-sell? What is the most common item found on a one-item order? What is the most common item found on a multiitem order? What is the most common item for repeat customer purchases? How has ordering of an item changed over time? How does the ordering of an item vary geographically?
42

Association Rules
Wal-Mart customers who purchase Barbie dolls have a 60% likelihood of also purchasing one of three types of candy bars Customers who purchase maintenance agreements are very likely to purchase large appliances When a new hardware store opens, one of the most commonly sold items is toilet bowl cleaners

43

Association Rules

Association rule types:

Actionable Rules contain high-quality, actionable information Trivial Rules information already wellknown by those familiar with the business Inexplicable Rules no explanation and do not suggest action

Trivial and Inexplicable Rules occur most often


44

How Good is an Association Rule?


Customer 1 2 3 4 5 Items Purchased Coke, soda Milk, Coke, window cleaner Coke, detergent Coke, detergent, soda Window cleaner, soda Cok e Coke Window cleaner Milk 4 1 1 Window cleaner 1 2 1 Milk 1 1 1

POS Transactions

Co-occurrence of Products
Soda 2 1 0 Detergent 2 0 0

Soda
Detergent

2
2

1
0

0
0

3
1

1
2
45

How Good is an Association Rule?


Cok e 4 Window cleaner Milk Soda Detergent 1 1 2 2 Window cleaner 1 2 1 1 0 Milk 1 1 1 0 0 Soda 2 1 0 3 1 Detergent 2 0 0 1 2

Simple patterns: 1. Coke and soda are more likely purchased together than any other two items 2. Detergent is never purchased with milk or window cleaner 3. Milk is never purchased with soda or detergent
46

How Good is an Association Rule?


Customer 1 2 3 4 5

Items Purchased Coke, soda Milk, Coke, window cleaner Coke, detergent Coke, detergent, soda Window cleaner, soda

POS Transactions

What is the confidence for this rule:


If a customer purchases soda, then customer also purchases Coke 2 out of 3 soda purchases also include Coke, so 67% 2 out of 4 Coke purchases also include soda, so 50%

What about the confidence of this rule reversed?

Confidence = Ratio of the number of transactions with all the items to the number of transactions with just the if items
47

How Good is an Association Rule?

How much better than chance is a rule?

Lift (improvement) tells us how much better a rule is at predicting the result than just assuming the result in the first place
Lift is the ratio of the records that support the entire rule to the number that would be expected, assuming there was no relationship between the products Calculating liftWhen lift > 1 then the rule is better at predicting the result than guessing When lift < 1, the rule is doing worse than informed guessing and using the Negative Rule produces a better rule than guessing

48

Creating Association Rules


1.

Choosing the right set of items Generating rules by deciphering the counts in the co-occurrence matrix Overcoming the practical limits imposed by thousands or tens of thousands of unique items

2.

3.

49

Overcoming Practical Limits for Association Rules


1. 2.

Generate co-occurrence matrix for single itemsif Coke then soda Generate co-occurrence matrix for two itemsif Coke and Milk then

soda
3.

4.

Generate co-occurrence matrix for three itemsif Coke and Milk and Window Cleaner then soda Etc
50

Final Thought on Association Rules: The Problem of Lots of Data

Fast Food Restaurantcould have 100 items on its menu

Supermarket10,000 or more unique items

How many combinations are there with 3 different menu items? 161,700 !

Use of product hierarchies (groupings) helps address this common issue Finally, know that the number of transactions in a given time-period could also be huge (hence expensive to analyze)
51

50 million 2-item combinations 100 billion 3-item combinations

Business and other cases

52

53

54

55

56

57

58

59

General Observations

Banking case seems to provide well defined and intelligible information of the form:
account_1

and account_2,,, etc or activity_1 and activity_2, etc, possibly indexed by time. As such, rules found provide guide to action to .offer. product or service (cross-sell).

60

In retailing case of items purchased together, .guidance. is not so clear cut due to extensive number of rules.

61

Challenges

A major difficulty is that a large number of the rules found may be trivial for anyone familiar with the business

The computational complexity involved in calculating the results of market basket analysis is at least the square of the number of transaction item-lines (records of every item purchased.) With data warehouses storing billions of transaction lines, this yields extremely high computational requirements

62

Solutions

Differential market basket analysis can find interesting results and can also eliminate the problem of a potentially high volume of trivial results Special techniques involving filtering or aggregation of the transaction database are commonly used to in analysis algorithms to increase performance and allow some level of interactivity, such as in business intelligence applications.
63

Thank You!

64

Вам также может понравиться