Академический Документы
Профессиональный Документы
Культура Документы
Non-focused mining
A manager may be only interested in patterns involving some items (s)he manages A user is often interested in patterns satisfying some constraints
Jian Pei: Data Mining -- Maximal and Closed Patterns 2
Itemset Lattice
ABCD ABC AB A AC ABD BC B {} ACD AD C BCD BD D CD
Length Frequent itemsets 1 2 3 A, B, C, D AB, AC, AD, BC, BD, CD ABC, ABD, ACD
3
Length Frequent itemsets 1 2 3 A, B, C, D AB, AC, AD, BC, BD, CD ABC, ABD
4
ACD AD C
BCD BD D CD
Min_sup=2
Potential max-patterns
Since BCDE is a max-pattern, no need to check BCD, BDE, CDE in later scan Bayardo, SIGMOD98
Jian Pei: Data Mining -- Maximal and Closed Patterns 6
Len Frequent itemsets 1 2 3 A:4, B:4, C:3, D:4 AB:3, AC:2, AD:3, BC:3, BD:2, CD:2 ABC:2, ABD:2
7
Items a, c, d, e, f a, b, e c, e, f a, c, d, f c, e, f
8
Every transaction having d also has cfa cfad is a frequent closed pattern
PHM00
Jian Pei: Data Mining -- Maximal and Closed Patterns 9
11
12
An Example
Support threshold: min_sup =1 Error bound: k = 2
14
Another Base
Support threshold: min_sup = 1 Error bound: k = 2
15
Approximation Functions
NOT unique
Different condensed FP-bases have different approximation function
16
Summary
Effectiveness is a critical issue in frequent pattern mining Mining non-redundant patterns
Max-patterns and closed patterns
17
B:4 {}
C:3
D:4
18