Академический Документы
Профессиональный Документы
Культура Документы
Clustering System)
Binning
Correlation Analysis
2
expected)2 / expected
= (4000 4500)2/4500 = 555.6
Correlation Analysis
All Confidence
All_conf(X) = sup(X) / max_item_sup(X)
Maximum Single item support is considered
Minimal confidence among the rules ij X ij
where ij
Cosine Measure
Cosine(A, B) = P(AUB) / (P(A) x P(B))1/2
Similar to Lift
Influenced only by support of A, B and A U B not
by total number of transactions
min_confidence 60%
Meta Rule guided Mining
Makes Mining process more effective and efficient
Users can specify the syntactic form of the rules
Example:
P1(X, Y) P2(X, W) buys(X,Educational
Software)
Rule form
P1
2
l Q1
2 Qr
p=l+r
All frequent p-predicate sets and count of lpredicate sets
Cube Search
Rule Constraints
Find the sales of which cheap items (where the sum
of prices is less than $100) may promote the sales of
which expensive items (where the minimum price is
$500) of the same group for Chicago Customers in
2004
mine associations as
lives_in(C, _, Chicago) sales+(C,?{I},{S})
sales+(C,?{J},{T})
from sales
where S.year = 2004 and T.year = 2004 and I.group =
J.group
group by C, I.group
having sum(I.price) < 100 and min(J.price) >= 500
with support threshold = 1%
with confidence threshold = 1%
Rule Constraints
Looks for rules of the form:
Lives_in(C,_,Chicago) sales(C,?I1,S1
sales(C,?Ik,Sk
1,Ik
1,Sk}
sales(C,?J1,T1
m,Tm
1,Jm}
1,Tm}
Mines rules like
sales(C, MS/Office,
MS/SQLServer,_) [1.5, 68%]
Types of Constraints
Anti-monotone If an itemset does not satisfy a
constraint none of its supersets will also satisfy the
constraint
Min(J.price)>= 500
Convertible
Inconvertible
Sum(S) v where