Академический Документы
Профессиональный Документы
Культура Документы
Joint Entropy
py
Conditional Entropy
Mutual Information
Relative Entropy
39
H ( X ,Y ) E[log p( X ,Y )]
p(X, Y)
Y=0
Y=1
X=0
1/2
1/4
X=1
1/4
p(X,Y)
Y=0
Y=1
X=0
1/4
1/4
X=1
1/4
1/4
40
YX
X [
[ 1, 1]
pX [0.1, 0.9]
0 9]
p(X,Y)
p(
, )
Y=1
X = -1
0.1
X=1
0.9
41
H (Y | X ) E[log p(Y | X )]
p(Y|X)
Y=0
Y=1
X=0
2/3
1/3
X=1
42
YX
X [1, 1]
pX [0.1, 0.9]
0 9]
p(Y|X)
Y=1
X = -1
X=1
p(X|Y)
Y=1
X = -1
0.1
X=1
0.9
43
INTERPRETATIONS OF CONDITIONAL
ENTROPY H(Y|X)
H(Y|X):
( | ) The average
g uncertainty/information
y
in
Y when you know X
H(Y|X): The weighted average of row entropies
p(X,Y)
Y=0
Y=1
H(Y|X) = x p(X)
X 0
X=0
1/2
1/4
H(1/3)
3/4
X=1
1/4
H(1)
1/4
44
CHAIN RULES
H ( X ,Y ) H ( X ) H (Y | X ) H (Y ) H ( X | Y )
Proof:
In general,
H ( X1, X 2 ,...,X n )
H( X | X
i
i 1,X i 2 ,...X1 )
i 1
Proof:
45
H(X|Y)
H(X)
H(Y)
H(Y|X)
H ( X ,Y ) H ( X ) H (Y | X ) H (Y ) H ( X | Y )
46
MUTUAL INFORMATION
I ( X ;Y ) H ( X ) H ( X | Y ) H ( X ) H (Y ) H ( X ,Y )
H(X,Y)
H(X|Y)
H(X)
I(X;Y)
H(Y)
H(Y|X)
47
MUTUAL INFORMATION
I ( X ;Y ) I (Y; X )
Proof:
48
p(X, Y)
Y=0
Y=1
X=0
1/2
1/4
X=1
1/4
H(X,Y)=1.5
H(X|Y)=0.5
H(X)=0 811
H(X)=0.811
I(X;Y)=0.311
H(Y|X)=0.689
( | )
H(Y)=1
49
50
I ( X ;Y | X
i
i 1, X i 2 ,...,X1 )
i 1
Proof:
51
+
Z
p(X, Z)
Z=0
Z=1
X=0
1/4
1/4
X=1
1/4
1/4
Find I(X,Z;Y).
52
u v (a, b), 0 1
Examples:
JENSENS INEQUALITY
f (X ) convex
f ((X ) strictly convex
E[ f ( X )] f (E[ X ])
E[ f ( X )] f (E[ X ])
Proof:
54
Mnemonic example:
p
f (x) x2
55
RELATIVE ENTROPY
Relative entropy
py of Kullback-Leibler Divergence
g
between two probability mass vectors (functions)
p and q is defined as:
p(x)
p(x)
Ep[log
] Ep[logq(x)] H ( X )
D( p || q) p(x) log
q(x)
q(x)
xA
Property of D( p || q)
56
Cloudy
Sunny
Weather at
Seattle
p(x)
1/4
1/2
1/4
Weather at
Corvallis
q(x)
1/3
1/3
1/3
D( p || q)
D(q || p)
57
INFORMATION INEQUALITY
D( p || q) 0
Proof:
58
INFORMATION INEQUALITY
Proof:
I ( X ;Y ) 0
Proof:
59
INFORMATION INEQUALITY
Conditioning
g reduces entropy
py H ( X | Y ) H ( X )
Proof:
Proof:
H( X )
i
i 1
60
INFORMATION INEQUALITY
Conditional independence
p
bound
H ( X1, X 2 ,...,X n | Y1,Y2 ,...,Yn )
H( X | Y )
i
i 1
Proof:
61
Proof:
SUMMARY
62