Академический Документы
Профессиональный Документы
Культура Документы
Decision Tree
Induction
Training Dataset
This
follows
an
example
from
Quinlans
ID3
age
<=30
<=30
3140
>40
>40
>40
3140
<=30
<=30
>40
<=30
3140
3140
>40
buys_computer
no
no
yes
yes
yes
no
yes
no
yes
yes
yes
yes
yes
no
3
age?
<=30
student?
30..40
overcast
yes
>40
credit rating?
no
yes
excellent
fair
no
yes
no
yes
E(A)
j 1
I ( s1 j ,..., smj )
Attribute Selection by
Information Gain Computation
age
<=30
3040
>40
age
<=30
<=30
3140
>40
>40
>40
3140
<=30
<=30
>40
<=30
3140
3140
April
>40
pi
2
4
3
ni I(pi, ni)
3 0.971
0 0
2 0.971
5
4
I ( 2,3)
I (4,0)
14
14
5
I (3,2) 0.694
14
5
I (2,3) means age <=30 has
14 5 out of 14 samples, with 2
E ( age)
buys_computer
no
no
yes
yes
yes
no
yes
no
yes
yes
yes
yes
Data
yes Mining: Concepts and
Techniques
no
Similarly,
Gain(income) 0.029
Gain( student ) 0.151
Gain(credit _ rating ) 0.048
7