Академический Документы
Профессиональный Документы
Культура Документы
Tutorial 10 Exercises
Q3 A classification problem involves four attributes plus a class. The attributes have 3, 2, 2,
and 3 possible values each. The class has 2 possible values. How many possible different
examples can be there?
A. 10
B. 24
C. 36
D. 72
E. 96
A. 4
B. 5
C .2
D. 3
Use the three-class confusion matrix below to answer questions (5) through
(9).
Computed Decision
Class 1 Class 2 Class 3
Class 1 10 5 3
Class 2 5 15 5
Class 3 2 4 11
Q10 Which class instances were classified with the least error rate?
A Class 1
B Class 2
C Class 3
D Class 1 and Class 2
Answers: to MCQs
Q1 – B
Q2 – C
Q3 – D
Q4 - C
Q5 – C
Q6 - D
Q7 - C
Q8 - B
Q9 - A
Q10 - C
Explanations to Q6-Q10.
Answer = (5 + 15+ 5) = 25
Q10 Which class instances were classified with the least error rate?
Error Rate for Class 1 = 8 / 18 = 0.444 = 44.4%
Error Rate for Class 2 = 5+ 5 = 10 / 25 = 0.4 – 40%
Error Rate for Class 3 = 2+ 4 = 6 / 17 = 0.353 – 35.3%
Therefore, Class 3 instances were classified with the least error rate.
a)
b)
b)
b)
b)
EX2: Determie the root node of the decision tree for the following
data set.
Show your calculations3
D1 8 AM Sunny No No Long
D3 10 AM Sunny No No Short
D6 10 AM Sunny No No Short
D7 10 AM Cloudy No No Short
D8 9 AM Rainy No No Medium
H(T)=
Given our commute time sample set, we can calculate the entropy of each attribute at the root
node
• Gain(T, Hour)= ?
– 8am=3 (3 Long, 0 Short, 0 Medium)
– 9am=5 (3 Long, 0 Short, 2 Medium)
– 10am=5 (1 Long, 4 Short, 0 Medium)
•
– Entropy(T8am) = - (3/3) log2 (3/3)
– Entropy(T9am) = - (3/5) log2 (3/5) - (2/5) log2 (2/5)
– Entropy(T10am ) = - (1/5) log2 (1/5) - (4/5) log2 (4/5)
– Gain(T, Hour) = Entropy(T)- ((P(8am)Entropy(T8am) +
P(9am) Entropy(T9am)+ P(10am) Entropy(T10am) )
= Entropy(T) - ((3/13)Entropy(T8am)+(5/13)Entropy(T9am)+
(5/13)Entropy(T10am))
• Gain(T, Weather)= ?
– Sunny=6 (3 Long, 2 Short, 1 Medium)
– Cloudy=4 (3 Long, 1 Short, 0 Medium)
– Rainy=3 (1 Long, 1 Short, 1 Medium)
•
– Entropy(Tsunny) = - (3/6) log2 (3/6) - (2/6) log2 (2/6) - (1/6) log2 (1/6)
– Entropy(Tcloudy) = - (3/4) log2 (3/4) - (1/4) log2 (1/4)
– Entropy(Trainy ) = - (1/3) log2 (1/3) - (1/3) log2 (1/3) - (1/3) log2 (1/3)
– Gain(T, Weather) = Entropy(T)- ((P(Sunny)Entropy(Tsunny) +
P(Cloudy) Entropy(Tcloudy)+ P(Rainy) Entropy(Trainy) )
= Entropy(T) - ((6/13)Entropy(Tsunny)+(4/13)Entropy(Tcloudy)+
(3/13)Entropy(Trainy))
• Gain(T, accident)= ?
– Yes=5 (5 Long)
– No=8 (2 Long, 3 Short, 2 Medium)
• Gain(T, stall)= ?
– Yes=3 (3 Long)
– No=10 (4 Long, 4 Short, 3 Medium)
–
– Entropy(Tyes)= = - (3/3) log2 (3/3) = 0
– Entropy(Tno)= - (4/10) log2 (4/10) - (4/10) log2 (4/10) - (3/8) log2 (3/8)
– Gain(T, stall) = Entropy(T)- ((P(yes)Entropy(Tyes) +
P(no) Entropy(Tno))
= Entropy(T) - ((3/13)Entropy(Tyes)+(10/13)Entropy(Tno))
Hour 0.768449
Weather 0.130719
Accident 0.496479
Stall 0.248842
Calculate the accuracy of the above decision tree model using the test data
given below.
TEST DATA
OUTLOO TEMP HUMIDITY WINDY PLAY
K
Sunny Hot High False No
Sunny Hot High True No
Sunny Mild High False Yes
Sunny Cool Normal True Yes
Sunny Mild Normal True No
Rainy Hot High False No
Rainy Hot High True No
Rainy Mild High False Yes
Rainy Cool Normal True Yes
Rainy Mild Normal True No
Total No of Tests = 10
No of correct classifications = 6
The accuracy = 60%