Вы находитесь на странице: 1из 17

Classification Modelling

Comparison and Intuition


Logistic Regression vs Decision Tree
Splitting in Decision Trees
Splitting Criteria

• Information Gain
• Gain Ratio
• Gini Index
Selection Criteria
• Selection of an attribute to test at each node - choosing the most useful attribute
for classifying examples.

• information gain
• measures how well a given attribute separates the training examples according to
their target classification
• This measure is used to select among the candidate attributes at each step while
growing the tree
Entropy

• A measure of homogeneity of the set of examples.

• Given a set S of positive and negative examples of some target


concept (a 2-class problem), the entropy of set S relative to this binary
classification is

E(S) = - p(P)log2 p(P) – p(N)log2 p(N)


Example
Outlook Temperature  Humidity  Windy Play
overcast hot high FALSE yes
overcast cool normal TRUE yes
overcast mild high TRUE yes
overcast hot normal FALSE yes
rainy mild high FALSE yes
rainy cool normal FALSE yes
rainy cool normal TRUE no
rainy mild normal FALSE yes
rainy mild high TRUE no
sunny hot high FALSE no
sunny hot high TRUE no
sunny mild high FALSE no
sunny cool normal FALSE yes
sunny mild normal TRUE yes
Knn purpose
Boundary for each observation
Knn- Decision Boundary
SVM intuition for decision boundary
SVM-Principle
SVM-Optimization
Non-linear separable decision boundary

Вам также может понравиться