Info Gain

Загружено:

Avnish Kumar Singh

0% нашли этот документ полезным (0 голосов)

42 просмотров13 страниц

A brief description about information gain and entropy.

Авторское право

Доступные форматы

PDF, TXT или читайте онлайн в Scribd

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Пожаловаться на этот документ

A brief description about information gain and entropy.

Авторское право:

Доступные форматы

Скачайте в формате PDF, TXT или читайте онлайн в Scribd

Отметить как неприемлемый контент

0% нашли этот документ полезным (0 голосов)

42 просмотров13 страниц

Info Gain

Загружено:

Avnish Kumar Singh

A brief description about information gain and entropy.

Авторское право:

Доступные форматы

Скачайте в формате PDF, TXT или читайте онлайн в Scribd

Отметить как неприемлемый контент

Перейти к странице

Вы находитесь на странице: 1из 13

Поиск в документе

1

Information Gain
Which test is more informative?
Split over whether
Balance exceeds 50K
Over 50K Less or equal 50K
Employed
Unemployed
Split over whether
applicant is employed
2
Information Gain
Impurity/Entropy (informal)
Measures the level of impurity in a group
of examples
3
Impurity
Very impure group
Less impure
Minimum
impurity
4
Entropy: a common way to measure impurity
Entropy =
p
i
is the probability of class i
Compute it as the proportion of class i in the set.
Entropy comes from information theory. The
higher the entropy the more the information
content.

i
i i
p p
2
log
What does that mean for learning from examples?
16/30 are green circles; 14/30 are pink crosses
log
2
(16/30) = -.9; log
2
(14/30) = -1.1
Entropy = -(16/30)(-.9) (14/30)(-1.1) = .99
5
2-Class Cases:
What is the entropy of a group in which
all examples belong to the same
class?
entropy = - 1 log
2
1 = 0
What is the entropy of a group with
50% in either class?
entropy = -0.5 log
2
0.5 0.5 log
2
0.5 =1
Minimum
impurity
Maximum
impurity
not a good training set for learning
good training set for learning
6
Information Gain
We want to determine which attribute in a given
set of training feature vectors is most useful for
discriminating between the classes to be
learned.
Information gain tells us how important a given
attribute of the feature vectors is.
We will use it to decide the ordering of attributes
in the nodes of a decision tree.
7
Calculating Information Gain
996 . 0
30
16
log
30
16
30
14
log
30
14
2 2
=

787 . 0
17
4
log
17
4
17
13
log
17
13
2 2
=

Entire population (30 instances)
17 instances
13 instances
(Weighted) Average Entropy of Children =
615 . 0 391 . 0
30
13
787 . 0
30
17
=

I nformation Gain= 0.996 - 0.615 = 0.38 for this split
391 . 0
13
12
log
13
12
13
1
log
13
1
2 2
=

Information Gain = entropy(parent) [average entropy(children)]
parent
entropy
child
entropy
child
entropy
8
Entropy-Based Automatic
Decision Tree Construction
Node 1
What feature
should be used?
What values?
Training Set S
x
1
=(f
11
,f
12
,f
1m
)
x
2
=(f
21
,f
22
, f
2m
)
.
.
x
n
=(f
n1
,f
22
, f
2m
)
Quinlan suggested information gainin his ID3 system
and later the gain ratio, both based on entropy.
9
Using Information Gain to Construct a
Decision Tree
Attribute A
v1 vk v2
Full Training Set S
Set S
repeat
recursively
till when?
S={sS | value(A)=v1}
Choose the attribute A
with highest information
gain for the full training
set at the root of the tree.
Construct child nodes
for each value of A.
Each has an associated
subset of vectors in
which A has a particular
value.
1
2
3
10
Simple Example
X Y Z C
1 1 1 I
1 1 0 I
0 0 1 II
1 0 0 II
How would you distinguish class I from class II?
Training Set: 3 features and 2 classes
11
X Y Z C
1 1 1 I
1 1 0 I
0 0 1 II
1 0 0 II
Split on attribute X
I I
II II
I I
II
II
E
parent
= 1
GAIN = 1 ( 3/4)(.9184) (1/4)(0) = .3112
X=1
X=0
E
child2
= 0
E
child1
= -(1/3)log
2
(1/3)-(2/3)log
2
(2/3)
= .5284 + .39
= .9184
If X is the best attribute,
this node would be further split.
12
X Y Z C
1 1 1 I
1 1 0 I
0 0 1 II
1 0 0 II
Split on attribute Y
I I
II II
I I
II
II
E
parent
= 1
GAIN = 1 (1/2) 0 (1/2)0 = 1; BEST ONE
Y=1
X=0
E
child2
= 0
E
child1
= 0
13
X Y Z C
1 1 1 I
1 1 0 I
0 0 1 II
1 0 0 II
Split on attribute Z
I I
II II
I
II
I
II
E
parent
= 1
GAIN = 1 ( 1/2)(1) (1/2)(1) = 0 ie. NO GAIN; WORST
Z=1
Z=0
E
child2
= 1
E
child1
= 1

Вам также может понравиться

Fear: Trump in the White House
От Everand
Fear: Trump in the White House
Bob Woodward
Рейтинг: 3.5 из 5 звезд
3.5/5 (738)
A Man Called Ove: A Novel
От Everand
A Man Called Ove: A Novel
Fredrik Backman
Рейтинг: 4.5 из 5 звезд
4.5/5 (4609)
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
От Everand
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
Viet Thanh Nguyen
Рейтинг: 4.5 из 5 звезд
4.5/5 (121)
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
От Everand
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
Dave Eggers
Рейтинг: 3.5 из 5 звезд
3.5/5 (231)
Grit: The Power of Passion and Perseverance
От Everand
Grit: The Power of Passion and Perseverance
Angela Duckworth
Рейтинг: 4 из 5 звезд
4/5 (588)
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
От Everand
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
Gilbert King
Рейтинг: 4.5 из 5 звезд
4.5/5 (266)
Principles: Life and Work
От Everand
Principles: Life and Work
Ray Dalio
Рейтинг: 4 из 5 звезд
4/5 (599)
Never Split the Difference: Negotiating As If Your Life Depended On It
От Everand
Never Split the Difference: Negotiating As If Your Life Depended On It
Chris Voss
Рейтинг: 4.5 из 5 звезд
4.5/5 (838)
The Emperor of All Maladies: A Biography of Cancer
От Everand
The Emperor of All Maladies: A Biography of Cancer
Siddhartha Mukherjee
Рейтинг: 4.5 из 5 звезд
4.5/5 (271)
The Little Book of Hygge: Danish Secrets to Happy Living
От Everand
The Little Book of Hygge: Danish Secrets to Happy Living
Meik Wiking
Рейтинг: 3.5 из 5 звезд
3.5/5 (400)
Yes Please
От Everand
Yes Please
Amy Poehler
Рейтинг: 4 из 5 звезд
4/5 (1891)
The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
От Everand
The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
Mark Manson
Рейтинг: 4 из 5 звезд
4/5 (5794)
The World Is Flat 3.0: A Brief History of the Twenty-first Century
От Everand
The World Is Flat 3.0: A Brief History of the Twenty-first Century
Thomas L. Friedman
Рейтинг: 3.5 из 5 звезд
3.5/5 (2259)
The Glass Castle: A Memoir
От Everand
The Glass Castle: A Memoir
Jeannette Walls
Рейтинг: 4.5 из 5 звезд
4.5/5 (1713)
Shoe Dog: A Memoir by the Creator of Nike
От Everand
Shoe Dog: A Memoir by the Creator of Nike
Phil Knight
Рейтинг: 4.5 из 5 звезд
4.5/5 (537)
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
От Everand
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
Brené Brown
Рейтинг: 4 из 5 звезд
4/5 (1090)
John Adams
От Everand
John Adams
David McCullough
Рейтинг: 4.5 из 5 звезд
4.5/5 (2409)
A Tree Grows in Brooklyn
От Everand
A Tree Grows in Brooklyn
Betty Smith
Рейтинг: 4.5 из 5 звезд
4.5/5 (1929)
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
От Everand
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
Ben Horowitz
Рейтинг: 4.5 из 5 звезд
4.5/5 (345)
Team of Rivals: The Political Genius of Abraham Lincoln
От Everand
Team of Rivals: The Political Genius of Abraham Lincoln
Doris Kearns Goodwin
Рейтинг: 4.5 из 5 звезд
4.5/5 (234)
Her Body and Other Parties: Stories
От Everand
Her Body and Other Parties: Stories
Carmen Maria Machado
Рейтинг: 4 из 5 звезд
4/5 (821)
The Art of Racing in the Rain: A Novel
От Everand
The Art of Racing in the Rain: A Novel
Garth Stein
Рейтинг: 4 из 5 звезд
4/5 (4200)
Wolf Hall: A Novel
От Everand
Wolf Hall: A Novel
Hilary Mantel
Рейтинг: 4 из 5 звезд
4/5 (3811)
The Light Between Oceans: A Novel
От Everand
The Light Between Oceans: A Novel
M.L. Stedman
Рейтинг: 4.5 из 5 звезд
4.5/5 (789)
The Perks of Being a Wallflower
От Everand
The Perks of Being a Wallflower
Stephen Chbosky
Рейтинг: 4.5 из 5 звезд
4.5/5 (2104)
440T4 4T60 4t60e 4t65e
Документ24 страницы
440T4 4T60 4t60e 4t65e
FTompkins73
100% (4)
Rise of ISIS: A Threat We Can't Ignore
От Everand
Rise of ISIS: A Threat We Can't Ignore
Jay Sekulow
Рейтинг: 3.5 из 5 звезд
3.5/5 (137)
Angela's Ashes: A Memoir
От Everand
Angela's Ashes: A Memoir
Frank McCourt
Рейтинг: 4.5 из 5 звезд
4.5/5 (440)
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
От Everand
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
Margot Lee Shetterly
Рейтинг: 4 из 5 звезд
4/5 (895)
The Woman in Cabin 10
От Everand
The Woman in Cabin 10
Ruth Ware
Рейтинг: 3.5 из 5 звезд
3.5/5 (2322)
The Outsider: A Novel
От Everand
The Outsider: A Novel
Stephen King
Рейтинг: 4 из 5 звезд
4/5 (1839)
The Unwinding: An Inner History of the New America
От Everand
The Unwinding: An Inner History of the New America
George Packer
Рейтинг: 4 из 5 звезд
4/5 (45)
Little Women
От Everand
Little Women
Louisa May Alcott
Рейтинг: 4 из 5 звезд
4/5 (104)
Sing, Unburied, Sing: A Novel
От Everand
Sing, Unburied, Sing: A Novel
Jesmyn Ward
Рейтинг: 4 из 5 звезд
4/5 (1103)
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
От Everand
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
Ashlee Vance
Рейтинг: 4.5 из 5 звезд
4.5/5 (474)
On Fire: The (Burning) Case for a Green New Deal
От Everand
On Fire: The (Burning) Case for a Green New Deal
Naomi Klein
Рейтинг: 4 из 5 звезд
4/5 (74)
Brooklyn: A Novel
От Everand
Brooklyn: A Novel
Colm Toibin
Рейтинг: 3.5 из 5 звезд
3.5/5 (1937)
The Yellow House: A Memoir (2019 National Book Award Winner)
От Everand
The Yellow House: A Memoir (2019 National Book Award Winner)
Sarah M. Broom
Рейтинг: 4 из 5 звезд
4/5 (98)
The Constant Gardener: A Novel
От Everand
The Constant Gardener: A Novel
John le Carré
Рейтинг: 3.5 из 5 звезд
3.5/5 (104)
Manhattan Beach: A Novel
От Everand
Manhattan Beach: A Novel
Jennifer Egan
Рейтинг: 3.5 из 5 звезд
3.5/5 (792)
Steve Jobs
От Everand
Steve Jobs
Walter Isaacson
Рейтинг: 4.5 из 5 звезд
4.5/5 (806)
Injection Moulding - Quality Molded Parts
Документ28 страниц
Injection Moulding - Quality Molded Parts
Quản Lê Đình
100% (3)
GE Proficy Machine Edition Getting Started
Документ124 страницы
GE Proficy Machine Edition Getting Started
Irfan Ashraf
Оценок пока нет
Bad Feminist: Essays
От Everand
Bad Feminist: Essays
Roxane Gay
Рейтинг: 4 из 5 звезд
4/5 (1016)
Acquisition (Pagtamo) Meaning-Making (Pag-Unawa) Transfer (Paglilipat)
Документ2 страницы
Acquisition (Pagtamo) Meaning-Making (Pag-Unawa) Transfer (Paglilipat)
MAY BEVERLY MORALES
100% (8)
94-0518-4 Mini-RadaScan Engineers and Service Guide
Документ15 страниц
94-0518-4 Mini-RadaScan Engineers and Service Guide
Alex Sandoval
100% (1)
SM
Документ36 страниц
SM
haran2000
Оценок пока нет
6.hydraulic Pressure Spesification
Документ3 страницы
6.hydraulic Pressure Spesification
TLK Channel
Оценок пока нет
Government College of Engineering SALEM 636011.: Electronics and Communication Engineering Curriculum and Syllabus
Документ111 страниц
Government College of Engineering SALEM 636011.: Electronics and Communication Engineering Curriculum and Syllabus
Salma Mehajabeen Shajahan
Оценок пока нет
WRO 2018 WeDo Regular
Документ14 страниц
WRO 2018 WeDo Regular
Alaas Alvcasza
Оценок пока нет
V8 Pro BR
Документ20 страниц
V8 Pro BR
wtn2013
Оценок пока нет
Automated Marking Score For English PT3 (New Format)
Документ17 страниц
Automated Marking Score For English PT3 (New Format)
Azri Arbaian
Оценок пока нет
Ice Plant Trainer
Документ1 страница
Ice Plant Trainer
Virender Rana
Оценок пока нет
Emmeskay MIL-SIL Tutorial
Документ52 страницы
Emmeskay MIL-SIL Tutorial
Neacsu Eugen
Оценок пока нет
CAAYE Annual Report 2013
Документ26 страниц
CAAYE Annual Report 2013
Rahul Mirchandani
Оценок пока нет
Whirlpool SPIW318A2WF Air Conditioner
Документ324 страницы
Whirlpool SPIW318A2WF Air Conditioner
Axx
Оценок пока нет
Pds Luxathane 5000 Voc
Документ2 страницы
Pds Luxathane 5000 Voc
muthukumar
100% (1)
VirtualHost Examples - Apache HTTP Server
Документ9 страниц
VirtualHost Examples - Apache HTTP Server
SaitejaTallapelly
Оценок пока нет
Leaflet US5Ge Protected
Документ3 страницы
Leaflet US5Ge Protected
quochung0606
Оценок пока нет
IsCAN Labview Developer
Документ30 страниц
IsCAN Labview Developer
afsala1982
Оценок пока нет
CRCC
Документ13 страниц
CRCC
Galih Santana
Оценок пока нет
Direct Current Generator Reviewer
Документ16 страниц
Direct Current Generator Reviewer
Caitriona Angelette
Оценок пока нет
Ht-Discharge Pipes and Fittings (PPS) : Kunststoffe
Документ35 страниц
Ht-Discharge Pipes and Fittings (PPS) : Kunststoffe
luis eduardo ramos rebata
Оценок пока нет
Canalta Parts Catalogue
Документ25 страниц
Canalta Parts Catalogue
llando1
Оценок пока нет
EE809 DC-AC System Interactions Lecture - 1
Документ10 страниц
EE809 DC-AC System Interactions Lecture - 1
NagababuMutyala
Оценок пока нет
Paranormal Activity 3
Документ5 страниц
Paranormal Activity 3
alex
Оценок пока нет
Workstation-Sub Micro Space Plan
Документ21 страница
Workstation-Sub Micro Space Plan
arnella_h
Оценок пока нет
SAP BW Basic Configuration Settings
Документ29 страниц
SAP BW Basic Configuration Settings
SIKANDAR
Оценок пока нет
Case Study Analysis of Apex Corporation PDF
Документ2 страницы
Case Study Analysis of Apex Corporation PDF
AJ
Оценок пока нет
Phillips
Документ22 страницы
Phillips
Arthur Rodriguez
Оценок пока нет
Arun Et al-2019-ChemistrySelect
Документ8 страниц
Arun Et al-2019-ChemistrySelect
Ravi Kumar ARun
Оценок пока нет