Вы находитесь на странице: 1из 31

CS 4104

APPLIED MACHINE LEARNING

Dr. Hashim Yasin


National University of Computer
and Emerging Sciences,
Faisalabad, Pakistan.
LINEAR REGRESSION
Linear Regression with one Variable
3

Housing Prices
(Portland, OR)

of dollars)
(in 1000s
Price

Size (feet2)
Supervised Learning Regression Problem
Given the “right answer” for Predict real-valued output
each example in the data.
Dr. Hashim Yasin Applied Machine Learning (CS4104)
Regression Example
4

Training set of Size in feet2 (x) Price ($) in 1000's (y)


housing prices 2104 460
1416 232
1534 315
852 178
… …
Notation:
m = Number of training examples One Training example 𝑥, 𝑦
x’s = “input” variable / features 𝑖𝑡ℎ training example (𝑥 𝑖 , 𝑦 𝑖 )
y’s = “output” variable / “target” variable

Dr. Hashim Yasin Applied Machine Learning (CS4104)


Regression Example
5

Training set of Size in feet2 (x) Price ($) in 1000's (y)


housing prices 2104 460
1416 232
1534 315
852 178
… …

Hypothesis:
‘s: Parameters
How to choose ‘s ?
Dr. Hashim Yasin Applied Machine Learning (CS4104)
Linear Regression with one Variable
6

Simplified:
Hypothesis:

Parameters:

Cost Function:

Goal:

Dr. Hashim Yasin Applied Machine Learning (CS4104)


Regression
7

One Parameter (𝜃1 ):


𝑚 𝑚

෍ 𝜃1 𝑥 2,𝑖 − ෍ 𝑥 𝑖 𝑦 𝑖 = 0
𝑚 𝑖=1 𝑖=1
𝜕 𝐽(𝜃1 ) 𝜕 2
= ෍ ℎ𝜃 𝑥 𝑖 − 𝑦 𝑖 𝑚 𝑚
𝜕𝜃1 𝜕𝜃1
𝑖=1 ෍ 𝜃1 𝑥 2,𝑖 = ෍ 𝑥 𝑖 𝑦 𝑖
𝑚
𝜕 𝐽(𝜃1 ) 𝜕 𝑖=1 𝑖=1
2
= ෍ 𝜃1 𝑥 𝑖 − 𝑦 𝑖
𝜕𝜃1 𝜕𝜃1 σ𝑚 𝑖 𝑖
𝑖=1 𝑖=1 𝑥 𝑦
𝜃1 = 𝑚 2𝑖
𝑚 σ𝑖=1 𝑥
𝜕 𝐽(𝜃1 ) 𝜕
= 2 ෍ 𝜃1 𝑥 𝑖 − 𝑦 𝑖 𝜃1 𝑥 𝑖 − 𝑦 𝑖
𝜕𝜃1 𝜕𝜃1
𝑖=1
𝑚 𝑐𝑜𝑣𝑎𝑟(𝑋, 𝑌)
𝜕 𝐽(𝜃1 )
= 2 ෍ 𝜃1 𝑥 𝑖 − 𝑦 𝑖 𝑥 𝑖 = 0 𝑣𝑎𝑟(𝑋)
𝜕𝜃1 if 𝑚𝑒𝑎𝑛(𝑋) = 𝑚𝑒𝑎𝑛(𝑌) = 0
𝑖=1

Dr. Hashim Yasin Applied Machine Learning (CS4104)


LINEAR REGRESSION WITH
MULTIPLE VARIABLES
Multivariate Regression
9

Hypothesis:
Parameters:
𝑥0 = 1
Cost function:

Gradient descent:
Repeat

(simultaneously update for every )

Dr. Hashim Yasin Applied Machine Learning (CS4104)


Gradient Descent
10
New algorithm :
Repeat
Previously (𝑛 = 1):
Repeat
(simultaneously update for
)

(simultaneously update )

Dr. Hashim Yasin Applied Machine Learning (CS4104)


Multivariate Regression
11

Two Parameters(𝜃0 , 𝜃1 ):
𝑚 𝑚 𝑚
1 1 1
෍ 𝜃0 + ෍ 𝜃1 𝑥 = ෍ 𝑦 𝑖
𝑖
𝑚 𝑚 𝑚
𝑖=1 𝑖=1 𝑖=1
𝑚
𝜕 𝐽(𝜃0 ) 𝜕 1 2 𝑚
= ෍ ℎ𝜃 𝑥 𝑖 − 𝑦 𝑖 1
𝜕𝜃0 𝜕𝜃0 2𝑚
𝑖=1 𝑥ҧ = ෍ 𝑥 𝑖
𝑚 𝑚
𝜕 𝐽(𝜃0 ) 𝜕 1 2 𝑖=1
= ෍ 𝜃0 + 𝜃1 𝑥 𝑖 − 𝑦 𝑖 𝑚
𝜕𝜃0 𝜕𝜃0 2𝑚
𝑖=1 1
𝜕 𝐽(𝜃0 ) 1
𝑚
𝜕
𝑦ത = ෍ 𝑦 𝑖
=2 ෍ 𝜃0 + 𝜃1 𝑥 𝑖 − 𝑦 𝑖 𝜃0 + 𝜃1 𝑥 𝑖 − 𝑦 𝑖 𝑚
𝜕𝜃0 2𝑚 𝜕𝜃0 𝑖=1
𝑖=1
𝑚
𝜕 𝐽(𝜃0 ) 1
= ෍ 𝜃0 + 𝜃1 𝑥 𝑖 − 𝑦 𝑖 = 0
𝜃0 = 𝑦ത − 𝜃1 𝑥ҧ
𝜕𝜃0 𝑚
𝑖=1

Dr. Hashim Yasin Applied Machine Learning (CS4104)


Multivariate Regression
12

Two Parameters(𝜃0 , 𝜃1 ):

𝑚
𝜕 𝐽(𝜃1 ) 𝜕 1 2
= ෍ ℎ𝜃 𝑥 𝑖 − 𝑦 𝑖
𝜕𝜃1 𝜕𝜃1 2𝑚
𝑖=1
𝑚
𝜕 𝐽(𝜃1 ) 𝜕 1 2
= ෍ 𝜃0 + 𝜃1 𝑥 𝑖 − 𝑦 𝑖
𝜕𝜃1 𝜕𝜃1 2𝑚
𝑖=1
𝑚
𝜕 𝐽(𝜃1 ) 1 𝜕
=2 ෍ 𝜃0 + 𝜃1 𝑥 𝑖 − 𝑦 𝑖 𝜃0 + 𝜃1 𝑥 𝑖 − 𝑦 𝑖
𝜕𝜃1 2𝑚 𝜕𝜃1
𝑖=1
𝑚
𝜕 𝐽(𝜃1 ) 1
= ෍ 𝜃0 + 𝜃1 𝑥 𝑖 − 𝑦 𝑖 𝑥 𝑖 = 0
𝜕𝜃1 𝑚
𝑖=1

Dr. Hashim Yasin Applied Machine Learning (CS4104)


Multivariate Regression
13

Two Parameters(𝜃1 , 𝜃2 ):
𝑚
𝜕 𝐽(𝜃1 ) 1
= ෍ 𝜃0 + 𝜃1 𝑥 𝑖 − 𝑦 𝑖 𝑥 𝑖 = 0
𝜕𝜃1 𝑚
𝑖=1
𝑚 𝑚 𝑚
1 1 1
෍ 𝜃0 𝑥 + ෍ 𝜃1 𝑥 = ෍ 𝑥 𝑖 𝑦 𝑖
𝑖 2𝑖
𝜃0 = 𝑦ത − 𝜃1 𝑥ҧ
𝑚 𝑚 𝑚
𝑖=1 𝑖=1 𝑖=1
𝑚 𝑚 𝑚

෍ 𝜃1 𝑥 2,𝑖 + ෍(𝑦ത − 𝜃1 𝑥)𝑥


ҧ 𝑖 = ෍ 𝑥𝑖 𝑦𝑖
𝑖=1 𝑖=1 𝑖=1
𝑚 𝑚 𝑚 𝑚

෍ 𝜃1 𝑥 2,𝑖 + ෍ 𝑦𝑥
ത 𝑖 − ෍ 𝜃1 𝑥𝑥
ҧ 𝑖 = ෍ 𝑥 𝑖 𝑦𝑖
𝑖=1 𝑖=1 𝑖=1 𝑖=1
𝑚 𝑚

𝜃1 ෍ 𝑥 𝑖 𝑥 𝑖 − 𝑥ҧ = ෍ 𝑥 𝑖 𝑦 𝑖 − 𝑦ത
𝑖=1 𝑖=1
Dr. Hashim Yasin Applied Machine Learning (CS4104)
Multivariate Regression
14

Two Parameters(𝜃0 , 𝜃1 ): 𝑚
1
𝑚 𝑚
𝑥ҧ = ෍ 𝑥 𝑖
𝑚
𝑖=1
𝜃1 ෍ 𝑥 𝑖 𝑥 𝑖 − 𝑥ҧ = ෍ 𝑥 𝑖 𝑦 𝑖 − 𝑦ത
𝑚
𝑖=1 𝑖=1 1
𝑦ത = ෍ 𝑦 𝑖
σ𝑚 𝑖
𝑖=1 𝑥 𝑦 − 𝑦
𝑖
ത 𝑚
𝜃1 = 𝑚 𝑖 𝑖 𝑖=1
σ𝑖=1 𝑥 𝑥 − 𝑥ҧ
σ𝑚 𝑥 𝑖 𝑦 𝑖 − 𝑚 σ𝑚 𝑥 𝑖 𝑦

𝑖=1 𝑚 𝑖=1
𝜃1 =
σ𝑚 𝑥 2,𝑖 − 𝑚 σ𝑚 𝑥 𝑖 𝑥ҧ
𝑖=1 𝑚 𝑖=1
σ𝑚
𝑖=1 𝑥 𝑖 𝑦 𝑖 − 𝑚. 𝑥ҧ 𝑦

𝜃1 = 𝑚 2,𝑖
σ𝑖=1 𝑥 − 𝑚. 𝑥ҧ 2
Dr. Hashim Yasin Applied Machine Learning (CS4104)
Multivariate Regression
15

Two Parameters(𝜃0 , 𝜃1 ): 𝑚
1
𝑥ҧ = ෍ 𝑥 𝑖
σ𝑚 𝑥 𝑖 𝑦 𝑖 − 𝑚. 𝑥ҧ 𝑦
ത 𝑚
𝑖=1 𝑖=1
𝜃0 = 𝑦ത − 𝜃1 𝑥ҧ 𝜃1 = 𝑚 2,𝑖
σ𝑖=1 𝑥 − 𝑚. 𝑥ҧ 2 𝑚
1
𝑦ത = ෍ 𝑦 𝑖
𝑚
These equations can be summarized by the following 𝑖=1
matrix equation (also known as normal equation)

𝑚 σ𝑚
𝑖=1 𝑥 𝑖
𝜃0 σ𝑚
𝑖=1 𝑦 𝑖
=
σ𝑚
𝑖=1 𝑥
𝑖 σ𝑚
𝑖=1 𝑥
2,𝑖 𝜃1 σ𝑚 𝑖 𝑖
𝑖=1 𝑥 𝑦

Dr. Hashim Yasin Applied Machine Learning (CS4104)


Multivariate Regression
16

Two Parameters(𝜃1 , 𝜃2 ): 𝑚
1
𝑥ҧ = ෍ 𝑥 𝑖
𝑚 σ𝑚
𝑖=1 𝑥 𝑖 𝜃0 σ𝑚
𝑖=1 𝑦 𝑖 𝑚
= 𝑖=1
σ𝑚
𝑖=1 𝑥
𝑖 σ𝑚
𝑖=1 𝑥
2,𝑖 𝜃1 σ𝑚 𝑖 𝑖
𝑖=1 𝑥 𝑦 𝑚
1
This equation can be written in more compact form, 𝑦ത = ෍ 𝑦 𝑖
𝑚
Suppose 𝐗 = (𝟏 𝐱) , 𝟏 = (1,1,1, … )𝑇 and 𝑖=1
𝐱 = (𝑥1 , 𝑥2 , 𝑥3 , … , 𝑥𝑛 )𝑇

𝑇
𝟏 𝑇
𝟏 𝟏𝑇 𝐱 𝑚 σ𝑚
𝑖=1 𝑥
𝑖
𝐗 𝐗= 𝑇 =
𝐱 𝟏 𝑇
𝐱 𝐱 σ𝑚
𝑖=1 𝑥
𝑖 σ𝑚
𝑖=1 𝑥
2,𝑖

𝑇𝐲 σ 𝑚 𝑖
𝟏 𝑖=1 𝑦
𝐗 𝑇 𝐲 = (𝟏 𝐱)𝑇 𝐲 = 𝑇 =
𝐱 𝒚 σ𝑚 𝑖 𝑖
𝑖=1 𝑥 𝑦
Dr. Hashim Yasin Applied Machine Learning (CS4104)
Multivariate Regression
17

𝐗 𝑇 𝐗𝛉 = 𝐗 𝑇 y
where 𝛉 = (𝜃0 , 𝜃1 )𝑇
𝛉 = (𝐗 𝑇 𝐗)−1 𝐗 𝑇 y

 This equation leads to extend the univariate linear


regression to multivariate regression.
 If the attribute set consists of 𝑛 attributes
(𝑥1 , 𝑥2 , 𝑥3 , … , 𝑥𝑛 ), 𝐗 becomes an 𝑚 × 𝑛 design matrix.

Dr. Hashim Yasin Applied Machine Learning (CS4104)


Multivariate Regression
18

𝑚 × 𝑛 design matrix
1 𝑥11 𝑥12 ⋯ 𝑥1𝑛
1 𝑥21 𝑥22 ⋯ 𝑥2𝑛
𝐗= ⋯ ⋯ ⋯ ⋯ ⋯
1 𝑥𝑚1 𝑥𝑚2 ⋯ 𝑥𝑚𝑛

𝜃 = (𝜃0 , 𝜃1 , 𝜃2 , ⋯ , 𝜃𝑛 )𝑇

While 𝜃 is the n-dimensional vector.

Dr. Hashim Yasin Applied Machine Learning (CS4104)


EXAMPLES
Example 1
20

Examples:
Size (feet2) Number of Number of Age of home Price ($1000)
bedrooms floors (years)

1 2104 5 1 45 460
1 1416 3 2 40 232
1 1534 3 2 30 315
1 852 2 1 36 178

simultaneously update

where 𝛉 = (𝜃0 , 𝜃1 , 𝜃2 , 𝜃3 , 𝜃4 )𝑇
Dr. Hashim Yasin Applied Machine Learning (CS4104)
Example 2
21

A chemical process expects the


yield to be affected by two factors
𝑥1 and 𝑥2

Observations recorded for these


two factors are shown in the given
table.

Dr. Hashim Yasin Applied Machine Learning (CS4104)


Example 2
22

 The first order regression model is,


ℎ𝜃 𝑥 = 𝜃0 + 𝜃1 𝑥1 + 𝜃2 𝑥2

𝛉 = (𝐗 𝑇 𝐗)−1 𝐗 𝑇 y

Dr. Hashim Yasin Applied Machine Learning (CS4104)


Example 2
23

𝛉 = (𝐗 𝑇 𝐗)−1 𝐗 𝑇 y

𝜽𝟎
𝛉 = 𝜽𝟏
𝜽𝟐
𝜽𝟎
𝛉 = 𝜽𝟏
𝜽𝟐

Dr. Hashim Yasin Applied Machine Learning (CS4104)


Example 2
24

𝜽𝟎
𝛉 = 𝜽𝟏
𝜽𝟐
𝜽𝟎
𝛉 = 𝜽𝟏
𝜽𝟐

ℎ𝜃 𝑥 = −153.51 + 1.24𝑥1 + 12.08𝑥2

Dr. Hashim Yasin Applied Machine Learning (CS4104)


Example 2
25

ℎ𝜃 𝑥 = −153.51 + 1.24𝑥1 + 12.08𝑥2

𝜽𝟎 −153.51
𝛉 = 𝜽𝟏 = 1.24
𝜽𝟐 12.08

Dr. Hashim Yasin Applied Machine Learning (CS4104)


Gradient Descent
26

Have some function


Want

Outline:
• Start with some
• Keep changing to reduce
until we hopefully end up at a minimum
Dr. Hashim Yasin Applied Machine Learning (CS4104)
Gradient Descent
27

Gradient descent algorithm

Notice : α is the learning rate.

Correct: Simultaneous update Incorrect:

Dr. Hashim Yasin Applied Machine Learning (CS4104)


Gradient Descent
28

Gradient descent algorithm Linear Regression Model

Dr. Hashim Yasin Applied Machine Learning (CS4104)


Gradient Descent
29

Gradient descent algorithm


𝜕
∙ 𝐽(𝜃0 , 𝜃1 )
𝜕𝜃0

update
and
simultaneously

𝜕
∙ 𝐽(𝜃0 , 𝜃1 )
𝜕𝜃1

Dr. Hashim Yasin Applied Machine Learning (CS4104)


Acknowledgement
30

Tom Mitchel, Russel & Norvig, Andrew Ng, Alpydin &


Ch. Eick.

Dr. Hashim Yasin Applied Machine Learning (CS4104)


31

Dr. Hashim Yasin Applied Machine Learning (CS4104)

Вам также может понравиться