Академический Документы
Профессиональный Документы
Культура Документы
Housing Prices
(Portland, OR)
of dollars)
(in 1000s
Price
Size (feet2)
Supervised Learning Regression Problem
Given the “right answer” for Predict real-valued output
each example in the data.
Dr. Hashim Yasin Applied Machine Learning (CS4104)
Regression Example
4
Hypothesis:
‘s: Parameters
How to choose ‘s ?
Dr. Hashim Yasin Applied Machine Learning (CS4104)
Linear Regression with one Variable
6
Simplified:
Hypothesis:
Parameters:
Cost Function:
Goal:
𝜃1 𝑥 2,𝑖 − 𝑥 𝑖 𝑦 𝑖 = 0
𝑚 𝑖=1 𝑖=1
𝜕 𝐽(𝜃1 ) 𝜕 2
= ℎ𝜃 𝑥 𝑖 − 𝑦 𝑖 𝑚 𝑚
𝜕𝜃1 𝜕𝜃1
𝑖=1 𝜃1 𝑥 2,𝑖 = 𝑥 𝑖 𝑦 𝑖
𝑚
𝜕 𝐽(𝜃1 ) 𝜕 𝑖=1 𝑖=1
2
= 𝜃1 𝑥 𝑖 − 𝑦 𝑖
𝜕𝜃1 𝜕𝜃1 σ𝑚 𝑖 𝑖
𝑖=1 𝑖=1 𝑥 𝑦
𝜃1 = 𝑚 2𝑖
𝑚 σ𝑖=1 𝑥
𝜕 𝐽(𝜃1 ) 𝜕
= 2 𝜃1 𝑥 𝑖 − 𝑦 𝑖 𝜃1 𝑥 𝑖 − 𝑦 𝑖
𝜕𝜃1 𝜕𝜃1
𝑖=1
𝑚 𝑐𝑜𝑣𝑎𝑟(𝑋, 𝑌)
𝜕 𝐽(𝜃1 )
= 2 𝜃1 𝑥 𝑖 − 𝑦 𝑖 𝑥 𝑖 = 0 𝑣𝑎𝑟(𝑋)
𝜕𝜃1 if 𝑚𝑒𝑎𝑛(𝑋) = 𝑚𝑒𝑎𝑛(𝑌) = 0
𝑖=1
Hypothesis:
Parameters:
𝑥0 = 1
Cost function:
Gradient descent:
Repeat
(simultaneously update )
Two Parameters(𝜃0 , 𝜃1 ):
𝑚 𝑚 𝑚
1 1 1
𝜃0 + 𝜃1 𝑥 = 𝑦 𝑖
𝑖
𝑚 𝑚 𝑚
𝑖=1 𝑖=1 𝑖=1
𝑚
𝜕 𝐽(𝜃0 ) 𝜕 1 2 𝑚
= ℎ𝜃 𝑥 𝑖 − 𝑦 𝑖 1
𝜕𝜃0 𝜕𝜃0 2𝑚
𝑖=1 𝑥ҧ = 𝑥 𝑖
𝑚 𝑚
𝜕 𝐽(𝜃0 ) 𝜕 1 2 𝑖=1
= 𝜃0 + 𝜃1 𝑥 𝑖 − 𝑦 𝑖 𝑚
𝜕𝜃0 𝜕𝜃0 2𝑚
𝑖=1 1
𝜕 𝐽(𝜃0 ) 1
𝑚
𝜕
𝑦ത = 𝑦 𝑖
=2 𝜃0 + 𝜃1 𝑥 𝑖 − 𝑦 𝑖 𝜃0 + 𝜃1 𝑥 𝑖 − 𝑦 𝑖 𝑚
𝜕𝜃0 2𝑚 𝜕𝜃0 𝑖=1
𝑖=1
𝑚
𝜕 𝐽(𝜃0 ) 1
= 𝜃0 + 𝜃1 𝑥 𝑖 − 𝑦 𝑖 = 0
𝜃0 = 𝑦ത − 𝜃1 𝑥ҧ
𝜕𝜃0 𝑚
𝑖=1
Two Parameters(𝜃0 , 𝜃1 ):
𝑚
𝜕 𝐽(𝜃1 ) 𝜕 1 2
= ℎ𝜃 𝑥 𝑖 − 𝑦 𝑖
𝜕𝜃1 𝜕𝜃1 2𝑚
𝑖=1
𝑚
𝜕 𝐽(𝜃1 ) 𝜕 1 2
= 𝜃0 + 𝜃1 𝑥 𝑖 − 𝑦 𝑖
𝜕𝜃1 𝜕𝜃1 2𝑚
𝑖=1
𝑚
𝜕 𝐽(𝜃1 ) 1 𝜕
=2 𝜃0 + 𝜃1 𝑥 𝑖 − 𝑦 𝑖 𝜃0 + 𝜃1 𝑥 𝑖 − 𝑦 𝑖
𝜕𝜃1 2𝑚 𝜕𝜃1
𝑖=1
𝑚
𝜕 𝐽(𝜃1 ) 1
= 𝜃0 + 𝜃1 𝑥 𝑖 − 𝑦 𝑖 𝑥 𝑖 = 0
𝜕𝜃1 𝑚
𝑖=1
Two Parameters(𝜃1 , 𝜃2 ):
𝑚
𝜕 𝐽(𝜃1 ) 1
= 𝜃0 + 𝜃1 𝑥 𝑖 − 𝑦 𝑖 𝑥 𝑖 = 0
𝜕𝜃1 𝑚
𝑖=1
𝑚 𝑚 𝑚
1 1 1
𝜃0 𝑥 + 𝜃1 𝑥 = 𝑥 𝑖 𝑦 𝑖
𝑖 2𝑖
𝜃0 = 𝑦ത − 𝜃1 𝑥ҧ
𝑚 𝑚 𝑚
𝑖=1 𝑖=1 𝑖=1
𝑚 𝑚 𝑚
𝜃1 𝑥 2,𝑖 + 𝑦𝑥
ത 𝑖 − 𝜃1 𝑥𝑥
ҧ 𝑖 = 𝑥 𝑖 𝑦𝑖
𝑖=1 𝑖=1 𝑖=1 𝑖=1
𝑚 𝑚
𝜃1 𝑥 𝑖 𝑥 𝑖 − 𝑥ҧ = 𝑥 𝑖 𝑦 𝑖 − 𝑦ത
𝑖=1 𝑖=1
Dr. Hashim Yasin Applied Machine Learning (CS4104)
Multivariate Regression
14
Two Parameters(𝜃0 , 𝜃1 ): 𝑚
1
𝑚 𝑚
𝑥ҧ = 𝑥 𝑖
𝑚
𝑖=1
𝜃1 𝑥 𝑖 𝑥 𝑖 − 𝑥ҧ = 𝑥 𝑖 𝑦 𝑖 − 𝑦ത
𝑚
𝑖=1 𝑖=1 1
𝑦ത = 𝑦 𝑖
σ𝑚 𝑖
𝑖=1 𝑥 𝑦 − 𝑦
𝑖
ത 𝑚
𝜃1 = 𝑚 𝑖 𝑖 𝑖=1
σ𝑖=1 𝑥 𝑥 − 𝑥ҧ
σ𝑚 𝑥 𝑖 𝑦 𝑖 − 𝑚 σ𝑚 𝑥 𝑖 𝑦
ത
𝑖=1 𝑚 𝑖=1
𝜃1 =
σ𝑚 𝑥 2,𝑖 − 𝑚 σ𝑚 𝑥 𝑖 𝑥ҧ
𝑖=1 𝑚 𝑖=1
σ𝑚
𝑖=1 𝑥 𝑖 𝑦 𝑖 − 𝑚. 𝑥ҧ 𝑦
ത
𝜃1 = 𝑚 2,𝑖
σ𝑖=1 𝑥 − 𝑚. 𝑥ҧ 2
Dr. Hashim Yasin Applied Machine Learning (CS4104)
Multivariate Regression
15
Two Parameters(𝜃0 , 𝜃1 ): 𝑚
1
𝑥ҧ = 𝑥 𝑖
σ𝑚 𝑥 𝑖 𝑦 𝑖 − 𝑚. 𝑥ҧ 𝑦
ത 𝑚
𝑖=1 𝑖=1
𝜃0 = 𝑦ത − 𝜃1 𝑥ҧ 𝜃1 = 𝑚 2,𝑖
σ𝑖=1 𝑥 − 𝑚. 𝑥ҧ 2 𝑚
1
𝑦ത = 𝑦 𝑖
𝑚
These equations can be summarized by the following 𝑖=1
matrix equation (also known as normal equation)
𝑚 σ𝑚
𝑖=1 𝑥 𝑖
𝜃0 σ𝑚
𝑖=1 𝑦 𝑖
=
σ𝑚
𝑖=1 𝑥
𝑖 σ𝑚
𝑖=1 𝑥
2,𝑖 𝜃1 σ𝑚 𝑖 𝑖
𝑖=1 𝑥 𝑦
Two Parameters(𝜃1 , 𝜃2 ): 𝑚
1
𝑥ҧ = 𝑥 𝑖
𝑚 σ𝑚
𝑖=1 𝑥 𝑖 𝜃0 σ𝑚
𝑖=1 𝑦 𝑖 𝑚
= 𝑖=1
σ𝑚
𝑖=1 𝑥
𝑖 σ𝑚
𝑖=1 𝑥
2,𝑖 𝜃1 σ𝑚 𝑖 𝑖
𝑖=1 𝑥 𝑦 𝑚
1
This equation can be written in more compact form, 𝑦ത = 𝑦 𝑖
𝑚
Suppose 𝐗 = (𝟏 𝐱) , 𝟏 = (1,1,1, … )𝑇 and 𝑖=1
𝐱 = (𝑥1 , 𝑥2 , 𝑥3 , … , 𝑥𝑛 )𝑇
𝑇
𝟏 𝑇
𝟏 𝟏𝑇 𝐱 𝑚 σ𝑚
𝑖=1 𝑥
𝑖
𝐗 𝐗= 𝑇 =
𝐱 𝟏 𝑇
𝐱 𝐱 σ𝑚
𝑖=1 𝑥
𝑖 σ𝑚
𝑖=1 𝑥
2,𝑖
𝑇𝐲 σ 𝑚 𝑖
𝟏 𝑖=1 𝑦
𝐗 𝑇 𝐲 = (𝟏 𝐱)𝑇 𝐲 = 𝑇 =
𝐱 𝒚 σ𝑚 𝑖 𝑖
𝑖=1 𝑥 𝑦
Dr. Hashim Yasin Applied Machine Learning (CS4104)
Multivariate Regression
17
𝐗 𝑇 𝐗𝛉 = 𝐗 𝑇 y
where 𝛉 = (𝜃0 , 𝜃1 )𝑇
𝛉 = (𝐗 𝑇 𝐗)−1 𝐗 𝑇 y
𝑚 × 𝑛 design matrix
1 𝑥11 𝑥12 ⋯ 𝑥1𝑛
1 𝑥21 𝑥22 ⋯ 𝑥2𝑛
𝐗= ⋯ ⋯ ⋯ ⋯ ⋯
1 𝑥𝑚1 𝑥𝑚2 ⋯ 𝑥𝑚𝑛
𝜃 = (𝜃0 , 𝜃1 , 𝜃2 , ⋯ , 𝜃𝑛 )𝑇
Examples:
Size (feet2) Number of Number of Age of home Price ($1000)
bedrooms floors (years)
1 2104 5 1 45 460
1 1416 3 2 40 232
1 1534 3 2 30 315
1 852 2 1 36 178
simultaneously update
where 𝛉 = (𝜃0 , 𝜃1 , 𝜃2 , 𝜃3 , 𝜃4 )𝑇
Dr. Hashim Yasin Applied Machine Learning (CS4104)
Example 2
21
𝛉 = (𝐗 𝑇 𝐗)−1 𝐗 𝑇 y
𝛉 = (𝐗 𝑇 𝐗)−1 𝐗 𝑇 y
𝜽𝟎
𝛉 = 𝜽𝟏
𝜽𝟐
𝜽𝟎
𝛉 = 𝜽𝟏
𝜽𝟐
𝜽𝟎
𝛉 = 𝜽𝟏
𝜽𝟐
𝜽𝟎
𝛉 = 𝜽𝟏
𝜽𝟐
𝜽𝟎 −153.51
𝛉 = 𝜽𝟏 = 1.24
𝜽𝟐 12.08
Outline:
• Start with some
• Keep changing to reduce
until we hopefully end up at a minimum
Dr. Hashim Yasin Applied Machine Learning (CS4104)
Gradient Descent
27
update
and
simultaneously
𝜕
∙ 𝐽(𝜃0 , 𝜃1 )
𝜕𝜃1