Академический Документы
Профессиональный Документы
Культура Документы
Lecture 6
! ! Y
!! !" !
!" !! !
. . .
. . .
. . .
!! !! !
Here, ! is location, ! is size of house, Y is cost of the house (output or target variable) and n is
the number of data points (also referred to as number of training examples). ! and ! are known
as input variables or features and together with corresponding output variable, they are called
training example.
Now, the general approach to solve the problem would be to try and find out a function, Y =
f(! ,! ) keeping in mind, that the data is prone to error. We can fit a model, maybe a straight
line passing through this data such that for every ! and ! , we have a unique Y. This process is
an example of a regression analysis, a type of supervised learning algorithm. Regression refers to
the fact that we are predicting a real-valued output namely the price. We are taking location and
size as input variables and trying to map the price (output) such as to get a continuous result
function.
1
The dataset of housing prices is called training dataset which can be used to predict prices of
the houses.
Curve fitting vs ability to model
The high degree of fit of a curve on the points does not always imply a higher utility for
understanding the underlying trend. Care should be taken while extrapolating the curve obtained
from regression.
Hypothesis function tells that y is a function of x. In the above picture, we are predicting y (on
vertical axis) which is some linear function of x (on horizontal axis). This is called fitting and is
an example of linear regression with one variable. The idea is to choose parameters such that we
get right y for the training example.
where
= Feature value,
= Learning rate, > 0 (it determines the step size of change in value on each iteration)
Irrespective of the value of initial starting point (guess) chosen, the algorithm will certainly
converge to minimum.
Note: problem of optimum value of learning rate will not occur in successive iterations of
Gradient Descent since the algorithm takes care of smaller step size.
() = !
!
1
() = ( ! ! ! )!
!!!
() = ||()||!
X = Y
Advantage of the analytical approach over the iterative approach: Using the analytical approach
we can hope to get to the best result as against iterative solution where depending on the initial
value and step size, the algorithm can converge to a local minima and not the global minima.