Вы находитесь на странице: 1из 2

Untitled - 10/06/16 - 1:03 AM

When constructing GLMs, a few things are to be kept in mind.


1. There must be always a linear relationship between <theta><x> and the natural
parameter. This is my definition, and also where the term Linear in GLMs come from.
2. Then we must choose an appropriate hypothesis function. The Hypothesis function
basically takes in the input parameters and churns out the target variable. The choice is
very obvious in Regression, where the target variable is a continuos variable, and hence
hypothesis function can simply taken to be equal to Y. How it relates to the parameters
is a different story though. In Linear regression, this relation is linear, i.e. Y can be
expressed as a linear combination of concurrent values of X.
3. In logistic regression, the choice may not be very clear. Logistic regression, the target
variable is a discrete one. How do you go on to describe a discrete variable with a
function. Though set-wise defined function is one way, we generally dont approach this
problem this way. Instead the probability of the target variable is calculated. All
probabilities sum to one.
4. Once, a hypothesis function has been decided, one must choose the probability
distribution for the target variable. Its important to differentiate between the value, the
probability and the PD Function for a variable. In Linear Regression, we assume
Gaussian distribution as it seems natural and easy to formulate mathematically. Its
even easier to come up a PDF for a Bernoulli Distribution, as the hypothesis already
describes the probability. This wasnt the case, with Linear as hypothesis described the
value, and the probability of the value being true was modelled upon the fact that we
assume the errors in value was normally distributed. The assumption in Bernoulli is the
choice of sigmoid function, which is assumed to be a fair representation of the
probability of the target variable as theta changes.
5. T(y) is equal to y in many case, as in both Linear and Bernoulli, but that isnt always the
case. In order to check to see whether T(y) = y is a sound choice, one must calculated
the expected value of y. Does the value of y fulfil the purpose of the hypothesis. For
linear model, we desire the value of y, and expected value of y is the mean of the Normal
distribution which intact is equal to y according to our hypothesis function. In Bernoulli
model, the expected value of y is 1*p(y=1) + 0*p(y=0), which is again our hypothesis
function, but does it give us the probability of y being equal to 1. Yes. Naturally, the
probability of y = 0 can be deduced (1 - P(1)).

Untitled - 10/06/16 - 1:03 AM

6. However, the above is not true in case of Softmax Regression where the variable can
take more than 2 values. If we get the expected value of y, we will get a linear
combination of probabilities of y equal to non-zero quantities, which is hardly of any use
to use. We must then take to be a N-1 dim. vector to capture N-1 probabilities
simultaneously. The Nth prob. can be deduced by subtraction from

Вам также может понравиться