Multiple Linear Regression

Econometrics
The Multiple Linear Regression Model

Yuyi LI
University of Manchester
2012
Yuyi LI (University of Manchester) The Multiple Linear Regression Model 2012 1 / 35
Outline
The Multiple Linear Regression (MLR) Model
MLR in Matrix Form
OLS Estimation and Interpretation
Statistical Properties of OLS Estimators
Unbiasedness of OLS Estimators
Sample Variance-Covariance Matrix of OLS Estimators
Gauss-Markov Theorem (OLS is BLUE)
Reading: Wooldridge Chapter 3, Appendices D,E
Mathematics/Statistics: Vector, matrix, identity
matrix, full-rank matrix, and covariance matrix,
basic matrix algebra, vector differentiation . . .
Multiple Linear Regression Models Motivation
Motivation for MLR Models
Consider a SLR model: wage =
0
+
1
educ + u
Work experience (exper) affect wage?
exper is not present (i.e. captured by u).
Assumption SLR.4: E[u|educ] = 0 - Does it hold?
What happens if SLR4 fails?
In an MLR model, exper can be included:
wage =
0
+
1
educ +
2
exper + u
Necessary to have MLR models:
measure ceteris paribus effects of more than 1 variable
can allow for more exible (nonlinear) relationships
between the independent and dependent variables.
can correct for omitted variable bias
Multiple Linear Regression Models MLR Models
MLR model with k Regressors
The MLR model in population is
y =
0
+
1
x
1
+
2
x
2
+ . . . +
k
x
k
+ u (1)
y : dependent variable; x
1
, x
2
, . . . , x
k
: regressors;
0
,
1
,
2
, . . . ,
k
: parameters; u : error term
A key assumption is (Zero Conditional Mean)
E[u|x
1
, x
2
, . . . , x
k
] = 0
Note, the above assumption comes from assuming
E[u] = 0 and E[u|x
1
, x
2
, . . . , x
k
] = E[u].
Multiple Linear Regression Models MLR Model in Matrix Form
MLR Model for Each Individual
Let {(y
i
, x
i1
, x
i2
, . . . , x
ik
) : i = 1, 2, . . . , n} be a
random sample from the population.
The regression equations for each i = 1, 2, . . . , n is
y
1
=
0
+
1
x
11
+
2
x
12
+ . . . +
k
x
1k
+ u
1
y
2
=
0
+
1
x
21
+
2
x
22
+ . . . +
k
x
2k
+ u
2
.
.
. (2)
y
n
=
0
+
1
x
n1
+
2
x
n2
+ . . . +
k
x
nk
+ u
n
Note, each line corresponds to an individual
MLR Model in Vector Form
For each i = 1, 2, . . . , n dene a 1 (k + 1) vector
x
i
= [1, x
i1
, x
i2
, , x
ik
]
Dene a (k + 1) 1 parameter vector
= [
0
,
1
,
2
, ,
k
]
Then, the regression model can be written in terms

of these newly dened vectors:
y
i
= x
i
+ u
i
, i = 1, 2, . . . , n (3)
Note, (3) is equivalent to (2)
Verify: (3) is identical to the i-th line of (2) for all i
MLR Model in Matrix Form 1/2
Write (3) out for every observation:
_
_
y
1
y
2
.
.
.
y
n
_
_
=
_
_
x
1
x
2
.
.
.
x
n
_
+
_
_
u
1
u
2
.
.
.
u
n
_
_
=
_
_
x
1
x
2
.
.
.
x
n
_
_
+
_
_
u
1
u
2
.
.
.
u
n
_
_
(4)
Note, (2)-(4) are equivalent; Make sure you follow!
MLR Model in Matrix Form 2/2
Dene two n 1 vectors, and an n (k +1) matrix
y = [y
1
, y
2
, , y
n
]
u = [u
1
, u
2
, , u
n
]
X =
_
_
x
1
x
2
.
.
.
x
n
_
_
=
_
_
1 x
11
x
12
x
1k
1 x
21
x
22
x
2k
.
.
.
.
.
.
.
.
.
.
.
.
1 x
n1
x
n2
x
nk
_
_
Then, (4) yields the MLR model in matrix form:
y = X + u (5)
Note, (2)-(5) are equivalent; Verify!
OLS Estimation of MLR Models Residuals and OLS Estimation
Residuals
Residual for the i-th individual is
u
i
= y
i

y
i
= y
i
(b
0
+ b
1
x
i1
+ b
2
x
i2
+ . . . + b
k
x
ik
)
y
i
x
i
b, Verify this
for any estimators b = [b
0
, b
1
, . . . , b
k
]
, where

y
i
denotes the tted value and x
i
is dened earlier.
Residual

u
i
measures the vertical distance
between the observed y
i
and the tted value

y
i
.
Note, the last two expressions in the residual
formula are equivalent (one in scalar form and the
other vector form)
OLS Estimation of MLR Models Residuals and OLS Estimation
SSR
Sum of Squared Residuals (SSR) is
SSR(b) =
n
i=1
u
2
i
=
n
i=1
[y
i
(b
0
+ b
1
x
i1
+ b
2
x
i2
+ . . . + b
k
x
ik
)]
2
=
n
i=1
(y
i
x
i
b)
2
SSR is the total squared distances between y
i
and
y
i
for all i = 1, 2, . . . , n
OLS Estimation of MLR Models OLS Estimation
OLS Estimation Principle
OLS estimators

= [
0
,

1
, . . . ,

k
]
minimise SSR:
:= min
b
SSR(b) = min
b
n
i=1
(y
i
x
i
b)
2
(6)
To solve (6), we can still utilise the FOCs and
SOCs, noting that SSR is a single function with
(k + 1) arguments (i.e. (k + 1) FOCs needed).
Note, in scalar notation, (6) becomes
min
b
0
,b
1
,...,b
k
n
i=1
[y
i
(b
0
+ b
1
x
i1
+ b
2
x
i2
+ . . . + b
k
x
ik
)]
2
FOCs for the OLS Estimators
The (k + 1) FOCs are:
SSR(b)
b
j
= 0, j = 0, 1, 2, . . . , k
This yields
2
n
i=1
1(y
i

1
x
i1
. . .

k
x
ik
) = 0
2
n
i=1
x
i1
(y
i

1
x
i1
. . .

k
x
ik
) = 0
.
.
. (7)
2
n
i=1
x
ik
(y
i

1
x
i1
. . .

k
x
ik
) = 0
FOCs in Vector Notation
The FOCs in (7) (ignoring 2) can be written as
n
i=1
x
i
(y
i
x
i
) =
n
i=1
x
i
y
i

n
i=1
x
i
x
i
= 0 (8)
x
i
= [1, x
i1
, x
i2
, , x
ik
] and

= [
0
,

1
,

2
, ,

k
]
Note, (8) can be directly obtained via

vector-differentiating SSR (6) w.r.t. b
Note, try to convert expression (8) back to (7)!
FOCs (8) will be written in matrix form on next slide
FOCs in Matrix Form
Recall page 8 that
y = [ y
1
y
2
y
n
]
X = [ x
1
x
2
x
n
]
It follows
X
y =
n
i=1
x
i
y
i
and X
X =
n
i=1
x
i
x
i
(9)
FOCs (8) can be written as
X
y X
= 0 (10)
Note, expressions (7), (8) and (10) are equivalent.
OLE Estimator in Matrix Form
If X has full column rank, then (X
X)
1
exists
Solving (10) gives the OLS estimators
= (X
X)
1
X
y (11)
Note, using (9),
=
_
_
n
i=1
x
i
x
i
_
_
1
n
i=1
x
y
i
Note, rank(X) = k + 1, rank(X
X) = k + 1 and
(X
X)
1
X
X = I
(k+1)
.
Note, I
(k+1)
is a (k + 1) by (k + 1) identity matrix.
OLS Estimation of MLR Models OLS Interpretation and Goodness of Fit
Interpreting the OLS Estimates
(OLS) estimates

represent the partial effects
Partial effect refers to the effect of one regressor
when the rest regressors are kept constant, e.g.
1
=

y
x
1
when x
2
= x
3
= . . . = x
k
= 0
Note, nothing is said about error term u
True parameters are the ceteris paribus effects:
effect of one regressor while holding all other
variables (including error terms) constant, e.g.
1
=
y
x
1
when x
2
= x
3
= . . . = x
k
= u = 0
An Example: Wage Equation
Dene lwage = log(wage), ed=education,
ex=work experience, te=time spent with the
current employer.
Relationship between wage and other variables
Using a MLR model (n = 526)
lwage = 0.284 + 0.092ed + 0.004ex + 0.022te

Estimate 0.022 means that one additional year
spent with the current employer is expected to
increase wage by 2.2%, if education and work
experience remain constant. [log-level]
Goodness of Fit
Sample regression function
y =

0
+

1
x
1
+ . . . +

k
x
k
Population regression function:
E[y|x
1
, . . . , x
k
] =
0
+
1
x
1
+ . . . +
k
x
k
Fitted values:
y
i
=

0
+

1
x
i1
+ . . . +

k
x
ik
= x
i
=

y = X
Residuals
u
i
= y
i

y
i
= y
i
x
i
=

u = y

y = y X

Sum of Squares
Total Sum of Squares (SST):
SST =
n
i=1
(y
i

y)
2
= y
y n
y
2
Explained Sum of Squares (SSE):
SSE =
n
i=1
(
y
i

y)
2
=

y
y n
y
2
Residual Sum of Squares (SSR):
SSR =
n
i=1
u
2
i
=

u
u
SST = SSE +SSR and R
2
=
SSE
SST
= 1
SSR
SST
R
2
has the same interpretation and features as
that in the SLR model.
R
2
will increase or stay the same, if more
regressors are added
Coefcient of Determination: R
2
Relationship among the sum of squares:
SST = SSE + SSR
R-squared
R
2
=
SSE
SST
= 1
SSR
SST
R
2
has the same interpretation and features as
that in the SLR model.
R
2
will increase or stay the same, if more
regressors are added
Statistical Properties of OLS Estimators Gauss-Markov Assumptions
Gauss-Markov Assumptions
Assumption MLR.1: (Linear in Parameters)
y = X+u or y =
0
+
1
x
1
+
2
x
2
+. . .+
k
x
k
+u
Assumption MLR.2: (Random Sampling)
{(y
i
, x
i1
, x
i2
, . . . , x
ik
) : i = 1, 2, . . . , n)}
Assumption MLR.3: (No Perfect Collinearity or
X has Full Column Rank)
Assumption MLR.4: (Zero Conditional Mean)
E[u|X] = 0, or E[u
i
|x
i1
, x
i2
, . . . , x
ik
] = 0 i
Assumption MLR.5: (Homoskedasticity)
Var(u|X) =
2
I
n
or Var(u|x
i1
, x
i2
, . . . , x
ik
) =
2
I
n
denotes an (n n) identity matrix.
Statistical Properties of OLS Estimators Conditional Mean of OLS Estimators
Unbiasedness of OLS Estimators
Theorem (Unbiasedness of OLS Estimators)
Under Assumptions MLR.1-MLR.4,
E[
|X] = or E[
j
|x
i1
, x
i2
, . . . , x
ik
] =
j
,
j = 1, . . . , k and i = 1, . . . , n for any , i.e. OLS
estimators are unbiased for population parameters.
Proof: see Appendix
Note, MLR.4 is again crucial
Note, MLR.4 requires all the regressors to be
uncorrelated to the error term
Misspecication and Bias
Model misspecication can arise due to
inclusion of irrelevant variables
exclusion of relevant variables
Including an irrelevant variable:
zero parameter value on this variable in population
model OLS estimators unbiased, but variance
may be affected.
Omitting a relevant variable:
generally biased estimators
(omitted variable bias; details to follow)
Omitted Variable Bias 1/4
Suppose the true model is
y =
0
+
1
x
1
+
2
x
2
+ u
y =

0
+

1
x
1
+

2
x
2
but due to ignorance or lack of data, we actually
estimated this model
y =

0
+

1
x
1
i.e. a relevant variable is omitted.
Omitted Variable Bias 2/4
There exists a simple relation
1
=

1
+

1
where

1
is from an (auxilliary) regression of x
2
on
x
1
:

x
2
=

0
+

1
x
1
.
Therefore, it follows
E[
1
|x
1
, x
2
] = E[
1
+

1
|x
1
, x
2
]
=
1
+

2
The bias of estimator

1
is
Bias(
1
|x
1
, x
2
) = E[
1
|x
1
, x
2
]
1
=

2
Note,

1
can be unbiased; Why?
Omitted variable bias 3/4
Direction of the bias, Bias(
1
|x
1
, x
2
) =

2
,
depends on the signs of

1
and
2
But, x
2
may be unobservable (thus

1
can not be
obtained) and the true parameter
2
is unknown.
In practice, note that

1
and
2
reect the
correlations between (x
1
and x
2
) and between (x
2
and y), respectively.
The direction of omitted variable (x
2
) bias
Corr(x
1
, x
2
) > 0 Corr(x
1
, x
2
) < 0
2
> 0 Positive/Upward bias Negative/Downward bias
2
< 0 Negative/Downward bias Positive/Upward bias
Omitted variable bias 4/4
The direction of omitted variable (x
2
) bias
Corr(x
1
, x
2
) > 0 Corr(x
1
, x
2
) < 0
2
> 0 Positive/Upward bias Negative/Downward bias
2
< 0 Negative/Downward bias Positive/Upward bias
Example: wage =
0
+
1
educ +
2
ability + u, but
ability unobserved
(i) the omitted variable ability is likely to be
positively correlated to wage
2
> 0
(ii) ability is also likely to be positively correlated to
educ, i.e. Corr(ability, educ) > 0
(i)+(ii) =

1
is biased upward (on average,

1
is
larger than it would be if ability was included).
Statistical Properties of OLS Estimators Covariance Matrix of OLS Estimators
Covariance Matrix of OLS Estimators
Theorem (Covariance Matrix of OLS Estimators)
Under Assumptions MLR.1-MLR.5 (Gauss-Markov),
Var(
|X) =
2
(X
X)
1
where
2
I
n
= Var(u|X)
Proof: see Appendix
Note, a different & interesting version is in
Wooldridge (p.95, 4th edition)
Note, Var(
|X) is unknown, since

2
is unknown
Note, estimation of Var(
|X) =
2
(X
X)
1
, which is
used to obtain the OLS estimators standard errors.
Covariance Matrix Estimation 1/2
An unbiased estimator for
2
is

2
=
u
n 1 k
=
_
n
i=1

u
2
i
n 1 k

SSR
n 1 k
Then, estimate Var(
|X) by

Var(
|X) =
2
(X
X)
1
Standard error for

j
is
se(
j
) =
_
Var(
|X)
jj
where

Var(
|X)
jj
denotes the (j + 1)-th leading
diagonal element of the matrix

Var(
|X) and
j = 0, 1, 2, . . . , k
Covariance Matrix Estimation 2/2
Wooldridge Theorem 3.2 [MLR.1-MLR.5]:
Var(
j
|x
j
) =

2
SST
j
(1 R
2
j
)
, j = 1, 2, . . . , k, (12)
where SST
j
=
_
n
i=1
(x
ij

x)
2
is the total sample variation in x
j
,
and R
2
j
is the R
2
from regressing x
j
on all the other
regressors (including an intercept).
2
is larger/smaller Var(
j
|x
j
) is larger/smaller
SST
j
j
|x
j
) is smaller/larger
R
2
j
j
|x
j
) is larger/smaller
Finally, se(
j
) =
_

2
SST
j
(1R
2
j
)
Multicollinearity
Multicollinearity: high, but not perfect (cf MLR.3),
correlation among the regressors
Possible consequence:
Imprecise estimates (high variance); see (12)
(difcult to identify effect of individual regressor x
j
)
Multicollinearity high R
2
j
in (12) high Var(
j
|x
j
)
Other things being equal, low correlations among
regressors (no multicollinearity) is preferred
Variance Ination Factor (VIF > 10?) could be
used to test for seriousness of multicollinearity
Note, multicollinearity does not violate any of the
Gauss-Markov Assumptions (MLR.1-5)
Statistical Properties of OLS Estimators Gauss-Markov Theorem
Some Terminologies
Consider an estimator

j
for
j
(j = 0, 1, 2, . . . , k)
Estimator is a rule applied to any sample to
produce an estimate
Linear, if

j
=
_
n
i=1
w
ij
y
i
where w
ij
can be a function
of the x
j
s
Unbiased, if E[
j
] =
j
Best, if Var(
j
) Var(
j
) for unbiased estimator

j
and all other unbiased estimators

j
Statistical Properties of OLS Estimators Gauss-Markov Theorem
Gauss-Markov Theorem
Theorem (Gauss-Markov Theorem)
Under Assumptions MLR.1-MLR.5, the OLS estimators
are the Best Linear Unbiased Estimators (BLUE) of

the population parameters .
Proof, see Wooldridge Appendix 3A
Note, MLR.5 is crucial for OLS estimators to be B
Note, MLR.4 is crucial for OLS estimators to be U
Note, if MLR.3 fails, OLS estimator does not exist
Note, this theorem justies the wide use of OLS
Appendix Proofs
Proof. Unbiasedness of OLS Estimators.
= (X
X)
1
X
y
= (X
X)
1
X
(X + u)
= (X
X)
1
X
X + (X
X)
1
X
u
= I
(k+1)
+ (X
X)
1
X
u
= + (X
X)
1
X
u (13)
E[
|X] = E[ + (X
X)
1
X
u|X]
= + E[(X
X)
1
X
u|X]
= + (X
X)
1
X
E[u|X] use MLR.4

=

Appendix Proofs
Proof. Covariance Matrix of OLS Estimators.
Recall expression (13),

= + (X
X)
1
X
u.
It follows,
Var(
|X) = Var((X
X)
1
X
u|X) (constant is dropped)

= (X
X)
1
X
Var(u|X)X(X
X)
1
= (X
X)
1
X
(
2
I
n
)X(X
X)
1
using MLR.5
=
2
(X
X)
1
X
(I
n
)X(X
X)
1
=
2
(X
X)
1
X
X(X
X)
1
=
2
(X
X)
1

Multiple Linear Regression

Загружено:

Сведения о документе

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Multiple Linear Regression

Загружено:

Авторское право:

Доступные форматы

Econometrics

The Multiple Linear Regression Model

Then, the regression model can be written in terms

Note, (8) can be directly obtained via

lwage = 0.284 + 0.092ed + 0.004ex + 0.022te

Yuyi LI (University of Manchester) The Multiple Linear Regression Model 2012 18 / 35

|X) is unknown, since

are the Best Linear Unbiased Estimators (BLUE) of

E[u|X] use MLR.4

Yuyi LI (University of Manchester) The Multiple Linear Regression Model 2012 34 / 35

u|X) (constant is dropped)

Yuyi LI (University of Manchester) The Multiple Linear Regression Model 2012 35 / 35

Вам также может понравиться