Sp16 Review Test 1 CH 4 5 6 7 8

Review for Test 1: Ch 4, 5,
6, 7, 8
Please do review all the material we have

covered
We can only go over a fraction it in one
review class, so please make sure to review
all of our lecture material and textbook
readings, not just the slides here
The Population Multiple Regression Model

(SW Section 6.2)
Consider the case of two regressors:
Yi = 0 + 1X1i + 2X2i + ui, i = 1,,n

Y is the dependent variable
X1, X2 are the two independent variables (regressors)
(Yi, X1i, X2i) denote the ith observation on Y, X1, and X2.
0 = unknown population intercept
1 = effect on Y of a change in X1, holding X2 constant
2 = effect on Y of a change in X2, holding X1 constant
ui = the regression error (omitted factors)
Copyright 2015 Pearson, Inc. All rights reserved.
6-2
Omitted variable bias

Yi = 0 + 1Xi + 2X2i + ui
For omitted variable bias to occur, the omitted
variable Z must satisfy two conditions:
The two conditions for omitted variable bias
1. Z is a determinant of Y (i.e. Z is part of u); and
2. Z is correlated with the regressor X (i.e. corr(Z,X)
0)
Both conditions must hold for the omission of Z to
result in omitted variable bias.
6-3
Multiple Regressors Regression

Coefficient Interpretation
Yi = 0 + 1X1i + 2X2i + ui
Y
, holding X2 constant
X 1
1 =
2 = Y , holding X1 constant
X 2
0 = predicted value of Y when X1 = X2 = 0.
6-4
Multiple regression in STATA

regtestscrstrpctel,robust;
RegressionwithrobuststandarderrorsNumberofobs=420
F(2,417)=223.82
Prob>F=0.0000
Rsquared=0.4264
RootMSE=14.464
|Robust
testscr|Coef.Std.Err.tP>|t|[95%Conf.Interval]
+
str|1.101296.43284722.540.0111.95213.2504616
pctel|.6497768.031031820.940.000.710775.5887786
_cons|686.03228.72822478.600.000668.8754703.189
TestScore
= 686.0 1.10STR 0.65PctEL
6-5
When are the estimates of a Multiple

Linear Regression unbiased ?
Yi = 0 + 1X1i + 2X2i + + kXki + ui, i = 1,,n
1. The conditional distribution of u given the Xs has
mean zero, that is, E(ui|X1i = x1,, Xki = xk) = 0.
2. (X1i,,Xki,Yi), i =1,,n, are i.i.d.
3. Large outliers are unlikely: X1,, Xk, and Y have
4
4
4
X
Y
four moments: E( ki) < ,, E( X 1i ) < , E( i )
< .
4. There is no perfect multicollinearity.
6-6
The OLS Estimator in Multiple Regression

With two regressors, the OLS estimator solves:
n
min b ,b ,b
0
2
[Y
(b
b
X
b
X
)]
i 0 1 1i 2 2i
i1
The OLS estimator minimizes the average squared

difference between the actual values of Yi and the
prediction (predicted value) based on the
estimated line.
This minimization problem is solved using calculus
This yields the OLS estimators of 0 and 1 .
6-7
How do we test for the significance of each coefficient/estimation ?

To test H0: 1 = 0 v. H1: 1 0
Construct the t-statistic

t=
1 1,0
2
Reject at 5% significance level if |t| > 1.96

This procedure relies on the large-n
approximation that
is normally
distributed; typically n = 50 is large
enough for the approximation to be
excellent.
5-8
How do we test for the significance of each coefficient/estimation ?

Confidence Intervals for 1
Recall that a 95% confidence is, equivalently:

The set of points that cannot be rejected at
the 5% significance level;
A set-valued function of the data (an interval
that 95% confidence interval for 1
={
1.96SE(
)}
5-9
Group Activity: Lets test for the significance

of each coefficient (fill in the missing test)
F(2,417)=223.82
Prob>F=0.0000
Rsquared=0.4264
RootMSE=14.464
|Robust
+
str|1.101296.4328472___________0.0111.95213.2504616
pctel|.6497768.031031820.940.000_____________________
_cons|686.03228.72822478.600.000668.8754703.189
= 686.0 1.10STR 0.65PctEL

TestScore
6-10
Measures of Fit
Two regression statistics provide complementary
measures of how well the regression line fits
or explains the data:
The regression adj R2 measures the fraction
of the variance of Y that is explained by X; it is
unitless with a max value of one (perfect fit)
The standard error of the regression (SER
or RMSE) measures the magnitude of a
typical regression residual in the units of Y.
4-11
2
R
R and
: Measures of Fit for Multiple
2
Regression
The R 2 (the adjusted R2) is different from the

regular R2 Stata calculated by default, as it
penalizes you for including another regressor the
does not necessarily increase when you add
another regressor
Adjusted R2: R 2 = 1 n 1 SSR
n k 1 TSS
Note that R 2 < R2, however if n is large the two will

be very close.
6-12
Group Activity: What do the items in green

tell us in words ?
F(2,417)=223.82
Prob>F=0.0000
Rsquared=0.4264
RootMSE=14.464
|Robust
+
str|1.101296.4328472___________0.0111.95213.2504616
pctel|.6497768.031031820.940.000_____________________
_cons|686.03228.72822478.600.000668.8754703.189
= 686.0 1.10STR 0.65PctEL
TestScore
6-14
Tests of Joint Hypotheses

Let Expn = expenditures per pupil and consider the
population regression model:
TestScorei = 0 + 1STRi + 2Expni + 3PctELi + ui
The null hypothesis that school resources dont
matter, and the alternative that they do,
corresponds to:
H0: 1 = 0 and 2 = 0
vs. H1: either 1 0 or 2 0 or both
7-15
Tests of joint hypotheses

H0: 1 = 0 and 2 = 0
In general, a joint hypothesis involves q restrictions.
Here, q = 2, and the two restrictions are 1 = 0 and 2
= 0.
The F-statistic tests all parts of a joint hypothesis
at once.
2
F is distributed as
/q.
q
q 5% critical value
1 3.84
2 3.00 (the case q=2 above)
3 2.60
4 2.37
5 2.21
7-16
Group Activity: Is cubic better than linear ?

H0: 1 = 0 and 2 = 0 vs. H1: either 1 0 or 2 0 or both
8-17
Group Activity: Is cubic better than linear ?

Test the two coefficients being equal to zero at the same time.
1 = 0 and 2 = 0
We use an F-Test for that with _____ restrictions and use table:
No of restrictions
8-18
Group Activity: What do the items in blue tell

us in words ?
F(2,417)=223.82
Prob>F=0.0000
Rsquared=0.4264
RootMSE=14.464
|Robust
+
str|1.101296.43284722.540.0111.95213.2504616
pctel|.6497768.031031820.940.000.710775.5887786
_cons|686.03228.72822478.600.000668.8754703.189
TestScore
= 686.0 1.10STR 0.65PctEL
6-19
Problems with multiple regressors

analysis
1. OVB
2. MULTICOLINEARITY
3. HETEROSKEDASTICITY
1-20
MULTICOLINEARITY
1-21
Perfect Multicollinearity
The word perfect in this context implies that the
variation in one explanatory variable can be
completely explained by movements in another
explanatory variable
An example would be (Notice: no error term!):
X1i = 0 + 1X2i
(8.1)
where the s are constants and the Xs are

independent variables in:
Yi = 0 + 1X1i + 2X2i + i (8.2)
Easy way to solve this: drop one of the multicolinear
variables
2011 Pearson Addison-Wesley. All rights reserved.
8-22
Imperfect Multicollinearity
Imperfect multicollinearity occurs when
two (or more) explanatory variables are
imperfectly linearly related, as in:
X1i = 0 + 1X2i + ui (8.7)
Compare Equation 8.7 to Equation 8.1
Notice that Equation 8.7 includes ui, a
stochastic error term
Not as easy to deal with

8-23
Identifying Multicolinearity
A. High Simple Correlation
Coefficients: (.8 or higher)
B. High Variance Inflation Factors
(VIFs): (5 or higher)
Why do we need to do both ?
8-24
Group Activity: Correcting

Multicolinearity
Why do we need to correct it ?
Name 2 ways to correct it
8-25
Correcting Multicolinearity
standard errors biased upwards
T-scores biased downwards

Drop a multicolinear variable
Increase sample size
8-26
HETEROSKEDASTICITY
1-27
Heteroskedasticity
Perhaps the most frequently specified model of
pure heteroskedasticity relates the variance of the
error term to an exogenous variable Zi as follows:
(10.3)
(10.4)
where Z, the proportionality factor, may or may
not be in the equation
Impure heteroskedasticity is heteroskedasticity that is
caused by a specification error
Impure heteroskedasticity almost always originates from
an omitted variable
1028
Pure Heteroskedasticity
1.
If var(u|X=x) is constant that is,

if the variance of the conditional
distribution of u given X does not
depend on X then u is said to be
homoskedastic.
2. Otherwise, u is heteroskedastic.
5-29
Pure Homoskedastic Error Term with

Respect to Xi

1030
Pure Heteroskedastic Error Term

with Respect to Xi

1031
Identifying Heteroskedasticity
Park Test
White Test
Breusch-Pagan / Cook-Weisberg
Breusch-Pagan LM test

1032
Group Activity: Correcting

Heteroskedasticity
8-33
Correcting Heteroskedasticity
underestimates the standard errors
overestimates the t-scores

Robust standard errors regyx1x2,robust
Transforming the data (double-log is the most
popular or per capita transformation of the data)
8-34
NON LINEAR REGRESSION
1-35
1. Polynomials in X
Approximate the population regression function by a
polynomial:
Yi = 0 + 1Xi + 2 X i2++ r X r + ui
This is just the linear multiple regression model
except that the regressors are powers of X!
Estimation, hypothesis testing, etc. proceeds as in
the multiple regression model using OLS
The coefficients are difficult to interpret, but the
regression function itself is interpretable
8-36
Group Activity:
Is quadratic better than linear ?
Based on what we have learned together in this class,
how would we test if
TestScore
= 607.3 + 3.85Income
(2.9)
(0.27)
0.0423(Incomei)2
(0.0048)
should be chosen and not the linear form ?

How about the cubic compared to the quadratic ?
= 0 + 1Incomei + 2(Incomei)2 +
TestScore
3(Incomei)3 + ui
8-37
Group Activity Answers:

Is quadratic better than linear ?
Based on what we have learned together in this class,
how would we test if
TestScore
= 607.3 + 3.85Income
(2.9)
(0.27)
0.0423(Incomei)2
(0.0048)
should be chosen and not the linear form ?

See if the coefficient on the quadratic term is
significant. If so, then we should keep the
quadratic form.
8-38
1-39
Interacting two dummy variables:

Yi = 0 + 1D1i + 2D2i + ui, D1i, D2i are binary
Let
HiSTR =
1 if STR 20
and HiEL =
0 if STR 20
1 if PctEL l0
0 if PctEL 10
TestScore
= 664.1 18.2HiEL 1.9HiSTR 3.5(HiSTRHiEL)
(1.4)
(2.3)
(1.9)
(3.1)
Effect of HiSTR when HiEL = 0 is 1.9

Effect of HiSTR when HiEL = 1 is 1.9 3.5 = 5.4
Class size reduction is estimated to have a bigger effect
when the percent of English learners is large
8-40
Binary-continuous interactions
Example: TestScore, STR, HiEL (=1 if PctEL 10)
= 682.2 0.97STR + 5.6HiEL 1.28(STRHiEL)

TestScore
(11.9) (0.59)
(19.5)
(0.97)
When HiEL = 0:
= 682.2 0.97STR
TestScore
When HiEL = 1,
= 682.2 0.97STR + 5.6 1.28STR
= 687.8 2.25STR
TestScore
Two regression lines: one for each HiSTR group.
Class size reduction is estimated to have a larger effect when
the percent of English learners is large.
8-41

Sp16 Review Test 1 CH 4 5 6 7 8

Загружено:

Сведения о документе

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Sp16 Review Test 1 CH 4 5 6 7 8

Загружено:

Авторское право:

Доступные форматы

Review for Test 1: Ch 4, 5,

Please do review all the material we have

The Population Multiple Regression Model

Yi = 0 + 1X1i + 2X2i + ui, i = 1,,n

Copyright 2015 Pearson, Inc. All rights reserved.

Omitted variable bias

Multiple Regressors Regression

0 = predicted value of Y when X1 = X2 = 0.

Copyright 2015 Pearson, Inc. All rights reserved.

Multiple regression in STATA

= 686.0 1.10STR 0.65PctEL

Copyright 2015 Pearson, Inc. All rights reserved.

When are the estimates of a Multiple

The OLS Estimator in Multiple Regression

The OLS estimator minimizes the average squared

Copyright 2015 Pearson, Inc. All rights reserved.

How do we test for the significance of each coefficient/estimation ?

Construct the t-statistic

Reject at 5% significance level if |t| > 1.96

How do we test for the significance of each coefficient/estimation ?

Recall that a 95% confidence is, equivalently:

Copyright 2015 Pearson, Inc. All rights reserved.

Group Activity: Lets test for the significance

= 686.0 1.10STR 0.65PctEL

Copyright 2015 Pearson, Inc. All rights reserved.

The R 2 (the adjusted R2) is different from the

Note that R 2 < R2, however if n is large the two will

Copyright 2015 Pearson, Inc. All rights reserved.

Group Activity: What do the items in green

= 686.0 1.10STR 0.65PctEL

Tests of Joint Hypotheses

Tests of joint hypotheses

Copyright 2015 Pearson, Inc. All rights reserved.

Group Activity: Is cubic better than linear ?

H0: 1 = 0 and 2 = 0 vs. H1: either 1 0 or 2 0 or both

Copyright 2015 Pearson, Inc. All rights reserved.

Group Activity: Is cubic better than linear ?

Copyright 2015 Pearson, Inc. All rights reserved.

Group Activity: What do the items in blue tell

= 686.0 1.10STR 0.65PctEL

Copyright 2015 Pearson, Inc. All rights reserved.

Problems with multiple regressors

Copyright 2015 Pearson, Inc. All rights reserved.

Copyright 2015 Pearson, Inc. All rights reserved.

where the s are constants and the Xs are

Not as easy to deal with

Group Activity: Correcting

Copyright 2015 Pearson, Inc. All rights reserved.

Name 2 ways to correct it

Copyright 2015 Pearson, Inc. All rights reserved.

If var(u|X=x) is constant that is,

Copyright 2015 Pearson, Inc. All rights reserved.

Pure Homoskedastic Error Term with

2011 Pearson Addison-Wesley. All rights reserved.

Pure Heteroskedastic Error Term

2011 Pearson Addison-Wesley. All rights reserved.

2011 Pearson Addison-Wesley. All rights reserved.

Group Activity: Correcting

Copyright 2015 Pearson, Inc. All rights reserved.

Name 2 ways to correct it

Copyright 2015 Pearson, Inc. All rights reserved.

NON LINEAR REGRESSION

Copyright 2015 Pearson, Inc. All rights reserved.