Академический Документы
Профессиональный Документы
Культура Документы
6, 7, 8
6-2
6-3
1 =
2 = Y , holding X1 constant
X 2
6-4
RegressionwithrobuststandarderrorsNumberofobs=420
F(2,417)=223.82
Prob>F=0.0000
Rsquared=0.4264
RootMSE=14.464
|Robust
testscr|Coef.Std.Err.tP>|t|[95%Conf.Interval]
+
str|1.101296.43284722.540.0111.95213.2504616
pctel|.6497768.031031820.940.000.710775.5887786
_cons|686.03228.72822478.600.000668.8754703.189
TestScore
6-5
6-6
min b ,b ,b
0
2
[Y
(b
b
X
b
X
)]
i 0 1 1i 2 2i
i1
6-7
1 1,0
2
5-8
1.96SE(
)}
5-9
RegressionwithrobuststandarderrorsNumberofobs=420
F(2,417)=223.82
Prob>F=0.0000
Rsquared=0.4264
RootMSE=14.464
|Robust
testscr|Coef.Std.Err.tP>|t|[95%Conf.Interval]
+
str|1.101296.4328472___________0.0111.95213.2504616
pctel|.6497768.031031820.940.000_____________________
_cons|686.03228.72822478.600.000668.8754703.189
6-10
Measures of Fit
Two regression statistics provide complementary
measures of how well the regression line fits
or explains the data:
The regression adj R2 measures the fraction
of the variance of Y that is explained by X; it is
unitless with a max value of one (perfect fit)
The standard error of the regression (SER
or RMSE) measures the magnitude of a
typical regression residual in the units of Y.
4-11
2
R
R and
: Measures of Fit for Multiple
2
Regression
n k 1 TSS
6-12
RegressionwithrobuststandarderrorsNumberofobs=420
F(2,417)=223.82
Prob>F=0.0000
Rsquared=0.4264
RootMSE=14.464
|Robust
testscr|Coef.Std.Err.tP>|t|[95%Conf.Interval]
+
str|1.101296.4328472___________0.0111.95213.2504616
pctel|.6497768.031031820.940.000_____________________
_cons|686.03228.72822478.600.000668.8754703.189
TestScore
Copyright 2015 Pearson, Inc. All rights reserved.
6-14
7-15
7-16
8-17
Test the two coefficients being equal to zero at the same time.
1 = 0 and 2 = 0
We use an F-Test for that with _____ restrictions and use table:
No of restrictions
8-18
RegressionwithrobuststandarderrorsNumberofobs=420
F(2,417)=223.82
Prob>F=0.0000
Rsquared=0.4264
RootMSE=14.464
|Robust
testscr|Coef.Std.Err.tP>|t|[95%Conf.Interval]
+
str|1.101296.43284722.540.0111.95213.2504616
pctel|.6497768.031031820.940.000.710775.5887786
_cons|686.03228.72822478.600.000668.8754703.189
TestScore
6-19
1. OVB
2. MULTICOLINEARITY
3. HETEROSKEDASTICITY
1-20
MULTICOLINEARITY
1-21
Perfect Multicollinearity
The word perfect in this context implies that the
variation in one explanatory variable can be
completely explained by movements in another
explanatory variable
An example would be (Notice: no error term!):
X1i = 0 + 1X2i
(8.1)
8-22
Imperfect Multicollinearity
Imperfect multicollinearity occurs when
two (or more) explanatory variables are
imperfectly linearly related, as in:
X1i = 0 + 1X2i + ui (8.7)
Compare Equation 8.7 to Equation 8.1
Notice that Equation 8.7 includes ui, a
stochastic error term
8-23
Identifying Multicolinearity
A. High Simple Correlation
Coefficients: (.8 or higher)
B. High Variance Inflation Factors
(VIFs): (5 or higher)
Why do we need to do both ?
Copyright 2015 Pearson, Inc. All rights reserved.
8-24
8-25
Correcting Multicolinearity
Why do we need to correct it ?
standard errors biased upwards
T-scores biased downwards
8-26
HETEROSKEDASTICITY
1-27
Heteroskedasticity
Perhaps the most frequently specified model of
pure heteroskedasticity relates the variance of the
error term to an exogenous variable Zi as follows:
(10.3)
(10.4)
where Z, the proportionality factor, may or may
not be in the equation
Impure heteroskedasticity is heteroskedasticity that is
caused by a specification error
Impure heteroskedasticity almost always originates from
an omitted variable
2011 Pearson Addison-Wesley. All rights reserved.
Copyright 2015 Pearson, Inc. All rights reserved.
1028
Pure Heteroskedasticity
1.
5-29
1030
1031
Identifying Heteroskedasticity
Park Test
White Test
Breusch-Pagan / Cook-Weisberg
Breusch-Pagan LM test
1032
8-33
Correcting Heteroskedasticity
Why do we need to correct it ?
underestimates the standard errors
overestimates the t-scores
8-34
1-35
1. Polynomials in X
Approximate the population regression function by a
polynomial:
Yi = 0 + 1Xi + 2 X i2++ r X r + ui
This is just the linear multiple regression model
except that the regressors are powers of X!
Estimation, hypothesis testing, etc. proceeds as in
the multiple regression model using OLS
The coefficients are difficult to interpret, but the
regression function itself is interpretable
Copyright 2015 Pearson, Inc. All rights reserved.
8-36
Group Activity:
Is quadratic better than linear ?
Based on what we have learned together in this class,
how would we test if
TestScore
= 607.3 + 3.85Income
(2.9)
(0.27)
0.0423(Incomei)2
(0.0048)
8-37
TestScore
= 607.3 + 3.85Income
(2.9)
(0.27)
0.0423(Incomei)2
(0.0048)
8-38
1-39
HiSTR =
1 if STR 20
and HiEL =
0 if STR 20
1 if PctEL l0
0 if PctEL 10
TestScore
= 664.1 18.2HiEL 1.9HiSTR 3.5(HiSTRHiEL)
(1.4)
(2.3)
(1.9)
(3.1)
8-40
Binary-continuous interactions
(19.5)
(0.97)
When HiEL = 0:
= 682.2 0.97STR
TestScore
When HiEL = 1,
= 682.2 0.97STR + 5.6 1.28STR
= 687.8 2.25STR
TestScore
Two regression lines: one for each HiSTR group.
Class size reduction is estimated to have a larger effect when
the percent of English learners is large.
8-41