Академический Документы
Профессиональный Документы
Культура Документы
Regression with
a Single
Regressor:
Hypothesis Tests
and Confidence
Intervals
Outline
1. The standard error of
2. Hypothesis tests concerning 1
3. Confidence intervals for 1
4. Regression when X is binary
5. Heteroskedasticity and homoskedasticity
6. Efficiency of OLS and the Student t
distribution
5-2
2.
3.
4.
5.
5-3
Object of interest: 1 in
Yi = 0 + 1Xi + ui, i = 1,, n
1 = Y/X, for an autonomous change in X (causal effect)
Estimator: the OLS estimator
1. E(u|X = x) = 0.
2. (Xi,Yi), i =1,,n, are i.i.d.
3. Large outliers are rare (E(X4) < , E(Y4) < .
Copyright 2015 Pearson, Inc. All rights reserved.
5-4
, ctd.
2
, where v = (X )u
v
~N ,
i
i
X
i
1
n( 2X )2
5-5
5-6
General approach: construct t-statistic, and compute pvalue (or compare to the N(0,1) critical value)
In general:
t=
t = 1 1,0 ,
SE ( )
t=
Y Y ,0
sY / n
5-7
1-8
1-9
1 1,0
2
5-10
Review from Ch 4
1-11
DIST_CODE:
DISTRICT CODE;
READ_SCR:
MATH_SCR:
COUNTY :
COUNTY;
DISTRICT:
DISTRICT;
GR_SPAN:
ENRL_TOT :
TOTAL ENROLLMENT;
TEACHERS:
NUMBER OF TEACHERS;
COMPUTER:
NUMBER OF COMPUTERS;
TESTSCR:
COMP_STU:
(=COMPUTER/ENRL_TOT);
EXPN_STU:
STR:
EL_PCT:
MEAL_PCT:
CALW_PCT:
AVGINC:
1-12
1-13
1-14
Variable |
Obs
Mean
Std. Dev.
Min
Max
-------------+--------------------------------------------------------math_scr |
420
653.3426
18.7542
605.4
709.5
Variable |
Obs
Mean
Std. Dev.
Min
Max
-------------+--------------------------------------------------------read_scr |
420
654.9705
20.10798
604.5
704
Variable |
Obs
Mean
Std. Dev.
Min
Max
-------------+--------------------------------------------------------str |
420
19.64043
1.891812
14
25.8
Copyright 2015 Pearson, Inc. All rights reserved.
4-15
|Robust
testscr|Coef.Std.Err.tP>|t|[95%Conf.Interval]
+
str|2.279808.51948924.390.0003.3009451.258671
_cons|698.93310.3643667.440.000678.5602719.3057
TestScore
= 698.9 2.28STR
4-16
SE(
) = 10.4
SE(
) = 0.52
1 1,0
=0=
= 2.28 0 = 4.38
SE (1 )
0.52
5-17
) = 10.4
SE(
) = 0.52
1 1,0
=0=
= 2.28 0 = 4.38
SE ( )
0.52
1
5-18
5-19
1.96SE(
)}
5-20
) = 10.4
SE(
1.96SE(
) = 0.52
:
)} = {2.28 1.960.52}
= (3.30, 1.26)
5-21
TestScore
= 698.9 2.28STR, R2 = .05, SER = 18.6
(10.4) (0.52)
This expression gives a lot of information
The estimated regression line is
= 698.9 2.28STR
TestScore
The standard error of
is 10.4
is 0.52
5-22
RegressionwithrobuststandarderrorsNumberofobs=420
F(1,418)=19.26
Prob>F=0.0000
Rsquared=0.0512
RootMSE=18.581
|Robust
testscr|Coef.Std.Err.tP>|t|[95%Conf.Interval]
+
str|2.279808.51948924.380.0003.3009451.258671
_cons|698.93310.3643667.440.000678.5602719.3057
2
=
698.9
2.28STR,
,
R
= .05, SER = 18.6
TestScore
(10.4)
t (1 = 0) = 4.38,
(0.52)
5-23
and
have approximately normal sampling distributions in
large samples
Testing:
H0: 1 = 1,0 v. 1 1,0 (1,0 is the value of 1 under H0)
t = ( 1,0)/SE( )
p-value = area under standard normal outside tact (large n)
Confidence Intervals:
95% confidence interval for 1 is { 1.96SE( )}
5-24
5-25
5-26
Yi = 0 + 1Xi + ui
0 = mean of Y when X = 0
0 + 1 = mean of Y when X = 1
SE(
) has the usual interpretation
t-statistics, confidence intervals constructed as usual
This is another way (an easy way) to do difference-in-means
analysis
The regression formulation is especially useful when we have
additional regressors (as we will very soon)
5-27
5-28
Heteroskedasticity in a picture:
5-29
1030
1031
1032
1033
Heteroskedastic or homoskedastic?
Copyright 2015 Pearson, Inc. All rights reserved.
5-34
Heteroskedastic or homoskedastic?
Copyright 2015 Pearson, Inc. All rights reserved.
5-35
5-36
Practical implications
The homoskedasticity-only formula for the standard error of
and the heteroskedasticity-robust formula differ so in
general, you get different standard errors using the different
formulas.
Homoskedasticity-only standard errors are the default
setting in regression software sometimes the only setting
(e.g. Excel). To get the general heteroskedasticity-robust
standard errors you must override the default.
If you dont override the default and there is in fact
heteroskedasticity, your standard errors (and tstatistics and confidence intervals) will be wrong
typically, homoskedasticity-only SEs are too small.
5-37
Heteroskedasticity-robust standard
errors in STATA
regress testscr str, robust
Regression with robust standard errors Number of obs =
420
F( 1,
418) =
19.26
Prob > F
= 0.0000
R-squared
= 0.0512
Root MSE
= 18.581
------------------------------------------------------------------------Robust
testscr |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
------------------------------------------------------------------------qqqqstr | -2.279808
.5194892
-4.39
0.000
-3.300945
-1.258671
qq_cons |
698.933
10.36436
67.44
0.000
678.5602
719.3057
------------------------------------------------------------------------
5-38
5-39