Вы находитесь на странице: 1из 27

PGP

Session 2
2017
Understanding Econometrics
A case of Multiple Regression

08/06/2017 Understanding Econometrics 1


PRECISION OF THE REGRESSION COEFFICIENTS


2 1

u2
u
X2 b21
X X
2
2
X i X
b0 2


n
i
n2
E M SD(e) u2
n
n

e 2
i
Hence, u2 i 1
n2
. reg API FLE

Source SS df MS Number of obs = 20


F( 1, 18) = 33.81
Model 81632.8614 1 81632.8614 Prob > F = 0.0000
Residual 43466.3386 18 2414.79659 R-squared = 0.6525
Adj R-squared = 0.6332
Total 125099.2 19 6584.16842 Root MSE = 49.141

API Coef. Std. Err. t P>|t| [95% Conf. Interval]

FLE -2.114271 .3636374 -5.81 0.000 -2.878245 -1.350298


_cons 951.8735 22.78791 41.77 0.000 903.9979 999.7491

08/06/2017 Understanding Econometrics 2


18
A flowchart illustrating the Process to Move from Statistics to Econometrics
Start

Choose a set of variables


Done Formulate the problem Choose form of model
Specify assumptions

Done Fit the model Use method of fitting: LS

No Residual plots
Validate assumptions What are Outliers detection
those? : Diagnostics Checks

OK?
Yes
Evaluate the fitted
model Goodness of fit tests

No Ok?

Yes
Use the model for the
Stop
08/06/2017 Understanding Econometrics intended purpose 3
TYPES OF REGRESSION MODEL AND ASSUMPTIONS

Assumptions

A.1: The model is linear in parameters and correctly specified.


Y = b1 + b2 X + u

Examples of models that are not linear in parameters:

Y b1 X b2 u
Y = b1 + b2X2 + b3X3 + b2b3X4 + u

08/06/2017 Understanding Econometrics 4


5
Assumptions

A.2.The disturbance term has zero expectation


E(ui) = 0 for all i
Yi = b0+ b1Xi + ui
Define E (ui ) u 0

Yi b 0 b1 X i vi u
b 0* b1 X i vi where b 0 * b 0 u

Then E (vi ) E (ui u )


= u-u = 0 A2*=cov(X,U)=0 unbiased
estimates
08/06/2017 Understanding Econometrics 5
Results of the Assumptions: UNBIASEDNESS OF THE REGRESSION
COEFFICIENTS

Simple regression model: Y = b0 + b1X + u

LSestimate s : b1
X X Y Y b
i i
ai ui
X X
2 1
i

Xi X
ai

jX X 2


E b1 E b1 E ai ui
b1 E ai ui b1 E ( ai ) E ui
b1

08/06/2017 Understanding Econometrics 6


Assumptions
A.3. There should be some variations in X.


b1 X i X Yi Y
i X X 2

0
If X i X for all i, b2 .
0

08/06/2017 Understanding Econometrics 7


A.4. The variance of the random error u is

var(u) E[u E (u)] 2 2

If Assumption A.4 is not satisfied, the OLS


regression coefficients will be inefficient, and you
should be able to obtain more reliable results by
using a modification of the OLS regression
technique.

08/06/2017 Understanding Econometrics


8
l

A.5 The covariance between any pair of random errors,


ui and uj is
cov(ui , u j ) E[(ui E (u)(u j E (u)] 0

For example, just because the disturbance term is large and positive in one observation,
there should be no tendency for it to be large and positive in the next (or large and
negative, for that matter, or small and positive, or small and negative).

If this assumption is not satisfied, OLS will again give inefficient estimates.

08/06/2017 Understanding Econometrics


9
Gauss-Markov Theorem: Under these assumptions of the
linear regression model, the estimators b and b have
0 1
the smallest variance of all linear and unbiased estimators of b 0

and b1 . They are called the Best Linear Unbiased


Estimators (BLUE).

08/06/2017 Understanding Econometrics


10
Assumptions
For inference purpose,
A.6 : we assume errors follow normal
distribution.
Source SS df MS Number of obs = 20
F( 1, 18) = 33.81
Model 81632.8614 1 81632.8614 Prob > F = 0.0000
Residual 43466.3386 18 2414.79659 R-squared = 0.6525
Adj R-squared = 0.6332
Total 125099.2 19 6584.16842 Root MSE = 49.141

API Coef. Std. Err. t P>|t| [95% Conf. Interval]

FLE -2.114271 .3636374 -5.81 0.000 -2.878245 -1.350298


_cons 951.8735 22.78791 41.77 0.000 903.9979 999.7491

08/06/2017 Understanding Econometrics 11


Statistical analysis ; usefulness of X as a
predictor of Y

An appropriate test statistic for testing the


null hypothesis for testing the null
hypothesis
H0 : 1 = 0 ; against the alternative
H1 : 1 0 is the t-Test.

08/06/2017 Understanding Econometrics 12


Sampling Distribution of b0
(intercept)and b1 (slope)
/ (x x)
i
2

t
b1
x2
1 / n
( xi x ) 2 /2

e 2
RSS
i

n2 n2

b0 t
08/06/2017 Understanding Econometrics 13
t- Test

The statistic t1 is distributed as a students t


with n-2 degrees of freedom. The test is
carried out by comparing this observed
value with the appropriate critical value
obtained from the t table.

08/06/2017 Understanding Econometrics 14


Evaluate the model: GOODNESS OF FIT

ei Yi Yi Yi Yi ei

TSS Yi Y
2

Yi ei Y e
2
Yi Y ei 2

Y Y e 0

Y Y Y Y e
i
2
i
2 2
i
2 Yi Y ei

Y Y e 2 Yi ei 2Y ei
2 2
i i

Y Y Y Y e
2
TSS ESS RSS
2 2
i i i

R2
ESS

i
(Y Y ) 2

R2
TSS RSS
1
ei
2

TSS (Yi Y ) 2 TSS (Yi Y )2

08/06/2017 Understanding Econometrics 15


Measure of Variation

08/06/2017 Understanding Econometrics 16


Multiple Regression: can I improbe
goodness of Fit

08/06/2017 Understanding Econometrics 17


Is it only poverty or something Else?
Parents education: Is it because FLE explains API or because FLE is correlated with
Whole set of other variables (including parents schooling and school funding)
We do not know.

Goal is to isolate how poverty affects API, but in fact it might be the variable that was
Omitted is the true cause.

So PE: percentage of kids in a school parents having college degree.

Read data

08/06/2017 Understanding Econometrics 18


Now 3 normal equations

: =1 0 1 1 2 2 (-1) = 0

1 : =1 0 1 1 2 2 (-1 ) = 0

2 : =1 0 1 1 2 2 (-2 ) = 0

= - 1 1 - 2 2

08/06/2017 Understanding Econometrics 19


Multiple Regression
2

=1 1 =1 2 =1 1 2 =1 2
1 = 2 2 2
=1 1 =1 2 1= 1 2

2

=1 2 =1 1 =1 1 2 =1 1
2 = 2 2 2
=1 1 =1 2 1= 1 2

08/06/2017 Understanding Econometrics 20


Multiple=Simple
2

=1 1 =1 2 =1 1 2 =1 2
1 = 2 2 2
=1 1 =1 2 1= 1 2

if x1i x2i 0
i

leave out relevant variable : difference may be quite


large depending on covariance between x1 and x2 and
y and x2.

08/06/2017 Understanding Econometrics 21


Simple Regression

08/06/2017 Understanding Econometrics 22


Multiple Regression
. reg API FLE PE

Source SS df MS Number of obs = 20


F( 2, 17) = 52.09
Model 107548.893 2 53774.4467 Prob > F = 0.0000
Residual 17550.3065 17 1032.37097 R-squared = 0.8597
Adj R-squared = 0.8432
Total 125099.2 19 6584.16842 Root MSE = 32.131

API Coef. Std. Err. t P>|t| [95% Conf. Interval]

FLE -.5105995 .3987211 -1.28 0.218 -1.351827 .3306285


PE 2.335998 .4662362 5.01 0.000 1.352325 3.31967
_cons 777.1664 37.91938 20.50 0.000 697.1635 857.1693

. corr FLE PE
(obs=20)

FLE PE

FLE 1.0000
PE -0.8027 1.0000

08/06/2017 Understanding Econometrics 23


Advantage of Multiple Regression
. reg FLE PE

Source SS df MS Number of obs = 20


F( 1, 18) = 32.62
Model 11768.0225 1 11768.0225 Prob > F = 0.0000
Residual 6493.77755 18 360.765419 R-squared = 0.6444
Adj R-squared = 0.6247
Total 18261.8 19 961.147368 Root MSE = 18.994

FLE Coef. Std. Err. t P>|t| [95% Conf. Interval]

PE -.9386783 .1643529 -5.71 0.000 -1.283971 -.5933856


_cons 89.72497 7.430862 12.07 0.000 74.1133 105.3366

. predict e1,resid
(2 missing values generated)

. reg API e1

Source SS df MS Number of obs = 20


F( 1, 18) = 0.25
Model 1693.00479 1 1693.00479 Prob > F = 0.6253
Residual 123406.195 18 6855.89973 R-squared = 0.0135
Adj R-squared = -0.0413
Total 125099.2 19 6584.16842 Root MSE = 82.8

API Coef. Std. Err. t P>|t| [95% Conf. Interval]

e1 -.5105995 1.027504 -0.50 0.625 -2.669305 1.648106


_cons 835.8 18.51472 45.14 0.000 796.902 874.698

08/06/2017 Understanding Econometrics 24


Advantage of Multiple Regression
. reg PE FLE

Source SS df MS Number of obs = 20


F( 1, 18) = 32.62
Model 8606.56421 1 8606.56421 Prob > F = 0.0000
Residual 4749.23579 18 263.846433 R-squared = 0.6444
Adj R-squared = 0.6247
Total 13355.8 19 702.936842 Root MSE = 16.243

PE Coef. Std. Err. t P>|t| [95% Conf. Interval]

FLE -.6865041 .1201998 -5.71 0.000 -.9390345 -.4339736


_cons 74.78907 7.532511 9.93 0.000 58.96385 90.61429

. predict e2,resid
(2 missing values generated)

. reg API e2

Source SS df MS Number of obs = 20


F( 1, 18) = 4.70
Model 25916.0318 1 25916.0318 Prob > F = 0.0437
Residual 99183.1682 18 5510.17601 R-squared = 0.2072
Adj R-squared = 0.1631
Total 125099.2 19 6584.16842 Root MSE = 74.231

API Coef. Std. Err. t P>|t| [95% Conf. Interval]

e2 2.335998 1.077137 2.17 0.044 .0730171 4.598978


_cons 835.8 16.59846 50.35 0.000 800.9279 870.6721

08/06/2017 Understanding Econometrics 25


Perfect Multicollinearity
2

=1 1 =1( 1 ) =1 1 1 =1 1
1 = 2 ( )2 2
=1 1 =1 1 1= 1 1

2(
=1 1


=1 1
2 2
=1 1 =1 1 )
= 2 2 2 2
2
( =1 1 =1 1 1= 1 )

= 0/0

08/06/2017 Understanding Econometrics 26


Which model should we choose?
. reg API PE

Source SS df MS Number of obs = 20


F( 1, 18) = 99.02
Model 105855.889 1 105855.889 Prob > F = 0.0000
Residual 19243.3112 18 1069.07284 R-squared = 0.8462
Adj R-squared = 0.8376
Total 125099.2 19 6584.16842 Root MSE = 32.697

API Coef. Std. Err. t P>|t| [95% Conf. Interval]

PE 2.815286 .2829233 9.95 0.000 2.220886 3.409686


_cons 731.3529 12.79176 57.17 0.000 704.4784 758.2274

08/06/2017 Understanding Econometrics 27