You are on page 1of 74

Ch.

4: Simple Linear Regression


Econ 141 Spring 2014
Lecture: February 02 and 05, 2014
Bart Hobijn
The views expressed in these lecture notes are solely those of the instructor and do not necessarily
reflect those of the UC Berkeley, or other institutions with which he is affiliated.
2/03&05/2014

Econ 141, Spring 2014

Example: Estimate MPC


MPC: Marginal Propensity to Consume
Suppose households pre-tax income
increases by a dollar, what fraction of this
dollar would they end up spending versus
paying in taxes or saving?

2/03&05/2014

Econ 141, Spring 2014

Example: Estimate MPC


Basic equation
= 0 + 1 +



0
1

MSA, (unit of observation)


Average consumption expenditures per household
Average pre-tax income per household
Average consumption level at zero income
Marginal propensity to consume (MPC)
MSA-specific deviation from average linear
relationship between income and spending

How can we estimate value of MPC, i.e. 1 ?


2/03&05/2014

Econ 141, Spring 2014

Income and spending by MSA


MSA
()

Spending
( )

Income
( )

MSA
()

Spending
( )

Income
( )

Chicago

57.7

74.4

Atlanta

51.9

71.2

Detroit

50.5

79.8

Miami

40.6

58.9

MinneapolisSt. Paul

56.7

66.8

DallasFort Worth

57.1

71.0

Cleveland

48.0

65.9

Houston

58.2

73.5

New York

58.7

80.2

55.3

69.6

Philadelphia

53.5

71.7

73.6

98.2

Boston

65.0

79.8

San Diego

56.2

76.4

Washington,
D.C.

77.9

111.9

Seattle

60.7

74.1

Baltimore

62.3

96.9

Phoenix

53.7

63.2

Los
Angeles
San
Francisco

Note: Spending and income are annual average across households in thousands of dollars
Source: Consumer Expenditure Survey
2/03&05/2014

Econ 141, Spring 2014

Data scatterplot
Average Income and Expenditures by major MSA
Annual income and expenditures by household; 000's dollars; 2012

Expenditures
90

Washington,
D.C.

80

San
Francisco

70
Boston
Baltimore
Seattle

60

New York
Houston
DallasChicago
MinneapolisLos
Fort WorthSan Diego
St. Paul
Angeles
Phoenix
Philadelphia
Atlanta
Detroit

50

Cleveland

Miami

40

30
50

60

70

80

90

100

110

120

Income
Source: Consumer Expenditure Survey by MSA

2/03&05/2014

Econ 141, Spring 2014

Data scatterplot
Average Income and Expenditures by major MSA
Annual income and expenditures by household; 000's dollars; 2012

Expenditures
90

80

70

60

50

40

30
50

60

70

80

90

100

110

120

Income
Source: Consumer Expenditure Survey by MSA

2/03&05/2014

Econ 141, Spring 2014

Estimate of MPC ( )?
Average Income and Expenditures by major MSA
Annual income and expenditures by household; 000's dollars; 2012

Expenditures
90

80

70

60

50

40

30
50

60

70

80

90

100

110

120

Income
Source: Consumer Expenditure Survey by MSA

2/03&05/2014

Econ 141, Spring 2014

Estimate of MPC ( )?
Average Income and Expenditures by major MSA
Annual income and expenditures by household; 000's dollars; 2012

Expenditures
90

80

70

60

50

What is best estimate of line


defined by and ?

40

30
50

60

70

80

90

100

110

120

Income
Source: Consumer Expenditure Survey by MSA

2/03&05/2014

Econ 141, Spring 2014

Ordinary Least Squares

2/03&05/2014

Econ 141, Spring 2014

Simple linear regression model


= 0 + 1 +



0
1

2/03&05/2014

observation number
dependent variable (regressand)
independent (explanatory) variable (regressor)
intercept / constant
slope coefficient
error term / residual

Econ 141, Spring 2014

10

Simple linear regression model


= 0 + 1 +


Population
regression line /
observation
number

0
1

intercept / constant
slope coefficient
error term / residual

2/03&05/2014

Population regression function

dependent variable (regressand)

Average linear relationship between


dependent(explanatory)
and independent variable.
independent
variable (regressor)

Econ 141, Spring 2014

11

Simple linear regression model


= 0 + 1 +



0
1

2/03&05/2014

observation number

Error term / Residual

deviation from
dependent variableObservation-specific
(regressand)

average linear relationship between


dependent and
independent
variable.
(explanatory)
variable
(regressor)

independent
intercept / constant
slope coefficient
error term / residual

Econ 141, Spring 2014

12

Simple linear regression model


= 0 + 1 +



0
1

2/03&05/2014

observation number
dependent variable (regressand)
Why
we need to estimate
independent
(explanatory)
variable (regressor)
Observed:
Sample , for = 1, , .
Intercept
/ constant
Slope
coefficient
Unobserved:
Parameters 0 and 1 as well
as error
for = 1, , .
error
termterms
/ residual

Econ 141, Spring 2014

13

Ordinary Least Squares (OLS)


OLS estimates:
Choose 0 and 1 to minimize the sum of squared
residuals (SSR)

0 , 1 = argmin
1 ,2

0 1

=1

Properties:
What is solution for 0 , 1 ?
Are 0 , 1 consistent estimates of true 0 and 1 ?

Are 0 , 1 unbiased estimates of true 0 and 1 ?


What is their asymptotic distribution?
2/03&05/2014

Econ 141, Spring 2014

14

Solution for ,
First order necessary condition for

2
0=
0 1 =
0 1
0
0
=1

= 2

=1

0 1
=1

Solving for 0

1
0=

1
0=

0 1 .
=1

0 1 = 0 1
=1

Such that
0 = 1
2/03&05/2014

Econ 141, Spring 2014

15

Solution for ,
First order necessary condition for

2
0=
0 1 =
1
1
1
=1

=1

= 2

=1

1
0=

2/03&05/2014

=1

Econ 141, Spring 2014

16

Solution for ,
First order necessary condition for

2
0=
0 1 =
1
1
1
=1

=1

= 2

=1

1
0=

0=
2/03&05/2014

=1

This implies that

=1 0 1
Econ 141, Spring 2014

17

Solution for ,
First order necessary condition for

2
0=
0 1 =
1
1
1
=1

=1

= 2

=1

1
0=

1
=


=1


=1

2/03&05/2014

1
1

= 1 2 .

=1

Econ 141, Spring 2014

18

Solution for ,
First order necessary condition for

2
0=
0 1 =
1
1
1
=1

=1

= 2

=1

1
0=

1
=

2/03&05/2014

=1


=1

1
1

= 1 2 .

=1

Econ 141, Spring 2014

19

Simple linear regression with OLS


OLS estimators of and :
1 =

=1
1
2

=1

0 = 1

Derived estimates for each = , ,


= 0 + 1 ,
= ,
2/03&05/2014

predicted/fitted value of
residual
Econ 141, Spring 2014

20

Estimating the MPC


Average Income and Expenditures by major MSA
Annual income and expenditures by household; 000's dollars; 2012

Expenditures
90

80

70

60

50

40

30
50

60

70

80

90

100

110

120

Income
Source: Consumer Expenditure Survey by MSA

2/03&05/2014

Econ 141, Spring 2014

21

Estimating the MPC


Average Income and Expenditures by major MSA
Annual income and expenditures by household; 000's dollars; 2012

Expenditures
90

80
San Francisco

70

60

50

40

30
50

60

70

80

90

100

110

120

Income
Source: Consumer Expenditure Survey by MSA

2/03&05/2014

Econ 141, Spring 2014

22

Estimating the MPC


Average Income and Expenditures by major MSA
Annual income and expenditures by household; 000's dollars; 2012

Expenditures
90

80
San Francisco

70

60

50

Estimated MPC out of


pre-tax income is 56
cents on the dollar

40

30
50

60

70

80

90

100

110

120

Income
Source: Consumer Expenditure Survey by MSA

2/03&05/2014

Econ 141, Spring 2014

23

SF predicted value and residual


Average Income and Expenditures by major MSA
Annual income and expenditures by household; 000's dollars; 2012

Expenditures in
SF higher than
predicted by
regression

Expenditures
90

80
San Francisco
73.6

70

69.6

60

50

40

30
50

60

70

80

90

100

110

120

Income
Source: Consumer Expenditure Survey by MSA

2/03&05/2014

Econ 141, Spring 2014

24

SF predicted value and residual


Average Income and Expenditures by major MSA
Annual income and expenditures by household; 000's dollars; 2012

Expenditures
90

80
Residual:

San Francisco
73.6

70

69.6

60

50

40

30
50

60

70

80

90

100

110

120

Income
Source: Consumer Expenditure Survey by MSA

2/03&05/2014

Econ 141, Spring 2014

25

Goodness of fit
Main question
What fraction of the variance of the
dependent variable, , is explained by the
regression line rather than unexplained?
(unexplained means part of the residuals)

Variance accounting
=

=1

=1

=1

2/03&05/2014

total sum of squares

estimated sum of squares


a.k.a. model sum of squares

=1

Econ 141, Spring 2014

sum of squares residuals


a.k.a. residual sum of squares
26

TSS = ESS + SSR decomposition

+
=1

=1

+
=1

0 + 1 +
=1

1 + 1 +
=1

=
=1

2/03&05/2014

+ 21

1 +
=1

2 .

+
=1

Econ 141, Spring 2014

=1

27

TSS = ESS + SSR decomposition

+
=1

=1

+
=1

0 + 1 +
=1

1 + 1 +
=1

=1

+ 21

1 +
=1

2 .

+
=1

=1

=1

= according to first-order
necessary condition derived on slide 17.

2/03&05/2014

Econ 141, Spring 2014

28

TSS = ESS + SSR decomposition

+
=1

=1

+
=1

0 + 1 +
=1

1 + 1 +
=1

=
=1

+ 21

1 +
=1

2 .

+
=1

=1

1 =

2/03&05/2014

Econ 141, Spring 2014

29

TSS = ESS + SSR decomposition

=1

+
=1

=1

0 + 1 +
=1

1 + 1 +
=1

=
=1

+ 21


=1

2/03&05/2014

1 +
=1

+
=1

=1

2 = + .

+
=1

Econ 141, Spring 2014

30

Goodness of fit for MCP regression


Average Income and Expenditures by major MSA
Annual income and expenditures by household; 000's dollars; 2012

Expenditures
90

80
San Francisco

70

60

50

40

30
50

60

70

80

90

100

110

120

Income
Source: Consumer Expenditure Survey by MSA

2/03&05/2014

Econ 141, Spring 2014

31

: measure of goodness of fit


Measure Equation

Value

Share
(percentage)

ESS

=1

SSR

=1

TSS

2/03&05/2014

=1

944.0

74.9

316.4

25.1

1260.3

100.0

Econ 141, Spring 2014

32

: measure of goodness of fit


Measure Equation

Value

Share
(percentage)

ESS

=1

SSR

=1

TSS

2/03&05/2014

=1

944.0

74.9

316.4

25.1

1260.3

100.0

fraction of the variation in


the dependent variable, i.e.
of , explained by the
regression line. = . .

Econ 141, Spring 2014

33

Standard error of the regression


Measure Equation

Value

Share
(percentage)

ESS

=1

SSR

=1

TSS

=1

944.0

74.9

316.4

25.1

1260.3

100.0

= , where =

is
=
an unbiased estimate of variance of
residuals

2/03&05/2014

Econ 141, Spring 2014

34

Standard error of the regression


= , where

1
2

=1

Degrees of freedom correction


If we had only two observations we would be able to
perfectly fit a straight line and residuals would be
zero.

2/03&05/2014

Econ 141, Spring 2014

35

Standard error of the regression


= , where

1
2

=1

Degrees of freedom correction


If we had only two observations we would be able to
perfectly fit a straight line and residuals would be
zero.

Measure of spread around regression line


Estimate of standard deviation of deviation from the
regression line.

2/03&05/2014

Econ 141, Spring 2014

36

MCP regression in Excel


Average Income and Expenditures by major MSA
Annual income and expenditures by household; 000's dollars; 2012

Expenditures
90

80
San Francisco

70

60

50

40

30
50

60

70

80

90

100

110

120

Income
Source: Consumer Expenditure Survey by MSA

2/03&05/2014

Econ 141, Spring 2014

37

Example Excel regression output


SUMMARY OUTPUT
Dependent variable: Expenditures
Regression Statistics
Multiple R
0.865434015
R Square
0.748976034
Adjusted R Square
0.733287037
Standard Error
4.446724217
Observations
18
ANOVA
df
Regression
Residual
Total

Intercept
Income

2/03&05/2014

SS
943.9589518
316.3737002
1260.332652

MS
943.9589518
19.77335626

F
Significance F
47.73893411 3.51406E-06

Coefficients Standard Error


14.69690817
6.304502449
0.558872878
0.080886618

t Stat
2.331176535
6.909336734

P-value
Lower 95% Upper 95% Lower 95.0% Upper 95.0%
0.033146025 1.331960021 28.061856 1.33196002 28.06185632
3.51406E-06 0.387400909 0.7303448 0.38740091 0.730344847

1
16
17

estimated
intercept

Econ 141, Spring 2014

38

Example Excel regression output


SUMMARY OUTPUT
Dependent variable: Expenditures
Regression Statistics
Multiple R
0.865434015
R Square
0.748976034
Adjusted R Square
0.733287037
Standard Error
4.446724217
Observations
18
ANOVA
df
Regression
Residual
Total

Intercept
Income

SS
943.9589518
316.3737002
1260.332652

MS
943.9589518
19.77335626

F
Significance F
47.73893411 3.51406E-06

Coefficients Standard Error


14.69690817
6.304502449
0.558872878
0.080886618

t Stat
2.331176535
6.909336734

P-value
Lower 95% Upper 95% Lower 95.0% Upper 95.0%
0.033146025 1.331960021 28.061856 1.33196002 28.06185632
3.51406E-06 0.387400909 0.7303448 0.38740091 0.730344847

1
16
17

estimated slope

2/03&05/2014

Econ 141, Spring 2014

39

Example Excel regression output


SUMMARY OUTPUT
Dependent variable: Expenditures
Regression Statistics
Multiple R
0.865434015
R Square
0.748976034
Adjusted R Square
0.733287037
Standard Error
4.446724217
Observations
18

sample size

ANOVA
df
Regression
Residual
Total

Intercept
Income

2/03&05/2014

SS
943.9589518
316.3737002
1260.332652

MS
943.9589518
19.77335626

F
Significance F
47.73893411 3.51406E-06

Coefficients Standard Error


14.69690817
6.304502449
0.558872878
0.080886618

t Stat
2.331176535
6.909336734

P-value
Lower 95% Upper 95% Lower 95.0% Upper 95.0%
0.033146025 1.331960021 28.061856 1.33196002 28.06185632
3.51406E-06 0.387400909 0.7303448 0.38740091 0.730344847

1
16
17

Econ 141, Spring 2014

40

Example Excel regression output


SUMMARY OUTPUT
Dependent variable: Expenditures
Regression Statistics
Multiple R
0.865434015
R Square
0.748976034
Adjusted R Square
0.733287037
Standard Error
4.446724217
Observations
18

Explained sum of squares

ANOVA
df
Regression
Residual
Total

Intercept
Income

2/03&05/2014

SS
943.9589518
316.3737002
1260.332652

MS
943.9589518
19.77335626

Coefficients Standard Error


14.69690817
6.304502449
0.558872878
0.080886618

t Stat
2.331176535
6.909336734

1
16
17

F
Significance F
47.73893411 3.51406E-06

Sum of squared residuals


P-value
Lowersum
95% Upper
95% Lower 95.0% Upper 95.0%
Total
of squares
0.033146025 1.331960021 28.061856 1.33196002 28.06185632
3.51406E-06 0.387400909 0.7303448 0.38740091 0.730344847

Econ 141, Spring 2014

41

Example Excel regression output


SUMMARY OUTPUT
Dependent variable: Expenditures
Regression Statistics
Multiple R
0.865434015
R Square
0.748976034
Adjusted R Square
0.733287037
Standard Error
4.446724217
Observations
18

ANOVA
df
Regression
Residual
Total

Intercept
Income

2/03&05/2014

SS
943.9589518
316.3737002
1260.332652

MS
943.9589518
19.77335626

F
Significance F
47.73893411 3.51406E-06

Coefficients Standard Error


14.69690817
6.304502449
0.558872878
0.080886618

t Stat
2.331176535
6.909336734

P-value
Lower 95% Upper 95% Lower 95.0% Upper 95.0%
0.033146025 1.331960021 28.061856 1.33196002 28.06185632
3.51406E-06 0.387400909 0.7303448 0.38740091 0.730344847

1
16
17

Econ 141, Spring 2014

42

Example Excel regression output


SUMMARY OUTPUT
Dependent variable: Expenditures
Regression Statistics
Multiple R
0.865434015
R Square
0.748976034
Adjusted R Square
0.733287037
Standard Error
4.446724217
Observations
18
ANOVA
df
Regression
Residual
Total

Intercept
Income

2/03&05/2014

SS
943.9589518
316.3737002
1260.332652

MS
943.9589518
19.77335626

F
Significance F
47.73893411 3.51406E-06

Coefficients Standard Error


14.69690817
6.304502449
0.558872878
0.080886618

t Stat
2.331176535
6.909336734

P-value
Lower 95% Upper 95% Lower 95.0% Upper 95.0%
0.033146025 1.331960021 28.061856 1.33196002 28.06185632

3.51406E-06 0.387400909 0.7303448 0.38740091 0.730344847

1
16
17

1
=
2

=1

Econ 141, Spring 2014

43

Example Excel regression output


SUMMARY OUTPUT
Dependent variable: Expenditures
Regression Statistics
Multiple R
0.865434015
R Square
0.748976034
Adjusted R Square
0.733287037
Standard Error
4.446724217
Observations
18

ANOVA
df
Regression
Residual
Total

Intercept
Income

2/03&05/2014

SS
943.9589518
316.3737002
1260.332652

MS
943.9589518
19.77335626

F
Significance F
47.73893411 3.51406E-06

Coefficients Standard Error


14.69690817
6.304502449
0.558872878
0.080886618

t Stat
2.331176535
6.909336734

P-value
Lower 95% Upper 95% Lower 95.0% Upper 95.0%
0.033146025 1.331960021 28.061856 1.33196002 28.06185632
3.51406E-06 0.387400909 0.7303448 0.38740091 0.730344847

1
16
17

Econ 141, Spring 2014

44

Why use OLS?


Most common estimation method
Implemented in many different applications
Most common methodology. Thus important to
understand.

OLS has very desirable properties


Under relatively general conditions OLS estimates
are
Consistent
Unbiased
Have tractable asymptotic distribution
2/03&05/2014

Econ 141, Spring 2014

45

Why use OLS?


Most common estimation method
Implemented in many different applications
Most common methodology. Thus important to
understand.

OLS has very desirable properties


Under relatively general conditions OLS estimates
are
Consistent
Unbiased
Have tractable asymptotic distribution
2/03&05/2014

Econ 141, Spring 2014

46

Conditions listed in book


= 0 + 1 + , where = 1, ,
No information in about
= = 0

2/03&05/2014

Econ 141, Spring 2014

47

Conditions listed in book


= 0 + 1 + , where = 1, ,
No information in about
= = 0
Suppose not and instead = , then we can write
= 0 + 1 + = 0 + 1 + +
= 0 + 1 + + = 0 + 1 +
where
= = = 0
So, in this case there is an alternative representation of the
linear regression line with a different slope parameter,
1 = 1 + , that satisfies this assumption.
2/03&05/2014

Econ 141, Spring 2014

48

Conditions listed in book


= 0 + 1 + , where = 1, ,
No information in about
= = 0
Note, this implies that
= = 0
Which, given = 0 implies that
cov , = = 0
2/03&05/2014

Econ 141, Spring 2014

49

Conditions listed in book


= 0 + 1 + , where = 1, ,
No information in about
= = 0
, well-behaved random variables
, , = 1, , , are independently drawn from
identical joint distribution.
Large outliers are unlikely.

2/03&05/2014

Econ 141, Spring 2014

50

Conditions listed in book


= 0 + 1 + , where = 1, ,
No information in about
= = 0
, well-behaved random variables
, , = 1, , , are independently drawn from
identical joint distribution.
Large outliers are unlikely.
Last two assumptions are made such that we can apply LLN
and CLT to derive properties of 0 and 1 .
2/03&05/2014

Econ 141, Spring 2014

51

Properties of 0 and 1
Properties of 0 and 1 are derived by manipulating
the first-order necessary conditions from slides 15 and
17.
1
0=

and
1
0=

2/03&05/2014

0 1
=1

0 1
=1

Econ 141, Spring 2014

1
=

1
=

=1


=1

52

Properties of 0 and 1
Properties of 0 and 1 are derived by manipulating
the first-order necessary conditions from slides 15 and
17.
1
0=

and
1
0=

0 1
=1

0 1
=1

1
=

1
=

=1


=1

These are sample approximations of condition


= = 0
2/03&05/2014

Econ 141, Spring 2014

53

Consistency of 1
0=

0 1 =
=1

1
=

=1

0 + 1 + 0 1 1
=1

1
=


=1

1
= 1 1

2/03&05/2014

=1

1 1 +
1
2
+

.
=1

Econ 141, Spring 2014

54

Consistency of 1
0=

0 1 =
=1

1
=

=1

0 + 1 + 0 1 1
=1

1
=


=1

1
= 1 1

2/03&05/2014

=1

1 1 +
1
2
+

.
=1

>

Econ 141, Spring 2014

, =

55

Consistency of 1
0=

0 1 =
=1

1
=

=1

0 + 1 + 0 1 1
=1

1
=


=1

1
= 1 1

=1

1 1 +
1
2
+


=1

0 = 1 1 var .

Such that
1 1
2/03&05/2014

0, that is 1

Econ 141, Spring 2014

1
56

Consistency of 1
0=

0 1 =
=1

1
=

=1

0 + 1 + 0 1 1
=1

As1 the sample size gets arbitrarily


=large, i.e.


1
1 of+the
,our
estimate

=1
slope
coefficient,
, gets arbitrarily

1
1
close to the true parameter
value
= 1 1
2 +

from the population regression line


=1

=1

0 = 1 1 var .

Such that
1 1
2/03&05/2014

0, that is 1

Econ 141, Spring 2014

1
57

But, in real life, is finite


Small sample properties of OLS estimators
Unbiasedness
On average OLS estimate equals true
parameter value of interest.
Asymptotic distribution
OLS assumptions imply we can use CLT to
derive asymptotic normal distribution of OLS
estimates, that can be used as approximation
when is big.
2/03&05/2014

Econ 141, Spring 2014

58

Unbiasedness of 1
1
0 = 1 1

=1

1
2 +

.
=1

Such that
1
1 = 1 +

=1

=1

1
= 1 +
1

=1

=1

Taking expectations yields


1
1

=1
=1

E 1 = E 1 + E
= 1 + E
1
1
2
2

=1
=1
1
1

=1
=1 E

= 1 + E
= 1 +E
1
1
2
2

=1
=1
1
=1 0

= 1 + E
= 1
1
2
=1
2/03&05/2014

Econ 141, Spring 2014

59

Unbiasedness of 1
1
0 = 1 1

=1

1
2 +

.
=1

Such that
1
1

=1 we
=1
Here
is
where

1 = 1 +
= 1 +
1
1

2
2
apply second

=1
=1

conditionyields
from slide 43
Taking expectations
1
1

=1
=1

E 1 = E 1 + E
= 1 + E
1
1
2
2

=1
=1
1
1

=1
=1 E

= 1 + E
= 1 +E
1
1
2
2

=1
=1
1
=1 0

= 1 + E
= 1
1
2
=1
2/03&05/2014

Econ 141, Spring 2014

60

Unbiasedness of 1
1
0 = 1 1

=1

1
2 +

.
=1

Such that
1
1 = 1 +

=1

=1

1
= 1 +
1

=1

=1

Taking expectations yields Implied by first


1
condition
from
slide 43 1

=1
=1

E 1 = E 1 + E
= 1 + E
1
1
2
2

=1
=1
1
1

=1
=1 E

= 1 + E
= 1 +E
1
1
2
2

=1
=1
1
=1 0

= 1 + E
= 1
1
2
=1
2/03&05/2014

Econ 141, Spring 2014

61

Unbiasedness of 1

=1

=1

1
1
0 = 1 1
2 +
.

Even as the sample size is not large, on


Such that

average our estimate of the slope


1
coefficient,1 =1
equal
the
true
parameter
, will

=1

1 =
1 from
+
= 1 +
value
line.
1the population regression
1

=1

= . Thus, unbiased.

=1

Taking expectations yields


1
1

=1
=1

E 1 = E 1 + E
= 1 + E
1
1
2
2

=1
=1
1
1

=1
=1 E

= 1 + E
= 1 +E
1
1
2
2

=1
=1
1
=1 0

= 1 + E
= 1
1
2
=1
2/03&05/2014

Econ 141, Spring 2014

62

Unbiasedness of 1
1
0 = 1 1

1
2 +

Note that the


=1 first condition
=1 that
Such that
E = 0
1 for OLS to be unbiased.
1
is crucial

=1
=1

1 +
If this
condition
the= average
1 =
1+
1 is not true
1 OLS
2
2

=1
=1
estimate
will
deviate
from

the true parameter value.


Taking expectations yields
E 1

1
= E 1 + E

1
= 1 + E

2/03&05/2014

=1

=1

1
2
=1
E 1 , ,
1

=1

1
= 1 + E
1

1
= 1 + E
1
= 1 +E

=1

=1

Econ 141, Spring 2014

=1

=1

=1

E
1

=1

= 1
63

Asymptotic distribution of 1
1
1 1 =

1

=
1
1
2

=1

=1

=1

=1

See slide 48

3 steps to deriving asymptotic distribution


1. Apply CLT to numerator
2. Apply LLN to denominator
3. Combine using Slutskys theorem (S&W page 676)
2/03&05/2014

Econ 141, Spring 2014

64

Apply CLT to

=1

Define random variable


=
Condition 1 from slide 47: E = 0
Conditions 2&3 from slide 50 imply that
var exists and is finite.
CLT applies to sample mean of .

1
=

=1

Note that: 1 1 = 2
2/03&05/2014

Econ 141, Spring 2014

65

Apply CLT to

=1

Apply the Central Limit Theorem


E

=
=
var
var
Such that
0, var

0,1

Thus, the numerator of


1 1 = 2
has an asymptotic distribution that is normal
with a mean equal to zero.
2/03&05/2014

Econ 141, Spring 2014

66

Apply LLN to

Conditions 2&3 from slide 50 imply that we


can apply the Law of Large Numbers to 2
and that 2 converges in probability to the
variance of , i.e. to var .
Thus
2
var

Now that we know asymptotic behavior of


numerator and denominator of 1 1 = 2
the only thing left is to combine them.
2/03&05/2014

Econ 141, Spring 2014

67

Apply Slutskys theorem

(S&W page 676)

Slutskys theorem implies that we can combine the


asymptotic properties of
0, var

and
2

var

such that
1 1 =

2 0,

var

var

and thus
1 = 1 +
2/03&05/2014

var
1 ,
var

Econ 141, Spring 2014

68

1 has tractable asymptotic distribution


Thus as sample size get large then estimated
slope coefficient has approximately a normal
distribution, such that
1 ~ 1 , 1
where
1 = 1
2
1
2/03&05/2014

var
var

1 var
var 2

Econ 141, Spring 2014

69

1 has tractable asymptotic distribution


Thus as sample size get large then estimated
slope coefficient has approximately a normal
distribution, such that
Remember:
1 ~ 1 , 1
= . Thus, unbiased.

where
1 = 1
2
1
2/03&05/2014

var
var

1 var
var 2

Econ 141, Spring 2014

70

1 has tractable asymptotic distribution


Thus as sample size get large then estimated
slope coefficient has approximately a normal
distribution, such that
Remember:
1 ~ 1 , 1
2

as

where
1 = 1
1

2
1
2/03&05/2014

. Thus, consistent.

var
var

1 var
var 2

Econ 141, Spring 2014

71

1 has tractable
asymptotic distribution
This term is like a noise to signal ratio.
The numerator is related to variance of the
residual

Thus as sample size get large then estimated


The denominator is related to the variance of the
slope coefficient
has approximately a normal
explanatory variable
distribution,
thatin the explanatory variable
The largesuch
the variation
relative to the residual the more accurate the
1OLS
~estimate.
1 , 1
where
1 = 1
2
1
2/03&05/2014

var
var

1 var
var 2

Econ 141, Spring 2014

72

Virtues of asymptotic normality


Asymptotic normality of OLS coefficients
allows us to
Do hypothesis tests
Calculate confidence intervals

Simple generalizations of same techniques


applied to the population mean.

Chapter 5! Next week.


2/03&05/2014

Econ 141, Spring 2014

73

Summary
Linear regression
model
OLS estimators
2 and
OLS conditions

Even-numbered
problems:
4.2, 4.4, 4.6, 4.10,
4.12, 4.14
Study STATA tutorial

Consistency of 1

Unbiasedness of 1
Asymptotic distribution
of 1
2/03&05/2014

Empirical exercises
next week.

Econ 141, Spring 2014

74