Dynamic Panel Data Methods Lecture II
Microeconometrics Lectures
Richard Blundell UCL and IFS March 2005
Dynamic Panel Data Methods
Background
The standard panel data model is
y
it
β
0
=
= x
it
′
+
β
xx
it
11
+
β η
+
i
+
v
it
it
2
β
2
+ +
x
itk
β η
k
+
i
+
v
it
where
i = 1,
,
the
η _{i}
N; t = 1,
are
the
unobserved
constant
,T
, with N large and T small.
individual
effects.
Often lagged values of y are included in x.
2
An Example: Company Investment Rates
The panel data model is
I
it
I
β
it − 1
K
it
K
it − 1
=
Unbalanced panel Company level data T = 410, N = 700.
3
ηλ
+++ v
i
t
it
Example
OLS 
Within 
DIF 

Levels 
Groups 
2SLS 

( I 
/ 
K ) 
it −1 
0.2669 
0.0094 
0.1626 

(.0185) 
(.0181) 
(.0362) 

Instruments 
( I 
/ 
K ) 
it −2 
STATA command for GMM: xtabond2 f.windmeijer@ifs.org.uk On the CeMMAP website http://cemmap.ifs.org.uk/ (resources page), the Windmeijer course is available together with the computer exercises and some of the data sets.
4
Three common specifications to deal with η _{i} :
1. Random effects
2. Fixed effects
3. First Differences
In the model
we assume that
y
u
it
it
= x
=
it
η
i
′
β
+
+
v
it
Ev
_{a}_{f} = 0;a
it
Ev
it
5
u
it
x
it
_{f} = 0
The Random Effects specification further assumes that
E
_{a}_{f}_{η}
i
= 0;aη
Ex
iit
_{f} = 0
i.e. it assumes that the individual effect η _{i} is uncorrelated with the regressors x _{i}_{t} . Therefore
a
Ey
it
 x
it
f =
x
it
′
βη
E
a
fa
Ev
f
++=
i
x
it
it
x
it
x
it
′
β
and therefore the simple OLS estimator on the pooled data is unbiased. However, it is not efficient, and the estimated standard errors are wrong, as it does not take account of the dependence of the error term within individual over time.
6
Let u
and
= η + the v _{i}_{t} , then
it
i
v
it
and assume independence of v _{i}_{s} and v _{i}_{t} , s ≠ t, and of η _{i}
a
Euu
is
it
f = bη g
E
2
i
2
=σ _{η}
and therefore the u _{i}_{s} and u _{i}_{t} are correlated. The within individual
a
variancecovariance matrix is given by, u′ = uu
i
ii1
2
u
iT
f,
Ω= E ′=
uu
i
i
a
f
^{β} RE
=
L
M
M
M
M
N
F
G
H
22
+
2
2
σ
η
2
σ
η
+
σ σ
σ
ηη
v
2
σ
η
2
σ
η
22
+
σσ
η
v
222
v
σσσ
ηη
N
∑
i
= 1
Ω
XX′
i
−
i
1
I
J
K
− 1
N
∑
i
= 1
Xy ′ Ω
i
−
1
i
7
O
P
P
P
P
Q
Fixed Effects
The more likely and interesting case is when the unobserved individual effects are correlated with the regressors:
E
aη
i

x
it
_{f} ≠ 0.
Clearly, in this case OLS and the Random Effects estimator are biased and inconsistent as
a
Ey
it

x
it
f
= 
x 
= x 
it
it
′
′
β
βη
η
E
a
a
i

x
it
+
Ev
it

it
≠
x
it
′
β
fa
f
+
+
Ex
i
8
x
it
f
A solution is to estimate the model with a separate intercept for every individual by OLS. As
η
i
′
β
= yx−
ii
−
v
i
this happens to be equivalent for the _{β} parameters to estimate the transformed, within group model by OLS
y
it
′
a
faβ+
−=
i
y
xx
it
−
i
vv
it
−
i
f
Therefore, for the fixed effects, or within group estimator, only the effects of variables that change over time can be estimated.
(OLS standard errors in this model are again wrong as it ignores the fact that N intercepts have been estimated).
9
For the fixed effects estimator to be unbiased, one needs that the x _{i}_{t} in all periods are uncorrelated with the v _{i}_{s} in all periods:
a
E vx
is
it
_{f} = 01; = ,
s
,
T
,
t
= 1,
,
T
when x _{i}_{t} satisfies this condition, we call it to be strictly exogenous.
Assuming strict exogeneity, the Hausman test can be used to test whether the unobserved heterogeneity is correlated with the regressors. When they are not correlated the RE estimator is efficient. If they are
correlated, the FE estimator is consistent, but the RE estimator is not.
H =
d
ββ
FE
−
RE
β
FE
i
−
Var
1
d id
β
RE
−
ββ
FE
−
RE
i
If H is large, RE is rejected in favour of FE. For large samples H ~ χ ^{2} , with k the number of elements in β.
k
10
First Differencing
Again consider the model
y
u
it
it
=
=
x
it
η
i
′
β
+
+
v
it
u
it
where the unobserved individual effects _{η} _{i} are correlated with x _{i}_{t} . Taking first differences eliminates _{η} _{i} :
y
it
−
y
it
a
=−
xx
it
−11−
it
′
f β a
+−
it
uu
11
it
−1
fa
=−
xx
it
it
−1
′
f β a
+
vv
it
−
it
−1
f
are
uncorrelated. This is a weaker assumption than the strict exogeneity assumption of the fixed effects estimator. Again OLS estimated standard errors are wrong as it does not take
account of the correlation between
and therefore OLS is unbiased if a v
it
−1
_{f}
and
a
x
it
_{f}
−
v
−
x
it
it
−1
a
v
it
_{f} and
a
v
it−
2
f
−
v
−
v
it
−1
it−
1
a
Ev
it
−
v
E
a
′=
f
^{v}^{v}
ii
it
fa
v
−11−
it
−
v
it
−2
f
=−
σ
2
v
σ
2
v
L
M
M
M
M
N
21
−
− 1
2
0
− 1
0
−
12
O
P
P
P
P
Q
(when the v _{i}_{t} themselves are not correlated over time).
12
Endogenous Variables
Consider again the model in first differences
y
it
−
y
it
a
=−
xx
it
−11−
it
′
faβ
+
vv
it
−
it
−1
f
And x _{i}_{t} is endogenous if it is correlated with v _{i}_{t} .
There can also be feedback from v _{i}_{t}_{−}_{1} to x _{i}_{t} such that E a x v this case we call x _{i}_{t} predetermined or weakly exogenous.
it
ba
In both cases Ex
it
−
x
it
−1
fa
v
it
−
v
it
−1
fg
≠ 0 and OLS is biased.
it−
1
f
≠ 0. In
If the u _{i}_{t} are not correlated over time, lagged values of x _{i}_{t} can be used as instruments for the endogenous differences, and the model can be estimated by the Instrumental Variables estimator.
13
If x _{i}_{t} is endogenous, E a x v
it
it
_{f} ≠ 0 and E
a
are x _{i}_{s} , with s=1,…,t2, as E a x v
it−
2
it
f
=
0.
If x _{i}_{t} is predetermined, E a x v
it
it− 1
f
≠
x
0
it−
1
v
it−
1
but
f
≠ 0. Valid instruments
E
a
x
it−
1
v
it−
1
f
=
0.
Valid
instruments therefore are x _{i}_{s} , with s=1,…,t1.
14
Treatment Effects in Panels
Suppose the model is:
y
it
=α dx+ ′βλη+++ _{v}
i
it
it
t
i
it
where
d i = 1
if the program impacts on group i in period t.
Typically once the program is in place this dummy is set to unity for all remaining time periods.
If the time effects, the group effects and the x are sufficient to render
exogenous, then within groups (fixed effects) will be consistent for
the ATT impact of the treatment.
d i = 1
In this case, if the treatment occurs at the same time for all groups that are treated then diffindiff and within groups are identical estimators.
15
Dynamic Panel Data Models
A dynamic panel data model is specified as
y
it
=
α
yx
it
− 1
+
it
′
βη
+
i
+
v
it
Consider a model without other explanatory variables
Clearly, y
it−
a
=
α
y
12
it−
+
η
i
+
y
it
=
α
y
it
− 1
+
η
i
+
v
it
v
it−
1
_{f} is correlated with η _{i} .
OLS estimator is biased upwards.
Fixed Effects estimator is biased downwards (this bias gets smaller for larger T)
16
For the first differenced model
y
it
−
y
it
−1
=
α a
y
it
−
y
−12−
it
fa
+
vv
it
−
it
−1
f
y _{i}_{t}_{−}_{1} is of course correlated with v _{i}_{t}_{−}_{1} , (y is predetermined), and the OLS estimator in the differenced model is severely downward biased.
Valid instruments for
=
as Ey
b
a
v
it
fg
−
v
it−
2
it−
1
a
y
0.
it−
1
−
y
it−
2
_{f} are the lagged levels y
y
it−23it−
,
,
,
y
i1
,
An Instrumental Variables estimator that uses this information optimally is the Generalised Method of Moments (GMM) estimator.
17
Let ∆v _{i} be the vector of errors for individual i in the first differenced equation:
∆ =
v
i
L
M
M
M
M
N
v
v
i
i
v
iT
3
4
−
−
v
v
i
2
i 3
−
v
iT − 1
O
P
P
P
P
Q
=
L
M
M
M
N M
∆ y
∆ y
i
i
∆ y
iT
− O
− P
P
P
P
Q
∆ y
α
3 2
i
4
∆ y
α
i
3
∆ y
−
α
iT − 1
and let Z _{i} be the matrix of instruments for individual i
Z
i
=
L
M
M
M
M
N
y
i
1
0
0
y
i
1
00
y
i
2
0
00
…
0
yy
i
12
i
18
…
y
iT −
2
O
P
P
P
P
Q
Then
EZ e
i
′ ∆ v j = 0,
i
a total of (T −1)(T − 22) / moment conditions.
The GMM estimator uses these moment condition to estimate the parameters consistently and efficiently in two steps. The onestep estimator minimises
J
N
=
F
G
H
′
1
N
∑
i =
1
I
J
K
F
G
H
1
′
ZvW
ii
N
∆
N
N
N
∑
=
i
1
′
Z
∆
v
ii
I
J
K
where W _{N} is a weight matrix.
Choosing W
estimator.
N
=
F
G
H
1
N
N
∑
i
= 1
′
Z
Z
ii
I
J
K
− 1
results in the TwoStage Least Squares
19
The onestep GMM estimator uses as the weight matrix
W
N
1
A
N
=
=
F
G
H
L
M
M
M
M
N
I
J
K
−
1
1
N
2
N
∑
i = 1
′
ZAZ
i
Ni
−
10
−
0
0
−
1
− 12
0
0
0
12
O
P
P
P
P
Q
and is efficient when the errors are homoscedastic and not correlated over time. This is often too restrictive. However, the onestep results are consistent, and robust standard errors that adjust for heteroscedasticity and autocorrelation are easily obtained.
20
The twostep estimator is efficient under more general conditions, like heteroscedasticity. The efficient weight matrix is computed as
W
N 2
=
F
G
H
1
N
N
∑
i
= 1
∆∆
=
vy
ii
−
α
Z
i
′
vZ ′
ii
∆ v ∆
i
1
∆
y
i , −
1
I
J
K
−
1
where _{α} _{1} is the onestep GMM estimator. A problem is that in small samples (small number of individuals) the estimated standard errors of the twostep GMM estimator tend to be too small.
21
Sargan test for overidentifying restrictions:
The null hypothesis for this test is that the instruments are valid in the sense that they are not correlated with the errors in the firstdifferenced equation. It is computed as
S
=
NJ
N
a
α
2
f
=
N
F
G
H
′
1
N
∑
i =
1
I
J
K
F
G
H
1
′
Zv
∆
W
N
N
2
2
N
ii
N
∑
i
= 1
′
Z
∆
v
ii
2
I
J
K
.
Under the null, this test statistic has a _{χ}
total number of instruments minus the number of parameters in the model.
distribution, with q equal to the
2
q
Only use the twostep result for the Sargan test.
Note also test for serial correlation in the errors.
22
An Example: Investment Rates across Firms
The estimated model is
F
G
H
_{J} K I =
F
G
H
I
it
I
it − 1
λα
t
+
K
it
K
it − 1
I
J
K
+
η
i
+ v
it
and results are presented in Table 1 for OLS, within groups, just identified TwoStage Least Squares for a differenced model, with
_{−}_{2} as an instrument for ∆ ( I K ) _{−}_{1} , and two GMM estimates for
α in the differenced model, one using
_{−}_{3} , the other
using ( I
( I
/
K )
it
/
it
( I
/
K )
it
_{−}_{2} and
(
I
/
K )
it
/
K
)
it
,(
I
/
K ) _{1} as instruments.
i
−2
,
23
OLS 
Within 
2SLS DIF 
GMM1 DIF 
GMM1 DIF 

Levels 
Groups 

( I 
/ 
K ) 
it −1 
0.2669 
0.0094 
0.1626 
0.1593 
0.1560 

(.0185) 
(.0181) 
(.0362) 
(.0327) 
(.0318) 

m1 
4.71 
11.36 
10.56 
10.91 
11.12 

m2 
2.52 
2.02 
0.61 
0.52 
0.46 

Sargan (p) 
0.36 
0.43 

Instruments 
( I 
/ 
K ) 
it −2 
( 
I 
/ K ) 
it −2 
( I / 
K ) it −2 

( 
I 
/ K ) 
it −3 


( I / 
K ) 
i 
1 
24
Exogeneity/Endogeneity of additional regressors and instrument set
Consider again the dynamic model with one other explanatory variable:
y
it
=
+
β
α
yx
it
− 1
it
+
η
i
+
v
it
and the model in first differences:
∆y
it
=
∆y
α
it
− 1
+
β
∆xv∆ .
it
+
it
Consider the case with T = 4. When x is strictly exogenous w.r.t. v, the instruments are
Z
i
=
L
M
N
yx ,
11
ii
,
0
,
x
i
4
25
0 O
P
Q
yyx , ,
121
ii
i
,
,
x
i
4
.
When x is predetermined
Z
i
=
L
M
N
yxx , ,
112
iii
0
And when x is endogenous
Z
i
=
L
M
N
y
i
1
,
x
0
i
1
0 O
P
Q
yyxx ,,,,
1212
ii
ii
x
i
3
0
yyxx ,,,
1212
ii
ii
O
P
Q
.
26
.
An example and finite sample inference
Arellano and Bond (1991) estimate dynamic employment equations using a sample of 140 UK quoted firms over the years 19761984. One model was specified as
α
nn
it
=
11
it
−
+
α
2
n
it
−
2
+
β
β
ww
it
1
+
it
−
1
+
k
γ
it
+
δ
ys
it
+
δ
1
ys
it
−
1
+
λ
t
+
η
i
+
u
it
where n _{i}_{t} is the logarithm of UK employment in company i at the end of the period t, w _{i}_{t} is the log of the real product wage, k _{i}_{t} is the log of gross capital and ys _{i}_{t} is the log of industry output.
The table presents estimation results for the one and twostep GMM estimators.
27
OneStep
TwoStep std err
coeff 
std err 
coeff 
std errc 

n _{i}_{t}_{−}_{1} 
.535 
.166 
.474 
.085 
.185 

n _{i}_{t}_{−}_{2} 
.075 
.068 
.052 
.027 
.052 

w 
_{i}_{t} 
.592 
.168 
.513 
.049 
.146 
w 
_{i}_{t}_{−}_{1} 
.292 
.142 
.225 
.080 
.142 
k _{i}_{t} 
.359 
.054 
.293 
.040 
.063 

ys 
_{i}_{t} 
.597 
.172 
.610 
.109 
.156 
ys _{i}_{t}_{−}_{1} 
.612 
.212 
.446 
.125 
.217 

m _{1} 
2.493 
2.826 
1.999 

m _{2} 
0.359 
0.327 
0.316 

_{W}_{a}_{l}_{d} 
219.6 
372.0 
142.0 
28
Another test statistic with reasonable finite sample properties is the difference between the Sargan test statistics in the models with and without the restriction imposed. Imposing _{α} _{2} = 0 (keeping time periods and instruments the same) results in a Sargan test of 30.58. The difference between the Sargan tests is therefore 0.47, which is much smaller that the 5% critical value of the chisquared distribution with
one degree of freedom. H _{0} _{2}
= 0 is therefore not rejected.
: α
29
Weak Instruments and Dynamic Panels
Remember that instruments have to satisfy that
1.They are not correlated with the error term in the equation of interest. 2.They are correlated with the endogenous explanatory variable.
Whether the instruments are correlated with the error term is tested by means of the Sargan test. If the Sargan test rejects the null of no correlation, the IV estimator is biased and inconsistent.
However, even if the instruments are not correlated with the error term, a serious small sample bias can occur if they are only weakly correlated with the endogenous explanatory variable.
30
For the dynamic panel data model in first differences
∆y
it
=
∆y
α
it
−
1
+
∆v
it
lagged
as instruments for ∆y _{i}_{t}_{−}_{1} become less
increases. (For the extreme unit root case, , α is not identified in the first differenced GMM model).
The weak instrument bias tend to go in the direction of the within groups bias (i.e. downward).
y
informative
levels
y
,
y
i1
it−2
_{α}
,
as
=
y
+
v
it
it
it
−1
This occurs for any highly persistent endogenous r.h.s. variable – capital etc.
31
There are T − 2 additional moment conditions (additional to the moment conditions for the model in first differences) for this case are
a
Eu
it
∆
y
it
−
1
fab
ba
E
=
η
i
+
Ey
=−
it
v
it
f
∆
y
it
−
1
g
α
y
it
−
1
f
∆
y
it
−
1
g
= 0
These
conditions satisfy
additional
moment
conditions
are
E
a
η∆
i
y
i
2
_{f} = 0,
available
if
the
initial
which holds when the process is mean stationary:
E
y
i 1
afε
i
1
=
η i
1 −
α
+
ε
= E
aηε f
ii
1
32
i
1
= 0
.
The GMM estimator that combines the moment conditions for the differenced model with those for the levels model is call the SYSTEM estimator (Blundell and Bond (1998)) and has been shown to perform much better (less bias and more precision), especially when α is large, i.e. when the series are persistent. This is due to the fact that ∆y _{i}_{t}_{−}_{1} is a good instrument for y _{i}_{t}_{−}_{1} , it explains y _{i}_{t}_{−}_{1} well, irrespective of the value of α. Whether the additional moment conditions are valid has of course to be tested, using the Sargan test.
33
The model is
y
x
it
it
=
=
+
η
τη θ
β
α
yx
it
−
1
it
−
1
it
+
+
v
it
i
ρ
x
+++
i
it
ve
it
2
T = 8, N = 500, β = 1, τ = 0.25, θ = −0.1, σ _{η} (Normal), 10,000 replications
34
= 1, σ _{v}
^{2} =1, σ _{e} ^{2} = 0.16
OLS
WG
DIF
SYS
Mean 
St D 
Mean 
St D 
Mean 
St D 
Mean 
St D 

ρ= 0.5 

ρ 
0.762 
.012 
0.265 
.018 
0.494 
.034 
0.501 
.024 

_{α}_{=} _{0}_{.}_{5} 
_{α} 
0.820 
.007 
0.311 
.017 
0.480 
.040 
0.511 
.027 
β 
0.775 
.034 
0.490 
.045 
0.930 
.136 
0.997 
.124 

_{α} _{=} _{0}_{.}_{9}_{5} 
_{α} 
0.990 
.001 
0.662 
.016 
0.548 
.177 
0.979 
.011 
β 
0.581 
.035 
0.388 
.044 
0.226 
.356 
0.983 
.101 
35
OLS
WG
DIF
SYS
Mean 
St D 
Mean 
St D 
Mean 
St D 
Mean 
St D 

ρ = 0.95 

ρ 
0.997 
.001 
0.591 
.017 
0.676 
.222 
0.958 
.031 

_{α}_{=} _{0}_{.}_{5} 
_{α} 
0.650 
.009 
0.396 
.015 
0.480 
.033 
0.518 
.021 
β 
0.830 
.022 
0.796 
.040 
0.800 
.290 
1.075 
.059 

_{α} _{=} _{0}_{.}_{9}_{5} 
_{α} 
0.962 
.001 
0.882 
.009 
0.927 
.025 
0.957 
.003 
β 
0.902 
.017 
0.745 
.040 
0.615 
.400 
1.019 
.031 
36
An Example: Company Capital Stock
The estimated model is k
it
=
λ
t
+
k
α
it
−
1
+
η
i
+
v
it
OLS 
Within 
GMM1 DIF (t − 3) 
GMM1 SYS (t − 3) 

Levels 
Groups 

^{k} 
it−1 
0.987 
0.733 
0.768 
0.925 
(.002) 
(.027) 
(.070) 
(.021) 

m1 
7.72 
6.82 
5.80 
6.51 

m2 
2.29 
1.73 
1.73 
1.81 

Sargan (p) DifSar 
.563 
0.627 

0.562 
37
Count Data Models
Often the dependent variable is an integer valued nonnegative count variable, like the number of visits to the doctor, the number of patents granted or the average daily number of cigarettes smoked. A standard model for analysing such data is the Poisson regression model. The Poisson density for a count variable y _{i} given x _{i}
where
µ
i
a
fyx
i

i
f =
a
= Ey

x
ii
f
e −µ
i y
µ i
=
y
i
!
exp _{e}
x
i
′
j
β
is the conditional mean of y _{i} given x _{i} , which is positive.
38
′ β , the model is often called a loglinear model. The
Poisson distribution has the property that the conditional variance is equal to the conditional mean (equidispersion):
As ln µ
a
i
_{f} = x
i
Var y
afaf
x
ii
=
E
y

x
ii
=
exp
e
x
i
j
^{′} β .
Consider the regression model
y
i
= exp
e
x
′
j
β +
u
ii
with Eu a x
i

_{f} = 0. As long as these
conditions are valid in the population, the Poisson estimator for _{β} is consistent, even if the true distribution is not Poisson.
i
_{f} = 0 from which it follows that E
a
xu
i
i
39
Parameter Interpretation
The partial effects are given by
Further,
∂
a
Eyx 
f
∂ x
j
=
β
j
e
exp x
′
j
β .
β j
= ^{∂}
Eyx
a

f
1
∂ x
j
Eyx
a

f
and so β _{j} is a semielasticity, it equals the proportionate change in the conditional mean if the j ^{t}^{h} regressor changes by one unit. If x _{j} is
replaced by ln
c ^{x} j
_{h} , β _{j} is the elasticity of E _{a} y x _{f} with respect to x _{j} .
40
Overdispersion
In many applications there is overdispersion, i.e. the conditional variance is larger than the conditional mean (and sometimes there is underdispersion). The Poisson maximum likelihood estimated standard errors are then wrong, but they can easily be corrected by using robust standard errors that allow for general heteroskedasticity.
41
Overdispersion can be introduced directly into the model by introducing an unobserved heterogeneity term, η _{i} . Conditional on x _{i} and η _{i} , the y _{i} are Poisson distributed with
Ey x
ii
a

,η
i
f = exp
e x
i
′
β η
+
i
j =
exp
e
x
i
′
j
βε
i
If ε _{i} is independent of x _{i} and has a gamma distribution with E
and Var _{a} ε _{f} =δ ^{2} , then the conditional distribution of y _{i} given x _{i} is negative binomial with
_{f} =1
aε
i
i
Ey x
ii
a
Гораздо больше, чем просто документы.
Откройте для себя все, что может предложить Scribd, включая книги и аудиокниги от крупных издательств.
Отменить можно в любой момент.