Вы находитесь на странице: 1из 11

MGTECON 640: Section 8

Marco Giacoletti
11/22/2013
Simulate Data
We simulate (xi , ei ) with i 1, .., N and N = 5000 so that xi is normally distributed
with mean 1 and ei is normally distributed with mean 0. Their variance-covariance
matrix is:

=

1 0
0 5

While we define:
yi = xi + x2i + ei

Problem 1
Estimate the conditional expectation of wages given iq using the Epanechnikov kernel.
First plot the cross-validation criterion and select the optimal bandwidth.
Solutions:
The conditional expectation at observation x can be defined as:

PN
Xi x
i=1 Yi K
h


gh (x) = P
Xj x
N
j=1 K
h
The kernel function is the Epanechnikov kernel:


3
z2
K (z) = 1
1{|z|<5}
5
4 5
In order to evaluate the Cross-Validation criterion we first calculate the kernel
estimator based on leaving out the ith observation:
1



Xj x
Y

K
j
i=1,j6=i
h

PN
Xk x
k=1,k6=i K
h

PN
gh,(i) (x) =

We can then evaluate the Cross-Validation criterion as:


CV (h) =

N
X

gh,(i) (x) Yi

2

i=1

The following figure reports the criterion. We focus on values of the bandwidth in
the interval between 0.01 and 1.01 and build an equally spaced grid where the distance
between two consecutive points is 0.01.
5

1.5

x 10

CrossValidation Criterion

1.45

1.4

1.35

1.3

1.25

0.2

0.4

0.6

0.8

We can then select the optimal bandwidth as:


hopt = arg min CV (h)
Which in this case is hopt = 0.17.

Problem 2
Plot the regression function with 95% confidence bands.
Solutions:
We can now calculate the conditional expectation function using the optimal bandwidth:
2



Xi x
Y

K
i
i=1
hopt


PN
Xj x
K
j=1
hopt

PN
ghopt (x) =
With asymptotic distribution:


ghopt (x) g (x) N

N hopt

2 (x)
0,
f (x)


K(u) du
2

Where we are ignoring the bias term. There are several elements composing the
asymptotic variance. The first one is the scaling factor:
Z
Z
2
K(u) du = K(u)2 du =


2
3
u2
1
=
1{|u|<5} du =
5
4 5



Z +
1 4 2 2
9
1 + u u 1{|u|<5} du =
=
25
5
80


Z 5 
1 4 2 2
9
=
1+ u u
du =
80
25
5
5
Z


 5



1 5
2 3
9
1 2
2
9
u+
u u =2

5 5 5 5 =
5+
=
80
125
15
80
125
15
5
9
=2
80

5+

! 


8 9
3
3
5 2 5
1 2 9

5=
5= 2 5=
= 1+
5
3
5 3 40
15 40
5
5 5

We can then get estimates for:


PN

2 (x) =

i=1

f (x) =



2
x
Yi ghopt (x) K Xhiopt


PN
Xj x
j=1 K
hopt



N
X
1
Xi x
K
N hopt i=1
hopt

We can then calculate the estimated asymptotic variance: V (x) =


3

2 (x) 3
.
f(x) 5 5

Finally, we can get (1 ) % confidence intervals as:




q
q
g (Xi ) + 2 V (x), g (Xi ) + 1 2 V (x)
In the following figure we plot the conditional expectation function for each value
of Xi along with the 95% confidence bands.
25

Expectation
95% CI
20

15

10

5
3

Problem 3
Estimate the regression function using polynomial series. Calculate the cross-validation
criterion for K = 0, 1, . . . , 10, and select the optimal number of terms in the series.
Solutions:
We now turn to polynomial series; we define the conditional expectation as a regression function:
g (x; K ) =

K
X

k xk

k=0

We again use cross validation to select the optimal value of K; the criterion is:
CV (K) =

N
X

g(i) (Xi ) Yi

i=1

With:
4

2

g(i) (x) =

K
X

k,(i) xk

k=0

We evaluate the criterion for values of K ranging between 0 and 10; as we can see
in the following figure, the criterion is minimized at K = 3 which coincides with a
quadratic function (as expected).
5

1.8

x 10

CrossValidation Criterion

1.7

1.6

1.5

1.4

1.3

1.2
0

Problem 4
Plot the regression function with 95% confidence bands.
Solutions:
The model selected in the previous step is:
Yi = g (Xi ) + ei = 0 + 1 Xi + 2 Xi2 + ei = X i + ei



0
Where X i = 1 Xi Xi2 and = 0 1 2 . Under the homoschedasticity assumption (which holds for the simulated data), we can calculate the regression
standard error as:
r
e0 e

e =
N 2
And the variance-covariance matrix of the coefficients as:
 

1
0

V ar =
e X X
5

While the variance of the conditional expectation of Yi given X is equal to:




 
V ar (
g (Xi )) = V ar (
g (Xi )) = V ar X i = X i V ar X 0
i

We can then get (1 ) % confidence intervals as:


i
h
p
p

g (Xi )), g (Xi ) + 1 2 V ar (


g (Xi ))
g (Xi ) + 2 V ar (
Where p is a quantile of the standard normal distribution. We now plot the
regression function (or conditional expectation function) for each value of Xi along
with the 95% confidence bands.
25

Expectation
95% CI

20

15

10

5
3

Code

% MGTECON 640 : Section 8


% Professor: Guido Imbens
% TA: Marco Giacoletti
%%
clear all; clc; close all;
% Setting common random seed
rng(12345,v5uniform);
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%% Simulate Data
N = 5000;
X = normrnd(1,1,N,1);
e = normrnd(0,5,N,1);
Y = X + X.^2 + e;
%% Problem 1
% set up grid of values of h for cross-validation
h = 0.01:0.01:1.01;
H = size(h,2);
Q_h = NaN(1,H);
% evaluate cross-validation objective at each point in the grid
for ii=1:H
Q_h(ii) = CrossVal_kern(X,Y,h(ii));
end
% plot the cross validation objective
figure(3);
plot(Q_h,LineWidth,2,Color,b);
grid on;
legend(Cross-Validation Criterion)
xlim([1 H])
set(gca,fontname,garamond,fontsize,14);
set(gca,xtick,[0 20 40 60 80 100]);
set(gca,xticklabel,0|0.2|0.4|0.6|0.8|1);
% select optimal bandiwth as the one minimizing CV
ind = find(Q_h==min(Q_h));
[Q_h_opt g_hat_opt CI_025_opt CI_975_opt] = CrossVal_kern(X,Y,h(ind));
out_h = [X g_hat_opt CI_025_opt CI_975_opt];
%% Problem 2
% plot conditinal means along with confidence bands
figure(4);

scatter(X,out_h(:,2),20,[0 0 1],filled); hold on;


scatter(X,out_h(:,3),15,[1 0 0]); hold on;
scatter(X,out_h(:,4),15,[1 0 0]); hold off; grid on;
legend(Expectation,95% CI)
%% Problem 3
% set up the grid for the degrees of the approximating plynomial
K = 0:1:10;
KK = size(K,2);
Q_K = NaN(1,KK);
% evaluate cross-validation objective at each point in the grid
for kk=1:KK
Q_K(kk) = CrossVal_pol(X,Y,K(kk));
end
% plot the cross validation objective
figure(5);
plot(Q_K,LineWidth,2,Color,b);
grid on;
legend(Cross-Validation Criterion)
xlim([1 KK+1])
set(gca,fontname,garamond,fontsize,14);
set(gca,xtick,[0 1 2 3 4 5 6 7 8 9 10]);
set(gca,xticklabel,|0|1|2|3|4|5|6|7|8|9|10);
% find the optimal degree of the polynomila as the one minimizing CV
% objective
ind_K = find(Q_K==min(Q_K));
[Q_K_opt g_K_hat_opt CI_K_025_opt CI_K_975_opt] =
CrossVal_pol(X,Y,K(ind_K));
%% Problem 4
figure(6);
scatter(X,g_K_hat_opt,20,[0 0 1],filled); hold on;
scatter(X,CI_K_025_opt,15,[1 0 0]); hold on;
scatter(X,CI_K_975_opt,15,[1 0 0]); hold off; grid on;
legend(Expectation,95% CI)
set(gca,fontname,garamond,fontsize,14);
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
function [Q g_hat CI_025 CI_975]= CrossVal_kern(X,Y,h)
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% inputs:
% X = covariates
% Y = dependent variable
% h = bandiwth
% outputs:

% Q
= CV criterion
% g_hat = conditional expectation
% CI_025, CI_975 = 95% confidence bands
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% epanechickov kernel function
epa_kernel = @(z) (3/4*sqrt(5))*(1-(1/5)*z.^2).*(abs(z)<sqrt(5));
% integral of the squared kernek
K2
= 3/(5*sqrt(5));
% allocate memory
N
= size(X,1);
g_hat
= NaN(N,1);
f_hat
= NaN(1,N);
g_hat_i
= NaN(1,N);
v_hat
= NaN(1,N);
CI_025
= NaN(N,1);
CI_975
= NaN(N,1);
% building blocks of CV criterion
for nn=1:N
% exclude current observation
if nn==1
Xn = X(2:N);
Yn = Y(2:N);
end
if nn==N
Xn = X(1:N-1);
Yn = Y(1:N-1);
end
if nn>1 && nn <N
Xn1 = X(1:nn-1);
Xn2 = X(nn+1:N);
Xn = [Xn1;Xn2];
Yn1 = Y(1:nn-1);
Yn2 = Y(nn+1:N);
Yn = [Yn1;Yn2];
end
% nnth element for CV criterion
g_hat_i(nn) = sum(Yn.*epa_kernel((Xn-ones(N-1,1)*X(nn))/h))...
/sum(epa_kernel((Xn-ones(N-1,1)*X(nn))/h));
% conditional expectation
g_hat(nn) =
sum(Y.*epa_kernel((X-ones(N,1)*X(nn))/h))/sum(epa_kernel((X-ones(N,1)*X(nn))/h));
% elements for variance of conditional expectation

f_hat(nn) = (1/(N*h))*sum(epa_kernel((X-ones(N,1)*X(nn))/h));
v_hat(nn) =
sum(((Y-g_hat(nn)*ones(N,1)).^2).*epa_kernel((X-ones(N,1)*X(nn))/h))...
/sum(epa_kernel((X-ones(N,1)*X(nn))/h));
% 95% confidence bands
CI_025(nn) = g_hat(nn) +
norminv(0.025,0,1)*sqrt(K2*v_hat(nn)/f_hat(nn))/sqrt(N*h);
CI_975(nn) = g_hat(nn) +
norminv(0.975,0,1)*sqrt(K2*v_hat(nn)/f_hat(nn))/sqrt(N*h);
end
% CV criterion
Q = sum((g_hat_i-Y).^2);
end
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
function [Q g_hat CI_025 CI_975]= CrossVal_pol(X,Y,K)
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% input:
% X = covariates
% Y = dependent variable
% K = degree of the polynomial approximation
% output:
% Q
= CV criterion
% g_hat = conditional expectations of Yi
% CI_025, CI_975 = 95% confidence bands
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% set up covariates matrix
N
= size(X,1);
XX
= NaN(N,K+1);
for kk = 1:K+1
XX(:,kk) = X.^(kk-1);
end
% allocate memory
g_hat_i
= NaN(1,N);
% get building blocks of CV criterion
for nn=1:N
% exclude the current observation
if nn==1
XXn = XX(2:N,:);
Yn = Y(2:N);

10

end
if nn==N
XXn = XX(1:N-1,:);
Yn = Y(1:N-1);
end
if nn>1 && nn <N
XXn1 = XX(1:nn-1,:);
XXn2 = XX(nn+1:N,:);
XXn = [XXn1;XXn2];
Yn1 = Y(1:nn-1);
Yn2 = Y(nn+1:N);
Yn = [Yn1;Yn2];
end
% get nnth prediction for CV criterion
theta_n = pinv(XXn*XXn)*(XXn*Yn);
g_hat_i(nn) = XX(nn,:)*theta_n;
end
% get conditional expectations:
theta = pinv(XX*XX)*(XX*Y);
g_hat = XX*theta;
% standard errors of conditional expectations
s2_e = ((Y-XX*theta)*(Y-XX*theta))/(N-K);
s2_X = s2_e*pinv(XX*XX);
s2_f = NaN(N,1);
for nn = 1:N
s2_f(nn) = XX(nn,:)*s2_X*XX(nn,:);
end
% 95% confidence interval
CI_025 = g_hat + norminv(0.025,0,1)*sqrt(s2_f);
CI_975 = g_hat + norminv(0.975,0,1)*sqrt(s2_f);
% CV criterion
Q = sum((g_hat_i-Y).^2);
end

11

Вам также может понравиться