Clase 0 Programación Dinámica

Advanced Macroeconomics II
Lecture 1
Stochastic Dynamic Programming
Isaac Baley
UPF & Barcelona GSE
January 7, 2016
1 / 38
Introduction
Stochastic dynamic programming is a very useful tool to think and solve a

lot of economic problems.
We review the theory of dynamic programming and extend to include

uncertainty.
Straightforward extension of what you did in Advanced Macro I.
References:
I Acemoglu (2009), Ch. 16.

I Adda and Cooper (2003), Ch 2-3.
I Ljungquvist and Sargent (2014), Ch 3-4.
I Stokey, Lucas, and Prescott (1989), Ch 3-4 (deterministic), 7-9
(stochastic with measure theory).
2 / 38
Roadmap
1 Sequence Problem
2 Recursive Formulation
3 Role of Uncertainty
4 Contraction Mapping Theorem
5 Characterization: Euler + Transversality
6 Solution Methods
3 / 38
Sequence Problem (1): Setup
Time is infinite and discrete.

An agent solves the following problem:
" #
X
t
max

E0 U (xt , yt , zt )
{yt }t=0
t=0
Subject to:
(control) yt Ge (xt , zt )
(state) xt+1 fe (xt , yt , zt )
(initial conditions) x0 , z0 given
I yt Y RKy : control variables (choice variables, e.g. investment)

I xt X RKx : endogenous state variables (predetermined, e.g. capital)
I zt Z : exogenous state variables (stochastic shocks, e.g. productivity)
I U : X Y Z R instantaneous payoff
I (0, 1] : discount factor
4 / 38
Sequence Problem (2): Exogenous Shocks
zt is a stationary shock.
For simplicity, we assume zt to be a first order Markov Chain.

I N possible realizations:
Z {z1 , z2 , ..., zN }
I Transition probability from i to j, denoted qji
pr [zt = zj | zt1 = zi ] = qji
such that:
N
X
qji 0 and qji = 1 for any i = 1, ..., N
j=1
I N N matrix with i in rows and j in columns, where rows sum 1.

5 / 38
Sequence Problem (3): Constraint Sets

" #
X
t
max E0 U (xt , yt , zt )
{yt }
t=0
t=0
(control) yt Ge (xt , zt )
(state) xt+1 fe (xt , yt , zt )
(initial conditions) x0 , z0 given
Constraint sets:
I Ge (xt , zt ) constraint on admissible controls, for given states.
I fe (xt , yt , zt ) law of motion of the state.
Using fe we can substitute yt as a function of xt+1 , xt and zt .
Example: kt+1 = (1 )kt + it . Control variable can be either it or kt+1 .

6 / 38
Sequence Problem (4): Solution
Substitute yt using fe and obtain the Sequence Problem (P1):

" #
X
t
V (x0 , z0 ) = max E0 U (xt , zt , xt+1 )
{xt+1 }t=0
t=0
st : xt+1 G (xt , zt ) , x0 , z0 given
V : X Z R is the value function.
Problem is stationary in that U and G do not depend on time.

Solution is an infinite sequence xt+1 t=0
Idea of dynamic programming: transform the problem into one of finding

a time-invariant function (xt , zt ) rather than an infinite sequence.

I Example, instead of the infinite sequence of optimal capital kt+1 t=0 ,

you find kt+1 = (kt , zt ), the time invariant capital policy function.
7 / 38
Roadmap
1 Sequence Problem
6 Solution Methods
8 / 38
Recursive Formulation (1): Principle of Optimality
The Principle of Optimality allows to express the problem in recursive form.
I In an optimal policy, whatever the initial state and decision are, the
remaining decisions must be an optimal policy with regard to the state
resulting from the first decision.
Consider the original sequence problem (P1):
V (x0 , z0 ) = max E0 [U (x0 , z0 , x1 ) + U (x1 , z1 , x2 ) + . . . ]

{xt+1 }t=0
st : xt+1 G (xt , zt ) , x0 , z0 given

Suppose xt+1 t=0
is a solution to the problem and V (x0 , z0 ) is finite.
9 / 38
Recursive Formulation (2): Bellman Equation

In period 1, the state variables are x1 and z1 , and the sequence xt+1 t=0
is
also optimal from period 1 onwards:
V (x0 , z0 ) = U (x0 , z0 , x1 ) + E0 [U (x1 , z1 , x2 ) + . . . ]
= U (x0 , z0 , x1 ) + E0 [V (x1 , z1 )]
Therefore P1 must be equal to maximizing the two period problem:
V (x0 , z0 ) = max U (x0 , z0 , x1 ) + E0 [V (x1 , z1 )] , x0 , z0 given

x1 =(x0 ,z0 )
For any t, we obtain the Recursive Problem (P2), a Bellman Equation:
V (xt , zt ) = max {U (xt , zt , xt+1 ) + Et [V (xt+1 , zt+1 )]} x X

xt+1 =(xt ,zt )
I xt and zt are states and xt+1 is the vector of controls (tomorrows state)
I We usually assume that z is an exogenous first order stochastic process:
Et (zt+1 ) only depends on zt .
10 / 38
Recursive Formulation (3): Important Notes
1 The infinite horizon plan is reduced to a two-period problem (today +

continuation value).
I Often gives better economic intuition
2 Once we have V (), the policy function xt+1 = (xt , zt ) can be found from:
V (xt , zt ) = U (xt , (xt , zt ), zt ) + Et [V ((xt , zt ), zt+1 )] xt X
3 P2 is a functional equation, i.e. a function of a functions.

I Under certain assumptions, we can guarantee existence and properties
of the solutions.
I Contraction Mapping Theorem.
4 P2 is recursive in that V () appears both on the LHS and the RHS.
I Powerful numerical tools to find the solution.
I Contraction Mapping Theorem.
11 / 38
Recursive Formulation (4): Additional Assumptions
We will make the following assumptions:

Pn n
t1
(i) limn E0 t=0 U xt z , xt+1 [z t ] , zt exists and is finite.
(ii) X is a compact subset of RK .
(iii) G (x, z) is nonempty, compact valued, and continuous. Moreover, it is
convex in x for any z Z .
(iv) U is continuous, concave, differentiable and increasing in the state x for
any z Z .
With the previous assumptions, we can establish the following results:

1 Equivalence of P1 (sequence problem) and P2 (recursive problem).
2 V : X R exists, is unique, bounded, continuous, concave, increasing
and differentiable.
3 There exists a unique optimal plan with

xt+1 = (xt , zt )
12 / 38
Roadmap
1 Sequence Problem
6 Solution Methods
13 / 38
Role of Uncertainty
What are the practical complications introduced by uncertainty?
Once we make sure the assumptions we need to solve the recursive problem
hold, then not much.
The expectation is just a weighed average of outcomes in different states:
( N
)
X
V (xt , zt ) = max U (xt , zt , xt+1 ) + pr [zt+1 = zj | zt ] V (xt+1 , zj )
xt+1 =(xt ,zt )
j=1
It is key the fact that the process is exogenous, and hence pr [zt+1 = zj | zt ]
is not affected by the control variable xt+1 . This implies that:
N
Et [V (xt+1 , zt+1 )] X V (xt+1 , zj ) V (xt+1 , zt+1 )
= pr [zt+1 = zj | zt ] = Et
xt+1 j=1
xt+1 xt+1
14 / 38
Roadmap
1 Sequence Problem
6 Solution Methods
15 / 38
Contraction Mapping Theorem (1): Motivation
A map is a function that transforms functions into functions (rather than

numbers).
T (f (x)) = g (x)
Bellman Equation can be written as map in value functions and policy rules.
For any function W , define the map T as:
T(W ) = max U(xt , zt , xt+1 ) + Et [W (xt+1 , zt+1 , xt+2 )]

xt+1 G (xt ,zt )
The solution is a fixed point of the mapping:
V = T(V )
The Contraction Mapping Theorem ensures that you can find the fixed point
with an iterative procedure.
16 / 38
Contraction Mapping Theorem (2): Example
Example: household that maximizes intertemporal consumption:

V (at ) = max u (ct ) + Et [V (at+1 )]
ct
s.t at+1 = Rat ct + yt
Substitute restriction into the value function:

at+1
z }| {
V (at ) = max u (ct ) + Et V Rat ct + yt
ct
= T (V (Rat ct + yt ))
where the mapping is defined as:
T (W (a)) max u (c) + Et [W (Ra c + y )]
c
For every value of at , finding optimal consumption ct , is equivalent to finding

the the fixed point of the mapping, where the function that maps into itself:
T (V (a)) = V (a)
17 / 38
Contraction Mapping Theorem (3): Definition
Definition: Let (F, || ||) be a metric space. An on-to map T : F F is a

contraction map iff there exists a number [0, 1) such that
kTf1 (x) Tf2 (x)k kf1 (x) f2 (x)k , f1 , f2 F
i.e. functions Tf1 (x) and Tf2 (x) are closer than f1 (x) and f2 (x).
Why is this useful? Consider a sequence of functions {fn (x)}
n=0 given as:
fn (x) = Tfn1 (x)
If T is a contraction map, then
kfn (x) fn1 (x)k = kTfn1 (x) Tfn2 (x)k

kfn1 (x) fn2 (x)k kfn1 (x) fn2 (x)k
Functions in the sequence become closer and closer.

18 / 38
Contraction Mapping Theorem (4): Theorem
Theorem 1
Let (F, || ||) be a complete metric space and T a contraction mapping. Then it
has a unique fixed point, Tf = f .
Moreover, for any initial guess f0 , the sequence fn = Tfn1 will converge to f .
fn (x) n f (x)
Why is it unique? Assume not, then f 6= f such that Tf = f , Tf = f.

0 < f f = Tf Tf f f = 1! (Contradition)

Very useful in practice: take any arbitrary initial guess for V , iterate the
Bellman equation until convergence.
The key is to check that our Bellman equation is a contraction map!
19 / 38
Contraction Mapping Theorem (5): Blackwell Conditions
In general, it is hard to prove a map is a contraction.
But we have the following useful sufficient conditions (Blackwell):

(B1) Monotonicity:
f1 (x) f2 (x) for all x = Tf1 (x) Tf2 (x) for all x
(B2) Discounting: There exists a [0, 1) such that, for any constant k
and any function f , we have:
T (f + k) Tf + k
If map T satisfies (B1) and (B2), then T is a contraction.
Let us check if a map T given by a Bellman Equation is a contraction.
20 / 38
Contraction Mapping Theorem (6): Bellman & Blackwell
Bellman and Monotonicity:

Suppose V (a) W (a) for all a. Then it must be that TV (a) TW (a),
TV (a) = max {u (c) + E [V (Ra c + y )]}

c

u (cW ) + E [V (Ra cW + y )]

u (cW ) + E [W (Ra cW + y )]

= max {u (c) + E [W (Ra cW + y )]} = TW (a)
c
Bellman and Discounting:
T (V (a) + k) = max {u (c) + E [V (Ra c + y ) + k]}

c
= max {u (c) + E [V (Ra c + y )]} + k
c
= TV (X ) + k
Conclusion: if < 1 then T (Bellman Equation) is a contraction map.

21 / 38
Roadmap
1 Sequence Problem
6 Solution Methods
22 / 38
Characterization (1): First Order Conditions
Bellman Equation:
V (x, z) = max U(x, z, x 0 ) + E[V (x 0 , z 0 )] x X

x 0 =(x,z)
I By the above assumptions, the maximization is strictly concave and

differentiable.
I For interior solutions, the first order conditions (FOC) are necessary.
The FOC with respect to control x 0 :
Dx 0 U(x, z, x 0 ) + DE[V (x 0 , z 0 )] = 0
where D denotes the gradient and Dx 0 the gradient wrt the vector x 0 .
How to evaluate DE[V (x 0 , z 0 )] = 0?
23 / 38
Characterization (2): Envelope Conditions
As we saw earlier, since the stochastic process is exogenous, we can

exchange the derivative and the expectation:
DE[V (x 0 , z 0 )] = E[DV (x 0 , z 0 )]
Now use V (x, z) = U(x, z, x 0 ) + E[V (x 0 , z 0 )] to compute DV (x, z):

0 0

0 0 dx 0 0 dx
DV (x, z) = Dx U(x, z, x ) + Dx 0 U(x, z, x ) + E DV (x , z )
dx dx
dx 0
= Dx U(x, z, x 0 ) + {Dx 0 U(x, z, x 0 ) + E [DV (x 0 , z 0 )]}
| {z } dx
=0 by FOC
= Dx U(x, z, x 0 )
Intuition: V is maximized wrt to x 0 (small changes in x 0 do not affect it,

envelope theorem).
24 / 38
Characterization (3): Euler Equation
Back to the FOC:
Dx 0 U(x, z, x 0 ) + E[DV (x 0 , z 0 )] = 0
The second term is the derivative we just computed with envelope condition,
but one period forward:
E[DV (x 0 , z 0 )] = E[Dx 0 U(x 0 , z 0 , x 00 )]
Substituting back:
Dx 0 U(x, z, x 0 ) + E[Dx 0 U(x 0 , z 0 , x 00 )] = 0
This Euler equation characterizes implicitly the (unknown) optimal policy

(x, z):
Dx 0 U(x, z, (x, z)) + E[Dx 0 U((x, z), z 0 , ((x, z), z 0 ))] = 0
25 / 38
Characterization (4): Transversality Condition
The Euler Equation establishes the optimality of the solution between two
contiguous periods (one period deviations from optimal policy are not
profitable).
What about an infinite deviation?
The solution must also satisfy the transversality condition:
lim t Et Dx U(xt , z, xt+1

) xt = 0

t
I The discounted value of the xt must approach zero at infinity.

I Infinite-horizon equivalent of a terminal condition in the finite case,
so that all wealth must be consumed by the end of period.
26 / 38
Characterization (5): One-dimensional case
Suppose x, z and x 0 are real numbers.
The First Order Condition is:

U(x, z, x 0 ) V (x 0 , z 0 )

= E
x 0 x 0
I Indifference condition: Todays cost of increasing x 0 (i.e. capital

tomorrow) has to be equal to its expected discounted marginal gain on
future utility (i.e. profits).
The Envelope Condition:
V (x, z) U(x, z, x 0 )
=
x x
27 / 38
Characterization (5): One-dimensional case (cont...)
Forward the envelope one period
V (x 0 , z 0 ) U(x 0 , z 0 , x 00 )
0
=
x x 0
The Euler Equation (substitute forwarded envelope into FOC):
U(x, z, x 0 ) U(x 0 , z 0 , x 00 )

= E
x 0 x 0
I New indifference condition only between today and tomorrow (effect on

future continuation value is second order, because of envelope theorem).
28 / 38
Characterization (5): One-dimensional case (cont...)
Finally, the Transversality Condition.
Suppose the last period is t = T , we choose xT +1 to max T U(xT , zT , xT +1 )
Because of potential corner, the FOC reads:
U(xT , zT , xT +1 )
T xT +1 = 0
xT +1

U(xT ,zT ,xT +1 )
I Either an interior solution is optimal xT +1 = 0 or we go to a
corner solution xT +1 = 0.
When T , we take the limit:

U(xT , zT , xT +1 ) U(xT +1 , zT +1 , xT +2 )
lim T xT +1 = lim T +1 xT +1 = 0
T xT +1 T xT +1
| {z }
use Euler
29 / 38
Characterization (6): Our previous example
Bellman:
V (a, y ) = max
0
u (Ra + y a0 ) + E [V (a0 , y 0 )]
a
FOC:
V (a0 , y 0 )

u(c)
= E
c a0
Envelope:
V (a, y ) u(c) V (a0 , y 0 ) u(c 0 )
=R =forward = R
a c a0 c 0
Euler = FOC + Forward Envelope
u(c 0 )

u(c)
= RE
c c 0
Transversality
u(ct )
lim t at = 0
t ct
30 / 38
Roadmap
1 Sequence Problem
6 Solution Methods
31 / 38
Solution Methods
The Bellman equation is a functional equation.
How to Solve Functional Equations? No general way, several

approaches.
Closed form: Guess a functional form (often same form as U) with

undetermined coefficients and verify.
Numerical dynamic programming:
a) Value function iteration.
b) Policy function iteration (Howard improvement algorithm).
c) Projection methods (approximate policy with polynomials).
32 / 38
Guess and Verify (Undetermined Coefficients)
In some special cases, one can obtain closed form solutions.
Example: Stochastic growth with Log utility and Cobb-Douglas production.

I Consider the problem:

!
X
t
max E0 ln ct
{ct ,kt+1 }
t=0
t=0
st : kt+1 = t kt ct , k0 , 0 given
log(t ) iid(0, 2 )
I In this case kt (state), kt+1 (control) and ct = t kt kt+1
Bellman Equation:
V (kt , t ) = max ln (t kt kt+1 ) + Et [V (kt+1 , t+1 )]

kt+1
33 / 38
First Order Condition:

1 V (kt+1 , t+1 )
= Et
t kt k+1 kt+1
Envelope condition:
V (kt , t ) t kt1
=
kt t kt kt+1
Forwarding by one period:
" #
1
V (kt+1 , t+1 ) t+1 kt+1
Et = Et k
kt+1 t+1 kt+1 t+2
Substituting back in FOC:

" #
1
1 t+1 kt+1
= Et
t kt kt+1 k
t+1 kt+1 t+2
34 / 38
We guess the value function as a log-linear function of the states:
V (kt , t ) = v1 + v2 log kt + v3 log t
Since log(t ) iid(0, 2 ), the guess implies that:
Et [V (kt+1 , t+1 )] = v1 + v2 log kt+1
Substituting the guess in the Bellman Equation:
V (kt , t ) = max ln (t kt kt+1 ) + v1 + v2 log kt+1

kt+1
35 / 38
Bellman Equation:
V (kt , t ) = max ln (t kt kt+1 ) + v1 + v2 log kt+1
kt+1
FOC:
1 v2 v2
+ =0 = kt+1 = t k
t kt kt+1 kt+1 1 + v2 t
Substitute the solution into the value function:

v2 v2
V (kt , t ) = ln t kt t kt + v1 + v2 log t k
1 + v2 1 + v2 t
Rearrange as follows (For homework verify this claim):
V (kt , t ) = constant + (1 + v2 ) ln (t kt )
and conclude that:
V (kt , t ) = constant + (1 + v2 ) ln kt + (1 + v2 ) ln t
| {z } | {z }
v2 v3
36 / 38
If the guess is right, then the following equations must have a solution
(1 + v2 ) = v2
1 + v2 = v3
Solving for v2 and v3 we obtain:

1
v2 = , v3 =
1 1
Hence the policy function:

1
1
kt+1 = 1+ t kt = t kt
v2
And the value function:

1
V (kt , t ) = constant + ln kt + ln t
1 1
37 / 38
Homework: Autocorrelated shocks
Assume now that the productivity shocks t are autocorrelated.
log t = log t1 + t
where t is iid(0, 2 ) and < 1.
Verify that the guess V (kt , t ) = v1 + v2 log kt + v3 log t is still correct.
Compute the optimal policy and the value function in this case.
How does the semi-elasticity of the value function with respect to change
with the persistence parameter ?
38 / 38

Clase 0 Programación Dinámica

Загружено:

Сведения о документе

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Clase 0 Programación Dinámica

Загружено:

Авторское право:

Доступные форматы

Advanced Macroeconomics II

Stochastic dynamic programming is a very useful tool to think and solve a

We review the theory of dynamic programming and extend to include

Straightforward extension of what you did in Advanced Macro I.

I Acemoglu (2009), Ch. 16.

4 Contraction Mapping Theorem

5 Characterization: Euler + Transversality

Time is infinite and discrete.

I yt Y RKy : control variables (choice variables, e.g. investment)

For simplicity, we assume zt to be a first order Markov Chain.

I Transition probability from i to j, denoted qji

pr [zt = zj | zt1 = zi ] = qji

I N N matrix with i in rows and j in columns, where rows sum 1.

Using fe we can substitute yt as a function of xt+1 , xt and zt .

Example: kt+1 = (1 )kt + it . Control variable can be either it or kt+1 .

Substitute yt using fe and obtain the Sequence Problem (P1):

V : X Z R is the value function.

Problem is stationary in that U and G do not depend on time.

Idea of dynamic programming: transform the problem into one of finding

4 Contraction Mapping Theorem

5 Characterization: Euler + Transversality

The Principle of Optimality allows to express the problem in recursive form.

Consider the original sequence problem (P1):

V (x0 , z0 ) = max E0 [U (x0 , z0 , x1 ) + U (x1 , z1 , x2 ) + . . . ]

st : xt+1 G (xt , zt ) , x0 , z0 given

Therefore P1 must be equal to maximizing the two period problem:

V (x0 , z0 ) = max U (x0 , z0 , x1 ) + E0 [V (x1 , z1 )] , x0 , z0 given

For any t, we obtain the Recursive Problem (P2), a Bellman Equation:

V (xt , zt ) = max {U (xt , zt , xt+1 ) + Et [V (xt+1 , zt+1 )]} x X

1 The infinite horizon plan is reduced to a two-period problem (today +

V (xt , zt ) = U (xt , (xt , zt ), zt ) + Et [V ((xt , zt ), zt+1 )] xt X

3 P2 is a functional equation, i.e. a function of a functions.

We will make the following assumptions:

With the previous assumptions, we can establish the following results:

4 Contraction Mapping Theorem

5 Characterization: Euler + Transversality

What are the practical complications introduced by uncertainty?

4 Contraction Mapping Theorem

5 Characterization: Euler + Transversality

A map is a function that transforms functions into functions (rather than

T(W ) = max U(xt , zt , xt+1 ) + Et [W (xt+1 , zt+1 , xt+2 )]

The solution is a fixed point of the mapping:

Example: household that maximizes intertemporal consumption:

Substitute restriction into the value function:

For every value of at , finding optimal consumption ct , is equivalent to finding

Definition: Let (F, || ||) be a metric space. An on-to map T : F F is a

kTf1 (x) Tf2 (x)k kf1 (x) f2 (x)k , f1 , f2 F

fn (x) = Tfn1 (x)

If T is a contraction map, then

kfn (x) fn1 (x)k = kTfn1 (x) Tfn2 (x)k

Functions in the sequence become closer and closer.

Why is it unique? Assume not, then f 6= f such that Tf = f , Tf = f.

In general, it is hard to prove a map is a contraction.

But we have the following useful sufficient conditions (Blackwell):

If map T satisfies (B1) and (B2), then T is a contraction.

Let us check if a map T given by a Bellman Equation is a contraction.

Bellman and Monotonicity:

TV (a) = max {u (c) + E [V (Ra c + y )]}

Bellman and Discounting:

T (V (a) + k) = max {u (c) + E [V (Ra c + y ) + k]}

Conclusion: if < 1 then T (Bellman Equation) is a contraction map.

4 Contraction Mapping Theorem

5 Characterization: Euler + Transversality

V (x, z) = max U(x, z, x 0 ) + E[V (x 0 , z 0 )] x X