Академический Документы
Профессиональный Документы
Культура Документы
Thomas J. Sargent
March 8, 2015
Abstract
A representative consumer expresses distrust of a baseline probability model by using
a convex set of martingales as likelihood ratios to represent other probability models.
The consumer constructs the set to include martingales that represent particular
parametric alternatives to the baseline model as well as others representing only
vaguely specified models statistically close to the baseline model. The representative
consumers max-min expected utility over that set gives rise to equilibrium prices
of model uncertainty expressed as worst-case distortions to the drifts in his baseline
model. We calibrate a quantitative example to aggregate US consumption data.
Key words: Risk, uncertainty, uncertainty prices, Chernoff entropy, robustness, shock
price elasticities, affine stochastic discount factor
We thank Scott Lee, Botao Wu, and especially Lloyd Han and Paul Ho for carrying out the computations.
Introduction
Specifying a set of probability distributions is an essential part of applying the Gilboa and
Schmeidler (1989) max-min expected utility model. This paper proposes a new way to
imagine that a decision maker forms that set and provides an application to asset pricing.
When a representative investor describes risks with a set of probability models, uncertainty premia augment prices of exposures to those risks. We describe how our method for
specifying that set affects prices of model uncertainty.
Our experiences as applied econometricians attract us to robust control theory. We
always regard our own quantitative models as approximations to better models that we had
not formulated. This is also the attitude of the robust decision maker modeled in Hansen
and Sargent (2001) and Hansen et al. (2006). The decision maker has a single baseline
probability model with a finite number of parameters. He wants to evaluate outcomes
under alternative models that are statistically difficult to distinguish from his baseline
model. He expresses distrust of his baseline model by surrounding it with an uncountable
number of alternative models, many of which have uncountable numbers of parameters. He
represents these alternative models by multiplying the baseline probabilities with likelihood
ratios whose entropies relative to the baseline model are less than a bound that expresses
the idea that alternative models are statistically close to the baseline model.
The decision theory presented in this paper retains the starting point of a single baseline
model but differs from Hansen et al. (2006) in how it forms the set surrounding the baseline
model. A new object appears: a quadratic function of a Markov state that defines alternative parametric models to be included within a set of models surrounding the baseline
model. The decision maker wants valuations that are robust to these models in addition
to other vaguely specified models expressed as before by multiplying the baseline model
by likelihood ratios. The quadratic function can be specified to include alternatives to the
baseline model including ones with fixed parameters, time varying parameters, and other
less structured forms of model uncertainty.
For asset pricing, a key object that emerges from the analysis in Hansen and Sargent
(2010) is a vector of worst-case drift distortions to the baseline model. The negative of
the drift distortion vector equals the vector of market prices of model uncertainty that
compensate the representative investor for bearing model uncertainty. The effects that our
new object the quadratic function indexing particular alternative models has on market prices of uncertainty are all intermediated through this drift distortion. We show how
the quadratic function can produce drift distortions that imply stochastic discount factors
resembling ones attained by earlier authors under different assumptions about the sources
of risks. For example, models that posit that a representative consumers consumption
process has innovations with stochastic volatility introduce new risk exposures in the form
of the shocks to volatilities. Their presence induces time variation in equilibrium compensations for exposures to shocks that include both the stochastic volatility shock as well as
the original shocks whose volatilities now move. By way of contrast, we introduce no
stochastic volatility and no new risks. Instead, we amplify the prices of exposures to the
original shocks. We induce fluctuations in those prices by modeling how the representative consumer struggles to confront his doubts about the baseline model. We extend these
insights to the analysis of uncertainty prices over alternative investment horizons.
Section 2 describes a representative consumers baseline probability model and martingale perturbations to it. Section 3 describes two convex sets of martingales that perturb
the baseline model. Section 4 uses one of these sets to form a robust planning problem that
generates a worst-case model that we use to calibrate key parameters measuring the size
of a convex set of models. Section 5 constructs a recursive representation of a competitive
equilibrium. Then it links the worst-case model that emerges the robust planning problem
to competitive equilibrium compensations that the representative consumer earns for bearing model uncertainty. This section also describes a term structure of these market prices
of uncertainty. By borrowing from Hansen and Sargent (2010), section 6 describes a quantitative version of a baseline model as well as a class of models that particularly concern the
robust consumer and the robust planner. Section 7 uses the quantitative model to compare
the set of models that concern both our robust planner and our representative consumer
with two other sets featured in Anderson et al. (2003) and Hansen and Sargent (2010),
one based on Chernoff entropy, the other on relative entropy. Section 8 offers concluding
remarks. Six appendices provide technical details.
The model
2.1
Mathematical framework
.
A representative consumer cares about a stochastic process for Y = {Yt : t } described
by the baseline model1
d log Yt = (.01) (
+ Xt ) dt + (.01) dWt
dXt = dt
Xt dt + dWt ,
(1)
model.2
Because he doesnt trust the baseline model, the consumer also cares about Y under
probability models obtained by multiplying probabilities associated with the baseline model
(1) by likelihood ratios. We represent a likelihood ratio by a stochastic process Z h that is
a positive martingale with respect to the baseline model and that satisfies3
dZth = Zth ht dWt ,
(2)
or
1
(3)
d log Zth = ht dWt |ht |2 dt,
2
where h is adapted to the filtration F = {Ft : t 0} associated with the Brownian motion
W and satisfies
Z
t
|hu |2 du <
(4)
with probability one. Imposing the initial condition Z0h = 1, we express the solution of
stochastic differential equation (2) as:
Zth
= exp
Z
t
0
1
ht dWt
2
t
2
|hu | du .
0
(5)
We let X denote the stochastic process, Xt the process at date t, and x a realized value of the state.
In earlier papers, we sometimes referred to what we now call the baseline model as the decision makers
approximating model or benchmark model.
3
James (1992), Chen and Epstein (2002), and Hansen et al. (2006) used this representation.
2
for any t 0 and any bounded Ft -measurable random variable Bt . Similarly, we write
E g B0 = E [g(X0)B0 ]
for any bounded random variable B0 in the date zero information set F0 .
Here the positive random variable Zth acts as a Radon-Nikodym derivative for the
date t conditional expectation operator E h [|X0 ]. The martingale property of the process
Z h ensures that the conditional expectations operators for different ts are compatible in
the sense that they satisfy a Law of Iterated Expectations. The random variable g(X0 )
acts as a Radon-Nikodym derivative for the date zero unconditional distribution vis a
b over the date zero state vector X0 .
vis a baseline probability distribution Q
While under the baseline model W is a standard Brownian motion, under the alternative
h model distribution this process has increments
dWt = ht dt + dWth ,
(6)
where W h is a standard Brownian motion. While (3) expresses the evolution of log Z h in
(7)
2.2
Discounted relative entropy quantifies how a (g, h) pair distorts baseline model probabilities.
We construct discounted relative entropy in two steps. First, we condition on X0 = 0 and
b
focus solely on h; second, we focus on misspecifications of Q.
i) Our first step is to compute
(Z ; x) =
exp(t)E Zth log Zth X0 = x dt
0 Z
1
exp(t)E Zth |ht |2 X0 = x dt
=
2
0
h
(8)
stationary probability distribution for X under the baseline model and that g is the
b We average over the initial state via
density used to alter Q.
h
(g; Z )
b
(Z ; x)g(x)Q(dx)
+
h
b
g(x) log g(x)dQ
(9)
The growth rate includes a multiplication by 100 that offsets one of the .01s.
Hansen et al. (2006) used the representation of discounted relative entropy that appears on the right
side of the first line of (8).
5
Two convex sets that surround the baseline model are designed to include parametric
probability models that a decision maker cares about. One set can readily be used for
robust control problems, but the other cannot. Nevertheless, the second set is useful
because it generalizes Chernoff (1952) entropy to a Markov environment and thereby has
an explicit statistical interpretation. We are interested in how these two convex sets are
related.
3.1
The following parametric model nests baseline model (1) within a bigger class:
d log Ct = .01 ( + Xt ) dt + .01 dWth
dXt = dt Xt dt + dWth ,
(10)
where W h is a Brownian motion and (6) continues to describe the relationship between the
processes W and W h . Here (
, ,
) are parameters of the baseline model (1), (, , ) are
parameters of model (10), and (, ) are parameters common to both models. We want to
use drift distortions h for W to represent models in a parametric class defined by (10). We
can express model (10) in terms of our section 2.1 structure by setting
ht = (Xt ) 0 + 1 Xt
and using (1), (6), and (10) to deduce the following restrictions on 0 and 1 :
"
#
" #
0 =
"
#
" #
0
1 =
.
(11)
ary distribution implied by the baseline model, in which case we can construct g so that
b
g(x)dQ(x)
is the stationary distribution under the alternative model implied by (0 , 1 ).
6
(12)
)2 x2 .
(x)
0 + 21x + 2 x2 =
||2
This choice of makes both =
and = 2
quadratic function .
Definition 3.2.
n
o
t)
Z o = Z h Z + : |ht |2 (X
Next we construct a larger set of martingales that contains Z o but allows departures
from the parametric structure (10).
Definition 3.3.
n
o
h
2
e
Z = Z Z : |ht | (Xt )
(13)
3.2
3.3
We construct our first convex set of martingales Z h by starting with a drift distortion h
that represents a particular alternative parametric model created along lines described in
section 3.1. We use the following functional of a process Z h :
Z
2
Z ; |h| , x =
exp(u)E Zuh log Zuh du|X0 = x
h
i
u | du|X0 = x
exp(u)E Zuh | h
Z0
h
i
1
h
2
2
exp(u)E
|hu | |hu | |X0 = x .
=
2 0
(14)
Let
2 = (x)
|h|
and introduce a positive number .
Definition 3.4.
Z h
Z
i
h
h
b
b
Z = g(X0 )Z Z : Z ; (X), x g(x)Q(dx) + g(x) log g(x)Q(x) 0 .
(15)
Z includes martingales in Ze that are associated with the parametric probability models
of Section 3.1. In light of feature i) , the set Z is is convex in g(X0 )Z h and necessarily
contains Z h and Z h = 1. Feature ii) makes it tractable to use Z to pose a recursive robust
to include parametric
decision problem. Feature iii) provides a convenient way to use {}
3.4
Zb
entropys connection to a specific statistical decision problem makes it attractive, it has the
disadvantage that it is less tractable than relative entropy for the types of robust decision
Consider a statistical model selection rule based on a data history of length t that takes the
form log Zth log , where Zth is the likelihood ratio associated with the alternative model
for a sample size t. To construct a bound on the probability that this model selection rule
incorrectly chooses the alternative model when the baseline model governs the data, we use
an argument from large deviations theory that starts from the inequality
1{log Z h } = 1{r +r log Z h 0} = 1{exp(r )(Z h )r 1} exp(r )(Zth )r ,
t
t
t
which holds for 0 r 1. The expectation of the term on the left side equals the
probability of mistakenly selecting the alternative model when the data are a sample of
size t generated by the baseline model. We bound this mistake probability for large t by
following Donsker and Varadhan (1976) and Newman and Stuck (1979) and studying
lim sup
t
r
r
1
1
log E exp(r ) Zth = lim sup log E Zth
t
t
t
for alternative choices of r. The threshold does not affect this limit. Furthermore, the
limit is often independent of the initial state X0 = x. To get the best bound, we compute
inf lim sup
0r1
r
1
log E Zth ,
t
a limit that is typically negative because mistake probabilities decay with sample size. A
measure of Chernoff entropy is then
(Z h , x) = inf lim sup
0r1
r
1
log E Zth .
t
(16)
mistake probability2
exp(T2 )
.
=
mistake probability1
exp(T1 )
10
log .5
.
(17)
The preceding back-of-the-envelope calculation justifies the detection error bound computed by Anderson et al. (2003). The bound on the decay rate should be interpreted
cautiously because, while it is constant, the actual decay rate is not. Furthermore, the pairwise comparison oversimplifies the challenge truly facing a robust decision maker, which is
statistically to discriminate among multiple models.
We could conduct a symmetrical calculation that reverses the roles of the two models,
so that the h model with martingale Z h becomes the model on which we condition. It is
straightforward to show that the limiting rate remains the same. Thus, when we select a
model by comparing a log likelihood ratio to a constant threshold, the two types of mistakes
share the same asymptotic decay rate.
Our second convex set is a ball formed using Chernoff entropy (16).
Definition 3.5.
Zb = Z h Z : (Z h ; x) .
(18)
Calibrating and
In subsections 4.1 and 4.2, we formulate a robust planning problem for an economy with
a representative consumer having an instantaneous utility function that is logarithmic in
consumption. Associated with the worst-case probability from the robust planning problem
is a greatest lower bound of expected discounted utility over the family Z of alternative
probability distributions. In subsections 4.3, 4.4, and 4.5, we represent the worst-case
probability as a drift distortion to the multivariate Brownian motion in the baseline model
(1). We then use that drift distortion to guide the calibration of the parameters and
that pin down the size of the set Z . In section 5, we show how that same worst-case drift
distortion appears in a recursive representation of competitive equilibrium prices for an
economy with a representative investor. We deduce uncertainty prices and connect them
to the worst-case drift distortion from our robust planning problem.
11
4.1
(x)
where for the moment is an arbitrary parameter and is pre-specified. Eventually, we will
allow to be specified a priori up to a scale determined by a scalar that well calibrate
by imposing a model detection probability half-life defined in terms of Chernoff entropy.
Let be a multiplier on the constraint:
Z
Z
h
i
h
b
b
Z ; (X), x g(x)Q(dx) + g(x) log g(x)Q(x)
0
(19)
4.2
(20)
1
2 (, )x2 + 21 (, )x + 0 (, ) ,
2
1
h (x, , ) = [.01 2 (, )x 1 (, )] .
4.3
(21)
Determining
To set , we must decide how to weight the initial state. Previous research by Petersen et al.
b to be a mass point over single value of x and thereby
(2000) and Hansen et al. (2006) set Q
12
b
(x, , ) g(x)Q(dx)
+
Z
b
log g(x)g(x)Q(dx)
.
(22)
1
g(x, , ) exp (x, , )
(23)
1
2
2 (, )
2
||
"
#
1 2
1
(, ) =
+ 1 (, ) .
||2
Z
1
b
exp (x, , ) Q(dx) .
13
Specifying
4.4
, ), ] log g[x, (
, ), ]Q(dx)
b
g[x, (
= .
(24)
(g; Z )
where
b
g(x)(Z ; x)Q(dx)
+
h
1
(Z ; x) =
2
h
b
g(x) log g(x)Q(dx),
exp(u)E h |hu |2 |X0 = x .
The relative entropy measure includes an adjustment for distorting the initial distribution.
R
b
By imposing (24), we have set exactly to offset the term g(x) log g(x)Q(dx)
when
evaluating the constraint (19) at the minimizing choice of g and Z.
4.5
We refine a suggestion of Anderson et al. (1998). Compute h [x, (), ] and evaluate the
associated Chernoff entropy. Then adjust to match a target half-life.8 A larger value of
should lead to a smaller half-life. Call the resulting , and let
g (x) = g [x, ( ), ]
14
5.1
(25)
where At = a is a vector of chosen risk exposures, (x) is the instantaneous risk free rate
expressed as a percent, and (x) is the vector of risk prices evaluated at state Xt = x.
Initial wealth is K0 . The investor has discounted logarithmic preferences but distrusts his
probability model.
Key inputs to a representative consumers robust portfolio problem are the baseline
model (1), the wealth evolution equation (25), the vector of risk prices (x), and the
quadratic function in (12) that defines the alternative explicit models that concern the
representative consumer. As in the robust planners problem analyzed in section 4.1, let
be a penalty parameter or Lagrange multiplier on the constraint (19). For the recursive
competitive equilibrium, we take (, ) as given. We described how we calibrate these
parameters in section 4.
Under the guess that the value function takes the form (x, , ) + log k + log , the
HJB equation for the robust portfolio allocation problem is
0 = max min (x, , ) log k log + log c
a,c
c
+ (.01)(x)
k
|a|2
+ (.01)(x) a + a h
+
x (x, , ) + h (x, , )
22
|h|
2
||2
2 x + 21 x + 0 .
(x, , ) +
+
2
2
2
(26)
1
= ,
c
k
which implies that c = k, an implication that flows partly from the representative consumers unitary elasticity of intertemporal substitution. First-order conditions for a and h
15
are
(.01)(x) + h (x, , ) a (x, , ) = 0
(27a)
(27b)
Here we appeal to arguments like those in Hansen and Sargent (2008, ch. 7) to justify
stacking first-order conditions and not worrying about who goes first in the two-person
zero-sum game.9
5.2
We show here that the drift distortion h that emerges from the robust planners problem of
subsection 4.1 determines prices that a competitive equilibrium awards for bearing model
uncertainty. To compute a vector (x) of competitive equilibrium vector of risk prices, we
find a robust planners marginal valuation of exposure to the W shocks. We decompose
that price vector into separate compensations for bearing risk and for accepting model
uncertainty.
Noting from the robust planners problem that the shock exposure vectors for log K
and log Y must coincide implies
a = (.01).
Thus, from (27a), = , where
(x) = 100h(x, ).
(28)
.0001
= (.01)(
+ x),
2
so that = , where
(x) = 100 + (
+ x) + h (x, )
9
.01
.
2
(29)
If we were to use a timing protocol that allows the maximizing player to take account of the impact
of its decisions on the minimizing agent, we would obtain the same equilibrium decision rules as those
described in the text.
16
We can use these formulas for equilibrium prices to construct a solution to the HJB equation
of a representative consumer in a competitive equilibrium by letting = and g = (.01).
5.3
[
x +
1 x] e (x)
|a|2
c
+ (.01)(x) + (.01)(x) a + a (0 + 1 x)
k
2
||2 e
+
(x).
2
1
=0
c
k
17
(.01)(x) + 0 + 1 x a = 0,
which lead to decision rules c = k and
a = (.01)(x) + 0 + 1 x.
Because the exposure and drift for log K and log Y should coincide in equilibrium, it follows
that
+ (.01)(x) + .01 (x) + .0001
.0001
= (.01) [
+ x + h (x)] .
2
Thus, the ordinary decision rules that solve the ex post portfolio problem imply the same
equilibrium prices as the robust portfolio problem, so that = and = , as given by
(28) and (29), respectively.
5.4
We now study how competitive equilibrium uncertainty prices vary over an investment horizon by computing a pricing counterpart to an impulse response function. Our continuoustime formulation means that the pertinent shock that occurs during the next instant is
an incremental change that will have incremental effects on prices across all future time
periods. An asset exposed to these shocks earns compensations that depend on the horizon.
Shock-price elasticities are state dependent because they vary with the growth state. In
this section, we compute the elasticities and produce what we regard as a dynamic value
decomposition. We present a quantitative example in section 7.3.10
5.4.1
The equilibrium stochastic discount factor process for our robust representative consumer
economy is
1
d log St = dt .01 (
+ Xt ) dt .01 dWt + ht dWt |ht |2 dt.
2
(30)
The stochastic discount factor has a linear local mean and a quadratic local variance.
The exponential-quadratic formulation has been used extensively in empirical asset pricing
10
18
applications. Duffie and Kan (1994) described term structures of interest rates implied by
models with affine stochastic discount factors. Ang and Piazzesi (2003) estimated a term
structure model with an affine stochastic discount factor process driven by macroeconomic
variables.
The entries of the vector, (Xt ) given by (28), which equal minus the local exposures to
the Brownian shocks, are usually interpreted as local risk prices, but we shall reinterpret
them. Motivated by the decomposition
minus stochastic discount factor exposure =
.01
risk price
ht ,
uncertainty price
we prefer to think of .01 as risk prices induced by the curvature of log utility and ht
as uncertainty prices induced by a representative investors doubts about the baseline
model. Here ht = 0 + 1 x, as described in equation (21). When 1 = 0, ht is constant;
but when 1 differs from zero, the uncertainty prices ht = h (Xt ) are time varying and
depend linearly on the growth state Xt . When the dependence of h on x is positive, these
uncertainty prices are higher in bad times than in good times. Countercyclical uncertainty
prices emerge endogenously from a baseline model that excludes stochastic volatility in the
underlying consumption risk as an exogenous input. Such fluctuations emerge endogenously
from a baseline model that excludes stochastic volatility in the underlying consumption risk
as an exogenous input. Stochastic volatility models introduce new risks to be priced while
also inducing fluctuations in the prices of the original risks. The mechanism in this
paper simultaneously enhances and induces fluctuations in the uncertainty prices, but it
introduces no new sources of risk. Instead, it focuses on the impact of uncertainty about
the implications of those risks.
Following Borovicka et al. (2011), we assign horizon-dependent uncertainty prices to risk
exposures. To represent shock price elasticities, we study the dependence of logarithms of
expected returns on an investment horizon. The logarithm of the expected return from a
consumption payoff at date t is
log E
!
"
#
Ct
Ct
X0 = x log E St
X0 = x .
C0
C0
(31)
The first term captures the expected payoff and the second the cost of the payoff. A shock
in the next instance affects the consumption and the stochastic discount factor processes.
19
In continuous time, this leads formally to what is called a Malliavin derivative. There is
one such derivative for each Brownian increment. The date zero shock influences both the
expected asset payoff at date t (aggregate consumption in this case) and the cost of an asset
with this payoff. Its impact on the logarithm of the expected return is the price elasticity
and its impact on the logarithm of the expected payoff is the exposure elasticity.
Consider initially the expected payoff term on the left side of (31). Let D0 Ct denote
the derivative vector of Ct with respect to dW0 . The familiar formula for a derivative of a
logarithm applies so that
D0 Ct = Ct D0 log Ct .
A contribution to the elasticity vector for horizon t is
E
i
log Ct |X0 = x
h
i
.
E CC0t |X0 = x
Ct
D
C0 0
There is a distinct elasticity for each shock. Since log C has linear dynamics, D0 log Ct can
be shown to be the same as the vector of impulse responses of log Ct to shocks at date zero,
which does not depend on the Markov state. We call this an exposure elasticity.
Consider next the cost term on the right-hand side of (31). For the product M = SC, a
calculation analogous to the preceding one confirms that the contribution of the cost term
to the elasticity is
i
h
Mt
D
log
M
|X
=
x
E M
0
t
0
0
h
i
.
Mt
E M
|X
=
x
0
0
The dynamic evolution of the stochastic discount factor is not linear in the state variable,
and as a result
D0 log Mt = D0 log St + D0 log Ct
is no longer a deterministic function of time.
The shock price elasticity combines these calculations:
E
h
h
i
i
i
Mt
Mt
E M
E
log Ct |X0 = x
D
log
M
|X
=
x
D
log
S
|X
=
x
0
t
0
t
0
M0 0
0
i
i
i
h
h
h
=
. (32)
Mt
Mt
|X
=
x
|X
=
x
E CC0t |X0 = x
E M
E
0
0
M0
0
Ct
D
C0 0
This is a valuation analogue of the impulse response function routinely estimated by em-
20
A quantitative example
For a laboratory, we use our baseline model (1) evaluated at the following maximum likelihood estimates computed by Hansen and Sargent (2010):11
= .465,
"
#
.468
=
0
= 0
= .185
"
#
0
=
.149
(33)
The estimates are for Y being consumption of nondurables and services for aggregate U.S. data over
the period 1948II to 2009 IV.
21
constant terms.
Consistent with findings of Anderson et al. (2003) and Hansen and Sargent (2010), when
= 1, there is no change in the persistence parameter . The worst-case model targets
(x)
the constant terms in the consumption evolution and the state evolution. The worst-case
analysis reduces to a determination of how much to distort the respective constant terms
only.
Half-Life
+ (/)
0.4650
0.1850
0.4650
120
0.2562
80
0.2024
40
0.0579
Half-Life
+ (/)
0.4650
0.1850
0.4650
120
0.2648
80
0.2198
40
0.1182
"
# !" #
0 1
t
= + exp(t).
0
Integrating this growth rate over an interval [0, t] gives a worst-case trend for log consumption:
t + 2 exp(t) 2
+
(34)
Notice that the initial growth trend growth rate is and that the eventual growth rate is
+ . In this calculation, we impose the distorted model starting at date zero and consider
22
its implications going forward. The shift in the constant term for the evolution of X has
no immediate impact on the growth of log C. Its eventual impact is determined in part by
the persistence parameter . We applied formula (34) to alternative models including the
worst-case models to compute the long-run drifts reported in Figures 1 and 2.
Figure 1: Long-run drift of log Ct for the three target half-lives when = x2 .
23
Figure 2: Long-run drift of log Ct for the three target half-lives when = 1.
Next we consider the distributional impacts. The new information about log Ct log C0
(scaled by 100) is:
Z tZ
0
exp(v) dBuv du +
dBu =
0
Z tZ
0
Z tZ
0
1
=
r
t
Z 0t
dBu
dBu
+
dBu
0
Z
Z t
1 t
[1 exp[(t r)]] dBr +
=
dBu
0
0
The variance is .0001 times the following object
1
2
2
1
1
= 2 ||2 t + ||2t 3 [1 exp(t)]||2 + 3 [1 exp(2t)]||2 ,
24
where we have used the fact that = 0. Using this calculation, figures 3 and 4
display the interdecile ranges of the distribution for consumption growth over alternative
horizons. These figures depict deciles for both the baseline model and the worst-case models
associated with a half-life of 80. The region between the deciles illustrates a component
of risk in the consumption distribution. The variation across the baseline and worst-case
models reflects a broader notion of uncertainty driven by skepticism about the baseline
model. The upper decile of the worst-case model overlaps the lower decile of the baseline
model in both figures. The interdecile range is somewhat larger when is quadratic in x
than when is constant.
Figure 3: Expected values of log Ct scaled by 100 for the baseline model and for a half-life
of 80 when = x2 . The shaded black and red areas show the .1 and .9 interdecile ranges
under the baseline model and the worst-case model for a half-life of 80. The black line is
the mean growth for the baseline model, and the red circle line is the mean growth for the
worst-case model.
25
Figure 4: Expected values of log Ct scaled by 100 for the baseline model and for a half-life
of 80 when = 1. The shaded black and red areas show the .1 and .9 interdecile ranges
under the the baseline and the worst-case model for a half-life of 80. The black line is the
mean growth for the baseline model, and the red circle line is the mean growth for the
worst-case model.
h
h
b
Z = Z : Z ; , x g (x)Q(dx)
0
26
7.1
Entropy ball
Anderson et al. (2003) and Hansen and Sargent (2010) focused primarily on entropy balls.
Here we are interested in constructing a new set Z that we define as the smallest entropy
ball that contains Z . An entropy ball is a family of Z h s that satisfy12
h
(Z ) =
Z Z
exp(t)E
Zth
1
b
= x g (x)Q(dx)
2
(35)
for some constant > 0. By constructing an entropy ball that contains Z , we compute
how large relative entropy can be for martingales in the set Z .
and pose a maximum
To determine this magnitude, we take the quadratic function (x)
problem that starts from the observation that a martingale Z h that is biggest in terms of
its relative entropy satisfies the constraint in definition 3.4 at equality:
Z
h
i
t ), x q (x)Q(dx)
b
Z h ; (X
= 0.
t )dt|X0
exp(t)Zth (X
b
= x g (x)Q(dx)
x g (x)Q(dx)
b
Z h ; ,
0.
(36)
We use the term ball loosely because typically a ball in mathematics is defined using a metric. Although
relative entropy quantifies statistical discrepancy, it is not a metric because it depends on which of two
models is taken as the baseline model.
27
Let
.
Z=
By construction Z Z.
7.2
Z :
1
b
Z ; x g (x)Q(dx)
2
h
Comparing sets
b While it is
We compare intersections of Z o with each of the three sets Z , Z, and Z.
tractable to use the sets Z and Z to formulate robust decision problems, these sets are
not directly linked to statistical discrimination problems. The set Zb is closely linked to
statistical discrimination, but for forming robust decision problems it is not as tractable
b at least in
as the other two. It would be comforting if Z were closely to approximate Z,
regions near the worst-case model that emerges from the robust planners HJB equation
(20).
We compute and report the projection Zb Z o of the Chernoff ball on Z o for three
half-lives in figure 6. We represent these projections using the three parameter values that
characterize Z o . For comparison, we also report (Z Z o ). The sets are distinct, but
the big differences occur for larger values of at which the Chernoff ball contains points
not included in Z . Such large values of turn out not the be the ones that the robust
planner most fears. Overall, the regions are closer for longer specifications of the half-life
of Chernoff entropy.
28
Figure 5: Projections of Sets I and II onto three-parameter axis. From Top to Bottom:
Target half-lives 120, 80, and 40, respectively. Left: = x2 . Right: = 1. (Z Z o ) shown
in blue mesh. (Zb Z o ) shown in yellow. The solution to the robust planners problem is
shown as the red point.
29
Half-Life
69.78
42.19
16.65
+ (/)
0.4650
0
0.1850
0.4650
0.3982 -0.0362 0.1850
0.2024
0.3791 -0.0466 0.1850
0.1273
0.3282 -0.0742 0.1850
-0.0726
would have to be to contain the set used in the robust planning problem affiliated with
HJB equation (20). The right side compares the Chernoff ball to this entropy ball. As is
evident from this figure, the resulting entropy ball is much larger. When we solve the robust
planners problem with this constructed ball, we reduce the implied half lives, as reported
in Table 7.2, decline from 120, 80 and 40 to 70, 42 and 17, respectively. The constant terms
for both the consumption equation (the first equation of (10)) and the consumption growth
equation (the second equation of (10)) are reduced while the autoregressive parameter is
not altered in comparison to Table 6.
30
Figure 6: Comparing entropy balls to sets I and II when = x2 . From Top to bottom:
half-lives 120, 80, and 40, respectively. (Z Z o ) shown in blue, (Z Z o ) shown in black
mesh. (Zb Z o ) shown in yellow. The solution to the robust planners problem is the red
point.
31
7.3
Figure 7: Shock-price elasticity to a shock to X for the three target half-lives when = x2 .
From Top to Bottom: the half-lives 120, 80, and 40, respectively. The shaded regions show
interquartile ranges of the shock-price elasticities under the stationary distribution for X.
32
The uncertainty price elasticities depend on the initial state x. In figure 7, we display
the shock elasticities evaluated at the median and the two quartiles of the stationary distribution for X. We shade in interquartile ranges. Figure 7 shows shock price elasticity
trajectories for the growth rate shock. They are nearly constant across horizons. Increasing
the concern for robustness, as reflected by the sizes of the associated Chernoff half lives,
makes the elasticities larger and increases their variation across horizons.
Concluding remarks
We have applied our proposal for constructing a set of models surrounding a decision
makers baseline probability model to an asset pricing model in which a representative
consumers responses to model uncertainty make so-called prices of risk be countercyclical.
We say so-called because they are actually compensations for model uncertainty, not risk.
And their countercyclical components are entirely due to fears of model misspecification.
We have produced an affine model (30) of the log stochastic discount factor whose so-called
risk prices reflect a robust planners worst-case drift distortions ht . We describe how these
drift distortions should be interpreted as prices of model uncertainty. The dependency of
these uncertainty prices ht on the growth state x, and thus whether they are pro cyclical or
specifying a general equilibrium model. A third approach introduces stochastic volatility into the macroeconomy by positing that the volatilities of shocks driving consumption
growth are themselves stochastic processes. A stochastic volatility model induces time
variation in risk prices via exogenous movements in the conditional volatilities impinging
on macroeconomic variables.
What drives countercyclical risk prices in Hansen and Sargent (2010) is a particular
kind of robust model averaging occurring inside the head of the representative consumer.
The consumer carries along two difficult-to-distinguish models of consumption growth, one
asserting i.i.d. log consumption growth, the other asserting that the growth in log consumption is a process with a slowly moving conditional mean.13 The consumer uses observations
on consumption growth to update a Bayesian prior over these two models, starting from
an initial prior probability of .5. The prior wanders over the post WWII sample for US
data, but ends up about where it started. Each period, the Hansen and Sargent representative consumer expresses his specification distrust by exponentially twisting a posterior over
the two baseline models in a pessimistic direction. That leads the consumer to interpret
good news as temporary and bad news as persistent, causing him to put countercyclical
uncertainty components into equilibrium risk prices.
In this paper, we propose a different way to induce variation in risk prices. We abstract
from learning and instead consider alternatives models with parameters whose future variations are not discernible from from the past. These time-varying parameter models differ
from the decision makers baseline model, a fixed parameter model whose parameters can
be well estimated from historical data. We ensure that among the class of alternative models are ones that allow for parameters persistently to deviate from those of the baseline
model in statistically subtle and time-varying ways. In addition to this class of alternative
models, the decision maker also includes other statistical specifications in the set of models
that concern him. The robust planners worst-case model responds to these forms of model
ambiguity partly by having more persistence than in a baseline models. Our approach gains
tractability because the worst-case model turns out to be a time-invariant model in which
projections for long-term growth are more cautious and stochastic growth is more persis13
Bansal and Yaron (2004) and Hansen and Sargent (2010) both start from the observation that two such
models are difficult to distinguish empirically, but they draw different conclusions from that observation.
Bansal and Yaron use the observation to justify a representative consumer who with complete confidence
embraces one of the models (the long-run risk model with persistent log consumption growth), while Hansen
and Sargent use the observation to justify a representative consumer who initially puts prior probability
.5 on both models and who continues to carry along both models when evaluating prospective outcomes.
34
tent than in the baseline model. Worst-case shock distributions are shifted in an adverse
fashion and with additional persistence that gives rise to enduring effects on uncertainty
prices. Adverse shifts in the shock distribution that drive up the absolute magnitudes of
uncertainty prices larger were also present in some of our earlier work (for example, see
Hansen et al. (1999) and Anderson et al. (2003)). In this paper, we induce state dependence
in uncertainty prices in a different way, namely, by specifying the set of alternative models
to capture concerns about the baseline models specification of persistence in consumption
growth.
Models of robustness and ambiguity aversion bring new parameters. In this paper,
we extend our earlier work on Anderson et al. (2003) on restricting these parameters by
exploiting connections between models of statistical model discrimination and our way of
formulating robustness. We build on mathematical formulations of Newman and Stuck
(1979), Petersen et al. (2000), and Hansen et al. (2006). We pose an ex ante robustness
problem that pins down a robustness penalty parameter by linking it to an asymptotic
measure of statistical discrimination between models. This asymptotic measure allows us to
quantify a half-life for reducing the mistakes in selecting between competing models based
on historical evidence. A large statistical discrimination rate implies a short half-life for
reducing discrimination mistake probabilities. Anderson et al. (2003) and Hansen (2007)
had studied the connection between conditional discrimination rates and uncertainty prices
that clear security markets. By following Newman and Stuck (1979) and studying asymptotic rates, we link statistical discrimination half-lives to calibrated equilibrium uncertainty
prices.
35
Z Reconsidered
Let h
Here E h denotes an expectation under the h model and E hh denotes an expectation under
model.
the h h
Zh
By using ht h as a Radon-Nykodym derivative at time t, we can represent the h model
Zt
model:
in terms of the h h
E h (Bt |X0 ) = E Zth Bt |X0
"
!
#
Zth
=E
Zthh Bt |X0
hh
Z
!
#
"t
h
Z
t
Bt |X0 .
= E hh
hh
Zt
Recall that under the h probability distribution, W h is a multivariate standard Brownian
motion where from (6), dWt = ht dt + dWth . Thus,
t ) dWt 1 |ht h
t |2 dt
d log Zthh = (ht h
2
h
t |2 dt + (ht h
t ) ht
t ) dW 1 |ht h
= (ht h
t
2
t |2 dt.
t ) dW h + 1 |ht |2 dt 1 |h
= (ht h
t
2
2
(37)
Conditioned on date zero information, the discounted relative entropy of the h model
model is:
with respect to the h h
Z
h
i
1 h
h
h
hh
2
where we have used integration by parts and the evolution in (37). We are interested in the
discrepancy between: (i) the relative entropy (8) of h with respect to the baseline model,
model:
and (ii) the relative entropy (38) of the h model with respect to the h h
Z
h 2
Z ; |h| , x =
exp(t)E Zth log Zth |X0 = x dt
h
i
(39)
For a given multiplier, write the value function that solves HJB equation (36) in the form:
(x, ) =
which gives us
1
2 ()x2 + 2 1 ()x + 0 ()
2
1
[ 2 ()x + 1 ()] .
We can solve for 2 , 1 , and 0 by comparing the coefficients for x2 , x and the constant
h (x, ) =
)2 4 ||2 2 (+1)
+ 2
( + 2
.
2 () =
2
2 ||
1
+ ||2 2 () 1 () = 0.
1 () + ( + 1)1 1 ()
Thus
Finally,
1 () = 2( + 1)
1
2
( + 2
) 4 ||
2 (+1)
1
( + 1)
1
1
1
2
2
2
( + 1)0 + 2 ()|| + 1 () || .
0 () =
37
for X0 = x.
= 0
"
# " #
0
= 1
for 1 and 0 .
,
and
ii) For a given r, construct 0 , 1 , 2 ,
, ,
from:
(r + r2 )|h(x)|2 = (r + r2 )|0 + 1 x|2 = 0 + 21 x + 2 x2
= (1 r)
+ r
= (1 r) + r
= r
= (1 r)
+ r
iii) Solve
=
1
0 + 21 x + 2 x2 + ( x
)(log e) (x)
2
38
2 =
q
(
)2 + 2 ||2
||2
Given 2 , 1 solves
2=0
1
1 + 1 2 ||2 +
or
1 =
Finally,
2
2
1
1
q
.
=
2 ||2
(
)2 + 2 ||2
1
1
1
1.
= 0 ||2 2 ||2 (1 )2
2
2
2
D.1
Solving for
Consider
1
0 = min (x) + (.01)(
+ x) + (x)( +
x) + ||2 (x, )
h
2
2 x2 + 21 x + 0
+ (.01) h + (x, ) h + |h|2
2
2
1
2 ()x2 + 21 ()x + 0 ()
2
1
h(x, ) = [.01 2 ()x 1 ()] .
39
We can solve for 2 , 1 , and 0 by matching the coefficients for x2 , x and the constant
terms, respectively. Solving first for 2 :
q
2
2
) 4 || 2
+ 2
( + 2
2 () =
2
2 ||
=
2 ,
.01 .01 ( ) 2 +
2 + 1
q
1 () =2
+ ( + 2
)2 4 ||2 2
1
1
2
2
1 () + || 2 + |.01 1 ()| + 0
+
0 () = .02
D.2
Solving for
We want to solve
||2
2
2 () ||2
1
max
1 ()2 +
1
log 2
2 () ||2 log (2
) 0 () .
2
2
satisfies
=
2
||2 1,0
+ (2
2 ||2 ) 0,1
2
||2 1,1
+ (2
2 ||2 ) [ log (2
2 ||2 ) + log (2
) + 0,1 + 2]
40
Here is how to compute Chernoff entropies for parametric models of the form (10). Because
the hs associated with them take the form
ht = (Xt ),
these alternative models are Markovian. This allows us to compute Chernoff entropy by
using an eigenvalue approach of Donsker and Varadhan (1976) and Newman and Stuck
r
(1979). We start by computing the drift of Zth f (Xt ) for 0 r 1 at t = 0:
. (r + r2 )
|(x)|2f (x) + rf (x) (x)
[Gf ](x) =
2
f (x) 2
|| ,
f (x)x +
2
where [Gf ](x) is the drift given that X0 = x. Next we solve the eigenvalue problem
[G(r)]e(x, r) = (r)e(x, r),
whose eigenfunction e(x, r) is the exponential of a quadratic function of x. We compute
Chernoff entropy numerically by solving:
(Z h , x) = max (r).
r[0,1]
e (x)
e(x)
2
e (x)
e (x)
.
(log e) (x) =
e(x)
e(x)
For a positive f
[Gf ](x) . (r + r2 )
=
|h(x)|2 + r(log f ) (x) h(x) (log f ) (x)x
f (x)
2
log f (x) 2 [log f (x)]2 2
+
|| +
|| .
2
2
41
(40)
[Gf ](x)
.
G(log f ) (x) =
f (x)
G(log e) (x) =
.01
[1 exp (
t)] u + (.01) u
42
iii) Compute
where Mt = St
Ct
C0
. Note that
1
d log Mt = dt + ht dWt |ht |2 dt.
2
Let dWt have drift ht and compute expectations conditioned on X0 = x recursively:
ft ) ht h1t dt
d log St1 = .01Xt1 dt + h1t (ht dt + dW
ft
= .01X 1 dt + h1 dW
t
dXt1
Xt1 dt
ht = 0 + 1 Xt
h1t = 1 Xt1 ,
log Ct1 |X0 = x
.01
=
[1 exp(
t)] u + .01 u,
E CC0t |X0 = x
Ct
C0
log Ct1 |X0 = x
E (Mt log St1 |X0 = x)
E [Mt (log St1 + log Ct1 ) |X0 = x]
=
E (Mt |X0 = x)
E (Mt |X0 = x)
E CC0t |X0 = x
Ct
C0
.01
[1 exp(
t)] u + .01 u h (x) u.
43
44
References
Anderson, Evan W., Lars Peter Hansen, and Thomas J. Sargent. 1998. Risk and Robustness
in Equilibrium. Available on webpages.
. 2003. A Quartet of Semigroups for Model Specification, Robustness, Prices of Risk,
and Model Detection. Journal of the European Economic Association 1 (1):68123.
Ang, Andrew and Monika Piazzesi. 2003. A No-Arbitrage Vector Autoregression of the
Term Structure Dynamics with Macroeconomic and Latent Variables. Journal of Monetary Economics 50:745787.
Bansal, Ravi and Amir Yaron. 2004. Risks for the Long Run: A Potential Resolution of
Asset Pricing Puzzles. Journal of Finance 59 (4):14811509.
Borovicka, Jaroslav, Lars Peter Hansen, Mark Hendricks, and Jose A. Scheinkman. 2011.
Risk-Price Dynamics. Journal of Financial Econometrics 9 (1):365.
Campbell, John Y. and John Cochrane. 1999. Force of Habit: A Consumption-Based Explanation of Aggregate Stock Market Behavior. Journal of Political Economy 107 (2):205
251.
Chen, Zengjing and Larry Epstein. 2002. Ambiguity, Risk, and Asset Returns in Continuous
Time. Econometrica 70:14031443.
Chernoff, Herman. 1952. A Measure of Asymptotic Efficiency for Tests of a Hypothesis
Based on the Sum of Observations. Annals of Mathematical Statistics 23 (4):pp. 493507.
Donsker, Monroe E. and S. R. Srinivasa Varadhan. 1976. On the Principal Eigenvalue
of Second-Order Elliptic Differential Equations. Communications in Pure and Applied
Mathematics 29:595621.
Duffie, Darrell and Rui Kan. 1994. Multi-Factor Term Structure Models. Philosophical
Transactions: Physical Sciences and Engineering 347 (1684):577586.
Gilboa, Itzhak and David Schmeidler. 1989. Maxmin expected utility with non-unique
prior. Journal of Mathematical Economics 18 (2):141153.
Hansen, Lars Peter. 2007. Beliefs, Doubts and Learning: Valuing Macroeconomic Risk.
American Economic Review 97 (2):130.
45
. 2011. Dynamic Valuation Decomposition within Stochastic Economies. Econometrica 80 (3):911967. Fisher-Schultz Lecture at the European Meetings of the Econometric
Society.
Hansen, Lars Peter and Thomas Sargent. 2010. Fragile beliefs and the price of uncertainty.
Quantitative Economics 1 (1):129162.
Hansen, Lars Peter and Thomas J. Sargent. 2001. Robust Control and Model Uncertainty.
American Economic Review 91 (2):6066.
. 2008. Robustness. Princeton, New Jersey: Princeton University Press.
Hansen, Lars Peter, Thomas J. Sargent, and Jr. Tallarini, Thomas D. 1999. Robust Permanent Income and Pricing. The Review of Economic Studies 66 (4):873907.
Hansen, Lars Peter, Thomas J. Sargent, Gauhar A. Turmuhambetova, and Noah Williams.
2006. Robust Control and Model Misspecification. Journal of Economic Theory
128 (1):4590.
James, Matthew R. 1992. Asymptotic analysis of nonlinear stochastic risk-sensitive control
and differential games. Mathematics of Control, Signals and Systems 5 (4):401417.
Newman, C. M. and B. W. Stuck. 1979. Chernoff Bounds for Discriminating between Two
Markov Processes. Stochastics 2 (1-4):139153.
Petersen, I.R., M.R. James, and P. Dupuis. 2000. Minimax optimal control of stochastic
uncertain systems with relative entropy constraints. Automatic Control, IEEE Transactions on 45 (3):398412.
46