Gerard Brunick - A Weak Existence Result With Application To The Financial Engineer's Calibration Problem (2008) PDF

Abstract
A Weak Existence Result with Application to the Financial Engineer’s

Calibration Problem
Gerard Brunick
Advisor: Steven E. Shreve
Given an initial Itô process, Krylov and Gyöngy have shown that it is of-
ten possible to construct a diffusion process with the same one-dimensional
marginal distributions. As the one-dimensional marginal distributions of a
price process under a pricing measure essentially determine the prices of
European options written on that price process, this result has found wide
application in Mathematical Finance. In this dissertation, we extend the
result of Krylov and Gyöngy in two directions: We relax the technical con-
ditions which must be imposed on the initial Itô process. And we clarify
the relationship between the stochastic differential equation that is solved
by the mimicking process and the properties of the initial process that are
preserved.
A Weak Existence Result with
Application to the Financial
Engineer’s Calibration Problem
Gerard Brunick

Defense Date: July 29th , 2008
A dissertation in the Department of Mathematical Sciences submitted in

partial fulfillment of the requirements for the degree of Doctor of Philosophy
at Carnegie Mellon University.
Copyright © 2008 by Gerard Brunick
All rights reserved.
Abstract
A Weak Existence Result with Application to the Financial Engineer’s
Calibration Problem
Gerard Brunick
Given an initial Itô process, Krylov and Gyöngy have shown that it is of-
ten possible to construct a diffusion process with the same one-dimensional
marginal distributions. As the one-dimensional marginal distributions of a
price process under a pricing measure essentially determine the prices of
European options written on that price process, this result has found wide
application in Mathematical Finance. In this dissertation, we extend the
result of Krylov and Gyöngy in two directions: We relax the technical con-
ditions which must be imposed on the initial Itô process. And we clarify
the relationship between the stochastic differential equation that is solved
by the mimicking process and the properties of the initial process that are
preserved.
i
Acknowledgments
I would like to express my gratitude to my adviser, Steven Shreve, for his

guidance and support as I worked on this dissertation. His comments and
insight have been invaluable. I would also like to thank Dmitry Kramkov
and Kasper Larson for many useful conversations on a wide range of topics.
Finally, I would like acknowledge Peter Carr who made me aware of the
previous work of Krylov and Gyöngy, and Silviu Predoiu who produced a
very nice counterexample that allowed me to abandon a fallacious conjecture.
I would also like to take this opportunity to thank my family for their
love and encouragement during my time at Carnegie Mellon University. In
particular, I would never have had this opportunity without my parents’
constant love, patience, and support. Finally, I would like to express my
gratitude to Jessica whose love and kindness have been a constant source of
inspiration.
During my time at Carnegie Mellon University I was supported by an
NSF VIGRE fellowship and grant DMS-0404682.
ii
Contents
1 Introduction 1
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Definitions and Notation . . . . . . . . . . . . . . . . . . . . . 7
2 Statement of Results 12
2.1 Updating Functions . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2 Main Result . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.3 Applications to Mixture Models . . . . . . . . . . . . . . . . . 30
3 A Cross Product Construction 33

3.1 The Binary Construction . . . . . . . . . . . . . . . . . . . . . 36
3.2 Properties Preserved by the Binary Construction. . . . . . . . 42
3.3 The General Construction. . . . . . . . . . . . . . . . . . . . . 53
4 Main Theorem 61
4.1 Conditional Expectation Lemmas . . . . . . . . . . . . . . . . 61
4.2 Approximation Lemmas . . . . . . . . . . . . . . . . . . . . . 66
4.3 Main Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . 77
A Galmarino’s Test 93
B Metric Space-Valued Random Variables. 99
C FV and AC Processes 104
D Semimartingale Characteristics 113
E Rebolledo’s Criterion 120
iii
F Convergence of Characteristics 128
References 136
iv
Chapter 1
Introduction
1.1 Introduction
Emanuel Derman [Der01] neatly summarizes the way in which many market
participants make use of financial models as follows:
Trading desks in many product areas at investment banks often

have substantial positions in long-term or exotic over-the-counter
derivative securities that have been designed to satisfy the risk
preferences of their customers . . . Because liquid market prices
are unavailable, these positions are marked and hedged by means
of sophisticated and complex financial models . . . These models
derive prices from market parameters (volatilities, correlations,
prepayment rates or default probabilities, for example) that are
forward-looking and should ideally be implied from market prices
of traded securities.
In this application, the market participant identifies a set of primary and

derivative securities which she believes characterize the state of the market.
In particular, she selects a set of securities which are actively traded so that
price information is available, and she then attempts to construct a financial
model in such a way that the prices computed within the model are consistent
with the prices quoted in the market for the securities in this set. The process
of constructing such a model is known as model calibration. Once a model
has been calibrated, the market participant then uses the model to draw
inferences about the market.
1
CHAPTER 1. INTRODUCTION
One approach to this calibration problem is to suppose that the financial

model takes a particular parametric form. The calibration problem then re-
duces to a nonlinear optimization over the parameter space to minimize an
objective function that measures the difference between the prices computed
in the model and the prices quoted in the market. While the absence of arbi-
trage does serve to reduce the range of possible market price configurations,
the space of market price configurations is generally of much higher dimen-
sion than the set of parameters for a given model. This raises the possibility
that a particular parametric model simply cannot be calibrated to market
prices. From the market participant’s perspective, this is in fact the main
indictment of the Black-Scholes model, which provides a single volatility pa-
rameter for calibration. As the map from parameters to prices is often rather
complicated, it can be difficult to determine, a priori, whether a particular
parametric form will allow for a good fit to a given set of market prices.
Instead, this question is empirical, and a great deal of research has been to
devoted to determining which classes of models allow for good fits to market
data.
In this dissertation, we develop a result that provides the market par-
ticipant with a tool to approach the calibration problem from the opposite
direction. We impose a specific structure on the derivative securities, and
we then use this structure to draw conclusions about the models that are
consistent with a given set of prices. To illustrate the structure that we
must impose on the derivative securities to apply our result, we introduce an
auxiliary process that we use to record information about the history of the
primary security, and we require that this auxiliary process can be updated
using only the changes in the price of the primary security. This is actually
a technical notion whose definition we postpone until Chapter 2; however,
we do provide the following heuristic version of that definition. Suppose
that, given the value of the auxiliary process today and a full description of
the changes in the price of the primary security over the next week, we can
give a full description of the changes in the auxiliary process over the next
week. Then we say that the auxiliary process may be updated using only the
changes in the price of the primary security.
As an example of an auxiliary process that satisfies this condition, we
might choose to track the current value of the primary security as well as its
running maximum. In this case, the auxiliary process takes values in R2 . We
could also choose to track the current value of the primary security and its
historical average. We could even let the auxiliary process take values in R4
2
and track the current value of the primary security, the running maximum,
the running minimum, and the historical average. An example of an auxiliary
process that may not be updated using only the changes in the price of the
primary security would be the current price of the primary security and the
price one week prior. Of course, if we instead decided to track the price of the
primary security over the entire previous week, this auxiliary process could
be updated using only the changes in the price of the primary security.
We now assume that we have fixed some auxiliary process that satisfies the
updating condition sketched above, and that we have prices for a collection of
European-style derivative securities. In this context, European-style means
that the holder of the derivative security receives a single payment at a fixed
maturity and makes no decisions prior to that date. The main result of
this dissertation essentially asserts that when the payoff of each derivative
security can be expressed as a function of the auxiliary process evaluated
at the derivative’s maturity, then it is possible to construct a model with a
price process that satisfies a “simple” stochastic differential equation (SDE)
in such a way that the model prices for all derivative securities agree with
the given prices. Moreover, the structure of this SDE is determined by the
structure of the auxiliary process. As an example of such a situation, we
note that the payoff of most barrier options can be expressed as a function of
the current value, the running maximum, and the running minimum of the
underlying security’s price at the maturity of the option.
To see how such a result might be useful, consider the simplest case where
the auxiliary process is simply the underlying security’s price. In this case,
the payoffs of European puts and calls are functions of the auxiliary pro-
cess at each option’s maturity. This case corresponds to the notion of “local
volatility models” and has been studied extensively, starting with the work
of Dupire [Dup94] and Derman and Kani [DK94]. (Rubinstein [Rub94] has
also given similar results in a binomial tree model.) The forward equation
of Fokker [Fok13], Planck [Pla17], and Kolmogorov [Kol31] expresses the re-
lationship between the drift and volatility coefficients of a diffusion process
and the one-dimensional marginal distributions of that process. Breeden and
Litzenberger [BL78] have shown the equivalence between European option
prices and the one-dimensional marginal distributions of the underlying price
process under any pricing measure. Connecting these two results and observ-
ing that the drift of a price process under any pricing measure is determined
by no-arbitrage, Dupire argued that the volatility coefficient in a diffusion
model for a price process may be implied directly from the set of European
3
option prices. In particular, assuming zero interest rates for simplicity of

exposition, it is possible to choose a (deterministic) diffusion coefficient σ
b in
such a way that the model prices for European options written on a price
process Sb that solves the SDE
(1.1) dSbt = σ
b(Sbt , t) Sbt dWt
will agree with the market prices. It was clear to Dupire from the start that
(1.1) may not be a very good model for the price process and he advocated a
hedging strategy that is robust to violations of the dynamics given in (1.1).
Indeed, empirical work such as [DFW98] indicates that σ b must often be
modified to refit market prices. As the model assumes that σ b should be a
fixed function, this is inconsistent.
Nevertheless, (1.1) is still useful because it can be used to characterize the
models that are consistent with a given set of European option prices. We
will say that a model is an Itô model if the price processes for the primary
securities are modeled as Itô processes. Consider a general Itô model where
the price process solves the SDE
(1.2) dSt = σt St dWt
for some adapted process σ under some pricing measure. Given such a model,
we could compute the European option prices, take these prices as inputs to
Dupire’s approach, and then choose σ b such that the prices for European
options written on price processes which solve (1.1) and (1.2) agree. It turns
out that the process σ and the function σb that we imply are related by the
rather intuitive formula
(1.3) b2 (x, t) = E[σt2 | St = x].

σ
Derman and Kani give such a formula in [DK98]. As the local volatility func-
tion σ
b essentially characterizes the European option prices, (1.3) essentially
characterizes the Itô models that are consistent with a given collection of Eu-
ropean option prices. Gatheral [Gat06] argues that one should understand
local volatilities as an “effective theory,” and (1.3) connects a local volatility
model with a stochastic volatility model in a way that is consistent, at least
with respect to pricing European options.
The relationship given in (1.3) has found a wide range of applications.
Brigo and Mecurio [BM01] [BM02] use (1.3) to produce local volatility models
4
where the prices for European options are given as simple mixtures of Black-
Scholes prices. To do this, they fix a finite number of deterministic volatility
scenarios and build a stochastic volatility model by randomly choosing a
volatility scenario at the initial time. It is clear that the price for any option
in such a model is given as a mixture of the prices that are computed in
each scenario. They then compute σ b explicitly using (1.3) and conclude that
the corresponding local volatility model has European option prices that are
given as mixtures of the prices computed in each scenario. We note that
the prices for non-European options in the initial mixture model and the
mimicking local volatility model may differ. Piterbarg stresses this point
in the working paper [Pit03a] and argues against the use of such mixture
models. We will briefly revisit this point at the end of Chapter 2.
Avellaneda et al. [ABOBF02] use (1.3) for pricing European options on
baskets of securities. In this case, the volatility of the basket is given as
the sum of the volatilities of the underlying securities and ideas from Varad-
han [Var67] are used to compute the most likely configuration of the basket
and approximate σ b. Combining (1.3) with parameter averaging techniques,
Antonov, Misirpashaev, and Piterbarg [Pit03b] [Pit05] [Pit06] [AM06] [Pit07]
[AMP07] have developed pricing approximations for a range of markets.
Inspired by the success of the HJM methodology, some authors have
attempted to use σ b as the state of a process. Unlike parameterized models,
this approach has the advantage that essentially any set of European option
prices may be matched with an appropriate choice of state. Derman and
Kani [DK98] initiate such an approach for trinomial trees, but comment
that the no-arbitrage drift conditions for such a model in continuous-time
are rather involved. More recently, Carmona and Nadtochiy [Car07] [CN07]
have provided a rigorous development of this approach in continuous-time.
They use results of Kunita [Kun90] on stochastic flows to ensure that σ ..
bt ( , ),
which is now a random field, remains regular enough that the pricing PDE
may be solved, and they derive the drift restrictions, correcting a mistake
in [DK98]. Formula (1.3) actually provides some clue as to why the drift
restrictions in such a model are so difficult to deal with. To see how a
perturbation to σb affects P[St ∈ dx], one must re-solve the forward equation
using the perturbed value of σ b, so the drift restrictions that enforces (1.3)
are not available in closed-form.
Formula (1.3) is also useful because it suggests a way to adjust an initial
model to fit a set of European options prices. In particular, given a model
of the form (1.2), equation (1.3) suggests that we might attempt to choose a
5
deterministic function f such that the solution to the SDE
(1.4) dSt = f (St , t) σt St dWt
has the required European option prices. In [BJN00], Britten-Jones and

Neuberger develop this approach for a discrete-time model, and Madan, Qain,
and Ren [MQR07] propose a numerical method to compute such an f in a
continuous-time model by solving the associated forward equation.
In this dissertation, we extend the relationship among the formulas (1.1),
(1.2), and (1.3) beyond the diffusion case. In particular, we show that by
conditioning on the value of an auxiliary process in (1.3), we may construct
a path-dependent function σ b and then find a process that solves a general-
ization of (1.1) and preserves some path-dependent properties of the initial
process given in (1.2). In particular, the one-dimensional marginal distribu-
tions of the auxiliary process are preserved, so the prices for European-style
options with payoffs that can be written as functions of the auxiliary process
at maturity are also preserved. We hope that with such a result, it will be
possible to adapt some of the local volatility-based approaches mentioned
above to handle common path-dependent options.
Moreover, even in the diffusion case, we believe that our approach offers
technical advantage over previous PDE-based approaches to local volatility
models. Dupire’s derivation of the local volatility SDE using the forward
equation is essentially formal; however, in earlier work only recently redis-
covered by finance community, Krylov [Kry84] and Gyöngy [Gyö86] develop
a result that may be considered a rigorous proof of the existence of a local
volatility model. They have provided the following result.
1.5 Theorem ([Gyö86] Thm. 4.6). Let W be an Rd -valued Wiener process,

and let X be an Rd -valued process that solves the SDE
dXt = µt dt + σt dWt ,
where µt ∈ Rd and σt ∈ Rd ⊗Rd are bounded, adapted processes and σt σtT

is uniformly positive definite. Then there exist (deterministic) functions µ b:
d d d d d
R ×R+ → R and b : R ×R+ → R ⊗R and a Lebesgue-null set N ⊂ R+
σ
b Xt , t = E[µt | Xt ] a.s. and σ
such that µ b2 (Xt , t) = E[σt σtT | Xt ] a.s. when
t∈/ N , and there exists a weak solution to the SDE
(1.6) dX
bt = µ
b(X
bt , t) dt + σ
b(X
bt , t) dW
ct
6
with the same one-dimensional marginal distributions as X, where W

c is an-
d
other R -valued Wiener process.
The requirements on the covariance process σσ T in this theorem are rather
strong. For instance, in Heston’s model [Hes93] the volatility process is nei-
ther bounded, nor uniformly bounded away from zero. As a result, one may
not apply Thm. 1.5 to conclude that there exists a local volatility model
with the same one-dimensional marginal distributions as a given parameteri-
zation of Heston’s model. Using the main result of this dissertation, we may
replace the requirements of boundedness and uniform positive definiteness
in Thm. 1.5 with a weaker integrability condition (compare Thm. 1.5 with
Cor. 2.16) that is satisfied in Heston’s model. So, even in the diffusion case,
we provide a stronger result by avoiding the use of PDE-based arguments.
1.2 Definitions and Notation

We have N = {1, 2, . . . } and N , N ∪ {∞}. We also have R , (−∞, ∞),
R+ , [0, ∞), and R+ , [0, ∞]. We define Rd , Rd ∪ {∞} to be the one
d
point compactification of the locally compactP spacei Rj . We treat p Rd as a
Hilbert space with inner product (x, y) , 1≤i≤d x y and kxk = (x, x).
We denote the set of n×d matrices by Rn ⊗R d
P . We treat Rn ⊗Rd as a Hilbert
T ii
space as well with inner product (A, B) , 1≤i≤n (AB ) where B T denotes
the transpose of B (this is the Frobenius or Hilbert-Schmidt norm). If z ∈ Rn
and w ∈ Rd then z⊗w denotes the matrix in Rn ⊗Rd with (z⊗w)ij = z i wj and
kz⊗wk = kzkkwk. We also identify A ∈ Rn ⊗Rd with the linear operator
from Rd to Rn that acts by matrix multiplication. The Frobenius norm
is stronger than the operator norm, so kAxk ≤ kAkkxk. We denote by
S+d ⊂ Rd ⊗Rd the set of symmetric nonnegative definite matrices. We let λ
denote Lebesgue’s measure on R, and we let λB (A) , λ(A ∩ B) denote the
restriction of λ to B. For easy reference, we tag the following remark.
1.7 Remark. We will always use the “extended” version of the integral
(R R
f dµ if kf k dµ < ∞,
Z
f dµ ,
∞ otherwise,
so the integral takes values in the set Rd when f takes values in Rd . In partic-
ular, the integral is always defined. With this definition, one should interpret
7
the symbol ∞ to mean thatRthe integral is not finite. In particular, according

1 R1
to this definition, we have 0 −1/x dx = ∞ rather than 0 −1/x dx = −∞.
This may seem to be a strange choice, but will prove to be very convenient as
we will be using this convention mainly with Rd -valued integrands. Rather
than trying to keep track of which coordinates are finite or infinite or unde-
fined, we simply define the value of the whole integral to be “∞” whenever
any coordinateR in not finite. If we fix any measurable f : R+ → Rd , and
t
define F (t) , 0 f (s) ds, then F is left continuous. In fact, the only way that
F may jump is if it jumps to ∞ and then stays there, so F is continuous if
it is finitely-valued for all t ∈ R+ . For instance, if we fix any nonzero x ∈ Rd
and take f (t) = x/(t − 1) 1{t>1} , then F (t) = ∞ 1{t>1} .
1.8 Definition. If (E, E ) is a measurable space and X is an E-valued random

variable, then we let L (X | P) , P ◦ X -1 denote the measure induced by X
on (E, E ), and we say that L (X | P) is the law of X under P. When P is
clear from the context, we abbreviate L (X | P) to L (X), and we simply say
that L (X) is the law of X.
1.9 Definition. Let E be a topological space, and let {X n }n≤∞ be a sequence

of E-valued random variables, possibly defined on different probability spaces.
We say that X n converges in distribution to X ∞ , written X n ⇒ X ∞ , if
lim En f (X n ) = E∞ f (X ∞ )

n→∞
for each bounded, continuous function f : E → R.
Filtrations and stochastic bases

1.10 Definition. Given a probability space (Ω, F , P), we say that a col-
lection of σ-fields F 0 = {Ft0 }t∈R+ is a filtration if Fs0 ⊂ Ft0 ⊂ F when
s < t. We say that F 0 is right continuous if Fs = ∩t>s Ft . We say that the
σ-field F is complete if A ⊂ N ∈ F and P[N ] = 0 implies that A ∈ F .
We say that the filtration F 0 is complete if A ⊂ N ∈ F and P[N ] = 0
implies A ∈ F00 , and we say that F 0 satisfies the usual conditions if F is
right-continuous and complete.
In particular, we do not require a filtration to be right continuous as in

[JS87] Def. I.1.2. We often superscript a filtration which may not satisfy the
usual conditions with a zero to warn the reader. Some authors differentiate
8
between the “completion” and the “augmentation” of a filtration; we make

no such distinction.
1.11 Definition. We say that B = Ω, F , F 0 = {Ft0 }t∈R+ , P is a stochas-

tic basis if (Ω, F , P) is a probability space and F 0 is a filtration. We say

that B is complete if F and F 0 are both complete, and we say that B
satisfies the usual conditions if F is complete and F 0 satisfies the usual
conditions. If Bb = Ω,b F b , {Fb 0 }t∈R+ , P
t
b is another stochastic basis, then we
define B⊗ B b F ⊗F b and we say that B⊗ B

b , Ω×Ω, b , {F 0 ⊗ Fb 0 }t∈R+ , P×P b
t t
is an extension of B.
Spaces of functions
If E1 is a topological space, then the Borel σ-field on E1 is the σ-field generated
by the open subsets of E1 . If E2 is another topological space, then C(E1 ; E2 )
denotes the set of continuous maps from E1 to E2 . If E2 has a metric d2 , then
we will always endow C(R+ ; E2 ) with the locally uniform topology and the
compatible distance
∞
X
2-n 1 ∧ sup d2 x(s), y(s) .

d(x, y) ,
s≤n
n=1
If E2 is a Polish space, then C(R+ ; E2 ) is a Polish space as well. If E2 is a

vector space, then C(R+ ; E) is also vector space.
We now assume that E is a Polish space. We will be slicing up paths and
patching them back together, so we fix some useful notation. We define the
shift operator
Θ : C(R+ ; E)×R → C(R+ ; E) by Θ(x, t) , x (t + )+ ,

.
.
where ( )+ denotes the positive part. Notice that this is a slight extension of
the standard shift operator of Markov process theory as we allow for negative
shifts and the value of the path at 0 is used to fill the “gap” that is created
when the path is shifted to the right. In particular, if 0 ≤ s ≤ t then
Θ(x, −t)(s) = x(0). We also define the stopping operator
∇ : C(R+ ; E)×R+ → C(R+ ; E) by ∇(x, t) , x( ∧ t). .

If E has a vector space structure, then C(R+ ; E) has a vector space struc-
9
ture, so E has a zero element, and we may define the space of paths that
start at zero
C0 (R+ ; E) , x ∈ C(R+ ; E) : x(0) = 0 .
This is a closed subset of C(R+ ; E), so C0 (R+ ; E) is a Polish space in the
relative topology. In this situation, we can also define the difference operator
.
∆ : C(R+ ; E)×R+ → C0 (R+ ; E) by ∆(x, t) , x(t + ) − x(t).
We will slightly abuse the notation and use the same symbols for these oper-
ators as we vary the space E. The domain and range of the operators should
be clear from the context. The maps Θ, ∇, and ∆ are continuous, and they
are linear in x, for fixed t, when E is a vector space.
If X : Ω → C(R+ ; E), then we use the notation Xt : Ω → E to denote the
map ω 7→ X(ω)(t), and we use the standard “stopped process” notation. In
particular, if T is an R+ -valued random
variable, then X T : Ω → C(R+ ; E)
denotes the map ω 7→ ∇ X(ω), T (ω) . Notice that if t and u are nonnegative,
then we have
Θ(X t+u , t) = Θ ∇(X, t + u), t = ∇ Θ(X, t), u = Θu (X, t),

and a similar chain of equalities for ∆(X t+u , t) when E is a vector space.
If X is random variables, then σ(X) denotes the σ-field generated by
X. If G and H are σ-fields, and X and Y are random variables, then
σ(G , H , X, Y ) = G ∨ H ∨ σ(X) ∨ σ(Y ).
Processes

We denote by BV [a, b]; Rd the class of Rd -valued functions that are of
bounded variation when restricted to the interval [a, b]. Similarly, AC [a, b]; Rd
denotes the class of Rd -valued functions that are absolutely continuous when
restricted to the interval [a, b]. One may consult Appendix C for further
details.
1.12 Definition. If X is an Rd -valued process,

then we say that X is a
finite variation process if X ∈ BV [0, t]; Rd for all t ∈ R+ , and we say
that X is an absolutely continuous process if X ∈ AC [0, t]; Rd for all
t ∈ R+ .
10
1.13 Definition. If X is an Rd -valued process, then we define

X
Vart (X) , sup Xs − Xs ,
i i−1
π
i
where the sup is taken over all partitions of the form

π = 0 = s0 < s1 < . . . < sn = t .
If X is a càdlàg process, then we need only consider partitions containing

rational points in the definition above, and Vart (X) is a (measurable) random
variable.
1.14 Definition. Let B = (Ω, F , F 0 , P) be a stochastic basis supporting a

continuous, Rd -valued process X. We say that X is a continuous semi-
martingale if we can decompose X as
(1.15) Xt = X0 + Mt + Bt ,
where M is a continuous local martingale with M0 = 0, and B is a continuous

process with B0 = 0 that is P-a.s. of finite
variation. In this case, we say that
X has the characteristics B, hM i . If B and hM i are both absolutely
continuous, P-a.s., then we say that X is an Itô process.
11
Chapter 2
Statement of Results
In this chapter, we present the main result of the dissertation and we give a
few corollaries to illustrate potential applications. To state the main result,
we need to first define the notion of an updating function. We give this
definition and present a few examples of updating functions in Section 2.1.
In Section 2.2, we state the main result of the dissertation and give corollaries.
In Section 2.3, we show how these result can be used to give an answer to a
question raised by Piterbarg about the prices for barrier options in mixture
models.
2.1 Updating Functions

The following definition is fundamental for all that follows.
2.1 Definition. Let E be a Polish space, and let Φ : E×C0 (R+ ; Rd ) →

C(R+ ; E) be a function. We say that Φ is an updating function if

(a) Φt (e, x) = Φt e, ∇(x, t) ∀t ∈ R+ , and

(b) Θ Φ(e, x), t = Φ Φt (e, x), ∆(x, t) ∀t ∈ R+ .
If Φ is also continuous as a map from E×C0 (R+ ; Rd ) to C(R+ ; E), then we

say that Φ is a continuous updating function.
Property (a) of Def. 2.1 is an adaptedness condition. Property (b) re-

stricts the way in which Φ may depend upon the history of X. We state this
precisely as a lemma.
12
CHAPTER 2. STATEMENT OF RESULTS
2.2 Lemma. Let E be a Polish space, let X be a continuous, Rd -valued

process, let Z be a continuous, E-valued
process, and let Φ be an updating
function. If Z = Φ Z0 , ∆(X, 0) , then Θ(Z, t) = Φ Zt , ∆(X, t) for all
t ∈ R+ .
Proof. We simply write

Θ(Z, t) = Θ Φ Z0 , ∆(X, 0) , t

= Φ Φt Z0 , ∆(X, 0) , ∆(X, t)

= Φ Zt , ∆(X, t) ,

where we have used property (b) of Def. 2.1 and the fact that ∆ ∆(X, 0), t =
∆(X, t).

This lemma suggests that processes of the form Z = Φ Z0 , ∆(X, 0) have
some desirable properties, and we give this relationship an intuitive name.
2.3 Definition. Let E be a Polish space, let X be a continuous, Rd -valued
Z be a continuous, E-valued process. If we can write Z =
process, and let
Φ Z0 , ∆(X, 0) for some updating function Φ, then we say that Z may be
updated using only the changes in X.
We now present a number of examples to illustrate the relationship be-
tween the processes X and Z of Def. 2.3. Unfortunately, the notational
burden increases with each example; however, the point of these examples is
simply to show that the notion of updating function is quite general, and the
reader will not lose much by skimming the details.
2.4 Example. Let X be a continuous, Rd -valued process, and set Z , X.
Then Z may be updated using only the changes in X. To see this, we set
E , Rd , and we define Φ(e, x) , e 1[0,∞) + x, where e 1[0,∞) denotes the
constant path in C(R+ ; Rd ) that is equal to e at all times.
We have Z = Φ Z0 , ∆(X, 0) , and we now check that Φ is an updating
function. For t ∈ R+ and x ∈ C0 (R+ ; Rd ), we have
Φt (e, x) = e 1[0,∞) + ∇(x, t) = Φt e, ∇(x, t) ,

so property (a) of Def. 2.1 holds. If we know the value of X at time t, and
we know how the process X changes after time t, then we may reconstruct
13
the path of X after time t. In particular, we have

Θ Φ(e, x), t = e 1[0,∞) + Θ(x, t)

= e + x(t) 1[0,∞) + ∆(x, t)

= Φ Φt (e, x), ∆(x, t) ,
so property (b) of Def. 2.1 holds, and Φ is an updating function.
In Example 2.4, the only information that we decided to record about the
process X was the current location. In the next example, we choose to track
both the current location and the running maximum. We restrict ourselves
to the one-dimensional case for simplicity.
2.5 Example. Let X be a continuous process and set Z , (X, M ) where

Mt , max Xs : s ∈ [0, t] . Then Z may be updated using only the changes
in X. This time we take E = R2 , and we write a typical point in E as
e = [ ee12 ]. We let ψ : C0 (R+ , R) → C(R+ ; R+ ) denote the map such that
ψt (x) = max x(s) : s ∈ [0, t] , and we let Φ : E×C0 (R+ ; R) → C(R+ ; R2 )
denote the map such that

e1 + x(t)
Φt (e, x) , .
max{e2 , e1 + ψt (x)}
Notice that

X0 + ∆t (X, 0)
Φt Z0 , ∆(X, 0) = Φt X
0
X0 , ∆(X, 0) = X0 + ψt ◦ ∆(X, 0)
= Zt ,
where we have used the fact that X0 + ψt ◦ ∆(X, t) ≥ X0 .

We now check that Φ is an updating function. Fixing any s ≤ t, we have

t e1 + x(s)
Φs (e, x) = Φs (e, x) =
max{e2 , e1 + ψs (x)}

e1 + ∇s (x, t)
=
max{e2 , e1 + ψs ◦ ∇(x, t)}
= Φs e, ∇(x, t) = Φts e, ∇(x, t) .

As this is true for all s ≤ t, we conclude that property (a) of Def. 2.1 holds.
14
Finally, we check that, for all t, u ∈ R+ , we have

Θu Φ(e, x), t = Φt+u (e, x)

e1 + x(t + u)
=
max{e2 , e1 + ψt+u (x)}

e1 + x(t) + ∆u (x, t)
=
max{e2 , e1 + ψt (x), e1 + x(t) + ψu ◦ ∆(x, t)}
h i
= Φu max{ee21,+x(t)
e1 +ψt (x)} , ∆(x, t)

= Φu Φt (e, x), ∆(x, t) ,
so property (b) of Def. 2.1 holds, and Φ is an updating function.
When dealing with a time-dependent Markov processes, a standard tech-

nique is to append the time to the state of the process and form a time-
homogeneous “space-time” process. In the next example, we see that we
may employ a similar technique with an updating function. We also see how
we might construct updating functions that record information about the
joint distributions of the X process.
2.6 Example. Let X be a continuous, real-valued process. Fix some T > 0,

and set Zt , (Xt , XtT , t). Then Z may be updated using only the changes
2
in X. i time we take E , R ×R+ , and we write a typical point in E as
h e1This
e = ee23 . Let Φ : E×C0 (R+ ; R) → C(R+ ; E) denote the map such that
 
e1 + x(t)
Φt (e, x) = e2 + ∇t x, (T − e3 )+  .
e3 + t
We check that
 
h X i X0 + ∆t (X, 0)
0
Φt Z0 , ∆(X, 0) = Φt X0 , ∆(X, 0) = X0 + ∆Tt (X, 0) = Zt .
0
t
15
Fixing any s ≤ t, we have

 
e1 + x(s)
Φts (e, x) = Φs (e, x) = e2 + ∇s x, (T − e3 )+ 

e3 + s
 
e1 + ∇s (x, t)
= e2 + ∇s ∇(x, t), (T − e3 )+ 
e3 + s
= Φs e, ∇(x, t) = Φts e, ∇(x, t) .

As this is true for any s ≤ t, we again conclude that property (a) of Def. 2.1
holds.
To see that property (b) holds, we first note that for any path y and times
s, t ∈ R+ , we have

∇ Θ(y, t), s = Θ ∇ y, (s − t)+ , t .

(2.7)
Taking any t, u ∈ R+ and letting y = ∆(x, t), we write
∇t+u x, (T − e3 )+

+

= ∇t+u ∇(x, t) + Θ ∆(x, t), t , (T − e3 )
= ∇t+u ∇(x, t), (T − e3 )+ + ∇t+u Θ(y, t), (T − e3 )+

+ +

(2.8) = x t ∧ (T − e3 ) + Θt+u ∇ y, (T − e3 − t) , t
= ∇t x, (T − e3 )+ + ∇u ∆(x, t), (T − e3 − t)+ ,

16
where 2.8 follows from 2.7 with s = (T − e3 )+ . To conclude, we write

Θu Φ(e, x), t = Φt+u (e, x)
 
e1 + x(t + u)
= e2 + ∇t+u x, (T − e3 )+ 
e3 + t + u
 
e1 + x(t)
+ ∆ u (x, t)
= e2 + ∇t x, (T − e3 )+ + ∇u ∆(x, t), (T − e3 − t)+ 
e3 + t + u
e1 +x(t)

= Φu e2 +∇t (x, (T −e3 )+ ) , ∆(x, t)
e3 +t

= Φu Φt (e, x), ∆(x, t) .
This shows that property (b) of Def. 2.1 holds, so Φ is an updating function.
As is quickly becoming clear, the hardest part of checking that a function
is an updating function is working through the notation. This is particularly
true of the last example that we present. In this last example, we use Z to
record the entire trajectory of X up until the current time. The updating
function removes an initial segment of path from the front end of ∆(X, t)
and appends it to the end of the initial path segment stored in Z. As we now
have a path-valued process, this situation is somewhat unpleasant to deal
with notationally, and the reader will not lose much by omitting the details.
2.9 Example. Let X be a continuous, Rd -valued process, and set Zt =
(X t, t), so Zt records the entire trajectory of X up until the time t. Then Z
may be updated using only the changes in X. This example is extremal in
the sense that we choose to record the most information about the path of
X that is possible without violating property (a) of Def. 2.1.
We take E , C(R+ ; Rd )×R+ , and we write a typical point in E as e = [ ee12 ].
We map a segment of path to a point in E, using the second coordinate to
record the length of the segment. Let Ψ : E → C(R+ ; E) denote the map
such that
∇(e1 , e2 + t)
Ψt (e) = .
e2 + t
We might describe Ψ as a map that reveals more and more of the path
e1 over time. In particular, we give Ψ a path e1 and an initial time e2 ,
and Ψt shows us the piece of e1 that lives on the interval [0, e2 + t]. Let
17
Φ : E×C0 (R+ ; Rd ) → C(R+ ; E) denote the map such that

Φt (e, x) = Ψt ∇(e1 , e2 ) + Θ(x, −e2 ), e2

∇ ∇(e1 , e2 ) + Θ(x, −e2 ), e2 + t
= .
e2 + t
Recall that to compute Θ(x, −e2 ), we slide the path x to the right by the
amount e2 , and we have Θt (x, −e2 ) = x(0) = 0 for t ∈ [0, e2 ]. Φ appends
the path x to the initial path segment e and then hands the newly con-
structed path over to Ψ, which slowly
∇(X,0) reveals information about the path in
an adapted way. As Z0 = 0
, we have

Φt Z0 , ∆(X, 0) = Ψt ∇(X, 0) + ∆(X, 0), 0 = Ψt (X, 0) = Zt .
We now check that Φ is an updating function. To check the first property

of Def. 2.1, we first notice that if x ∈ C(R+ ; Rd ) and 0 ≤ s ≤ t, then

∇ Θ(x, −e2 ), e2 + s = Θ ∇(x, s), −e2

= Θ ∇ ∇(x, t), s , −e2

= ∇ Θ ∇(x, t), −e2 , e2 + s .
The first and last equality state that sliding the path x or ∇(x, t) to the right
by e2 and then stopping it at time e2 +s is the same as first stopping the path
at time s and then sliding it to the right by e2 . The second equality follows
from the fact that stopping a path at two deterministic times is equivalent
to stopping the path once at the earlier time. Using this observation and the
fact that ∇(x + y, t) = ∇(x, t) + ∇(y, t), we may write

∇ ∇(e1 , e2 ) + Θ(x, −e2 ), e2 + s

= ∇ ∇(e1 , e2 ), e2 + s + ∇ Θ(x, −e2 ), e2 + s

= ∇ ∇(e1 , e2 ), e2 + s + ∇ Θ ∇(x, t), −e2 , e2 + s

= ∇ ∇(e1 , e2 ) + Θ ∇(x, t), −e2 , e2 + s .
18
Fixing any 0 ≤ s ≤ t, we then have

t ∇ ∇(e1 , e2 ) + Θ(x, −e2 ), e2 + s
Φs (e, x) = Φs (e, x) =
e2 + s
" #
∇ ∇(e1 , e2 ) + Θ ∇(x, t), −e2 , e2 + s
=
e2 + s
= Φs e, ∇(x, t) = Φts e, ∇(x, t) .

We have now shown that property (a) of Def. 2.1 holds.

To check the second property of Def. 2.1, we first observe that x =
∇(x, t) + Θ ∆(x, t), −t), which implies that

Θ(x, −e2 ) = Θ ∇(x, t), −e2 + Θ ∆(x, t), −e2 − t

= ∇ Θ(x, −e2 ), e2 + t) + Θ ∆(x, t), −e2 − t .
Using these observations, we may write
eb1 , ∇(e1 , e2 ) + Θ(x, −e2 )

= ∇ ∇(e1 , e2 ), e2 + t + ∇ Θ(x, −e2 ), e2 + t) + Θ ∆(x, t), −e2 − t

= ∇ ∇(e1 , e2 ) + Θ(x, −e2 ), e2 + t + Θ ∆(x, t), −e2 − t

= ∇ eb1 , e2 + t + Θ ∆(x, t), −e2 − t .
19
To conclude, we write
Θu Φ(e, x), t) = Φt+u (e, x)

= Ψt+u eb1 , e2

∇ eb1 , e2 + t + u
=
e2 + t + u
" #
∇ ∇ eb1 , e2 + t + Θ ∆(x, t), −e2 − t , e2 + t + u
=
e2 + t + u

= Ψu ∇(b e1 , e2 + t) + Θ ∆(x, t), −e2 − t , e2 + t

= Φu ∇(bee12, +te2 +t)

, ∆(x, t)

= Φu Ψt (b e1 , e2 ), ∆(x, t)

= Φu Φt (e, x), ∆(x, t) .
We have now shown that property (b) of Def. 2.1 holds, so Φ is an updating
function.
2.2 Main Result

Before we present the main result of the dissertation, we pause to give the
following simple example.
e F

2.10 Example. Let Ω, e , {Fet }t∈R+ , P
e be a stochastic basis that supports a
Wiener process Wf and an Fe0 -measurable random variable U e that is uniformly
distributed over the interval [0, 1]. Fix constants 0 < c1 < c2 , and set
√ √
σ
et , c1 1{Ue <1/2} + c2 1{Ue ≥1/2} , where we take the nonnegative square root.
Rt
The process σe is bounded and adapted, so we may define Yet , 0 σ es dW
fs and
Rt 2
Cet , σ
0
es ds. Notice that Cet = tc1 1 e +tc2 1 e
{U <1/2} {U ≥1/2}and that Ye is an Itô
e Letting η(x, v) , (2πv)−1/2 e−x2 /(2v)
process with the characteristics (0, C).
denote the density of the normal distribution with mean 0 and variance v,
e Yet ∈ dx] =
we see that the density of the random variable Yet is given by P[
η(x, tc1 )/2 + η(x, tc2 )/2.
This example is essentially the simplest possible stochastic volatility model
and we will use this example to illustrate most of the results that follow.
20
We now present the main result of this dissertation.
2.11 Theorem. Let W be an Rr1 -valued Wiener process, let µ be an adapted,

Rd -valued process, let σ be an adapted, Rd ⊗Rr1 -valued process, assume that
Z t
T
(2.12) E kµs k + kσs σs k ds < ∞ ∀t ∈ R+ ,
0
and set
Z t Z t
(2.13) Yt , µs ds + σs dWs .
0 0
Let E be a Polish space, and let Z be a continuous, E-valued process with

Z = Φ(Z0 , Y ) for some continuous updating function Φ. Finally, suppose
that N ⊂ R+ is a Lebesgue-null set and that we have (deterministic) functions
b : E×R+ → Rd and σ
µ b : E×R+ → Rd ⊗Rr2 such that µ b(Zt , t) = E[µt | Zt ]
a.s. and σbσb (Zt , t) = E[σt σt | Zt ] a.s. when t ∈
T T
/ N.
Then there exists a stochastic basis (Ω, F , F,
b b b supporting processes W
b P) c,
Yb , and Zb such that
c is an Rr2 -valued Wiener process,
(a) W
Z t Z t
(b) Yt =
b µ
b(Zs , s) dt +
b σ
b(Zbs , s) dW
cs ,
0 0
(c) Zb is a continuous, E-valued process with Zb = Φ(Zb0 , Yb ), and
(d ) Zb has the same one-dimensional marginal distributions as Z.
2.14 Remark. Cor. 4.5 asserts that we may find deterministic functions
b : E×R+ → Rd and νb : E×R+ → S+d and a Lebesgue-null set N ⊂ R+ such
µ
b(Zt , t) = E[µs | Zt ] a.s. and νb(Zt , t) = E[σσsT | Zt ] a.s. when t ∈
that µ / N . If
we take r2 = d, then Lem. D.6 asserts that we may take the positive square
root of νb to get a function σ b taking values in S+d and satisfying σ b2 = νb.
As a result, we can always find functions satisfying the requirements of the
previous theorem; however, in applications we can often compute versions of
µ
b and σ b explicitly, so we formulate the theorem to take these functions as
inputs.
21
2.15 Remark. In this formulation, we always have Y0 = 0, so Z0 is the only

“initial condition.” In the corollaries that follow, we will see that this does
not restrict generality.
To appreciate this result, it is probably helpful to first consider two es-

sentially extremal corollaries. The first corollary reads as follows.
2.16 Corollary. Let W be an Rr1 -valued Wiener process, and let X be an

Rd -valued Itô process with stochastic differential
where µt ∈ Rd and σt ∈ Rd ⊗Rr1 are adapted processes satisfying (2.12). Let

N ⊂ R+ be a a Lebesgue-null set, and let µ b : Rd ×R+ →Rd and σ b : Rd ×R+ →
Rd ⊗Rr2 be (deterministic) functions such that µ b Xt , t = E[µt | Xt ] a.s. and
σ
bσbT (Xt , t) = E[σt σtT | Xt ] a.s. when t ∈
/ N . Then there exists a weak solution
to the SDE
(2.17) dX
bt = µ
b(X
bt , t) dt + σ
b(X
bt , t) dW
ct
c is an Rr2 -valued Wiener process, and X

where W b has the same one-dimensional
marginal distributions as X.
Proof. Set E , Rd , Φ(e, x) , e 1[0,∞) + x, Y , ∆(X, 0), and Z , X, so
Z = Φ(Z0 , Y ). It is clear that Φ is continuous, and we have already shown
that Φ is an updating function in Example 2.4, so we may apply Thm. 2.11
to conclude that there exists a stochastic basis (Ω, b Fb , F, b that supports
b P)
processes Yb , Z,b and W c such that W c is an Rr2 -valued Wiener process, Yb
satisfies (b) of Thm. 2.11, Zb is a continuous, Rd -valued process with Zb =
Φ(Zb0 , Yb ), and Zb has the same one-dimensional marginal distributions as Z.
We set X b , Z,
b so Xbt = Φt (Zb0 , Yb ) = Zb0 + Ybt . As Yb satisfies (b) of Thm. 2.11,
we conclude that X solves (2.17), and we are done.
b
As noted in the introduction, Krylov [Kry84] and Gyöngy [Gyö86] have

proved this result under the additional hypotheses that µ and σ are both
bounded and σσ T is uniformly positive definite.
Let M denote the class of financial models in which the interest rate
is deterministic and the price processes are modeled as Itô processes whose
coefficients satisfy the integrability requirement (2.12). Let D ⊂ M denote
22
the class of financial models in which the price processes solve an SDE of the
form (2.17). Given the equivalence between the one-dimensional marginal
distributions of a price process under a pricing measure and the prices of
European options, Cor. 2.16 admits the following financial interpretation: If
there exists any model in M which is consistent with a given set of market
prices for European options, then there also exists a model in D which is
consistent with that set of market prices.
2.18 Example. Let Ye and η be defined as in Example 2.10, and take X = Ye

in Cor. 2.16. In this case, we can compute µ b and σ b explicitly. For t > 0, we
have µ
b(x, t) = 0 and
2
b2 (x, t) = E

σ e σet Yet = x
h i h i
c1 P
e σ et2 = c1 , Yet ∈ dx + c2 P
e σ et2 = c2 , Yet ∈ dx
= h i h i
e σ
P et2 = c1 , Yet ∈ dx + P
e σ et2 = c2 , Yet ∈ dx
c1 η(x, tc1 ) + c2 η(x, tc2 )
= .
η(x, tc1 ) + η(x, tc2 )
So, taking µ
b = 0 and
s
c1 η(x, tc1 ) + c2 η(x, tc2 )
(2.19) σ
b(x, t) = ,
η(x, tc1 ) + η(x, tc2 )
Cor. 2.16 asserts the existence of a solution to the SDE (2.17) with the same
one-dimensional marginal distributions as Ye .
If we were to start with a mixture of geometric Brownian motions with dif-

fering volatilities rather than (arithmetic) Brownian motions, then we would
recover the results of Brigo and Mecurio [BM01] [BM02] who show how to
construct models in which European option prices are given as mixtures of
Black-Scholes prices. In fact, our results are slightly stronger, as Brigo and
Mecurio require the existence of a strong solution to (2.17). In Example 2.18,
we already see a situation where the solution to (2.17) may not be strong.
In particular, looking at (2.19), we see that we cannot define σ b in such a
way that it is continuous at t = 0. Brigo and Mecurio avoid this problem
by choosing volatility scenarios that are deterministic functions of time and
requiring that all volatility scenarios agree on some arbitrarily small initial
23
time interval.
The first corollary that we presented corresponds to the diffusion case
where the only information that we choose to track about the process X is
the current location. At the other extreme, we might choose to remember
the entire history of X.
2.20 Corollary. Let W be an Rr1 -valued Wiener process, and let X be an
Rd -valued Itô process with stochastic differential
where µt ∈ Rd and σt ∈ Rd ⊗Rr are adapted processes satisfying (2.12).

Let N ⊂ R+ be a Lebesgue-null set, and let µ b : C(R+ ; Rd )×R+ → Rd and
b : C(R+ ; Rd )×R+ → Rd ⊗Rr2 be functions such that µ
σ b(X t , t) = E[µt | X t ]
t t
a.s. and σ
bσb (X , t) = E[σt σt | X ] a.s. when t ∈
T T
/ N . Then there exists a
weak solution to the SDE
(2.21) dX
bt = µ b t , t) dt + σ
b(X b t , t) dW
b(X ct ,
c is some Rr2 -valued Wiener process.

with the same law as X, where W
Proof. Take E = C(R+ ; Rd )×R+ and let e = [ ee12 ] denote a typical point in E.
Set Y , ∆(X, 0), set Zt , (X t , t), and let Φ : E×C0 (R+ ; Rd ) → C(R+ ; Rd )
denote the map such that

∇ ∇(e1 , e2 ) + Θ(x, −e2 ), e2 + t
Φt (e, x) , .
e2 + t
Z is a continuous, E-valued process with Z = Φ(Z0 , Y ). We showed that Φ

is an updating function in Example 2.9. ∇ and Θ are continuous functions,
and the addition of paths in C(R+ , Rd ) is a continuous operation, so Φ is
a continuous function, and we may apply Thm. 2.11 to conclude that there
b F
exists a stochastic basis (Ω, b , F, b that supports processes Yb , Z,
b P) b and W c
r
such that Wc is an R 2 -valued Wiener process, Yb satisfies (b) of Thm. 2.11,
Z is a continuous, E-valued process with Zb = Φ(Zb0 , Yb ), and Zb has the
b
same one-dimensional marginal distributions as Z. There is a slight abuse
of notation here as µ b has domain C(R+ ; Rd )×R+ , but Thm. 2.11 expects µ b
d
to have domain E×R+ = C(R+ ; R )×R+ ×R+ . This happens because the
process Z already includes the time, and we implicitly identify µb(x, t, t) with
µ
b(x, t) and σ
b(x, t, t) with σ
b(x, t).
24
Define X b , (Zb1 ) + Yb , where Zbi denotes the ith component of Z. b This

0 0
looks a little awkward, but notice that Zb01 ∈ C(R+ ; Rd ), so (Zb01 )0 ∈ Rd . Given
the definition of X b and the fact that Yb satisfies (b) of Thm. 2.11, it then
follows that that X b solves (2.21). As L (Zb0 ) = L (Z0 ) = L (X 0 , 0), Zb2 is
h b1 i 0
Z
P-a.s. equal to 0. This means that the E-valued processes Zt = Φt Zb02 , Yb
b b
h i 0
t
and (X , t) = Φt
b b1
Z
0
0 , Y are P-indistinguishable. In particular, L (X
b b b t) =
L (Zb1 ) = L (Z 1 ) = L (X t ) for each t, so L (X) = L (X).
t t
b
Lipster and Shiryaev refer to a process that solves an SDE of the form
(2.21) as a process of diffusion-type, and they give Cor. 2.20 under the ad-
ditional assumptions that d = r1 = 1 and σ = 1 (see [LS01] Thm. 7.12),
although it is not clear that these assumptions are necessary for their ap-
proach to work. Lipster and Shiryaev provide an explicit formula for the
Radon-Nikodym derivative of the law of a process of diffusion-type with re-
spect to the law of a Wiener process. To apply this result to a general Itô
process like X, they must show that X solves some SDE of the form (2.21).
Their approach is to filter the drift from the path of the process X. They
subtract the filtered drift from X, and they show that what remains is a
Wiener process; although, it may differ from the Wiener process that was
initially used to define X.
2.22 Example. Assume that we are in the setting of Example 2.10. For
n
each fixed n, define the sequence of functions ξm : C(R+ ; R) → R+ by
m
n
X
i
i−1
2
ξm (x) , x nm
−x nm
.
i=1
n e
For each fixed n, nξm e02 in probability as m → ∞. Moving
(Y ) converges to σ
to a subsequence {a(n, m)}m that converges a.s., we define ξ n , lim inf n ξa(n,m)
n
,
m→∞
so ξ (Ye ) = ξ n (Ye 1/n ) = σ
n
e02 , P-a.s.,
e where Ye 1/n denotes the process stopped
th
at time 1/n (not the n root). Define σ b : C(R+ , R)×R+ → R+ by
∞
X p
σ
b(x, t) , ξ n (x) 1[1/n, 1/(n−1)) (t),
i=1
where we take the positive root, and we take 1/0 to be ∞. Then σ

b(Ye , t) =
25
b(Ye t , t) = σ
σ e0 , P-a.s.,
e for t > 0. In this simple case, it is clear without even
applying Cor. 2.20 that Ye solves dYet = σ b(Ye t , t) dW
ft .
It seems that the results that fall between Cor. 2.16 and Cor. 2.20 are
new. For example, we have the following corollary.
2.23 Corollary. Let W be a real-valued Wiener process, let X have stochastic

differential
where µ and σ are real-valued, adapted processes that satisfy (2.12), and set
Mt , max{Xs : s ∈ [0, t]}. Let N ⊂ R+ be a Lebesgue-null set, and let
b : R2 ×R+ → R and σ
µ b : R2 ×R+ → R be functions with µ b(Xt , Mt , t) =
2 2
b (Xt , Mt , t) = E[σt | Xt , Mt ] a.s. when t ∈
E[µt | Xt , Mt ] a.s. and σ / N.
Then there exists a stochastic basis (Ω, F , F, P) that supports processes
b b b b
W , X,
c b and M c such that W c is a Wiener process, X b solves the SDE:
(2.24) dX
bt = µ
b(X
bt , M
ct , t) dt + σ
b(X
bt , M
ct , t) dW
ct ,
M bs : s ∈ [0, t]}, and L (X

ct = max{X ct ) = L (Xt , Mt ) for all t ∈ R+ .
bt , M
Proof. Take E , R2 and let e = [ ee12 ] denote a typical point in E. Set

Y , ∆(X, 0), set Z , (X, M ), and let Φ : E×C0 (R+ ; R) denote the map
such that
e1 + x(t)
Φt (e, x) = ,
max e2 , e1 + x(s) : s ∈ [0, t]
so Z = Φ(Z0 , Y ). It is clear that Φ is a continuous map, and we have shown
that Φ is an updating function in Example 2.5, so we may apply Thm. 2.11
to conclude that there exists a stochastic basis (Ω, b Fb , F, b that supports
b P)
processes W
c , Yb , and Zb such that W c is a Wiener process, Yb satisfies (b)
of Thm. 2.11, Zb = Φ(Zb0 , Yb ), and L (Zbt ) = L (Zt ) for all t ∈ R+ . Set
(X,
b N
b ) , Z,
b and set M c = max{X bt = Z01 + Ybt where Zbti
bs : s ∈ [0, t]}. Then X
denotes the ith component of Zbt . As Yb satisfies (b) of Thm. 2.11, X b solves
(2.25) dX
bt = µ
b(X
bt , N
bt , t) dt + σ
b(X
bt , N
bt , t) dW
ct .
We also see that

bt = max Zb02 , Zb01 + Ybs : s ∈ [0, t] = max Zb02 , X

N bs : s ∈ [0, t] .
26
Now Z02 = M0 = X0 = Z01 and L (Zb0 ) = L (Z0 ), so Zb02 = Zb01 , P-a.s.

b In
particular, P[ Nt = Mt ∀t] = 1. As Xt solves (2.25), it also solves (2.24).
b b c b
Finally, we notice that
L (X ct ) = L (X
bt , M bt ) = L (Zbt ) = L (Zt ) = L (Xt , Mt ) ∀t ∈ R+ ,
bt , N
so we are done.
2.26 Example.
Assume that we are in the setting of Example 2.10, set
N
et , max Wfs : s ∈ [0, t] , and define
(2m − x)2

2(2m − x)
p(x, m; t) , √ exp − 1{x≤m, m≥0} .
2πt3 2t
According to [KS91] Prop. 2.8.1, the R2 -valued random variable (W ft , N

et )
admits the density P[ ft ∈ dx, N
e W et ∈ dm] = p(x, m; t).

Setting Mft = max Yes : s ∈ [0, t] and taking X = Ye and M =
M
f in Cor. 2.23, we may use this density to compute µ b and σ b explicitly.
We set A , (X, M ), B , (W , N ), and we will write a typical point in
e e f e f e
R2 as a = (x, m). From the scaling properties of Brownian motion (e.g.,
√ e
[RY99] Prop. 1.1.10 (iii)), it follows that L ( ci B t ) = L (Btci ). For t > 0,
e
we have
2
b2 (x, m, t) = E

σ e σet Yet = x, Mft = m
2 2
c1 P
e σ e ∈ da + c2 P
et = c1 , A e σ e ∈ da
et = c2 , A
= 2 2
e σ
P e ∈ da + P
et = c1 , A e σ e ∈ da
et = c2 , A
e √c1 B e √c2 B

c1 P et ∈ da + c2 P[ et ∈ da]
= √ √
e c1 B
P et ∈ da + P e c2 B et ∈ da

c1 P
e B etc1 ∈ da + c2 P[ e B etc2 ∈ da]
=
e B
P etc1 ∈ da + P e B etc2 ∈ da
c1 p(x, m; tc1 ) + c2 p(x, m; tc2 )

= .
p(x, m; tc1 ) + p(x, m; tc2 )
27
So taking µ
b(u, v, t) = 0 and
s
c1 p(x, m; tc1 ) + c2 p(x, m; tc2 )
σ
b(x, m, t) = ,
p(x, m; tc1 ) + p(x, m; tc2 )
Cor. 2.23 asserts the existence a solution to (2.24) such that the one-dimensional
marginal distributions of the the process (X,b Mc) agree with the one-dimensional

marginal distributions of (Ye , M
f), where M ct = max X bs : s ∈ [0, t] .
2.27 Corollary. Let W be a Wiener process, fix some time T ∈ R+ , and let
X have stochastic differential
where µ and σ are adapted processes that satisfy (2.12). Further assume that
N ⊂ R+ is Lebesgue-null set and that µ b : R2 ×R+ → R and σ b : R2 ×R+ → R
T T
are functions such that µ b2 (Xt , XtT ; t) =
b(Xt , Xt ; t) = E[µt | Xt , Xt ] a.s. and σ
E[σt2 | Xt , XtT ] a.s. when t ∈
/ N.
Then there exists a stochastic basis (Ω, b Fb , F, b that supports processes
b P)
W
c and X b such that W c is a Wiener process, X b solves the SDE:
(2.28) dX
bt = µ
b(X b T ; t) dt + σ
bt , X b(X b T ; t) dW
bt , X ct ,
t t
and L (X btT ) = L (Xt , XtT ) t ∈ R+ .

bt , X
h e1 i
Proof. Take E = R2 ×R+ , and write a typical point in E as e = ee23 . Define
h Xt i
Y , ∆(X, 0), Zt , XtT , and Φ : E×C0 (R+ , R) → C(R+ ; E) by
t
 
e1 + x(t)
Φt (e, x) , e2 + ∇t x, (T − e3 )+  .
e3 + t
It is clear that Φ is a continuous map, and we have shown that Φ is an

updating function in Example 2.6, so we may apply Thm. 2.11 to conclude
that there exists a stochastic basis (Ω, b F b , F, b that supports processes W
b P) c,
Yb , and Zb such that Wc is a Wiener process, X b satisfies (b) of Thm. 2.11, Zb =
Φ(Zb0 , Yb ), and L (Zt ) = L (Zbt ) for all t ∈ R+ . As in Cor. 2.20, there is a slight
abuse of notation here as µ b has domain R2 ×R+ , but Thm. 2.11 expects µ b to
28
have domain E×R+ = R2 ×R+ ×R+ . We are implicitly identifying µ b(e, t, t)

with µ
b(e, t) and σb(e, t, t) with σ
b(e, t).
Set Xb , Zb1 , where Z i denotes the ith component of Z, so X
bt = Zb1 + Ybt .
0
As Yb satisfies (b) of Thm. 2.11, X b solves
(2.29) dX
bt = µ bt , Zb2 ; t) dt + σ
b(X bt , Zb2 ; t) dW
b(X ct .
t t
Z02 = X0 = Z01 and L (Zb0 ) = L (Z0 ), so Zb02 = Zb01 , P-a.s.

b Notice that
1 2
Φt∧T (e, x) = Φt (e, x) for all t ∈ R+ when e1 = e2 and e3 = 0. As Zb =
Φ(Zb0 , Yb ), we have
(2.30) b Zb2 = X
P[ bT ∀t] ≥ P[ Zb02 = Zb01 ] = 1.
t t
It then follows from (2.29) and (2.30) that X

b solves (2.28). It also follows
from (2.30) that we have
L (X b T ) = L (Zb1 , Zb2 ) = L (Z 1 , Z 2 ) = L (Xt , X T ) ∀t ∈ R+ ,

bt , X
t t t t t t
so we are done.
2.31 Example. Assume that we are in the setting Example 2.10. Taking
X = Ye in Cor. 2.27, we may compute µ b and σ b explicitly. It is clear that
b = 0. When t ≤ T , σ
µ b(e, t) is a.s. only evaluated at the points with e1 = x2
and we may use the formula given in in (2.19). We now assume that t > T .
Write a typical point in R2 as x = (x1 , x2 ). Set Ae , (Yet , YetT ) and define
η 0 (x1 , x2 ; v, t) , η(x2 , T v) η(x1 − x2 , (t − T )v),
for t > T , so η 0 (x1 , x2 ; c, t) is the density of the R2 -valued random variable

cWft , c W
fT . Recall that η(x, v) was defined as the density of the normal
distribution with mean 0 and variance v, and W f was defined as a Wiener
29
process in Example 2.10. We then have

2
b2 (x, t) = E et Yet = x1 , YetT = x2

σ e σ
2 2
c1 P
e σ et ∈ dx + c2 P
et = c1 , A e σ et ∈ dx
et = c2 , A
= 2 2
e σ
P et ∈ dx + P
et = c1 , A e σ et ∈ dx
et = c2 , A
c1 η 0 (x1 , x2 ; c1 , t) + c2 η 0 (x1 , x2 ; c2 , t)
= .
η 0 (x1 , x2 ; c1 , t) + η 0 (x1 , x2 ; c2 , t)
So taking µ b(x1 , x2 , t) = 0 and

s

 c1 η(x1 , tc1 ) + c2 η(x1 , tc2 )

 if t ≤ T , and
η(x1 , tc1 ) + η(x1 , tc2 )



σb(x1 , x2 , t) = s
c1 η 0 (x1 , x2 ; c1 , t) + c2 η 0 (x1 , x2 ; c2 , t)



if t > T ,


η 0 (x1 , x2 ; c1 , t) + η 0 (x1 , x2 ; c2 , t)


Cor. 2.27 asserts the existence a solution to (2.28) such the one-dimensional
marginal distributions of the the process (X,b X b T ) agree with the one-dimensional
marginal distributions of the process (Ye , Ye T ).
2.3 Applications to Mixture Models

To conclude this chapter, we will show how our results may be used to give
an answer to a question raised by Piterbarg in the working paper [Pit03a].
Let S denote the price process for some primary security. We will assume
throughout this section that S0 = s0 where s0 is a constant. We will also
abuse notation and use S to denote the price process in different models
which may be defined on different spaces. Recall that an up-and-out call
option with maturity T , strike K, and barrier L is an option that pays the
amount (ST −K)+ at time T if S remains below the barrier L until time T . If
S exceeds the barrier L, then the option knocks out and losses all value. Fix
two constant volatility levels 0 ≤ σ 1 < σ 2 and two probabilities p1 + p2 = 1.
For i ∈ {1, 2}, let B i (T, K, L) denote the price at time 0 of an up-and-out
call option with maturity T , strike K, and barrier L in a Black-Scholes model
with constant interest rate r and volatility σ i . No-arbitrage arguments imply
30
that the price for such an option in given as
B i (T, K, L) = Ei e−rT 1{MT ≤L} (ST − K)+ ,

where Mt , max{Su : u ∈ [0, t]}, Pi [S0 = s0 ] = 1, and S satisfies the SDE:

dSt = σ i St dWt under Pi for some Wiener process W .
We will let P
e denote a probability measure under which the volatility is
a random variable that takes the value σ i with probability pi at the initial
time, just as in Example 2.10. As
e e−rT 1{M ≤L} (ST − K)+ = p1 B 1 (T, K, L) + p2 B 2 (T, K, L),

(2.32) E T
we conclude that
(2.33) B(T,
e K, L) , p1 B 1 (T, K, L) + p2 B 2 (T, K, L)
gives an arbitrage-free pricing rule for up-and-out call options in the model
e where B(T,
P, e K, L) is the price for the option with maturity T , strike K,
and barrier L. Note that we are not making any effort to justify the pric-
ing formula (2.32); instead, we are simply observing that the existence of a
martingale measure is sufficient to ensure the absence of arbitrage. Piterbarg
conjectures that this “coin-flip” model is essentially the only model in which
the pricing rule (2.33) is arbitrage-free. In particular, Piterbarg writes:
Does there exist a “real” and ‘reasonable” dynamic model, in
which uncertainty is revealed over time, and not in an instant
explosion of information as in [the coin-flip model], such that all
European options and all barriers are priced using (2.33)? The
answer is most likely no, but we do not have a formal proof.
To produce another model in which the pricing rule (2.33) is arbitrage-
free, we apply Cor. 2.23 to the process (S, M ) under the measure P
e to produce
a measure P c such that L (S,
b and processes Sb and M b M b = L (S, M | P).
c | P) e
This is the geometric version of Example 2.26. It then follows that
b e−rT 1 c bT − K)+ = E e e−rT 1{M ≤L} (ST − K)+

E {MT ≤L} ( S T
= B(T,
b K, L),
so the the pricing rule (2.33) is also arbitrage-free for the model P.
b We should
31
also observe that the prices computed by discounting cash flows under the
pricing measure Pb will no longer be given as mixtures of Black-Scholes prices
after the initial time. By including the running minimum in the auxiliary
process, it is possible to construct an arbitrage-free model which is distinct
from the coin-flip model, and in which all options with both upper and lower
barriers may be priced as simple mixtures of Black-Scholes prices without
introducing arbitrage. While we make no claims about the extent to which
the model P b is “real” or ‘reasonable”, it is fully-specified and dynamically-
consistent.
32
Chapter 3
A Cross Product Construction
In this section, we develop a cross product construction for probability mea-

sures that preserves certain properties of the composed measures. We first
develop a binary product, and then we show that the construction is asso-
ciative, so we may repeat the construction iteratively. This section is very
much in the spirit of Chapter 6 of Stroock and Varadhan [SV79]; however,
the goal is slightly different. Stroock and Varadhan begin with a collection
of measures that each solve a martingale problem locally, and they show that
these measures can be patched together to produce a global solution to that
martingale problem. We start with a single initial measure which we break
into a measure on the events prior to some stopping time T , and a condi-
tional probability measure on the events after T . We then patch these two
objects back together to form a new measure, and we do this in such a way
that we preserve the unconditional distribution of the events before T and
the unconditional distribution of the events after T .
In the next chaper, we will need to keep track of an auxiliary process
that takes values in a Polish space, so the natural initial condition will be a
measure on that Polish space. These considerations lead us to work in the
following setting.
3.1 Setting. Let (E, E ) be a Polish space with its Borel σ-field, and set
Ω , E×C0 (R+ ; Rd ) with typical point ω = (e, x). Define the random variable
E(e, x) , e, the process X(e, x) , x, and the filtration F0 , {Ft0 }t∈R+ where
Ft0 , E ⊗σ(X t ) = E ⊗σ(Xs : s ≤ t). Than Ω is a Polish space under the
33
CHAPTER 3. A CROSS PRODUCT CONSTRUCTION
standard product topology on E×C0 (R+ ; Rd ) with Borel σ-field

_
F , E ⊗σ(X) = E ⊗X -1 (C0 ) = Ft0 ,
t
where C0 denotes the Borel σ-field on C0 (R+ ; Rd ).
In this chapter, we will always assume that we are in Setting 3.1. Notice
that by taking E = Rd and defining Zt (e, x) , e + x(t), we recover the
standard Wiener space with canonical process Z.
One might be concerned that F0 does not satisfy the usual conditions;
however, Lem. F.8 asserts that every right-continuous F0 -martingale remains
a martingale when we move to the smallest filtration generated by F0 that
satisfies the usual conditions, so we can move to a filtration that satisfies
the usual conditions if we need to invoke results from the general theory of
processes. Moreover, F0 -stopping times have a number of useful properties
that are lost when we move to the right-continuous filtration generated by F0 .
In particular, if T is an F0 -stopping time, then the events in the σ-field FT0
have a nice characterization (e.g., Lem. A.1), and FT0 is countably generated.
These result are developed in Appendix A.
The following notion will be fundamental.
3.2 Definition. Let {0 = T0 ≤ T1 ≤ . . . ≤ Tn < ∞} be an increasing

sequence of finite F0 -stopping times, and let {Gi }0≤i≤n be a collection of
σ-fields. Set H0 , σ(E), Tn+1 = ∞, and
Hi , σ Gi−1 , ∆(X Ti , Ti−1 ) for 1 ≤ i ≤ n + 1.

(3.3)
We say that Π , (Ti , Gi ) 0≤i≤n is an extended partition if both the

following properties hold:
(a) Ti − Ti−1 ∈ σ Gi−1 , ∆(X, Ti−1 ) for 1 ≤ i ≤ n, and

(b) Gi ⊂ Hi for 0 ≤ i ≤ n.
One possible way to interpretation this structure is to think of an ex-

tended partition as a filtration-like object in which information is lost at
each time Ti−1 , and Gi−1 denotes the information that we keep. If we watch
the process ∆(X, Ti−1 ) over the stochastic interval [Ti−1 , Ti ], and we combine
what we learn with the information that we kept at Ti−1 , then the amount of
34
information that we have at time Ti is Hi . As Ti is an F0 -stopping time by

assumption, Lem. A.5 asserts that (a) is actually equivalent to the seemingly
stronger Ti − Ti−1 ∈ Hi . The σ-field Hi represents the information that we
have at time Ti , and property (b) states that this is the only information
that we may include in the next Gi . In essence, once you choose to forget
something by leaving it out of some Gi , that information is gone forever.
3.4 Example. We specialize to the case Ω = {0}×C0 (R+ ; R2 ), so E contains

a single point and there is no initial condition. We assume that the canonical
process is divided into (Y, C) = X, where Y and C are real-valued processes.
We fix some n, and we fix a deterministic partition
(3.5) π = {0 = t0 < t1 < . . . < tn < tn+1 = ∞}.
We take Ti = ti and Gi = σ(Yti ) in Def. 3.2. For i ∈ {1, . . . , n + 1}, we have
Hi = σ Gi−1 , ∆(X Ti , XTi−1 )

= σ Yti−1 , Ys − Yti−1 , Cs − Cti−1 : s ∈ [ti−1 , ti ] ∩ R+
= σ Θ(Y ti , ti−1 ), ∆(C ti , ti−1 ) .

As Yti = Θti −ti−1 (Y ti , ti−1 ) ∈ Hi , Π = (Gi , ti ) 0≤1≤n is an extended parti-

tion. Notice that in this example, we do not have Cti ∈ Hi when i > 0.
The goal of this section is to the provide the following theorem.
3.6 Theorem. Let P be a probability measure on Ω and Π = {(Ti , Gi )}0≤i≤n

⊗Π
be an extended partition. Then there exists a unique measure, denoted P ,
such that
⊗Π
(a) P [A] = P[A] for A ∈ ∪i Hi , and
⊗Π
(b) any version of P[B | Gi ] is a version of P [B | FT0i ] for B ∈ Hi+1 and
0 ≤ i ≤ n,
where Hi , σ Gi−1 , ∆(X Ti , Ti−1 ) for 1 ≤ i ≤ n + 1.

3.7 Remark. Two versions of P[B | Gi ] may differ on a P-null set, N ∈ Gi ⊂

⊗Π ⊗Π
Hi , but P [N ] = P[N ] = 0 by (a) of Thm. 3.6, so N is also a P -null set
and statement of the theorem is at least plausible.
35
Property (a) says that we do not change the unconditional distributions

of events that are Hi -measurable for some i; however, if the random variable
A is FT0i -measurable and the random variable B is Hi+1 -measurable, then we
may change the joint distribution of (A, B). In particular, (b) implies that A
⊗Π
and B are conditionally independent given Gi under P , regardless of their
joint distribution under P.
3.8 Example. Let Ω, F , {Ft0 }t∈R+ , P be a stochastic basis that supports

a Wiener process W and a collection of independent, F00 -measurable random

variables {Ui }0≤i≤n , each of which is uniformly distributed over the interval
[0, 1]. Let c1 , c2 , and η be defined as as in Example 2.10, and let π be defined
as in (3.5) of Example 3.4. Define σ : Ω×C0 (R+ ; R)×R+ → R by
√
 c1 if t ∈ [0, t1 ) and U0 < 1/2,
√c if t ∈ [0, t ) and U ≥ 1/2,



2 1 0
σt (y) , √
 c1 if t ∈ [ti , ti+1 ) and Ui < η(y(ti ),η(y(t i ), ti c1 )
ti c1 )+η(y(ti ), ti c2 )
for i > 0, and
√


 c if t ∈ [t , t ) and U ≥
 η(y(ti ), ti c2 )
2 i i+1 i for i > 0.
η(y(ti ), ti c1 )+η(y(ti ), ti c2 )
Rt
Let Y solve dYt = σt (Y ) dWt with Y0 = 0, and set Ct = 0 σs2 (Ys ) ds. In
prose, we flip a coin to choose a volatility at the initial time, and we use
this volatility over the time interval [0, t1 ). At each time ti , we flip again to
reset the volatility level, but the odds are adjusted so that the conditional
distribution of the volatility chosen at time ti given Yti = y is the same as
the conditional distribution of σe0 = σ
eti in Example 2.10 given Yeti = y. Let Ω
and Π be defined as in Example 3.4, and set P , L (Ye , C) e where Ye and C e
are defined as in Example 2.10. In particular, P is a measure on Ω. In this
⊗Π
case, we have P = L (Y, C).
3.1 The Binary Construction

We work up to this result in steps. In the first lemma, we take an initial
point ω = (e, x) ∈ Ω, and we cut the path x at time t, keeping the initial
segment from 0 to t and discarding the rest. We then randomly draw a path
from C0 (R+ ; Rd ) according to some measure Q and append this path to the
initial segment of x. Recall that C0 (R+ ; Rd ) denotes the set of continuous
functions from R+ to Rd that start at 0.
36
3.9 Lemma. Fix some ω 0 = (e0 , x0 ) ∈ Ω, t ≥ 0, and let Q be a probability

measure on C0 (R+ ; Rd ). Then there exists a unique measure on Ω, denoted
δω0 ⊕t Q, such that
δω0 ⊕t Q A ∩ {∆(X, t) ∈ B } = 1A (ω 0 ) Q[B ] for all A ∈ Ft0 and B ∈ C0 ,

where C0 denotes the Borel σ-field on C0 (R+ ; Rd ).

Proof. Let φ : Ω → C0 (R+ ; Rd ) denote the map ω = (e, x) 7→ ∆(x, t). As
F = σ Ft0 , φ -1 (C0 ) = σ Ft0 , ∆(X, t)

(e.g., Lem. A.3), uniqueness follows from the standard π-system argument.
If we let ψ : C0 (R+ ; Rd ) → Ω denote the map y 7→ e0 , ∇(x0 , t) + Θ(y, −t) ,

then
ψ -1 A ∩ {∆(X, t) ∈ B }

= ψ -1 (A) ∩ y ∈ C0 (R+ ; Rd ) : ∆ ∇(x0 , t) + Θ(y, −t), t ∈ B

(
B if ω 0 ∈ A, and
= ,
∅ otherwise.
as ∆ ∇(x0 , t) + Θ(y, −t), t = ∆ ∇(x0 , t), t +∆ Θ(y, −t), t = 0 + y. This

means that Q ◦ ψ -1 is the required measure.
3.10 Lemma. Let T be an F0 -stopping time, let Q be a probability measure

on C0 (R+ ; Rd ), and let C0 denote the Borel σ-field on C0 (R+ ; Rd ). Fix some
ω 0 = (e0 , x0 ) ∈ Ω and set P , δω0 ⊕T (ω0 ) Q. Then
(a) P[T = T (ω 0 )] = 1,
(b) P A ∩ {∆(X, T ) ∈ C} = 1A (ω 0 ) Q[C] for all A ∈ FT0 , C ∈ C0 , and

(c) P[A ∩ F ] = 1A (ω 0 ) P[F ] for all A ∈ FT0 , F ∈ F .
Proof. Set B , {E = e0 , X t = ∇(x0 , t)}. Lem. A.1 asserts that B ∈ Ft0 ,

so may apply the previous lemma to conclude that P[B ] = 1B (ω 0 ) = 1.
Applying Lem. A.1 again, we see that T (ω) = T (ω 0 ) for all ω ∈ B, so we
have (a).
37
If A ∈ FT0 and C ∈ C0 , then
P[A ∩ {∆(X, T ) ∈ C }] = P[A ∩ {∆(X, T ) ∈ C } ∩ {T = T (ω 0 )}]

= P[A ∩ {∆ X, T (ω 0 ) ∈ C } ∩ {T = T (ω 0 )}]

= P[A ∩ {∆ X, T (ω 0 ) ∈ C }]

= 1A (ω 0 ) Q[C ],
so (b) follows.
Finally, take A ∈ FT0 and let F = B ∩ {∆(X, T ) ∈ C} with B ∈ FT0 and
C ∈ C0 . In this case, we have
P[A ∩ F ] = 1A∩B (ω 0 ) Q[C ] = 1A (ω 0 )P[B ∩ {∆(X, T ) ∈ C }] = 1A (ω 0 )P[F ].
As F = σ FT0 , ∆(X, T ) (e.g., Lem. A.3), (c) then follows from the standard

π-system argument.
We can now patch together a fixed initial point in Ω and a single prob-
ability measure on C0 (R+ ; Rd ). We use this construction to glue together a
probability measure P on Ω and a probability kernel Q on C0 (R+ ; Rd ).
3.11 Definition. Let (Ω0 , F 0 ) and (Ω00 , F 00 ) be a measurable spaces and
fix some G 0 ⊂ F 0 . We say that Q : Ω0 ×F 00 → [0, 1] is a G 0 -measurable
probability kernel from (Ω0 , F 0 ) to (Ω00 , F 00 ), if
(a) Q[A] is a G 0 -measurable random variable for fixed A ∈ F 00 , and
(b) Qω0 is a probability measure on (Ω00 , F 00 ) for fixed ω 0 ∈ Ω0 .
3.12 Theorem. Let P be a probability measure on (Ω, F ), T be an F0 -
stopping time, and Q be an FT0 -measurable probability kernel from (Ω, F ) to
(C0 (R+ ; Rd ), C0 ). Then there exists a unique probability measure on (Ω, F ),
denoted P⊕T Q, such that
(a) P⊕T Q[A] = P[A] for all A ∈ FT0 , and

(b) the map ω 7→ δω ⊕T (ω) Qω [B] is a version of P⊕T Q[B | FT0 ] for each
B ∈ F.

Proof. Let Qb : Ω×F → [0, 1] denote the map (ω, A) 7→ δω ⊕T (ω) Qω [A].
b is an F 0 -measurable probability kernel from (Ω, F ) to
We first show that Q T
(Ω, F ).
38
Let A , {A ∈ F : Q[A] b is FT0 -measurable}. If A ∈ F , then Qbω [Ac ] =

1− Qbω [A] because Q
bω is a probability measure for each fixed ω. In particular,
if A ∈ A , then Q[A soPA ∈ A . Similarly, if An ∈ A and the
b c ] = 1 − Q[A]
b c
n Q[An ] so ∪n An ∈ A . We have now

An are disjoint, then Qb ∪n An = b
shown that A is a λ-system.
Set
B , F ∈ F : F = A ∩ B for some A ∈ FT0 and B = {∆(X, T ) ∈ C } ,

and take F ∈ B. Then Q[F b ] = Q[Ab ∩ B] = 1A Q(C) ∈ F 0 by the previous

T
lemma, so B ⊂ A . B is closed with respect to finite intersections and
σ(B) ⊃ FT ∨ σ ∆(X, T ) = F (e.g., Lem A.3), so F ⊂ A by the π-λ
0

theorem, and Q[A]

b is an FT0 -measurable for all A ∈ F . As Q bω is a probability
measure for fixed ω by construction, Qb is an F 0 -measurable probability kernel
T
on (Ω, F )
b ] for F ∈ F . If A ∈ F 0 , then

Now define the measure Q[F ] , EP Q[F T

Q[A] = EP Q[A]
b = EQ [1A ],
where we use the second property in Lem. 3.10. Therefore Q has property
(a). If B ∈ F , then

Q A ∩ B = EP Q[A
b ∩ B] = EP 1A Q[B]
b = EQ 1A Q[B]
b ,
b ∈ F0,
where we have used the last property in Lem. 3.10, the fact that Q T
0 0
and the fact that Q and P agree on FT . As A ∈ FT was arbitrary, we have
now shown that Q has property (b) of Thm. 3.12.
The uniqueness is evident, as any other measure R with these properties
must assign measure
h i
0

R B = R R B FT = EP Q[B]
b
to any set B ∈ F .
The previous construction connects an initial law with a collection of

probability kernels. In our application, the collection of probability kernels
will be generated by conditioning a probability measure, so we will need the
following definition.
39
3.13 Definition. Let P be a probability measure on (Ω0 , F 0 ) and fix some

G 0 ⊂ F 0 . We say that Q is a conditional probability distribution for P
given G 0 if
(a) Q is a G 0 -measurable probability kernel from (Ω0 , F 0 ) into (Ω0 , F 0 ),

and
(b) Q[A] is version of P[A | G 0 ] for all A ∈ F 0 .
In addition, we say that Q is regular if there exists a P-null set N such that
Qω0 [G] = 1G (ω 0 ) for all G ∈ G 0 and ω 0 ∈
/ N.
We recall the following result.
3.14 Theorem. Let (Ω0 , F 0 ) be a Polish space with its Borel σ-field, and
let P be a probability measure on (Ω0 , F 0 ). If we fix some G 0 ⊂ F 0 , then a
conditional probability distribution for P given G 0 exists. Moreover, if G 0 is
countably generated, then we may choose a regular version.
For proof, one may consult [SV79] Thm 1.1.6 and Thm 1.1.8.
3.15 Corollary. Let P1 and P2 be probability measures on Ω, let T be an

F0 -stopping time, and let G ⊂ FT0 with P1 |G P2 |G . Then there exists a
unique measure, denoted P1 ⊗T,G P2 , such that
(a) P1 ⊗T,G P2 [A] = P1 [A] for any A ∈ FT0 , and
| G ] is a version of P ⊗T,G P [B | FT ] for all

(b) any version of P2 [B 1 2 0
B ∈ σ G , ∆(X, T ) .
In particular, if P1 and P2 agree when restricted to G , then P1 ⊗T,G P2 and P2

agree when restricted to σ G , ∆(X, T )
3.16 Remark. If G = σ(XT ), then σ G , ∆(X, T ) = σ Θ(X, T ) , so we can

read (b) as saying that X has a strong Markov-like property at the stopping
time T under the measure P1 ⊗T,G P2 .
Proof. Let Qe be a regular conditional probability distribution of P2 condi-

tioned on G (which exists as the conditions of Thm. 3.14 are satisfied), and
let Q(ω, C) , Q ω, {∆(X, T ) ∈ C} for ω ∈ Ω and C ∈ C0 . Notice that

e
Q is a G -measurable probability kernel from (Ω, F ) to C0 (R+ ; Rd ), C0 , so
40
we may define Pb , P1 ⊕S Q as our candidate for P1 ⊗T,G P2 . Property (a) is

simply (a) of Thm. 3.12.
To show that Pb has property (b), consider the classes of sets
There exists a version of P2 [A | G ] which

A , A∈F : b | F0] , and
is also a version of P[A T
B , B ∈ F : B = G ∩ {∆(X, T ) ∈ C } for some G ∈ G and C ∈ C0 . .

Now fix some B = G ∩ {∆(X, T ) ∈ C} ∈ B. By (b) of Thm. 3.12, the map

b | F 0 ]. But
ω 7→ δω ⊕T (ω) Qω [B] is a version of P[B T

δω ⊕T (ω) Qω [B] = δω ⊕T (ω) Qω G ∩ {∆(X, T ) ∈ C }
= 1G (ω) Q(ω, C)

= 1G (ω) Qe ω, {∆(X, T ) ∈ C }
where these equalities hold for all ω and we have used (b) of Lem. 3.10. We

can then conclude that 1G Q e ∆(X, T ) ∈ C is a version of P[B b | F 0 ], and
T
2

we already
know 1G Q ∆(X, T ) ∈ C is a version of P G ∩ {∆(X, T ) ∈
that
e
C} G = P2 B G , so B ⊂ A . But A is a σ-field and B is closed with
respect to intersection, so σ(B) = σ G , ∆(X, T ) ⊂ A .

We now have the existence of a common version, but we still need to show
that every version of P2 [A | G ] actually works. Fix A ∈ σ G , ∆(X, T ) , let Y
be any version of P2 [A | G ], and let Z be a version of P2 [A | G ] which is also
b | F 0 ]. So P2 [Y 6= Z] = 0 ⇒ P1 [Y 6= Z] = 0 as P2 |G P1 |G ,
a version of P[A T
b 6= Z] = 0 by (a), so Y is a version of P[A
but then P[Y b | F 0 ] and we are
T
done.
3.17 Remark. If we did not have P2 |G P1 |G , we could still attempt to

define some measure “P1 ⊗T,G P2 ”; however, we could not hope for unique-
ness. In this case, P1 charges events in G that don’t happen under P2 , so
a conditional probability distribution of Q conditioned on G can be defined
arbitrarily (up to measurability requirements) on these events.
3.18 Example. We specialize Example 3.8 to the case n = 1. In this case,

we have T = t1 , G = σ(Yt1 ), and Y and C admit the relatively explicit
41
representations
√
 c1 W t for t ≤ t1 and U0 < 1/2,
√



 c2 W t
 for t ≤ t1 and U0 ≥ 1/2,
Yt = √ η(Y 1 , t1 c1 )
 Yt1 + c1 (Wt − Wt1 ) for t > t1 and U1 < η(Yt , t c t)+η(Y t1 , t1 c2 )
,
1 1 1
√


 η(Yt1 , t1 c1 )
Yt1 + c2 (Wt − Wt1 ) for t > t1 and U1 ≥
 , and
η(Yt1 , t1 c1 )+η(Yt1 , t1 c2 )



 tc1 for t ≤ t1 and U0 < 1/2,

tc2
 for t ≤ t1 and U0 ≥ 1/2,
Ct = η(Y 1 , t1 c1 )
 Ct1 + (t − t1 )c1 for t > t1 and U1 < η(Yt , t c t)+η(Y t1 , t1 c2 )
, and

 1 1 1
 η(Yt1 , t1 c1 )
Ct1 + (t − t1 )c2 for t > t1 and U1 ≥
 .
η(Yt , t c )+η(Yt , t c )
1 1 1 1 1 2
Recall that in Example 3.8 we set P , L (Ye , C)

e where Ye and C e were defined
as in Example 2.10. Then Q , L (Y, C) = P⊗ T,G P. It is clear that Q and P
agree on Ft01 , and this is property (a) of Cor. 3.15. Using the independence
of the increments of W , the fact that U1 is independent of U0 , and the
representation of Y and C above, we see that ∆(X, t1 ) where X = (Y, C)
only depends upon Ft01 through the value of Yt1 . This is property (b) of
Cor. 3.15.
3.2 Properties Preserved by the Binary Con-

struction.
The results given in this section share the following assumptions.
3.19 Assumption. Let P1 and P2 be probability measures on Ω, let T be an
F0 -stopping time, let G ⊂ FT0 with P1 |G P2 |G , and we set P12 , P1 ⊗T,G P2 .
3.20 Lemma. In addition to the assumptions of 3.19, let A be an Rd -valued,
continuous process such that ∆(A, T ) is σ G , ∆(X, T ) -measurable. Then the
following two implications hold.
(a) If AT is P1 -a.s. of finite variation, and ∆(A, T ) is P2 -a.s. of finite
variation, then A is P12 -a.s. of finite variation.
(b) If AT is P1 -a.s. absolutely continuous, and ∆(A, T ) is P2 -a.s. absolutely
continuous, then A is P12 -a.s. absolutely continuous.
42
Proof. Set
F V d , {x ∈ C0 (R+ ; Rd ) : x ∈ BV [0, t]; Rd for all t}, and

AC d , {x ∈ C0 (R+ ; Rd ) : x ∈ AC [0, t]; Rd for all t}.

These are both Borel measurable subsets of C0 (R+ ; Rd ) (e.g., Cor. C.9 andd
Cor. C.11). Fixing
any ω = (e, x) ∈ Ω, we see that ∇ x, T (ω) ∈ BV [0, t]; R

and ∆ x, T (ω) ∈ BV [0, (t−T (ω))∨0]; R implies that x ∈ BV [0, t]; Rd ,
d
so T
A ∈ F V d ∩ {∆(A, T ) ∈ F V d } ⊂ A ∈ F V d .

As P2 [∆(A, T ) ∈ F V d ] = 1, 1 is a version of P2 [∆(A, T ) ∈ F V d | G ], and we

may apply the properties of P12 listed in Cor. 3.15 to conclude that
E12 [A ∈ F V d ] ≥ E12 1{AT ∈F V d } 1{∆(A,T )∈F V d }

h i
= E12 1{AT ∈F V d } E12 1{∆(A,T )∈F V d } FT0

h i
= E1 1{AT ∈F V d } E2 1{∆(A,T )∈F V d } G

= 1.
This is (a).
Similarly, if ω = (e, x) ∈ Ω, ∇ x, T (ω) ∈ AC [0, t]; Rd and ∆ x, T (ω) ∈
AC [0, (t − T (ω)) ∨ 0]; Rd then x ∈ AC [0, t]; Rd , so may replace F V d with
AC d in the previous argument to get (b).
The following corollary is often easier to use.
3.21 Corollary. In addition to the assumptions of 3.19, let A be an Rd -
valued, continuous process such that ∆(A, T ) is σ G , ∆(X, T ) -measurable.
Then the following two implications hold.
(a) If A is Pi -a.s. of finite variation for i ∈ {1, 2}, then A is P12 -a.s. of
finite variation.
(b) If A is Pi -a.s. absolutely continuous for i ∈ {1, 2}, then A is P12 -a.s.
absolutely continuous.
Proof. We have {A ∈ F V d } ⊂ {AT ∈ F V d }, so P1 [A ∈ F V d ] = 1 implies
P1 [AT ∈ F V d ] = 1. We also have {A ∈ F V d } ⊂ {∆(A, T ) ∈ F V d }, so
P2 [A ∈ F V d ] = 1 implies P2 [∆(A, T ) ∈ F V d ] = 1. Assertion (a) then
43
follows from (a) of Lem. 3.20, and essentially the same argument shows that
(b) follows from (b) of Lem. 3.20.
3.22 Lemma. In addition to the assumptions of 3.19, let A be an F0 -

adapted, Rd -valued, continuous process such that ∆(A, T ) is σ G , ∆(X, T ) -
measurable, and let a be an Rd -valued, measurable process such that the set
n ∂ ∂ o
B(ω) , t ∈ R+ : At (ω) exists and at (ω) 6= At (ω)
∂t ∂t
has Lebesgue measure 0 for all ω. Further assume that S is an R+ -valued,
F0 -stopping time such that (S − T )+ is σ G , ∆(X, T ) -measurable. If we
Rt
have Pi At = 0 au du ∀t ∈ R+ = 1 for i ∈ {1, 2} and P1 |G = P2 |G , then
Rt
P12 At = 0 au du ∀t ∈ R+ = 1 and
Z S Z T ∧S Z T ∨S
12 1 2
(3.23) E f (au ) du = E f (au ) du + E f (au ) du .
0 0 T
3.24 Remark. Each B(ω) is automatically Lebesgue measurable as it is a

null set. We do not require these sets to be Borel measurable.
3.25 Remark. S = ∞ always satisfies the requirements of this lemma.
Proof. It follows from the previous corollary that A is P12 -a.s. absolutely
continuous.R t As a(ω) is a version of the derivative for each ω, we must have
12
P [At = 0 au du for all t] = 1.
By taking divided differences of the process AT , we may find a σ(AT )⊗R+ -
∂
measurable process a1 such that a1t (ω) = ∂t T
t (ω) whenever this deriva-2
A
tive exits. Similarly, there exists a σ ∆(A, T ) ⊗R+ -measurable process a
∂
such that a2t (ω) = ∂t ∆t (A, T ) whenever this derivative exists (e.g., take
0 T 0
Ft = σ(A ) or Ft = σ ∆(A, T ) for all t in Lem. C.10).
Now define the sets
1
n ∂ T o
B (ω) , t ∈ R+ : At (ω) does not exist , and
∂t
2
n ∂ o
B (ω) , t ∈ R+ : ∆t A(ω), T (ω) does not exist.
∂t
∂
/ B 1 (ω), then ∂t
If 0 < t < T (ω) and t ∈ ATt (ω) exists and AT (ω) and A(ω)
∂ ∂
agree in some neighborhood of t, so ∂t At (ω) exists and agrees with ∂t ATt (ω).
44
In particular, if 0 < t < T (ω) and t ∈ / B(ω) ∪ B 1 (ω), then at (ω) = a1t (ω). If
T d 1
ω ∈ {A ∈ F V }, then B(ω) ∪ B (ω) has Lebesgue measure zero, so at (ω 0 )
and a1t (ω 0 ) agree for Lebesgue-a.e. t ∈ 0, T (ω) . This means that

Z T ∧S Z T ∧S
(3.26) 1{AT ∈F V d } f (au ) du = 1{AT ∈F V d } f (a1u ) du
0 0
for all ω, where we use the extended integral of Rem. 1.7. The process
R T ∧S
a1 1[0,T ∧S] is FT0 ⊗R+ -measurable, so 0 f (a1u ) du is FT0 -measurable by Fu-
bini’s Theorem, and the same holds for the left hand side of (3.26).
Similar pathwise arguments show that aT (ω)+t (ω) = a2t (ω) if t > 0, T (ω)+
t ∈ / B(ω), and t ∈ / B 2 (ω). If ω ∈ {∆(A, T ) ∈ F V d }, then aT (ω)+t (ω) and
a2t (ω) agree for Lebesgue-a.e. t, so
Z T ∨S Z (S−T )+
(3.27) 1{∆(A,T )∈F V d } f (au ) du = 1{∆(A,T )∈F V d } f (a2u ) du
T 0
for all ω. The process as 1[0,(S−T )+ ] is σ G ,∆(X, T ) ⊗R+ -measurable, so the

right hand side of (3.27) is σ G , ∆(X, T ) -measurable by Fubini’s Theorem

and the assumption that ∆(A, T ) is σ G , ∆(X, T ) -measurable.

Rt
We have assumed that P2 [At = 0 au du for all t] = 1, so A is absolutely
R T ∨S R T ∨S
continuous P2 -a.s. and 1{∆(A,T )∈F V d } T f (au ) du = T f (au ) du P2 -a.s.
R T ∨S
This means that any version of P2 [1{∆(A,T )∈F V d } T f (au ) | G ] is also a
R T ∨S
version of P2 [ T f (au ) du | G ].
Rt
We have assumed that Pi [At = 0 au du for all t] = 1 for i ∈ {1, 2}, so
we may apply the previous corollary to conclude that A is P12 -a.s. absolutely
continuous. As {AT ∈ AC d } ⊂ {A ∈ AC d } and {∆(A, T ) ∈ AC d } ⊂ {A ∈
AC d }, P12 [AT ∈ AC d ] = P12 [∆(A, T ) ∈ AC d ] = 1.
We now use (3.23), (3.26), the properties stated in Cor. 3.15, and the
45
assumption that P1 |G = P2 |G to write

Z S Z T ∧S
12 12
E f (au ) du = E 1{AT ∈F V d } f (au ) du
0 0
Z T ∨S
12
+E 1{∆(A,T )∈F V d } f (au ) du
T
Z T ∧S
1
=E 1{AT ∈F V d } f (au ) du
0
h Z T ∧S i
1 2
+ E E 1{∆(A,T )∈F V d } f (au ) du G

T
Z T ∧S Z T ∧S
1 2
=E f (au ) du + E f (au ) du .
0 T
3.28 Example. Resume the setting of Example 3.18 and take A = C in

Lem. 3.22. Recall that in this case we have T = t1 and
σ G , ∆(X, t1 ) = σ Yt1 , ∆(X, t1 ) = σ Θ(Y, t1 ), ∆(C, t1 ) ,

so ∆(C, T ) is σ G , ∆(X, T ) -measurable. Define the constant process ct ,

C1 . Then under P , L (Ye , C),e we have P Ct = t cs ds ∀t = 1, but under

R
Rt 0
Q , P⊗ t1 ,σ(Yt1 ) P = L (Y, C), we have Q Ct = 0 cs ds ∀t < 1. In particular,

Rt
Ct = 0 cs ds ∀t is the event where we choose the same volatility over the
interval [0, t1 ) as we choose over the interval (t1 , ∞). This shows that Q is
not absolutely continuous with respect to P. This example also shows why
we must require a to agree with the derivative of A at all ω, rather than just
P-a.e. ω.
3.29 Lemma. In addition to the assumptions of 3.19, let F b0 , {Fb0 } where
t
Fbt0 , FT0 +t and let M be a continuous, real-valued process with M0 = 0. Set
c is σ G , ∆(X, T ) -measurable. If M T is

M
c , ∆(M, T ) and assume that M
an (F0 , P1 )-local martingale and M b0 , P2 )-local martingale, then M
c is an (F
is an (F0 , P12 )-local martingale.
Proof. First we note that if S is an F0 -stopping time, then S ∨ T − T is an
b0 -stopping time and F 0 ⊂ Fb0
F S S∨T −T . To see this, notice that T + t is an
46
F0 -stopping time, so the first claim follows from the equalities
{S ∨ T − T ≤ t} = {S ≤ T + t} ∈ FT0 +t = Fbt0 .
If A ∈ FS0 , then the same chain of equalities gives
A ∩ {S ∨ T − T ≤ t} = A ∩ {S ≤ T + t} ∈ FT0 +t = Fbt0 .
From now on, if S is an F0 -stopping time, then Sb , S ∨ T − T will denote

b0 -stopping time.
the corresponding F
We first show that Mc is a (P12 , F
b0 )-local martingale. Let
T2n , inf{t ≥ T : |Mt − MT | ≥ n}.
As this is the hitting time of a closed set, T2n is an F0 -stopping time. Notice

that in this case, Tb2n = inf{t ≥ 0 : |∆t (M, T )| ≥ n}, so Tb2n is σ ∆(M, T ) -
cTb2n = ∇ ∆(M, T ), Tb2n . Also notice that M cTb2n ≤ n,

measurable as is M cn , M
cn is an (F
so M b0 , P2 )-martingale. We write Z ∈ bF to mean that Z is a
bounded F -measurable random variable. For 0 ≤ s ≤ t < ∞, let
A , Z ∈ bF : E12 Y (M ctn − Mcsn ) = 0 , and

B , Z ∈ bF : Z = Z1 Z2 with Z1 ∈ bFb00 , Z2 ∈ bσ ∆s (X, T ) .

If we fix some Z = Z1 Z2 in B, then

h i
E12 Z(Mctn − M
csn ) = E12 Z1 E12 Z2 (M ctn − M
csn ) Fb00

h i
= E1 Z1 E2 Z2 (M
cn − M cn ) G

t s

h 0 i
1 2 2 cn c ) Fb G ,
n

= E Z1 E Z2 E (Mt − M s s
| {z }
=0
where the second equality follows from (b) of Cor. 3.15. This means that
B ⊂ A , but A is a monotone class (by bounded convergence) and B is
closed with respect to forming finite products, so σ(B) ⊂ A by a monotone
class argument. As
Fbs0 = FT0 +s = σ FT0 , ∆(X T +s , T ) = σ FT0 , ∆s (X, T ) ⊂ σ(B),

47
by Lem. A.3, we conclude that M cn is a (P12 , F

b0 )-martingale. As M
c is con-
tinuous, we have Tb2n → ∞ everywhere, so T2n is a localizing sequence and M c
is a (P12 , F
b0 )-local martingale.
Now we show that M is an (F0 , P12 )-local martingale. Retaining the
previous notation, also define
T1n , inf{0 ≤ t ≤ T : |Mt | ≥ n}, and

T n , T1n ∧ T2n .
Notice that {T1n = ∞} = {sups≤T |Ms | < n} and T n is an F0 -stopping time.

n
M n , M T is bounded by 2n as it is bounded by n on the interval [0, T ] and
can potentially make a move of size n after T before getting stopped. We
now show that M n is an (F0 , P12 )-martingale. To this end, fix some s < t
and bounded Z ∈ Fs0 . Then
E12 [Z(Mtn − Msn )] = E12 [Z 1{s≤T } (Mtn − MTn )] + E12 [Z 1{s≤T } (MTn − Msn )]
+ E12 [Z 1{T <s} (Mtn − Msn )]
, A + B + C.
To see that A = 0, first notice that

( Tn Tn
n n Mt 2 − MT 2 if T1n = ∞ and T < t, and
Mt − Mt∧T =
0 if T1n ≤ T or t ≤ T
T2n Tn
= 1{T <T1n } Mt∨T − MT 2
cbn − Mcn

= 1{T <T1n } M t 0
b0 -stopping time. As Z 1{s≤T <T n } ∈ F 0 = Fb0 ,

t = t ∨ T − T is an F
where b 1 T 0
A = E12 [Z 1{s≤T <T1n } (M

cbn − M
t
c0n )] = 0
cn is an (F
follows from the fact that M b0 , P12 )-martingale.
To see that B = 0, notice that Z 1{s≤T } (MTn − Msn ) ∈ F0T , so we may
apply (a) of Cor. 3.15 and write
B = E2 [Z 1{s≤T } (MTn − Msn )] = 0,
48
n
where we have also used the fact that M T ∧T is a bounded (F0 , P1 )-local
martingale, so it is in fact a martingale.
Finally, we show that C = 0. Let sb , s ∨ T − T and b
t , t ∨ T − T . Notice
0 0
that Z 1{T <s} ∈ Fs ⊂ Fsb . Then
b
Tn Tn
Z 1{T <s} (Mtn − Msn ) = Z 1{T <s∧T1n } (Mt∨T
2
− Ms∨T2
)
cbn − M
= Z 1{T <s∧T n } (M cn )
1 t sb
Finally notice that {T < T1n } ∈ FT0 = Fb00 , so Z 1{T <s∧T1n } ∈ Fbsb0 . We then
write
C = E12 [Z 1{T <s∧T1n } (M

cbn − M
t
cn )] = 0
sb
cn is a (P12 , F0 )- martingale as shown above.

using the fact that M
To conclude the proof, we again note that T n → ∞ everywhere as M
is continuous, so T n is a localizing sequence, and M is an (F0 , P12 )-local
martingale.
To apply the previous theorem, you must know that M c is an (F b0 , P2 )-

local martingale. While this looks like an unpleasant property to check, the
following lemma shows that this condition is automatically satisfied when M
is an (F0 , P2 )-local martingale.
3.30 Corollary. Let M be a continuous, real-valued process, and set F b0 ,

0 0 0 0
{Fbt }, where Fbt , FT +t . If M is an (F , P)-local martingale, then M
c ,
0
∆(M, T ) is an (Fb , P)-local martingale.
Proof. Take Fb0 , T1n , T2n , Tb2n , and T n as in the proof of the previous lemma.
cn , M
Let M cTb2n , M n , M T2n , and M n,m , M T2n ∧T m . M cn is bounded by n,
n,m
and M is bounded by m + (n ∧ m). In particular, M n,m is an (F0 , P)-
martingale. Notice that if m ≥ n, then {T m ≥ T2n } = {T1m > T } ∈ FT0 . In
prose, when m ≥ n, the only way for T m to happen strictly before T2n is for
the process to make a move of at least size m before time T . Also notice
that if m ≥ n, then M n = M n,m on the set {T m ≥ T2n }.
We are now ready to show that M cn is a (P, F
b0 )-martingale. Fix s < t and
49
bounded Z ∈ Fbs0 = FT0 +s . Notice that Z1{T m ≥T2n } ∈ Fbs0 = FT0 +s , and write
cn − M
E[Z(M cn )] = E[Z(M n − M n )]
t s T +t T +s
= lim E[Z1{T m ≥T2n } (MTn,m n,m
+t − MT +s )]
m
= 0,
where we have used bounded convergence, and the fact that M n,m is a (P, F0 )-
martingale. This means that Tbn is a localizing sequence, and M is a (P, Fb0 )-
local martingale.
Combining Cor. 3.30 and Lem. 3.29 yields the following corollary.
3.31 Corollary. In addition to the assumptions of 3.19, let M be a contin-

uous, real-valued process. Suppose that M is a local martingale with respect
to both (F, P1 ) and (F, P2 ) and that ∆(M, T ) is σ G , ∆(X, T ) -measurable.
Then M is an (F, P12 )-local martingale.
Before we present the corresponding result for quadratic variation, we

give an easy lemma.
3.32 Lemma. Let M be a uniformly integrable (F0 , P)-martingale, and let

S, T , and U be F0 -stopping times with T ≤ U . If Z is an FT0 -measurable
random variable, then
E[(MU − MT ) Z | FS0 ] = (MU ∧S − MT ∧S ) Z.
3.33 Remark. We often apply this result with Z = YT for some process
Y . In this situation, we have (MU ∧S − MT ∧S ) YT = (MU ∧S − MT ∧S ) YT ∧S as
MU ∧S − MT ∧S is only nonzero if T < S.
50
Proof. We write
E[(MU − MT ) Z | FS0 ]
= 1{S≤T } E[(MU − MT ) Z | FS0 ] + 1{T <S≤U } E[(MU − MT ) Z | FS0 ]
+ 1{U <S} E[(MU − MT ) Z | FS0 ]
h i
0 0 0
= 1{S≤T } E E[(MU − MT ) | FT ] Z FS + 1{T <S≤U } E[MU | FS ] − MT Z
| {z }
=0
+ 1{U <S} (MU − MT ) Z
= 1{T <S≤U } (MS − MT ) Z + 1{U <S} (MU − MT ) Z
= (MU ∧S − MT ∧S ) Z
3.34 Corollary. In addition to the assumptions of 3.19, let M 1 , M 2 , and C

1 2
be continuous, real-valued processes,
and assume that i∆(M , T ), ∆(M , T ),
and ∆(C, T ) are all σ G , ∆(X, T ) -measurable. If M is a local martingale
under both P1 and P2 for i ∈ {1, 2}, and M 3 , M 1 M 2 − C is a local mar-
tingale under both P1 and P2 , then M 3 is a local martingale under P12 .
Proof. We cannot apply Lem. 3.29 directly to M 3 as we do not assume
∆(M , T ) ∈ σ G , ∆(X, T ) . Instead, we define the process
3

1 2
Yt , Mt∧T Mt∧T + (M 1 − Mt∧T
1
)(M 2 − Mt∧T
2
) − Ct .
Notice that ∆(Y, T ) = ∆(M 1 , T )∆(M 2 , T )−∆(C, T ) ∈ σ G , ∆(X, T ) . Now

let
T n , inf{t ≥ 0 : |Mti | ≥ n for any i ∈ {1, 2, 3} or |Ct | ≥ n}.
n n n
and define M i,n , (M i )T for i ∈ {1, 2, 3}, C n , C T , and Y n , Y T . Notice
that M i,n is bounded by n for i ∈ {1, 2, 3} and Y n is bounded by 5n2 + n.
We now show that Y n is a martingale under P1 and P2 . For 0 ≤ s ≤ t,
we write
Ei (Mt1,n − Mt∧T
1,n 2,n 1,n
0 2,n
Fs = (Ms1,n − Ms∧T

)Mt∧T Mt∧T
(3.35) 1,n 2,n
= (Ms1,n − Ms∧T

Ms∧T ,
2,n
where we have applied the previous lemma with M = M 1,n , Z = Mt∧T ,
1,n
1,n

S = s, T = t ∧ T , and U = t and used the fact that (Ms − Ms∧T is zero
51
if T ≥ s. Clearly the same equality holds if we reverse the roles of M 1,n and
M 2,n . Now we use the fact that M 3,n is a martingale to write
Ei Ytn Fs0 = Ei Mt3,n Fs0 − Ei (Mt1,n − Mt∧T

1,n 2,n
0
)Mt∧T Fs
i
2,n 2,n 1,n
0
− E (Mt − Mt∧T )Mt∧T Fs
1,n
2,n 2,n
1,n
= Ms3,n − (Ms1,n − Ms∧T Ms∧T − (Ms2,n − Ms∧T Ms∧T
= Ysn ,
so Y is local martingale under P1 and P2 .

By Cor. 3.31, M 1 , M 2 , and Y are all P12 -local martingales, so the stopped
versions are P12 -martingales. This means that (3.35) holds for P12 as well,
and we can run the above argument in the opposite direction. In particular,
E12 Mt3,n F 0 = E12 Ytn Fs0 + E12 (Mt1,n − Mt∧T 1,n 2,n
0
)Mt∧T Fs
12
2,n 2,n 1,n
0
+ E (Mt − Mt∧T )Mt∧T Fs
1,n
2,n 2,n
1,n
= Ys + (Ms1,n − Ms∧T
n
Ms∧T + (Ms2,n − Ms∧T Ms∧T
= Ms3,n
We conclude that M 3 is a P12 -local martingales.

Putting all of this together, we see that this construction preserves the
characteristics of continuous semimartingales. Recall that a continuous, Rd -
valued semimartingale X is said to admit the characteristics (B, C) with
respect to (F, P) if we can write X = X0 + M + B, where M is a continuous
(F, P)-local martingale with M0 = 0, B is a continuous process with B0 = 0
that is P-a.s. of finite variation, and C = hM i.
3.36 Corollary. In addition to the assumptions of 3.19, let Y be a con-
tinuous, Rd -valued that is a semimartingale which admits the characteristics
1 2
(B, C) with respect to both (F, P ) and (F, P ). If ∆(Y, T ), ∆(B, T ), and
∆(C, T ) are all σ G , ∆(X, T ) -measurable, then Y admits the characteris-
tics (B, C) with respect to (F, P12 ).
Proof. As B and C are both P1 and P2 -a.s. of finite variation, we may apply
(a) of Cor. 3.21 to conclude that B and C are both P12 -a.s. of finite variation.
We then notice that ∆(Y − B, T ) = ∆(Y, T ) − ∆(B, T ) ∈ σ G , ∆(X, T ) , so

we may apply Cor. 3.31 to each component of M , Y − B to conclude that

M is a (P12 , F)-local martingale. We then apply Cor. 3.34 to each component
52
of M ⊗M − C to conclude that M ⊗M − C is a (P12 , F)-local martingale. As

C is P12 -a.s. of finite variation, we conclude that hM i = C.
3.37 Example. Resume the setting of Example 3.18. As
σ G , ∆(X, t1 ) = σ Yt1 , ∆(X, t1 ) = σ Θ(Y, t1 ), ∆(C, t1 ) ,

∆(Y, t1 ) and ∆(C, t1 ) are both σ G , ∆(X, t1 ) -measurable. Notice that under

both P , L (Ye , C)
e and Q , P⊗ t ,σ(Y ) P = L (Y, C), Y is a continuous
1 t1
semimartingale with the characteristics (0, C).
3.3 The General Construction.

We get the general construction announced at the beginning of this chapter
by repeated application of the binary construction. Fortunately, the binary
construction is associative.
3.38 Theorem. Let P1 , P2 , and P3 be measures on Ω, and let S ≤ T be0

0
T−
finite F -stopping times with S ∈ σ G , ∆(X, S) . Fix σ-fields G ⊂ FS
and H ⊂ σ G , ∆(X , S) . If P |G P2 |G and P2 |H P3 |H , then
T 1

(a) P1 |G P2 ⊗T,H P3 |G ,

(b) P1 ⊗S,G P2 |H P3 |H , and

(c) P1 ⊗S,G P2 ⊗T,H P3 = P1 ⊗S,G P2 ⊗T,H P3 .
Proof. To reduce the now rather burdensome notation, we set
P12 , P1 ⊗S,G P2 ,
P23 , P2 ⊗T,H P3 ,
P12,3 , P1 ⊗S,G P2 ⊗T,H P3 , and

P1,23 , P1 ⊗S,G P2 ⊗T,H P3 .

(a) Fix A ∈ G with P1 [A] > 0. A ∈ FT0 and (a) of Cor. 3.15 imply
P23 [A] = P2 [A], and P2 [A] > 0 as P1 |G P2 |G .
53
(b) Fix A ∈ H with P3 [A] = 0, so P2 [A] = 0 as P2 |H P3 |H . This

means that 0 is a version of P2 [A | G ], but
A ∈ σ G , ∆(X T , S) ⊂ σ G , ∆(X, S) ,

so 0 is also a version of P12 [A | FS0 ] by (b) of Cor. 3.15 and P12 [A] = 0.
(c) Fix G ∈ G , B ∈ σ ∆(X T , S) , and C ∈ σ ∆(X, T ) . Let Z be any

version of E3 [1C | H ] and Y be a any version of E2 1B Z G . Two

applications of Cor. 3.15 give
E23 1G Y = E2 1G Y = E2 1G∩B Z

= E23 1G∩B Z = P23 G ∩ B ∩ C .

This means that any version of E2 1B Z G is a version of E23 [1B∩C | G ].

We will use this fact the next chain of equalities.

Now fix A ∈ FS0 as well. Again using the properties listed in Cor. 3.15,
we have
h i
P12,3 A ∩ B ∩ C = E12 1A∩B E3 1C H

h i
= E 1A E 1B E 1C H G
1 2 3

h i
1A E 1B E 1C H G
1,23 2 3

=E
h i
= E1,23 1A E23 1B∩C G

= P1,23 A ∩ B ∩ C .

As we have
σ FS0 , ∆(X T , S), ∆(X, T ) = σ FT0 , ∆(X, T ) = F

(e.g., Lem. A.3), the measures agree on a π-system that generates F

which is enough to conclude that the measures agree everywhere.
We now have everything that we need for the proof of Thm. 3.6. We
restate the result below for the reader’s convenience.
54
3.6 Theorem. Let P be a probability measure on Ω and Π = {(Ti , Gi )}0≤i≤n

⊗Π
be an extended partition. Then there exists a unique measure, denoted P ,
such that
⊗Π
(a) P [A] = P[A] for A ∈ ∪i Hi , and
⊗Π
(b) any version of P[B | Gi ] is a version of P [B | FT0i ] for B ∈ Hi+1 and
0 ≤ i ≤ n,
where Hi , σ Gi−1 , ∆(X Ti , Ti−1 ) for 1 ≤ i ≤ n + 1.

Proof. First notice that if 0 ≤ i ≤ j ≤ k ≤ l ≤ n, then repeated applications

of Cor. A.7 give σ Gj , ∆ X Tk+1 , Tj ⊂ σ Gi , ∆ X Tl+1 , Ti . We will make use

of this fact repeatedly, and without further mention.

We will argue inductively, making the inductive assumption that there
exists a measure Pm such that
(a) Pm [A] = P[A] if A ∈ Hi and 0 ≤ i ≤ n + 1, and
(b) any version of P[B | Gi ] is a version of Pm [B | FT0i ] if B ∈ Hi+1 and

0 ≤ i ≤ m.
Setting P-1 , P it clear that (a) holds trivially and (b) holds vacuously for
the base case m = −1.
We now assume that some Pm exists which satisfies (a) and (b). As
Gm+1 ⊂ Hm+1 , we have Pm |Gm+1 = P |Gm+1 by assumption (a), so we may
define Pm+1 , Pm ⊗ Tm+1 ,Gm+1 P. If A ∈ Hi for some i ≤ m+1, then A ∈ FT0m+1
and Pm+1 [A] = Pm [A] = P[A] by (a) of Cor. 3.15 and the first inductive
assumption. If A ∈ Hi for some i > m + 1, then Hi ⊂ σ Gm+1 , ∆(X, Tm+1 )
and Pm+1 [A] = P[A] by (b) of Cor. 3.15. Either way, (a) is satisfied by Pm+1 .
If A ∈ FT0i , B ∈ Hi+1 for some i ≤ m, and Z is any version of P[B | Gi ],
then we note that A ∩ B and 1A Z are both FT0m+1 -measurable, so applying
(a) of Cor. 3.15, followed by the our inductive assumption (b), and then (a)
of Cor. 3.15 again, gives
Em+1 1A Z = Em 1A Z = Pm A ∩ B = Pm+1 A ∩ B ,

and Z is a version of Em+1 [1B | FT0i ]. If B ∈ Hm+2 ⊂ σ Gm+1 , ∆(X, Tm+1 ) ,

then (b) of Cor. 3.15 says that any version of E[B | Gm+1 ] is a version
of Em+1 [B | FT0m+1 ], so Pm+1 satisfies the inductive assumption (b) for all
0 ≤ i ≤ m + 1.
55
It is then clear that Pn satisfies properties (a) and (b). To see thatthis
measure is unique, fix any A0 ∈ σ(E) = F00 and Ai ∈ σ ∆(X Ti , XTi−1 ) for
1 ≤ i ≤ n + 1. Then
h i
n
n+1 n n n 0 0
P ∩i=0 Ai = E 1A0 E 1A1 · · · E [1An+1 | FTn ] · · · F0
h i
= E 1A0 E 1A1 · · · E[1An+1 | Gn ] · · · F0
n n 0
h i
= E 1A0 E 1A1 · · · E[1An+1 | Gn ] · · · G0 ,

so the probability assigned to the event ∩n+1 i=0 Ai is fully determined by P and
the properties (a) and (b). As
σ E, ∆(X T1 , 0), · · · , ∆(X Tn+1 , Tn = F ,

any two measures which agree on sets of the form ∩ni=0 Ai must agree on all
of F by the π-λ theorem.
Now we quickly check that the properties of the original measure which
were preserved by the binary construction are also preserved by the gen-
eral construction. All of these proofs are essentially the same, and we use
induction to reduce to the binary case.
3.39 Lemma. Let P be a measure on Ω and Π = {(Ti , Gi )}0≤i≤n be an
extended partition. Let A be a continuous, Rd -valued process, and assume
that ∆(A, Ti ) is σ Gi , ∆(X, Ti ) -measurable for each i ∈ {1, . . . , n}. Then
the following two implications hold.
⊗Π
(a) If A is P-a.s. of finite variation, then A is P -a.s. of finite variation.
⊗Π
(b) If A is P-a.s. absolutely continuous, then A is P -a.s. absolutely con-
tinuous.
Proof. Assume that A is P-a.s. of finite variation, and set P0 , P ⊗T0 ,G0 P.
It then follows from Cor. 3.21 that A is P0 -a.s. of finite variation. We now
proceed by induction, so assuming that A is Pi -a.s. of finite variation and
setting Pi+1 = Pi ⊗Ti+1 ,Gi+1 P, it again follows from Cor. 3.21 that A is Pi+1 -
⊗Π
a.s. of finite variation. As Pn = P , we have (a). Assertion (b) follows in
the same way.
56
3.40 Lemma. Let P be a measure on Ω and Π = {(Ti , Gi )}0≤i≤n be an ex-

tended partition. Let A be a continuous, Rd -valued process such that ∆(A, Ti )
is σ Gi , ∆(X, Ti ) -measurable for each i ∈ {1, . . . , n}. Let a be a measurable,
Rd -valued process such that the set
n ∂ ∂ o
B(ω) , t ∈ R+ : At (ω) exists and at (ω) 6= At (ω)
∂t ∂t
has Lebesgue measure 0 for all ω. Finally, let S be an R+ -valued F0 -stopping
time such that (S−Ti )+ is σ Gi , ∆(X, Ti ) -measurable for each i ∈ {1, . . . , n}.

Rt ⊗Π Rt
If P At = 0 au du ∀t = 1, then P At = 0 au du ∀t = 1 and
Z S Z S
⊗Π
(3.41) E f (au ) du = E f (au ) du .
0 0
Proof. Set P0 , P ⊗T0 ,G0 P. It then follows from Lem. 3.22 that
Z S Z T0 ∧S Z T0 ∨S
0
E f (au ) du = E f (au ) du + E f (au ) du
0 0 T0
Z S
=E f (au ) du .
0
We then proceed by induction, setting Pi+1 , Pi ⊗Ti+1 ,Gi+1 P, and applying

Lem. 3.22 to conclude that
Z S Z S
i+1
E f (au ) du = E f (au ) du .
0 0
⊗Π
As Pn = P , we are done.
3.42 Corollary. Let P be a measure on Ω, let A be a continuous, Rd -valued,

P-a.s. absolutely continuous process, let a be a measurable, Rd -valued process,
and let
Π(n) = (Tin , Gin ) 0≤i≤N (n)

be an extended partition for each n. Set TNn (n)+1 , ∞ and Pn , P⊗Π(n) ,

∂
and assume that at (ω) = ∂t At (ω) whenever this derivative exists. Further
assume that Ti and ∆(A, Tin ) are σ Gin , ∆(X, Tin ) -measurable for all n and
n
57
i ∈ {0, . . . , N (n)}. If
Z t
(3.43) P
E kau k du < ∞ ∀t ∈ R+ ,
0
then a is uniformly integrable with respect to {Pn ×λ[0,t] }n for each t ∈ R+ .

Proof. Fix any t and ε > 0 and choose M so large that
Z t +
E P
kau k − M du < ε
0
using the integrability assumption (3.43). Applying the previous lemma with
+
f (x) = kxk − M and S = t, we have
Z t + Z t +
n
E kau k − M du = EP
kau k − M du < ε.
0 0
This shows that A is uniformly integrable with respect to {Pn ×λ[0,t] }n
3.44 Corollary. Let P be a measure on Ω, let A be a continuous, Rd -valued,

P-a.s. absolutely continuous process, and let
Π(n) = (Tin , Gin ) 0≤i≤N (n)

be an extended partition for each n. Set TNn (n)+1 , ∞ and Pn , P⊗Π(n) .

Further suppose that Tin and ∆(A, Tin ) are both σ Gin , ∆(X, Tin ) -measurable

for all n and i ∈ {0, . . . , N (n)} and that
(3.45) EP [ Vart (A) ] < ∞ ∀t ∈ R+ .
Then the collection L (A | Pn ) n is tight.

Proof. By taking the limit of divided difference on the left (e.g., Lem. C.10),
∂
we may find an F0 -predictable processes a such that at (ω) = ∂t At (ω) when-
ever this derivative exists. As A is P-a.s. absolutely continuous, pathwise
58
arguments show that we have

Z t
P
E At = au du ∀t = 1, and
0
Z t
(3.46) P
E Vart (A) = kau k du ∀t = 1.
0
Combining (3.45) and (3.46), we see that we may apply the previous lemma
to conclude that a is uniformly integrable with respect to {(Pn ×λ[0,t] )}n for
each t.
As A0 = 0, we only need to check that we can control the modulus of
continuity on each compact interval [0, t] to conclude that {L (A | Pn )}n is
tight. Fix some T > 0. Using the uniform integrability, we may choose M
so large that Z t +
n
sup E kau k − M du < ε2 /2.
n 0
Setting δ = ε2 /(2M ) and letting D(δ, T ) , {s, t ∈ R2+ : s ≤ t ≤ T ∧ s + δ},

we have
h i
Pn sup kAt − As k : s, t ∈ D(δ, T ) ≥ ε

Z t
n
≤ P sup kau k du : s, t ∈ D(δ, T ) ≥ ε
s
Z t
1 n
≤ E sup kau k du : s, t ∈ D(δ, T )
ε s
Z T
1 n +
≤ E δM + kau k − M du
ε 0
≤ ε,
so {L (A | Pn )}n is tight.
3.47 Lemma. Let P be a measure on Ω, and let Π = {(Ti , Gi )}0≤i≤n be

an extended partition. Let Y be a continuous, Rd -valued process, and sup-
pose that Y is a semimartingale with the characteristics (B, C) under P. If
∆(Y, Ti ), ∆(B, Ti ), and ∆(C, Ti ) are all σ Gi , ∆(X, Ti ) -measurable for each
i ∈ {1, . . . , n}, then Y is a semimartingale with the characteristics (B, C)
⊗Π
under P .
59
Proof. Setting P0 , P⊗T0 ,G0 P, we may apply Cor. 3.36 to conclude that X has
characteristics (B, C) under P0 . We then proceed inductively, setting Pi+1 =
Pi ⊗Ti+1 ,Gi+1 P and applying Cor. 3.36 to conclude that X has characteristics
⊗Π
(B, C) under Pi+1 for each i < n. As Pn = P , we are done.
60
Chapter 4
Main Theorem
In this chapter, we develop the main theorem of this dissertation. In Sec-

tion 4.1, we develop some results related to conditional expectations. In
Section 4.2, we develop some approximation lemmas that we will need for
the proof of the main result, and in Section 4.3 we prove Thm. 2.11.
4.1 Conditional Expectation Lemmas

The results of this section are implicit in [Gyö86].
4.1 Lemma. Let (E, E ) be a measurable space with a countably generated
σ-field, let Y be an E-valued process, let z be an Rd -valued process with
Z t
(4.2) E kzu k du < ∞ ∀t ∈ R+ ,
0
and let zb : E×R+ → Rd be a (deterministic) E ⊗R+ -measurable function.

Then zb(Yt , t) is a version of E[zt | Yt ] for Lebesgue-a.e. t if and only if
Z t Z t
(4.3) E zb(Yu , u) f (Yu , u) du = E zu f (Yu , u) du
0 0
for all t ∈ R+ and all bounded f : E×R+ → Rd that are E ⊗R+ -measurable.
Moreover, in this case we have
Z t
(4.4) E kb
z (Yu , u)k du < ∞ ∀t ∈ R+ .
0
61
CHAPTER 4. MAIN THEOREM
Proof. Suppose that zb(Yt , t) is a version of E[zt | Yt ] when t ∈ / N , where

N ⊂ R+ is a Lebesgue-null set. It then follows that E kb z (Yt , t)k ≤ E kzt k
when t ∈ / N . We may apply Tonelli’s Theorem and (4.2) to conclude that
(4.4) holds, and (4.3) then follows by Fubini’s Theorem.
RNow
t
assume that (4.3) holds. Taking f = 1 in (4.3), we see that we have
E[ 0 kbz (Yt , t)k dt] < ∞ for all t (recall Rem. 1.7). Set
n o
N1 , t ∈ R+ : E kb z (Yt , t)k + kzt k = ∞ .
By Tonelli’s Theorem, N1 is an R+ -measurable Lebesgue-null set. Let C =

{Cn }n∈N denote a countable collection of sets that generate E . Without
loss of generality, we may assume that C is closed with respect to finite
intersections. Now define

gn (t) , E zb(Yt , t) − zt 1{Yt ∈Cn } ,
so gn is R+ /R d -measurable by Fubini’s Theorem. Suppose that A ⊂ [0, t] is

R+ -measurable. Then (4.3) implies that
Z Z t

gn (u) du = E zb(Yt , t) − zt 1{(Yt ,t)∈Cn ×A} du = 0.
A 0
As this is true for all such A, we have gn = 0 Lebesgue-a.e. when restricted to

[0, t] (e.g., [Roy88] Lem. 5.3.8). Letting t → ∞, we see that gn = 0 Lebesgue-
a.e. on R+ . Setting N2 = ∪n {t ∈ R+ : gn (t) 6= 0}, Fubini’s Theorem implies
that N2 is an R+ -measurable Lebesgue-null set. Now consider the class of
sets
B , B ∈ E : E zb(Yt , t) 1{Yt ∈B} = E zt 1{Yt ∈B} for all t ∈

/ N1 ∪ N2 .
We have C ⊂ B by construction and B is a monotone class by dominated

convergence, so B = E . In particular, zb(Yt , t) is a version of E[zt | Yt ] for all
t∈/ N1 ∪ N2 .
4.5 Corollary. Let (E, E ) be a measurable space with a countably generated

σ-field, let Y be an E-valued process, and let z be a K-valued process, where
62
K is a closed convex subset of Rd or Rd ⊗Rr . If

Z t
E kzs k ds < ∞ ∀t ∈ R+ ,
0
then there exists a (deterministic) E⊗R+ -measurable function zb : E×R+ →

K such that zb(Yt , t) is a version of E[zt | Yt ] for Lebesgue-a.e. t.
4.6 Remark. S+d is a closed convex subset of Rd ⊗Rd , so if z takes values in
S+d , then the theorem asserts that may choose zb : E×R+ → S+d .
Proof. We consider the case K ⊂ Rd with z = (z 1 , . . . , z d ). Define the
following σ-finite, signed measures on E ×R+ :
Z ∞
µ(A) , E 1A (Yu , u) du , and
0
Z ∞
i i
ν (A) , E zu 1A (Yu , u) du for i ∈ {1, . . . , d}.
0
It is clear from these definitions that µ ν i for each i. As µ is σ-finite,

the Radon-Nikodym derivatives dν i /dµ are well-defined for each i. Let ze =
z 1 , . . . , zed ) denote the function with zei = dν i /dµ for each i. Fixing any
(e
bounded, E ⊗R+ /R d -measurable g = (g 1 , . . . , g d ) : E ×R+ → Rd and letting
{e1 , . . . , ed } denote the canonical basis on Rd , we have
Z t d
X Z
i
E ze(Yu , u) g(Yu , u) du = e zei (y, u) g(y, u) µ(dy, du)
0 i=1 E×[0,t]
Xd Z
= ei g(y, u) ν i (dy, du)
i=1 E×[0,t]
Z t
=E zu g(Yu , u) du .
0
We now show that ze takes values in K, µ-a.e. We argue by contradiction,

so assume that
Z
(4.7) 1{ez(e,t)∈K}
/ µ(de, dt) > 0.
E×R+
As Rd is separable, we may write K as an intersection of a countable collection
63
of closed half-spaces. In particular, we have K = ∩n Hn where Hn = {x ∈

Rd : (x, yn0 ) ≤ αn } with yn0 ∈ Rd and αn ∈ R. This means that K c = ∪n Hnc .
As we have assumed (4.7), there must exist some N and T with
Z
1{bz(e,t)∈HNc } µ(de, dt) > 0.
E×[0,T ]
But then
Z T Z T
0

αN E 1{ez(Ys ,s)∈HNc } ds < E ze(Ys , s), y 1{ez(Ys ,s)∈HNc } ds
0 0
Z T
0
=E (zs , y ) 1{ez(Ys ,s)∈HNc } ds
0
Z T
≤ αN E 1{ez(Ys ,s)∈HNc } ds ,
0
which is a contradiction. We conclude that ze takes values in K, µ-a.e. We

then pick any k ∈ K and define
zb(e, t) , ze(e, t) 1{ez(e,t)∈K} + k 1{ez(e,t)∈K}

/ .
We have
Z t Z t
E ze(Yu , u) g(Yu , u) du = E zb(Yu , u) g(Yu , u) du
0 0
Z t
=E zu g(Yu , u) du ,
0
so we may apply the previous lemma to conclude that zb(Yt , t) is a version of

E[zt | Yt ] for Lebesgue-a.e. t. The case where z takes values in Rd ⊗Rr follows
in the same way.
4.8 Definition. Let (Ω, F , F, P) be a stochastic basis which supports pro-
cesses {X i }i and a random variable T . We say that T is strongly indepen-
dent of the processes {X i }i is there exists a σ-field G ⊂ F such that X i is
G ⊗R+ -measurable for each i and σ(T ) is independent of G .
4.9 Remark. The statement that “X is independent of T ” means that
σ(X) , σ(Xt : t ∈ R+ ) and σ(T ) are independent. Unfortunately, for
general measurable processes X, we cannot immediately conclude that X ∈
64
σ(X)⊗R+ . Indeed, if this were true, it would imply that every adapted
process is progressive without modification. If the sample paths of X have
enough regularity that we may write X as the pointwise limit of simple
functions, then it is of course true that X ∈ σ(X)⊗R+ . We give the previous
definition so that we may handle the situation where we do not assume any
sample path regularity.
4.10 Lemma. Let (Ω, F , F, P) be a stochastic basis which supports an Rd -

valued process z and an R+ -valued random time T with law µ , L (T ). If T
is strongly independent of z, then
Z
(4.11) E[zt ] µ(dt) = E[zT ].
R+
Proof. Let G ⊂ F be a σ-field such thatR T is independent of G and z

is G ⊗R+ measurable. We will show that R+ E[Xt ] µ(dt) = E[XT ] for all
bounded, G ⊗R+ -measurable processes X. Letting X ∈ bF ⊗R+ mean that
X is a bounded, F ⊗R+ -measurable process, we set
n Z o
C , X ∈ bF ⊗R+ : E[Xt ] µ(dt) = E[XT ] .
R+
Ai 1Bi (t) for some random variables Ai ∈ G and sets Bi ∈ R+ ,

Pn
If Xt = i=1
then
Z X X
E[Xt ] µ(dt) = E[Ai ] P[T ∈ Bi ] = E[Ai 1{T ∈Bi } ] = E[XT ].
R+ i i
As C is a monotone class that contains all X of this form, we conclude that

G ⊗R+ ⊂ C .
We then use monotone convergence to write
Z Z
E [ kzt k ] µ(dt) = lim E [ kzt k ∧ n ] µ(dt)
R+ n R+
= lim E [ kzT k ∧ n ] = E [ kzT k ] .

n
If this expression is finite, then (4.11) follows in the same way by dominated
converge at each coordinate. If this expression is infinite, then both sides of
(4.11) are defined to be ∞ (recall Rem. 1.7). Either way, (4.11) holds.
65
4.12 Corollary. Let (E, E ) be a measurable space, and let (Ω, F , F, P) be a

stochastic basis which supports an E-valued process Y , an Rd -valued process
z, and an R+ -valued random time T . Let zb be a (deterministic) E ⊗R+ /R d -
measurable function such that zb(Yt , t) is a version of E[zt | Yt ] for Lebesgue-
a.e. t. If the the law of T is absolutely continuous with respect to Lebesgue’s
measure, and T is strongly independent of Y and z, then zb(YT , T ) is a version
of E[zT | YT , T ]
Proof. Let G ⊂ F be a σ-field such that T is independent of G and z is
G ⊗R+ measurable. Set µ , L (T ) and write zb = (b z 1 , . . . , zbd ) and z =
1 d
(z , . . . , z ). We now check each component. Fix any bounded deterministic
f : E×R+ → R which is E ⊗R+ /R-measurable. A standard monotone class
argument shows that the maps (ω, t) 7→ f Yt (ω), t and (ω, t) 7→ zbi Yt (ω), t
are both G ⊗R+ /R d -measurable. We have
(4.13) E[f (Yt , t) zbi (Yt , t)] = E[f (Yt , t) zti ]
for Lebesgue-a.e. t by assumption. As µ is absolutely continuous with respect

to Lebesgue’s measure, (4.13) holds for µ-a.e. t as well. We then write
Z
i
E[f (YT , T ) zb (YT , T )] = E[f (Yt , t) zbi (Yt , t)] µ(dt)
ZR+
= E[f (Yt , t) zti ] µ(dt)
R+
= E[f (YT , T ) zTi ],
where the first and last equalities follow from Lem. 4.10.
4.2 Approximation Lemmas

To appreciate our first approximation result, consider the following lemma
from Revuz and Yor.
4.14 Lemma (Lem. 0.5.7 from [RY99]). Let (Xn , Yn ) be a sequence of ran-
dom variables with values in separable metric spaces E and F and such that
(a) (Xn , Yn ) converges in distribution to (X, Y ), and
66
(b) the law of Yn does not depend on n.
Then for every Borel function f : F → G, where G is a separable metric

space, the sequence (Xn , f (Yn )) converges in distribution to (X, f (Y )).
While we do not repeat the proof, it essentially results from the fact that
we can approximate f arbitrarily well in L1 L (Y ) with bounded, continu-
ous functions. The point is that we get a stronger kind of convergence from
the fact that the Yn share a common law. This is related to the notion of
weak-strong convergence as developed by Jacod and Memin in [JM81b] and
[JM81a]. In the following theorem, it is the assumption of common one-
dimensional marginal distributions that allows us to conclude that we have
weak convergence even though f is only assumed to be measurable.
4.15 Lemma. Let E be a Polish space and let {Y n }n≤∞ be sequence of contin-
uous E-valued processes, possibly defined on different
Rt spaces. Let f : E×R+ →
R be a measurable function and define Ft , 0 f (Ysn , s) ds. If
d n
(a) L (Ytn ) = L (Yt1 ) ∀t ∈ R+ and ∀n ∈ N,
(b) Y n ⇒ Y ∞ , and
R t
(c) E1 0 kf (Yu1 , u)k du < ∞ ∀t ∈ R+ ,
then
..

(d ) f (Y n , ), Pn ×λ[0,t] n∈N is uniformly integrable ∀t ∈ R+ ,

(e) Pn F n ∈ C(R+ ; Rd ) = 1 for each n ∈ N, and
(f ) (Y n , F n ) ⇒ (Y ∞ , F ∞ ).
4.16 Remark. There may very well be paths of Y n for which f (Y n , ) is . .

not integrable. Recall that in Rem. 1.7 we adopted the convention that the
integral is Rd -valued, where Rd , Rd ∪{∞} is the one-point compactification
of Rd , so Ftn may take the value ∞. It is a conclusion of this lemma that
F n is a continuous, finitely-valued process Pn -a.s. for each n. As a result, we
may treat F n as C(R+ ; Rd )-valued random variable and (f) makes sense.
67
Proof. Define the σ-finite measure µ on E×R+ by

Z ∞
n n
µ(A) = E 1A (Yu , u) du .
0
If we fix some t ∈ R+ , then x 7→ x(t) is a continuous map from C(R+ ; R) to

Rd . This means that (a) actually holds for n = ∞ as well, and it does not
matter which n ∈ N we use in this definition of µ. In particular, we have
Z t
n
n

sup E f (Yu , u) 1{kf (Yun ,u)k>M } du

n≤∞ 0
Z

= f (e, u) 1{kf (e,u)k>M } µ(de, du)
E×[0,t]
Z t

1 f (Yu1 , u) 1{kf (Y 1 ,u)k>M }
=E u
du .
0
Using the integrability assumption (c), we may make this last expression
arbitrarily small by choosing M sufficiently large. This implies (d), and also
implies that
Z t
n
n

P f (Yu , u) du < ∞ ∀t ∈ R+ = 1 ∀n ∈ N,
0
which then implies (e).

For the final claim, we approximate f in L1 (µ) with bounded continu-
ous functions. Cor. B.9 asserts that we may choose a sequence of bounded
functions f m ∈ C(E×[0, m]; Rd ) such that
Z

lim f (e, t) − f m (e, t) µ(de, dt) = 0.
m→∞ E×[0,m]
Define the processes Z t∧m

Ztn,m , f m (Ysn , s) ds
0
68
for n ∈ N and m ∈ N. If yi → y∞ in C(R+ ; E), then

Z t Z t
m
m

lim sup sup
f y∞ (s), s ds − f yi (s), s ds
i→∞ t≤m 0 0
Z m
m m

≤ lim sup f y (s), s − f y (s), s ds = 0

∞ i
i→∞ 0
by bounded convergence. In particular, the map y 7→

R . ∧m
f m y(s), s ds is
0
continuous from C(R+ ; E) to C(R+ ; Rd ), so we have
(4.17) (Y n , Z n,m ) ⇒ (Y ∞ , Z ∞,m ) for each fixed m.
For each fixed T , we also have

h i
lim sup sup Pn sup kFtn − Ztn,m k > ε
m→∞ n∈N t≤T
h i
≤ lim sup sup ε -1 En sup kFtn − Ztn,m k
m→∞ n∈N t≤T
Z
≤ lim sup ε -1

f (e, s) − f m (e, s) µ(de, ds) = 0.
m→∞ E×[0,m]
In particular,
inf sup Pn d(F n , Z n,m ) > δ = 0

m∈N n∈N
for each δ > 0, so (4.17) implies (f) (e.g., Lem. B.3).
4.18 Remark. If we add an additional sequence of random variables, {Zn },

which take values in some metric space E 2 to the statement of this theorem,
and we assume that (Y n , Z n ) ⇒ (Y ∞ , Z ∞ ), then we may conclude that
(Y n , Z n , F n ) ⇒ (Y ∞ , Z ∞ , F ∞ ).
The next result will show that we may approximate an integrable process
in L1 (P×λ[0,t] ) using step functions if we randomize the partition that we use
to generate the step functions. First we will need to present a lemma from
analysis.
If f : R → R is a function which is integrable over [0, T ], and we set
φn (t) , n 1[0,1/n] (t), then φn converges to the Dirac mass at 0 in some sense, so
we might expect f ∗ φn to converge to f in L1 [0, T ], λ[0,T ] . This observation
motivates, but is not quite equivalent to, the following lemma.
69
Rt
4.19 Lemma. Let f : R+ → Rd be function with 0 kf (s)k ds < ∞ for all
t ∈ R+ . Define the sets

n u+i−1 u+i
Ii , (t, u) ∈ R+ ×[0, 1] : ≤t< ,
n n
and defineP the sequence of approximating functions fn : R+ ×[0, 1] → Rd by

fn (t, u) , ∞ i=1 f
u+i−1
n
1Iin (t, u). Then
Z 1 Z t
lim kf (s) − fn (s, u)k ds du = 0 ∀t ∈ R+ .
n→∞ 0 0
Proof. Fix any t and ε > 0. Then choose g ∈ C([0, t + 1]; Rd ) with
Z t+1
kf (s) − g(s)k ds < ε/4.
0
Pmn u+i−1

Set m , dte ∈ [t, t + 1) ∩ N and set gn (s, u) , i=1 g n
1Iin (s, u). We
have
Z 1Z t
kfn (s, u) − gn (s, u)k ds du
0 0
mn Z u+i
X 1 Z
n
u+i−1 u+i−1

≤ f
n
−g n
ds du
u+i−1
i=1 0 n
mn Z 1
X u+i−1
u+i−1
du
= f
n
−g n

i=1 0
n
mn Z i
X n
= f (v) − g(v) dv ≤ ε/4.
i−1
i=1 n
As g is uniformly continuous on the interval [0, t + 1], we may choose N so

large that |s2 − s1 | < 1/N implies kg(s2 ) − g(s1 )k < ε/(4t). By enlarging N
R 1/N
if necessary, we may also assume that 0 kg(s)k ds ≤ ε/4. Putting all of
70
these estimates together gives

Z 1Z t
kf (s) − fn (s, u)k ds du
0 0
Z t Z 1Z t
≤ kf (s) − g(s)k ds + kg(s) − gn (s, u)k ds du
0 0 u/n
Z 1/n Z 1Z t
+ kg(s)k ds + kgn (s, u) − fn (s, u)k ds du
0 0 0
≤ ε/4 + ε/4 + ε/4 + ε/4
when n ≥ N .
Using this lemma, we see that we may approximate an arbitrary inte-

grable process using step functions if we first extend the space to add an
independent uniform random variable, and we use this variable to randomize
the placement of the partition points that we use to do the sampling.
4.20 Lemma. Let (Ω0 , G , Q) be a probability space which supports an Rd -

valued process a0 with
Z t
0
(4.21) E Q
kas k ds < ∞ ∀t ∈ R+ .
0
Set Ω , [0, 1]×Ω0 with typical point ω = (u, ω 0 ), and define U (u, ω 0 ) , u.
Letting R[0,1] denote the Borel σ-field on [0, 1], set F , R[0,1] ⊗G , P ,
λ[0,1] ×Q, and a(u, ω 0 ) , a0 (ω 0 ). Finally, define the random times T0n , 0,
Tin , (U + i − 1)/n for i ∈ {1, . . . , n2 }, and Tnn2 +1 , ∞ and the sampled
P 2
processes ant , ni=1 aTin 1[Tin ,Ti+1
n ) (t). Then
Z t
(4.22) lim E P
kas − ans k ds = 0 ∀t ∈ R+ .
n→∞ 0
Proof. We will first show that the collection of processes {an }n is uniformly
integrable with respect to P×λ[0,t] for each t ∈ R+ . To see this, fix some
71
n
t ∈ R+ and set m , dte ∈ [t, t + 1) ∩ N so t ≤ Tmn+1 . We then write
Z t
n
E P
kas k 1{kans k>M } ds
0
Z Tn
mn+1
n
≤E P
kas k 1{kans k>M } ds
0
mn
1 X Ph i
= E kaTi k 1{kaT n k>M }
n
n i=1 i
mn Z 1
X h i du
= EQ ka0(u+i−1)/n k 1{ka0(u+i−1)/n k>M }
i=1 0
n
mn Z i/n
X h i
= EQ ka0s k 1{ka0s k>M } ds
i=1 (i−1)/n
Z m
≤E Q
ka0s k 1{ka0s k>M } ds .
0
where the third relation follows Fubini’s Theorem applied to the product
measure P = Q×λ[0,t] . As a0 is integrable over the interval [0, m] under Q,
we may make this last expression arbitrarily small by choosing M large. We
have now shown
that {an }n is uniformly integrable with respect to P×λ[0,t] .
As a result, ka − an k n is also uniformly integrable with respect to P×λ[0,t] .
Define
Z 1Z t
0 0
(4.23) At,n (ω ) , ka0s (ω 0 ) − ans (u, ω 0 )k ds du,
0 0
which is a random variable on Ω0 . We then write

h Z 1 Z t +
0
+ i 0 0 n 0
E Q
At,n − M =EQ
kas (ω ) − as (u, ω )k ds du − M
0 0
Z 1 Z t +
0 0 n 0
≤E Q
kas (ω ) − as (u, ω )k ds − M du
0 0
Z t +
n
=EP
kas − as k ds − M
0
Z t
n
+
≤E P
kas − as k − M/t ds ,
0
72
where both inequalities follow from Jensen’s inequality. The uniform integra-
bility of {A0t,n }n with respect to Q then follows from the uniform integrability

of kas − ans k n with respect to P×λ[0,t] .
Rt
If we fix an ω 0 such that 0 ka0s (ω 0 )k ds < ∞ for all t ∈ R+ , then we may
apply the previous lemma to the right-hand R t side of (4.23) to conclude that
0 0 0

limn At,n (ω ) = 0. (4.21) implies that Q 0 kas k ds < ∞ ∀t ∈ R+ = 1, so
may conclude that limn A0m,n = 0 Q-a.e. Combining this with the uniform
integrability of {A0t,n }n , we conclude that limn EQ [A0t,n ] = 0. As t is arbitrary,
we have shown (4.22).
To motivate the final approximation result, we recall that a local martin-

gale of finite variation is constant. The next result is essentially a prelimiting
version of that result. In this lemma, we have a sequence of absolutely contin-
uous processes that are only martingales with respect to a discrete partition.
We will show that we can control such a sequence by controlling the width
of the partition and the integrability of the derivatives. To state the lemma,
we need the following definition.
4.24 Definition. If π = {0 = T0 ≤ T1 ≤ . . . ≤ Tn } is a linearly ordered

sequence of random times, then we call π a random partition and we
set |π|(ω) , sup1≤i≤n |Ti (ω) − Ti−1 (ω)|. If {π m }m is sequence of random
partitions, possibly defined on different spaces {Ωm }m , with
π m = {0 = T0m ≤ T1m ≤ . . . ≤ TNm(m) },
then we say that {π m }m converges uniformly to the identity if
lim sup |π m |(ω) = 0, and

m→∞ ω∈Ωn
lim inf TNm(m) (ω) = ∞.
m→∞ ω∈Ωn
4.25 Lemma. Let (Ωn , F n , Pn ) n be a sequence of probability spaces. As-

sume that on each space there is defined a processes xn and a random partition
π(n) = {0 = T0n ≤ T1n ≤ . . . ≤ TNn (n) }.

Further assume that the collection of processes and measures (xn , Pn ×λ[0,t] ) n
is uniformly integrable for each t ∈ R+ and that the sequence of partitions
73
{π(n)}n converges uniformly to the identity. Finally, define

Z Tkn
Ykn , xnu du
0
and Fkn , σ(Yjn , Tjn : j ≤ k) for k ∈ {0, . . . , N (n)}, and assume that
{Ykn , Fkn }0≤k≤N (n) is a martingale for each n. Then
Z s
n n
(4.26) lim E sup xu du = 0 ∀t ∈ R+ .

n→∞ s∈[0,t] 0
Proof. First we derive an estimate for a single process. Let x : Ω×R+ → Rd

be a process and suppose that
π = {0 = T0 ≤ T1 ≤ . . . ≤ TN },
Rt
is a random partition with TN > t and |π| ≤ 1. Set Xt , 0 xs ds, Yk , XTk ,

and Fk , σ(Yj , Tj : j ≤ k). We show below that if Yk , Fk 0≤k≤N is a
martingale, then
h i Z t+1
+
E sup kXs k ≤ M E |π| + E kxs k − M ds
s∈[0,t] 0
h Z t+1 1
+ i 2
(4.27) + d C1 M E |π| + E kxu k − M du
0
s
h Z t+1 i
× E kxu k du ,
0
where C1 is a constant that does not depend on x and M is arbitrary.

To see this, let S , inf{k ∈ {0, . . . , N } : Tk ≥ t}, so S is an F-stopping
time and Y stopped at S is still an F-martingale. Also notice that |π| ≤ 1
implies that TS ≤ t + 1. Letting Y i and X i denote the ith components of Y
74
and X, we write
h i
E max kYn k
1≤n≤N ∧S
X h i
≤ E max |Yni |
0≤n≤N ∧S
1≤i≤d
X q
i
P
≤ C1 E (Yni − Yn−1 )2
1≤i≤d 0≤n≤N ∧S
q
XTn − XTn−1 2
P
≤ d C1 E
 r0≤n≤N ∧S 

Z t+1
≤ d C1 E  max XTn − XTn−1 kxu k du 
0≤n≤N ∧S 0
s s Z t+1

≤ d C1 E max XTn − XTn−1 E kxu k du
1≤n≤N ∧S 0
Z t+1 12
+
= d C1 M E |π| + E kxu k − M du
0
(4.28) s
h Z t+1 i
× E kxu k du ,
0
where C1 is the “universal” constant from the discrete-time BDG inequality

with p = 1 (e.g., [Gar73] II.1.1) and the fifth
inequality is Hölder’s. Now fix
some s ≤ t and temporarily set bsc , max i ∈ {0, 1, . . . , N − 1} : Ti ≤ s so
Tbsc is the largest random time before s. With this notation, we write

kXs k ≤ Xs − XTbsc + XTbsc
Z Tbsc+1

≤ kxu k du + XTbsc
(4.29) Tbsc
Z t
+
≤ M |π| + kxu k − M du + max kYn k.
0 1≤n≤N ∧S
We now sup over s ∈ [0, t] on the left hand side of (4.29) and take expectations
75
to give
h i h Z t + i h i
E sup kXs k ≤ M E |π| + E kxu k − M du + E max kYn k .
s∈[0,t] 0 1≤n≤N ∧S
(4.27) then follows from this inequality and (4.28).

We now use (4.27) to show that (4.26) holds. Fix some t ∈ R+ and ε > 0
and set Z t+1
n
n
C2 , sup E xu du .

n 0
n n

The uniform integrability of (x , P ×λ[0,t+1] ) n ensures that C2 < ∞, and
we also use the uniform integrability to choose M1 (ε, C2 ) so large that
h Z t+1 + i ε ε2
n
sup E kxnu k − M1 du ≤ ∧ .
n 0 4 8C12 C2 d2
Set
ε2

ε
δ = δ(ε, C2 , M ) , ∧ ,
2M1 8M1 C12 C2 d2
and choose M2 (δ) so large that that |π(n)| ≤ δ ∧ 1 and TNn (n) > t for all
n > M2 using the fact that {π(n)}n converges uniformly to the identity.
Putting this all together and applying the estimate (4.27) to X n then gives
76
h i
En sup kXsn k
s∈[0,t]
hZ t+1 + i
n n
kxns k − M1

≤ M1 E |π(n)| + E ds
0
h Z t+1 + i 21
+ d C1 M1 E |π(n)| + En
n
kxns k

− M1 ds
0
s
h Z t+1 i
× E n kxns k ds
0
ε2 21 p
≤ M1 δ + ε/4 + d C1 M1 δ + C2
8C12 C2 d2
ε p
≤ ε/2 + d C1 √ C2
2C1 C2 d
≤ε
for all n ≥ N . As t and ε were arbitrary, we have shown that (4.26) holds.
4.3 Main Theorem

We recall the following definition for the reader’s convenience.
2.1 Definition. Let E be a Polish space, and let Φ : E×C0 (R+ ; Rd ) →

C(R+ ; E) be a function. We say that Φ is an updating function if

(a) Φt (e, x) = Φt e, ∇(x, t) ∀t ∈ R+ , and

(b) Θ Φ(e, x), t = Φ Φt (e, x), ∆(x, t) ∀t ∈ R+ .
If Φ is also continuous as a map from E×C0 (R+ ; Rd ) to C(R+ ; E), then we

say that Φ is a continuous updating function.
We first give a version of the main theorem stated in terms of the char-
acteristics of an Itô process.
4.30 Theorem. Let (Ω, F , F, P) be stochastic basis that Rsupports an Rd -

t
valued Itô process Y with Y0 = 0 and characteristics Bt = 0 bs ds and Ct =
77
Rt
0
cs ds, where bt ∈ Rd and ct ∈ S+d are F-adapted processes with
Z t
(4.31) E kbs k + kcs k ds < ∞ ∀t ∈ R+ .
0

Z = Φ(Z0 , Y ) for some continuous updating function Φ. Let N ⊂ R+ be a
Lebesgue-null set, and let bb : E×R+ → Rd and b c : E×R+ → S+d be (deter-
ministic) functions such that bb(Zt , t) = E[bt | Zt ] a.s. and b
c(Zt , t) = E[ct | Zt ]
a.s. when t ∈/ N.
Then there exists a stochastic basis (Ω,b Fc, F,
b P)b that supports continuous,
F-adapted
b processes Yb and Zb such that
bt , t bb(Zbs , s) ds
R
(a) Yb is an Itô process with characteristics (B,
b C),
b where B
0
bt , t b
R
and C 0
c ( Z
bs , s) ds,
(b) Zb = Φ(Zb0 , Yb ), and
(c) Zb has the same one-dimensional marginal distributions as Z.
4.32 Remark. As the proof is somewhat involved, we first give a heuristic

explanation of the main steps in the context of an example. Let Ye , C, e and
2
σ
Re t be defined as in Example 2.10, and set e c,σ e , so we have h Y it = C
e et =
0 s
c ds. In Example 2.4, we produced an updating function Φ such that
e
Y = Φ(0, Y ) when Y0 = 0, so we may take Y = Z = Ye in the statement of
the theorem. We take b c=σ b2 , where σ b is defined as in (2.19) of Example 2.18.

In particular, we showed in Example 2.18 that b c(x, t) = E e
e ct Yt = x for
e
bt (Y ) , t b
R
t > 0. Defining C 0
c(Ys , s) ds (which is slightly at odds with the
definition given in the statement of the theorem, but convenient for the
purposes of this remark), the theorem asserts that we may find a process
Yb such that h Yb it = C bt (Yb ) and such that Yb has the same one-dimensional
marginal distributions as Ye . We will construct a sequence of processes in
such a way that Yb is given as the weak limit of this sequence.
Set Ω = {0}×C0 (R+ ; R2 ) with canonical process X = (Y, C) as in Exam-
ple 3.4, so P , L (Ye , C)e is a measure on Ω. Define a sequence of deterministic
partitions
π(n) , {0 = tn0 < . . . < tnn < tnn+1 = ∞},
78
and set Gin , σ(Ytni ). In Example 3.4, we showed that Π(n) , {(Gin , tni )}0≤i≤n
is an extended partition, so we may define the sequence of measures Pn ,
P⊗Π(n) . Recall that we interpreted these extended partitions as filtration-like
objects in which we choose to forget everything about the process X at time
tni except the current location of Y . We also showed in Example 3.4 that, in
this case, we have
n n n
Hi n , σ Gin , ∆(X ti , tni−1 ) = σ Θ(Y ti , tni−1 ), ∆(C ti , tni−1 ) .

In particular, if we choose some s ∈ R+ , then we may choose some j such

that s ∈ [tnj−1 , tnj ], and we then have Ys = Θs−tni−1 (Y ti , tni−1 ) ∈ Hi n . Using (a)
n
of Thm. 3.6, we see that Pn [Ys ≤ y] = P[Ys ≤ y] for any y ∈ R. In particular,

Y has the same one-dimensional marginal distributions under each Pn as Ye .
To show that this sequence of measures is tight, we use a result of Re-
bolledo [Reb79] which asserts that the sequence {L (Y | Pn )}n is tight when
the sequence {L (C | Pn )}n is tight. But the tightness of {L (C | Pn )}n
follows from the integrability condition (4.31) and Cor. 3.44, so by passing
to a subsequence, we may assume that L (Y | Pn ) ⇒ Yb for some limiting
process Ye . As Y has the same one-dimensional marginal distributions under
each Pn as Ye , Yb also has this property.
We still need to show that h Yb i = C( b Yb ). As hY i = C under each Pn (e.g.,
Lem. 3.47), the main result of Appendix F asserts that if L (Y, C | Pn ) ⇒
L Yb , C(
b Yb ) , then h Yb i = C(b Yb ). Lem. 4.15 essentially asserts that we have
L Y, C(Yb ) | Pn ⇒ L Yb , C( b Yb ) , so all we need to show is that L C −

b ) | Pn ⇒ 0. This is probably the most technical part of the proof, and
C(Y
we only give a plausibility argument now.
Let Y n and C n denote the processes Y and C of Example 3.8, when
we take π = π(n) in that example, and let cn denote the right derivative
of C n . As noted in Example 3.8, we have L (Y n , C n ) = Pn . Notice that
P and Pn are not equivalent. In particular, Pn charges paths of C which
change slope, and P does not. Comparing the definition of σ in Example 3.8
with the definition of σ b given in (2.19) of Example 2.18, we see that, for
each n and i > 0, we have En [ctni | Ytnni = x] = b c(x, tni ), so bc is the expected
variance accumulation rate, conditional on the location of Y at a reset time.
As each reset is conditionally independent given the value of Y at the time
of the reset, we might hope that when we have enough partition points, a
law of large numbers will kick in causing C − C b to converge to zero. To
79
make this work without imposing continuity assumptions on b c, we need to

slightly randomize the placement of the partition points and use Lem. 4.20.
We introduce the uniform random variable in the proof that follows for just
this purpose.
Proof. To free up some notational space, we add a tilde to every symbol in

the statement of the theorem which does not have a hat, and we set Φ , Φ. e
The first thing that we do is transport the problem from Ω e to a space with
0 2
more structure where it is easier to work. Set Ω , E×C0 (R+ ; Rd+d+d ) and
set Ω , [0, 1]×Ω0 . We will write a typical point of Ω as ω = (u, ω 0 ) = (u, e, x)
2
where u ∈ [0, 1], ω 0 ∈ Ω0 , e ∈ E, and x ∈ C0 (R+ ; Rd+d+d ). We define the
random variables U (u, ω 0 ) , u, E(u, e, x) , e, and X(u, e, x) , x, and we
subdivide X as (Y, B, C) = X where Y ∈ C0 (R+ ; Rd ), B ∈ C0 (R+ ; Rd ),
and C ∈ C0 (R+ ; Rd ⊗Rd ). We also define the continuous, E-valued process
Z , Φ(E, Y ). Let E 0 , X 0 , Y 0 , Z 0 , B 0 , and C 0 denote the corresponding
random variables defined on Ω0 .
Letting E denote the Borel σ-field on E, we see that Ω0 is Polish space with
Borel σ-field G , E ⊗σ(X). The filtration on Ω0 is given by G0 , {Gt0 }t∈R+
where Gt0 = E ⊗σ(X t ). Letting R[0,1] denote the Borel σ-field on [0, 1], we see
that Ω is also Polish space with Borel σ-field F , R[0,1] ⊗G . The filtration
on Ω is F0 , {Ft0 }t∈R+ where Ft0 = R[0,1] ⊗Gt0 . Because [0, 1]×E is a Polish
space, we are in Setting 3.1, and we may apply the results from Chapter 3
to measures on Ω.
We define the measure
(4.33) Q , L (Ze0 , Ye , B,
e C),
e
on Ω0 and the measure P , λ[0,1] ×Q on Ω. In particular, we have
(4.34) L (E 0 , Y 0 , Z 0 , B 0 , C 0 | Q) = L (Ze0 , Ye , Z,
e B,
e C).
e
By taking divided difference on the left (e.g., Lem. C.10), we may find G0 -
predictable processes b0 and c0 such that, for each ω 0 ∈ Ω0 , we have b0t (ω 0 ) =
∂
B 0 (ω 0 ) and c0t (ω 0 ) = ∂t
∂t t
∂
Ct0 (ω 0 ) whenever these derivatives exists. As B
e and
C
e are P-a.s.
e absolutely continuous, (4.34) implies that B 0 and C 0 are Q-a.s.
80
absolutely continuous. Pathwise arguments show that we have

Z t
0 0
(4.35) Q Bt = bs ds ∀t = 1, and
0
Z t
0 0
(4.36) Q Ct = cs ds ∀t = 1.
0
Setting b(u, ω 0 ) , b0 (ω 0 ) and c(u, ω 0 ) , c0 (ω 0 ), we see that the corresponding

properties also hold under P. It is then clear from the product structure that
U is strongly independent of (Y, Z, B, C, b, c) under P.
et = t ebs ds ∀t] = 1, P[Bt = t bs ds ∀t] = 1, and L (B)
R R
We have P[e B
0 0
e =
L (B | P). As B e and B are both a.s. absolutely continuous, all of the
information about their derivatives is essentially RT encoded
R T in their common
law. In particular, the random variables 0 kbs k ds and 0 kbs k ds (under P)
e
agree in law. The details of this argument are given in Cor. C.18. This means
that we have
Z t Z t
(4.37) E P
kbs k ds = E e k bs k ds < ∞.
e
0 0
Repeating the argument for c gives

Z t Z t
(4.38) EP
kcs k ds = E
e ke
cs k ds < ∞.
0 0
We will now show that bb(Zt , t) is still a version of EP [bt | Zt ] for Lebesgue-
a.e. t. Fixing any t ∈ R+ and any bounded, E ⊗R+ /R d -measurable f :
E×R+ → Rd , we write
Z t Z t
P
E bs f (Zs , s) ds = Ee bs f (Zs , s) ds
e e
0 0
Z t
=Ee b(Zs , s) f (Zs , s) ds
b e e
0
Z t
P
=E bb(Zs , s) f (Zs , s) ds .
0
The first and last equalities follows from the fact that L (B, Z | P) = L (B,
e Z)
e
(e.g., Cor. C.18). The middle equality follows from our assumption that
81
e ebt | Zet ] for Lebesgue a.e. t and Lem. 4.1. We then

bb(Zet , t) is a version of E[
apply Lem. 4.1 again to conclude that bb(Zt , t) is a version of EP [bt | Zt ] for
Lebesgue a.e. t. It follows in the same way b c(Zt , t) is a version of EP [ct | Zt ]
for Lebesgue-a.e. t. In particular, (4.4) of Lem. 4.1 asserts that
Z t

(4.39) EP bb(Zs , s) ds < ∞ ∀t ∈ R+ , and
0
Z t

(4.40) EP bc(Zs , s) ds < ∞ ∀t ∈ R+ .
0
Ye has the characteristics (B,

e C)e with respect to (F, e It follows from
e P).
0 0 0
(4.33) that Y has the characteristics (B , C ) with respect to (G0 , Q). As
the only difference between (F0 , P) and (G0 , Q) is the addition of an F00 -
measurable random variable that is independent of σ(E, X), we may conclude
that Y still has the characteristics (B, C) with respect to (F0 , P).
We define the random times T0n , 0, Tin , (U +i−1)/n for i ∈ {1, . . . , n2 },
and Tnn2 +1 , ∞. We collect these random times into the random partitions
π(n) , {0 = T0n ≤ . . . ≤ Tnn2 ≤ n}.
Notice that each Tin is trivially an F0 -stopping time as Tin is F00 -measurable,
and notice that the sequence of partitions {π(n)}n converges uniformly to
the identity.
We now define the additional objects that we need to specify a generalized
partition. For each n ∈ N, let
G0n , H0n , F00 = σ(U, E),

Gin , σ(U, ZTin ) for 1 ≤ i ≤ n2 , and
n
Hi n , σ Gi−1
n
, ∆(X Ti , Ti−1
n
) for 1 ≤ i ≤ n2 + 1.

Intuitively, this structure means that the only historical information that we
keep at the reset time Tin is the value of U and the current location of Z.
Notice that T1n − T0n = U/n and Tin − Ti−1n
= 1/n for i > 1, so Tin − Ti−1
n
is
always Gi−1 -measurable. As Z may be updating using only the changes in
n
Y , we have
n
(4.41) Θ(Z Ti , Ti−1
n
) ∈ Hi n ∀i ∈ {1, . . . , n2 + 1}.
82
To show this rigorously, we write

n n −T n
Θ(Z Ti , Ti−1
n
) = ΘTi i−1 n
(Z, Ti−1 )
n −T n
= ΘTi n

i−1 Φ(E, Y ), Ti−1
n −T n
= ΦTi n

(4.42) ΦTi−1
i n (E, Y ), ∆(Y, T
i−1 )
n n
= ΦTi −Ti−1 ZTi−1 n

n , ∆(Y, T
i−1 )
n n T n −Tin

(4.43) n ,∆ i (Y, Ti−1 )
n n Tn

n , ∆(Y i , T
i−1 ) ,
where we use property (b) of Def. 2.1 at (4.42) and property (a) of Def. 2.1 at
i−1 ) are all Hi -measurable,
Tn
(4.43). Because Tin − Ti−1
n
, ZTi−1
n , and ∆(Y i , T
n n
) as a function of Hi n -measurable random

n
we have now written Θ(Z Ti , Ti−1
n
variables.
We now set Π(n) , {(Tin , Gin )}0≤i≤n2 for n ≥ 1, and we show that each
Π(n) is an extended partition. Specifically, we need to check that
(a) each Tin is a finite F0 -stopping time,
∈ σ Gi−1 , ∆(X, Ti−1

n
(b) Tin − Ti−1 n
) , and
(c) Gin ⊂ Hi n .
Claim (a) holds as π(n) is uniformly bounded by n and Tin ∈ F00 for all i,
Tn n
and we have already shown (b). Writing ZTin = ΘTin −Ti−1 n (Z i , T
i−1 ), we see
that (4.41) and Ti − Ti−1 ∈ Gi−1 imply that ZTin ∈ Hi for each i, so (c)
n n n
holds as well. We now use this sequence of extended partitions to define a

⊗Π(n)
sequence of measures on Ω, setting Pn , P for each n ∈ N.
We will now show that the collection of laws, {L (E, Y | Pn )}n , is tight.
{L (E, Y | Pn )}n is tight if and only if the collections {L (E | Pn )}n and
{L (Y | Pn )}n are both tight (e.g., Lem. B.5), so we may check each collection
individually. {L (E | Pn )}n contains a single element, so it is clearly tight.
Because Y has the characteristics (B, C) under P, and ∆(Y, Tin ), ∆(B, Tin ),
and ∆(C, Tin ) are all trivially ∆(X, Tin )-measurable for each i ∈ {0, 1, . . . , n2 },
we may apply Lem. 3.47 to conclude that Y has the characteristics (B, C)
with respect to any measure in the set {Pn }n . As we have (4.37) and (4.38),
we may apply Cor. 3.44 to conclude that the collection {L (B, C | Pn )}n
is tight. We then use the results of Rebolledo (e.g., Cor. E.12 or [JS87]
Thm. VI.4.18) to conclude that the collection {L (Y | Pn )}n is tight.
83
As the collection of laws {L (E, Y | Pn )}n is tight, we may assume (by

passing to a subsequence if necessary) that L (E, Y | Pn ) ⇒ P b for some
d
b , E×C0 (R+ ; R ) with E(e,
measure P b on the Polish space Ω b x) , e and
Yb (e, x) , x. We also set Zb , Φ(E, b0 ,
b Yb ), and we define the filtration F
{Fbt0 }t∈R+ where Fbt0 , σ(E, Y t ). As we assumed that Φ is continuous, we
have
L E, Y, Z | Pn ⇒ L E,

(4.44) b Yb , Zb .
We now show that L (Zt | Pn ) = L (Zt | P) for all n ∈ N and t ∈ R+ .

Fix any A ∈ E and t ∈ R+ . As U ∈ Hi n for all i, the event {t ∈ [Ti−1 n
, Tin )}
and the random variable Sin , (t − Ti−1n
)+ are both Hi n -measurable. Notice
n
that Zt = ΘSin (Z i , Ti−1 ) when t ∈ [Ti−1 , Tin ). Combining this observation
T n n
with (4.41) gives

2 +1
nX h i
n n Tin n n
P [Zt ∈ A] = P Θ (Z
Sin , Ti−1 ) ∈ A and t ∈ [Ti−1 , Tin )
i=1
2 +1
nX h n
i
= P ΘSin (Z Ti , Ti−1
n n
) ∈ A and t ∈ [Ti−1 , Tin )
i=1
= P[Zt ∈ A],
where we have used the fact that Pn agrees with P on each Hi+1 n
(e.g., (a) of
Thm. 3.6). It then follows from (4.44) that L (Zt ) = L (Zt | P) = L (Zet ) for
b
all t ∈ R+ .
To complete the proof, we need to characterize the limit. We will show
that Yb has the characteristics (B,
b C)
b with respect to (F, b by showing that
b P)
(4.45) L (Y, Z, B, C | Pn ) ⇒ L (Yb , Z,

b B,
b C),
b
and applying Thm. F.1. As a first step, define the processes

Z t
bt , b(Zt , t),
b Bt , bs ds,
0
Z t
ct , b
c(Zt , t), and Ct , cs ds.
0
84
As we have (4.39), (4.40), and (4.44), and we have shown that Z has the
same one-dimensional marginal distributions under each Pn , we may apply
Lem. 4.15 and Rem. 4.18 to conclude that
(4.46) L (E, Y, Z, B, C | Pn ) ⇒ L (E, b Yb , Z,

b B,
b C),
b
(b, Pn ×λ[0,t] ) n is u.i. ∀t ∈ R+ , and

(4.47)
(c, Pn ×λ[0,t] ) n is u.i. ∀t ∈ R+ .

(4.48)
If we show that limn Pn [d(B, B) > ε] = 0 and limn Pn [d(C, C) > ε] = 0 for
each ε, then (4.45) follows from (4.46) (e.g., Lem. B.2). We will actually do
slightly more. We will show that
h i
(4.49) lim En sup kBs − Bs k = 0 ∀t ∈ R+ , and
n→∞ s≤t
h i
(4.50) lim En sup kCs − Cs k = 0 ∀t ∈ R+ .
n→∞ s≤t
We now show that (4.49) holds by approximating B and B with step func-
tions. As a first step, we show that there exist random variables {ξin }1≤i≤n2
such that P[ξin = bTin ] = 1 and ξin is Hi+1 n
-measurable. Recall that Tin ,
(U + i − 1)/n for i ∈ {1, . . . , n2 }, and define the Rd -valued random variables
ξin , lim inf m(BTin +1/m − BTin )

m→∞
for i ∈ {1 . . . , n2 } where the lim inf is taken at each coordinate. n

As Ti+1 −
n
T
, Ti ) ⊂ Hi+1 . In
n n n
n
Ti ≥ 1/n when i ≥ 1, it is clear that ξi ∈ σ ∆(B i+1
prose, ξin is the right derivative of B at the time Tin (when it exists), so ξin
is fully determined by the changes in B just after Tin . For each ω 0 ∈ Ω0 , we
define the sets
n o
Aωn,i
0 , u ∈ [0, 1] : ξin (u, ω 0 ) = b0(u+i−1)/n (ω 0 ) , and
n ∂ o
Bωn,i0 , u ∈ [0, 1] : Bt0 (ω 0 ) exists at t = (u + i − 1)/n .
∂t
Recall that b0t (ω 0 ) = ∂t
∂
Bt0 (ω 0 ) whenever this derivative exists. It is clear
from the construction of ξin that ξin (u, ω 0 ) = ∂t ∂
Bt0 (ω 0 ) at t = (u + i − 1)/n
whenever this derivative exists. Combining these two observations, we see
85
that Bωn,i0 ⊂ Aωn,i

0 for all ω 0 ∈ Ω0 . Using Fubini’s Theorem, we may write
Z
n
P[ξi = bTin ] = λ[0,1] (Aωn,i 0
0 ) Q(dω ).
Ω0
If we choose an ω 0 ∈ Ω0 such that B 0 (ω 0 ) is absolutely continuous, then we

have λ[0,1] (Bωn,i0 ) = 1, so λ[0,1] (Aωn,i 0
0 ) = 1 as well. As B is Q-a.s. absolutely
0
continuous, this equality holds for Q-a.e. ω , and we conclude that
(4.51) P[ξin = bTin ] = 1.
We pause for a brief comment on this argument. We never assume that

Bωn,i0 is a Borel measurable subset of [0, 1]. If we choose ω 0 ∈ Ω0 such that
B (ω ) is absolutely continuous, then Bωn,i0 is necessarily Lebesgue measurable
0 0
as it is the complement of a null set. On the other hand, Aωn,i

0 is a cross section
of the R[0,1] ⊗G -measurable set {ξi = bTin }, so it is Borel measurable. We

n
only apply Fubini’s Theorem to the set {ξin = bTin }, so the potential lack of
Borel measurability of the set Bin (ω 0 ) is not a problem.
We now define some sequences of step functions which we will use to
approximate b and b. Let
n2
X Z t
bnt , ξin 1[Tin ,Ti+1
n ) (t), Btn , bns ds,
i=1 0
n2
X Z t
bnt , bTin 1[Tin ,Ti+1
n ) (t), Btn , bns ds, and
i=1 0
n2
X
Π(n)
bt , bTin 1[Tin ,Ti+1
n ) (t).
i=1
As P[ξin = bTin ] = 1 for each i, bn and bΠ(n) are P-indistinguishable. Each B n

∂
is piecewise affine, so, for each ω, we have ∂t Btn (ω) = bnt (ω) except at a finite
number of points. It is also clear that ∆(B n , Tin ) is σ(ξjn : j ≥ i)-measurable.
As each ξjn is Hj+1 -measurable and Hj+1 ⊂ σ Gin , ∆(X, Tin ) for j ≥ i, we
n n

conclude that ∆(B n , Tin ) is σ Gin , ∆(X, Tin ) -measurable. This means that

we may apply Lem. 3.40 to the Rd ×Rd -valued process (B, B n ). In particular,
taking f : Rd ×Rd → R to be the function f (x, y) = kx − yk, we may apply
86
Lem. 3.40 to conclude that for each n, we have

Z t Z t
n n n
E kbs − bs k ds = E P
kbs − bs k ds ∀t ∈ R+
0 0
Fixing any t ∈ R+ , we then write

Z t
n n n n
lim sup E sup kBs − Bs k ≤ lim sup E kbs − bs k ds
n→∞ s≤t n→∞ 0
Z t
n
= lim sup EP
kbs − bs k ds
n→∞ 0
Z t
Π(n)
= lim sup EP
kbs − bs k ds .
n→∞ 0
U is strongly independent of b under P, so we may apply Lem. 4.20 to con-

clude that Z t
Π(n)
lim E P
kbs − bs k ds = 0.
n→∞ 0
In particular, we have now shown that

n n
(4.52) lim E sup kBs − Bs k ∀t ∈ R+ .
n→∞ s≤t
To estimate the distance between B and B n , first notice that

Z Tin ∧t

(4.53) bs − bns ds
n ∧t
Ti−1
Z Tin ∧t
n
bb(Zs , s) − bb(ZT n , Ti−1
= i−1
)1 {i>1}
ds
n ∧t
Ti−1
Z n )+
1/n∧(t−Ti−1

bb Θs (Z Tin , Ti−1
n n
= ), Ti−1 + s
0

Tin n n

− b Θ0 (Z , Ti−1 ), Ti−1 1{i>1}
b
ds.
) are all Hi n -measurable, so (4.53) is

n
Θ(Z Ti , Ti−1
n n
), Ti−1 n
, and 1/n ∧ (t − Ti−1
Hi -measurable as well. Then, fixing any t and using the fact that Pn and
n
87
P agree on each Hin (e.g., (a) of Thm. 3.6), we write

n2 "Z #
Z t X Tin ∧t
lim sup En kbs − bns k ds = lim sup En bs − bns ds
n→∞ 0 n→∞ n ∧t
Ti−1
i=1
n2 "Z #
X Tin ∧t
= lim sup EP bs − bns ds
n→∞ n ∧t
Ti−1
i=1
Z t
= lim sup E P
kbs − bns k ds .
n→∞ 0
Once we have reduced the estimate to a calculation under P, we may use the
fact that U is strongly independent of Z, so U is strongly independent of b,
and we may again apply Lem. 4.20 to conclude that
Z t
n
lim EP
kbs − bs k ds = 0.
n→∞ 0
In particular, we have now shown that

Z t
n n
(4.54) lim E kbs − bs k ds = 0 ∀t ∈ R+ .
n→∞ 0
R t
As En sups≤t kBs − Bsn k ≤ En 0 kbs − bns k ds , (4.54) implies

n n
(4.55) lim E sup kBs − Bs k = 0 ∀t ∈ R+ .
n→∞ s≤t
We are now almost done. We only need to estimate the difference between
B and B n . To do this, we define
n
Z t
n n n
Ψt , Bt − Bt = bns − bns ds.
0
We will now show that

n
(b − bn , Pn ×λ[0,t] ) n is u.i. ∀t ∈ R+ .

(4.56)

Combining (4.47) and (4.54) shows that (bn , Pn ×λ[0,t] ) n is uniformly inte-

grable for each t, so we only need to show that (bn , Pn ×λ[0,t] ) n is uniformly
88
integrable for each t. As ξin and Ti+1 n

∧ t − Tin ∧ t are both Hi+1
n
-measurable,
we may break the integral into pieces again to show that
Z t
n n
sup E kbs k 1{kbns k>M } ds
n 0
n2
X
En n
∧ t − Tin ∧ t kξin k 1{kξin k>M }

= sup Ti+1
n
i=1
n2
X
n
∧ t − Tin ∧ t kξin k 1{kξin k>M }

= sup EP Ti+1
n
i=1
Z t
(4.57) = sup E P
kbns k 1{kbns k>M } ds .
n 0
We have
nalready shown
that bn converges to b in L1 (P×λ[0,t] ) which implies
that (b , P×λ[0,t] ) n is uniformly integrable. In particular, we may make
(4.57) arbitrarily small
by choosing sufficiently large M . But this means
that (bn , Pn ×λ[0,t] n is uniformly integrable, so we have (4.56).
n
Set δin , Tin − Ti−1 = U/n 1{i=1} + 1/n 1{i>1} . We then write
h i h i
En ΨnTin − ΨnTi−1
0
= δin En ξi−1
n n 0
n FTi−1
n − bb(ZTi−1
n ,T
i−1 ) F n
Ti−1
h i
= δin EP ξi−1
n n
G
n
− bb(ZTi−1
n ,T
i−1 ) i−1
h i
= δin EP bTi−1 n
ZTi−1 , Ti−1 − δin bb(ZTi−1
n ,T
n
i−1 )
n
n
= 0.
The first equality follows from the F00 -measurability of δin . The second equal-
ity follows from the Hi n -measurability of ξi−1
n
− bb(ZTi−1
n ,T
n
i−1 ) and property
n
(b) of Thm. 3.6. The third equality follows from the P-equivalence of ξi−1 and
bTi−1
n and the definition of Gi−1 . The final inequality follows from Cor. 4.12
n
and the fact that U is strongly independent of b and Z under P. This means
that {ΨnTin }0≤i≤n2 is a discrete time martingale under Pn , and we may apply
Lem. 4.25 to conclude that

n n n n
(4.58) lim E sup kBs − Bs k = lim sup E sup kΨs k = 0 ∀t ∈ R+ .
n→∞ s≤t n→∞ s≤t
89
Combining (4.52), (4.55), and (4.58) gives (4.49). We make essentially

the same argument to get (4.50). Combining (4.46) with (4.49) and (4.50)
then gives (4.45), so we may applying Thm. F.1 to conclude that Yb has the
characteristics (B,
b C)
b with respect to (F, b completing the proof.
b P),
4.59 Remark. In this proof, we construct ξin such that P[ξin = bTin ] = 1 for
each i by taking the right derivative of B at Tin . In this remark, we want to
emphasis that this does not imply that ξin and bTin are Pn -indistinguishable.
In general, the measures in the sequence {Pn }n are not equivalent to P. The
reason that ξin and bTin agree under P is that U is (strongly) independent of
B under P, and B is absolutely continuous. As a result, Tin is P-a.s. a point
at which B is differentiable, and the left and right derivatives agree at such
a point. Once we start constructing new measures, U and B are no longer
independent. In fact, we would expect that the characteristics quite often
have “kinks” at reset times as we reset the dynamics of the process at these
times, so we should not expect the left and right derivatives to agree at these
points.
In particular, if we resume the setting of Remark 4.32, we see that C
is P-a.s. linear, so C is P-a.s. differentiable for all t > 0; however, C has a
“kink” at each reset time under each Pn whenever we “reflip” and change
the variance accumulation rate. Also notice that the right derivative of C at
the reset time tni is equal to derivative of C over the interval (ti , ti+1 ), while
the left derivative of C at the reset time tni is equal to the derivative of C
over the previous interval (ti−1 , ti ).
To get the theorem announced in Section 2.2, we must show that we can
add a Wiener process to the stochastic basis produced in Thm. 4.30. This
involves moving to an extension, so we make the following definition.
4.60 Definition. Let X denote the canonical process on the space C(R+ ; Rr ),
let C denotes the Borel σ-field on C(R+ ; Rr ), let C0 = {σ(X t )}t∈R+ de-
note the filtration generated by X, and let W denote
Wiener’s measure on
C(R+ ; R ). We refer to W , C(R+ ; R ), C , C , W as Wiener’s basis on
r r 0
C(R+ ; Rr ).
According to this definition, Wiener’s basis on C(R+ ; Rr ) does not satisfy

the usual conditions, but this will not matter in what follows. We restate
the result presented in Section 2.2 for the reader’s convenience.
90
2.11 Theorem. Let W be an Rr1 -valued Wiener process, let µ be an adapted,

Rd -valued process, let σ be an adapted, Rd ⊗Rr1 -valued process, assume that
Z t
T
(2.12) E kµs k + kσs σs k ds < ∞ ∀t ∈ R+ ,
0
and set
Z t Z t
(2.13) Yt , µs ds + σs dWs .
0 0

Z = Φ(Z0 , Y ) for some continuous updating function Φ. Finally, suppose
that N ⊂ R+ is a Lebesgue-null set and that we have (deterministic) functions
b : E×R+ → Rd and σ
µ b : E×R+ → Rd ⊗Rr2 such that µ b(Zt , t) = E[µt | Zt ]
a.s. and σbσb (Zt , t) = E[σt σt | Zt ] a.s. when t ∈
T T
/ N.
Then there exists a stochastic basis (Ω, F , F,
b b b supporting processes W
b P) c,
Yb , and Zb such that
(a) Wc is an Rr2 -valued Wiener process,
Z t Z t
(b) Ybt = µ
b(Zbs , s) dt + σ
b(Zbs , s) dW
cs ,
0 0
(c) Zb is a continuous, E-valued process with Zb = Φ(Zb0 , Yb ), and
(d ) Zb has the same one-dimensional marginal distributions as Z.

Proof. Set b , µ, bb , µ b, c = σσ T , and b c=σbσbT . ItRis clear from (2.13),R t that
t
Y is an Itô process with the characteristics Bt , 0 bs ds and Ct , 0 cs ds
(e.g., Lem. D.2), so Thm. 4.30 asserts the existence of a stochastic basis
B e F
e = (Ω, e , F,
e P)e that supports adapted, continuous processes Ye and Ze such
et , t bb(Zes , s) ds
R
that Ye is an Itô process with characteristics (B, e C),
e where B
0
et , t b
R
and C 0
c (Zes , s) ds, Ze = Φ(Ze0 , Ye ), and Ze has the same one-dimensional
marginal distributions as Z. Set M f = Ye − B, e so M f is a local martingale
with Z t Z t
h M it = Ct =
f e c(Zs , s) ds =
b e σ
bσbT (Zes , s) ds.
0 0
Let W denote Wiener’s basis on C(R+ ; Rr2 ), and set B b F

b = (Ω, b , F, b ,
b P)
B⊗W
e (see Def. 1.11). Let B,
b C,
b M
c, Yb , Zb denote the extensions of B,
e C,
e
91
r2
M
f, Ye and Ze from Ω e to Ωb , Ω×C(R
e + ; R ). Moving to the extension, we
Rt
bt , bb(Zbs , s) ds and Zb = Φ(Zb0 , Yb ), and Zb still has the same
still have B 0
one-dimensional marginal distributions as Z. Thm. D.9 asserts the existence
of an F-adapted,
b continuous, Rr2 -valued Wiener process W c defined on Ωb such
that Z t
Mt =
c σ
b(Zbs , s) dW
cs .
0
As Yb = M
c + B,
b we see that Yb satisfies (b), and we are done.
92
Appendix A
Galmarino’s Test
In this section, we assume that we are in Setting 3.1.
A.1 Lemma (Galmarino’s test). Let T be an F -measurable, R+ -valued ran-

dom variable. The following are equivalent:
(a) T is an F0 -stopping time, and
(b) if E(ω) = E(ω 0 ) and Xu (ω) = Xu (ω 0 ) for 0 ≤ u ≤ T (ω 0 ), then T (ω) =

T (ω 0 ).
Moreover, if T is an F0 -stopping time and Z is an F -measurable random

variable, then
(c) Z is FT0 -measurable if and only if Z = Z(E, X T ).
In particular, FT0 = σ(E, X T ).
A.2 Remark. If T is the last time that X leaves an open set G, then
XT ∈ / G. This means that T is also the last time that the process stopped
at T leaves the set G. In particular, T = T (E, X T ), but T is clearly not a
stopping time as you must look into the future to determine if you will enter
the set G again later. In particular, the property which must be checked in
(b) is strictly stronger than the property which must be checked in (c).
Proof. First we show that Z is Ft0 -measurable if and only if Z is F -measurable

and Z = Z(E, X t ).
⇒ The class of bounded random variables such that Z = Z(E, X t ) is a mono-
tone class that contains finite products of the form f (E)g1 (Xt1 ) · · · gn (Xtn )
93
APPENDIX A. GALMARINO’S TEST
for bounded measurable f and gi and 0 ≤ ti ≤ t. The property then holds

for all bounded Z ∈ Ft0 by a monotone class argument.
⇐ E and X t are both Ft0 -measurable, and Z is F -measurable, so Z(E, X t )
is also Ft0 -measurable.
Now we show the first equivalence.

⇒ Assume T is an F0 -stopping time and fix ω and ω 0 E(ω) = E(ω 0 ), and
Xu (ω) = Xu (ω 0 ) for 0 ≤ u ≤ t , T (ω 0 ). T is an F0 -stopping time, so
{T = t} ∈ Ft0 . By the previous case,
1{T =t} (ω) = 1{T =t} (E(ω), X t (ω))

= 1{T =t} (E(ω 0 ), X t (ω 0 )) = 1{T =t} (ω 0 ) = 1.
⇐ Assume that property (b) holds. We need to show that this implies
{T ≤ t} ∈ Ft0 . By the previous case, it is sufficient to show that
ω ∈ {T ≤ t} ⇒ E(ω), X t (ω) ∈ {T ≤ t}.

Fix ω ∈ {T ≤ t} and set ω t , E(ω), X t (ω) . Then E(ω t ) = E(ω) and
Xu (ω t ) = Xu (ω) for 0 ≤ u ≤ T (ω) ≤ t, Using the assumption, we see that
T (ω t ) = T (ω), so T (ω t ) ≤ t and ω t ∈ {T ≤ t}.
Finally we show (c).

⇒ Assume that Z ∈ FT0 . Fix any ω and set t , T (ω) and z , Z(ω). By
assumption, we have
A , T = t and Z = z ∈ Ft0 .

So ω ∈ A ⇒ E(ω), X t (ω) ∈ A, but then Z E(ω), X t (ω) = z. As ω was
arbitrary, we conclude that Z = Z(E, X t )
⇐ Suppose that Z is F -measurable, and that Z = Z(E, X T ). Fixing an
0
arbitrary constant z, we need to show that A ,
0 0 t 0
{Z ≤ z and T ≤ t} 0∈ Ft .
Fix some ω ∈ A and set ω , E(ω ), X (ω ) . Then E(ω) = E(ω ) and
Xu (ω) = Xu (ω 0 ) for 0 ≤ u ≤ T (ω 0 ) ≤ t so T (ω) = T (ω 0 ) by the previous
equivalence. Then
0 0
X T (ω) (ω) = X T (ω ) (ω) = X T (ω ) (ω 0 ).
94
Using the assumption, we see that

0
Z(ω) = Z E(ω), X T (ω) (ω) = Z E(ω 0 ), X T (ω ) (ω 0 ) = Z(ω 0 ),

so ω ∈ A. This implies that A ∈ Ft0 , so we are done.
A.3 Lemma. Let S ≤ T be F0 -stopping times with S ∈ R+ and T ∈ R+ .

Then
FT0 = σ FS0 , ∆(X T , S) .

Proof. By the previous lemma, we have FT0 = σ(E, X T ). Writing
X T = X S + Θ ∆(X T , S), −S ,

and observing that E, X S , and S are all FS0 -measurable, we conclude that
FT0 ⊂ σ FS0 , ∆(X T , S) . On the other hand, FS0 , X T , and
S are all FT0 -
0 T 0
measurable. This means that we also have σ FS , ∆(X , S) ⊂ FT , complet-
ing the proof.
A.4 Lemma. Suppose that S is a finite F0 -stopping time and that T ≥ S

is an R+ -valued random time. Further suppose that G ⊂ FS0 and that Z
is a random variable. Let I , (E, X) denote the identify operator on Ω,
set C0 , C0 (R+ ; Rd ), and let C0 denotes the Borel σ-field on C0 . Then the
following are equivalent:
(a) Z is σ G , ∆(X T , S) -measurable,

(b) Z = f I; ∆(X T , S) for some f : Ω×C0 → R which is G ⊗C0 -measurable,

and
(c) Z = g E, X S ; ∆(X T , S) for some g : Ω×C0 → R which is G ⊗C0 -

measurable.
Proof. Z ∈ bF means that Z is a bounded F -measurable random variable.

Let
B , Z ∈ bF : Z = f I; ∆(X T , S) with f ∈ bG ⊗C0 ,

C , Z ∈ bF : Z = g E, X S ; ∆(X T , S) with g ∈ bG ⊗C0 , and

D , Z ∈ bF : Z = Z 0 h ∆(X T , S) with Z 0 ∈ bG and h ∈ bC0 . .

95
It is clear that D ⊂ B. As Z 0 = Z 0 (E, X S ) by (c) of Lem. A.1, we also have

D ⊂ C . As B and C are closed with respect to uniformly bounded, pointwise
limits they are monotone classes, and we have σ(D) = σ G , ∆(X T , S) ⊂

B ∩ C . In particular, (a) implies (b) and (c).

Now assume Z = f I; ∆(X T , S) for some G ⊗C0 /R-measurable f . By
checking the preimages of rectangles, we see that the map which sends
ω 7→ ω, ∆(X T (ω), S(ω)) is σ G , ∆(X T , S) /G ⊗C0 -measurable. As Z is
the composition of this map with f , it follows that Z is σ G , ∆(X T , S) /R-
measurable. In particular, (b) implies (a).
Let φ denote the map ω 7→ E(ω), X S (ω) . If G ∈ G , then Lem. A.1

asserts that ω ∈ G ⇔ φ(ω) ∈ G. As a result, φ -1 (G) = G and φ is G /G -

measurable. Now assume that Z = g φ; ∆(X T , S) for some G ⊗C0 /R-
measurable g. By checking the preimages of rectangles, we see that the
map which sends ω 7→ φ(ω), ∆(X T (ω), S(ω)) is σ G , ∆(X T , S) /G ⊗C0 -
measurable. As Z is the composition of this map with g, it follows that Z
is σ G , ∆(X T , S) /R-measurable. In particular, (c) implies (a) and we are
done.
A.5 Lemma. Let S ≤ T be finite, F0 -stopping times and let U ≥ T be an

R+ -valued random time. If G ⊂ FS0 , then
σ G , ∆(X U , S) ∩ FT0 = σ G , ∆(X T , S) ∩ FT0 .

Proof. Fix any R+ -valued random times U1 ≥ T and U2 ≥ T . We will show

that
σ G , ∆(X U1 , S) ∩ FT0 ⊂ σ G , ∆(X U2 , S) ∩ FT0 .

(A.6)
To do this, choose any bounded random variable Z which is measurable with

respect to σ G , ∆(X U1 , S) ∩ FT0 . Using Lem. A.4, we may choose some

g ∈ G ⊗ C0 such that Z = g E, X S ; ∆(X U1 , S) . Now fix any ω ∈ Ω, let

t = T (ω), and set ω t , E(ω), X t (ω) ∈ Ω. As ω t agrees with ω up until
time t = T (ω) ≥ S(ω), T (ω t ) = T (ω) and S(ω t ) = S(ω) by (b) of Lem. A.1.
This means that
X S (ω t ) = ∇ X(ω t ), S(ω t ) = ∇ X(ω t ), S(ω) = ∇ X(ω), S(ω) = X S (ω).

96
We also have Ui (ω t ) ≥ T (ω t ) = t. This means that
X U1 (ω t ) = ∇ X(ω t ), U1 (ω t ) = X(ω t ) = ∇ X(ω t ), U2 (ω t ) = X U2 (ω).

As Z ∈ FT0 , an application of (b) of Lem. A.1 followed by the use of the

characterization in terms of g gives
Z(ω) = Z(ω t )

t S t U1 t t
= g E(ω ), X (ω ); ∆ X (ω ), S(ω )

= g E(ω), X S (ω); ∆ X U2 (ω), S(ω) .

As ω is arbitrary, we have Z = g E, X S ; ∆(X U2 , S) and the characterization
given in the preceding lemma implies that Z ∈ σ G , ∆(X U2 , S) . We have

now shown that (A.6) holds.

The result follows by first taking
U1 = U and U2 = T and applying (A.6)
to conclude that σ G , ∆(X U , S) ∩ FT0 ⊂ σ G , ∆(X T , S) ), and then taking
G T

U1 = T and U2 = U and applying (A.6) to conclude that σ , ∆(X , S) ∩
FT0 ⊂ σ G , ∆(X T , S) .

A.7 Corollary. Let S ≤ T be finite F0 -stopping times, and let U1 ≥ T and

U2 ≥ T be R+ -valued random times. If G ⊂ FS0 and T −S ∈ σ G , ∆(X U1 , S) ,

then
σ G , ∆(X T , S) = σ G , ∆(X U2 , S) ∩ FT0 , and

(A.8)
σ G , ∆(X T , S), ∆(X U2 , T ) = σ G , ∆(X U2 , S) .

(A.9)
In particular, if T − S ∈ σ G , ∆(X, S) then T − S ∈ σ G , ∆(X T , S) .

Proof. S is an F0 -stopping time andS ≤ T , so S is FT0 -measurable. This

means that we have σ G , ∆(X T , S) ⊂ FT0 and that T − S ∈ FT0 . As
T − S ∈ σ G , ∆(X U1 , S) by assumption, we have

T − S ∈ σ G , ∆(X U1 , S) ∩ FT0 = σ G , ∆(X T , S) ∩ FT0

= σ G , ∆(X U2 , S) ∩ FT0

∈ σ G , ∆(X U2 , S) ,

by the previous lemma. In particular, we know that T −S

so if we then write ∆(X T , S) = ∇ ∆(X U2 , S), T − S) , then it is clear that
97
σ G , ∆(X T , S) ⊂ σ G , ∆(X U2 , S) , and we have one of inclusions needed

for A.8. The opposite inclusion follows immediately from the previous lemma
as
σ G , ∆(X U , S) ∩ FT0 = σ G , ∆(X T , S) ∩ FT0 ⊂ σ G , ∆(X T , S) .

To show that σ G , ∆(X T , S), ∆(X U2 , T ) ⊂ σ G , ∆(X U2 , S) , we write

∆(X T , S) = ∇ ∆(X U2 , S), T − S ,

∆(X U2 , T ) = ∆ ∆(X U2 , S), T − S ,

G G
U2
U2

and use the fact that T −S ∈ σ , ∆(X , S) . To show that σ , ∆(X , S) ⊂
σ G , ∆(X T , S), ∆(X U2 , T ) , we write

∆(X U2 , S) = ∆(X T , S) + Θ ∆(X U2 , T ), −(T − S) .

and use the fact that T −S ∈ σ G , ∆(X T , S) . We have now shown (A.9)

98
Appendix B
Metric Space-Valued Random

Variables.
Here we collect a number of results on metric space-valued random variables.

We recall the following definition from Section 1.2.
1.9 Definition. Let E be a topological space, and let {X n }n≤∞ be a sequence

of E-valued random variables, possibly defined on different probability spaces.
We say that X n converges in distribution to X ∞ , written X n ⇒ X ∞ , if
lim En f (X n ) = E∞ f (X ∞ )

n→∞
for each bounded, continuous function f : E → R.
B.1 Theorem (Portmanteau). When E is a metric space, the following are

equivalent:
(a) X n ⇒ X ∞ , and
(b) En [f (X n )] → E∞ [f (X ∞ )] for all bounded uniformly continuous f .
(c) lim supn Pn [X n ∈ F ] ≤ P∞ [X ∞ ∈ F ] for all closed F ⊂ E,
(d ) lim inf n Pn [X n ∈ G] ≥ P∞ [X ∞ ∈ G] for all open G ⊂ E, and
(e) Pn [X n ∈ A] → P∞ [X ∞ ∈ A] for all A ⊂ E with P∞ [X ∞ ∈ ∂A] = 0.
Proof. See [Bil68] Thm. 2.1.
99
APPENDIX B. METRIC SPACE-VALUED RANDOM VARIABLES.
B.2 Lemma. Let (E, d) be a metric space and let {X n }n∈N and {Y n }n∈N be
collections of E-valued random variables. If X n ⇒ X ∞ and d(X n , Y n ) ⇒ 0
then Y n ⇒ X ∞ .
Proof. Fix any bounded uniformly continuous f : E → R, and write

∞
E [f (X ∞ )] − En [f (Y n )]

h i
≤ E∞ [f (X ∞ )] − En [f (X n )] + En f (X n ) − f (Y n ) .

As d(X n , Y n ) ⇒ 0 and f is uniformly continuous, we conclude that f (X n ) −

f (Y n ) ⇒ 0, and the result follows.
B.3 Lemma. Let (E, d) be a metric space, let {S n }n∈N be a sequence of

probability spaces where S n = (Ωn , F n , Pn ), and suppose that on each space
S n there is defined an E-valued a random variable Y n and a collection of
approximating random variables {X n,a }a∈A . If
inf sup Pn d(X n,a , Y n ) > δ = 0

(B.4)
a∈A n∈N
for each δ > 0, and X n,a ⇒ X ∞,a as n → ∞ for each a ∈ A, then Y n ⇒ Y ∞ .

Proof. Fix any bounded, uniformly continuous f : E → R, choose C such
that |f | ≤ C, and then choose any ε > 0. Using the uniform continuity of
f , choose δ = δ(ε) so small that |f (e2 ) − f (e1 )| ≤ ε/6 when d(e1 , e2 ) ≤ δ.
Using (B.4), choose a ∈ A such that
sup Pn d(X n,a , Y n ) > δ < ε/(12C).

n∈N

Finally, choose N = N (ε, a) so large that E∞ [f (X ∞,a )]−En [f (X n,a )] < ε/3
100
when n ≥ N . Putting this all together gives

∞
E [f (Y ∞ )] − En [f (Y n )]

h i
≤ E∞ f (Y ∞ ) − f (X ∞,a ) + E∞ [f (X ∞,a )] − En [f (X n,a )]

h i
n n,a n
+ E f (X ) − f (Y )
≤ 2C P∞ d(Y ∞ , X ∞,a ) > δ + ε/6 + ε/3

+ 2C Pn [d(X n,a , Y n ) > δ] + ε/6

≤ε
when n ≥ N . As f and ε are arbitrary, we conclude that Y n ⇒ Y ∞ .

B.5 Lemma. Let E1 and E2 be topological spaces and let {Xni } be a collection
of Ei -valued random variables for i ∈ {1, 2}. Then the collection of E1 ×E2 -
valued random variables {(Xn1 , Xn2 )} is tight if and only the collection {Xn1 }
is tight and the collection {Xn2 } is tight.
Proof. We let πi : E1 ×E2 → Ei denote projection onto the ith component. We
now check both implications:
⇒ Fix ε and choose compact K ⊂ E1 ×E2 with Pn [(Xn1 , Xn2 ) ∈ K ] ≥ 1 − ε.
Without loss of generality, we may assume that K = K1 ×K2 for compact
sets Ki ⊂ Ei (otherwise replace K with π1 (K)×π2 (K) and note that the
continuous forward image of a compact set is compact). Then we have
Pn [Xni ∈ Ki ] ≥ Pn [(Xn1 , Xn2 ) ∈ K ] ≥ 1 − ε.
⇐ Fix ε and choose Ki with Pn [Xni ∈ / Ki ] ≤ ε/2. Then K1 ×K2 is compact

and
Pn [(Xn1 , Xn2 ) ∈
/ K1 ×K2 ] ≤ Pn [Xn1 ∈
/ K1 ] + Pn [Xn2 ∈
/ K2 ] ≤ ε
B.6 Lemma. Let E be a Polish space. If f ∈ C(Rd ×R+ ; E), and F :

C(R+ ; Rd ) → C(R+ ; E) denotes the map such that Ft (x) = f x(t), t for
x ∈ C(R+ ; Rd ) and t ∈ R+ , then F is a continuous map.
Proof. Fix a path x ∈ C(R+ ; Rd ). If tn → t∞ , then

Ftn (x) = f x(tn ), tn → f x(t∞ ), t∞ = Ft∞ (x)
101
so F (x) is a continuous process.

Set x∗ (t) , sups≤t kx(s)k < ∞ for all t, so f is uniformly continuous
when restricted to the compact set B(x∗ (t) + 1)×[0, t] where B(r) , {a ∈
Rd : kak ≤ r}. Now let xn → x and fix t and ε > 0. Choose δ > 0 so small
that kf (b, s) − f (a, s)k ≤ ε if a, b ∈ B(x∗ (t) + 1), s ≤ t and kb − ak ≤ δ.
Then choose N so large that sups≤t kx(s) − xn (s)k ≤ δ ∧ 1. We have

sup kFs (x) − Fs (xn )k = sup kf x(s), s − f xn (s), s k ≤ ε,
s≤t s≤t
as x(s) and xn (s) are both in B(x∗ (t) + 1). In particular, F (xn ) → F (x).
We used the fact that closed bounded subsets of Rd are compact in the
previous proof.
B.7 Theorem (Lusin’s Theorem). Let E be a metric space, µ be finite mea-
sure on E, and f be a real-valued measurable function on E. Given any ε > 0,
there exists a continuous function g such that µ({x : f (x) 6= g(x)}) < ε.
Proof. See [Kec95] Thm 17.12
B.8 Lemma. Let (E, d) be a metric space, and let µ be finite measure on
that space. Then the collection of bounded Lipschitz continuous functions on
E is dense in Lp (E, µ) for any p ≥ 1.
Proof. Let f : E → R with 0 ≤ f ≤ M for some finite constant M . Fix any
ε > 0 and choose continuous g with µ({x : f (x) 6= g(x)}) < ε 2−p−1 M −p using
the last theorem. Without loss of generality, we may assume that 0 ≤ g ≤ M ;
otherwise, replace g with (0 ∨ g) ∧ M . Let gn (x) = inf y∈E g(y) + nd(y, x),
so we have 0 ≤ gn ≤ g, gn (x) → g(x) as n → ∞, and each gn is Lipschitz
continuous withR constant n. Using bounded convergence, we may choose N
p
so large that E |g − gN | dµ < ε/2, and then
Z Z Z
|f − gN | dµ ≤ |f − g| dµ + |gN − g|p dµ ≤ ε
p p
E E E
The result follows for arbitrary f ∈ Lp (E, µ) by first truncating, and then
approximating the positive and negative parts.
B.9 Corollary. Let E be a metric space, let µ be a finite measure on E, and
let f : E → Rd be a measurable function. Then there Rexists a sequence of
bounded, Lipschitz continuous functions {fn } such that kf − fn k dµ → 0.
102
Proof. Letting f i denote the ith component of f , we choose d sequences of

R-valued functions, {fni }n , with fni → f i in L1 (E, µ) and we let fn be the
Rd -valued function with these components. Then
Z d Z
X
kf − fn k dµ ≤ |f i − fni | dµ → 0.
i=1
103
Appendix C
FV and AC Processes
C.1 Definition. If f : D → Rd and [a, b] ⊂ D, then we define

n
X
Var[a,b] (f ) , sup f (ti ) − f (ti−1 ) .
π
i=1
where the supremum is taken over all partitions of the form

π = a = t0 < t1 < . . . < tn = b .
We say that f is of bounded variation on the interval [a, b] if Var[a,b] (f ) <

∞, and we let BV [a, b]; Rd denote the collection of all such functions. We
abbreviate Var[0,t] (f ) to Vart (f ).
Definition C.1 extends Def. 1.13.
C.2 Definition. Let f : D → Rd . We say that f is absolutely continuous
on the interval P [a, b] if [a, b] ⊂ D, and there exists a function δ : (0, ∞) →
n
(0, ∞)
Pn such that i=1 kf (ti ) − f (si )k < ε whenever si , ti ∈ [a, b] with si <
ti , i=1 |ti − si | < δ(ε), and the intervals {(si , ti )}i are disjoint. We let
AC [a, b]; Rd denote the collection of all such functions.

It is clear that AC [a, b]; Rd ⊂ BV [a, b]; Rd .
C.3 Theorem. If f ∈ BV [a, b]; Rd , then f 0 exists for Lebesgue-a.e. t ∈

Rb
[a, b], and we have a kf 0 (u)k du < ∞. If f ∈ AC [a, b]; Rd , then

Z t
(C.4) f (t) = f (a) + f 0 (u) du ∀t ∈ [a, b].
a
104
APPENDIX C. FV AND AC PROCESSES
Proof. Write f = (fi )1≤i≤d . Recall that f 0 exists at t if and only 0

if fi exists
0
at t for all i ∈ {1, . . . , d}. It is clear that fi ∈ BV [a, b]; R for each i ∈
{1, . . . , d}, so we may apply the scalar result at each component to conclude
that there exist Lebesgue-null sets {Ni }1≤i≤d such that fi0 exists when t ∈/ Ni .
Setting N = ∪1≤i≤d Ni , we see that N is a Lebesgue-null set and f 0 exists
when t ∈ / N.
If f ∈ AC [a, b]; Rd , then fi0 ∈ AC [a, b]; R for each i ∈ {1, . . . , d}.

Equation (C.4) then follows by applying the scalar result componentwise.
C.5 Theorem. If g : [a, b] → Rd is integrable on [a, b], and

Z t
f (t) = f (a) + g(u) du ∀t ∈ [a, b],
a
then f ∈ AC [a, b]; Rd , f 0 exists for Lebesgue-a.e. t ∈ [a, b], and f 0 = g for

Lebesgue-a.e. t ∈ [a, b].

R
Proof. Given ε, we can choose δ so small that A kg(t)k du < ε when A ⊂ [a, b]
Rt
and λ(A) < δ. As kf (t) − f (s)k ≤ s kf (u)k du, the absolute continuity of f
follows just as in the scalar-valued case.R The previous theorem asserts that f 0
t
exists for Lebesgue-a.e. t ∈ [a, b], and a g(u) − f 0 (u) du = 0 for all t ∈ [a, b].
Fixing any x ∈ Rd , we have
Z t Z t
0 0
(g(u) − f (u), x) du = g(u) − f (u) du, x = 0 ∀t ∈ [a, b],
a a
so we conclude that (g(t) − f 0 (t), x) = 0 for Lebesgue-a.e. t ∈ [a, b]. Letting

{xn } denote a countable dense subset of Rd , we can choose a single Lebesgue-
null set N ⊂ [a, b] such that (g(t) − f 0 (t), xn ) = 0 for all n ∈ N when t ∈
/ N.
0 0
This implies that g(t) − f (t) = 0 when t ∈ / N , so g = f Lebesgue-a.e.

If the function f : [a, b] → R is nondecreasing, then f ∈ BV [a, b]; R .
The next result generalizes this observation.
C.6 Lemma. Let f : [a, b] → Rd ⊗Rd . If f (t) − f (s) ∈ S+d for all s, t ∈ [a, b]
with s ≤ t, then and f ∈ BV [a, b]; Rd ⊗Rd .
Proof. It is clear that f ii is nondecreasing and, therefore, of finite variation
for each i ∈ {1, . . . , d}. Letting {ei } denote the canonical basis on Rd , and
105
fixing any s, t ∈ [a, b] with s < t, we see that

f (s)(ei ± ej ), ei ± ej ≤ f (t)(ei ± ej ), ei ± ej .
This implies that

2 f ij (t) − f ij (s) ≤ f ii (t) − f ii (s) + f jj (t) − f jj (s).
In particular, if we fix a partition π = {a = t0 < t1 < . . . < tn = b}, then

n
X ij
f (tk ) − f ij (tk−1 )
k=1 n
X
f ii (tk ) − f ii (tk−1 ) + f jj (tk ) − f jj (tk−1 ) /2

≤
k=1
= f ii (b) − f ii (a) + f jj (b) − f jj (a) /2.

Taking the supremum over all such partitions, we see that f ij ∈ BV [a, b]; R .
We have now shown that each component of f is of bounded variation on the
interval [a, b], so f must be of bounded variation on the interval [a, b].
1.12 Definition. If X is an Rd -valued process,

then we say that X is a
finite variation process if X ∈ BV [0, t]; Rd for all t ∈ R+ , and we say
that X is an absolutely continuous process if X ∈ AC [0, t]; Rd for all
t ∈ R+ .
C.7 Lemma. The map Vart : C(R+ ; Rd ) → R+ is lower semicontinuous for

each fixed t.
Proof. Take xn → x∞ , fix ε > 0, and choose a partition {0 = s0 < s1 < . . . <
sm = t} such that
m
X
Vart (x∞ ) ≤ x∞ (si ) − x∞ (si−1 ) + ε/2.
i=1
Then choose N = N (ε, m) so large that

sup x∞ (u) − xn (u) ≤ ε/4m
u∈[0,t]
106
for all n ≥ N . So when n ≥ N , we have

m
X
Vart (xn ) ≥ xn (si ) − xn (si−1 )
i=1
m
X

≥ x∞ (si ) − x∞ (si−1 ) − ε/2m
i=1
≥ Vart (x∞ ) − ε.
Letting ε → 0, we are done.

To see we cannot hope for more than lower semicontinuity, consider the
function 
t
 for 0 ≤ btc ≤ 1/4,
f (t) , 1/2 − t for 1/4 ≤ btc ≤ 3/4, and

t−1 for 3/4 ≤ btc ≤ 1,

and set fn (t) , f (nt)/n, so fn → 0 uniformly, but Vart (fn ) = t for all n.
C.8 Corollary. If X is a continuous process, then Vart (X) is a (measurable)
random variable.
Proof. The composition of a measurable map and a lower semicontinuous
map is measurable.
C.9 Corollary. The set
F V d , x ∈ C(R+ ; Rd ) : x ∈ BV [0, t]; Rd ∀t ∈ R+

is a Borel measurable subset of C(R+ ; Rd ).

Proof. Treating Vart as a map from C(R+ ; Rd ) to R+ , we write
\
FV d = Varn-1 (R+ ).
n∈N
C.10 Lemma. If X is an Rd -valued, continuous process which is adapted

to some filtration F0 = {Ft0 }t∈R+ , then there exists an F0 -predictable process
∂
x such that, for each ω, we have xt (ω) = ∂t Xt (ω) whenever this derivative
exists.
107
Proof. Define xnt , n(Xt −Xt−1/n )1{t>1/n} . Each xn is left-continuous and F0 -

adapted, so each xn is F0 -predictable. By taking the lim sup or lim inf at each
coordinate, we get an F0 -predictable process x such that xt (ω) = limn xnt (ω)
∂
whenever the limit exists. In particular, if ∂t Xt (ω) exists, then limn xnt (ω)
∂
exists, and xt (ω) = limn xnt (ω) = ∂t Xt (ω).
C.11 Corollary. The set
AC d , y ∈ C(R+ ; Rd ) : y ∈ AC [0, t]; Rd ∀t ∈ R+

is a Borel measurable subset of C(R+ ; Rd ).

Proof. Let X denote the canonical process on C(R+ ; Rd ), let C0 = {Ct0 }t∈R+
where Ct0 = σ(X t ) denote the filtration generated by X, and let C denote
the Borel σ-field on C(R+ ; Rd ). Lem. C.10 asserts the existence of a C0 -
∂ ∂
predictable process x such that xt (y) = ∂t Xt (y) = ∂t y(t) whenever the
derivative exists. Set
Z t
d
A(t) , y ∈ C(R+ ; R ) : y(t) = xu (y) du .
0
Rt
As x is C0 -predictable, it is certainly C ⊗ R+ -measurable, so 0 xu du is C -
measurable by Fubini’s theorem (recall convention Rem. 1.7), and A(t) is
C -measurable as well. Setting B = ∩q∈Q+ A(q), we will show that B = AC d .
First assume that y ∈ AC d . Thm. C.3 asserts that y 0 (t) exists
R t 0 for Lebesgue-
0
a.e. t, y is integrable on each interval [0, t], and y(t) = 0 y (u) du for all t.
0 0
As xt (y) agrees
R t with y (t) whenever it exists, xt (y) = y (t) for Lebesgue-a.e.
t, and y(t) = 0 xu (y) du for all t. In particular, y ∈ B.
Rt
Now assume that y ∈ B. As y is continuous and the map t 7→ 0 xu (y) du
Rt
is left-continuous (e.g., Rem. 1.7), we have y(t) = 0 xu (y) du for all t, and
we may apply Thm. C.5 to conclude that y is absolutely continuous on each
compact interval.
If X and Y are two absolutely continuous processes which share the same
law, then the derivatives of X and Y should the same law in some sense. To
make this precise, one must address the fact the derivatives are only specified
up to equivalence with respect to Lebesgue’s measure, and the following
lemma gives one possible approach.
108
C.12 Lemma. Let (E, E ) be a metric space with its Borel σ-field, and let S i
and S 2 be probability spaces with S i = (Ωi , F i , Pi ). Let S 1 support a con-
tinuous, Rd -valued process X i , a measurable, Rd -valued process xi , and a con-
tinuous, E-valued process Y . Let f : Rd ×E → R+ be an R d ⊗E -measurable
function, and define the R+ -valued random variables
Z ∞
f,i
(C.13) Z , f (xis , Ysi ) ds for i ∈ {1, 2}.
0
If Z t
i
P Xti = xis ds ∀t ∈ R+ = 1 for i ∈ {1, 2},
0
and L (X 1 , Y 2 ) = L (X 2 , Y 2 ), then
(C.14) L (X 1 , Y 1 , Z f,1 ) = L (X 2 , Y 2 , Z f,2 ).
Proof. Set
Z t
i
(C.15) A , Xti = xis ds ∀t ∈ R+ .
0
R .
X i is continuous and 0 xiu du is left-continuous (e.g., Rem. 1.7), so we may
replace R+ with Q+ in the (C.15) to see that Ai is measurable.
We first show that the lemma holds when f is of the form f (a, b) =
e−t g(a, b) for some bounded, R d ⊗E -measurable g, and we then show that
the lemma holds as stated using monotone convergence.
Assume that f (a, b) = e−t g(a, b) for some bounded, continuous g. Define
φ : C(R+ ; Rd )×R+ → Rd by
n
φnt (y) , n y(t) − y(t − 1/n) 1{t>1/n} ,

R∞
and set Znf,i , 0
f (φns ◦ X i , Ysi ) ds. As L (X 1 , Y 1 ) = L (X 2 , Y s ), we have
(C.16) L (X 1 , Y 1 , Znf,1 ) = L (X 2 , Y 2 , Znf,2 ) ∀n ∈ N.
Set
B i (ω i ) , t ∈ R+ : lim φnt ◦ X i (ω i ) 6= xit (ω i ) for ω i ∈ Ωi ,

n
109
where limn zn 6= z∞ means that either the limit doesn’t exists, or that the
∂
limit exists and differs from z∞ . If ∂t Xti (ω i ) exists and agrees with xit (ω i ),
then the difference quotients used to define φnt ◦ X i (ω i ) must converge to this
value. In particular,
n ∂ o
B i (ω i ) ⊂ t ∈ R+ : Xti (ω i ) 6= xit (ω i ) .
∂t
∂
If ω i ∈ Ai , then Thm. C.5 asserts that ∂t Xti (ω i ) exists and agrees with xit (ω i )
for Lebesgue-a.e. t. In particular, λ(B i (ω i )) = 0 and
lim φnt ◦ X i (ω i ) = xit (ω i ) for Lebesgue-a.e. t.

n
Using the continuity of f and dominated convergence, we conclude that

limn Znf,i (ω i ) = Z f,i (ω i ) when ω i ∈ Ai . As Pi [Ai ] = 1, limn Znf,i = Z f,i ,
Pi -a.s., and this implies that (X i , Y i , Znf,i ) ⇒ (X i , Y i , Z f,i ). Combining this
with C.16, we conclude that (C.14) holds for this case.
We will now extend the result to functions f of the form f (a, b) =
−t
e g(a, b) for some bounded, measurable g using a monotone class argument.
g ∈ bR d ⊗E means that g is a bounded R d ⊗E -measurable function. Let
n o
C , g ∈ bR d ⊗E : (C.14) holds with f (a, b) = e−t g(a, b) .
We now show that C is a monotone class. Assume that {gn }n∈N is a uniformly
bounded sequence of functions in C that converge to some limiting function
g pointwise on Rd ×E. Setting fn (a, b) , e−t gn (a, b) for n ∈ N and f ,
e−t g(a, b), we have limn fn (xit , Yti ) = f∞ (xit , Yti ) for each t ∈ R+ , so we may
apply dominated converge to conclude that
Z ∞ Z ∞
fn ,i i i
lim Z = lim fn (xt , Yt ) du = f (xit , Yti ) du = Z f,i
n→∞ n→∞ 0 0
pointwise on Ωi . This implies that (Z i , Y i , Z fn ,i ) ⇒ (Z i , Y i , Z f,i ), and we

have
(C.17) L (Z 1 , Y 1 , Z fn ,1 ) = L (Z 2 , Y 2 , Z fn ,2 ) ∀n ∈ N
from the definition of C , so we may conclude that (C.14) holds for this case.
Finally, we show that the result holds for nonegative f . Setting fn =
110
f ∧ (ne−t ) and applying the monotone convergence theorem, we see that

limn Z fn ,i = Z f,i pointwise on Ωi as R+ -valued random variables. Applying
the previous case to each fn , we that (C.17) holds, so (C.14) holds as well..
The following corollary is often more convenient for applications than

Lem. C.12.
C.18 Corollary. Let (E, E ) be a metric space with its Borel σ-field, and let
S i and S 2 be probability spaces with S i = (Ωi , F i , Pi ). Let S 1 support a
continuous, Rd -valued process X i , a measurable, Rd -valued process xi , and a
continuous, E-valued process Y . Let f : Rd ×E →→ Rr be an R d ⊗E /R r -
measurable function, and define the Rr -valued random variables
Z ∞
f,i
(C.19) Z , f (xis , Ysi ) ds for i ∈ {1, 2}.
0
If Z t
i
P Xti = xis ds ∀t ∈ R+ = 1 for i ∈ {1, 2},
0
and L (X 1 , Y 2 ) = L (X 2 , Y 2 ), then
(C.20) L (X 1 , Y 1 , Z f,1 ) = L (X 2 , Y 2 , Z f,2 ).
C.21 Remark. According to the conventions of Rem. 1.7, the integral in

(C.19) is always defined and takes the value ∞ ∈ Rd when any component
is infinite or undefined.
Proof. Write f = (fi )1≤i≤r , and let fi+ and fi− denote the positive and neg-
ative parts of fi . We may apply Lem. C.12 to conclude that
+ +
L (X 1 , Y 1 , Z f1 ,1 ) = L (X 2 , Y 2 , Z f1 ,2 ),
and then that

+ − + −
L (X 1 , Y 1 , Z f1 ,1 , Z f1 ,1 ) = L (X 2 , Y 2 , Z f1 ,2 , Z f1 ,2 ).
111
Repeating this argument a finite number of times, we see that

+ − + −
L (X 1 , Y 1 , Z f1 ,1 , Z f1 ,1 , . . . , Z fr ,1 , Z fr ,1 )
+ − + −
(C.22) = L (X 2 , Y 2 , Z f1 ,2 , Z f1 ,2 , . . . , Z fr ,2 , Z fr ,2 ).
Define the R+ -valued random variables

+ − + −
φi , Z f1 ,i + Z f1 ,i + · · · + Z fr ,i + Z fr ,i
for i ∈ {1, 2}.
According to the conventions of Rem. 1.7, we have

( + − + −
f,i (Z f1 ,i − Z f1 ,i , . . . , Z fr ,i − Z fr ,i ) if φi < ∞, and
Z =
∞ otherwise,
so (C.22) implies (C.20).
112
Appendix D
Semimartingale Characteristics
In Section 1.2, we presented the following definitions.
1.14 Definition. Let B = (Ω, F , F 0 , P) be a stochastic basis supporting a

continuous, Rd -valued process X. We say that X is a continuous semi-
martingale if we can decompose X as
(1.15) Xt = X0 + Mt + Bt ,
where M is a continuous local martingale with M0 = 0, and B is a continuous

process with B0 = 0 that is P-a.s. of finite
variation. In this case, we say that
X has the characteristics B, hM i . If B and hM i are both absolutely
continuous, P-a.s., then we say that X is an Itô process.
Our definition of Itô process is technically convenient; however, it differs

from the standard definition where an Itô process is defined as a process of
form:
Z t Z t
(D.1) Xt = µs ds + σs dWs .
0 0
In this section, we show that our definition is essentially equivalent to the

standard definition. One direction is trivial.
D.2 Lemma. Suppose that W is an Rr -valued Wiener process and that X is

a continuous, Rd -valued process which satisfies (D.1), where µ is an adapted,
Rd -valued process, and σ is an adapted, Rd ⊗Rr -valued process. Then X is
an Itô process.
113
APPENDIX D. SEMIMARTINGALE CHARACTERISTICS
Rt Rt
Proof. Set Bt , 0 µs ds and Mt , 0 σs dWs . It is then clear that X has
Rt
the canonical decomposition X = X0 + M + B. As hM it = 0 σs σsT ds, it
is clear that B and hM i are both a.s. absolutely continuous, so X is an Itô
process.
Going in the other direction, we will show that we can construct a Wiener
process W such that (D.1) holds. The first step is to find good versions of
the characteristics.
D.3 Lemma. Let B = (Ω, F , F, P) be a stochastic basis which satisfies

the usual conditions and supports an Rd -valued Itô process X. Then there
exists an F-predictable, Rd -valued process b and an F-predictable, S+d -valued
Rt
process c such that X has the characteristics (B, C) where Bt , 0 bs ds and
Rt
Ct , 0 cs ds.
Proof. As X is an Itô process, we may write X = X0 + M + B for some
continuous local martingale M with M0 = 0 and some continuous process
B with B0 = 0 which is P-a.s. absolutely continuous. Moreover, there also
exists a process C which is version of hM i and is P-a.s. absolutely continuous.
As B satisfies the usual conditions, we may assume that C is continuous
(otherwise we redefine C on a null set). Lem. C.10 asserts that by taking
divided differences from the left, we may construct F-predictable processes
∂ ∂
b and c such that, for each ω ∈ Ω, bt (ω) = ∂t Bt (ω) and ct (ω) = ∂t Ct (ω)
whenever either derivative exists. Using the a.s. absolute continuity of B
and C, we conclude that
Z t
P Bt = bs ds ∀t = 1, and
0
Z t
P Ct = cs ds ∀t = 1.
0
We now need to modify c so that it only takes valued in S+d , and we follow
114
[JS87] II.2.9. For q ∈ Qd , we define
aqt , 1{(ct q, q)<0} , Mtq , (Mt q, q),

Z t Z t
q q q q
Yt , as dhM is , Zt , aqs (cs q, q) ds, and
Z0 t 0
Zbtq , aqs ds,

0
where (x, y) denotes the inner product on Rd . Y q and Z q are P-indistiguishable,

but Y q is P-a.s. nonnegative for all t and Z q is P-a.s. nonpositive for all t,
so we conclude that Y q and Z q are both P-indistiguishable from the zero
process. This implies that Zbq is also P-indistiguishable from the zero pro-
cess. Letting {qn }n be an enumeration of Qd , we define bnt , maxi≤n aqt i ,
Rt
bt , maxn∈N bnt , and Zbt , 0 bs ds. We may find a single P-null set N such
that Zbtq (ω) = 0 for all q ∈ Qd and all t ∈ R+ when ω ∈ / N.
As b(ω) = limn b (ω) pointwise on R+ , and the sequence bn (ω) is nonde-
n
creasing, we may applying the monotone convergence theorem to conclude

that Zbt (ω) = 0 for all t ∈ R+ when ω ∈ / N . This implies that
Z t Z t
P cs ds = as cs ds ∀t = 1.
0 0
As at (ω) ct (ω) ∈ S+d for all t and ω, we are done.

D.4 Remark. We only use the usual conditions to ensure that we may
choose a version of hM i which is continuous, rather than a.s. continuous. If
we know, a prior, that such a version of hM i exists, then the usual conditions
are not necessary in this lemma.
Once we have good versions of the characteristics, the rest of the work is
essentially linear algebra. We will need the following definitions and results.
D.5 Definition. Given a matrix A ∈ S+d , we say that the matrix A1/2 ∈ S+d
is the positive square root of A if A1/2 A1/2 = A.
D.6 Lemma. Given any A ∈ S+d , the positive square root of A exists and is
unique, and the map A 7→ A1/2 is a measurable map from S+d to S+d .
Proof. It is a classical result that a bounded self-adjoint linear operator on
a Hilbert space has a unique positive, self-adjoint square root. Moreover, if
115
we define B1 = (I − A)/2 and Bn+1 = (I − A + Bn2 )/2 for n ≥ 1, where

I denotes the identity matrix in S+d , then the sequence {Bn } converges in
operator norm to A1/2 . This implies that the map A 7→ A1/2 is measurable.
One may consult [RSN90] VII.104 for the details of this argument.
D.7 Definition. Given a matrix A ∈ Rn ⊗Rm , we say that a matrix A+ ∈
Rm ⊗Rn is the Moore-Penrose generalized inverse of A if AA+ A = A,
A+ AA+ = A+ , (AA+ )T = AA+ , and (A+ A)T = A+ A.
D.8 Lemma. The Moore-Penrose generalized inverse exists and is unique,
and the map A 7→ A+ is measurable.
Proof. The existence and uniqueness of the Moore-Penrose generalized deriva-
tive is shown in [Pen55]. [BIG03] provides a textbook treatment. Moreover,
if we define B1 , AT /kAAT k and Bn+1 , Bn (2I − ABn ), then it is shown
in [BI66] that the sequence {Bn } converges to A+ at each coordinate. This
implies that the map A 7→ A+ is measurable.
+
A 7→
While the map A is measurable,
it is not continuous.
To see this
1 0 1 0 1 0
consider An = → A∞ = , then A+
n = but A+
∞ = A∞ .
0 1/n 0 0 0 n
[Con98] shows thats the the map A 7→ A+ is in fact analytic when restricted
to matrices of a common rank. Notice that AA+ and A+ A are idempotent
and self-adjoint, so they are orthogonal projections.
4.60 Definition. Let X denote the canonical process on the space C(R+ ; Rr ),
let C denotes the Borel σ-field on C(R+ ; Rr ), let C0 = {σ(X t )}t∈R+ de-
note the filtration generated by X, and let W denote
Wiener’s measure on
C(R+ ; R ). We refer to W , C(R+ ; R ), C , C , W as Wiener’s basis on
r r 0
C(R+ ; Rr ).
D.9 Theorem. Let B = (Ω, F , F0 , P) be a stochastic basis which supports
an Rd -valued, P-a.s. continuous local martingale M and an adapted, Rd ⊗Rr -
valued process σ with Z t
hM it = σs σsT ds.
0
Let W denote Wiener’s basis on C(R+ ; Rr ), set B

b , B⊗W (see Def. 1.11),
and let M b denote the extensions of M and σ to B.
c and σ b Then Bb supports
116
an Rr -valued Wiener process W

c such that
Z t
Mt =
c σ
bs dW
cs .
0
Proof. We will let (Ω, b Fb,F b = B.

b0 , P) b Let X denote the canonical process
on C(R+ ; Rr ), and let X b denote the extension of X to Ω. b X b is a continuous
martingale with h X b it = t I under P, b where I denotes the identity matrix
in Rr ⊗Rr , so we may apply Levy’s characterization to conclude that X b is a
Wiener process.
Applying Lem. D.8, we see that σ b+ is a adapted, Rr ⊗Rd -valued process.
+ + +
σs σ
As (b σs σ
bs )(b T
bs ) = (b bs ) and (I − σ
σs σ bs+ σ
bs )(I − σbs+ σ
bs )T = I − σ bs+ σ
bs , we
have
Z t Z t Z t
+ + + +
bs ⊗b
σ σs dh M is =
c σ
bs σbs (b
σs σ T
bs ) ds = bs+ σ
σ bs ds, and
0 0 0
Z t Z t
+ +
(I − σ bs σbs )⊗(I − σ bs σ bs ) dh X is =
b bs+ σ
(I − σ bs )(I − σbs+ σ
bs )T ds
0
Z0 t
= bs+ σ
(I − σ bs ) ds.
0
As σ bs+ σ
bs is an orthogonal projection, we have kb σs+ σ
bs k ≤ r and kI −
+
σ
bs σbs k ≤ r. Recall that we use the Frobenius norm on Rr ⊗Rr rather than
the operator norm, so kIk = r. This means that
Z t Z t
+ +
bs ⊗b
σ σs dh M is ≤ kbσs+ σ
bs k ds ≤ t r, and
c

0 0
Z t Z t
+ +
(I − σ bs )⊗(I − σ
bs σ bs σ b is
bs ) dh X ≤ bs+ σ
kI − σ bs k ds ≤ t r,

0 0
so Z t Z t
W
ct , bs+
σ dM
cs + bs+ σ
(I − σ bs ) dX
bs .
0 0
117
is well-defined. We have
Z t Z t
+ +
hWc it = bs ⊗b
σ cis +
σs dh M bs+ σ
(I − σ bs+ σ
bs )⊗(I − σ b is
bs ) dh X
0 0
Z t
= bs+ σ
σ bs+ σ
bs + I − σ bs du
0
= tI,
where h Mc, Xb i = 0 because M c and X b are orthogonal as a result of the

product construction. As W c is a continuous martingale, we conclude that Wc
is an Rr -valued Wiener d
R t process. Finally, let I denote the identity matrix in
d d
R ⊗R , and set L b, σ
0 s
b dW b−M
cs , so L c is a local martingale and
hL
b−M
cit = h Li
b t + hM cit − 2h L,
b M cit
Z t Z t
=2 σ bsT ds − 2
bs σ bs ⊗I d dh W
σ c, Mcis
Z0 t Z0 t
=2 σ bsT ds − 2
bs σ σ bs+ ⊗I d dh M
bs σ cis
c, M
Z0 t Z0 t
=2 σ bsT ds − 2
bs σ σ bs+ σ
bs σ bsT ds
bs σ
0 0
= 0.
In particular, M = M
c, so we are done.
D.10 Corollary. Let B = (Ω, F , F, P) be a stochastic basis which satisfies

the usual conditions and supports an adapted, Rd -valued Itô process X. Let
W denote Wiener’s basis on C(R+ ; Rd ), set B b , B⊗W , and let X b denote
the extension of X to B.b Then there exist adapted, Rd -valued processes µ b
and Wc and an adapted, S+d -valued process σ b which are defined on B b such
that W
c is a Wiener process, and
Z t Z t
(D.11) Xt =
b µ
bs ds + σ
bs dW
cs .
0 0
Proof. Let B b = (Ωb 1, F

c1 , F b If a is a process defined on B, then b
b0 , P). a will
denote the extension of a to B. Lem. D.3 asserts the existence of an F-
b
predictable, Rd -valued process b and an F-predictable, S+d -valued process c
118
Rt
such that X has the characteristics (B, C) where Bt , 0 bs ds and Ct ,
Rt 1/2
c ds. Define σt , ct . Lem. D.6 asserts that the map A 7→ A1/2 is
0 s
measurable, soR t σ isTalso F-predictable. Defining M , X − X0 − B, we see
that hM it = 0 σs σs ds. The previous theorem asserts the existence of an F-b
R t
adapted, Rd -valued, continuous Wiener process W c such that Mct = σ
0 s
b dW
cs .
Setting µ
b , bb, we see that X b solves (D.11).
119
Appendix E
Rebolledo’s Criterion
If a collection of probability measures on a Polish space is tight, then Prokhorov’s

theorem tells us that we may select a weakly convergent sequence from that
collection. We will also say that a collection of processes is tight if the col-
lection laws induced by those processes is tight.
Given a collection of Rd -valued continuous processes, {X α }, each defined
on a stochastic bases (Ωα , Pα , Fα , F α ), we list five potential conditions.
[C1] The collection of random variables {X0α } is tight.
[C2] For each t and ε > 0 there exists a δ > 0 such that
h i
Pα sup kXsα2 − Xsα1 k ≥ ε ≤ ε
s1 ,s2 ∈At,δ
for any α, where At,δ , {(s1 , s2 ) ∈ R2+ : s1 ≤ s2 ≤ t and s2 − s1 ≤ δ}.
[C3] Each X α is a continuous semimartingale with characteristics Aα ,

(B α , C α ), and the collection of continuous Rd ×(Rd ⊗ Rd )-valued pro-
cesses, {Aα }, is tight.
Pα kXTαα +u − XTαα k ≥ ε ≤ ε

for each α, Fα -stopping time T α ≤ t, and u ∈ [0, δ].
120
APPENDIX E. REBOLLEDO’S CRITERION
Pα kXTαα − XSαα k ≥ ε ≤ ε

for any α, and any two Fα -stopping times S α ≤ T α ≤ t with T α − S α ≤

δ.
Azela and Ascoli’s characterization of the compact subsets of spaces of

continuous functions implies that a collection of processes is tight if and only
if [C1] and [C2] holds (e.g., [Bil68] or [Par67]); however, it is often difficult
to directly verify condition [C2] for a collection of Itô processes if the drift
and diffusion processes are not uniformly bounded. Fortunately, Rebolledo
[Reb79] has shown that [C3] is actually sufficient to ensure that [C2] holds.
This result is quite useful because it is often easier to compute with the
characteristics of a semimartingale than with the semimartingale itself. The
goal of this section is to provide a relatively brief and self-contained derivation
of this result.
To prove Rebolledo’s result, we will first show that the conditions [C2],
[C4], and [C5] are all equivalent for continuous processes. This result is
essentially given in [Ald78], and we borrow heavily from the presentation
given in [JM86]. To facilitate the proof, we give two lemmas.
First we note that if a function does not oscillate too wildly within in each
interval of a partition, then the function also cannot oscillate wildly between
points in adjacent intervals. This is the content of the rather obvious
E.1 Lemma. Let x ∈ C(R+ ; Rd ) and suppose that we have a (deterministic)

partition {0 = t0 < t1 < t2 < . . . < tn } such that |ti − ti−1 | ≥ δ for
all i ∈ {2, . . . , n − 1} and |x(v) − x(u)| < ε if u, v ∈ [ti−1 , ti ] for some
i ∈ {1, . . . , n}. Then |v − u| ≤ δ and u, v ∈ [0, tn ] implies |x(v) − x(u)| < 2ε.
E.2 Remark. We do not need to control the size of the first or the last
interval.
Proof. If u, v ∈ [ti−1 , ti ] for some i the result is immediate. The only other
possibility is that ti−1 ≤ u < ti ≤ v < ti+1 for some i, but the |x(v) − x(u)| ≤
|x(v) − x(ti )| + |x(ti ) − x(u)| ≤ 2ε.
We also observe that if [C5] holds, then we can bound the probability that
the processes makes a large number of large moves in a given time interval.
This is the content of the following
121

E.3 Lemma. Suppose that P |XT − XS | ≥ ε ≤ ε for all stopping times
S ≤ T ≤ t with T − S ≤ δ. If we define the stopping times T0 , 0, and
Ti , inf t > Ti−1 : |Xt − XTi−1 | ≥ ε , then

(E.4) (1 − t/δn) P Tn ≤ t ≤ ε.
Proof. First we notice that

n
X n
X
P Ti − Ti−1 ≤ δ = P |XTi ∧(Ti−1 +δ) − XTi−1 | ≥ ε ≤ nε,
i=1 i=1
so we have
n
X
n P Tn ≤ t ≤ P Tn ≤ t and Ti − Ti−1 > δ + nε
i=1
n
X
≤ E 1{Tn ≤t} (Ti − Ti−1 )/δ + nε
i=1

= E 1{Tn ≤t} Tn /δ + nε

≤ t P Tn ≤ t /δ + nε.
We now have everything that we need to show the equivalence of [C2],

[C4], and [C5]. Notice that [C4] looks much weaker than [C2] or [C5]. In
particular, one must choose a single deterministic offset u in condition [C4]
which cannot vary from path to path.
E.5 Theorem. If {X α } is a collection of continuous processes, then [C2],

[C4], and [C5] are all equivalent.
Proof. [C2] clearly implies [C4], so we now assume that [C4] holds and show
that this implies [C5]. Fix t and ε > 0 and then choose δ as in condition
[C4], so that
Pα [ |XTαα +u − XTαα | ≥ ε/2 ] ≤ ε/3
for every α, Fα -stopping time T α , and u ∈ [0, 2δ].
Pick any α and any two Fα -stopping times S α ≤ T α ≤ t with T α −S α ≤ δ.
In particular, we have [T α , T α + δ] ⊂ [S α , S α + 2δ]. Also notice that for any
122
s, we have
α
|XT α − XSαα | ≥ ε ⊆ |XTαα − Xsα | ≥ ε/2 ∪ |Xsα − XSαα | ≥ ε/2 .

Combining these two observations with Fubini’s Theorem, we write
δ P |XTαα − XSαα | ≥ ε

R T α +δ
= E 1{|XTαα −XSαα |≥ε} T α ds
Z ∞
P |XTαα − XSαα | ≥ ε and s ∈ [T α , T α + δ] ds

=
Z0 ∞
P |XTαα − Xsα | ≥ ε and s ∈ [T α , T α + δ]

≤
0
+ P |Xsα − XSαα | ≥ ε and s ∈ [S α , S α + 2δ] ds

Z δ Z 2δ
α α
P |XSαα +u − XSαα | ≥ ε/2 du

= P |XT α +u − XT α | ≥ ε/2 du +
0 0
≤ δ ε,
so [C5] holds.
Finally, we will show that [C5] implies [C2], so assume [C5], fix some t
and ε > 0, and define the stopping times T0α , 0 and
Tiα , inf s > Ti−1 : |Xsα − XTαi−1 | ≥ ε/2 .

Choose δ1 , as in [C5], such that
Pα |XSαα − XTαα | ≥ ε/4 ≤ ε/4

for each α and all Fα -stopping times S α ≤ T α ≤ t with T α − S α ≤ δ1 . Then

choose n so large that 1 − t/(nδ1 ) ≥ 1/2 which means that Pα [Tnα ≤ t] ≤ ε/2
by (E.4). Finally, choose another δ2 such that
Pα |XSαα − XTαα | ≥ ε/2 ≤ ε/2n

for each α and all Fα -stopping times S α ≤ T α ≤ t with T α − S α ≤ δ2 .

Now notice that if we fix a point ω α ∈ B α ⊂ Ωα where
B α , Tiα − Ti−1α
≥ δ2 /2 for all i ≥ 1 with Tiα ≤ t ,

123
then we may apply Lemma E.1 to conclude that |Xsα2 (ω α ) − Xsα1 (ω α )| ≤ ε for
all s1 ≤ s2 ≤ t with s2 −s1 ≤ δ2 . In particular, we are done if Pα [B α ] ≥ 1−ε.
Define the sets
α
Ciα , Tiα − Ti−1 < δ2 /2 and Tiα ≤ t

= |XTiα ∧(Ti−1
α +δ /2)∧ t − XT α ∧ t | ≥ ε/2 ,
2 i−1
so Pα [Ciα ] ≤ ε/2n for all i ≥ 1. As
(B α )c ⊂ ∪∞ α n α α
i=1 Ci ⊂ ∪i=1 Ci ∪ {Tn ≤ t},
we have n
α c X
P Ci + P Tnα ≤ t ≤ ,

P (B ) ≤
i=1
and we are done.

We now recall the following
E.6 Definition. We say that the process A dominates the process X in
the sense of Lenglart [Len77], if

E X T ≤ E AT
for all bounded stopping times T .

The domination property is useful because it implies the following
E.7 Lemma. Let X be a right-continuous process and let A be a continuous
increasing process which dominates X in the sense of Lenglart. If T is an
R-valued stopping time and a and x are strictly positive constants, then
a
P XT∗ ≥ x ≤ + P AT ≥ a .

x
where Xt∗ , sups≤t |Xs | and Y∞ , limt→∞ Yt for any increasing process Y .
Proof. If A dominates X, then AT dominates X T , so we may assume without
loss of generality that T = ∞ by redefining X to be X T and A to be AT .
Fix x and a and let S , inf { t : Xt ≥ x} and U , inf { t : At ≥ a}, so we
have
∗
Xt ≥ x and At < a = S ≤ t < U ⊂ S = S ∧ U ∧ t = XS∧U ∧t ≥ x
124
Using Chebyshev’s inequality, the domination property, and the fact that A
is increasing, we see that
P Xt∗ ≥ x ≤ P XS∧U ∧t ≥ x + P At ≥ a

1
≤ E AS∧U ∧t + P A∞ ≥ a
x
a
≤ + P A∞ ≥ a
x
holds for all t. Letting t → ∞ through some sequence and noting that
∗
{X∞ > x} ⊂ {Xt∗ ≥ x for some t}, we have
∗ a
P X∞ > x ≤ + P A∞ ≥ a .
x
But the right hand side is continuous in x, so we really have
∗ ∗ a
P X∞ ≥ x = lim P X∞ > x − 1/n ≤ lim + P A∞ ≥ a
n n x − 1/n
a
= + P A∞ ≥ a .
x
E.8 Lemma. If M is a continuous local martingale with M0 = 0, then hM i

dominates M 2 in the sense of Lenglart.
Proof. Define the stopping times Tn , inf{t : |M | ≥ n or hM i ≥ n}, fix
2
some stopping time T , and set N n , M Tn ∧T . Applying Doob’s maximal
inequality to the positive submartingale M Tn ∧T gives
E[sups≤t Nsn ] ≤ 4 E[Ntn ] = 4 E[hM iTt n ∧T ].
Letting t → ∞ and then n → ∞, we apply the monotone convergence

theorem to conclude that E[sups≤T Ms2 ] ≤ E[hM iT ].
E.9 Lemma. If M is an Rd -valued continuous local martingale and T is an

extended real-valued stopping time, then N , M − M T is a continuous local
martingale and hN i = hM i − hM iT .
Proof. By stopping, we may assume without loss of generality that M and
hM i are bounded martingales. Let N i denote the ith component of N and
125
let M i denote the ith component of M . If s < t, then
E[Nti | Fs ] = E[Mti − Mt∧T

i
| Fs ] = Msi − Ms∧T
i
= Nsi
and
E[Nti Ntj − hM i , M j it − hM i , M j it∧T | Fs ]

= E[Mti Mtj − hM i , M j it | Fs ] − E[Mt∧T
i j
Mt∧T − hM i , M j it∧T | Fs ]
j
i
− E[(Mti − Mt∧T ) Mt∧T | Fs ] − E[(Mtj − Mt∧T
j i
) Mt∧T | Fs ]
j
= Msi Msj − hM i , M j is − Ms∧T
i
Ms∧T + hM i , M j is∧T
j j
− (Msi − Ms∧T
i
) Ms∧T − (Msj − Ms∧T i
) Ms∧T
= Nsi Nsj − hM i , M j is − hM i , M j is∧T
where we have applied Lem. 3.32.
E.10 Lemma. If M , {M α } is collection of Rd -valued continuous local

martingales and {hM α i} satisfies condition [C4], then M also satisfies con-
dition [C4].
Proof. Fix t and ε > 0. Then choose a ≤ ε3 /2d3 and use the fact that
{hM α i} satisfies condition [C4] to choose δ so small that
h i
Pα hM α iT α +δ − hM α iT α ≥ a ≤ ε/(2d)
for all α, F-stopping times T α ≤ t.

We will now show that condition [C4] holds for M. Fix some α and F-
α
stopping time T α ≤ t, and then set M , M α −(M α )T . Lem. E.9 asserts that
α
M is a local martingale with quadratic variation C , hM α i − hM α iT . As
M0 = 0, Lem. E.8 asserts that C ii dominates (M i )2 in the sense of Lenglart
for each i ∈ {1, . . . , d}. Now fix any u ∈ [0, δ]. As T α + u is a stopping time,
126
we may apply Lem. E.7 to conclude that

h i Xd h i
Pα MTα,iα +u − MTα,iα ≥ ε/d
α α α

P MT α +u − MT α ≥ ε ≤
i=1
Xd h i
= Pα (MTi α +u )2 ≥ ε2 /d2
i=1
d
X
≤ ad3 /ε2 + Pα [CTiiα +u ≥ a]
i=1
d
X
= ε/2 + Pα [hM α,i iT α +u − hM α,i iT α ≥ a]
i=1
≤ ε.
As this holds for all u ∈ [0, δ], we conclude that M satisfies condition [C4].
E.11 Lemma. If {X α } is a collection of Rd -valued continuous semimartin-

gales, then [C3] implies condition [C2].
Proof. M α , X α − B α is a local martingale with hM α i = C α . As the
collection {Aα } is tight, it satisfies [C2] which implies that it also satisfies
[C4]. The previous lemma then asserts that the collection {M α } satisfies
[C2] which implies that it also satisfies [C4]. As {B α } satisfies [C2] by
assumption and X α = M α + B α , it is clear that {X α } satisfies [C2] as
well.
E.12 Corollary. If {X α } is a collection of Rd -valued continuous semimartin-

gales that satisfy condition [C1] and [C3], then X is tight.
Proof. As {X α } satisfies [C3], the previous lemma asserts that {X α } also
satisfies [C2]. But conditions [C1] and [C2] are sufficient to ensure that the
collection {X α } is tight.
127
Appendix F
Convergence of Characteristics
The goal of this subsection is to provide a self-contained development the

following theorem.
F.1 Theorem. Let X, B, and C be continuous processes where X and

B take values in Rd , C takes values in Rd ⊗Rd , and B is a.s. of finite
variation. Let {X n } be sequence of continuous, Rd -valued processes, and
suppose that X n is a semimartingale with the characteristics (B n , C n ). If
(X n , B n , C n ) ⇒ (X, B, C), then X is a semimartingale which has the char-
acteristics (B, C) with respect to the filtration generated by X, B, and C
(i.e., F0 , {σ(X t , B t , C t )}t∈R+ ).
This theorem is a continuous version of [JS87] Thm. IX.2.4, and we take

advantage of the assumption of continuity to streamline the presentation.
To prove this theorem, we will need to show that the the weak limit of a
local martingale is still a local martingale; however, this is a little delicate as
the map which stops a path when it reaches a given level is not continuous.
Consider the following example.
F.2 Example. Let
xn (t) , t 1[0,1) (t) − (2 − t) 1[1,∞) (t) − 1/n,

y(t) , t 1[0,1) (t) − (2 − t) 1[1,∞) (t), and
z(t) , t 1[0,1) (t) + 1[1,∞) (t).
Define the stopping time T : C(R+ ; R) → R+ by T (x) = inf{t : x(t) ≥ 1}.
128
APPENDIX F. CONVERGENCE OF CHARACTERISTICS
Then xn → y uniformly, but

lim ∇ xn , T (xn ) = lim xn = y 6= ∇ y, T (y) = z.
n→∞ n→∞
Fortunately, we can avoid the situation just described a.s. by choosing

the levels at which we stop a process in a clever way that depends upon the
law of that process. This is the content of Lem. F.5. First we will need to
give a lemma about counting the jumps of a nondecreasing function.
F.3 Notation. If f is a function that admits left and right limits, then we
set f (x+) , limy↓x f (y), f (x−) , limy↑x f (y).
F.4 Lemma. Let f : R → R+ be a nondecreasing function, fix some ε > 0,
and define

A , a ∈ R : f (a+) − f (a−) ≥ ε ,
B m , q ∈ Q : f (q + 1/m) − f (q) ≥ ε ,

x0 , −∞, and
xi , lim inf B m ∩ (xi−1 , ∞) for i ≥ 1.
m→∞
Then inf {xi }i∈N > −∞, xi > xi−1 when xi−1 < ∞, and A = {xi : xi < ∞}.
Proof. We first show that if y ∈ R with f (y+) − f (y−) < ε, then we may
choose δ = δ(y) > 0 and M = M (y) ∈ N such that B m ∩ (y − δ, y + δ) = ∅
for all m ≥ M . Choose η so small that f (y+) − f (y−) < ε − η, and then
choose δ so small that f (y−) − f (y − δ) < η/2 and f (y + 2δ) − f (y+) < η/2.
Finally, choose M so large that 1/M < δ. If m ≥ M and q ∈ (y − δ, y + δ),
then {q, q + 1/m} ⊂ (y − δ, y + 2δ), so
f (q + 1/m) − f (q) ≤ f (y + 2δ) − f (y − δ) < ε.
In particular, B m ∩ (y − δ, y + δ) = ∅.
As f is bounded from below, the set A ∩ (−∞, n] contains a finite number
of points for each n. In particular, A contains a least element, and we may
linearly order the jumps of size at least ε as {yn }n<N = A with yi−1 < yi for
some N ∈ N. We now show that the inductive assumption xi−1 = yi−1 < ∞
implies that xi = yi for i < N .
We will argue by contradiction that xi ≥ yi , so assume that xi < yi , and
then choose qm ∈ B m ∩ (xi−1 , ∞) with qm → xi . If xi = xi−1 , then we may
129
choose δ > xi−1 so close to xi−1 that we have f (z) − f (xi−1 +) < ε/2 when
z ∈ (xi−1 , δ). For sufficiently large m, we have {qm , qm + 1/m} ⊂ (xi−1 , δ),
but this contradicts the fact that f (qm + 1/m) − f (qm ) ≥ ε. Recall that
f is bounded below, so this argument is valid when xi−1 = −∞. On the
other hand, if xi ∈ (xi−1 , yi ), then we have f (xi +) − f (xi −) < ε, so we
may choose δ > 0 and M with B m ∩ (xi − δ, xi + δ) = ∅ for all m ≥ M .
This is again a contradiction. We have now shown that xi ≥ yi . Choosing
qm ∈ Q ∩ (xi−1 , ∞) with yi ∈ (qm , qm + 1/m), we see that
f (qm ) + ε ≤ f (yi −) + ε ≤ f (yi +) ≤ f (qm + 1/q),
so xi ≤ yi , and we conclude that xi = yi .

Using induction, we conclude that xi = yi for all i < N . As the yi are
strictly increasing and finite, so are the xi . If N = ∞, we are done, so
assume that N < ∞, and then further assume that xN < ∞ for the sake
of generating a contradiction. We have f (xN +) − f (xN −) < ε, so we may
choose δ > 0 and M with B m ∩ (xN − δ, xN + δ) = ∅. This contradicts the
the definition of xN , so we conclude that xi = ∞ for all i ≥ N .
F.5 Lemma. Let Z be a continuous, real-valued process, and define the

stopping times T a : C(R+ ; R) → R+ by T a (z) , inf{t : z(t) ≥ a}. Then
there exists a countable set A ⊂ R such that T a is L (Z)-a.s. continuous
when a ∈/ A.
Proof. We first show that the map a 7→ T a (z0 ) is left-continuous as a function
of a for fixed z0 ∈ C(R+ ; R). Fix some nondecreasing sequence {an } with
an → a∞ and an < a∞ for all n. Set tn = T an (z0 ) and t∞ , supn tn . If
t∞ = ∞, then we must have T a (z0 ) = ∞ = limn T an (z0 ), as T a (z0 ) ≥ T an (z0 )
for each n. Now assume that t∞ < ∞. This implies that tn → t∞ , as every
bounded nondecreasing sequence converges. As z0 is continuous, we have
z0 (t∞ ) = lim z0 (tn ) = lim an = a,

n→∞ n→∞
so T a (z0 ) ≤ t∞ . On the other hand, if s < t∞ , then there exists some n such
that tn ∈ (s, t∞ ), and this implies that z0 (s) ≤ an < a and T a (z0 ) > s. In
particular, T a (z0 ) = t∞ . We have now shown that the map a 7→ T a (z0 ) is
left continuous.
We now assume that the map a 7→ T a (z0 ) is continuous at the point
a = c, and we show that this implies that the map z 7→ T c (z) is continuous
130
at the point z = z0 . Notice that this assumption implies that z0 does not
have a local max at t = T c (z0 ) and prevents the situation in Example F.2.
Let zn → z0 , fix ε > 0, and choose b < c < d with T c (z0 ) − T b (z0 ) <
ε and T d (z0 ) − T c (z0 ) < ε using the continuity of the map a 7→ T a (z0 ).
Set δ , min{c − b, d − c}/2, set t = T d (z0 ), and choose N so large that
sups≤t |z0 (s) − zn (s)| ≤ δ for all n ≥ N . Notice that this implies that zn (s) ≤
z0 (s) + δ ≤ b + δ < c for s ∈ [0, T b (z0 )] and zn (t) ≥ z0 (t) − δ = d − δ > c. In
particular, T c (zn ) ∈ T b (z0 ), T d (z0 ) , so |T c (z0 ) − T c (zn )| ≤ ε.
Recursively define a sequence of functions ξin : C(R+ ; R) → R by setting
ξ0n (x) , −∞, and then defining
ξin (x) , lim inf a ∈ Q : ξi−1 n

(x) < a and T a+1/m (x) − T a (x) ≥ 1/n .

m→∞
for each i > 0. For fixed x, the map a 7→ T a (x) is left-continuous, nonde-
creasing, and nonnegative, so we may apply Lem. F.4 to conclude that
{a ∈ R : T a+ (x) − T a (x) ≥ 1/n} = {ξin (x) : ξin (x) < ∞}.
The map a 7→ P[ξin (Z) ≤ a] is right-continuous,

so it has at most
countably
n
many jumps. This implies that the set a : P[ξi (Z) = a] > 0 is countable
for each n and i.
Putting everything together, we have
the map z 7→ T a (z) is not continuous at z = Z

⊂ the map b 7→ T b (Z) is not continuous at b = a

= T a+ (Z) ≥ T a (Z) + 1/n for some n

= a = ξin (Z) for some n and i .

Defining
A , ∪i,n a : P[ξin (Z) = a] > 0 ,

we see that A is countable, and T a is L (Z)-a.s. continuous when a ∈ / A.

a
In particular, for each a ∈
/ A there exists a set Ω ⊂ C(R+ ; R) such that
P[Z ∈ Ωa ] = 1 and the map z 7→ T a (z) is continuous at each z ∈ Ωa .
F.6 Corollary. Let E be a Polish spaces, and let {X n }n∈N be a collection

of continuous, E-valued processes. Suppose that X n ⇒ X ∞ , fix some point
131
e ∈ E, define the stopping times
S a : C(R+ ; E) → R+ by S a (y) , inf{t : d(y(t), e) ≥ a},

a n
and set X n,a , (X n )S (Y ) . Then there exists an increasing sequence {am }
with limm am = ∞ such that X n,am ⇒ X ∞,am for each m.
Proof. Let B ∞ = (Ω∞ , P∞ , F∞ , F ∞ ) denote the stochastic basis on which
X ∞ is defined. Let φ : C(R+ ; E) → C(R+ ; R+ ) denote the map such that
φt (y) = d(y(t), e), and set Z n = φ ◦ X n . The map f 7→ d(f, e) is uniformly
continuous, so φ is continuous, and (X n , Z n ) ⇒ (X ∞ , Z ∞ ).
Define the stopping times T a as in the previous lemma, so T a (Z n ) =
S (X n ), and then choose A such that P∞ [T a is discontinuous at Z ] = 0
a
when a ∈ / A. Let ψ a : C(R+ ; E×R) → C(R+ ; E) denote the map (x, z) 7→

∇(x, T a (z)), and notice that ψ a (X n , Z n ) = (X n,a ). The map (x, t) 7→ ∇(x, t)
is continuous, so ψ a is continuous at the point (x, z) when T a is continuous
at the point z. In particular, ψ a is L (X ∞ , Z ∞ )-a.s. continuous when T a is
L (Z ∞ )-a.s. continuous.
To conclude, choose an increasing sequence {am } with limm am = ∞ and
am ∈ / A for all m. Then ψ am is L (X ∞ , Z ∞ )-a.s. continuous, for each m, so
we have
X n,am = ψ am (X n , Z n ) ⇒ ψ am (X ∞ , Z ∞ ) = X ∞,am .
F.7 Remark. Let E2 be a Polish space, and let {Y n }n∈N be a collection

of E2 -valued random variables. If we make the stronger assumption that
(X n , Y n ) ⇒ (X ∞ , Y n ) in the previous corollary, then we may conclude that
(X n,am , Y n ) ⇒ (X ∞,am , Y ∞ ) for each m.
F.8 Lemma. Suppose the M is a P-a.s. right-continuous process which is

adapted to some filtration
F0 = {Ft0 }t∈R+ . If D is a dense subset
of R+ and
0
{Mt }t∈D , {Ft }t∈D , P is a martingale, then {Mt }t∈R+ , FP , P is a martin-
gale where FP = {FtP }t∈R+ is the smallest filtration which contains F0 and
satisfies the usual conditions with respect to P.
Proof. Let F = {Ft }t∈R+ denote the smallest right-continuous filtration that
contains F0 , so Ft = Ft+
0
. Fix s < t and let Z be a bounded, Fs -measurable
random variable. Choose strictly decreasing sequences {sn } and {tn } in D
with limn sn = s, limn tn = t, and sn < tn for all n. As Mu = E[Mt0 |
132
Fu0 ] for u ∈ D ∩ [0, t0 ], the collection {Mu }u∈D∩[0,t0 ] is uniformly integrable,

limn Msn = Ms , P-a.s., and limn Mtn = Mt , P-a.s., so
E[Mt Z ] = lim E[Mtn Z ] = lim E[Msn Z ] = E[Ms Z ].

n→∞ n→∞
As this is true for all bounded, Fs -measurable Z, we conclude that Ms is a

version of E[Mt | Fs ], and it is then clear that Ms must also be a version of
E[Mt | FsP ].
F.9 Theorem. Let E be a Polish space, let {Y n }n∈N be collection of contin-
uous, E-valued processes, and let {M n }n∈N be collection of continuous, real-
valued processes. If M n is a local martingale with respect to some filtration to
which Y n and M n are adapted for each n < ∞ and (Y n , M n ) ⇒ (Y ∞ , M ∞ ),
then M ∞ is a local martingale with respect to the filtration generated by Y ∞
and M ∞ .
Proof. First we move the proof onto the canonical space. Let X = (Y, M )
denote the canonical process on Ω , C(R+ ; E×R), and set Pn = L (Y n , M n ),
so Pn ⇒ P∞ by assumption. As we now have everything defined on the
canonical space, we throw away the original sequence (Y n , M n ), and we will
reuse the notation M m to denote stopped versions of M below. Let C0 = {Ct0 }
denote the filtration on Ω with Ct0 , σ(X t ), and let C = σ(X) = C∞ 0
denote
n n
the Borel σ-field on Ω. Finally, let F = {Ft } denote the smallest right-
continuous, Pn -augmented filtration which contains C0 .
Now define the stopping time T a , inf{t : |Mt | ≥ a}. Using the Cor. F.6
and Rem. F.7, we may choose a sequence {am } with limm am = ∞ such that
L (X, M m | Pn ) ⇒ L (X, M m | P∞ ) for each fixed m, where M m , M T . If
am
we fix any s < t and any bounded,

continuous
f : C(R+ ; E×R) → R, then the
map x = (y, z) 7→ f ∇(x, s) z(t)−z(s) is continuous from C(R+ ; E×R) to
R. This means that L f (X )(Mt − Ms ) | P ⇒ L f (X s )(Mtm − Msm ) |
s m m n

P∞ . But everything here is bounded, so we also have
E∞ [f (X s ) Mtm − Msm ] = lim En [f (X s ) Mtm − Msm ] = 0,

(F.10) n
as M m is a martingale for each n < ∞. Now the class of functions f :

C(R+ ; E×R) → R such that F.10 holds is a monotone class that contains all
the bounded continuous functions, so it must actually contain all bounded
C /R-measurable functions. In particular, E∞ [Mt∞ − Ms∞ | Cs0 ] = 0 and M m
is a (C0 , P∞ )-martingale. We may then apply Lem. F.8 to conclude that
133
M m is actually an (F∞ , P∞ , )-martingale. As T am → ∞ everywhere, we have

evidenced a localizing sequence for M , and we see that M is an (F∞ , P∞ )-
local martingale.
F.11 Lemma. Let D1 ⊂ R+ and D2 ⊂ Rd be dense subsets, and let f ∈

C(R+ ; Rd ⊗Rd ). Suppose that f (s) ∈ S d when s ∈ D1 and that (f (s)x, x) ≤
(f (t)x, x) when s, t ∈ D1 , x ∈ D2 and s ≤ t. Then f (t) − f (s) ∈ S+d for all
s, t ∈ R+ with s ≤ t.
Proof. For the sake of generating a contradiction, first assume that there
exist s ∈ R+ such that f ij (s) 6= f ji (s). But then we may choose sn ∈ D1
with limn sn = s, so
f ij (s) = lim f ij (sn ) = lim f ji (sn ) = f ji (s).

n n
As this is a contradition, we have f (s) ∈ S d for all s ∈ R+ .

Now assume that there exist s, t ∈ R+ with s < t such that f (t) − f (s) ∈
/
d d
S+ . This means that there exists x ∈ R with (f (s)x, x) > (f (s)x, x). Take
sn , tn ∈ D1 and xn ∈ D2 with limn sn = s, limn tn = t, limn xn = x, and
sn < tn for each n. The map (A, x) 7→ (Ax, x) is continuous from S+d ×Rd to
R, so we have
(f (s)x, x) = lim (f (sn )xn , xn ) ≤ lim (f (tn )xn , xn ) = (f (t)x, x).

n→∞ n→∞
This is again a contradiction, so we conclude that f (t) − f (s) ∈ S+d for all
s, t ∈ R+ with s ≤ t.
We now have everything that we need to prove the main theorem of this
subsection.
F.1 Theorem. Let X, B, and C be continuous processes where X and

B take values in Rd , C takes values in Rd ⊗Rd , and B is a.s. of finite
variation. Let {X n } be sequence of continuous, Rd -valued processes, and
suppose that X n is a semimartingale with the characteristics (B n , C n ). If
(X n , B n , C n ) ⇒ (X, B, C), then X is a semimartingale which has the char-
acteristics (B, C) with respect to the filtration generated by X, B, and C
(i.e., F0 , {σ(X t , B t , C t )}t∈R+ ).
134
Proof. Without loss of generality, we assume that everything is defined on

the same space. As the set {0} is closed and C n = hX n i, we may apply
Portmanteau’s Theorem (Thm. B.1) to conclude that
P[Ctij − Ctji = 0] ≥ lim P[Ctn,ij − Ctn,ji = 0] = 1.

n
Letting D1 denote a countable dense subset of R+ , we may then conclude

that P[Ct ∈ S d ∀t ∈ D1 ] = 1. Similarly, for each s < t and x ∈ Rd , we have
P (Ct − Cs )x, x ≥ 0 ≥ lim P (Ctn − Csn )x, x ≥ 0 = 1.

n
Letting D2 denote a countable dense subset of Rd , we may then conclude

that P (Ct − Cs )x, x ≥ 0 ∀s < t ∈ D1 x ∈ D2 = 1. We may then apply
Lem. F.11 to conclude that P[Ct − Cs ∈ S+d ∀s < t ∈ R+ ] = 1, and then
Lem. C.6 implies that C is a.s. of finite variation.
If a and b are points in Rn and c is a point in Rd ⊗Rd , then the maps
(a, b) 7→ a − b and (a, b, c) 7→ (a − b)⊗(a − b) − c are continuous, so we may
apply Lem. B.6 to conclude that the functional versions of these map are
continuous. This implies that
X n , B n , C n , X n − B n , (X n − B n )⊗(X n − B n ) − C n

⇒ X, B, C, X − B, (X − B)⊗(X − B) − C
As each coordinate of X n − B n and (X n − B n )⊗(X n − B n ) − C n is a

local martingale, we may apply Thm. F.9 to conclude that each coordinate
of X − B and (X − B)⊗(X − B) − C is a local martingale with respect
to the filtration generated by (X, B, C). We have shown above that C is
a.s. of finite variation, so hX − B i = C, As B is a.s. of finite variation by
assumption, we conclude that X has the characteristics (B, C).
135
Bibliography
[ABOBF02] M. Avellaneda, D. Boyer-Olson, J. Busca, and P. Friz. Recon-

struction of volatility: Pricing index options using the steepest-
descent approximation. Risk, pages 87–91, 2002.
[Ald78] D. Aldous. Stopping times and tightness. The Annals of Prob-

ability, 6(2):335–340, 1978.
[AM06] A. Antonov and T. Misirpashaev. Markovian projection onto a

displaced diffusion: Generic formulas with applications. avail-
able at SSRN: http://ssrn.com/abstract, 937860, 2006.
[AMP07] A. Antonov, T. Misirpashaev, and V. Piterbarg. Markovian

projection onto a heston model. Working Paper, 2007.
[BI66] A. Ben-Israel. A note on an iterative method for generalized

inversion of matrices. Math. Comp, 20:439–440, 1966.
[BIG03] A. Ben-Israel and T.N.E. Greville. Generalized Inverses: Theory

and Applications. Springer, 2003.
[Bil68] P. Billingsley. Convergence of Probability Measures. John Wiley

& Sons, 1968.
[BJN00] M. Britten-Jones and A. Neuberger. Option prices, implied price

processes, and stochastic volatility. The Journal of Finance,
55(2):839–866, 2000.
[BL78] D.T. Breeden and R.H. Litzenberger. Prices of state-contingent

claims implicit in option prices. The Journal of Business,
51(4):621–651, 1978.
136
BIBLIOGRAPHY
[BM01] D. Brigo and F. Mercurio. Displaced and mixture diffusions

for analytically-tractable smile models. Mathematical Finance-
Bachelier Congress 2000, pages 151–174, 2001.
[BM02] D. Brigo and F. Mercurio. Lognormal-mixture dynamics and

calibration to market volatility smiles. International Journal of
Theoretical and Applied Finance, 5(4):427–446, 2002.
[Car07] R. Carmona. HJM: A unified approach to dynamic models for

fixed income, credit and equity markets. Lecture Notes in Math-
ematics, 1919:1, 2007.
[CN07] R. Carmona and S. Nadtochiy. Local volatility dynamic models.

Preprint, Princeton University, 2007.
[Con98] D. Constales. A closed formula for the Moore-Penrose general-

ized inverse of a complex matrix of given rank. Acta Mathemat-
ica Hungarica, 80:83–88, 1998.
[Der01] Emanuel Derman. Models and markets. Risk, 14(2):48–50, 2001.
[DFW98] B. Dumas, J. Fleming, and R.E. Whaley. Implied volatility

functions: Empirical tests. The Journal of Finance, 53(6):2059–
2106, 1998.
[DK94] E. Derman and I. Kani. Riding on a smile. Risk, 7(2):32–39,

1994.
[DK98] E. Derman and I. Kani. Stochastic implied trees: Arbitrage pric-

ing with stochastic term and strike structure of volatility. Inter-
national Journal of Theoretical and Applied Finance, 1(1):61–
110, 1998.
[Dup94] B. Dupire. Pricing with a smile. Risk, 7(1):18–20, 1994.
[Fok13] AD Fokker. Die mittlere energie rotierender elektrischer dipole

im strahlungsfeld. Annalen der Physik, 348(5):810–820, 1913.
[Gar73] A. Garcia. Martingale Inequalities: Seminar Notes on Recent

Progress. W. A. Benjamin, 1973.
137
BIBLIOGRAPHY
[Gat06] J. Gatheral. The Volatility Surface: A Practitioner’s Guide.

Wiley, 2006.
[Gyö86] I. Gyöngy. Mimicking the one-dimensional marginal distribu-
tions of processes having an Itô differential. Probability Theory
and Related Fields, 71(4):501–516, 1986.
[Hes93] S.L. Heston. A closed-form solution for options with stochas-
tic volatility with applications to bond and currency options.
Review of Financial Studies, 6(2):327–43, 1993.
[JM81a] J. Jacod and J. Memin. Existence of weak solutions for stochas-
tic differential equations with driving semimartingales. Stochas-
tics An International Journal of Probability and Stochastic Pro-
cesses, 4(4):317–337, 1981.
[JM81b] J. Jacod and J. Memin. Weak and strong solutions of stochas-
tic differential equations: existence and stability. Proc. LMS
Symp., Lect. Notes in Math, 851:169–212, 1981.
[JM86] A. Joffe and M. Metivier. Weak convergence of sequences of
semimartingales with applications to multitype branching pro-
cesses. Advances in Applied Probability, 18(1):20–65, 1986.
[JS87] J. Jacod and A.N. Shiryaev. Limit theorems for stochastic pro-
cesses, 2nd Edition. Springer New York, 1987.
[Kec95] A.S. Kechris. Classical Descriptive Set Theory. Springer, 1995.
[Kol31] A. Kolmogoroff. Über die analytischen methoden in
der wahrscheinlichkeitsrechnung. Mathematische Annalen,
104(1):415–458, 1931.
[Kry84] N. V. Krylov. Once more about the connection between elliptic
operators and Itôs stochastic equations. Statistics and Control
of Stochastic Processes, Steklov Seminar, pages 214–229, 1984.
[KS91] I. Karatzas and S.E. Shreve. Brownian motion and stochastic
calculus, volume 113 of graduate texts in mathematics, 1991.
[Kun90] H. Kunita. Stochastic Flows and Stochastic Differential Equa-
tions. Cambridge University Press, 1990.
138
BIBLIOGRAPHY
[Len77] E. Lenglart. Relation de domination entre deux processus. Ann.

Inst. Henri Poincaré, 13:171–179, 1977.
[LS01] RS Lipster and A.N. Shiryaev. Statistics of random processes

i: General theory, second edition. Applications of Mathematics,
Springer, Berlin-New York, 2001.
[MQR07] Dilip Madan, Michael Qian Qian, and Yong Ren. Calibrat-
ing and pricing with embedded local volatility models. Risk,
20(9):138–143, 2007.
[Par67] K. R. Parthasarathy. Probability Measures on Metric Spaces.

Academic Press, 1967.
[Pen55] R. Penrose. A generalized inverse for matrices. Proc. Cambridge

Philos. Soc., 51:406–413, 1955.
[Pit03a] V. Piterbarg. Mixture of models: A simple recipe for a... hang-

over? Working Paper, 2003.
[Pit03b] VV Piterbarg. A stochastic volatility forward libor model with

a term structure of volatility smiles. Technical report, Working
paper, Bank of America, 2003.
[Pit05] V. Piterbarg. Time to smile. Risk, pages 71–75, 2005.
[Pit06] V. Piterbarg. Smiling hybrids. Risk, May, pages 65–71, 2006.
[Pit07] V. Piterbarg. Markovian projection for volatility calibration.

Risk, 4, 2007.
[Pla17] M. Planck. Ueber einen satz der statistichen dynamik und

eine erweiterung in der quantumtheorie. Sitzungberichte der
Preussischen Akadademie der Wissenschaften, pages 324–341,
1917.
[Reb79] R. Rebolledo. La méthode des martingales appliquée à l’étude

de la convergence en loi de processus. Mémoires de la Société
Mathématique de France, 62:1–125, 1979.
[Roy88] H.L. Royden. Real analysis, 3rd Edition. Macmillan New York,
1988.
139
BIBLIOGRAPHY
[RSN90] F. Riesz and B. Szőkefalvi-Nagy. Functional Analysis. Dover

Publications, 1990.
[Rub94] M. Rubinstein. Implied binomial trees. Journal of Finance,

49(3):771–818, 1994.
[RY99] D. Revuz and M. Yor. Continuous martingales and Brownian

motion, 3rd Edition. Springer-Verlag New York, 1999.
[SV79] D.W. Stroock and S.R.S. Varadhan. Multidimensional Diffusion

Processes. Springer-Verlag, 1979.
[Var67] S.R.S. Varadhan. On the behavior of the fundamental solution

of the heat equation with variable coefficients. Comm. Pure
Appl. Math, 20(2), 1967.
140

Gerard Brunick - A Weak Existence Result With Application To The Financial Engineer's Calibration Problem (2008) PDF

Загружено:

Сведения о документе

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Gerard Brunick - A Weak Existence Result With Application To The Financial Engineer's Calibration Problem (2008) PDF

Загружено:

Авторское право:

Доступные форматы

Abstract

A Weak Existence Result with Application to the Financial Engineer’s

Advisor: Steven E. Shreve

A dissertation in the Department of Mathematical Sciences submitted in

I would like to express my gratitude to my adviser, Steven Shreve, for his

3 A Cross Product Construction 33

B Metric Space-Valued Random Variables. 99

C FV and AC Processes 104

D Semimartingale Characteristics 113

E Rebolledo’s Criterion 120

Trading desks in many product areas at investment banks often

In this application, the market participant identifies a set of primary and

One approach to this calibration problem is to suppose that the financial

option prices. In particular, assuming zero interest rates for simplicity of

(1.2) dSt = σt St dWt

(1.3) b2 (x, t) = E[σt2 | St = x].

deterministic function f such that the solution to the SDE

(1.4) dSt = f (St , t) σt St dWt

has the required European option prices. In [BJN00], Britten-Jones and

1.5 Theorem ([Gyö86] Thm. 4.6). Let W be an Rd -valued Wiener process,

where µt ∈ Rd and σt ∈ Rd ⊗Rd are bounded, adapted processes and σt σtT

with the same one-dimensional marginal distributions as X, where W

1.2 Definitions and Notation

the symbol ∞ to mean thatRthe integral is not finite. In particular, according

1.8 Definition. If (E, E ) is a measurable space and X is an E-valued random

1.9 Definition. Let E be a topological space, and let {X n }n≤∞ be a sequence

for each bounded, continuous function f : E → R.

Filtrations and stochastic bases

In particular, we do not require a filtration to be right continuous as in

between the “completion” and the “augmentation” of a filtration; we make

tic basis if (Ω, F , P) is a probability space and F 0 is a filtration. We say

If E2 is a Polish space, then C(R+ ; E2 ) is a Polish space as well. If E2 is a

Θ : C(R+ ; E)×R → C(R+ ; E) by Θ(x, t) , x (t + )+ ,

∇ : C(R+ ; E)×R+ → C(R+ ; E) by ∇(x, t) , x( ∧ t). .

Θ(X t+u , t) = Θ ∇(X, t + u), t = ∇ Θ(X, t), u = Θu (X, t),

1.12 Definition. If X is an Rd -valued process,

1.13 Definition. If X is an Rd -valued process, then we define

where the sup is taken over all partitions of the form

If X is a càdlàg process, then we need only consider partitions containing

1.14 Definition. Let B = (Ω, F , F 0 , P) be a stochastic basis supporting a

where M is a continuous local martingale with M0 = 0, and B is a continuous

2.1 Updating Functions

2.1 Definition. Let E be a Polish space, and let Φ : E×C0 (R+ ; Rd ) →

If Φ is also continuous as a map from E×C0 (R+ ; Rd ) to C(R+ ; E), then we

Property (a) of Def. 2.1 is an adaptedness condition. Property (b) re-

2.2 Lemma. Let E be a Polish space, let X be a continuous, Rd -valued

Φt (e, x) = e 1[0,∞) + ∇(x, t) = Φt e, ∇(x, t) ,

the path of X after time t. In particular, we have

so property (b) of Def. 2.1 holds, and Φ is an updating function.

2.5 Example.  Let X be a continuous process and set Z , (X, M ) where

where we have used the fact that X0 + ψt ◦ ∆(X, t) ≥ X0 .

Finally, we check that, for all t, u ∈ R+ , we have

so property (b) of Def. 2.1 holds, and Φ is an updating function.

When dealing with a time-dependent Markov processes, a standard tech-

2.6 Example. Let X be a continuous, real-valued process. Fix some T > 0,

Fixing any s ≤ t, we have

Taking any t, u ∈ R+ and letting y = ∆(x, t), we write

where 2.8 follows from 2.7 with s = (T − e3 )+ . To conclude, we write

Φ : E×C0 (R+ ; Rd ) → C(R+ ; E) denote the map such that

We now check that Φ is an updating function. To check the first property

Fixing any 0 ≤ s ≤ t, we then have

We have now shown that property (a) of Def. 2.1 holds.

Using these observations, we may write

2.5 Example. Let X be a continuous process and set Z , (X, M ) where