Академический Документы
Профессиональный Документы
Культура Документы
, . . . , X
N
) becomes a genuine Markov chain, and inference
about the underlying coefcients of the diffusion process must be sought via the
identication of the law of the observation X
(N,
N
)
. In the time-homogeneous
case the mathematical properties of the random vector X
(N,
N
)
are embodied in
the transition operator
P
f (x) :=E[f (X
)|X
0
=x],
dened on appropriate test functions f . Under suitable assumptions, the opera-
tor P
(x) +b(x)f
(x).
NONPARAMETRIC ESTIMATION OF SCALAR DIFFUSIONS 2225
The second-order term () is the diffusion coefcient, and the rst-order term b()
is the drift coefcient. Postulating the existence of an invariant density () =
,b
(), the operator L is unbounded, but self-adjoint negative on L
2
() :=
{f |
_
|f |
2
<}, and the functional calculus gives the correspondence
P
=exp(L) (1.1)
in the operator sense. Therefore, a consistent statistical program can be presumed
to start fromthe observed Markov chain X
(N,)
, estimate its transition operator P
and infer about the pair (b(), ()), via the correspondence (1.1), in other words
via the spectral properties of the operator P
(I)
L
_
b(), ()
_
=parameter. (1.2)
The efciency of a given statistical estimation procedure will be measured by the
prociency in combining the estimation part (E) and the identication part (I) of
the model.
The works of Hansen, Scheinkman and Touzi (1998) and Chen, Hansen and
Scheinkman (1997) paved the way: they formulated a precise and thorough pro-
gram, proposing and discussing several methods for identifying scalar diffusions
via their spectral properties. Simultaneously, the Danish school, given on impulse
by the works of Kessler and Srensen (1999), systematically studied the para-
metric efciency of spectral methods in the xed setting described above. By
constructing estimating functions based on eigenfunctions of the operator L, they
could construct
N-consistent estimators and obtained precise asymptotic prop-
erties.
However, a quantitative study of nonparametric estimation in the xed
context remained out of reach for some time, for both technical and conceptual
reasons. The purpose of the present paper is to ll in this gap, by trying to
understand and explain why the nonparametric case signicantly differs from
its parametric analogue, as well as from the high frequency data framework in
nonparametrics.
We are going to establish minimax rates of convergence over various smooth-
ness classes, characterizing upper and lower bounds for estimating b() and ()
based on the obervation of X
0
, X
, . . . , X
N
, with asymptotics taken as N .
The minimax rate of convergence is an index of both accuracy of estimation and
complexity of the model. We will show that in the nonparametric case the com-
plexity of the problems of estimating b() and () is related to ill-posed inverse
problems. Although we mainly focus on the theoretical aspects of the statistical
model, the estimators we propose are based on feasible nonparametric smoothing
methods: they can be implemented in practice, allowing for adaptivity and nite
sample optimization. Some simulation results were performed by Rei (2003).
The estimation problem is exactly formulated in Section 2, where also the
main theoretical results are stated. The spectral estimation method we adopt is
2226 E. GOBET, M. HOFFMANN AND M. REI
explained in Section 3, which includes a discussion of related problems and
possible extensions. The proofs of the upper bound for our estimator and its
optimality in a minimax sense are given in Sections 4 and 5, respectively. Results
of rather technical nature are deferred to Section 6.
1.2. Statistical estimation for diffusions: an outlook. We give a brief and
selective summary of the evolution of the area over the last two decades. The
nonparametric identication of diffusion processes from continuous data was
probably rst addressed in the reference paper of Banon (1978). More precise
estimation results can be listed as follows:
1.2.1. Continuous or high frequency data: the parametric case. Estimation of
a nite-dimensional parameter from X
T
= (X
t
, 0 t T ) with asymptotics as
T when X is a diffusion of the form
dX
t
=b
(X
t
) dt +(X
t
) dW
t
(1.3)
is classical [Brown and Hewitt (1975) and Kutoyants (1975)]. Here (W
t
, t 0) is
a standard Wiener process. The diffusion coefcient is perfectly identied from
the data by means of the quadratic variation of X. By assuming the process X
to be ergodic (positively recurrent), a sufciently regular parametrization
b
() implies the local asymptotic normality (LAN) property for the underlying
statistical model, therefore ensuring the
T -consistency and efciency of the
ML-estimator [see Liptser and Shiryaev (2001)].
In the case of discrete data X
n
N
, n = 0, 1, . . . , N, with high frequency
sampling
1
N
, but long range observation N
N
as N , various
discretization schemes and estimating procedures had been proposed [Yoshida
(1992) and Kessler (1997)] until Gobet (2002) eventually proved the LANproperty
for ergodic diffusions of the form
dX
t
=b
1
(X
t
) dt +
2
(X
t
) dW
t
(1.4)
in a general setting, by means of the Malliavin calculus: under suitable regularity
conditions, the nite-dimensional parameter
1
in the drift term can be estimated
with optimal rate
N
N
, whereas the nite-dimensional parameter
2
in the
diffusion coefcient is estimated with the optimal rate
N.
1.2.2. Continuous or high frequency data: the nonparametric case. A similar
program was progressively obtained in nonparametrics: If the drift function b()
is globally unknown in the model given by (1.3), but belongs to a Sobolev ball
S(s, L) (of smoothness order s >0 and radius L) over a given compact interval I,
a certain kernel estimator
b
T
() achieves the following upper bound in L
2
(I) and
in a root-mean-squared sense:
sup
bS(s,L)
E
_
b
T
b
2
L
2
(I)
_
1/2
T
s/(2s+1)
.
NONPARAMETRIC ESTIMATION OF SCALAR DIFFUSIONS 2227
This already indicates a formal analogy with the model of nonparametric
regression or signal + white noise where the same rate holds. (Here and in
the sequel, the symbol means up to constants, possibly depending on the
parameters of the problem, but that are continuous in their arguments.) See
Kutoyants (1984) for precise mathematical results.
Similar extensions to the discrete case with high frequency data sampling
for the model driven by (1.4) were given in Hoffmann (1999), where the rates
(N
N
)
s/(2s+1)
for the drift function b() and N
s/(2s+1)
for the diffusion
coefcient () have been obtained and proved to be optimal. See also the
pioneering paper of Pham (1981). Methods of this kind have been successfully
applied to nancial data [At-Sahalia (1996), Stanton (1997), Chapman and
Pearson (2000) and Fan and Zhang (2003)]. In particular, it is investigated whether
the usual parametric model assumptions are compatible with the data, and the use
of nonparametric methods is advocated.
1.2.3. From high to low frequency data. As soon as the sampling frequency
1
N
=
1
is not large anymore, the problem of estimating a parameter in the
drift or diffusion coefcient becomes signicantly more difcult: the trajectory
properties that can be recovered from the data when
N
is small are lost. In
particular, there is no evident approximating scheme that can efciently compute
or mimic the continuous ML-estimator in parametric estimation.
Likewise, the usual nonparametric kernel estimators, based on differencing, do
not provide consistent estimation of the drift b() or the diffusion coefcient ().
As a concrete example, consider the standard NadarayaWatson estimator
b(x) of
the drift b(x) in the point x R:
b(x) :=
(N)
1
N1
n=0
K
h
(x X
n
)(X
(n+1)
X
n
)
N
1
N1
n=0
K
h
(x X
n
)
with a kernel function K() and K
h
(x) := h
1
K(h
1
x) for h > 0. If we let
N and h 0, then by the ratio ergodic theorem and by kernel properties
we obtain almost surely the limit
E[
1
(X
X
0
)|X
0
=x] =
1
_
0
P
t
b(x) dt.
Hence, this estimator is not consistent. It merely yields a highly blurred version
of b(x), which of course tends to b(x) in the high frequency limit 0. Note that
the transition operators P
t
involved depend on the unknown functions b() and ()
as a whole. The situation for estimators of () based on the approximation of the
quadratic variation is even worse, because the drift b() enters directly into the
limit expression.
2228 E. GOBET, M. HOFFMANN AND M. REI
1.2.4. Spectral methods for parametric estimation. Kessler and Srensen
(1999) suggested the use of eigenvalues
and eigenvectors
() of the
parametrized innitesimal generator
L
f (x) =
2
(x)
2
f
(x) +b
(x)f
(x),
that is, such that L
(x) =
) also satises
P
(X
n
) =E
_
_
X
(n+1)
_
X
n
_
=exp(
(X
n
),
whenever it is easy to compute, the knowledge of a pair (
) can be translated
into a set of conditional moment conditions to be used in estimating functions.
With their method, Kessler and Srensen can construct
N-consistent estimators
that are nearly efcient. See also the paper of Hansen, Scheinkman and Touzi
(1998) that we already mentioned.
In a sense, in this idea also lies the essence of our method. However, the
strategy of Kessler and Srensen is not easily extendable to nonparametrics: there
is no straightforward way to pass from a nite-dimensional parametrization of the
generator L
H
s
C, b
H
s1 C, inf
x
(x) c
_
,
where H
s
denotes the L
2
-Sobolev space of order s.
Note that all ((), b())
s
satisfy Assumption 2.1.
2.2. Minimax rates of convergence. We are now in position to state the main
theorems. By (3.12) and (3.13) in the next section we dene estimators
2
and
b
using a spectral estimation method based on the observation (X
0
, X
, . . . , X
N
).
These estimators, which of course depend on the number N of observations, satisfy
the following uniform asymptotic upper bounds.
THEOREM 2.4. For all s >1, C c >0 and 0 <a <b <1 we have
sup
(,b)
s
E
,b
_
2
2
L
2
([a,b])
_
1/2
N
s/(2s+3)
,
sup
(,b)
s
E
,b
_
b b
2
L
2
([a,b])
_
1/2
N
(s1)/(2s+3)
.
Recall that A B means that A can be bounded by a constant multiple of B,
where the constant depends continuously on other parameters involved. Similarly,
AB is equivalent to B A and AB holds if both relations AB and AB
are true.
As the following lower bounds prove, the rates of convergence of our estimators
are optimal in a minimax sense over
s
.
THEOREM 2.5. Let E
N
denote the set of all estimators according to
Denition 2.2. Then for all 0 a <b 1 and s >1 the following hold:
inf
2
E
N
sup
(,b)
s
E
,b
_
2
2
L
2
([a,b])
_
1/2
N
s/(2s+3)
, (2.2)
inf
bE
N
sup
(,b)
s
E
,b
_
b b
2
L
2
([a,b])
_
1/2
N
(s1)/(2s+3)
. (2.3)
If we set s
1
=s 1 and s
2
=s, then the drift b()
s
with regularity s
1
can be
estimated with the minimax rate of convergence u
N
= N
s
1
/(2s
1
+5)
, whereas the
diffusion coefcient ()
s
has regularity s
2
and the corresponding minimax
rate of convergence is v
N
= N
s
2
/(2s
2
+3)
. Hence, Table 1 in Section 1.2.5 can be
lled with two rather unexpected rates of convergence u
N
and v
N
. In Section 3.3
the rates are explained in the terminology of ill-posed problems and reasons are
given why the tight connection between the regularity assumptions on b() and ()
is needed.
NONPARAMETRIC ESTIMATION OF SCALAR DIFFUSIONS 2231
3. Spectral estimation method.
3.1. The basic idea. We shall base our estimator of the diffusion coef-
cient () and of the drift coefcient b() on spectral methods for passing from
the transition operator P
2
(x) exp
__
x
0
2b(y)
2
(y) dy
_
(3.1)
and the function S() =1/s
2
(x)
1
(x) =C
0
exp
_
_
x
0
2b(y)
2
(y) dy
_
, (3.2)
with the normalizing constant C
0
> 0 depending on () and b(). The action of
the generator in divergence form is given by
Lf (x) =L
,b
f (x) =
1
2
2
(x)f
(x) +b(x)f
(x) =
1
(x)
_
S(x)f
(x)
_
, (3.3)
where the domain of this unbounded operator on L
2
() is given by the subspace
of the L
2
-Sobolev space H
2
with Neumann boundary conditions
dom(L) ={f H
2
([0, 1])|f
(0) =f
(1) =0}.
The generator L is a self-adjoint elliptic operator on L
2
() with compact resolvent
so that it has nonpositive point spectrum only. If
1
denotes the largest negative
eigenvalue of L with eigenfunction u
1
, then due to the reecting boundary of [0, 1]
the Neumann boundary conditions u
1
(0) =u
1
(1) =0 hold and thus
Lu
1
=
1
(Su
1
)
=
1
u
1
S(x)u
1
(x) =
1
_
x
0
u
1
(y)(y) dy. (3.4)
From (3.2) we can derive an explicit expression for the diffusion coefcient:
2
(x) =
2
1
_
x
0
u
1
(y)(y) dy
u
1
(x)(x)
. (3.5)
The corresponding expression for the drift coefcient is
b(x) =
1
u
1
(x)u
1
(x)(x) u
1
(x)
_
x
0
u
1
(y)(y) dy
u
1
(x)
2
(x)
. (3.6)
Hence, if we knew the invariant measure , the eigenvalue
1
and the eigenfunc-
tion u
1
(including its rst two derivatives), we could exactly determine the drift and
diffusion coefcient. Of course, these identities are valid for any eigenfunction u
k
with eigenvalue
k
, but for better numerical stability we shall use only the largest
2232 E. GOBET, M. HOFFMANN AND M. REI
nondegenerate eigenvalue
1
. Moreover, it is known that only the eigenfunction u
1
does not have a vanishing derivative in the interior of the interval (cf. Proposi-
tion 6.5) so that by this choice indeterminacy at interior points is avoided.
Using semigroup theory [Engel and Nagel (2000), Theorem IV.3.7] we know
that u
1
is also an eigenfunction of P
with eigenvalue
1
= e
1
. Our procedure
consists of determining estimators of and
P
of P
, to calculate the
corresponding eigenpair (
1
, u
1
) and to use (3.5) and (3.6) to build a plug-in
estimator of () and b().
3.2. Construction of the estimators. We use projection methods, taking
advantage of approximating properties of abstract operators by nite-dimensional
matrices, for which the spectrum is easy to calculate numerically. A similar
approach was already suggested by Chen, Hansen and Scheinkman (1997). More
specically, we make use of wavelets on the interval [0, 1]. For the construction of
wavelet bases and their properties we refer to Cohen (2000).
DEFINITION 3.1. Let (
J
.
In the sequel we shall regularly use the Jackson and Bernstein inequalities with
respect to the L
2
-Sobolev spaces H
s
([0, 1]) of regularity s:
(Id
J
)f
H
t 2
J(st)
f
H
s , 0 t s,
v
J
V
J
, v
J
H
s
2
J(st)
v
J
H
t , 0 t s.
The canonical projection estimate of based on (X
n
)
0nN
is given by
:=
||J
with
:=
1
N +1
N
n=0
(X
n
). (3.7)
By the ergodicity of X it follows that
is a consistent estimate of ,
for
N . To estimate the action of the transition operator on the wavelet basis
(P
J
)
,
:= P
with entries
(
)
,
:=
1
2N
N
n=1
_
_
X
(n1)
_
(X
n
) +
_
X
(n1)
_
(X
n
)
_
. (3.8)
This yields an approximation of P
,
and which is given by
G
,
:=
1
N
_
1
2
(X
0
)
(X
0
)
+
1
2
(X
N
)
(X
N
) +
N1
n=1
(X
n
)
(X
n
)
_
.
(3.9)
The particular treatment of the boundary terms will be explained later. If we
put = (w
n
(X
n
))
||J, nN
with w
0
= w
N
=
1
2
and w
n
= 1 otherwise, we
have
G = N
1
T
,
T
being the transpose of . Our construction can thus be
regarded as a least squares type estimator, as in a usual regression setting; see the
argument developed in Section 3.3.1.
We combine the last two estimators in order to determine estimates for the
eigenvalue
1
and the eigenfunction u
1
of P
and
J
P
J
P
lie in V
J
, the range of
J
P
. The eigenfunction u
J
1
corresponding to the second largest eigenvalue
J
1
of
J
P
is characterized by
P
u
J
1
,
=
J
1
u
J
1
,
|| J. (3.10)
We pass to vector-matrix notation and use from now on bold letters to dene for
a function v V
J
the corresponding coefcient column vector v =(v,
)
||J
.
Observe carefully the different L
2
-scalar products used; here they are with respect
to the Lebesgue measure. Thus, we can rewrite (3.10) as
P
J
u
J
1
=
J
1
Gu
J
1
. (3.11)
As v
T
Gv = v, v
v, w
G
:=(G
1
P
J
v)
T
Gw
= v
T
P
J
w=v, G
1
P
J
w
G
.
Similarly, v
T
Gv =N
1
(
T
v)
T
T
v 0 holds and the matrix
G can be shown
to be even strictly positive denite with high probability (see Lemma 4.12). In
this case, we similarly infer that
G
1
G
1
v, v
G
=
1
N
N
n=1
v
_
X
(n1)
_
v(X
n
)
1
N
_
N1
n=0
v(X
n
)
2
_
1/2
_
N
n=1
v(X
n
)
2
_
1/2
1
N
_
1
2
v(X
0
)
2
+
1
2
v(X
N
)
2
+
N1
n=1
v(X
n
)
2
_
=v, v
G
.
We infer that all eigenvalues of
G
1
and its
corresponding eigenvector u
1
yield estimators of
J
1
and u
J
1
.
Plugging the estimator as well as
1
and u
1
into (3.5) and (3.6), we obtain our
estimators of
2
() and b():
2
(x) :=
2
1
log(
1
)
_
x
0
u
1
(y) (y) dy
u
1
(x) (x)
, (3.12)
b(x) :=
1
log(
1
)
u
1
(x) u
1
(x) (x) u
1
(x)
_
x
0
u
1
(y) (y) dy
u
1
(x)
2
(x)
. (3.13)
To avoid indeterminacy, the denominators of the estimators are forced to remain
above a certain minimal level, which depends on the subinterval [a, b] [0, 1] for
which the loss function is taken. See (4.5) for the exact formulation in the case
of
2
() and proceed analogously for
b().
3.3. Discussion.
3.3.1. Least squares approach. The estimator matrix
G
1
is built as in
the least squares approach for projection methods in classical regression. To
estimate P
0
(x) = E
,b
[
0
(X
)|X
0
= x], the least squares method consists
of minimizing
N
n=1
0
(X
n
)
||J
_
X
(n1)
_
2
min! (3.14)
over all real coefcients (
0
n=1
_
||J
_
X
(n1)
_
_
_
X
(n1)
_
=
N
n=1
_
X
(n1)
_
0
(X
n
)
NONPARAMETRIC ESTIMATION OF SCALAR DIFFUSIONS 2235
for all |
.
3.3.2. Other than wavelet methods. For our projection estimates to work,
we merely need approximation spaces satisfying the Jackson and Bernstein
inequalities. Hence, other nite element bases could serve as well.
The invariant density and the transition density could also be estimated using
kernel methods, but the numerical calculation of the eigenpair (
1
, u
1
) would then
involve an additional discretization step.
3.3.3. Diffusions over the real line. Using a wavelet basis of L
2
(R), it is still
possible to estimate and P
1
()
yielding an ill-posedness degree of 1 (N
s/(2s+3)
). Observe that the regularity
conditions H
s
and b H
s1
are translated into H
s
, S H
s
. The
transformation of (, S) to
2
() =2S()/() is stable [L
2
-continuous for S()
s
0
> 0], whereas in b() = S
) is well-posed (for P
2(x)
, x [0, 1].
Estimation of H
s
, s >1, in H
1
-normcan be achieved with rate N
(s1)/(2s+1)
and this rate is thus also valid for estimating b() in L
2
-norm. Given the drift
coefcient b(), we nd
2
(x) =2
_
x
0
b(y)(y) dy +C
(x)
, x [0, 1],
where C is a suitable constant. If we knew
2
(0), we would obtain the rate
N
s/(2s+1)
for H
s
.
Using a preliminary nonparametric estimate
2
C
depending on the parameter C
and then tting a parametric model for C, we are likely to nd the same rate. In
any case, the assumption of knowing one parameter seems rather articial and no
further investigations have been performed.
3.3.8. Estimation at the boundary. Our plug-in estimators can only be dened
on the open subinterval (0, 1). Estimation at the boundary points leads to a
risk increase due to S
1
(0) =
1
u
1
(0)(0)/u
1
(0) by de lHspitals rule applied
to (3.4). Thus, estimating (0) and b(0) involves taking the second and third
NONPARAMETRIC ESTIMATION OF SCALAR DIFFUSIONS 2237
derivative, respectively, when using plug-in estimators. A pointwise lower bound
resultalong the lines of the L
2
-lower bound proofshows that this deterioration
cannot be avoided.
4. Proof of the upper bound.
4.1. Convergence of . First, we recall the proof for the risk bound in
estimating the invariant measure:
PROPOSITION 4.1. With the choice 2
J
N
1/(2s+1)
the following uniformrisk
estimate holds for based on N observations:
sup
(,b)
s
E
,b
_
2
L
2
_
1/2
N
s/(2s+1)
.
PROOF. The explicit formula (3.1) for shows that
H
s is uniformly
bounded over
s
. This implies that the bias term satises
L
2 2
Js
H
s N
s/(2s+1)
,
uniformly over
s
. Since
is an unbiased estimator of ,
2
L
2
_
=
||J
Var
,b
[
] 2
J
N
1
,
whichin combination with the uniformity of the constants involvedgives the
announced upper bound.
4.2. Spectral approximation. We shall rely on the spectral approximation
results given in Chatelin (1983); compare also Kato (1995). Since for () we
have to estimate not only the eigenvalue u
1
, but also its derivative u
1
, we will
be working in the L
2
-Sobolev space H
1
. The general idea is that the error in the
eigenvalue and in the eigenfunction can be controlled by the error of the operator
on the eigenspace, once the overall error measured in the operator norm is small.
Let R(T, z) = (T z Id)
1
denote the resolvent map of the operator T , (T ) its
spectrum and B(x, r) the closed ball of radius r around x.
PROPOSITION 4.2. Suppose a bounded linear operator T on a Hilbert space
has a simple eigenvalue such that (T ) B(, ) = {} holds for some
> 0. Let T
T <
R
2
, where R :=
(sup
zB(,)
R(T, z))
1
. Then the operator T
in
B(, ) and there are eigenvectors u and u
with T u =u, T
satisfying
u
8R
1
(T
T )u. (4.1)
2238 E. GOBET, M. HOFFMANN AND M. REI
PROOF. We use the resolvent identity and the Cauchy integral representation
of the spectral projection P
on the eigenspace of T
u =
1
2
_
_
_
_
_
B(,)
R(T
, z)
z
dz (T
T )u
_
_
_
_
1
2
2 sup
zB(,)
R(T
, z)
1
(T
T )u
sup
zB(,)
R(T, z)
1 R(T, z)T
T
(T
T )u
= (R T
T )
1
(T
T )u.
Hence, for T
T <
R
2
this calculation is a posteriori justied and yields
u P
u < 2R
1
(T
T )u. Applying (T
T )u <
R
2
u once again, we
see that the projection P
.
It remains to nd eigenvectors that are close, too. Observe that, for arbitrary
Hilbert space elements g, h with g =h =1 and g, h 0,
g h
2
=2 2g, h 2(1 +g, h)(1 g, h) =2g g, hh
2
holds. We substitute for g and h the normalized eigenvectors u and u
with
u, u
0; note that oblique projections only enlarge the right-hand side and thus
infer (4.1).
COROLLARY 4.3. Under the conditions of Proposition 4.2 there is a constant
C =C(R, T ) such that |
| C(T
T )u.
PROOF. The inverse triangle inequality yields
|
| =|T
T u| T
(u
u) +(T
T )u
(T +T
T )u
u +(T
T )u
__
T +
R
2
_
8R
1
+1
_
(T
T )u,
where the last line follows from Proposition 4.2.
4.3. Bias estimates. In a rst estimation step, we bound the deterministic error
due to the nite-dimensional projection
J
P
of P
J
and
J
have similar approximation properties.
NONPARAMETRIC ESTIMATION OF SCALAR DIFFUSIONS 2239
LEMMA 4.4. Let m: [0, 1] [m
0
, m
1
] be a measurable function with m
1
m
0
> 0. Denote by
m
J
the L
2
(m)-orthogonal projection onto the multiresolution
space V
J
. Then there is a constant C =C(m
0
, m
1
) such that
(Id
m
J
)f
H
1 C(Id
J
)f
H
1 f H
1
([0, 1]).
PROOF. The norm equivalence m
0
g
L
2 g
m
m
1
g
L
2 implies
m
J
L
2
L
2 m
1
m
1
0
m
J
L
2
(m)L
2
(m)
=m
1
m
1
0
.
On the other hand, the Bernstein inequality in V
J
and the Jackson inequality for
Id
J
in H
1
and L
2
yield, for f H
1
,
(Id
J
)f
H
1 =(Id
J
)(Id
J
)f
H
1
(Id
J
)f
H
1 +
J
(Id
J
)f
H
1
(Id
J
)f
H
1 +2
J
J
(Id
J
)f
L
2
(Id
J
)f
H
1 +
J
L
2
L
2(Id
J
)f
H
1 ,
where the constants depend only on the approximation spaces.
PROPOSITION 4.5. Uniformly over
s
we have
J
P
H
1
H
1
2
Js
.
PROOF. The transition density p
. Hence,
from Lemma 6.7 it follows that P
: H
1
H
s+1
is continuous with a uniform
norm bound over
s
. Lemma 4.4 yields
(P
J
P
)f
H
1 Id
J
H
s+1
H
1 f
H
1 .
The Jackson inequality in H
1
gives the result.
COROLLARY 4.6. Let
J
1
be the largest eigenvalue smaller than 1 of
J
with
eigenfunction u
J
1
. Then uniformly over
s
the following estimate holds:
|
J
1
1
| +u
J
1
u
1
H
1 2
Js
.
PROOF. We are going to apply Proposition 4.2 on the space H
1
and its
Corollary 4.3. In view of Proposition 4.5, it remains to establish the existence
of uniformly strictly positive values for and R over
s
. The uniform separation
of
1
from the rest of the spectrum is the content of Proposition 6.5.
For the choice of in Proposition 6.5 we search a uniform bound R. If
we regard P
on L
2
(), then P
, z) =
dist(z, (P
))
1
[see Chatelin (1983), Proposition 2.32].
2240 E. GOBET, M. HOFFMANN AND M. REI
By Lemma 6.3 and the commutativity between P
and L we conclude
R(P
, z)f
H
1 (IdL)
1/2
R(P
, z)f
R(P
, z)(IdL)
1/2
f
dist
_
z, (P
)
_
1
f
H
1 .
Hence, R(P
, z)
H
1
H
1
1
holds uniformly over z B(, ) and
(, b)
s
.
REMARK 4.7. The KatoTemple inequality [Chatelin (1983), Theorem 6.21]
on L
2
() even establishes the so-called superconvergence |
J
1
1
| 2
2Js
.
4.4. Variance estimates. To bound the stochastic error on the nite-
dimensional space V
J
, we return to vector-matrix notation and look for a bound
on the error involved in the estimators
G and
P
||J
E
,b
_
1
N
2
_
E
,b
[
(X
0
)v(X
0
)]
1
2
(X
0
)v(X
0
)
1
2
(X
N
)v(X
N
)
N1
n=1
(X
n
)v(X
n
)
_
2
_
||J
N
1
E
,b
__
(X
0
)v(X
0
)
_
2
_
N
1
_
_
_
_
_
||J
_
_
_
_
_
v
2
L
2
N
1
2
J
v
2
l
2
,
as asserted.
LEMMA 4.9. For any vector v we have, uniformly over
s
,
E
,b
_
(
)v
2
l
2
_
v
2
l
2
N
1
2
J
.
NONPARAMETRIC ESTIMATION OF SCALAR DIFFUSIONS 2241
PROOF. We obtain, by (6.2) in Lemma 6.2,
E
,b
_
(
)v
2
l
2
_
=
||J
E
,b
__
1
N
N
n=1
_
X
(n1)
_
v(X
n
) E
,b
[
(X
0
)v(X
)]
_
2
_
||J
N
1
E
,b
__
(X
0
)v(X
1
)
_
2
_
N
1
||J
2
L
2
v
2
L
2
p
N
1
2
J
v
2
l
2
,
as asserted.
DEFINITION 4.10. We introduce the random set
R=R
J,N
:=
_
GG
1
2
G
1
1
_
.
REMARK 4.11. Since G is invertible, so is
G on R with
G
1
2G
1
GG
2
l
2
l
2
||J
(
GG)e
2
l
2
l
2
holds with unit vectors (e
GG
2
l
2
l
2
_
N
1
2
2J
.
Since the spaces L
2
([0, 1]) and L
2
() are isomorphic with uniform isomorphism
constants, G
1
1 holds uniformly over
s
and the assertion follows from
Chebyshevs inequality.
PROPOSITION 4.13. For any >0 we have, uniformly over
s
,
P
,b
(R{
G
1
G
1
P
}) N
1
2
2J
2
.
PROOF. First, we separate the different error terms:
G
1
G
1
P
=
G
1
(
) +(
G
1
G
1
)P
=
G
1
_
(
) +(G
G)G
1
P
_
.
2242 E. GOBET, M. HOFFMANN AND M. REI
On the set R we obtain, by Remark 4.11,
G
1
G
1
P
G
1
(
+G
GG
1
P
)
2G
1
(
+G
GG
1
P
+G
G.
By Lemmas 4.8 and 4.9 and the HilbertSchmidt norm estimate (cf. the proof of
Lemma 4.12) we obtain the uniform norm bound over
s
,
E
,b
[
G
1
G
1
P
2
1
R
] N
1
2
2J
.
It remains to apply Chebyshevs inequality.
Having established the weak consistency of the estimators in matrix norm, we
now bound the error on the eigenspace.
PROPOSITION 4.14. Let u
J
1
be the vector associated with the normalized
eigenfunction u
J
1
of
J
P
with eigenvalue
J
1
. Then uniformly over
s
the
following risk bound holds:
E
,b
_
(
G
1
G
1
P
J
)u
J
1
2
l
2
1
R
_
N
1
2
J
.
PROOF. By the same separation of the error terms on R as in the preceding
proof and by Lemmas 4.8 and 4.9 we nd
E
,b
_
(
G
1
G
1
P
J
)u
J
1
2
l
2
1
R
_
8G
1
2
_
E
_
(
P
J
)u
J
1
2
l
2
_
+E
_
(G
G)
J
1
u
J
1
2
l
2
__
N
1
2
J
.
The uniformity over
s
follows from the respective statements in the lemmas.
COROLLARY 4.15. Let
1
be the second largest eigenvalue of the matrix
G
1
with eigenvector u
1
. If
G is not invertible or if u
1
l
2 2sup
s
u
1
L
2
holds, put u
1
:= 0,
1
:= 0. If N
1
2
2J
0 holds, then uniformly over
s
the
following bounds hold for N, J :
E
,b
__
|
1
J
1
|
2
+ u
1
u
J
1
2
l
2
_
1
R
_
N
1
2
J
, (4.2)
E
,b
_
u
1
u
J
1
2
H
1
_
N
1
2
3J
. (4.3)
PROOF. For the proof of (4.2) we apply Proposition 4.2 using the Euclidean
Hilbert space R
V
J
and Corollary 4.3. Then Proposition 4.14 in connection with
Proposition 4.13 (using < R/2 and N
1
2
2J
0) yields the correct asymptotic
rate on the event R. For the uniform choice of and R for G
1
P
J
in
NONPARAMETRIC ESTIMATION OF SCALAR DIFFUSIONS 2243
Proposition 4.2 just use the corresponding result for P
J
P
0.
The precaution taken for undened or too large u
1
is necessary for the event
\ R. Since the estimators
1
and u
1
are now kept articially bounded, the rate
P
,b
( \ R) N
1
2
2J
established in Lemma 4.12 sufces to bound the risk on
\R. Hence, the second estimate (4.3) is a consequence of (4.2) and the Bernstein
inequality u
1
u
J
1
H
1 2
J
u
1
u
J
1
L
2 .
REMARK 4.16. The main result of this section, namely (4.3), can be extended
to pth moments for all p (1, ):
E
,b
_
u
1
u
J
1
p
H
1
_
1/p
N
1/2
2
3J/2
. (4.4)
Indeed, tracing back the steps, it sufces to obtain bounds on the moments of
order p in Lemmas 4.8 and 4.9, which on their part rely on the mixing statement
in Lemma 6.2. By arguments based on the Riesz convexity theorem this last
lemma generalizes to the corresponding bounds for pth moments, as derived in
Section VII.4 of Rosenblatt (1971). For the sake of clarity we have restricted
ourselves to the case p =2 here.
4.5. Upper bound for (). By Corollary 4.15 and our choice of J, 2
J
N
1/(2s+3)
,
sup
(,b)
s
E
,b
_
|
1
J
1
|
2
+ u
1
u
J
1
2
H
1
_
N
1
2
3J
N
2s/(2s+3)
holds. Using this estimate and the estimate for in Proposition 4.1, the risk of
the plug-in estimator
2
() in (3.12) is bounded as asserted in the theorem. We
only have to ensure that the stochastic error does not increase from the plug-in and
that the denominator is uniformly bounded away from zero. Using the Cauchy
Schwarz inequality and Remark 4.16 on the higher moments of our estimators, we
encounter no problem in the rst case. The second issue is dealt with by using the
lower bound c
a,b
>0 in Proposition 6.5 so that an improvement of the estimate for
the denominator by using
u
1
:=max( u
1
, c
a,b
) (4.5)
instead of u
1
guarantees the uniform lower bound away from zero.
4.6. Upper bound for b(). Since b() =S
J
P
H
2
H
2 2
J(s1)
,
2244 E. GOBET, M. HOFFMANN AND M. REI
because Id
J
H
s+1
H
2 is of this order. As in Corollary 4.6 this is also the
rate for the bias estimate. The only ne point is the uniform norm equivalence
f
H
2 (Id L)f
2
H
2
_
N
1
2
5J
.
Therefore balancing the bias and variance part of the risk by the choice 2
J
N
1/(2s+3)
as beforeyields the asserted rate of convergence N
(s1)/(2s+3)
.
5. Proof of the lower bounds. First, the usual Bayes prior technique is
applied for the reduction to a problem of bounding certain likelihood ratios. Then
the problem is reduced to that of bounding the L
2
-distance between the transition
probabilities, which is nally accomplished using HilbertSchmidt normestimates
and the explicit form of the inverse of the generator.
5.1. The least favorable prior. The idea is to perturb the drift and diffusion
coefcients of the reected Brownian motion in such a way that the invariant
measure remains unchanged. Let us assume that is a compactly supported
wavelet in H
s
with one vanishing moment. Without loss of generality we suppose
C > 1 > c > 0 such that (1, 0)
s
holds. We put
jk
= 2
j/2
(2
j
k) and
denote by K
j
Z a maximal set of indices k such that supp(
jk
) [a, b] and
supp(
jk
) supp(
jk
) = holds for all k, k
K
j
, k =k
. Furthermore, we set
2
j (s+1/2)
such that, for all =(j) {1, +1}
|K
j
|
,
_
_
2S
, (S
_
s
with S
(x) :=S
(j, x) :=
_
2 +
kK
j
jk
(x)
_
1
.
We consider the corresponding diffusions with generator
L
S
f (x) :=(S
(x) :=S
(x)f
(x) +S
(x)f
(x), f dom(L).
Hence, the invariant measure is the Lebesgue measure on [0, 1].
The usual Assouad cube techniques [e.g., see Korostelev and Tsybakov (1993)]
give, for any estimator () and for N N, >0, the lower bounds
sup
(,b)
s
E
,b
_
2
2
L
2
([a,b])
_
2
j
2
e
p
0
, (5.1)
sup
(,b)
s
E
,b
_
b b
2
L
2
([a,b])
_
2
j
2
b
e
p
0
, (5.2)
NONPARAMETRIC ESTIMATION OF SCALAR DIFFUSIONS 2245
where, for all ,
with
l
1 =2 and with P
:=P
2S
,S
2 2S
2S
L
2 ,
b
S
L
2 , p
0
P
_
dP
dP
F
N
>e
_
.
We choose
k
,
S
(x) S
(x) =2
jk
(x)S
(x)S
(x)
and S
, S
1
2
holds uniformly so that the L
2
-norm is indeed of order .
Equivalently, we nd
b
2
j
. Due to 2
j (s+1/2)
, the proof of the theorem
is accomplished, once we have shown that a strictly positive p
0
can be chosen for
xed >0 and the asymptotics 2
j
N
1/(2s+3)
.
5.2. Reduction to the convergence of transition densities. If we denote the
transition probability densities P
(X
dy|X
0
= x) by p
p
BM
=0
due to S
1
2
C
1 2
3j/2
0 for s > 1. We are now going to use the
estimate log(1 + x) x
2
x, which is valid for all x
1
2
. For j so large
that 1
p
1
2
holds, the KullbackLeibler distance can be bounded from
above (note that the invariant measure is Lebesgue measure):
E
_
log
_
dP
dP
F
N
__
=
N
k=1
E
_
log
_
p
(X
k1
, X
k
)
p
(X
k1
, X
k
)
__
=N
_
1
0
_
1
0
log
_
p
(x, y)
p
(x, y)
_
p
(x, y) dy dx
N
_
1
0
_
1
0
(p
(x, y) p
(x, y))
2
p
(x, y)
_
p
(x, y) p
(x, y)
_
dy dx
=N
_
1
0
_
1
0
(p
(x, y) p
(x, y))
2
p
(x, y)
dy dx
Np
1
2
L
2
([0,1]
2
)
.
2246 E. GOBET, M. HOFFMANN AND M. REI
The square root of the KullbackLeibler distance bounds the total variation
distance in order, which by the Chebyshev inequality yields
P
_
dP
dP
F
N
>e
_
=1 P
_
dP
dP
F
N
1 e
1
_
1 E
dP
dP
F
N
1
_
(1 e
)
1
=1 (1 e
)
1
(P
)|
F
N
TV
1 CN
1/2
p
L
2
([0,1]
2
)
,
where C > 0 is some constant independent of , N, and j. Summarizing, we
need the estimate
limsup
N,j
N
1/2
p
L
2
([0,1]
2
)
<C
1
for 2
j
N
1/(2s+3)
. (5.3)
5.3. Convergence of the transition densities. Observe rst that p
L
2
([0,1]
2
)
is exactly the HilbertSchmidt norm distance P
HS
between
the transition operators derived from L
S
and L
S
_
f =0
_
and V
:={f L
2
([0, 1])|f constant},
then the transition operators coincide on V
HS
=(P
)|
V
HS
.
We take advantage of the key result that for Lipschitz functions f with Lipschitz
constant on the union of the spectra of two self-adjoint bounded operators
T
1
and T
2
the continuous functional calculus satises
f (T
1
) f (T
2
)
HS
T
1
T
2
HS
; (5.4)
see Kittaneh (1985). We proceed by bounding the HilbertSchmidt norm of the
difference of the inverses of the generators and by then transferring this bound to
the transition operators via (5.4). By the functional calculus for operators on V , the
function f (z) =exp((z
1
)) sends (L
|
V
)
1
to P
|
V
. Moreover, f is uniformly
Lipschitz continuous on (, 0) due to := sup
z<0
|f
(z)| = 4
1
e
2
< .
Thus, we arrive at
p
L
2
([0,1]
2
)
=
_
_
_
P
V
_
_
HS
(L
|
V
)
1
(L
|
V
)
1
HS
.
The inverse of the generator L
|
V
)
1
g(x) =
_
1
0
__
1
y
S
1
(v)
_
v 1
[x,1]
(v)
_
dv
_
g(y) dy. (5.5)
NONPARAMETRIC ESTIMATION OF SCALAR DIFFUSIONS 2247
Using |S
1
S
1
| = 2
jk
for some k K
j
and denoting by the primitive
of with compact support, we obtain
(L
|
V
)
1
(L
|
V
)
1
2
HS
=
_
1
0
_
1
0
__
1
y
2
jk
(v)
_
v 1
[x,1]
(v)
_
dv
_
2
dx dy
=4
2
2
j
_
1
0
_
1
0
_
(2
j
y k)y
_
1
y
(2
j
v k) dv +
_
2
j
(x y) k
_
_
2
dx dy
2
2
j
(2
j
)
2
L
2
2
2
2j
.
Consequently, p
2
L
2
2
j (2s+3)
holds with an arbitrarily small constant
if 2
j (s+1/2)
is chosen sufciently small. Hence, the estimate (5.3) is valid for this
choice and the asymptotics N2
j (2s+3)
1, which remained to be proved.
6. Technical results. We shall need several technical results, mainly to de-
scribe the dependence of certain quantities on the underlying diffusion parameters.
The following result is in close analogy with Section IV.5 in Bass (1998).
LEMMA 6.1. The second largest eigenvalue
1
of the innitesimal generator
L
,b
can be bounded away from zero:
1
inf
x[0,1]
S(x) =: s
0
.
This eigenvalue is simple and the corresponding eigenfunction f
1
is monotone.
PROOF. The variational characterization of
1
[Davies (1995), Section 4.5]
and partial integration yield
1
= sup
f
=1
f,1
=0
Lf, f
= inf
f
=1
f,1
=0
_
1
0
S(x)f
(x)
2
dx.
Given the derivative f
= 0 is uniquely
determined. Setting M(x) :=([0, x]), this function f satises
0 =
_
1
0
_
f (0) +
_
x
0
f
(y) dy
_
(x) dx =f (0) +
_
1
0
f
(y)
_
1 M(y)
_
dy.
2248 E. GOBET, M. HOFFMANN AND M. REI
For f, g L
2
() with f, 1
=g, 1
=0 we nd
f, g
=
_
1
0
_
f (0) +
_
x
0
f
(y) dy
__
g(0) +
_
x
0
g
(z) dz
_
(x) dx
= f (0)g(0) f (0)
_
1
0
g
(z)
_
1 M(z)
_
dz
g(0)
_
1
0
f
(y)
_
1 M(y)
_
dy
+
_
1
0
_
1
0
f
(y)g
(z)
_
1 M(y z)
_
dy dz
=
_
1
0
_
1
0
_
M(y z) M(y)M(z)
_
f
(y)g
(z) dy dz
=:
_
1
0
_
1
0
m(y, z)f
(y)g
(z) dy dz.
The kernel m(y, z) is positive on (0, 1)
2
and bounded by 1, whence we obtain, by
regarding u =f
1
=inf
u
_
1
0
S(x)u(x)
2
dx
_
1
0
_
1
0
m(y, z)u(y)u(z) dy dz
s
0
u
2
L
2
u
2
L
1
s
0
.
If the derivative of an eigenfunction f
1
changed sign, we could write f
1
=u
+
u
0
:=u
+
+u
satises Lf
0
, f
0
=Lf
1
, f
1
,
while f
0
=
_
1
0
_
1
0
m(y, z)f
1
(y)g
1
(z) dy dz
does not change sign and the whole integral does not vanish. We infer that
the eigenspace of
1
cannot contain two orthogonal functions and is thus one-
dimensional.
LEMMA 6.2. For H
1
, H
2
L
2
([0, 1]) we have the following two uniform
variance estimates over
s
:
Var
,b
_
1
N
N
n=1
H
1
(X
n
)
_
N
1
E
,b
[H
1
(X
0
)
2
], (6.1)
Var
,b
_
1
N
N
n=1
H
1
_
X
(n1)
_
H
2
(X
n
)
_
N
1
E
,b
[H
1
(X
0
)
2
H
2
(X
1
)
2
]. (6.2)
NONPARAMETRIC ESTIMATION OF SCALAR DIFFUSIONS 2249
PROOF. Due to the uniform spectral gap s
0
over
s
(Proposition 6.1),
P
satises P
with := e
|s
0
|
< 1 for all f L
2
() with
f, 1
=0.
We obtain the rst estimate by considering the centered random variables
f
1
(X
k
) :=H
1
(X
k
) E
,b
[H
1
(X
k
)], k N:
Var
,b
_
N
n=1
H
1
(X
n
)
_
=
N
m,n=1
E
,b
[f
1
(X
m
)f
1
(X
n
)] =
N
m,n=1
_
f
1
, P
|mn|
f
1
_
m,n=1
|mn|
f
1
NE
,b
[H
1
(X
0
)
2
].
The second estimate follows along the same lines. Merely observe that for
m>n, by the projection property of conditional expectations
E
,b
_
H
1
_
X
(n1)
_
H
2
(X
n
)H
1
_
X
(m1)
_
H
2
(X
m
)
_
=
_
H
1
(P
H
2
), P
mn1
_
H
1
(P
H
2
)
__
for all f H
1
.
PROOF. The invariant measure and the function S are uniformly bounded
away from zero and innity so that we obtain, with uniform constants for f
dom(L),
f
2
H
1
=f, f +f
, f
f, f
+Sf
, f
=(IdL)f, f
=(Id L)
1/2
f
2
.
By an approximation argument this extends to all f H
1
=dom(L
1/2
).
PROPOSITION 6.4. Suppose ((
n
(), b
n
())
s
, n 0, and
lim
n
n
0
=0, lim
n
b
n
b
0
=0.
Then the corresponding transition probabilities (p
(n)
t
) converge uniformly:
lim
n
_
_
p
(n)
t
(, ) p
(0)
t
(, )
_
_
=0.
PROOF. An application of the results by Stroock and Varadhan (1971) yields
that the corresponding diffusion processes X
(n)
converge weakly to X
(0)
for any
2250 E. GOBET, M. HOFFMANN AND M. REI
xed initial value X
(n)
(0) =x. This implies in particular
lim
n
_
1
0
p
(n)
t
(x, y)(y) dy =
_
1
0
p
(0)
t
(x, y)(y) dy
for all test functions L
is uniformly separated:
(P
) B(
1
, 2) ={
1
}.
Furthermore, for all 0 < a < b < 1 there is a uniform constant c
a,b
> 0 such
that the associated rst eigenfunction u
1
=u
1
(, b) satises, for all (, b)
s
,
min
x[a,b]
|u
1
(x)| c
a,b
.
PROOF. Proceeding indirectly, assume that there is a sequence (
n
, b
n
)
s
such that the corresponding eigenvalues satisfy
(n)
1
1 (or
(n)
1
(n)
2
0,
resp.). By the compactness of the Sobolev embedding of H
s
into C
0
we can
pass to a uniformly converging subsequence. Hence, Proposition 6.4 yields
that the corresponding transition densities converge uniformly, which implies
that the transition operators P
(n)
1
is always simple (Lemma 6.1) gives the contradiction.
By the same indirect arguments, we construct transition operators P
(n)
on
the space C([0, 1]) and infer that the eigenfunctions u
(n)
1
[Chatelin (1983),
Theorem 5.10], the invariant measures
(n)
[see (3.1)] and the inverses of the
functions S
(n)
[see (3.2)] converge in supremum norm. Therefore (u
(n)
1
)
(n)
1
(S
(n)
)
1
_
u
(n)
1
(n)
also converges in supremum norm. Due to u
1
|
[a,b]
= 0
(Lemma 6.1) this implies that (u
(n)
1
)
H
s+1 C(s, s
0
, S
1
s
,
s1
)|
k
|
s
,
where C is a continuous function of its arguments.
NONPARAMETRIC ESTIMATION OF SCALAR DIFFUSIONS 2251
PROOF. We know that
1
(Su
k
)
=
k
u
k
and u
k
(0) =0 holds, which imply
u
k
(x) =
k
S
1
(x)
_
x
0
u
k
(u)(u) du.
Suppose u
k
H
r+1
with r [0, s]. Then the function u
k
is in H
r(s1)
due
to u
k
C
r
(Sobolev embeddings). Hence, the antiderivative is in H
(r+1)s
.
As S
1
H
s
holds, the right-hand side is an element of H
s(r+1)
. We conclude
that the regularity r of u
k
is larger by 1, which implies that u
k
is in H
s+1
.
In quantitative terms we obtain for r [1, s], where we use the seminorm
|f |
s
:=f
(s)
L
2 ,
|u
k
|
r
|
k
|C(r)
_
|S
1
|
r
_
_
_
_
_
0
u
k
_
_
_
_
+S
1
_
0
u
k
r
_
|
k
|C(r)S
1
r
_
u
k
L
1
()
+|u
k
|
r1
_
|
k
|C(r)S
1
s
(1 +|u
k
|
r1
+u
k
r1
)
|
k
|C(r)S
1
s
(1 +2u
k
r1
s1
).
By applying this estimate u
k
r+1
|
k
|S
1
s
(1 + u
k
r1
s1
) succes-
sively for r =1, 2, . . . , s and nally, for r =s 1, the estimate follows.
PROPOSITION 6.7. For (, b)
s
the corresponding transition probability
density p
=p
,,b
satises
sup
(,b)
s
p
H
s+1
H
s <.
PROOF. The spectral decomposition of P
: L
2
() L
2
() yields
p
(x, y) =(y)
k=0
e
u
k
(x)u
k
(y), x, y [0, 1].
Due to the uniform ellipticity and uniform boundedness of the coefcients, we
have
k
[C
1
k
2
, C
2
k
2
] with uniform constants C
1
, C
2
>0 on
s
[see Davies
(1995), adapting Example 4.6.1, page 93, to our situation]. From the preceding
Lemma 6.6 and the Sobolev embedding H
s+1
C
s
we infer
p
H
s+1
H
s
k=0
e
C
2
k
2
u
k
s+1
u
k
k=0
C(s, s
0
, S
1
s
,
s1
)
2
e
C
2
k
2
(C
1
k
2
)
s+1
s
,
which gives the desired uniform estimate.
Acknowledgment. The helpful comments of two anonymous referees are
gratefully acknowledged.
2252 E. GOBET, M. HOFFMANN AND M. REI
REFERENCES
AT-SAHALIA, Y. (1996). Nonparametric pricing of interest rate derivative securities. Econometrica
64 527560.
BANON, G. (1978). Nonparametric identication for diffusion processes. SIAM J. Control Optim. 16
380395.
BASS, R. F. (1998). Diffusions and Elliptic Operators. Springer, New York.
BROWN, B. M. and HEWITT, J. I. (1975). Asymptotic likelihood theory for diffusion processes.
J. Appl. Probability 12 228238.
CHAPMAN, D. A. and PEARSON, N. D. (2000). Is the short rate drift actually nonlinear? J. Finance
55 355388.
CHATELIN, F. (1983). Spectral Approximation of Linear Operators. Academic Press, New York.
CHEN, X., HANSEN, L. P. and SCHEINKMAN, J. A. (1997). Shape preserving spectral approximation
of diffusions. Working paper. (Last version, November 2000.)
COHEN, A. (2000). Wavelet methods in numerical analysis. In Handbook of Numerical Analysis 7
(P. G. Ciarlet, ed.) 417711. North-Holland, Amsterdam.
DAVIES, E. B. (1995). Spectral Theory and Differential Operators. Cambridge Univ. Press.
ENGEL, K.-J. and NAGEL, R. (2000). One-Parameter Semigroups for Linear Evolution Equations.
Springer, Berlin.
FAN, J. and YAO, Q. (1998). Efcient estimation of conditional variance functions in stochastic
regression. Biometrika 85 645660.
FAN, J. and ZHANG, C. (2003). A re-examination of diffusion estimators with applications to
nancial model validation. J. Amer. Statist. Assoc. 98 118134.
GOBET, E. (2002). LAN property for ergodic diffusions with discrete observations. Ann. Inst.
H. Poincar Probab. Statist. 38 711737.
HANSEN, L. P., SCHEINKMAN, J. A. and TOUZI, N. (1998). Spectral methods for identifying scalar
diffusions. J. Econometrics 86 132.
HOFFMANN, M. (1999). Adaptive estimation in diffusion processes. Stochastic Process. Appl. 79
135163.
KATO, T. (1995). Perturbation Theory for Linear Operators. Springer, Berlin. [Reprint of the
corrected printing of the 2nd ed. (1980).]
KESSLER, M. (1997). Estimation of an ergodic diffusion fromdiscrete observations. Scand. J. Statist.
24 211229.
KESSLER, M. and SRENSEN, M. (1999). Estimating equations based on eigenfunctions for a
discretely observed diffusion process. Bernoulli 5 299314.
KITTANEH, F. (1985). On Lipschitz functions of normal operators. Proc. Amer. Math. Soc. 94
416418.
KOROSTELEV, A. P. and TSYBAKOV, A. B. (1993). Minimax Theory of Image Reconstruction.
Lecture Notes in Statist. 82. Springer, Berlin.
KUTOYANTS, Y. A. (1975). Local asymptotic normality for processes of diffusion type. Izv. Akad.
Nauk Armyan. SSR Ser. Mat. 10 103112.
KUTOYANTS, Y. A. (1984). On nonparametric estimation of trend coefcient in a diffusion process.
In Statistics and Control of Stochastic Processes 230250. Nauka, Moscow.
LIPTSER, R. S. and SHIRYAEV, A. N. (2001). Statistics of Random Processes 1. General Theory,
2nd ed. Springer, Berlin.
MLLER, H.-G. and STADTMLLER, U. (1987). Estimation of heteroscedasticity in regression
analysis. Ann. Statist. 15 610625.
PHAM, D. T. (1981). Nonparametric estimation of the drift coefcient in the diffusion equation.
Math. Operationsforsch. Statist. Ser. Statist. 12 6173.
REI, M. (2003). Simulation results for estimating the diffusion coefcient from discrete time
observation. Available at www.mathematik.hu-berlin.de/reiss/sim-diff-est.pdf.
NONPARAMETRIC ESTIMATION OF SCALAR DIFFUSIONS 2253
ROSENBLATT, M. (1971). Markov Processes. Structure and Asymptotic Behavior. Springer, Berlin.
STANTON, R. (1997). A nonparametric model of term structure dynamics and the market price of
interest rate risk. J. Finance 52 19732002.
STROOCK, D. W. and VARADHAN, S. R. S. (1971). Diffusion processes with boundary conditions.
Comm. Pure Appl. Math. 24 147225.
STROOCK, D. W. and VARADHAN, S. R. S. (1997). Multidimensional Diffusion Processes. Springer,
Berlin.
TRIBOULEY, K. and VIENNET, G. (1998). L
p
adaptive density estimation in a -mixing framework.
Ann. Inst. H. Poincar Probab. Statist. 34 179208.
YOSHIDA, N. (1992). Estimation for diffusion processes from discrete observations. J. Multivariate
Anal. 41 220242.
E. GOBET
CENTRE DE MATHMATIQUES APPLIQUES
ECOLE POLYTECHNIQUE
91128 PALAISEAU CEDEX
PARIS
FRANCE
M. HOFFMANN
LABORATOIRE DE PROBABILITS
ET MODLES ALATOIRES
UNIVERSIT PARIS 6 ET 7
UFR MATHMATIQUES, CASE 7012 TAGE
2 PLACE JUSSIEU
75251 PARIS CEDEX 05
FRANCE
E-MAIL: hoffmann@math.jussieu.fr
M. REI
INSTITUT FR MATHEMATIK
HUMBOLDT-UNIVERSITT ZU BERLIN
UNTER DEN LINDEN 6
10099 BERLIN
GERMANY