Distrvo PDF

Lecture Notes on the
THEORY OF DISTRIBUTIONS
by
Gunther Hormann & Roland Steinbauer
Fakultat fur Mathematik, Universitat Wien

Summer Term 2009
Contents
Contents iii
0 PRELUDE 1
1 TEST FUNCTIONS AND DISTRIBUTIONS 3

1.1 Smooth functions, support, and test functions . . . . . . . . . . . . . . . . 6
1.2 Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.3 Convergence of distributions . . . . . . . . . . . . . . . . . . . . . . . . . . 21
1.4 Localization and Support . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
1.5 Distributions with compact support . . . . . . . . . . . . . . . . . . . . . 32
2 DIFFERENTIATION, DIFFERENTIAL OPERATORS 39

2.1 Dierentiation in D . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
0
2.2 Multiplication by C -functions . . . . . . . . . . . . . . . . . . . . . . . . 44

2.3 A rst glimpse of dierential equations . . . . . . . . . . . . . . . . . . . . 47
2.4 On duality tricks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
3 BASIC CONSTRUCTIONS 57
3.1 Test functions depending on parameters . . . . . . . . . . . . . . . . . . . 58
3.2 Tensor product of distributions . . . . . . . . . . . . . . . . . . . . . . . . 61
3.3 Change of coordinates and pullback . . . . . . . . . . . . . . . . . . . . . . 68
4 CONVOLUTION 75
4.1 Convolution of distributions . . . . . . . . . . . . . . . . . . . . . . . . . . 77
4.2 Regularization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
4.3 The case of non-compact supports . . . . . . . . . . . . . . . . . . . . . . 85
iii
4.4 The local structure of distributions . . . . . . . . . . . . . . . . . . . . . . 89
5 FOURIER TRANSFORM AND TEMPERATE DISTRIBUTIONS 93
5.1 Classical Fourier transform . . . . . . . . . . . . . . . . . . . . . . . . . . 95
5.2 The space of rapidly decreasing functions . . . . . . . . . . . . . . . . . . 99
5.3 Temperate distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
5.4 Fourier transform on S 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
5.5 Fourier transform on E 0 and the convolution theorem . . . . . . . . . . . 114
5.6 Fourier transform on L2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
6 REGULARITY 123
6.1 Sobolev Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
6.2 The singular support of a dristribution . . . . . . . . . . . . . . . . . . . . 134
6.3 The theorem of Paley-Wiener-Schwartz . . . . . . . . . . . . . . . . . . . . 137
6.4 Regularity and partial dierential operators . . . . . . . . . . . . . . . . . 143
7 FUNDAMENTAL SOLUTIONS 149
7.1 Basic notions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
7.2 The Malgrange-Ehrenpreis theorem . . . . . . . . . . . . . . . . . . . . . . 151
7.3 Hypoellipticity of partial dierential operators with constant coecients . 152
7.4 Fundamental solutions of some prominent operators . . . . . . . . . . . . 153
Bibliography 155
Chapter
PRELUDE
to be done later
0.1. General Intro

0.2. Some historical remarks
0.3. Motivating examples from PDE
(i) A transport equation
(ii) The wave equation
0.4. A motivating example from Physics: Electrostatics

0.5. Preview
1
Chapter
TEST FUNCTIONS AND

DISTRIBUTIONS
1.1. Intro In this chapter we start to make precise the basic elements of the theory
of distributions announced in 0.5.
We start by introducing and studying the space of test functions D, i.e., of smooth func-
tions which have compact support. We are going to construct non-tirivial test functions,
discuss convergence in D and regularizations by convolution. We also prove sequential
completeness of D.
We are then prepared to give the denition of distributions as continuous linear func-
tionals on D and prove a semi-norm estimate charcaterizing continuity. We also give a
number of examples and study distributions of nite order. Then we head on to discuss
convergence in the space D 0 of distributions and to prove sequential completeness of D 0 .
Next we dene the support of a distribution and introduce the localization of a distribu-
tion to an open set. We invoke partitions of unity to show that a distribution is uniquely
determined by its localizations.
Finally we discuss distributions with compact support and identify them with continuous
linear forms on C . Moreover, we completely describe distributions which have their
support concentrated in a single point.
1.2. Notation and conventions

(i) N := {1, 2, ...}, N0 := {0} N
3
4 Chapter 1. TEST FUNCTIONS AND DISTRIBUTIONS
(ii) means subset, will not be used

(iii) Rn will always be an open and non-empty subset; c = Rn \ denotes the
complement of in Rn
(iv) K b (also K ) : K and K compact
(v) Let A . We denote by A the interior of A and by A the closure of A in ;

if is clear from the context, we will only write A
(vi) For A we denote by 1A the characteristic function of A, i.e.

1 if x A
1A (x) =
0 if x 6 A
(vii) For x = (x1 , x2 , . . . , xn ) Rn or Cn we denote by |x| the Euclidian norm of x, i.e.

v
uX
u n 2
|x| := t x j
j=1
(viii) For R > 0 and x0 Rn (Cn ) we denote by BR (x0 ) the open Euclidian ball around
x0 with radius R, i.e.
BR (x0 ) := {x : |x x0 | < R}
(ix) Multi-index notation:

Pn 0 will be called a multi-index of length
= (1 , . . . n ) Nn
(order) || := j=1 j . For multi-indices , we dene
+ := (1 + 1 , . . . , n + n )
6 j 6 j 16j6n
:= (1 1 , . . . , n n ), if 6
For x Rn we write x := x n
1 . . . xn
1
!
!

! := 1 ! . . . n ! and :=
( )! !
For f : C we write
|| f
f :=
n
x1 . . . xn
1
If = ej = (0, 0, . . . , 0, 1, 0 . . . , 0) (i.e. i = ij with ij the Kronecker-delta,

ij = 1, if i = j and ij = 0 otherwise) we write f = j f
{ { { { { { { { { { { { D R A F T - V E R S I O N (July 10, 2009) { { { { { { { { { { { {

5
In multi-index notation the Taylor series takes the form

X h
f(x0 + h) = f(x0 )
!
||>0
(x) Lp -norms: Let A open or closed, f : A C continuous (resp. A measurable,

f measurable)
Z
1/p
||f||Lp (A) := |f(x)|p dx (1 6 p < )

A
||f||L (A) := sup |f(x)| (resp. ess sup)
xA

1.1. SMOOTH FUNCTIONS, SUPPORT, AND

TEST FUNCTIONS
1.3. DEF (Ck -functions)

(i) C() = C0 () := {f : C | f is continuous}, C = C(Rn )
(ii) k N: Ck () := {f : C | f is k-times continuously dierentiable}, Ck =
Ck (Rn )
(iii) E () = C () :=
\
Ck ()
kN
= {f : C | f is (continuously) dierentiable of every order}
is the space of smooth functions (also: innitely dierentiable or C -functions)
E = C = C (Rn )
1.4. DEF (Convergence in Ck )

(i) Let (fl )lN be a sequence and f in Ck () (in E (), resp.)
0 , || 6 k
K b Nn
Ck
fl f (l ) : fl f uniformly on K
[i.e. k fl fkL (K) 0]
E
(resp. fl f (l ) : K b Nn 0 : fl f uniformly on K)

This notion is called uniform convergence on compact sets in all derivatives.

(ii) We will occasionally consider nets of Ck - or C -functions of the type (f )0<61
or (ft )1<t< ; for these kind of nets convergence (as 0 or t ) is dened
analogously to (i)
1.5. Example (Convergence in E ) Let f E (Rn ) be arbitrary; for ]0, 1] dene

f E by f (x) := f(x) (x Rn )
E
Claim: f f(0) [constant function] as 0

1.1. SMOOTH FUNCTIONS, SUPPORT, AND TEST FUNCTIONS 7
Proof: Let K b Rn ; choose R > 0 such that K BR (0).

Then we have x K: {x | ]0, 1]} BR (0). We distinguish two cases with respect to
the derivative order ||
|| = 0: |f (x) f(0)| = |f(x) f(0)| 0 (as 0) uniformly on K
[by uniform continuity of f on the compact set BR (0)]
|| > 1: clearly (f(0)) = 0 and f (x) = || f(x); since f is bounded on

compact sets we thus obtain x K
| f (x) (f(0))| = || | f(x)| 6 || k fkL (BR (0)) ,
hence k f (f(0))kL (K) 6 k fkL (BR (0)) 0 ( 0).

1.6. DEF (Suppport) Let f C(). The set

supp(f) := {x | f(x) 6= 0}
= \ {y | neighborhood U 3 y : f|U = 0}
is called the support of f.
1.7. Observation (Properties of the support) Let f, g C().

(i) f = 0 supp(f) =
(ii) supp(f) is closed in
(iii) supp(f g) supp(f) supp(g)
(iv) supp(f) is the complement of the largest open subset of where f does not vanish.
1.8. DEF (Test functions)

(i) k N0 : Ckc () := {f Ck () | supp(f) is compact}
(ii) C
c () = D() := {f C () | supp(f) is compact}

is the space of test functions on .
1.9. Question (Nontriviality of the spaces Ckc () and D())

Clearly, for all k N0 we have the inclusions D() Ck+1 c () Cc () and the zero
k
function belongs to D(). But can we be sure that there exist any non-zero test func-
tions in D()? In fact, many classes of smooth functions that may come to your mind
at rst, e.g. polynomials, sin, cos, and the exponential function, do not have a compact
support. So we have to deal with the question: Does the vector space D() contain
suciently many \interesting functions" to provide a good basis for a rich theory?
[For nite k the nontriviality of Ckc () can be shown by an elementary exercise: (i) For an open
interval I R it is very easy to construct plenty of nonzero functions h C0c (I). (E.g., choose
a1 , a2 , a3 , a4 I with a1 < a2 < a3 < a4 ; put h = 0 on I\]a1 , a4 [, h = 1 on [a2 , a3 ]; then dene h on the remaining
two subintervals by the unique (ane) linear interpolation such that h is continuous on I.) If I1 , . . . In are open
intervals such that J := I1 In we may choose 0 6= fj c j ) for each j = 1, . . . , n.
C0 (I
Then put f(x1 , . . . , xn ) := f1 (x1 ) f2 (x2 ) fn (xn ), when (x1 , . . . , xn ) J, and f(x) := 0, when
x \ J. This yields an element 0 6= f C0c () (with f = 1 on some compact cube inside J).
(ii) To obtain an element 0 6= f Ckc () with (nite) k > 1 one simply has to replace the edges
in the graph of the function h in (i) (at the points (a1 , 0), (a2 , 1), (a3 , 1), (a4 , 0)) by a higher-order
spline-type interpolation with matching left and right derivatives at the respective connecting
points up to order k.]
We now begin our constructions in the smooth case.
1.10. LEMMA (Smoothly joining zero) Let h : R C be dened by

e1/t t>0
h(t) := insert graph
0 t 6 0.
Then h belongs to C (R), 0 6 h 6 1, and h(t) > 0 if and only if t > 0.

[The proof of smoothness can be done by induction and is analogous to a similar one in
[For06, Beispiel (22.2)] (see also [Hor09, 10.12]); the remaining properties are immediate
from the denition.]
1.11. Constructions (Bump functions on Rn and nontriviality of D() = Cc ())

(i) Basic bump function: An explicit example of a function C
c (R ) with 0 6 6 1,
n
supp() = B1 (0), and (x) > 0 when |x| < 1, is given by

if |x| < 1,
2)
e1/(1|x|
(x) := insert graph
0 otherwise.
[Smoothness of follows from the chain rule upon noticing that (x) = h(1 |x|2 ) with h as in
Lemma 1.10]

R
(ii) Normalization: Since C
c (R ) (from (i)) satises
n
Rn (y)dy > 0 we may set
(x)
(x) := R
Rn (y)dy
and obtain a function RC c (R ), supp() = B1 (0), 0 6 6 1, (x) > 0 when |x| < 1,
n
and such that in addition Rn (x)dx = 1.

(iii) Scaling: For ]0, 1] we set
1 x
(x) := ( ) (x Rn ).
n
R
c (R ), > 0, supp( ) = B (0), and
Then we have for every ]0, 1]: C n
= 1.
Rn
(iv) Translation: Let x0 Rn be arbitrary. Dening the functions (0 < 6 1) on Rn
by
1 x x0
(x) := (x x0 ) =
( )
n
R
we obtain ]0, 1]: C
c (R ), > 0, supp( ) = B (x0 ), and
n
= 1.
Rn
Thus, we constructed smooth normalized non-negative bump functions in D() with
supports concentrated near any given point.
1.12. DEF (Mollier) A function D(Rn ) is called a mollier if

(i) supp() B1 (0)
R
(ii) (x)dx = 1.
Rn
By 1.11 existence of molliers is guaranteed.
1.13. THM (Approximation by convolution) Let f Ckc (Rn ) (0 6 k 6 ) and let

be a mollier. For ]0, 1] we dene
Z Z
1 xy
f (x) := f(y) (x y) dy = n f(y)( ) dy.

Rn Rn
gue/material.html [f = f , where '' is called convolution]

Then we have
(i) f D(Rn ) with supp(f ) {x Rn | d(x, supp(f)) 6 };
[where d(x0 , A) := inf xA |x x0 | for any x0 Rn and A Rn ]
(ii) if k < then f f in Ck (Rn ) (as 0);
(iii) if k = then f f in E (Rn ) (as 0).

Proof: (i) Let K = supp(f).

Since is smooth (and for x in a bounded open subset U Rn we have f(y)( xy

)=0
when y / K (U supp())) we may apply standard theorems about integrals depending
on parameters1 and obtain smoothness of f .
Furthermore, noting that supp( ) B (0) and changing integration variables from y
to y 0 = x y we may write
Z Z
f (x) = f(y) (x y) dy = f(x y 0 ) (y 0 ) dy 0 .
Rn B (0)
If x Rn with d(x, K) > then f(x y 0 ) = 0 for all y 0 in the integration domain, thus
f (x) = 0. Therefore supp(f ) {x Rn | d(x, K) 6 }.
(ii) and (iii): We rst prove uniform convergence (on all of Rn ) f f ( 0).
The change of variables y 7 z = (x y)/ (hence dz = dy/n ) yields
Z Z
1 xy
f (x) = n f(y)( ) dy = f(x z)(z) dz.

Rn Rn
Therefore, appealing to uniform continuity of f (due to its compact support!), we may

deduce
R
[ =1] Z Z

|f (x) f(x)| = f(x z)(z) dz f(x)(z) dz

n n

R R
[supp()(B1 (0)] Z

6 |f(x z) f(x)| |(z)| dz
B1 (0)
Z
6 |(z)| dz sup |f(x y) f(x)| 0 ( 0).
| {z } |y|6
constant
R
If || 6 k then the same game can be played with f (x) = f(x z)(z) dz to show
uniform convergence f f. Thus we obtain, in particular, uniform convergence
of all derivatives up to order k on compact subsets.
1.14. REM (Approximation on )

(i) Any f Ckc () can be extended to a function f~ Ckc (Rn ) simply by setting f~(x) = 0
when x 6 , and f~(x) = f(x) when x . In fact, the map f 7 f~ yields an embedding
1 e.g. [For05, 10, Satz 2, Bemerkung] or [Hor09, 18.19]; see also [Fol99, Theorem 2.27]

Ckc () , Ckc (Rn ) and we will follow a common abuse of notation in writing f instead of
f~.
Constructing f as in the above theorem we obtain for < d(supp(f), c ) that supp(f )
, hence f Ckc () and f f in Ckc ().
[Here, d(A, B) = inf xA,yB |x y| for subsets A, B Rn .]
insert drawing
(ii) As a special case of the result in (i) we may state that D() is dense in Cc () .
(iii) Let f C () and K b . Then we have: For any > 0 there is a function
D() such that f |K = |K and supp() K + B (0).
insert drawing
[For a proof recall the following result (cf. e.g.[Hor09, 25.5], or [For84, 3, Satz 1]) on smooth
bump functions, or cut-os: Let 0 Rn be open. For every K b 0 we can nd a function
Cc ( ) with 0 6 6 1 such that x K : (x) = 1.
0
Now consider = f and choose 0 to be some open neighborhood of K + B (0).]
1.15. DEF (The spaces Dk (K)) Let K b , 0 6 k 6 . We dene

(i) Dk (K) := {f Ckc () | supp
T
(f) K}; if k = we also write D(K) instead of
D (K); note that D(K) = k=1 Dk (K)

(ii) Let fn (n N) and f be in D(K) [or Dk (K) with k < , resp.]. We say that the
sequence (fn ) converges to f in D(K) [Dk (K), k < , resp.], and we write fn f,
if
fn f uniformly on K, Nn0 [|| 6 k, resp.].
(iii) When we consider nets of the type (f )0<61 or (ft )1<t< convergence (as 0
or t ) is dened similarly.
1.16. REM (Topology I)

(i) For k N0 (thus k < ) the space Dk (K) is a Banach space (i.e. normed and
complete) with the norm
X P
kfkDk (K) := k fkL (K) [or max instead of ].
||6k

(ii) D(K) is a Frechet space (i.e., a complete, metrizable, locally convex vector space;
[Hor66, Ch. 2, Section 9, Def. 4], [Ste09, 2.51]) with semi-norms
q (f) := k fkL (K) ( Nn

0 ).
1.17. DEF (Convergence of test functions)

(i) Let n (n N) and be in D(). We dene n in D() (as n ) to mean
(1) K b : supp() K and supp(n ) K n N, and

(2) n uniformly on K for all Nn
0.
Thus, n in D() if and only if
K b : n N, n , D(K) and n in D(K).
(ii) Again, for nets of the form ( )0<61 or (t )1<t< convergence in D() (as 0
or t ) is dened in the same way.
(iii) We also have analogous concepts of convergence in the spaces Ckc () when k < .
In these cases, we require (2) to hold for Nn0 with || 6 k.
1.18. REM (Topology II)

(i) D() is a strict inductive limit of the Frechet spaces D(K) (as K runs through an
exhaustive sequence of compact subsets of Rn ) | a so-called (LF)-space (see [Sch66, Ch.
II, 6.3] or [Ste09, Def. 3.23]). As a locally convex vector space D() is complete, barreled,
bornological, and a Montel space (thus it is re exive and the Heine-Borel theorem is valid)
(cf. [Hor66, Chapters 2-3] or [Sch66, Chapter III, 1-2]); however, it can be shown that
D() is not metrizable (follows from [Hor66, Chapter 2, 12, Exercise 6(a)]).
(ii) Ckc () with k < is a strict inductive limit of the Banach spaces Dk (K) (as K runs
through an exhaustive sequence of compact subsets of Rn ) | a so-called (LB)-space.
1.19. Example (D convergence vs. E -convergence) Convergence in D(Rn ) diers from

convergence in E (Rn ): Let D(Rn ) with (0) 6= 0 and set (x) := (x) ( > 0). We
know from Example 1.5 that (0) in E (Rn ) (as 0). However, (the nonzero
constant function) (0) 6 D(Rn ). Moreover, the net ( ) cannot be convergent in
D(Rn ): choose x0 6= 0 with (x0 ) 6= 0; then (x0 /) = (x0 ) 6= 0, whereas |x0 |/ ;
hence there is no compact subset K b Rn such that supp( ) K for all small > 0.

1.20. REM (Approximation by convolution in Ckc () and D()) As can be seen from
Theorem 1.13 and its proof we may adapt the construction of f used there to obtain
convergence in Ckc () and D(). Indeed, choose 0 < 0 < 1 such that K0 := {x
Rn | d(x, supp(f)) 6 0 } b . Then we have supp(f) K0 and supp(f ) K0 for all
]0, 0 ], and by the very same proof we obtain f f (as 0 > 0) in Ckc (), or in
D() respectively.
1.21. DEF (Cauchy sequences in D())

A sequence (l )l in D() is called a Cauchy sequence if
(1) K b : supp(l ) K l N, and

(2) Nn
0 > 0 N(, ) : k k l kL (K) < k, l > N(, ).
Thus (l )l is a Cauchysequence in D() if and only if there is some K b such that

(l ) is a Cauchy sequence in D(K).
Analogously we dene the notion of a Cauchy net for ( )0<61 and (t )1<t< and the
notions of Cauchy sequences and Cauchy nets in Dk ().
1.22. THM D() is sequentially complete.
Proof: Let (l ) be a Cauchy sequence in D() and let K be as in 1.21(1). Then ( l )

is a Cauchy sequence in D0 (K) for all Nn0 . By 1.16(i) this space is complete hence
there exist limits for all ( l ). More precisely, we have
0 D (K) D () with l uniformly on K.

Nn 0 0
We now claim that l 0 in D(). Indeed (1) in 1.17(i) is clear and we are left with
proving (2). To this end let Nn0 , 1 6 j 6 n, = + ej . Then we have
Zj
x Zj
x
= lim l = lim l (x1 , . . . , s, . . . , xn )ds = (x1 , . . . , s, . . . , xn )ds.

j j

Hence j = and since , , and j were arbitrary we have = 0 for all

Nn0 . But this implies that l 0 uniformly on K for all N0 , and we are
n
done.
1.23. COR Dk () is sequentially complete for all 0 6 k < .
Proof: Just slim down the above proof to the case || 6 k.

1.2. DISTRIBUTIONS
1.24. DEF (Distributions)

(i) A distribution u on is a linear functional on D(), i.e. u : D() C linear,
with the following continuity property:
n in D() = u(n ) u() in C
(equivalently we may require: n 0 in D() = u(n ) 0 in C).
(ii) The complex vector space of distributions on is denoted by D 0 (). We often

write hu, i instead of u().
1.25. REM (On continuity issues)

(i) The continuity condition stated in the above denition is, in fact, that of sequential
continuity.
(ii) Continuity of a linear functional (with respect to the locally convex topology) implies
sequential continuity. However, on general locally convex spaces sequential continuity
is strictly weaker than continuity. (Explicit examples can be obtained from constructions in
[Obe86, Beispiele 1.3.(3-4)] or [KR86, Exercise 7.6.9].)
(iii) On AA1-spaces, also called rst countable2 , thus in particular on metric spaces,
sequential continuity is equivalent to continuity.
But recall from Remark 1.18, (i), that D() is not metrizable; hence it cannot be AA1,
since due to [Sch66, Theorem 6.1] any Hausdor topological vector space that is rst
countable is necessarily metrizable.
(iv) Nevertheless, sequential continuity of linear functionals on D() does imply conti-
nuity on D(), since it is an (LF)-space (see also [Ste09, 3.26]): Let u : D() C be
linear and sequentially continuous. Then (LF)-theory implies that u |D(K) is sequentially
continuous for every compact subset K. Metrizability of D(K) then yields continuity of
u |D(K) for every K, which in turn by (LF)-theory yields continuity of u on D().
2 i.e. each point possesses a countable basis of neighborhoods.

1.2. DISTRIBUTIONS 15
More explicitly we have the following result.
1.26. THM (Continuity criterion| seminorm estimates) Let u : D() C be linear.

Then we have: u D 0 () K b C > 0 m N0 :
X
(SN) |hu, i| 6 C k kL (K) D(K).
||6m
P
[Using 1.16(ii) we may rewrite (SN) as |hu, i| 6 C ||6m q (). Observe that this is
precisely the semi-norm estimate characterizing continuity of the linear map u : D(K)
C; cf. [Sch66, Ch. III, 1.1], [Ste09, 2.24].)]

Proof: Let n 0 in D() with K supp(n ) for all n. Choose C > 0 and

m N0 according to (SN). Then we have
X
|hu, n i| 6 C k n kL (K) 0 (n ),
||6m
hence u D 0 ().

By contradiction: Assume that we have u D () but K b m N0 there is
0
some m D(K) such that

X
|hu, m i| > m k m kL (K) .
||6m
P
(Note that necessarily m 6= 0 and thus 0 < km kL (K) 6 ||6m k m L (K) .)
k
m (x)
Now dene m (x) := P for x .
m k m kL (K)
||6m
Then m D(K) and for any Nn0 and m > || we clearly have
X X k m kL (K) 1
k m kL (K) 6 k m kL (K) = P = ,
m k m kL (K) m
||6m ||6m
||6m
hence m 0 in D() (as m ).

On the other hand, we obtain by construction
|hu, m i|
|hu, m i| = P > 1,
m k m kL (K)
||6m
and therefore |hu, m i| 6 0 in C | a contradiction .
1.27. Examples (Some important distributiions)

(i) Continuous functions as dstributions: Let f C() and dene uf : D() C by

Z
(RD) huf , i := f(x)(x) dx.
| {z }
Cc ()!
Clearly uf is linear; we show that it also satises (SN): Let K b , D(K), then
Z Z
|huf , i| 6 |f(x)| |(x)| dx = |f(x)| |(x)| dx
K
Z
6 kkL (K) |f(x)| dx = kfkL1 (K) kkL (K) .
K
Thus we obtain (SN) with m = 0 and C = kfkL1 (K) .

(ii) The Heaviside function: Let H denote (the L -class of) a function on R with H(x) = 0
when x < 0 and H(x) = 1 when x > 0. We dene a linear functional on D(R), which we
also denote by H, by setting
Z
Z

hH, i := H(x)(x) dx = (x) dx ( D(R)).

0
If K b R we have |hH, i| 6 diam(K)kkL (K) , which proves (SN) with m = 0 and

C = diam(K). Hence H D 0 (R).
(iii) The Dirac distribution (\-function") at a point x0 :
hx0 , i := (x0 ) ( D()).
Linearity of x0 is clear and

|hx0 , i| = |(x0 )| 6 kkL
shows that x0 D 0 () (the estimate (SN) holds with m = 0 and C = 1).

If = Rn and x0 = 0 it is common to write to mean 0 .
(iv) Exercise: Which of the following maps D(R) C dene distributions?
X

(a) hw1 , i := (k) [yes; k leaves any compact subset]
k=0
X

(b) hw2 , i := (k) ( 2) [no; not dened for all test functions3 ]
k=0
[furthermore, would require all derivatives on K = { 2}]
3 E.g.

if (x) = ex 2
near x = 2

X
1 (k)
(c) hw3 , i := (k) [yes; k leaves any compact subset]
k=0
k
X

(d) hw4 , i := (k) (k) [yes; k leaves any compact subset]
k=0
Z
(e) hw5 , i := 2 (x) dx [no; not a linear form]
R
Z

(x) (x)
(f) hw6 , i := dx [yes; but requires a little rewriting:
x
0
Z1
Rx
the fundamental theorem of calculus gives (x) (x) = x 0 (s) ds =x 0 (sx) ds;
|1 {z }
(x) smooth
for any K b R we may choose d > 0 with K [d, d], then for D(K) we have
Z Z
d d
(x) (x)
|hw6 , i| =
dx 6 |(x)| dx 6 2dk 0 kL ([d,d]) ,
x
0 0
which yields the estimate (SN) with m = 1 and C = 2d.]

This distribution is called the principal value of 1
x
and is denoted by vp( x1 ).
1.28. REM (Regular distributions)

(i) As Example 1.27(i) shows any continuous function denes a distribution. Moreover
the assignment
C() 3 f 7 uf D 0 ()
is clearly linear. We claim that it is also injective : uf = 0 in D 0 () implies 0 = huf , i =
R
f for all D(); if f 6= 0 there is x0 such that f(x0 ) 6= 0. Upon division by
f(x0 ) we may assume that f(x0 ) = 1 and, taking real parts, also that f is real-valued.
Then we may choose > 0 such that B (x0 ) and f(x) > 1/2 when |x x0 | < .
Let D(Rn ) be a mollier [Def.1.12] satisfying > 0, then x 7 ((x x0 )/) can be
considered as test function in D() and we have
Z Z Z

0 = huf , i = f(x) (
x x0

) dx >
1 x x0
2
(

1
) dx = n
2
(x) dx =
n
2
>0 .
B (x0 ) B (x0 ) Rn
Thus we obtain an embedding C() , D 0 (), which we henceforth understand when

writing simply C() D 0 ().

(ii) Similar to (i) we even obtain an embedding of the space L1loc (), i.e. (classes of)
Lebesgue measurable functions that are Lebesgue integrable on every compact subset of
, into D 0 (). Hence we actually have C() L1loc () D 0 ().
R
[To prove that f = 0 D() implies f = 0 in L1loc () one can again use a regularization
technique as in Theorem 1.13 above (for details cf. [LL01, Theorems 2.16 and 6.5]): dene
R
f := f L1 , then K |f f| 0 ( 0) on any compact subset K; by assumption
R R R
f (x) = f(y) (x y)dy = 0, hence K |f| = lim K |f f| = 0 on every compact subset K,
which implies f = 0 almost everywhere.]
1.29. DEF (Regular distribution) A distribution u D 0 () is called regular, if f

L1loc () such that u = uf , i.e.
Z
hu, i = f(x)(x) dx = huf , i D().

In this case we will often abuse notation and simply write f instead of uf (thus hf, i
instead of huf , i).
1.30. Example ( is not regular)

The Dirac distribution x0 is not a regular distribution: Suppose the contrary, that is
f L1loc () with x0 = uf . Choose D(Rn ) with supp() B1 (0), (0) = 1, and
dene l (x) := (l(x x0 )) (l N). Then supp(l ) B1/l (x0 ), l (x0 ) = 1, and we have
Z Z
1 = |hx0 , l i| 6 |f(x)| |(l(x x0 ))| dx 6 kkL |f(x)| dx 4 0 (l )
B1/l (x0 ) B1/l (x0 )
| a contradiction .
1.31. DEF (Distributions of nite order)
(i) A distribution u D 0 () is said to be of nite order, if in the estimate (SN) the
integer m may be chosen uniformly for all K, i.e.
X
m N0 K b C > 0 : |hu, i| 6 C k kL (K) ( D(K)).
||6m
The minimal m N0 satisfying the above is then called the order of the distribution.
The space of all distributions of order less or equal to m is denoted by D 0m ().
4 Apply the dominated convergence Theorem, which we will recall in 1.40 below.

(ii) The subspace of all distributions of nite order is denoted by DF0 (). We have
[
DF0 () = D 0m ().
mN0
1.32. Examples (Some (non-)nite order distributions)

(i) A regular distribution is of order 0 (cf. 1.27(i)).
(ii) x0 is of order 0 (cf. 1.27(iii)).
(iii) Let || = m and dene u D 0 (Rn ) by hu, i := . Then u is of order m.
(iv) There exist distributions that are not of nite order. E.g. consider u D 0 (R)
dened by
X

hu, i := (k) (k).
k=0
If K b R then we have to choose m > sup {k N0 | k K} to ensure (SN). There can be

no m such that (SN) holds with this xed m and for all compact subsets K. The farther
outward supp() reaches the higher the derivatives that have to be taken into account.
1.33. Motivation: (Finite order distributions as functionals on Dm ) For a distribu-

tion of order m the continuity condition (SN) involves only derivatives up to order m
of the test functions. Thus we expect D 0m () to be the dual of Dm (), where the
continuity conditions on the linear functionals are analogous to 1.24(i) and (SN). The
technical problem with this identication is that given u D 0m () we have to extend it
to a linear functional on Dm (), which is strictly larger than D(). The precise result
is Proposition 1.34 below.
We remark that, in particular, distributions of order 0 dene continuous linear forms
on Cc (). Therefore they can be identied with complex Radon measures on (cf.
[Fol99, Sections 7.1-3]).
1.34. PROP (D 0m is the dual of Dm )

(i) Every u D 0m () can be uniquely extended to a continuous linear form on
Dm ().
(ii) Conversely, if u is a continuous linear form on Dm () then u |D() D 0m ().

Proof: (i) Let u D 0m (), then we have: K b C > 0 (depending on K)

X
( ) |hu, i| 6 C k kL (K) D(K).
||6m
Let Dm (). By Remarks 1.14,(i) and 1.20,(ii) there is a sequence (l ) in D() such
that l in Dm () (as l ). That is, there exists K0 b with supp() K and
supp(l ) K for all l such that forall with || 6 m we have l uniformly on K0 .
In particular, for || 6 m we obtain a Cauchy sequence ( l ) with respect to the L -norm on
K0 .
Choosing C0 > 0 according to () we obtain
X
|hu, k l i| 6 C0 k k l kL (K0 ) ,
||6m
which implies that (hu, l i) is a Cauchy sequence in C, hence possesses a limit u() := limhu, l i.
By a standard sequence mixing argument we see that the value u() is independent of the
approximating sequence (l ). Linearity with respect to is clear, hence we obtain a linear form
u on Dm (). Moreover
X X
|hu, i| = lim |hu, l i| 6 lim C0 k l kL (K0 ) = C0 k kL (K0 )
||6m ||6m
shows (sequential) continuity of u. That u |D() = u follows from sequential continuity of u and
uniqueness of u as an extension of u follows from the density of D() in Dm () (observed in
1.14,(i) and 1.20,(ii) already).
(ii) Is clear since sequential continuity of u implies the same for u |D() .
[Convergent sequences in D() converge also in Dm () and have the same limit.]

1.3. CONVERGENCE OF DISTRIBUTIONS 21
1.3. CONVERGENCE OF DISTRIBUTIONS
1.35. DEF (Sequential convergence in D 0 ) Let (ul ) be a sequence in D 0 () and u

D 0 (). We say that
(i) (ul ) converges to u in D 0 (), ul u (l ), if
lim hul , i = hu, i D();
l
(ii) (ul ) is a Cauchy sequence in D 0 (), if (hul , i)lN is a Cauchy sequence in C for
all D().
(iii) When considering nets of the type (u )0<61 or (ut )1<t< convergence (as 0
or t , respectively) and the concept of a Cauchy5 net are dened similarly.
1.36. REM (On D 0 -convergence) The notion of conergence introduced above is often
called
pointwise convergence on D() (since we require that ul () u())
or
weak convergence of distributions.
In fact, it is weak*-convergence in the sense of locally convex vector space theory for
the dual pairing (D 0 , D). This notion of convergence stems from the weak topology on
D 0 , denoted by (D 0 , D), which is generated by the following family of seminorms
p (u) := |hu, i| ( D()).
1.37. Example (Approximating and vp(1/x))

R
(i) Delta nets: Let Cc (Rn ) with supp() B1 (0) and = 1 (in particular, any
mollier would be admissible). We set
1 x
(x) := ( ).
n
Claim: in D 0 (Rn ) ( 0).
5 E.g.
in case of (u )0<61 : D() and > 0 we can nd 0 < 6 1 such that |hu1 u2 , i| <
whenever 0 < 1 , 2 < .

Proof: (Compare with 1.13.) Let D(Rn ) then

[subst. y=x/,
Z dy=dx/n ] Z
x dx
h , i = ( )(x) n = (y)(y) dy

Z
[since (y)(0) uniformly for ysupp()
we obtain as 0] (y)(0) dy = (0) = h, i

We stress the point that the above result (as well as its proof) does not refer to the
explicit shape of . This \shapelessness of " is of high practical value in applications of
distribution theory, in particular, to linear partial dierential equations.
(ii) Cauchy principal value: We dene the net (v )0<61 of distributions on R by
Z
(x)
hv , i := dx D(R), ]0, 1].
x
|x|>
To see that v D 0 (Rn ) we could use the fact that the function f , dened by
f (x) = 1/x when |x| > , and f (x) = 0 when |x| 6 , belongs to L1loc (R). Alterna-
tively, given K b R and D(K), we can directly establish the seminorm estimate
(SN) as follows: choose R > 0 with K [R, R], then
Z ZR
|(x)| dx R
|hv , i| 6 dx 6 kkL (K) 2 = 2 log( ) kkL (K) ,
|x| x
<|x|6R
hence (SN) holds (with m = 0 and C = 2 log(R/)).

(v ) converges in D 0 (R) (as 0): Let D(R) with supp() [R, R]. Then
Z Z
ZR ZR
(x) (x) (x) (x) (x)
hv , i = dx = dx + dx = dx =: ().
x x x x
<|x|6R R
As in Example 1.27(f)Rwe appeal to theR the fundamental theorem of calculus and

write (x) (x) = x
x
0 (s) ds = x 1 0 (sx) ds = x (x), where is smooth.
1
Hence we have
ZR ZR ZR
x (x)
() = dx = (x) dx (x) dx ( 0),
x
0
which shows that D 0 -lim v agrees with the distribution from Example 1.27(f). This
justies the following denition.

1.38. DEF We dene the (Cauchy) principal value of x1 , denoted by vp( x1 ), by its

action on any D 0 (R) in the form
Z Z

1 (x) (x) (x)
hvp( ), i := lim dx = dx.
x 0 x x
|x|> 0
Warning: A function f(x) = (x 6= 0), f(0) arbitrary, cannot dene an L1loc -class on

1
x
R, hence x1 is not a regular distribution. The closest we can get to interpret x1 as a
distribution on R is vp( x1 ).
1.39. REM (Convergence of regular distributions) D 0 -convergence of a sequence (uk )

of regular distributions, e.g. continuous functions, does not imply pointwise convergence
almost everywhere. For example, consider uk C(R) given by
uk (x) = eikx (x R; k N).
Sending k we have
(uk (x))kN converges i x 2Z, but
(uk ) converges to 0 in D 0 (R): let D(R) then
[integration
Z by parts] Z
1
huk , i = e ikx
(x) dx = eikx 0 (x) dx 0.
ik
| {z }
bounded
(This is of course just a disguised form of Riemann's lemma on Fourier coecients,

or, more generally speaking, a special case of the Riemann-Lebesgue lemma on the
Fourier transform; cf. [For06, 19, Satz 6] and [For84, 12, Corollar 2] or Lemma
6.20, below.)
We recall an important classical result by Lebesgue which allows to deduce distributional

convergence of a function sequence from convergence almost everywhere.
1.40. THM (Dominated convergence) Let (fk ) be a sequence in L1 () and f : C

such that
(i) for almost all x : f(x) = limk fk (x),
(ii) g L1 (), k N: |fk | 6 g (almost everywhere).

Z Z
R R
Then f L () and
1
f(x) dx = lim fk (x) dx [i.e., lim fn = lim fn ]
k

(For a proof see [For84, 9, Satz 2] or [Fol99, Theorem 2.24] or [LL01, Theorem 1.8].)
1.41. THM (Weak via dominated convergence) Let (fk ) be a sequence in L1loc () and
f : C such that
(i) for almost all x : f(x) = limk fk (x),
(ii) K b g L1 () k N: |fk | 6 g (almost everywhere on K).
Then f L1loc () and fk f in D 0 () (as k ).
Proof: Let D() and set K := supp(). Choose g according to (ii), then we clearly
have fk (x)(x) f(x)(x) (k ) for almost all x and |fk | 6 g || L1 () almost
everywhere on K. Thus, Lebesgue's dominated convergence theorem gives (as k )
Z Z
hfk , i = fk (x)(x) dx f(x)(x) dx = hf, i.

1.42. COR and Example (Uniform and D 0 -convergence)

(i) Let fk (k N) and f be functions in C() such that
fk f (k ) uniformly on compact sets.
Then fk f in D 0 ().
X

(ii) As a special case of (i) consider a power series ak xk on C with radius of conver-
k=0
gence R > 0. Let := BR (0) R2 (upon identication (x1 , x2 ) = (Re(x), Im(x))) and
denote by f : C the (analytic) P function dened by the power series, i.e. as limit of
the polynomial functions pN (x) := N k=0 ak x (x ). Then we have in the sense of
k
convergence in D 0 () that f = lim pN , or with a convenient abuse of notation
X
X
N
hf, i = hak xk , i = lim hak xk , i D().
N
k=0 k=0

1.43. REM (The issue of sequential completeness of D 0 ) The concepts of convergence

and Cauchy sequence in D 0 are dened via the respective notions in C (by considering
evaluations on test functions). One might be lead to expect sequential completeness of
D 0 to be an easy consequence of that of C. However, the case is not as simple as a pure
nite-dimensional intuition would tell: If (un ) is a Cauchy sequence in D 0 , then we know
that for every test function we have a limit
u() := lim hun , i,

n
which in turn denes a linear functional u. But continuity of u cannot be shown

without a deeper investigation of convergence in D and its interplay with the seminorm
estimate (SN), or, alternatively, appealing to the uniform boundedness principle for
Frechet spaces.
Technically, continuity of the distributional limit u requires to interchange the limit
un u with m , i.e. we have to show that
!
limhu, m i = lim limhun , m i = lim limhun , m i = limhun , i = hu, i.

m m n n m n
1.44. THM D 0 () is sequentially complete.
Proof: Let (un ) be a Cauchy sequence in D 0 (), i.e. D() we obtain a Cauchy
sequence (hun , i)nN . We dene a candidate for the D 0 -limit u by
hu, i := lim hun , i D().

n
It remains to prove that u is continuous. To achieve this we show the seminorm estimate
(SN) using the uniform boundedness principle for the Fechet spaces6 D(K) (with K b
).
Let K b , then (SN) applied to un tells us that there are Cn > 0 and mn N such
that
X
() |hun , i| 6 Cn k kL (K) D(K),
||6mn
which means that {un | n N} is pointwise bounded on D(K). By the principle of uniform
boundedness we obtain that the same set is also strongly bounded, which implies that
6 Cf.
[Sch66, Chapter III, 4, 4.2 and Chapter II, 7, Corollary to 7.1] or [Hor66, Chapter 3, 6,
Poposition 2 and Corollary to Proposition 3].

the constants Cn and the orders mn are uniformly bounded (with respect to n), say by
C and m, respectively. Hence we obtain
X
|hu, i| = lim |hun , i| 6 C k kL (K) D(K).
n
||6m
Therefore we have shown that (SN) holds for u as well.
1.45. REM (More on the completeness of D 0 )

(i) The above proof can be easily adapted to show D 0 -convergence of Cauchy nets
(u ) ]0,1] or (ut )t ]1,[ .
(ii) In the above proof we obtained the seminorm estimate (SN) for K b uniformly for
all un and D(K). If m in D() and K is a compact set containing supp(m )
(for all m) and supp(), then (SN) implies that hun , n i 0 (n ). Hence
hun , n i = hun , n i + hun , i 0 + hu, i. Therefore we have shown:
If un u in D 0 () and n in D(), then hun , n i hu, i (n ),
i.e., that the map h , i : D 0 () D() C is jointly sequentially continuous.

(Similar results hold for nets with an interval as index set; cf. [Hor66, Chapter 4, 1, Prop. 2].)
(iii) In terms of the topology (D 0 , D) the above theorem states that D 0 () is weak--
sequentially complete. However, D 0 () is not weak--complete: The (D 0 , D)-completion
of D 0 () is D () = (D()) (the algebraic dual of D(); this follows from the general
theory of dual systems, as mentioned in [Sch66, Chapter IV, 6, page 148]). Note that
D () is strictly larger than D 0 (); for an example of a discontinuous functional on
D() see [Obe0x, Example 10]).
(iv) On the other hand, D 0 () is complete with respect to the strong topology (D 0 , D)
and weakly convergent sequences in D 0 are strongly convergent (cf. [Hor66, Chapter 4,
1, Proposition 2]).
(v) We mention the result that D() is sequentially dense in D 0 (), which we shall prove
in Theorem 4.10 below. While our proof will be based on regularization by convolution
(see 4.9 or [Hor90, Theorem 4.1.5]) there is also a functional analytic proof (see [Hor66,
Chapter 4, 1, Proposition 3]).
(vi) There exist more elementary proofs of the weak--sequential completeness of D 0 (),
i.e. without recourse to Baire's theorem (from topology) in terms of the uniform bound-
edness principle. E.g., one variant can be found in [FL74, Kapitel IV, Satz 8.1]. We
present yet another variant of an elementary proof below, which is inspired by classical
proofs of the Banach-Steinhaus theorem for L2 (based on the \Methode der gleitenden
Buckel", [RN82, 31]; cf. [Wer05, Section IV.9, p.190] for notes on the history).

insert RO notes in small print

1.4. LOCALIZATION AND SUPPORT
1.46. Motivation By its very denition u D 0 () is a map from D() C, hence

it cannot be localized to a point x as it is naturally possible for a function f : C.
Nevertheless, a distribution may be localized to open subsets 0 and the support
of a distribution can be dened. The main result of this paragraph is, that distributions
actually are characterized by their localizations.
1.47. DEF (Localization to open subsets) Let 0 be open subsets of Rn :

(i) Let u D 0 (). Considering D( 0 ) D() (by trivial extension of test functions on 0 to )
we obtain the restriction u |D( 0 ) which maps
D( 0 ) C, 7 hu, i,
and clearly belongs to D 0 ( 0 ). We call this the restriction or localization of u to

0 and denote it by u | 0 .
(ii) u, v D 0 () are said to be equal on 0 if u | 0 = v | 0 .

partitions of unity: to be done later
1.48. THM+DEF
1.49. COR
1.50. REM
1.51. COR
1.52. REM
1.53. DEF (Support of a distribution) Let u D 0 () and dene
Z(u) := {x | u = 0 in some open neighborhood of x},
supp(u) := \ Z(u).
The subset supp(u) is called the support of u.

1.4. LOCALIZATION AND SUPPORT 29
1.54. Observation
Note that Z(u) is the largest open subset of , where u vanishes; in particular, supp(u)
is a closed subset of (in the topology of ).
Also x0 supp(u) if and only if for all open neighborhoods V of x0 there exists some
D(V) with hu, i = 6 0.
Finally, observe that u = 0 if and only if supp(u) = .
1.55. Examples (Support of some distributions)

(i) Clearly = 0 on every open subset U Rn \ {0}. On the other hand, if U Rn is
open with 0 U then h, i = (0) 6= 0 for appropriately chosen D(U). Therefore
supp() = {0}.
(ii) Let f C(). By 1.28(i) we have that Z(uf ) = \ supp(f), where supp(f) is as
dened in 1.6. Hence we obtain
supp(uf ) = supp(f)
and the two notions of support agree in case of a continuous function.
1.56. PROP (Disjoint supports yield value 0)

Let u D 0 () and D() with supp(u) supp() = . Then hu, i = 0.
Proof: Let K := supp(). Since supp(u) K = we have:

x K a neighborhood Ux of x such that u |Ux = 0.
Choose a nite subcovering (Uxi )m i=1 and a subordinated
Pm
partition of unity (i )m
i=1 as
in Corollary 1.51, that is, with j D(Uxj ) and i=1 i = 1 on a neighborhood of K.
Then
X
m mX
hu, i = hu, i = hu, i = 0.
|{z}
|i=1{z } i=1 D(Uxi )
=
1.57. COR (Vanishing locally everywhere implies 0) Let u D 0 ().

If x neighborhood Ux of x such that u |Ux = 0, then u = 0 in D 0 ().
Proof: We have Z(u) = hence supp(u) = . Proposition 1.56 then implies hu, i = 0
for all D(). Thus u = 0.

1.58. THM (Localizations determine a distribution) Let I be a set and (i )iI be an

open covering of . For every i I let ui D 0 (i ) such that the following holds:
() ui |i j = uj |i j i, j I with i j 6= .
Then ! u D 0 () with u |j = uj for all j I.
(A collection of distributions ui D 0 (i ) satsifying () is called a coherent family.)
Proof: Uniqueness: Let u, v D 0 () such that i I: u |i = ui = v |i .

Set w := u v, then w |i = 0 for all i I and 1.57 implies w = 0, thus u = v.
Existence: Let KSb . Since K iI i we may pick a nite subcovering: i1 , . . . , im
S
I such that K m l=1 il . According to 1.51 we can nd a subordinate partition of unity,

Pm
i.e. l D(il ) (l = 1 . . . , m) with l=1 l = 1 in a neighborhood of K. For every
compact subset K of we choose a corresponding partition of unity.
Now we dene the action of u on D() as follows: Let K := supp() and 1 , . . . , m
be the partition of unity chosen above. Then we set
X
m
() hu, i := huil , l i.
l=1
We have to show that

(a) the value of hu, i is well-dened by () (i.e. depends only on u and ),
(b) u D 0 (), and
(c) u |i = ui for all i I.
(a) Let K 0 be a compact subset with K 0 supp() and suppose that r1 , . . . , rp is
a corresponding subcovering with subordinate partition of unity 10 , . . . , p0 . Then we
have
D(il rk )
X
p
X
p
X
m z }| { X
m X
p
X
m
hurk , k0 i = 0
hurk , k l i = 0
huil , l k i = huil , l i.

k=1 k=1 l=1 () l=1 k=1 l=1
(b) We prove the seminorm estimate (SN). If K b we have (with subcovering and
partition of unity as chosen above) the action on any D(K) given by (). Using (SN)
for every ui1 , . . . , uim (with C > 0 and order N uniformly over l = 1, . . . , m) we obtain
the estimate
X
m X
m X
|hu, i| 6 |huil , l i| 6 C k (l )kL (K)
l=1 l=1 ||6N
P
X
[Leibniz rule: (l )= 6 () l ] 6 C0 k kL (K) ,
||6N

1.4. LOCALIZATION AND SUPPORT 31
where C 0 depends only on K (via il , l = 1, . . . , m).

(c) Let D(i ) and K := supp(). Choose a cut-o D(i ) over K, i.e. = 1 in
a neighborhood of K. Thus i provides a nite covering of K and a partition of unity
subordinate to it. Then = and () yields
hu, i = hui , i = hui , i.

1.5. DISTRIBUTIONS WITH COMPACT

SUPPORT
1.59. Intro In this last paragraph of chapter 1 we study one of the most important
subspaces of D 0 |the space of compactly supported distributions. We will see that this
space actually is the space of continuous linear forms on E = C .
Moreover we shall be concerned with distributions which have their support concentrated
in one single point|a property which is not possible for continuous or L1loc -functions.
We will completely describe such distributions.
1.60. DEF (E 0 -distributions) We denote the space of sequentially continuous linear

functionals on E () = C () by E 0 ().
[For any u : E () C linear we have: u E 0 () n in E ()
hu, n i hu, i in C.]
1.61. REM (On continuity issues)

(i) Since E () is a Frechet space sequential continuity on E () is continuity (cf. 1.25(iii);
compare also with the discussion in the other items of 1.25).
(ii) Analogous to the D 0 -case we may characterize continuity in more analytic terms, i.e.,
via seminorm estimates as illustrated in the following statement.
1.62. THM (Continuity criterion | seminorm estimates) Let u : E () C be linear.

Then we have: u E 0 () K b C > 0 m N0 :
X
(SN 0 ) |hu, i| 6 C k kL (K) E ()(= C ()).
||6m
[Compare with 1.26: `K' is replaced by `K'; recall Denitions 1.4 and 1.17 to compare
convergence in D and E .]
Proof: (Very similar to that of Theorem 1.26.)

Let
n 0 in E (). By assumption K, C, m as in (SN 0 ), hence
X
|hu, n i| 6 C k n kL (K) 0 (n ).
||6m

1.5. DISTRIBUTIONS WITH COMPACT SUPPORT 33

Bycontradiction: Suppose u E 0 () but (SN 0 ) fails for all m N0 with C = m
and K = Bm (0), and for a certain m E (). Thus we have
X
|hu, m i| > m k m kL (Bm (0)) .
||6m
P
(Necessarily m 6= 0 and so 0 < km kL (Bm (0)) 6 ||6m k m kL (Bm (0)) .)

m (x)
As in 1.26 we dene m (x) := P for x . Then m E ().
m k m kL (Bm (0))
||6m
For any K b and Nn
0 let m N0 be such that m > || and K Bm (0). Then we
have
X X k m kL (Bm (0)) 1

k m kL (K) 6
k m kL (Bm (0)) = P = ,
m
k m kL (Bm (0)) m
||6m ||6m
||6m
hence m 0 in E () (as m ).
|hu, m i|
On the other hand, we obtain by construction |hu, m i| = P > 1,
m k m kL (Bm (0))
||6m
and therefore |hu, m i| 6 0 in C | a contradiction .
1.63. Motivation (D and E ) To clarify the interrelation between D 0 and E 0 we

rst address the same question regarding the test function spaces D and E . Clearly
D() E (); moreover, the embedding D() , E () is continuous, since n 0 in
the D-sense implies the same in E . Furthermore, we have the following result.
1.64. THM D() is (sequentially) dense in E ().
Proof: to be done later; make a remark on topology (1.52A in RO notes)
1.65. REM (E 0 -distributions as D 0 -distributions)

(i) If u E 0 () then u |D() D 0 (). Indeed, we have
[1.63] [1.60]
n 0 in D() = n 0 in E () = hu, n i 0.

(ii) If u E 0 () then u |D() has compact support.

[(SN 0 )]
Indeed, let K b be as in (SN ). For any D(): supp() K = = hu, i = 0.
0
Therefore supp(u |D() ) K.
1.66. THM (Compactly supported D 0 -distributions are E 0 -distributions)

Let u D 0 () and supp(u) compact. Then ! u
e E 0 () with u
e |D() = u.

Proof: Uniqueness: Follows from the density of D() in E () (cf. 1.64).

Existence: Let D() with = 1 on a neighborhood of supp(u) and dene u
e by
he
u, i := hu, i ( E ()).
Then u
e : E () C is linear and for any D() we have
he
u, i = hu, i = hu, i + hu, ( 1)i = hu, i,
| {z }
=0 by 1.56, since
supp((1))supp(u)=
thus u
e |D() = u.
It remains to show that u
e E 0 (). Let K := supp(). Then for every E () we
have supp() K and therefore D(K). Thanks to (SN) we can nd C > 0 and
m N0 such that
X
|he
u, i| = |hu, i| 6 C k ()kL (K) E ().
||6m
Applying the Leiniz rule to the terms () we obtain the estimate (SN 0 ) (as in the
proof of Theorem 1.58, part (b)).
1.67. REM
(i) We observe that the denition of u e in the proof of the above theorem does not
depend on the choice of . In fact, let D() also be a cut-o over (a neighborhood
of) supp(u), then supp( ) supp(u) = and Proposition 1.56 yields
hu, i hu, i = hu, ( )i = 0.
(ii) In view of 1.65 and 1.66 we may | and heneceforth will | identify E 0 () with the
subspace of compactly supported distributions in D 0 (). Thus we write E 0 () D 0 ()
and u instead of u e (and also hu, i to mean hu, i) in this context.
(iii) The condition (SN 0 ) for distributions in E 0 () implies (SN), where the order m
can be chosen independently of the compact sets. Hence E 0 () DF0 () (compactly
supported distributions are of nite order).
1.68. THM E 0 () is sequentially dense in D 0 ().

to be done later: either use proof analogous to the one of Thm. 1.64 or new
Rem. 1.45(iv)
1.69. Examples (D 0 - and E 0 -distributions)

(i) Let x0 then x0 E 0 (). [supp(x0 ) = x0 ]

(ii) vp( x1 ) 6 E 0 (R). [supp(vp( x1 )) = R]
(iii) Every test function D() and more generally every compactly supported con-
tinuous function f Cc () by 1.55(ii) is a regular distribution with compact support.
1.70. THM (Distributions supported in a single point)

Let x0 and u D 0 () with supp(u) = {x0 }. Then m N0 and c C (|| 6 m),
such that X
hu, i = c (x0 ) D().
||6m
[This result completely describes the distributions with point-support. With the concepts
introduced below we may state: All of these are linear combinations of derivatives of the
Dirac-delta at x0 .
Observe that due to 1.54 at least one of the constants c is nonzero.]
The proof will be based on the following result.
1.71. LEMMA Let x0 and u D 0 () with supp(u) = {x0 }. Then m N0

such that
hu, i = 0 D() with (x0 ) = 0 || 6 m.
1.72. REM Lemma 1.71 is a special case of the following result (cf. [FJ98, Theorem
3.2.2]): Let u E 0 () and m be its order. Then we have
hu, i = 0 D() with ( ) |supp(u) = 0 (|| 6 m).
Proof of Lemma 1.71: W.l.o.g. x0 = 0, = Rn .

Let D(Rn ) with = 1 on B1/2 (0) and = 0 on Rn \ B1 (0). If ]0, 1] then we
have for any D(Rn ) that
x
(x) (x) ( ) = 0 when x B/2 (0).

Therefore supp(u) supp( ( . )) = and 1.56 implies
.
() hu, i = hu, ( )i.

We have ]0, 1]: supp( ( . )) supp() B1 (0) =: K. Hence (SN) together with
() implies that m N0 C > 0 such that
X .
() |hu, i| 6 C k ( ) k D().
L (K)
||6m

Suppose that (0) = 0 if || 6 m. Let || 6 m then we have by Taylor's theorem
X x +
(x) = (0) [=0 by hypothesis]
!
||6m||
X Z1
x
+ (m || + 1) (1 t)m|| (+ )(tx) dt.
!
||=m||+1 0
Hence by the compactness of supp() we obtain the estimate
=1
z }| {
Z1 X |x|||
| (x)| 6 (m || + 1) (1 t)m|| dt k(+ )kL (supp())
!
0 ||=m||+1

X k( +
)kL (supp())
= |x|m||+1 ,
!
||=m||+1
| {z }
=:C(m,,)
which in turn gives
( ) | (x)| 6 C(m, , ) m||+1 for |x| 6 .
To prepare for the application of ( ) to the estimate () we apply the Leibniz rule
and use the fact that supp(( . )) B (0) to deduce
. X .

k ( ) k 6 k (.)( )( )|| k

L (K)
6
L (B (0))
X
6 C(m, , )m||+1 C(||||) = O(m+1|| ).

6

[()]
X
Inserting these upper bounds into () yields |hu, i| = O m+1|| = O().

||6m
Since 0 < 6 1 was arbitrary we obtain that hu, i = 0.
Proof of Theorem 1.70: Again w.l.o.g. x0 = 0, = Rn .

Let , K, and m be as in the proof of Lemma 1.71. Let D() be arbitrary.

We have by Taylor's theorem

=:R (x)
z Z }| {
X x X x 1
(x) = (0) + (m + 1) (1 t)m ( )(tx) dt
! ! 0
||6m ||=m+1
X x (x) X x
= (0) + 1 (x) (0) + R (x),
! !
||6m ||6m
| {z }
=:(x)
e
where
e D(Rn ) satises (0)
e = 0 when || 6 m (due to the polynomial factors in
R and the fact that = 1 on a neighborhood of 0). Thus Lemma 1.71 gives hu, i
e =0
and therefore X 1
hu, i = hu, x i (0).
!
||6m
Setting c = hu, x i/! yields the claim.

Chapter
DIFFERENTIATION,
DIFFERENTIAL OPERATORS
2.1. Intro Having introduced the space of distributions D 0 in detail in the previous
chapter we now head on to dene and study operations on distributions.
In Section 2.1 we discuss dierentiation in D 0 . Distributions turn out to have partial
derivatives of all orders and taking derivatives commutes with taking D 0 -limits, which
is a very remarkable fact. We present some examples, in particular those announced in
0.5.
In Section 2.2 we introduce the product of distributions with smooth functions, prove
the Leibnitz rule and give some more examples.
Combining these notion we give a rst account on partial dierential operators on D 0
in Section 2.3. We discuss some examples from ODEs and prove the existence of prime
functions in D 0 .
Finally, Section 2.4 provides an answer to the question, why the operations on D 0 dened
in this chapter work so smoothly: We put the constructions in the functional analytic
context of duality.
39
40 Chapter 2. DIFFERENTIATION, DIFFERENTIAL OPERATORS
2.1. DIFFERENTIATION IN D 0
2.2. Motivation We want to dene a notion of dierentiation in D 0 that is compatible

with the classical derivative of, say, a C1 -function. More precisely, let f C1 ()
L1loc () D 0 (). We wish to achieve that new
j (uf ) = uj f holds, which requires the
following diagram to be commutative: make diagram nicer
/ D0
C1
j new
j

/
C0 D0
To see what this actually means we calculate the action on a test function
Z Z
new !
hj (uf ), i = huj f , i = j f(x)(x) dx = f(x)j (x) dx = huf , j i.

[int. by parts]
This motivates the following denition:
2.3. DEF (Derivative of a distribution) Let u D 0 () and 1 6 j 6 n. We dene the

partial derivative j u of u by
(2.1) hj u, i := hu, j i D().
If n = 1 we denote the derivative by u 0 instead of 1 u.
2.4. REM (DEF 2.3 really works) j u : D() C is obviously linear. It is also
(sequentially) continuous, since k (k ) in D() implies j k j in
D() and hence
(k)
hj u, k i = hu, j k i hu, j i = hj u, i.
Therefore we obtain a map j : D 0 () D 0 ().
2.5. Example (Derivatives of some important distributions)

(i) Derivative of the Heaviside function: Let H be the Heaviside function on R (as in
1.27(ii)), then
Z

hH 0 , i = hH, 0 i = 0 (x) dx = 0 + (0) = (0) = h, i.

0

2.1. DIFFERENTIATION IN D0 41
So we have in D 0 (R)
(2.2) H 0 = .
(ii) Derivative of regular distributions: If u C1 (), then the calculation in 2.2 shows
that j uf = uj f .
Beware of the following pitfall: For arbitrary dierentiable functions the distributional
derivative does not always agree with the classical (pointwise) derivative! (Examples1 are
provided by dierentiable functions on an open interval with derivative not in L1loc .)
(iii) Derivative of a jump: Let f C (R). Then f H L1loc (R) D 0 (R) and we have
Z

h(uf H ) , i = huf H , i = f(x) 0 (x) dx

0 0
0
Z
Z

0
= f(0)(0) + f (x)(x) dx = f(0)h, i + H(x)f 0 (x)(x) dx

[int. by parts] 0
= hf(0) + uf 0 H , i.
Thus we obtain the formula
(f H) 0 := (uf H ) 0 = f(0) + f 0 H
2.6. REM (Distributions are innitely dierentiable) In contrast to the functions from
classical analysis, distributions always possess partial derivatives of arbitrary orders.
Iterating the denition of the distributional derivative (and applying 2.4 successively)
we obtain for any Nn0 and D()
(2.3) h u, i = (1)|| hu, i
The map : D 0 () D 0 () is obviously linear and, as we shall prove shortly, also
(sequentially) continuous.
2.7. THM (Properties of the distributional derivative)

Let u, v D 0 () and (uk ), (vk ) be sequences in D 0 (). Then we have
(i) (u + v) = u + v
(ii) a C: (au) = a u
(iii) uk u (k ) in D 0 () = uk u in D 0 ()
P P
(iv) v = k=1 vk in D 0 () = v = k=1 vk in D 0 ().
1 Such as f : ] 1, 1[ R, given by f(x) = x2 sin(1/x2 ) (x 6= 0), and f(0) = 0.

Proof: (i) and (ii) are direct consequences of the denition.

(iii): h uk , i = (1)|| huk , i (1)|| hu, i = h u, i.
(k)
(iv): apply (i) and (iii) to the partial sums.
2.8. REM (Distributional derivatives and limits) Note that properties (iii) and (iv)
in the above statement display remarkable features:
The classical analogues of these statements are often wrong.
Since D 0 -theory has to stay consistent with classical analysis, the validity of (iii)
above is due to a dierent notion of convergence and the extension of the notion
of derivative.
The proofs are very simple [we will see why this is so in 2.4 below].
2.9. Example (Convergence of derivatives) Consider un C (R) D 0 (R) (n N)

given by
1
un (x) = sin(nx).
n

Since kun kL (R) = 1/ n we have un 0 uniformly, hence also un 0 in D 0 (R).

The derivatives are un0 (x) = n cos(nx), thus (un0 ) does not even converge pointwise
to 0 [if cos(kx) 0 (k ), then 1 = lim(1 + cos(2nx)) = lim 2 cos2 (nx) = 0 ].
Nevertheless we know that un0 0 in D 0 () by the continuity of the distributional
derivative! [The latter can also be seen directly upon integrating by parts in hun0 , i.]
2.10. Example (Derivatives of ) By (2.3) we have

h x0 , i = (1)|| (x0 ).
Using this relation we now see that the representation obtained in Theorem 1.70 is indeed
a linear combination of derivatives of x0 .
2.11. Example (vp( x1 ) revisited) We consider the Cauchy principal value vp( x1 ) (see
1.37(ii) and 1.38) and will now show that
1
(2.4) (log |x|) 0 = vp( ).
x
R
Here log |x| is to mean the regular distribution 7 log |x| (x) dx and the derivative
R
is in the D 0 -sense. [Note that log |.| L1loc (R), since 01 | log(x)| dx = (x log(x) x)|10 = 1.]
We can compare the above distributional formula to the cassical statement that for x > 0
we have log 0 (x) = 1/x, hence, when x < 0, also (log(|x|)) 0 = (log(x)) 0 = 1/(x) = 1/x.

2.1. DIFFERENTIATION IN D0 43
To prove (2.4) we evaluate on test functions

Z Z
h(log |x|) , i
0
= hlog |x|, i = log |x| (x) dx = lim
0 0
log |x| 0 (x) dx
0+
|x|>
[int. by
parts] Z
(x)
lim dx (x) log |x|

=
x

0+
|x|>
Z
(x)
lim dx lim log() () ()

=
0+ x 0+ | {z }
|x|> =0+2 0 (0)+O(2 )
| {z }
=0
Z
(x) 1
= lim dx = hvp( ), i.
0+ x x
|x|>
[Observe that the derivative of the regular distribution log |x| is not regular.]
2.12. Example (Distributions as boundary values of holomorphic functions | an em-

bryonic example) Consider the branch of the logarithm in C := C \ {z C | Re(z) 6
0, Im(z) = 0}, i.e. the function Log : C C given by z 7 log |z| + i arg(z). We have
Log 0 (z) = 1/z on C .
By the standard identication (x, y) 7 z = x + iy we may consider fy : x 7 Log(x + iy)
as a distribution on R, depending on the parameter y ]0, 1].
For every x 6= 0 we have the pointwise limit
() lim fy (x) = lim Log(x + iy) = log |x| + i (1 H(x)). insert drawing
y0+ y0+
Furthermore, we have a uniformly dominating L1 -upper bound when 0 < y 6 1, since

|fy (x)| 6 | log |x + iy| | + | i arg(x + iy) | 6 log
p
1 + x2 + (x 6= 0, 0 < y 6 1).
Hence Lebesgue's dominated convergence theorem 1.40 yields the validity of () also in
D 0 (R). Consequently we obtain from Theorem 2.7(iii) that fy0 log |.| + i(1 H)
0
in D 0 (R) (as y 0+), in other words

1 1 1
:= D 0 - lim = vp( ) i
x + i0 y0+ x + iy x
Analogously we have
1 1 1
:= D 0 - lim = vp( ) + i .
x i0 y0 x + iy x

2.2. MULTIPLICATION BY C-FUNCTIONS
2.13. Motivation As in the case of dierentiation we want to dene a concept of mul-

tiplication of distributions with functions which extends the classical notion of pointwise
products of functions. Let u, f C(), then we have
Z Z
hfu, i = (fu) = u(f) = \hu, fi".
Note that f D() for all D() only if f C (). This leads us to the following
denition.
2.14. DEF (Multiplication by C -functions)

Let u D 0 () and f C (). We dene the product f u by
hfu, i := hu, fi D().
2.15. REM (DEF 2.14 really works) We have to check that fu D 0 (): If n 0 in
D(), then fn 0 in D(). [Prove the details as an exercise: use supp(fn ) supp(n )
and apply the Leibniz rule to estimate (fn ).] Therefore we obtain, since u D 0 (),
hfu, n i = hu, fn i 0 (n ).
2.16. PROP (Multiplication is sequentially continuous)

Let f C (). The map
D 0 () 3 u 7 fu D 0 ()
is sequentially continuous.
[Do not confuse this continuity property with the one in 2.15 above.]
Proof: Let un u in D 0 (). Then
hfun , i = hun , fi hu, fi = hfu, i.

2.2. MULTIPLICATION BY C -FUNCTIONS 45
2.17. THM (Leibniz rule) Let u D 0 () and f C (). Then for any Nn0 the
following formula holds in D 0 ()
X
(2.5)
(fu) = f u.
6

Proof: It suces to prove (2.5) for the case = i . The result follows then by
induction.
Let 1 6 i 6 n. Then we have for any D()
hi (fu), i = hfu, i i = hu, f i i = hu, i (f) (i f)i

= hi u, fi + h(i f)u, i = hf i u + (i f) u, i.
2.18. Examples (Some prominent products)

(i) again: Let f C (Rn ), then we have D(Rn )
hf, i = h, fi = f(0)(0) = hf(0), i.
Hence
(2.6) f = f(0)
and, in particular,
(2.7) x = 0
Moreover, by (2.5) we have
i (f) = (i f) + f i = f(0) i = (i f)(0) + f i =

|{z}
f(0)
(2.8) f i = f(0) i (i f)(0)
(ii) vp( x1 ) again: We claim that
1
(2.9) x vp( ) = 1
x

Indeed, we obtain
Z

1 1 x(x) + x(x)
hx vp( ), i = hvp( ), xi = dx
x x x
0
Z
Z
Z

= (x) dx + (x) dx = (x) dx = h1, i.

0 0
2.19. REM (A more general product?) The product C D 0 D 0 dened above

cannot be extended to a general multiplication D 0 D 0 D 0 in a \reasonable" way.
For example, such a product can never be associative, since this would imply

1
x
1
0 = ( x) vp( ) = (x vp( )) = 1 =
x
.
[(2.7)] [(2.9)]
For a more in depth discussion of this topic see e.g. [GKOS01, Section 1.1].

2.3. A FIRST GLIMPSE OF DIFFERENTIAL EQUATIONS 47
2.3. DIFFERENTIAL OPERATORS & A

FIRST GLIMPSE OF DIFFERENTIAL
EQUATIONS IN D 0
2.20. DEF (Linear partial dierential operators)

(i) A linear partial dierential operator [PDO] with smooth coecients on is given
by (a sum of dierentiations followed by multiplications)
X
(2.10) P(x, ) = a (x)
||6m
where a C () ( Nn0 , || 6 m).

P(x, ) denes linear maps D() D(), C () C (), and D 0 () D 0 ()
P
by the assignment f 7 ||6m a f.
P(x, ) is said to be of order m, if with || = m such that a 6= 0.
(ii) The function P : Rn C given by

X
P(x, ) = a (x)
||6m
is called the symbol of the operator P(x, ).

(iii) The function P : Rn C given by
X
P (x, ) = a (x)
||=m
is called the principal symbol of the operator P(x, ).

d Xm
d
(iv) If n = 1 [i.e., R] we call P(x, ) = P(x, )= ak (x)( )k
dx k=0
dx
an ordinary dierential operator.
2.21. REM (PDOs on D 0 ; adjoint operator) PDOs should always be considered as

maps on appropriate function or distribution spaces.

If P(x, ) is a linear PDO with C -coecients as in (2.10), then building up from the
special cases of partial dierentiation and multiplication by a smooth function we obtain
the action of P(x, ) as a map D 0 () D 0 () in the form
hP(x, )u, i = hu, Pt (x, )i,
where Pt (x, ) denotes the adjoint operator

X
(2.11) Pt (x, ) := (1)|| a .

||6m
P
[Exercise: Show that the adjoint is again a PDO with representation Pt (x, ) = ||6m b (x)
,
P
where b = ||6m (1)||
a .]

>
We shall often use the brief notation P (resp. Pt ) instead of P(x, ) (resp. Pt (x, )).
2.22. PROP (Basic properties of PDOs) Let P = P(x, ) be a PDO with C -coecients.
Then we have:
(i) P : D 0 () D 0 () is linear and sequentially continuous.
(ii) u D 0 (): supp(Pu) supp(u), in particular P E 0 () E 0 ().

[P is a local operator.]
Proof: (i) follows immediately from Theorem 2.7(iii) and Proposition 2.16.
(ii) We rst note that for any C () the relation supp(Pt ) supp() follows
directly from the denition (2.11) (upon using the Leibniz rule and supp(f) supp()).
Suppose x \ supp(u). Then there is a neighborhood U of x such that u |U = 0, i.e.
D(U): hu, i = 0.
Thus we obtain for every D(U), since supp(Pt ) supp() U, that
hPu, i = hu, Pt i = 0.
In other words, (Pu) |U = 0 and therefore x \ supp(Pu).
2.23. Motivation (ODEs in D 0 ) We will now study the simplest class of dierential
equations in D 0 , namely linear2 ordinary dierential equations (ODEs) with smooth
coecients.
One question arises immediately: Does a given classical ODE possess a larger set of
solutions in D 0 than in the classical C - or C1 -setting?
2 Nonlineardierential equations are in general not well-dened in the context of all of D 0 ; this is
related with the problem of distributional products brie y touched upon in 2.19.

The answer is simple, if the coecient of the highest-order derivative (corresponding

to the principal part) has no zeroes. In this case no new solutions occur. Otherwise
additional distributional solutions may or may not exist. We will give explicit examples
for both situations below.
2.24. THM (Classical ODEs in D 0 ) Let I R be an interval, a C (I), f C(I),

and u D 0 (I). If
(2.12) u 0 + au = f holds in D 0 (I),
then u C1 (I) and Equation (2.12) holds also in the classical sense.
[Here, P(x, ) = P(x, dx
d
)= d
dx + a(x).]
The proof of Theorem 2.24 will be based on a result about distributional antiderivatives
(or primitive functions), which is of interest in its own right.
2.25. LEMMA (Primitive functions in D 0 ) Let I R be an interval, v D 0 (I), and

suppose that
v 0 = g C(I).
Then v C1 (I) andR is a classical antiderivative of g, that is, we have with some x0 I
and c C: v(x) = xx0 g(s) ds + c.
R
Proof: Put G(x) := xx0 g(s) ds and w := v G D 0 (I). Then w 0 = v 0 g = 0; it
remains to show that w has to be a constant (function).
We have D(I)
() 0 = hw 0 , i = hw, 0 i.
So, w gives 0 when applied to a derivative of a test function. But not all test functions
can be represented as derivative of a test function. However, as we will now prove, the
set of derivatives of test functions form a hyperplane in D 0 (I).
2.26. Sublemma Let D(I). Then we have

Z
! D(I) : =
0
(x) dx = 0.
I
R R
Proof: I = I 0 = |I = 0, since supp() b I.

R Let x0 I arbitrary. Every C (I) with = is of the form (x) =
0
x0 (y) dy + c for some c C.

x
Choose r < s such that supp() [r, s]. Then (x) = c if x < r and (x) = 0 + c = c if
x > s. Thus we have D(I) c = 0.

2.27. REM While : C (I) C (I) is surjective and not injective, we conclude
d
dx
from the above Sublemma that dxd
: D(I) D(I) is not surjective, but injective.
R
Continuation of the proof of Lemma 2.25: Let H := { D(I) | I = 0}, which is the
hyperplane
R
in D(I) given by the kernel of the continuous linear functional : D(I) C,
() = I = h1, i. By () and the Sublemma we have w |H = 0.
R
Choose D(I) with I = 1. Then we may decompose any D(I) in the form
() = () + () .
| {z } | {z }
H lin. span {}
Therefore we obtain
Z
hw, i = hw, ()i + hw, ()i = () hw, i = c (x) dx,
| {z } | {z }
H =:c C I
hence w = c.
Zx
Proof of Theorem 2.24: Let x0 I and E(x) := exp( a(y) dy). Then E C (I), E > 0
x0
and E 0 = aE. Putting v = Eu we may calculate
d d
v= (Eu) = E 0 u + Eu 0 = E(au + u 0 ) = Ef C(I).
dx dx
[(2.12)]
By Lemma 2.25 we deduce v C1 (I) and therefore u = v/E C1 (I).

Finally, writing (2.12) in terms of action on test functions and applying integration by
parts we deduce that the continuous functions f and u 0 + au agree as distributions.
Therefore they agree as continuous functions as well.
As a special case of the above results we obtain the following statement.
2.28. COR (Vanishing derivative means constant) Let I R be an interval and u

D 0 (I). Then we have
u 0 = 0 in D 0 (I) c C : u = c
2.29. REM (Regularity of solutions to ODEs) Based on a reasoning similarly to the

above for linear systems of ODEs and upon standard reduction of higher equations to
rst-order systems one can prove the following generalization of Theorem 2.24 (cf. [Hor90,

Corollary 3.1.6]):
Let u D 0 (I) and a0 , . . . , am1 C (I). Suppose that
u(m) + am1 u(m1) + . . . + a1 u 0 + a0 u = f C(I),
then u Cm (I) and the above ODE holds also in the classical sense. P
Dening the dierential operator P of order m by P = P(x, dx
d d m
) = ( dx ) + m1 d j
j=0 aj (x)( dx )
we may rephrase this result as follows:
Pu C0 = u Cm .
This means that Cm -regularity of any distributional solution u to Pu = f is implied by

mere continuity of the right-hand side f.
2.30. Two warning examples

(i) x3 u 0 + 2u = 0 has no nontrivial (i.e. u 6= 0) solution in D 0 (R).
We start by considering the ODE o the zeroes of the highest order coecient:
solution u+ (x) = c+ e1/x in C1 (]0, [)
2
x>0 u 0 = x23 u
and in D 0 (]0, [) [by Thm.2.24];
solution u (x) = c e1/x in C1 (] , 0[)
2
x<0 u 0 = x23 u
and in D 0 (] , 0[) [by Thm.2.24].
Claim: There exists no 0 6= u D 0 (R) with u |],0[ = u , u |]0,[ = u+ (for certain
constants c and c+ ).
Proof (by contradiction): Suppose c , c+ C and u D 0 (R), u 6= 0 such that
u |],0[ = u and u |]0,[ = u+ hold.
Then the seminorm condition (SN) on K := [1, 1] implies the following: C > 0 m N0
X
m
() |hu, i| 6 C k(j) kL (K) D(K).
j=0
Case 1, c+ 6= 0: Choose D(]0, 1[) with (1/2) = 1 and > 0. Set (x) := (x/),
then D(K) ]0, 1] and () implies
Xm
.
|hu, i| 6C j k(j) ( )k = O(m ) ( 0)
j=0

L (K)
k
Z1 Z
1/ Z1
x
| c+ e 1/x2
( ) dx| = |c+ | (y)e 1/(2 y2 )
dy > |c+ | e1/2
(y) dy.

0 0 1/2
| {z }
>0

Hence C1 > 0: e1/ 6 C1 m1 | a contradiction

2
, if 0.
Case 2, c 6= 0: analogous to case 1.
Case 3, c+ = c = 0: Since u |R\{0} = 0 we necessarily have supp Pm
(u) = {0}. By Theorem
1.70
Pm
there are constants 0 . . . , m C such that u = l=0 l (l) . Hence u 0 =
l=0 l
(l+1)
. Inserting this into the dierential equation x3 u 0 = 2u gives
X
m X
m
() 3 (l+1)
l x = 2 l (l) .
l=0 l=0
We need to calculate the products x3 (l+1) , so let us derive a more general formula
for xk (j) by directly calculating
d d
hxk (j) , i = (1)j h, ( )j (xk )i = (1)j ( )j (xk (x)) |x=0
dx dx
X j
j
d d 0 j < k,
= (1)j ( )q (xk ) |x=0 ( )jq (0) =
q | dx {z } dx (1) k k!
j
j
(jk)
q=0 (0) j > k
=0, if q<k or q>k

h0, i j < k,
= (1)j j!
(jk)!
(1)jk h(jk) , i j>k
Therefore
(2.13) xk (j) = 0 (j < k), xk (j) = (1)k j! (jk) /(j k)! (j > k)
In particular, x3 (l+1) = (l + 1)l(l 1)(l2) = l(l2 1)(l2) when l > 2, and

0 when l < 2. Inserting into () yields
X
m X
m
2 (l2)
l l(l 1) = 2 l (l) ,
| {z }
l=2 =:cl 6=0 l=0
which is equivalent to
X
m2
0= (2j cj+2 j+2 )(j) + 2m1 (m1) + 2m (m) .
j=0
Since the set {(j) | j N0 } is linearly independent in D 0 (R) [exercise!3 ] we deduce

that m = m1 = 0 and j = j+2 cj+2 /2 (j = 0, . . . , m 2). Successively, we
obtain from this also m2 = 0, . . . , 0 = 0. Hence u = 0 | a contradiction .
3 Hint: consider test functions of the form (x) = xk near x = 0.


(ii) xu 0 = 0 has a 2-parameter family of solutions in D 0 (R).
More precisely, all solutions are of the form u = H + , where , C are arbitrary.
Observe that the only C1 -solutions are the constants u = (i.e. = 0), since the equation
implies u 0 (x) = 0 for all x 6= 0 and u 0 C(R) forces u 0 = 0 on R.
Clearly, u = H + is a solution, since Pm
x(H + ) 0 = x = 0. Furthermore, since
u |R\{0} = 0 Theorem 1.70 implies u = j=0 j (j) with certain constants j C. The
0 0
equation xu 0 = 0 then forces

X
m X
m
0= j x(j) = j j (j1)

j=0 [(2.13)] j=1
and again by the linear independence of {(j) | j N0 } we obtain 1 = 2 = . . . = m = 0.

Hence u 0 = 0 , or equivalently (u 0 H) 0 = 0, and Corollary 2.28 implies that u 0 H
is constant.

2.4. ON DUALITY TRICKS
2.31. Motivation Why was the extension of operations like dierentiation and mul-
tiplication by smooth functions dened in this chapter so easy? The key to the answer
is the general notion of the transpose or adjoint of a linear map, which allowed to \push
all diculties to the side of the test functions". Before investigating adjoints we mention
a result that characterizes (sequentially) continuous linear maps on D.
2.32. Notation In this section we denote by i (i = 1, 2) open subsets of Rn . i
2.33. PROP (Sequentially continuous maps on D) Let L : D(2 ) D(1 ) be a linear

map. Then the following statements are equivalent:
(i) L is sequentially continuous.
(ii) K2 b 2 K1 b 1 such that
a) supp(L) K1 D(K2 ), and
b) N N0 m N0 C > 0:
X X
k (L)kL (K1 ) 6 C k kL (K2 ) D(K2 ).
||6N ||6m
Proof similar to that of Theorem 1.26; to be inserted in small print later

on!
2.34. DEF (Transpose/Adjoint) Let L : D(2 ) D(1 ) be linear and sequentially

continuous. The adjoint or transpose Lt of L is dened as the map Lt : D 0 (1 ) D 0 (2 ),
u 7 Lt u, where Lt u D 0 (2 ) is given by
hLt u, i := hu, Li D(2 ).
2.35. REM (DEF 2.34 really works)

(i) We show that Lt u D 0 (2 ). Linearity is clear and for continuity we appeal to the
sequential continuity of L: l 0 in D(2 ) implies Ll 0 in D(1 ), hence
hLt u, l i = hu, Ll i 0 (l ),
since u D 0 (1 ).

2.4. ON DUALITY TRICKS 55
(ii) The denition of Lt u is illustrated by the following commutative diagram, which

makes it a more obvious special case of the general linear algebraic notion of an adjoint
of a map between arbitrary vector spaces (instead of D(j ), j = 1, 2):
L /
D(2 ) D(1 )
KKK
KKK
K
Lt u KKKK
u
%
C
The additional feature of our context is the sequential continuity of all maps involved.
We have even more to say about continuity.
2.36. THM (Continuity of the adjoint) Let L : D(2 ) D(1 ) be linear and sequen-
tially continuous. Then the transpose Lt : D 0 (1 ) D 0 (2 ) is linear and sequentially
continuous.
Proof: Linearity is immediate from the denition.

As for continuity let uk u in D 0 (1 ). Then for any D(2 )
(k)
hLt uk , i = huk , Li hu, Li = hLt u, i,
hence Lt uk Lt u in D 0 (2 ).
2.37. REM (Extension via adjoints) Since D() D 0 () (and the inclusion is dense,
see 1.45(iv)) we obtain a simple means to extend any sequentially continuous map
S : D() D() to a sequentially continuous map S e : D 0 () D 0 () (and the ex-
tension is unique): First, we determine
R
the transpose S] in the sense of the bilinear form
D() D() C, (, ) 7 , i.e.
Z Z
(S)(x)(x) dx = (x)(S] )(x) dx , D().

Second, we interpret the above relation in D 0 () as

hS, i = h, S] i , D().
Third, we dene Su
e for any u D 0 () by
e i := hu, S] i = h(S] )t u, i
hSu, D().
Thus Se is (actually, has to be) the D 0 -adjoint of S] , i.e., Se = (S] )t .
2.38. Examples (A review of the operations in D 0 )

(i) Consider the partial derivative S = j . We see from the calculation in 2.2 that
S] = j , hence the extension of j to D 0 was done via (j )t .
(ii) If S = f with f C , then S] = S. Hence the extension of multiplication by f is
given by St in the form hfu, i = hu, fi, precisely what we encountered in 2.14 above.
(iii) Combining (i) and (ii) we come to an extension of a linear PDO P(x, ) by observing
that P] agrees exactly with the operator (abusively) denoted by Pt in (2.11) (Remark
2.21) above. Thus the extension of P to D 0 we gave earlier follows the same systematic
duality trick.

Chapter
BASIC CONSTRUCTIONS
3.1. Intro In this chapter we discuss the extension of two basic constructions with
smooth functions to the case of distributions: the tensor product and composition.
Recall: If 1 Rn1 and 2 Rn2 are open subsets and f C (1 ), g C (2 ), then
the function f g C (1 2 ), the tensor product of f and g, is dened by
f g(x, y) := f(x)g(y) (x, y) 1 2 .
We will extend this to the case where f as well as g is a distribution in 3.2 below.
If h : 1 2 is a smooth map, then the pullback of g by h is the function h g C (1 )
dened by composition
h g
h g := g h : 1 2 C.
We will extend this to the case where g is a distribution for certain classes of smooth
maps h in 3.3 below. In particular we will thus dene translations, scalings, and change
of coordinates for distributions.
As a technical preparation for an elegant and elementary1 denition of the tensor product
of distributions we will rst consider test functions depending smoothly on additional
parameters. This is the subject of 3.1 which is also of interest in its own right.
1 i.e.
without recourse to the abstract theory of tensor products of innite dimensional locally convex
vector spaces
57
58 Chapter 3. BASIC CONSTRUCTIONS
3.1. TEST FUNCTIONS DEPENDING ON

PARAMETERS
3.2. PROP (D 0 -action on test functions depending on parameters) Let n1 , n2 N

and j R nj
(j = 1, 2) be open subsets. Assume that C (1 2 ) satises the

following:
y 0 2 neighborhood U(y 0 ) of y 0 in 2 K(y 0 ) b 1 :

supp (., y) K(y 0 ) y U(y 0 ).

[The support of the map x 7 (x, y) is contained in K(y 0 ).]

Then we have u D 0 (1 ) that
y 7 hu(x), (x, y)i := hu, (., y)i C (2 )
and for all Nn0 2
(3.1) hu, (., y)i = hu,

y (., y)i.
3.3. REM
(i) Note that for a regular distributions u L1loc (1 ) Equation (3.1) reads
Z Z

u(x)(x, y) dx = u(x)
y (x, y) dx,
1 1
hence it includes a variant of the classical theorem on \dierentiation under the integral".
(ii) To be prepared for the proof we recall a basic estimate, which is a consequence of
the mean value theorem (cf. [Hor09, 18.18, equation (18.13)], or [For05, 6, Corollar zu
Satz 5]): If f C1 () and the line segment xy joining x, y lies entirely in , then
we have
(3.2) |f(x) f(y)| 6 kDfkL (xy) |x y|.
[Here, kDfkL (A) = supzA |Df(z)| and |.| is the euclidean norm.]

3.1. TEST FUNCTIONS DEPENDING ON PARAMETERS 59
Proof of Proposition 3.2: By hypothesis we have for any y 2 that x 7 (x, y)

belongs to D(1 ). Thus we may dene
() (y) := hu(x), (x, y)i (y 2 ).
Let y 0 2 and choose U(y 0 ) and K(y 0 ) as in the hypothesis. Let > 0 be such that
B (y 0 ) U(y 0 ).
is continuous: For any h Rn2 with |h| < set
h (x, y 0 ) := (x, y 0 + h) (x, y 0 ).
Then we may write

(y 0 + h) (y 0 ) = hu(x), h (x, y 0 )i
and conclude that it suces to show h (., y 0 ) 0 in D(1 ) as h 0.

From the hypothesis we have supp(h (., y 0 )) K(y 0 ) b 1 for all h with |h| < .
Furthermore, if Nn0 1 we may apply (3.2) to the function y 7 x (x, y) and
obtain
|
x h (x, y )| = |x (x, y + h) x (x, y )|
0 0 0
x kL (K(y 0 )B (y 0 )) |h| 0
6 kDy (h 0).
is continuously dierentiable: Let ej denote the jth standard basis vector in Rn2
(1 6 j 6 n2 ) and dene for 0 < <
(x, y 0 + ej ) (x, y 0 )
(x, y 0 ) := yj (x, y 0 ) (x 1 ).

By () we obtain
(y 0 + ej ) (y 0 )
hu(x), yj (x, y 0 )i = hu(x), (x, y 0 )i

and thus recognize that it suces to prove (., y 0 ) 0 in D(1 ) as 0, since
we know that y 0 7 hu(x), yj (x, y 0 )i is continuous (by an application of the rst
part of this proof to yj in place of ).
From the hypothesis we get supp( (., y 0 )) K(y 0 ) b 1 for all ]0, [. Fur-
thermore, if Nn0 1 we may apply the mean value theorem to the function
x (x, y + ej ) and obtain with some 1 [0, ]
0
7
0 0 0

x (x, y ) = yj x (x, y + 1 ej ) yj x (x, y ).

Hence another application of (3.2) to the function y 7 yj x (x, y) now gives

| 0
x (x, y )| 6 C(, ) 1 0 (0 6 1 6 0),
where the constant C(, ) equals the maximum of |Dy (yj x )| on K(y 0 )B (y 0 ).
In particular, we obtain the special case of (3.1) when = ej , i.e.,
j (y 0 ) = hu(x), yj (x, y 0 )i.
C (2 ): Proceeding inductively we obtain that is continuously dierentiable

and that (3.1) holds for all Nn0 2 .
3.4. COR
(i) If u D 0 (1 ) and D(1 2 ), then the function y 7 hu, (., y)i belongs to
D(2 ) and (3.1) holds.
(ii) If u E 0 (1 ) and C (1 2 ), then the function y 7 hu, (., y)i belongs

to C (2 ) and (3.1) holds.
Proof: (i) The hypothesis of Proposition 3.2 is satised with U(y 0 ) = 2 and K(y 0 ) =
1 (supp()), where 1 denotes the projection 1 2 1 , (x, y) 7 x.
(ii) According to the proof of Theorem 1.66 we may choose a suitable cut-o over a
neighborhood of supp(u). Then the function dened by the action of u on (., y) is
given by
(y) = hu, (.)(., y)i.
Hence we may copy the proof of Proposition 3.2 upon taking U(y 0 ) = 2 and K(y 0 ) =
supp().
3.5. Example Let D(Rn ) and dene the function : Rn Rn C by (x, y) :=

(x + y). Then C (Rn Rn ) and satises the hypothesis of Proposition 3.2, e.g.
with U(y 0 ) := B1 (y 0 ) and K(y 0 ) := supp() B1 (y 0 ) = {z y | z supp(), y B1 (y 0 )}.
Thus, for any u D 0 (Rn ) the map y 7 hu(x), (x + y)i is smooth Rn C and
[(3.1)]

hj u, i = hu, j i = hu, yj (., 0)i = j hu, (., y)i |y=0
hu(x), (x ej )i hu(x), (x)i hu(x), (x ej )i hu(x), (x)i
= lim = lim
0 0
u(x + ej ) u(x)
[if, in addition, u is regular] = lim h , (x)i.
0
(Compare this with Example 3.15(ii), formula (3.8) below.)

3.2. TENSOR PRODUCT OF DISTRIBUTIONS 61
3.2. TENSOR PRODUCT OF

DISTRIBUTIONS
3.6. Motivation Let n1 , n2 N and j Rnj (j = 1, 2) be open subsets. For

functions f C (1 ) and g C (2 ) we dene the tensor product fg C (1 2 )
by
f g (x, y) := f(x) g(y) (x 1 , y 2 ).
We may consider fg as a regular distribution on 1 2 with action on a test function

D(1 2 ) according to
Z
hf g, i = f(x)g(y)(x, y) d(x, y)
1 2
Z Z

= g(y) f(x)(x, y) dx dy = hg(y), hf(x), (x, y)ii.
2 1
In particular, if (x, y) = (x)(y), i.e. = , with D(1 ) and D(2 ),

then we obtain
() hf g, i = hf, ihg, i.
Our aim is to extend the tensor product to distributions in such a way that the analogue
of equation () holds for all test functions , and determines the distributional tensor
product uniquely.
The rst step will be to show that the linear combinations of all elements of the form
are dense in D(1 2 ).
3.7. LEMMA (Tensor products are dense in D(1 2 )) Let M denote the subspace
of D(1 2 ) dened by the linear span of the set
M0 := { | D(1 ), D(2 )}.
Then M is dense in D(1 2 ).

(Note that it suces to consider sums of elements in M0 to generate all of M, since
scalar factors can always be subsumed into one of the functions.)

Proof: Let D(1 2 ). We have to show that there exist sequences (j ) in D(1 )
and (j ) in D(2 ) such that
X
m
j j in D(1 2 ) as m .
j=0
By a partition of unity and after appropriate translation we may w.l.o.g. assume that
supp() ]0, 1[ n1 +n2 and that l = ]0, 1[ nl (l = 1, 2). Setting n = n1 + n2 and
I := ]0, 1[ n we will prove the following
Claim: For any D(I) we can nd n sequences (j,1 )jN0 , . . . , (j,n )jN0 in D(]0, 1[)
such that putting
X
m
m (x1 , . . . , xn ) := j,1 (x1 ) j,n (xn ) ((x1 , . . . , xn ) I)
j=0
we obtain
( ) m in D(I).
Assuming the claim to be true for a moment, we rst show how it implies the statement
of the lemma: we simply set
j (x1 , . . . , xn1 ) := j,1 (x1 ) j,n1 (xn1 ),
j (y1 , . . . , yn2 ) := j,n1 +1 (y1 ) j,n1 +n2 (yn2 ),
Pm
then j=0 j j = m and () completes the proof of the lemma.
Proof of the claim: Considering as a periodic function2 on Rn , i.e. (x + k) = (x)

for all k Zn , we obtain the Fourier series expansion
X
(x) = ck e2ihk|xi (x I),
kZn
R
where the Fourier coecients are given by ck = I (x)e2ihk|xi dx (k Zn ).
By smoothness of we have convergence of the (partial sums of the) Fourier series to
in C (Rn ), that is, uniformly on compact sets in all derivatives.
[Via several integrations by parts it is routine to deduce the following: l N l > 0 such
that |ck | 6 l (1 + |k|2 )l ; thus we obtain uniform and absolute convergence of every derivative
of the Fourier series, hence convergence to some function in C (I); since by abstract Hilbert
space theory the Fourier series converges to in L2 (I), the C -limit of the series must also be
2 Note that the extension is in C (Rn ), since supp() has positive distance from the boundary I.

. For details on multiple Fourier series see, e.g., [SD80, Kapitel I, insbesondere Satz 8.1 und
Bemerkung nach dessen Beweis].]
Since supp() is compact in I we can nd > 0 such that supp() [2, 1 2]n .
Let D(]0, 1[) with = 1 on ], 1 [ and dene N D(I) for N N0 by
X Y
n
N (x1 , . . . , xn ) := ck (xl )e2ikl xl .
(k1 ,...,kn ) Zn l=l
|k1 |,...,|kn |6N
Clearly supp(N ) supp()n for all N, supp() supp()n , and N |supp() agrees
with the corresponding partial sum of the Fourier series. Hence by Leibniz' rule we
obtain that N in D(I). Finally, since the series (N )NN0 converges uniformly
absolutely (for all derivatives) it may be brought into the form as claimed by standard
relabeling procedure.
[A few details on the relabeling: First, for any k = (k1 , . . . , kn ) Zn we put
ek
2ik1 x1 and
ek 2ikl xl (l = 2, . . . n), so that
1 (x1 ) := ck (x1 )e l (xl ) := (xl )e
P
ek n (xn ); second, choose a bijection : N0 Z and
ek n
N (x1 , . . . , xn ) = kZn ,kkk 6N 1 (x1 )
P
dene j,l := e (j)
l ; then the partial sums m (x1 , . . . , xn ) := m j=0 j,1 (x1 ) j,n (xn ) are re-
arrangements of the original series; by uniform absolute convergence of the original series (for
every derivative) we obtain also m in D(I).]
3.8. THM (Tensor product of distriutions) Let u D 0 (1 ) and v D 0 (2 ). There

exists a unique distribution u v D 0 (1 2 ), called the tensor product of u and v,
such that
(3.3) hu v, i = hu, ihv, i D(1 ), D(2 ).
Proof: Uniqueness: By (3.3) the linear form u v is determined on the subspace M

D(1 2 ) generated by splitting tensors of the form (M as in Lemma 3.7).
P
Indeed, if = mj=1 j j (with j D(1 ), j D(2 )), then by linearity and (3.3)
X
m X
m
() hu v, i = hu v, j j i = hu, j ihv, j i.
j=1 j=1
By assumption uv is continuous on D(1 2 ). Therefore uniqueness of uv follows,

since M is dense due to Lemma 3.7.
Existence: Note that the right-hand side of () can be rewritten in the form
X
m X
m
hv, hu, j ij i = hv(y), hu(x), j (x)j (y)ii = hv(y), hu(x), (x, y)ii.
j=1 j=1

Let now be D(1 2 ) arbitrary. By Corollary 3.4(i) the function y 7 hu(x), (x, y)i
belongs to D(2 ), hence we may dene a linear form u v on D(1 2 ) by
(3.4) hu v, i := hv(y), hu(x), (x, y)ii D(1 2 ).
On the subspace M this denition reproduces (), in particular, (3.3) holds. It remains
to show that u v is continuous.
Let K b 1 2 and denote by Ki b i the projection of K onto i (i = 1, 2). Let
D(K) and dene g D(K2 ) by
g(y) := hu, (., y)i (y 2 ).
Recall that (3.1) gives g(y) = hu, y (., y)i.

The continuity condition (SN) applied to v provides m and C (depending on K2 only,
not on g or ) such that
X
() |hv, gi| 6 C k gkL (K2 ) .
||6m
Since supp((., y)) K1 we may employ (SN) for u to provide N and C 0 (depending on
K1 only, not on ) such that
X
( ) | g(y)| = |hu,
y (., y)i| 6 C
0
k
x y (., y)kL (K ) .
1
||6N
Combining () and ( ) yields an estimate of the form (SN) for u v (with constant
CC 0 and derivative order m + N).

3.9. THM (Properties of the distributional tensor product) Let u D 0 (1 ) and

v D 0 (2 ). The tensor product u v D 0 (1 2 ) satises the following \Fubini-
like" relation for all D(1 2 )
(3.5) hu v, i = hv(y), hu(x), (x, y)ii = hu(x), hv(y), (x, y)ii.
Moreover, we have the following properties:

(i) supp(u v) = supp(u) supp(v).
(ii) x y (u v) = x u y v
(iii) : D 0 (1 ) D 0 (2 ) D 0 (1 2 ) is bilinear and (jointly) sequentially contin-
uous.

Proof: We have used the equation hu v, i = hv(y), hu(x), (x, y)ii already to prove
existence of the tensor product.
If we consider the functional w : 7 hu(x), hv(y), (x, y)ii, then it is easy to see that
it satises (3.3) and continuity of w follows similarly as in the proof of Theorem 3.8.
Hence by uniqueness we necessarily have w = u v and, consequently, Equation (3.5)
holds.
(i): We rst show supp(u) supp(v) supp(u v).
Let (x, y) supp(u) supp(v) and let W be a neighborhood of (x, y) in 1 2 . We
may nd a neighborhood U(x) of x (in 1 ) and a neighborhood V(y) of y in 2 with
U(x) V(y) W .

x supp(u) = D(U(x)) : hu, i =6 0
supp( ) U(x) V(y) W
y supp(v) = D(V(y)) : hv, i =
6 0
and hu v, i = hu, ihv, i =

6 0, thus (x, y) supp(u v).
To show the reverse inclusion relation suppose (x, y) 1 2 \ supp(u) supp(v) .

We may assume w.l.o.g. that x 6 supp(u). Then there exists some neighborhood U(x)
of x (in 1 ) such that U(x) supp(u) = .
Let D(1 2 ) with supp() U(x) 2 arbitrary. Then we obtain y 0 2
{x 0 1 | (x 0 , y 0 ) 6= 0} 1 (supp()) U(x), i.e., supp((., y 0 )) U(x).
Hence Proposition 1.56 implies that hu(.), (., y 0 )i = 0 for all y 0 2 . Therefore
hu v, i = hv(y 0 ), hu(x 0 ), (x 0 , y 0 )ii = 0.
Since was an arbitrary element of D(U(x) 2 ) we conclude that (x, y) 6 supp(u v).
(ii): By a direct calculation of the action on any D(1 2 )
||+|| ||+||
h
x y (u v), i = (1) hu v,
x y i = (1) hu(x), hv(y),
y x (x, y)ii
= (1)|| hu(x), h ||
y v(y), x (x, y)ii = (1) hu(x), x hy v(y), (x, y)ii

[(3.1)]
= h
x u(x), hy v(y), (x, y)ii = hx u y v, i.
(iii): Separate sequential continuity follows from (3.5) using the sequential continuity of
u and v respectively. Joint sequential continuity is due to Remark 1.45(ii).
3.10. Example Let 1 = 2 = R and u = v = . We have

h , i = h(x), h(y), (x, y)ii = h(x), (x, 0)i = (0, 0) = h(x, y), (x, y)i,

i.e. (x) (y) = (x, y) .

Moreover, we have
x y (H(x) H(y)) = (x) (y) = (x, y).
3.11. Remark Tensor products of any nite number of distributional factors are
constructed in a similar way and the properties are analogous. For example, we obtain
on Rn = R R (n times) by a calculation as above
(x) = (x1 , . . . , xn ) = (x1 ) (xn )
and
1 n (H(x1 ) H(xn )) = (x1 , . . . , xn ).
3.12. THM (Distributions that are constant in one direction) Let u D 0 (Rn ). Then
we have:
n u = 0 v D 0 (Rn1 ) : u(x) = v(x 0 ) 1(xn ),
with the notation x 0 = (x1 , . . . , xn1 ) and 1(xn ) for the constant function xn 7 1.
Note that in this case the action of u on a test function D(Rn ) is thus given by
Z
(3.6) 0 0
hu, i = h1(xn ), hv(x ), (x , xn )ii = hv, (., t)i dt.
R
Proof: is immediate

R
from Theorem 3.9(ii).
3
Let D(R) with = 1 and dene the linear functional v : D(R ) C by
n1
() hv, i := hu(x 0 , xn ), (x 0 ) (xn )i ( D(Rn1 ).
Continuity of v follows from the observation that k 0 in D(Rn1 ) (k ) implies

k 0 in D(Rn ), hence v D 0 (Rn1 ).
Now let D(Rn ), then we calculate
Z
hv 1, i = hv(x ), h1(t), (x , t)ii = hv(x ), (x 0 , t) dti
0 0 0
Z
= hu(x , xn ), ( (x 0 , t) dt) (xn )i.
0

[()]
3 It
is not dicult to guess v by making the ansatz u = v1 and considering the action of R u on a tensor
product: hu, i = hv(x 0 ), h1(xn ), (x 0 ) (xn )ii = hv(x 0 ), h1, i (x 0 )i = h1, ihv, i = ( ) hv, i.

Hence we may write

Z
hu v 1, i = hu(x , xn ), (x , xn ) ( (x 0 , t) dt) (xn )i.
0 0
| {z }
=:(x 0 ,xn )
Observe that for every x 0 Rn1 we have

Z Z Z Z
0 0 0
(x , xn ) dxn = (x , xn ) dxn ( (x , t) dt) (xn ) dxn = 0.
| {z }
=1
R
Therefore (x 0 , xn ) :=
xn
(x 0 , s) ds denes a function D(Rn ) with the property
n = (note that this is a parametrized variant of Sublemma 2.26).
Therefore we obtain nally
hu v 1, i = hu, i = hu, n i = hn u, i = 0,

[n u=0!]
thus u = v 1.

3.3. CHANGE OF COORDINATES AND

PULLBACK
3.13. Intro
(o) If X, Y, Z are sets and w : Y Z, h : X Y are maps, then we denote by h w :=
w h : X Z the pullback of the map w to X.
(i) Change of coordinates: Let 1 , 2 Rn be open subsets and F : 1 2 be a
dieomorphism, i.e., F is bijective and F as well as F1 are C .
If u C(2 ) then
F u = u F : 1 C
belongs to C(1 ) and can be considered an element of D 0 (1 ). We calculate its action
on a test function D(1 ) by a change of coordinates in the integral as follows
Z Z

u(y)(F1 (y)) det D(F1 )(y) dy

hF u, i = u(F(x))(x) dx =
1 2
= hu, (F1 ) | det D(F1 )|i = hu, (F1 ) | det D(F1 )|i.
(ii) Pullback by a real function: Let f : R be smooth with

Df(x) 6= 0 x
and u C1c (R). We have the pullback f u = u f C1 (). Considering f u as a

distribution on we obtain the following for its action on test functions:
Z

writing u(f(x)) = u 0 (t) dt we have for any D()

f(x)
Z Z
Z
hf u, i = u(f(x))(x) dx = u 0 (t)dt (x)dx [set St :={x|f(x)<t}]
f(x)
Z
Z Z
Z

d
= u 0 (t) (x) dx dt = u(t) (x) dx dt = hu, f i.
dt
St [int. by St
parts] | {z }
=:f (t)

3.3. CHANGE OF COORDINATES AND PULLBACK 69
(As will be seen below the condition Df 6= 0 ensures that f is a test function. Otherwise it may
fail toRbe C as the example

f : R R, f(x) = x2 and = 1 near x = 0 shows: direct calculation
gives St (x) dx = 2 t when t > 0 is suciently small.)
3.14. THM (Pullback by a dieomorphism) Let 1 , 2 Rn open and F : 1 2

be a dieomorphism. For any u D 0 (2 ) we dene the pullback of u under F by
(3.7) hF u, i := hu, (F1 ) | det D(F1 )|i D(1 ).
Then F u D 0 (1 ). Moreover, the map D 0 (2 ) D 0 (1 ), u 7 F u, is linear and

sequentially continuous.
Proof: We rst note that (F1 ) | det D(F1 )| : y 7 (F1 (y)) det D(F1 )(y) belongs

to D(2 ): supp((F1 (.)) = F(supp()) is compact and | det D(F1 )| is just a C factor,
since det D(F1 ) has no zero in 2 (thus preserving smoothness of the absolute value).
Second, by the chain rule and the fact that det D(F1 ) 6= 0 it follows that the linear map
7 (F1 ) | det D(F1 )| is sequentially continuous D(1 ) D(2 ). Since u 7 F u
is just the adjoint of this map, the results in 2.4 complete the proof.
3.15. Examples
(i) in new coordinates: Let y0 2 , u = y0 D 0 (2 ), and F : 1 2 be a
dieomorphism. Then we have
hF y0 , i = hy0 , (F1 ) | det D(F1 )|i = (F1 (y0 )) | det D(F1 )(y0 )|,
or, upon writing x0 := F1 (y0 ) and noting that det D(F1 )(y0 ) = 1/ det DF(x0 ),
1
F y0 = x (y0 = F(x0 )).
| det DF(x0 )| 0
(ii) Translations: For any h Rn we have the translation h : Rn Rn , x 7 x h.

h = h . Since Dh (x) = idRn we obtain
This clearly denes a dieomorphism and 1
for every u D (R )
0 n
hh u, i = hu, h i,
that is, h coincides with the adjoint of the pullback of test functions under h . (Note
that there is a slight notational mismatch which is common abuse. In fact, also notations
like h u or u(. + h) are widely used to mean the same.)
For example, h 0 = h since
hh 0 , i = h0 , (. + h)i = (h) = hh , i.

We observe that the result in Example 3.5 can now be read as

ej u u
(3.8) j u = D 0 - lim .
0
Applying this relation to h u in place of u and noting that ej (h u) = hej u we
obtain the generalization
j (h u) = hj (h u).
On the other hand, applying h to both sides of (3.8)4 then yields
(3.9) h (j u) = j (h u),
i.e., translation and dierentiation commute on D 0 (Rn ).

(iii) Linear transformations: Let A : Rn Rn be linear and invertible. Then we obtain
A : D 0 (Rn ) D 0 (Rn ), u 7 A u, given by
1
hA u, i = | det(A1 )| hu, (A1 ) i = hu(x), (A1 x)i D(Rn ).
| det A|
Two notable special cases are the following:

Re ection: If A = idRn , then we have for test functions A (x) = (x) =: (x)
and
for a distribution u D (R )
0 n
, i := hA u, i = hu, i.
hu
A distribution u is called even (resp. odd), if u = u (resp. u = u).

For example, 0 is even and j 0 is odd (1 6 j 6 n).
Dilation: If t > 0 and A = t idRn , then we have | det A| = tn and A (x) = (tx) for
any test function on Rn . Thus we obtain for a distribution u on Rn
1 1
hut , i := hA u, i = n
hu(x), ( x)i.
t t
Let C. A distribution u is said to be (positively) homogeneous of degree , if
ut = t u t > 0.
For example, 0 on Rn is homogeneous of degree n:
h(0 )t , i = tn h0 (x), (x/t)i = tn (0) = htn 0 , i.

4 On the right-hand side use continuity and linearity of h besides h (ej u) = hej u.

3.16. THM (Pullback by a real-valued function) Let f : R be smooth and satisfy

(D) Df(x) 6= 0 x .
For any u D 0 (R) we dene the pullback of u under f by

(3.10) hf u, i := hu, f i D(),
where Z
d
f (t) := (x) dx.
dt
{x|f(x)<t}
Then f u D () and the map D (R) D 0 (), u 7 f u, is linear and sequentially

0 0
continuous.
Proof: Condition (D) guarantees that for each x0 we have j f(x0 ) 6= 0 for some
j {1, . . . , n}. W.l.o.g. we may assume that j = 1 (otherwise permute coordinates). By
the implicit function theorem there is a neighborhood U(x0 ) of x0 such that
(x1 , . . . , xn ) 7 (f(x), x2 , . . . , xn ) =: (y1 , y2 , . . . , yn )
is a dieomorphism F1 of U(x0 ) onto some open subset Vx0 Rn .

Let D() arbitrary. A standard \partition-of-unity-argument" allows us to assume
that supp() U(x0 ) =: U for some x0 . Changing coordinates according to
F : V := Vx0 U in the integral dening f we nd that
Z
d
f (t) = (F(y)) | det DF(y)| dy [note that supp(F)bV]
dt
{yV|y1 <t}
Zt Z Z
d
= (F(y1 , y )) | det DF(y1 , y )| dy dy1 =
0 0 0
(F(t, y 0 )) | det DF(t, y 0 )| dy 0 .
dt
Rn1 Rn1
This expression directly shows smoothness [use theorems on parameter integrals] and bound-
edness of the support of f [if |t| is large, then (t, y2 , . . . , yn ) cannot be in the support of F].
Thus (3.10) is a valid denition of a linear form f u on D().
The continuity of f u follows from the observation that k 0 in D() implies (k )f
0 in D(R) [use dierentiation of parameter integrals, chain rule, and dominated convergence (or
uniform convergence of the integrands) in the explicit expression above].
Since u 7 f u is just the adjoint of 7 f , the sequential continuity of the pullback
map follows from 2.4.
3.17. Example
(i) Delta on a hypersurface: If f : R is C and such that Df(x) 6= 0 for all x ,

then
M := {x | f(x) = 0}
is a hypersurface in (i.e. an (n 1)-dimensional C submanifold). If denotes
the Dirac-distribution on R concentrated in 0, then intuitively speaking a distribution
\(f(x))" should \evaluate" (or take the mean value of) a test function on the surface M.
We will clarify a rigorous mathematical aspect of this concept from physics by appealing
to Theorem 3.16 in dening f .
Equation (3.10) gives for any D()
hf , i = h, f i = f (0).
As in the proof of Theorem 3.16 we may reduce the explicit determination of f to

the case where supp() is contained in the domain of new coordinates of the form
(y1 , y2 , . . . , yn ) = (f(x), x2 , . . . , xn ) = F1 (x) and write (using that M is given by y = 0)
Z
f (0) = (F(0, y 0 )) | det DF(0, y 0 )| dy 0 .
Rn1
Noting that | det DF| = 1/| det D(F1 ) F| and det D(F1 ) = 1 f we further obtain
Z
1
() f (0) = (F(0, y 0 )) dy 0 .
|1 f(F(0, y 0 ))|
Rn1
The map y 0 7 F(0, y 0 ) provides a (local) parametrization of M and we have F(0, y 0 ) =

(g(y 0 ), y 0 ), where g is determined by the implicit function theorem from the equation
f(g(y 0 ), y 0 ) = 0.
Dierentiation yields for j = 2, . . . , n: 1 f(g(y 0 ), y 0 ) j g(y 0 ) + j f(g(y 0 ), y 0 ) = 0. There-

fore (with arguments dropped in l f(g(y 0 ), y 0 ) and j g(y 0 ) for brevity)
X
n X
n
|Df| = |1 f| +
2 2
|j f| = |1 f| +
2 2
|1 f|2 |j g|2 = |1 f|2 (1 + |Dg|2 ),
j=2 j=2
which in turn yields

1 + |Dg(y 0 )|2
p
1
() = .
|1 f(F(0, y 0 ))| |Df(g(y 0 ), y 0 )|
Note that 1 + |Dg(y 0 )|2 dy 0 is the surface measure dS of M corresponding to the

p
parametrization y 0 7 (g(y 0 ), y 0 ) of M (which describes M as the graph of g upon

permutation of coordinates 1 and n; cf. [For84, 14, Beispiel (14.7)]). Thus inserting
() into () we arrive at an explicit formula for f in terms of a surface integral
Z

hf , i = f (0) = dS.
|Df|
M
(ii) Delta on the lightcone: As a special case of (i) consider = ]0, [ R3 and f : R
with f(x1 , x 0 ) = x21 |x 0 |2 . Then M = {(x1 , x 0 ) R4 | x1 > 0, |x 0 | = x1 } describes the
forward lightcone and hf , i = f (0) can be determined by direct calculation: rst,
observe that
p for any (x1 , x ) ]0, [ R we have 3 f(x1 , x ) < t (t > |x | and
0 3 0 02
0 < x1 < |x 0 |2 + t); hence for any D(]0, [ R )

Z
d
f (0) := (x1 , x 0 ) d(x1 , x 0 ) |t=0
dt
{(x1 ,x 0 )|f(x1 ,x 0 )<t}
02
Z Z |x | +t Z
d (|x 0 |, x 0 ) 0
= (x1 , x ) dx1 dx |t=0 =
0 0
dx
dt 2|x 0 |
R3 0 R3
(To compare this formula with the surface integral in (i) observe that here we have: Df(x1 , x 0 ) =

2(x1 , x 0 ), thus |Df(|x 0 |, x 0 )| = 2 2|x 0 |2 = 2 2|x 0 |, and g(x 0 ) = |x 0 | [which is smooth on R3 \ {0};
p

0 < x1 = |x 0 | on M!], thus Dg(x 0 ) = x 0 /|x 0 |, gives the surface element 1 + |Dg| = 2; hence
p
dS/|Df| = dx 0 /(2|x 0 |) on the forward lightcone.)
3.18. REM
(i) Note that in both cases (dieomorphism and real-valued function) we have constructed
the formula for the distributional pullback to t the action of a classical composition of
continuous functions as distributions. Therefore the extended pullback map u 7 f u
is compatible with the functional pullback in these cases. Moreover, by density of the
considered classes of functions the extension of the pullback as a sequentially continuous
map on D 0 is uniquely determined by this compatibility requirement.
(ii) A pullback map on D 0 can be dened more generally for submersions 5 , which include
the two cases we have considered above (cf. [FJ98, Theorem 7.2.2] or [Hor90, Theorem
6.1.2]), again as the unique sequentially continuous extension from the classical case.
(An even more general extension of the pullback is possible under so-called microlocal
conditions ; see [Hor90, Theorem 8.2.4].)
5F C (1 , 2 ) is a submersion, if Df(x) is surjective at each point x 1 .

(iii) Theorem 3.14 opens the door to an invariant denition of distributions on smooth
manifolds, thus introduces distribution theoretic objects also into dierential geometry
and mathematical relativity theory (cf. [Hor90, Section 6.3], [Kun98, Kapitel 10], and
[GKOS01, Section 3.1 and Chapter 5]).
(iv) Chain rule and pullback of products: Based on the chain rule for compositions of
smooth functions, the density of smooth functions in D 0 , and the sequential conti-
nuity of pullback and multiplication by smooth factors one easily proves chain rules
for distributional derivatives of pullbacks. In the same way an equation of the form
(a u) f = (a f) (u f) for smooth functions u and a is extended to the case of smooth
a and distributional u (cf. [FJ98, Corollaries 7.1.1 and 7.2.1] and [Hor90, Equations
(6.1.2) and (6.1.3)]):
(a) If F = (F1 , . . . , Fn ) : 1 2 is a dieomorphism, then for every u D 0 (2 )
X
n
j (F u) = (j Fk ) F (k u).
k=1
If a C (2 ) then F (au) = (a F) (F u).

(b) If f : R is smooth with Df(x) 6= 0 for all x , then for every u D 0 (R)
j (f u) = (j f) f (u 0 ).
If a C (R) then f (au) = (a f) (f u).

As an application of the rules in (b) we suggest the following
Exercise: Let u = f D 0 () be the Dirac-Delta on the forward lightcone as in Example
3.17(ii) (i.e. f(x1 , x 0 ) = x21 |x 0 |2 and = ]0, [ R3 ). Prove that u satises the wave
equation
X
4
u := 21 u 2j u = 0 (in D 0 ()).
j=2
(Calculate 21 (f ) = 1 (2x1 f 0 ) = 2f 0 + 4x21 f 00 and for j = 2, 3, 4 similarly 2j (f ) =

2f 0 + 4x2j f 00 . Thus u = 8f 0 + 4(x21 |x 0 |2 )f 00 . Interpreting x21 |x 0 |2 as f (idR ) we
obtain (x21 |x 0 |2 )f 00 = f (idR 00 ). Formula (2.13) gives t 00 (t) = 2 0 (t) hence f (idR 00 ) =
f (2 0 ) = 2f 0 . Inserting this into the expression for u obtained above we arrive at u =
8f 0 + 4(2f 0 ) = 0.)

Chapter
CONVOLUTION
4.1. Intro For functions f Cc (Rn ) and g C(Rn ) the convolution f g C(Rn ) is
dened by
Z Z
f g(x) := f(y)g(x y) dy = f(x y)g(y) dy (x Rn ).
We may consider f g as a regular distribution on Rn and calculate its action on a test

function as follows:
Z ZZ
hf g, i = f g(z)(z) dz = f(z y)g(y)(z) dy dz [Fubini]
ZZ ZZ
= f(z y)g(y)(z) dz dy = f(x)g(y)(x + y) dx dy

[inner integral
x=zy]
Z
= f(x)g(y)(x + y) d(x, y).

[Fubini]
This suggests to generalize the convolution to distributions u E 0 (Rn ) and v D 0 (Rn )

by a formula like
hu v, i := hu(x) v(y), (x + y)i.
However, the status of the right-hand side of this equation has to be claried, which is
achieved in 4.1 by an appropriate cut-o to adjust the support properties of the function
(x, y) 7 (x + y). (Note that this function will not be compactly supported, unless = 0: if
(z0 ) 6= 0, then for every x Rn and y := z0 x we have (x + y) 6= 0.)
75
76 Chapter 4. CONVOLUTION
In 4.2 we will turn convolution into a useful tool for regularization by showing that
E 0 C C and D 0 D C . This will provide us with a systematic approximation
technique of distributions by smooth functions and yield a proof that D is dense in D 0 .
In 4.3 we will present a condition that allows to drop the assumption that at least
one of the convolution factors has to be compactly supported. As an application we
obtain a prominent example of a convolution algebra and an alternative description of
antiderivatives. The latter idea is at the basis of a further application of convolution
to reveal the local structure of distributions in 4.4. As it turns out, locally every
distribution is the derivative of a continuous function.

4.1. CONVOLUTION OF DISTRIBUTIONS 77
4.1. CONVOLUTION OF DISTRIBUTIONS
4.2. Preliminaries
Operations with subsets of Rn : Let A, B Rn . We will occasionally make use of nota-
tions like
A := {x | x A},
A + B := {x + y | x A, y B}, and
A B := {x y | x A, y B}.
Recall (or prove as an exercise) that
(i) A compact and B closed = A B is closed
[Also: A, B compact A B compact.]
(ii) A compact = A B = A B.
Supports: If D(Rn ) and C (Rn ), then

(4.1) supp (x)(x + y) supp() supp() supp() .

(Proof: (x)(x + y) 6= 0 x supp() and x + y supp().)

If in addition supp() is compact, then supp((x)(x + y)) is compact in Rn Rn .
~ y) := (x + y), then we have
Furthermore, for any D(Rn ) we set (x,
(4.2) ~
(x, y) supp() x + y supp().
~ y) 6= 0.)
(Since (x + y) 6= 0 (x,
4.3. THM (The convolution E 0 D 0 ) Let u E 0 (Rn ), v D 0 (Rn ). Choose a cut-o

function D(Rn ) with = 1 on a neighborhood of supp(u). We dene the convolution
u v of u and v by setting
(4.3) hu v, i := hu(x) v(y), (x)(x + y)i D(Rn ).
Then
(i) the value of hu v, i is independent of the choice of the cut-o , i.e. u v is
well-dened;
(ii) Equation (4.3) denes a distribution on Rn , i.e. u v D 0 (Rn ).

Proof: (i) If D(Rn ) is also a cut-o over supp(u), then there is a neighborhood U
of supp(u) such that ((x) (x))(x + y) = 0 when (x, y) U Rn , which in turn
happens to be a neighborhood of supp(u v) = supp(u) supp(v). Hence Proposition
1.56 implies
hu(x) v(y), ((x) (x))(x + y)i = 0.
(ii) Linearity of u v is obvious. To show the continuity condition (SN) let K b Rn and
D(K) arbitrary. By (4.1) we have
supp (x)(x + y) supp() K supp() =: K 0 ,

and K 0 is compact in Rn Rn . The corresponding seminorm estimate (SN) for u v on

K 0 implies then an estimate of the form (SN) on K for u v.
4.4. COR (A formula for the convolution E 0 D 0 ) Let u E 0 (Rn ), v D 0 (Rn ). Then
we have for each D(Rn )
(4.4) hu v, i = hv(y), hu(x), (x + y)ii = hu(x), hv(y), (x + y)ii.
Moreover, we see that the roles of u and v may be interchanged in this formula. In this
sense we have commutativity u v = v u.
Proof: Choosing a cut-o as above we have by Equations (4.3) and (3.5)
hu v, i = hu(x) v(y), (x)(x + y)i

= hu(x), (x)hv(y), (x + y)ii = hv(y), hu(x), (x)(x + y)ii.
As noted in Remark 1.67(i) we may drop reference to the cut-o in the action of an
E 0 -distribution, so we obtain (4.4).
4.5. PROP (Properties of the convolution E 0 D 0 ) Let u, v D 0 (Rn ), at least one of

the two with compact support. Then
(i) supp(u v) supp(u) + supp(v)
(ii) j = 1, . . . , n: j (u v) = (j u) v = u (j v)
(iii) h Rn : h (u v) = (h u) v = u h v.
Furthermore, = 0 plays the role of a neutral element for convolution
(iv) w D 0 (Rn ): w = w = w.

4.1. CONVOLUTION OF DISTRIBUTIONS 79
Proof: (i) W.l.o.g. we may assume that supp(v) is compact and is a suitable cut-o.
Let z Rn \(supp(u)+ supp(v)). As noted in 4.2 supp(u)+ supp(v) is closed, hence there
is an open neighborhood U of z such that U (supp(u) + supp(v)) = . Let D(U),
then (4.2) shows that (x, y) supp((x 0 , y 0 ) 7 (x 0 + y 0 )) implies x + y supp() b U.
Hence
supp((x + y)) (supp(u) supp(v)) =
| {z } | {z }
~ in (4.2)
=supp() =supp(uv)
and therefore
hu v, i = hu(x) v(x), (x)(x + y)i.
We conclude that z 6 supp(u v).
(ii) We calculate the action on a test function applying (4.4)
h(j u) v, i = hv(y), hj u(x), (x + y)ii = hv(y), hu(x), j (x + y)ii

= hu v, j i = hj (u v), i
and by commutativity also j (u v) = j (v u) = (j v) u = u j v.

(iii) As in (ii) by use of (4.4)
h(h u) v, i = hv(y), hh u(x), (x + y)ii = hv(y), hu(x), h (x + y)ii

= hu v, h i = hh (u v), i
and again by commutativity also h (u v) = h (v u) = (h v) u = u h v.

(iv) The action on a test function gives
h w, i = hw(y), h(x), (x + y)ii = hw(y), (0 + y)i = hw, i
and again by commutativity also w = w = w.
4.6. THM (Sequential continuity of convolution) Suppose that either

(i) u E 0 (Rn ) and v, vm D 0 (Rn ) (m N) with vm v in D 0 (Rn ) (m )
or
(ii) u D 0 (Rn ) and v, vm E 0 (Rn ) (m N) satisfy the following: K b Rn such that
supp(v) K, supp(vm ) K holds m N and vm v in D 0 (Rn ) (m ).1
Then u vm u v in D 0 (Rn ) (m ).
1 It
would suce to assume v D 0 (Rn ) without support condition, since then supp(v) K follows
from the convergence vm v.

Proof: (i) If D(Rn ), then by Corollary 3.4(ii) the function u : y 7 hu(x), (x+y)i
is smooth. Moreover, u vanishes when y 6 supp() supp(u), since this implies
supp(u) supp((. + y)) = and Proposition 1.56 yields u (y) = hu(x), (x + y)i = 0.
Thus u is a test function on Rn and we obtain
(k)
hu vm , i = hvm (y), hu(x), (x + y)ii hv(y), hu(x), (x + y)ii = hu v, i.

[(4.4)]
(ii) Let D(Rn ) be a cut-o over some neighborhood of K. Recall that the action of
any w E 0 (Rn ) with supp(w) K on a function C (Rn ) was obtained by hw, i.
Furthermore, the function y 7 hu(x), (x + y)i is smooth by Proposition 3.2, so by (4.4)
again we have as m
hu vm , i = hvm (y), hu(x), (x + y)ii

= hvm (y), (y)hu(x), (x + y)ii hv(y), (y)hu(x), (x + y)ii
= hv(y), hu(x), (x + y)ii = hu v, i.

4.2. REGULARIZATION 81
4.2. REGULARIZATION
4.7. Heuristics: In the preceding section we have developed a theory of convolution

as a map E 0 D 0 D 0 . Now we will change the point of view by restricting the E 0 -factor
to C -functions, that is we consider the convolution D D 0 . As we will see this provides
a process of regularizing (smoothing) a given distribution. More precisely, if D is a
mollier, then for any u D 0 we obtain a net of smooth functions u ( > 0) with
the property u u in D 0 as 0. Recall that we already used this technique in
Theorem 1.13 to approximate Ck -functions by C -functions.
To get some intuitive idea why convolution has a smoothing eect, we consider f
C(R) D 0 (R). Let D(R) with > 0, supp() [1, 1], and (x) = 1 when
|x| 6 1/2. If (z) := (z/)/, then supp( )(x .) [x , x + ], (x y) = 1 when
|x y| 6 /2, and we obtain
Z
Z
x+
1
f (x) = f(y) (x y) dy f(y) dy =: M (f)(x).

x
Here, M (f)(x) is the \mean value of f near x" and we easily deduce that M (f)(x) f(x)
( 0). Moreover, as noted in the proof of 1.13 the functions f ( > 0) are smooth.
4.8. THM (Smoothing via convolution) Let C (Rn ) and u D 0 (Rn ) such that
(i) supp() is compact or (ii) supp(u) is compact.
Then
(4.5) u (x) = hu(y), (x y)i (x Rn )
and u C (Rn ).
Proof: Suppose that (i) holds. Let D(Rn ). We may regard as a regular element
in E 0 (Rn ) and choose a cut-o D(Rn ) over a neighborhood of supp(). Then
(x, y) 7 (x)(x y) is in D(Rn Rn ).

Recall that the function x 7 hu(y), (x y)i is smooth due to Proposition 3.2. We may
thus calculate the action of on this function by
Z Z
hu(y), (x y)i (x) dx = (x)(x)hu(y), (x y)i dx
= h(x), hu(y), (x)(xy)ii = h(x)u(y), (x)(xy)i = hu(y), h(x), (x)(xy)ii

= hu(y), h(x), (x y)ii = hu(y), h, y ii = hu(y), hy , ii = hu(y), h, y ii
= hu(y), h(x), (x + y)ii = hu , i .
Since was arbitrary we obtain that u is a regular distribution and is given by (4.5).
Now suppose that (ii) holds. Then we pick a cut-o D(Rn ) over a neighborhood of
supp(u) and note that the right-hand side in (4.5) actually means hu(y), (y)(x y)i.
A calculation similar to the above then shows
Z
hu(y), (y)(x y)i (x) dx = hu v, i.
4.9. THM D(Rn ) is sequentially dense in D 0 (Rn ).
Proof: Let u D 0 (Rn ). We have to show that there exists a sequence (um ) in D(Rn )
with um u in D 0 (Rn ) as m . R
Let D(Rn ) be a mollier, i.e. supp() B1 (0) and = 1, and set
m (x) = mn (mx) (x Rn , m N).
(This corresponds to when = 1/m as used in the proof of Theorem 1.13.)

By Example 1.37(i) we have m in D 0 (Rn ). Since supp(m ) supp() B1 (0)
holds for all m, Theorem 4.6, case (ii), implies
m := u m u = u
uf (m ).
Case (i) of Theorem 4.8 ensures that ufm belongs to C (Rn ). Thus, it remains to adjust
the supports for our approximating sequence. To achieve this we take D(Rn ) with
= 1 on B1 (0) and set
x x
um (x) := ( ) uf
m (x) = ( ) (m u)(x) (x Rn , m N).
m m
(Note that um = u
gm on Bm (0) and supp(um ) m supp().)

4.2. REGULARIZATION 83
Let D(Rn ) arbitrary. Then we have for m suciently large that

Z Z
hum , i = um (x)(x) dx = uf
m (x)(x) dx = hu
f m , i.
Hence also um u in D 0 (Rn ).
4.10. THM D() is sequentially dense in D 0 ().

Proof strategy: Choose a sequence of compact subsets exhausting and combine this
with the technique of the above proof; cf. [FJ98, Theorem 5.3.2]. Alternatively, for a
functional analytic argument see [Hor66, Chapter 4, 1, Proposition 3]. Include RO
notes in small print
4.11. THM (Translation invariant operators are convolutions)

Let L : D(Rn ) E (Rn ) be a linear map. Then the following statements are equivalent:
(i) L is continuous and h Rn : h L = L h , i.e. L commutes with translations.
(ii) ! u D 0 (Rn ): L = u holds for all D(Rn ).
Proof: (ii) (i): Linearity of L is clear and that convolution commutes with trans-
lations follows from Proposition 4.5(iii). Smoothness of u is ensured by case (i) in
Theorem 4.8.
It remains to prove that j 0 in D(Rn ) implies uj 0 in E (Rn ). Since (uj ) =
u ( j ) [by 4.5(ii)] it suces to show the following: K b Rn we have u j 0
uniformly on K.
Let K b Rn be arbitrary and let K0 b Rn such that supp(j ) K0 for all j. Then by
(4.5) and (SN) applied to u we can nd m N0 and C > 0 such that for all x K
X
|u j (x)| = |hu, j (x .)i| 6 C k j kL (KK0 ) ,
||6m
hence X
ku j kL (K) 6 C k j kL (KK0 ) 0 (j ).
||6m
(i) (ii): Uniqueness of u follows by considering D(Rn )

hu(y), (y)i = hu, (0 .)i = u (0) = L(0),
since this equation determines the action of u.

As for existence we know make the ansatz

hu, i := L (R)(0) = L(0) D(Rn ),


where we have put R(y) := (y) = (y). (Note that RR = .) By continuity and
linearity of L the corresponding properties for u follow, that is, u D 0 (Rn ). [We have
u = (evaluation at 0) L R , which is a composition of linear continuous maps.]
Finally, if D(Rn ) and x Rn are arbitrary, then commutativity with translations
implies
L(x) = x (L)(0) = L(x )(0) = hu, R(x )i = hu, R((. + x))i

= hu(y), (y + x)i = u (x).
4.12. Examples
(i) Let h Rn and D(Rn ) then
h (x) = hh , (x .)i = (x h) = h (x),
thus translation h corresponds to convolution with h .

P
(ii) Let P() = ||6m a be a PDO with constant coecients a C. Clearly,
P() denes a translation invariant map D(Rn ) D(Rn ) E (Rn ). We have for any
D(Rn )
X X
P() = a = a ( )
||6m ||6m
X X
a ( ) = a = u ,

=
||6m ||6m
P
where u := ||6m a = P().

4.3. THE CASE OF NON-COMPACT SUPPORTS 85
4.3. THE CASE OF NON-COMPACT

SUPPORTS
4.13. Motivation We have dened the convolution for u E 0 and v D 0 by its

action on a test function as
hu v, i = hu(x) v(y), (x + y)i.
Inspecting the right-hand side of this equation, we realize that all that is required for
the above formula to work is to have the function
supp(u) supp(v) C, (x, y) 7 (x + y)
compactly supported. This in turn would be guaranteed (for all ), if the map supp(u)
supp(v) Rn , (x, y) 7 x + y has the property that inverse images of compact subsets
of Rn are compact in supp(u) supp(v).
4.14. DEF Let X and Y be locally compact2 topological spaces and f : X Y be

continuous. Then f is said to be proper, if for every compact subset K Y the inverse
image f1 (K) X is compact.
4.15. LEMMA Let A Rn be closed and f : A Rm be continuous. Then A is

locally compact3 . Furthermore, f is proper if and only if the following holds:
> 0 > 0 x A : |f(x)| 6 |x| 6 .
Proof: Let x A and K(x) be a compact neighborhood of x in Rn . Then U(x) =

A K(x) is a compact neighborhood of x in A. Hence A is locally compact (since the
Hausdor property of A is clear).
If f is proper, then f1 (B (0)) is compact in A, hence (also compact and) bounded in Rn .
Therefore we can nd > 0 such that f1 (B (0)) B (0), which means that |f(x)| 6
implies |x| 6 .
2 Proper maps between arbitrary topological spaces have to be dened dierently (cf. [Bou66, Chapter
I, Section 10]), but on locally compact spaces our denition is equivalent (due to [Bou66, Proposition 7 in
Chapter I, Section 10, Number 3]).
3 In general (topological) subspaces of a locally compact space may fail to be locally compact.
(E.g. Q with the inherited euclidean topology of R is not locally compact; see also examples with sine
curves in R2 as in [SJ95, No. 118,1]).

Conversely, suppose that for any > 0 we can nd > 0 with the above property, i.e.,
f1 (B (0)) B (0). Let K b Rm . By continuity the set f1 (K) is closed in A, thus also
closed in Rn (since A is closed in Rn ). Choose > 0 so that K B (0). There is > 0
such that f1 (K) f1 (B (0)) B (0). Hence f1 (K) is also bounded.
In summary, f1 (K) is compact.
4.16. Preparatory observations: Suppose u, v D 0 (Rn ) are such that the map
supp(u) supp(v) Rn , (x, y) 7 x + y is proper. If > 0, then there exists > 0 such
that the following holds for every (x, y) supp(u) supp(v):
() |x + y| 6 = max(|x|, |y|) 6 .
Let , D(Rn ) with = 1, = 1 on a neighborhood of B (0).
We claim that the restriction of the distribution (u) (v) to B (0) is independent
of the choice of and :
Let 1 D(Rn ) also have the property that 1 = 1 on a neighborhood of B (0). Then
supp((1 )u) B (0) = and we will show that also
() supp(((1 )u) (v)) B (0) =
holds. Indeed, let z Rn satisfy |z| 6 and
z supp(((1 )u) (v)) supp((1 )u) + supp(v) supp(u) + supp(v).
Then z = x + y with x supp((1 )u) and v supp(v) and () implies that
x, y B (0). In particular x supp((1 )u) B (0) = | a contradiction .
By () we have ((1 )u) (v) |B (0) = 0 and therefore
(1 u) (v) = (u) (v) + ((1 )u) (v) = (u) (v) on B (0).
Thus we are lead to the following way of dening the convolution u v when neither u
nor v need to be compactly supported.
4.17. DEF (Convolution) Let u, v D 0 (Rn ) such that the map

supp(u) supp(v) Rn , (x, y) 7 x + y is proper.
We dene the convolution u v D 0 (Rn ) as follows: For any > 0 and D(Rn ) with
supp() B (0) we dene
(4.6) hu v, i := h(u) (v), i,
where the cut-o functions and are as in 4.16.
4.18. REM (Basic properties)

4.3. THE CASE OF NON-COMPACT SUPPORTS 87
(i) If u E 0 (Rn ) and v D 0 (Rn ), then the convolution according to Denition 4.17
coincides with u v as constructed in Theorem 4.3.
Proof: By compactness supp(u) is bounded, say, supp(u) BR (0). Hence properness of

the map supp(u) supp(v) Rn , (x, y) 7 x + y follows, since x supp(u), y supp(v)
and |x + y| 6 implies |y| 6 + R. Thus we may put := + R to satisfy () in 4.16.
With cut-o functions and according to 4.16 we now obtain u = u. Furthermore,
if D(B (0)) then v = v on the set {y | x supp(u) : x + y supp()}, hence
hu (v), i = hu(x), h(y)v(y), (x + y)ii = hu(x), hv(y), (x + y)ii = hu v, i.
(ii) Relations analogous to those stated in 4.1 hold for the convolution dened in 4.17.
In particular, we have again the formulae
hu v, i = hu(x), hv(y), (x + y)ii = hv(y), hu(x), (x + y)ii,
supp(u v) supp(u) + supp(v),
j (u v) = (j u) v = u (j v) and h (u v) = (h u) v = u (h v),
and separate sequential continuity of (u, v) 7 u v.

(The proofs are easy adaptations of those in 4.1 based on (4.6). Alternatively, cf. [Hor66,
Chapter 4, 9] for an equivalent approach and more detailed proofs. [Equivalence of the
approaches follows from Exercise 2 in the same Section of that book.])
(iii) Similarly, convolution of nitely many distributions u1 , . . . , um D 0 (Rn ) can be
dened under the condition that
supp(u1 ) supp(um ) Rn , (x(1) , . . . , x(m) ) 7 x(1) + . . . + x(m) is proper.
In this case, we have also associativity of the convolution, in particular, if u1 , u2 , u3
satisfy the above properness condition, then
u1 u2 u3 = (u1 u2 ) u3 = u1 (u2 u3 ).
(Cf. [Hor66, Chapter 4, 9] or [FJ98, Section 5.3].)

Warning: Associativity may fail, if the properness condition is violated even in cases,
where both convolutions (u1 u2 ) u3 and u1 (u2 u3 ) do exist. For example, on R
we have
(1 0 ) H = (1 0 ) H = 0 H = 0, whereas
1 ( 0 H) = 1 ( H 0 ) = 1 ( ) = 1 = 1.
4.19. Examples and applications

(i) D+0 (R) := {u D 0 (R) | a R : supp(u) [a, [ } is a convolution algebra, that is,
D+ 0
(R) is a vector subspace such that convolution is a bilinear map : D+ 0
(R)D+ 0
(R)
D+ (R) and (D+ (R), +, ) forms a ring.
0 0
(In addition, we have 0 D+0 (R) as an identity with respect to convolution and com-
mutativity of .)
If u1 , . . . , um D+0 (R), then the properness condition in (iii) above holds, since we may
rst choose a common lower bound for the supports and then boundedness of the sum
forces boundedness of each summand. Thus we obtain convolvability and associativity.
That u1 u2 belongs again to D+0 (R) follows from the relation supp(u1 u2 ) supp(u1 )+
supp(u2 ). Finally bilinearity is immediate from the denition.
Similarly, one can show that D0 (R) := {u D 0 (R) | a R : supp(u) ] , a]} is a
convolution algebra.
(ii) Alternative description of primitives (or antiderivatives): Let a, b R with a < b
and C (R) be such that = 0 when x < a and = 1 when x > b.
For any v D 0 (R) put
v := (1 )v and v+ := v.
Then v = v + v+ with v D0 (R) and v+ D+0 (R). Since also (H 1) D0 (R) and
H D+0
(R), we may dene
u := (H 1) v + H v+ D 0 (R)
and obtain
u 0 = ((H 1) v ) 0 + (H v+ ) 0 = (H 1) 0 v + H 0 v+ = v + v+ = v + v+ = v.
In particular, for any w D+0 (R) the distribution H w D+0 (R) is an antiderivative.

4.4. THE LOCAL STRUCTURE OF DISTRIBUTIONS 89
4.4. THE LOCAL STRUCTURE OF

DISTRIBUTIONS
4.20. Motivation At the very beginning of this course (0.3, 0.5) we had illustrated
the need to dierentiate functions that are classically non-dierentiable. Now we are in
a position to show that, in a sense, distribution theory is exactly a theory of derivatives
(of arbitrary order) of continuous functions. More precisely, we will show that locally
any distribution is represented as a derivative of a continuous function. To begin with,
we rst look at the special case of the Dirac-Delta as (higher-order) derivative of certain
continuous functions.
4.21. Examples
The kink function: As one of the rst examples of dierentiation we had H 0 = in
D 0 (R). We see that in a sense the \primitive function" of is thus more regular than
itself. In fact, H is a regular distribution, since H L
loc (R) Lloc (R). Let us look at a
1
\primitive function" of H, namely the kink function
x+ := xH(x) (x R).
Indeed we have by the Leibniz rule 2.17 and from (2.7) x+0 = (xH(x)) 0 = H(x) + x (x) =
H. We observe that x+ is even continuous4 and that x+ 00
= .
Successively dening primitive functions with value 0 at x = 0 we obtain with the
functions xk1
+ Ck2 (R) the relations
(k)
xk1

+
= (k = 2, 3, . . .).
(k 1)!
[Proof by induction.]
(ii) The multidimensional case: We use coordinates x = (x1 , . . . , xn ) in Rn and put
(x1 )k1
+ (x2 )+
k1
(xn )k1
+
Ek (x) := .
((k 1)!)n
4 Sinceany other antiderivative diers from x+ by a constant, we deduce continuity of any primitive
function of H.

Then Ek Ck2 (Rn ) and we have

(4.7) (1 2 n )k Ek = (k = 2, 3, . . .).
In the terminology to be introduced in chapter 7, we may restate Equation (4.7) as
follows: Ek is a fundamental solution for the partial dierential operator (1 n )k .
Furthermore, H is a fundamental solution for dx
d
, x+ is a fundamental solution for ( dx
d 2
)
etc.
4.22. THM (Local structure theorem for distributions on Rn ) Let u D 0 (Rn ) and
Rn be open and bounded. Then there exists f C(Rn ) and Nn
0 such that
u | = (f | ).
Thus, locally every distribution is the (distributional) derivative of a continuous function.
Proof: The boundedness of allows us to choose D(Rn ) with = 1 on . We

set u~ = u, then u | = u~ | and u~ E 0 (Rn ). By 1.67(iii) u~ is of nite order N, say.
We have
u~ = u~ = (1 n )N+2 EN+2 u~ ,

[(4.7)]
hence it suces to show that EN+2 u~ is continuous.

Let D(Rn ) be a mollier and (x) = (x/)/n (x Rn , 0 < 6 1). Consider
~ C (Rn ).

f := EN+2 u

[T hm4.8]
Since u~ and both have compact support, we may use associativity and commutativity
of the convolution and obtain
~ (EN+2 )(x) = hu~ (y), (EN+2 )(x y)i.
f (x) = u
| {z }
C [Thm 4.8] [(4.5)]
Recall from Theorem 1.13(ii) that we have EN+2 EN+2 in CN (Rn ) as 0. In

particular (EN+2 ) is a Cauchy net in CN (Rn ).
Since u~ E 0 (Rn ) and is of order N we have the seminorm estimate (SN 0 ) with derivative
order N and some C > 0 and a xed compact set K b Rn . Applying this to f f
(0 < , 6 1) we obtain for any compact subset L b Rn and arbitrary x L
~ (y), (EN+2 EN+2 )(x y)i|
|f (x) f (x)| = |hu
X
6C k (EN+2 EN+2 )(x .)kL (K)
||6N
X
6C k (EN+2 EN+2 )kL (LK) .
||6N

4.4. THE LOCAL STRUCTURE OF DISTRIBUTIONS 91
Upon taking the supremum over x L we deduce that (f ) is a Cauchy net in C(Rn ),
thus converges uniformly on compact sets to some function f C(Rn ).
On the other hand, by separate sequential continuity of convolution we also obtain the
convergence
~ ) EN+2 (u~ ) = EN+2 u~
f = EN+2 (u ( 0).
Therefore the equality EN+2 u~ = f C(Rn ) must hold and the proof is complete.
Insert old 4.27(i), (iii) as a remark?
4.23. Corollary (Global structure of E 0 -distributions) Let u E 0 (Rn ) and U be an

open neighborhood of supp(u). Then we can nd m N and functions f C(Rn )
(|| 6 m) with supp(f ) b U such that
X
u= f .
||6m
In other words, every compactly supported distribution is (globally) represented by a

nite sum of (distributional) derivatives of continuous functions.
Proof: Choose Rn open and bounded such that supp(u) b U. The local
structure theorem provides us with a function f C(Rn ) such that u | = (f | ).
Let D() with = 1 on a neighborhood of supp(u). Then we have for any
C (Rn )
hu, i = hu, i = h f, i = (1)|| hf, ()i [Leibniz' rule]

X X
|| ||+||
= (1) hf, i = (1) h ( f), i
| {z }
6 6
(1)|| h ( f),i
X
X
||+||
= h (1) f , i = h f , i.

6 | {z } 6
=:f

Chapter
FOURIER TRANSFORM AND

TEMPERATE DISTRIBUTIONS
5.1. Intro This chapter presents the basics of Fourier transform in a distribution
theoretic framework. Fourier transform techniques have already been a prominent tool
in analysis prior to Laurent Schwartz' new theory, which provided a more complete
and consistent treatment. In particular, its impact on the study of partial dierential
equations turned out to be enormous and can still be felt in present-day research.
Our point of departure (5.1) will be the classical formula of Fourier transform as an
integral operator acting of functions f : Rn C by
Z
Ff() = f()
b := f(x)eix dx.
Rn
As in earlier chapters our strategy of extending the Fourier transform will be by the
adjoint of its action on test functions, i.e. for a test function and a distribution u we
wish to dene
hb
u, i := hu, i.
b
Apparently we thus need a space of test functions which is invariant under application
of the Fourier transform. However, neither D nor E have this property1 , hence we are
lead to introducing the Schwartz space S of rapidly decreasing functions in 5.2.
1 In
case f E the integral dening fb may be divergent, if f D we will see below that supp(f)
b cannot
be compact unless f = 0.
93
94 Chapter 5. FOURIER TRANSFORM AND TEMPERATE DISTRIBUTIONS
The dual space S 0 of temperate distributions (alternatively called tempered in the

literature2 ) then provides an appropriate arena for the distributional Fourier transform
(5.3-4).
Among the many remarkable properties of the Fourier transform is its relation with
respect to convolution and multiplication: In 5.5. we will prove that for any u S 0
and v E 0 one has
\
(u v) = u
b b
v,
where bv is smooth (and polynomially bounded in every derivative).
Finally, in 5.6 we study the Fourier transform on L2 S 0 and recover basic facts,
which are obtained classically by an extension of F (originally dened on L1 ) considered
on L1 L2 and then extended to L2 (cf. [For84, 12]).
2 Compare, e.g. [Hor90, Chapter 7] vs. [FJ98, Chapter 8]

5.1. CLASSICAL FOURIER TRANSFORM 95
5.1. CLASSICAL FOURIER TRANSFORM
5.2. Preliminaries and the classical denition:

(i) Recall that
R
L1 (Rn ) is the vector space of classes of Lebesgue integrable functions f
on Rn , i.e. Rn |f(x)| dx < as Lebesgue integral, modulo the relation of 'being equal
(Lebesgue) almost everywhere'. Following traditional abuse of notion and notation we
typically work with elements of L1 as they were functions, thus, strictly speaking, mixing
up a representative with its class.
(ii) The Fourier transform of a function f L1 (Rn ) is dened to be the function
F(f) : Rn C, given by (x denoting the standard inner productof x and on Rn )
Z
(5.1) F(f)() = f()
b := f(x)eix ( Rn ).
Rn
(For every the value of the integral is nite, since |f(x)eix | = |f(x)| is L-integrable; further-
more, F(f)() does not depend on the L1 -representative, since altering f on a set of Lebesgue
measure zero does not change the value of the integral.)
(iii) In the sequel we will occasionally apply theorems by Fubini and Tonelli ([Fol99,
Theorem 2.37]) without further mentioning. We recall special forms of the general state-
ments, which are sucient for our purposes: Let X Rm and Y Rn be L-measurable
subsets.
(a) Suppose f L1 (X Y). Then the functions
Z Z
x 7 f(x, y) dy and y 7 f(x, y) dx
Y X
(are nite almost everywhere and) dene integrable functions on X and Y , repectively,

and we have the equalities
Z Z Z Z Z
( f(x, y)dy)dx = f(x, y) d(x, y) = ( f(x, y)dx)dy.
X Y XY Y X
(b) If f is L-measurable on X Y , then the functions

Z Z
x 7 |f(x, y)| dy and y 7 |f(x, y)| dx
Y X

are L-measurable (and non-negative) on X and Y , repectively, and we have the equalities
Z Z Z Z Z
( |f(x, y)|dy)dx = |f(x, y)| d(x, y) = ( |f(x, y)|dx)dy.
X Y XY Y X
In particular, if any of the three members in the above equalities is nite, then (the class
of) f L1 (X Y).

5.1. CLASSICAL FOURIER TRANSFORM 97
5.3. THM
(i) For every f L1 (Rn ) the Fourier transform fb: Rn C is continuous and satises
(5.2) |f()|
b 6 kfkL1 Rn .
(ii) If f, g L1 (Rn ), then

Z Z
(5.3) f(x) g
b(x) dx = f()
b g() d.
R
(iii) If f, g L1 (Rn ), then x 7 f g(x) = f(y)g(x y)dy denes an element f g
L1 (Rn ) and we have
(5.4) \
(f g) = fb g
b.
Proof: (i) As remarked immediately after the denition, f()

b is well-dened and nite.
Moreover, the triangle inequality for integrals yields
Z Z
|f()| 6 |f(x)e
b ix
| dx = |f(x)| dx = kfkL1 .
If k as k , then f(x)eixk f(x)eix pointwise and |f(x)eixk | 6 |f(x)|

provides an L1 -bound uniformly for all k. Thus dominated convergence implies f( b k)
b (k ), hence continuity of fb.
f()
(ii) By (i) we have that |b
g| 6 kgkL1 , hence g
b is bounded and f g
b L1 (Rn ). Furthermore,
Z Z Z Z Z Z
ix ix
f(x) g
b(x) dx = f(x) g() e d dx = g() f(x)e dx d = g() f()
b d.
(iii) Observe that (x, y) 7 f(y)g(x y) is L-measurable and

Z Z Z
|f(y)g(x y)| d(x, y) = |f(y)| |g(x y)| dx dy
Rn Rn
Z
= |f(y)|kgkL1 dy = kgkL1 kfkL1 < .
R
Hence x 7 f(y)g(x y) dy = f g(x) denes an integrable function on Rn . We
determine ist Fourier transform as follows
Z Z Z
ix ix
\
(f g)() = e (f g)(x) dx = e f(y)g(x y)dy dx
Z Z Z Z
ix
= f(y) g(x y)e dx dy = f(y) g(z)ei(z+y) dz dy

[z=xy]
Z Z
iy
= f(y)e g(z)eiz dz dy = f()
b g b().
| {z }
=g()
b

5.4. What are test functions for a distributional Fourier transform?

If f L1 (Rn ) D 0 (Rn ) and D(Rn ) L1 (Rn ), then Equation (5.3) gives
hf,
b i = hf, i.
b
Thus it is tempting to try dening the Fourier transform of any u D 0 (Rn ) by our
standard duality trick in the form
hb
u, i := hu, i.
b
However, if D(Rn ) then b cannot have compact support unless = 0. For simplic-
ity, let us give details in the one-dimensional case: Let R > 0 such that supp() [R, R].
Using the power series expansion of the exponential function and its uniform convergence
on the compact set [R, R] we may write
ZR X X ZR X
(ix)k (i)k k k
ak k ,

() = (x) dx = (x)x dx =
k! k!
b
Rk=0 k=0 R k=0
| {z }
=:ak
where |ak | 6 2RkkL Rk /k!. This shows that ()

b is represented by a power series with
innite radius of convergence, thus is a real analytic function (that can be extended to
a holomorphic function on all of C). If supp() b is compact, then b vanishes on a set
with accumulation points, hence b = 0 (everywhere). (As we will see below, this implies
= 0.)
We conclude that F(D) 6 D and ask the question, whether there is an explicit function
space Y on Rn with D Y L1 E such that F(Y) Y?
A further natural requirement will be that Y should be invariant under dierentiation.
Observe that then we further obtain that Y
Z Z
ix
j () = e
d j (x) dx = (ij )eix (x) dx = ij ()
b
should belong to Y. By induction we deduce that also multiplication by polynomials

should leave Y invariant. Furthermore, a calculation similar to the above shows (xj )b =
b etc. Thus, we are lead to the additional condition that also
ij
x Y Y , Nn
0
should hold. As we shall see in the following section, an appropriate function space is
given by considering smooth functions such that x (x) is bounded (for all , ).

5.2. THE SPACE OF RAPIDLY DECREASING FUNCTIONS 99
5.2. THE SPACE OF RAPIDLY DECREASING

FUNCTIONS
5.5. Notation To avoid extra factors of the form (i)|| from popping up in many
calculations, it is very common to introduce the operator
1
Dj := j (j = 1, . . . , n)
i
and D = (D1 , . . . , Dn ). Note that we then have Dj (eix ) = j eix and furthermore,
p(D)(eix ) = p()eix , if p is any polynomial function on Rn .
5.6. DEF Let C (Rn ).

(i) The function is said to be rapidly decreasing, if it satises the following semi-
norm condition
(5.5) , Nn
0 : q, () := sup |x D (x)| < .
xRn
(ii) The vector space of all rapidly decreasing functions on Rn is denoted by S (Rn ).
(iii) Let (m ) be a sequence in S (Rn ). We dene convergence of (m ) to in S (Rn )
S
(as m ), denoted also by m , by the property
, Nn
0 : q, (m ) 0 (m ).
(Similarly for nets like ( )0<61 .)
5.7. REM
(i) Let C (Rn ), then the condition (5.5) is equivalent to the following statement
C
(5.6) Nn
0 l N0 C > 0 : |D (x)| 6 x Rn .
(1 + |x|)l
It is obvious that (5.5) is a consequence of (5.6). On the other hand, (5.5) implies3
Nn
0 k N0 C > 0 : sup |D (x)| + sup | |x|2k D (x)| 6 C,
xRn xR n
| {z }
>supxRn | (1+|x|2k )D (x)|
3 (putting = and = 0, 2e1 , . . . 2en , 4e1 , . . . , 4en , . . . , 2ke1 , . . . , 2ken successively)

which in turn gives (5.6) upon noting that 1/(1 + |x|2k ) 6 Ck,l /(1 + |x|)l when 2k > l
with an appropriate constant Ck,l .
(ii) We clearly have D(Rn ) S (Rn ) E (Rn ).
(iii) An explicit example of a function S (Rn ) \ D(Rn ) is (x) = ec|x| with
2
Re(c) > 0.
(iv) Convergence in S (Rn ) (and, in fact, also the topology of S (Rn )) is equivalently
described by the increasing sequence of semi-norms
X
Qk () := q, () (k N0 ).
||,||6k
(`Increasing' since k 6 k 0 implies Qk () 6 Qk 0 ().)

Moreover, we claim that convergence (and also the topology) in S (Rn ) can also be
described by the metric d : S (Rn ) S (Rn ) R, dened by
X

Qk ( )
d(, ) := 2k .
k=0
1 + Qk ( )
(Regarding the abstract theory in the background, this stems from the general fact that a locally
convex vector space is metrizable if and only if its topology is generated by a countable number
of semi-norms. The construction of the metric is as in [Hor66, Chapter 2, 6, Proposition 2].)
We comment on the proof of the above claim:
In showing that d indeed denes a metric the only nontrivial part is the triangle
inequality d(, ) 6 d(, ) + d(, ). Use that the function f : [0, [ [0, [,
f(x) = x/(1 + x), is increasing and that Qk ( ) 6 Qk ( ) + Qk ( ).
Finally, in every summand (as k = 0, 1, 2, . . .) use the following simple estimate
valid for any a, b > 0: 1+a+b
a+b a
= 1+a+b b
+ 1+a+b a
6 1+a b
+ 1+b .
That convergence with respect to the metric d implies S -convergence as dened
above is clear. Conversely, assume that (m ) is a sequence converging to in S .
We have to show that d(m , ) 0 as m .
Let > 0. Chose N N so that > 1/2N+1 . There exists m0 N such that
QN (m ) < /4 holds for all m > m0 . Thus we obtain for any m > m0
6QN (m ) 61
z }| { z }| {
X
N
Qk (m ) X

Q k (m )
d(m , ) = 2k + 2k
k=0
1 + Qk (m ) k=N+1 1 + Qk (m )
X
N X

1
k
6 QN (m ) 2 + 2k = QN (m ) 2(1 2N1 ) + 2
k=0 k=N+1
2N+1

< 2(1 0) + = .
4 2

In particular, since S (Rn ) is a metric space, we need not distinguish between continuity
and sequential continuity for maps dened on S (Rn ).
5.8. THM S (Rn ) is complete (as a metric space).

(Thus, being a complete metrizable locally convex vector space, it is a Frechet space.
As shown, e.g., in [Hor66, Chapters 2-3] S is also bornological, barreled, and a Montel
space.)
Proof. Suppose (j ) is a Caucgy sequence in S (Rn ). Recall that Cb (Rn ) := C(Rn )
L (Rn ) equipped with the norm k kL is a Banach space. For every , Nn0 we obtain
that (x D j ) is a Cauchy sequence in Cb , thus converges to some , Cb .

Put := 0,0 . As in the proof of Theorem 1.22 it follows that C and that
D = 0, for all Nn0.
Moreover, since , = Cb - lim x D j = pointwise- lim x D j = x 0, we deduce
that also x D = x 0, = , .
If N N0 is arbitray, but xed, then we have for any , Nn0 that kx D kL 6
k, x D N kL + kx D N kL < , hence S (Rn ). Finally, the Cauchy
sequence property of (x D j ) provides for any > 0 an index m0 such that
kx D x D l kL = lim kx D j x D l kL 6 (l > m0 ).
j
Therefore l in S (Rn ) as l .
5.9. DEF (Moderate functions) The space of slowly increasing smooth functions is
dened by
OM (Rn ) := {f C (Rn ) | N0 N N0 C > 0 x Rn : | f(x)| 6 C(1 + |x|)N }.
Clearly, polynomials belong to OM (Rn ).
5.10. THM
(i) Let P(x, D) be a partial dierential operator with coecients in OM (Rn ), i.e.,
X
P(x, D) = a (x)D (a OM (Rn )).
||6m
Then P(x, D) : S (Rn ) S (Rn ) is linear and continuous.

(ii) D(Rn ) S (Rn ) with continuous embedding.
(iii) D(Rn ) is dense in S (Rn ).
(iv) S (Rn ) L1 (Rn ) with continuous embedding.

Proof: (i): Linearity is clear. To show continuity we prove that j 0 in S implies

P(x, D)j 0 in S (as j ). For any , Nn 0 we have q, (P(x, D)j ) 6
P
||6m q, (a D j ). Upon application of the Leibniz rule we simply have to recall

the denition of OM and the seminorm Qk and estimate a (nite) linear combination of
terms of the form ( 6 )
|x | |D a (x)| |D j (x)| 6 |x | C(1 + |x|2 )N |D+ j (x)| 6 CQ||+||+||+2N (j ),
where the upper bound tends to 0 as j .

(ii): Clearly j 0 in D implies j 0 in S .
(iii): Choose a cut-o function D(Rn ) with (x) = 1 when |x| 6 1.
Let S (Rn ). Dene j (x) := (x)(x/j) (j N), then j D(Rn ) and (x)
j (x) = 0 when |x| 6 j. We show that j in S .
For arbitrary , Nn0 and j N we obtain q, ( j ) = sup|x|>j |x D ((x)(1
(x/j)))| and by Leibniz rule we are left to estimate a (nite) linear combination of terms
( 6 )
x
t (x) := |x D (x)D (1 ( ))| when |x| > j.
j
If || > 0, t (x) = |x D (x) j||
1
D ( xj )| 6 kD kL q, ()/j|| 0 (j ).
If = 0, then recalling from (5.6) that |D (x)| 6 Cl /(1 + |x|)l for every l with
appropriate Cl we may conclude choosing l > || that
x
t0 (x) = |x D (x)(1 ( ))| 6 k1 + kL sup |x||| |D (x)|
j |x|>j
6 k1 + kL sup (1 + |x|)|| Cl (1 + |x|)l 6 Cl k1 + kL sup (1 + |x|)||l

|x|>j |x|>j
= Cl k1 + kL /(1 + j) l||
0 (j ).
(iv): Choosing = 0 and l = n + 1 in (5.6) and noting that the constant there can be
chosen to be Qn+1 () we obtain
Z Z
dx
|(x)| dx 6 Qn+1 () = Cn Qn+1 (),
(1 + |x|)n+1
Rn Rn
| {z }
=Cn <
since 1/(1 + |x|)n+1 is L-integrable on Rn (e.g., use polar coordinates). Thus L1 (Rn )
and we may deduce that, for any sequence (j ) in S , j 0 in S implies kj kL1 0.

Thanks to 5.10(iv) the Fourier transform is dened on S L1 and Theorem 5.3(i) gives
F(S ) Cb . We will show that, in fact, Fourier transform is an isomorphism of S . We
split this task into several steps.

5.11. LEMMA (Exchange formulae) For any S (Rn ) and Nn0 we have
(5.7) (D )b() = ()
b
and
(5.8) (x )b() = (1)|| D (),

b
in particular,
b C (Rn ).
Proof: By Theorem 5.10 the functions D and x belong to S (Rn ) L1 (Rn ), hence
we may take their Fourier transforms and relieve the calculations in 5.4 of their informal
status: First, we apply integration by parts and nd
Z Z
ix
(Dj )b() = Dj (x)e dx = (x)(j )eix dx = j ()
b
and (5.7) follows by induction. Second, standard theorems on dierentiation of the

parameter in the integral imply that
b is continuously dierentiable and
Z Z
ix
Dj ()
b = Dj (x)e dx = (x) xj eix dx = (xj )b().
Equation (5.8) and smoothness of

b then follows by induction.
5.12. LEMMA F(S (Rn )) S (Rn ) and

b is continuous S (Rn ) S (Rn ).
Proof: Let S (Rn ). We know from the previous lemma that b C (Rn ) and that
the exchange formulae hold. To show that b belongs to S (Rn ) we have to establish
an upper bound for q, ()b = supRn | D ()|
b , where , Nn0 are arbitrary.
Repeated application of the exchange formulae and the basic L -L1 -estimate 5.3(i) give
Z
| D ()|

b
= | F(x )()| = |F(D (x ))()| 6 |D (x (x))| dx

We emphasize again that x 7 D (x (x)) is in S (Rn ). Thus, as in the proof of Theorem

5.10(iv) we obtain from the alternative S -condition (5.6) the folloing estimate: l N0
|D (x (x))| 6 Ql (D (x (x))(1+|x|)l 6 Ql+||+|| ()(1+|x|)l . Choosing l = n+1
then yields in summary
Z
dx
b 6 Qn+1+||+|| ()
q, () = Cn Qn+1+||+|| (),
(1 + |x|)n+1
n
|
R
{z }
=:Cn

which proves that b S (Rn ) and also shows continuity of the Fourier transform as a
linear operator on S (Rn ).
5.13. LEMMA (Fourier transform of the Gaussian function) The Fourier transform
of the Gaussian function x 7 exp(|x|2 /2) in S (Rn ) is given by
2 /2 2
e|x| b() = (2)n/2 e|| /2 .

Proof: In dimension n = 1 we note that g(x) = ex satises the following linear

2 /2
rst-order ordinary dierential equation
() g 0 (x) + xg(x) = 0
with initial value g(0) = 1. (Recall that any solution to () is of the form f(x) = cex2 /2 ,
where c = f(0).) Applying Fourier transform equation () and the exchange formulae
yield
i g g 0 () = 0,
b() + ib
thus gb also solves the dierential equation () . Therefore

R
we must have
g
b() =
c exp( /2) and it remains to determine c = g
2
b(0) = exp( /2)d = 2 ([For06,
2
20, Beispiel (20.8)] or [For84, 9, Beispiel (9.4)]).
In dimension n > 1 we then calculate directly using Fubini's theorem

n Z
Y Y
n
|x|2 /2 ixk k x2k /2

e b() = e e dxk = g
b(k )
k=1 k=1
Y
n
2 2 /2
= (2)1/2 ek /2 = (2)n/2 e|| .
k=1
(Compare with an alternative proof based on complex analysis as in [SS03, Chapter 2, Section 3,
Example 1].)
5.14. LEMMA (Fourier inversion formula) For any S (Rn ) the inversion formula
Z
1
(5.9) (x) = ()e
b ix
d (x Rn )
(2)n
Rn
holds.

Proof: We start with a simple observation for any real number a 6= 0 and f S (Rn ):
Z Z
dy 1 b1
(5.10) (x 7 f(ax))b() = f(ax)e ix
dx = f(y)eiy/a = n f( ).
|a|n |a| a
Recall that Equation (5.3) gives for any , S (Rn )

Z Z
()
b = .
b
Now let S (Rn ) be arbitrary and put () = e|| , where > 0. Then ()
2 /2
together with (5.10) and Lemma 5.13 imply

Z Z [z=x/] Z
||2 /2 (2)n/2 |x|2 /(22 ) n/2 2 /2
()e
b d = (x)e dx = (2) (z)e|z| dz .
n
| {z } | {z }
l := =:r
As 0 we have by dominated convergence

Z
l ()
b d
and that Z
n/2 2
r (2) (0) e|z| /2 dz = (2)n (0),
| {z }
=(2)n/2 [by 5.13]
hence
Z
() (0) = (2) n
()
b d.
From this we will obtain the result by translation. We have again a simple observation:
For any h Rn and f S (Rn )
Z Z
(5.11) (h f)b() = f(x + h)e ix
dx = f(y)ei(yh) dx = eih f().
b
(Translation of f corresponds to modulation of fb.)

Thus we nally arrive at
Z Z
n n
(x) = (x )(0) = (2) \
(x )() d = (2) eix ()
b d.

[()]

5.15. THM The Fourier transform F : S (Rn ) S (Rn ) is linear continuous with
continuous inverse F1 given by
Z
F 1
(x) = (2) n
()eix d ( S (Rn )).
Rn
Hence we have for all S (Rn )
(5.12)
b = (2)n

b
= ).
= (x) and
(recall that (x)
Proof: By Lemma 5.14 the formula for F1 gives a left-inverse of F on S (Rn ), i.e.,
F1 F = idS . Hence F is injective.
To prove surjectivity of F let S (Rn ) be arbitrary. Then (5.9) yields

Z Z
(x) =
1
()e
b ix
d = (2)n ()
b
ix
e b
d = F((2)n ).
(2)n
= (F) for any S (Rn ),
Moreover, noting that (5.10) with a = 1 implies F()
the above equation means
b = (2)n F()
= F((2)n ) b = (2)n .
b
b
Continuity of F has been shown in Lemma 5.12 above and that of F1 follows by the
similarity of the integral formula.

5.3. TEMPERATE DISTRIBUTIONS 107
5.3. TEMPERATE DISTRIBUTIONS
5.16. DEF A temperate distribution (also: tempered distribution) on Rn is a con-

tinuous linear functional u : S (Rn ) C, i.e., k 0 in S = hu, k i 0 in C. The
space of temperate distributions on Rn is denoted by S 0 (Rn ).
(As noted in 5.7(iv), for maps dened on the metrizable space S (Rn ) continuity is
equivalent to sequential continuity.)
As in the cases of D 0 and E 0 there is an \analytic" characterization of continuity of linear
functions on S in terms of seminorm estimates.
5.17. THM Let u : S (Rn ) C be linear. Then u S 0 (Rn ) if and only if the
following holds: C > 0 N N0 such that S (Rn )
X X
(SN 00 ) |hu, i| 6 C QN () = C q, () = C kx D kL (Rn ) .
||,||6N ||,||6N
Proof: Clearly, (SN 00 ) and k 0 in S imply hu, k i 0. Conversely, suppose u

is continuous but (SN 00 ) does not hold (compare with the analogous proofs of (SN) and
(SN 0 ) earlier): N N N S (Rn ) such that
|hu, N i| > NQN (N ).
Then N 6= 0 and N := N /(N QN (N )) (N N) denes a sequence in S with
q, (N ) 6 1/N when N > max(||, ||), but |hu, N i| > 1 | a contradiction .
5.18. REM (S 0 and D 0 ) Since D(Rn ) S (Rn ) with continuous dense embedding
(cf. Theorem 5.10(ii),(iii)) we have for any u S 0 (Rn ) that
u |D(Rn ) D 0 (Rn )
and that the map u 7 u |D(Rn ) is injective S 0 (Rn ) D 0 (Rn ). Thus we may consider
S 0 (Rn ) as a subspace of D 0 (Rn ). The latter point of view can serve in alternatively
dening S 0 to consist of those distributions in D 0 which can be extended to continuous
linear forms on S D (e.g., see [FJ98, Denition 8.3.1]).
In particular, we may consider operations dened originally on D 0 (dierentiation, mul-
tiplication etc.) and study under what conditions these leave S 0 invariant, thus dening
corresponding operations on S 0 .

5.19. PROP Let u S 0 (Rn ). Then we have

(i) N0 : u S 0 (Rn )
(ii) f OM (Rn ): fu S 0 (Rn )
(iii) Let P(x, D) be a partial dierential operator with coecients in OM (Rn ), then
P(x, D) : S 0 (Rn ) S 0 (Rn ) is linear and weak*-sequentially continuous.
Proof: (i) and (ii) are immediate from Theorem 5.10(i).

(iii) follows by direct inspection from from (i) and (ii); alternatively, one may use
the general property that adjoints of (sequentially) continuous linear maps are weak*-
sequentially continuous; cf. also Thm. 2.36, where this property has been shown explicitly
for D 0 and a transfer to S 0 is easy.
5.20. REM
(i)We have E 0 (Rn ) S 0 (Rn ) DF0 (Rn ) D 0 (Rn ) .
That E 0 S 0 follows from the discussion in Remark 5.18, since E 0 D 0 and k 0
in S implies k 0 in E .
The inclusion S 0 DF0 follows from the fact that N occurring in the estimate (SN 00 ) is
valid with global L -norms.
Each ofRthe above inclusions is strict: For example, 1 S 0 (Rn ) \ E 0 (Rn ), since |h1, i| 6
R
|| 6 Rn (1 + |x|)(n1) dx Qn+1 (), but supp(1) = Rn is not compact.
Second, the function u(x) = ex denes a regular distributionRin u D 00 (R), but u is
2
not dened on all of S (R), since (x) = ex yields hu, i = 1dx = (alternatively,
2
any approximating sequence D(R) 3 j in S (R) yields (hu, j i) unbounded).

(ii) Recall that for any 1 6 p < the vector space Lp (RR
n
) is dened analogously to L1 ,
only changing the integrability condition to kfkLp := ( Rn |f(x)|p dx)1/p < . Further-
more, L (Rn ) consists of (classes of) essentially bounded L-measurable functions f on
Rn with norm kfkL = ess supxRn |f(x)| (= inf {M [0, [ | |f| 6 M almost everywhere}).
(Lp (Rn ), k.kLp ) is a Banach space for every 1 6 p 6 .
We have Lp (Rn ) S 0 (Rn )
Proof: By Holder's inequality ([Fol99, 6.2 and Theorem 6.8.a]), if f Lp (Rn ) and
S (Rn ), then
Z
1 1
|hf, i| 6 |f| 6 kfkLp kkLq ( + = 1).
p q

5.3. TEMPERATE DISTRIBUTIONS 109
The standard S -estimate |(x)| 6 Ql ()/(1 + |x|)l , valid for every l N0 , gives
Z Z
dx
kkqLq = |(x)| dx 6 Ql ()
q q
(1 + |x|)lq
and thus shows continuity of 7 hf, i upon choosing l suently large to ensure lq > n.
(iii) Let f C(Rn ) be of polynomial growth, i.e., C, M > 0:
|f(x)| 6 C(1 + |x|)M x Rn .
Then f S 0 (Rn ), since we for any S (Rn )

Z Z Z
Ql () dx
|hf, i| 6 |f(x)||(x)| 6 C(1 + |x|) M
dx = CQl () ,
(1 + |x|)l (1 + |x|)lM
| {z }
=:Cl
where Cl is nite, if l > M + n.

Note that we automatically obtain that also f S 0 (Rn ) due to Proposition 5.19(i).
As a matter of fact, there is an S 0 -variant of the structure theorem which states that
every temperate distribution is a nite order derivative of a (single) continuous func-
tion of polynomial growth on Rn (cf. [FJ98, Theorem 8.3.1]). This is the reason why
distributions in S 0 are called temperate (or tempered ).
5.21. DEF (Convergence in S 0 ) Let (uj ) be a sequence in S 0 (Rn ) and u S 0 (Rn ).

We say that (uj ) converges to u in S 0 (Rn ), denoted uj u (j ), if S (Rn ):
huj , i hu, i (j ). (Similarly for nets (u ) ]0,1] etc.)
5.22. REM
(i) Similarly as in the proof of Theorem 1.22 one can show that S 0 (Rn ) is sequentially
complete (cf. [Die79, 22.17.8] and use the fact that a Cauchy sequence (uj ) in S 0 yields
a Cauchy sequence (huj , i), hence convergent sequence, in C for every S ).
(ii) D(Rn ) is dense in S 0 (Rn ) (cf. [Die79, 22.17.3(iii)]).
(iii) Moreover since obviousely D(Rn ) Lp (Rn ) (1 6 p 6 ) and by 5.20(ii) Lp (Rn )
S 0 (Rn ) we obtain that Lp (Rn ) is dense in S 0 (Rn ) for all 1 6 p 6 .

5.4. FOURIER TRANSFORM ON S 0
5.23. Intro: We have collected sucient information to contruct an appropriate ex-

tension of Fourier transform to distribution theory. If u L1 (Rn ) we may consider it
as an element of S 0 (Rn ) (by 5.20(ii)) and 5.3(i) gives u
b Cb (Rn ) L (Rn ), thus
b S 0 (Rn ) by 5.20(ii) again.
u
Thus, since for any S (Rn ) L1 (Rn ) also
b S (Rn ) L1 (Rn ), we obtain
Z Z
(5.13) hb
u, i = u
b ()() d = u(x)(x)
b dx = hu, i.
b

[5.3(ii)]
Observe that the right-most term can be extended to the general case u S 0 (Rn ):
Since 7
b is a (continuous) isomorphism on S (Rn ), the map 7 hu, i
b denes an
element in S (R ).
0 n
5.24. DEF If u S 0 (Rn ), then the Fourier transform ub , or Fu, is dened by

(5.14) hb
u, i := hu, i
b S (Rn ).
5.25. THM The Fourier transform F : S 0 (Rn ) S 0 (Rn ) is linear and bijective, F
as well as F1 are sequentially continuous. We have again the formulae
(5.15)
b = (2)n u
u
b
as well as F1 u = (2)n (b
u).
Moreover, if u L (R ), then u
1 n
b according to (5.14) coincides with its classical Fourier
transform (as L -function).
1
Proof: Compatibility of the distributional with the classical Fourier transform on func-
tions u L1 (Rn ) follows from Equation (5.13).
Speaking in abstract terms, it follows that F is an isomorophism on S 0 (Rn ) since it
is the adjoint of an isomorphism on S (Rn ). The latter statement includes also the
weak*-continuity of F and F1 . However, we give an independent proof for our case
here.
Linearity of F is clear, as is the sequential continuity of F from (5.14).

5.4. FOURIER TRANSFORM ON S 0 111
To prove injectivity assume that Fu = 0. Then we have for every S (Rn ) that
0 = hFu, F1 i = hu, i, hence u = 0.
To show surjectivity we will rst derive (5.15). Let u S 0 (Rn ) and S (Rn ), then
(5.16) b , i = hb
hu
b b = hu, i
u, i b = (2)n hu , i,
b = (2)n hu, i

[(5.12)]
which implies u = F((2)n u b) (since and F commute on S , hence also on S 0 ) and,
in particular, yields surjectivity of F and the stated formula for the inverse.
We state a list of properties of the Fourier transform on S , which follows directly from
0
the corresponding formulae on S and the denition as adjoint of the Fourier transform
on S .
5.26. PROP For any u S 0 (Rn ) we have

(i) Nn0 : (D u)b = u
b,
(ii) Nn0 : (x u)b = (1)|| D u

b,
(iii) h Rn : (h u)b = eih u

b,
(iv) h Rn : (eixh u)b = h u

b,
(v) (u )b = (b
u).
Proof: Applying the denition of the action of u on any S (Rn ) we obtain (i) and
(ii) from Lemma 5.11, (iii) and (iv) from (5.11) and a similar direct calculation showing
(eixh )b = h
b
(alternatively, use (5.12)), and (v) from (5.10) with a = 1.
5.27. Examples
(i) We directly calculate for arbitrary S (Rn )
Z
hb
, i = h, i
b = (0)
b ei0x (x) dx = h1, i,
= |{z}
=1
therefore we obtain
(5.17) = 1.
b
Moreover, since = and

= (2)n
b we may further deduce that
b
= (2)n hb, i
h, i = h, i b = (2)n h1, i
b = h(2)n b
1, i,

hence
(5.18) 1 = (2)n .
b
(ii) Let eh (x) := eixh (h Rn ), then eh OM (Rn ) S 0 (Rn ) and

Feh = F(eihx 1) = hb
1 = (2)n h = (2)n h .
(iii) We determine the Fourier transform of the Heaviside function H L (R) S 0 (R):
Variant (A): Since H 0 = we have i H = 1 and hence H()
b =b b = i/ when 6= 0.
Note that on R \ {0} the function 7 1/ coincides, as a distribution, with the principal
value vp(1/). Furthermore, we have the gloabally (i.e. in D 0 (R)) valid equation
+ ivp(1/) = 0.

H()
b
[Recall from (2.9) that vp(1/) = 1.]

It is an exercise4
to show that u() = 0 implies u = c with a complex constant c.
Hence it remains to determine the constant c in the equation H
b + i vp(1/) = c.
Note that = , H
= 1 H, and vp(1/) = vp(1/) and recall that Fourier transform
commutes with re ection . Thus we calculate
=c
z }| {
=H
c = c + i vp(1/) = (1 H)b i vp(1/) = b1 (H
b b + i vp(1/))
= 2 c = (2 c),
hence c = and we arrive at
(5.19) b = i vp( 1 )
H

Variant (B): Let f (x) := H(x)ex ( > 0), then f L1 (R). We directly calculate
Z
Z

ix x 1 1 1
fb () = e e dx = ex(+i) dx = = .
+ i i i
0 0
Now S 0 -lim0 f = H (by dominated convergence) and F is continuous on S 0 . There-

fore we obtain
b = lim 1 1
H
1
=
1 1
= (vp(1/) + i ) = i vp(1/).
0 i i i i0 i
2.12
4 This PN
is similarly to the arguments in Ex. 2.30; rst show that supp(u) {0}, thus u = j=0 cj (j) ;
then show that cj = 0 when j = 1, . . . , N.

5.4. FOURIER TRANSFORM ON S 0 113
(iv) Since H = (2)n (1 H(x)) we may use the result of (iii) to deduce a
b = (2)n H
b
formula for F(vp(1/x)) as follows
i vc
p = b H = 2(1 H) = (2H 1) .
b = 1 2 H
b
| {z }
=sgn()
Therefore we obtain vcp = i sgn .
5.28. REM (FT and PDO with constant coecients) Let P(D) be a linear partial
dierential operator with constant coecients, i.e.
X
P(D) = c D (c C).
||6m
If u, f S 0 (Rn ), then the exchange formulae from Proposition 5.26(i),(ii) applied to

each term in P(D) give
P(D)u = f P() u
b = f.
b
Thus, the action of P(D) is translated into multiplication with the polynomial P().
In certain cases, this trick allows (a more or less) explicit representation of solutions.
Moreover, the above equivalence provides important additional information in theoretical
investigations regarding regularity and solvability questions.

5.5. FOURIER TRANSFORM ON E 0 AND

THE CONVOLUTION THEOREM
5.29. Intro Recall from Theorem 5.3(iii), Equation (5.4), that we have for any f, g
L1 (Rn )
\
(f g) = fb g
b,
where the product on the right-hand side means the usual (pointwise) multiplication of
continuous functions. In the current section we will prove the analogous result for the
convolution product, if f S 0 with g E 0 . Then fb S 0 and we have to clarify the
meaning of fb gb in a preparatory result on the Fourier transforms of distributions in E 0 .
5.30. THM If v E 0 (Rn ), then bv OM (Rn ) C (Rn ) and we have

(5.20) v() = hv(x), eix i
b Rn .
Moreover, bv can be extended to an entire (holomorphic) function on Cn .
Proof: Smoothness of the function h : 7 hv(x), eix i follows from Corollary 3.4(ii),
in particular also, that every derivative D h is given by D h() = hv(x), D (eix )i =
hv(x), (x) eix i.
The OM -estimates for h follow directly from the seminorm estimate (SN 0 ) for v, which
provide a compact Neighborhood K of supp(v), a constant C > 0, and a derivative order
N N0 such that
X
|D h()| = |hv(x), (x) eix i| 6 C sup |x (x eix )| 6 C 0 (1 + ||)N ,
xK | {z }
||6N
6C,K ||||
where C,K and C 0 denote appropriate constants.

Let D(Rn ) arbitrary, then v E 0 (Rn Rn ) and
hv (x, ), eix i = hv(x), h(), eix ii = hv, i
b = hb
v, i.
| {z }
=(x)
b
On the other hand,

Z
ix ix
hv (x, ), e i = h(), hv(x), e ii = h()() d,
| {z }
=h() Rn

5.5. FOURIER TRANSFORM ON E 0 AND THE CONVOLUTION THEOREM 115
hence bv = h and therefore (5.20) holds.

Finally, again by Corollary 3.4(ii) we obtain smoothness of the function
Rn + iRn = Cn 3 = + i 7 hv(x), eix i = hv(x), eix+x i C
and, since 2j (eix+x ) := (j + ij )(eix+x ) = 0, we also obtain

d = hv(x), (eix+x )i = 0.
j v() j
Thus the Cauchy-Riemann equations are satised in each complex variable, which means
holomorphicity of bv as a function on Cn (cf. [For84, 21, S. 261]).
5.31. Applications: (i) Let f E 0 (Rn ) and P(D) be a linear partial dierential
operator with constant coecients. We claim that
P(D)v = f
has a (unique) solution v E 0 (Rn ) if and only if 7 f()/P()

b is an entire function.
Indeed, the above equation is equivalent to P() bv() = f() b on Cn , where bv and fb
denote the corresponding holomorphic extensions. Thus, solvability of the abve PDE
with v E 0 (Rn ) implies that f()/P()
b is entire (and bv, hence also v, is then uniquely
determined). On the other hand, if f()/P() is entire, then it can be shown (with some
b
eort) that F1 (f/P)
b gives a solution in E 0 (Rn ) (cf. [Hor90, Theorem 7.3.2]).
(ii) Let 7 P() be a nonzero polynomial function Cn C and let P(D) denote the
corresponding linear partial dierential operator with constant coecients. We have
P(D) 6= 0 (zero operator) and investigate the question of injectivity of P(D) as operator
on various spaces of functions and distributions on Rn :
1) P1 : S S , 7 P(D), is injective: Suppose we had 0 6= S with P(D) = 0.
Then P()()
b = 0, where
b is continuos and nonzero. Hence P() = 0 on some open
ball in Rn | a contradiction , since P is not the zero polynomial.
2) P2 : E 0 E 0 , v 7 P(D)v, is injective: Suppose v E 0 with P(D)v = 0.
Then P() bv() = 0 holds for the extension of bv as an entire (holomorphic) on Cn . This
implies that bv = 0 on the nonempty open set Cn \ P1 (0). Hence bv = 0, which in turn
forces v = 0.
3) P3 : S 0 S 0 , u 7 P(D)u, is injective if and only if P1 (0) Rn = .
For any u S 0 , the equation P(D)u = 0 implies P() u
b () = 0 (on Rn ) and hence
supp(b
u) P1 (0) Rn .

If P1 (0) Rn = , then supp(b

u) = , which yields u
b = 0, hence u = 0.
If there exists P1 (0) Rn , then we consider the function u(x) := eix . We have
0 6= u C L S 0 and P(D)u = P()u = 0, since P() = 0.
4) P4 : D 0 D 0 , u 7 P(D)u, is injective if and only if P is constant.
If P is constant (and nonzero), then P4 is a complex multiple of the identity operator,
hence injective.
If P is not constant, then P1 (0) 6= and any function of the form x 7 eix on Rn with
P1 (0) provides a nonzero element in the kernel of P4 .
(Note that here x 7 eix is unbounded, if Cn \ Rn .)
5.32. Lemma If u, v E 0 (Rn ), then (u

\ v) = u v.
b b
Proof: Since supp(u v) supp(u) + supp(v) we have u v E 0 (Rn ). Therefore (5.20)

gives
\
(u v)() = hu v(z), eiz i = hu v(x, y), ei(x+y) i
= hu(x), eix hv(y), eiy ii = u
b () b
v().
5.33. THM (Convolution theorem) Let u S 0 (Rn ) and v E 0 (Rn ). Then u v

belongs to S 0 (Rn ) and we have
(5.21) \
(u v) = u
b b
v.
Proof: Theorem 5.30 implies that bv OM (Rn ), hence bv u b S 0 (Rn ) by Proposition

5.19(ii). Since the Fourier transform is an isomorphism on S 0 (Rn ),
! w S 0 (Rn ) : w
b =b
vub.
=
We determine w by its action on any test function D(Rn ), upon noting that
b , as follows:
n b
(2)
= (2)n hw, i
hw, i b = (2)n hw,
b b = (2)n hb
b i vub , i
b
= (2)n hb
u, b b = (2)n hb
v i i = (2)n hu
u, v[ b , v i
b

[5.32]
, v i = u (v )(0) = (u v) (0) = hu v, i.
= hu

[(4.5)] [(4.5)]

5.5. FOURIER TRANSFORM ON E 0 AND THE CONVOLUTION THEOREM 117
Since D S is dense we obtain w = u v.
5.34. REM A result analogous to Theorem 5.33 holds also in the case u S 0 and
S . Then u OM (Rn ) and the formula (u
\ ) = u b also holds (cf. [Hor66,
b
Chapter 4, 11, Proposition 7 and Theorem 3]).

5.6. FOURIER TRANSFORM ON L2
5.35. REM A standard result (from analysis or measure theory) tells that Cc (Rn )
is dense in Lp (Rn ) when 1 6 p < (see, e.g., [Fol99, Proposition 7.9]). Using the
regularization techniques already seen in Theorem 1.13 we may in turn appproximate
Cc -functions uniformly by test functions in D and thus conclude in summary that
D(Rn ) (as well as S (Rn )) is dense in Lp (Rn ) (1 6 p < ).
(Cf. also [Fol99, Proposition 8.17])
5.36. THM (Plancherel) If f L2 (Rn ) then the (S 0 -)Fourier transform fb is also in

L2 (Rn ). Moreover, Parseval's formula (5.3), i.e.,
Z Z
f(x) g
b(x) dx = f()
b g() d
is valid for all f, g L2 (Rn ) and we have
(5.22) b 2 = (2)n/2 kfk 2 .

kfk L L
Proof: Step 1: Let f, g S (Rn ).

Then f,
bgb S (Rn ) and (5.3) holds. If we set g = fb then
Z
b = f(y)eixy dy = bf(x) = (bf)(x),
g(x) = f(x)
hence gb = (2)n f and (5.3) implies (5.22) in this case.
Step 2: Let f L2 (Rn ) and g S (Rn ).

R
We have hf,
b gi = hf, g b(x)dx and the Cauchy-Schwarz inequality gives
bi = f(x) g
|hf,
b gi| = |hf, g
bi| 6 kfkL2 kb
gkL2 = (2)n/2 kfkL2 kgkL2 .

[Step 1]
Since S is dense in L2 the above inequality shows that the linear functional g 7 hf,
b gi
on S (Rn ) has a unique continuous extension to L2 (Rn ), which we denote again by fb.

5.6. FOURIER TRANSFORM ON L2 119
In view of the Frechet-Riesz theorem ([Wer05, Theorem V.3.6]) there exists a unique
v L2 (Rn ) such that we have
Z
2
L : hf,
b i = h|viL2 = (x) v(x) dx.
If S we obtain hf, b i = hv, i, thus v = fb holds in S 0 and therefore fb L2 and

Parseval's formula hf, bi is valid with f, g L2 . Now (5.22) follows exactly as in
b gi = hf, g
Step 1.
5.37. COR The linear map (2)n/2 F |L (R ) denes a unitary operator on L2 (Rn ).
2 n
Proof: Replacing f by fb and g by g in Parseval's formula yields

Z Z Z
hf|b
b giL2 = f() g
b b() d = f()
b g () d
b =
bb
f(x) g(x) dx

[Parseval]
Z
= (2) n
f(x) g
(x) dx = (2)n hf|giL2 ,
thus the linear map f 7 (2)n/2 fb is an isometry on L2 . Since f = (2)n/2 F(h), where
h := (2)n/2 fb L2 , the map (2)n/2 F |L2 (Rn ) is also surjective (as operator on L2 ),
hence it is unitary.
5.38. Application For every t R let Mt denote the unitary operator on L2 (Rn ) de-
ned by multiplication with the function mt () = eit|| , that is, Mt g() = mt ()g() =
2
eit|| g() for all g L2 (Rn ). Let F2 := (2)n/2 F |L2 (Rn ) , then F2 as well as F21 is
2
also unitary on L2 . Therefore
Ut f := (F21 Mt F2 ) f = F1 mt fb f L2 (Rn )

denes a family (Ut )tR of unitary operators on L2 . For any t1 , t2 R we clearly have
the relation mt1 mt2 = mt1 +t2 , which implies Ut1 Ut2 = Ut1 +t2 . In particular, U0 is the
identity operator. Hence t 7 Ut is a group homomorphism of (R, +) into the group of
unitary operators on L2 (thus, a unitary representation of (R, +)).
Now we change the point of view slightly and consider Ut as a linear operator on S 0
(since the same formula above makse sense with f S 0 ).
Noting that limh0 mt+h ()m
h
t ()
= i ||2 mt () we obtain by dominated convergence
that for any S and f L2
mt+h mt
lim hf, i = i hf, ||2 mt i.
h0 h

Recall that for v S 0 we have hmt v, i = hv, mt i. Therefore we further deduce that
d mt+h f mt f
(mt f) := S 0 - lim = i ||2 mt f f L2 .
dt h0 h
By continuity of F1 we may interchange it with the limit and calculate as follows
d mt+h fb mt fb Xn
Ut f = F 1
S - lim
0
=F 1
i || mt f = i
2 b F1 (2k mt f)
b
dt h0 h k=1
X
n X
n
= i D2k F 1
(mt f)
b =i b = i F1 (mt f)
2k F1 (mt f) b = i Ut f.
k=1 k=1
This result is interpreted as operator dierential equation

d
(5.23) Ut = i Ut .
dt
Since we also have U0 = I (identity operator) as the initial value at t = 0, we denote the
solution by
Ut = eit (t R).
Note that, if we set u(x, t) := Ut f(x), then u(x, 0) = f(x) and (5.23) reads
t u = i u,
which corresponds to Schrodinger's equation, valid in the sense of distributions.
5.39. LEMMA (Convolution of L2 -functions) If f, g L2 (Rn ), then

Z
x 7 f g(x) := f(y)g(x y) dy
Rn
denes a continuous bounded function on Rn .
Proof: For every x Rn the function g(x .) belongs to L2 (Rn ), hence the product
function y 7 f(y)g(x y) belongs to L1 (Rn ). Therefore f g(x) is dened. The Cauchy-
Schwarz inequality implies
Z Z
|f g(x)| 6 |f(y)| |g(x y)| dy 6 kfkL2 ( |g(x y)| dy)1/2 = kfkL2 kgkL2 ,
hence boundedness of f g.
To prove continuity, let (gk ) be a sequence of functions in Cc (Rn ) that converges to g in
L2 (Rn ). By standard theorems on parameter dependent integrals we have f gk C(Rn )
for every k N. The above inequality implies
|f g(x) f gk (x)| = |f (g gk )(x)| 6 kfkL2 kg gk kL2 0 (k ).

5.6. FOURIER TRANSFORM ON L2 121
uniformly with respect to x. Thus f g is continuous.
5.40. REM The proof of boundedness of f g (with f, g L2 ) above shows that

kf gkL 6 kfkL2 kgkL2 .
This is a special case of Young's inequality, which states the following:

Let 1 6 p, q, r 6 and p1 + q1 = 1r + 1. If f Lp and g Lq , then f g Lr and the
inequality
kf gkLr 6 kfkLp kgkLq
holds ([Fol99, Proposition 8.9]).
5.41. THM Let f, g L2 (Rn ). Then we have the analogue of the formula (5.21),
namely
(5.24) \
(f g) = fb g
b.
(Note that on the left-hand side we have the Fourier transform of a continuous bounded
function, whereas the right-hand side displays a product of L2 -functions.)
Proof: If g Cc (Rn ) E 0 (Rn ), then the statement follows from Theorem 5.33.
For general g L2 we choose again a sequence (gk ) in Cc approximating g in L2 . Then
f gk f g uniformly (see the proof of the above Lemma), hence in S 0 .
Moreover, by the Cauchy-Schwarz inequality
kfbg ck kL1 6 kfk

b fbg L gk kL2 = (2)n kfk
b 2 kg\ b 2 kg gk k 2
L L
and therefore fbgck fbgb in L1 , hence also in S 0 . In summary, by continuity of F we

have in terms of limits in S 0 (as k )
\
(f g) = lim (f\
gk ) = lim fb g
ck = lim fb g
b.

Chapter
REGULARITY
6.1. Intro In this chapter we focuss on the issue of regularity of distributions, i.e., we
ask under which conditions a general distribution u D 0 () resp. S 0 (Rn ) actually has
higher regularity, that is when does it belongs to some smaller space of "nicer" functions.
To begin with in section 6.1 we deal with this question on a global basis. We introduce
a scale of Hilbert spaces of distributions on Rn , the so called L2 -based Sobolev spaces
Hs (Rn ) (s R) which single out tempered distributions whose Fourier transform has
a certain growth at innity as measured in the L2 -norm. After studying their basic
properties we will prove that if the Sobolev order s is large enough any u Hs (Rn )
actually is continuous (and vanishes at innity).
In section 6.2 we take a local approach and look at the set of points in where a given
u D 0 () is actually a smooth function. Its complement is called the singular support
of u and is one fundamental notion in regularity theory. We will also link the singular
support of a distribution to fall-o conditions of its Fourier transform. This idea is more
thoroughly studied in section 6.3 where we prove the Paley-Wiener-Schwartz theorem
which gives a precise characterization of the singular support in terms of the Fourier
transform.
Finally in section 6.4 we discuss applications in the theory of PDE (with constant coef-
cients). We introduce the central notion of hypoellipticity which singles out a class of
PDO whose solutions are at least as regular as the right hand side and prove the elliptic
regularity theorem. The latter states that all (constant coecient) elliptic PDOs are
123
124 Chapter 6. REGULARITY
hypoelliptic.

6.1. SOBOLEV SPACES 125
6.1. SOBOLEV SPACES
6.2. Intro In this section we are going to study a family of Hilbert spaces of distri-
butions. Our point of departure is the remarkable fact from Plancherel's Theorem 5.36
that for u S 0 (Rn ) we have
^ L2 (Rn ).
u L2 (Rn ) u
Moreover, by the exchange formulas (Proposition 5.26(i),(ii)) we know that dierenti-
ation of u amounts to multiplication of u^ with polynomials and vice versa. In this
way derivatives of u are linked to growth at innity of u^ . The denition of Sobolev
spaces is based on this observation and allows to measure smootheness of u in terms of
L2 -estimates of its Fourier transform. We start by introducing some notation.
6.3. Notation (Polynomial weights) Let s R and Rn . In the following we shall

write
() := (1 + ||2 ) 2 and hence s s () = (1 + ||2 ) 2 .
1 s
6.4. DEF (Sobolev Spaces) Let s R. We dene the Sobolev space Hs (Rn ) (some-
times also called Bessel potential space) by
^ L2 (Rn )}.
Hs (Rn ) := {u S 0 (Rn ) : s u
6.5. Observation (on Hs )

(i) Note that if u Hs (Rn ) then by denition u^ is a function.
(ii) From Plancherel's theorem 5.36 it follows that H0 (Rn ) = L2 (Rn ).
6.6. PROP (Basic Properties of Hs )

(i) The spaces Hs (Rn ) are Hilbert Spaces with scalar product
Z
(6.1) hu|vis := (2) n
^ ()^v() d
2s ()u
and (associated) norm

Z
(6.2) kuk2Hs = (2) n
^ ()|2 d.
2s ()|u

(ii) For all s R we have that S (Rn ) Hs (Rn ) is dense.
6.7. Observation (Notation and norms) The factor (2)n in (6.1) was introduced
to have
kukL2 = kukH0 ,
cf. (5.22) in Plancherel's theorem. Moreover we have
^ kL2 .
n
kukHs = (2) 2 ks u
Proof: [Proposition 6.6]

(i) The scalar product exists by the Cauchy-Schwarz inequality and the denition of Hs ;
indeed we have Z
^ s ^v| 6 ks u^ kL2 ks ^vkL2 .
|s u
Moreover, sesquilinearity and non-negativity is clear. To show positive deniteness as-

sume that hu|uis = 0. Then we have
Z
^ |2 = 0
2s |u = ^ () = 0 a.e. = u^ = 0 L2 (Rn )
u
= ^ = 0 S 0 (Rn ) = u = 0 Hs (Rn ).
u

5.20(ii)
Finally completeness of Hs follows from completeness of L2 .

finish proof of completeness
(ii) Clearly, S (Rn ) Hs (Rn ) for all s [u S (Rn ) = ^ S (Rn ) L2 (Rn )].
s u

5.10(i), 5.15 5.35
To show denseness let u Hs (Rn ). By 5.35 there exists (j )j D(Rn ) with
(6.3) ^ in L2 (Rn ).
j s u
Now set j := F1 (s j ) S (Rn ). Then we obtain

| {z }
DS
Z 12
ku j kHs = (2) n
2 2s
^ ()
()|u s 2
j ()| d
Z 12
= (2) n
2 ^ () j ()| d
| ()u
s 2
0,
where convergence is due to (6.3).

6.8. Example (Once again ) We have

n
Hs (Rn ) s > .
2
Indeed,
^ =1

s n ^ L2 (Rn )
H (R ) s
s () 6 C(1 + ||)s L2 (Rn )
Z
d
< 2s > n.
(1 + ||)2s
Rn
6.9. PROP (More basic properties of Hs )

(i) For s > t we have Hs (Rn ) , Ht (Rn ) continuousely.
(ii) Let P(D) be a linear PDO with constant coecients of order m then
P(D) : Hs (Rn ) Hsm (Rn )
is continuous.
Proof: (i) Let s > t, then we nd

Z 21
kukHt = (2) n
2 t
^ ()| d
| ()u 2
21
Z

^ ()|2 d 6 kukHs .
n
= (2) 2 | (1 + ||2 )ts s () u
| {z }
61
(ii) We prove the statement for P(D) = D , the general case follows analogously. Let
u Hs (Rn ), then by the exchange formula (Prop. 5.26(i)) we have
^ ()|
sm
sm () |D
[ u| 6 (1 + ||2 ) 2 ||m |u
^ ()| = (1 + ||2 ) 2 |u^ ()| = s () |u^ ()|,

sm m s
6 (1 + ||2 ) 2 (1 + ||2 ) 2 |u
so D u Hsm (Rn ) and kD ukHsm 6 kukHs .
6.10. REM (The spaces H and H ) Due to Proposition 6.9(i) it makes sense to
introduce the spaces
H (Rn ) := Hs (Rn ) and H (Rn ) :=
\ [
Hs (Rn ).
sR sR

We immediatly see that we have the inclusions1
S (Rn ) H (Rn ) H (Rn ) S 0 (Rn ).
6.11. REM (Towards the duality Hs , Hs ) Let , S (Rn ) and regard as a

regular S 0 -distribution. Then we have by Corrolary 5.37
Z
h, i = (x)(x) dx = h|i ^ i
L2 = (2)n h| ^ L2
Z Z
= (2) n ^ ^
()() d = (2) n ^ s ()()
s ()() ^ d,

^
()= ^
()
hence by the Cauchy-Schwarz inequality

Z
(6.4) |h, i| 6 (2) n ^ ()|d
|() ^ 6 kkHs kkHs .
Since S (Rn ) is dense in Hs (Rn ) for all s (Proposition 6.6(ii)) we may extend the map
(, ) 7 h, i
uniquely to a continuous bilinear map Hs (Rn ) Hs (Rn ) C which we write as

Z
(6.5) (u, v) 7 hu, viHs ,Hs := (2) n
^ ()^v() d.
u
Note that (6.4) gives
(6.6) |hu, viHs ,Hs | 6 kukHs kvkHs .
We may now prove
6.12. THM (The Hs , Hs duality) The bilinear form h , iHs ,Hs of (6.5) induces an
isometric isomorphism
0
Hs (Rn ) Hs (Rn ) (the topological dual of Hs ).
With other words Hs (Rn )-distributions are precisely the linear and continous forms on
Hs (Rn ).
1 Infact all these inclusions are strict: (1 + |x|2 )n H (Rn ) by 6.16 below but not in S (Rn ) and
1 S (Rn ) \ H (Rn ).
0

Proof: insert write proof in small print
6.13. Rem (on the dual of Hs ) Do not be confused by the fact that (as is the case for any
Hilbert space) (Hs ) 0 is also isometrically isomorphic to Hs itself. This isomorphism is induced
by the mapping h | is rather than h , iHs ,Hs .
Composing these two mappings we obtain an isometric isomorphism from Hs (Rn ) to Hs (Rn )
which is essentially given by the (Pseudo-dierential) operator 2s (D) dened via F(2s (D)u) :=
^.
2s u
6.14. Motivation (Measuring smootheness via Sobolev norms) Our next task is to
make the announcement of Intro 6.2 more precise. We will show that Sobolev spaces
consist of functions whose derivatives belong to L2 . An overall understanding of this
statement is best reaches via the use of Pseudo-dierential operators. Since this is
beyond the focus of the present course we will restrict ourselves to the case of the spaces
Hm (Rn ) with m N0 . We start with a little technical Lemma, which, however is easily
proved also in the general case s R.
6.15. LEM (Sobolev norms of derivatives) For all s R we have

u Hs+1 (Rn ) u, D1 u, . . . , Dn u Hs (Rn )
and in this case the norms satisfy the equality

X
n
kuk2Hs+1 = kuk2Hs + kDj uk2Hs .
j=1
P
Proof: We have 2 () = 1 + ||2 = 1 + j 2j and so again by the exchange formula
Proposition 5.26(i)
X
n X
n
|s+1
^ | = | u^ | = | u^ | +
u 2 2 s 2 s 2 s
^ | = | u^ | +
| j u 2 s 2
|s D
d 2
j u| .
j=1 j=1
6.16. THM (Characterization of Hm ) Let m N0 , then we have

Hm (Rn ) = {u S 0 (Rn ) : D u L2 (Rn ) for all 6 m}.
Furthermore Hm (Rn ) is the completion of D(Rn ) w.r.t. the norm

12
Z X

kk(m) := |D (x)|2 dx .
||6m

Proof: We prove the rst assertion by induction. The case m = 0 is clear from 6.5(ii)
(resp. Plancherel's Theorem). The inductive step is due to Lemma 6.15, since
6.15 Ind. hyp.
m+1
uH (R ) u, Dj u H (R ) 1 6 j 6 n D u L2 (Rn ) || 6 m + 1.
n m n
We now show that the completion of (D(Rn ), k k(m) ) is Hm (Rn ).

Let (j )j be a Cauchy sequence in D(Rn ) w.r.t. k k(m) . Then (D j )j is a Cauchy
sequence in L2 (Rn ) for all || 6 m and so there exist u L2 (Rn ) with
D j u in L2
and we claim j u0 w.r.t. k k(m) . Hence we have to show that kD j D u0 kL2 0

|| 6 m. To do so it suces to show D u0 = u || 6 m since then u0 Hm (Rn )
by (i). We have for all D(Rn ) (where convergence in both cases is due to the
Cauchy-Schwarz inequality)
Z Z

D j u and
Z Z Z Z
|| ||
D j = (1) j D (1) u0 D = D u0 .

R
So we obtain (D uo u ) = 0 D(Rn ) which establishes the claim due to dense-
ness of D(Rn ) in L2 (Rn ) (cf. Remark 5.35).
Let u Hm (Rn ). Then by (i) D u L2 (Rn ) for all || 6 m. We proceed by

smoothing u: Let be a mollier and set u := u . Then by the standard results on
smoothing (see 4.5(ii) and e.g. [Fol99, Theorem 8.14]) we have
(6.7) D u = D (u ) = (D u) and kD u D ukL2 0 || 6 m.
Let now D(Rn ), 0 6 6 1 and 1 on B1 (0) and set g (x) := (x)u (x). Then
g D(Rn ) and we have
0 in L2
}| { z

D (g u) = (.) D u D u + (.) 1 D u
+ D (.)u (.)D u

| {z }
X
= || (D )(.) D u .
|{z} | {z } | {z }
0<6 0 k kL < k kL2 <
So by (6.7) D g D u in L2 (Rn ) || 6 m, hence kg uk(m) 0.
6.17. COR (Further Properties of H()m ) Let m N0 .

(i) The norms k k(m) and k kHm are equivalent on Hm (Rn ).
(ii) Let u Hm (Rn ). Then there exists f L2 (Rn ) (|| 6 m) such that
X
u= D f .
||6m
Proof: (i) Let (uj )j Hm (Rn ). Then we have
uj 0 w.r.t. k kHs ^ j 0 in L2 for all || 6 m

u
F(D uj ) 0 in L2 for all || 6 m
D uj 0 in L2 for all || 6 m uj 0 w.r.t. k k(m) .
Pn
(ii) Let u Hm (Rn ). Then (1 + ||2 ) 2 u^ L2 (Rn ) and so g^() := u^ () 1 +
m
j=1 |j | )
m 1
L2 (Rn ) and nally
X
n X
n
|j |m
^ () = g^() +
u ^() = g^() +
|j |m g m
j ^() .
g
mj
j=1 j=1 | {z }
L2
Hence the assertion follows by applying F1 .
6.18. Motivation (The Sobolev embedding Theorems) One of the most useful fea-
tures of Sobolev spaces is also connected with the fact that Sobolev norms measure
smoothness. Indeed if we suppose the Sobolev order, i.e., s in Hs (Rn ) to be high enough
as compared to the dimension n of the space, then the functions are actually continuous
and vanish at innity. This means in the context of the Hm -spaces: if one can prove
the L2 -property of enough orders of derivatives one in fact gains regularity. To prove
this satement we need two results from the classical theory of the Fourier transform, the
Lemma of Riemann-Lebesgue and the Fourier inversion formula for L1 -functions.
6.19. Lemma (Classical Fourier inversion formula) Let g L1 (Rn ) then we have
Z
1 n
(F g)(x) = (2) eix g() d
which is a countinuous and bounded function.

Proof: Since g L1 (Rn ) S 0 (Rn ) (Remark 5.20(ii)) we conclude from Theorem 5.25
that F1 g =: f S 0 (Rn ). So for S (Rn ) we obtain
5.16 ^
f=gL1 Z
hf^, i

^

hf, i = (2) n
= (2) n
^ d
g()()
S Z Z
n
= (2) g() (x)eix dx d
Fubini ZZ

= (2) n
dx
g()eix d (x)

x7x
R
which tells us that (F1 g)(x) = f(x) = (2)n g()eix d.
6.20. Lemma (Riemann-Lebesgue) If f L1 (Rn ) then f^ C00 (Rn ) (where C00 = {f

C0 (Rn )| lim|x| f(x) = 0} is the space of continuous functions vanishing at innity).
Proof: In view of Theorem 5.3(i) we only have to show that f^ vanishes at innity. This
is elementary for f being the characteristic function of a rectangle. Indeed for n = 1 we
have
Zb
eia eib
f^() = eix dx = 0 (|| )
i
a
and the general case is analogous. The case of a general f L1 (Rn ) follows since the
charcateristic functions of rectangles are a total set in L1 (Rn ) (i.e., their nite linear
combinations are dense).
6.21. THM (Sobolev embedding)

(i) If s < n
2
then Hs (Rn ) C00 (Rn ).
(ii) If s < k+ n2 then Hs (Rn ) Ck0 (Rn ) (= {f Ck (Rn )| lim|x| f(x) = 0 || 6 k}).
Proof: (i) Let u Hs (Rn ) with s > n/2. Note that 7 (1 + ||2 )s L1 (Rn ) and we
set f = s u^ which by denition is in L2 (Rn ) with kfkL2 = (2)n/2 kukHs . So we obtain
using the Cauchy-Schwarz inequality
Z
12
^ kL1 6 kfkL2
ku (1 + ||2 )s d 6 CkfkL2 6 CkukHs
| {z }
L1
hence u^ L1 (Rn ). So Lemma 6.19 tells us that u = F1 u^ , where F1 is the classical

inverse Fourier transform. Finally by Lemma 6.20 (applied to F1 ) u is in C00 .

(ii) Let u Hs (Rn ) with s > k + n/2. Then by Proposition 6.9(i),(ii) D u Hsk (Rn )
for all || 6 k and by (i) D u C00 (Rn ) for these . Finally by Lemma 2.25 D u is the
classical derivative of the Ck -function u for all || 6 k.
6.22. COR (H functions are smooth) If u H (Rn ) then u C0 (Rn ).

6.23. REM (Hs is closed under multiplication of test functions) One may show that
u Hs (Rn ), S (Rn ) = u Hs (Rn )
with the map u 7 u being continuous on Hs (Rn ). This result tells us that PDOs with
S -coecients operate continuousely on the scale of Sobolev spaces. A proof involves
Young's inequality (cf. Remark 5.40) for p = 1 and q = 2 (hence r = 2) and Petree's
inequality which can be proven by elementry means and reads
t
1 + ||2

6 2|t| (1 + | |2 )|t|
1 + ||2
for t R and , Rn . For details see [FJ98, p. 125.].
6.24. Outlook (More on Sobolev Spaces) The theory of Sobolev spaces is vast and
has a lot of applications in the theory of PDE, see e.g. [Fol95, Chapter 6] for a start.
One striking feature is Rellich's theorem which states that under certain conditions the
embedding Hs , Ht (s > t) is compact, hence from any bounded sequence in Hs one
may extract an Ht -converging subsequence | an argument which is frequently used in
existence proofs in PDE.
A standard reference on Sobolev spaces, however, with emphasis put on the Lp -based
spaces of integer order, i.e.,
W m,p () := {u Lp () : D u Lp () for all || 6 m}
for m N0 and 1 6 p 6 is the book [Ada75] resp. [AF03].

6.2. THE SINGULAR SUPPORT OF A

DRISTRIBUTION
6.25. Motivation During our study we have seen several examples of distributions
that were regular distributions o some "small set". E.g. we know that
Notation 1.28
1 1 1
vp( ) D 0 (R), 6 L1loc (R) but vp( ) |R\{0} = u x1 ,
x x x
hence vp( x1 ) is a regular distribution on R \ {0}. Moreover, vp( x1 ) = u x1 = 1

x
C (R \ {0})
is even smooth o the origin.
6.26. DEF (Singular support) Let u D 0 (). We call the set

singsupp(u) := \ {xo : U open neighbourhood of x0 such that u|U C (U)}
the singular support of u. (Obviousely it is the complement of the largest open set where
u is smooth, hence, in particular, closed.)
6.27. Examples (Singular support)

(i) singsupp() = singsupp(H) = singsupp(vp(1/x)) = {0}
(ii) singsupp(H(1 x2 y2 )) = S1
6.28. PROP (Basic poperties of singsupp)

(i) For u D 0 (Rn ) and v E 0 (Rn ) we have
singsupp(u v) singsupp(u) + singsupp(v).
(ii) Let P(x, D) be a linear PDO with smooth coecients on then we have for all
u D 0 ()
singsupp(P(x, D)u) singsupp(u).

6.2. THE SINGULAR SUPPORT OF A DRISTRIBUTION 135
Proof: (i) Let K := singsupp(v) supp(v), which is a compact set and let D(Rn )
be a cut-o function with 1 on a neigbourhood of K. Futhermore let C (Rn )
with (x) = 1 for all x singsupp(u). We then have
(1 )v D(Rn ) and (1 )u C (Rn ).
Hence we nd
u v = (u + (1 )u) (v + (1 )v)
= u v + u (1 )v + (1 )u v + (1 )u (1 )v
|{z} | {z } | {z } |{z} | {z } | {z }
D 0 D C E 0 C D
= u v + f with f C (R ) by Theorem 4.8.
n
So
singsupp(u v) singsupp(u v) supp(u v)
supp(u) + supp(v) supp() + supp(),

4.5(i)
and the claim follows from taking the intersection over the supports of all such and .
(ii) Let x0 6 singsupp(u). Then U, a neighbourhood of x0 with u|U C (U) =

(Pu)|U = P|C (U) u|U C (U), hence x0 6 singsupp(Pu).
6.29. Motivation In the following we are going to analyze the interrrelation between
smootheness of a distribution and the fall-o of its Fourier transform. The guiding
example in this context is:
^ = 1 no smootheness | no fall-o.

This idea actually carries very far and is the staring point of microlocal analysis which
allows to describe not only the singular support of a distribution but also its "bad"
frequency directions. This theory, however, lies beyond the goals of this course. An
introductory text is [FJ98, Chapter 11] and the ultimate text, of course, is Chapter VIII
of [Hor90].
The basic observation is the following:
6.30. LEM (Smootheness vs. fall-o of the Fourier transform) Let u E 0 (Rn ). Then
the following two conditions are equivalent:
(i) u is smooth (i.e., u C E 0 = D).
(ii) u^ is rapidly decreasing, that is
C
l N0 C > 0 : ^ ()| 6
|u Rn .
(1 + ||)l

Proof: (i) (ii): Since u D S also u^ S and hence the estimates hold.
(ii) (i): By assumption u E 0 and so u^ is smooth (Theorem 5.30). Moreover, from
the exchange formula (Proposition 5.26(i)) F( u)() = i|| u^ () we see that also
F( u) is rapidly decreasing. So F( u) L1 (Rn ) N0 hence by Lemma 6.19 and
Theorem 5.3(i) u C0 (Rn ) N and so u C (Rn ).
6.31. THM (Singular support and fall-o of the Fourier transform) Let u D 0 ()
and x0 . Then we have:
x0 6 singsupp(u) D(), (x0 ) 6= 0 with

F(u) rapidly decreasing.
Proof: : Choose supp() so small that u D() S (Rn ) then F(u) S (Rn ),
hence it is rapidly decreasing.

6.32. Observation and Outlook

(i) Theorem 6.31 says that points in the singular support of a distribution are charac-
terized by the presence of high frequency parts of its Fourier transform.
(ii) Lemma 6.30 which actually is the key to Theorem 6.31 can be strengthened by
using the Fourier-Laplace transform and a bit of complex analysis. This is the task
of the next section.

6.3. THE THEOREM OF PALEY-WIENER-SCHWARTZ 137
6.3. THE THEOREM OF

PALEY-WIENER-SCHWARTZ
6.33. Motivation (The Laplace and the Fourier-Laplace transform) For a function f
on Rn the Laplace transform is dened by
Z
p 7 epx f(x) dx (p C).
Setting p = i we obtain
Z
(6.8) 7 eix f(x), dx,
which formally equals the Fourier transform but with a complex "dual" variable and
reduces to the Fourier transform if = Rn . If f is a bounded measurable function
with compact support then (6.8) denes an analytic function on Cn . We are going
to extend these notions to tempered distributions and bring in some complex variable
techniques.
6.34. Facts (Analytic functions of several complex variables) Let f C1 (X) with X Cn
then
X
n
f X
n
f
df = dzj + dzj ,
zj zj
j=1 j=1
where we have used the notation

z = (z1 , . . . , zn ) = (x1 + iy1 , . . . , xn + iyn )
z = (z1 , . . . , zn ) = (x1 iy1 , . . . , xn iyn )
1 1
= xj iyj , = xj + iyj .
zj 2 zj 2
The function f is called analytic if the Cauchy-Riemann dierential equations, i.e., f/zj = 0
hold for all 1 6 j 6 n.
For Cn and r = (r1 , . . . , rn ) (R+ )n we call the set
D(, r) := {z : |zj j | < rj (1 6 j 6 n)}
a polydisc of radius r around and we clearly have D(, r) = D(1 , r1 ) D(n , rn ). A

repeated application of the one-dimensional Cauchy formula gives for any analytic function f on

X and any z D(, r), a polydisc in X

Z Z
1 n

f(1 , . . . , n )
(6.9) f(z) = ... d1 . . . dn .
2i (1 z1 ) . . . (n zn )
D1 Dn
Since dierentiation under the integral is permitted we see that f C (D(, r)). So the deriva-
tives f of f satisfy the Cauchy-Riemann equations, hence are analytic in D(, r) as well. By
the fact that X can be covered by polydiscs we have that any analytic f on X is actually smooth
on X with all its derivatives again analytic on X.
We may now proceed as in the one-dimensional case to see that any f that is analytic on X has a
power series expansion around any point X. To make this more explicit note that the series
X z ) 1

=
1 1 ) . . . (n n )( ) 1 z1 ) . . . (n zn )
||>0
converges uniformly and absolutely on any compact subset of D(, r). Hence we my integrate in
(6.9) term by term to obtain
n X Z Z
1 f(1 , . . . , n )
f(z) = (z ) ... d1 . . . dn
2i (1 1 ) . . . (n n )( )
||>0 D1 Dn
X (z )
(6.10) = f(),
!
||>0
with the convergence being absolute and uniform on compact subsets of D(, r). For the last
equality we have used
n Z Z
1 f()

f() = ! ... d1 . . . dn ,
2i (1 1 ) . . . (n n )( )
D1 Dn
which again is a consequence of (6.9).

The only fact about these basic properties of analytic functions we are going to use in the sequel
is uniqueness of the analytic extension, i.e., the following statement.
Let X Cn be open and connected. If f is analytic on X and there is a point X with
f() = 0 for all , then f=0 on X.
To prove this assertion set Y := {z X : f(z) = 0 }, which is closed being the intersection
of a family of closed sets. But by (6.10) each point in Y has a ploydisc-shaped neighbourhood
contained in Y , so Y is also open. By connectedness of X we have that Y = X or Y = . The latter
is impossible since Y and we are done.
Finally recall from Theorem 5.30 that for any v E 0 (Rn ) the Fourier transform which is given
by
v() = hv(x), eix i
b ( Rn )
can be extended to a holomorphic function on Cn . This leads the way to the following denition.

6.35. DEF (Fourier-Laplace transform) Let v E 0 (Rn ). Then we call the function
(6.11) ^v() := hv(x), eix i ( Cn )
the Fourier-Laplace transform of v.
6.36. Observation
(i) As already indicated above Theorem 5.30 tells us that the Fourier-Laplace transform
of any v E 0 (Rn ) is an entire function.
(ii) In case v D(Rn )(= E 0 (Rn ) C (Rn )) equation (6.11) obviousely reduces to
Z
(6.12) ^v() = u(x)eix dx
Rn
and in case = Rn (6.11) gives back the Fourier transform.
6.37. PROP (Support of v vs. fall-o of the Fourier-Laplace transform)

(i) Let v E 0 (Rn ) with supp(v) Ba (0). Then there exist C, N > 0 such that
(6.13) |^v()| 6 C(1 + ||)N ea| Im | ( Cn ).
(ii) Let v D(Rn ) with supp(v) Ba (0). Then for all m > 0 there exist Cm > 0 such
that
(6.14) |^v)| 6 Cm (1 + ||)m ea| Im | ( Cn ).
Proof: (i) Let C (R) such that (t) 0 for t 6 1 and (t) 1 for t > 1/2
and set
(x) := (||(a |x|)) ( Cn ).
We then have 1 for = 0 while for 6= 0 we nd
1
(x) 0 for |x| > a + ||1 and (x) 1 for |x| 6 a + ||1 ,
2
hence, in particular, D(Rn ) for 6= 0. By the support condition on v we may write
^v() = hv(x), (x)eix i.
Now since ^v is smooth, it is bounded for say || 6 1. To obtain estimate (6.13) for large
we observe that for || > 1 we have supp( ) {|x| 6 a + 1} =: K. So we may use (SN)
to obtain the existence of N, C with
X
|^v(| 6 C kD
x ( (x)e
ix
)||L (K) ,
||6N

hence the claim follows from the Leibnitz rule since
x (x)| 6 C || and
|D
|x|6a+||1

|Dx eix | = |||| ex| Im | |||| e| Im |(a+|| ) 6 |||| ea| Im | .
1
6
R
(ii) Let now v D(Rn ) then by (6.12) u^ () = eix u(x)dx and we may use integration
by parts to obtain Z
^v() = eix D u(x)

Nn
0.
So for all
Z
x| Im |
| ^v()| 6 kD vkL (Rn )

sup e dx
xsupp(v)
|x|6a
a| Im |
6 CkD vkL (Rn ) e ,
which gives the claim.

The followowing Theorem shows that the above estimates are actually characterizing.
6.38. THM (Paley-Wiener-Schwartz) Let a < 0 and let f : Cn C be analytic.
(i) f is the Fourier-Laplace transform of some u E 0 (Rn ) with supp(v) Ba (0) if

and only if
(6.15) C, N > 0 : |f()| 6 C(1 + ||)N ea| Im | ( Cn ).
(ii) f is the Fourier-Laplace transform of some u D(Rn ) with supp(v) Ba (0) if and
only if
(6.16) m > 0 Cm > 0 : |f()| 6 Cm (1 + ||)m ea| Im | ( Cn ).
Proof: (i),(ii) : In both cases this is just Proposition 6.37.

(ii) : To begin with we set m = n + 1 and = Rn in (6.16). Then 7 f() is in
L1 (Rn ) and by the same reasoning as in 5.3(i) we nd that
Z
(6.17) u(x) := (2) n
eix f() d
is continuous.
Now setting m = || + n + 1 in (6.16) we nd that also 7 f() is in L1 (Rn ) for all
|| 6 m and we may dierentiate under the integral in 6.17. So we have that u C (Rn ).

We now claim that
(6.18) supp(u) {|x| 6 a}.
Since each of the functions j 7 f() is analytic we may use Cauchy's theorem in each of
the variables j (1 6 j 6 n) to shift the integration in (6.17) into the complex domain.
By (6.16) the integrals parallel to the imaginary axis vanish and we may replace (6.17)
by
Z =+i Z
n ix n
u(x) = (2) e f() d = (2) eix(+i) f( + i) d,
Im = Rn
where Rn is arbitrary. Now again setting m = n + 1 in (6.16) we obtain

Z
|u(x)| 6 (2) n
Cn+1 e a||x
(1 + | + i|)n1 d 6 Cea||x .
| {z }
Rn integrable
We now set = tx/|x| (t > 0) and obtain |u(x)| 6 Ce(a|x|)t . By taking the limit t
we see that u(x)=0 if |x| > a which establishes (6.18).
Knowing that u D(Rn ) we may apply the Fourier inversion formula in S (Lemma
5.14) to obtain u^ () = f() for all Rn . Moreover we know from Theorem 5.30 that
^ extends to an analytic function on Cn , so by uniqueness of analytic continuation (cf.
u
6.34) we obtain u^ = f on Cn .
(i) : Since f is analytic and so by (6.15) we have f|Rn S 0 (Rn ). So by Theorem 5.25
^ = f|Rn .
f|Rn is the Fourier transform of some u S 0 (Rn ), i.e., u
Let now be a mollier and set u := u. Then by the convolution Theorem 5.33
and formula (5.10) we nd
u ^ () = ^()f().
c () = b ()u
Next we combine 6.37(iii) for ^(.) with (6.15) to obtain that u

c extends to an analytic
function on C such that for all m N0 the estimate
n
^ ()| 6 Cm (1 + ||)m C(1 + ||)N e(a+)| Im |

|u
holds. Upon replacing m by m + N and noting that (1+||) m+N 6 m+N (1+||)m we see
(1+||) N
1 1
that u satises (6.16) with a + replacing a. So by (ii) we nd some v D(Rn ) with

^v = u
c and supp(v) {|x| 6 a + }. Hence v = u and we obtain supp(u ) {|x| 6 a + }.

Next we show that actually supp(u) {|x| 6 a}. If x0 6 Ba (0) then there exists 0 > 0
and a neighbourhood V of x0 such that |y| > a + 0 for all y V . So by the above u = 0
on V for all < 0 and we have for all D(V)
hu, i = lim hu , i = 0.
0
So u|V = 0, hence x0 6 supp(u).

Finally we proceed as in (ii): Again Theorem 5.30 says that u^ extends analytically to
Cn and by the fact that u ^ |Rn = f|Rn and by uniqueness of analytic continuation (6.34)
we have u^ = f C and we are done.
n


6.4. REGULARITY AND PARTIAL DIFFERENTIAL OPERATORS 143
6.4. REGULARITY AND PARTIAL

DIFFERENTIAL OPERATORS
6.39. Intro In this nal section of chapter 6 we apply the notions of section 6.2 to
the study of PDOs. Recall that we denote PDOs with smooth coecients on of order
m by
X
P(x, D) = a (x)D ,
||6m
where a are smooth functions on .

We aregoing to introduce the notion of hypoellipticity which is fundamental for linear
PDOs and dened by the requirement that the singular support of Pu equals the singular
support of u. We prove the elliptic regularity theorem in the constant-coectients-case;
it asserts that all elliptic PDOs are hypoelliptic.
6.40. DEF (Hypoellipticity) Let P(x, D) be a linear PDO with smooth coecients on
. We say that P(x, D) is hypoelliptic if for all u D 0 ()
(6.19) singsupp P(x, D)u = singsupp(u).

6.41. Observation (The meaning of hypoellitpticity) In view of 6.28(ii) (i.e., ""

in (6.19) is always true) hypoellipticity means that singsupp(u) singsupp(Pu). In
particular, we have
P(x, D)u C () = u C ()
and in the context of the PDE P(x, D)u = f we have that smootheness of the right hand
side implies smootheness of the solution.
Hence for general PDOs with smooth coecients the following question arises
P(x, D)u = f D 0 = singsupp(u) singsupp(f) ? .
This question as well as a rened version of it are answered using microlocal analysis by
giving a bound on the wave front set of u (see e.g. [Hor90, Thm. 8.3.1]).
6.42. Example ((Non-)hypoelliptic operators)

(i) The wave operator

(6.20) 2 = 2t 2x = D2t + D2x
on R2 is not hypoelliptic since u(x, t) = H(x x0 (t t0 )) is a solution of

2u = 0 (cf. 0.3(ii)) but singsupp(u) = {(x, t) R2 : x x0 = t t0 }(6= ).
(ii) Let I R be an intervall and let a C (I). Then the ordinary dierential operator
d d
P(x, ) = + a(x)
x dx
is hypoelliptic. Indeed, if Pu = f C on some subintervall then Theorem 2.24
implies that u C1 . Furthermore we obtain from the ODE that u 00 = f 0 au 0
a 0 u C0 and another appeal to Theorem 2.24 gives u C2 . Now going on
inductively we obtain u C .
6.43. DEF (Ellipticity) Let P(D) be linear PDO with constant coecients. We call
P(D) elliptic if its principal symbol satises
P () 6= 0 6= 0 in Rn .
6.44. Example ((Non-)elliptic operators)

(i) The Laplace operator
Xn
2
P(D) = 4 = x = (x1 , . . . xn ) Rn
j=1
x2j
is elliptic since P () = ||2 is nonvanishing o the origin.

(ii) The Cauchy-Riemann operator
1
P(D) = z = +i z = (x, y) R2
2 x y
is elliptic since P (, ) = (1/2)i( + i) = 0 i = 0 = .
(iii) The wave operator P(D) = 2 on Rn+1 (cf. (6.20)) is not elliptic since P (, ) =
2 + ||2 which vanises for || = .
(iv) The heat operator

P(D) = 4x (x, t) = (x1 , . . . xn , t) Rn+1
t
is not elliptic since its principal symbol P (, ) = ||2 vanishes for all (0, ).

6.45. THM (Elliptic regularity) Let P(D) be a linear PDO with constant coecients.
If P(D) is elliptic then it is hypoelliptic (on any open set Rn ).
Proof: We rst observe that the principal symbol of any linear constant coecient PDO
P(D) of order m is homogeneous of degree m. Indeed we have for all t > 0
X X
P (t) = a (t) = t|| a () = tm P ().
||=m ||=m
Step 1 (Construction of a parametrix)2 :

By assumption we have := minSn1 |P ()| > 0 and by the above observation we
obtain

|P ()| = ||m |P > ||m Rn .
||
So
|P()| = |P () + O(||m1 )| > |P ()| C||m1

> ||m C||m1 > || > R > 0, with R suitable.
2
Now we chose a cut-o function D(Rn ), 0 6 6 1, and () 1 for || 6 R. Then
we obtain
1 () 2
P() 6
Rn ,
and so (1 )/P is smooth and bounded, hence in S 0 (Rn ). So by Theorem 5.25 there
exists E S 0 (Rn ) with
^ = 1 ,
E
P
and we call a parametrix for P(D).
Using the exchange formulas 5.26(i),(ii) we now have
^ = P() 1 ()
F(P(D)E) = P()E = 1 ()
P(
and since D(Rn ) S (Rn ) we obtain using Theorems 5.15 and 5.25 as well as
formula (5.18) that
(6.21) P(D)E = F1 = , with S (Rn ).
Step 2 (Regularity of the parametrix): We claim that
E|Rn \{0} C , i.e., singsupp(E) {0}.
2Adistribution E with P(D)E = + C is called a parametrix of P(D). It plays the role of an
approximate inverse for P(D) and will be employed to establish the asserted regularity statement.

To see this note that by the exchange formulas

^ ) = (1)|| D (1 () ,
F(x D E) = (1) D ( E ||
P())
for any pair of multi-indices , . Moreover, we have (by induction)

(1 ())

D
= O(||||m ),
P()
and so by setting || = || m + n + 1 we obtain
F(x D E) = O(n1 ).
So x D E F1 (L1 (Rn )) and by 6.19 x D E C0 (Rn ). Finally for any x 6= 0 there is

at least one such for which x 6= 0, hence D E C0 (Rn \ {0}). Since was arbitrary
we have established the claim.
Step 3 (Smootheness of u): We rst claim that for any 0 , open and relatively
compact
P(D)u| 0 C ( 0 ) = u| 0 C ( 0 ).
0 . We then have
To see this let D(), 1 on a neighbourhood of
4.5(iv) (6.21) 4.5(ii)

u = (u) = P(D)E + (u) = E P(D)(u) + (u) .
| {z }
S E 0 C

4.8
So
singsupp(u) = singsupp(E P(D)(u))

singsupp(E) +singsupp(P(D)(u)) = singsupp(P(D)(u))
| {z }
6.28(i) {0} by step 2
Since (u)| 0 = u| 0 we have in 0 that
(6.22) singsupp(u) singsupp(P(D)u)
establishing the claim.

Since smootheness is a local property we actually have (6.22) on all of . Finally we
note that in (6.22) always holds (by 6.28(ii)), and so we are done.
6.46. Outlook (Elliptic regularity)

(i) The elliptic regularity theorem also holds true for non-constant coecient operators . Thereby
a linear PDO with smooth coecients on is called elliptic if the principal symbol P (x, ) 6= 0
on (Rn \{0}). Then again ellipticity implies hypoellipticity, a result which is most conveniently
proved using the machinery of pseudo dierential operators, see e.g. [Ray91, Cor. 3.8].
(ii) There is also a Sobolev space version of the elliptic regularity theorem: If P(x, D) is elliptic
of order m then we have for all u H
P(x, D)u Hs Rn ) = u Hs+m (Rn ),
see e.g. [Fol95, Thm. 6.33].

Chapter
FUNDAMENTAL SOLUTIONS
149
150 Chapter 7. FUNDAMENTAL SOLUTIONS
7.1. BASIC NOTIONS

7.2. THE MALGRANGE-EHRENPREIS THEOREM 151
7.2. THE MALGRANGE-EHRENPREIS

THEOREM

152 Chapter 7. FUNDAMENTAL SOLUTIONS
7.3. HYPOELLIPTICITY OF PARTIAL

DIFFERENTIAL OPERATORS WITH
CONSTANT COEFFICIENTS

7.4. FUNDAMENTAL SOLUTIONS OF SOME PROMINENT OPERATORS 153
7.4. FUNDAMENTAL SOLUTIONS OF SOME

PROMINENT OPERATORS

Bibliography
[Ada75] R. Adams. Sobolev Spaces. Academic Press, New York, 1975.

[AF03] R. Adams and J.J.F. Fournier. Sobolev Spaces. Elsevier, Oxford, second
edition, 2003.
[Bou66] N. Bourbaki. Elements of mathematics. General topology. Part 1. Her-
mann, Paris, 1966.
[Die79] J. Dieudonne. Grundzuge der modernen Analysis, Band 5/6. Vieweg,
Braunschweig, 1979.
[FJ98] G. Friedlander and M. Joshi. Introduction to the theory of distributions.
Cambridge University Press, New York, second edition, 1998.
[FL74] B. Fuchssteiner and D. Laugwitz. Funktionalanalysis. BI Wissenschaftsver-
lag, Zurich, 1974.
[Fol95] G. B. Folland. Introduction to partial dierential equations. Princeton
University Press, Princeton, New Jersey, second edition, 1995.
[Fol99] G. B. Folland. Real Analysis. John Wiley and Sons, New York, 1999.
[For84] O. Forster. Analysis 3. Vieweg Verlag, Wiesbaden, 1984. 3. Au age.
[GKOS01] M. Grosser, M. Kunzinger, M. Oberguggenberger, and R. Steinbauer. Geo-
metric theory of generalized functions. Kluwer, Dordrecht, 2001.
[Hor66] J. Horvath. Topological vector spaces and distributions. Addison-Wesley,
Reading, MA, 1966.
155
156 BIBLIOGRAPHY
[Hor90] L. Hormander. The analysis of linear partial dierential operators, vol-

ume I. Springer-Verlag, second edition, 1990.
[Hor09] G. Hormann. Analysis (Lecture notes, University of Vienna). available elec-
tronically at http://www.mat.univie.ac.at/gue/material.html, 2008-09.
[KR86] R. V. Kadison and J. R. Ringrose. Fundamentals of the Theory of Operator
Algebras, Volume II: Advanced theory. Academic Press, New York, 1986.
[Kun98] M. Kunzinger. Distributionentheorie II (Lecture notes, University of Vienna,
spring term 1998). available from the author, 1998.
[LL01] E. H. Lieb and M. Loss. Analysis, volume 14 of Graduate Studies in Math-
ematics. American Mathematical Society, Providence, RI, second edition,
2001.
[Obe86]
M. Oberguggenberger. Uber Folgenkonvergenz in lokalkonvexen Raumen.
Math. Nachr., 129:219{234, 1986.
[Obe0x] M. Oberguggenberger. On the algebraic dual of D(). Unpublished Notes,
200x.
[Ray91] X. Saint Raymond. Elementary introduction to the theory of pseudodif-
ferential operators. CRC Press, Boca Raton, 1991.
[RN82] F. Riesz and B. Sz. Nagy. Vorlesungen uber Funktionalanalysis. VEB
Deutscher Verlag der Wissenschaften, Berlin, 1982.
[Sch66] H. H. Schaefer. Topological Vector Spaces. Springer-Verlag, New York, 1966.
[SD80] W. Schempp and B. Dreseler. Einfuhrung in die harmonische Analyse. B.
G. Teubner, Stuttgart, 1980.
[SJ95] L. A. Steen and J. A. Seebach Jr. Counterexamples in topology. Dover
Publications Inc., Mineola, NY, 1995. Reprint of the second (1978) edition.
[SS03] E. Stein and R. Shakarchi. Complex Analysis. Princeton Lectures in Analysis
II. Princeton University Press, Princeton, 2003.
[Ste09] R. Steinbauer. Locally convex vector spaces (Lecture notes, University of
Vienna, fall term 2008). available from the author, 2009.
[Wer05] D. Werner. Funktionalanalysis. Springer-Verlag, Berlin, 2005. funfte Au-
age.

Distrvo PDF

Загружено:

Сведения о документе

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Distrvo PDF

Загружено:

Авторское право:

Доступные форматы

Lecture Notes on the

Fakultat fur Mathematik, Universitat Wien

1 TEST FUNCTIONS AND DISTRIBUTIONS 3

2 DIFFERENTIATION, DIFFERENTIAL OPERATORS 39

2.2 Multiplication by C -functions . . . . . . . . . . . . . . . . . . . . . . . . 44

0.1. General Intro

0.4. A motivating example from Physics: Electrostatics

TEST FUNCTIONS AND

1.2. Notation and conventions

(ii) means subset, will not be used

(vii) For x = (x1 , x2 , . . . , xn ) Rn or Cn we denote by |x| the Euclidian norm of x, i.e.

(ix) Multi-index notation:

 If = ej = (0, 0, . . . , 0, 1, 0 . . . , 0) (i.e. i = ij with ij the Kronecker-delta,

{ { { { { { { { { { { { D R A F T - V E R S I O N (July 10, 2009) { { { { { { { { { { { {

 In multi-index notation the Taylor series takes the form

(x) Lp -norms: Let A open or closed, f : A C continuous (resp. A measurable,

||f||Lp (A) := |f(x)|p dx (1 6 p < )

{ { { { { { { { { { { { D R A F T - V E R S I O N (July 10, 2009) { { { { { { { { { { { {

1.1. SMOOTH FUNCTIONS, SUPPORT, AND

1.3. DEF (Ck -functions)

1.4. DEF (Convergence in Ck )

This notion is called uniform convergence on compact sets in all derivatives.

1.5. Example (Convergence in E ) Let f E (Rn ) be arbitrary; for ]0, 1] de ne

{ { { { { { { { { { { { D R A F T - V E R S I O N (July 10, 2009) { { { { { { { { { { { {

Proof: Let K b Rn ; choose R > 0 such that K BR (0).

 || > 1: clearly (f(0)) = 0 and f (x) = || f(x); since f is bounded on

| f (x) (f(0))| = || | f(x)| 6 || k fkL (BR (0)) ,

hence k f (f(0))kL (K) 6 k fkL (BR (0)) 0 ( 0).

1.6. DEF (Suppport) Let f C(). The set

is called the support of f.

1.7. Observation (Properties of the support) Let f, g C().

1.8. DEF (Test functions)

is the space of test functions on .

1.9. Question (Nontriviality of the spaces Ckc () and D())

1.10. LEMMA (Smoothly joining zero) Let h : R C be de ned by

Then h belongs to C (R), 0 6 h 6 1, and h(t) > 0 if and only if t > 0.

1.11. Constructions (Bump functions on Rn and nontriviality of D() = Cc ())

supp() = B1 (0), and (x) > 0 when |x| < 1, is given by

{ { { { { { { { { { { { D R A F T - V E R S I O N (July 10, 2009) { { { { { { { { { { { {

and such that in addition Rn (x)dx = 1.

1.12. DEF (Molli er) A function D(Rn ) is called a molli er if

By 1.11 existence of molli ers is guaranteed.

1.13. THM (Approximation by convolution) Let f Ckc (Rn ) (0 6 k 6 ) and let

gue/material.html [f = f , where '' is called convolution]

{ { { { { { { { { { { { D R A F T - V E R S I O N (July 10, 2009) { { { { { { { { { { { {

Proof: (i) Let K = supp(f).

Therefore, appealing to uniform continuity of f (due to its compact support!), we may

1.14. REM (Approximation on )

{ { { { { { { { { { { { D R A F T - V E R S I O N (July 10, 2009) { { { { { { { { { { { {

Now consider = f and choose 0 to be some open neighborhood of K + B (0).]

1.15. DEF (The spaces Dk (K)) Let K b , 0 6 k 6 . We de ne

1.16. REM (Topology I)

{ { { { { { { { { { { { D R A F T - V E R S I O N (July 10, 2009) { { { { { { { { { { { {

q (f) := k fkL (K) ( Nn

1.17. DEF (Convergence of test functions)

(1) K b : supp() K and supp(n ) K n N, and

Thus, n in D() if and only if

K b : n N, n , D(K) and n in D(K).

1.18. REM (Topology II)

1.19. Example (D convergence vs. E -convergence) Convergence in D(Rn ) di ers from

{ { { { { { { { { { { { D R A F T - V E R S I O N (July 10, 2009) { { { { { { { { { { { {

1.21. DEF (Cauchy sequences in D())

(1) K b : supp(l ) K l N, and

Fakultat fur Mathematik, Universitat Wien

If = ej = (0, 0, . . . , 0, 1, 0 . . . , 0) (i.e. i = ij with ij the Kronecker-delta,

In multi-index notation the Taylor series takes the form

1.5. Example (Convergence in E ) Let f E (Rn ) be arbitrary; for ]0, 1] dene

|| > 1: clearly (f(0)) = 0 and f (x) = || f(x); since f is bounded on

1.10. LEMMA (Smoothly joining zero) Let h : R C be dened by

1.12. DEF (Mollier) A function D(Rn ) is called a mollier if

By 1.11 existence of molliers is guaranteed.

1.15. DEF (The spaces Dk (K)) Let K b , 0 6 k 6 . We dene

1.19. Example (D convergence vs. E -convergence) Convergence in D(Rn ) diers from

and therefore |hu, m i| 6 0 in C | a contradiction .

(i) Continuous functions as dstributions: Let f C() and dene uf : D() C by

1.32. Examples (Some (non-)nite order distributions)

1.38. DEF We dene the (Cauchy) principal value of x1 , denoted by vp( x1 ), by its

Warning: A function f(x) = (x 6= 0), f(0) arbitrary, cannot dene an L1loc -class on

(uk (x))kN converges i x 2Z, but

(uk ) converges to 0 in D 0 (R): let D(R) then

(This is of course just a disguised form of Riemann's lemma on Fourier coecients,