Академический Документы
Профессиональный Документы
Культура Документы
FALL 2001
Gabriel Nagy
Gabriel
c Nagy
Chapter I
Topology Preliminaries
Lecture 1
belong to Dn . Apply (n0 ) to the pair of disjoint closed sets V t− and X r Vt+ to
find two open sets U, W ⊂ X such that
V t− ⊂ U ⊂ U ⊂ W and W ∩ X r Vt+ = ∅.
Notice that the equality W ∩ (X r Vt+ ) = ∅, coupled with the inclusion U ⊂ W ,
gives U ∩ (X r Vt+ ), so we get U ⊂ Vt+ . We can then define Vt = U , and we will
obviously have the inclusions
(3) V t− ⊂ Vt ⊂ V t ⊂ Vt+ .
Now the extended family (Vt )t∈Dn+1 will also satisfy property (ii), since for t, s ∈
Dn+1 with t < s, one of the following will hold:
• either t, s ∈ Dn , or
• t ∈ Dn , s ∈ Dn+1 r Dn , and t ≤ s− , or
• t ∈ Dn+1 r Dn , s ∈ Dn , and t+ ≤ s, or
• t, s ∈ Dn+1 r Dn , and t+ ≤ s− .
(In either case, one uses (3) combined with the inductive hypothesis.)
Having constructed the family (Vt )t∈D , with properties (i) and (ii), we define
the functions f : X → [0, 1] by
inf{t ∈ D : x ∈ Vt }, if x ∈ V1
f (x) =
1, if x 6∈ V1
Claim 1: The function f is equivalently defined by
0, if x ∈ V 0
(4) f (x) =
sup{t ∈ D : x 6∈ V t }, if x 6∈ V 0
Let us denote by g : X → [0, 1] be the function defined by formula (4). Fix
some point x ∈ X. We break the proof in several cases
Case I: x ∈ V 0 .
In particular, using (ii) we get x ∈ Vt , for all t ∈ D, with t > 0, and since
x ∈ V1 , we have
f (x) = inf{t ∈ D : x ∈ Vt } = inf{t ∈ D : t > 0} = 0 = g(x).
Case II: x 6∈ V1 .
Using (ii) we have x 6∈ V t , for all t ∈ D, with t < 1, and since x 6∈ V 0 , we have
g(x) = sup{t ∈ D : x 6∈ V t } = sup{t ∈ D : t < 1} = 1 = f (x).
Case III: x ∈ V1 r V 0 .
By the definition of f (x) we know:
(5) x 6∈ Vt , ∀ t ∈ D, with t < f (x).
(6) ∀ ε > 0, ∃ sε ∈ D, with f (x) ≤ sε < f (x) + ε, such that x ∈ Vsε .
By the definition of g(x) we know:
(7) x ∈ V t , ∀ t ∈ D, with t > g(x);
(8) ∀ ε > 0, ∃ rε ∈ D, with g(x) ≥ rε > g(x) − ε, such that x 6∈ V rε .
Using (6) and (8) we see that we must have
(9) sε ≥ rε , ∀ ε > 0.
CHAPTER I: TOPOLOGY PRELIMINARIES 7
Indeed, if there exists some ε > 0 for which we have sε < rε , then using (6) we
would have
x ∈ V sε ⊂ V sε ⊂ V r ε ⊂ V r ε ,
which contradicts (8).
Now the inequality (9) gives
f (x) + ε > g(x) − ε, ∀ ε > 0,
so we have in fact the inequality
f (x) ≥ g(x).
Suppose now this inequality is strict. Using (5) and (7) we will get
(10) x ∈ V t and x 6∈ Vt , for all t ∈ D, with f (x) > t > g(x).
Using the fact that D is dense in [0, 1], we could then find at least two elements
t1 , t2 ∈ D such that
f (x) > t1 > t2 > g(x).
In this case (10) immediately creates a contradiction, since
x ∈ V t2 ⊂ Vt1 .
Claim 2: The function f is continuous.
Since any open set in R is a union of open intervals, it suffice to prove the
following two properties1
(usc): f −1 (∞, t) is open for all t ∈ R;
Start with a point x ∈ f −1 (t, ∞) , which means that f (x) < t. Using (6), there
exists some s ∈ D with f (x) < s < t, such that x ∈ Vs , so x indeed belongs to
the right hand side of (11). Conversley, if x belongs to the right hand side of (11),
there exists some s < t such that x ∈ Vs . By the definition of f (x), it follows that
f (x) ≤ s < t, so x ∈ f −1 (∞, t) .
In order to prove property (lsc) it suffices to prove the equality
[
f −1 (t, ∞) =
(12) (X r V r ).
r∈D
r>t
Start with a point x ∈ f −1 (t, ∞) , which means that f (x) > t. Using (8), there
exists some r ∈ D with f (x) > r > t, such that x 6∈ V r , that is, x ∈ X r V r , so x
indeed belongs to the right hand side of (12). Conversley, if x belongs to the right
hand side of (12), there exists some r > t such that x ∈ X r V s , i.e. x 6∈ V r By
the equivalent definition of f (x) given by Claim 1, it follows that f (x) ≥ r > t, so
x ∈ f −1 (t, ∞) .
1 The condition (usc) means that f is upper semi-continuous, while the condition (lsc)
means that f is lower semi-continuous.
8 LECTURE 1
2. Ultrafilters
In this lecture we discuss a set theoretical concept, which turns out to be
technically useful in topology.
Definition. Suppose X is a fixed (non-empty) set. A filter in X is a (non-
empty) family F of non-empty subsets of X which has the property2:
(f) Whenever F and G belong to F, it follows that F ∩ G also belongs to F.
What is important here is that all the sets in the filter are assumed to be non-
empty. The set of all filters in X can be ordered by inclusion. A simple application
of Zorn’s Lemma yields:
• For each filter F there exists at least one maximal filter U with U ⊃ F.
Maximal filters will be called ultrafilters.
An interesting feature of ultrafilters is given by the following:
Lemma 2.1. Let X be a non-empty set, and let U be a filter on X. The
following are equivalent:
(i) U is an ultrafilter.
(ii) For any subsets A ⊂ X, it follows that either A or X r A belongs to U,
but not both!
Proof. (i) ⇒ (ii). Assume U is an ultrafilter. First remark that X always
belongs to U. (Otherwise, if X does not belong to U, the family U ∪ {X} will be
obviously a new filter which will contradict the maximality of U).
Let us assume that A is non-empty and it does not belong to U. This means
that the family
M = U ∪ {A ∩ U | U ∈ U}
is no longer a filter (otherwise, the maximality of U will be contradicted). Note that
if F and G belong to M, then automatically F ∩ G belongs to M. This means that
the only thing that can prevent M from being a filter, must be the fact that one
of the sets in M is empty. That is, there is some set V ∈ U such that A ∩ V = ∅.
In other words, V ⊂ X r A. But then, it follows that for any U ∈ U we have
U ∩ (X r A) ⊃ U ∩ V 6= ∅ and then the set
N = U ∪ {U ∩ (X r A) | U ∈ U }
will be a filter. By maximality, it follows that N = U, in particular, X r A belongs
to U. It is obvious that A and X r A cannot simultaneously belong to U, because
this will force ∅ = A ∩ (X r A) to belong to U.
2 Some textbooks may use a slightly different definition.
9
10 LECTURE 2
(ii) ⇒ (i). Assume property (ii) holds, but U is not maximal, which means
that there exists some ultrafilter V with V ) U. Pick then some set A ∈ V r U.
Since A 6∈ U, by (ii) we must have X r A ∈ U. This would force both A and X r A
to belong to V, which is impossible.
Exercise 1. Let U be an ultrafilter on X, and let A ∈ U. Prove that the
collection
UA = {U ∩ A : U ∈ U}
is an ultrafilter on A.
Remark 2.1. If U is an ultrafilter on X, and A ∈ U, then U contains all sets
B with A ⊂ B ⊂ X. Indeed, if we start with such a B, then by the above result,
either B ∈ U or X r B ∈ U. Notice however that in the case X r B ∈ U we would
get
U 3 (X r B) ∩ A = ∅,
which is impossible. Therefore B must belong to U.
We are in position now to define the notion of convergence for ultrafilters, by
means of the following.
Proposition 2.1. Let (X, T ) be a topological space, let U be an ultrafilter in
X, and let x be a point in X. The follwoing are equivalent:
(i) Every neighborhood of x belongs to U.
(ii) There exists N a basic system of neighborhoods of x, with N ⊂ U.
(iii) There exists V a fundamental system of neighborhoods of x, with V ⊂ U.
If the ultrafilter U satisfies one of the equivalent conditions above, we say that
U is convergent to x, and we write U → x.
Proof. The implications (i) ⇒ (ii) ⇒ (iii) are obvious.
(iii) ⇒ (i). Let V be a fundamental system of neighborhoods of x, with V ⊂ U.
Start with an arbitrary neighborhood M of x. By the proeprties of V, there exists
a finite sequence V1 , . . . , Vn ∈ V, with
x ∈ V1 ∩ · · · ∩ Vn ⊂ M.
Since V ⊂ U, and U is a filter, it follows that the intersection W = V1 ∩ · · · ∩ Vn
belongs to U. By Remark 2.1 it follows that M itself belong to U. Since M was
arbitrary, it follows that U indeed satisfies condition (i).
The Hausdorff property has a nice ultrafilter characterization:
Proposition 2.2. For a topological space (X, T ), the following are equivalent:
(i) The topology T is Hausdorff.
(ii) Every convergent ultrafilter in X has a unique limit.
Proof. (i) ⇒ (ii). Assume the topolgy is Hausdorff. Let U be an ultrafilter in
X which is convergent to both x and y. If x 6= y, then by the Hausdorff property,
there exist two open sets U, V ⊂ X, with x ∈ U , y ∈ V , and U ∩ V = ∅. Since U
is a neighborhood of x, we must have U ∈ U. Likewise, we must have V ∈ U. But
this is impossible, since it will force U 3 U ∩ V = ∅.
(ii) ⇒ (i). Assume X satisfies condition (ii), but the topology is not Hausdorff.
This means that there exist two points x, y ∈ X, with x 6= y, such that
(∗) for any open sets U, V ⊂ X, with U 3 x and V 3 y, we have U ∩ V 6= ∅.
CHAPTER I: TOPOLOGY PRELIMINARIES 11
Let Nx denote the collection of all neighborhoods of x, and Ny denote the collection
of all neighborhoods of y. By condition (∗) we have
M ∩ N 6= ∅, ∀ M ∈ Nx , N ∈ Ny .
This proves that the collection
F = {M ∩ N : M ∈ Nx , N ∈ Ny }
is a filter in X. Notice that, since X is a neighborhood for both x and y, we have
the inclusion F ⊃ Nx ∪ Ny . So if we take U to be an ultrafilter, with U ⊃ F, it
follows that U ⊃ Nx , hence U converges to x, but also U ⊃ Ny , hence U is also
convergent to y. By condition (ii) this is impossible.
One can prove this property by contradiction. Assume f (U ) does not belong to
f∗ (U), for some U ∈ U. Then Y r f (U ) belongs to f∗ (U), which means that the set
M = f −1 Y r f (U ) = X r f −1 f (U )
∅, which is impossible.
Continuity can be nicely characterized using ultrafilters:
Proposition 2.3. Let (X, T ) and (Y, S) be topological spaces, and let x be
element in X. For a function f : X → Y , the following are equivalent:
(i) f is continuous at x.
(ii) Whenever U is an ultrafilter on X convergent to x, it follows that the
ultrafilter f∗ (U) in Y , convergent to f (x).
Proof. (i) ⇒ (ii). Assume that f is continuous at x. Start with an ultrafilter
U on X, with U → x. Let N be an arbotrary neighborhood of f (x). Since f is
continuous at x, it follows that f −1 (N ) is a neighborhood of x. In particular we
get f −1 (N ) ∈ U, which proves that N ∈ f∗ (U). Since the ultrafilter f∗ (U) contains
all neighborhoods of f (x), it means that indeed f∗ (U) is convergent to f (x).
(ii) ⇒ (i). Assume f satisfies condition (ii), but f is not continuous at x. This
means that there exists some neighborhood V of f (x) such that f −1 (V ) is not a
neighborhood of x. Consider the collection
F = {N r f −1 (V ) : N neighborhood of x}.
Our assumption on V shows that all the sets in F are non-empty. (Otherwise
f −1 (V ) would contain some neighborhood of x, which would force f −1 (V ) itself to
be a neighborhood of x.) It is also clear that F is a filter. Let U be an ultrafilter
with U ⊃ F.
Claim: The ultrafilter U is convergent to x.
To prove this, start with some arbitrary neighborhood N of x. If N does not
belong to U, then X r N belongs to U. But then (X r N ) ∩ (N r f −1 (V )) = ∅
belongs to U, which is impossible. So U contains all neighborhoods of x, which
means that indeed U is convergent to x.
Using our assumption on V , plus condition (ii), it follows that V ∈ f∗ (U),
which means that f −1 (V ) ∈ U. But this leads to a contradiction, since X r f −1 (V )
clearly belongs to F ⊂ U.
Lecture 3
3. Constructing topologies
In this section we discuss several methods for constructing topologies on a given
set.
Definition. If T and T 0 are two topologies on the same space X, such that
T ⊂ T (as sets), then T is said to be stronger than T 0 . Equivalently, we will say
0
13
14 LECTURE 3
The implication “⇐” is pretty obvious. Since top(S) is a topology, and every set
in S is open with respect to top(S), it follows that every finite intersection of sets
in S is again in top(S), which means that every set in V(S) is again open with
respect to top(S). But then arbitrary unions of sets in V(S) are again open with
respect to top(S).
To prove the implication “⇒” we define
[
T0 = A ⊂ X : ∃ VA ⊂ V(S) such that A = B ,
B∈VA
There are instances when sub-bases have a particular feature, which enables
one to describe all open sets in an easier fashion.
Proposition 3.3. Let (X, T ) be a topological space. Suppose V is a colletion
of subsets of X. The following are equivalent:
(i) V is a sub-base for T , and
(2) ∀ U, V ∈ V and x ∈ U ∩ V, , ∃ W ∈ V with x ∈ W ⊂ U ∩ V.
(ii) Every open set A ( X is a union of sets in V.
Proof. (i) ⇒ (ii). From property (i), it follows that every finite intersection
of sets in V is a union of sets in V. Then the desired implication is immeadiate
from the previous result.
(ii) ⇒ (i). Assume (ii) and start with two sets U, V ∈ V, and an element
x ∈ U ∩ V . Since U ∩ V is open, by (ii) either we have U ∩ V = X, in which case
we get U = V = X, and we take W = X, or U ∩ V ( X, in which case U ∩ V is a
union of sets in V, so in particular there exists W ∈ V with x ∈ W ⊂ U ∩ V .
and fi ◦ g : Z → Yi is continuous.
To prove the uniqueness, let T be another topology on X with properties (i)
and (ii). Consider the map h = Id : (X, T ) → (X, T Φ ). Using property (i) for T ,
combined with property (ii) for T Φ , it follows that h is continuous, which means
that T Φ ⊂ T . Reversing the roles, and arguing exactly the same way, we also get
the other inclusion T ⊂ T Φ .
Remark 3.4. Using the above setting, assume that for each i ∈ I a sub-base
Si for the topology of Yi is given. Consider the sets fi∗ Si = fi−1 (S) : S ∈ Si .
as a sub-base, so if we define
Sx = {S ∈ S : S 3 x},
we clearly have U ⊃ Sx . Then the fact that U converges to x follows from Propo-
sition 3.2.
Example 3.1. (The product topology) Supoose we have Ya family (Xi , Ti ), i ∈ I
of topological spaces. Consider the Cartesian product X = Xi . For each j ∈ I
i∈I
18 LECTURE 3
To prove property (ii), start with some topological space (Z, S) and a map
g : X → Z such that g ◦ fi : Yi → Z is continuous, for all i ∈ I. Start with
some open set D ⊂ Z, and let us prove that the set A = g −1 (D) is open in X, i.e.
A ∈ TΦ . Notice that, for each i ∈ I, one has
fi−1 (A) = fi−1 g −1 (D) = (g ◦ fi )−1 (D),
so using the continuity of g ◦ fi we get the fact that fi−1 (A) is open in Yi , which
means that A ∈ fi∗ (Ti ). Since this is true for all i ∈ I, we then get A ∈ TΦ .
To prove uniqueness, let T be another topology on X with properties (i) and
(ii). Consider the map h = Id : (X, T ) → (X, T Φ ). Using property (i) for TΦ ,
combined with property (ii) for T , it follows that h is continuous, which means
that T Φ ⊂ T . Reversing the roles, and arguing exactly the same way, we also get
the other inclusion T ⊂ T Φ .
Comment. Using the notations above, it is immediate that the topology TΦ
can also be described as the strongest topology on X, with respect to which all the
maps fi : Yi → X, i ∈ I, are continuous. In the light of this remark, we will call
the topology TΦ the strong topology defined by Φ.
G a family (Xi , Ti ),
Example 3.2. (The disjoint union topology) Supoose we have
3
i ∈ I of topological spaces. Consider the disjoint union X = Xi . For each i ∈ I
i∈I
we consider the inclusion i : Xi → X. The strongest topology on X, defined by
the family Φ = {i }i∈I , is called the disjoint union topology.
If we think each Xi as a subset of X, then Xi is open in X, for all i ∈ I.
Moreover, a set D ⊂ X is open, if and only if D ∩ Xi is open (in Xi ), for all i ∈ I.
For a point x ∈ X, there exists a unique i(x) ∈ I, with x ∈ Xi(x) . With this
notation, an ultrafilter U on X is convergent to x, if and only if Xi(x) ∈ U, and the
collection
U X = {U ∩ Xi(x) : U ∈ U}
i(x)
4. Compactness
Definition. Let X be a topological space X. A subset K ⊂ X is said to be
compact set in X, if it has the finite open cover property:
S
(f.o.c) Whenever {Di }i∈I is a collection of open sets such that K ⊂ i∈I Di ,
there exists a finite sub-collection Di1 , . . . , Din such that
K ⊂ Di1 ∪ · · · ∪ Din .
An equivalent description is the finite intersection property:
(f.i.p.) If {Fi }i∈I is is a collection of closed sets such that for any finite sub-
collection Fi1 , . . . , Fin we have K ∩ Fi1 ∩ . . . Fin 6= ∅, it follows that
\
K∩ Fi 6= ∅.
i∈I
Besides the two equivalent conditions (f.o.c) and (f.i.p.), there are some other
useful characterizations of compactness, listed in the following.
Theorem 4.1. Let (X, T ) be a topological space. The following are equivalent:
(i) X is compact.
21
22 LECTURE 4
(ii) (Alexander sub-base Theorem) There exists a sub-base S with the finite
open cover property: [
(s) For any collection {Si | i ∈ I} ⊂ S with X = Si , there exists a
i∈I
finite sub-collection {Si1 , Si2 , . . . , Sin } (for some finite sequence of
indices i1 , i2 , . . . , in ∈ I) such that X = Si1 ∪ Si2 ∪ · · · ∪ Sin .
(iii) Every ultrafilter in X is convergent.
Proof. (i) ⇒ (ii). This is obvious. (In fact any sub-base has the open cover
property.)
(ii) ⇒ (iii). Let U be an ultrafilter on X. Assume U is not convergent to any
point x ∈ X. By Proposition 3.2 it follows that, for each x ∈ X, one can find a
set Sx ∈ S with Sx 3 x, but such that Sx 6∈ U. Using property (s), one can find a
finite collection of points x1 , . . . , xn ∈ X, such that
(1) Sx1 ∪ · · · ∪ Sxn = X.
Since Sxp 6∈ U, it means that X r Sxp belongs to U, for every p = 1, . . . , n. Then,
using (1), we get
U 3 (X r Sx1 ) ∩ · · · ∩ (X r Sxn ) = ∅,
which is impossible.
(iii) ⇒ (i). Assuming property (iii), we will show that X has the finite in-
tersection property. Start with a family of closed sets {Fi }i∈I , with the property
that
\
(2) Fi 6= ∅, for every finite subset J ⊂ I.
i∈J
T
We want to prove that Fi 6= ∅. For every finite subset J ⊂ I we define the
i∈I T
non-empty closed set FJ = i∈J Fi . It is clear that
F = FJ : J finite subset of I
Equip the closure θ(X) with the topology induced from T . Then the pair (θ, θ(X))
is a compactification of X.
Proof. For every f ∈ F , let us denote by πf : T → [0, 1] the coordinate map.
Remark that θ : X → T is continuous. This is immediate from the definition of
the product topology, since the continuity of θ is equivalent to the continuity of all
compositions πf ◦ β, f ∈ F . The fact that these compositions are continuous is
however trivial, since we have πf ◦ θ = f , ∀ f ∈ F .
Denote for simplicity θ(X) by B. By Tihonov’s Theorem, the space T is com-
pact (and obviously Hausdorff), so the set B is compact as well, being a closed
subset of T . By construction, θ(X) is dense in B, and θ is continuous.
At this point, it is interesting to point out the following property
Claim 1: For every f ∈ F , there exists a unique continuous map f˜ : B →
[0, 1], such that f˜ ◦ θ = f .
The uniqueness is trivial, since θ(X) is dense in B. The existence is also trivial,
because we can take f˜ = πf B .
(αf )f ∈F ∈ B r S, and let us prove that α ∈ θ(X). Since α 6∈ S, there exists some
f ∈ Fc , such that πf (α) > 0. Since f ∈ Fc , there exists some compact subset
˜
K ⊂ X, such that f XrK = 0. Using Claim 2, we know that f Brθ(K) = 0. Since
f˜(α) = πf (α) 6= 0, this forces α ∈ θ(K) ⊂ θ(X).
To finish the proof of the Theorem, all we need to prove now is the fact that
θ : X → θ(X) is a homeomorphism, which amounts to proving that, whenever
D ⊂ X is open, it follows that θ(D) is open in B. Fix an open subset D ⊂ X. In
order to show that θ(D) is open in B, we need to show that θ(D) is a neighborhood
for each of its points. Fix some point α ∈ θ(D), i.e. α = θ(x), for some x ∈ D.
Choose some compact subset K ⊂ D, such that x ∈ Int(K), and apply Urysohn
Lemma to find a function f ∈ FK , with f (x) = 1. Consider the continuous function
f˜ : B → [0, 1] given by Claim 1, and apply Claim 2 to conclude that f˜Brθ(K) = 0.
6. Metric spaces
In this section we review the basic facts about metric spaces.
Definitions. A metric on a non-empty set X is a map
d : X × X → [0, ∞)
with the following properties:
(i) If x, y ∈ X are points with d(x, y) = 0, then x = y;
(ii) d(x, y) = d(y, x), for all x, y ∈ X;
(iii) d(x, y) ≤ d(x, z) + d(y, z), for all x, y, z ∈ X.
A metric space is a pair (X, d), where X is a set, and d is a metric on X.
Notations. If (X, d) is a metric space, then for any point x ∈ X and any
r > 0, we define the open and closed balls:
Br (x) = {y ∈ X : d(x, y) < r},
Br (x) = {y ∈ X : d(x, y) ≤ r}.
Definition. Suppose (X, d) is a metric space. Then X carries a natural
toplogy constructed as follows. We say that a set D ⊂ X is open, if it has the
property:
• for every x ∈ D, there exists some rx > 0, such that Brx (x) ⊂ D.
One can prove that the collection
Td = {D ⊂ X : D open }
is indeed a topology, i.e. we have
• ∅ and X are open; S
• if (Di )i∈I is a family of open sets, then i∈I Di is again open;
• if D1 and D2 are open, then D1 ∩ D2 is again open.
The topology thus constructed is called the metric topology.
Remark 6.1. Let (X, d) be a metric space. Then for every p ∈ X, and for
every r > 0, the set Br (p) is open, and the set B r (p) is closed.
If we start with some x ∈ Br (p), an if we define rx = r − d(x, p), then for every
y ∈ Brx (x) we will have
d(y, p) ≤ d(y, x) + d(x, p) < rx + d(x, p) = r,
so y belongs to Br (p). This means that Brx (x) ⊂ Br (p). Since this is true for all
x ∈ Br (p), it follows that Br (p) is indeed open.
To prove that B r (p) is closed, we need to show that its complement
X r B r (p) = {x ∈ X : d(x, p) > r}
31
32 LECTURE 6
open set D ⊂ X, with x ∈ D. Let ε > 0 be chosen such that Bε (x) ⊂ D. Since
limn→∞ d(xn , x) = 0, there exists some nε such that d(xnε , x) < ε. It is now clear
that
xnε ∈ Bε (x) ∩ A ⊂ D ∩ A,
so the intersection D ∩ A is indeed non-empty.
Continuity can be characterized using convergence, as follows.
Proposition 6.2. Let X and Y be metric spaces, and let f : X → Y be a
function. For a point p ∈ X, the following are equivalent:
(i) f is continuous at p;
(ii) for every ε > 0, there exists some δε > 0 such that
d f (x), f (p) < ε, for all x ∈ X with d(x, p) < δε .
(iii) if (xn )n≥1 ⊂ X is a sequence with limn→∞ xn = p, then limn→∞ f (xn ) =
f (p).
Proof. (i) ⇒ (ii). The condition that f is continuoous at p means
(∗) for every open set D ⊂ Y , with D 3 f (p), there exists some open set
E ⊂ X, with p ∈ E ⊂ f −1 (D).
Assume f is continuous at p. For every ε > 0, we consider the open ball BYε f (p).
Using (∗), there exists some open set E ⊂ X, with E 3 p, and f (E) ⊂ BYε f (p) .
In particular, there exists δ > 0, such that BXδ (p) ⊂ E, so now we have
f BX δ (p) ⊂ Bε f (p) ,
Y
This means that, for every integer n ≥ 1, we can find a point xn ∈ X such that
1
d(xn , p) < and d f (xn ), f (p) ≥ r.
n
34 LECTURE 6
clear that the sequence (xn )n≥1 ⊂ X is convergent to p, but the sequence
It is then
f (xn ) n≥1 ⊂ Y is not convergent to f (p). This will contradict (iii).
(This follows from the fact that the Tn ’s form a decreasing sequence of sets.) By
compactness, it follows that
\
Tn 6= ∅.
n≥1
T
Take a point x ∈ n≥1 Tn . The key feature of x is the given by the following:
Claim 1: For every ε > 0 and every integer ` ≥ 1, there exists some integer
N (ε, `) > ` such that d(xN (ε,`) , x) < ε.
This is a consequence of the fact that, for every ` ≥ 1, the point x belongs to the
closure {xN : N > `}, so for every ε > 0 we have
Bε (x) ∩ {xN : N > `} =
6 ∅.
Using Claim 1, we define a sequence (kn )n≥0 of integers, recursively by
kn = N ( n1 , kn−1 ), ∀ n ≥ 1.
(The initial term k0 is chosen arbitrarily.) We have, by construction, k0 < k1 <
k2 < . . . , and
1
d(xkn , x) < , ∀ n ≥ 1,
n
so (xkn )n≥1 is indeed a subsequence of (xk )k≥1 , which is convergent (to x).
(ii) ⇒ (i). Assume (ii). Before we start proving that X is compact, We shall
need some preparations.
Claim 2: For every r > 0 there exists a finite set F ⊂ X, such that
[
X= Br (x).
x∈F
We prove this by contradiction. Assume there exists some r > 0, such that
[
Br (x) ( X,
x∈F
for every finite set F ⊂ X. In particular, there exists a sequence (xn )n≥1 such that
xn+1 ∈ X r Br (x1 ) ∪ · · · ∪ Br (xn ) , ∀ n ≥ 1.
CHAPTER I: TOPOLOGY PRELIMINARIES 35
metric topology.
What we need to show is that every open set is a union of sets in W. Fix an open
set D and a point p ∈ D. Choose r > 0, such that Br (p) ⊂ D. Choose then
some integer n ≥ 1, such that n1 < 2r , and choose some point x ∈ Fn , such that
p ∈ B n1 (x). Notice that, for every y ∈ B n1 (x), we have
1 1
d(y, p) ≤ d(y, x) + d(x, p) < + ≤ r,
n n
which proves that y ∈ Br (p). Therefore we have
p ∈ B n1 (x) ⊂ Br (p) ⊂ D.
Since p ∈ D is arbitrary, this proves that D is a union of sets in W.
We now beginS proving that X is compact. Start with a collection (Di )i∈I of
open Ssets, with i∈I Di = X. We need to find a finite set of indices I0 ⊂ I, such
that i∈I0 Di = X. First we show that:
Claim 4: There exists a countable set of indices I1 ⊂ I, such that
[
Di = X.
i∈I1
The key fact is that the base W is countable. Let us enumerate the base W as a
sequence
W = {Wm : m ∈ N}.
For each i ∈ I, we define the set
Mi = {m ≥ 1 : Wm ⊂ Di }.
By Claim 3, we know that for every x ∈ Di there exists some m ∈ Mi such that
x ∈ Wm ⊂ Di . In particular this proves the equality
[
Di = Wm , ∀ i ∈ I.
m∈Mi
S
Consider then the union M = i∈I Mi , which is countable, being a subset of the
integers. We clearly have
[ [ [ [
Wm = Wm = Di = X.
m∈M i∈I m∈Mi i∈I
Proof. Use (i) and the steps in the proof of (i) ⇒ (ii), up to the proof of
Claim 3.
Corollary 6.2. Let (X, d) be a metric space. For a subset K ⊂ X the fol-
lowing are equivalent:
(i) every sequence in K has a subsequence which is convergent to some point
in K;
(ii) K is compact in X.
CHAPTER I: TOPOLOGY PRELIMINARIES 37
Proof. (i) ⇒ (ii). By the above Theorem, we know that when we equip K
with the metric dK×K , then K is compact. This means that K is compact in the
induced topology, which means exactly that K is compact in X.
(ii) ⇒ (i). Argue as above. If K is compact in X, then
K is compact when
equipped with the induced toplogy, which means that (K, dK×K ) is compact.
Proposition 6.4. Let (X, d) be a metric space. The following are equivalent.
(i) (X, d) is complete.
(ii) Every sequence (xn )n≥1 ⊂ X, with
∞
X
(4) d(xn+1 , xn ) < ∞,
n=1
is convergent.
(iii) Every Cauchy sequence has a convergent subsequence.
Proof. (i) ⇒ (ii). Assume X is complete. Let (xn )n≥1 ⊂ X be a sequence
with property (4). To prove (ii) it suffices to show that (xn )n≥1 is Cauchy. For
every N ≥ 1 we define
X∞
RN = d(xn+1 , xn ).
n=N
Using (4) we get limN →∞ RN = 0, so for every ε > 0 there exists some N (ε) with
RN (ε) < ε. Notice also that the sequence (RN )N ≥1 is decreasing. If m > n ≥ N (ε),
then
m−1
X ∞
X
d(xm , xn ) ≤ d(xk+1 , xk ) ≤ d(xk+1 , xk ) = Rn ≤ RN (ε) < ε,
k=n k=n
Using the assumption, we can find a subsequence (xkn )n≥1 (defined by an increasing
sequence of integers 1 ≤ k1 < k2 < . . . ) which is convergent to some point x. We
are going to prove that the entire sequence (xn )n≥1 is convergent to x. Fix for the
moment n ≥ 1. For every m ≥ n, we have km ≥ m ≥ n, so we have
(7) Sn ≥ d(xn , xkm ), ∀ m ≥ n.
By Remark 3.4, we also know that
lim d(xn , xkm ) = d(xn , x),
m→∞
Proof. (i) ⇒ (ii). Assume Y is complete, and let us prove that Y is closed.
Start with a point x ∈ Y . Then there exists a sequence (yn )n≥1 ⊂ Y with
limn→∞ yn = x. Notice that (yn )n≥1 is Cauchy in Y , so by assumption, (yn )n≥1 is
convergent to som point in Y . This will then clearly force x ∈ Y .
(ii) ⇒ (i). Assume Y is closed, and let us prove that Y is complete. Start
with a Cauchy sequence (yn )n≥1 ⊂ Y . Since X is complete, the sequence (yn )n≥1
is convergent to some point x ∈ X. Since Y is closed, this forces x ∈ Y .
Remark 6.6. Using Theorem 6.1, we immediately see that a metric space,
which is compact in the metric topology, is automatically complete.
The next result identifies those complete metric spaces that are compact. In
order to formulate it, we need the following:
Definition. Let (X, d) be a metric space, and let ε > 0. A subset A ⊂ X is
said to be ε-rare, if
d(a, b) ≥ ε, for all a, b ∈ A with a 6= b.
Proposition 6.6. Let (X, d) be a complete metric space. The following are
equivalent:
(i) X is compact in the metric topology;
(ii) for each ε > 0, all ε-rare subsets of X are finite;
(iii) for any ε > 0, there exist finitely many points p1 , p2 , . . . , pn ∈ X, such
that
X = Bε (p1 ) ∪ Bε (p2 ) ∪ · · · ∪ Bε (pn ).
It is clear that no subsequence of (an )n≥1 is Cauchy, which means that (an )n≥1
does not have any convergent subsequence, thus contradicting the fact that X is
compact.
(ii) ⇒ (iii). Assume property (ii) and let us prove (iii) by contradiction.
Assume there exists some ε > 0, such that, for every finite set F ⊂ X, one has a
strict inclusion [
Bε (x) ( X.
x∈F
Start with some arbitrary point a1 ∈ X, and construct recursively a seqeuence
(an )n≥1 ⊂ X, by choosing
an+1 ∈ X r Bε (a1 ) ∪ · · · ∪ Bε (an ) , ∀ n ≥ 1.
It is clear that [
Mm−1 = Sm (p),
p∈Fm
and since Mm−1 is infinite, it follows that one of the sets Sm (p), p ∈ Fm is infinite.
We then choose pm ∈ Fm to be one point for which Sm (pm ) is infinite.
Having proven the Claim, let us us construct a sequence of integers 1 ≤ n1 <
n2 < . . . as follows. Start with some arbitrary n1 ∈ M1 . Once n1 < n2 < · · · < nk
have been constructed, we choose the integer nk+1 ∈ Mk+1 , such that nk+1 > nk .
(It is here that we use the fact that Mk+1 is infinite.) By construction, we have
nk ∈ Mk , ∀ k ≥ 1.
Suppose k ≥ ` ≥ 1. Then by construction we have nk ∈ Mk ⊂ M` and n` ∈ M` .
In particular we get
2
d(xnk , xn` ) ≤ d(xnk , p` ) + d(xn` , p` ) < .
`
The above estimate clearely proves that the subsequence (xnk )k≥1 is Cauchy. Since
X is complete, it follows that (xnk )k≥1 is convergent.
Corollary 6.4. Let (X, d) be a complete metric space, and let A be a subset
of X. The following are equivalent:
(i) the closure A is compact in X;
(ii) for each ε > 0, all ε-rare subsets of A are finite.
Proof. (i) ⇒ (ii). This is trivial from the above result.
(ii) ⇒ (i). Assume (ii), and let us prove that A is compact. Since A is complete,
it suffices to prove that, for each ε > 0, all ε-rare subsets of A are finite. Fix ε > 0,
and let B be an ε-rare subset of A. For each x ∈ B, let us choose a point ax ∈ A,
such that x ∈ Bε/3 (ax ). Suppose x, y ∈ B are such that x 6= y. Then
ε ε ε
d(ax , ay ) ≥ d(x, y) − d(ax , x) − d(ay , y) > ε − − = .
3 3 3
In particular, this shows that the map
f : B 3 x 7−→ ax ∈ A
is injective, and the set f (B) is an (ε/3)-rare subset of A. By condition (ii) this
forces B to be finite.
We continue with an important construction.
Definitions. Let (X, d) be a metric space. We define
cs(X, d) = x = (xn )n≥1 : x Cauchy sequence in X .
We say that two Cauchy sequences x = (xn )n≥1 and y = (yn )n≥1 in X are equiva-
lent, if
lim d(xn , yn ) = 0.
n→∞
In this case we write x ∼ y. (It is fairly obvious that ∼ is indeed an equivalence
relation.) We define the quotient space
e = cs(X, d)/ ∼ .
X
For an element x ∈ cs(X, d), we denote its equivalence class by x
e.
Finally, for a point x ∈ X, we define hxi ∈ X,
e to be the equivalence class of
the constant sequence x (which is obviously Cauchy).
CHAPTER I: TOPOLOGY PRELIMINARIES 43
Remark 6.7. Let (X, d) be a metric space. If x = (xn )n≥1 and y = (y n )n≥1
are Cauchy sequences in X, then the sequence of real numbers d(xn , yn ) n≥1 is
convergent. Indeed, for any m, n we have
d(xm , ym ) − d(xn , yn ) ≤ d(xm , ym ) − d(xn , ym ) + d(xn , ym ) − d(xn , yn ) ≤
≤ d(xm , xn ) + d(ym , yn ).
We can then define
δ(x, y) = lim d(xn , yn ).
n→∞
Proposition 6.7. Let (X, d) be a metric space.
A. The map δ : cs(X, d) × cs(X, d) → [0, ∞) has the following properties:
(i) δ(x, y) = δ(y, x), ∀ x, y ∈ cs(X, d);
(ii) δ(x, y) ≤ δ(x, z) = δ(z, y), ∀ x, y, z ∈ cs(X, d);
(iii) δ(x, y) = 0 ⇒ x ∼ y;
(iv) If x, x0 , y, y 0 ∈ cs(X, d) are such that x ∼ x0 and y ∼ y 0 , then
δ(y, x) = δ(x0 , y 0 ).
B. The map de : X e ×X e → [0, ∞), correctly defined by
is a metric on X.
e
C. The map X 3 x 7−→ hxi ∈ X
e is isometric, in the sense that
d(hxi,
e hyi) = d(x, y), ∀ x, y ∈ X.
Proof. A. Properties (i), (ii) and (iii) are obvious. To prove property (iv) let
x = (xn )n≥1 , x0 = (x0n )n≥1 , y = (yn )n≥1 , and y 0 = (yn0 )n≥1 . The inequality
d(x0n , yn0 ) ≤ d(x0 n, xn ) + d(xn , yn ) + d(yn , yn0 ),
combined with limn→∞ d(x0n , xn ) = limn→∞ d(yn , yn0 ) = 0 immediately gives
δ(x0 , y 0 ) = lim d(x0n , yn0 ) ≤ lim d(xn , yn ) = δ(x, y).
n→∞ n→∞
so we indeed have
lim de hxn i, x
e = 0.
n→∞
e k , p` ) + 1 .
d(xk , x` ) = de hxk i, hx` i ≤ de hxk i, pk ) + d(p
e k , p` ) + de p` , hx` i ≤ d(p
2`
This clearly gives
lim sup d(xk , x` ) ≤ lim sup d(p
e k , p` ) = 0,
n→∞ k,`≥N n→∞ k,`≥N
d(p
e k, x e k , hx` i) ≤ 1 + ε, ∀ k ≥ Nε .
e ) = lim d(p
`→∞ 2k
The above estimate clearly proves that
lim d(p
e k, x
e ) = 0,
k→∞
Proof. Start with some Cauchy sequence x = (xn )n≥1 in X. Using the in-
equality
ρ f (xm ), f (xn ) ≤ C · d(xm , xn ), ∀ m, n ≥ 1,
it is obvious that f (xn ) n≥1 is a Cauchy sequence in Y . Since Y is complete, this
sequence is convergent. Define,
φ(x) = lim f (xn ).
n→∞
and using the fact that limn→∞ d(xn , x0n ) = 0, we get limn→∞ ρ f (xn ), f (x0n ) = 0.
e → Y be another continuous
Finally, let us show that fe is unique. Let F : X
function with F (hxi) = f (x), for all x ∈ X. Start with an arbitrary point p ∈
46 LECTURE 6
X,
e represented as p = x, for some Cauchy sequence x = (xn )n≥1 in X. Since
limn→∞ hxn i = p in X,
e by continuity we have
Corollary 6.5. Let (X, d) be a metric space, let (Y, ρ) be a complete metric
space, and let f : X → Y be an isometric map, that is
ρ f (x), f (x0 ) = d(x, x0 ), ∀ x, x0 ∈ X.
Then the map f˜ : X̃ → Y , given by the above result, is isometric and f˜(X̃) = f (X)
- the closure of f (X) in Y ..
Proof. To show that f˜(X̃) = f (X), start with some arbitrary point y ∈
f (X). Then there exists a sequence (xn )n≥1 ⊂ X, with limn→∞ f (xn ) = y. Since
f (xn ) n≥1 is Cauchy in Y , and
d(xm , xn ) = ρ f (xm ), f (xn ) , ∀ m, n ≥ 1,
it follows that the sequence x = (xn )n≥1 is cauchy in X. We then have
y = lim f (xn ) = f˜(x̃).
n→∞
Finally, we show that f˜ is isometric. Start with two points p, q ∈ X̃, represented
as p = x̃ and q = z̃, for some Cauchy sequences x = (xn )n≥1 and z = (zn )n≥1 in
X. Then by construction we have
ρ f˜(p), f˜(q) = lim ρ f˜(hxn i), f˜(hzn i) = lim ρ f (xn ), f (zn ) =
n→∞ n→∞
˜ z̃) = d(p,
= lim d(xn , zn ) = d(x̃, ˜ q).
n→∞
In the remainder of this section we will address the following question: Given
a topological Hausdorff space X, when does there exists a metric d on X, such that
the given topology coincides with the metric topology defined by d? A topolgical
Hausdorff space with the above property is said to be metrizable. It is difficult to
give non-trivial necessary and sufficient conditions for mtrizability. One instance in
which this is possible is the compact case (see the Urysohn Metrizability Theorem
later in these notes). Here is a useful result, which is an example of a sufficient
condition for mterizabilty.
Proposition 6.10 (Metrizability of Countable Products). Let
Q (Xi , di )i∈I be a
countable family of metric spaces. Then the product space X = i∈I Xi , equipped
with the product topology, is metrizable.
CHAPTER I: TOPOLOGY PRELIMINARIES 47
The continuity of the map Id : (X, d) → (X, T) is equivalent to the fact that all
maps
πi : (X, d) → (Xi , di ), i ∈ I
are continuous. This is obvious, because by construction we have
di πi (x), πi (y) ≤ d(x, y), ∀ x, y ∈ X.
Conversely, to prove the continuity of Id : (X, T) → (X, d), we are going to prove
that every d-open set is open in the product
Q topology. It suffices to prove this only
for open balls. Fix then x = (xi )i∈I ∈ i∈I Xi and r > 0, and consider the open
ball Br (x). If we define, for each i ∈ I, the open ball BX
r (xi ), then it is obvious
i
that \
Br (x) = πi−1 BX
r (xi ) ,
i
i∈I
and since πi are all continuous, this proves that Br (x) is indeed open in the product
toplogy.
Case II: Assume I is infinite. In this case we identify I = N. For every n ∈ N
we define a new metric δn on Xn , as follows. If
sup dn (p, q) ≤ 1,
p,q∈Xn
As before, in order to prove the continuity of the other map Id : (X, T) → (X, d), we
start with some d-open set D, and we show that D is open in the product topology.
Since D is a union of of open balls, we need to prove that for any x ∈ X and any
Br (x), in (X, d), is a neighborhood of x in the product topology.
r > 0, the open ball Q
Fix x = (xn )n∈N ∈ n∈N Xn , as well as r > 0. Choose some integer N ≥ 1, such
that
∞
X 1 r
< ,
2n 2
n=N +1
and define, for each k ∈ {1, 2, . . . , N } the set
Y r
Dk = {y = (yn )n∈N ∈ Xn : δn (xk , yk ) < }.
2
n∈N
7. Baire theorem(s)
In this section we discuss some topological phenomenon that occurs in certain
topological spaces. This deals with interiors of closed sets.
Exercise 1. Let X be a topological space, and let A and B be closed sets with
the property that int(A ∪ B) 6= ∅. Prove that either Int(A) 6= ∅, or Int(B) 6= ∅.
Exercise 2. Give an example of a topological space X and of two (non-closed)
sets A and B such that Int(A ∪ B) 6= ∅, but Int(A) = Int(B) = ∅.
Theorem 7.1 (Baire’s Theorem). Let (X, T ) be a topological Hausdorff space,
which satisfies one (or both) of the following properties:
(a) There exists a meatric d on X, which meakes (X, d) a complete metric
space, and T is the metric topology.
(b) X is locally compact.
S∞
Suppose one has a sequence (Fn )n≥1 of closed subsets of X, such that X = n=1 Fn .
Then there exists some integer n ≥ 1, such that Int(Fn ) 6= ∅.
Sn
S∞every n ≥ 1 we define the closed set Gn = k=1 Fk , so that we
Proof. For
still have X = n=1 Gn , but we also have G1 ⊂ G2 ⊂ . . . . According to Exercise 1
(use an inductive argument) it suffices to show that there exists some n ≥ 1, with
Int(Gn ) 6= ∅. We are going to prove this property by contradiction.
(∗) Assume Int(Gn ) = ∅, for all n ≥ 1.
Claim: Under the assumption (∗) there exists a sequence (Dn )n≥1 of non-
empty open sets, such that for all n ≥ 1 we have:
(i) Dn ∩ Gn = ∅;
(ii) Dn+1 ⊂ Dn ;
(iii) In case (a) we have diam(Dn ) ≤ 2−n ; in case (b) Dn is compact.
The sequence is constructed recursivley. To construct D1 we use the fact that
Int(G1 ) = ∅ forces X r G1 6= ∅. We then choose a point x ∈ X r G1 . In case (a)
we know that there exists r > 0 such that Br (x) ⊂ X r G1 . We put ρ = min{r, 41 }
and we set D1 = Bρ (x). In the case (b) we apply Lemma 5.1 to find D1 open with
D1 compact, such that x ∈ D1 ⊂ D1 ⊂ X r G1 .
Let us assume now that we have constructed D1 , D2 , . . . , Dk , such that (i) and
(iii) hold for all n ∈ {1, . . . , k}, and such that (ii) hold for all n ∈ {1, . . . , k − 1},
and let us indicate how the next set Dk+1 is constructed. Using the assumption
that Int(Gk+1 ) = ∅, it follows that the open set Dk r Gk+1 is non-empty. Choose
then a point x ∈ Dk r Gk+1 . In case (a) there exists some r > 0 such that
Br (x) ⊂ Dk r Gk+1 . We then put ρ = min{ 2r , 2k+2 1
}, and we define Dk+1 = Bρ (x).
49
50 LECTURE 7
In case (b) we apply Lemma 5.1 an find an open set Dk+1 with Dk+1 compact,
and x ∈ Dk+1 ⊂ Dk+1 ⊂ Dk r Gk+1 . All properties (i)-(iii) are easily verified.
Having proven the Claim, let us see now that the assumption (∗) produces a
contradiction.
Case (a): In this case we choose, for each n ≥ 1 a point xn ∈ Dn . Notice
that, for every m ≥ n ≥ 1 we have
1
xm , xn ∈ Dn and d(xm , xn ) ≤ diam(Dn ) ≤ n .
2
In particular, this proves that the sequence (xn )n≥1 is Cauchy, hence convergent
to some point x. Since xm ∈ Dn , ∀ m ≥ n ≥ 1, we see that x ∈ Dn , for all n ≥ 1.
In other words we get
∞
\
(1) Dn 6= ∅.
n=1
Case (b): In this case we also get (1), this time as a consequence of the
compactness of the sets Dn (and the finite intersection property).
T∞
Let us notice now that (1) combined with (ii) will also give n=1 Dn 6= ∅. But
this is impossible, since by (i) we have
\∞ ∞
\ ∞
[
Dn ⊂ (X r Gn ) = X r Gn = ∅.
n=1 n=1 n=1
Chapter II
Elements of Functional Analysis
Lecture 8
1. Hahn-Banach Theorems
The result we are going to discuss is one of the most fundamental theorems in
the whole field of Functional Analysis. Its statement is simple but quite technical.
Definitions. Let K be either of the fields R or C. Suppose X is a K-vector
space.
A. A map q : X → R is said to be a quasi-seminorm, if
(i) q(x + y) ≤ q(x) + q(y), for all x, y ∈ X ;
(ii) q(tx) = tq(x), for all x ∈ X and all t ∈ R with t ≥ 0.
B. A map q : X → R is said to be a seminorm if, in addition to the above
two properties, it satisfies:
(ii’) q(λx) = |λ|q(x), for all x ∈ X and all λ ∈ K.
Remark that if q : X → R is a seminorm, then q(x) ≥ 0, for all x ∈ X . (Use
2q(x) = q(x) + q(−x) ≥ q(0) = 0.)
There are several versions of the Hahn-Banach Theorem.
Theorem 1.1 (Hahn-Banach, R-version). Let X be an R-vector space. Suppose
q : X → R is a quasi-seminorm. Suppose also we are given a linear subspace Y ⊂ X
and a linear map φ : Y → R, such that
φ(y) ≤ q(y), for all y ∈ Y.
Then there exists a linear map ψ : X → R such that
(i) ψ Y = φ;
(ii) ψ(x) ≤ q(x) for all x ∈ X .
Proof. We first prove the Theorem in the following:
Particular Case: Assume dim X /Y = 1.
This means there exists some vector x0 ∈ X such that
X = {y + sx0 : y ∈ Y, s ∈ R}.
What we need is to prescribe the value ψ(x0 ). In other words, we need a number
α ∈ R such that, if we define ψ : X → R by ψ(y + sx0 ) = φ(y) + sα, ∀ y ∈ Y, s ∈ R,
then this map satisfies condition (ii). For s > 0, condition (ii) reads:
φ(y) + sα ≤ q(y + sx0 ), ∀ y ∈ Y, s > 0,
and, upon dividing by s (set z = s−1 y), is equivalent to:
(1) α ≤ q(z + x0 ) − φ(z), ∀ z ∈ Y.
For s < 0, condition (ii) reads (use t = −s):
φ(y) − tα ≤ q(y − tx0 ), ∀ y ∈ Y, t > 0,
53
54 LECTURE 8
Using Zorn’s Lemma, Ξ posesses a maximal element (Z, ψ). The proof of the
Theorem is finished once we prove that Z = X . Assume Z ( X and choose a
vector x0 ∈ X r Z. Form the subspace V = {z + tx0 : z ∈ Z, t ∈ R} and apply
the particular case ofthe Theorem for the inclusion Z ⊂ V, for ψ : Z → R and for
the quasi-seminorm q V : V → R. It follows that there exists some linear functional
η : M → R such that
(i) η Z = ψ (in particular we will also have η Y = φ);
(ii) η(v) ≤ q(v), for all v ∈ V.
But then the element (V, η) ∈ Ξ will contradict the maximality of (Z, ψ).
Theorem 1.2 (Hahn-Banach, C-version). Let X be an C-vector space. Suppose
q : X → R is a quasi-seminorm. Suppose also we are given a linear subspace Y ⊂ X
and a linear map φ : Y → C, such that
Re φ(y) ≤ q(y), for all y ∈ Y.
Then there exists a linear map ψ : X → R such that
(i) ψ Y = φ;
(ii) Re ψ(x) ≤ q(x) for all x ∈ X .
CHAPTER II: ELEMENTS OF FUNCTIONAL ANALYSIS 55
Proof. Regard for the moment both X and Y as R-vector spaces. Define the
R-linear map φ1 : Y → R by φ1 (y) = Re φ(y), for all y ∈ Y, so that we have
φ1 (y) ≤ q(y), ∀ y ∈ Y.
Use Theorem 1 to find an R-linear map ψ1 : X → R such that
(i) ψ1 Y = φ1 ;
(ii) ψ1 (x) ≤ q(x), for all x ∈ X .
Define the map ψ : X → C by
ψ(x) = ψ1 (x) − iψ1 (ix), for all x ∈ X .
Claim 1: ψ is C-linear.
It is obvious that ψ is R-linear, so the only thing to prove is that ψ(ix) = iψ(x),
for all x ∈ X . But this is quite obvious:
ψ(ix) = ψ1 (ix) − iψ1 (i2 x) = ψ1 (ix) − iψ1 (−x) =
= −i2 ψ1 (ix) + iψ1 (x) = i ψ1 (x) − iψ1 (ix) = iψ(x), ∀ x ∈ X .
Proof. We are going to apply Theorems 1 and 2, using the fact that q is also
a quasi-seminorm.
The case K = R. Remark that
φ(y) ≤ |φ(y)| ≤ q(y), ∀ y ∈ Y.
So we can apply Theorem 1 and find ψ : X → R with
(i) ψ = φ;
Y
(ii) ψ(x) ≤ q(x), for all x ∈ X .
56 LECTURE 8
In the remainder of this section we will discuss the geometric form of the
Hahn-Banach theorems. We begin by describing a method of constructing quasi-
seminorms.
Proposition 1.1. Let X be a real vector space. Suppose C ⊂ X is a convex
subset, which contains 0, and has the property
[
(6) tC = X.
t>0
Indeed, if t ∈ TC (λx), we have λx ∈ tC, which menas that λ−1 tx ∈ C, i.e. λ−1 t ∈
TC (x). Conequently we have
t = λ(λ−1 t) ∈ λTX (x),
which proves the inclusion
TC (λx) ⊂ λTC (x).
To prove the other inclusion, we start with some s ∈ λTC (x), which means that
there exists some t ∈ TC (x) with λt = s. The fact that t = λ−1 s belongs to TC (x)
means that x ∈ λ−1 sC, so get λx ∈ sC, so s indeed belongs to TC (λx).
Claim 2:: For every x, y ∈ X, one has the inclusion4
TC (x + y) ⊃ TC (x) + TC (y).
Start with some t ∈ TC (x) and some s ∈ TC (y). Define the elements u = t−1 x and
v = s−1 y. Since u, v ∈ C, and C is convex, it follows that C contains the element
t s 1
u+ v= (x + y),
t+s t+s t+s
which means that x + y ∈ (t + s)C, so t + s indeed belongs to TC (x + y).
We can now conclude the proof. If x ∈ X and λ > 0, then the equality
QC (λx) = λQC (x)
is an immediate consequence of Claim 1. If x, y ∈ X, then the inequality
QC (x + y) ≤ λQC (x) + QC (y)
is an immediate consequence of Claim 2.
Definition. Under the hypothesis of the above proposition, the quasi-semi-
norm QC is called the Minkowski functional associated with the set C.
Remark 1.1. Let X be a real vector space. Suppose C ⊂ X is a convex subset,
which contains 0, and has the property (6). Then one has the inclusions
{x ∈ X : QC (x) < 1} ⊂ C ⊂ {x ∈ X : QC (x) ≤ 1}.
The second inclusion is pretty obvious, since if we start with some x ∈ C, using the
notations from the proof of Proposition 2.1, we have 1 ∈ TC (x), so
QC (x) = inf TC (x) ≤ 1.
To prove the first inclusion, start with some x ∈ X with QC (x) < 1. In particular
this means that there exists some t ∈ (0, 1) such that x ∈ tC. Define the vector
y = t−1 x ∈ C and notice now that, since C is convex, it will contain the convex
combination ty + (1 − t)0 = x.
Exercise 1. Let X be a real vector space, and let q : X → R be a quasi-seminorm.
Define the sets
C0 = {x ∈ X : q(x) < 1},
C1 = {x ∈ X : q(x) ≤ 1}.
(i) Prove that C0 and C1 are both convex, they contain 0, and they both
hav property (6).
4For subsets T, S ⊂ R we define T + S = {t + s : t ∈ T, s ∈ S}.
58 LECTURE 8
Proof. Regard X as a real topological vector space, and apply the real version
to produce an R-linear continuous map φ1 : X → R, and a real number α, such that
φ1 (a) < α ≤ φ1 (b), ∀ a ∈ A, b ∈ B.
Then the function φ : X → C defined by
φ(x) = φ1 (x) − iφ1 (ix), x ∈ X
will clearly satisfy the desired properties.
Proof. Start with some point p ∈ C + D, and let us prove that p ∈ C + D. For
every neighborhood U of 0, the set p + U is a neighborhood of p, so by assumption,
we have
(11) (p + U) ∩ (C + D) 6= ∅.
Define, for each neighborhood U of 0, the set
AU = (p + U − D) ∩ C.
Using (11), it is clear that AU is non-empty. It is also clear that, if U1 ⊂ U2 , then
AU1 ⊂ AU2 . Using the compactness of C, it follows that
\
AU 6= ∅.
U neighborhood
of 0
When K = C, the space cC 0 (I) is simply denoted by c0 (I). When I = N - the set of
natural numbers - the space cK0 (N) can be equivalently described as
cK
0 (N) = α = (αn )n≥1 ⊂ K : lim αn = 0 .
n→∞
63
64 LECTURES 9-11
Lemma 2.1. Let K be one of the fields R or C, let I be a non-empty set, and
let α : I → [0, ∞). The following are equivalent:
(i) α issummable;
X
(ii) sup α(i) : F ⊂ I, finite < ∞.
i∈F
Moreover, in this case we have
X X
sup α(i) : F ⊂ I, finite = α(i).
i∈F i∈I
X
Proof. We denote the quantity sup α(i) : F ⊂ I, finite simply by t.
i∈F P
(i) ⇒ (ii). Assume α is summable, and denote i∈I α(i) simply by s. Choose,
for each ε > 0 a finite set Fε ⊂ I such that
X
s − α(i) < ε, for all finite subsets F ⊂ I with F ⊃ Fε .
i∈F
Claim: For any finite set G ⊂ I, and any ε > 0, one has the inequality
X
α(i) < s + ε.
i∈G
Indeed, if we take the finite set G ∪ Fε , then using the fact that all α’s are non-
negative, we get
X X
α(i) ≤ α(i) < s + ε.
i∈G i∈G∪Fε
Using the Claim, which holds for any ε > 0, we immediately get
X
α(i) ≤ s, for all finite subsets G ⊂ I,
i∈G
so we immediately get
X
t − α(i) < ε.
i∈F
66 LECTURES 9-11
such that
u −
X ε
Re α(j) < , for all finite sets E ⊂ I with E ⊃ Eε ,
2
j∈E
v −
X ε
Im α(j)< , for all finite sets G ⊂ I with G ⊃ Gε .
2
j∈G
so we get
X X X
[u + iv] − α(j) = u −
Re α(j) + i v − Im α(j) ≤
j∈F j∈F j∈F
X X ε ε
≤ u − Re α(j) + v − Im α(j) < + = ε.
2 2
j∈F j∈F
P
This proves that α is indeed summable, and j∈I α(j) = u + iv.
Exercise 8. Let K be one of the fields R or C, and let I be a non-empty set.
I = I1 ∪ I2 and I1 ∩ I2 = ∅.
Suppose one has two non-empty sets I1 , I2 with
Suppose α : I → K has the property that both αI : I1 → K and αI : I2 → K are
1 2
summable. Prove that α is summable, and
X X X
α(j) = α(j) + α(j).
j∈I j∈I1 j∈I2
Proof. (i) ⇒ (ii). Assume α is summable. We divide the proof in two cases:
Case K = R. Define the sets
I + = {j ∈ I : α(j) > 0},
I − = {j ∈ I : α(j) < 0},
I 0 = {j ∈ I : α(j) = 0}.
More generally, for any subset F ⊂ I we define F ± = F ∩ I ± and F 0 = F ∩ I 0 .
Claim: Both maps αI + : I + → R and αI − : I − → R are summable.
Moreover, one has the equality
X X X
(2) α(j) = α(j) + α(j).
j∈I j∈I + j∈I −
68 LECTURES 9-11
P
Denote the sum j∈I α(j) simply by s. Start by choosing some finite set F ⊂ I
such that
X
s − α(j) < 1, for all finite sets G ⊂ I with G ⊃ F.
j∈G
so we get
X X X X X
α(j) ≤ α(j) = α(j) + α(j) − α(j) =
j∈E j∈E∪F + j∈E∪F + j∈F 0 ∪F − j∈F 0 ∪F −
X X X
= α(j) − α(j) < s + 1 − α(j) .
j∈Ẽ j∈F − j∈F −
In particular, we get
X X X
|s| ≤ ε + α(j) ≤ ε + α(j) ≤ ε + α(j).
j∈Fε j∈Fε j∈I
The following result shows that summability is essentially the same as the
summability of series.
Proposition 2.2. Suppose α : I → K is summable. Then the support set
[[α]] = {j ∈ I : α(j) 6= 0}
is at most countable.
Since |α| is summable, the sets Jn , n ≥S1 are all finite. The desired result then
∞
follows from the obvious equality [[α]] = n=1 Jn .
1
tq
p
Hint: Analyze the derivative: f 0 (t) = u − v , t ∈ (0, 1).
1 − tq
Lemma 2.3 (Hölder’s inequality). Let a1 , a2 , . . . , an , b1 , b2 , . . . , bn be non-nega-
tive numbers. Let p, q > 1 be real number with the property p1 + 1q = 1. Then:
Xn Xn p1 X n q1
p q
(3) aj bj ≤ aj · bj .
j=1 j=1 j=1
Moreover, one has equality only when the sequences (ap1 , . . . , apn ) and (bq1 , . . . , bqn )
are proportional.
Proof. The proof will be carried on by induction on n. The case n = 1 is
trivial.
Case n = 2.
Assume (b1 , b2 ) 6= (0, 0). (Otherwise everything is trivial). Define the number
b1
r= .
(bq1 + bq2 )1/q
Notice that r ∈ [0, 1], and we have
b2
= (1 − rq )1/q .
(bq1 + bq2 )1/q
Notice also that, upon dividing by (bq1 + bq2 )1/q , the desired inequality
1 1
(4) a1 b1 + a2 b2 ≤ (ap1 + ap2 ) p (bq1 + bq2 ) q
reads
a1 r + a2 (1 − rq )1/q ≤ (ap1 + ap2 )1/p ,
and it follows immediately from the exercise, applied to the function
f (t) = a1 t + a2 (1 − tq )1/q , t ∈ [0, 1].
Let us examine when equality holds. If a1 = a2 = 0, the equality obviosuly holds,
and in this case (a1 , a2 ) is clearly proportional to (b1 , b2 ). Assume (a1 , a2 ) 6= (0, 0).
Put
p/q
a
s = p 1 p 1/q ,
(a1 + a2 )
and notice that
1/q p/q
ap1
q 1/q a
(1 − s ) = 1− p p = p 2 p 1/q ,
a1 + a2 (a1 + a2 )
CHAPTER II: ELEMENTS OF FUNCTIONAL ANALYSIS 71
so we have
1+ p 1+ p
a1 q
+ a2 q
ap1 + ap2 1 1
f (s) = 1 = 1 = (ap1 + ap2 )1− q = (ap1 + ap2 ) p = max f (t).
(ap1 + ap2 ) q (ap1 + ap2 ) q t∈[0,1]
By the exercise, it follows that we have equality in (4) precisely when r = s, i.e.
p
b1 a1q
1 = 1 ,
(bq1 + bq2 ) q (ap1 + ap2 ) q
or equivalently
bq1 ap1
= .
bq1 + bq2 ap1 + ap2
Obviously this forces
bq2 ap2
q = p ,
+ b2 bq1a1 + ap2
so indeed (ap1 , ap2 ) and (bq1 , bq2 ) are proportional.
Having proven the case n = 2, we now proceed with the proof of:
The implication: Case n = k ⇒ Case n = k + 1.
Start with two sequences (a1 , a2 , . . . , ak , ak+1 ) and (b1 , b2 , . . . , ak , bk+1 ). Define
the numbers
X k p1 X k q1
a= apj and b = bqj .
j=1 j=1
Using the assumption that the case n = k holds, we have
k+1
X Xk p1 Xk q1
p q
(5) aj bj ≤ aj · bj + ak+1 bk+1 = ab + ak+1 bk+1 .
j=1 j=1 j=1
so combining with (5) we see that the desired inequality (3) holds for n = k + 1.
Assume now we have equality. Then we must have equality in both (5) and in
(6). On the one hand, the equality in (5) forces (ap1 , ap2 , . . . , apk ) and (bq1 , bq2 , . . . , bqk ) to
be proportional (since we assume the case n = k). On the other hand, the equality
in (6) forces (ap , apk+1 ) and (bq , bqk+1 ) to be proportional (by the case n = 2). Since
k
X k
X
ap = apj and bq = bqj ,
j=1 j=1
it is clear that (ap1 , ap2 , . . . , apk , apk+1 ) and (bq1 , bq2 , . . . , bqk , bqk+1 ) are proportional.
Definition. Two numbers p, q ∈ [1, ∞) are said to be Hölder conjugate, if
1 1 1
p + q = 1. Here we use the convention ∞ = 0.
Proposition 2.3. Let K be one of the fields R or C, let I be a non-empty set,
and let p, q ∈ [1, ∞] be two Hölder conjugate numbers. If α ∈ `pK (I) and β ∈ `qK (I),
then αβ ∈ `1K (I), and
kαβk1 ≤ kαkp · kβkq .
72 LECTURES 9-11
j∈F j∈I
j∈F j∈I
so we get
X p1 X q1
α(j)p β(j)q
≤ kαkp and ≤ kβkq ,
j∈F j∈F
so when we go back to (8) we immediately get the desired inequality (7)
In the case when p = 1, we immediately have
X X
α(j)β(j) ≤ α(j) · max β(j) ≤
j∈F
j∈F j∈F
X
≤ α(j) · sup β(j) = kαk1 · kβk∞ .
j∈I
j∈I
(remark that finK (I) ⊂ `qK (I), for all q ∈ [1, ∞].)
Theorem 2.1 (Dual definition of `p spaces). Let p, q ∈ (1, ∞) be Hölder con-
jugate numbers, let K be one of the fields R or C, and let I be a non-empty set.
For a function α : I → K, the following are equivalent:
(i) α ∈ `pK (I);
(ii) sup |hα, βi| < ∞.
β∈BqK (I)
CHAPTER II: ELEMENTS OF FUNCTIONAL ANALYSIS 73
So in any case we have βαF ∈ BqK (I). Notice also that, unless βαF is identically zero,
we have
p
|α(i)|1+ q p
P P
i∈F |α(i)|
X
hα, βαF i = α(i)βαF (i) = P i∈F 1/q = =
p 1/q
P
|α(j)| p |α(j)|
i∈F j∈F j∈F
(10)
X 1
p 1− q
X
p 1/p
= |α(i)| = |α(i)| .
i∈F i∈F
It is clear that the equality (10) actually holds even when βαF is identically zero.
To make the exposition a bit clearer, we denote the quantity sup hα, βi
β∈BqK (I)
simply by |||α|||.
We now proceed with the proof of the Theorem.
(i) ⇒ (ii). Assume α ∈ `pK (I). In order to prove (ii) it suffices to prove the
inequality
(11) |||α||| ≤ kαkp .
Start with some arbitrary β ∈ BqK (I). Using Hölder inequality we have
X X
|hα, βi| =
α(j)β(j) ≤ |α(j)| · |β(j)| ≤
j∈[[β]] j∈[[β]]
1/p X 1/q
1/p
X
p q
X p
≤ |α(j)| · |β(j)| ≤ sup |α(i)| = kαkp .
F ⊂I
j∈[[β]] j∈[[β]] finite i∈F
In particular we get the fact that hα, βαF i = |hα, βαF i|, and the fact that βαF belongs
to BqK (I), combined with (13) will give
X p
p F p
|α(i)| = |hα, βα i| ≤ sup |hα, βi| = |||α|||p .
i∈F β∈BqK (I)
Having proven the equivalence (i) ⇔ (ii), let us now observe that (9) is an
immediate consequence of (11) and (12).
Exercise 10. Prove that Theorem 9.1 holds also in the cases (p, q) = (1, ∞) and
(p, q) = (∞, 1).
Corollary 2.1. Let K be either R or C, let I be a non-empty set, and let
p ≥ 1.
(i) When equipped with point-wise addition and scalar multiplication, the set
`pK (I) is a K-vector space.
(ii) The map
`pK (I) 3 α 7−→ kαkp ∈ [0, ∞)
is a norm.
Proof. Let q be the Hölder conjugate of p. If α ∈ `pK (I), and λ ∈ K, then
hλα, βi = λhα, βi, ∀ β ∈ fin K (I),
so we get
sup |hλα, βi| = |λ| · sup |hα, βi|,
β∈BqK (I) β∈BqK (I)
which gives the fact that λα ∈ `pK (I), as well as the equality kλαkp = |λ| · kαkp .
If α1 , α2 ∈ `pK (I), then
hα1 + α2 , βi = hα1 , βi + hα2 , βi, ∀ β ∈ fin K (I),
so we get
sup |hα1 + α2 , βi| = sup hα1 , βi + hα2 , βi ≤
β∈BqK (I) β∈BqK (I)
≤ sup |hα1 , βi| + |hα2 , βi| ≤ sup |hα1 , βi| + sup |hα2 , βi|,
β∈BqK (I) β∈BqK (I) β∈BqK (I)
which gives the fact that α1 + α2 ∈ `pK (I), as well as the inequality
kα1 + α2 kp ≤ kα1 kp + kα2 kp .
The implication kαkp = 0 ⇒ α = 0 is obvious.
Exercise 11. Let p ≥ 1 be a real number, let K be one of the fields R or C, and
let I be a non-empty set. Prove that finK (I) is a dense linear subspace in `pK (I).
Remark 2.3. Let p, q ∈ [1, ∞] be Hölder conjugate. Then the map
`pK (I) × `qK (I) 3 (α, β) 7−→ hα, βi ∈ K
is bilinear, in the sense that for any γ ∈ `pK (I) and any η ∈ `qK (I), the maps
`pK (I) 3 α 7−→ hα, ηi ∈ K,
`qK (I) 3 β 7−→ hγ, βi ∈ K
are linear. These facts follow immediately from Exercise ??
We now examine linear continuous maps between normed spaces.
CHAPTER II: ELEMENTS OF FUNCTIONAL ANALYSIS 75
which means there exists some sequence (xn )n≥1 ⊂ X such that
(a) kxn k ≤ 1, ∀ n ≥ 1;
(b) limn→∞ kT xn k = ∞.
Put
zn = kT xn k−1 xn , ∀ n ≥ 1.
On the one hand, we have
kxn k 1
kzn k = ≤ , ∀ n ≥ 1,
kT xn k kT xn k
which gives limn→∞ kzn k = 0, i.e. limn→∞ zn = 0. Since T is assumed to be
continuous, we will get
(14) lim T zn = T 0 = 0.
n→∞
un =
any vector of norm 1, if xn = 0
so that we have
kun k = 1 and xn = kxn kun , ∀ n ≥ 1.
Since T is linear, we have
(15) T xn = kxn kT un , ∀ n ≥ 1.
If we define M = sup kT xk : x ∈ X, kxk = 1 , then kT un k ≤ M , ∀ n ≥ 1, so (15)
will give
kT xn k ≤ M · kxn k, ∀ n ≥ 1,
and the condition limn→∞ xn = 0 will force limn→∞ T xn = 0.
(iv) ⇒ (i). Assume T is continuous at 0, and let us prove that T is continuous at
any point. Start with some arbitrary x ∈ X and an arbitrary sequence (xn )n≥1 ⊂ X
with limn→∞ xn = x. Put zn = xn − x, so that limn→∞ zn = 0. Then we will have
limn→∞ T zn = 0, which (use the linearity of T ) means that
0 = lim kT zn k = lim kT xn − T xk,
n→∞ n→∞
76 LECTURES 9-11
M2 = sup kT xk : x ∈ X, kxk = 1 ,
When Y = K (equipped with the absolute value as the norm), the space L(X, K)
will be denoted simply by X∗ , and will be called the topological dual of X.
Proposition 2.5. Let K be either R or C, and let X and Y be normed K-vector
spaces.
(i) The space L(X, Y) is a K-vector space.
(ii) For T ∈ L(X, Y) we have
kT k = min C ≥ 0 : kT xk ≤ Ckxk, ∀ x ∈ X .
(16)
In particular one has
(17) kT xk ≤ kT k · kxk, ∀ x ∈ X.
(iii) The map L(X, Y) 3 T 7−→ kT k ∈ [0, ∞) is a norm.
Proof. The fact that L(X, Y) is a vector space is clear.
(ii). Assume T L(X, Y). We begin by proving (17). Start with some arbitrary
x ∈ X, and write it as x = kxku, for some u ∈ X with kuk = 1. Then by definition
we have kT uk ≤ kT k, and by linearity we have
kT xk = kxk · kT uk ≤ kxk · kT k.
To prove the equality (16) let us define the set
CT = C ≥ 0 : kT xk ≤ Ckxk, ∀ x ∈ X .
On the one hand, by (17) we know that kT k ∈ CT . On the other hand, if we take
an arbitrary C ∈ CT , then for every u ∈ X with kuk = 1, we will have
kT uk ≤ Ckuk = C,
so taking supremum, over all u with kuk = 1, will immediately give kT k ≤ C. Since
we now have
kT k ≤ C, ∀ C ∈ CT ,
we clearly get kT k = min CT .
CHAPTER II: ELEMENTS OF FUNCTIONAL ANALYSIS 77
It is clear that δ i ∈ `qK (I), for all i ∈ I. (In fact δ i ∈ finK (I).) We define α : I → K
by
α(i) = φ(δ i ), ∀ i ∈ I.
Notice that, for every β ∈ finK , we have
X X X
β(i)φ(δ i ) = φ β(i)δ i = φ(β),
(21) α(i)β(i) =
i∈I i∈I i∈bβc
where bβc = {i ∈ I : β(i) 6= 0}. (Since β ∈ finK (I), the set bβc is finite.) Using
Hölder’s inequality, the above computation shows that
hα, βi ≤ kφk · kβkq , ∀ β ∈ finK (I).
By Theorem 9.1 and Exercise 7, this proves that α ∈ `pK (I). Going back to (21) we
now have
θα (β) = φ(β), ∀ β ∈ finK (I).
Since both θα and φ are continuous, and finK (I) is dense in `qK (I) (by Exercise 10),
it follows that φ = θα .
Remark 2.5. In the case p = 1, the map
∗
Θ : `1K (I) 3 α 7−→ θα ∈ `∞
K (I)
is still isometric, but it is no longer surjective, unless I is finite. The explanation
is the fact that when I is infinite, the subspace finK (I) is not dense in `∞ K (I). For
example, if we take 1 ∈ `∞ K (I) to be the constant function 1, then it is pretty
obvious that
k1 − βk ≥ 1, ∀ β ∈ finK (I).
The above equality can be immediately extended to
(22) kλ1 + βk ≥ |λ|, ∀ λ ∈ K, β ∈ finK (I).
If we then consider the subspace
f (I) = {λ1 + β : β ∈ fin (I), λ ∈ K},
fin K K
we see that the map
f (I) 3 λ1 + β 7−→ λ ∈ K
φ0 : fin K
is linear, continuous, and has the property that
(23) φ0 fin (I) = 0, φ0 (1) = 1,
K
3. Banach spaces
Definition. Let K be one of the fields R or C. A Banach space over K is a
normed K-vector space (X, k . k), which is complete with respect to the metric
d(x, y) = kx − yk, x, y ∈ X.
Example 3.1. The field K, equipped with the absolute value norm, is a Banach
space. More generally, the vector space Kn , equipped with any of the norms
k(λ1 , . . . , λn )k∞ = max{|λ1 |, . . . , |λn |},
1/p
k(λ1 , . . . , λn )kp = |λ1 |p + · · · + |λn |p
, p ≥ 1,
is a Banach space.
Remark 3.1. Using the facts from the general theory of metric spaces, we
know that for a normed vector space (X, k . k), the following are equivalent:
(i) X is a Banach space; P∞
(ii) given any sequence (xn )n≥1 ⊂ X with Pn=1 kxn k < ∞, the sequence
n
(yn )n≥1 of partial sums, defined by yn = k=1 xk , is convergent;
(iii) every Cauchy sequence in X has a convergent subsequence.
This is pretty obvious, since the sequence of partial sums has the property that
d(yn+1 , yn ) = kyn+1 − yn k = kxn+1 k, ∀ n ≥ 1.
Exercise 1*. Let X be a finite dimensional normed vector space. Prove that X
is a Banach space.
Hints: Use inductionn on dim X. The case dim X = 1 is trivial. Assume the statement is true for
all normed vector spaces of dimension d, and let us prove it for a normed vector space of dimension
d + 1. Fix such an X, and a linear basis {e1 , e2 , . . . , en , ed+1 } for X. Start with a Cauchy sequence
(xn )n≥1 ⊂ X. Write each term as
d+1
X
xn = αn (k)ek .
k=1
Prove first that αn (d + 1) n≥1 ⊂ K is bounded. Then extract a subsequence (xnp )p≥1 such that
αnp (d + 1) p≥1 is convergent. If we take α(d + 1) = limp→∞ αnp (d + 1), then prove that the
sequence xnp − αnp (d + 1)ed+1 p≥1 is Cauchy in the space Span{e1 , . . . , ed }. Using the inductive
hypothesis, conclude that (xnp )p≥1 is convergent in X. Thus, every Cauchy sequence in X has a
convergent subsequence, hence X is Banach.
Exercise 2*. Let n ≥ 1 be an integer, and let k · k be a norm on Kn . Prove
that there exist constants C, D > 0, such that
Ckxk∞ ≤ kxk ≤ Dkxk∞ , ∀ x ∈ Kn .
79
80 LECTURE 12
Argue by contradiction (see also the hint from the preceding exercise).
Exercise 3. Let X and Y be normed vector spaces. Consider the product X × Y,
equipped with the natural vector space structure.
(i) Prove that k(x, y)k = kxk + kyk, (x, y) ∈ X × Y defines a norm on X × Y.
(ii) Prove that, when equipped with the above norm, X × Y is a Banach space,
if and only if both X and Y are Banach spaces.
There are two key constructions which enable one to construct new Banach
space out of old ones.
Proposition 3.1. Let X be a normed vector space, and let Y be a Banach
space. Then L(X, Y) is a Banach space, when equipped with the operator norm.
Proof. Start with a Cauchy sequence (Tn )n≥1 ⊂ L(X, Y). This means that
for every ε > 0, there exists some Nε such that
(1) kTm − Tn k < ε, ∀ m, n ≥ Nε .
Notice that, if one takes for example ε = 1, and we define
C = 1 + max{kT1 k, kT2 k, . . . , kTN1 k},
then we clearly have
(2) kTn k ≤ C, ∀ n ≥ 1.
Notice that, using (1), we have
(3) kTm x − Tn xk ≤ εkxk, ∀ m, n ≥ Nε , x ∈ X,
which proves that
• for every x ∈ X, the sequence (Tn x)n≥1 ⊂ Y is Cauchy.
Since Y is a Banach space, for each x ∈ X, the sequence (Tn )n≥1 will be convergent.
We define the map T : X → Y by
T x = lim Tn x, x ∈ X.
n→∞
Proof. For p = 1 we know that `1 ' (c0 )∗ . For p ∈ (1, ∞], we know that
` ' (`q )∗ , where q is Hölder conjugate to p.
p
Proof. This is a particular case of a general result from the theory of complete
metric spaces.
Corollary 3.3. Let I be a non-empty set, and let K be one of the fields R or
C. Then cK
0 (I) is a Banach space.
∞
Proof. Use the fact that cK
0 (I) is closed in `K (I).
where Fn = Span(b1 , b2 , . . . , bn }. Since the Fn ’s are finite dimensional linear subspaces, they will
be closed. Use Baire’s Theorem to get a contradiction.
Comments. A third method of constructing Banach spaces is the completion.
If we start with a normed K-vector space X, when we regard X as a metric space,
its completion X̃ is constructed as follows. One defines
cs(X) = x = (xn )n≥1 : (xn )n≥1 Cauchy sequence in X .
Two Cauchy sequences x = (xn )n≥1 and x0 = (x0n )n≥1 are said to be equivalent, if
limn→∞ kxn − x0n k = 0. In this case one writes x ∼ x0 . The completion X̃ is then
defined as the space
X̃ = cs(X)/ ∼
of equivalence classes. For x ∈ cs(X), one denotes by x̃ its equivalence class in X̃.
Finally for an element x ∈ X one denotes by hxi ∈ X̃ the equivalence class of the
constant sequence x.
We know from general theory that X̃ is a complete metric space, with the
distance d˜ (correctly) defined by
˜ x̃0 ) = lim kxn − x0 k,
d(x̃, n
n→∞
for any two Cauchy sequences x = (xn )n≥1 and x0 = (x0n )n≥1 .
82 LECTURE 12
It turns out that, in our situation, the space cs(X) carries a natural vector
space structure, defined by pointwise addition and scalar multiplication. Moreover,
the space X̃ is identified as a quotient vector space
X̃ = cs(X)/ns(X),
where
ns(X) = x = (xn )n≥1 : (xn )n≥1 sequence in X with lim xn = 0
n→∞
is the linear subspace of null sequences. It then follows that X̃ carries a natural
vector space structure. More explicitly, if we start with a scalar λ ∈ K, and with
two elements p, q ∈ X̃, which are represented as p = x̃ and q = ỹ, for two Cauchy
sequences x = (xn )n≥1 and y = (yn )n≥1 in X, then the sequence
w = (λxn + yn )n≥1
is Cauchy in X, and the element λp + q ∈ X̃ is then defined as λp + q = w̃.
Finally, there is a natural norm on X̃, (correctly) defined by
˜ h0i) = lim kxn k,
kx̃k = d(x̃,
n→∞
for all Cauchy sequences x = (xn )n≥1 . These considerations then prove that X̃ is
a Banach space, and the map
X 3 x 7−→ hxi ∈ X̃
is linear and isometric, in the sense that
khxik = kxk, ∀ x ∈ X.
In the context of normed vector spaces, the universality property of the com-
pletion is stated as follows:
Proposition 3.3. Let X be a normed vector space, let X̃ denote its completion,
and let Y be a Banach space. For every linear continuous map T : X → Y, there
exists a unique linear continuous map T̃ : X̃ → Y, such that
T̃ hxi = T x, ∀ x ∈ X.
Moreover the map
L(X, Y) 3 T 7−→ T̃ ∈ L(X̃, Y)
is an isometric linear isomorphism.
Proof. If T : X → Y is linear an continuous, then T is a Lipschitz map with
Lipschitz constant kT k, because
kT x − T x0 k ≤ kT k · kx − x0 k, ∀ x, x0 ∈ X.
We know, from the theory of metric spaces, that there exists a unique continuous
map T̃ : X̃ → Y, such that
T̃ hxi = T x, ∀ x ∈ X.
We also know that T̃ is Lipschitz, with Lipschitz constant kT k. The only thing we
need to prove is the fact that T̃ is linear. Start with two points p, q ∈ X̃, represented
as p = x̃ and q = z̃, for some Cauchy sequences x = (xn )n≥1 and z = (zn )n≥1 in
X. If λ ∈ K, then λp + q = w̃, where w = (λxn + zn )n≥1 . We then have
T̃ (λp + q) = lim T (λxn + z + n) = λ · lim T xn + lim T zn = λT̃ p + T̃ q.
n→∞ n→∞ n→∞
CHAPTER II: ELEMENTS OF FUNCTIONAL ANALYSIS 83
Let us prove now that kT̃ k = kT k. Since T̃ is Lipschitz, with Lipschitz constant
kT k, we will have kT̃ k ≤ kT k. To prove the other inequality, let us consider the
sets
B0 = {p ∈ X̃ : kpk ≤ 1}, B1 = {hxi : x ∈ X, kxk ≤ 1}.
By definition, we have
kT̃ k = sup kT̃ pk.
p∈B0
Since we clearly have B0 ⊃ B1 , we get
kT̃ k = sup kT̃ pk ≥ sup kT̃ hxik : x ∈ X kxk ≤ 1 =
p∈B1
= sup kT xk : x ∈ X kxk ≤ 1 = kT k.
The fact that the map L(X, Y) 3 T 7−→ T̃ ∈ L(X̃, Y) is linear is obvious.
To prove the surjectivity, start with some S ∈ L(X̃, Y). Consider the map
ι : X 3 x 7−→ hxi ∈ X̃.
Since ι is linear and isometric, in particular it is continuous, so the composition
T = S ◦ ι is linear and continuous. Notice that
Shxi = S ι(x) = (S ◦ ι)x = T x, ∀ x ∈ X,
so by uniqueness we have S = T̃ .
Corollary 3.4. Let X be a normed space, let Y be a Banach space, and let
T : X → Y be an isometric linear map.
(i) Let T̃ : X̃ → Y be the linear continuous map defined in the previous result.
Then T̃ is linear, isometric, and T̃ (X̃) = T (X).
(ii) X is complete, if and only of T (X) is closed in Y.
Proof. (i). The fact that T̃ is isometric, and has the range equal to T (X) is
true in general (i.e. for X metric space, and Y complete metric space). The linearity
follows from the previous result.
(ii). This is obvious.
Example 3.2. Let X be a normed vector space. For every x ∈ X define the
map x : X∗ → K by
x (φ) = φ(x), ∀ φ ∈ X∗ .
Then x is a linear and continuous. This is an immediate consequence of the
inequality
|x (φ)| = |φ(x)| ≤ kxk · kφk, ∀ φ ∈ X∗ .
Notice that this also proves
kx k ≤ kxk, ∀ x ∈ X.
Interestingly enough, we actually have
(4) kx k = kxk, ∀ x ∈ X.
To prove this fact, we start with an arbitrary x ∈ X, and we consider the linear
subspace
Y = Kx = {λx : λ ∈ K}.
84 LECTURE 12
If we define φ0 : Y → K, by
φ0 (λx) = λkxk, ∀ λ ∈ K,
then it is clear that φ0 (x) = kxk, and
|φ0 (y)| ≤ kyk, ∀ y ∈ Y.
Use then the Hahn-Banach Theorem to find φ : X → K such that φY = φ0 , and
|φ(z)| ≤ kzk, ∀ z ∈ X.
This will clearly imply kφk ≤ 1, while the first condition will give φ(x) = φ0 (x) =
kxk. In particular, we will have
kxk = |φ(x)| = |x (φ)| ≤ kx k · kφk ≤ kx k.
Having proven (4), we now have a linear isometric map
E : X 3 x 7−→ x ∈ X∗∗ .
Since X∗∗ is a Banach space, we now see that Ẽ : X̃ → E(X) is an isometric linear
isomorphism. In particular, X is Banach, if and only if E(X) is closed in X∗∗ .
We conclude with a series of results, which are often regarded as the “principles
of Banach space theory.” These results are consequences of Baire Theorem.
Theorem 3.1 (Uniform Boundedness Principle). Let X be a Banach space, let
Y be normed vector space, and let M ⊂ L(X, Y). The following are equivalent
(i) sup kT k : T ∈ M < ∞;
(ii) ⇒ (i). Assume M satisfies condition (ii). For each integer n ≥ 1, let us
define the set
Fn = x ∈ X : kT xk ≤ n, ∀ T ∈ M .
k=1
and we use again (6) to find xp+1 ∈ A, such that kz − T xp+1 k ≤ 2ε . We then claerly
have
p+1
X
1
z − T xp+1
ε
y − T 2k xk
=
≤ p+2 ,
2 p+1 2
k=1
P∞ 1
Consider now the series k=1 2k xk . Since kxk k < 1, ∀ k ≥ 1, and X is a Banacch
space, by Remark 3.1, the sequence of (wn )∞n=1 ⊂ X of partial sums
n
X
1
wn = x ,
2k k
n ≥ 1,
k=1
Fix for the moment k ∈ {1, . . . , n}. The fact that φ ∈ −1
xk (Dk ) means that φ(xk ) ∈
Dk . Since Dk is open in K, there exists some εk > 0, such that
Dk ⊃ Bεk φ(xk ) .
Notice that, if one takes ε = min{ε1 , . . . , εn }, then we clearly have the inclusions
W (φ; ε, xk ) ⊂ W (φ; εk , xk ) ⊂ −1
xk (Dk ).
We then immediately get
n
\
W (φ; ε, xk ) ⊂ −1
xk (Dk ) ⊂ E ⊂ N,
k=1
and we are done.
Corollary 4.1. Let X be a normed vector space. Then the w∗ topology on X∗
is locally convex, i.e.
• for every φ ∈ X∗ and every w∗ -neighborhood N of φ, there exists a convex
w∗ -open set D such that φ ∈ D ⊂ N .
Proof. Apply the second part of the proposition, together with the obvious
fact that each of the sets W (φ; ε, x) is convex and w∗ -open.
Proposition 4.2. Let X be a normed vector space. When equipped with the
w∗ topology, the space X∗ is a topological vector space. This means that the maps
X∗ × X∗ 3 (φ, ψ) 7−→ φ + ψ ∈ X∗
K × X∗ 3 (λ, φ) 7−→ λφ ∈ X∗
are continuous with respect to the w∗ topology on the target space, and the w∗
product topology on the domanin.
Proof. According to the definition of the w∗ topology, it suffices to prove
that, for every x ∈ X, the maps
σx : X∗ × X∗ 3 (φ, ψ) 7−→ γx : x (φ + ψ) ∈ K
K × X∗ 3 (λ, φ) 7−→ x (λφ) ∈ K
are continuous. But the continuity of σx and γx is obvious, since we have
σx (φ, ψ) = φ(x) + φ(x) = x (φ) + x (ψ), ∀ (φ, ψ) ∈ X∗ × X∗ ;
γx (λ, φ) = λφ(x) = λx (φ), ∀ (λ, φ, ψ) ∈ K × X∗ .
CHAPTER II: ELEMENTS OF FUNCTIONAL ANALYSIS 91
Our next goal will be to describe the linear maps X∗ → K, which are continuous
in the w∗ topology.
Proposition 4.3. Let X be a normed vector space over K. For a linear map
ω : X∗ → K, the following are equivalent:
(i) ω is continuous with respect to the w∗ topology;
(ii) there exists some x ∈ X, such that
ω(φ) = φ(x), ∀ φ ∈ X∗ .
Proof. The implication (ii) ⇒ (i) is trivial, since condition (ii) gives ω = x
(i) ⇒ (ii). Suppose ω is continuous. In particular, ω is continuous at 0, so if
we take the set
D = {λ ∈ K : |λ| < 1},
the set
ω −1 (D) = {φ ∈ X∗ : |ω(φ)| < 1}
is an open neighborhood of 0 in the w∗ topology. By Proposition ?? there exist
x1 , . . . , xn ∈ X, and ε > 0, such that
(1) W (0; ε, x1 ) ∩ · · · ∩ W (0; ε, xn ) ⊂ D.
Claim 1: One has the inequality
|ω(φ)| ≤ ε−1 · max |φ(x1 )|, . . . , |φ(xn )| , ∀ φ ∈ X∗ .
Fix an arbitrary φ ∈ X∗ , and put M = max |φ(x1 )|, . . . , |φ(xn )| . For every integer
k ≥ 1, define
−1
φk = ε M + k1 φ,
so that
−1 −1
|φk (xj )| = ε M + k1 |φ(xj )| ≤ εM M + k1 < ε, ∀ k ≥ 1, j ∈ {1, . . . , n}.
This proves that φk ∈ W (0; ε, xj ), for all k ≥ 1, and all j ∈ {1, . . . , n}. By (1) this
will give
|ω(φk )| < 1, ∀ k ≥ 1,
which reads −1
ε M + k1 |ω(φ)| < 1, ∀ k ≥ 1.
This gives
|ω(φ)| ≤ ε−1 M + k1 , ∀ k ≥ 1,
∼
there exists a linear isomorphism T̂ : X/Ker T −−→ Ran T , such that T̂ ◦ π = T .
We then define
σ0 = ω̂ ◦ T̂ −1 : Ran T → K,
and we will have
σ0 ◦ T = (ω̂ ◦ T̂ −1 ) ◦ (T̂ ◦ π) = ω̂ ◦ π = ω.
We finally extend5 σ0 : Ran T → K to a linear map σ : Kn → K.
Having proven Claim 2, we choose scalars α1 , . . . , αn ∈ K, such that
σ(λ1 , . . . , λn ) = α1 λ1 + · · · + αn λn , ∀ (λ1 , . . . , λn ) ∈ Kn .
We now have
ω(φ) = σ(T φ) = σ φ(x1 ), . . . , φ(xn ) = α1 φ(x1 ) + · · · + αn φ(xn ), ∀ φ ∈ X∗ ,
Fix f ∈ L. Start with some x ∈ X and some λ ∈ K. We have kλxk = |λ| · kxk, so
we get
0 if either x = 0, or λ = 0
ψf (λx) = λ x
|λ| · kxk · f · if λ 6= 0 and x 6= 0
|λ| kxk
94 LECTURE 13
If λ 6= 0 and x 6= 0, we put
λ x
µ= and y = ,
|λ| kxk
and the fact that µ ∈ B, y ∈ (X)1 , and f ∈ Bµ,y , will give
λ x λ x λ
f · = f (µy) = µf (y) = ·f = ψf (x),
|λ| kxk |λ| kxk |λ| · kxk
so in this case we get
λ x λ
ψf (λx) = |λ| · kxk · f · = |λ| · kxk · ψf (x) = λψf (x).
|λ| kxk |λ| · kxk
In the case when either λ = 0 or x = 0, we also get the equality
ψf (λx) = 0 = λψf (x).
This way we have proven the homeogeneity of ψf
(2) ψf (λx) = λψf (x), ∀ λ ∈ K, x ∈ X.
Let us prove now that ψf (X) = f . If x = 0, then using the property
1
f (x) ∈ B, ∀ x ∈ (X)1 ,
which means that f ∈ P . Using the fact that φ is linear, it is obvious that f ∈ L.
Using Claim 1, we have
ψf (x) = f (x) = φ(x), ∀ x ∈ (X)1 .
Now, since ψf (X)1 = φ(X)1 , and both ψf and φ are linear, we immediately get
ψf = φ.
Remarks 4.2. Using the notations from the above proof, the continuous map
Ψ : L → (X∗ )1 is in fact bijective. The only thing we need to prove is the injectivity.
Suppose ψf = ψg , for some f, g ∈ L. Then
f = ψf (X)1 = ψg (X)1 = g.
Since Ψ : (X∗ )1 → L is bijective, continuous, and the spaces (X∗ )1 and L are
compact Hausdorff, it follows that Ψ is in fact a homeomorphism. The inverse map
Ψ−1 : (X∗ )1 → L is simply defined by
Ψ−1 (φ) = φ(X)1 , ∀ φ ∈ (X∗ )1 .
Since (M)1 is dense in (X)1 , this will force φ(X) = ψ (X) , which finally forces
1 1
φ = ψ.
∗
Q Using the above Claim, we see that∗ if we define Q = (Υ ◦ κ) (X )1 , then Q ⊂
B is compact, and Υ ◦ κ : (X )1 → Q is a homeomorphism. Notice that
Qx∈(M)1
x∈(M)1 B is a countable product of metric spaces, so it is metrizable. Therefore
Q is also metrizable, and so will be (X∗ )1 .
This map will still be injective and continuous, and one can show that
κ̃ : X∗ → κ̃(X∗ )
CHAPTER II: ELEMENTS OF FUNCTIONAL ANALYSIS 97
is a homeomorphism, when κ̃(X∗ ) is equipped with the induced topology from the
product space x∈(X)1 K. In general however, the set κ̃(X∗ ) is not closed in the
Q
Q
product space x∈(X)1 K.
If X is separable, and if one takes a countable dense set M ⊂ X, then as before,
one also still has a continuous map
Y Y
Υ̃ : K 3 f 7−→ f (M)1 ∈ K,
x∈(X)1 x∈(M)1
99
100 LECTURES 14-15
so that f (p) ⊂ [α, β], and kf k = max{|α|, |β|}. Define the sets
2α + β α + 2β
A = f −1 α, and B = f −1 ,β .
3 3
so that both A and B are closed, and A ∩ B = ∅. Use the hypothesis, to find a
function h ∈ C, such that hA = 0, hB = 1, and h(p) ∈ [0, 1], for all p ∈ Ω. Define
the function g ∈ C by
1
g = α1 + (β − α)k .
3
Let us examine the difference g − f . Start with some arbitrary point p ∈ Ω. There
are three cases to examine:
α
Case I: p ∈ A. In this case we have h(p) = 0, so we get g(p) = . By the
3
2α + β
construction of A we also have α ≤ f (p) ≤ , so we get
3
2α α+β
≤ f (p) − g(p) ≤ .
3 3
β
Case II: p ∈ B. In this case we have h(p) = 1, so we get g(p) = . We also
3
2β + α
have ≤ f (p) ≤ β, so we get
3
α+β 2β
≤ f (p) − g(p) ≤ .
3 3
Case III: p ∈ Ω r (A ∪ B). In this case we have 0 ≤ h(p) ≤ 1, so we get
α β 2α + β α + 2β
≤ g(p) ≤ , and < f (p) < . In particular we get
3 3 3 3
2α + β β 2α
f (p) − g(p) > − = ;
3 3 3
α + 2β α 2β
f (p) − g(p) < − = .
3 3 3
2α α+β 2β
Since ≤ ≤ , we see that in all three cases we have
3 3 3
2α 2β
≤ f (p) − g(p) ≤ ,
3 3
so we get
2α 2β
≤ inf f (p) − g(p) ≤ sup f (p) − g(p) ≤ ,
3 p∈Ω p∈Ω 3
so we indeed get the desired inequality
2
kg − f k ≤ kf k.
3
CHAPTER II: ELEMENTS OF FUNCTIONAL ANALYSIS 101
Having proven the Claim, we now prove the density of C in CbR (Ω). Start with
some f ∈ CbR (Ω), and we construct recursively two sequences (gn )n≥1 ⊂ C and
(fn )n≥1 ⊂ CbR (Ω), as follows. Set f1 = f . Apply the Claim to find g1 ∈ C such that
2
kg1 − f k ≤ kf1 k.
3
Once f1 , f2 , . . . , fn and g1 , g2 , . . . , gn have been constructed, we set
fn+1 = gn − fn ,
and we choose gn+1 ∈ C such that
2
kgn+1 − fn+1 k ≤ kfn+1 k.
3
It is clear, by construction, that
n−1
2
kfn k ≤ kf k, ∀ n ≥ 1.
3
Consider the sequence (sn )n≥1 ⊂ C of partial sums, defined by
sn = g1 + g2 + · · · + gn , ∀ n ≥ 1.
Using the equalities
gn = fn − fn+1 , ∀ n ≥ 1,
we get
sn − f = g1 + g2 + · · · + gn − f1 = fn+1 ,
so we have n
2
ksn − f k ≤ kf k, ∀ n ≥ 1,
3
which clearly give f = limn→∞ sn , so f indeed belongs to the closure C.
Proof. Let us introduce the Banach space setting that will make the proof
clearer. We consider the Banach spaces C R (Ω) and CbR (T ). To avoid any confusion,
the norms on these Banach spaces will be denoted by k · kΩ and k · kT . If we define
the restriction map
R : CbR (Ω) 3 g 7−→ g ∈ CbR (T ),
T
then R is obviously linear and continuous.
We define the subspace C = R CbR (Ω) ⊂ CbR (T ).
Claim: For every f ∈ C, there exists some g ∈ CbR (Ω) such that f = Rg, and
inf f (q) ≤ g(p) ≤ supq∈T f (q), ∀ p ∈ Ω.
q∈T
102 LECTURES 14-15
To prove this fact, we start first with some arbitrary g0 ∈ CbR (Ω), such that f =
Rg0 = g0 Y . Put
α = inf f (q) and β = sup f (q),
q∈T q∈T
so that kf kT = max |α|, |β| . Define the function θ : R → [α, β] by
α if t < α
θ(t) = t if α ≤ t ≤ β
β if t > β
Theorem 5.2 (Dini). Let K be a compact Hausdorff space, let (fn )n≥1 ⊂
C R (K) be a monotone sequence. Assume there is some f ∈ C R (K), such that
lim fn (p) = f (p), ∀ p ∈ K.
n→∞
Then limn→∞ fn = f , in the norm topology.
Proof. Replacing fn with fn − f , we can assume that limn→∞ fn (p) = 0,
∀ p ∈ K. Replacing (if necessary) fn with −fn , we can also assume that the
sequence (fn )n≥1 is decreasing. In particular, each fn is non-negative.
We need to prove that limn→∞ kfn k = 0. Assume this is not true, so there
exists some ε > 0, such that the set
M = {m ∈ N : kfm k ≥ ε}
is infinite. For each integer n ≥ 1, let us define the set
Fn = {p ∈ K : fn (p) ≥ ε}.
Then by the definition of M , we have
Fm 6= ∅, ∀ m ∈ M.
Claim: One has the inclusion Fn ⊃ Fn+1 , ∀ n ≥ 1.
Indeed, if p ∈ Fn+1 , then
ε ≤ fn+1 (p) ≤ fn (p),
which proves that p ∈ Fn .
Using the claim, plus the fact that the set M is infinite, it follows that, Fn 6= ∅,
∀ n ≥ 1. (Indeed, if we start with some arbitrary n, then since M is infinite, we
can find m ∈ M , with m ≥ n, and then using the Claim we have ∅ 6= Fm ⊂ Fn .)
Since K is compact, and the sets F1 ⊃ F2 ⊃ . . . are closed and non-empty, by
the finite intersection property, it follows that
∞
\
Fn 6= ∅.
n=1
T∞
But this leads to a contradiction, because if we pick an element p ∈ n=1 Fn ,
then we will have fn (p) ≥ ε, ∀ n ≥ 1, and then the equality limn→∞ fn (p) = 0 is
impossible.
Exercise 1. Define the sequence (Pn )n≥1 of polynomials, by P1 (t) = 0, and
1
Pn+1 (t) = t − Pn (t)2 + Pn (t), ∀ n ≥ 1.
2
Prove that √
lim max Pn (t) − t = 0.
n→∞ t∈[0,1]
√
Hint: Define the functions fn , f : [0, 1] → R by fn (t) = Pn (t) and f (t) = t. Prove that, for
every t ∈ [0, 1], the sequence fn (t) n≥1 is incresing, bounded, and limn→∞ fn (t) = f (t). Then
apply Dini’s Theorem.
Theorem 5.3 (Stone-Weierstrass). Let K be a compact Hausdorff space. Let
A ⊂ C R (K) be a unital subalgebra, i.e.
• A 3 1 - the constant function 1;
• A is a linear subspace;
• if f, g ∈ A, then f g ∈ A.
104 LECTURES 14-15
Assume A separates the points of K, i.e. for any p, q ∈ K, with p 6= q, there exists
f ∈ A such that f (p) 6= f (q).
Then A is dense in C R (K), in the norm topology.
Proof. Let C denote the closure of A. Remark that C is again a unital sub-
algebra and it still separates the points.
The proof will eventually use the Urysohn density Lemma. Before we get to
that point, we need several preparations.
Step 1. If f ∈ C, then |f | ∈ C.
To prove this fact, we define g = f 2 ∈ C, and we set h = kgk−1 g, so that h ∈ C,
and h(p) ∈ [0, 1], for all p ∈ K. Let Pn (t), n ≥ 1 be the polynominals defined in
the above exercise. The functions hn = Pn ◦ h, n ≥ 1 are clearly all in C. By the
above Exercise, we clearly get
p
lim max |hn (p) − h(p)| = 0,
n→∞ p∈K
√ √
which means that limn→∞ hn = h, in the norm topology. In particular, h
belongs to C. Obviously we have
√
h = kf k−1 · |f |,
so |f | indeed belongs to C.
Step 2: Given two functions f, g ∈ C, the continuous functions max{f, g} and
min{f, g} both belong to C.
This follows immediately from Step 1, and the equalities
1 1
max{f, g} = f + g + |f − g| and min{f, g} = f + g − |f − g| .
2 2
Step 3: For any two points p, q ∈ K, p 6= q, there exists h ∈ C, such that
h(p) = 0, h(q) = 1, and h(s) ∈ [0, 1], ∀ s ∈ K.
Use the assumption on A, to find first a function f ∈ A, such that f (p) 6= f (q).
Put α = f (p) and β = f (q), and define
1
g= f − α1 .
β−α
The function g still belongs to A, but now we have g(p) = 0 and g(q) = 1. Define
the function h = min{g 2 , 1}. By Step 3, h ∈ C, and it clearly satisfies the required
properties.
Step 4: Given a closed subset A ⊂ K, and a point p ∈ K r A, there exists a
function h ∈ C, such that h(p) = 0, hA = 1, and h(q) ∈ [0, 1], ∀ q ∈ K.
For every q ∈ A, we use Step 3 to find a function hq ∈ C, such that hq (p) = 0,
hq (q) = 1, and hq (s) ∈ [0, 1], ∀ s ∈ K, and we define the open set
Dq = {s ∈ K : hq (s) > 0}.
Using the compactness of A, we find points q1 , . . . , qn ∈ A, such that
A ⊂ Dq1 ∪ · · · ∪ Dqn .
Define the function f = hq1 + · · · + hqn ∈ C, so that f (p) = 0, f (q) > 0, for all
q ∈ A, and f (s) ≥ 0, ∀ s ∈ K. If we define
m = min f (q),
q∈A
CHAPTER II: ELEMENTS OF FUNCTIONAL ANALYSIS 105
then the function g = m−1 f again belongs to C, and it satisfies g(p) = 0, g(q) ≥ 1,
∀ q ∈ A, and g(s) ≥ 0, ∀ s ∈ K. Finally, the function
h = min{g, 1}
will satisfy the required properties.
Step
5: Given
closed sets A, B ⊂ K with A ∩ B = ∅, there exists h ∈ C, such
that hA = 1, hB = 0, and h(q) ∈ [0, 1], ∀ q ∈ K.
Use Step 4, to find for every p ∈ B, a function hp ∈ C, such that hp B = 1,
In fact, one way to see that this property fails is by inspecting the closure of A in
C(D). This closure is denoted by A(D) and is called the disk algebra. The main
feature of A(D) is the following:
Exercise 2*. Prove that
A(D) = f : D → C : f continuous, and f D holomorphic .
We now examine the topological dual of C(K).
Notations. Let K be a compact Hausdorff space, and let K be one of the
fields R or C. We define the space
MK (K) = C K (K)∗ = {φ : C K (K) → K : φ K-linear continuous}.
The unit ball will be denoted by MK (K)1 . When K = C, the superscript C will be
omitted from the notation.
Remarks 5.1. Let K be a compact Hausdorff space. The space M(K) =
C(K)∗ carries a natural involution, defined as follows. For φ ∈ M(K), we define
the map φ? : C(K) → C by
φ? (f ) = φ(f ), ∀ f ∈ C(K).
For every φ ∈ M(K), the map φ? : C(K) → C is again linear, continuous, and has
kφ? k = kφk.
The map φ? will be called the adjoint of φ. We used the term involution, because
the map
M(K) 3 φ 7−→ φ? ∈ M(K)
has the following properties:
• (φ? )? = φ, ∀ φ ∈ M(K);
• (φ + ψ)? = φ? + ψ ? , ∀ φ, ψ ∈ M(K);
• (λφ)? = λφ? , ∀ φ, ∈ M(K), λ ∈ C.
If we define the space of self-adjoint maps
Msa (K) = {φ ∈ M(K) : φ? = φ},
then is clear that, for any φ ∈ Msa (K), the restriction φC R (K) is real-valued. In
is isometric. Moreover, when the two spaces are equipped with the w∗ topology, this
map is a homeomorphism.
kφC R (K) k ≤ kφk. To prove the other inequality, fix for the moment ε > 0, and
choose f ∈ C(K) such that kf k ≤ 1, and
|φ(f )| ≥ kφk − ε.
Choose a complex number λ with |λ| = 1, such that
|φ(f )| = λφ(f ) = φ(λf ).
If we write λf = g + ih, with g, h ∈ C R (K), then using the fact that φ is self-adoint,
we will have
|φ(f )| = φ(g).
Since kgk ≤ kλf k = kf k ≤ 1, we will get
|φ(f )| ≤ kφ R k,
C (K)
are injcetive and continuous, when the target spaces M(K)1 and MR (K)1
are equipped with the w∗ topology.
Proof. (i)-(ii). The fact that γp is C-linear is obvious. This will also give the
R-linearity of γpR . The continuity follows from the obvious inequality
|γp (f )| = |f (p)| ≤ max |f (q)| = kf k, ∀ f ∈ C(K).
q∈K
Notice that P is still countable, it also separates the points of K, but also has the
property:
f, g ∈ P ⇒ f g ∈ P.
If we define
A = Span({1} ∪ P),
then A ⊂ C R (K) satisfies the hypothesis of the Stone-Weierstrass Theorem, hence
A is dense in C R (K). Notice that if we define
AQ = SpanQ ({1} ∪ P),
i.e. the set of linear combinations of elements in {1} ∪ P with rational coefficients,
then clearly AQ is dense in A, and so AQ is dense in C R (K). But now we are done,
since AQ is obviously countable.
(iiiR ) ⇒ (iiiC ). Assume C R (K) is separable. Let S ⊂ C R (K) be a countable
dense set. Then the set
S + iS = {f + ig : f, g ∈ S}
is clearly countable, and dense in C(K).
(iiiC ) ⇒ (i). Assume C(K) is separable. By the results from the previous
section, it follows that, when equipped with the w∗ topology, the compact space
M(K)1 is metrizable. Then the compact subset Γ(K) ⊂ M(K)1 is also metrizable.
Since K is homeomorphic to Γ(K), it follows that K itself is metrizable.
Definition. Let K be a compact Hausdorff space, and let K be one of the
fields R or C. A K-linear map φ : C K (K) → K is said to be positive, if it has the
property
f ∈ C R (K), f ≥ 0 =⇒ φ(f ) ≥ 0.
Proposition 5.4 (Automatic continuity for positive linear maps). Let K be
a compact Hausdorff space, and let K be one of the fields R or C. Any positive
K-linear map φ : C K (K) → K is continuous. Moreover, one has the equality
kφk = φ(1).
Proof. In the case when K = C, it suffices to prove that φC R (K) is continuous.
Therefore, it suffices to prove the statement for K = R. Start with some arbitrary
f ∈ C R (K), and define the function f± ∈ C R (K) by
f+ = max{f, 0} and f− = max{−f, 0},
so that f± ≥ 0, f = f + −f− , and kf k = max{kf+ k, kf− k}. On the one hand, by
positivity, we have the inequalities φ(f± ) ≥ 0, so we get
−φ(f− ) ≤ φ(f+ ) − φ(f− ) ≤ φ(f+ ),
which give
(2) |φ(f )| = |φ(f+ ) − φ(f− )| ≤ max{φ(f+ ), φ(f− )}.
On the other hand, we have
kf± k · 1 − f± ≥ 0,
so by positivity we get
kf± k · φ(1) ≥ φ(f± ).
Using this in (2) gives
|φ(f )| ≤ φ(1) · max{kf+ k, kf− k} = φ(1) · kf k.
110 LECTURES 14-15
Since this holds for all f ∈ C R (K), the continuity of φ follows, together with the
estimate
kφk ≤ φ(1).
Since φ(1) ≤ kφk · k1k = kφk, the desired norm equality follows.
MK
+ (K)1 = {φ ∈ M+ (K) : kφk ≤ 1} = M+ (K) ∩ M (K)1 .
K K K
M(K). This follows from the fact that, for each f ∈ C (K), the set
R
−1
AKf = {f ∈ M (K) : φ(f ) ≥ 0} = f
K
[0, ∞)
is w∗ -closed, being the preimage of a closed set, under a w∗ -continuous map. Then
everything is a consequence of the equality
\
MK+ (K) = AK
f.
f ∈C R (K)
f ≥0
Using the identification MR (K) ' Msa (K), we have the following hierarchies:
MR
+ (K) ' M+ (K) MR
+ (K)1 ' M+ (K)1
∩ ∩ ∩ ∩
MR (K) ' Msa (K) MR (K)1 ' Msa (K)1
∩ ∩
M(K) M(K)1
with ' isometric and w∗ -homeomorphism.
Proposition 5.5. Let K be a compact Hausdorff space. Then one has the
equality
Msa (K)1 = conv M+ (K)1 ∪ −M+ (K)1 .
Denote the set on the right hand side of (3) simply by D. The inclusion C ⊃ D is
clear. To prove the inclusion C ⊂ D, we only need to prove that D is convex and it
contains M+ (K)1 ∪ −M+ (K)1 . The second property is clear. The convexity of D
is also clear, being a consequence of the convexity of ±M+ (K)1 .
The w∗ -compactness of C is then a consequence of the compatness of the prod-
uct space
M+ (K)1 × M+ (K)1 × [0, 1],
and of the fact that C is the range of the continuous map
M+ (K)1 × M+ (K)1 × [0, 1] 3 (φ, ψ, t) 7−→ tφ − (1 − t)ψ ∈ Msa (K).
Having proven the Claim, we now proceed with the equality
Msa (K)1 = C.
The inclusion ⊃ is clear, since Msa (K)1 is convex, and it contains both M+ (K)1
and −M+ (K)1 .
We prove the other inclusion by contradiction. Assume there is some φ ∈
Msa (K)1 r C. Apply Corollary II.4.2 to find some f ∈ C(K) and a real number α,
such that
Re φ(f ) < α ≤ Re σ(f ), ∀ σ ∈ C.
If we take g = Re f , then this gives
φ(g) < α ≤ σ(g), ∀ σ ∈ C.
Notice that 0 ∈ C, so we get α ≤ 0. If we define β = −α(≥ 0), and h = −g, the
above inequality gives
φ(h) > β ≥ σ(h), ∀ σ ∈ C.
Using the obvious inclusions ±Γ(K) ⊂ C, we get
β ≥ ±γp (h) = ±h(p), ∀ p ∈ K.
Since h is real-valued, this will force khk ≤ β. But then we get a contradiction,
because we also have
β < φ(h) ≤ kφk · khk ≤ khk.
Corollary 5.3. Let K be a compact Hausdorff space, and let φ ∈ Msa (K).
Then there exist φ1 , φ2 ∈ M+ (K), such that φ = φ1 − φ2 , and kφk = kφ1 k + kφ2 k.
Proof. If φ ∈ M+ (K) ∪ −M+ (K), there is nothing to prove. Assume φ 6∈
φ
M+ (K)∪−M+ (K), in particular φ 6= 0. We define ψ = , so that ψ ∈ Msa (K)1 .
kφk
Find ψ1 , ψ2 ∈ M+ (K)1 and t ∈ [0, 1], such that
ψ = tψ1 − (1 − t)ψ2 .
Since ψ 6∈ M+ (K) ∪ −M+ (K), it follows that 0 < t < 1. Notice that
1 = kψk = ktψ1 − (1 − t)ψ2 k ≤ tkψ1 k + (1 − t)kψ2 k.
If kψ1 k < 1, or kψ2 k < 1, then this would imply tkψ1 k + (1 − t)kψ2 k < 1, which
is impossible by the above estimate. This argument proves that we must have
kψ1 k = kψ2 k = 1. If we define
φ1 = tkφkψ1 and φ2 = (1 − t)kφkψ2 ,
112 LECTURES 14-15
then kφ1 k = tkφk and kφ2 k = (1 − t)kφk, so we indeed have kφ1 k + kφ2 k = kφk.
Obviously φ1 and φ2 are positive, and
φ1 − φ2 = kφk · tψ1 − (1 − t)ψ2 = kφk · ψ = φ.
Proposition 5.6. Let K be a compact Hausdorff space. The set
conv Γ(K) ∪ {0}
is w∗ -dense in M+ (K)1 .
Proof. Let C be the w∗ -closure of conv Γ(K) ∪ {0} . It is obvious that C ⊂
Define the map T : CbK (Ω) 3 h 7−→ hβ ∈ C K (Ωβ ), and let us show that T is an
inverse for R. The equality R ◦ T = Id is trivial, by construction. To prove the
equality T ◦ R = Id, we start with some f ∈ CbK (Ω), and we consider h = Rf .
Then T h = hβ , and since hβ Ω = h = f Ω , the denisty of Ω in Ωβ clearly forces
f = hβ = T h = T (Rf ).
The fact that R is isometric is now clear, because on the one hand we clearly
have kRf k ≤ kf k, ∀ f ∈ C K (Ωβ ), and on the other hand, by (5), we also have
kT hk ≤ khk, ∀ h ∈ CbK (Ω).
If Ω is a locally compact space, the above result suggests that the space CbK (Ω)
is quite “large.” It is then natural to look at smaller spaces.
Definitions. Let Ω be a locally compact space. If K is one of the fields R or
C, and f : Ω → K is a continuous function, we define the support of f by
supp f = {ω ∈ Ω : f (ω) 6= 0}.
We define the space
CcK (Ω) = f : Ω → K : f continuous, with compact support .
When K = C, this space will be denoted simply by Cc (Ω). Remark that, when
equipped with pointwise addition and multiplication, the space CcK (Ω) becomes a
K-algebra. One has obviously the inclusion CcK (Ω) ⊂ CbK (Ω).
We define C0K (Ω) = CcK (Ω), the closure of CcK (Ω) in CbK (Ω). (When K = C, we
will denote this space simply by C0 (Ω).) The Banach space C0K (Ω) can be regarded
as the completion of CcK (Ω). Of course, when Ω is compact, we have the equality
C0K (Ω) = C K (Ω).
The following result characterizes the Banach space C0K (Ω).
Proposition 5.7. Let Ω be a locally compact space. For a function f ∈ CbK (Ω),
the following are equivalent:
(i) f ∈ C0K (Ω);
(ii) for every ε > 0, there exists some compact subset Kε ⊂ Ω, such that
sup |f (ω)| ≤ ε.
ω∈ΩrKε
Proof. (i) ⇒ (ii). Suppose f ∈ C0K (Ω), which means that there exists some
sequence (fn )∞
n=1 ⊂ Cc (Ω), such that limn→∞ fn = f , in the norm topology in
K
Using the fact that fn Kn = f Kn , the above equality proves that kf −fn k ≤ n1 . This
way we have constructed a sequence (fn )∞ n=1 ⊂ Cc (Ω), such that limn→∞ fn = f ,
K
Proof. (i). We know that Ω is open in Ωα , which immediately gives the fact
that f α is continuous at every point ω ∈ Ω. So all we need to show is the continuity
of f α at ∞. This amounts to showing that for every neighborhood N of f α (∞) = 0
in K, there exists a neighborhood V of ∞ in Ωα , such that f α (V ) ⊂ N . Start with
a neighborhood N of 0, and choose ε > 0, such that the set Bε = {z ∈ K : |z| ≤ ε}
is contained in N . Choose some compact set Kε ⊂ Ω, such that
sup |f (ω)| ≤ ε.
ω∈ΩrKε
Indeed, if we start with some function g ∈ C K (Ωα ) and we take λ = g(∞) and
f = g − λ1, then f (∞) = 0. Note that this argument proves that in fact every
g ∈ C K (Ωα ), can be uniquely represented as g = λ1+f , with λ ∈ K, and f ∈ C0K (Ω).
CHAPTER II: ELEMENTS OF FUNCTIONAL ANALYSIS 115
Proof. Let us denote the right hand side of (7) by M . First we show that
M < ∞. If M = ∞, there exists a sequence (fn )∞
n=1 ⊂ C0 (Ω), such that
R
0 ≤ fn ≤ 1 and φ(fn ) ≥ 4n , ∀ n ≥ 1.
P∞ P∞
P∞
Consider then the function f = n=1 21n fn . Since n=1
21n fn
≤ n=1 21n = 1,
it follows that f ∈ C0R (Ω). Notice however that, since we obviously have 21n fn ≤ f ,
by the positivity of φ, we get
1 1
φ(f ) ≥ φ n fn = n φ(fn ) ≥ 2n , ∀ n ≥ 1,
2 2
which is clearly impossible. Let us show now that φ is continuous, by proving the
inequality
(8) |φ(f )| ≤ M, ∀ f ∈ C0R (Ω), with kf k ≤ 1.
Start with some arbitrary function f ∈ C0R (Ω). The functions g ± = |f |±f ∈ C0R (Ω),
clearly satisfy g ≥ 0, so we get φ(|f | ± f ) ≥ 0, so we get φ(|f |) ≥ ±φ(f ). This gives
|φ(f )| ≤ φ(|f |), and since 0 ≤ |f | ≤ 1, we immediately get (8).
The inequality (8) proves the inequality kφk ≤ M . Since we obviously have
M ≤ kφk, we get in fact the equality (7).
6. Hilbert spaces
In this section we examine a special type of Banach spaces.
Definition. Let K be one of the fields R or C, and let X be a vector space
over K. An inner product on X is a map
X × X 3 (ξ, η) 7−→ ξ η ∈ K,
λξ1 + ξ2 η = λ ξ1 η + ξ2 η , ∀ ξ1 , ξ2 , η ∈ X, λ ∈ K.
= |z|2 η η + z ξ η + z ξ η + ξ ξ , ∀ z ∈ R.
117
118 LECTURES 16-17
Proof. (i). This is obvious, since (since the computations from the proof of
Corollary ??)
kξ ± ηk2 = kξk2 + kηk2 ± 2Re ξ η .
so we immediately get
kξ + ηk2 − kξ − ηk2 = 4 ξ η .
Since
3
X 3
X
i−k = i−2k = 0,
k=0 k=0
the above computation proves that we indeed have
3
X
i−k kξ + ik ηk2 = 4 ξ η .
k=0
Corollary
6.2. Let X be a K-vector space equipped with an inner product
· · . Then the map
X × X 3 (ξ, η) 7−→ ξ η ∈ K
Hint: Define the inner product by the Polarization Identity, and then prove that it is indeed an
inner product.
Let us prove now the uniqueness. Assume ξ 00 ∈ C is another point such that
kξ − ξ 00 k = δ. Using the Parallelogram Law, we have
4δ 2 = 2kξ − ξ 0 k2 + kξ − ξ 00 k2 = k2ξ − ξ 0 − ξ 00 k2 + kξ 0 − ξ 00 k2 .
If ξ 0 6= ξ 00 , then we will have
4δ 2 > k2ξ − ξ 0 − ξ 00 k2 = 4kξ − 12 (ξ 0 + ξ 00 )k2 ,
so we have a new vector η = 12 (ξ 0 + ξ 00 ) ∈ C, such that
kξ − ηk < δ,
thus contracting the definition of δ.
Definition. Let H be a Hilbert space, and let X ⊂ H be a closed linear
subspace. For every ξ ∈ H, using the above result, we let PX ξ ∈ X denote the
unique vector in X with the property
kξ − PX ξk = dist(ξ, X).
This way we have constructed a map PX : H → H, which is called the orthogonal
projection ont X.
The properties of the orthogonal projection are summarized in the following
result.
Proposition 6.4. Let H be a Hilbert space, and let X ⊂ H be a closed linear
subspace.
(i) For vectors ξ ∈ H and ζ ∈ X one has the equivalence
ζ = PX ξ ⇐⇒ (ξ − ζ) ⊥ X.
(ii) PX X = IdX .
(iii) The map PX : H → X is linear, continuous. If X 6= {0}, then kPX k = 1.
(iv) Ran PX = X and Ker PX = X⊥ .
Proof. (i). “⇒.” Assume ζ = PX ξ. Fix an arbitrary vector η ∈ X r {0}, and
choose a number λ ∈ K, with |λ| = 1, such that
λ ξ − ζ η = ξ − ζ η .
In particular, we have
ξ − ζ η = Re ξ − ζ λη .
Define the map F : R → R by
F (t) = kξ − ζ − tληk2 − kξ − ζk2 .
By the definition of ζ = PX ξ, we have
(5) F (t) > 0, ∀ t ∈ R r {0}.
2
2
= at + bt, ∀ t ∈ R, where a = λη λη = kηk , and b =
Notice that F (t)
2Re ξ − ζ λη = 2 ξ − ζ η . Of course, the property
at2 + bt > 0, ∀ t ∈ R r {0}
forces b = 0, so we indeed get ξ − ζ η = 0.
“⇐.” Assume (ξ − ζ) ⊥ X. For any η ∈ X, we have (ξ − ζ) ⊥ (ζ − η), so using
the Pythagorean Theorem, we get
kξ − ηk2 = kξ − ζk2 + kζ − ηk2 ,
CHAPTER II: ELEMENTS OF FUNCTIONAL ANALYSIS 123
which forces
kξ − ηk ≥ kξ − ζk, ∀ η ∈ X.
This proves that
kξ − ζk = dist(ξ, X),
i.e. ζ = PX ξ.
(ii). This property is pretty clear. If ξ ∈ X, then 0 = ξ − ξ is orthogonal to X,
so by (i) we get ξ = PX ξ.
(iii). We prove the linearity of PX . Start with vectors ξ1 , ξ − 2 ∈ H and a scalar
λ ∈ K. Take ζ1 = PX ξ1 and ζ2 = PX ξ2 . Consider the vector ζ = λζ1 + ζ2 . For any
η ∈ X, we have
λξ1 + ξ2 − ζ η = (λξ1 − λζ1 ) + (ξ2 − ζ2 ) η =
= λξ1 − λζ1 η + ξ2 − ζ2 η = λ ξ1 − ζ1 η + ξ2 − ζ2 η = 0.
By (i) we have (ξ1 − ζ1 ) ⊥ X and (ξ1 − ζ1 ) ⊥ X, so the above computation proves
that
(λξ1 + ξ2 − ζ) ⊥ X,
so using (i) we get
PX (λξ1 + ξ2 ) = ζ = λζ1 + ζ2 = λPX ξ1 + PX ξ2 ,
so PX is indeed linear.
To prove the continuity, we start with an arbitrary vector ξ ∈ H and we use
the fact that (ξ − PX ξ) ⊥ PX ξ. By the Pythagorean Theorem we then have
kξk2 = k(ξ − PX ξ) + PX ξk2 = kξ − PX ξk2 + kPX ξk2 ≥ kPX ξk2 .
In other words, we have
kPX ξk ≤ kξk, ∀ ξ ∈ H,
so PX is indeed continuous, and we have kPX k ≤ 1. Using (ii) we immediately get
that, when X 6= {0}, we have kPX k = 1.
(iv). The equality Ran PX = X is trivial by the construction of PX and by (ii).
If ξ ∈ Ker PX , then by (i), we have ξ ∈ X⊥ . Conversely, if ξ ⊥ X, then ζ = 0
satisfies the condition in (i), i.e. PX ξ = 0.
Corollary 6.4. If H is a Hilbert space, and X ⊂ H is a closed linear subspace,
then
X + X⊥ = H and X ∩ X⊥ .
In other words the map
(6) X × X⊥ 3 (η, ζ) 7−→ η + ζ ∈ H
is a linear isomorphism.
Proof. If ξ ∈ H then PX ξ ∈ X, and ξ − PX ξ ∈ X⊥ , and then the equality
ξ = PX ξ + (ξ − PX ξ)
proves that ξ ∈ X + X . The equality X ∩ X⊥ = {0} is trivial, since for ζ ∈ X ∩ X⊥ ,
⊥
(ii) Prove that two closed subspaces X, Y ⊂ H, the following are equivalent:
– X ⊥ Y;
– PX PY = 0;
– PY PX = 0.
(iii) Prove that two closed subspaces X, Y ⊂ H, the following are equivalent:
– X ⊂ Y;
– PX PY = PX ;
– PY PX = PX .
(iv) Let X, Y ⊂ H are closed subspaces, such that X ⊥ Y, then
– X + Y is c closed linear subspace of H;
– PX+Y = PX + PY .
Corollary 6.5. Let H be a Hilbert space, and let X ⊂ H be a linear (not
necessarily closed) subspace. Then on has the equality
⊥
X = X⊥ .
⊥ ⊥
Proof. Denote the closed subspace X⊥ by Z. Since X⊥ = X , by the
previous exercise we have
PZ = I − PX⊥ = I − PX⊥ = I − (I − PX ) = PX ,
which forces
Z = Ran PZ = Ran PX = X.
Theorem 6.1 (Riesz’ Representation Theorem). Let H be a Hilbert space over
K, and let φ : H → K be a linear continuous map. Then there exists a unique
vector ξ ∈ H, such that
φ(η) = ξ η , ∀ η ∈ H.
that
X⊥ = Kξ.
Start now with some arbitrary vector η ∈ H. On the one hand, using the equality
Kξ0 + X = H, there exists λ ∈ K and ζ ∈ X, such that
η = λξ0 + ζ,
and since ζ ∈ X = Ker φ, we get
φ(η) = φ(λξ0 ) = λφ(ξ0 ).
On the other hand, we have
ξ0 η = ξ0 λξ0 + ξ0 ζ = λkξ0 k2 ,
CHAPTER II: ELEMENTS OF FUNCTIONAL ANALYSIS 125
In particular, we have
kξ − ξ 0 k2 = ξ − ξ 0 | ξ − ξ 0 = ξ | ξ − ξ 0 − ξ 0 | ξ − ξ 0 = φ(ξ − ξ 0 ) − φ(ξ − ξ 0 ) = 0,
which forces ξ = ξ 0 .
Finally, to prove the norm equality, we first observe that when ξ = 0, the
equality is trivial. If ξ 6= 0, then on the one hand, using C-B-S inequality we have
|φ(η)| = ξ η ≤ kξk · kηk, ∀ η ∈ H,
so we immediately get kφk ≤ kξk. If we take the vector ζ = kξk−1 ξ, then kζk = 1,
and
φ(ζ) = ξ kξk−1 ξ = kξk,
is orthonormal.
Proposition 6.5. Let X be a K-vector space equipped with an inner product.
Any orthogonal set F ⊂ X is linearly independent.
Lemma 6.2. Let X be a K-vector space equipped with an inner product, and let
F ⊂ X be an orthogonal set. Then there exists a maximal (with respect to inclusion)
orthogonal set G ⊂ X with F ⊂ G.
126 LECTURES 16-17
AF = G ∈ A : G ⊃ F ,
ordered with the inclusion. We are going to apply Zorn’s Lemma to AF . Let
T ⊂ AF be a subcollection, which is totally ordered, i.e. for any G1 , G2 ∈ T one has
G1 ⊂ G2 or G1 ⊃ G2 . Define the set
[
M= G.
G∈T
Remark 6.2. Using the notations from the proof above, given an orthonormal
set M ⊂ X, the following are equivalent:
(i) M is maximal in A;
(ii) M is maximal in
A(1) = G : G orthonormal subset of X .
Proof. (i) ⇒ (ii). Assume F is maximal. We are going to show that Span F is
dense in H, by contradiction. Denote the closure Span F simply by X, and assume
X ( H. Since
⊥
X = X⊥ ,
we see that, the strict inclusion X ( H forces X⊥ 6= {0}. But now if we take a
non-zero vector ξ ∈ X⊥ , we immediately see that the set F ∪ {ξ} is still orthogonal,
thus contradicting the maximality of F.
CHAPTER II: ELEMENTS OF FUNCTIONAL ANALYSIS 127
(ii) ⇒ (i). Assume Span F is dense in H, and let us prove that F is maximal.
We do this by contardiction. If F is not maximal, then there exists ξ ∈ H r F, such
that F ∪ {ξ} is still orthogonal. This would force ξ ⊥ F, so we will also have
ξ ⊥ Span F.
But since Span F is dense in H, this will give ξ ⊥ H. In particular we have ξ ⊥ ξ,
which would force ξ = 0, thus contradicting the fact that F ∪ {ξ} is orthigonal.
(Recall that all elements of an orthigonal set are non-zero.)
Definition. Let H be a Hilbert space An orthonormal set B ⊂ H, which is
maximal among all orthogonal (or orthonormal) subsets of H, is called an orthonor-
mal basis for H.
By Lemma ??, we know that given any orthonormal set F ⊂ H, there exists an
orthonormal basis B ⊃ F.
By the above result, an orthonormal set B ⊂ H is an orthonormal basis for H,
if and only if Span B is dense in H.
Example 6.2. Let I be a non-empty set. Consider the Hilbert space `2K (I).
Consider (see section II.2) the set
B = {δ i : i ∈ I}.
Then
Span B = finK (I),
which is dense in The above result then says that B is an orthonormal basis
`2K (I).
for `2K (I).
The following exercise will be useful in the discussion of another interesting
example.
Exercise 4. Equipp the space C([0, 1]) with the inner product
Z 1
f g =
f (t)g(t) dt, f, g ∈ C([0, 1]).
0
The norm defined by this inner product is
Z 1 12
2
kf k2 = |f (t)| dt , f ∈ C([0, 1]).
0
Define the maps en : [0, 1] 3 t 7−→ exp(2nπit) ∈ T, n ∈ Z. (Here T denotes the unit
circle in C.) Prove that the set
B = {en : n ∈ Z}
is orthonormal in C([0, 1]), and Span B is dense in C([0, 1]) in the topology defined
by the norm k · k2 .
Hints: Define the space
P = f ∈ C([0, 1]) : f (0) = f (1) .
Prove that P is dense in C([0, 1]) in the topology defined by the norm k · k2 .
Prove that the map
Φ : C(T) 3 F 7−→ F ◦ e ∈ P
is a linear isomorphism, which is isometric with respect to the uniform norms.
In order to prove that Span B is dense in C([0, 1]) with respect to k · k2 , it suffices to show
that Span B is dense in P in the uniform norm. Equivalently, it suffices to show that
Φ−1 Span B
128 LECTURES 16-17
is dense in C(T), with respect to the uniform norm. To get this density use Stone-Weierstrass
Theorem, plus the fact that the functions ζn = Φ−1 (en ) ∈ C(T) are defined by
ζn (z) = z n , ∀ z ∈ T, n ∈ Z.
Example 6.3. We define L2 ([0, 1]) to be the completion of C([0, 1]) with re-
spect to the norm k · k2 . Regard C([0, 1]) as a dense linear subspace in L2 ([0, 1]),
so we also regard
B = {en : n ∈ Z}
as a subset in L ([0, 1]). Then Span B is dense in L2 ([0, 1]), so B is an orthonormal
2
Having proven the Claim, let us observe that, since the terms in the sum that
defines η F are all orthogonal, we get
ξj η · ξj
2 = ξj η 2 · kξj k2 =
X
X X
kη F k2 = |αη (j)|2 .
(ii). The linearlity of T is obvious. The above inequality actually proves that
kT ηk ≤ kηk, ∀ η ∈ H.
We now prove that in fact T is isometric. Since T is linear and continuous, it
suffices to prove that T Span B is isometric. Start with some vector η ∈ Span B,
P that there exists some finite set F ⊂ I, and scalars (λk )k∈F ⊂ K, such
which means
that η = k∈F λk ξk . Remark that
X λk if k ∈ F
ξj | η = λk ξ j | ξj =
0 if k ∈ F
k∈F
so we indeed get
kηk = kT ηk, ∀ η ∈ Span B.
Let us prove that T is surjective. Notice that, the above computation, applied to
singleton sets F = {k}, k ∈ I, proves that
T ξk = δ k , ∀ k ∈ I.
130 LECTURES 16-17
In particular, we have
Ran T ⊃ T Span B = Span T (B) =
Notation. Let H be a Hilbert space, let B = {ξj : j ∈ I} be an orthonormal
basis for H, and let T : H → `2K (I) be the isometric linear isomorphism defined in
the previous theorem. Given an element α ∈ `2K (I), we denote the vector T −1 α ∈ H
by
X
α(j)ξj .
j∈I
The summation notation is justified by the following fact.
Proposition 6.6. With the above notations, for every ε > 0, there exists some
finite subset Fε ⊂ I, such that
X
X
2
α(j)ξj − α(k)ξk
< ε, for all finite sets F ⊂ I with F ⊃ Fε .
j∈I k∈F
P
Proof. Define the vector η = j∈I α(j)ξj . By construction we have T η = α.
Likewise, if we define, for each finite set F ⊂ I, the element αF ∈ `2K (I) by
α(k) if k ∈ F
αF (k) =
0 if k ∈ I r F
then T −1 αF = k∈F α(k)ξk . Using the fact that T is an isometry, we have
P
kη − T −1 αF k = kT η − αF k = kα − αF k,
and the desired property follows from the well-known properties of `2K (I).
Exercise 5. Let H be a Hilbert space, let F = {ξj : j ∈ J} be an orthonormal
set. Define the closed linear subspace HF = Span F. Prove that the orthogonal
projection PHF is defined by
X
ξj η ξj , ∀ η ∈ H.
PHF η =
j∈J
CHAPTER II: ELEMENTS OF FUNCTIONAL ANALYSIS 131
Hints: Extend F to an orthonormal basis B. Let B be labelled as {ξi : i ∈ I} for some set
I ⊃ J. First prove that for any η ∈ H, the map β η = T η J belongs to `2K (J). In particular, the
sum
X
ηF = ξj η ξ j
j∈J
is “legitimate” and defines an element in HF (use the fact that F is an orthonormal basis for HF ).
Finally, prove that (η − ηF ) ⊥ F, using Parseval Identity.
Example 6.4. Let us analyze the space L2 ([0, 1]). Use the orthonormal basis
{en : n ∈ Z} defined by
en (t) = exp(2nπit), ∀ t ∈ [0, 1], n ∈ Z.
For any f ∈ C([0, 1]) we define
Z 1
fˆ(n) =
exp(−2nπit)f (t) dt = en f .
0
One can think the right hand side as a series, but the reader should be aware of
the fact that this series is convergent only in the norm k · k2 . One can define for
example for any N ≥ 1, a partial sum fN : [0, 1] → C by
N
X
fN (t) = fˆ(n)exp(2nπit), t ∈ [0, 1].
n=−N
We will have
lim kf − fN k2 = 0,
N →∞
but in general there are (many) values of t ∈ [0, 1] for which the limit limN →∞ fN (t)
does not exist. One can consider a formal infinite series
∞
X
(7) fˆ(n)exp(2nπit).
n=−∞
Although this series is not convergent (pointwise) for all t ∈ [0, 1], it plays an
important role in analysis. The series (7) is called the complex Fourier series of f .
Note that Parseval’s Identity gives
Z 1 X∞
f (t)g(t) dt = fˆ(n)ĝ(n).
0 n=−∞
One can construct another orthonormal basis for L2 ([0, 1]), by taking real and
imaginary parts of en . More explicitly, we define the sequences of functions (gn )∞
n=0
and (hn )∞
n=1 by
Then B0 = {gn : n ≥ 0}∪ {hn : n ≥ 1} is again an orthonormal basis for L2 ([0, 1]).
(It is clear that B0 is orthonormal, and Span B0 3 en , ∀ n ∈ Z, so Span B0 is dense
in L2 ([0, 1]).) For f ∈ C([0, 1]) one can then define its real Fourier series
∞
X
fˆ(0) +
an cos(2nπt) + bn sin(2nπt) ,
n=1
where
√ Z 1 √ Z 1
an = 2 f (t) cos(2nπt) dt and bn = 2 f (t) sin(2nπt) dt, ∀ n ≥ 1.
0 0
Note that
√ √
2 ˆ ˆ 2 ˆ
an = f (−n) + f (n)] and bn = f (−n) − fˆ(n)], ∀ n ≥ 1.
2 2i
The next result discusses the appropriate notion of dimension for Hilbert spaces.
Theorem 6.4. Let H be a Hilbert space. Then any two orthonormal bases of
H have the same cardinality.
Proof. Fix two orthonormal bases B and B0 . There are two possible cases.
Case I: One of the sets B or B0 is finite.
In this case H is finite dimensional, since the linear span of a finite set is
automatically closed. Since both B and B0 are linearly independent, it follows that
both B and B0 are finite, hence their linear spans are both closed. It follows that
Span B = Span B0 = H,
so B and B0 are in fact linear bases for H, and then we get
Card B = Card B0 = dim H.
Case II: Both B and B0 are infinite.
The key step we need in this case is the following.
Claim 1: There exists a dense subset Z ⊂ H, with
Card Z = Card B0 .
To prove this fact, we define the set
X = SpanQ B0 .
It is clear that
Card X = Card B0 .
Notice that X is dense in SpanR B0 . If we work over K = R, then we are done. If
we work over K = C, we define
Z = X + iX,
and we will still have
Card Z = Card X = Card B0 .
Now we are done, since clearly Z is dense in SpanC B0 .
Choose Z as in Claim 1. For every ξ ∈ B we choose a vector ζξ ∈ Z, such that
√
2−1
kξ − ζξ k ≤ .
2
Claim 2: The map B 3 ξ 7−→ ζξ ∈ Z is injective.
CHAPTER II: ELEMENTS OF FUNCTIONAL ANALYSIS 133
137
138 LECTURE 18
We define
ElemK (X) = {φ : X → K : φ elementary}.
Given a collection M ⊂ P(X), a function φ : X → K is said to be M-elementary,
if φ is elementary, and moreover,
φ−1 ({λ}) ∈ M, ∀ λ ∈ K r {0}.
We define
M-ElemK (X) = {φ : X → K : φ M-elementary}.
Exercise 2. With the above notations, prove that ElemK (X) is a unital K-
algebra.
Proposition 1.1. Given a non-empty set X, the collection P(X) is a unital
ring, with the operations
A + B = A4B and A · B = A ∩ B, A, B ∈ P(X).
Proof. First of all, it is clear that 4 is commutative.
To prove the associativity of 4, we simply observe that
κ (A4B)4C = κ A4B + κ C − 2κ A4B κ C =
= κ A + κ B − 2κ A κ B + κ C − (κ A + κ B − 2κ A κ B ) · κ C =
= κ A + κ B + κ C − 2(κ A κ B + κ A κ C + κ B κ C ) + 2κ A κ B κ C .
Since the final result is symmetric in A, B, C, we see that we get
κ A4(B4C) = κ (A4B)4C ,
so we indeed get
(A4B)4C = A4(B4C).
The neutral element for 4 is the empty set ∅. Since we obviously have A4A = ∅,
it follows that P(X), 4 is indeed an abelian group.
The operation ∩ is clearly commutative, associative, and has the total set X
as the unit.
To check distributivity, we again use characteristic functions:
κ (A∩C)4(B∩C) = κ A∩C + κ B∩C − 2κ A∩C κ B∩C =
= κ A κ C + κ B κ C − 2κ A κ B κ C = (κ A + κ B − 2κ A κ B )κ C =
= κ A4B κ C = κ (A4B)∩C ,
so we indeed have the equality
(A ∩ C)4(B ∩ C) = (A4B) ∩ C.
7 K can be any field.
CHAPTER III: MEASURE THEORY 139
combined with the fact that the B’s are pairwise disjoint, give
κ Cj = κ B2j−1 + κ B2j , ∀ j ∈ {1, . . . , p},
κ A1 = κ B1 + κ B3 + . . . κ B2p+1 ,
which give
p
X p
X
φ= ηj κ B2j + (ηj + λ1 )κ B2j−1 + λ1 κ B2p+1 ,
j=1 j=1
which proves that φ indeed belongs to E.
j=1 ⊂ R,
(iii) ⇒ (i). Assume there exists a finite pair-wise disjoint system (Bj )m
and numbers µ1 , . . . , µm ∈ K, such that
φ = µ1 κ B1 + · · · + µm κ Bm ,
and let us prove that φ is R-elemntary.
If all the µ’s are zero, there is noting to prove, since φ = 0.
Assume the µ’s are not all equal to zero. Since the µ’s that are equal to zero
do not have any contribution, we can in fact assume that all the µ’s are non-zero.
Notice that
φ(X) r {0} = {µj : 1 ≤ j ≤ m}.
In particular φ is elementary.
If we start with an arbitrary λ ∈ K r {0}, then either λ 6∈ φ(X), or λ ∈
φ(X) r {0}. In the first case we clearly have φ−1 ({λ}) = ∅ ∈ R. In the second
case, we have the equality [
φ−1 ({λ}) = Bj ,
j∈Mλ
where
Mλ = {j : 1 ≤ j ≤ m and µj = λ}.
Since all B’s belong to R, it follows that φ−1 ({λ}) again belongs to R. Having
shown that φ is elementary, and φ−1 ({λ}) ∈ R, for all λ ∈ K r {0}, it follows that
φ is indeed R-elementary.
Proposition 1.3. Let X be a non-empty set, and let K be one of the fields Q,
R, or C.
A. For a non-empty collection R ⊂ P(X), the following are equivalent:
(i) R is a ring on X;
(ii) R-ElemK (X) is a K-subalgebra of ElemK (X).
B. For a non-empty collection A ⊂ P(X), the following are equivalent:
(i) A is an algebra on X;
(ii) A-ElemK (X) is a K-subalgebra of ElemK (X), which contains the constant
function 1.
Proof. A. (i) ⇒ (ii). Assume R is a ring on X. Using Lemma 1.1 we see that
we have the equality:
R-ElemK (X) = Span{κ A : A ∈ R}.
In particular, this shows that R-ElemK (X) is a K-linear subspace of ElemK (X).
Moreover, in order to prove that R-ElemK (X) is a K-subalgebra, it suffices to prove
the implication
A, B ∈ R =⇒ κ A · κ B ∈ R-ElemK (X).
142 LECTURE 18
n
is invertible. Take αij i,j=1 to be the inverse of T . The obvious equalities
n
X
φk = λkj κ Aj , ∀ k = 1, . . . , n
j=1
Notice that the family FΘ (E, X) is non-empty, since it contains at leas the collection
P(X). The collection \
ΘX (E) = C
C∈FΘ (E,X)
is of type Θ on X, and is called the type Θ class generated by E. When there is no
danger of confusion, the ambient set X will be ommitted.
Comment. In the above setting, the class Θ(E) is the smallest collection of
type Θ on X, which contains E. In other words, if C is a collection of type Θ on X,
with C ⊃ E, then C ⊃ Θ(E). This follows immediately from the fact that C belongs
to FΘ (E, X).
Examples 2.2. Let X be a (non-empty) set, and let E be an arbitrary collection
of subsets of X. According to the previous list of consistent types R, A, S, Σ, and
M, one can construct the following collections.
(i) R(E), the ring generated by E; this is the smallest ring that contains E.
145
146 LECTURE 19
(ii) A(E), the algebra generated by E; this is the smallest algebra that contains
E.
(iii) S(E), the σ-ring generated by E; this is the smallest σ-ring that contains
E.
(iv) Σ(E), the σ-algebra generated by E; this is the smallest σ-algebra that
contains E.
(v) M(E), the monotone class generated by E; this is the smallest monotone
class that contains E.
Comment. Assume Θ is a consistent type. Suppose E is an arbitrary collection
of subsets of some fixed non-empty set X. There are instances when we would like
to decide whether a class C ⊃ E coincides with Θ(E). The following is a useful test:
(i) check that C is of type Θ;
(ii) check the inclusion C ⊂ Θ(E).
By (i) we must have C ⊃ Θ(E), so by (ii) we will indeed hav equality.
A simple illustration of the above technique allows one to describe the ring and
the algebra generated by a collection of sets.
Proposition 2.1. Let X be a non-empty set, and let E be an arbitrary collec-
tion of subsets of X.
A. For a set A ⊂ X, the following are equivalent:
(i) A ∈ R(E);
(ii) There exist sets A1 , A2 , . . . , An such that A = A1 4A2 4 . . . 4An , and each
Ak , k = 1, . . . , n is a finite intersection of sets in E.
B. The algebra generated by E is
A(E) = R(E) ∪ X r A : A ∈ R(E) = R E ∪ {X} .
Since we clearly have E ⊂ A ⊂ A(E), all we need to prove is the fact that A is an
algebra. It is clear that, whenever A ∈ A, it follows that X r A ∈ A. Therefore
(see Section III.1), we only need to show that
A, B ∈ A ⇒ A ∪ B ∈ A.
There are four cases to examine: (i) A, B ∈ R(E); (ii) A ∈ R(E) and X rB ∈ R(E);
(iii) X r A ∈ R(E) and B ∈ R(E); (iv) X r A ∈ R(E) and X r B ∈ R(E).
Case (i) is clear, since it will force A ∪ B ∈ R(E).
In case (ii), we use
X r (A ∪ B) = (X r A) ∩ (X r B) = (X r B) r A,
which proves that X r (A ∪ B) ∈ R(E).
Case (iii) is proven exactly as case (ii).
In case (iv) we use
X r (A ∪ B) = (X r A) ∩ (X r B),
which proves that X r (A ∪ B) ∈ R(E).
The equality A(E) = R E ∪ {X} is trivial.
Comment. Unfortunately, for σ-rings and σ-algebras, no easy constructive
description is avaialable. There is an analogue of Proposition 2.1 uses transfinite
induction. In order to formulate such a statement, we introduce the following
notations. For every collection C of subsets of X, we define
∞
[
C∗ = (An r Bn ) : An , Bn ∈ C ∪ {∅}, ∀ n ≥ 1 .
n=1
Notice that
(2) C ∪ {∅} ⊂ C∗ ⊂ S(C).
Theorem 2.1. Let X be a non-empty set, and let E be an arbitrary collection
of subsets of X. For every ordinal number η define the set
Pη = {α : α ordinal number with α < η}.
Let Ω denote the smallest uncountable ordinal number, and define the classes Eα ,
α ∈ PΩ recursively by E0 = E, and
[ ∗
Eα = Eβ , ∀ α ∈ PΩ r {0}.
β∈Pα
such that An ∈ Eαn and B ∈ Eβn . Form then the countable set
Z = {αn : n ∈ N} ∪ {βn : n ∈ N} ⊂ PΩ .
Then we clearly have
[ ∗
U∈ Eν .
ν∈Z
Since Z is countable, there is a strict upper bound for Z in PΩ , i.e. there exists
γ ∈ PΩ , such that αn < γ and βn < γ, ∀ n ≥ 1. In other words we have Z ⊂ Pγ , so
[ ∗
U∈ Eν = E γ ,
ν∈Pγ
so U indeed belongs to U.
Proof. Using the notations from the proof of the above theorem, we will first
prove, by transfinite induction, that
ℵ0
(3) card Eα ≤ card E , ∀ α ∈ PΩ .
The case α = 0 is clear. Assume now we have α ∈ PΩ r {0}, such that
ℵ0
card Eβ ≤ card E , ∀ β ∈ Pα ,
ℵ0
let us prove that we also have the inequality card Eα ≤ card E . If we take
and S
C = β∈Pα Eβ , we know that C is a countable union of sets, each having cardinality
ℵ0
≤ card E , so we immediately get
ℵ0 ℵ0
card C ≤ ℵ0 · card E = card E .
Then the collection
D(C) = {A r B : A, B ∈ C}
2
has cardinality at most card C , so we also have
ℵ0
card D(C) ≤ card E .
ℵ0
Finally, the collection Eα = C∗ has cardinality at most card D(C) , so we get
ℵ0 ℵ0 ℵ0
card Eα ≤ card E = card E .
CHAPTER III: MEASURE THEORY 149
S R(E) ⊃ S(E).
Conversely, since S(E) is a ring, and contains E, we get the inclusion
S(E) ⊃ R(E),
and since S(E) is a σ-ring, we will now get
S(E) ⊃ S R(E) ,
so we get
S R(E) = S(E).
Using (5), the desired equality follows.
(ii). This follows from Proposition 2.1 andpart (i) applied to E∪{X}, combined
with the obvious equality Σ(E) = S E ∪ {X} .
The σ-ring and the σ-algebra, generated by an arbitrary collection of sets, are
related by means of the following result.
CHAPTER III: MEASURE THEORY 151
(i)PE
σ (X) is a σ-ring on X;
(ii) the σ-ring S(E) and the σ-algebra Σ(E), generated by E, satsify the equality
S(E) = Σ(E) ∩ PE
σ (X).
The key ingredient in proving the inclusion “⊃” is contained in the following.
Claim: Given a set E ∈ E, the collection
AE (X) = A ⊂ X : A ∩ E ∈ S(E)
is a σ-algebra on X.
To prove this we need to check:
(a) if A belongs to AE (X), then X r A also belongs to AE (X);
whenever (An )∞
(b) S n=1 is a sequence of sets in AE (X), it follows that the union
∞
A
n=1 n also belongs to AE (X).
To check (a) we simply remark that, since both E and A ∩ E belong to S(E), it
follows immediately that (X r A) ∩ E = E r (A ∩ E), also belongs to S(E), which
means that X r A indeed belongs to AE (X).
Property (b) is clear. Since the
S∞fact that An S ∩ E belongs to S(E), for all n,
∞
immediately gives the fact that n=1 A n ) ∩ E = n=1 (An ∩ E) belongs to S(E),
S∞
which means precisely that n=1 An belongs to AE .
Having proven the Claim, we now proceed with the proof of the inclusion
S(E) ⊃ Σ(E) ∩ PE E
σ (X). Start with some set A ∈ Σ(E) ∩ Pσ (X), and we will show
n=1 ⊂ E, such that
that A belongs to S(E). First of all, there exists a sequence (En )∞
∞
[
(6) A⊂ En .
n=1
Using the Claim, we know that for each n ∈ N, the collection AEn is a σ-algebra.
This σ-algebra clearly contains E, so we have
Σ(E) ⊂ AEn , ∀ n ∈ N.
In particular, we get the fact that A ∈ AEn , which means that A ∩ En belongs to
S(E, for all n ∈ N. But then the inclusion (6) forces the equality
[
A= (A ∩ En ),
n=1
f ∗ G = f −1 (G) : G ∈ G ⊂ P(X).
Example 2.4. Let Θ be a consistent class type, which is both covariant and
contravariant. Let Y be some set, and let C be a collection of type Θ on Y . Given
a subset X ⊂ Y , we consider the inclusion map
ι : X ,→ Y . The collection ι∗ C is
then of type Θ on X. It will be denoted by C X . Since ι A = A ∩ X, ∀ A ∈ P(Y ),
−1
we have
C = {A ∩ X : A ∈ C}.
X
If E ⊂ P(Y ) is a collection with C = Θ(E), then by the Generating Theorem we
have the equality
Θ(E)X = Θ {E ∩ X : E ∈ E} .
(7)
Comment. The exercise below shows that a “forward” version of the Gener-
ating Theorem does not hold in general. In other words, an equality of the type
f∗ Θ(G) = Θ(f∗ G) may fail. The reason is the fact that the collection f∗ G may be
relatively “small.”
Exercise 2. Consider the sets X = {1, 2, 3}, Y = {1, 2}, the function f :X→
Y , defined by f (1) = f (2) = 1, f (3) = 2, and the collection C = {1}, {2}, ∅ .
Describe the collection f∗ C, the algebra A(C) generated by C (on X), and the
algebra A(f∗ C) generated by f∗ C (on Y ). Prove that one has a strict inclusion
A(f∗ C) ( f∗ A(C).
Exercise 3. Let Θ be a consistent natural type, let f : X → Y be a surjective
map, and let G be a collection of subsets of X. Assume one has the inclusion
(8) G ⊂ f ∗ Θ(f∗ G).
Prove that one has the equality
f∗ Θ(G) = Θ(f∗ G).
(One instance when (8) holds is for example when f −1 f (G) = G, ∀ G ∈ G.)
i∈I i∈I
154 LECTURE 19
Indeed, if we define Ei = Θ(Gi ), the inclusion ⊃ follows from the obvious inclusions
X Ei ⊃ πi∗ Ei ⊃ πi∗ Gi .
i∈I
which combined with the fact that the right hand side is of type Θ, and the Gen-
erating Theorem, give the inclusions
[
πi∗ Ei = πi∗ Θ(Gi ) = Θ(πi∗ Gi ) ⊂ Θ πi∗ Gi .
i∈I
Natural consistent types also allow one to define disjoint union structures.
Definitions. Let (Xi )i∈I be a collection of non-empty sets. Assume that,
F for
each i ∈ I, a collection Ci ⊂ P(Xi ) is given. On the disjoint union X = i∈I Xi
one defines the collection
_
Ci = C ⊂ X : C ∩ Xi ∈ Ci , ∀ i ∈ I .
i∈I
and we define
Borσc (X) = Bor(X) ∩ Pσc (X).
CHAPTER III: MEASURE THEORY 155
(Here the notation ΣKn indicates that the σ-algebra is taken on Kn .) In particular,
we get
(10) Bn = B ∩ Kn ∈ Bor(X) = Bor(Kn ), ∀ n ∈ N.
Kn
Now (9) immediately follows from the above inclusions, combined with (10).
Remark 2.2. For a topological Hausdorff space, we always have the inclusions
Borσc (X) ⊂ Borc (X) ⊂ Bor(X).
The following are equivalent
(i) Borσc (X) = Bor(X);
(ii) X is σ-compact.
The following result exaplains when a minimal set of generators can be chosen
for the Borel sets.
156 LECTURE 19
Exercise 6. A. Prove that, if XSis second countable, and S is a sub-base for its
topology (countable or not), with S∈S S = X, then we have in fact the equality
Bor(X) = S(S).
B. Prove that, if X is Hausdorff, second countable, with card X ≥ 2, then for any
sub-base S (countable or not), we have the equality
Bor(X) = S(S).
Hints: Follow the proof above. Remark that every open set D ⊂ X, which is a countable union
of sets in V, belongs in fact to the σ-ring S(V) = S(S). So in either case, we only have to show
that X is a countable union of sets in V.
In case A, we trace the proof of the Claim, and we notice that the only property that we
used was the fact that, for every x ∈ D, there exists V ∈ V with x ∈ V ⊂ D, i.e. D is a (possibly
uncountable) union of sets in V. Since X itself satisfies this property, it follows that X is also a
countable union of sets in V.
In case B, we use the Hausdorff property to write X = D1 ∪ D2 , with D1 , D2 ( X open.
Corollary 2.4. If X is a topological Hausdorff space, which is second count-
able, and X is infinite (as a set), then card Bor(X) = c.
Proof. First of all, since X is infinite, one can chose an infinite countable
subset A ⊂ X. Then A, and all its subsets are Borel, i.e. we have the inclusion
P(A) ⊂ Bor(X), thus proving the inequality
card Bor(X) ≥ card P(A) = 2ℵ0 = c.
Secondly, one can choose abase V for the topology, which is countable. We now
have Bor(X) = S V ∪ {X} , so by Corollary 2.1. we get
ℵ0
card Bor(X) ≤ card V ∪ {X} ≤ ℵ0 ℵ0 = c,
and the desired equality follows.
Examples 2.5. A. Consider the extended real line [−∞, ∞] = R ∪ {−∞, ∞},
thought as a compact space, homeomorphic to the interval [−π/2, π/2], via the map
f : [−π/2, π/2] → [−∞, ∞], defined by
−∞ if t = −π/2
f (t) = tan t if − π/2 < t < π/2
∞ if t = π/2
E3 = [−∞, a) : a ∈ A ; E4 = [−∞, a] : a ∈ A .
Second, we observe that E1 ∪ E3 is a sub-base for the topology, and since [−∞, ∞]
is obviously second countable, we will have the equality
Bor([−∞, ∞]) = Σ(E1 ∪ E3 ).
So, in order to finish the proof we only need to show the inclusions
(13) E1 ∪ E3 ⊂ Σ(Ek ), ∀ k ∈ {1, 2, 3, 4}.
Since every set in E2 has its complement in E3 , and viceversa, we have the inclusions
E2 ⊂ Σ(E3 ) and E3 ⊂ Σ(E2 ),
which prove the equality
(14) Σ(E2 ) = Σ(E3 ).
Likewise, we have the equality
(15) Σ(E1 ) = Σ(E4 ).
This means that we only have to prove (13) for k = 2 and k = 4. The case k = 2
amounts to proving that E1 ⊂ Σ(E2 ). Fix some a ∈ A. For every integer n ≥ 1 we
choose an ∈ (a, a + n1 ) ∩ A. Then the equality
∞
[
(a, ∞] = [an , ∞]
n=1
is clearly a base for the metric topology. Since X is separable, one can choose both
A and R to be countable, which proves that X is automatically second countable.
Then for any choice of A and R, we will have the equality
Bor(X) = Σ Br (a) : r ∈ R, a ∈ A = S Br (a) : r ∈ R, a ∈ A .
(16)
(The equality between the generated σ-algebra and σ-ring follows from Exercise
1.A.) As particular cases when the equality (16) holds, one has the metric spaces
which are σ-compact.
CHAPTER III: MEASURE THEORY 159
Exercise 7*. Let I be an uncountable set, and let (Xi )i∈I be a collection of
topological spaces. Assume that for each i ∈ I, there esists at leas one non-empty
closed subset Fi ( Xi . (This is the case for example when Xi is Hausdorff, and
card Xi ≥ 2.) Prove that one has a strict inclusion
Y
Bor X
Xi ) Σ- Bor(Xi ).
i∈I
i∈I
Q Q
Hint: For every subset J ⊂ I, define the projection map πJ : i∈I Xi → i∈J Xi . Consider
the collection
Xi : there exists J ⊂ I countable, such that A = πJ−1 πJ (A) .
Y
A= A⊂
i∈I
Prove that A ∪ {∅} is a σ-algebra, which contains i∈I πi∗ Bor(Xi ). Prove that one has a strict
S
T∗ A = B ⊂ Y : T −1 (B) ∈ A .
measurable.
T S
(iii) If (Y, B) and (Z, C) are measurable spaces, and if (X, A) −−→ (Y, B) −−→
(Z, C) are measurable maps, then the composition S ◦ T : (X, A) → (Z, C)
is again a measurable map.
Often, one would like to check the measurability condition (1) on a small col-
lection of B’s. Such a criterion is the following.
161
162 LECTURE 20
Lemma 3.1. Let (X, A) and (Y, B) be masurable spaces. Assume B = Σ(E),
for some collection of sets E ⊂ P(Y ). For a map T : X → Y , the following are
equivalent:
(i) T : (X, A) → (Y, B) is measurable;
(ii) T −1 (E) ∈ A, ∀ E ∈ E.
(ii) T −1 (S) ∈ A, ∀ S ∈ S.
Proof. Immediate from the above Lemma, and Proposition 2.2, which states
that Bor(Y ) = Σ(S).
We know (see Section 19) that the type Σ is consistent and natural. In par-
ticular, measurability behaves nicely with respect to products and disjoint unions.
More explicitly one has the following.
Q Let (Xi , Ai )i∈I
Proposition 3.2. F be a collection of measurable spaces. Con-
sider the sets X = i∈I Xi and Y = i∈I Xi , and the σ-algebras
_
A = Σ - Ai and B =
X Ai .
i∈I i∈I
Conversely, assume all the compositions πi ◦ f are measurable, and let us show
that f : (Z, G) → (X, A) is measurable. By Lemma 3.1 and (2), all we need to
prove is the fact that [
f∗ πi∗ Ai ⊂ G,
i∈I
which is equivalent to
f ∗ πi∗ Ai ⊂ G, ∀ i ∈ I.
all i ∈ I.
(ii). By the definition of the σ-algebra sum, we know that
\
(3) B= i∗ Ai .
i∈I
• f −1 [a, ∞] ∈ A, ∀ a ∈ A;
• f −1 [−∞, a) ∈ A, ∀ a ∈ A;
• f −1 [−∞, a] ∈ A, ∀ a ∈ A.
Definition. If X and Y are topological Hausdorff spaces, a map T : X → Y
is said to be Borel measurable, if T is measurable as a map
T : X, Bor(X) → Y, Bor(Y ) .
164 LECTURE 20
In the cases when Y = R, C, [−∞, ∞], a Borel measurable map will be simply
called a Borel measurable function.
For K = R, C, we define
BK (X) = f : X → K : f Borel measurable function .
Remark 3.3. If X and Y are topological Hausdorff spaces, then any continuous
map T : X → Y is Borel measurable. This follows from Lemma 3.1, from the fact
that
Bor(Y ) = Σ {D ⊂ Y : D open } ,
and the fact that T −1 (D) is open, hence in Bor(X), for every open set D ⊂ Y .
Measurable maps behave nicely with respect to “measurable countable opera-
tions,” as suggested by the following result.
Proposition 3.3. Let (X, A) and (Z, B) be a measurable spaces, let I be a
set which is at most countable, and let (Yi )i∈I be a family of topological Hausdorff
spaces, each of which is second countable. Suppose a measurable Qmap Ti : (X, A) →
Yi , Bor(Yi ) is given, for each i ∈ I. Define the map T : X → i∈I Yi by
T (x) = Ti (x) i∈I , ∀ x ∈ X.
Q
Equip the product space Y = i∈I Yi with theproduct topology.
For any measurable map g : Y, Bor(Y ) → (Z, B), the composition g ◦ T :
(X, A) → (Z, B) is measurable.
But this is quite obvious, since a point x = (xi )i∈I belongs to m−1 [−∞, a) , if
and only if thereQexists some j ∈ I with xi < a. In other words, if we define the
projections πj : i∈I [−∞, ∞] → [−∞, ∞], then we have
[
m−1 [−∞, a) =
πj [−∞, a) .
j∈I
−1
This shows that in fact m [−∞, a) is open, hence clearly Borel.
To prove the measurability of M , we are going to show that
Y
M −1 (a, ∞] ∈ Bor
[−∞, ∞] , ∀ a ∈ R.
i∈I
be a sequence
∞ of measurable maps. Assume that, for every x ∈ X, the sequence
Tn (x) n=1 ⊂ Y is convergent. Define the map T : X → Y by
T (x) = lim Tn (x), ∀ x ∈ X.
n→∞
Denote the set in the right hand side simply by A. Start first with some x ∈ A.
There exist some m, n ∈ N such that
∞
\
Tk−1 Br− n1 (y) ,
x∈
k=m
Using the fact that A is closed under countable intersections, it follows that
∞
\
Tk−1 Br− n1 (y) ∈ A, ∀ m, n ∈ N, r > 0.
k=m
Finally, using the fact that A is closed under countable unions, the desired property
(5) follows.
Exercise 3. Let (X, A)Sbe a measurable space, and let (Xn )∞
n=1 be a sequence
∞
of sets in A, with X = n=1 Xn . Suppose (Y, B) is a measurable space, and
F : X → Y is a map, such that
F : Xn , A → (Y, B)
Xn Xn
Hint: Use the preceding exercise, applied to the set Ω1 = {z ∈ C : P 0 (z) 6= 0}.
The preceding exercise can be generalized:
Exercise 6*. Let Ω1 ⊂ C be a connected open set, and let f : Ω1 → C be a
non-constant holomorphic function. By the Open Mapping Theorem we know that
the set Ω2 = f (Ω1 ) is open. Prove
that there exists a Borel measurable function
φ : Ω2 → Ω1 , such that f ◦ φ = IdΩ2 .
Hint: Use Exercise 4, applied to the set Ω0 = {z ∈ Ω1 : f 0 (z) 6= 0}. Since f is non-constant,
the set Ω1 r Ω0 is countable.
We continue with a discussion on the role of elementary functions.
Proposition 3.4. Let (X, A) be a measurable space, and let K be one of the
fields R or C. For an elementary function f ∈ ElemK (X), the following are equiv-
alent:
(i) f ∈ A-ElemK (X);
(ii) f : (X, A) → K is measurable.
Remark that
(8) 0 ≤ gn (s) < 1 and 0 < hn (s) ≤ 1, ∀ s ∈ [0, 1].
Note that, for every n ∈ N, we have
(9) gn (0) = 0; gn (1) = (2n − 1)/2n ;
(10) hn (0) = 1/2n ; hn (1) = 1.
Claim 1: The sequence (gn )∞ ∞
n=1 is non-decreasing, and the sequence (hn )n=1
is non-increasing.
170 LECTURE 20
Using (9) and (10), we only need to examine the restrictions to the open interval
(0, 1). Fix some point s ∈ (0, 1). For every integer n ≥ 1, define
k
psn = max k ∈ Z : 0 ≤ n < s .
2
We clearly have psn < 2n and
psn ps + 1
(11) n
<s≤ n n .
2 2
We then have
psn /2n if s 6= (psn + 1)/2n psn + 1
(12) gn (s) = and hn (s) =
(psn+ 1)/2n if s = (psn + 1)/2n 2n
We now estimate gn+1 (s) and hn+1 (s). First of all, using (11), we have
2psn 2psn + 2
< x ≤ ,
2n+1 2n+1
which means that either psn+1 = 2psn , or psn+1 = 2psn + 1. This immediately gives
psn+1 + 1 2psn + 2 psn + 1
hn+1 (s) = ≤ = = hn (s).
2n+1 2n+1 2n
Note that, if s = (psn + 1)/2n , we will have psn+1 = 2ps + 1 and s = (psn+1 + 1)/2n+1 ,
so we get
gn+1 (s) = (psn+1 + 1)/2n+1 = (2psn + 2)/2n+1 = (psn + 1)/2n = gn (s).
If s 6= (psn + 1)/2n , then
psn 2ps psn+1
gn (s) =
n
= nn ≤ n+1 ≤ gn+1 (s).
2 2 2
Claim 2: For every s ∈ [0, 1] one has
lim sup gn (s) − s = lim sup hn (s) − s = 0.
n→∞ s∈[0,1] n→∞ s∈[0,1]
To prove this fact we are going to estimate the differences |gn (s)−s| and |hn (s)−s|.
If s = 0 or s = 1, then the equalities (9) and (10) immediately show that
1 1
(13) |gn (s) − s| ≤ and |hn (s) − s| ≤ n , ∀ n ∈ N.
2n 2
If s ∈ (0, 1), then the definitions of gn (s) and hn (s) clearly show that
s, gn (s), hn (s) ∈ psn /2n , (psn + 1)/2n ,
and then we see that we again have the inequalities (13). Since (13) now holds for
all s ∈ [0, 1], the Claim immediately follows.
We proceed now with the proof of the theorem. Define
α = inf f (x) : x ∈ X and β = sup f (x) : x ∈ X .
If α = β, there is nothing to prove. Assume α < β. Depending on the finitude of
α and β, we define a homeomorphism Φ : [α, β] → [0, 1], as follows.
(a) If α > −∞ and β < ∞, we define
s−α
Φ(s) = , ∀ s ∈ [α, β].
β−α
CHAPTER III: MEASURE THEORY 171
In case (a), we have f 0 (x) = f (x) and f 00 (x) = 0, so limn→∞ fn0 (x) = f (x) and
limn→∞ fn00 (x) = 0.
In case (b), we have f 0 (x) = 0 and f 00 (x) = f (x), so limn→∞ fn0 (x) = 0 and
limn→∞ fn00 (x) = f (x).
In either case, the equality (14) follows.
We call T the space of infinite coin flippings, having in mind that an element
of T is the same as the outcome of an infinite sequence of coin flips (think 0
as corresponding to tails, and 1 as corresponding to heads). Equipp T with the
product topology. By Tihonov’s Theorem, T is compact. The product topology on
T is in fact given by a metric d defined by
∞
X |αn − βn |
d(a, b) = n
, ∀ a = (αn )∞ ∞
n=1 , b = (βn )n=1 ∈ T.
n=1
2
Proof. Throughout the proof the number r will be fixed. The map φr will be
denoted by φ, and the compact set Kr will be denoted by K.
Since φ : T → K is continuous, it is measurable, i.e. we have the implication
(15) B ∈ Bor(K) ⇒ φ−1 (B) ∈ Bor(T ).
Before we proceed with the actual proof, we need some preparations. Remark that,
since φ : T → K is surjective, we have the equality
φ φ−1 (C) = C, ∀ C ⊂ K.
(16)
Claim 1: If a subset C ⊂ K is at most countable, if and only if the set
φ−1 (C) ⊂ T is at most countable.
Suppose C is at most countable countable. If we take A0 = φ−1 (C) ∩ T0 , and
A1 = φ−1 (C) r T0 , then obviously φ−1 (C) = A0 ∪ A1 . Since A1 ⊂ T r T0 , and
T rT0 is countable, it follows that A1 is at most
countable, so we only need to prove
that A0 is at most countable. But since φT is injective, and A0 ⊂ T0 , it follows
0
that φA : A0 → C is injective, and then the fact that C is at most countable,
0
forces A0 to be at most countable.
Conversely, if φ−1 (C) is at most countable, then so is φ φ−1 (C) . By (16) we
are done.
For each subset A ⊂ T , we define
hAi = φ−1 φ(A) .
Remark that A ⊂ hAi, ∀ A ⊂ T . Note also that, for any family (Ai )i∈I of subsets
of T , one has the equality
[ [ [
[ [
Ai = φ−1 φ = φ−1 φ−1 φ(Ai ) =
(17) Ai φ(Ai ) = hAi i.
i∈I i∈I i∈I i∈I i∈I
Corollary 3.7. Use the above notations. For a number r ≥ 2 and a subset
B ⊂ Kr , the following are equivalent:
(i) B ∈ Bor(Kr );
(ii) φ−1
r (B) ∈ Bor(T ).
CHAPTER III: MEASURE THEORY 175
then
• (Φr ◦ Ψr )(B) = B, for all B ⊂ Kr ;
• (Ψr ◦ Φr )(A) ⊃ A, and (Φr ◦ Ψr )(A) r A is at most countable, for all
A ⊂ T;
• B ∈ Bor(Kr ) ⇔ Ψr (B) ∈ Bor(T );
• A ∈ Bor(T ) ⇔ Φr (A) ∈ Bor(Kr ).
In the particular case r = 2, we know that K2 = [0, 1], so we can think the mea-
surable space [0,1], Bor([0, 1]) as “approximatively the same” as the measurable
space T, Bor(T ) .
The case r = 3 will be an interesting one, especially for constructing various
counter-examples. The compact set K3 ⊂ [0, 1] is called the ternary Cantor set.
It turns out that there exists another useful description of the ternary Cantor
set K3 , which yields some interesting properties.
Notations. We keep the notations above. An element a = (α)∞ n=1 ∈ T will
be called finite, if there exists some N ∈ N, such that αn = 0, ∀ n ≥ 0. We define
Tfin = a ∈ T : a finite .
Remark that Tfin ⊂ T0 . In particular the map φ3 T : Tfin → K3 is injective.
fin
For a ∈ Tfin we define its length as
`(a) = min{N ∈ N : αn = 0, ∀ n ≥ N } − 1.
With this definition, for every a = (αn )∞
n=1 ∈ Tfin , we have
In case (b), based on the fact that we have proven case (a), we can assume, without
any loss of generality, that a = b and j < k. In this case we have
2 2 1 1
φ(b) + = φ(a) + < φ(a) + ≤ φ(a) + j+1 ,
3k+1 3k+1 3 k 3
which means that the right end-point of Iµ is not greater than the left end-point of
Iλ , so again we get Iλ ∩ Iµ = ∅.
For the proof of (iii) we are going to use the space
Exactly as is the case with T , the product space P is compact with respect to the
product topology, which is given by the metric
∞
X |αn − βn |
d(a, b) = n
, ∀ a = (αn )∞ ∞
n=1 , b = (βn )n=1 ∈ P.
n=1
2
satisfies
ψ(a) − ψ(b)| ≤ d(a, b), ∀ a, b ∈ P,
Prove that P0 is dense in P , and prove that ψ(P ) ⊂ [0, 1] r K. (Use the arguments employed in
the proof of part (iii).)
Remarks 3.5. If we set Λn = Λ∩ {n}×P , then we can write the complement
of the ternary Cantor set as
∞
[
[0, 1] r K3 = Dn ,
n=0
180 LECTURE 20
where [
Dn = Iλ .
λ∈Λn
Then the system of open sets (Dn )n≥0 is pair-wise disjoint. Morever, each Dn is a
union of 2n disjoint intervals of length 1/3n+1 .
Since card T0 = c, and the map φ3 T0 : T0 → K3 is injective, we get card K3 ≥ c.
Since we also have card K3 ≤ card R = c, we get in fact the equality
card K3 = c.
Lecture 21
is a measure on f∗ E1 .
B. If f is surjective, prove that the map f ∗ µ2 : f ∗ E2 → [0, ∞], defined by
(f ∗ µ2 ) ) = µ2 f (B) , ∀ B ∈ f ∗ E2 ,
is a measure on f ∗ E2 .
We now concentrate on the most rudimentary types of collections E on which
measures can be somehow easily defined. Actually, what we have in mind is a set
of easy conditions on a map µ : E → [0, ∞] which would guarrantee that µ is a
measure.
Definition. Let X be a non-empty set. A collection J ⊂ P(X) is called a
semiring, if it satisfies the following properties:
• ∅ ∈ J;
• if A, B ∈ J, then A ∩ B ∈ J;
• if A, B ∈ J and A ⊂ B, then there exists an integer n ≥ 1, and sets
D0 , D1 , . . . , Dn ∈ J, such that A = D0 ⊂ D1 ⊂ · · · ⊂ Dn = B, and
Dk r Dk−1 ∈ J, ∀ k ∈ {1, . . . , n}.
Remark that every ring is a semiring.
Exercise 2. Prove that the semiring type is not consistent. Give an example of
two semirings J1 , J2 ⊂ P(X), such that J1 ∩ J2 is not a semiring.
Hint: Use the set X = {1, 2, 3}.
Exercise 3. Let X1 , . . . , Xn be non-empty sets, and let Jk ⊂ P(Xk ), k =
1, . . . , n, be semirings. Prove that
J = A1 × · · · × An : A1 ∈ J1 , . . . , An ∈ Jn ⊂ P(X1 × · · · × Xn )
is a semiring.
Hint: First prove the case n = 2, and then use induction.
Example 4.2. Take X = R. The collection
J = {∅} ∪ [a, b) : a, b ∈ R, a < b ⊂ P(R)
is a semiring.
Indeed, the first two axioms are pretty clear. To prove the third axiom, we
start with two intervals A = [a, b) and B = [c, d) with A ⊂ B. This means that
a ≥ c and b ≤ d. If a = c or b = d, we set D0 = A and D1 = B. If a > c and b < d,
we set D0 = A, D1 = [a, d) and D2 = B.
CHAPTER III: MEASURE THEORY 183
is a semiring.
Exercise 4. Let Jn ⊂ P(Rn ) be the semiring defined above. Prove that the
σ-ring S(J) generated by Jn coincides with Bor(Rn ).
The ring generated by a semiring has a particularly nice description (compare
to Proposition 2.1):
Proposition 4.1. Let J be a semiring on X. For a subset A ⊂ X, the following
are equivalent:
(i) A belongs to R(J), the ring generated by J;
(ii) There exists an integer n ≥ 1, and a pair-wise disjoint system (Aj )nj=1 ⊂ J,
such that A = A1 ∪ · · · ∪ An .
Proof. Denote by R the collection of all subsets A ⊂ X that satisfy condition
(ii). It is obvious that
J ⊂ R ⊂ R(J),
so (see Section III.2) we only need to prove that R is a ring.
Let us first remark that we obviously have the property:
(i) if A, B ∈ R, and A ∩ B = ∅, then A ∪ B ∈ R.
Secondly, we remark that we have have the implication:
(ii) A, B ∈ J ⇒ A r B ∈ R.
Indeed, since A∩B ∈ J, by the definition of a semiring, there exist D0 , D1 , . . . , Dn ∈
J with A ∩ B = D0 ⊂ D1 ⊂ · · · ⊂ Dn = A, and Dk r Dk−1 ∈ J, ∀ k ∈ {1, . . . , n}.
Then the equality
n
[
Ar = (Dk r Dk−1 )
k=1
shows that A r B indeed belongs to R.
Thirdly, we prove the implication:
(iii) A, B ∈ R ⇒ A ∩ B ∈ R.
i=1 , (Bk )k=1 ⊂ J
Write A = A1 ∪ · · · ∪ Am and B = B1 ∪ · · · ∪ Bn , with (Ai )m n
with (Ai r B)m i=1 a pair-wise disjoint system, so by (i) it suffices to show that
Ai rB ∈ R, ∀ i ∈ {1, . . . , m}. To prove this, we fix i and we write B = B1 ∪· · ·∪Bn ,
with (Bk )nk=1 ⊂ J a pair-wise disjoint system. Then
Ai r B = (Ai r B1 ) ∩ · · · ∩ (Ai r Bn ),
and the fact that Ai r B belongs to R follows from (ii) and (iii).
Having proven (i)-(iv), it we now prove that R is a ring. By (iii), we only need
to prove the implication
(∗) A, B ∈ R ⇒ A4B ∈ R.
On the one hand, using (iv), it follows that the sets A r B = A r (A ∩ B) and
B r A = B r (A ∩ B) both belong to R. Since A4B = (A r B) ∪ (B r A), and
(A r B) ∩ (B r A) = ∅, by (i) is follows that A4B indeed belongs to R.
Theorem 4.1 (Semiring-to-ring extension). Let J be a semiring on X, and let
µ : J → [0, ∞] be an additive map with µ(∅) = 0.
(i) There exists a unique additive map µ̄ : R(J) → [0, ∞], such that µ̄J = µ.
(ii) If µ is σ-additive, then so is µ̄.
Proof. The key step is contained in the following
i=1 ⊂ J and (Bj )j=1 ⊂ J are pair-wise disjoint systems, with
Claim: If (Ai )m n
A1 ∪ · · · ∪ Am = B1 ∪ · · · ∪ Bn ,
then µ(A1 ) + · · · + µ(Am ) = µ(B1 ) + · · · + µ(Bn ).
To prove this fact, we define the pair-wise disjoint system (Dij )1≤i≤m by Dij =
1≤j≤n
Ai ∩ Bj , ∀ (i, j) ∈ {1, . . . , m} × {1, . . . , n}. Since
n
[
Dij = Ai , ∀ i ∈ {1, . . . , m},
j=1
[m
Dij = Bj , ∀ j ∈ {1, . . . , n},
i=1
using additivity, we have the equalities
Xn
µ(Dij ) = µ(Ai ), ∀ i ∈ {1, . . . , m},
j=1
Xm
µ(Dij ) = µ(Bj ), ∀ j ∈ {1, . . . , n},
i=1
and then we get
m
X m
X m n n n
X X X X
µ(Ai ) = µ(Dij ) = µ(Dij ) = µ(Bj ).
i=1 i=1 j=1 j=1 i=1 j=1
To prove (i), for any set A ∈ R(J) we choose (use Proposition 4.1) a finite
pair-wise disjoint system (Ai )ni=1 ⊂ J, with A = A1 ∪ · · · ∪ An , and we define
(1) µ̄(A) = µ(A1 ) + · · · + µ(An ).
By the above Claim, the number µ̄(A) is independent of the particular choice of the
pair-wise disjoint system (Ai )ni=1 . Also, it is clear that µ̄J = µ, and µ̄ is additive.
CHAPTER III: MEASURE THEORY 185
The uniqueness is also clear, because the equality µ̄J = µ and additivity of µ̄ force
(1)
(ii). Assume now that µ is σ-additive, and let us prove that µ̄ is S∞ again σ-
additive. Start with a pair-wise disjoint sequence (An )∞
n=1 ⊂ R(J), with n=1 An ∈
R(J), and let us prove the equality
∞
[ ∞
X
(2) µ̄ An = µ̄(An ).
n=1 n=1
S∞
SinceS n=1 An ∈ R, there exists a finite pair-wise disjoint system (Bi )pi=1 ⊂ J, such
∞
that n=1 An = B1 ∪ · · · ∪ Bp . With this choice we have
∞
[ p
X
(3) µ̄ An =µ(Bi ).
n=1 i=1
S∞
For each i ∈ {1, . . . , p}, we have Bi = n=1 (Bi ∩ An ). Fix for the moment a
pair (n, i) ∈ N × {1, . . . , p}. Since Bi ∩ An ∈ R(J), it follows that there exist an
integer Nni ≥ 1 and a finite pair-wise disjoint system (Ckni )N k=1 ⊂ J, such that
ni
SNni ni
Bi ∩ An = k=1 Ck .
Since, for each i ∈ {1, . . . , p}, the countable system (Ckni ) n∈N ⊂ J is pair-
1≤k≤Nni
wise disjoint, and we have the equality
∞ N
[ [ni ∞
[
Ckni = (Bi ∩ An ) = Bi ∈ J,
n=1 k=1 n=1
Since, for each n ∈ N, the finite system (Ckni ) 1≤i≤p ⊂ J is pair-wise disjoint,
1≤k≤Nni
and we have the equality
p N
[ [ni ∞
[
Ckni = (Bi ∩ An ) = An ∈ J,
i=1 k=1 i=1
by the definition of µ̄, we have
ni
p N
X X
µ̄(An ) = µ(Ckni ), ∀ i ∈ {1, . . . , p}.
i=1 k=1
Note that we do not require the Ak ’s to be pair-wise disjoint. With this terminology,
Theorem 4.1 has the following.
Corollary 4.1. Let X be a non-empty set X, and let J ⊂ P(X) be a semiring.
Then any additive map µ : J → [0, ∞] is sub-additive.
Proof. Let µ̄ : R(J) → [0, ∞] be the additive extension of µ to the ring gener-
ated by J. It suffices to prove that µ̄ is sub-additive. Start with sets A, A1 , . . . , An ∈
R(J) such that A ⊂ A1 ∪ . . . An . Define the sets B1 = A1 , and
Bk = Ak r (A1 ∪ · · · ∪ Ak−1 ), f orall k ∈ {1, . . . , n}, k ≥ 2.
Since we work in a ring, the sets Bk , Bk ∩A, Bk rA, and An rBn , n ∈ N, all belong
to R(J). Moreover, the sequence (Bk )nk=1 is pair-wise disjoint and it satisfies
• S
Bk ⊂ Ak , ∀ Sk ∈ {1, . . . , n},
n n
• k=1 Bk = k=1 Ak ⊃ A,
so by the additivity of µ̄, we get
n
X n
X n
X
µ̄(Ak ) = µ̄ (Ak r Bk ) ∪ Bk = µ̄(Ak r Bk ) + µ̄(Bk ) ≥
k=1 k=1 k=1
n
X n
X n
X
≥ µ̄(Bk ) = µ̄ (Bk r A) ∪ (Bk ∩ A) = µ̄(Bk r A) + µ̄(Bk ∩ A) ≥
k=1 k=1 k=1
n
X n
[
≥ µ̄(Bk ∩ A) = µ̄ [Bk ∩ A] = µ̄(A).
k=1 k=1
µ̄(A) = µ(A), ∀ A ∈ J.
Since we work in a ring, the sets Bn , Bn ∩A, Bn rA, and An rBn , n ∈ N, all belong
to R(J). Moreover, the sequence (Bn )∞ n=1 is pair-wise disjoint and it satisfies
• B
Sn∞⊂ An , ∀ S
n ∈ N,
∞
• n=1 Bn = n=1 An ⊃ A,
so by σ-additivity of µ̄, we get
∞
X ∞
X ∞
X
µ̄(An ) = µ̄ (An r Bn ) ∪ Bn = µ̄(An r Bn ) + µ̄(Bn ) ≥
n=1 n=1 n=1
∞
X ∞
X ∞
X
≥ µ̄(Bn ) = µ̄ (Bn r A) ∪ (Bn ∩ A) = µ̄(Bn r A) + µ̄(Bn ∩ A) ≥
n=1 n=1 n=1
∞
X ∞
[
≥ µ̄(Bn ∩ A) = µ̄ [Bn ∩ A] = µ̄(A).
n=1 n=1
Proof. Using Theorem 4.1, we can assume that J is already a ring. (Otherwise
we replace J by R(J), and µ by its extension µ̄.)
(i). Consider the sets D1 = A1 , and Dk = An r Ak−1 , ∀ k ≥ 2. It is clear that
k=1 is a pairwise disjoint sequence in J, and we have the equality
(Dk )∞
n
[
(6) Dk = An , ∀ n ≥ 1.
k=1
Using this equality, combined with the (σ-)additivity of µ, and with (6), we get
∞
[ ∞
X n n
X [
µ An = µ(Dk ) = lim µ(Dk ) = lim µ Dk = lim µ(An ).
n→∞ n→∞ n→∞
n=1 k=1 k=1 k=1
T∞
(ii). Consider the sets B = n=1 Bn , and An = B1 rBn , ∀ nS≥ 1. It is clear that
∞
n=1 ⊂ J, and we have A1 ⊂ A2 ⊂ . . . . Moreover, we have
(An )∞ n=1 An = B1 r B,
so by part (i), we get
(7) µ(B1 r B) = lim µ(B1 r Bn ).
n→∞
The above result has a (minor) generalization, which we record for future use.
To formulate it we introduce the following.
Notation. Let R be a ring, and let µ be a measure on R. For two sets
A, B ∈ R, we write A ⊂ B, if µ(A r B) = 0.
µ
Using this notation, we have the following generalization of Lemma 4.1.
Proposition 4.3. Let R be a ring, and let µ be a measure on R.
S∞
n=1 ⊂ R is a sequence of sets, with A1 ⊂ A2 ⊂ . . . , and
(i) If (An )∞ n=1 An ∈
µ µ
R, then
∞
[
µ An = lim µ(An ).
n→∞
n=1
T∞
n=1 ⊂ R is a sequence of sets, with B1 ⊃ B2 ⊃ . . . , and
(ii) If (Bn )∞ n=1 Bn ∈
µ µ
J, and µ(B1 ) < ∞, then
∞
\
µ Bn = lim µ(Bn ).
n→∞
n=1
Sn
Proof. (i). Define the sequence of sets (En )∞n=1 ⊂ R, by En = k=1 Ak ,
∀ n ≥ 1. Notice that, A1 = E1 , and for each n ≥ 2, we have An ⊂ En , as well as
the equality
n−1
[
En r An = [An r Ak ].
k=1
Using sub-additivity, it follows that
n−1
X
µ(En r An ) ≤ µ(An r Ak ),
k=1
We can prove this using induction on p. The case p = 1 is trivial. Assuming that the
above fact holds for p = N , let us prove it for p = N + 1. Pick k1 ∈ {1, . . . , N + 1}
such that ak1 = a. Then we clearly have
[
[ak , bk ) = [bk1 , b),
1≤k≤N +1
k6=k1
so we get
N
X +1
(bk − ak ) = (bk1 − ak1 ) + (b − bk1 ) = b − ak1 = b − a,
k=1
It will be helpfull to introduce the following notations. For every half-open box
B = [x1 , y1 ) × · · · × [xn , yn ),
and every δ > 0, we define the boxes boxes
B δ = [x1 − δ, y1 ) × · · · × [xn − δ, yn ) and Bδ = [x1 , y1 − δ) × · · · × [xn , yn − δ).
CHAPTER III: MEASURE THEORY 191
5. Outer measures
Although measures can be defined on arbitrary collections of sets, the most
natural domain of a measure is a σ-ring. In the previous section we dealt however
only with (semi)rings. Therefore it is natural to ask the following
Question 1: Given a measure µ on a (semi)ring J, is it possible to extend it
to a measure on the σ-ring S(J) generated by J?
As a particular case of the above question, we can specifically ask if there exists a
measure on Bor(Rn ), which agrees with voln on “half-open boxes.”
As a consequence of a remarkably clever construction, due to Caratheodory, we
will be able to answer the above general question in the affirmative. Caratheodory’s
approach is based on the following concept.
Definition. Given a non-empty set X, an outer measure on X is simply a
map ν : P(X) → [0, ∞] with the following properties.
(0) ν(∅) = 0.
(m) If A, B ∈ P(X) are such that A ⊂ B, then ν(A) ≤ ν(B).
(add− σ ) ν is σ-sub-additive,Si.e. whenever A ∈ P(X), and (A Pn∞)∞n=1 is a sequence
∞
in P(X) with A ⊂ n=1 An , it follows that µ(A) ≤ n=1 µ(An ).
The property (m) is called monotonicity.
Remark that ν is automatically sub-additive, in the sense that, whenever
A, A1 , . . . , An ∈ P(X) are such that A ⊂ A1 ∪ · · · ∪ An , it follows that ν(A) ≤
ν(A1 ) + · · · + ν(An ).
The following result explains how a measure on a semiring can be naturally
extended to an outer measure on the ambient space.
Proposition 5.1. Let X be a non-empty set, let J be a semiring on X, and
let µ : J → [0, ∞] be a measure on J. Consider the collection
∞
PJσ (X) = A ⊂ X : there exists (Bn )∞
[
J,
n=1 ⊂ with A ⊂ Bn .
n=1
moment some ε > 0. For every n ∈ N choose a sequence (Bkn )∞ k=1 ⊂ J, such that
∞
X ε
µ(Bkn ) < + µ̄(An ).
2n
k=1
Since the above inequality holds for all ε > 0, we conclude that
∞
X ∞
X
µ∗ (A) = µ̄(A) ≤ µ̄(An ) = µ∗ (An ),
n=1 n=1
so µ∗ is indeed σ-sub-additive.
Finally, we must show that µ∗ J = µ. Start with some A ∈ J. On the one
S∞ S∞
is an alegbra, all the Bn ’s belong to mν (X). We have, n=1 Bn = n=1 An = A,
which, using Step 4 gives
∞ ∞ X ∞
[ [
(1) ν(S ∩ A) = ν S ∩ An = ν S ∩ Bn = ν(S ∩ Bn ).
n=1 n=1 n=1
SN
Using Step 3, combined with the equality n=1 Bn = AN , we also have
XN N
[
ν(S ∩ Bn ) = ν S ∩ Bn = ν(S ∩ AN ), ∀ N ∈ N,
n=1 n=1
so by (1) we have
∞
X
(2) ν(S ∩ A) = ν(S ∩ Bn ) = lim ν(S ∩ AN ).
N →∞
n=1
Notice now that, using the fact that AN “sharply cuts S,” combined with the
monotonicity of ν and the obvious inclusion S r A ⊂ S r AN , we have
ν(S ∩ AN ) + ν(S r A) ≤ ν(S ∩ AN ) + ν(S r AN ) = ν(S), ∀ N ∈ N,
so using (2), we immediately get
ν(S ∩ A) + ν(S r A) ≤ ν(S).
Since the above inequality holds for all S ⊂ X, by Remark 5.1.A it follows that A
indeed belongs to mν (X).
By the results from Section 1, we know that the fact that mν (X) is simu-
lutaneously an algebra, and a monotone class, implies the fact that mν (X) is a
σ-algebra.
We now show that ν mν (X) is a measure. If we start with a pair-wise disjoint
n=1 ⊂ mν (X), then the equality equality
sequence (An )∞
∞
[ ∞
X
ν An = ν(An )
n=1 n=1
S∞
is an immediate consequence of Step 4, applied to the set S = n=1 An , which
clearly satisfies S ∩ An = An , ∀ n ∈ N.
Proof. What we need to prove is the fact that every set A ∈ J is µ∗ -measurable.
Start with an arbitrary set S ⊂ X. As noticed before (Remark 5.1.A), we only need
to prove the inequality
(3) µ∗ (S ∩ A) + µ∗ (S r A) ≤ µ∗ (S).
If µ∗ (S) = ∞, there is nothing to prove, so we can assume that µ∗ (S) < ∞. In
particular this means that S ∈ PJσ (X). Fix for the moment ε > 0. By the definition
198 LECTURE 22
S∞
n=1 ⊂ J, such that S ⊂
of µ∗ (S) = µ̄(S), there exists a sequence (Bn )∞ n=1 Bn ,
and
∞
X
(4) µ(Bn ) ≤ µ∗ (S) + ε.
n=1
n n
Cm = Dm−k n−1
r Dm−1−k n−1
, if kn−1 < m ≤ kn , n ∈ N.
By construction, for each n ∈ N, we have
kn
[ pn
[
Cm = (Djn r Dj−1
n
) = Bn r An .
m=kn−1 +1 j=1
Combining (6) and (7) with (5) immediately gives the desired inequality (3).
The construction
µ∗ Σ(J)
maximal outer
µ∗
µ extension restriction
−−− −−−−−−→ −−−−−−→
measure on J outer measure on X measure on Σ(J)
gives the fact that A belongs to PJσ (X), so by Proposition 2.3, the set A belongs to
the intersection Σ(J) ∩ PJσ (X) = S(J).
Using the above terminology, we have the following uniqueness result.
Theorem 5.3. Let J be a semiring on X, let µ be a measure on J, let µ∗ be the
maximal outer extension of µ, and let ν be a measure on the σ-ring S(J) generated
by J, with ν J = µ. Then one has ν(A) = µ∗ (A), for all J-µ-σ-finite sets A ∈ S(J).
In the absence of the σ-finitess condition the uniqueness of the σ-ring extension
fails, as illustrated by the following.
Example 5.2. Consider the set X = Q, and the semiring of rational half-open
intervals
J1 = ∅ ∪ [a, b) ∩ Q : a, b ∈ R, a < b .
X∞ [∞
ν ∗ (A) = inf ν(Bn ) : (Bn )∞
n=1 ⊂ I, A ⊂ B n .
n=1 n=1
Proof. (i). This is pretty clear. In fact, if one takes E = {B ∈ S : ν(B) = 0},
then one has the equality N(S, ν) = PEσ (X).
(ii). (a) ⇒ (b). Assume A = B r N with B ∈ S and N ∈ N(S, ν). Choose
D ∈ S with ν(D) = 0 and N ⊂ D. We now have
B r D ⊂ B r N = A,
so if we put F = B r D, we have the equality A = F ∪ M , where
M = A r F = (B r N ) r (B r D) ⊂ D.
Notice that F ∈ S, while the inclusion M ⊂ D shows that M ∈ N(S, ν).
(b) ⇒ (a). Assume A = F ∪ M with F ∈ S and M ∈ N(S, ν). Choose D ∈ S
with M ⊂ D and ν(D) = 0. Define B = F ∪ D. It is clear that B ∈ S, and A ⊂ B.
Define N = B r A, so we clearly have A = B r N . We have
N = (F ∪ D) r (F ∪ M ) ⊂ D r M ⊂ D,
so N clearely belongs to N(S, ν).
(iii). We need to prove the following properties:
(∗) whenever A1 , A2 are sets in S̄, it follows that the difference A1 r A2 also
belongs to S̄;
(∗∗) S whenever (An )∞ n=1 1 is a sequence of sets in S̄, it follows that the union
∞
n=1 An also belongs to S̄.
To prove (∗), we write A1 = B r N and A2 = F ∪ M , with B, F ∈ S and M, N ∈
N(S, ν). Then we have
A1 r A2 = (B r N ) r (F ∪ M ) = B r (F ∪ M ∪ N ) = (B r F ) r (M ∪ N ).
The difference B r F belongs to S, and, using (i), the union N ∪ M belongs to
N(S, ν). By (ii) it follows that A1 r A2 belongs to S̄.
To prove (∗∗), we write, for each n ∈ N, the set An as An = Fn ∪ rMn with
Fn ∈ S and Mn ∈ N(S, ν). Then
∞
[ ∞
[ ∞
[
An = Fn ∪ Mn .
n=1 n=1 n=1
S∞ S∞
The union n=1 Fn belongs to S, and, using (i), the union
S∞ n=1 Mn belongs to
N(S, ν). By (ii), the union n=1 An belongs to S̄.
CHAPTER III: MEASURE THEORY 205
definition of S̄.
(iv). To prove the existence, we consider the maximal outer extension ν ∗ .
When restricted to the σ-algebra mν ∗ (X) of all ν ∗ -measurable sets, then we get
a measure. Notice that ν ∗ (N ) = 0, ∀ N ∈ N(S, ν), which gives the inclusion
N(S, ν) ⊂ mν ∗ (X). In particular, since mν ∗ (X) is a σ-algebra, which contains
both N(S, ν) and S, it follows that
mν ∗ (X) ⊃ S N(S, ν) ∪ S = S̄.
properties.
To prove uniqueness, let µ be another measure on S̄, such that µN(S,ν) = 0 and
with F ∈ S and M ∈ N(S, ν), then using the fact that A r F ⊂ M , we see that
A r F belongs to N(S, ν), so we have
µ(A) = µ(F ) + ν(A r F ) = µ(F ) = ν(F ) = ν̄(F ) = ν̄(F ) + ν̄(A r F ) = ν̄(A).
Finally, we prove that the measure ν̄ is complete. Let A ∈ S̄ be a set with
ν̄(A) = 0, and let U be an arbitrary subset of A. Using (ii) we write A = F ∪ M ,
with F ∈ S and M ∈ N(S, ν). Notice that we have
0 ≤ ν(F ) = ν̄(F ) ≤ ν̄(F ∪ M ) = ν̄(A) = 0,
which forces F ∈ N(S, ν), so using (i), we see that A itself belongs to N(S, ν). By
(i), it follows that U ∈ N(S, ν) ⊂ S̄.
(v) Let E and λ be as in indicated. In order to prove the inclusion E ⊃ S̄, it
suffices to prove the inclusion N(S, ν) ⊂ E. But this inclusion is pretty obvious. If
we start with some N ∈ N(S, ν), then there exists A ∈ S with N ⊂ A and ν(A) = 0.
In particular, we have A ∈ E and λ(E) = 0, and then the completeness of λ forces
N ∈ E. Notice that this also forces λ(N ) = ν̄(N ) = 0. Using (iv) it then follows
that λ|S̄ = ν̄.
Definition. Using the notations above, the σ-ring S̄ is called the completion
of S with respect to ν. The correspondence (S, ν) 7−→ (S̄, ν̄) is referred to as the
measure completion. Remark that, if ν is already complete, then S̄ = S and ν̄ = ν.
Exercise 4. Using the notations from Theorem 5.4, prove that for a set A ⊂ X,
the condition A ∈ S̄ is equivalent to any of the following:
(a0 ) there exists B ∈ S and N ∈ N(S, µ), with A = B r N , and N ⊂ B;
(b0 ) there exists F ∈ S and M ∈ N(S, ν), such that A = F ∪M and F ∩M = ∅;
(c) there exists E ∈ S and Z ∈ N(S, ν), such that A = E4Z.
(d) there exist B, F ∈ S such that F ⊂ A ⊂ B, and µ(B r F ) = 0.
The µ∗ -measurable sets of a special type can be completely characterized using
µ∗ -negligeable ones.
Theorem 5.5. Suppose J is a semiring on X, and µ is a measure on J. Let µ∗
be the maximal outer extension of µ. For a J-µ-σ-finite subset A ⊂ X, the following
are equivalent;
(i) A is µ∗ -measurable;
206 LECTURE 22
(ii) there exists B in the σ-ring S(J) generated by J, and a µ∗ -neglijeable set
N ⊂ X, such that A = B r N .
So, if we denote by S(J) the completion of S(J) with respect to µ∗ S(J) , condition (ii)
from Theorem 5.5 reads: A ∈ S(J). Similarly,
if we denote by Σ(J) the completion
of Σ(J) with respect to the measure µ∗ Σ(J) , condition (ii0 ) above reads: A ∈ Σ(J).
With these notations, we have the inclusions
(12) S(J) ⊂ Σ(J) ⊂ mµ∗ (X).
With these notations, Theorem 5.5 states that
(13) S(J) ∩ A ⊂ X : A J-µ-σ-finite = mµ∗ (X) ∩ A ⊂ X : A J-µ-σ-finite .
Proof. Indeed, under the given assumptions on J and µ, it follows that every
set A ⊂ X is J-µ-σ-finite.
Examples 5.3. A. The implication (i) ⇒ (ii) from Theorem 5.5 may fail, if
A is not σ-finite. Start with an arbitrary set X, consider the semiring J = {∅, X}
and the measure µ on J defined by µ(∅) = 0 and µ(X) = ∞. Notice that J is a
σ-algebra, so it is trivial that J is σ-total in X. The maximal outer extension µ∗
of µ is defined by
0 if A = ∅
µ∗ (A) =
∞ if A 6= ∅
It is clear that, since µ∗ is a measure on P(X), we have the equality mµ∗ (X) =
P(X), but the only µ∗ -neglijeable set is the empty set ∅. This means that the sets
satisfying condition (ii) in Theorem 5.4 are only the sets ∅ and X, so, if ∅ 6= A ( X,
the implication (i) ⇒ (ii) fails, although J is σ-total in X. What occurs here is the
total lack of J-µ-σ-finite sets.
B. Let X be an uncountable set, and let J be the semiring of all finite subsets
of X. We have
S(J) = A ⊂ X : card A ≤ ℵ0 ,
Σ(J) = A ⊂ X : either card A ≤ ℵ0 , or card(X r A) ≤ ℵ0 .
Equipp J with the trivial measure µ(A) = 0, ∀ A ∈ J. The maximal outer extension
µ∗ is then defined by
∗ 0 if card A ≤ ℵ0
µ (A) =
∞ if A is uncountable
It is clear that µ∗ is a measure
on P(X),
so we have mµ∗ (X) = P(X) ) J. Notice
that both measures µ∗ S(J) and µ∗ Σ(J) are complete, so using the notations from
Remark 5.4.B, we have the equalities
S(J) = S(J) and Σ(J) = Σ(J).
It is clear however that both inclusions in (12) are strict, although µ is finite. What
happens here is the fact that J is not σ-total in X.
C. In the same setting as in Example B, if we take I = Σ(J), and ν = µ∗ I ,
of the fact that there are “new” measurable sets which are not necessarily of the
form B r N with B ∈ B and N neglijeable. The existence of such sets is suggested
by the following.
Remark 5.5. Suppose ν is an outer measure on X. For a set A ⊂ X, the
following are equivalent:
(i) A is ν-measurable;
(ii) ν(S) ≥ ν(S ∩ A) + ν(S r A), for all S ⊂ X with ν(S) < ∞;
The implication (i) ⇒ (ii) is trivial. To prove the converse, by Remark 5.1.A, we
need to show that
ν(S) ≥ ν(S ∩ A) + ν(S r A), ∀ S ⊂ X.
But this is trivial, when ν(S) = ∞. If ν(S) < ∞, then this is exactly condition (ii).
The “new” sets, that were mentioned above, are of a type covered by the
following.
Definition. Let ν be an outer measure on X. A subset N ⊂ X is said to be
locally ν-neglijeable, if
ν(N ∩ A) = 0, for all A ⊂ X with ν(A) < ∞.
It is clear that every subset of N is also locally ν-neglijeable.
The above observation shows that every locally ν-neglijeable set is ν-measurable.
The term “local” will be used in connection with properties that hold when the
subject set is cut down by sets of finite measure. For example, one can formulate
the following.
Definitions. Let B be a σ-algebra on X, and µ be a measure on B. We say
that a set N ∈ B is locally µ-null, if
(16) µ(F ∩ N ) = 0, for all F ∈ B, with µ(F ) < ∞.
Remark that locally µ-null sets do not necessarily have zero measure (see Example
5.3.C)
We say that µ is locally complete, if it satisfies the condition
(lc) whenever N ∈ B is a locally µ-null set, it follows that B contains all
subsets of N .
Remarks 5.6. Use the notations above.
A. If the measure µ is σ-finite, the local completeness of µ is equivalent to
completeness. The reason is the fact that, in the σ-finite case, condition (16) is
equivalent to µ(N ) = 0.
B. Given an outer measure ν on X, the measure ν mν (X) is locally complete.
Comment. If we look at Example 5.3.C, we now see that although the measure
ν on I is complete, it is not locally complete, thus giving another explanation for
the strict inclusion I ( mν ∗ (X).
We are now in position to analyze Question 3, in the simplified given setting.
The following fact will be helpful.
Lemma 5.1. Let B be a σ-algebra on X, let µ be a measure on B, and let µ∗
be the maximal outer extension of µ. Then, for every subset S ⊂ X, one has the
equality
µ∗ (S) = inf µ(B) : B ∈ B, B ⊃ S .
(17)
CHAPTER III: MEASURE THEORY 209
A. To prove the implication (i) ⇒ (ii), start with a µ∗ -measurable set A, and
with some F ∈ Bfin . Since F is µ∗ -measurable, the intersection A ∩ F is µ∗ -
measurable. Since µ∗ (A ∩ F ) ≤ µ∗ (F ) = µ(F ) < ∞, by Theorem 5.5 there exist
B0 ∈ B and N0 ⊂ X with µ∗ (N0 ) = 0 and A ∩ F = B0 r N0 . If we then define
B = B0 ∩ F and N = N0 ∩ F , then we clearly have B ∈ BF , N ∈ NF , and
A ∩ F = B r N , so A ∩ F indeed belongs to MF .
The implication (ii) ⇒ (ii) is trivial, since every set in MF is clearly µ∗ -
measurable.
To prove the implication (iii) ⇒ (i), assume A has property (iii), and let us
show that A is µ∗ -measurable. We are going to use Remark 5.5, which means that
it suffices to prove the inequality
(19) µ∗ (S) ≥ µ∗ (S ∩ A) + µ∗ (S r A),
8 Here µ denotes the restriction of µ to the σ-algebra B .
F F
210 LECTURE 22
only for those subsets S ⊂ X with µ∗ (S) < ∞. Fix such a subset S. Since
µ∗ (S) < ∞, Lemma 5.1 gives
µ∗ (S) = inf µ(F ) : F ∈ Bfin , F ⊃ S .
(20)
Start with some arbitrary ε > 0, and choose some F ∈ Bfin with F ⊃ S and
µ(F ) ≤ µ∗ (S) + ε. By (iii) the set A ∩ F is µ∗ -measurable, so we have
µ∗ (F ) = µ∗ F ∩ [A ∩ F ] + µ∗ F r [A ∩ F ] = µ∗ (F ∩ A) + µ∗ (F r A).
Condition (iii) uses the summation convention from II.2. (The sum is defined as
the suppremum of all finite partial sums.)
CHAPTER III: MEASURE THEORY 211
Since µ(A ∩ F ) > 0, ∀SF ∈ SFµ (A), property (a) follows from Proposition II.2.2. If
we denote the union F ∈S µ (A) (A ∩ F ) by A0 , then by the σ-additivity of µ (it is
F
here where we use (a) in an essential way) the equality (22) gives
X
µ(A0 ) = µ(A ∩ F ) = µ(A),
µ
F ∈SF (A)
Proof. It will be useful to introduce the following notations (use also the
notations from Proposition 5.5). For every B ∈ Bfin , we define
[
B0 = (B ∩ F ).
µ
F ∈SF (B)
and let us show that A ∩ B is µ∗ -measurable. Using the above notation, and the
monotonicity of µ∗ we have
µ∗ A ∩ [B r B 0 ] ≤ µ∗ (B r B) = µ(B r B 0 ) = 0,
and since the indexing set SFµ (B) is at most countable, it then suffices to show
that A ∩ F ∩ B is µ∗ -measurable, for each F . But this is obvious, since A ∩ F is
µ∗ -measurable, by condition (ii), and B ∈ B.
C. Let A ⊂ X be a subset with µ∗ (A). Using Lemma 5.1, we can find, for
every ε > 0, some set Bε ∈ Bfin , such that Bε ⊃ A, and µ(Bε ) ≤ µ∗ (A) + ε. Fix
for the moment ε. Since the family µ(Bε ∩ F ) F ∈F is summable, and µ∗ (A ∩ F ) ≤
µ(Bε ∩ F ), ∀ F ∈ F, it follows that the family µ∗ (A ∩ F ) F ∈F is summable, and
Since SFµ (B1 ) is at most countable, the set G belongs to B. With the above notation,
we have the equality B10 = B1 ∩ G, and by Remark 5.7.A, we have µ(B1 r G) =
µ(B1 r B10 ) = 0. Since A r G ⊂ B1 r G, it follows that µ∗ (A r G) = 0. Since G is
µ∗ -measurable, we get
µ∗ (A) = µ∗ (A ∩ G) + µ∗ (A r G) = µ∗ (A ∩ G).
Since G is a countable union of F ’s, by the σ-subadditivity of µ∗ , we have
[ X X
µ∗ (A) = µ∗ (A ∩ G) = µ∗ µ∗ (A ∩ F ) ≤ µ∗ (A ∩ F ).
[A ∩ F ] ≤
µ µ
F ∈SF (B1 ) F ∈SF (B1 ) F ∈F
B. The implication (i) ⇒ (ii) is trivial. To prove the implication (ii) ⇒ (i), we
must show that condition (ii) implies
µ∗ (N ∩ B) = 0, ∀ B ∈ Bfin .
But if we fix some B ∈ Bfin , then of course we have µ∗ N ∩B) ≤ µ∗ (B) = µ(B) < ∞,
so using part C, we have
X X
µ∗ (N ∩ B) = µ∗ (N ∩ B ∩ F ) ≤ µ∗ (N ∩ F ) = 0,
F ∈F F ∈F
N ⊂ X, such that A = B r N .
Proof. A. This is exactly property A from Proposition 5.6.
B. (i) ⇒ (ii). Assume A ∈ Mµ , i.e. A is µ∗ measurable. For every F ∈ F, the
set A ∩ F is µ∗ -measurable. Since µ∗ (A ∩ F ) < ∞, by Theorem 5.5, it follows that
A ∩ F = BF r NF , with BF ∈ B and µ∗ (NF ) = 0. Replacing BF with BF ∩ F ,
and N SF with NF ∩ F , we S can assume that BF , NF ⊂ F . Form then the sets
B = F ∈F BF and N = F ∈F NF . On the one hand, we have B ∩ B = BF ∈ B,
∀ F ∈ F, which means precisely that B ∈ F ∈F BF . On the other hand, we also
W
have N ∩ F = NF , so we get µ∗ (N ∩ F ) = 0, ∀ F ∈ F. By Proposition 5.6.B, it
follows that N is locally µ∗ -neglijeable. We clearly have A = B r N .
The implication (ii) ⇒ (i) is obvious.
There is yet another nicer consequence of Proposition 5.6, for which we are
going to use the following terminology.
Definition. Let A be a σ-algebra on X, and let µ be a measure on A. A
family F is called a µ-finite decomposition for A, if
(i) F is a sufficient µ-finite
W A-partition
of X, and
(ii) one has the equality F ∈F AF = A.
(Given a collection F ⊂ A, one always has the inclusion F ∈F AF ⊂ A.)
W
(ii) there exist B ∈ B, and some locally µ∗ -neglijeable set N , such that
A = B r N.
B. For a subset N ⊂ X, the following are equivalent
(i) N is locally µ∗ -neglijeable;
(ii) there exists a locally µ-null set D ∈ B with N ⊂ D.
Proof. A. This is clear, by Corollary 5.3.
B. The implication (ii) ⇒ (i) is trivial, because any locally µ-null set D is
locally µ∗ -neglijeable, and so is every subset of D.
To prove the implication (i) ⇒ (ii) start with a locally µ∗ -neglijeable set N ,
and we fix F a µ-finite decomposition of B. We know that µ∗ (N ∩ F ) = 0, ∀ F ∈ F.
In particular, using Remark 5.4.B, for each F ∈ F, there exists some
S set EF ∈ B,
with N ∩ F ⊂ EF , and µ(EF ) = 0. Consider now the set D = F ∈F (EF ∩ F ).
we have D ∩ F = EF ∩ F ∈ B, ∀ F ∈ F, which means that
By construction,
D ∈ F ∈F BF . It is here where we use condition (ii) in the definition of µ-finite
W
decompositions, to conclude that D belongs to B. Of course, we have
µ(D ∩ F ) = µ(EF ∩ F ) ≤ µ(EF ) = 0, ∀ F ∈ F,
which by Proposition 5.6 means that D is locally µ∗ -neglijeable. This means that
µ(D ∩ B) = µ∗ (D ∩ B) = 0, ∀ B ∈ Bfin ,
which means that D is locally µ-null. Since N ∩ F ⊂ EF ∩ F ⊂ D, ∀ F ∈ F, and F
is a partition of X, we get N ⊂ D.
Lectures 23-25
neglijeable means that λ∗n (N ) = 0, and is equivalent to the existence of a Borel set
C ⊃ N with λn (C) = 0.)
Exercise 1. Let A = [a1 , b1 ) × · · · × [an , bn ) be a half-open box in Rn . Assume
A 6= ∅ (which means that a1 < b1 , . . . , an < bn ). Consider the open box Int(A)
and the closed box A, which are given by
Int(A) = (a1 , b1 ) × · · · × (an , bn ) and A = [a1 , b1 ] × · · · × [an , bn ].
Prove the equalities
λn Int(A) = λn A = voln (A).
Remarks 6.1. If D ⊂ Rn is a non-empty open set, then λn (D) > 0. This is a
consequence of the above exercise, combined with the fact that D contains at least
one non-empty open box.
The Lebesgue measure of a countable subset C ⊂ Rn is zero. Using σ-additivity,
it suffices to prove this only in the case of singletons C = {x}. If we write x in
coordinates x = (x1 , . . . , xn ), and if we consider half-open boxes of the form
Jε = [x1 , x1 + ε) × · · · × [xn , xn + ε),
then the obvious inclusion {x} ⊂ Jε will force
0 ≤ λn {x} ≤ λn (Jε ) = εn ,
so taking the limit as ε → 0, we indeed get λn {x} = 0.
The (outer) Lebesgue measure is completely determined by its values on open
sets. More explicitly, one has the following result.
Proposition 6.1. Let n ≥ 1 be an integer. For every subset A ⊂ Rn one has:
(2) λ∗n (A) = inf{λn (D) : D open subset of Rn , with D ⊃ A}.
Proof. Throughout the proof the set A will be fixed. Let us denote, for
simplicity, the right hand side of (2) by ν(A). First of all, since every open set is
Lebesgue measurable (being Borel), we have λn (D) = λ∗n (D), for all open sets D,
so by the monotonicity of λ∗n , we get the inequality
λ∗n (A) ≤ ν(A).
We now prove the inequality λ∗n (A) ≥ ν(A). Fix for the moment some S∞ε > 0, and
use (1). to get the existence of a sequence (Bk )∞ k=1 ⊂ J n , such that k=1 Bk ⊃ A,
and
X∞
voln (Bk ) < λ∗n (A) + ε.
k=1
For every k ≥ 1, we write
(k) (k)
Bk = [a1 , b1 ) × · · · × [a(k) (k)
n , bn ),
Qn (k) (k)
so that voln (Bk ) = j=1 (b1 − aj ). Using the obvious continuity of the map
n
(k) (k)
Y
R 3 t 7−→ (b1 − aj − t) ∈ R,
j=1
(k) (k) (k) (k)
we can find, for each k ≥ 1 some numbers c1 < a1 , . . . , cn < an , with
n n
Y (k) (k) ε Y (k) (k)
(3) (b1 − cj ) < k
+ (b1 − aj ).
j=1
2 j=1
CHAPTER III: MEASURE THEORY 217
The Lebesgue measure can also be recovered from its values on compact sets.
Proposition 6.2. Let n ≥ 1 be an integer. For every Lebesgue measurable
subset A ⊂ Rn one has:
(6) λn (A) = sup{λn (K) : K compact subset of Rn , with K ⊂ A}.
Proof. Let us denote, for simplicity, the right hand side of (6) by µ(A). First
of all, by the mononoticity we clearly have the inequality
λn (A) ≥ µ(A).
To prove the inequality λn (A) ≤ µ(A), we shall first use a reduction to the bounded
case. For each integer k ≥ 1, we define the compact box
Bk = [−k, k] × · · · × [−k, k].
S∞
Notice that we have B1 ⊂ B2 ⊂ . . . , with k=1 Bk = Rn . We then have
B1 ∩ A ⊂ B2 ∩ A ⊂ . . . ,
S∞
with k=1 (Bk ∩ A) = A, so using the Continuity Lemma 4.1, we have
(7) λn (A) = lim λn (Bk ∩ A) = sup λn (Bk ∩ A) : k ≥ 1 .
k→∞
218 LECTURES 23-25
Fix for the moment some ε > 0, and use the (7) to find some k ≥ 1, such that
λn (A) ≤ λn (Bk ∩ A) + ε. Apply Proposition 6.1 to the set Bk r A, to find an open
set D, with D ⊃ Bk r A, and λn (Bk r A) ≥ λn (D) − ε. On the one hand, we have
λn (Bk ) = λn (Bk ∩ A) + λn (Bk r A) ≥ λn (Bk ∩ A) + λn (D) − ε ≥
(8)
≥ λn (Bk ∩ A) + λn (Bk ∩ D) − ε.
On the other hand, we have
λn (Bk ) = λn (Bk r D) + λn (Bk ∩ D),
so using (8) we get the inequality
λn (Bk r D) + λn (Bk ∩ D) ≥ λn (Bk ∩ A) + λn (Bk ∩ D) − ε,
and since all numbers involved in the above inequality are finite, we conclude that
λn (Bk r D) ≥ λn (Bk ∩ A) − ε ≥ λn (A) − 2ε.
Obviously the set K = Bk r D is compact, with K ⊂ Bk ∩ A ⊂ A, so we have
µ(A) ≥ λn (K), hence we get the inequality
µ(A) ≥ λn (A) − 2ε.
Since this is true for all ε > 0, the desired inequality µ(A) ≥ λn (A) follows.
the inequalities
S∞ (9) force λn (Nk ) = 0, ∀ k ≥ 1. Now if we define the set N =
Ar j=1 K j , we have
∞
[ ∞
[ ∞
[ ∞
[
N= (Bk ∩ A) r Kj = (Bk ∩ A) r Ep ⊂
k=1 j=1 k=1 p=1
∞
[ ∞
[
⊂ (Bk ∩ A) r Ek = Nk ,
k=1 k=1
which proves that λn (N ) = 0.
The implication (ii) ⇒ (i) is trivial.
n
Proposition 6.2 does not hold if A ⊂ R is non-measurable. In fact the equality
(6), with λn replaced by λ∗n , essentially forces A to be measurable, as shown by the
following.
Exercise 2. Let A ⊂ Rn be am arbitrary subset, with λ∗n (A) < ∞. Prove that
the following are equivalent:
(i) A is Lebesgue measurable;
(ii) λ∗n (A) = sup{λn (K) : K compact subset of Rn , with K ⊂ A}.
Propositions 6.1 and 6.2 are regularity properties. The following terminology is
useful:
Definitions. Suppose A is a σ-algebra on X, and µ is a measure on A. Sup-
pose we have a sub-collection F ⊂ A.
(i) We say that µ is regular from below, with respect to F, if
µ(A) = sup µ(F ) : F ⊂ A, F ∈ F .
With this terminology, Proposition 6.1 gives the fact that the Lebesgue measure is
regular from above with respect to open sets, while Proposition 6.2 gives the fact
that the Lebesgue measure is regular from below with respect to compact sets.
Exercise 3. For a subset A ⊂ Rn , prove that the following are equivalent:
(i) A is Lebesgue measurable;
(ii) There exist a sequence of compact sets (Kj )∞j=1 , and a dequence of open
∞
S∞ T ∞
sets (Dj )j=1 , such that j=1 Kj ⊂ A ⊂ j=1 Dj , and the difference
T∞ S∞
j=1 Dj r j=1 Kj is neglijeable.
Hint: For the implication (i) ⇒ (ii) analyze first the case when λ∗ (A) < ∞. Then write A as a
countable union of sets of finite outer measure.
In the one-dimensional case n = 1, the Lebesgue measure of open sets can be
computed with the aid of the following result.
Proposition 6.3. For every open set D ⊂ R, there exists a countable S (or
finite) pair-wise disjoint collection {Ji }i∈I of open intervals with D = i∈I Ji .
Proof. For every point x ∈ D, we define
ax = inf{a < x : (a, x) ⊂ D} and bx = sup{b > x : (x, b) ⊂ D}.
(The fact that D is open guarantees the fact that both sets above are non-empty.)
It is clear that, for every x ∈ D, the open interval Jx = (ax , bx ) is contained in D, so
220 LECTURES 23-25
S
we have the equality D = x∈D Jx . The problem at this point is the fact that the
collection {Jx }x∈D is not pair-wise disjoint. What we need to find is a countable
(or finite) subset X ⊂ D, S such that the sub-collection {Jx }x∈X is pair-wise disjoint,
and we still have D = x∈X Jx . One way to do this is based on the following
Claim: For two points x, y ∈ D, the following are equivalent:
(i) x ∈ Jy ;
(ii) Jx ⊃ Jy ;
(ii) Jx ∩ Jy 6= ∅;
(iii) Jx = Jy .
To prove the implication (i) ⇒ (ii) we observe that if x ∈ Jy , then ay < x < by ,
so we have (ay , x) ⊂ D and (x, by ) ⊂ D, which means that ax ≤ ay and bx ≥ by ,
therefore we have the inclusion Jx = (ax , bx ) ⊃ (ay , by ) = Jy . The implication
(ii) ⇒ (iii) is trivial. To prove (iii) ⇒ (iv), assume Jx ∩ Jy 6= ∅, and pick a point
z ∈ Jx ∩ Jy . Using the implication (i) ⇒ (ii) we have the inclusions Jz ⊃ Jx and
Jz ⊃ Jy . In particular we have x ∈ Jz , so again using the inplication (i) ⇒ (ii) we
get Jx ⊃ Jz , which means that we have in fact the equality Jx = Jz . Likewise we
have the equality Jy = Jz , so (iv) follows. The implication (iv) ⇒ (i) is trivial.
Going back to the proof of the Proposition, we now see that, using the fact
that any open interval contains a rational number, if we put X0 = D ∩ Q, then
S y ∈ D, there exists x ∈ X0 , such that Jx = Jy . This gives the equality
for any
D = x∈X0 Jx , this time with the indexing set X0 countable. Finally, if we equip
the set X0 with the equivalence relation
x ∼ y ⇐⇒ Jx = Jy ,
and we choose X ⊂ X0 to the a list of all equivalence classes. This means that, for
every y ∈ X0 , S
there exists a unique x ∈ X with Jx = Jy . It is clear now that we
still have D = x∈X Jx , but now if x, x0 ∈ X are such that x 6= x0 , then x 6∼ x0 , so
we have Jx 6= Jx0 , which by the Claim gives Jx ∩ Jx0 = ∅.
Comments. When we want to compute S the Lebesgue measure of an open set
D ⊂ R, we should first try to write D = i∈I Ji with (Ji )i∈I a countable (or finite)
pair-wise collection of open intervals. If we succeed, then we would have
X
λ(D) = λ(Ji ).
i∈I
For intervals (open or not) the Lebesgue measure is the same as the length.
There are
S∞instances when we can manage only to write a given open set D as a
union D = k=1 Jk , with the J’s not necessarily disjoint. In that case we can only
get the estimate
X∞
λ(D) ≤ λ(Jk ).
k=1
Example 6.1. Consider the ternary Cantor set K3 ⊂ [0, 1], discussed in III.3.
We know (see Remarks 3.5) that one can S find a pair-wise sequence (Dn )∞n=0 of open
∞
subsets of (0, 1) such that K3 = [0, 1] r n=0 Dn , and such that, for each n ≥ 0,
the open set Dn is a disjoint union of 2n intervals of length 1/3n+1 . In particular,
this means that λ(Dn ) = 2n /3n+1 , so
∞ ∞ ∞
2n
[ X X
λ(K3 ) = λ [0, 1] − λ Dn = 1 − λ(Dn ) = 1 − = 0.
n=0 n=0 n=0
3n+1
CHAPTER III: MEASURE THEORY 221
What is interesting here (see Remarks 3.5) is the fact that card K3 = c.
Remark 6.2. An interesting consequence of the above computation is the
fact that all subsets of K3 are Lebesgue measurable, i.e. one has the inclusion
P(K3 ) ⊂ m(R). This gives the inequality
card m(R) ≥ card P(K3 ) = 2card K3 = 2c .
Since we also have m(R) ⊂ P(R), we get
card m(R) ≤ card P(R) = 2card R = 2c ,
so using the Cantor-Bernstein Theorem we get the equality
card m(R) = 2c .
We also know (see Corollary 2.5) that card Bor(R) = c.
As a consequence of this difference in cardinalities, one gets the fact that we
have a strict inclusion
(10) Bor(R) ( m(R).
Later on we shall construct (more or less) explicitly a Lebesgue measurable set
which is not Borel.
Exercise 4. The strict inclusion (10) holds also if R is replaced with Rn ,
with n ≥ 2. In this case, instead of using Cantor sets, one can proceed as fol-
lows. Consider the set S = Rn−1 × {0}. Prove that λn (S) = 0. Conclude that
card m(Rn ) = 2c .
One key feature of the Lebesgue (outer) measure is the translation invariance
property, described in the following result. To formulate it we introduce the follow-
ing notation. For an integer n ≥ 1, a point x ∈ Rn , and a subset A ⊂ Rn , we define
the set
A + x = {a + x : a ∈ A}.
Remark that the map Θx : Rn 3 a 7−→ a + x ∈ Rn is a homeomorphism. In
particular, both Θx and Θ−1x = Θ−x are Borel measurable, which means that, for
a set A ⊂ Rn , one has the equivalence
A ∈ Bor(Rn ) ⇐⇒ A + x ∈ Bor(Rn ).
Proposition 6.4. Let n ≥ 1 be an integer. For any set A ⊂ Rn one has the
equality
λ∗n (A + x) = λ∗n (A).
Proof. Fix A and x. First remark that, for every half-open box B ∈ Jn , its
translation B + x is again a half-open box, and we have the equality
voln (B + x) = voln (B).
S∞ for the moment ε > 0, and choose a sequence (Bk )k=1 ⊂ Jn , such that A ⊂
∞
Fix
k=1 Bk , and
∞
X
voln (Bk ) ≤ λ∗n (A) + ε.
k=1
222 LECTURES 23-25
S∞
Then, using the obvious inclusion A + x ⊂ k=1 (Bk + x), by the remark made at
the begining of the proof, combined with the monotonicity of the outer Lebesgue
measure, we have
[∞ X ∞
∗ ∗
λn (A + x) ≤ λn (Bk + x) ≤ λ∗n (Bk + x) =
k=1 k=1
∞
X ∞
X
= voln (Bk + x) = voln (Bk ) ≤ λ∗n (A) + ε.
k=1 k=1
Since the inequality λ∗n (A + x) ≤ ∗
λn (A) + ε holds for all ε > 0, we get
λ∗n (A + x) ≤ λ∗n (A).
The other inequality follows from the above one applied to the set A + x and the
translation by −x.
Corollary 6.2. For a subset A ⊂ Rn , one has the equivalence
A ∈ m(Rn ) ⇐⇒ A + x ∈ m(Rn ).
Proof. Write A = B ∪ N , with B Borel, and N neglijeable. Then we have
A + x = (B + x) ∪ (N + x). The set B + x is Borel. By the above result we have
λ∗n (N + x) = λ∗n (N ) = 0, i.e. N + x is neglijeable. Therefore A + x is Lebesgue
measurable.
As we have seen, the fact that there exist Lebesgue measurable sets that are
not Borel is explained by the difference in cardinalities. Since card m(Rn ) = 2c =
card P(Rn ), it is legitimate to ask whether the inclusion m(Rn ) ⊂ P(Rn ) is strict.
In other words, do there exist sets that are not Lebesgue measurable? The answer
is affirmative, as discussed in the following.
Example 6.2. Equipp R with the equivalence relation
x ∼ y ⇐⇒ x − y ∈ Q.
Denote by R/Q the quotient space (this is in fact the quotient group of (R, +) with
respect to the subgroup Q), and denote by π : R → R/Q the quotient map. Since
every x ∈ R, one can find some y ∼ x, with y ∈ [0, 1), it follows that the map
for
π [0,1) : [0, 1) → R/Q is surjective. Choose then a map φ : R/Q → [0, 1), such that
φ ◦ π = Id, and put E = φ(R/Q). The set E is a complete set of representatives for
the equivalence relation ∼. In other words, E ⊂ [0, 1) has the property that, for
every x ∈ R, there exists exactly one element y ∈ E, with x ∼ y.SIn particular, the
collection of sets (E + q)q∈Q is pair-wise disjoint, and satisfies q∈Q (E + q) = R.
Using σ-sub-additivity, we get
X
∞ = λ(R) ≤ λ∗ (E + q).
q∈Q
Since (by Proposition 6.5) we have λ (E + q) = λ∗ (E), the above inequality forces
∗
λ∗ (E) > 0.
Claim: The set E is not Lebesgue measurable
Assume E is Lebesgue measurable. If we define the set X = Q ∩ [0, 1), then the
sets E + q, q ∈ X are pair-wirse disjoint. On the one hand, the measurability
of E,Scombined with the Corollary 6.2 would imply the measurability of the set
S = q∈X (E + q). On the other hand, the equalities λ(E + q) = λ(E) > 0 will
CHAPTER III: MEASURE THEORY 223
force λ(S) = ∞. But this is impossible, since we obviously have S ⊂ [0, 2), which
forces λ(S) ≤ 2.
Exercise 5. Let E ∈ m(Rn ). Prove that the map
Rn 3 x 7−→ λ E ∪ (E + x) ∈ [0, ∞]
is continuous.
Hint: Analyze first the case when E is compact. In this particular case, show that for every
x0 ∈ Rn and every open set D ⊃ E ∪ (E + x0 ), there exists some neighborhood V of x0 , such that
D ⊃ E ∪ (E + x), ∀ x ∈ V.
Use then regularity from above, combined with the inequality9
|λ(A) − λ(B)| ≤ λ(A4B), for all A, B ∈ m(Rn ), with λ(A), λ(B) < ∞.
In the general case, use regularity from below. (The case λ(E) = ∞ is trivial.)
Exercise 6. Let E ∈ m(Rn ), be such that λn (E) > 0. Prove that the set
E − E = {x − y : x, y ∈ E}
is a neighborhood of 0.
Hint: Assume the contrary, which means that there exists a sequence (xp )∞ n
p=1 ⊂ R r (E − E),
with limp→∞ xp = 0. This will force E ∩ (E + xp ) = ∅, ∀ p ≥ 1. Use the preceding Exercise to
get a contradiction.
We are now in position to construct a Lebesgue measurable set which is not
Borel.
Example 6.3. In Section 3 we discussed the compact space T = {0, 1}ℵ0 and
the maps
∞
X αn
φr : T 3 (αn )∞
n=1 −
7 → (r − 1) ∈ [0, 1].
n=1
rn
For each r ≥ 2 the map φr : T → [0, 1] is continuous so the set Kr = φr (T ) is
compact. We have K2 = [0, 1], and K3 is the ternary Cantor set. We also know
(see Theorem 3.5) that, for a set A ⊂ T , one has the equivalence
(11) A ∈ Bor(T ) ⇐⇒ φr (A) ∈ Bor(Kr ).
Choose now a set E ⊂ [0, 1] which is not Lebesgue measurable. In particular, E is
not Borel, so E 6∈ Bor([0, 1]). Since φ2 : T → [0, 1] is surjective, by (11) it follows
that the set A = φ−12 (E) is not in Bor(T ). Again, by (11) it follows that the set
S = φ3 (A) is not in Bor(K3 ). Since
Bor(K3 ) = Bor(R) K3
9
This inequality holds for any additive map defined on a ring.
224 LECTURES 23-25
Exercise 7. Start with an arbitrary inerval [0, 1], and list all rational numbers
in [0, 1] as a sequence Q ∩ [0, 1] = {xn }∞
n=1 . Fix some ε > 0, and consider the open
set
∞
[ ε ε
D= xn − n+1 , xn + n+1 .
n=1
2 2
Consider the compact set K = [0, 1] r D.
(i) Prove that λ(D) ≤ ε.
(ii) Prove that λ(K) ≥ 1 − ε.
(iii) Prove that Int(K) = ∅.
Hint: For (iii) use the fact that K ∩ Q = ∅.
Exercise 8*. Prove that, for every non-empty open set D ⊂ R, and any two
positive numbers α, β with α + β < λ(D), there exist compact sets A, B ⊂ D, with
λ(A) > α, λ(B) > β, such that A ∩ B = ∅ and (A ∪ B) ∩ Q = ∅.
Hint: Write D as a union of a pair-wise disjoint sequence (Jn )∞ n=1 of open intervals, so that
λ(D) = ∞ ∞ ∞
P
λ(J n ). Find then two sequences (α n ) n=1 and (β n ) n=1 of positive numbers, such
P∞ n=1 P∞
that n=1 αn > α, n=1 βn > β, and αn + βn < λ(Jn ), for all n ≥ 1. This reduces essentially
the problem to the case when D is an open interval, for which one can use the construction
outlined in Exercise 7.
Exercise 9*. Construct o Borel set A ⊂ R, such that, for every open interval
I ⊂ R one has λ(I ∩ A) > 0 and λ(I r A) > 0.
Hints: List all open intervals with rational endpoints as a sequence (In )∞
n=1 . Start (use exercise
8) off by choosing two compact sets A1 , B1 ⊂ I1 , with A1 ∩ B1 = ∅, (A1 ∪ B1 ) ∩ Q = ∅,
and λ(A1 ), λ(B1 ) > 0. Use Exercise 5 to construct two sequences (An )∞ ∞
n=1 and (Bn )n=1 of
compact sets, such that, for all n ≥ 1 we have: (i) An ∩ Bn = ∅; (ii) (An ∪ Bn ) ∩ Q = ∅;
Sn S∞
(iii) λ(An ), λ(Bn ) > 0; (iv) An+1 ∪ Bn+1 ⊂ In+1 r k=1 (Ak ∪ Bk ) . Put A = n=1 An and
S∞
B = n=1 Bn . Notice that A ∩ B = ∅, λ(A), λ(B) > 0, and λ(A ∩ In ), λ(B ∩ In ) > 0, ∀ n ≥ 1.
In the remainder of this section we discuss some applications of the Lebesgue
measure to the theory of Riemann integration. The following techincal result will
be very useful.
Lemma 6.1. Let f : [a, b] → R be a non-negative Riemann integrable function,
let A, B ⊂ [a, b] be two disjoint sets, with A∪B = [a, b]. Then one has the estimates
Z b
λ∗ (A) · inf f (z) ≤ f (t) dt ≤ (b − a) · sup f (x) + λ∗ (B) · sup f (y).
z∈A a x∈A y∈B
Recall first that, if for each partition ∆ = (a = x0 < x1 < · · · < xn = b) of [a, b],
we define the lower and the upper Darboux sums of f with respect to ∆:
n
X
L(∆, f ) = (xk − xk−1 ) · inf f (t),
t∈[xk−1 ,xk ]
k=1
Xn
U (∆, f ) = (xk − xk−1 ) · sup f (t),
k=1 t∈[xk−1 ,xk ]
CHAPTER III: MEASURE THEORY 225
so we get
X X
(13) L(∆, f ) ≤ (xk − xk−1 ) · α + (xk − xk−1 ) · β
k∈S k6∈S
X
(14) U (∆, f ) ≥ (xk − xk−1 ) · γ
k∈S
Since the intervals involded in both M and N have at most singleton overlaps, it
follows that we have the equalities
X X
(xk − xk−1 ) = λ(M ) and (xk − xk−1 ) = λ(N ),
k∈S k6∈S
Proof. Since f is bounded, there exists some constant C > 0, such that the
Riemann integrable functions C +f and C −f are both non-negative. Apply Lemma
6.1 to these two functions with A = [a, b] r N and B = N . Since f [a,b]rN = 0, we
get (C ± f )[a,b]rN
= C, so we get
Z b
[C ± f (x)] dx ≤ (b − a) · C,
a
which yields
Z b Z b Z b
± f (x) dx = [C ± f (x)] − C dx = [C ± f (x)] dx − (b − a) · C ≤ 0,
a a a
f1 r f2 , a.e.
if the set
A = x ∈ [a, b] : f1 (x) r f2 (x)
has neglijeable complement in [a, b], i.e. λ∗ [a, b] r A = 0. The abreviation “a.e.”
Proof. From the definition of Riemann integrability, we know that (i) is equiv-
alent to any of the following two conditions
(ii’) inf U (∆, f ) − L(∆, f ) : ∆ partition of [a, b] = 0;
(iii’) there exists a sequence (∆p )∞
p=1 of partitions of [a, b], with ∆1 ⊂ ∆2 ⊂ . . . ,
and limp→∞ U (∆p , f ) − L(∆p , f ) = 0.
228 LECTURES 23-25
Then the Proposition follows immediately from the fact that, for every partition ∆
one has the equalities
Z b Z b
f∆ (x) dx = L(∆, f ) and f ∆ (x) dx = U (∆, f ).
a a
Fix for the moment ε > 0. Since f is continuous at y, there exists some δε > 0,
such that
(21) |f (z) − f (y)| < ε, ∀ z ∈ (y − δε , y + δε ) ∩ [a, b].
Choose now q ≥ 1, such that |∆q | < δε . Write ∆q = (a = x0 < x1 < · · · < xn = b).
Using the fact that y 6∈ ∆q , we can find k ∈ {1, . . . , n} such that y ∈ (xk−1 , xk ).
Since xk − xk−1 < δε , we have the inclusion [xk−1 , xk ] ⊂ (y − δε , y + δε ), so by (21)
we immediately get
f (y) ≤ f ∆q (y) = sup f (z) ≤ f (y) + ε;
z∈[xk−1 ,xk ]
such that E ⊃ Df ∪ S, and λ(E) < ε. Define the compact set A = [a, b] r E, and
put B = [a, b] ∩ E. We clearly have
(22) λ(B) ≤ λ(E) < ε.
Define
the sequence (hp )∞
by hp = f ∆p − f∆p . Since A ∩ ∆p = ∅, it follows that
p=1
hp A is continuous, for each p ≥ 1. Since A ∩ (Df ∪ S) = ∅, by Claim 3, we know
that limp→∞ hp (y) = 0, ∀ y ∈ A. Since (hp )∞ p=1 is monotone, by Dini’s Theorem
(see ??) it follows that
lim max hp (y) = 0.
p→∞ y∈A
Using Lemma 6.1 for hpε and the sets A and B, combined with (22), we have
Z b
hpε (x) dx ≤ (b − a) · sup hpε (y) + λ∗ (B) · sup hpε (z) ≤
a y∈A z∈B
≤ ε(b − a) + λ∗ (B)(M − m) ≤ ε(b − a + M − m).
Since hpε ≥ hp ≥ 0, for all p ≥ pε , we get the inequalities
Z b
0≤ hp (x) dx ≤ ε(b − a + M − m), ∀ p ≥ pε .
a
Rb
The above argument proves that limp→∞ a hp (x) dx = 0, i.e.
Z b
lim [f ∆p (x) − f∆p (x)] dx = 0.
p→∞ a
Hint: Consider the function g : [a, b] → R defined by g(x) = max{f (x), 1}. Then f ≥ g ≥ κ M ,
and g is still Riemann integrable. Apply Lemma 6.1 (the first inequality) to the function 1 − g.
Exercise 17*. Let f : [a, b] → R be a bounded function. Prove that the following
are equivalent:
(i) f is Riemann integrable;
(ii) for every ε > 0, there exist continuous functions g, h : [a, b] → R with
Rb
g ≥ f ≥ h, and a [g(x) − h(x)] dx < ε;
(iii) for every ε > 0, there exist Riemann integrable functions g, h : [a, b] → R
Rb
with g ≥ f ≥ h, and a [g(x) − h(x)] dx < ε.
Hints: For the implication (i) ⇒ (ii) analyze first the particular case when f = κ J , with J
a sub-interval of [a, b]. Then analyze the functions of the type f ∆ and f∆ . For the implication
(iii) ⇒ (i), analyze the relationship among lower/upper Darboux sums of f , g and h.
Comment. The statement of Theorem 6.1 shows that, appart from trivial
cases, the problem of checking that a function f : [a, b] → R is Riemann integrable,
is a rather difficult one. The main difficulty arises from the fact that, if N ⊂ [a, b]
is a neglijeable set, and f [a,b]rN is continuous, then f need not be continuous
at all points in [a, b] r N . For instance, if we consider the characteristic function
f = κ Q∩[a,b] of the rationals in [a, b], and N = Q∩[a, b], then clearly N is neglijeable,
f [a,b]rN is continuous (because it is constant zero), but Df = [a, b].
As earlier suggested, in the hope that such an anomaly can be eliminated, it is
reasonable to consider the slightly weaker notion of almost Riemann integrabilty.
In the remainder of this section, we take a closer look at this notion, and we will
eventually show (see Theorem 6.2) that this indeed removes the above anomaly.
We begin with an “almost” version of Exercise 17.
Lemma 6.2. For a function f : [a, b] → R, the following are equivalent:
(i) f is almost Riemann integrable;
(ii) for every ε > 0, there exist continuous functions g, h : [a, b] → R with
Rb
g ≥ f ≥ h a.e., and a [g(x) − h(x)] dx < ε;
(iii) for every ε > 0, there exist Riemann integrable functions g, h : [a, b] → R
Rb
with g ≥ f ≥ h a.e., and a [g(x) − h(x)] dx < ε.
232 LECTURES 23-25
so is the set N = M ∪ Dg . On
Since f = g, a.e., the set M is neglijeable, and
the one hand, since Dg ⊂ N , the restriction g [a,b]rN , is continuous. On the other
hand, since M ⊂ N , we have f = g
[a,b]rN [a,b]rN
, so (ii) follows.
(ii) ⇒ (i). We are going to imitate the proof of Theorem
6.1, with some minor
modifications. Fix N ⊂ [a, b] neglijeable, such that f [a,b]rN is continuous. Fix
also a sequence (∆p )∞
p=1 of partitions, with ∆1 ⊂ ∆2 ⊂ . . . , and limp→∞ |∆p | = 0.
S∞
Put S = p=1 ∆p . Since S is countable, the set N ∪ S is still neglijeable. We put
T = [a, b] r (N ∪ S), and we define the analogues of the functions f ∆p and f∆p as
CHAPTER III: MEASURE THEORY 233
follows. Write each partition as ∆p = (a = xp0 < xp1 < · · · < xpnp = b), and define,
for each k ∈ {1, . . . , np }, the numbers
Mkp = sup f (t) : t ∈ [xpk−1 , xpk ] ∩ T and mpk = inf f (t) : t ∈ [xpk−1 , xpk ] ∩ T .
Fix some ε > 0, and use regularity from above, to find an open set D with D ⊃ N ∪S
and λ(D) < ε. Take the compact set A = [a, b] r D. Note that f A is continuous,
since A ⊂ [a, b] r N . Note also that, since A ⊂ [a, b] r S, the functions g p A and
∞
gp A are also continuous, and so will be hp A , for every p ≥ 1. Since g p (x) p=1
∞
is non-increasing, and gp (x) p=1 is non-decreasing, for all x, it follows that the
sequence (hp )∞p=1 is monotone, so by Dini’s Theorem, (25) gives
lim max hp (x) = 0.
p→∞ x∈A
The following exercise shows how the lack of regularity can always be repaired.
Exercise 2. Let X be a locally compact space, and let ω be a content on X.
Define ω̆ : CX → [0, ∞), by
ω̆(K) = inf ω(L) : L ∈ CX , Int(L) ⊃ K , ∀ K ∈ CX .
Prove that:
(i) ω̆ is a regular content on X;
(ii) ω̆(K) ≥ ω(K), ∀ K ∈ CX ;
(iii) if η is a regular content on X, with η(K) ≥ ω(K), ∀ K ∈ CX , then
η(K) ≥ ω̆(K), ∀ K ∈ CX ;
(iv) ω is regular, if and only if ω̆ = ω.
Definition. With the notations from Exercise 2, the regular content ω̆ is called
the regularization of ω.
Theorem 7.1. Let X be a locally compact space, and let ω be a content on X.
Denote by TX the collection of all open subsets of X. Define the map ω̂ : TX →
[0, ∞] by
ω̂(D) = sup ω(K) : K ∈ CX , K ⊂ D , ∀ D ∈ TX ,
Since we have
∞
X
ω(K) ≤ ω̂(Dn ), for all K ∈ CX with K ⊂ D,
n=1
CHAPTER III: MEASURE THEORY 237
Proof. Using Remark 7.1.C, we can assume that ω is regular, and in this case
we need to prove that ω ∗ C = ω. Start with some compact set K ⊂ X. By the
X
definition of ω ∗ , using the notations from Theorem 7.1, we know that
ω ∗ (K) = inf ω̂(D) : D ∈ TX , D ⊃ K .
(2)
It is clear that, for every open set D ⊃ K, we have the inequality
ω̂(D) ≥ ω(K),
so taking infimum in the left hand side, and using (2), immediately gives the in-
equality
ω ∗ (K) ≥ ω(K).
To prove the reverse inequality, we start by fixing ε > 0, and we use regularity to
find some compact set L with K ⊂ Int(L), and ω(L) ≤ ω(K) + ε. Consider the
open set D = Int(L). On the one hand, for every compact set F ⊂ D, we have
the onbious inclusion F ⊂ L, which gives ω(F ) ≤ ω(L). Taking supremum over all
copact sets F ⊂ D then gives ω̂(D) ≤ ω(L). By the choice of L, by the definition
of ω ∗ , and using the inclusion D ⊃ K, we then get
ω ∗ (K) ≤ ω̂(D) ≤ ω(L) ≤ ω(K) + ε.
Since the inequality
ω ∗ (K) ≤ ω(K) + ε,
holds for all ε > 0, we then must have ω ∗ (K) ≤ ω(K).
CHAPTER III: MEASURE THEORY 239
The above result gives a nice characterization for the regularity of a content,
in terms of the induced outer measure.
Corollary 7.1. Let X be a locally comoact space. A content ω is regular, if
and only if ω ∗ C = ω.
X
We now proceed with the proof of (3) for arbitrary A’s. Fix A, and consider
an arbitrary open set E ⊃ A. By Claim 2, we have
ω ∗ (E) ≥ ω ∗ (E ∩ D) + ω ∗ (E r D).
Using the obvious inclusions E ∩ D ⊃ A ∩ D and E r D ⊃ A r D, we then get
ω ∗ (E) ≥ ω ∗ (A ∩ D) + ω ∗ (A r D).
The desired inequality (3) follows now by taking infimum in the left hand side, and
using Remark 7.1.B.
The most important consequence of Theorem 7.2 is the following.
Corollary 7.2. Let X be a locally compact space, and let ω be a regular
content on X. Then ω can be extended uniquely to a measure µω on Bor(X), with
the following properties.
(i) µ is regular from above, with respect to the collection TX of all open sets,
that is
µω (B) = inf µω (D) : D ∈ TX , D ⊃ B , ∀ B ∈ Bor(X);
Conversely, if µ is a measure on
Bor(X) with properties (i) and (ii), and such that
µ(K) < ∞, ∀ K ∈ CX , then µC is regular content.
X
To prove the converse, we use property (i), to find, for each ε > 0, an open set
Dε ⊃ K, such that µ(Dε ) ≤ µ(K) + ε. If we choose, for each ε.), a compact set Lε ,
such that
K ⊂ Int(Lε ) ⊂ Lε ⊂ Dε ,
then we obviously have
µ(K) ≤ µ(Lε ) ≤ µ(Dε ) ≤ µ(K) + ε,
so we get the inequality
inf ω(L) : L ∈ CX , K ⊂ Int(L) ≤ ω(K) + ε.
Since this holds for all ε > 0, we get in fact the inequality
inf ω(L) : L ∈ CX , K ⊂ Int(L) ≤ ω(K),
and since µ(K) ≥ max µ1 (K), µ2 (K) , ∀ K ∈ CX , the equality (5) immedi-
ately follows. Suppose now µ(D) < ∞, which is equivalent to the fact that
µ1 (D), µ2 (D) < ∞. Denote the right hand side of (5) by ν(D). For every ε > 0,
using the fact that µ1 and µ2 are Radon measures, we can find two compact sets
K1ε , K2ε ⊂ D, such that µ1 (K1ε ) ≥ µ1 (D) − 2ε and µ2 (K2ε ) ≥ µ2 (D) − 2ε . Of course,
the compact set Kε = K1ε ∪ K2ε is still a subset of D, and satisfies
ε
µ1 (Kε ) ≥ µ1 (K1ε ) ≥ µ1 (D) − ,
2
ε
µ2 (Kε ) ≥ µ2 (K2ε ) ≥ µ2 (D) − ,
2
so we get µ(Kε ) = µ1 (Kε ) + µ2 (Kε ) ≥ µ1 (D) + µ2 (D) − ε = µ(D) − ε. This
proves that ν(D) ≥ µ(D) − ε, and since this inequality is true for all ε > 0, we get
ν(D) ≥ µ(D). The inequality ν(D) ≤ µ(D) is trivial.
We now show that µ satisfies condition (iii). Fix some set A ∈ Bor(X), and
let us prove that
µ(A) = inf µ(D) : D ∈ TX , D ⊃ A .
(6)
If µ(A) = ∞, there is nothing to prove. Suppose now µ(A) < ∞, which is equivalent
to the fact that µ1 (A), µ2 (A) < ∞. Denote the right hand side of (6) by λ(A). For
every ε > 0, using the fact that µ1 and µ2 are Radon measures, we can find two
242 LECTURES 26-29
open sets D1ε , D2ε ⊃ A, such that µ1 (D1ε ) ≤ µ1 (A) + 2ε and µ2 (D2ε ) ≤ µ2 (A) + 2ε .
Then open set Dε = D1ε ∩ D2ε still contains A, and satisfies
ε
µ1 (Dε ) ≤ µ1 (D1ε ) ≤ µ1 (A) + ,
2
ε
µ2 (Dε ) ≤ µ2 (D2ε ) ≥ µ2 (A) + ,
2
so we get µ(Dε ) = µ1 (Dε ) + µ2 (Dε ) ≤ µ1 (A) + µ2 (A) + ε = µ(A) + ε. This
proves that λ(A) ≤ µ(A) + ε, and since this inequality is true for all ε > 0, we get
λ(A) ≤ µ(A). The inequality λ(A) ≥ µ(A) is trivial.
Radon measures are also functorial with respect to proper maps, in the following
sense.
Proposition 7.3. Let X and Y be locally compact spaces, let Φ : X → Y
be a proper continuous map, and let µ be a Radon measure on X. Then the map
ν : Bor(Y ) → [0, ∞], defined by
ν(B) = µ Φ−1 (B) , ∀ B ∈ Bor(Y ),
is a Radon measure on Y .
Proof. First of all, remark that since Φ is continuous, it is Borel measurable,
which means that
Φ−1 (B) ∈ Bor(X), ∀ B ∈ Bor(Y ).
Secondly, by the well known properties of measures, the map ν is a measure.
We now check that ν is a Radon measure. First of all, if K ⊂ Y is compact,
then using the fact that Φ is proper, it means that Φ−1 (K) is compact in X, so we
clearly get
ν(K) = µ Φ−1 (K) < ∞.
To prove that ν satisfies condition (ii), start with some open set D ⊂ Y ,
and let us find a sequence (Ln )∞n=1 of compact subsets of D, such that ν(D) =
limn→∞ ν(Ln ). The set Φ−1 (D) is open, so there exists a sequence (Kn )∞ n=1 of
compact subsets of Φ−1 (D), with
ν(D) = µ Φ−1 (D) = lim µ(Kn ).
(7)
n→∞
It we define the subsets Ln = Φ(Kn ), then (Ln )n≥1 is a sequence of compact subsets
of D, and the inclusion Kn ⊂ Φ−1 (Ln ) immediately gives ν(D) ≥ ν(Ln ) ≥ µ(Kn ),
so by (7) we also get ν(D) = limn→∞ ν(Ln ).
To prove condition (iii) start with some arbitrary subset B ∈ Bor(Y ), and let
us a sequence (En )∞n=1 of open subset of Y , such that ν(B) = limn→∞ ν(En ), and
En ⊃ B, ∀ n ≥ 1. Use the fact that µ is a Radon measure, to find a sequence
(Dn )∞
n=1 of open subset of X, such that
is a Radon measure on T .
Proof. Recall that the fact that (θ, T ) is a compactification of X means that
• T is a compact Hausdorff space;
• θ : X → T is continuous;
• θ(X) is open and dense in T ;
• θ : X → θ(X) is a homeomorphism.
Without any loss of generality, we can assume that X is a dense open subset of T ,
and θ is the inclusion map. With this convention, the map ν is defined by
(9) ν(B) = µ(B ∩ X), ∀ B ∈ Bor(T ).
(i) ⇒ (ii). Assume µ(X) < ∞. It is clear that ν is a finite measure on Bor(T ),
and in fact we have ν(T r X) = 0.
The fact that ν(K) < ∞, for every compact subset K ⊂ T is of course trivial.
We now check the second condition in the definition. Fix some open subset
D ⊂ T , and let us show that
ν(D) = sup ν(K) : K compact, K ⊂ D .
All we need is a sequence (Kn )∞
n=1 of compact subsets of D, with limn→∞ ν(Kn ) =
ν(D). To get this sequence we simply use the fact that D ∩ X is open (in X), so we
can find a sequence (Kn )∞
n=1 of compact subsets of D ∩ X, with limn→∞ µ(Kn ) =
µ(D ∩ X) = ν(D). Now we are done, because the fact that Kn ⊂ X, gives µ(Kn ) =
ν(Kn ), ∀ n ≥ 1.
We now check the third condition in the definition. Fix some set B ∈ Bor(T ),
and let us show that
ν(B) = inf ν(D) : D ⊂ T open, D ⊃ B .
All we need is a sequence (Dn )∞
n=1 of open subsets of T , with Dn ⊃ B, ∀ n ≥ 1,
and limn→∞ ν(Dn ) = ν(B). Start off by choosing a sequnce (Kn )∞ n=1 of compact
subsets of X, such that limn→∞ µ(Kn ) = µ(X), we will get limn→∞ µ(X r Kn ) = 0
(the condition that µ(X) < ∞ is essential here). If we define then the open sets
An = T r Kn , then we will have ν(An ) = µ(An ∩ X) = µ(X r Kn ), ∀ n ≥ 1, so we
have
(10) lim ν(An ) = 0.
n→∞
244 LECTURES 26-29
where one uses the summation conventions discussed in II.2. (The sum in
the right hand side is defined as the supremum of all finite sums.)
(v) If B ∈ Bor(X) has uncountable support SB , then µ(B) = ∞.
CHAPTER III: MEASURE THEORY 245
For Radon measures, the lack of regularity from below, with respect to compact
sets, in somehow compensated by the following result (compare with Exercise 2 from
Section 6).
Lemma 7.1. Let X be a locally compact space, let µ be a Radon measure on
X, and let µ∗ be the maximal outer extension of µ. For a subset A ⊂ X, with
µ∗ (A) < ∞, the following are equivalent
(i) A is µ∗ -measurable;
(ii) µ∗ (A) = sup{µ(K) : K ∈ CX , K ⊂ A};
(iii) there exists a sequence (Kn )∞
n=1 of compact subsets of A, such that
∞
[
µ∗ A r
Kn = 0.
n=1
Proof. (i) ⇒ (ii). Suppose A is µ∗ -measurable, and let us prove the equality
(ii). Denote the right hand side of (ii) simply by ν(A). It is obvious, by the
monotonicity of µ∗ , and the fact that µ∗ Bor(X) = µ, that we have the inequality
µ∗ (A) ≥ ν(A). To prove the other inequality we fix for the moment some ε > 0.
Using (13), there exists an open set D ⊃ A, such that µ(D) ≤ µ∗ (A) + ε. Use
property (ii) in the definition of Radon measures, to find some compact set L ⊂ D
such that
µ(D) ≤ µ(L) + ε.
Since µ(D) = µ(D r L) + µ(L), and µ(L) ≤ µ(D) < ∞, this inequality gives
µ(D r L) ≤ ε,
which, combined with the obvious inclusion A r L ⊂ D r L, yields
(15) µ∗ (A r L) ≤ µ∗ (D r L) = µ(D r L) ≤ ε.
Using (13) we can also find an open set E ⊃ L r A, such that
(16) µ(E) ≤ µ∗ (L r A) + ε.
Since LrA is µ∗ -measurable, we have µ(E) = µ∗ (E) = µ∗ E r(LrA) +µ∗ (LrA).
Since K ⊂ A, we get
µ∗ (A) ≤ µ∗ (A r K) + µ∗ (K) ≤ 2ε + µ(K) ≤ 2ε + ν(A).
Since the inequality µ∗ (A) ≤ 2ε + ν(A) holds for all ε > 0, we get µ∗ (A) ≤ ν(A),
so (ii) follows.
CHAPTER III: MEASURE THEORY 247
(ii) ⇒ (iii). Assume A satisfies (ii), and let us show that A has property (iii).
For every integer n ≥ 1, we use (ii) to find a compact set Kn ⊂ A, such that
1
(18) µ∗ (A) ≤ µ(Kn ) + .
n
S∞
On the one hand, we have the inclusions A r n=1 Kn ⊂ A r Kp , which give
∞
[
µ∗ A r ≤ µ∗ (A r Kp ), ∀ p ≥ 1.
(19) Kn
n=1
∞
[ ∞
\ ∞ ∞
\ [
Kn ⊂ A ⊂ Dn and µ Dn r Kn = 0.
n=1 n=1 n=1 n=1
(The condition that A isSµ -σ-finite means that there exists a sequence (An )∞
∗
n=1 of
∞
subsets of X, with A = n=1 An , and µ∗ (An ) < ∞, for all n ≥ 1.)
Proof. Follow the first part of the proof of (i) ⇒ (ii) to find a sequence
(Kn )∞
n=1 of compact subsets of A, such that
∞
[
µ∗ A r
Kn = 0.
n=1
S∞
Since n=1 Kn is µ∗ -measurable, this forces the equality
∞
[
µ∗ (A) = µ∗ Kn = lim µ∗ (K1 ∪ · · · ∪ Kn ).
n→∞
n=1
Exercise 5*. Let X be a locally compact space, and let µ be a Radon measure on
X. Suppose ν : Bor(X) → [0, ∞] is a measure satisfying the following conditions:
(a) ν(B) ≤ µ(B), ∀ B ∈ Bor(X);
(b) for every B ∈ Bor(X), one has the implication ν(B) < ∞ ⇒ µ(B) < ∞.
Prove that ν is a Radon measure on X. (Notice that, in the case when µ is finite,
the condition (b) is superfluous.)
Hints: To prove condition (ii) in the definition of Radon measures, start with some open set
D ⊂ X, and choose a sequence K1 ⊂ K2 ⊂ · · · ⊂ D of compact subsets, such that
lim µ(Kn ) = µ(D),
n→∞
S∞
and define the Borel set B = n=1 Kn ⊂ D. Notice that we have the equalities µ(B) =
limn→∞ µ(Kn ) and ν(B) = limn→∞ ν(Kn ). Argue that, when ν(D) = ∞, we must have
ν(B) = ∞. When ν(D) < ∞, show that µ(D r B) = 0. In either case we get ν(B) = ν(D).
The next result explains somehow the anomaly illustrated by Exercise 3.
Proposition 7.5. If µ is a Radon measure on X, and let µ∗ denote its maximal
outer extension. For a subset N ⊂ X, the following are equivalent
(i) N is µ∗ -measurable, and for every compact subset K ⊂ N , one has the
equality µ(K) = 0;
(ii) µ∗ (D ∩ N ) = 0, for all open subsets D ⊂ X with µ(D) < ∞;
(iii) N is locally µ∗ -neglijeable, i.e.
µ∗ (A ∩ N ) = 0, for all subsets A ⊂ X with µ∗ (A) < ∞.
Proof. (i) ⇒ (ii). Assume N satisfies condition (i). Fix some open set D ⊂
X, with µ(D) < ∞. Then the set D∩N is measurable, and µ∗ (D∩N ) ≤ µ(D) < ∞.
The equality µ∗ (D ∩ N ) = 0 then follows from (i), combined with Corollary 7.3.
(ii) ⇒ (iii). Assume N satisfies condition (ii). Fix some arbitrary subset
A ⊂ X, with µ∗ (A) < ∞. Using (13), there exists some open set D ⊃ A with
µ(D) < ∞. Then we have the inequality µ∗ (A ∩ N ) ≤ µ∗ (D ∩ N ), so condition (ii)
will force µ∗ (A ∩ N ) = 0.
(iii) ⇒ (i). Let N be locally µ∗ -neglijeable. We know that local µ∗ -neglijeability
implies µ∗ -measurability (see Section 5). The fact that µ(K) = 0, for all compact
subsets K ⊂ N is also trivial.
Notation. Let µ be a Radon measure on the locally compact space X, and
let µ∗ be the maximal outer extension of µ. We denote the σ-algebra mµ∗ (X),
of all µ∗ -measurable subsets of X, simply by Mµ (X), and we define the measure
µ̃ = µ∗ m ∗ (X) . Using the terminology introduced in Section 5, the pair (Mµ (X), µ̃)
µ
is the quasi-completion of Bor(X) with respect to µ.
250 LECTURES 26-29
Our next goal is to examine the inclusion Bor(X) ⊂ Mµ (X) along the same
lines used in the final part of Section 5. In preparation for the results that follow,
it is helpful to introduce the following terminology.
Definition. Let µ be a Radon measure on the locally compact space X. A
non-empty compact subset K ⊂ X, is said to be µ-tight, if it has the property
• there is no compact non-empty proper subset L ( K, with µ(K) = µ(L).
Remark 7.3. Singleton sets are always µ-tight. If K is µ-tight, and µ(K) = 0
then K must be a singleton.
For a non-empty compact set K with µ(K) > 0, the µ-tightness is equivalent
to the following condition11:
D ⊂ X open
(22) =⇒ µ(D ∩ K) > 0.
D ∩ K 6= ∅
Indeed, if K is µ-tight, and D ⊂ X is an open set, such that D ∩ K 6= ∅, then the
compact set L = K r D is either empty, or a proper subset of K. In either case,
we get µ(L) < µ(K), and then the equality D ∩ K = K r L gives µ(D ∩ K) =
µ(K) − µ(L) > 0. Conversely, if K satisfies (22) and if L is a non-empty proper
compact subset of K, then the set D = X r L is open, and satisfies D ∩ K 6= ∅.
By (22) this forces µ(D ∩ K) > 0, and since we have L = K r (D ∩ K), we get
µ(L) = µ(K) − µ(D ∩ K) < µ(K).
A µ-tight compact set K, with µ(K) > 0, will be called non-degenerate.
Lemma 7.2. Let X be a locally compact space, let µ be a Radon measure on
X. Every non-empty compact set K ⊂ X has a µ-tight compact subset K0 ⊂ K,
with µ(K0 ) = µ(K).
Proof. If K is already tight, there is nothing to prove. Also, if µ(K) = 0,
then we can pick K0 to be of the form {x}, with x any point in K.
For the remainder of the proof, we are going to assume that K is not µ-tight,
and µ(K) > 0. Consider the collection
L = L ∈ CX : ∅ 6= L ( K and µ(L) = µ(K) .
Since K is not µ-tight, the collection L is non-empty. One key property of the
collection L is the following.
Claim 1: If L1 , . . . , Ln ∈ L, then L1 ∩ · · · ∩ Ln ∈ L.
Indeed, if we define the sets Aj = K rLj , j = 1, . . . , n, then µ(A1 ) = · · · = µ(An ) =
0, and then the equality
K r [L1 ∩ · · · ∩ Ln ] = A1 ∪ · · · ∪ An
will force µ K r [L1 ∩ · · · ∩ Ln ] = 0, thus giving µ(L1 ∩ · · · ∩ Ln ) = µ(K) > 0.
(The last inequality forces of course L1 ∩ · · · ∩ Ln 6= ∅.)
T Using the finite intersection property, it follows that the intersection K0 =
L∈L L is non-empty.
Claim 2: K0 ∈ L.
Obviously K0 is compact non-empty proper subset of K, so the only thing we need
to prove is the equality µ(K0 ) = µ(K). Consider the Borel subset
B = K r K0 ⊂ K.
11 Notice that using D = X, condition (22) actually forces µ(K) > 0.
CHAPTER III: MEASURE THEORY 251
is at most countable.
Proof. Since µ∗ (A) < ∞, by (13), there exists some open set D ⊃ A with
µ(D) < ∞. It is obvious that SG (A) ⊂ SG (D), so it suffices to prove that SG (D) is
at most countable.
On the one hand, we notice that, for every finite subset F ⊂ SG (D), one has
X [
µ(G ∩ D) = µ [G ∩ D] ≤ µ(D) < ∞.
G∈F G∈F
This means that the family µ(G ∩ D) G∈S (D) is summable, and we have
G
X
µ(G ∩ D) ≤ µ(D) < ∞.
G∈SG (D)
On the other hand, by Remark 7.3, we know that all the terms µ(G ∩ D), G ∈
SG (D) are are strictly positive. Using Proposition II.2.2, this forces SG (D) to be
countable.
The main application of the above result is the following.
Theorem 7.4. Let X be a locally compact space, and let µ be a Radon measure
on X. Then there exists a partition F of X into µ-tight compact sets, with the
property that the set [
NF = F
F ∈F
µ(F )=0
is locally µ∗ -neglijeable.
Proof. Define the set
Ω = F : F pairwise disjoint collection of non-degenerate µ-tight compact sets .
which is obviously a partition of X into µ-tight compact sets. For this partition,
we obviously have the equality NF = X r T . By Claim 3, we have
µ∗ (NF ∩ D) = 0, for all open sets D ⊂ X with µ(D) < ∞.
By Proposition 7.2, it follows that NF is indeed locally µ∗ -neglijeable.
Definition. Let X be a locally compact space, and let µ be a Radon measure
on X. A partition F of X into µ-tight compact sets, with the property stated in
Theorem 7.3, will be called non-degenerate.
The existence of such partitions is significant, as indicated below.
Theorem 7.5. Let X be a locally compact space, let µ be a Radon measure on
X, and let F be a non-degenerate partition of X into µ-tight compact sets. Then12
F is a sufficient µ-finite Bor(X)-partition of X.
12 See Section 5 for the terminology.
CHAPTER III: MEASURE THEORY 253
In the remainder of this section we discuss two basic examples of methods for
constructing (regular) contents.
To introduce the first construction, let us recall some notations and terminology
introduced in II.5 For a locally compact space X, and K one of the fields R or C, we
denote by CcK (X) the space of all continuous functions f : X → K, with compact
support. A R-linear map φ : CcR (X) → R is said to be positive, if it has the
property:
f ∈ CcR (X), f ≥ 0 ⇒ φ(f ) ≥ 0.
With these notations, we have the following result.
13 Here we use the summation convention from II.2
254 LECTURES 26-29
Proposition 7.6. Let X be a locally compact space, and let φ : CcR (X) → R
be a positive R-linear map. For every compact subset K ⊂ X, define the number
ωφ (K) = inf φ(f ) : f ∈ CcR (X), f ≥ κ K .
Then the map CX 3 K 7−→ ωφ (K) ∈ [0, ∞) is a regular content on X.
Proof. The inequality f ≥ κ K forces f ≥ 0, so we indeed have ωφ (K) ≥ 0,
∀ K ∈ CX . We now check conditions (i)-(iv) in the definition of a content.
The constant function 0 satisfies 0 ≥ κ ∅ , which immediately gives the equality
ωφ (∅) = 0, so condition (i) is satisfied.
By the definition of ωφ , it is clear that one has the implication
K, L ∈ CX , K ⊂ L =⇒ ωφ (K) ≤ ωφ (L),
thus giving condition (ii).
To check condition (iii), suppose K, L ∈ CX , and let us prove the inequality
(26) ωφ (K ∪ L) ≤ ωφ (K) + ωφ (L).
Start with some ε > 0, and choose functions f, g ∈ CcR (X), such that f ≥ κ K ,
g ≥ κ L , φ(f ) ≤ ωφ (K) + ε, and φ(g) ≤ ωφ (L). If we consider the function
h = f + g ∈ CcR (X), then we clearly have h ≥ κ K∪L , so we will have
ωφ (K ∪ L) ≤ φ(h) = φ(f + g) = φ(f ) + φ(g) ≤ ωφ (K) + ωφ (L) + 2ε.
Since the inequality ωφ (K ∪ L) ≤ ωφ (K) + ωφ (L) + 2ε holds for arbitrary ε > 0, it
will clearly force (26)
Finally, to check condition (iv) we need start with two disjoint sets K, L ∈ CX ,
and we prove the equality
(27) ωφ (K ∪ L) = ωφ (K) + ωφ (L).
By (26) it only suffices to show the inequality
(28) ωφ (K ∪ L) ≥ ωφ (K) + ωφ (L).
Start with some arbitrary ε > 0, and choose a function f ∈ CcR (X), with f ≥
κ K∪L and φ(f ) ≤ ωφ (K ∪ L) + ε. Use Uryshon Lemma for locally compact spaces
(Theorem
I.5.1) to find a continuous map θ : X → [0, 1], such that θ = 1 and
K
θL = 0. The functions g = f θ and h = f (1 − θ) are obviously continuous, and
have compact supports. Moreover, one has the inequalities g ≥ κ K and h ≥ κ L .
Since g + h = f , we get
ωφ (K ∪ L) + ε ≥ φ(f ) = φ(g + h) = φ(g) + φ(h) ≥ ωφ (K) + ωφ (L).
Since the inequality ωφ (K ∪ L) + ε ≥ ωφ (K) + ωφ (L) holds for all ε > 0, it will
clearly force the inequality (28)
So far, we have shown that ωφ is a content. We now prove that ωφ is regular,
which means that, for every K ∈ CX , one has the equality
ωφ (K) = inf ωφ (L) : L ∈ CX , K ⊂ Int(L).
Start with some arbitrary ε > 0, and choose a function f ∈ CcR (X) with f ≥ κ K ,
and φ(f ) ≤ ωφ (K) + ε. Consider the function g = (1 + ε)f , and the set
D = {x ∈ X : g(x) > 1 .
Obviously D is an open set, and since f (x) ≥ 1, ∀ x ∈ K, we get g(x) ≥ 1 + ε > 1,
∀ x ∈ K. In particular, this gives the inclusion K ⊂ D. Apply then Lemma I.5.1
to find some compact set L ⊂ D, with K ⊂ Int(L). Since g(x) > 1, ∀ x ∈ L, we
clearly have
ωφ (L) ≤ φ(g) = (1 + ε)φ(f ) ≤ (1 + ε)(ωφ (K) + ε).
This argument shows that, if we denote the right hand side of (29) by ν(K), then
we have the inequality
ν(K) ≤ (1 + ε)(ωφ (K) + ε).
Since this inequality holds for all ε > 0, it will force the inequality ν(K) ≤ ωφ (K),
thus proving (29).
Definition. Let X be a locally compact space, and let φ : CcR (X) → R be a
positive R-linear map. We apply Corollary 7.2 to the regular content ωφ , and we
will denote the Radon measure extension of ωφ simply by µφ . The measure µφ on
Bor(X) is called the Riesz measure associated with φ.
An interesting property, which will later be generalized, is the following.
Lemma 7.4 (Mean Value Property). Let X be a locally compact space, let
φ : CcR (X) → R be a positive R-linear map, and let µφ be the Riesz measure
associated with φ. For any function f ∈ CcR (X), and any compact subset K ⊂ X,
with K ⊃ supp f , one has the inequality
(30) min f (x) · µφ (K) ≤ φ(f ) ≤ max f (x) · µφ (K).
x∈K x∈K
Proof. Since minx∈K f (x) = − maxx∈K (−f )(x), it suffices to prove only the
inequality
(31) φ(f ) ≤ max f (x) · µφ (K).
x∈K
Fix f ∈ CcR (X), as well as the compact set K ⊃ supp f . Denote the number
maxx∈K f (x) simply by M .
f
If M < 0 the inequality is pretty clear, because the function g = M satisfies g ≥
κ K , which gives φ(g) ≥ ωφ (K) = µφ (K), and then multiplying by M immediately
gives (31).
The case M = 0 is also trivial, since this forces f ≤ 0, so we get φ(f ) ≤ 0.
Assume M > 0. Fix for the moment some ε > 0, and choose some function
h ∈ CcR (X), with h ≥ κ K , and φ(h) ≤ µφ (K) + ε.
Let us observe that M h − f ≥ 0. Indeed, if we start with some arbitrary point
x ∈ X, then either x ∈ K, in which case we have M h(x) ≥ M ≥ f (x), or we have
x ∈ X r K, in which case M h(x) ≥ 0 = f (x).
Using the positivity of φ we then get φ(M h − f ) ≥ 0, which by the choice of h
gives
φ(f ) ≤ φ(M h) = M φ(h) ≤ M µφ (K) + ε .
Since the inequality φ(f ) ≤ M µφ (K) + ε holds for arbitrary ε > 0, it will clearly
force φ(f ) ≤ M µφ (K).
The Riesz measure can be implicitly characterized by the following result.
256 LECTURES 26-29
Proposition 7.7. With the notations above, the Riesz measure µφ is the
unique Radon measure which has the interpolation property:
(iφ ) whenver F ⊂ X is compact, D ⊂ X is open, and f ∈ CcR (X) satisfies
κ F ≤ f ≤ κ D , it follows that one has the inequality
µφ (F ) ≤ φ(f ) ≤ µφ (D).
Proof. Let us first show that µφ has property (iφ ). Start with F , D and f
as in (iφ ). Since µφ (F ) = ωφ (F ), by the definition of ωφ , we immediately get the
inequality µφ (F ) ≤ φ(f ).
To prove the inequality φ(f ) ≤ µφ (D), we need some preparations. For every
integer n ≥ 1 we define the sets
1 1
An = x ∈ X : f (x) > and Bn = x ∈ X : f (x) ≥ .
n n
Define also the set E = {x ∈ X : f (x) > 0}, so that E = supp f . (Here we use
the obvious fact that f ≥ 0.) The sets An , n ≥ 1 are open. The sets Bn , n ≥ 1
are closed subsets of E ⊂ E, hence they are compact. Notice also that we have the
inclusions
A1 ⊂ B1 ⊂ A2 ⊂ B2 ⊂ · · · ⊂ E ⊂ D.
For every n ≥ 1, we use Urysohn
Lemma to find a continuous function hn : X →
[0, 1], with hn B = 1 and hn XrA = 0. On the one hand, we notice that the
n n+1
On the other hand, for each n ≥ 1, the function f hn has support contained in
Bn+1 , and (f hn )(x) ≤ 1, ∀ x ∈ Bn+1 , so again by Lemma 7.4 combined with the
inclusion Bn+1 ⊂ D, we get
φ(f hn ) ≤ µφ (Bn+1 ) ≤ µφ (D).
Using (32) we immediately get φ(f ) ≤ µφ (D).
We now prove the uniqueness. Let µ be a Radon measure with property (iφ ).
Claim 1: For any compact set K ⊂ X and any open set D ⊂ X, with K ⊂ D,
one has the inequality
µφ (K) ≤ µ(D).
Choose a compact set L ⊂ X, with K ⊂ Int(L) ⊂ L ⊂ D, and use Urysohn
Lemma
to find a continuous function f : X → [0, 1] such that f K = 1 and f XrInt(L) = 0.
In particular, f has compact support, and satisfies κ K ≤ f ≤ κ D . Using (iφ ) for
µφ and for µ, we then get µφ (K) ≤ φ(f ) ≤ µ(D), and we are done.
Claim 2: for every compact set K ⊂ X, one has the equality µφ (K) = µ(K).
CHAPTER III: MEASURE THEORY 257
will be studied in Chapter IV, where we will eventially prove the fact that it is
bijective. At this point we simply regard it as a method of constructing Radon
measures.
Proposition 7.8. Let X be a locally compact space. Then the Riesz corre-
spondence is “linear” in the following sense.
(i) If φ : CcR (X) → R is a positive R-linear map, and t ∈ [0, ∞), then tφ is
also a positive R-linear map, and one has the equality µtφ = tµφ .
(ii) If φ1 , φ2 : CcR (X) → R are positive R-linear maps, then φ1 + φ2 is also a
positive R-linear map, and one has the equality µφ1 +φ2 = µφ1 + µφ2 .
Proof. (i). Assume φ is positive and t ∈ [0, ∞). The fact that tφ is positive
is trivial. We know, by Proposition 7.2, that tµφ is a radon measure. Then the
equality µtφ = tµφ follows from Proposition 7.5, combined with the obvious fact
that µtφ has the interpolation property (itφ )
(ii). If φ1 and φ2 are positive, then so is φ1 + φ2 . Define ψ = φ1 + φ2 , and
ν = µφ1 + µφ2 . By Proposition 7.2, we again know that ν is a Radon measure. The
equality µψ = ν follows from Proposition 7.5, combined with the obvious fact that
ν has the interpolation property (iψ )
We can then consider two Riesz measures µφ on X, and µψ on X α . One has the
equality
(35) µψ (B) = µφ (B ∩ X), ∀ B ∈ Bor(X α ).
First of all, remark that
(36) µψ (K) = µφ (K), ∀ K ∈ CX .
This is a consequence of the fact that for every g ∈ C R (X α ) with g ≥ κ K , there
exists some f ∈ CcR (X), with g ≥ f ≥ κ K (Simply take f = gh, for some continuous
function h : X → [0, 1] with compact support, with hK = 1.) Using (36), we
immediately get the equality
(37) µψ (B ∩ X) = µφ (B ∩ X), ∀ B ∈ Bor(X α ).
Using this with B = X, we get
µψ (X) = µφ (X) = kφk = kψk = µψ (X α ),
which forces µψ ({∞}) = 0, and then (35) is immediate from (37)
Exercise 6. Consider the case when X = Rn . For every continuous function
f : Rn → R, with compact support, we define
Z b1 Z b2 Z bn
φ(f ) = ··· f (x1 , x2 , . . . , xn ) dx1 dx2 · · · dxn ,
a1 a2 an
where the numbers a1 < b1 , . . . , an < bn are chosen (arbitrarily) such that
supp f ⊂ [a1 , b1 ] × [a2 , b2 ] × · · · × [an , bn ].
(One can show that the multiple integral is independent of the choice of the a’s and
the b’s.) It is obvious that this way we have constructed a positive R-linear map
φ : CcR (Rn ) → R. The Riesz measure µφ , defined by φ, is precisely the Lebesgue
measure λn .
Hint: Compute the values of µφ on compact boxes.
We conclude this section with an important result from harmonic analysis. The
main object of study is explained in the following.
Definition. A topological group is a group G, which comes also equipped with
a topology, which is compatible with the group structure in the sense that the map
G × G 3 (g, h) 7−→ gh−1 ∈ G
is continuous. Remark that is equivalent to the fact that both maps G × G 3
(g, h) 7−→ gh ∈ G and G 3 g 7−→ g −1 ∈ G are continuous. To avoid any complica-
tions, all topological groups are assumed to be Hausdorff.
Examples 7.2. A. Any group becomes a topological group, when equipped
with the discrete topology. (This is the topology in which every subset is open.)
B. The group (Rn , +) is a topological group, when equipped with the norm
topology.
C. The unit circle T = z ∈ C : |z| = 1 is a topological group, when equipped
with the unsual multiplication, and the topology induced from C. More generally,
for an integer n ≥ 1, the n-dimensional torus Tn , equipped with coordinate-wise
multiplication, and the product topology, is a topological group.
D. Given an integer n ≥ 1, the group GLn (R), of all invertible n × n matrices
(with matrix multiplication as the group operation), is a topological group, when
260 LECTURES 26-29
equipped with the topology comming from the identification of GLn (R) as an open
2
subset in Rn .
Notations. Let G be a group. For a subset A ⊂ G and an element g ∈ G, we
define the left and right translations of A by g, as the sets
gA = {gh : h ∈ A} and Ag = {hg : h ∈ A}.
For two subsets A, B ⊂ G, we define
A · B = {hk : h ∈ A, k ∈ B}.
Finally, for a subset A ⊂ G, we define A−1 = {h−1 : h ∈ A .
Remark 7.5. There is some similarity between topological groups and metric
spaces. The subsets that paly role of open balls are the open neighborhoods of the
identity. More explicitly, if G is a topological group, with identity element e, then
one has the equalities
{N ⊂ G : N open neighborhood of g} =
= {gV ⊂ G : V open neighborhood of e} =
= {W g ⊂ G : W open neighborhood of e}.
For example, given a metric space (X, d), a map f : G → X is continuous at some
point g ∈ G, if and only if, for every ε > 0, there exists some neighborhood Vε of
e, such that
d f (gh), f (g) < ε, ∀ h ∈ Vε .
The following two results will be used several times.
Lemma 7.5. Suppose G is a topological group, with identity element e. For
any open neighborhood U of e, there exists an open neighborhoods V of e, such that
V = V −1 and V · V ⊂ U .
Proof. Fix the open neighborhood U . Use the continuity of the map G × H 3
(g, h) → gh ∈ G, at (e, e), to find an open neighborhood D of (e, e) in G × G, such
that
gh ∈ U, ∀ (g, h) ∈ D.
Since D is open in the product topology, there exist open neighborhoods U1 and
U2 , of e, such that U1 × U2 ⊂ D. Then we obviously have
U1 · U2 ⊂ U.
Consider the open neighborhood W = U1 ∩ U2 . We still have W · W ⊂ U . Finally,
using the continuity of the map G 3 g 7−→ g −1 ∈ G, it follows that W −1 is also a
neighborhood of e. Then we are done, if we take V = W ∩ W −1 .
Proposition 7.10. Let G be a topological group, and let K, L ⊂ G be two
compact disjoint sets. Then there exists an open neighborhood V of the identity
element e, such that V = V −1 and (K · V ) ∩ (L · V ) = (V · K) ∩ (V · L) = ∅.
Proof. Consider the continuous map φ : G × G 3 (g, h) 7−→ gh−1 ∈ G, and
the compact set C = (K × L) ∪ (L × K) ⊂ G × G. Since φ is continuous, it follows
that φ(C) is a compact subset of G. The condition K ∩ L = ∅ obviously gives the
fact that e 6∈ φ(C). Since φ(C) is closed, there exists some open neighborhood U
of e, such that φ(C) ∩ U = ∅. Use Lemma 7.5 to find some open neigborhood V
of e, such that V = V −1 and V · V ⊂ U .
CHAPTER III: MEASURE THEORY 261
Comment. Later on, in Chapter IV, we are going to prove that the left invari-
ance property of φ is also a necessary condition for µφ to be a Haar measure.
Examples 7.3. Let us examine the examples 7.2.A-D and let us construct
Haar measures on these groups.
A. On a discrete group G, one has the counting measure µ(A) = Card A,
∀ A ⊂ G, which is obviously a Haar measure.
B. On (Rn , +), the Lebesgue measure is a Haar measure.
CHAPTER III: MEASURE THEORY 263
It is not hard to see that Λ◦Lg = Λ, ∀ g ∈ Tn . One easy way is to check directly the
equality (Λ ◦ Lg )(P ) = Λ(P ), for functions of the form P (z1 , . . . , zn ) = z1m1 · · · znmn ,
with m1 , . . . , mn ∈ Z, and then use continuity and the Stone-Weierstrass Theorem
which gives the fact that the linear span of all these P ’s is dense in C R (Tn ). Using
Proposition 7.6 it follows that µΛ is a Haar measure on Tn .
D. The construction of a Haar measure on GLn (R) is outlined in the following.
2
Exercise 7*. Identify GLn (R) as an open subset in Rn . For every continuous
2
function F : GLn (R) → R, with compact support, F̆ : Rn → R by
F (x) · | det x|−n if x ∈ GLn (R)
F̆ (x) = 2
0 if x ∈ Rn r GLn (R)
and we define
Z b1 Z b2 Z bn2
ψ(F ) = ··· f (x1 , x2 , . . . , xn2 ) dx1 dx2 · · · dxn2 ,
a1 a2 an2
where the numbers a1 < b1 , . . . , an2 < bn2 are chosen (arbitrarily) such that
(On has the equality supp F̆ = supp F , and the multiple integral is independent of
the choice of the a’s and the b’s.) Prove that ψ ◦ Ls = ψ, ∀ s ∈ GLn (R). Conclude
that the Riesz measure µψ associated with ψ is a Haar measure on GLn (R).
Hints: Fix s ∈ GLn (R). The map `s−1 : GLn (R) → GLn (R) has an obvious linear extension
2 2
Φs : Rn → Rn , defined by
2
Φs (x) = s−1 x, ∀ x ∈ Rn ,
2
where the vector space Rn is identified with M atn×n (R). Fix now F ∈ CcR GLn (R) and consider
Use this equality, combined with the above formula for H̆, to get the equality ψ(H) = ψ(F ),
as a result of the change of variable theorem. (Use the fact that in the definition of ψ, instead
of integrating over rectangles one can integrate over arbitrary compact sets Ω ⊂ GLn (R), with
Jordan neglijeable boundary, and Int(Ω) ⊃ supp F .)
Comments. The Haar measures defined in Examples 7.3.A-D are peculiar in
the sense that they also have the right invariance property:
µ(Ag) = µ(A), ∀ g ∈ G, A ∈ Bor(G).
In general such a property does not hold. At this point, we can only speculate on
this matter, by examining the following example.
264 LECTURES 26-29
Exercise 8*. Consider the group G of all affine orientation preserving affine
transformations of R, i.e. the collection
G = Tab : a, b ∈ R, a > 0 ,
where Tab : R 3 x 7−→ ax + b ∈ R. (Some people call this the “ax + b” group.) It
is not hard to see that compositions and inverses of such transformations are again
of this form. In fact one can identify G as the subgroup of GL2 (R) given by
a b
G= : a, b ∈ R, a > 0 .
0 1
The topology on G is the one induced from this inclusion. Equivalently, G can be
identified with the right half-plane (0, ∞) × R. We use this identification to define a
positive R-linear map Λ : CcR (G) → R as follows. For every F ∈ CcR (G), we choose
0 < c1 < d1 and c2 < d2 , such that supp F ⊂ [c1 , d1 ] × [c2 , d2 ], and we define
Z d1 Z d2
F (a, b)
Λ(F ) = da db.
c1 c2 a2
The integral does not depend on the particular choice of the rectangle. Prove that
Λ◦Lg = Λ, ∀ g ∈ G, so that the Riesz measure µΛ is a Haar measure. In general the
equality Λ ◦ Rg = Λ fails. As indicated in the comment that followed Proposition
7.6, the fact that Λ ◦ Rg 6= Λ would prevent the Riesz measure µΛ from having the
right invariance property.
Hints: Use similar arguments to the ones in Exercise 8. If g = Tab ∈ G, then the map
`g−1 : G → G extends to a linear map Φg : R2 → R2 , defined by
Φg (x, y) = (ax + by, y), ∀ (x, y) ∈ R2 .
The other inequality ω̂(D) ≤ ω̂(gD), follows from the one above if we replace g
with g −1 and D with gD.
We are now in position to prove that ω ∗ has the left invariance property. Fix
for the moment A ⊂ G and g ∈ G. For every open set D ⊃ gA, one has g −1 D ⊃ A,
so by the Claim we get
ω̂(D) = ω̂(g −1 D) ≥ ω ∗ (A).
Since we have ω̂(D) ≥ ω (A), for all open sets D ⊃ gA, by the definition of ω ∗ , we
∗
get
ω ∗ (gA) = inf ω̂(D) : D ∈ TG , D ⊃ gA ≥ ω ∗ (A).
The other inequality ω ∗ (A) ≥ ω ∗ (gA), follows from the one above if we replace g
with g −1 and A with gA.
In order to prove that µ is a Haar measure, all we need to prove is the fact that
µ(G) > 0. Start with some compact subset K ⊂ G, with ω(K) > 0. We have
µ(G) ≥ µ(K) = ω ∗ (K) = ω̆(K) ≥ ω(K) > 0,
and we are done.
Before we prove the existence of Haar measures, we need more preparations.
Notations. Let G be a group. For two non-empty subsets A, B ⊂ G, we write
A ≺ B, if there exist elements g1 , . . . , gn ∈ G, such that A ⊂ g1 B ∪ · · · ∪ gn B. In
this case we define the number
[A : B] = min{n ∈ N : there exist g1 , . . . , gn ∈ G with K ⊂ g1 V ∪ · · · ∪ gn V .
The following result will be useful.
Lemma 7.7. Let G be a group.
(i) If A, B ⊂ G are non-empty sets with A ⊂ B, then A ≺ B, and [A : B] = 1.
(ii) The relation ≺ is transitive, i.e. whenever A, B, C ⊂ G are non-empty
subsets satisfying A ≺ B and B ≺ C, it follows that A ≺ C. Moreover, in
this case one has the inequality
[A : C] ≤ [A : B] · [B : C].
(iii) The relation ≺ is compatible with left translations. This means that for
any two elements g, h ∈ G, and any two non-empty subsets A, B ⊂ G,
one has the equivalence A ≺ B ⇔ gA ≺ hB. Moreover, in this case one
has
[gA : hB] = [A : B].
266 LECTURES 26-29
[K : V −1 ]
ω(K) = ,
[A : V −1 ]
for all non-empty compact subsets K ⊂ G. The fact that ω has properties (i)-(vi)
is immediate from Lemma 7.7.
Let us regard the sets Ω(V ), V ∈ V as subsets of the product space
Y
P= [0, m(K)].
K∈CG
Notice that, when we equip P with the product topology, it becomes a compact
space, by Tihonov’s Theorem.
Claim 2: For every V ∈ V, the set Ω(V ) is closed in P.
Define, for any K ∈ CG , the map
πK : P 3 ω 7−→ ω(K) ∈ R.
FKL (ω) = ω(K) − ω(L) and TKL (ω) = ω(K ∪ L) − ω(K) − ω(L), ∀ ω ∈ P.
then Ω6V is also closed, and so will then be the intersection Ω5 ∩ Ω6V = Ω(V ).
T
Claim 3: The intersection V ∈V Ω(V ) is non-empty.
Remark that, if V1 , V2 ∈ V are such that V1 ⊂ V2 , then we have the inclusion
Ω(V1 ) ⊂ Ω(V2 ). Indeed, if ω belongs to Ω(V1 ), then properties (i)-(v) are clear. To
check property (vi) for V2 we need to show that whenever K, L ⊂ G are compact
sets, with (K · V2 ) ∩ (L · V2 ) = ∅, it follows that ω(K ∪ L) = ω(K) + ω(L). This
is however trivial, since the inclusion V1 ⊂ V2 forces (K · V1 ) ∩ (L · V1 ) = ∅, and
then the desired equality follows from the property (vi) for V1 . We now see that,
for any finite number of sets V1 , . . . , Vn ∈ V, we have the inclusion
Ω(V1 ∩ · · · ∩ Vn ) ⊂ Ω(V1 ) ∩ · · · ∩ Ω(Vn ),
which by Claim 1, proves that Ω(V1 ) ∩ · · · ∩ Ω(Vn ) 6= ∅. Using Claim 2, and the
compactness of P, the Claim immediately
T follows.
Pick now an element ω ∈ V ∈V Ω(V ).
Claim 4: The map ω : CG → [0, ∞) is a content on G with the left invariance
property
ω(gK) = ω(K), ∀ g ∈ G, K ∈ CG .
Moreover, one has the equality ω(A) = 1.
The fact that ω(A) = 1 is clear, from condition (ii) in the definition of Ω(V ). The
left invariance property follows from condition (v). In order to prove that ω is a
content, we need to prove
(a) ω(∅) = 0;
(b) K, L ∈ CG , K ⊂ L ⇒ ω(K) ≤ ω(L);
(c) ω(K ∪ L) ≤ ω(K) + ω(L), ∀ K, L ∈ CG ;
(d) K, L ∈ CG , K ∩ L = ∅ ⇒ ω(K ∪ L) = ω(K) + ω(L).
CHAPTER III: MEASURE THEORY 269
Properties (a), (b), and (c) are clear, because every element in Ω(V ), V ∈ V satisfies
them. (Property (a) is a consequence of condition (i), property (b) is a consequence
of (iii), and property (c) is a consequence of (iv).) To prove property (d), we start
with two disjoint compact sets K and L, and we use Proposition 7.5 to find some
V ∈ V such that (K · V ) ∩ (L ∩ V ) = ∅. Then we use the fact that ω belongs to
Ω(V ), and by condition (vi) we indeed get ω(K ∪ L) = ω(K) + ω(L).
Having proven Claim 4, we now define the measure µ0 = ω ∗ Bor(G) . By Lemma
7.7, µ0 is a Haar measure on G. Notice that µ0 (A) = ω̆(A) ≥ ω(A) = 1, so if we
define µ : Bor(G) → [0, ∞] by (use the convention ∞/µ0 (A) = ∞)
µ0 (B)
µ(B) = , ∀ B ∈ Bor(G),
µ0 (A)
then µ is a Haar measure on G, and satisfies µ(A) = 1.
Comment. Eventually (see Chapter IV) we are going to improve on the above
result by proving the uniqueness of µ.
In concrete examples, it is possible to prove uniqueness.
Exercise 10*. Let S = [0, 1]n be the unit square in Rn , and let µ be a Haar
measure on (Rn , +), with µ(S) = 1. Prove that µ coincides with the n-dimensional
Lebesgue measure λn .
Hint: Consider first the half open box S0 = [0, 1)n , and its measure β = µ(S0 ). Prove that for
a half open box of the form
B = [a1 , b1 ) × · · · × [an , bn )
with a1 , . . . , an , b1 , . . . , bn ∈ Q, one has µ(B) = βλn (B). Conclude that if a subset A ⊂ Rn is
contained in a hyperplane of the form
Πk (a) = {(x1 , . . . , xn ) ∈ Rn : xk = a},
then µ(A) = 0. Use this to get β = 1, so
µ(B) = λn (B),
for every “rational” half-open box. Prove that this equality holds for all half-open boxes. Use
Corollary 5.1 to conclude that µ = λn .
The following two exercises show how a Haar measure can be used to get some
topological information.
Exercise 11. Let G be a locally compact group, and let µ be a Haar measure
on G. Prove that µ(D) > 0, for every open subset D ⊂ G.
Hint: Use the inequality µ(K) ≤ [K : D] · µ(D), for all compact K ⊂ G.
Exercise 12*. Let G be a locally compact group, and let µ be a Haar measure
on G. Prove that the following are equivalent:
(i) G is compact;
(ii) µ(G) < ∞.
Hint: For the implication (ii) ⇒ (i), start with some compact neighborhood V of the identity,
and choose a maximal subset A ⊂ G, such that the sets gV , g ∈ A are disjoint. Prove that A is
finite. Conclude that G = g∈A (gV · V −1 ), so G is a finite union of compact sets.
S
Lectures 30-31
Here we adopt the convention that if one term in the right hand side of (1) is equal
to ±∞, then the entire sum is equal to ±∞. It is important to use condition (i),
which avoids situations when one term is ∞ and another term is −∞.
Examples 8.1. Let us agree, in this section only, to use the term “honest”
measure, for a measure in the usual sense.
A. Any “honest” measure is of course a signed measure.
B. If µ is a signed measure, then −µ is again a signed measure.
C. If µ1 and µ2 are “honest” measures, one of which is finite, then µ1 − µ2 is
a signed measure. Eventually (see Theorem 8.2) we are going to show that any
signed measure can be written in this form.
One key technical result about signed measures is the following.
Theorem 8.1. Let A be a σ-algebra on a non-empty set X, and let µ be a
signed measure on X. Then there exist sets L, U ∈ A, such that
µ(L) = inf µ(A) : A ∈ A ;
(2)
µ(M ) = sup µ(A) : A ∈ A .
(3)
Proof. Since −µ is also a signed measure, it suffices to prove only the exis-
tence of M satisfying (3). Denote the right hand side of (3) by α, and choose a
sequence (αn )n≥1 ⊂ R, such that limn→∞ αn = α, and αn < α, ∀ n ≥ 1. The key
construction we need is contained in the following.
Claim 1: There exists a family of sets {Bkn : k, n ∈ N, 1 ≤ k ≤ n} ⊂ A,
with the following properties:
271
272 LECTURES 30-31
Choose now an arbitrary set D ∈ A, with µ(D) ≥ αm+1 , and define, for each
j ∈ {1, . . . , m}, the set
Ej if µ(Ej r D) > 0
Gj =
Ej ∩ D if µ(Ej r D) ≤ 0
Notice that we have Ej ⊃ Gj , and using the equality µ(Ej ) = µ(Ej ∩D)+µ(Ej rD),
we also have
(4) µ(Ej r Gj ) ≤ 0 and µ(Gj ) ≥ µ(Ej ∩ D), ∀ j = 1, . . . , m.
m
Define also the set Gm+1 = D r Bm . It is clear that the sets G1 , G2 , . . . , Gm+1 are
pairwise disjoint. Construct now the m + 1 row by taking
k
[
Bkm+1 = Gj , ∀ k = 1, 2, . . . , m + 1.
j=1
Using property (iii) from Claim 1, we then get µ(Ak ) ≥ αk . The fact that we have
the inclusions A1 ⊂ A2 ⊂ . . . is clear, from property (i) in Claim 1 (the horizontal
inclusions).
S∞now the sequence (Ak )k=1 ⊂ A as in Claim 2, and let us consider the set
∞
Fix
M = k=1 Ak . If we define the sets
M1 = A1 and Mk = Ak r Ak−1 , ∀ k ≥ 2,
S∞
then we have M = k=1 Mk , and the sets M1 , M2 , M3 , . . . are pairwise disjoint. In
particular, this gives
∞
X k k
X [
µ(M ) = µ(Mk ) = lim µ(Mj ) = lim µ Mj .
k→∞ k→∞
k=1 j=1 j=1
Sk
Since we obviously have j=1 Mj = Ak , ∀ k ≥ 1, the above equality proves that
(5) µ(M ) = lim µ(Ak ).
k→∞
On the one hand, using the inclusion Aε ⊂ Dnε , ∀ n ≥ 1, we get µ(Aε ) ≤ ε/2n ,
∀ n ≥ 1, which clearly forces
(10) µ(Aε ) = 0.
On the other hand, using σ-subadditivity, we have
∞ ∞ ∞
[ X X ε
(11) ν(Bε ) = ν Enε ≤ ν(Enε ) < n
= ε.
n=1 n=1 n=1
2
which gives
(12) Aε ⊃ X r Bε .
S∞
Define now the sets N = n=1 A1/n and M = X r N . On the one hand, using
σ-subadditivity, combined with (10), we get µ(N ) = 0. On the other hand, using
(12), we have
∞
[ ∞
\ ∞
\
M =X rN =X r A1/n = (X r A1/n ) ⊂ B1/n ⊂ B1/k , ∀ k ≥ 1,
n=1 n=1 n=1
Although the next technical result seems a bit out of context at this point, we
prove it here, and record it for future use.
Lemma 8.2. Let A be a σ-algebra on some non-empty set X, and let µ, η be
signed measures on A. Assume there is an “honest” finite measure ν on A, with
µ + ν = η.
(i) If µ = µ+ − µ− and η = η + − η − are the Hahn-Jordan decompositions of
µ and η respectively, then one has the inequalities
(13) µ+ ≤ η + ≤ µ+ + ν
(14) η − ≤ µ− ≤ η − + ν.
CHAPTER III: MEASURE THEORY 277
Proof. Since the statement of the Theorem is “symmetric,” without any loss
of generality we can assume that µ is finite.
Consider the signed measure η = µ − ν, and its Hahn-Jordan decomposition
η = η + − η − . Let (X + , X − ) be a Hahn-Jordan set decomposition of X relative to
η. This means that, for every A ∈ A, one has
(17) 0 ≤ η + (A) = η(A ∩ X + ) = µ(A ∩ X + ) − ν(A ∩ X + );
(18) 0 ≤ η − (A) = −η(A ∩ X − ) = ν(A ∩ X − ) − µ(A ∩ X − ).
In particular we get
(19) µ(A ∩ X + ) ≥ ν(A ∩ X + ) and µ(A ∩ X − ) ≤ ν(A ∩ X − ), ∀ A ∈ A.
(i). Define the measure µ ∨ ν = µ + η − . Using (18) we have
(20) (µ ∨ ν)(A) = µ(A ∩ X + ) + ν(A ∩ X − ), ∀ A ∈ A.
Notice that, using (19), it follows that, for every A ∈ A, one has the inequalities
(µ ∨ ν)(A ∩ X + ) = µ(A ∩ X + ) ≥ ν(A ∩ X + ),
(µ ∨ ν)(A ∩ X − ) = ν(A ∩ X − ) ≥ µ(A ∩ X − ),
In particular, this gives
(µ ∨ ν)(A) = (µ ∨ ν)(A ∩ X + ) + (µ ∨ ν)(A ∩ X − ) ≥ µ(A ∩ X + ) + µ(A ∩ X − ) = µ(A),
(µ ∨ ν)(A) = (µ ∨ ν)(A ∩ X + ) + (µ ∨ ν)(A ∩ X − ) ≥ ν(A ∩ X + ) + ν(A ∩ X − ) = µ(A),
for every A ∈ A, so µ ∨ ν indeed has property (a).
To prove property (b), start with some “honest” measure ω on A, with µ, ν ≤ ω,
and let us show that µ ∨ ν ≤ ω. This is quite clear, since for any A ∈ A, using (20)
we have
ω(A) = ω(A ∩ X + ) + ω(A ∩ X − ) ≥ µ(A ∩ X + ) + ν(A ∩ X − ) = (µ ∨ ν)(A).
The uniqueness of µ ∨ ν is now clear from (a) and (b).
(ii). Remark that, using the Minimality Theorem 8.3, for the measure η = µ−ν,
it follows that η + ≤ µ. In particular, η + is a finite “honest” measure, and so is the
difference µ − η + . Put µ ∧ ν = µ − η + . Using (17) we have
(21) (µ ∧ ν)(A) = µ(A ∩ X − ) + ν(A ∩ X + ), ∀ A ∈ A.
Notice that, using (19), it follows that, for every A ∈ A, one has the inequalities
(µ ∧ ν)(A ∩ X + ) = ν(A ∩ X + ) ≤ µ(A ∩ X + ),
(µ ∧ ν)(A ∩ X − ) = µ(A ∩ X − ) ≥ ν(A ∩ X − ),
In particular, this gives
(µ ∧ ν)(A) = (µ ∧ ν)(A ∩ X + ) + (µ ∧ ν)(A ∩ X − ) ≤ µ(A ∩ X + ) + µ(A ∩ X − ) = µ(A),
(µ ∧ ν)(A) = (µ ∧ ν)(A ∩ X + ) + (µ ∧ ν)(A ∩ X − ) ≤ ν(A ∩ X + ) + ν(A ∩ X − ) = µ(A),
for every A ∈ A, so µ ∧ ν indeed has property (a).
To prove property (b), start with some “honest” measure λ on A, with µ, ν ≤ ω,
and let us show that µ ∧ ν ≥ λ. This is quite clear, since for any A ∈ A, using (21)
we have
λ(A) = λ(A ∩ X + ) + ω(A ∩ X − ) ≤ ν(A ∩ X + ) + µ(A ∩ X − ) = (µ ∧ ν)(A).
The uniqueness of µ ∧ ν is now clear from (a) and (b).
CHAPTER III: MEASURE THEORY 279
We conclude with a series of results that make a connection with the theory of
Radon measures discussed in Section 7.
Definition. Suppose X is a locally compact space, and µ is a signed measure
on Bor(X). We call µ a signed Radon measure on X, if there exist “honest” Radon
measures ν and η on X, one of which is finite, such that µ = ν − η.
Exercise 2*. Let X be a locally compact space, and let µ be a signed measure
on Bor(X). Prove that the following are equivalent:
(i) µ is a signed Radon measure on X;
(ii) if µ = µ+ − µ− denotes the Hahn-Jordan decomposition of µ, then both
µ+ and µ− are Radon measures on X.
Hint: To prove the implication (i) ⇒ (ii) use the fact that µ+ ≤ ν and µ− ≤ η. Moreover,
show that, for any B ∈ Bor(X), one has the implications µ+ (B) < ∞ ⇒ ν(B) < ∞ and
µ− (B) < ∞ ⇒ η(B) < ∞. Then use Exercise 5 from Section 7.
Remark 8.2. Suppose X is a locally compact space. In Section 7 we discussed
the Riesz correpsondence, which associates to each linear positive map φ : CcR (X) →
R, a Radon measure µφ on X. As already suggested, this correspondence is in fact
a bijection, although the proof of this fact will come later in Chapter IV. At this
point we would like to analyze the Riesz correspondence in a simpler situation,
namely the case when X is compact. In this case it is interesting to point out that
Riesz correspondence can be extended beyond the positive case. The key fact (see
Corollary II.5.3) is that every linear continuous map φ : C R (X) → R can be written
as a difference φ = φ1 − φ2 , with φ1 , φ2 : C R (X) → R positive linear maps. (In fact
φ1 and φ2 can be chosen such that kφk = kφ1 k + kφ2 k. This fact will be heavily
exploited a little later.) We would like then to define a finite signed Radon measure
µφ by the formula µφ = µφ1 − µφ2 . There is a minor problem here: What if we
find another pair of continuous positive linear maps ψ1 , ψ2 : C R (X) → R, such that
φ = ψ1 − ψ2 ? Is is true that µψ1 − µψ2 = µφ1 − µφ2 ? The answer is affirmative,
and this is an easy consequence of Proposition 7.6, which gives the equalities
µφ1 + µψ2 = µφ1 +ψ2 = µψ1 +φ2 = µψ1 + µφ2 .
Notations. Suppose X is a compact Hausdorff space. We define
MR (X) = φ : C R (X) → R : φ R-linear continuous ,
RR (X) = µ signed Radon measure on X .
The correspondence
(22) MR (X) 3 φ 7−→ µφ ∈ RR (X)
defined above, will still be referred to as the extended Riesz correspondence.
Remark 8.3. If X is a compact Hausdorff space, then the extended Riesz
correspondence (22) is a linear map. This is a consequence of Proposition 7.6.
Given φ ∈ MR (X), the existence of a decomposition of φ, of the particular type
described in Corollary II.5.3, is extremely significant, as suggested by the following
result.
Theorem 8.4. Let X be a compact Hausdorff space, let φ1 , φ2 : C R (X) → R
be positive linear maps, and let µφ1 and µφ2 be the corresponding Riesz measures.
Consider the linear continuous map φ = φ1 − φ2 , and the finite signed measure
(23) µφ = µφ1 − µφ2 .
280 LECTURES 30-31
If kφk = kφ1 k + kφ2 k, then µφ1 ⊥ µφ2 , so (23) represents the Hahn-Jordan decom-
position of µφ .
Proof. We are going to show that the decomposition (23) satisfies condition
(ii) in Lemma 8.1. The key step in proving this fact is contained in the following.
Claim: For every ε > 0, there exist functions f1 , f2 ∈ C R (X), with f1 , f2 ≥
0, f1 + f2 ≥ 1, and such that φ1 (f2 ) < ε and φ2 (f1 ) < ε.
To prove this we fix ε > 0, and we use the definition of the norm, to find some
function g ∈ C R (X), with kgk ≤ 1, and |φ(g)| ≥ kφk − ε. Replacing g with −g, if
necessary, we can assume that
(24) φ(g) ≥ kφk − ε.
Consider the functions g = max{g, 0} and g − = max{−g, 0 , so that g = g + −g − ,
+
and we clearly have 0 ≤ g ± ≤ 1. On the one hand, since kφk k = φk (1) (see
Proposition II.5.4), we have φk (g ± ) ≤ kφk , k = 1, 2. On the other hand, by (24),
and the positivity of φ1 and φ2 , we know that
kφk − ε ≤ φ(g) = φ1 (g) − φ2 (g) = φ1 (g + ) + φ2 (g − ) − φ1 (g − ) − φ2 (g + ) ≤
≤ φ1 (g + ) + φ2 (g − ) ≤ kφ1 k + kφ2 k = kφk,
so we get
ε ≥ kφk − φ1 (g + ) − φ2 (g − ) = kφ1 k + kφ2 k − φ1 (g + ) − φ2 (g − ) =
= φ1 (1) + φ2 (1) − φ1 (g + ) − φ2 (g − ) = φ1 (1 − g + ) + φ2 (1 − g − ).
If we define f1 = 1 − g − and f2 = 1 − g + , then it is clear that f1 , f2 ≥ 0. Using
the fact that g + + g − = |g| ≤ 1, we get f1 + f2 = 2 − |g| ≥ 1. Finally, the above
estimate gives φ1 (f2 ) + φ2 (f1 ) ≤ ε, and so the Claim immediately follows.
Having proven the Claim, we are now in position to prove that the two measures
µφ1 and µφ2 satisfy condition (ii) in Lemma 8.1. Start with some arbitrary ε > 0,
and use the Claim to find two functions f1 , f2 ∈ C R (X) with f1 , f2 ≥ 0, f1 +f2 ≥ 1,
such that φ1 (f2 ) ≤ ε/2 and φ2 (f1 ) ≤ ε/2. Consider the compact subsets
1 1
K1 = x ∈ X : f1 (x) ≥ and K2 = x ∈ X : f2 (x) ≥ .
2 2
Since f1 + f2 ≥ 1, it follows immediately that we have K1 ∪ K2 = X. By construc-
tion, we have 2f1 ≥ κ K1 and 2f2 ≥ κ K2 , so using the interpolation property (see
Proposition 7.5), we get
µφ1 (K2 ) ≤ φ1 (2f2 ) = 2φ1 (f2 ) ≤ ε;
µφ2 (K1 ) ≤ φ2 (2f1 ) = 2φ2 (f1 ) ≤ ε.
The above result has several interesting consequences.
Corollary 8.2. Suppose X is a compcat Hausdorff space. Then the extended
Riesz correspondence (22) is injective.
Proof. Since the correspondence (22) is linear, is suffices to prove the impli-
cation µφ = 0 ⇒ φ = 0. Start with some linear continuous map φ : C R (X) → R,
such that µφ = 0. Use Corollary II.5.3 to find two positive linear maps φ1 , φ2 :
C R (X) → R, such that φ = φ1 − φ2 , and kφk = kφ1 k + kφ2 k. By Theorem 8.4
the difference µφ1 − µφ2 = µφ = 0 is the Hahn-Jordan decomposition of the zero
measure. By the uniqueness (see Corollary 8.1) it follows that µφ1 = µφ2 = 0. By
CHAPTER III: MEASURE THEORY 281
establishes an isometric linear isomorphism between MR 0 (X) and the space of all
continuous linear maps CcR (X) → R. For every positive φ ∈ M R
0 (X), we denote by
µφ the Riesz measure associated with the restriction φc = φC R (X) . Since kφc k =
c
kφk, we have the equality µφ (X) = kφk.
We know (see Proposition II.5.10) that for every linear continuous map φ :
C0R (X) → R, there exist linear positive continuous maps φ1 , φ2 : C0R (X) → R, with
φ = φ1 − φ2 . (In fact φ1 and φ2 can be chosen such that kφ1 k + kφ2 k = kφk.) We
use this fact to define the finite signed Radon measure µφ = µφ1 − µφ2 . Exactly as
282 LECTURES 30-31
which we will call the extended finite Riesz correspondence. Of course, if X is already
compact, we have C0R (X) = C R (X), MR 0 (X) = M (X), and mathf rakR0 (X) =
R R
R
R (X), so (26) is the extended Riesz correspondence previously defined.
The following result generalizes the statements of Remark 8.3, Theorem 8.4,
and Corollaries 8.2 and 8.3.
Theorem 8.5. Let X be a locally compact space.
A. The extended finite Riesz correspondence (26) is an injective linear map.
B. For every φ ∈ MR 0 (X), there exist unique positive maps φ , φ ∈∈ M0 (X),
+ − R
+ − + −
such that φ = φ − φ , and kφk = kφ k + kφ k. Moreover, in this case
µφ = µφ+ − µφ−
is precisely the Hahn-Jordan decomposition of µφ .
Proof. First of all, the correspondence (26) is clearly linear, again as a con-
sequence of Proposition 7.6.
Second, we remark that the existence part in B is already known, from Propo-
sition II.5.10. We are going to use the following version of Theorem 8.4.
Claim: Suppose φ ∈ MR 0 (X) is written as a difference φ = φ1 − φ2 , with
φ1 , φ2 ∈ MR
0 (X) positive, and kφk = kφ1 k + kφ2 k. Then
µφ = µφ1 − µφ2
is the Hahn-Jordan decomposition of µφ .
One way to prove this is by employing the Alexandrov compactification X α =
X t {∞}. We use the identification
C0R (X) = {f ∈ C R (X α ) : f (∞) = 0 .
We know that there exist positive linear maps ψ1 , ψ2 : C R (X) → R, such that
ψk C R (X) = φk , and kψk k = kφk k, k = 1, 2. If we define ψ : C R (X α ) → R by
0
ψ = ψ1 − ψ2 , it it not hard to see that kψk = kψ1 k + kψ2 k, so if we consider the
Radon measures µψ , µψ1 and µψ2 on the compact space X α , then using Theorem
8.4, we get the fact that
µψ = µψ1 − µψ2
is precisely the Hahn-Jordan decomposition of µψ . This means that there are sets
B1 , B2 ∈ Bor(X α ), with B1 ∪ B2 = X α , B1 ∩ B2 = ∅, and µψ1 (B2 ) = µψ2 (B1 ) = 0.
We know (see Remarks 7.4) that
µψk (B) = µφk (B ∩ X), ∀ B ∈ Bor(X α ), k = 1, 2,
so if we define Ak = Bk ∩ X, we immediately get A1 ∪ A2 = X, A1 ∩ A2 = ∅, and
µφ1 (A2 ) = µφ2 (A1 ) = 0, thus proving that µφ1 ⊥ µφ2 .
Having proven the above Claim, the proof follows line by line the proofs of
Corollaries 8.3 and 8.4.
The notion of a finite signed measure can be generalized to the complex case.
Definition. Suppose A is a σ-algebra on a non-empty set X. A function
µ : A → C is called a complex measure on A, if it is σ-additive in the sense that
CHAPTER III: MEASURE THEORY 283
disjoint, so we have
∞ ∞
X X
|µ(Dk )| =
µ(Dk ∩ An ) ≤ |µ(Dk ∩ An )|, ∀ k ≥ 1.
n=1 n=1
Summing up then yields
∞
X X ∞
∞ X X∞ X
∞
(30) |µ(Dk )| ≤ |µ(Dk ∩ An )| = |µ(Dk ∩ An )| .
k=1 k=1 n=1 n=1 k=1
Definition. With the notations above, and under the hypothesis of Theorem
8.6, the “honest” measure ν, defined by (28), is called the variation measure of µ,
and will be denoted by |µ|. By construction, we have the inequality
|µ(A)| ≤ |µ|(A), ∀ A ∈ A.
Remark 8.5. Let µ be either a signed measure, or a complex measure on
the σ-algebra A. Exactly as with numbers (or functions), the measure |µ| has a
minimality property, which can be stated as follows. Whenever ν is an “honest”
measure on A with
|µ(A)| ≤ ν(A), ∀ A ∈ A,
it follows that we have
|µ|(A) ≤ ν(A), ∀ A ∈ A.
CHAPTER III: MEASURE THEORY 285
S∞ is quite clear, because for any pairwise disjoint sequence (An )n=1 ⊂ A, with
∞
This
n=1 An = A, one has the inequality
∞
X ∞
X
|µ(An )| ≤ ν(An ) = ν(A),
n=1 n=1
and then the desired inequality follows by taking the supremum in the left hand
side.
In the case of signed measures, the variation measure is also given by the
following.
Proposition 8.2. Let µ be a signed measure on the σ-algebra A. Then one
has the equality
|µ| = µ+ + µ− ,
where µ = µ+ − µ− is the Hahn-Jordan decomposition of µ.
Proof. Denote the measure µ+ + µ− simply by ν. Remark that we obviously
have
−ν(A) = −µ+ (A)−µ− (A) ≤ µ+ (A)−µ− (A) ≤ µ+ (A)+mu− (A) = ν(A), ∀ A ∈ A,
which gives
|µ(A)| ≤ ν(A), ∀ A ∈ A.
By Remark 8.5, this forces the inequality |µ| ≤ ν.
To prove the other inequality, we start by fixing sets X + , X − ∈ A as in Theorem
8.2. We decompose each set A ∈ A as A = A+ ∪ A− , where A± = A ∩ X ± , so that
we have
ν(A) = ν(A+ )+ν(A− ) = µ+ (A+ )+µ+ (A− )+µ− (A+ )+µ− (A− ) = µ+ (A+ )+µ− (A− ).
Notice now that µ(A+ ) = µ+ (A+ ) ≥ 0, and −µ(A− ) = µ− (A− ) ≥ 0, which means
that we have the equalities µ+ (A+ ) = |µ(A+ )| and µ− (A− ) = |µ(A− )|, so the above
equality reads
ν(A) = |µ(A+ )| + |µ(A− )|,
and by the definition of |µ| we then immediately get ν(A) ≤ |µ|(A).
An interesting consequence is the following.
Corollary 8.4. Let µ be either a finite signed measure, or a comlex measure
on the σ-algebra A. Then the variation measure |µ| is finite.
Proof. The signed measure case is clear from the above result.
In the complex case, we write µ = ν + iη, with ν and η finite signed measures
on A. We apply the signed case, to get the fac that both |ν| and |η| are finite.
Notice that we have
|µ(A)| = |ν(A) + iη(A)| ≤ |ν(A)| + |η(A)| ≤ |ν|(A) + |η|(A), ∀ A ∈ A,
so by Remark 8.5 we get |µ| ≤ |ν|+|η|, and then the finiteness of |µ| is a consequence
of the finiteness of |ν| and |η|.
Exercise 3. Let A be a σ-algebra, and let K be one of the fields R or C. For
the purpose of this exercise, let us agree to use the term K-measure for designating
either a finite signed measure (when K = R), or a complex measure (when K = C).
Prove the following.
(i) The collection of all K-measures on A is a vector space.
286 LECTURES 30-31
(ii) For anu two K-measures if µ and ν, one has the inequality
|µ + ν| ≤ |µ| + |ν|.
(iii) For any K-measure µ and any α ∈ K, one has the equality
|αµ| = |α| · |µ|.
Proposition 8.2 has another interesting consequence, which is relevant for the
study of the extended finite Riesz correspondence.
Corollary 8.5. Let X be a locally compact space. Then the extended finite
Riesz correspondence (26) has the property
(31) |µφ |(X) = kφk, ∀ φ ∈ MR
0 (X).
Proof. From Proposition 8.1 and Theorem 8.5, we know that |µφ | = µφ+ +
µφ− . Using Remark 8.4, and Theorem 8.5 again, we have
|µφ |(X) = µφ+ (X) + µφ− (X) = kφ+ k + kφ− k = kφk.
Comments. Given a locally compact space X, we can define a complex Radon
measure on X as being a complex measure on X, whose real and imaginary part
are both (finite) signed Radon measures. The extended finite Riesz correspondence
can be then defined also over the complex numbers, as a map
M0 (X) 3 φ 7−→ µφ ∈ R0 (X),
where
M0 (X) = φ : C0 (X) → C : φ linear constinuous ,
R0 (X) = µ complex Radon measure on X .
This correspondence is again linear. One will still have the equality (31), but the
proof of this fact will appear later in Chapter IV.
Chapter IV
Integration Theory
Lectures 32-33
as
f = α1 κ A1 + · · · + αn κ An ,
with αk ∈ K, Ak ∈ A and µ(Ak ) < ∞, ∀ k = 1, . . . , n. Using the notations from
III.1, we have the inclusion
L1K,elem (X, A, µ) ⊂ A-ElemK (X).
B. If we consider the collection R = {A ∈ A : µ(A) < ∞}, then R is a ring,
and, we have the equality
L1K,elem (X, A, µ) = R-ElemK (X).
In particular, it follows that L1K,elem (X, A, µ) is a K-vector space.
The following result is the first step in the construction of the integral.
Theorem 1.1. Let (X, A, µ) be a measure space, and let K be one of the fields
µ
R or C. Then there exists a unique K-linear map Ielem : L1K,elem (X, A, µ) → K,
such that
µ
(1) Ielem (κ A ) = µ(A),
for all A ∈ A, with µ(A) < ∞.
289
290 LECTURES 32-33
with the convention that, when f (X) = {0} (which is the same as f = 0), we define
µ µ
Ielem (f ) = 0. It is obvious that Ielem satsifies the equality (1) for all A ∈ A with
µ(A) < ∞.
One key feature we are going to use is the following.
Claim 1: Whenever we have a finite pairwise disjoint sequence (Ak )nk=1 ⊂ A,
with µ(Ak ) < ∞, ∀ k = 1, . . . , n, one has the equality
µ
Ielem (α1 κ A1 + · · · + αn κ An ) = α1 µ(A1 ) + · · · + αn µ(An ), ∀ α1 , . . . , αn ∈ K.
It is obvious that we can assume αj 6= 0, ∀ j = 1, . . . , n. To prove the above equality,
we consider the elementary µ-integrable function f = α1 κ A1 + · · · + αn κ An , and we
observe that f (X)r{0} = {α1 }∪· · ·∪{αn }. It may be the case that some of the α’s
a equal. We list f (X) r {0} = {β1 , . . . , βp }, with βj 6= βk , for all j, k ∈ {1, . . . , p}
with j 6= k. For each k ∈ {1, . . . , p}, we define the set
Jk = j ∈ {1, . . . , n} : αj = βk .
It is obvious that the sets (Jk )pk=1 are pairwise disjoint, and we have J1 ∪ · · · ∪ Jp =
{1, . . . , n}. Moreover, for each k ∈ {1, . . . , p}, one has the equality
[
f −1 ({βk }) = Aj ,
j∈Jk
so we get
X X
βk µ f −1 ({βk }) = βk
µ(Aj ) = αj µ(Aj ), ∀ k ∈ {1, . . . , p}.
j∈Jk j∈Jk
µ
By the definition of Ielem we then get
p
X p X Xn
µ −1
X
Ielem (f ) = βk µ f ({βk }) = αj µ(Aj ) = αj µ(Aj ).
k=1 k=1 j∈Jk j=1
Claim 2: For every f ∈ L1K,elem (X, A, µ), and every A ∈ A with µ(A) < ∞,
one has the equality
µ µ
(2) Ielem (f + ακ A ) = Ielem (f ) + αµ(A), ∀ α ∈ K.
Write f = α1 κ A1 + · · · + αn κ An , with (Aj )nj=1 ⊂ A pairwise disjoint, and µ(Aj ) <
∞, ∀ j = 1, . . . , n. In order to prove (2), we are going to write the function f +
ακ A in a similar way, and we are going to apply Claim 1. Consider the sets
B1 , B2 , . . . , B2n , B2n+1 ∈ A defined by B2n+1 = A r (A1 ∪ · · · ∪ An ), and B2k−1 =
2n+1
Ak ∩ A, B2k = Ak r A, ∀ k = 1, . . . , n. It is obvious that the sets (Bp )p=1 are
pairwise disjoint. Moreover, one has the equalities
(3) B2k−1 ∪ B2k = Ak , ∀ k ∈ {1, . . . , n},
as well as the equality
n+1
[
(4) A= B2k−1 .
k=1
P2n+1
Using these equalities, now we have f + ακ A = p=1 βp κ Bp , where β2n+1 = α,
and β2k = αk and β2k−1 = αk + α, ∀ k ∈ {1, . . . , n}. Using these equalities,
CHAPTER IV: INTEGRATION THEORY 291
h−1
which proves that µ 0 ({α}) < ∞. Likewise, if α < 0, then, using the inequality
h0 ≥ f0 , we get
[
h−1 −1
f0−1 ({λ}),
0 ({α}) ⊂ f0 (−∞, 0) ⊂
λ∈f0 (X)r{0}
h−1
which proves again that µ 0 ({α}) < ∞.
Having shown that h0 is elementary integrable, we now compare the numbers
µ µ
Ielem (f ), Ielem (h0 ), and I µ (g). Define the functions f1 = h0 − f0 , and g1 = g0 − h0 .
By Theorem 1.1, we know that f1 , g1 ∈ L1R,elem (X, A, µ). Since f1 , g1 ≥ 0, we
µ
have f1 (X), g1 (X) ⊂ [0, ∞), so it follows immediately that Ielem (f1 ) ≥ 0 and
µ
Ielem (g1 ) ≥ 0. Now, again using Theorem 1.1, and (6), we get
µ µ µ µ µ µ
Ielem (h0 ) = Ielem (f0 + f1 ) = Ielem (f0 ) + Ielem (f1 ) ≥ Ielem (f0 ) = Ielem (f );
µ µ µ µ µ µ
Ielem (h0 ) = Ielem (g0 − g1 ) = Ielem (g0 ) − Ielem (g1 ) ≤ Ielem (g0 ) = Ielem (g).
Since h = h0 , µ-a.e., by the above Remark it follows that h ∈ L1R,elem (X, A, µ), and
µ µ
Ielem (h) = Ielem (h0 ), so the desired inequality (5) follows immediately.
We now define another type of integral.
CHAPTER IV: INTEGRATION THEORY 293
Indeed, if we define, for each t ∈ (0, ∞), the set At = f −1 ([t, ∞]) ∈ A, then we
have 0 ≤ tκ At ≤ f . This forces the functions tκ At , t ∈ (0, ∞) to be elementary
integrable, and
I µ (f )
µ(At ) ≤ + , ∀ t ∈ (0, ∞).
t
This forces limt→∞ µ(At ) = 0.
The next result explains the fact that positive integrability is a “decomposable”
property.
Proposition 1.4. Let (X, A, µ) be a measure space. Suppose (Ak )nk=1 ⊂ A
is a pairwise disjoint finite sequence, with A1 ∪ · · · ∪ An = X. For a measurable
function f : X → [0, ∞], the following are equivalent.
(i) f ∈ L1+ (X, A, µ);
(ii) f κ Ak ∈ L1+ (X, A, µ), ∀ k = 1, . . . , n.
Moreover, if f satisfies these equivalent conditions, one has
n
X
µ µ
I+ (f ) = I+ (f κ Ak ).
k=1
We obviously have
n
X n
X
h= hk ≤ f κ Ak = f,
k=1 k=1
CHAPTER IV: INTEGRATION THEORY 295
µ
so we get Ielem (h) ≤ I µ (f ), thus the inequality (8) gives
n
X µ
I µ (f ) ≥
I+ (f κ Ak ) − ε.
k=1
Pn µ
Since this inequality holds for all ε > 0, we get I µ (f ) ≥ k=1 I+ (f κ Ak ), and we
are done.
Remark 1.4. Let (X, A, µ) be a measure space, and let S ∈ A. We can
AS = {A ∩ S : A ∈ A} = {A ∈ A : A ⊂ S},
by µ|S . With these notations, (S, AS , µ|S ) is a measure space. It is not hard to
see that for a measurable function f : X → [0, ∞], the conditions
S ∈ L1 + (X, A,
1
• fκ µ),
• f S ∈ L+ (S, AS , µS )
are equivalent. Moreover, in this case one has the equality
µ µ|
I+ (f κ S ) = I+ S (f S ).
This is a consequence of the fact that these two conditions are equivalent
if f is
elementary, combined with the fact that the restriction map h 7−→ hS establishes
a bijection between the sets
h ∈ A-ElemR (X) : 0 ≤ h ≤ f κ S ,
k ∈ A -ElemR (S) : 0 ≤ k ≤ f .
S S
The next result gives an alternative definition of the positive integral, for func-
tions that are dominated by elementary integrable ones.
Proposition 1.5. Let X(, A, µ) be a measure space, let f : X → [0, ∞] be
a measurable function. Assume there exists h0 ∈ L1R,elem (X, A, µ), with h0 ≥ f .
Then f ∈ L1+ (X, A, µ), and one has the equality
µ µ
(h) : h ∈ L1R,elem (X, A, µ), h ≥ f .
(9) I+ (f ) = inf Ielem
To prove the other inequality, we use the definition of the positive integral, which
gives
µ µ
(h) : h ∈ L1R,elem (X, A, µ), 0 ≤ h ≤ g − f .
(13) I+ (g − f ) = sup Ielem
Remark that, whenever h ∈ L1R,elem (X, A, µ) is such that 0 ≤ h ≤ g − f , it follows
that 0 ≤ h + f ≤ g, so using part (i) combined with Proposition 1.3, we see that
h + f ∈ L1+ (X, A, µ), and
µ µ µ µ µ
Ielem (g) = I+ (g) ≥ I+ (h + f ) = Ielem (h) + I+ (f ).
This means that we have
µ µ µ
Ielem (h) ≤ Ielem (g) − I+ (f ),
for all h ∈ L1R,elem (X, A, µ), with 0 ≤ h ≤ g − f , and then by (13), we immediately
µ µ µ
get I+ (g − f ) ≤ Ielem (g) − I+ (f ).
We are now in position to prove the following result (compare with Theorem
1.1).
Theorem 1.2. Let (X, A, µ) be a measure space.
(i) If f1 , f2 ∈ L1+ (X, A, µ), then f1 + f2 ∈ L1+ (X, A, µ), and one has the
µ µ µ
equality I+ (f1 + f2 ) = I+ (f1 ) + I+ (f2 ).
(ii) If f ∈ L1+ (X, A, µ), and α ∈ [0, ∞), then14 αf ∈ L1+ (X, A, µ), and one
µ µ
has the equality I+ (αf ) = αI+ (f ).
The next result collects the basic properties of L1R̄ . Among other things, it
states that it is an “almost” vector space.
Theorem 1.3. Let (X, A, µ) be a measure space.
(i) For a measurable function f : X → R̄, the following are equivalent:
(a) f ∈ L1R̄ (X, A, µ);
(b) f ∈ L1+ (X, A, µ).
(ii) If f, g ∈ L1R̄ (X, A, µ), and if h : X → R̄ is a measurable function, such
that
h(x) = f (x) + g(x), ∀ x ∈ X r f −1 ({−∞, ∞}) ∪ g −1 ({−∞, ∞}) ,
Indeed, if we put N = f1−1 ({∞}) ∪ f2−1 ({∞}), then µ(N ) = 0, and if we start with
some x ∈ X r N , we either have f1 (x) ≥ f2 (x) ≥ 0, in which case we get
f + (x) = f (x) = f1 (x) − f2 (x) ≤ f1 (x),
f − (x) = 0 ≤ f2 (x),
or we have f1 (x) ≤ f2 (x), in which case we get
f + (x) = 0 ≤ f1 (x),
f − (x) = −f (x) = f2 (x) − f1 (x) ≤ f2 (x).
In other words, we have
f + (x) ≤ f1 (x) and f − (x) ≤ f2 (x), ∀ x ∈ X r N,
so we indeed get (17) and (18). Using these inequalities, and Proposition 1.3, it
follows that f ± ∈ L1+ (X, A, µ), so by Theorem 1.2, it follows that f + + f − = |f |
also belongs to L1+ (X, A, µ).
CHAPTER IV: INTEGRATION THEORY 301
To prove the implication (b) ⇒ (a), start by assuming that |f | ∈ L1+ (X, A, µ).
Then, since we obviously have the inequalities 0 ≤ f ± ≤ |f |, again by Proposition
1.3, it follows that f ± ∈ L1+ (X, A, µ). Since we obviously have
f (x) = f + (x) − f − (x), ∀ x ∈ X r f −1 ({−∞, ∞}),
it follows that f indeed belongs to f ± ∈ L1R̄ (X, A, µ).
(ii). Assume f , g, and h are as in (ii). By (i), both functions |f | and |g| are in
L1+ (X, A, µ). By Theorem 1.2, it follows that the function k = |f | + |g| also belongs
to L1+ (X, A, µ). Notice that we have the equality
f −1 ({−∞, ∞}) ∪ g −1 ({−∞, ∞}) = k −1 ({∞}),
so the hypothesis on h reads
h(x) = f (x) + g(x), ∀ x ∈ X r k −1 ({∞}),
which then gives
|h(x)| = |f (x) + g(x)| ≤ |f (x)| + |g(x)|, ∀ x ∈ X r k −1 ({∞}).
Of course, since µ k −1 ({∞}) = 0, this gives
|h| ≤ k, µ-a.e.,
and using (i) it follows that h indeed belongs to L1R̄ (X, A, µ).
(iii). Assume f , α, and g are as in (iii). Exactly as above, we have |g| = |α|·|f |,
µ-a.e., and then by Theorem 1.2 it follows that |g| ∈ L1+ (X, A, µ).
(iv). The inclusion L1+ (X, A, µ) ⊂ L1R̄ (X, A, µ) is trivial. To prove the inclusion
L1R,elem (X, A, µ) ⊂ L1R̄ (X, A, µ), we use parts (ii) and (iii) to reduce this to the fact
that κ A ∈ L1R̄ (X, A, µ), for all A ∈ A, with µ(A) < ∞. But this fact is now obvious,
because any such function belongs to L1+ (X, A, µ) ⊂ L1R̄ (X, A, µ).
Corollary 1.1. Let (X, A, µ) be a measure space, and let K be one of the
fields R or C.
(i) For a K-valued measurable function f : X → K, the following are equiva-
lent:
(a) f ∈ L1K (X, A, µ);
(b) |f | ∈ L1+ (X, A, µ).
(ii) When equipped with the pointwise addition and scalar multiplication, the
space L1K (X, A, µ) becomes a K-vector space.
Proof. (i). The case K = R is immediate from Theorem 1.3
In the case when K = C, we use the obvious inequalities
(19) max |Re f |, |Im f | ≤ |f | ≤ |Re f | + |Im f |.
If f ∈ L1C (X, A, µ), then both Re f and Im f belong to L1R (X, A, µ), so by
Theorem 1.3, both |Re f | and |Im f | belong to L1+ (X, A, µ). By Theorem 1.2, the
function g = |Re f | + |Im f | belongs to L1+ (X, A, µ), and then using the second
inequality in (19), it follows that |f | belongs to L1+ (X, A, µ).
Conversely, if |f | belongs to L1+ (X, A, µ), then using the first inequality in (19),
it follows that both |Re f | and |Im f | belong to L1+ (X, A, µ), so by Theorem 1.3,
both Re f and Im f belong to L1R (X, A, µ), i.e. f belongs to L1C (X, A, µ).
(ii). This part is pretty clear. If f, g ∈ L1K (X, A, µ), then by (i) both |f |
and |g| belong to L1+ (X, A, µ), and by Theorem 1.2, the function |f | + |g| will
302 LECTURES 32-33
also belong to L1+ (X, A, µ). Since |f + g| ≤ |f | + |g|, it follows that |f + g| itself
belongs to L1+ (X, A, µ), so using (i) again, it follows that f + g indeed belongs to
L1K (X, A, µ). If f ∈ L1K (X, A, µ) and α ∈ K, then |f | belongs to L1+ (X, A, µ), so
|αf | = |α| · |f | again belongs to L1+ (X, A, µ), which by (i) gives the fact that αf
belongs to L1K (X, A, µ).
Remark 1.5. Let (X, A, µ) be a measure space. Then one has the equalities
L1+ (X, A, µ) = f ∈ L1R̄ (X, A, µ) : f (X) ⊂ [0, ∞] ;
(20)
(21) L1K,elem (X, A, µ) = L1K (X, A, µ) ∩ A-ElemK (X).
Indeed, by Theorem 1.3 that we have the inclusion
L1+ (X, A, µ) ⊂ f ∈ L1R̄ (X, A, µ) : f (X) ⊂ [0, ∞] .
The inclusion in the other direction follows again from Theorem 1.3, since any
function that belongs to the right hand side of (20) satisfies f = |f |. The inclusion
L1K,elem (X, A, µ) ⊂ L1K (X, A, µ) ∩ A-ElemK (X)
is again contained in Theorem 1.3. To prove the inclusion in the other direction,
it suffices to consider the case K = R. Start with h ∈ L1R (X, A, µ) ∩ A-ElemR (X),
which gives |h| ∈ L1+ (X, A, µ). The function |h| is obviously in A-ElemR (X), so
we get |h| ∈ L1R,elem (X, A, µ). Since L1R,elem (X, A, µ) is a vector space, it will also
contain the function −|h|. The fact that h itself belongs to L1R,elem (X, A, µ) then
follows from Proposition 1.1, combined with the obvious inequalities
−|h| ≤ h ≤ |h|.
The following result deals with the construction of the integral.
Theorem 1.4. Let (X, A, µ) be a measure space. There exists a unique map
IR̄µ (X, A, µ) → R, with the following properties:
(i) Whenever f, g, h ∈ L1R̄ (X, A, µ) are such that
Proof. Let us first show the existence. Start with some f ∈ L1R̄ (X, A, µ), and
define the functions f ± : X → [0, ∞] by f + = max{f, 0} and f − = max{−f, 0} so
that f = f + − f − , and f + , f − ∈ L1+ (X, A, µ). We then define
µ
IR̄µ (f ) = I+ µ
(f + ) − I+ (f − ).
Claim: Whenever f ∈ L1R̄ (X, A, µ), and f1 , f2 ∈ L1+ (X, A, µ) are such that
−1
f (x) = f + (x) − f − (x), ∀ x ∈ X. r ({∞}) ∪ f2−1 ({∞}) ,
which gives
f2 + f + = f1 + f − , µ-a.e.
By Theorem 1.2, this immediately gives
µ µ µ µ
I+ (f2 ) + I+ (f + ) = I+ (f1 ) + I+ (f − ),
which then gives
µ µ µ µ
I+ (f1 ) − I+ (f2 ) = I+ (f + ) − I+ (f − ) = IR̄µ (f ).
Having prove the above Claim, let us show now that IR̄µ has properties (i) and
(ii). Assume f , g and h are as in (i). Notice that if we define h1 = f + + g + and
h2 = f − + g − , then we clearly have 0 ≤ h1 ≤ |f | + |g| and 0 ≤ h2 ≤ |f | + |g|, so h1
and h2 both belng to L1+ (X, A, µ). By Theorem 1.2, we then have
µ µ µ + µ µ µ −
(22) I+ (h1 ) = I+ (f + ) + I+ (g ) and I+ (h2 ) = I+ (f − ) + I+ (g ).
Notice also that, because of the equalities
h−1
1 ({∞}) = f
−1
({∞} ∪ g −1 ({∞}) and h−1
2 ({∞}) = f
−1
({−∞} ∪ g −1 ({−∞}),
we have
h = h1 (x) − h2 (x), ∀ x ∈ X. r h−1 −1
1 ({∞}) ∪ h2 ({∞}) ,
so by the above Claim, combined with (22), we get
µ
IR̄µ (h) = I+ µ
(h1 ) − I+ µ
(h2 ) = I+ µ +
(f + ) + I+ µ
(g ) − I+ µ −
(f − ) − I+ (g ) = IR̄µ (f ) + IR̄µ (g).
Property (ii) is pretty obvious.
The uniqueness is also obvious. If we start with a map J : L1R̄ (X, A, µ) → R
with properties (i)-(iii), then for every f ∈ L1R̄ (X, A, µ), we must have
µ µ
J(f ) = J(f + ) − J(f − ) = I+ (f + ) − I+ (f − ).
(For the second equality we use condition (iii), combined with the fact that both
f + and f − belong to L1+ (X, A, µ).)
Corollary 1.2. Let (X, A, µ) be a measure space, and let K be either R or
C. There exists a unique linear map IKµ (X, A, µ) → K, such that
IKµ (f ) = I+
µ
(f ), ∀ f ∈ L1+ (X, A, µ) ∩ L1K (X, A, µ).
Proof. Let us start with the case K = R. In this case, we have the inclusion
L1R (X, A, µ) ⊂ L1R̄ (X, A, µ),
so we can define IRµ as the restriction of IR̄µ to L1R (X, A, µ). The uniqueness is again
clear, because of the equalities
IRµ (f ) = IRµ (f + ) − IRµ (f − ) = I+
µ µ
(f + ) − I+ (f − ).
304 LECTURES 32-33
Proof. Let us first examine the case when K = R̄, R. In this case we define
f + = max{f, 0} and f − = max{−f, 0}, so we have f = f + − f − , as well as
µ
|f | = f + + f − . Using the inequalities I+ (f ± ) ≥ 0, we have
Z Z
µ + µ − µ + µ −
f dµ = I+ (f ) − I+ (f ) ≤ I+ (f ) + I+ (f ) = |f | dµ;
ZX X
Z
µ µ µ µ
− f dµ = −I+ (f + ) + I+ (f − ) ≤ I+ (f + ) + I+ (f − ) = |f | dµ.
X X
In other words, we have Z Z
± f dµ ≤ |f | dµ,
X X
and the desired inequality immediately follows. R
Let us consider now the case K = C. Consider the number λ = X f dµ, and
let us choose some complex number α ∈ C, with |α| = 1, and αλ = |λ|. (If λ 6= 0,
we take α = λ−1 |λ|; otherwise we take α = 1.) Consider the measurable function
g = αf . Notice now that
Z Z Z Z
Re g dµ + i Im g dµ = g dµ = α f dµ = αλ = |λ| ≥ 0,
X X X X
so in particular we get Z
|λ| = Re g dµ.
X
If we apply the real case, we then get
Z
(23) |λ| ≤ |Re g| dµ.
X
Notice now that, we have the inequality |Re g| ≤ |g| = |f |, which gives
Z
µ µ
I+ |Re g| ≤ I |f | = |f | dµ,
X
so the inequality (23) immediately gives
Z Z
f dµ = |λ| ≤ |f | dµ.
X X
Corollary 1.3. Let (X, A, µ) be a measure space, and let K be one of the
symbols R̄, R, or C. If a measurable function f : X → K satisfies f = 0, µ-a.e,
then f ∈ L1K (X, A, µ), and X f dµ = 0.
R
306 LECTURES 32-33
Comment. The introduction of the space L1R̄ (X, A, µ), of extended real-valued
µ-integrable functions, is useful mostly for technical reasons. In effect, everything
can be reduced to the case when only “honest” real-valued functions are involved.
The following result clarifies this matter.
Lemma 1.2. Let (X, A, µ) be a measure space, and let f : X → R̄ be a mea-
surable function. The following ar equivalent
(i) f ∈ L1R̄ (X, A, µ);
(ii) there exists g ∈ L1R (X, A, µ), such that g = f , µ-a.e.
Moreover, if f satisfies these equivalent conditions, then any function g, satisfying
(ii), also has the property
Z Z
f dµ = g dµ.
X X
Proof. Consider the set F = {x ∈ X : −∞ < f (x) < ∞}, which belongs to
A. We obviously have the equality X r F = |f |−1 ({∞}).
(i) ⇒ (ii). Assume f ∈ L1R̄ (X, A, µ), which means that |f | ∈ L1+ (X, A, µ). In
particular, we get µ(X r F ) = 0. Define the measurable function g = f κ F . On the
one hand, it is clear, by construction,
we have −∞ < g(x) < ∞, ∀ x ∈ X. On
that
the other hand, it is clear that g F = f F , so using µ(X r F ) = 0, we get the fact
that f = g, µ-a.e. Finally, the inequality 0 ≤ |g| ≤ |f |, combined with Proposition
1.3, gives |g| ∈ L1+ (X, A, µ), so g indeed belongs to L1R (X, A, µ).
(ii) ⇒ (i). Suppose there exists g ∈ L1R (X, A, µ), with f = g, µ-a.e., and let us
prove that
(a) fR ∈ L1R̄ (X, RA, µ);
(b) X f dµ = X g dµ.
The first assertion is clear, because by Proposition 1.3, the equality |f | = |g|, µ-a.e.,
combined with |g| ∈ L1+ (X, A, µ), forces |f | ∈ L1+ (X, A, µ), i.e. f ∈ L1R̄ (X, A, µ). To
prove (b), we consider the difference h = f − g, which is a measurable function h :
X →R R̄, and satisfies h = 0, µ-a.e. By Corollary 1.3, we know that h ∈ L1R̄ (X, A, µ),
and X h dµ = 0. By Theorem 1.3, we get
Z Z Z Z
f dµ = g dµ + h dµ = g dµ.
X X X x
The following result is an analogue of Proposition 1.1 (see also Proposition 1.3).
CHAPTER IV: INTEGRATION THEORY 307
Proposition 1.7. Let (X, A, µ), and let f1 , f2 ∈ L1R̄ (X, A, µ). Suppose f :
X → R̄ is a measurable function, such that f1 ≤ f ≤ f2 , µ-a.e. Then f ∈
L1R̄ (X, A, µ), and one has the inequality
Z Z Z
f1 dµ ≤ f dµ ≤ f2 dµ.
X X X
Proof. First of all, since f1 and f2 belong to L1R̄ (X, A, µ), it follows that
|f1 | and |f2 |, hence also |f1 | + |f2 |, belong tp L1+ (X, A, µ). Second, since we have
f2 ≤ |f2 | ≤ |f1 | + |f2 | and f1 ≥ −|f1 | ≥ −|f1 | − |f2 | (everyhwere!), the inequalities
f1 ≤ f ≤ f2 , µ-a.e., give
−|f1 | − |f2 | ≤ f ≤ |f1 | + |f2 |, µ-a.e.,
which reads
|f | ≤ |f1 | + |f2 |, µ-a.e.
Since |f1 | + |f2 | ∈ L1+ (X, A, µ), by Proposition 1.3., we get |f | ∈ L1+ (X, A, µ), so f
indeed belongs to L1R̄ (X, A, µ).
To prove the inequality for integrals, we use Lemma 1.2, to find functions
g1 , g2 , g ∈ L1R (X, A, µ), such that f1 R= g1 , µ-a.e.,R f2 = g2 ,Rµ-a.e., andRf = g, µ-a.e.
Lemma
R 1.2R also gives the equalities X f1 dµ = X g1 dµ, X f2 dµ = X g2 dµ, and
X
f dµ = X
g dµ, so what we need to prove are the inequalities
Z Z Z
(24) g1 dµ ≤ g dµ ≤ g2 dµ.
X X X
Of course, we have
g1 ≤ g ≤ g2 , µ-a.e.
To prove the first inequality in R(24), we consider the function h = g − g1 ∈
L1R (X, A, µ), and we prove that X h dµ ≥ 0. But this is quite clear, because
we have h ≥ 0, µ-a.e., which means that h = |h|, µ-a.e., so by Lemma 1.2, we get
Z Z
µ
h dµ = |h| dµ = I+ (|h|) ≥ 0.
X X
The second inequality in (24) is prove the exact same way.
The next result is an analogue of Proposition 1.4.
Proposition 1.8. Let (X, A, µ) be a measure space, and let K be one of the
symbols R̄, R, or C. Suppose (Ak )nk=1 ⊂ A is a pairwise disjoint finite sequence,
with A1 ∪ · · · ∪ An = X. For a measurable function f : X → K, the following are
equivalent.
(i) f ∈ L1K (X, A, µ);
(ii) f κ Ak ∈ L1K (X, A, µ), ∀ k = 1, . . . , n.
Moreover, if f satisfies these equivalent conditions, one has
Z X n Z
(25) f dµ = f κ Ak dµ.
X k=1 X
µ-a.e., for all k = 1, . . . , n, and the equality (25) follows from the corresponding
equality that holds for g.
Remark 1.8. The equality (25) also holds for arbitrary measurable functions
f : X → [0, ∞], if we use the convention that preceded Remarks 1.7. This is an
immediate consequence of Proposition 1.4, because the left hand side is infinite, if
an only if one of the terms in the right hand side is infinite.
The following is an obvious extension of Remark 1.4.
Remark 1.9. Let K be one of the symbols R̄, R, or C, let (X, A, µ) be a
measure space. For a set S ∈ A, and a measurable function f : X → K, one has
the equivalence
f κ S ∈ L1K (X, A, µ) ⇐⇒ f ∈ L1K S, A , µ .
S S S
If this is the case, one has the equality
Z Z
(26) f κ S dµ = f S dµ|S .
X S
The above equality also holds for arbitrary measurable functions f : X → [0, ∞],
again using the convention that preceded Remarks 1.7.
Notation. The above remark states that, whenver the quantities in (26) are
defined, they are equal. (This only requires the fact that f S is measurable, and
either f S ∈ L1K S, AS , µS , or fR(S) ⊂ [0, ∞].) In this case, the equal qunatities
2. Convergence theorems
In this section we analyze the dynamics of integrabilty in the case when se-
quences of measurable functions are considered. Roughly speaking, a “convergence
theorem” states that integrability is preserved under taking limits. In other words,
if one has a sequence (fn )∞n=1 of integrable functions, and if f is some kind of a
limit of the fn ’s,
R then we would
R like to conclude that f itself is integrable, as well
as the equality f = limn→∞ fn .
Such results are often employed in two instances:
A. When we want to prove that some function f is integrable. In this case
we would look for a sequence (fn )∞
n=1 of integrable approximants for f .
B. When we want to construct and integrable function. In this case, we will
produce first the approximants, and then we will examine the existence
of the limit.
The first convergence result, which is somehow primite, but very useful, is the
following.
Lemma 2.1. Let (X, A, µ) be a finite measure space, let a ∈ (0, ∞) and let
fn : X → [0, a], n ≥ 1, be a sequence of measurable functions satisfying
(a) f1 ≥ f2 ≥ · · · ≥ 0;
(b) limn→∞ fn (x) = 0, ∀ x ∈ X.
Then one has the equality
Z
(1) lim fn dµ = 0.
n→∞ X
Proof. Let us define, for each ε > 0, and each integer n ≥ 1, the set
Obviously, we have Aεn ∈ A, ∀ ε > 0, n ≥ 1. One key fact we are going to use is the
following.
Claim 1: For every ε > 0, one has the equality
lim µ(Aεn ) = 0.
n→∞
Fix ε > 0. Let us first observe that, using (a), we have the inclusions
T∞
Second, using (b), we clearly have the equality k=1 Aεk = ∅. Since µ is finite,
using the Continuity Property (Lemma III.4.1), we have
∞
\
µ(Aεn ) Aεn = µ(∅) = 0.
lim =µ
n→∞
n=1
Claim 2: For every ε > 0 and every integer n ≥ 1, one has the inequality
Z
0≤ fn dµ ≤ aµ(Aεn ) + εµ(X).
X
Since the last inequality holds for arbitrary ε > 0, the desired equality (1) immedi-
ately follows.
We now turn our attention to a weaker notion of limit, for sequences of mea-
surable functions.
Definition. Let (X, A, µ) be a measure space, let K be a one of the symbols
R̄, R, or C. Suppose fn : X → K, n ≥ 1, are measurable functions. Given
a measurable function f : X → K, we say that the sequence (fn )∞ n=1 converges
µ-almost everywhere to f , if there exists some set N ∈ A, with µ(N ) = 0, such that
lim fn (x) = f (x), ∀ x ∈ X r N.
n→∞
Exercise 2. Use the setting from Exercise 1. Prove that Let (X, A, µ), K, and
(fn )∞
n=1 be as in Exercise 1. Assume f : X → K is an arbitrary function, for which
there exists some set N ∈ A with µ(N ) = 0, and
lim fn (x) = f (x), ∀ x ∈ X r N.
n→∞
Prove that, when µ is a complete measure on A (see III.5), the function f is auto-
matically measurable.
Hint: Use the results from Exercise 1. We have X r N ⊂ L, and f (x) = `(x), ∀ x ∈ X r N .
Prove that, for a Borel set B ⊂ K, one has the equality f −1 (B) = `−1 (B)4M , for some M ⊂ N .
By completeness, we have M ∈ A, so f −1 (B) ∈ A.
The following fundamental result is a generalization of Lemma 2.1.
Theorem 2.1 (Lebesgue Monotone Convergence Theorem). Let (X, A, µ) be
a measure space, and let (fn )∞n=1 ⊂ L+ (X, A, µ) be a sequence with:
1
• fn ≤ fRn+1 , µ-a.e., ∀ n ≥ 1;
• sup X fn dµ : n ≥ 1 < ∞.
Assume f : X → [0, ∞]Ris a measurable function, with f = µ-a.e.- limn→∞ fn . Then
f ∈ L1+ (X, A, µ), and X f dµ = limn→∞ X fn dµ.
R
R
Proof. Define αn = X fn dµ, n ≥ 1. First of all, we clearly have
0 ≤ α1 ≤ α2 ≤ . . . ,
312 LECTURE 34
tions show that it suffices to prove the theorem with g’s in place of the f ’s. The
advantage is now the fact that we have the slightly stronger properties (a) and (b)
above. The first step in the proof is the following.
Claim 1: For every t ∈ (0, ∞), one has the inequality µ g −1 ((t, ∞]) ≤ αt .
Denote the set g −1 ((t, ∞]) simply by At . For each n ≥ 1, we also define the set
Ant = gn−1 ((t, ∞]). Using property (a) above, it is clear that we have the inclusions
(4) At1 ⊂ A2t ⊂ · · · ⊂ At .
S∞
Using property (b) above, we also have the equality At = n=1 Ant . Using the
continuity Lemma 4.1, we then have
µ(At ) = lim µ(Ant ),
n→∞
so in order to prove the Claim, it suffices to prove the inequalities
αn
(5) µ(Ant ) ≤ , ∀ n ≥ 1.
t
But the above inequality is pretty obvious, since we clearly have 0 ≤ tκ Ant ≤ gn ,
which gives Z Z
tµ(Ant ) = tκ Ant dµ ≤ gn dµ = αn .
X X
Claim 2: For any elementary function h ∈ A-ElemR (X), with 0 ≤ h ≤ g,
one has
CHAPTER IV: INTEGRATION THEORY 313
(αn )∞
R
Proof. As before, the sequence ⊂ [0, ∞], defined by αn =
n=1 X
fn dµ,
∀ n ≥ 1, is non-decreasing, and is has a limit
Z
α = lim αn = sup fn dµ : n ≥ 1 ∈ [0, ∞].
n→∞ X
There are two cases to analyze.
Case I : α = ∞.
In this case the inequalities f ≥ fn ≥ 0, µ-a.e. will force
Z Z
f dµ ≥ fn dµ = αn , ∀ n ≥ 1,
X X
R
which will force X
f dµ ≥ α, so we indeed get
Z
f dµ = ∞ = α.
X
Case II : α < ∞.
In this case we apply directly Theorem 2.1.
The following result provides an equivalent definition of integrability for non-
negative functions (compare to the construction in Section 1).
Corollary 2.1. Let (X, A, µ) be a measure space, and let f : X → [0, ∞] be
a measurable function. The following are equivalent:
(i) f ∈ L1+ (X, A, µ);
n=1 ⊂ LR,elem (X, A, µ), with
(ii) there exists a sequence (hn )∞ 1
• 0 ≤ h1 ≤ h2 . . . ;
• limn→∞
R hn (x) = f (x), ∀ x ∈ X;
• sup X hn dµ : n ≥ 1 < ∞.
CHAPTER IV: INTEGRATION THEORY 315
Moreover, if (hn )∞
n=1 is as in (ii), then one has the equality
Z Z
(10) f dµ = lim hn dµ.
X n→∞ X
Proof. (i) ⇒ (ii). Assume f ∈ L1+ (X, A, µ). Using Theorem III.3.2, we know
n=1 ⊂ A-ElemR (X), with
there exists a sequence (hn )∞
(a) 0 ≤ h1 ≤ h2 ≤ · · · ≤ f ;
(b) limn→∞ hn (x) = f (x), ∀ x ∈ X.
Note the (a) forces hn ∈ L1R,elem (X, A, µ), as well as the inequalities X hn dµ ≤
R
Corollary 2.2 (Fatou Lemma). Let (X, A, µ) be a measure space, and let
fn : X → [0, ∞], n ≥ 1, be a sequence of measurable functions. Define the function
f : X → [0, ∞] by
f (x) = lim inf fn (x), ∀ x ∈ X.
n→∞
Then f is measurable, and one has the inequality
Z Z
f dµ ≤ lim inf fn dµ.
X n→∞ X
Proof. The fact that f is measurable is already known (see Corollary III.3.5).
Define the sequence (αn )∞
R
n=1 ⊂ [0, ∞] by αn = X fn dµ, ∀ n ≥ 1.
Define, for each integer n ≥ 1, the function gn : X → [0, ∞] by
gn (x) = inf fk (x) : k ≥ n , ∀ x ∈ X.
By Corollary III.3.4, we know that gn , n ≥ 1 are all measurable. Moreover, it is
clear that
• 0 ≤ g1 ≤ g2 ≤ . . . ;
• f (x) = limn→∞ gn (x), ∀ x ∈ X.
By the General Lebesgue Monotone Convergence Theorem 2.2, it follows that
Z Z
(11) f dµ = lim gn dµ.
X n→∞ X
σ-algebra of all Lebesgue measurable subsets of [a, b]. Then every Riemann inte-
grable function f : [a, b] → R belongs to L1R ([a, b], Mλ ([a, b]), λ), and one has the
equality
Z Z b
(12) f dλ = f (x) dx.
[a,b] a
Proof. We are going to use the results from III.6. First of all, the fact that f
is Lebesgue integrable, i.e. f belongs to L1R [a, b], Mλ ([a, b]), λ , is clear since f is
Lebesgue measurable, and bounded. (Here we use the fact that the measure space
[a, b], Mλ ([a, b]), λ is finite.)
Next we prove the equality between the Riemann integral and the Lebesgue
integral. Adding a constant, if necessary, we can assume that f ≥ 0. For every
partition ∆ = (a = x0 < x1 < · · · < xn = b) of [a, b], we define the numbers
mk = inf f (t), ∀ k = 1, . . . , n,
t∈[xk−1 ,xk ]
where L(f, ∆p ) denotes the lower Darboux sum. Combining this with (13), and with
the well known properties of Riemann integration, we immediately get (12).
The Lebesgue Convergence Theorems 2.2 and 2.3 have many applications. They
are among the most important results in Measure Theory. In many instances, these
theorem are employed during proofs, at key steps. The next two results are good
illustrations.
Proposition 2.1. Let (X, A, µ) be a measure space, and let f : X → [0, ∞] be
a measurable function. Then the map
Z
ν : A 3 A 7−→ f dµ ∈ [0, ∞]
A
defines a measure on A.
Proof. It is clear that ν(∅) = 0. To proveSσ-additivity, start with a pairwise
∞
n )n=1 ⊂ A, and put A =
∞
disjoint sequence (AS n=1 An . For each integer n ≥ 1,
n
define the set Bn = k=1 Ak , and the measurable function gn = f κ Bn . Define also
the function g = f κ A . It is obvious that
• 0 ≤ g1 ≤ g2 ≤ · · · ≤ g (everywhere),
CHAPTER IV: INTEGRATION THEORY 319
X∞ Z
|f | dµ < ∞.
n=1 An
Proof. (i) ⇒ (ii). Assume f ∈ L1K (X, A, µ). Applying Proposition 2.1, to
|f |, we immediately get
∞ Z
X Z
|f | dµ = |f | dµ < ∞,
n=1 An X
we get
Z n Z
X
|fn | dµ = |f | dµ ≤ S < ∞, ∀ n ≥ 1.
X k=1 Ak
which proves that |f | ∈ L1+ (X, A, µ), so in particular f belongs to L1K (X, A, µ). On
the other hand, since we have |fn | ≤ |f |, by the Lebesgue Dominated Convergence
Theorem, we get
Z Z Xn Z X ∞ Z
f dµ = lim fn dµ = lim f dµ = f dµ.
X n→∞ X n→∞ Ak An
k=1 n=1
Corollary 2.4. Let (X, A, µ) be a measure space,S let K be one of the symbols
∞
R̄, R, or C, and let (Xn )∞ n=1 ⊂ A be sequence with n=1 Xn = X, and X1 ⊂
X2 ⊂ . . . . For a function f : X → K be a measurable function, the following are
equivalent.
(i) f ∈ L1K (X, A, µ);
(ii) f X ∈ L1K (Xn , AX , µX ), ∀ n ≥ 1, and
n n n
Z
sup |f | dµ : n ≥ 1 < ∞.
Xn
which by the results from III.8 gives the inequality |ν| ≤ ω. (Here |ν| denotes the
variation measure of ν.) Later on (see Section 4) we are going to see that in fact
we have the equality |ν| = ω.
CHAPTER IV: INTEGRATION THEORY 321
(Here we use the summability convention which defines the sum as the supremum
of all finite sums.) In general, f is not always measurable. But if it is, one still
cannot conclude that Z XZ
f dµ = fi dµ.
X j∈J X
is
P equal to κ J If J is non-measurable, this already gives an example when f =
j∈J fj is non-measurable. But even if J were measurable, it would be impossible
to have the equality Z XZ
f dλ = fj dλ,
X j∈J X
simply because the right hand side is zero, while the left hand side is equal to λ(J).
The next two exercises illustrate straightforward (but nevertheless interesting)
applications of the convergence theorems to quite simple situations.
Exercise 4. Let A be a σ-algebra on a (non-empty) set X, and let (µn )∞ n=1 be
a sequence
∞ of signed measures on A. Assume that, for each A ∈ A, the sequence
µn (A) n=1 has a limit denoted µ(A) ∈ [−∞, ∞]. Prove that the map µ : A →
[0, ∞] defines a measure on A, if the sequence (µn )∞
n=1 satisfies one of the following
hypotheses:
A. 0 ≤ µ1 (A) ≤ µ2 (A) ≤ . . . , ∀ A ∈ A;
B. there exists a finite measure ω on A, such that |µn (A)| ≤ ω(A), ∀ n ≥ 1,
A ∈ A.
S∞
k=1 ⊂ A, and put A =
Hint: To prove σ-additivity, fix a pairwise disjoint sequence (Ak )∞ k=1 Ak .
P∞
Treat the problem of proving the equality µ(A) = k=1 µ(A k ) as a convergence problem on
the measure space (N, P(N), ν) - with ν the counting measure - for the sequence of functions
fn : N → [0, ∞] defined by fn (k) = µn (Ak ), ∀ k ∈ N.
Exercise 5*. Let A be a σ-algebra on a (non-empty) set X, and let (µj )j∈J be
a family of signed measures on A. Assume either of the following is true:
322 LECTURE 34
A. µj (A) ≥ 0, ∀ j ∈ J, A ∈ A.
B. There exists a finite measure ω on A, such that j∈J |µj (A)| ≤ ω(A),
P
∀ A ∈ A.
Define the map µ : A → [0, ∞] by µ(A) = j∈J µj (A), ∀ A ∈ A. (In Case A, the
P
sum is defined as the supremum over finite sums. In case B, it follows that the
family µj (A) j∈J is summable.) Prove that µ is a measure on A.
S∞
k=1 ⊂ A, and put A =
Hint: To prove σ-additivity, fix a pairwise disjoint sequence (Ak )∞ k=1 Ak .
P∞
To prove the equality µ(A) = k=1 µ(Ak ), analyze the following cases: (i) There is some k ≥ 1,
such that µ(Ak ) = ∞; (ii) µ(Ak ) < ∞, ∀ k ≥ 1. The first case is quite trivial. In the second
case reduce the problem to the previous exercise, by observing that, for each k ≥ 1, the set
J(Ak ) = {j ∈ J : µj (Ak ) > 0 must be countable. Then the set J(A) = {j ∈ J : µj (A) > 0} is
also countable.
Comment. One of the major drawbacks of the theory of Riemann integration
is illustrated by the approach to improper integration. Recall that for a function
h : [a, b) → R the improper Riemann integral is defined as
Z b− Z x
h(t) dt = lim f (t) dt,
a x→b− a
provided that
(a) h[a,x] is Riemann integrable, ∀ x ∈ (a, b), and
(b) the above limit exists.
The problem is that although the improper integral may exist, and the function is
actually defined on [a, b], it may fail to be Riemann integrable, for example when
it is unbounded.
In contrast to this situation, by Corollary 2.4, we see that if for example h ≥ 0,
then the Lebesgue integrability of h on [a, b] is equivalent to the fact that
(i) h[a,x] is Lebesgue integrable, ∀ x ∈ (a, b), and
R
(ii) limx→b− [a,x] h dλ exists.
Going back to the discussion on improper Riemann integral, we can see that
a sufficient condition for h : [a, b) → R to be Riemann integrable in the improper
sense, is the fact that h has property (a) above, and h is Lebesgue integrable on
[a, b). In fact, if h ≥ 0, then by Corollary 2.4, this is also necessary.
Notation. Let −∞ ≤ a < b ≤ ∞, and let f be a Lebesgue integrable function,
defined on some interval J which is one of (a, b), [a, b), (a, b], or [a, b]. Then the
R Rb
Lebesgue integral J f dλ will be denoted simply by a f dλ.
Exercise 6*. Let (X, A, µ) be a finite measure space. Prove that for every
f ∈ L1+ (X, A, µ), one has the equality
Z Z ∞
µ f −1 ([t, ∞]) dt,
f dµ =
X 0
where the second term is defined as improper Riemann integral.
Hint: The function ϕ : [0, ∞) → [0, ∞) defined by ϕ(t) = µ f −1 ([t, ∞]) , ∀ t ≥ 0, is non-
increasing, so it is Riemann integrable on every interval [0, a], a > 0. Prove the inequalities
Z Z a Z
f dµ ≤ ϕ(t) dt ≤ f dµ, ∀ a > 0,
Xa 0 X
where Xa = f −1 ([0, a)), by analyzing lower and upper Darboux sums of ϕ[0,a] . Use Corollary
R R
2.4 to get lima→∞ Xa f dµ = X f dµ.
Lecture 35
Remark 3.1. The space L1K (X, A, µ) was studied earlier (see Section 1). It
has the following features:
(i) L1K (X, A, µ) is a K-vector space.
(ii) The map Q1 : L1K (X, A, µ) → [0, ∞) is a seminorm, i.e.
(a) Q1 (f + g) ≤ Q1 (f ) + Q1 (g), ∀ f, g ∈ L1K (X, A, µ);
(b) ) = |α| · Q1 (f ), ∀
R Q1 (αf f ∈ L1K (X, A, µ), α ∈ K.
(iii) X f dµ ≤ Q1 (f ), ∀ f ∈ LK (X, A, µ).
1
Property (b) is clear. Property (a) immediately follows from the inequality |f +g| ≤
|f | + |g|, which after integration gives
Z Z Z Z
|f + g| dµ ≤ |f | + |g| dµ = |f | dµ + |g| dµ.
X X X X
In what follows, we aim at proving similar features for the spaces LpK (X, A, µ)
and Qp , 1 < p < ∞.
The following will help us prove that Lp is a vector space.
Exercise 1 ♦ . Let p ∈ (1, ∞). Then one has the inequality
(s + t)p ≤ 2p−1 (sp + tp ), ∀ s, t ∈ [0, ∞).
323
324 LECTURE 35
Hint: The inequality is trivial, when s = t = 0. If s + t > 0, reduce the problem to the case
t + s = 1, and prove, using elementary calculus techniques that
min tp + (1 − t)p = 21−p .
t∈[0,1]
Proposition 3.1. Let (X, A, µ) be a measure space, let K be one of the fields
R or C, and let p ∈ (1, ∞). When equipped with pointwise addition and scalar
multiplication, LpK (X, A, µ) is a K-vector space.
Proof. It f, g ∈ LpK (X, A, µ), then by Exercise 1 we have
Z Z Z Z
p
|f + g|p dµ ≤ |f | + |g| dµ ≤ 2p−1 |f |p dµ + |g|p dµ < ∞,
X X X X
so f + g indeed belongs to LpK (X, A, µ).
It f ∈ LpK (X, A, µ), and α ∈ K, then the equalities
Z Z Z
p p p p
|αf | dµ = |α| · |f | dµ = |α| · |f |p dµ
X X X
clearly prove that αf also belongs to LpK (X, A, µ).
Our next task will be to prove that Qp is a seminorm, for all p > 1. In this
direction, the following is a key result. (The above mentioned convention will be
used throughout this entire section.)
Theorem 3.1 (Hölder’s Inequality for integrals). Let (X, A, µ) be a measure
space, let f, g : X → [0, ∞] be measurable functions, and let p, q ∈ (1, ∞) be such
that p1 + 1q = 1. Then one has the inequality15
Z Z 1/p Z 1/q
(1) f g dµ ≤ f p dµ · g q dµ .
X X X
such that
• 0 ≤ ϕ1 ≤ ϕ2 ≤ . . . and 0 ≤ ψ1 ≤ ψ2 ≤ . . . ;
• limn→∞ ϕn (x) = f (x)p and limn→∞ ψn (x) = g(x)q , ∀ x ∈ X.
By the Lebesgue Dominated Convergence Theorem, we will also get the equalities
Z Z Z Z
(2) f p dµ = lim ϕn dµ and g q dµ = lim ψn dµ.
X n→∞ X X n→∞ X
1/p 1/q
Remark that the functions fn = ϕn , n ≥ 1 are also elementary (because
gn ψn ,
they obviously have finite range). It is obvious that we have
• 0 ≤ f1 ≤ f2 ≤ . . . , and 0 ≤ g1 ≤ g2 ≤ . . . ;
• limn→∞ fn (x) = f (x), and limn→∞ gn (x)] = g(x), ∀ x ∈ X.
With these notations, the equalities (2) read
Z Z Z Z
p p q
(3) f dµ = lim (fn ) dµ and g dµ = lim (gn )q dµ.
X n→∞ X X n→∞ X
Of course, the products fn gn , n ≥ 1 are again elementary, and satisfy
15 Here we use the convention ∞1/p = ∞1/q = ∞.
CHAPTER IV: INTEGRATION THEORY 325
• 0 ≤ f1 g1 ≤ f2 g2 ≤ . . . ;
• limn→∞ [fn (x)gn (x)] = f (x)g(x), ∀ x ∈ X.
Using the General Lebesgue Monotone Convergence Theorem, we then get
Z Z
f g dµ = lim fn gn dµ.
X n→∞ X
Using (3) we now see that, in order to prove (1), it suffices to prove the inequalities
Z Z 1/p Z 1/q
p q
fn gn dµ ≤ (fn ) dµ · (gn ) dµ , ∀ n ≥ 1.
X X X
In other words, it suffices to prove (1), under the extra assumption that both f and
g are elementary integrable.
Suppose f and g are elementary integrable. Then (see III.1) there exist pair-
wise disjoint sets (Dj )m j=1 ⊂ A, with µ(Dj ) < ∞, ∀ j = 1, . . . , m, and numbers
α1 , β1 , . . . , αm , βm ∈ [0, ∞), such that
f = α1 κ D1 + · · · + αm κ Dm
g = β1 κ D1 + · · · + βm κ Dm
Notice that we have
f g = α1 β1 κ D1 + · · · + αm βm κ Dm ,
so the left hand side of (1) is the given by
Z Xm
f g dµ = αj βj µ(Dj ).
X j=1
At this point we are going to use the Hölder inequality for finite sequences (Lemma
II.2.3), which gives
m
X m
X m
1/p X 1/q
p q
(xj yj ) ≤ (xj ) · (yj ) ,
j=1 j=1 j=1
Corollary 3.1. Let (X, A, µ) be a measure space, let K be one of the fields
R or C, and let p, q ∈ (1, ∞) be such that p1 + 1q = 1. For any two functions
f ∈ LpK (X, A, µ) and g ∈ LqK (X, A, µ), the product f g belongs to L1K (X, A, µ) and
one has the inequality Z
f g dµ ≤ Qp (f ) · Qq (g).
X
The following result gives an alternative description of the maps Qp , p ∈ (1, ∞).
Proposition 3.2. Let (X, A, µ) be a measure space, let K be one of the fields
R or C, let p, q ∈ (1, ∞) be such that p1 + 1q = 1. and let f ∈ LpK (X, A, µ). Then
one has the equality
Qp (f ) = sup hf, gi : g ∈ LqK (X, A, µ), Qq (g) ≤ 1 .
(5)
Proof. Let us denote the right hand side of (5) simply by P (f ). By Corollary
3.1, we clearly have the inequality
P (f ) ≤ Qp (f ).
To prove the other inequality, let us first observe that in the case when Qp (f ) = 0,
there is nothing to prove, because the above inequality already forces P (f ) = 0.
Assume then Qp (f ) > 0, and define the function h : x → K by
|f (x)|p
if f (x) 6= 0
f (x)
h(x) =
0 if f (x) = 0
It is obvious that h is measurable. Moreover, one has the equality |h| = |f |p−1 ,
which using the equality qp = p + q gives |h|q = |f |qp−q = |f |p . This proves that
h ∈ LqK (X, A, µ), as well as the equality
Z 1/q Z 1/q
Qq (h) = |h|q dµ = |f |p dµ = Qp (f )p/q .
X X
−p/q
If we define the number α = Qp (f ) , then the function g = αh has Qq (g) = 1,
so we get Z Z
1
P (f ) ≥ f g dµ = f h dµ.
X Q (f )p/q
p
X
CHAPTER IV: INTEGRATION THEORY 327
Since the above inequality holds for all g ∈ LqK (X, A, µ), with Qq (g) ≤ 1, again by
Proposition 3.2, we get
Qp (f1 + f2 ) ≤ Qp (f1 ) + Qp (f2 ).
Property (b) is obvious.
Remarks 3.2. Let (X, A, µ) be a measure space, and K be one of the fields R
or C, and let p ∈ [1, ∞).
A. If f ∈ LpK (X, A, µ) and if g : X → K is a measurable function, with g = f ,
µ-a.e., then g ∈ LpK (x, A, µ), and Qp (g) = Qp (f ).
B. If we define the space
NK (X, A, µ) = f : X → K : f measurable, f = 0, µ-a.e. ,
then NK (X, A, µ) is a linear subspace of LpK (X, A, µ). In fact one has the equality
NK (X, A, µ) = f ∈ LpK (X, A, µ) : Qp (f ) = 0 .
The inclusion “⊂” is trivial. Conversely, f ∈ LpK (X, A, µ) has Qp (fR ) = 0, then the
measurable function g : X → [0, ∞) defined by g = |f |p will have X g dµ = 0. By
Exercise 2.3 this forces g = 0, µ-a.e., which clearly gives f = 0, µ-a.e.
Definition. Let (X, A, µ) be a measure space, let K be one of the fields R or
C, and let p ∈ [1, ∞). We define
LpK (X, A, µ) = LpK (X, A, µ)/NK (X, A, µ).
In other words, LpK (X, A, µ) is the collection of equivalence classes associated with
the relation “=, µ-a.e.” For a function f ∈ LpK (X, A, µ) we denote by [f ] its
equivalence class in LpK (X, A, µ). So the equality [f ] = [g] is equivalent to f = g,
µ-a.e. By the above Remark, there exists a (unique) map k . kp : LpK (X, A, µ) →
[0, ∞), such that
k[f ]kp = Qp (f ), ∀ f ∈ LpK (X, A, µ).
328 LECTURE 35
By the above Remark, it follows that k . kp is a norm on LpK (X, A, µ). When K = C
the subscript C will be ommitted.
Conventions. Let (X, A, µ), K, and p be as above We are going to abuse a
bit the notation, by writing
f ∈ LpK (X, A, µ),
if f belongs to LpK (X, A, µ). (We will always have in mind the fact that this notation
signifies that f is almost uniquely determined.) Likewise, we are going to replace
Qp (f ) with kf kp .
Given p, q ∈ (1, ∞), with p1 + 1q = 1, we use the same notation for the (correctly
defined) map
h . , . i : LpK (X, A, µ) × LqK (X, A, µ) → K.
Remark 3.3. Let (X, A, µ) be a measure space, let K be either R or C, and
let p, q ∈ (1, ∞) be such that p1 + 1q = 1. Given f ∈ LpK (X, A, µ), we define the map
Λf : LqK (X, A, µ) 3 g 7−→ hf, gi ∈ K.
According to Proposition 3.2, the map Λf is linear, continuous, and has norm
kΛf k = kf kp . If we denote by LqK (X, A, µ)∗ the Banach space of all linear continu-
ous maps LqK (X, A, µ) → K, then we have a correspondence
(6) LpK (X, A, µ) 3 f 7−→ Λf ∈ LqK (X, A, µ)∗
which is linear and isometric. This correspondence will be analyzed later in Section
5.
p
Notation. Given a sequence (fn )∞ n=1 , and a function f , in LK (X, A, µ), we
are going to write
f = Lp - lim fn ,
n→∞
if (fn )∞
n=1 converges to f in the norm topology, i.e. limn→∞ kfn − f kp = 0.
The following technical result is very useful in the study of Lp spaces.
Theorem 3.2 (Lp Dominated Convergence Theorem). Let (X, A, µ) be a mea-
sure space, let K be one of the fields R or C, let p ∈ [1, ∞) and let (fn )∞n=1 be a
sequence in LpK (X, A, µ). Assume f : X → K is a measurable function, such that
(i) f = µ-a.e.- limn→∞ fn ;
(ii) there exists some function g ∈ L1K (X, A, µ), such that
|fn | ≤ |g|, µ-a.e., ∀ n ≥ 1.
Then f ∈ LpK (X, A, µ), and one has the equality
f = Lp - lim fn .
n→∞
• |ηn | ≤ η, µ-a.e., ∀ n ≥ 1;
• η ∈ L1+ (X, A, µ).
Again using the Lebesgue Dominated Convergence Theorem, we get
Z
lim ηn dµ = 0,
n→∞ X
Our main goal is to prove that the Lp spaces are Banach spaces. The key result
which gives this, but also has some other interesting consequences, is the following.
Theorem 3.3. Let (X, A, µ) be a measure space, let K be one of the fields R
p
k=1 be a sequence in LK (X, A, µ), such that
or C, let p ∈ [1, ∞) and let (fk )∞
∞
X
kfk kp < ∞.
k=1
p
n=1 ⊂ LK (X, A, µ) of partial sums:
Consider the sequence (gn )∞
n
X
gn = fk , n ≥ 1.
k=1
Since we have n n
X X
|gn | =
fk ≤ |fk | = hn ≤ h, ∀ n ≥ 1,
k=1 k=1
using the Claim, and Theorem 3.2, it follows that g indeed belongs to LpK (X, A, µ)
and we also have the equality g = Lp - limn→∞ gn .
Corollary 3.3. Let (X, A, µ) be a measure space, and let K be one of the
fields R or C. Then LpK (X, A, µ) is a Banach space, for each p ∈ [1, ∞).
Proof. This is immediate from the above result, combined with the complete-
ness criterion given by Remark II.3.1.
Another interesting consequence of Theorem 3.3 is the following.
Corollary 3.4. Let (X, A, µ) be a measure space, let K be one of the fields
R or C, let p ∈ [1, ∞), and let f ∈ LpK (X, A, µ). Any sequence (fn )∞ n=1 ⊂∈
LpK (X, A, µ), with f = Lp - limn→∞ fn , has a subsequence (fnk )∞
k=1 such that f =
µ-a.e.- limk→∞ fnk .
Proof. Without any loss of generality, we can assume that f = 0, so that we
have
lim kfn kp = 0.
n→∞
Choose then integers 1 ≤ n1 < n2 < . . . , such that
1
kfnk kp ≤ k , ∀ k ≥ 1.
2
If we define the functions
Xm
gm = fnk ,
k=1
then by Theorem 3.3, it follows that there exists some g ∈ LpK (X, A, µ), such that
g = µ-a.e.- lim gm .
m→∞
CHAPTER IV: INTEGRATION THEORY 331
This measn that there exists some N ∈ A, with µ(N ) = 0, such that
lim gm (x) = g(x), ∀ x ∈ X r N.
m→∞
P∞
In other words, for each x ∈ X r N , the series k=1 fnk (x) is convergent (to some
number g(x) ∈ K). In particular, it follows that
lim fnk (x) = 0, ∀ x ∈ X r N,
k→∞
then we obviously have g = hr , so we get the fact that h belongs to LrK (X, A, µ).
Using part (i), we get the fact that 1 ∈ LsK (X, A, µ), so by Corollary 3.1, it follows
that h = 1 · h belongs to L1K (X, A, µ), and moreover, one has the inequality
Z Z Z 1/s Z 1/r
p r
|f | dµ = h dµ ≤ k1ks · khkr = 1 dµ · h dµ =
X X X X
Z 1/r
q/r
= µ(X)1/s · |f |q dµ = µ(X)1/s · kf kq .
X
On the one hand, this inequality proves that f ∈ LpK (X, A, µ). On the other hand,
this also gives the inequality
p q/r p p
kf kp ≤ µ(X)1/s · kf kq = µ(X)1− q · kf kq ,
which yields
1 1
kf kp ≤ µ(X) p − q · kf kq .
This proves that the linear map (8) is continuous (and has norm no greater than
1 1
µ(X) p − q ).
332 LECTURE 35
inclusion
L1K,elem (X, A, µ) ⊂ LpK (X, A, µ), ∀ p ∈ [1, ∞),
so we we consider the quotient map
Πp : LpK (X, A, µ) → LpK (X, A, µ),
we can also define the subspace
LpK,elem (X, A, µ) = Πp LpK,elem (X, A, µ) , ∀ p ∈ [1, ∞).
Remark that, as vector spaces, the spaces LpK,elem (X, A, µ) are identical, since
Ker Πp = NK (x, A, µ), ∀ p ∈ [1, ∞).
With these notations we have the following fact.
Proposition 3.4. LpK,elem (X, A, µ) is dense in LpK (X, A, µ), for each p ∈
[1, ∞).
Proof. Fix p ∈ [1, ∞), and start with some f ∈ LpK (X, A, µ). What we
need to prove is the existence of a sequence (fn )∞ n=1 ⊂ LK,elem (X, A, µ), such that
1
p
f = L - limn→∞ fn . Taking real and imaginary parts (in the case K = C), it
suffieces to consider the case when f is real valued. Since |f | also belongs to Lp,
it follows that f + = max{f, 0} = 21 |f | + f , and f − = max{−f, 0} = 12 |f | − f
both belong to Lp , so in fact we can assume that f is non-negative. Consider the
function g = f p ∈ L1+ (X, A, µ). Use the definition of the integral, to find a sequence
n=1 ⊂ LR,elem (X, A, µ), such that
(gn )∞ 1
• 0 ≤ gn ≤Rg, ∀ n ≥ 1;R
• limn→∞ X gn dµ = X g dµ.
This gives the fact that g = L1 - limn→∞ gn . Using Corollary 3.4, after replacing
(gn )∞
n=1 with a subsequence, we can also assume that g = µ-a.e.- limn→∞ gn . If we
put fn = (gn )1/p , ∀ n ≥ 1, we now have
• 0 ≤ fn ≤ f , ∀ n ≥ 1;
• f = µ-a.e.- limn→∞ fn .
CHAPTER IV: INTEGRATION THEORY 333
Obviously, the fn ’s are still elementary integrable, and by the Lp Dominated Con-
vergence Theorem, we indeed get f = Lp - limn→∞ fn .
Comments. A. The above result gives us the fact that LpK (X, A, µ) is the com-
pletion of LpK,elem (X, A, µ). This allows for the following alternative construction
of the Lp spaces.
B. For a measurable function f : X → K, by the (proof of the) above result,
it follows that the condition f ∈ LpK (X, A, µ) is equivalent to the equality f =
µ-a.e.- limn→∞ fn , for some sequence (fn )∞ n=1 of elementary integrable functions,
which is Cauchy in the Lp norm, i.e.
(c) for every ε > 0, there exists Nε , such that
kfm − fn kp < ε, ∀ m, n ≥ Nε .
One key feature, which will be heavily exploited in the next section, deals with
the Banach space p = 2, for which we have the following.
Proposition 3.5. Let (X, A, µ) be a measure space, and let K be one of the
fields R or C.
(i) The map ( . | . ) : L2K (X, A, µ) × L2K (X, A, µ) → K, given by
Z
( f | g ) = hf¯, gi = f¯g dµ, ∀ f, g ∈ L2K (X, A, µ),
X
(i) Whenever g ∈ L1K (X, A, µ), it follows that the function f g also belongs to
L1K (X, A, µ), and one has the inequality
kf gk1 ≤ M · kgk1 .
(ii) The map
Z
Λf : L1K (X, A, µ) 3 g 7−→ f g dµ ∈ K
X
is linear and continuous. Moreover, one has the inequality kΛf k ≤ M .
Remark 3.5. If we apply the above Exercise to the constant function f = 1,
we get the (already known) fact that the integration map
Z
(9) Λ1 : LK (X, A, µ) 3 g 7−→
1
g dµ ∈ K
X
is linear and continuous, and has norm kΛ1 k ≤ 1. The follwing exercise gives the
exact value of the norm.
Exercise 5. With the notations above, prove that the following are equivalent:
(i) the measure space (X, A, µ) is non-degenerate, i.e. there exists A ∈ A
with 0 < µ(A) < ∞;
(ii) L1K (X, A, µ) 6= {0};
(ii) the integration map (9) has norm kΛ1 k = 1.
Lectures 36-37
4. Radon-Nikodym Theorems
In this section we discuss a very important property which has many important
applications.
Definition. Let X be a non-empty set, and let A be a σ-algebra on X. Given
two measures µ and ν on A, we say that ν has the Radon-Nikodym property relative
to µ, if there exists a measurable function f : X → [0, ∞], such that
Z
(1) ν(A) = f dµ, ∀ A ∈ A.
A
Here we use the convention which defines the integral in the right hand side by
f κ A dµ if f κ A ∈ L1+ (X, A, µ)
Z R
f dµ = X
A ∞ if f κ A 6∈ L1+ (X, A, µ)
In this case, we say that f is a density for ν relative to µ.
The Radon-Nikodym property has an equivalent useful formulation.
Proposition 4.1 (Change of Variables). Let X be a non-empty set, and let
A be a σ-algebra on X, let µ and ν be measures on A, and let f : X → [0, ∞] be a
measurable function.
A. The following are equivalent
(i) ν has the Radon-Nikodym property relative to µ, and f is a density for ν
relative to µ;
(ii) for every measurable function h : X → [0, ∞], one has the equality16
Z Z
(2) h dν = hf dµ.
X X
B. If ν and f are as above, and K is either R or C, then the equality (2)
also holds for those measurable functions h : X → K with h ∈ L1K (X, A, ν) and
hf ∈ L1K (X, A, µ).
Proof. A. (i) ⇒ (ii). Assume property (i) holds, which means that we have
(1). Fix a measurable function h : X → [0, ∞], and use Theorem III.3.2, to find a
sequence (hn )∞n=1 ⊂ A-ElemR (X), with
(a) 0 ≤ h1 ≤ h2 ≤ · · · ≤ h;
(b) limn→∞ hn (x) = h(x), ∀ x ∈ X.
Of course, we also have
(a0 ) 0 ≤ h1 f ≤ h2 f ≤ · · · ≤ hf ;
16 For the product hf we use the conventions 0 · ∞ = ∞ · 0 = 0, and t · ∞ = ∞ · t = ∞,
∀ t ∈ (0, ∞].
335
336 LECTURES 36-37
We need to prove that B is locally µ-null, i.e. one has µ(B ∩ F ) = 0, for all F ∈ A
with µ(F ) < ∞. Fix F ∈ A with µ(F ) < ∞, and let us write B ∩ F = D ∪ E,
where
D = x ∈ B ∩ F : f (x) < g(x) and E = x ∈ B ∩ F : f (x) > g(x) .
If we define, for each integer n ≥ 1, the sets
Dn = x ∈ B ∩ F : f (x) + n1 ≤ g(x) and En = x ∈ B ∩ F : f (x) ≥ g(x) + n1 ,
the measure µ on A, by µ(∅) and µ(X) = ∞. It is clear that µ has the Radon-
Nikodym property realtive to itself, but as sensities one can choose for instance the
constant functions f = 1 and g = 2. Clearly, the equality f = g, µ-a.e. is not true.
338 LECTURES 36-37
Remark 4.1. The local almost uniqueness result, given in Proposition 4.2,
holds under slightly weaker assumptions. Namely, if (X, A, µ) is a measure space,
and if f, g : X → [0, ∞] are measurable functions for which we have the equality
Z Z
f dµ = g dµ,
A A
for all A ∈ A with µ(A) < ∞, then we still have the equality f = g, µ-l.a.e.
This
follows actually from Proposition 4.2, applied to functions of the form f A and g A .
Let us introduce the following.
Notations. For a measure space (X, A, µ) we define
Aµ0 = {N ∈ A : µ(N ) = 0};
Aµfin = {F ∈ A : µ(F ) < ∞};
Aµ0,loc = {A ∈ A : µ(A ∩ F ) = 0, ∀ F ∈ Aµfin }.
With these notations, we have the inclusions
Aµ0 = Aµ0,loc ∩ Aµfin ⊂ Aµ0,loc ⊂ A,
and Aµ0 and Aµ0,loc are in fact σ-rings.
Comment. The “locally-almost everywhere” terminology is actually designed
to “hide some pathologies under the rug.” For instance, if (X, A, µ) is a degenerate
measure space , i.e. µ(A) ∈ {0, ∞}, ∀ A ∈ A, then “anything happens locally
almost-everywhere,” which means that we have the equality Aµ0,loc = A.
At the other end, there is a particular type of measure spaces on which, even in
the absence of σ-finiteness, the notions of “locally-almost everywhere” and ”almost
everywhere” coincide, i.e. we have the equality Aµ0,loc = Aµ0 . Such spaces are
described by the following.
Definition. A measure space (X, A, µ) is said to be nowhere degenerate, or
with finite subset property, if
(f) for every set A ∈ A with µ(A) > 0, there exists some set F ∈ A, with
F ⊂ A, and 0 < µ(F ) < ∞.
With this terminology, one has the following result.
Proposition 4.3. For a measure space (X, A, µ), the following are equivalent:
(i) Aµ0,loc = Aµ0 ;
(ii) (X, A, µ) has the finite subset property.
Proof. (i) ⇒ (ii). Assume Aµ0,loc = Aµ0 , and let us prove that (X, A, µ) has
the finite subset property. We argue by contradiction, so let us assume there exists
some set A ∈ A, with µ(A) = ∞, such that µ(B) ∈ {0, ∞}, for every B ∈ A, with
B ⊂ A. In particular, if we start with some arbitrary F ∈ Aµfin , using the fact
that µ(A ∩ F ) ≤ µ(F ) < ∞, we see that we must have µ(A ∩ F ) = 0. This proves
precisely that A ∈ Aµ0,loc . By assumption, it follows that A ∈ Aµ0 , i.e. µ(A) = 0,
which is impossible.
(ii) ⇒ (i). Assume that (X, A, µ) has the finite subset property, and let us
prove the equality (i). Since one inclusion is always true, all we need to prove is
the inclusion Aµ0,loc ⊂ Aµ0 , which equivalent to the inclusion Aµ0,loc ⊂ Aµfin . Start
with some set A ∈ Aµ0,loc , but assume µ(A) = ∞. On the one hand, using the finite
CHAPTER IV: INTEGRATION THEORY 339
subset property, there exists some set F ∈ A with F ⊂ A and µ(F ) > 0. On the
other hand, since A ∈ Aµ0,loc , we have µ(F ) = 0, which is impossible.
Example 4.2. Take X be an uncountable set, let A = P(X), and let µ be the
counting measure, i.e.
Card A if A is finite
µ(A) =
∞ if A is infinite
Then (X, P(X), µ) has the finite subset property, but is not σ-finite.
When we restrict to integrable functions, the two notions µ-l.a.e, and µ-a.e.
coincide. More precisely, we have the following.
Proposition 4.4. Let (X, A, µ) be a measure space, let K be one of the fields
R or C, and let p ∈ [1, ∞). For a function f ∈ LpK (X, A, µ), the following are
equivalent:
(i) f = 0, µ-l.a.e.
(ii) f = 0, µ-a.e.
Proof. Of course, we only need to prove the implication (i) ⇒ (ii). Assume
f = 0, µ-l.a.e. Using the function g = |f |p , we can assume that p = 1 and f (x) ≥ 0,
∀ x ∈SX. Consider then the set N = {x ∈ X : f (x) > 0}, and write it as a union
∞
N = n=1 Nn , where
1
Nn = {x ∈ X : f (x) ≥ n }, ∀ n ≥ 1.
Of course, all we need is the fact that µ(Nn ) = 0, ∀ n ≥ 1. Fix n ≥ 1. On the one
hand, the assumption on f , it follows that Nn ∈ Aµ0,loc . On the other hand, the
inequality n1 κ Nn ≤ f , forces the elementary function n1 κ Nn to be µ-integrable, i.e.
µ(Nn ) < ∞. Consequently we have
N ∈ Aµ0,loc ∩ Aµfin = Aµ0 .
Comment. In what follows we will discuss several results, which all have as
conclusion the fact that one measure has the Radon-Nikodym property with respect
to another one. All such results will be called “Radon-Nikodym Theorems.”
The first result is in fact quite general, in the sense that it works for finite
signed or complex measures.
Theorem 4.1 (“Easy” Radon-Nikodym Theorem). Let (X, A, µ) be a finite
measure space, let K denote one of the fields R or C, and let C > 0 be some
constant. Suppose ν is a K-valued measure on A, such that
|ν(A)| ≤ Cµ(A), ∀ A ∈ A.
Then there exists some function f ∈ L1K (X, A, µ), such that
Z
(5) ν(A) = f dµ, ∀ A ∈ A.
A
Moreover:
(i) Any function f ∈ L1K (X, A, µ), satisfying (5) has the property |f | ≤ C, µ-
a.e. If ν is an “honest” measure, then one also has the inequality |f | ≥ 0,
µ-a.e.
(ii) A function satisfying (5) is essentially unique, in the sense that, whenever
f1 , f2 ∈ L1K (X, A, µ) satisfy (5), it follows that f1 = f2 , µ-a.e.
340 LECTURES 36-37
R
Proof. The ideea is to somehow make sense of X h dν, for suitable measurable
Rfunctions h, and to examine the properties of such a number relative to the integral
X
h dµ. The second integral is of course defined, for instance for h ∈ L 1
K (X, A, µ),
but the first integral is not, because ν is not an “honest” measure. The proof will
be carried on in several steps.
Step 1: There exist four “honest” finite measures νk , k = 1, 2, 3, 4, and num-
bers αk , k = 1, 2, 3, 4, such that ν = α1 ν1 + α2 ν2 + α3 ν3 + α4 ν4 , and
(6) νk ≤ Cµ, ∀ k = 1, 2, 3, 4.
In the case K = R we use the Hahn-Jordan decomposition ν = ν + − ν − . We also
know that ν ± ≤ |ν|, the variation measure of ν. In this case we take α1 = 1,
ν1 = ν + , α2 = −1, ν2 = ν − , and we set ν3 = ν4 = 0, α3 = α4 = 0.
In the case K = C, we write ν = η + iλ, with η and λ finite signed measures,
and we use the Hahn-Jordan decompositions η = η + − η − and λ = λ+ − λ− . We
also know that the variation measures of η and λ satisfy |η| ≤ |ν| and |λ| ≤ |ν|, so
we also have η ± ≤ |ν| and λ± ≤ |ν|. In this case we can then take α1 = 1, ν1 = η + ,
α2 = −1, ν2 = η − , α3 = i, ν3 = λ+ , α4 = −i, ν4 = λ− .
Notice that in either case we have
νk ≤ |ν|, ∀ k = 1, 2, 3, 4.
By Remark III.8.5 it follows that we have |ν| ≤ Cµ, so we immediately get the
inequalities (6).
Step 2: For any measurable function h : X → [0, ∞], one has the inequality
Z Z
(7) h dνk ≤ C h dµ, ∀ k = 1, 2, 3, 4.
X X
(Here we use the abusive notation that identifies an element in L1 with a function
in L1 , which is defined almost uniquely.) Moreover, one has the inequality
Z Z
|h| dνk ≤ C |h| dµ, ∀ h ∈ L1K (X, A, µ), k = 1, 2, 3, 4,
X X
in other words, the linear maps (8) are all continuous. For every k = 1, 2, 3, 4, let
φk denote the integration map
Z
φk : LK (X, A, νk ) 3 h 7−→
1
h dνk ∈ K.
X
We know (see Remark 3.5) that the φk ’s are continuous. In particular, the compo-
sitions ψk = φk ◦ Φk : L1K (X, A, µ) → K, which are defined by
Z
ψk : L1K (X, A, µ) 3 h 7−→ h dνk , k = 1, 2, 3, 4,
X
are linear and continuous.
We now use Proposition 3.3 which states that one has an inclusion
(9) Θ : L2K (X, A, µ) ,→ L1K (X, A, µ),
which is in fact a linear continuous map. So if we consider the compositions θk =
ψk ◦ Θ, which are defined by
Z
θk : L1K (X, A, µ) 3 h 7−→ h dνk , k = 1, 2, 3, 4,
X
then these compositions are linear and continuous. Apply then Riesz Theorem (in
the form given in Remark 3.4), to find functions f1 , f2 , f3 , f4 ∈ L2K (X, A, µ), such
that
θk (h) = hfk , hi, ∀ h ∈ L2K (X, A, µ), k = 1, 2, 3, 4.
In particular, using functions of the form h = κ A , A ∈ A (which all belong to
L2K (X, A, µ), due to the finiteness of µ), we get
Z Z
νk (A) = κ A dνk = fk κ A dµ, ∀ A ∈ A, k = 1, 2, 3, 4.
X X
S
so we immediately have the equality A = Aα , where
1
α∈SQ
Aα = {x ∈ X : Re[αf (x)] > C .
Since SQ1 is countable, in order to prove that µ(A) = 0, it then suffices to show that
µ(Aα ) = 0, ∀ α ∈ SQ1 . Fix then α ∈ SQ1 , and consider the K-valued measure η = αν.
It is clear that we still have
(11) |η(A)| = |ν(A)| ≤ Cµ(A), ∀ A ∈ A,
as well as the equality
Z
(12) η(A) = αf dµ, ∀ A ∈ A.
A
For each integer n ≥ 1, let us define the set
Anα = {x ∈ X : Re[αf (x)] ≥ C + n1 ,
S∞
so that we obviously have the equality Aα = n=1 Anα . In particular, in order to
prove µ(Aα ) = 0, it suffices to prove that µ(Anα ) = 0, ∀ n ≥ 1. Fix for the moment
n ≥ 1. Using (12), it follows that
Z Z Z
Re η(Anα ) = Re αf dµ = Re[αf ] dµ = Re[αf ]κ Anα dµ.
An
α An
α X
1
Since we have Re[αf ]κ Anα ≥ (C + n )κ An ,
the above inequality can be continued
α
with Z
Re η(Anα ) ≥ (C + n1 )κ Anα dµ = (C + n1 )µ(Anα ).
X
Of course, this will give
|η(Anα )| ≥ Re η(Anα ) ≥ (C + n1 )µ(Anα ).
Note now that, using (11), this will finally give
Cµ(Anα ) ≥ (C + n1 )µ(Anα ),
which clearly forces µ(Anα ) = 0.
Having proven that |f | ≤ C, µ-a.e., let us turn our attention now to the unique-
ness property (ii). Suppose f1 , f2 ∈ L1K (X, A, µ) are such that
Z Z
ν(A) = f1 dµ = f2 dµ, ∀ A ∈ A.
A X
Consider then the difference f = f1 − f2 and the trivial measure ν0 = 0. Obviously
we have
|ν0 (A)| ≤ n1 µ(A), ∀ A ∈ A,
for every integer n ≥ 1, as well as
Z
ν0 (A) = f dµ, ∀ A ∈ A.
A
By the first assertion in (i), it follows that
1
|f1 − f2 | = |f | ≤
, µ-a.e.,
n
n=1 ⊂ A defined by
for every n ≥ 1. So if we take the sets (Nn )∞
1
Nn = {x ∈ X : |f1 (x) − f2 (x)| > n },
CHAPTER IV: INTEGRATION THEORY 343
S∞
then µ(Nn ) = 0, ∀ n ≥ 1. Of course, if we put N = n=1 Nn , then on the one hand
we have µ(N ) = 0, and on the other hand, we have
f1 (x) − f2 (x) = 0, ∀ x ∈ X r N,
which means that we indeed have f1 = f2 , µ-a.e.
Finally, let us prove the second assertion in (i), which starts with the assumption
that ν is an “honest” measure. Let f ∈ L1K (X, A, µ) satisfy (5). By the uniqueness
property (ii), it follows immediately that
f = Re f, µ-a.e.,
so we can assume that f is already real-valued. Consider the “honest” measure
ω = Cµ − ν, and notice that the function g : X → R defined by
g(x) = C − f (x), ∀ x ∈ X,
clearly has the property
Z
ω(A) = g dµ, ∀ A ∈ A.
A
Since we obviously have
0 ≤ ω(A) ≤ Cµ(A), ∀ A ∈ A,
by the first assertion of (i), applied to the measure ω and the function g, it follows
that |g| ≤ C, µ-a.e. In other words, we have now a combined inequality:
max |f |, |C − f | ≤ C, µ-a.e.
Of course, since f is real valued, this forces f ≥ 0, µ-a.e.
(Recall that the notation D ⊂ E stands for µ(D r E) = 0.) Since ν µ, we also
µ
have the relations
A ∩ X1+ ⊂ A ∩ X2+ ⊂ . . . ,
ν ν
so using Proposition III.4.3, one gets the equality
+
ν(A ∩ X∞ ) = lim ν(A ∩ Xn+ ).
n→∞
Combining this with the inequalities (15) and (17) then yields the inequality
+ −
(18) ν(A) ≥ lim sup νn (A) ≥ lim inf νn (A) ≥ ν(A ∩ X∞ ) + lim nµ(A ∩ X∞ ) .
n→∞ n→∞ n→∞
Having shown that f satisfies (19), let us observe that the uniqueness property
stated in part A is a consequence of Proposition 4.2.
B. Let λ be a K-valued. In particular, the variation measure |λ| is finite, so by
the Polar Decomposition (Proposition 4.3) there exists some measurable function
h : X → K, such that
Z
(21) λ(A) = h d|λ|, ∀ A ∈ A,
A
and such that |h| = 1, |λ|-a.e. Replacing h with the measurable function h0 : X →
K, defined by
0 h(x) if |h(x)| = 1
h (x) =
1 if |h(x)| =
6 1
we can assume that in fact we have
|h(x)| = 1, ∀ x ∈ X.
CHAPTER IV: INTEGRATION THEORY 349
Apply then part A, to the measure |λ|, which is again absolutely continuous with
respect to µ, to find some measurable function g : X → [0, ∞], such that
Z
|λ|(A) = g dµ, ∀ A ∈ A.
A
Remark that, since Z
g dµ = |λ|(X) < ∞,
X
it follows that g ∈ L1+ (X, A, µ). Fix for the moment some set A ∈ A. On the one
hand, since
(22) |hκ A | ≤ 1,
and |λ| is finite, it follows that hκ A ∈ L1K (X, A, |λ|). On the other hand, since
g ∈ L1+ (X, A, µ), using (22) we get the fact that hκ A g ∈ L1K (X, A, µ). Using the
Change of Variable formula (Proposition 4.1) we then get the equality
Z Z
hκ A d|λ| = hκ A g dµ,
X X
which by (21) reads: Z
λ(A) = hg dµ.
A
Now the function f0 = hg (which has |f0 | = g) belongs to L1K (X, A, µ), and clearly
satisfies (20).
To prove the uniqueness property (i), we start with two functions f1 , f2 ∈
L1K (X, A, µ) which satisfy
Z Z
f1 dµ = f2 dµ = λ(A), ∀ A ∈ A.
A A
At this point we would like to go further, beyond the finite case. The following
generalization of Theorem 4.2 is pretty straightforward.
Corollary 4.1 (Radon-Nikodym Theorem: the σ-finite case). Let (X, A, µ)
be a σ-finite measure space.
A. If ν is an “honest” measure on A, with ν µ, then there exists a measurable
function f : X → [0, ∞], such that
Z
(23) ν(A) = f dµ, ∀ A ∈ A.
A
350 LECTURES 36-37
is the fact that, if K denotes one of the sets [0, ∞], R or C, then for a function
f : X → K the fact that f is measurable, is equivalent to the fact that f Xn is
measurable for each n ≥ 1. Moreover, given two functions f1 , f2 : X → K, the con-
dition f1 = f2 , µ-a.e. is equivalent to the fact that f1 Xn = f2 Xn , µ-a.e., ∀ n ≥ 1.
condition f ∈ LK (X, A, µ), is equivalent to the
1
Finally, forf : X → K(= R, C), the
fact that f X ∈ LK (Xn , A X , µ X ), ∀ n ≥ 1, and
1
n n n
∞ Z
X
f X d µX < ∞.
n n
n=1 Xn
(p) Given a measurable space (Y, B), a function f : (X, A) → (Y, B) is mea-
surable, if and only if all restrictions F F : (F, AF ) → (Y, B), F ∈ F, are
measurable.
Theorem 4.3 (Radon-Nikodym Theorem: the decomposable case). Let (X, A, µ)
be a decomposable measure space. Let Aµσ-fin be the collection of all µ-σ-finite sets
in A, that is,
∞
[
Aµσ-fin = A ∈ A : there exists (An )∞ Aµ
n=1 ⊂ fin , with A = An .
n=1
Using the patching property, there exists a measurable function f : X → [0, ∞],
such that f F = fF , ∀ F ∈ F. The key feature we ar going to prove is a particular
case of (25).
Claim 1: ν(A) = A f dµ, ∀ A ∈ Aµfin .
R
S
is at most countable. We then form the set à = F ∈F(A [A ∩ F ], which is clearly a
subset of A. The difference D = A r à has again µ(D) < ∞, so its measure is also
given as X
µ(D) = µ(D ∩ F ).
F ∈F
Notice however that we have µ(D ∩ F ) = 0, ∀ ∈ F. (If F ∈ F(A), we already have
D ∩ F = ∅, whereas if F ∈ F r F(A), we have D ∩ F ⊂ A ∩ F , with µ(A ∩ F ) = 0.)
Using then the above equality, we get µ(D) = 0. By abosulte continuity we also
get ν(D) = 0. Using the equality A = Ã ∪ Dn , and σ-additivity (it is essential here
that F(A) is countable), it follows that
X
ν(A) = ν(Ã) = ν(A ∩ F ).
F ∈F(A)
where
p
[
Gn = [A ∩ Fk ], ∀ n ≥ 1.
k=1
It is clear that we have
• f κ G 1 ≤ f κ G2 ≤ . . . ,
• limn→∞ (f κ Gn )(x) = (f κ Ã )(x), ∀ x ∈ X,
so using the Monotone Convergence Theorem, it follows that
Z Z Z
lim f κ Gn dµ = f κ Ã dµ = f dµ.
n→∞ X X Ã
Using (27) we then get
Z Z
ν(A) = lim f κ Gn dµ = f dµ.
n→∞ X Ã
S∞
so that we still have Bn ∈ Aµfin , ∀ n ≥ 1, as well as A = n=1 Bn , but moreover we
have B1 ⊂ B2 ⊂ . . . . For each n ≥ 1, using Claim 1, we have the equality
Z
ν(Bn ) = f dµ.
Bn
Using part A, there exists some measurable function g0 : X → [0, ∞], such that
Z
(28) |λ|(A) = g0 dµ, ∀ A ∈ Aµσ-fin .
A
At this point, g0 may not be integrable, but we have the freedom to perturb it (µ-
l.a.e.) to try to make it integrable. This is done as follows. Consider the collection
F0 = F ∈ F : |λ|(F ) > 0 .
Since S|λ| is finite, it follows that F0 is at most countable. Define then the set
X0 = F ∈F0 F ∈ Aµσ-fin . Since X0 is µ-σ-finite, every set A ∈ A with A ⊂ X0 , is
µ-σ-finite, so we have
Z
g0 dµ, ∀ A ∈ AX0 .
|λ|(A) =
A
Applying the σ-finite version of the Radon-Nikodym Theorem to the σ-finite mea-
sure space (X0 , AX0 , µX0 ) and the finite measure λX0 , it follows that the density
belongs to L1+ (X, A, µ). With this choice of g, let us prove now that the equality
(28) still holds, with g in place of g0 . Exactly as in the proof of part A, it suffices
to prove only the equality
Z
(29) |λ|(A) = g dµ, ∀ A ∈ Aµfin .
A
|λ|(A r X0 ) = 0, ∀ A ∈ Aµσ-fin ,
354 LECTURES 36-37
it suffices to prove it only for A ∈ Aµfin . If A ∈ Aµfin , using the properties of the
decomposition F, we have
X X X
|λ|(A) = |λ|(A ∩ F ) = |λ|(A ∩ F ) + |λ|(A ∩ F ) =
F ∈F F ∈F0 F ∈FrF0
[ X
= |λ| [A ∩ F ] + |λ|(A ∩ F ) =
F ∈F0 F ∈FrF0
X
= |λ|(A ∩ X0 ) + |λ|(A ∩ F ).
F ∈F 0 (A)
Define now the function f0 = hg. Since |f0 | = g ∈ L1+ (X, A, µ),
it follows that
f0 ∈ L1K (X, A, µ). Let us prove that f0 satisfies the equality (26). Start with some
A ∈ Aµσ-fin . On the one hand, using Claim 2, we have
|λ(A r X0 )| ≤ |λ|(A r X0 ) = 0,
∩ X0 ). Using
so we get λ(A) = λ(A the σ-finite version of the Radon-Nikodym
Theorem for (X0 , AX0 , µX0 ) and λX0 , we then have
Z Z
λ(A) = λ(A ∩ X0 ) = hg0 dµ = hg0 κ A∩X0 dµ =
A∩X0 X
Z Z Z Z
= hg0 κ X0 κ A dµ = hgκ A dµ = hg dµ = f0 dµ.
X X A A
We now prove the uniqueness property (i) of f (µ-a.e.!). Assume f ∈ L1K (X, A, µ)
is another function, such that
Z
λ(A) = f dµ, ∀ A ∈ Aµσ-fin .
A
Claim 3: f = f0 , µ-l.a.e.
What we need to show here is the fact that
f κ B = f0 κ B , µ-a.e., ∀ B ∈ Aµfin .
But this follows immediately from the uniqueness from part B of Theorem 4.2,
applied to the finite measure space (B, AB , µB ) and the measure λB , which has
both f B and f0 B as densities.
Using Claim 3, we now have f − f0 ∈ L1K (X, A, µ), with f − f0 = 0, µ-l.a.e.,
so we can apply Proposition 4.4, which forces f − f0 = 0, µ-a.e., so we indeed get
f = f0 , µ-a.e.
Property (ii) is obvious, since by (i), any function f ∈ L1K (X, A, µ), that satisfies
(26), automatically satisfies |f | = |f0 | = g, µ-a.e.
CHAPTER IV: INTEGRATION THEORY 355
precisely that ν(A) = 0 for all countable subsets A ⊂ X. In this case the equality
(25) says practically nothing, since it is restricted solely to countable sets A ⊂ X,
when both sides are zero.
In this example, it is also instructive to analyze the case when ν is finite (see part
B in Theorem 4.3). If we follow the proofS of the Theorem, we see
that at some point
we have constructed a certain set X0 = F ∈F0 , where F0 = F ∈ F : ν(F ) > 0 .
In our situation however it turns out that X0 = ∅. This example brings up a very
interesting question, which turns out to sit at the very foundation of set theory.
Question: Does there exists an uncountable set X, and a finite measure ν
on P(X), such that ν(X) > 0, but ν(A) = 0, for every countable subset
A ⊂ X?
(The above vanishing condition is of course equivalent to the fact that ν({x}) = 0,
∀ x ∈ X.) It turns out that, not only that the answer of this question is unkown, but
in fact several mathematicians are seriously thinking of proposing it as an axiom
to be added to the current system of axioms used in set theory!
The limitations of Theorem 4.3 also force limitations in the Change of Variables
property (see Proposition 4.1), which in this case has the following statement.
Proposition 4.6 (Local Change of Variables). Let (X, A, µ) be a measure
space, and let ν be a measure on A, and let f : X → [0, ∞] be a measurable
function.
A. The following are equivalent:
(i) one has
Z
ν(A) = f dµ, ∀ A ∈ Aµσ-fin ;
A
(ii) for every measurable function h : X → [0, ∞], with the property that the
set Eh = {x ∈ X : h(x) 6= 0} belongs to Aµσ-fin , one has the equality
Z Z
(30) h dν = hf dµ.
X X
Proof. A. (i) ⇒ (ii). Assume (i) holds. Start with some measurable function
h : X → [0, ∞], such that the set Eh = {x ∈ X : h(x) 6= 0} belongs to Aµσ-fin . The
equality
(30) is then immediate fromProposition 4.1, appliedto the measure space
(Eh , AE , µE ), and the measure ν E , which has density f E .
h h h h
356 LECTURES 36-37
(ii) ⇒ (i). Assume (ii) holds. If we start with some A ∈ Aµσ-fin , then obviously
the measurable function h = κ A will have Eh = A, so by (ii) we immediately get
Z Z Z
ν(A) = κ A dν = κ A f dµ = f dµ.
X X A
B. Assume now ν and f satisfy the equivalent conditions (i) and (ii). Suppose
h : X → K is measurable, with Eh ∈ Aµσ-fin , such that h ∈ L1K (X, A, ν) and
hf ∈ L1K (X, A, µ). Then the equality
(30) follows again from Proposition 4.1,
applied to the measure space (Eh , AE , µE ), and the measure ν E , which has
h h h
density f E .
h
Appendix A
Zorn Lemma
In this Appendix we review basic set theoretical results, which are consequences
of the following postulate:
Axiom of Choice. Given any non-empty collection17 {Xi : i ∈ I} of non-
empty sets, the cartesian product Y
Xi
i∈I
is non-empty.
Recall that the cartesian product is defined as
Y [
= f :I→ Xi : f (i) ∈ Xi , ∀ i ∈ I .
i∈I i∈I
In order to formulate several consequences of the Axion of Choice, we need
several concepts.
Definitions. Given a set X, by a relation on X one means simply as subset
R ⊂ X × X. The standard notation for relations is:
xRy ⇐⇒ (x, y) ∈ R.
An order relation on X is a relation ≺ with the following properties:
• x ≺ x, ∀ x ∈ X;
• if x, y, z ∈ X satisfy x ≺ y and y ≺ z, then x ≺ z;
• if x, y ∈ X satisfy x ≺ y and y ≺ x, then x = y.
In this case the pair (X, ≺) is called an ordered set.
An ordered set (X, ≺) is said to be totally ordered, if
• for any elements x, y ∈ X one has either x ≺ y or y ≺ x.
More generally, given an (arbitrary) ordered set (X, ≺), by a totally ordered subset
of (X, ≺), one means a subset T ⊂ X, which becomes totally ordered with respect
to the order relation ≺ T .
Example A.1. Fix a set M , and take X to be the collection of all subsets of
M . Then X carries a natural order relation defined by inclusion:
A ≺ B ⇐⇒ A ⊂ B.
A totally ordered subset C of (X, ⊂) is called a chain of subsets of M . Two subset
A, B ⊂ M will be said to be comparable, if either A ⊂ B, or B ⊂ A, i.e. the
collection {A, B} is a chain of subsets of M .
Definition. Let M be a set. A collection F of subsets of M is said to have
the chain property, if
17 By a “collection of sets” one simply means a set whose elements are sets themselves.
357
358 APPENDIX A
(c) whenever C ⊂ F is a chain, it follows that the union C∈C C also belongs
S
to F.
Lemma A.1. Let M be a set, let F be a collection of subsets of M with the
chain property. For every set A ∈ F, the collection
comp(A; F) = {B ∈ F : Bcomparable to A}
has the chain property.
Proof. Let C ⊂ comp(A; F) be a chain, and put T = C∈C C. Since F has
S
the chain property, we have T ∈ F. To show that T is comparable with A, we
consider the two pssibilities:
Case 1: A ⊃ C, for all C ∈ C. In this case we have A ⊃ C∈C C = T .
S
Case 2: There exists C0 ∈ C, such that A ⊂ C0 . In this case we have A ⊂
C0 ⊂ T .
Lemma A.2. Let M be some non-empty set, let F let F be a non-empty collec-
tion of subsets of M , with the chain property Suppose one has a map
F 3 A 7−→ xA ∈ M,
with the property that
A ∪ {xA } ∈ F, ∀ A ∈ F.
Then there exists A ∈ F such that xA ∈ A.
Proof. For each A ∈ F we define A+ = A ∪ {xA }. Call a subset G ⊂ F
inductive, if it has the chain property, and
(+) A ∈ G ⇒ A+ ∈ G.
T that if Gi , i ∈ I is a collection of inductive subsets of F, then the
It is quite clear
intersection i∈I Gi is again an inductive subset of F.
Fix now some subset A0 ∈ F, and define
\
G0 = G.
G inductive
A0 ∈G
and we try to prove that T = G0 . By Lemma A.1 it is clear that T has the chain
property. Using (1), it is clear that A0 ∈ T. Finally, we need to prove property
(+). We prove this indirectly as follows. Fix T ∈ T, consider the collection
VT = comp(T + ; G0 ) = {A ∈ G0 : A comparable with T + },
and let us prove that VT = G0 , by showing that VT is an inductive set, and contains
A0 . First of all, by Lemma A.1, it follows that VT has the chain property. Secondly,
using (1) we have A0 ⊂ T ⊂ T + , so A0 ∈ VT . Finally, to check property (+), we
ZORN LEMMA 359
Theorem A.1 (Zorn Lemma). Assume the Axiom of Choice is true. Let (X, ≺)
be a non-empty ordered set, with the following property
(z) every totally ordered subset A ⊂ X has an upper bound.
Then X has at least one maximal element.
Proof. Define the collection
F = {A ⊂ X : A totally ordered subset}.
Clearly F is non-empty (it contains, for instance, all singletons).
It is quite clear that F satisfies the hypothesis of Lemma A.3. So (F, ⊂) has a
maximal element A. Take now x to be an upper bound for A, i.e. a ≺ x, ∀ a ∈ A.
Now we prove that x is maximal for (X, ≺). Suppose y ∈ X satisfies x ≺ y.
Then clearly A ∪ {y} will still be a totally ordered subset of X, i.e. A ∪ {y} ∈ F.
The maximality of A in (F, ⊂) will force A ∪ {y} = A, so we get y ∈ A, hence y ≺ x.
Since we also have x ≺ y, this forces y = x.
Appendix B
Cardinal Arithmetic
In this Appendix we discuss cardinal arithmetic. We assume the Axiom of
Choice is true.
Definitions. Two sets A and B are said to have the same cardinality, if there
exists a bijective map A → B. It is clear that this defines an equivalence relation
on the class18 of all sets.
A cardinal number is thought as an equivalence class of sets. In other words,
if we write a cardinal number as a, it is understood that a consists of all sets of a
given cardinality. So when we write card A = a we understand that A belongs to
this class, and for another set B we write card B = a, exactly when B has the same
cardinality as A. In this case we write card B = card, A.
Notations. The cardinality of the empty set ∅ is zero. More generally the
cardinality of a finite set is equal to its number of elements. The cardinality of the
set N, of all natural numbers, is denoted by ℵ0 .
Definition. Let a and b be cardinal numbers. We write a ≤ b if there exist
sets A ⊂ B with card A = a and card B = b.
This is equivalent to the fact that, for any sets A and B, with card A = a and
card B = b, one of the following equivalent conditions holds:
• there exists an injective function f : A → B;
• there exists a surjective function g : B → A.
For two cardinal numbers a and b, we use the notation a < b to indicate that
a ≤ b and a 6= b.
Theorem B.1 (Cantor-Bernstein). Suppose two cardinal numbers a and b sat-
isfy a ≤ b and b ≤ a. Then a = b.
Proof. Fix two sets A and B with card A = a and card B = b, so there exist
injective functions f : A → B and g : B → A. We shall construct a bijective
function h : A → B. Define the sets
A0 = A r g(B) and B0 = A r f (A).
Then define recursively the sequences (An )n≥0 and (Bn )n≥0 by
An = g(Bn−1 ) and Bn = f (An−1 ), ∀ n ≥ 1.
Claim 1: One has Am ∩ An = Bm ∩ Bn , ∀ m > n ≥ 0.
Let us first observe that the case when n = 0 is trivial, since we have the inclusions
Am = g(Bm−1 ) ⊂ g(B) = A r A0 and Bm = f (Am−1 ) ⊂ f (A) = B r B0 . Next we
prove the desired property by induction on m. The case m = 1 is clear (this forces
18 The term class is used, because there is no such thing as the “set of all sets.”
361
362 APPENDIX B
n = 0). Suppose the statement is true for m = k, and let us prove it for m = k + 1.
Start with some n < k + 1. If n = 0, we are done, by the above discussion. Assume
first n ≥ 1. Since f and g are injective we have
Ak+1 ∩ An = g(Bk ) ∩ g(Bn−1 ) = g(Bk ∩ Bn−1 ) = ∅,
Bk+1 ∩ Bn = f (Ak ) ∩ f (An−1 ) = f (Ak ∩ An−1 ) = ∅,
and we are done. S
Put C = A rn≥0 An and D = B r n≥0 Bn .
Claim 2: One has the equality f (C) = D.
First we prove the inclusion f (C) ⊂ D. Start with some point c ∈ C, but assume
f (c) 6∈ D. This means that there exists some n ≥ 0 such that f (c) ∈ Bn . Since
f (c) ∈ f (A) = B rB0 , we must have n ≥ 1. But then we get f (c) ∈ Bn = f (An−1 ),
and the injectivity of f will force c ∈ An−1 , which is impossible.
Second, we prove that D ⊂ f (C). Start with some d ∈ D. First of all, since
D ⊂ B r B0 = f (A), there exists some c ∈ A with d = f (c). If c 6∈ C, then there
exists some n ≥ 0, such that c ∈ An , and then we would get d = f (c) ∈ f (An ) =
Bn+1 , which is impossible. S
We now begin constructing the desired bijection. First we define φ : n≥0 Bn →
B by
b if b ∈ Bn and n is odd
φ(b) =
(f ◦ g)(b) if b ∈ Bn and n is even
Claim 3: The map φ defines a bijection
[ [
φ: Bn → Bn .
n≥0 n≥1
It is clear that, since φBn is injective, the map φ is injective. Notice also that, if
n ≥ 0 is even, then φ(Bn ) = f g(Bn ) = f (An+1 ) = Bn+2 . When n ≥ 0 is odd we
have φ(Bn ) = Bn , so we have indeed the equality
[ [
φ Bn = Bn .
n≥0 n≥1
so ψ defines a bijection [ [
ψ: An → Bn .
n≥0 n≥0
We then combine ψ with the bijection f : C → D, i.e. we define the map h : A → B
by S
ψ(x) if x ∈ n≥0 An
h(x) = S
f (x) if x ∈ A r n≥0 An = C.
Clearly h is injective, and
[ [
h(B) = ψ An ∪ f (C) = Bn ∪ D = B,
n≥0 n≥0
so h is indeed bijective.
CARDINAL ARITHMETIC 363
Theorem B.2 (Total ordering for cardinal numbers). Let a and b be cardinal
numbers. Then one has either a ≤ b, or b ≤ a.
Proof. Choose two sets A and B with card A = a and card B = b. In order to
prove the theorem, it suffices to construct either an injective function f : A → B,
or an injective function f : B → A.
We define the set
X = {(C, D, g) : C ⊂ A, D ⊂ B, g : C → D bijection}.
We equip X with the following order relation:
C ⊂ C0
0 0 0
(C, D, g) ≺ (C , D , g ) ⇐⇒ D ⊂ D 0
g = g 0 C
We now check that (X, ≺) satisfies the hypothesis of Zorn Lemma. LetSA ⊂ X
be a totally
S ordered subset, say A = (Ci , D ,
i ig ) : i ∈ I . Define C = i∈I Ci ,
D = i∈I Di , and g : C → D to be the unique function with the property that
g Ci = gi , ∀ i ∈ I. (We use here the fact that for i, j ∈ I we either have Ci ⊂ Cj
and gj Ci = gi , or Cj ⊂ Ci and gi Cj = gj . In either case, this proves that
gi
Ci ∩Cj
= gj Ci ∩Cj
, ∀ i, j ∈ I, so such a g exists.) It is then pretty clear that
(C, D, g) ∈ X and (Ci , Di , gi ) ≺ (C, D, g), ∀ i ∈ I, i.e. (C, D, g) is an upper bound
for A. Use now Zorn Lemma, to find a maximal element (A0 , B0 , f ) in X.
Claim: Either A0 = A or B0 = B.
We prove this by contradiction. If we have strict inclusions A0 ( A and B0 ( B,
then if we choose a ∈ A r A0 and b ∈ B r B0 , we can define a bijection g :
A0 ∪ {a} → B0 ∪ {b0 } by g(a) = b and g A = f . This would then produce a
0
new element (A0 ∪ {a}, B0 ∪ {b}, g) ∈ X, which would contradict the maximality of
(A0 , B0 , f ).
The theorem now follows immediately from the Claim. If A0 = A, then f :
A → B is injective, and if B0 = B, then f : B → A is injective.
then a = card(AB ).
b
364 APPENDIX B
It is pretty easy to show that these definitions are correct, in the sense that they
do not depend on the particular choices of the sets involved. Moreover, these
operations are consistent with the usual operations with natural numbers.
Remark B.1. The operations with cardinal numbers, defined above, satisfy:
• a + b = b + a,
• (a + b) + d = a + (b + d),
• a + 0 = a,
• a · b = b · a,
• (a · b) · d = a · (b · d),
• a · 1 = a,
• a · (b + d) = (a · b) + (a · d),
• (a · b)d = (ad ) · (bd ),
• ab+d = (ab ) · (ad ),
• (ab )d = (ab·d ,
for all cardinal numbers a, b, d ≥ 1.
Remark B.2. The order relation ≤ is compatible with all the operations, in
the sense that, if a1 , a2 , b1 , and b2 are cardinal numbers with a1 ≤ a2 and b1 ≤ b2 ,
then
• a 1 + b1 ≤ a 2 + b2 ,
• a 1 · b 1 ≤ a 2 · b2 ,
• ab1 1 ≤ ab2 2 .
Proposition B.1. Let a ≥ 1 be a cardinal number.
(i) If A is a set with card A = a, and if we define
P(A) = {B : B subset of A},
then 2 = card P(A).
a
(ii) a < 2a .
Proof. (i). Put
P = {0, 1}A = f : f function from A to {0, 1} ,
(ii) ℵ0 + ℵ0 = ℵ0 ;
(iii) ℵ0 · ℵ0 = ℵ0 ;
Proof. (i). Let a be an infinite cardinal number, and let A be an infinite
set A, with card A = a. Since for every finite subset F ⊂ A, there exists some
x ∈ A r F , one to construct a sequence (xn )n∈N ⊂ A, with xm 6= xn , ∀ m > n ≥ 1.
Then the subset B = {xn : n ∈ N} has card B = ℵ0 , so the inclusion B ⊂ A gives
the desired inequality.
(ii). Consider the sets
A0 = {n ∈ N : n, even} and A1 = {n ∈ N : n, odd}.
Then clearly card A0 = card A1 = ℵ0 , and the equality A0 ∪ A1 = N gives
ℵ0 + ℵ0 = card A0 + card A1 = card(A0 ∪ A1 ) = card N = ℵ0 .
(iii). Take the set P = N × N, so that ℵ0 · ℵ0 = card P . It is obvious that
card P ≥ ℵ0 . To prove the other inequality, we define a surjection φ : N → P as
follows. For each n ≥ 1 we take sn = n(n − 1)/2, we set
Bn = {m ∈ N : sn < m ≤ sn+1 },
and we define φn : Bn → P by
φ(m) = (n + sn − m, m − sn + 1), ∀ m ∈ Bn .
Notice that
(1) φn (Bn ) = {(p, q) ∈ N × N : p + q = n + 1}.
S
Notice also that n≥1 Bn = N, and Bj ∩ Bk = ∅, ∀ j > k ≥ 1, so there exists a
(unique) function φ : N → P , such that φBn = φn , for all n ≥ 1. By (1) it is clear
that φ is surjective.
Theorem B.3. Let a and b be cardinal numbers, with 1 ≤ b ≤ a, and a infinite.
Then:
(i) a + b = a;
(ii) a · b = a.
Proof. It is clear that
a ≤ a + b ≤ a + a,
a ≤ a · b ≤ a · a,
so in order to prove the theorem, we can assume that a = b.
(i). Fix some set A with card A = a. Use Zorn Lemma, to find a maximal
non-empty family {Ai : i ∈ I} of subsets of A with
(a) card Ai = ℵ0 , for all i, j ∈ I;
(b) Ai ∩ Aj = ∅, for all i, j ∈ I with i 6= j.
S
If we put B = A r i∈I Ai , then by maximality it follows that B is finite. In
particular, if we take i0 ∈ I then obviously card(Ai0 ∪ B) = ℵ0 , so if we replace
S Ai0 ∪ B, we will still have the above properties (a) and (b), but also
Ai0 with
A = i∈I Ai . This proves that a = card A = ℵ0 · d, where d = card I. In other
words, we have a = card(N × I). Consider then the sets
C0 = {n ∈ N : n even} and C1 = {n ∈ N : n odd},
366 APPENDIX B
Proof. We have
2a ≤ ba ≤ (2a )a = 2a·a = 2a ,
and the desired equality follows from the Cantor-Bernstein Theorem.
Proof. First of all, the map A 3 a 7−→ {a} ∈ Pfin (A) is injective, so a ≤
card Pfin (A).
We now prove the other inequality. For every integer n ≥ 1, let An denote the
n-fold cartesian product. We treat the sequence A1 , A2 , . . . as pairwise disjoint.
For every n ≥ 1 we define the map
φn : An → Pfin (A),
by
φ(a1 , . . . , an ) = {a1 , . . . , an },
S∞
we define the map φ : n=1 A → Pfin (A) as the unique map such that
n
and
φ An = φn , ∀ n ≥ 1. Notice now that, since
card An = an = a, ∀ n ≥ 1,
it follows that
∞
[
An = ℵ0 · a = a,
card
n=1
which gives
card(Range φ) ≤ a.
But it is clear that
{∅} ∪ Range φ = Pfin (A),
and the fact that Pfin (A) is infinite, proves that
card Pfin (A) = card(Range φ) ≤ a.
so 2ℵ0 = card P . For any real number r ≥ 2, we define the map φr : T → [0, 1] by
∞
X αn
φ(a) = (r − 1) , ∀ a = (αn )n∈N ∈ T.
n=1
rn
The maps φr , r ≥ 2 are “almost” injective. To clarify this, we define the set
T0 = a = (αn )n∈N ∈ T : the set {n ∈ N : αn = 0} is infinite .
Note that
T r T0 = (αn )n∈N ∈ T : there exists N ∈ N, such that αn = 1, ∀ n ≥ N .
Clearly φ is surjective. In fact φ is “almost” bijective.
Claim 1: Fix r ≥ 2. For elements a = (αn )n∈N , b = (βn )n∈N ∈ T0 , the
following are equivalent
(∗) φr (a) > φr (b);
(∗∗) there exists k ∈ N, such that alphak > βk , and αj = βj , for all j ∈ N
with j < k.
We first prove the implication (∗∗) ⇒ (∗). If a, b ∈ T0 satisfiy (∗∗), then
∞ ∞
r−1 X αn − βn r−1 X βn
(4) φr (a) − φr (b) = k
+ (r − 1) n
≥ k
− (r − 1) .
r r r 2n
n=k+1 n=k+1
Notice now that there are infinitely many indices n ≥ k + 1 such that βn = 0. This
gives the fact that
∞ ∞
X βn X 1 1
< = ,
rn rn (r − 1)rk
n=k+1 n=k+1
so if we go back to (4) we get
∞
r−1 X βn r−1 1 r−2
φr (a) − φr (b) ≥ k
− (r − 1) n
> k − k = k ≥ 0,
r r r r r
n=k+1
Using the implication (∗∗) ⇒ (∗) we see that we cannot have βk > αk , because this
would force φ(b) > φ(a). Therefore we must have αk > βk , and we are done.
Using Claim 1, we now see that φr T0 : T0 → [0, 1] is injective
Claim 2: card(T r T0 ) = ℵ0 .
This is pretty clear, since we can write
∞
[
T r T0 = Rk ,
k=1
where
Rn = a = (αn )n∈N ∈ T : αn = 1, ∀ n ≥ 1 .
Since each Rn is finite, the desired result follows.
Using Claim 2, we have
2ℵ0 = card T = card(T r T0 ) + card T0 = ℵ0 + card T0 .
Since ℵ0 < 2ℵ0 , the above equality forces
2ℵ0 = card T0 .
For every r ≥ 2, we also have card φr (T rT0) ≤ ℵ0 , which then gives card φr (T )r
φr (T0 ) ≤ ℵ0 , hence using the injectivity of φr T , we have card φr (T0 ) = card T0 =
0
2ℵ0 , so we get
2ℵ0 = card φr (T0 ) ≤ card φr (T ) = card φr (T0 )+card φr (T )rφr (T0 ) ≤ card phir (T0 )+ℵ0 = 2ℵ0 +ℵ0 = 2ℵ0 .
Ordinal numbers
In this Appendix we discuss ordinal number arithmetic. The Axiom of Choice
is assumed to be true.
Definition. Let X be a non-empty set. A well ordering on X is an total order
relation ≺ on X with the following property:
(w) every non-empty subset A ⊂ X has a smallest element, i.e. there exists
a ∈ A, such that a ≺ x, ∀ x ∈ A.
In this case the pair (X, ≺) is called a well ordered set.
Notations. Let (W, ≺) be a well-ordered set. For any a ∈ W , we define
W (a) = {x ∈ W : x ≺ a and x 6= a}.
Remark that (W (a), ≺) is well-ordered.
Lemma C.1. Let (W, ≺) be a well ordered set. For a subset S ⊂ W , the
following are equivalent:
(i) for every s ∈ S, one has the inclusion W (s) ⊂ S;
(ii) either S = W , or there exists some a ∈ W , such that S = W (a).
Proof. (i) ⇒ (ii). Assume S ( W . Take a to be the smallest element of the
set W r S. If s ∈ S, then a 6= s, and by (i) we cannot have a ≺ s, since this would
force a ∈ W (s) ⊂ S. Therefore we must have s ≺ a, i.e. s ∈ W (a). This prove the
inclusion S ⊂ W (a). Conversely, if s ∈ W (a), then s must belong to S. Otherwise
s ∈ W r S would contradict the minimality of a.
(ii) ⇒ (i). This is trivial.
Definition. A subset S, as above, is called a full subset.
The key feature of well-ordered sets is the following.
Lemma C.2 (Transfinite Induction Principle). Let (W, ≺) be a well-ordered
set. Let w1 ∈ W be the smallest element of W . Assume A ⊂ W is a set with the
property
(i) If w ∈ W has the property that, W (w) ⊂ A, then w ∈ A.
Then A = W .
Proof. Consider the set
S = {s ∈ A : W (s) ⊂ A}.
It is obvious that S is full, and S ⊂ A. By Lemma C.1, either S = W , in which
case we clearly get A = W , or there exists w ∈ W , such that S = W (w). In this
case we have W (w) ⊂ A. By (i) this forces w ∈ A, so we get w ∈ S, which is
impossible.
371
372 APPENDIX C
Proof. For every a ∈ W let us denote the set W (a) ∪ {a} simply by Wa , and
let us define the set
Fa = g : Wa → X : g(w1 ) = x1 and g(b) = Φb
g W (b) , ∀ b ∈ Wa r {w1 } .
Remark that, for any a, b ∈ W , with a ≺ b, one has
f W a ∈ Fa , ∀ f ∈ Fb .
(2)
Claim: For every a ∈ W , the set Fa is a singleton.
We prove this statement using transfinite induction. Define
A = a ∈ W : Fa is a singleton .
This follows immediately from the fact that fc W belongs to Fb . Using the obvious
b
equality
[
W (a) = Wb ,
b∈W (a)
we define g : W (a) → X as the unique function with the property that g W = fb ,
b
∀ b ∈ W (a). Finally, we define fa : Wa → X by fa W (a) = g, and fa (a) = Φa (g). It
is clear that fa ∈ Fa , so Fa has at least one element. If h ∈ F
a is another function,
then for every b ∈ W (a) we have hW ∈ Fb , which forces hW = fb , in particular
b b
giving hW (a) = g = fa W (a) . Then h(a) = Φa (g), which means that we also have
h(a) = fa (a), so we must have h = fa .
Having proven
the Claim, we now have a family of functions fa : Wa → X,
a ∈ W , with fb W = fa , for all a, b ∈ W with a ≺ b. Using the equality
a
[
W = Wa ,
a∈W
we then define f : W → X to be the unique function such that f Wa = fa , ∀ a ∈ W .
Notice that, for each a ∈ W r {w1 }, we have f (a) = fa (a), and since fa ∈ Fa ,
we immediately get (1). The uniqueness of f with property (1) is also clear, since
any such f will atomatically satisfy f Wa ∈ Fa , for all a ∈ W .
Q
Comment. The system of maps Φa : W (a) X → X, a ∈ W is to be thought
as a “recurence relation,” in the sense that it is used to define the value f (a) in
terms of all “preceding” values f (w), w ≺ a, w 6= a.
ORDINAL NUMBERS 373
Definitions. Given two well ordered sets (W1 , ≺1 ) and (W2 , ≺2 ), a map f :
(W1 , ≺1 ) → (W2 , ≺2 ) is called an full embedding, if
• f is injective.
• For any two elements x, y ∈ W1 , one has
x ≺1 y ⇒ f (x) ≺2 f (y).
• f (W1 ) is a full subset of W2 .
If f is a full emebedding, with f (W1 ) = W2 , then f is called an order isomorphism.
The properties of these types of maps are contained in the following
Proposition C.1. A. Suppose (W1 , ≺1 ) and (W2 , ≺2 ), are well-ordered sets.
(i) If f : (W1 , ≺1 ) → (W2 , ≺2 ) is a full embedding, then
f W1 (a) = W2 f (a) , ∀ a ∈ W1 .
In particular, if w1 is the smallest element in W1 , and w2 is the smallest
element in W2 , then f (w1 ) = w2 .
(ii) If f : (W1 , ≺1 ) → (W2 , ≺2 ) is an order isomorphism, then f −1 : (W2 , ≺2
) → (W1 , ≺1 ) is again an order isomorphism.
(iii) There exists at most one full embedding f : (W1 , ≺1 ) → (W2 , ≺2 ).
B. Suppose (W1 , ≺1 ), (W2 , ≺2 ), (W3 , ≺3 ) are well-ordered sets, and
f g
(W1 , ≺1 ) −→ (W2 , ≺2 ) −→ (W3 , ≺3 )
are full emebeddings.
(i) The composition g ◦ f : (W1 , ≺1 ) → (W3 , ≺3 ) is again a full emebdding.
(ii) The composition g ◦ f is an order isomorphism, if and only if both f and
g are order isomorphisms.
Proof. A. (i). Start first with some element x ∈ W (a). Since x ≺1 a, we have
f (x) ≺2 f (a). Since f is injective, and x 6= a, we must have f (x) 6= f (a), hence
x ∈ W2 f (a) . Conversely, if y ∈ W2 f (a)), then using the fact that f (W2 ) is full
in W2 , it follows that y ∈ f (W2 ), so there exists some x ∈ W1 , with y = f (x). If
a ≺1 x, then we would get f (a) ≺2 f (x), which is impossible. Therefore we must
have x ≺1 a and x 6= a, i.e. x ∈ W1 (a), so y indeed belongs to f W1 (a) . The
second assertion is now clear since we have
W2 f (w1 ) = f W1 (w1 ) = f (∅) = ∅,
which clearly forces f (w1 ) = w2 .
(ii). This is obvious.
(iii). Suppose f, g : (W1 , ≺1 ) → (W2 , ≺2 ) are full embeddings, and let us show
that we must have f = g. We use transfinite induction. Define the set
A = {w ∈ W1 : f (w) = g(w)}.
Let w ∈ W1 be some element such that W1 (w) ⊂ A, and let us prove that w ∈ A,
i.e. f (w) = g(w). Denote f (w) by a, and g(w) by b. Using the fact that f W (w) =
1
B .(i). It is clear that g ◦ f is injective, and satisfies the second condition in the
definition, so the only thing we need to prove is the fact that (g ◦ f )(W1 ) is full. If
f (W1 ) = W2 , there is nothing to prove, since we would get (g ◦ f )(W1 ) = g(W2 ),
which is full.
Assume f (W1 ) = W2 (a), for some a ∈ W2 . Then by (i) we have
(g ◦ f )(W1 ) = g f (W1 ) = g W2 (a) = W3 g(a) ,
so again (g ◦ f )(W1 ) is full.
(ii). Assume first that both f and g are order isomprphisms. Then g ◦ f :
(W1 , ≺1 ) → (W3 , ≺3 ) is a full embedding, by (i), and it is clearly surjective, hence
g ◦ f is indeed an order isomorphism.
Conversely, assume g ◦ f : (W1 , ≺1 ) → (W3 , ≺3 ) is an order isomorphism. This
clearly forces g to be surjective, hence an order isomorphism. But then g −1 is an
order isomorphism, and so will be g −1 ◦ (g ◦ f ) = f .
Corollary C.1. If (W, ≺) is a well-ordered set, and a ∈ W , then there is no
full embedding (W, ≺) → (W (a), ≺).
Proof. Suppose there exists a full embedding f : (W, ≺) → (W (a), ≺). Since
the inclusion ι : (W (a), ≺) ,→ (W, ≺) is obviously a full embedding, the composition
ι ◦ f : (W, ≺) → (W, ≺) is a full embedding. Since we also have IdW : (W, ≺) →
(W, ≺) as a full embedding, this would force ι ◦ f = IdW , which would force ι to be
surjective. But this is obviously impossible.
Definitions. Two well-ordered sets W1 , ≺1 ) and (W2 , ≺2 ) are said to have
the same order type, if there exists an order isomorphism (W1 , ≺1 ) → (W2 , ≺2 ).
By the above considerations, this defines an equivalence relation on the class of all
well-ordered sets.
An ordinal number is thought as an equivalence class of well-ordered sets. In
other words, if we write a cardinal number as α, it is understood that α consists
of all well-ordered sets of a given order type. So when we write ord(W, ≺) = α
we understand that (W, ≺) belongs to this class, and for another well-ordered set
(W 0 , ≺0 ) we write ord(W 0 , ≺0 ) = α, exactly when (W 0 , ≺0 ) has the same order type
as (W, ≺). In this case we write ord(W 0 , ≺0 ) = ord(W, ≺).
We regard the empty set ∅ as a well-ordered set, with the empty relation. We
write ord(∅) = 0.
Comments. If (W1 , ≺1 ) and (W2 , ≺2 ) are well-ordered sets, then one has the
obvious implication
ord(W1 , ≺1 ) = ord(W2 , ≺2 ) =⇒ card W1 = card W2 .
Conversely, if the well-ordered sets (W1 , ≺1 ) and (W2 , ≺2 ) are finite, and card W1 =
card W2 , then ord(W1 , ≺1 ) = ord(W2 , ≺2 ). Indeed, if we take n = 1card W1 , then
one can define recursively a finite sequence (wk )nk=1 ⊂ W1 , by taking w1 to be the
smallest element of W1 , and defining, for each k ∈ {2, 3, . . . , n} the element wk to
be the smallest element of the set W1 r {w1 , w2 , . . . , wk−1 }. The obvious bijection
{1, 2, . . . , n} 3 k 7−→ wk ∈ W1
will then define an order isomorphism
{1, . . . , n}, ≤ → (W1 , ≺1 ).
Likewise (W2 , ≺2 ) has same order type as {1, . . . , n}, ≤ .
ORDINAL NUMBERS 375
Using the above notations, we can then regard all non-negative integers as
ordinal numbers, by identifying ord(W, ≺) = card(W ), for all finite well-ordered
sets (W, ≺).
Notation. If α is an ordinal number, say α = ord(W, ≺), for some well-ordered
set (W, ≺), then the cardinal number card W does not depend on the particular
choice of (W, ≺). We will denote it by card α. As dicussed above, if
card α = card β = finite cardinal,
then α = β. As we shall see later, this implication holds only for finite ordinal
numbers.
Definitions. Let α1 and α2 be ordinal numbers, say α1 = ord(W1 , ≺1 ) and
α2 = ord(W2 , ≺2 ), where (W1 , ≺1 ) and (W2 , ≺2 ) are two well-ordered sets. We write
α1 ≤ α2 , if there exists a full embedding f : (W1 , ≺1 ) → (W2 , ≺2 ). By Proposition
C.1, this definition is independent of the choices of (W1 , ≺1 ) and (W2 , ≺2 ).
We write α1 < α2 if α1 ≤ α2 and α1 6= α2 .
Remark C.1. If α1 and α2 are ordinal numbers, with α1 ≤ α2 , then card α1 ≤
card α2 .
Proposition C.2. The relation ≤ is an order relation, on any set of ordinal
numbers.
Proof. It is obvious that α ≤ α, for any ordinal number α
Assume α1 and α2 are ordinal numbers with α1 ≤ α2 and α2 ≤ α1 , and let
us show that this forces α1 = α2 . Let (W1 , ≺1 ) and (W2 , ≺2 ) be well-ordered sets
with α1 = ord(W1 , ≺1 ) and α2 = ord(W2 , ≺2 ). Since α1 ≤ α2 , there exists a full
emebedding f : (W1 , ≺1 ) → (W2 , ≺2 ). Since α2 ≤ α1 , either there exists a full
emebdding g : (W2 , ≺2 ) → (W1 , ≺1 ). By Proposition C.1.B, the composition g ◦ f :
(W1 , ≺1 ) → (W1 , ≺1 ) is a full emebedding. Since we already have a full emebdding
IdW1 : (W1 , ≺1 ) → (W1 , ≺1 ), by Proposition C.1.A, we must have g ◦ f = IdW1 .
Using Proposition C.1.B this forces f (and g) to be order isomorphisms, so we
indeed have α1 = α2 .
Finally, suppose α1 , α2 and α3 are ordinal numbers such that α1 ≤ α2 and
α2 ≤ α3 . The fact that α1 ≤ α3 follows immediately from Proposition C.1.B.
Theorem C.1 (Ordinal Comparability Theorem). Let α1 and α2 be ordinal
numbers. Then either α1 ≤ α2 , or α2 ≤ α1 .
Proof. Let (W1 , ≺1 ) and (W2 , ≺2 ) be well-ordered sets with α1 = ord(W1 , ≺1 )
and α2 = ord(W2 , ≺2 ). For every a ∈ W1 we denote the set W1 (a) ∪ {a} simply by
W1a . It is clear that (W1a , ≺1 ) is well-ordered. Consider the set
A = a ∈ W1 : there exists a full embedding (W1a , ≺1 ) → (W2 , ≺) .
By Proposition C.1.A, we know that for any a ∈ A, there exists a unique full
embedding (W1a , ≺1 ) → (W2 , ≺2 ). We denote this full embedding by fa .
Claim 1: The
set A is full. Moreover, for any a, b ∈ A, with b ≺ a, we have
fb = fa W b .
1
Start with some a ∈ A, and let us prove that W1 (a) ⊂ A. Fix some arbitrary b ∈
W (a). Then the inclusion ι : (W1b , ≺1 ) ,→ (W1a , ≺1 ) is obviously a full embedding,
since we can write
W1b = W1 (c),
376 APPENDIX C
we also have [ [
W1a = φ(W1a ),
φ(A) = φ
a∈A a∈A
so there exists some a ∈ A, such that y ∈ φ(W1a ) = fa (W1a ). On the other hand,
since fa : (W1a , ≺1 ) → (W2 , ≺2 ) is a full embedding, it follows that fa (W1a ) is full,
so we get W2 (y) ⊂ fa (W1a ) = φ(W1a ) ⊂ φ(A).
We now finish the proof. Since both A and φ(A) are full, there are three cases
to examine
Case 1: A = W1 . In this case φ : (W1 , ≺1 ) → (W2 , ≺2 ) is a full embedding, so
we get α1 ≤ α2 .
Case 2: φ(A) = W2 . In this case φ : (A, ≺1 ) → (W2 , ≺2 ) is a an order
isomorphism, so φ−1 : (W2 , ≺2 ) → (W1 , ≺1 ) is a full embedding, and we get α1 ≤
α2 .
Case 3: A ( W1 and φ(A) ( W2 . This means there exist a1 ∈ W1 and
a2 ∈ W2 such that A = W1 (a1 ) and φ(A) = W2 (a2 ). This case turns out to be
impossible. To see this, we define ψ : W1a1 → W2 by ψ W1 (a) = φ and ψ(a1 ) = a2 ,
then ψ : (W1a1 , ≺1 ) → (W2 , ≺2 ) will still be an order isomorphism. Indeed, the first
two conditions in the definition are clear, while the equality
ψ(W1a1 ) = W2a2 = {y ∈ W2 : y ≺2 a2 },
proves that ψ(W1a1 ) is full. The existence of ψ then forces a1 ∈ A, which contradicts
the equality A = W1 (a1 ).
ORDINAL NUMBERS 377
Theorem C.2. Let α be an ordinal number. Then the class Pα of all ordinal
numbers β with β < α is a set. More explicitly, if (W, ≺) is a well-ordered set with
ord(W, ≺) = α, then the map
φ : W 3 a 7−→ ord(W (a), ≺) ∈ Pα
is a bijection. Moreover, (Pα , ≤) is well-ordered, and φ : (W, ≺) → (Pα , ≤) is an
order isomorphism.
Proof. Let β be an ordinal number with β < α. Then there exists a well-
ordered set (W1 , ≺1 ), and a full emebedding φ : (W1 , ≺1 ), such that
• β = ord(W1 , ≺1 ),
• φ(W1 ) = W (a1 ),
for some a1 ∈ W . This fact already proves that Pα is a set.
Claim: The element a1 ∈ W does not depend on the particular choice of
(W1 , ≺1 ).
Indeed, if (W2 , ≺2 ) is another well-ordered set, and ψ : (W2 , ≺2 ) → (W, ≺) is
another full emebdding with
• β = ord(W2 , ≺2 ),
• ψ(W2 ) = W (a2 ),
for some a2 ∈ W , then we would get the existence of an order isomorphism γ :
(W (a1 ), ≺) → (W (a2 ), ≺). We can assume (otherwise we replace γ with γ −1 ) that
a1 ≺ a2 . If a1 6= a2 , we would have a1 ∈ W (a2 ), so if we work with the well-ordered
set Z = W (a2 ) we would have an order isomorphism (Z, ≺) → (Z(a1 ), ≺). By
Corollary C.1 this is impossible. Therefore, we must have a1 = a2 .
Using the Claim, we then define aβ as the unique element in W , such that
ord(W (aβ ), ≺) = β. Define the map ψ : Pα 3 β 7−→ aβ ∈ W . It is clear that
φ ◦ ψ = IdPα .
Let us prove now that ψ ◦ φ = IdW . Start with some arbitrary a ∈ W , and put
β = φ(a) = ord(W (a), ≺). Since ord(W (a), ≺) = β, by the Claim, we must have
aβ = a, i.e. ψ(β) = a, which means that (ψ ◦ φ)(a) = a.
Finally, we note that, if a, b ∈ W are elements with a ≺ b, then the obvious full
embedding (W (a), ≺) ,→ (W (b), ≺) proves that ord(W (a), ≺) ≤ ord(W (b), ≺), i.e.
φ(a) ≤ φ(b).
Since φ is bijective, it is clear that, for a, b ∈ W , we have in fact the equivalence
a ≺ b ⇐⇒ φ(a) ≤ φ(b).
This proves that (Pα , ≤) is well-ordered, and φ : (W, ≺) → (Pα , ≤) is an order
isomorphism.
Proof. By Theorem C.1, (S, ≤) is totally ordered. Fix some non-empty subset
A ⊂ S, and let us show that A has a smallest element. Start with some arbitrary
α ∈ A. If α ≤ β, ∀ β ∈ A, we are done. Otherwise, the intersection A ∩ Pα is
non-empty. We then use the fact that (Pα , ≤) is well-ordered, to choose α1 to be
its smallest element. If we start with some arbitrary β ∈ A, then either α ≤ β, in
which case we immediately get α1 < β, or β < α, in which case β ∈ A ∩ Pα , and
we again get α1 ≤ β. So α1 is in fact the smallest element of A.
378 APPENDIX C
Theorem C.3 (Well ordering Theorem). Every non-empty set has a well or-
dering.
Proof. Let
W = (W, ≺) : (W, ≺) well-ordered, and W ⊂ X .
For two elements (W1 , ≺1 ) and (W2 , ≺2 ), we define (W1 , ≺1 ) @ (W2 , ≺2 ), if and
only if W1 ⊂ W2 , and the inclusion map (W1 , ≺1 ) ,→ (W2 , ≺2 ) is a full embedding.
(This is equivalent to the fact that W1 is a full subset of (W2 , ≺2 ), and ≺1 =≺2 W .)
1
It is obvious that (W, @) is an ordered set. We want to apply Zorn Lemma
to this
set. We need to check the hypothesis. Start with a totally ordered subset
T = (Wi , ≺iS ) : i ∈ I ⊂ W, and let us show that T has an upper bound in W.
Define W = i∈I Wi . For a, b ∈ W , we define a ≺ b, if and only if there exists
i ∈ I, such that a, b ∈ Wi , and a ≺i b. Let us chack that (W, ≺) is a well-ordered
set. First of all, we need to show that ≺ is an order relation on W . It is clear
that a ≺ a, ∀ a ∈ W . Suppose a, b ∈ W satisfy a ≺ b and b ≺ a, and let us show
that a = b. We know there exists i, j ∈ I such that a, b ∈ Wi and a ≺i b, and
a, b ∈ Wj and b ≺j a. Now there are two possibilities: either (Wi , ≺i ) @ (Wj , ≺j ),
or (Wj , ≺j ) @ (Wi , ≺i ). In the first case we get a ≺i b and b ≺i a, so we would
get a = b. In the other case, by symmetry, we again get a = b. Let us show now
transitivity. Suppose a, b, c ∈ W satisfy a ≺ b and b ≺ c, and let us show that
a ≺ c. We know there exist i, j ∈ I, such that a, b ∈ Wi and a ≺i b, and b, c ∈ Wj
and b ≺j c. As above, we have two possibilities: either (Wi , ≺i ) @ (Wj , ≺j ), or
(Wj , ≺j ) @ (Wi , ≺i ). In the first case we get a, b, c ∈ Wj and a ≺j b ≺j c, so we get
a ≺j c. In the second case, we get a, b, c ∈ Wi and a ≺i b ≺i c, so we get a ≺i c. In
either case we get a ≺ c.
Next we show that (W, ≺) is totally ordered. Start with arbitrary a, b ∈ W , and
let us prove that either a ≺ b or b ≺ a. If we choose i, j ∈ I such that a ∈ Wi and
b ∈ Wj , then using the two possiblities (Wi , ≺i ) @ (Wj , ≺j ) or (Wj , ≺j ) @ (Wi , ≺i )
we immediately see that we can find k ∈ I (k is either i or j), such that a, b ∈ Wk .
Then using the fact that (Wk , ≺k ) is totally ordered, we either have a ≺k b, or
b ≺k a. This gives either a ≺ b, or b ≺ a.
In order to prove that (W, ≺) is well-ordered, and (Wi , ≺i ) @ (W, ≺), ∀ i ∈ I,
we shall use the following
Claim: For any i ∈ I, one has the implication:
a ∈ Wi =⇒ W (a) ⊂ Wi .
Indeed, if there exists some b ∈ W (a), but b 6∈ Wi , this would mean that there
exists some j ∈ I, with b ∈ Wj , b ≺j a, and b 6= a. This would then force
(Wi , ≺i ) @ (Wj , ≺j ), and b ∈ Wj (a). But this is impossible, since the fact that Wi
is full in (Wj , ≺j ) would force b ∈ Wj (a) ⊂ Wi .
Let us show now that (W, ≺) is well-ordered. Start with some arbitrary non-
empty subset A ⊂ W . Choose i ∈ I, such that A ∩ Wi 6= ∅, and take a to be the
smallest element in A ∩ Wi , in the well-ordered set (Wi , ≺i ), i.e.
(5) a ∈ A ∩ Wi , and a ≺i x, ∀ x ∈ A ∩ Wi .
Let us prove that a is in fact the smallest element of A, in (W, ≺). Start with some
arbitrary element b ∈ A, and let us prove that a ≺ b. Assume the opposite, which
using the fact that (W, ≺) is totally ordered, this means that b ≺ a, and b 6= a,
ORDINAL NUMBERS 379
i.e. b ∈ W (a). By the Claim hoewever, this will force b ∈ Wi , so we would get
b ∈ A ∩ Wi , and the choice of a would give a ≺i b, which would then give a ≺ b,
thus contradicting the assumption on b.
We now prove (Wi , ≺i ) @ (W, ≺), ∀ i ∈ I. It is clear that the inclusion map
ι : (Wi , ≺i ) ,→ (W, ≺) satsifies the first two conditions in the definition of full
embeddings, so the only thing we need is the fact that Wi is full in (W, ≺). But
this is precisely the content of the above Claim.
Having shown that every totally orderes subset T ⊂ W has an upper bound, we
now invoke Zorn Lemma, to get the existence of a maximal element (W, ≺) ∈ W.
The proof of the Theorem will be finished onece we prove that W = X. We prove
this equality by contardiction. Assume W ( X. Pick an element x ∈ X r W , and
define the set W1 = W ∪ {x}. Equipp W1 with the order relation ≺1 defined by
a, b ∈ W and a ≺ b,
a ≺ b ⇐⇒
or b = x
It is pretty obvious that W = W1 (x) and ≺=≺1 W , so (W1 , ≺1 ) is well-ordered
and (W, ≺) @ (W1 , ≺1 ). Since W ( W1 , this would contardict the maximality.
Comment. An interesting consequence of the Well-Ordering Theorem is the
following: For any cardinal number a, there exists an ordinal number α, such that
card α = a.
Another interesting application is the following:
Corollary C.3. If C is a set of cardinal numbers, then (C, ≤) is well-ordered.
Proof. For any a ∈ C we choose a well-ordered set (Wa , ≺a ) with card Wa = a.
Choose any set X with
a < card X, ∀ a ∈ C.
(For example, we can take Y = a∈C Wa , so that a ≤ card Y , ∀ a ∈ C, and then we
S
The inclusion A ⊂ PΩ is clear. To prove the other inclusion, we start with some
ordinal number α < Ω and we see that this forces α ∈ Pγ2 , so it will be impossible to
have ℵ0 < card α, because this would give α ∈ Pγ2 rA, contradicting the minimality
of Ω.
The ordinal number Ω is called the smallest uncountable ordinal number.
Fact 1: The set PΩ is uncountable.
This follows from the fact that
Ω = ord(PΩ , ≤),
which gives ℵ0 < card Ω = card PΩ .
Fact 2: The cardinal number ℵ1 = card Ω = card PΩ is the smallest uncount-
able cardinal number.
Indeed, if one starts with some cardinal number a < ℵ1 , then if we choose a well-
ordered set (W, ≺) with card W = a, then, since we have card W < card Ω we must
have ord(W, ≺) < Ω, which then forces a ≤ ℵ0 .
Fact 3: Any countable subset A ⊂ PΩ has a strict upper bound in PΩ , that
is, there exists β ∈ PΩ , such that α < β, ∀ α ∈ A.
We prove this by contradiction. Assume A has no strict upper bound in PΩ , which
means that for every β ∈ PΩ , there exists some α ∈ A such that β ≤ α. This gives
[
(6) PΩ = (Pα ∪ {α}).
α∈A
But for every α ∈ PΩ we have ord Pα = α, which forces card Pα ≤ ℵ0 . Then the
fact that A is countable, combined with (6) will force PΩ to be countable, which is
impossible.
The above construction can be generalized to arbitrary cardinal numbers, giving
the following
Fact 4: Given any cardinal number a, there exist a smallest ordinal number
Ωa with a < card Ωa , and the cardinal number a0 = card Ωa is the smallest
cardinal number with a < a0 . Any set A ⊂ PΩa , with card A ≤ a, has a
strict upper bound in PΩa .