Вы находитесь на странице: 1из 382

REAL ANALYSIS

FALL 2001

Gabriel Nagy

Kansas State University

Gabriel
c Nagy
Chapter I
Topology Preliminaries
Lecture 1

1. Review of basic topology concepts


In this lecture we review some basic notions from topology, the main goal being
to set up the language. Except for one result (Uryson Lemma) there will be no
proofs.
Definitions. A topology on a (non-empty) set X is a family T of subsets of
X, which are called open sets, with the following properties:
(top1 ): both the empty set ∅ and the total set X are open;
(top2 ): an arbitrary union of open sets is open;
(top3 ): a finite intersection of open sets is open.
In this case the system (X, T ) is called a topological space.
If (X, T ) is a topological space and x ∈ X is an element in X, a subset N ⊂ X
is called a neighborhood of x if there exists some open set D such that x ∈ D ⊂ N .
A collection N of neighborhoods of x is called a basic system of neighborhoods
of x, if for any neighborhood M of x, there exists some neighborhood N in N such
that x ∈ N ⊂ M .
A collection V of neighborhoods of x is called a fundamental system of neighbor-
hoods of x if for any neighborhood M of x there exists a finite sequence V1 , V2 , . . . , Vn
of neighborhoods in V such that x ∈ V1 ∩ V2 ∩ · · · ∩ Vn ⊂ M .
A toplogy is said to have the Hausdorff property if:
(h) for any x, y ∈ X with x 6= y, there exist open sets U 3 x and V 3 y such
that U ∩ V = ∅.
If (X, T ) is a topological space, a subset F ⊂ X will be called closed, if its
complement X r F is open. The following properties are easily derived from the
definition:
(c1 ) both the empty set ∅ and the total set X are closed;
(c2 ) an arbitrary intersection of closed sets is closed;
(c3 ) a finite union of closed sets is closed.
Using the above properties of open/closed sets, one can perform the following
constructions. Let (X, T ) be a topological space and A ⊂ X be an arbitrary subset.
Consider the set Int(A) to be the union of all open sets D with D ⊂ A and consider
the set A to be the intersection of all closed sets F with F ⊃ A. The set Int(A)

(sometimes denoted simply by A) is called the interior of A, while the set A is
called the closure of A. The properties of these constructions are summarized in
the following:
Proposition 1.1. Let (X, T ) be a toplogical space, and let A be an arbitrary
subset of X.
3
4 LECTURE 1

A. (Properties of the interior)


(i) The set Int(A) is open and Int(A) ⊂ A.
(ii) If D is an open set such that D ⊂ A, then D ⊂ Int(A).
(iii) x belongs to Int(A) if and only if A is a neighborhood of x.
(iv) A is open if and only if A = Int(A).
B. (Properties of the closure)
(i) The set A is closed and A ⊃ A.
(ii) If F is a closed set with F ⊃ A, then F ⊃ A.
(iii) A point x belongs to A, if and only if, A ∩ N 6= ∅ for any neighborhood
N of x.
(iv) A is closed if and only if A = A.
C. (Relationship between interior and closure) Int(X r A) = X r A and
X r A = X r Int(A).
Definition. Suppose (X, T ) is a topological space. Assume A ⊂ X is a subset
of X. On A we can introduce a natural topology, sometimes denoted by T |A which
consists of all subsets of A of the form A ∩ U with U open set in X. This topology
is called the relative (or induced ) topology.
Remark 1.1. If A is already open in the topology T , then a subset V ⊂ A
is open in the induced topology if and only if V is open in the topology T (this
follows from the fact that the intersection of any two open sets in T is again an
open set in T .
Definition. Suppose (X, T ) and (Y, S) are topological spaces and x is an
element in X. A map f : X −→ Y is said to be continuous at x, if for any
neighborhood N of f (x) in the topology S (on Y ), the set
f −1 (N ) = {x ∈ X | f (x) ∈ N }
is a neighborhood of x in the topology T (on X).
If f is continuous at every point in X, then f is said to be continuous.
Continuity is “well behaved” with respect to compositions:
Proposition 1.2. Suppose (X, T ), (Y, S), and (Z, Z are topological spaces,
f g
and X −→ Y −→ Z are two functions.
(i) If f is continuous at a point x ∈ X, and if g is continuous at f (x), then
g ◦ f is continuous at x.
(ii) If f and g are (globally) continuous, then so is g ◦ f .
The identity map on a topological space is always continuous.
In terms of open/closed sets, the characterization of continuity is given by the
following.
Proposition 1.3. If (X, T ), (Y, S) are topological spaces and f : (X, T ) →
(Y, S) is a map, then the following are equivalent:
(i) f is continuous.
(ii) Whenever U ⊂ Y is an open set, it follows that f −1 (U ) is also an open
set (in X).
(iii) Whenever F ⊂ Y is a closed set, it follows that f −1 (F ) is also a closed
set (in X).
We conclude this section with a useful technical result.
CHAPTER I: TOPOLOGY PRELIMINARIES 5

Theorem 1.1 (Urysohn’s Lemma). Let (X, T ) be a topological Hausorff space


with the following property:
(n) For any two disjoint closed sets A, B ⊂ X, there exist two disjoint open
sets U, V ⊂ X, such that U ⊃ A and V ⊃ B.
B ⊂ X, there exists a continuous function
Then for any two disjoint closed sets A,
f : X → [0, 1] such that f A = 0 and f B = 1.

Proof. We begin with a refinement of property (n):


(n0 ) For any disjoint closed sets A, B ⊂ X, there exist two open sets U, W ⊂ X,
such that A ⊂ U , U ⊂ W , and W ∩ B = ∅.
To prove (n0 ), we first apply (n) to find two disjoint open sets W, Z ⊂ X such that
(1) W ⊃ A and Z ⊃ B.
Next we apply again (n) to the pair of closed sets A and X r W , and find two
disjoint open sets U, V ⊂ X such that
(2) U ⊃ A and V ⊃ X r W.
On the one hand, using the fact that U ∩ V = ∅ and the fact that V is open, we
get the inclusion U ⊂ X r V . Using (2) this gives
U ⊂ X r V ⊂ W.
On the other hand, using the fact that W ∩ Z = ∅ and the fact that Z is open, we
get W ⊂ X r Z. But using (1) this will give
W ⊂ X r Z ⊂ X r B,
and we are done.
To prove the Theorem, start with two disjoint closed sets A, B ⊂ X. For every
integer n ≥ 0 we define the set Dn = { 2kn : k ∈ Z, 0 ≤ k ≤ 2n }, and we consider

[
D= Dn .
n=0

(Notice that Dn ⊂ Dn+1 , for all n ≥ 0.)


We are going to construct a family (Vt )t∈D of open sets in X with the following
properties
(i) V0 ⊃ A and V 1 ∩ B = ∅;
(ii) V t ⊂ Vs , for all t, s ∈ D with t < s.
Let us start by constructing V0 and V1 . We use property (n0 ) to find open sets
U, W ⊂ X, with
A ⊂ U ⊂ U ⊂ W and W ∩ B = ∅,
and we simply take V0 = U and V1 = W .
The construction of the family (Vt )t∈D is carried on recursively. Assume, for
some integer n ≥ 0, we have constructed the sets (Vt )t∈Dn with property (i) and (ii)
(satisfied for t, s ∈ Dn ), and let us construct the next block of sets (Vt )t∈Dn+1 rDn .
We start off by observing that for every t ∈ Dn+1 r Dn , then the numbers
1
t± = t ±
2n+1
6 LECTURE 1

belong to Dn . Apply (n0 ) to the pair of disjoint closed sets V t− and X r Vt+ to
find two open sets U, W ⊂ X such that
V t− ⊂ U ⊂ U ⊂ W and W ∩ X r Vt+ = ∅.
Notice that the equality W ∩ (X r Vt+ ) = ∅, coupled with the inclusion U ⊂ W ,
gives U ∩ (X r Vt+ ), so we get U ⊂ Vt+ . We can then define Vt = U , and we will
obviously have the inclusions
(3) V t− ⊂ Vt ⊂ V t ⊂ Vt+ .
Now the extended family (Vt )t∈Dn+1 will also satisfy property (ii), since for t, s ∈
Dn+1 with t < s, one of the following will hold:
• either t, s ∈ Dn , or
• t ∈ Dn , s ∈ Dn+1 r Dn , and t ≤ s− , or
• t ∈ Dn+1 r Dn , s ∈ Dn , and t+ ≤ s, or
• t, s ∈ Dn+1 r Dn , and t+ ≤ s− .
(In either case, one uses (3) combined with the inductive hypothesis.)
Having constructed the family (Vt )t∈D , with properties (i) and (ii), we define
the functions f : X → [0, 1] by

inf{t ∈ D : x ∈ Vt }, if x ∈ V1
f (x) =
1, if x 6∈ V1
Claim 1: The function f is equivalently defined by

0, if x ∈ V 0
(4) f (x) =
sup{t ∈ D : x 6∈ V t }, if x 6∈ V 0
Let us denote by g : X → [0, 1] be the function defined by formula (4). Fix
some point x ∈ X. We break the proof in several cases
Case I: x ∈ V 0 .
In particular, using (ii) we get x ∈ Vt , for all t ∈ D, with t > 0, and since
x ∈ V1 , we have
f (x) = inf{t ∈ D : x ∈ Vt } = inf{t ∈ D : t > 0} = 0 = g(x).
Case II: x 6∈ V1 .
Using (ii) we have x 6∈ V t , for all t ∈ D, with t < 1, and since x 6∈ V 0 , we have
g(x) = sup{t ∈ D : x 6∈ V t } = sup{t ∈ D : t < 1} = 1 = f (x).
Case III: x ∈ V1 r V 0 .
By the definition of f (x) we know:
(5) x 6∈ Vt , ∀ t ∈ D, with t < f (x).
(6) ∀ ε > 0, ∃ sε ∈ D, with f (x) ≤ sε < f (x) + ε, such that x ∈ Vsε .
By the definition of g(x) we know:
(7) x ∈ V t , ∀ t ∈ D, with t > g(x);
(8) ∀ ε > 0, ∃ rε ∈ D, with g(x) ≥ rε > g(x) − ε, such that x 6∈ V rε .
Using (6) and (8) we see that we must have
(9) sε ≥ rε , ∀ ε > 0.
CHAPTER I: TOPOLOGY PRELIMINARIES 7

Indeed, if there exists some ε > 0 for which we have sε < rε , then using (6) we
would have
x ∈ V sε ⊂ V sε ⊂ V r ε ⊂ V r ε ,
which contradicts (8).
Now the inequality (9) gives
f (x) + ε > g(x) − ε, ∀ ε > 0,
so we have in fact the inequality
f (x) ≥ g(x).
Suppose now this inequality is strict. Using (5) and (7) we will get
(10) x ∈ V t and x 6∈ Vt , for all t ∈ D, with f (x) > t > g(x).
Using the fact that D is dense in [0, 1], we could then find at least two elements
t1 , t2 ∈ D such that
f (x) > t1 > t2 > g(x).
In this case (10) immediately creates a contradiction, since
x ∈ V t2 ⊂ Vt1 .
Claim 2: The function f is continuous.
Since any open set in R is a union of open intervals, it suffice to prove the
following two properties1
(usc): f −1 (∞, t) is open for all t ∈ R;


(lsc): f −1 (t, ∞) is open for all t ∈ R.


In order to prove property (usc) it suffices to prove the equality
[
f −1 (∞, t) =

(11) Vs .
s∈D
s<t

Start with a point x ∈ f −1 (t, ∞) , which means that f (x) < t. Using (6), there


exists some s ∈ D with f (x) < s < t, such that x ∈ Vs , so x indeed belongs to
the right hand side of (11). Conversley, if x belongs to the right hand side of (11),
there exists some s < t such that x ∈ Vs . By the definition of f (x), it follows that
f (x) ≤ s < t, so x ∈ f −1 (∞, t) .
In order to prove property (lsc) it suffices to prove the equality
[
f −1 (t, ∞) =

(12) (X r V r ).
r∈D
r>t

Start with a point x ∈ f −1 (t, ∞) , which means that f (x) > t. Using (8), there


exists some r ∈ D with f (x) > r > t, such that x 6∈ V r , that is, x ∈ X r V r , so x
indeed belongs to the right hand side of (12). Conversley, if x belongs to the right
hand side of (12), there exists some r > t such that x ∈ X r V s , i.e. x 6∈ V r By
the equivalent definition of f (x) given by Claim 1, it follows that f (x) ≥ r > t, so
x ∈ f −1 (t, ∞) .


1 The condition (usc) means that f is upper semi-continuous, while the condition (lsc)
means that f is lower semi-continuous.
8 LECTURE 1

Having proven that f is continuous, let us finish the proof. Since A ⊂ V0 , by


of f , we get f A = 0. Since B ⊂ X r V1 , again by the definition of
the definition
f , we get f B = 1. 
Definition. A Hausdorff space (X, T ) with property (n) is called normal.
Lecture 2

2. Ultrafilters
In this lecture we discuss a set theoretical concept, which turns out to be
technically useful in topology.
Definition. Suppose X is a fixed (non-empty) set. A filter in X is a (non-
empty) family F of non-empty subsets of X which has the property2:
(f) Whenever F and G belong to F, it follows that F ∩ G also belongs to F.
What is important here is that all the sets in the filter are assumed to be non-
empty. The set of all filters in X can be ordered by inclusion. A simple application
of Zorn’s Lemma yields:
• For each filter F there exists at least one maximal filter U with U ⊃ F.
Maximal filters will be called ultrafilters.
An interesting feature of ultrafilters is given by the following:
Lemma 2.1. Let X be a non-empty set, and let U be a filter on X. The
following are equivalent:
(i) U is an ultrafilter.
(ii) For any subsets A ⊂ X, it follows that either A or X r A belongs to U,
but not both!
Proof. (i) ⇒ (ii). Assume U is an ultrafilter. First remark that X always
belongs to U. (Otherwise, if X does not belong to U, the family U ∪ {X} will be
obviously a new filter which will contradict the maximality of U).
Let us assume that A is non-empty and it does not belong to U. This means
that the family
M = U ∪ {A ∩ U | U ∈ U}
is no longer a filter (otherwise, the maximality of U will be contradicted). Note that
if F and G belong to M, then automatically F ∩ G belongs to M. This means that
the only thing that can prevent M from being a filter, must be the fact that one
of the sets in M is empty. That is, there is some set V ∈ U such that A ∩ V = ∅.
In other words, V ⊂ X r A. But then, it follows that for any U ∈ U we have
U ∩ (X r A) ⊃ U ∩ V 6= ∅ and then the set
N = U ∪ {U ∩ (X r A) | U ∈ U }
will be a filter. By maximality, it follows that N = U, in particular, X r A belongs
to U. It is obvious that A and X r A cannot simultaneously belong to U, because
this will force ∅ = A ∩ (X r A) to belong to U.
2 Some textbooks may use a slightly different definition.

9
10 LECTURE 2

(ii) ⇒ (i). Assume property (ii) holds, but U is not maximal, which means
that there exists some ultrafilter V with V ) U. Pick then some set A ∈ V r U.
Since A 6∈ U, by (ii) we must have X r A ∈ U. This would force both A and X r A
to belong to V, which is impossible. 
Exercise 1. Let U be an ultrafilter on X, and let A ∈ U. Prove that the
collection
U A = {U ∩ A : U ∈ U}

is an ultrafilter on A.
Remark 2.1. If U is an ultrafilter on X, and A ∈ U, then U contains all sets
B with A ⊂ B ⊂ X. Indeed, if we start with such a B, then by the above result,
either B ∈ U or X r B ∈ U. Notice however that in the case X r B ∈ U we would
get
U 3 (X r B) ∩ A = ∅,
which is impossible. Therefore B must belong to U.
We are in position now to define the notion of convergence for ultrafilters, by
means of the following.
Proposition 2.1. Let (X, T ) be a topological space, let U be an ultrafilter in
X, and let x be a point in X. The follwoing are equivalent:
(i) Every neighborhood of x belongs to U.
(ii) There exists N a basic system of neighborhoods of x, with N ⊂ U.
(iii) There exists V a fundamental system of neighborhoods of x, with V ⊂ U.
If the ultrafilter U satisfies one of the equivalent conditions above, we say that
U is convergent to x, and we write U → x.
Proof. The implications (i) ⇒ (ii) ⇒ (iii) are obvious.
(iii) ⇒ (i). Let V be a fundamental system of neighborhoods of x, with V ⊂ U.
Start with an arbitrary neighborhood M of x. By the proeprties of V, there exists
a finite sequence V1 , . . . , Vn ∈ V, with
x ∈ V1 ∩ · · · ∩ Vn ⊂ M.
Since V ⊂ U, and U is a filter, it follows that the intersection W = V1 ∩ · · · ∩ Vn
belongs to U. By Remark 2.1 it follows that M itself belong to U. Since M was
arbitrary, it follows that U indeed satisfies condition (i). 
The Hausdorff property has a nice ultrafilter characterization:
Proposition 2.2. For a topological space (X, T ), the following are equivalent:
(i) The topology T is Hausdorff.
(ii) Every convergent ultrafilter in X has a unique limit.
Proof. (i) ⇒ (ii). Assume the topolgy is Hausdorff. Let U be an ultrafilter in
X which is convergent to both x and y. If x 6= y, then by the Hausdorff property,
there exist two open sets U, V ⊂ X, with x ∈ U , y ∈ V , and U ∩ V = ∅. Since U
is a neighborhood of x, we must have U ∈ U. Likewise, we must have V ∈ U. But
this is impossible, since it will force U 3 U ∩ V = ∅.
(ii) ⇒ (i). Assume X satisfies condition (ii), but the topology is not Hausdorff.
This means that there exist two points x, y ∈ X, with x 6= y, such that
(∗) for any open sets U, V ⊂ X, with U 3 x and V 3 y, we have U ∩ V 6= ∅.
CHAPTER I: TOPOLOGY PRELIMINARIES 11

Let Nx denote the collection of all neighborhoods of x, and Ny denote the collection
of all neighborhoods of y. By condition (∗) we have
M ∩ N 6= ∅, ∀ M ∈ Nx , N ∈ Ny .
This proves that the collection
F = {M ∩ N : M ∈ Nx , N ∈ Ny }
is a filter in X. Notice that, since X is a neighborhood for both x and y, we have
the inclusion F ⊃ Nx ∪ Ny . So if we take U to be an ultrafilter, with U ⊃ F, it
follows that U ⊃ Nx , hence U converges to x, but also U ⊃ Ny , hence U is also
convergent to y. By condition (ii) this is impossible. 

Examples 2.1. A. Let x be a point in X. We can consider the collection


Ux = {U ⊂ X | U 3 x}. Clearly Ux is an ultrafilter in X. This is called a constant
ultrafilter at x. If (X, T ) is a topological space, then it is obvious that Ux is
convergent to x.
B. (Example of a convergent non-constant ultrafilter.) Suppose (X, T ) is a
topological space and x is a point in X such that for any neighborhood N of x, we
have N r {x} = 6 ∅. Consider the collection
F = {N r {x} | N neighborhood of x}.
Then F is a filter. If we take U any ultrafilter which contains F, we get a non-
constant (sometimes called free) ultrafilter. It is again clear that U is again con-
vergent to x.
C. (Example of a non-convergent ultrafilter.) Let N be the set of non-negative
integers. Equip N with the discrete topology (in which every subset is open). Con-
sider the collection F consisting of all subsets F ⊂ N which have finite complement
N r F . It is easy to check that F is a filter. Pick then U to be any ultrafilter with
U ⊃ F. Since on N we use the discrete topology, it follows that the only convergent
ultrafilters are the constant ones. Note however, that if n ∈ N, then the set N r {n}
belongs to F, hence to U. This means that the singleton set {n} cannot belong to
U. Therefore U cannot be constant.
Remark 2.2. Maps between sets can be put to act on ultrafilters. More
explicitly one has the following construction. Suppose f : X → Y is a map and U
is a ultrafilter in X. Consider the collection
f∗ (U) = {V ⊂ Y | f −1 (V ) ∈ U}.
Then f∗ (U) is a ultrafilter on Y . Indeed, it is easy to show that f∗ (U) is a filter.
To prove that it is maximal, let us take F a filter on Y with F ⊃ f∗ (U) and let us
consider an arbitrary set F which belongs to F. Since U is an ultrafilter on X it
follows that either f −1 (F ) or X r f −1 (F ) belongs to U. If X r f −1 (F ) belongs to
U, using the equality X r f −1 (F ) = f −1 (Y r F ) if follows that Y r F belongs to
f (U), hence to F. But this is impossible, since F also belongs to F and this will
force the empty set F ∩ (Y r F ) to belong to the filter F. This contradiction shows
that the set f −1 (F ) belongs to U, which means precisely that F belongs to f∗ (U).
This argument proves the inclusion F ⊂ f∗ (U), so f∗ (U) is indeed a maximal filter.
Remark 2.3. With the above notations, one has
f (U ) ∈ f∗ (U), ∀ U ∈ U.
12 LECTURE 2

One can prove this property by contradiction. Assume f (U ) does not belong to
f∗ (U), for some U ∈ U. Then Y r f (U ) belongs to f∗ (U), which means that the set
M = f −1 Y r f (U ) = X r f −1 f (U )
 

belongs to U. But using the obvious inclusion U ⊂ f −1 f (U ) , this gives M ∩ U =




∅, which is impossible.
Continuity can be nicely characterized using ultrafilters:
Proposition 2.3. Let (X, T ) and (Y, S) be topological spaces, and let x be
element in X. For a function f : X → Y , the following are equivalent:
(i) f is continuous at x.
(ii) Whenever U is an ultrafilter on X convergent to x, it follows that the
ultrafilter f∗ (U) in Y , convergent to f (x).
Proof. (i) ⇒ (ii). Assume that f is continuous at x. Start with an ultrafilter
U on X, with U → x. Let N be an arbotrary neighborhood of f (x). Since f is
continuous at x, it follows that f −1 (N ) is a neighborhood of x. In particular we
get f −1 (N ) ∈ U, which proves that N ∈ f∗ (U). Since the ultrafilter f∗ (U) contains
all neighborhoods of f (x), it means that indeed f∗ (U) is convergent to f (x).
(ii) ⇒ (i). Assume f satisfies condition (ii), but f is not continuous at x. This
means that there exists some neighborhood V of f (x) such that f −1 (V ) is not a
neighborhood of x. Consider the collection
F = {N r f −1 (V ) : N neighborhood of x}.
Our assumption on V shows that all the sets in F are non-empty. (Otherwise
f −1 (V ) would contain some neighborhood of x, which would force f −1 (V ) itself to
be a neighborhood of x.) It is also clear that F is a filter. Let U be an ultrafilter
with U ⊃ F.
Claim: The ultrafilter U is convergent to x.
To prove this, start with some arbitrary neighborhood N of x. If N does not
belong to U, then X r N belongs to U. But then (X r N ) ∩ (N r f −1 (V )) = ∅
belongs to U, which is impossible. So U contains all neighborhoods of x, which
means that indeed U is convergent to x.
Using our assumption on V , plus condition (ii), it follows that V ∈ f∗ (U),
which means that f −1 (V ) ∈ U. But this leads to a contradiction, since X r f −1 (V )
clearly belongs to F ⊂ U. 
Lecture 3

3. Constructing topologies
In this section we discuss several methods for constructing topologies on a given
set.
Definition. If T and T 0 are two topologies on the same space X, such that
T ⊂ T (as sets), then T is said to be stronger than T 0 . Equivalently, we will say
0

that T 0 is weaker than T .


Remark that this condition is equivalent to the continuity of the map
Id : (X, T ) → (X, T 0 ).
Comment. Given a (non-empty) set X, and a collection S of subsets of X,
one can ask the following:
Question 1: Is there a topology on X with respect to which all the sets in S
are open?
Of course, this question has an affirmative answer, since we can take as the topology
the collection of all subsets of X. Therefore the above question is more meaningful
if stated as:
Question 2: Is there the weakest topology on X with respect to which all the
sets in S are open?
The answer to this question is again affirmative, and it is based on the following:
Remark 3.1. If X is a non-empty set, and (Ti )i∈I is a family of topologies on
X, then the intersection \
Ti
i∈I
is again a topology on X.
In particular, if one starts with an arbitrary family S of subsets of X, and if
we take 
Θ(S) = T : T topology on X with T ⊃ S ,
then the intersection \
top(S) = T
T ∈Θ(S)
is the weakest (i.e. smallest) among all topologies with respect to which all sets in
S are open.
The topology top(S) defined above cane also be described constructively as
follows.
Proposition 3.1. Let S be a collection of subsets of X. Then the sets in
top(S), which are a proper subsets of X, are those which can be written a (arbi-
trary) unions of finite intersections of sets in S.

13
14 LECTURE 3

Proof. It is useful to introduce the following notations. First we define V(S)


to be the collection of all sets which are finite intersections of sets in S. In other
words,
B ∈ V(S) ⇐⇒ ∃ D1 , . . . , Dn ∈ S such that D1 ∩ · · · ∩ Dn = B.
With the above notation, what we need to prove is that for a set A ( X, we have
[
A ∈ top(S) ⇐⇒ ∃ VA ⊂ V(S) such that A = B.
B∈VA

The implication “⇐” is pretty obvious. Since top(S) is a topology, and every set
in S is open with respect to top(S), it follows that every finite intersection of sets
in S is again in top(S), which means that every set in V(S) is again open with
respect to top(S). But then arbitrary unions of sets in V(S) are again open with
respect to top(S).
To prove the implication “⇒” we define
 [
T0 = A ⊂ X : ∃ VA ⊂ V(S) such that A = B ,
B∈VA

and we will show that


(1) top(S) ⊂ {X} ∪ T0 .
By the definition of top(S) it suffices to prove the following
Claim: The collection T1 = {X} ∪ T0 is a topology on X, which contains all
the sets in S.
The fact that T1 ⊃ S is trivial.
The fact that ∅, X ∈ T1 is also clear.
The fact that arbitrary unions of sets in T1 again belong to T1 is again clear,
by construction.
Finally, we need to show that if A1 , A2 ∈ T1 , then A1 ∩ A2 ∈ T1 . If either
A1 = X or A2 = X, there is nothing to prove. Assume that both A1 and A2 are
proper subsets of X, so there are subsets V1 , V2 ⊂ V(S), such that
[ [
A1 = B and A2 = E.
B∈V1 E∈V1

Then it is clear that [


A1 ∩ A2 = (B ∩ E),
B∈V1
E∈V2

with all the sets B ∩ E in V(S), so A1 ∩ A2 indeed belongs to T1 . 

Definition. Let X be a (non-empty) set, let T be a topology on X. A


collection S of subsets of X, with the property that
T = top(S),
is called a sub-base for T . According to the above remark, the above condition is
equivalent to the fact that every open set D ( X can be written as a union of finite
intersections of sets in S.
Convergence of ultrafilters is characterized using sub-bases as follows;
CHAPTER I: TOPOLOGY PRELIMINARIES 15

Proposition 3.2. Let (X, T ) be a topological space, let S be a sub-base for


T , and let x be some point in X. For an ultrafilter U on X, the following are
equivalent:
(i) U is convergent to x;
(ii) U contains all the sets S ∈ S with S 3 x.

Proof. The implication (i) ⇒ (ii) is trivial.


To prove the implication (ii) ⇒ (i), we assume U has property (ii), we consider
some neighborhood N of x, and let us prove that N belongs to U. Since N is a
neighborhood of x, there exists some open set D, with x ∈ D ⊂ N . Furthermore,
by Proposition 3.1, either
(a) D = X, or
(b) there exist sets S1 , S2 , . . . , Sn ∈ S with
x ∈ S1 ∩ S2 ∩ · · · ∩ Sn ⊂ D ⊂ N.
In case (a) we immediately have N = X, and we obviously get N ∈ U. In case (b)
it follows that S1 , . . . , Sn ∈ U, so the intersection S1 ∩ S2 ∩ · · · ∩ Sn also belongs to
U. By Remark 2.1 it then follows that N itself belongs to U. 

There are instances when sub-bases have a particular feature, which enables
one to describe all open sets in an easier fashion.
Proposition 3.3. Let (X, T ) be a topological space. Suppose V is a colletion
of subsets of X. The following are equivalent:
(i) V is a sub-base for T , and
(2) ∀ U, V ∈ V and x ∈ U ∩ V, , ∃ W ∈ V with x ∈ W ⊂ U ∩ V.
(ii) Every open set A ( X is a union of sets in V.

Proof. (i) ⇒ (ii). From property (i), it follows that every finite intersection
of sets in V is a union of sets in V. Then the desired implication is immeadiate
from the previous result.
(ii) ⇒ (i). Assume (ii) and start with two sets U, V ∈ V, and an element
x ∈ U ∩ V . Since U ∩ V is open, by (ii) either we have U ∩ V = X, in which case
we get U = V = X, and we take W = X, or U ∩ V ( X, in which case U ∩ V is a
union of sets in V, so in particular there exists W ∈ V with x ∈ W ⊂ U ∩ V . 

Definition. If (X, T ) is a topological space, a collection V which satisfies the


above equivalent conditions, is called a base for T .
The following is a useful technical result.
Lemma 3.1. Let (Y, T ) be a topological space, let X be some (non-empty) set,
and let f : X → Y be a function. Then the collection
f ∗ (T ) = f −1 (D) : D ∈ T


is a topology on X. Moreover, f ∗ (T ) is the weakest topology on X, with respect to


which the map f is continuous.

Proof. Clearly ∅ = f −1 (∅) and X = f −1 (Y ) both belong to f ∗ (T ). If


(Ai )i∈I is a family of sets in f ∗ (T ), say Ai = f −1 (Di ), for some Di ∈ T , for all
16 LECTURE 3

i ∈ I, then the equality


[ [ [
f −1 (Di ) = f

Ai = Di
i∈I i∈I i∈I

clearly shows that i∈I Ai again belongs to f (T ). Likewise, if A1 , A2 ∈ f ∗ (T ),



S

say A1 = f −1 (D1 ) and A2 = f −1 (D2 ) for some D1 , D2 ∈ T , then the equality


A1 ∩ A2 = f −1 (D1 ) ∩ f −1 (D2 ) = f −1 (D1 ∩ D2 )
proves that A1 ∩ A2 again belongs to f ∗ (T ).
Having proven that f ∗ (T ) is a topology on X, let us prove now the second
statement. The fact that f is continuous with respect to f ∗ (T ) is clear by con-
struction. If T 0 is another topology which still makes f continuous, then this will
force all the sets of the form f −1 (D), D ∈ T to belong to T 0 , which means that
f ∗ (T ) ⊂ T 0 . 
Remark 3.2. Using the above notations, if V is a (sub)base for T , then
f ∗ (V) = f −1 (V ) : V ∈ V


is a (sub)base for f ∗ (T ). This is pretty obvious since the correspondence


{ subsets of Y } 3 D 7−→ f −1 (D)
is compatible with the operation of intersection and union (of arbitrary families).
Remark 3.3. As a consequence of the above remark, we see that (sub)bases
can be useful for verifying continuity. More specifically, if (X, T ) and (X 0 , T 0 ) are
topological spaces, and V is a sub-base for T 0 , then a function f : X → X 0 is
continuous, if and only if f −1 (V ) is open, for all V ∈ V.
The construction outlined in Lemma 3.1 can be generalized as follows.
Proposition 3.4. Let X be a set, and let Φ = (fi , Yi )i∈I be a family consisting
of maps fi : X → Yi , where Yi is a topological space, for all i ∈ I. Then there is a
unique toplogy T Φ on X, with the following properties
(i) Each of the maps fi : X → Yi , i ∈ I is continuous with respect to T Φ .
(ii) Given a topological space (Z, S), and a map g : Z → X, such that the
composition fi ◦ g : Z → Yi is continuous, for every i ∈ I, it follows that
g is continuous as a map (Z, S) → (X, T Φ ).
Proof. For every i ∈ I we define
Di = fi−1 (D) : D open subset of Yi ,


and we form the collection [


D= Di .
i∈I
Take T Φ = top(D) Property (i) follows from the simple observation that, by
construction, every set in D is open.
To prove property (ii) start with a topological space (Z, S), and a map g : Z →
X, such that the composition fi ◦ g : Z → Xi is continuous, for every i ∈ I, and
let us prove that g is continuous. By Remark 3.3 it suffices to prove that g −1 (A) is
open (in Z) for every A ∈ D. By the definition of D this is equivalent to proving
the fact that, for each i ∈ I, and each open set D ⊂ Yi , the set g −1 fi−1 (D) is


open. But this is obvious, since we have


g −1 fi−1 (D) = (fi ◦ g)−1 (D),

CHAPTER I: TOPOLOGY PRELIMINARIES 17

and fi ◦ g : Z → Yi is continuous.
To prove the uniqueness, let T be another topology on X with properties (i)
and (ii). Consider the map h = Id : (X, T ) → (X, T Φ ). Using property (i) for T ,
combined with property (ii) for T Φ , it follows that h is continuous, which means
that T Φ ⊂ T . Reversing the roles, and arguing exactly the same way, we also get
the other inclusion T ⊂ T Φ . 
Remark 3.4. Using the above setting, assume that for each  i ∈ I a sub-base
Si for the topology of Yi is given. Consider the sets fi∗ Si = fi−1 (S) : S ∈ Si .

Then the collection [


S= fi∗ Si
i∈I
is a sub-base for the topology T Φ .
To prove this, we take T = top(S), so that we obviously have the inclusion
T ⊂ T Φ . In order to prove the equality T = T Φ , all we have to prove are (use the
notations from the proof of the above Proposition) the inclusions
Di ⊂ T, ∀ i ∈ I.
By construction however, we have Di = fi∗ Ti , and since Si is a sub-base for Ti , it
follows that fi∗ Si is a sub-base for Di , which means that we have
Di = top(fi∗ Si ) ⊂ top(S) = T.
Comment. Using the notations above, it is immediate that the topology T Φ
can also be described as the weakest topology on X, with respect to which all the
maps fi : X → Yi , i ∈ I, are continuous. In the light of this remark, we will call
the topology T Φ the weak topology defined by Φ.
Convergence for ultrafilters can be nicely characterized:
Proposition 3.5. Let X be a set, and let Φ = (fi , Yi )i∈I be a family consisting
of maps fi : X → Yi , where Yi is a topological space, for all i ∈ I. For an ultrafilter
U on X and a point x ∈ X, the following are equivalent:
(i) U is convergent to x, with respect to the topology T Φ ;
(ii) for every i ∈ I, the ultrafilter fi∗ (U) is convergen to fi (x).
Proof. (i) ⇒ (ii). This implication is clear, since all maps fi : (X, T Φ ) →
(Yi , Ti ), i ∈ I, are continuous.
(ii) ⇒ (i). Suppose U satisfies (ii). Then for every i ∈ I, the ultrafilter fi∗ (U)
contains all the open sets D ⊂ Yi with D 3 fi (x). This means that fi−1 (D) ∈ U.
But by construction, the topology T Φ has
[
fi−1 (D) : D ⊂ Yi open ,

S=
i∈I

as a sub-base, so if we define
Sx = {S ∈ S : S 3 x},
we clearly have U ⊃ Sx . Then the fact that U converges to x follows from Propo-
sition 3.2. 
Example 3.1. (The product topology) Supoose we have Ya family (Xi , Ti ), i ∈ I
of topological spaces. Consider the Cartesian product X = Xi . For each j ∈ I
i∈I
18 LECTURE 3

we consider the projection πj : X → Xj . The weakest topology on X, defined by


the family Φ = {πj }j∈I , is called the product topology.
A sub-base for the product topolgy can be defined as follows. For each i ∈ I,
we choose a sub-base Si for Ti (for instance we can take Si = Ti ), and we take
[ [
πi∗ Si = πi−1 (D) : D ∈ Si .

S=
i∈I i∈I

Then S is a sub-base for the product topology.


For a point x = (xi )i∈I ∈ X, and an ultrafilter U on X, the condition U → x
is equivalent to the fact that πi∗ (U) → xi , ∀ i ∈ I.
Another method of constructing topologies is based on the following “dual”
version of Lemma 3.1.
Lemma 3.2. Let (Y, T ) be a topological space, let X be some (non-empty) set,
and let f : Y → X be a function. Then the collection
f∗ (T ) = D ⊂ X : f −1 (D) ∈ T


is a topology on X. Moreover, f∗ (T ) is the strongest topology on X, with respect


to which the map f is continuous.
Proof. Since f −1 (∅) = ∅ and f −1 (X) = Y , it follows that ∅ and X both
belong to f∗ (T ). If (Ai )i∈I is a family of sets in f∗ (T ), then the sets f −1 (Ai ), i ∈ I
belong to T . In particular the set
[  [
f −1 Ai = f −1 (Ai )
i∈I i∈I
S
will again belong to T , which means that i∈I Ai belongs to f∗ (T ). Likewise,
if A1 , A2 ∈ f∗ (T ), then the sets f −1 (A1 ) and f −1 (A2 ) both belong to T . The
intersection
f −1 (A1 ∩ A2 ) = f −1 (A1 ) ∩ f −1 (A2 )
will then belong to T , which means that A1 ∩ A2 again belongs to f∗ (T ).
Having proven that f∗ (T ) is a topology on X, let us prove now the second
statement. The fact that f is continuous with respect to f∗ (T ) is clear by con-
struction. If T 0 is another topology which still makes f continuous, then this will
force all the sets of the form f −1 (A), A ∈ T 0 to belong to T , which means that A
will in fact belong to f∗ (T ). In other words, we have the inclusion T 0 ⊂ f∗ (T ). 
A generalization of the above construction is given in the following.
Proposition 3.6. Let X be a set, and let Φ = (fi , Yi )i∈I be a family consisting
of maps fi : Yi → X, where Yi is a topological space, for all i ∈ I. Then there is a
unique toplogy TΦ on X, with the following properties
(i) Each of the maps fi : Yi → X, i ∈ I is continuous with respect to TΦ .
(ii) Given a topological space (Z, S), and a map g : X → Z, such that the
composition g ◦ fi : Yi → Z is continuous, for every i ∈ I, it follows that
g is continuous as a map (X, TΦ ) → (Z, S).
Proof. For each i ∈ I, let Ti denote the topology on Yi . We define
\
TΦ = fi∗ (Ti ).
i∈I

Property (i) is obvious by construction.


CHAPTER I: TOPOLOGY PRELIMINARIES 19

To prove property (ii), start with some topological space (Z, S) and a map
g : X → Z such that g ◦ fi : Yi → Z is continuous, for all i ∈ I. Start with
some open set D ⊂ Z, and let us prove that the set A = g −1 (D) is open in X, i.e.
A ∈ TΦ . Notice that, for each i ∈ I, one has
fi−1 (A) = fi−1 g −1 (D) = (g ◦ fi )−1 (D),


so using the continuity of g ◦ fi we get the fact that fi−1 (A) is open in Yi , which
means that A ∈ fi∗ (Ti ). Since this is true for all i ∈ I, we then get A ∈ TΦ .
To prove uniqueness, let T be another topology on X with properties (i) and
(ii). Consider the map h = Id : (X, T ) → (X, T Φ ). Using property (i) for TΦ ,
combined with property (ii) for T , it follows that h is continuous, which means
that T Φ ⊂ T . Reversing the roles, and arguing exactly the same way, we also get
the other inclusion T ⊂ T Φ . 
Comment. Using the notations above, it is immediate that the topology TΦ
can also be described as the strongest topology on X, with respect to which all the
maps fi : Yi → X, i ∈ I, are continuous. In the light of this remark, we will call
the topology TΦ the strong topology defined by Φ.
G a family (Xi , Ti ),
Example 3.2. (The disjoint union topology) Supoose we have
3
i ∈ I of topological spaces. Consider the disjoint union X = Xi . For each i ∈ I
i∈I
we consider the inclusion i : Xi → X. The strongest topology on X, defined by
the family Φ = {i }i∈I , is called the disjoint union topology.
If we think each Xi as a subset of X, then Xi is open in X, for all i ∈ I.
Moreover, a set D ⊂ X is open, if and only if D ∩ Xi is open (in Xi ), for all i ∈ I.
For a point x ∈ X, there exists a unique i(x) ∈ I, with x ∈ Xi(x) . With this
notation, an ultrafilter U on X is convergent to x, if and only if Xi(x) ∈ U, and the
collection
U X = {U ∩ Xi(x) : U ∈ U}

i(x)

is an ultrafilter on Xi(x) , which converges to x.

3 Formally one uses the sets Z = S


S i∈I Xi , and Y = I × Z, and one realizes the diskoint
union as X = i∈I {i} × Xi .
Lecture 4

4. Compactness
Definition. Let X be a topological space X. A subset K ⊂ X is said to be
compact set in X, if it has the finite open cover property:
S
(f.o.c) Whenever {Di }i∈I is a collection of open sets such that K ⊂ i∈I Di ,
there exists a finite sub-collection Di1 , . . . , Din such that
K ⊂ Di1 ∪ · · · ∪ Din .
An equivalent description is the finite intersection property:
(f.i.p.) If {Fi }i∈I is is a collection of closed sets such that for any finite sub-
collection Fi1 , . . . , Fin we have K ∩ Fi1 ∩ . . . Fin 6= ∅, it follows that
\ 
K∩ Fi 6= ∅.
i∈I

A topological space (X, T ) is called compact if X itself is a compact set.


Remark 4.1. Suppose (X, T ) is a topological
space, and K is a subset of
X. Equip K with the induced topology T K . Then it is straightforward from the
definition that the following are equivalent:
• K is compact,
as a subset in (X, T );
• (K, T K ) is a compact space, that is, K is compact as a subset in (K, T K ).
The following three results give methods of constructing compact sets.
Proposition 4.1. A finite union of compact sets is compact.

Proof. Immediate from the definition. 

Proposition 4.2. Suppose (X, T ) is a topological space and K ⊂ X is a


compact set. Then for every closed set F ⊂ X, the intersection F ∩ K is again
compact.

Proof. Immediate, using the finite intersection property. 

Proposition 4.3. Suppose (X, T ) and (Y, S) are topological spaces, f : X → Y


is a continuous map, and K ⊂ X is a compact set. Then f (K) is compact.

Proof. Immediate from the definition. 

Besides the two equivalent conditions (f.o.c) and (f.i.p.), there are some other
useful characterizations of compactness, listed in the following.
Theorem 4.1. Let (X, T ) be a topological space. The following are equivalent:
(i) X is compact.
21
22 LECTURE 4

(ii) (Alexander sub-base Theorem) There exists a sub-base S with the finite
open cover property: [
(s) For any collection {Si | i ∈ I} ⊂ S with X = Si , there exists a
i∈I
finite sub-collection {Si1 , Si2 , . . . , Sin } (for some finite sequence of
indices i1 , i2 , . . . , in ∈ I) such that X = Si1 ∪ Si2 ∪ · · · ∪ Sin .
(iii) Every ultrafilter in X is convergent.
Proof. (i) ⇒ (ii). This is obvious. (In fact any sub-base has the open cover
property.)
(ii) ⇒ (iii). Let U be an ultrafilter on X. Assume U is not convergent to any
point x ∈ X. By Proposition 3.2 it follows that, for each x ∈ X, one can find a
set Sx ∈ S with Sx 3 x, but such that Sx 6∈ U. Using property (s), one can find a
finite collection of points x1 , . . . , xn ∈ X, such that
(1) Sx1 ∪ · · · ∪ Sxn = X.
Since Sxp 6∈ U, it means that X r Sxp belongs to U, for every p = 1, . . . , n. Then,
using (1), we get
U 3 (X r Sx1 ) ∩ · · · ∩ (X r Sxn ) = ∅,
which is impossible.
(iii) ⇒ (i). Assuming property (iii), we will show that X has the finite in-
tersection property. Start with a family of closed sets {Fi }i∈I , with the property
that
\
(2) Fi 6= ∅, for every finite subset J ⊂ I.
i∈J
T
We want to prove that Fi 6= ∅. For every finite subset J ⊂ I we define the
i∈I T
non-empty closed set FJ = i∈J Fi . It is clear that
F = FJ : J finite subset of I


is a filter. Let then U be an ultrafilter with U ⊃ F. By (iii) there exists some x ∈ X


such that U → x, whicm means that U contains all neighborhoods of x. Start now
with some arbitrary index i ∈ I. Since we clearly have Fi ∈ F ⊂ U, it follows that
X r Fi cannot belong to U, which means that X r Fi is not a neighborhood of x.
Since X r Fi is already open, this forces x 6∈ X r Fi , which T means that x ∈ Fi .
Since this is true for all i ∈ I, it proves that the intersection i∈I Fi caontains x,
so it is non-empty. 
An interesting application of the above result is the following:
Theorem 4.2 (Tihonov). Suppose one Q has a familiy (Xi , Ti )i∈I of compact
topological spaces. Then the product space Xi is compact in the product topology.
i∈I

Proof. We are going to use the ultrafilter characterization


Q (iii) from the pre-
ceding Theorem. Let U be an ultrafilter on X = Xi . Denote by πi : X → Xi ,
i∈I
i ∈ I the coordinate maps. Since each Xi is compact, it follows that, for every
i ∈ I, the ultrafilter πi∗ (U) (in Xi ) is convergent to some point xi ∈ Xi . If we form
the element x = (xi )i∈I ∈ X, this means that πi∗ (U) is convergent to πi (x), for
every i ∈ I. Then, by the ultrafilter characterization of the product topology (see
section 3) it follows that U is convergent to x. 
CHAPTER I: TOPOLOGY PRELIMINARIES 23

Comment. Another interesting application of Theorem 4.1 is the following


construction. Suppose (X, T ) is a compact Hausdorff space, and (xi )i∈I ⊂ X is an
arbitray family of elements. (Here I is an arbitrary set.) Suppose U is an ultrafilter
on I. If we regard the family (xi )i∈I simply as a function f : I → X, then we can
construct the ultrafilter f∗ (U) on X. More explicitly

f∗ (U) = U ⊂ X : the set {i ∈ I : xi ∈ W } belongs to U .
Since X is compact Hausdorff, the ultrafilter f∗ (U) is convergent to some unique
point x ∈ X. This point is denoted by lim xi .
U
We conclude this section with some results on compactness in Hausdorff spaces.
Proposition 4.4. Suppose (X, T ) is toplological Hausdorff space.
(i) Any compact set K ⊂ X is closed.
(ii) If K is a compact set, then a subset F ⊂ K is compact, if and only if F
is closed (in X).
Proof. (i) The key step is contained in the following
Claim: For every x ∈ X r K, there exists some open set Dx with x ∈ Dx ⊂
X r K.
Fix x ∈ X r K. For every y ∈ K, using the Hausdorff property, we can find
two open sets Uy and S Vy with Uy 3 x, Vy 3 y, and Uy ∩ Vy = ∅. Since we
obviously have K ⊂ y∈K Vy , by compactness, there exist points y1 , . . . , yn ∈ K,
such that K ⊂ Vy1 ∪ · · · ∪ Vyn . The claim immediately follows if we then define
Dx = Uy1 ∩ · · · ∩ Uyn .
Using the Claim we now see that we can write the complement of K as a union
of open sets: [
X rK = Dx ,
x∈XrK
so X r K is open, which means that K is indeed closed. (ii). If F is closed, then
F is compact by Proposition 4.2. Conversely, if F is compact, then by (i) F is
closed. 
Proposition 4.5. Every compact Hausdorff space is normal.
Proof. Let X be a compact Hausdorff space. Let A, B ⊂ X be two closed
sets with A∩B = ∅. We need to find two open sets U, V ⊂ X, with A ⊂ U , B ⊂ V ,
and U ∩ V = ∅. We start with the following
Particular case: Assume B is a singleton, B = {b}.
The proof follows line by line the first part of the proof of part (i) from Proposition
4.4. For every a ∈ A we find open sets Ua and Va , such that Ua 3 a, Va 3 b,
and Ua ∩ Va = ∅.SUsing Proposition 4.4 we know that A is compact, and since we
clearly have A ⊂ a∈A Ua , there exist a1 , . . . , an ∈ A, such that Ua1 ∪· · ·∪Uan ⊃ A.
Then we are done by taking U = Ua1 ∪ · · · ∪ Uan and V = Va1 ∩ · · · ∩ Van .
Having proven the above particular case, we proceed now with the general case.
For every b ∈ B, we use the particular case to find two open sets Ub and Vb , with
Ub ⊃ A, Vb 3Sb, and Ub ∩ Vb = ∅. Arguing as above, the set B is compact, and
we have B ⊂ b∈B Vb , so there exist b1 , . . . , bn ∈ B, such that Vb1 ∪ · · · ∪ Vbn ⊃ B.
Then we are done by taking U = Ub1 ∩ · · · ∩ Ubn and V = Vb1 ∪ · · · ∪ Vbn . 
Lecture 5

5. Topology preliminaries V: Locally compact spaces


Definition. A locally compact space is a Hausdorff toplogical space with the
property
(lc) Every point has a compact neighborhood.
One key feature of locally compact spaces is contained in the following;
Lemma 5.1. Let X be a locally compact space, let K be a compact set in X,
and let D be an open subset, with K ⊂ D. Then there exists an open set E with:
(i) E compact;
(ii) K ⊂ E ⊂ E ⊂ D.
Proof. Let us start with the following
Particular case: Assume K is a singleton K = {x}.
Start off by choosing a compact neighborhood N of x. Using the results from section
4, when equipped with the induced topology, the set N is normal. In particular, if
we consider the closed sets A = {x} and B = N r D (which are also closed in the
induced topology), it follows that there exist sets U, V ⊂ N , such that
• U ⊃ {x}, V ⊃ B, U ∩ V = ∅;
• U and V are open in the induced topology on N .
The second property means that there exist open sets U0 , V0 ⊂ X, such that U =
N ∩ U0 and V = N ∩ V0 . Let E = Int(U ). By construction E is open, and E 3 x.
Also, since E ⊂ U ⊂ N , it follows that
(1) E ⊂ N = N.
In particular this gives the compactness of E. Finally, since we obviously have
E ∩ V0 ⊂ U ∩ V0 = N ∩ U0 ∩ V0 = U ∩ V = ∅,
we get E ⊂ X r V0 , so using the fact that X r V0 is closed, we also get the
inclusion E ⊂ X r V0 . Finally, combining this with (1) and with the inclusion
N r D ⊂ V ⊂ V0 , we will get
E ⊂ N ∩ (X r V0 ) ⊂ N ∩ (N r D) ⊂ D,
and we are done.
Having proven the particular case, we proceed now with the general case. For
every x ∈ K we use the particular case to find an open set E(x), with E(x) compact,
S
and such that x ∈ E(x) ⊂ E(x) ⊂ D. Since we clearly have K ⊂ x∈K E(x), by
compactness, there exist x1 , . . . , xn ∈ K, such that K ⊂ E(x1 )∪· · ·∪E(xn ). Notice
that if we take E = E(x1 ) ∪ · · · ∪ E(xn ), then we clearly have
K ⊂ E ⊂ E ⊂ E(x1 ) ∪ · · · ∪ E(xn ) ⊂ D,
25
26 LECTURE 5

and we are done. 


One of the most useful result in the analysis on locally compact spaces is the
following.
Theorem 5.1 (Urysohn’s Lemma for locally compact spaces). Let X be a
locally compact space, and let K, F ⊂ X be two disjoint sets, with K compact, F
closed, and K ∩ F = ∅. Then there exists a continuous function f : X → [0, 1]
such that f K = 1 and f F = 0.
Proof. Apply Lemma 5.1 for the pair K ⊂ X rF and find an open set E, with
E compact, such that K ⊂ E ⊂ E ⊂ X r F . Apply again Lemma 5.1 for the pair
K ⊂ E and find anothe open set G with G compact, such that K ⊂ G ⊂ G ⊂ E.
Let us work for the moment in the space E (equipped with the induced topol-
ogy). This is a compact Hausdorff space, hence it is normal. In particular, using
Urysohn Lemma (see section 1) there exists a continuous function g; E → [0, 1] such
that g K = 0 and g ErG = 0. Let us now define the function f : X → [0, 1] by

g(x) if x ∈ E
f (x) =
0 if x ∈ X r E

Notice that f E = g E , so f E is continuous. If we take the open set A = X r G,
then it is also clear
that f A = 0. So now we have two open sets E and A, with

A∪E = X, and f A and f E both continuous. Then it is clear that f is continuous.

The other two properties f K = 1 and f F = 0 are obvious. 
We now discuss an important notion which meakes the linkage between locally
compact spaces and compact spaces
Definition. Let X be a locally compact space. By a compactification of X one
means a pair (θ, T ) consisting of a compact Hausdorff space T , and of a continuous
map θ : X → T , with the following properties
(i) θ(X) is a dense open subset of T ;
(ii) when we equip θ(X) with the induced topology, the map θ : X → θ(X)
is a homeomorphism.
Notice that, when X is already compact, any compactification (θ, T ) of X is nec-
essarily made up of a compact space T , and a homeomorphism θ : X → T .
Example 5.1 (Alexandrov compactification). Suppose X is a locally compact
space, which is not compact. We form a disjoint union with a singleton X α =
X t {∞}, and we equip the space X α with the topology in which a subset D ⊂ X α
is declared to be open, if either D is an open subset of X, or there exists some
compact subset K ⊂ X, such that D = (X r K) t {∞}. Define the inclsuion
map ι : X ,→ X α . Then (ι, X α ) is a compactification of X, which is called the
Alexandrov compactification. The fact that ι(X) is open in X α , and ι : X → ι(X)
is a homeomorphism, is clear. The density of ι(X) in X α is also clear, since every
open set D ⊂ X α , with D 3 ∞, is of the form (X r K) t {∞}, for some compact
set K ⊂ X, and then we have D ∩ ι(X) = ι(X r K), which is non-empty, because
X is not compact.
Remark that, if X is already compact, we can still define the topological space
X α = X t {∞}, but this time the singleto set {∞} will be also be open. Although
ι(X) will still be open in X α , it will not be dense in X α .
CHAPTER I: TOPOLOGY PRELIMINARIES 27

One should regard the Alexandrov compactification as a minimal one. It turns


out that there exists another compactification which is described below, which can
be regarded as the largest.
Theorem 5.2 (Stone-C̆ech). Let X be a locally compact space. Consider the
set
F = {f : X → [0, 1] : f continuous },
and consider the product space
Y
T = [0, 1],
f ∈F

equipped with the product topology, and define the map θ : X → T by



θ(x) = f (x) f ∈F , ∀ x ∈ X.

Equip the closure θ(X) with the topology induced from T . Then the pair (θ, θ(X))
is a compactification of X.
Proof. For every f ∈ F , let us denote by πf : T → [0, 1] the coordinate map.
Remark that θ : X → T is continuous. This is immediate from the definition of
the product topology, since the continuity of θ is equivalent to the continuity of all
compositions πf ◦ β, f ∈ F . The fact that these compositions are continuous is
however trivial, since we have πf ◦ θ = f , ∀ f ∈ F .
Denote for simplicity θ(X) by B. By Tihonov’s Theorem, the space T is com-
pact (and obviously Hausdorff), so the set B is compact as well, being a closed
subset of T . By construction, θ(X) is dense in B, and θ is continuous.
At this point, it is interesting to point out the following property
Claim 1: For every f ∈ F , there exists a unique continuous map f˜ : B →
[0, 1], such that f˜ ◦ θ = f .
The uniqueness is trivial, since θ(X) is dense in B. The existence is also trivial,
because we can take f˜ = πf B .

We can show now that θ is injective. If x, y ∈ X are such that x 6= y, then


using Urysohn Lemma we can find f ∈ F , such that f (x) 6= f (y). The function f˜
given by Claim 1, clearly satisfies
f˜ θ(x) = f (x) 6= f (y) = f˜ θ(y) ,
 

which forces θ(x) 6= θ(y).


In order to show that θ(X) is open in B, we need some preparations. For every
compact subset K ⊂ X, we define

FK = f : X → [0, 1] : f continuous, f =0 .
XrK
On key observation is the following.
Claim 2: If K ⊂ X is compact, and if f ∈ FK , then the continuous function
f˜ : B → [0, 1], given by Claim 1, has the property f˜ Brθ(K) = 0.
We start with some α ∈ B r θ(K), and we use Urysohn Lemma to find some
continuous function φ : B → [0, 1] such that φ(α) = 1 and φ θ(K) = 0. Consider
ψ = φ · f˜. Notice that (φ ◦ θ) K = 0, which combined with the fact

the function

that f
XrK
= 0, gives
ψ ◦ θ = (φ ◦ θ) · (f˜ ◦ θ) = (φ ◦ θ) · f = 0,
28 LECTURE 5

so using Claim 1 (the uniqueness part), we have ψ = 0. In particular, since φ(α) =


1, this forces f˜(α) = 0, thus proving the Claim.
We define now the collection
[
Fc = FK .
K⊂X
K compact

Define the set \


πf−1 {0} .

S=
f ∈Fc
By the definition of the product topology, it follows that S is closed in T . The fact
that θ(X) is open in B, is then a consequence of the following fact.
Claim 3: One has the equality θ(X) = B r S.
Start first with some point x ∈ X, and let us show that θ(x) 6∈ S. Choose some
open set D ⊂ X, with D̄ compact, such that D 3 x, and apply Urysohn Lemma
to find some continuous map f : X → [0, 1] such that f (x) = 1 and f XrD = 0.

It is clear that f ∈ FD̄ ⊂ Fc , but πf θ(x) = f (x) = 1 6= 0, which means that
θ(x) 6∈ πf−1 {0} , hence θ(x) 6∈ S. Conversely, let us start with some point α =


(αf )f ∈F ∈ B r S, and let us prove that α ∈ θ(X). Since α 6∈ S, there exists some
f ∈ Fc , such that π f (α) > 0. Since f ∈ Fc , there exists some compact subset
˜
K ⊂ X, such that f XrK = 0. Using Claim 2, we know that f Brθ(K) = 0. Since

f˜(α) = πf (α) 6= 0, this forces α ∈ θ(K) ⊂ θ(X).
To finish the proof of the Theorem, all we need to prove now is the fact that
θ : X → θ(X) is a homeomorphism, which amounts to proving that, whenever
D ⊂ X is open, it follows that θ(D) is open in B. Fix an open subset D ⊂ X. In
order to show that θ(D) is open in B, we need to show that θ(D) is a neighborhood
for each of its points. Fix some point α ∈ θ(D), i.e. α = θ(x), for some x ∈ D.
Choose some compact subset K ⊂ D, such that x ∈ Int(K), and apply Urysohn
Lemma to find a function f ∈ FK , with f (x) = 1. Consider the continuous function
f˜ : B → [0, 1] given by Claim 1, and apply Claim 2 to conclude that f˜ Brθ(K) = 0.

In particular the open set


N = f˜−1 (1/2, ∞) ⊂ B


is contained in θ(K) ⊂ θ(D). Since f˜(α) = f (x) = 1, we clearly have x ∈ N . 

Definition. The compactification (θ, θ(X)), constructed in the above Theo-


rem, is called the Stone-Cech compactification of X. The space θ(X) will be denoted
by X β . Using the map θ, we shall identify from now on X with a dense open subset
of X β . Remark that if X is compact, then X β = X.
Comment. The Stone-Cech compactification is an inherent “Zorn Lemma
type” construction. For example, if X is a non-compact locally compact space,
and if U is an ultrafilter on X, then weither U is convergent to a point in X (this
happens when U contains at least one compact subset of X), or U produces a point
in X β r X. If θ : X → X β denotes the inclusion map, then one considers the ultra-
filter θ∗ U on X β , and by compactness this ultrafilter converges to some (unique)
point in X β . This way one gets a correspondence
limX : U ⊂ P(X) : U ultrafilter on X → X β .

CHAPTER I: TOPOLOGY PRELIMINARIES 29

This correspondence is surjective. The injectivity obstruction is characterized as


follows. For two ultrafilters U1 , U2 , the condition limX (U1 ) 6= limX (U2 ) is equiv-
alent to the existence of two disjoint open sets D1 ∈ U1 and D2 ∈ U2 .
Exercise 1. Suppose a set X is equipped with the discrete topology. Prove that
the correspondence limX is bijective.
The Stone-Cech compactification is functorial, in the following sense.
Proposition 5.1. If X and Y are locally compact spaces, and if Φ : X → Y
β β β
map, then there exists a unique continuous map Φ : X → Y ,
is a continuous
β
such that Φ X = Φ.
Proof. We use the notations from Theorem 5.2. Define
F = {f : X → [0, 1] : f continuous } and G = {g : Y → [0, 1] : f continuous },
the product spaces Y Y
TX = [0, 1] and TY = [0, 1],
f ∈F g∈G
as well as the maps θX : X → TX and θY : Y → TY , defined by

θX (x) = f (x) f ∈F , ∀ x ∈ X;

θY (y) = g(y) g∈G , ∀ y ∈ Y.

With these notations, we have X β = θX (X) ⊂ TX and Y β = θY (Y ) ⊂ TY . Using


the fact that we have a correspondence G 3 g 7−→ g ◦ Φ ∈ F , we define the map
Ψ : TX 3 (αf )f ∈F 7−→ (αg◦Φ )g∈G ∈ TY .
Remark that Ψ is continuous. This fact is pretty obvious, because when we compose
with corrdinate projections πg : TY → [0, 1], g ∈ G, we have πg ◦ Ψ = πg◦Φ where
πg◦Φ : TX → [0, 1] is the coordinate projection, which is automatically continuous.
Remark that if we start with some point x ∈ X, then
  
(2) Ψ θX (x) = (g ◦ Φ)(x) g∈G = θY Φ(x) ,
which means that we have the equality Ψ◦θX = θY ◦Φ. Remark first that, since Y β
is closed, it follows that Ψ−1 (Y β ) isclosed in TX . Second, using (2), we clearly have
the inclusion θX (X) ⊂ Ψ−1 θY (Y ) ⊂ Ψ−1 (Y β ), so using the fact that Ψ−1 (Y β ) is
closed, we get the inclusion
X β = θX (X) ⊂ Ψ−1 (Y β ).

In other words, we get now a continuous map Φβ = Ψ : X β → Y β , which clearly

satisfies Φβ ◦ θX = θY ◦ Φ, which using our conventions means that Φβ X = Φ. The
uniqueness is obvious, by the density of X in X β . 
Exercise 2. The Alexandrov compactification is not functorial. In other words,
given locally compact spaces X and Y , and a continuous map f : X → Y , in general
there does not exist a continuous map f α : X α → Y α , with f α X = f . Give an
example of such a situation.
Hint: Consider X = Y = N, equipped with the discrete topology, and define f : N → N by

1 if n is odd
f (n) =
2 if n is even
It turns out that one can define a certain type of continuous maps, with respect
to which the Alexandrov compactification is functorial.
30 LECTURE 5

Definition. Let X, Y be locally compact spaces, and let Φ : X → Y be a


continuous map. We say that Φ is proper, if it satisfies the condition
K ⊂ Y , compact ⇒ Φ−1 (K) compact in X.
The following is an interesting property of proper maps, which will be exploited
later, is the following.
Proposition 5.2. Let X, Y be locally compact spaces, let Φ : X → Y be a
proper continuous map, and let T ⊂ X be a closed subset. Then the set Φ(T ) is
closed in X.
Proof. Start with some point y ∈ Φ(T ¯ ). This means that
(3) D ∩ Φ(T ) 6= ∅, for every open set D ⊂ Y , with D 3 y.
Denote by V the collection of all compact neighborhoods of y. In other words,
V ∈ V, if and only if V ⊂ Y is compact, and y ∈ Int(V ). For each V ∈ V we define
the set Ṽ = Φ−1 (V ) ∩ T . Since Φ is proper, all sets Ṽ , V ∈ V, are compact. Notice
also that, for every finite number of sets V1 , . . . , Vn ∈ V, if we form the intersection
V = V1 ∩ · · · ∩ Vn , then V ∈ V, and Ṽ ⊂ Ṽj , ∀ j = 1, . . . , n. Remark now that, by
(3), we have Ṽ 6= ∅, ∀ V ∈ V. Indeed, if we start with some V ∈ V and we choose
some point x ∈ T , such that Φ(x) ∈ V , then x ∈ Ṽ . Use now the finite intersection
T T
property, to get the fact that V ∈V Ṽ 6= ∅. Pick now a point x ∈ V ∈V Ṽ . This
means that x ∈ T , and
(4) Φ(x) ∈ V, ∀ V ∈ V.
But now we are done, because this forces Φ(x) = y. Indeed, if Φ(x) 6= y, using the
Hausdorff property, one could find some V ∈ V with Φ(x) 6∈ V , thus contradicting
(4). 
Exercise 3. Let X be a locally compact space, which is non-compact, let Y be
another a locally compact space, and Φ : X → Y is a proper continuous map.
(i) If Y is non-compact, prove
that there exists a unique continuous map
Φα : X α → Y α , with Φα X = Φ.
(ii) If Y is compact,
prove that there exists a unique continuos map Ψ : X α →
Y , with Ψ X = Φ.

Hint: In case (i) define Φα (∞) = ∞. In case (ii) consider the collection
W = Φ(T ) : T ⊂ X closed, with X r T compact .

T
Use the above result, combined with the finite intersection property, to pick a point y ∈ W ∈W W.
Define Ψ(∞) = y.
Lecture 6

6. Metric spaces
In this section we review the basic facts about metric spaces.
Definitions. A metric on a non-empty set X is a map
d : X × X → [0, ∞)
with the following properties:
(i) If x, y ∈ X are points with d(x, y) = 0, then x = y;
(ii) d(x, y) = d(y, x), for all x, y ∈ X;
(iii) d(x, y) ≤ d(x, z) + d(y, z), for all x, y, z ∈ X.
A metric space is a pair (X, d), where X is a set, and d is a metric on X.
Notations. If (X, d) is a metric space, then for any point x ∈ X and any
r > 0, we define the open and closed balls:
Br (x) = {y ∈ X : d(x, y) < r},
Br (x) = {y ∈ X : d(x, y) ≤ r}.
Definition. Suppose (X, d) is a metric space. Then X carries a natural
toplogy constructed as follows. We say that a set D ⊂ X is open, if it has the
property:
• for every x ∈ D, there exists some rx > 0, such that Brx (x) ⊂ D.
One can prove that the collection
Td = {D ⊂ X : D open }
is indeed a topology, i.e. we have
• ∅ and X are open; S
• if (Di )i∈I is a family of open sets, then i∈I Di is again open;
• if D1 and D2 are open, then D1 ∩ D2 is again open.
The topology thus constructed is called the metric topology.
Remark 6.1. Let (X, d) be a metric space. Then for every p ∈ X, and for
every r > 0, the set Br (p) is open, and the set B r (p) is closed.
If we start with some x ∈ Br (p), an if we define rx = r − d(x, p), then for every
y ∈ Brx (x) we will have
d(y, p) ≤ d(y, x) + d(x, p) < rx + d(x, p) = r,
so y belongs to Br (p). This means that Brx (x) ⊂ Br (p). Since this is true for all
x ∈ Br (p), it follows that Br (p) is indeed open.
To prove that B r (p) is closed, we need to show that its complement
X r B r (p) = {x ∈ X : d(x, p) > r}
31
32 LECTURE 6

is open. If we start with some x ∈ X r Br (p), an if we define ρx = d(p, x) − r, then


for every y ∈ Bρx (x) we will have
d(y, p) ≥ d(p, x) − d(y, x) > d(p, x) − ρx = r,
so y belongs to X r Br (p). This means that Bρx (x) ⊂ X r Br (p). Since this is true
for all x ∈ X r Br (p), it follows that X r Br (p) is indeed open.
Remark 6.2. The metric toplogy on a metric space (X, d) is Hausdorff. Indeed,
if we start with two points x, y ∈ X, with x 6= y, then if we choose r to be a real
number, with
d(x, y)
0<r< ,
2
then we have Br (x) ∩ Br (y) = ∅. (Otherwise, if we have a point z ∈ Br (x) ∩ Br (y),
we would have 2r < d(x, y) ≤ d(x, z) + d(y, z) < 2r, which is impossible.)
Remark 6.3. Let (X, d) be a metric space, and let M be a subset of X. Then
d M ×M is a metric on M , and the metric topology on M defined by this metric is
precisely the induced toplogy from X. This means that a set A ⊂ M is open in M
if and only if there exists some open set D ⊂ X with A = M ∩ D.
The metric space framework is particularly convenient because one can use
convergence.
Definition. Let (X, d) be a metric space. For a point x ∈ X, we say that a
sequence (xn )n≥1 ⊂ X is is convergent to x, if limn→∞ d(xn , x) = 0.
Remark 6.4. Let (X, d) is a metric space, and if the sequence (xn )n≥1 ⊂ X
is convergent to some point x ∈ X, then
(1) lim d(xn , y) = d(x, y), ∀ y ∈ X.
n→∞
This is an immediate consequence of the inequalities
d(x, y) − d(xn , x) ≤ d(xn , y) ≤ d(x, y) + d(xn , x).
Among other things, the equality (1) gives the fact that (xn )n≥1 cannot be
convergent to any other point y 6= x. Therefore, if (xn )n≥1 is convergent to some
x, then x is uniquely determined, and will be denoted by limn→∞ xn .
Convergence is useful for characterizing closure.
Proposition 6.1. Let (X, d) be a metric space, and let A ⊂ X be a non-empty
subset. For a point x ∈ X, the following are equivalent:
(i) x belongs to the closure A of A;
(ii) there exists some sequence (xn )n≥1 ⊂ A, with limn→∞ xn = x.
Proof. (i) ⇒ (ii). Assume x ∈ A. This means that
(∗) For every open set D ⊂ X with D 3 x, the intersection D ∩ A is non-
empty.
We use this property for the open sets B1/n (x), n = 1, 2, . . . . So, for every integer
n ≥ 1, we can find a point xn ∈ B1/n (x) ∩ A. This way we have built a sequence
(xn )n≥1 ⊂ A, such that
1
d(xn , x) < , ∀ n ≥ 1.
n
It is clear that this gives x = limn→∞ xn .
(ii) ⇒ (i). Assume x satisfies property (ii). Fix (xn )n≥1 ⊂ A to be a sequence
with limn→∞ xn = x. We need to prove property (∗). Start with some arbitrary
CHAPTER I: TOPOLOGY PRELIMINARIES 33

open set D ⊂ X, with x ∈ D. Let ε > 0 be chosen such that Bε (x) ⊂ D. Since
limn→∞ d(xn , x) = 0, there exists some nε such that d(xnε , x) < ε. It is now clear
that
xnε ∈ Bε (x) ∩ A ⊂ D ∩ A,
so the intersection D ∩ A is indeed non-empty. 
Continuity can be characterized using convergence, as follows.
Proposition 6.2. Let X and Y be metric spaces, and let f : X → Y be a
function. For a point p ∈ X, the following are equivalent:
(i) f is continuous at p;
(ii) for every ε > 0, there exists some δε > 0 such that

d f (x), f (p) < ε, for all x ∈ X with d(x, p) < δε .
(iii) if (xn )n≥1 ⊂ X is a sequence with limn→∞ xn = p, then limn→∞ f (xn ) =
f (p).
Proof. (i) ⇒ (ii). The condition that f is continuoous at p means
(∗) for every open set D ⊂ Y , with D 3 f (p), there exists some open set
E ⊂ X, with p ∈ E ⊂ f −1 (D).
Assume f is continuous at p. For every ε > 0, we consider the open ball BYε f (p).


Using (∗), there exists some open set E ⊂ X, with E 3 p, and f (E) ⊂ BYε f (p) .
In particular, there exists δ > 0, such that BXδ (p) ⊂ E, so now we have

f BX δ (p) ⊂ Bε f (p) ,
Y
 

which clearly gives (ii).


(ii) ⇒ (iii). Assume f satisfies (ii), and start with some sequence (xn )n≥1 ⊂ X,
which converges to p. For every ε > 0, we choose δε > 0 as in (ii), and using the
fact that limn→∞ xn = p, we can also choose some Nε such that
d(xn , p) < δε , ∀ n ≥ Nε .
Using (ii) this will give

d f (xn ), f (p) < ε, ∀ n ≥ Nε .
In other words, we get the fact that

lim f (xn ), f (p) = 0,
n→∞

which means that we indeed have limn→∞ f (xn ) = f (p).


(iii) ⇒ (i). Assume f satisfies (iii), but f is not continuous at p. By (∗) this
means that there exists some open set D0 ⊂ Y with D0 3 f (p), such that
(∗0 ) for every open set E ⊂ X with E 3 p, we have f (E) 6⊂ D0 .
It is clear that any other open set D, with f (p) ∈ D ⊂ D0 , will again satisfy
property (∗0 ). Fix then some r > 0, such that BYr f (p)) ⊂ D0 . Using condition
(∗0 ) it follows that for every integer n ≥ 1, we have
f BX1/n (p) 6⊂ Br f (p) .
Y
 

This means that, for every integer n ≥ 1, we can find a point xn ∈ X such that
1 
d(xn , p) < and d f (xn ), f (p) ≥ r.
n
34 LECTURE 6

 clear that the sequence (xn )n≥1 ⊂ X is convergent to p, but the sequence
It is then
f (xn ) n≥1 ⊂ Y is not convergent to f (p). This will contradict (iii). 

Convergence can also be used for characterizing compactness.


Theorem 6.1. Let (X, d) be a metric space. The following are equivalent:
(i) X is compact in the metric topology;
(ii) every sequence has a convergent subsequence.
Proof. (i) ⇒ (ii). Assume X is compact. Start with an arbitrary sequence
(xn )n≥1 ⊂ X. For every n ≥ 1, we define the closed set
Tn = {xk : k > n}.
It is obvious that the family of closed sets (Tn )n≥1 has the finite intersection prop-
erty, i.e. for every finite set F of indices, we have
\
Tn 6= ∅.
n∈F

(This follows from the fact that the Tn ’s form a decreasing sequence of sets.) By
compactness, it follows that
\
Tn 6= ∅.
n≥1
T
Take a point x ∈ n≥1 Tn . The key feature of x is the given by the following:
Claim 1: For every ε > 0 and every integer ` ≥ 1, there exists some integer
N (ε, `) > ` such that d(xN (ε,`) , x) < ε.
This is a consequence of the fact that, for every ` ≥ 1, the point x belongs to the
closure {xN : N > `}, so for every ε > 0 we have
Bε (x) ∩ {xN : N > `} =
6 ∅.
Using Claim 1, we define a sequence (kn )n≥0 of integers, recursively by
kn = N ( n1 , kn−1 ), ∀ n ≥ 1.
(The initial term k0 is chosen arbitrarily.) We have, by construction, k0 < k1 <
k2 < . . . , and
1
d(xkn , x) < , ∀ n ≥ 1,
n
so (xkn )n≥1 is indeed a subsequence of (xk )k≥1 , which is convergent (to x).
(ii) ⇒ (i). Assume (ii). Before we start proving that X is compact, We shall
need some preparations.
Claim 2: For every r > 0 there exists a finite set F ⊂ X, such that
[
X= Br (x).
x∈F

We prove this by contradiction. Assume there exists some r > 0, such that
[
Br (x) ( X,
x∈F

for every finite set F ⊂ X. In particular, there exists a sequence (xn )n≥1 such that
xn+1 ∈ X r Br (x1 ) ∪ · · · ∪ Br (xn ) , ∀ n ≥ 1.
 
CHAPTER I: TOPOLOGY PRELIMINARIES 35

This will force


d(xm , xn ) ≥ r, ∀ m > n ≥ 1.
Notice that every subsequence (xkn )n≥1 will satisfy the same property
d(xkm , xkn ) ≥ r, ∀ m > n ≥ 1.
This proves that no subsequence of (xn )n≥1 is Cauchy, so no subsequence of (xn )n≥1
can be convergent, thus contradicting (ii).
Having proven Claim 2, we choose, for every integer n ≥ 1, finite set Fn such
that [
X= B n1 (x).
x∈Fn

Claim 3: The collection W = B n1 (x) : n ∈ N, x ∈ Fn is a base for the




metric topology.
What we need to show is that every open set is a union of sets in W. Fix an open
set D and a point p ∈ D. Choose r > 0, such that Br (p) ⊂ D. Choose then
some integer n ≥ 1, such that n1 < 2r , and choose some point x ∈ Fn , such that
p ∈ B n1 (x). Notice that, for every y ∈ B n1 (x), we have
1 1
d(y, p) ≤ d(y, x) + d(x, p) < + ≤ r,
n n
which proves that y ∈ Br (p). Therefore we have
p ∈ B n1 (x) ⊂ Br (p) ⊂ D.
Since p ∈ D is arbitrary, this proves that D is a union of sets in W.
We now beginS proving that X is compact. Start with a collection (Di )i∈I of
open Ssets, with i∈I Di = X. We need to find a finite set of indices I0 ⊂ I, such
that i∈I0 Di = X. First we show that:
Claim 4: There exists a countable set of indices I1 ⊂ I, such that
[
Di = X.
i∈I1

The key fact is that the base W is countable. Let us enumerate the base W as a
sequence
W = {Wm : m ∈ N}.
For each i ∈ I, we define the set
Mi = {m ≥ 1 : Wm ⊂ Di }.
By Claim 3, we know that for every x ∈ Di there exists some m ∈ Mi such that
x ∈ Wm ⊂ Di . In particular this proves the equality
[
Di = Wm , ∀ i ∈ I.
m∈Mi
S
Consider then the union M = i∈I Mi , which is countable, being a subset of the
integers. We clearly have
[ [ [  [
Wm = Wm = Di = X.
m∈M i∈I m∈Mi i∈I

For every m ∈ M we choose an im ∈ I, such that m ∈ Mim . If we take


I1 = {im : m ∈ M },
36 LECTURE 6

then I1 is obviously countable, and since we clearly have Wm ⊂ Dim , we get


[ [ [
X= Wm ⊂ Dim = Di ,
m∈M m∈M i∈I1

so the Claim is proven.


Let us list the countable set I1 as
I1 = {ik : k ≥ 1}.
(Of course, if I1 is already finite, there is nothing to prove. So we will assume
that I1 is infinite.) In order to finish the proof, we must find some k, such that
Di1 ∪ Di2 ∪ · · · ∪ Dik = X. Assume no such k can be found, which means that
Di1 ∪ Di2 ∪ · · · ∪ Dik ( X, ∀ k ≥ 1.
In other words, if we define for each k ≥ 1, the close set
Ak = X r (Di1 ∪ Di2 ∪ · · · ∪ Dik ),
we have
Ak 6= ∅, ∀ k ≥ 1.
For each k ≥ 1 we choose a point xk ∈ Ak . This way we have constructed a
sequence (xk )k≥1 ⊂ X, so using property (i) we can find a convergent subsequence.
This means that we have a sequence of integers
1 ≤ k1 < k2 < . . .
and a point x ∈ X, such that limn→∞ xkn = x. Notice that, since
kn ≥ n, ∀ n ≥ 1,
and since the sequence (Ak )k≥1 is decreasing, we get the fact that, for each m ≥ 1,
we have
xkn ∈ Am , ∀ n ≥ m.
Since Am is closed, this forces x ∈ Am , for all m ≥ 1. But this is clearly impossible,
since \ [  [ 
Am = X r (Di1 ∪ · · · ∪ Dim ) = X r Di = ∅.
m≥1 m≥1 i∈I1

Corollary 6.1 (of the proof). Evry compact metric space is second countable,
which means that there exists a sequence (Wm )m≥1 of open sets, with the property
(b) for every open set D, there exists a subset M ⊂ N such that
[
D= Wm .
m∈M

Proof. Use (i) and the steps in the proof of (i) ⇒ (ii), up to the proof of
Claim 3. 
Corollary 6.2. Let (X, d) be a metric space. For a subset K ⊂ X the fol-
lowing are equivalent:
(i) every sequence in K has a subsequence which is convergent to some point
in K;
(ii) K is compact in X.
CHAPTER I: TOPOLOGY PRELIMINARIES 37

Proof. (i) ⇒ (ii). By the above Theorem, we know that when we equip K
with the metric d K×K , then K is compact. This means that K is compact in the
induced topology, which means exactly that K is compact in X.
(ii) ⇒ (i). Argue as above. If K is compact in X, then
K is compact when
equipped with the induced toplogy, which means that (K, d K×K ) is compact. 

Corollary 6.3. Let X and Y be metric spaces, and let f : X → Y be a


continuous map. If X is compact, then f is uniformly continuous, that is,
• for every ε > 0, there exists some δε > 0, such that
d f (x), f (x0 ) < ε, for all x, x0 ∈ X with d(x, x0 ) < δε .


Proof. Suppose f is not uniformly continuos, so there exists some ε0 > 0,


with the property that for any δ > 0 there exists x, x0 ∈ X, with d(x.x0 ) < δ, but
0

d f (x), f (x ) ≥ ε0 . In particular, one can construct two sequences (xn )n≥1 and
(x0n )n≥1 with
1
d(xn , x0n ) < and d f (xn ), f (x0n ) ≥ ε0 , ∀ n ≥ 1.

(2)
n
Using compactness, we can find a subsequence (xnk )k≥1 of (xn )n≥1 which converges
to some point p. On the one hand, we have
1
d(p, x0nk ) ≤ d(p, xnk ) + d(xnk , x0nk ) < d(p, xnk ) + , ∀ k ≥ 1,
nk
which proves that
(3) lim x0nk = p.
k→∞

On the other hand, using (2) we also have


ε0 ≤ d f (xnk ), f (x0nk ) ≤ d f (p), f (xnk ) + d f (p), f (x0nk ) ,
  

which leads to a contradiction, because the equalities


lim xnk = lim x0nk = p,
k→∞ k→∞

together with the continuity of f , will force


lim d f (p), f (xnk ) = lim d f (p), f (x0nk ) = 0.

k→∞ k→∞

Remark 6.5. Let X be a metric space. Then any compact subset K ⊂ X is


closed (this is a consequence of the fact that X is Hausdorff) and bounded, in the
sense that for every p ∈ X we have
sup d(x, p) < ∞.
x∈K

This is a consequence of the continuity (see ??) of the map


K 3 x 7−→ d(x, p) ∈ [0, ∞).
In general however the converse is not true, i.e. there are metric spaces in which
closed bounded sets may fail to be compact.
38 LECTURE 6

Exercise 1. Equip R with the metric


|x − y|
d(x, y) = , ∀ x, y ∈ R.
1 + |x − y|
Prove that d is indeed a metric on R, and the metric topology on R defined by d is
the usual topology. Prove that R is bounded with respect to this metric.
Exercise 2. Start with a metric space X, and let (xn )n≥1 ⊂ X be a sequence
which is convergent to some point x. Prove that the set
K = {x} ∪ {xn : n ≥ 1}
is compact in X.
Definition. Let (X, d) be e metric space. For a point x ∈ X and a non-empty
subset A ⊂ X, one defines the distance from x to A as the number

d(x, A) = inf d(x, a) : a ∈ A .
Exercise 3. Let (X, d) be a metric space, and let A be a non-empty subset of
X.
(i) For a point x ∈ X, prove that the equality d(x, A) = 0 is equivalent to
the fact that x ∈ A.
(ii) Prove the inequality

d(x, A) − d(y, A) ≤ d(x, y), ∀ x, y ∈ X.
Using (ii) conclude that the map
X 3 x 7−→ d(x, A) ∈ [0, ∞)
is continuous.
Proposition 6.3. Let (X, d) be a metric space. When equipped with the metric
topology, X is normal.
Proof. Let A and B be closed subsets of X with A ∩ B = ∅. We need to
find open sets U, V ⊂ X, with U ⊃ A, V ⊃ B, and U ∩ V = ∅. We are going
to use a converse of Urysohn Lemma. More explicitly, let us define the function
f : X → [0, 1] by
d(x, A)
f (x) = , x ∈ X.
d(x, A) + d(x, B)
Notice that by Exercise 3, both the numerator and denominator are continuous,
and the denominator never vanishes. So f is indeed continuous. It is obvious
that f A = 0 and f B = 1, so if we take the open sets U = f −1 (−∞, 12 ) and


V = f −1 ( 12 , ∞) , we clearly get the desired result.




We continue now with a discussion on completeness.
Definitions. Let (X, d) be a metric space. A sequence (xn )n≥1 ⊂ X is said
to be a Cauchy sequence, if it has the following property.
(C) For every ε > 0, there exists some integer Nε ≥ 1 such that
d(xm , xn ) < ε, ∀ m, n ≥ Nε .
The metric space (X, d) is said to be complete, if every Cauchy sequence is
convergent.
The following result summarizes some equivalent characterizations of complete-
ness.
CHAPTER I: TOPOLOGY PRELIMINARIES 39

Proposition 6.4. Let (X, d) be a metric space. The following are equivalent.
(i) (X, d) is complete.
(ii) Every sequence (xn )n≥1 ⊂ X, with

X
(4) d(xn+1 , xn ) < ∞,
n=1
is convergent.
(iii) Every Cauchy sequence has a convergent subsequence.
Proof. (i) ⇒ (ii). Assume X is complete. Let (xn )n≥1 ⊂ X be a sequence
with property (4). To prove (ii) it suffices to show that (xn )n≥1 is Cauchy. For
every N ≥ 1 we define
X∞
RN = d(xn+1 , xn ).
n=N
Using (4) we get limN →∞ RN = 0, so for every ε > 0 there exists some N (ε) with
RN (ε) < ε. Notice also that the sequence (RN )N ≥1 is decreasing. If m > n ≥ N (ε),
then
m−1
X ∞
X
d(xm , xn ) ≤ d(xk+1 , xk ) ≤ d(xk+1 , xk ) = Rn ≤ RN (ε) < ε,
k=n k=n

so (xn )n≥1 is indeed Cauchy.


(ii) ⇒ (iii). Start with some Cauchy sequence (yk )k≥1 . For every n ≥ 1 choose
an integer N (n) ≥ 1 such that
1
(5) d(xk , x` ) < n , ∀ k, ` ≥ N (n).
2
Start with some arbitrary k1 ≥ N (1) and define recursively an entire sequence
(kn )n≥1 of integers, by
kn+1 = max{kn + 1, N (n + 1), n ≥ 1.
Clearly we have k1 < k2 < . . . , and since we have
kn+1 > kn ≥ N (n), ∀ n ≥ 1,
using (5), we get
1
d(ykn+1 , ykn ) < , ∀ n ≥ 1.
2n
So if we define the subsequence xn = ykn , n ≥ 1, we will have
∞ ∞
X X 1
d(xn+1 , xn ) ≤ n
= 1,
n=1 n=1
2
so the subsequence (xn )n≥1 satisfies condition (4). By (ii) the subsequence (xn )n≥1
is convergent.
(iii) ⇒ (i). Assume condition (iii) holds. Start with some Cauchy sequence
(xn )n≥1 . For every integer n ≥ 1 we put
Sn = sup d(x` , xm ).
`,m≥n

Since (xn )n≥1 is Cauchy, we have


(6) lim Sn = 0.
n→∞
40 LECTURE 6

Using the assumption, we can find a subsequence (xkn )n≥1 (defined by an increasing
sequence of integers 1 ≤ k1 < k2 < . . . ) which is convergent to some point x. We
are going to prove that the entire sequence (xn )n≥1 is convergent to x. Fix for the
moment n ≥ 1. For every m ≥ n, we have km ≥ m ≥ n, so we have
(7) Sn ≥ d(xn , xkm ), ∀ m ≥ n.
By Remark 3.4, we also know that
lim d(xn , xkm ) = d(xn , x),
m→∞

so if we take limm→∞ in (7) we will get


d(xn , x) ≤ Sn .
Since this estimate holds for arbitrary n ≥ 1, using (6) we immediately get the fact
that (xn )n≥1 is indeed convergent to x. 

Proposition 6.5. Suppose (X, d) is a complete metric space, and Y is a subset


of X. The following are equivalent:
(i) Y is complete, when equipped with the metric from X;
(ii) Y is closed in X, in the metric topology.

Proof. (i) ⇒ (ii). Assume Y is complete, and let us prove that Y is closed.
Start with a point x ∈ Y . Then there exists a sequence (yn )n≥1 ⊂ Y with
limn→∞ yn = x. Notice that (yn )n≥1 is Cauchy in Y , so by assumption, (yn )n≥1 is
convergent to som point in Y . This will then clearly force x ∈ Y .
(ii) ⇒ (i). Assume Y is closed, and let us prove that Y is complete. Start
with a Cauchy sequence (yn )n≥1 ⊂ Y . Since X is complete, the sequence (yn )n≥1
is convergent to some point x ∈ X. Since Y is closed, this forces x ∈ Y . 

Remark 6.6. Using Theorem 6.1, we immediately see that a metric space,
which is compact in the metric topology, is automatically complete.
The next result identifies those complete metric spaces that are compact. In
order to formulate it, we need the following:
Definition. Let (X, d) be a metric space, and let ε > 0. A subset A ⊂ X is
said to be ε-rare, if
d(a, b) ≥ ε, for all a, b ∈ A with a 6= b.
Proposition 6.6. Let (X, d) be a complete metric space. The following are
equivalent:
(i) X is compact in the metric topology;
(ii) for each ε > 0, all ε-rare subsets of X are finite;
(iii) for any ε > 0, there exist finitely many points p1 , p2 , . . . , pn ∈ X, such
that
X = Bε (p1 ) ∪ Bε (p2 ) ∪ · · · ∪ Bε (pn ).

Proof. (i) ⇒ (ii). Assume X is compact. We prove (ii) by contradiction.


Assume there exists some ε > 0 and an infinite ε-rare set A ⊂ X. It then follows
that there exists a sequence (an )n≥1 ⊂ A, such that
d(am , an ) ≥ ε, ∀ m > n ≥ 1.
CHAPTER I: TOPOLOGY PRELIMINARIES 41

It is clear that no subsequence of (an )n≥1 is Cauchy, which means that (an )n≥1
does not have any convergent subsequence, thus contradicting the fact that X is
compact.
(ii) ⇒ (iii). Assume property (ii) and let us prove (iii) by contradiction.
Assume there exists some ε > 0, such that, for every finite set F ⊂ X, one has a
strict inclusion [
Bε (x) ( X.
x∈F
Start with some arbitrary point a1 ∈ X, and construct recursively a seqeuence
(an )n≥1 ⊂ X, by choosing
an+1 ∈ X r Bε (a1 ) ∪ · · · ∪ Bε (an ) , ∀ n ≥ 1.
 

This will then force


d(am , an ) ≥ ε, ∀ m > n ≥ 1,
so A = {an : n ∈ N} will be an infinite ε-rare set, thus contradicting (ii).
(iii) ⇒ (i). Assume property (iii), and let us prove that X is compact. We are
going to use Theorem 6.1. Start with an arbitrary sequence (xn )n≥1 ⊂ X, and let
us construct a convergent subsequence.
Claim: There exists a sequence (pn )n≥1 ⊂ X, such that for every integer
k ≥ 1, the set
k
\
B 1` (p` )

Mk = n ∈ N : xn ∈
`=1
is infinite.
The sequence (pn )n≥1 is constructed recursively. To start, we use (ii) to find a finite
set F1 ⊂ X, such that [
X= B1 (p).
p∈F1
If we define, for each p ∈ F1 , the set
S1 (p) = {n ∈ N : xn ∈ B1 (p)},
then we clearly have [
S1 (p) = N,
p∈F1
so in particular one of the sets S1 (p), p ∈ F1 , is infinite.
Suppose now we have constructed points p1 , p2 , . . . , pm−1 , such that, for every
k ∈ {1, . . . , m − 1}, the set
k
\
B 1` (p` )

Mk = n ∈ N : xn ∈
`=1
is infinite, and let us indicate how the next term pm is to be constructed. Start
with a finite set Fm ⊂ X, such that
[
X= B m1 (p),
p∈Fm

and define, for each p ∈ Fm , the set


Sm (p) = {n ∈ Mm−1 : xn ∈ B m1 (p) .

42 LECTURE 6

It is clear that [
Mm−1 = Sm (p),
p∈Fm
and since Mm−1 is infinite, it follows that one of the sets Sm (p), p ∈ Fm is infinite.
We then choose pm ∈ Fm to be one point for which Sm (pm ) is infinite.
Having proven the Claim, let us us construct a sequence of integers 1 ≤ n1 <
n2 < . . . as follows. Start with some arbitrary n1 ∈ M1 . Once n1 < n2 < · · · < nk
have been constructed, we choose the integer nk+1 ∈ Mk+1 , such that nk+1 > nk .
(It is here that we use the fact that Mk+1 is infinite.) By construction, we have
nk ∈ Mk , ∀ k ≥ 1.
Suppose k ≥ ` ≥ 1. Then by construction we have nk ∈ Mk ⊂ M` and n` ∈ M` .
In particular we get
2
d(xnk , xn` ) ≤ d(xnk , p` ) + d(xn` , p` ) < .
`
The above estimate clearely proves that the subsequence (xnk )k≥1 is Cauchy. Since
X is complete, it follows that (xnk )k≥1 is convergent. 
Corollary 6.4. Let (X, d) be a complete metric space, and let A be a subset
of X. The following are equivalent:
(i) the closure A is compact in X;
(ii) for each ε > 0, all ε-rare subsets of A are finite.
Proof. (i) ⇒ (ii). This is trivial from the above result.
(ii) ⇒ (i). Assume (ii), and let us prove that A is compact. Since A is complete,
it suffices to prove that, for each ε > 0, all ε-rare subsets of A are finite. Fix ε > 0,
and let B be an ε-rare subset of A. For each x ∈ B, let us choose a point ax ∈ A,
such that x ∈ Bε/3 (ax ). Suppose x, y ∈ B are such that x 6= y. Then
ε ε ε
d(ax , ay ) ≥ d(x, y) − d(ax , x) − d(ay , y) > ε − − = .
3 3 3
In particular, this shows that the map
f : B 3 x 7−→ ax ∈ A
is injective, and the set f (B) is an (ε/3)-rare subset of A. By condition (ii) this
forces B to be finite. 
We continue with an important construction.
Definitions. Let (X, d) be a metric space. We define

cs(X, d) = x = (xn )n≥1 : x Cauchy sequence in X .
We say that two Cauchy sequences x = (xn )n≥1 and y = (yn )n≥1 in X are equiva-
lent, if
lim d(xn , yn ) = 0.
n→∞
In this case we write x ∼ y. (It is fairly obvious that ∼ is indeed an equivalence
relation.) We define the quotient space
e = cs(X, d)/ ∼ .
X
For an element x ∈ cs(X, d), we denote its equivalence class by x
e.
Finally, for a point x ∈ X, we define hxi ∈ X,
e to be the equivalence class of
the constant sequence x (which is obviously Cauchy).
CHAPTER I: TOPOLOGY PRELIMINARIES 43

Remark 6.7. Let (X, d) be a metric space. If x = (xn )n≥1 and y = (y  n )n≥1
are Cauchy sequences in X, then the sequence of real numbers d(xn , yn ) n≥1 is
convergent. Indeed, for any m, n we have

d(xm , ym ) − d(xn , yn ) ≤ d(xm , ym ) − d(xn , ym ) + d(xn , ym ) − d(xn , yn ) ≤
≤ d(xm , xn ) + d(ym , yn ).
We can then define
δ(x, y) = lim d(xn , yn ).
n→∞
Proposition 6.7. Let (X, d) be a metric space.
A. The map δ : cs(X, d) × cs(X, d) → [0, ∞) has the following properties:
(i) δ(x, y) = δ(y, x), ∀ x, y ∈ cs(X, d);
(ii) δ(x, y) ≤ δ(x, z) = δ(z, y), ∀ x, y, z ∈ cs(X, d);
(iii) δ(x, y) = 0 ⇒ x ∼ y;
(iv) If x, x0 , y, y 0 ∈ cs(X, d) are such that x ∼ x0 and y ∼ y 0 , then
δ(y, x) = δ(x0 , y 0 ).
B. The map de : X e ×X e → [0, ∞), correctly defined by

d(e e ) = δ(x, y), ∀ x, y ∈ cs(X, d),


e x, y

is a metric on X.
e
C. The map X 3 x 7−→ hxi ∈ X
e is isometric, in the sense that

d(hxi,
e hyi) = d(x, y), ∀ x, y ∈ X.
Proof. A. Properties (i), (ii) and (iii) are obvious. To prove property (iv) let
x = (xn )n≥1 , x0 = (x0n )n≥1 , y = (yn )n≥1 , and y 0 = (yn0 )n≥1 . The inequality
d(x0n , yn0 ) ≤ d(x0 n, xn ) + d(xn , yn ) + d(yn , yn0 ),
combined with limn→∞ d(x0n , xn ) = limn→∞ d(yn , yn0 ) = 0 immediately gives
δ(x0 , y 0 ) = lim d(x0n , yn0 ) ≤ lim d(xn , yn ) = δ(x, y).
n→∞ n→∞

By symmetry we also have δ(x, y) ≤ δ(x0 , y 0 ), and we are done.


B. This is immediate from A.
C. Obvious, from the definition. 
Proposition 6.8. Let (X, d) be a metric space.
(i) For any Cauchy sequence x = (xn )n≥1 in X, one has
lim hxn i = x
e , in X.
e
n→∞

(ii) The metric space (X,


e d)
e is complete.

Proof. (i). For every n ≥ 1, we have



(8) de hxn i, x
e = lim d(xn , xm ).
m→∞

Now if we start with some ε > 0, and we choose Nε such that


d(xn , xm ) < ε, ∀ m, n ≥ Nε ,
then (8) shows that

de hxn i, x
e ≤ ε, ∀ n ≥ Nε ,
44 LECTURE 6

so we indeed have

lim de hxn i, x
e = 0.
n→∞

(ii). Let pk )k≥1 be a Cauchy sequence in X.


e Using (i), we can choose, for each
k ≥ 1, an element xk ∈ X, such that
1
de hxk i, pk ) ≤ k .
2
Claim 1: The sequence x = (xk )k≥1 is Cauchy in X.
Indeed, for k ≥ ` ≥ 1 we have

e k , p` ) + 1 .
  
d(xk , x` ) = de hxk i, hx` i ≤ de hxk i, pk ) + d(p
e k , p` ) + de p` , hx` i ≤ d(p
2`
This clearly gives
   
lim sup d(xk , x` ) ≤ lim sup d(p
e k , p` ) = 0,
n→∞ k,`≥N n→∞ k,`≥N

so x = (xk )k≥1 is indeed Cauchy.


The proof of (ii) will the be finished, once we prove:
Claim 2: We have limn→∞ pk = x e in X.
e
To see this, we observe that, for ` ≥ k ≥ 1 we have the inequality
   1
(9) de pk , hx` i ≤ de pk , hxk i + de hxk i, hx` i ≤ k + d(xk , x` ).
2
If we now start with some ε > 0, and we choose Nε such that
d(xk , x` ) < ε, ∀ k, ` ≥ Nε ,
then (9) gives
 1
de pk , hx` i ≤ k + ε, ∀ ` ≥ k ≥ Nε .
2
If we keep k ≥ Nε fixed and take lim`→∞ , using (i) we get

d(p
e k, x e k , hx` i) ≤ 1 + ε, ∀ k ≥ Nε .
e ) = lim d(p
`→∞ 2k
The above estimate clearly proves that
lim d(p
e k, x
e ) = 0,
k→∞

so the sequence (pk )k≥1 is convergent (to x


e ). 

Definition. The metric space (X,


e d)
e is called the completion of (x, d).
The completion has a certain universality property. In order to formulate this
property we need the following
Definition. Let (X, d) and (Y, ρ) be metric spaces. A map f : X → Y is said
to be a Lipschitz function, if there exists some constant C ≥ 0, such that
ρ f (x), f (x0 ) ≤ C · d(x, x0 ), ∀ x, x0 ∈ X.


Such a constant C is then called a Lipschitz constant for f .


CHAPTER I: TOPOLOGY PRELIMINARIES 45

Proposition 6.9. Let (X, d) be a metric space, and let (X,


e d)
e be its completion.
If (Y, ρ) is a complete metric space, and f : X → Y is a Lipschitz function with
Lipschitz constant C ≥ 0, then there exists a unique continuous function fe : Xe →
Y , such that
fe(hxi) = f (x), ∀ x ∈ X.
Moreover, fe is Lipschitz, with Lipschitz constant C.

Proof. Start with some Cauchy sequence x = (xn )n≥1 in X. Using the in-
equality

ρ f (xm ), f (xn ) ≤ C · d(xm , xn ), ∀ m, n ≥ 1,

it is obvious that f (xn ) n≥1 is a Cauchy sequence in Y . Since Y is complete, this
sequence is convergent. Define,
φ(x) = lim f (xn ).
n→∞

This way we have constructed a map φ : cs(X, d) → Y .


Claim: If x ∼ x0 , then φ(x) = φ(x0 ).
Indeed, if x = (xn )n≥1 and x0 = (x0n )n≥1 , then the Lipschitz property will give
ρ f (xn ), f (x0n ) ≤ C · d(xn , x0n ), ∀ n ≥ 1,


and using the fact that limn→∞ d(xn , x0n ) = 0, we get limn→∞ ρ f (xn ), f (x0n ) = 0.


This clearly forces


lim f (xn ) = lim f (x0n ).
n→∞ n→∞
Having proven the claim, we now see that we have a correctly define map
f : X → Y , with the property that
e e

x) = φ(x), ∀ x ∈ cs(X, d).


fe(e
The equality
fe(hxi) = f (x), ∀ x ∈ X
is trivially satisfied.
Let us check now that fe is Lipschitz, with Lipschitz constant C. Start with
two points p, p0 ∈ X, e and p0 = xe0 , for two Cauchy sequences
e represented as p = x
0 0
x = (xn )n≥1 and x = (xn )n≥1 in X. Using the definition, we have
fe(p) = lim f (xn ) and fe(p0 ) = lim f (x0n ).
n→∞ n→∞

This will give


ρ f (p), f (p0 ) = lim ρ f (xn ), f (x0n ) .
 
n→∞
Notice however that
ρ f (xn ), f (x0n ) ≤ C · d(xn , x0n ), ∀ n ≥ 1,


so taking the limit yields


ρ f (p), f (p0 ) = lim ρ f (xn ), f (x0n ) ≤ C · lim d(xn , x0n ) = C · d(p,
e p0 ).
 
n→∞ n→∞

e → Y be another continuous
Finally, let us show that fe is unique. Let F : X
function with F (hxi) = f (x), for all x ∈ X. Start with an arbitrary point p ∈
46 LECTURE 6

X,
e represented as p = x, for some Cauchy sequence x = (xn )n≥1 in X. Since
limn→∞ hxn i = p in X,
e by continuity we have

F (p) = lim F (hxn i) = lim f (xn ) = φ(x) = fe(p).


n→∞ n→∞

Corollary 6.5. Let (X, d) be a metric space, let (Y, ρ) be a complete metric
space, and let f : X → Y be an isometric map, that is
ρ f (x), f (x0 ) = d(x, x0 ), ∀ x, x0 ∈ X.


Then the map f˜ : X̃ → Y , given by the above result, is isometric and f˜(X̃) = f (X)
- the closure of f (X) in Y ..

Proof. To show that f˜(X̃) = f (X), start with some arbitrary point y ∈
f (X). Then there exists a sequence (xn )n≥1 ⊂ X, with limn→∞ f (xn ) = y. Since
f (xn ) n≥1 is Cauchy in Y , and

d(xm , xn ) = ρ f (xm ), f (xn ) , ∀ m, n ≥ 1,
it follows that the sequence x = (xn )n≥1 is cauchy in X. We then have
y = lim f (xn ) = f˜(x̃).
n→∞

Finally, we show that f˜ is isometric. Start with two points p, q ∈ X̃, represented
as p = x̃ and q = z̃, for some Cauchy sequences x = (xn )n≥1 and z = (zn )n≥1 in
X. Then by construction we have
ρ f˜(p), f˜(q) = lim ρ f˜(hxn i), f˜(hzn i) = lim ρ f (xn ), f (zn ) =
  
n→∞ n→∞
˜ z̃) = d(p,
= lim d(xn , zn ) = d(x̃, ˜ q).
n→∞

Corollary 6.6. If (X, d) is a complete metric space, and X̃ is its completion,


then the map ι : X 3 x 7−→ hxi ∈ X̃ is bijective.

Proof. Apply the previous result to the map Id : X → X, to get a bijective


(isometric) map Id e : X̃ → X. Since the map Id
e is obviously a left inverse for ι, it
follows that ι itself is bijective. 

In the remainder of this section we will address the following question: Given
a topological Hausdorff space X, when does there exists a metric d on X, such that
the given topology coincides with the metric topology defined by d? A topolgical
Hausdorff space with the above property is said to be metrizable. It is difficult to
give non-trivial necessary and sufficient conditions for mtrizability. One instance in
which this is possible is the compact case (see the Urysohn Metrizability Theorem
later in these notes). Here is a useful result, which is an example of a sufficient
condition for mterizabilty.
Proposition 6.10 (Metrizability of Countable Products). Let
Q (Xi , di )i∈I be a
countable family of metric spaces. Then the product space X = i∈I Xi , equipped
with the product topology, is metrizable.
CHAPTER I: TOPOLOGY PRELIMINARIES 47

Proof. Denote by T the product topology on X. What we need is a metric d


on X, such that the maps
Id : (X, d) → (X, T) and Id : (X, T) → (X, d)
are continuous. (Here the notation (X, d) signifies that X is equipped with the
metric topology defined by d.) For each i ∈ I, let πi : X → Xi denote the
projection onto the ith coordinate.
Case I: Assume I is finite. In this case we define the metric d on X as follows.
If x = (xi )i∈I and y = (yi )i∈I are elements in X, we put
d(x, y) = max di (xi , yi ).
i∈I

The continuity of the map Id : (X, d) → (X, T) is equivalent to the fact that all
maps
πi : (X, d) → (Xi , di ), i ∈ I
are continuous. This is obvious, because by construction we have
di πi (x), πi (y) ≤ d(x, y), ∀ x, y ∈ X.
Conversely, to prove the continuity of Id : (X, T) → (X, d), we are going to prove
that every d-open set is open in the product
Q topology. It suffices to prove this only
for open balls. Fix then x = (xi )i∈I ∈ i∈I Xi and r > 0, and consider the open
ball Br (x). If we define, for each i ∈ I, the open ball BX
r (xi ), then it is obvious
i

that \
Br (x) = πi−1 BX

r (xi ) ,
i

i∈I
and since πi are all continuous, this proves that Br (x) is indeed open in the product
toplogy.
Case II: Assume I is infinite. In this case we identify I = N. For every n ∈ N
we define a new metric δn on Xn , as follows. If
sup dn (p, q) ≤ 1,
p,q∈Xn

we put δn = dn . Otherwise, we define


dn (p, q)
δn (p, p) = , ∀ p, q ∈ Xn .
1 + dn (p, q)
It is not hard to see that the metric topology defined by δn coincides with the one
defined by dn . The advantage is that δn takes values in [0, 1]. We define the metric
Q: X × X → [0, ∞), as follows. If x = (xn )n∈N and y = (yn )n∈N are elements in
d
n∈N Xn , we define
∞ ∞
X 1 dn (xn , yn ) X δn (xn , yn )
d(x, y) = n
· = .
n=1
2 1 + dn (xn , yn ) n=1 2n
Due to the fact that δn takes values in [0, 1], the above series is convergent, and it
obviously defines a metric on X.
As above, the continuity of the map Id : (X, d) → (X, T) is equivalent to the
continuity of all the maps πn : (X, d) → (Xn , dn ), or equivalently for πn : (X, d) →
(Xn , δn ), n ∈ N. But this is an immediate consequence of the (obvioous) inequalities
δn πn (x), πn (y) ≤ 2n · d(x, y), ∀ x, y ∈ X.

48 LECTURE 6

As before, in order to prove the continuity of the other map Id : (X, T) → (X, d), we
start with some d-open set D, and we show that D is open in the product topology.
Since D is a union of of open balls, we need to prove that for any x ∈ X and any
Br (x), in (X, d), is a neighborhood of x in the product topology.
r > 0, the open ball Q
Fix x = (xn )n∈N ∈ n∈N Xn , as well as r > 0. Choose some integer N ≥ 1, such
that

X 1 r
< ,
2n 2
n=N +1
and define, for each k ∈ {1, 2, . . . , N } the set
Y r
Dk = {y = (yn )n∈N ∈ Xn : δn (xk , yk ) < }.
2
n∈N

It is clear that Dk is open in the product topology, for  each k = 1, 2, . . . , N . (This is


a consequence of the fact that Dk = πk−1 Br/2 (xk ) , where Br/2 (xk ) is the δk -open
ball in Xk of radius r/2, centered at xk .) Then the set D = D1 ∩ D2 ∩ · · · ∩ DN is
also open in the product topology. Obviously we have x ∈ D. We now prove that
D ⊂ Br (x). Start with some arbitrary y ∈ D, say y = (yn )n∈N . On the one hand,
we have
r
δk (xk , yk ) < , ∀ k ∈ {1, 2, . . . , N },
2
so we get
N N
X 1 rX 1 r
n
δ n (xn , y n ) < n
< .
n=1
2 2 n=1 2 2
On the other hand, since δn takes values in [0, 1), we also have
∞ ∞
X 1 X 1 r
δ n (xn , y n ) < < ,
2n n=1
2n 2
n=N +1
so we get

X 1
d(x, y) = δn (xn , yn ) < r,
n=1
2n
thus proving that y indeed belongs to Br (x). 
Lecture 7

7. Baire theorem(s)
In this section we discuss some topological phenomenon that occurs in certain
topological spaces. This deals with interiors of closed sets.
Exercise 1. Let X be a topological space, and let A and B be closed sets with
the property that int(A ∪ B) 6= ∅. Prove that either Int(A) 6= ∅, or Int(B) 6= ∅.
Exercise 2. Give an example of a topological space X and of two (non-closed)
sets A and B such that Int(A ∪ B) 6= ∅, but Int(A) = Int(B) = ∅.
Theorem 7.1 (Baire’s Theorem). Let (X, T ) be a topological Hausdorff space,
which satisfies one (or both) of the following properties:
(a) There exists a meatric d on X, which meakes (X, d) a complete metric
space, and T is the metric topology.
(b) X is locally compact.
S∞
Suppose one has a sequence (Fn )n≥1 of closed subsets of X, such that X = n=1 Fn .
Then there exists some integer n ≥ 1, such that Int(Fn ) 6= ∅.
Sn
S∞every n ≥ 1 we define the closed set Gn = k=1 Fk , so that we
Proof. For
still have X = n=1 Gn , but we also have G1 ⊂ G2 ⊂ . . . . According to Exercise 1
(use an inductive argument) it suffices to show that there exists some n ≥ 1, with
Int(Gn ) 6= ∅. We are going to prove this property by contradiction.
(∗) Assume Int(Gn ) = ∅, for all n ≥ 1.
Claim: Under the assumption (∗) there exists a sequence (Dn )n≥1 of non-
empty open sets, such that for all n ≥ 1 we have:
(i) Dn ∩ Gn = ∅;
(ii) Dn+1 ⊂ Dn ;
(iii) In case (a) we have diam(Dn ) ≤ 2−n ; in case (b) Dn is compact.
The sequence is constructed recursivley. To construct D1 we use the fact that
Int(G1 ) = ∅ forces X r G1 6= ∅. We then choose a point x ∈ X r G1 . In case (a)
we know that there exists r > 0 such that Br (x) ⊂ X r G1 . We put ρ = min{r, 41 }
and we set D1 = Bρ (x). In the case (b) we apply Lemma 5.1 to find D1 open with
D1 compact, such that x ∈ D1 ⊂ D1 ⊂ X r G1 .
Let us assume now that we have constructed D1 , D2 , . . . , Dk , such that (i) and
(iii) hold for all n ∈ {1, . . . , k}, and such that (ii) hold for all n ∈ {1, . . . , k − 1},
and let us indicate how the next set Dk+1 is constructed. Using the assumption
that Int(Gk+1 ) = ∅, it follows that the open set Dk r Gk+1 is non-empty. Choose
then a point x ∈ Dk r Gk+1 . In case (a) there exists some r > 0 such that
Br (x) ⊂ Dk r Gk+1 . We then put ρ = min{ 2r , 2k+2 1
}, and we define Dk+1 = Bρ (x).
49
50 LECTURE 7

In case (b) we apply Lemma 5.1 an find an open set Dk+1 with Dk+1 compact,
and x ∈ Dk+1 ⊂ Dk+1 ⊂ Dk r Gk+1 . All properties (i)-(iii) are easily verified.
Having proven the Claim, let us see now that the assumption (∗) produces a
contradiction.
Case (a): In this case we choose, for each n ≥ 1 a point xn ∈ Dn . Notice
that, for every m ≥ n ≥ 1 we have
1
xm , xn ∈ Dn and d(xm , xn ) ≤ diam(Dn ) ≤ n .
2
In particular, this proves that the sequence (xn )n≥1 is Cauchy, hence convergent
to some point x. Since xm ∈ Dn , ∀ m ≥ n ≥ 1, we see that x ∈ Dn , for all n ≥ 1.
In other words we get

\
(1) Dn 6= ∅.
n=1
Case (b): In this case we also get (1), this time as a consequence of the
compactness of the sets Dn (and the finite intersection property).
T∞
Let us notice now that (1) combined with (ii) will also give n=1 Dn 6= ∅. But
this is impossible, since by (i) we have
\∞ ∞
\ ∞
[ 
Dn ⊂ (X r Gn ) = X r Gn = ∅.
n=1 n=1 n=1

Chapter II
Elements of Functional Analysis
Lecture 8

1. Hahn-Banach Theorems
The result we are going to discuss is one of the most fundamental theorems in
the whole field of Functional Analysis. Its statement is simple but quite technical.
Definitions. Let K be either of the fields R or C. Suppose X is a K-vector
space.
A. A map q : X → R is said to be a quasi-seminorm, if
(i) q(x + y) ≤ q(x) + q(y), for all x, y ∈ X ;
(ii) q(tx) = tq(x), for all x ∈ X and all t ∈ R with t ≥ 0.
B. A map q : X → R is said to be a seminorm if, in addition to the above
two properties, it satisfies:
(ii’) q(λx) = |λ|q(x), for all x ∈ X and all λ ∈ K.
Remark that if q : X → R is a seminorm, then q(x) ≥ 0, for all x ∈ X . (Use
2q(x) = q(x) + q(−x) ≥ q(0) = 0.)
There are several versions of the Hahn-Banach Theorem.
Theorem 1.1 (Hahn-Banach, R-version). Let X be an R-vector space. Suppose
q : X → R is a quasi-seminorm. Suppose also we are given a linear subspace Y ⊂ X
and a linear map φ : Y → R, such that
φ(y) ≤ q(y), for all y ∈ Y.
Then there exists a linear map ψ : X → R such that

(i) ψ Y = φ;
(ii) ψ(x) ≤ q(x) for all x ∈ X .
Proof. We first prove the Theorem in the following:
Particular Case: Assume dim X /Y = 1.
This means there exists some vector x0 ∈ X such that
X = {y + sx0 : y ∈ Y, s ∈ R}.
What we need is to prescribe the value ψ(x0 ). In other words, we need a number
α ∈ R such that, if we define ψ : X → R by ψ(y + sx0 ) = φ(y) + sα, ∀ y ∈ Y, s ∈ R,
then this map satisfies condition (ii). For s > 0, condition (ii) reads:
φ(y) + sα ≤ q(y + sx0 ), ∀ y ∈ Y, s > 0,
and, upon dividing by s (set z = s−1 y), is equivalent to:
(1) α ≤ q(z + x0 ) − φ(z), ∀ z ∈ Y.
For s < 0, condition (ii) reads (use t = −s):
φ(y) − tα ≤ q(y − tx0 ), ∀ y ∈ Y, t > 0,
53
54 LECTURE 8

and, upon dividing by t (set w = t−1 y), is equivalent to:


(2) α ≥ φ(w) − q(w − x0 ), ∀ w ∈ Y.
Consider the sets
Z = {q(z + x0 ) − φ(z) ; z ∈ Y} ⊂ R
W = {φ(w) − q(w − x0 ) : w ∈ Y} ⊂ R.
The conditions (1) and (2) are equivalent to the inequalities
(3) sup W ≤ α ≤ inf Z.
This means that, in order to find a real number α with the desired property, it
suffices to prove that sup W ≤ inf Z, which in turn is equivalent to
(4) φ(w) − q(w − x0 ) ≤ q(z + x0 ) − φ(z), ∀ z.w ∈ Y.
But the condition (4) is equivalent to
φ(z + w) ≤ q(z + x0 ) + q(w − x0 ),
which is obviously satisfied because

φ(z + w) ≤ q(z + w) = q (z + x0 ) + (w − x0 ) ≤ q(z + x0 ) + q(w − x0 ).
Having proved the Theorem in this particular case, let us proceed now with
the general case. Let us consider the set Ξ of all pairs (Z, ν) with
• Z is a subspace of X such that Z ⊃ Y;
• ν:Z→ R is a linear functional such that
(i) ν Y = φ;
(ii) ν(z) ≤ q(z), for all z ∈ Z.
Put an order relation  on Ξ as follows:

Z1 ⊃ Z2
(Z1 , ν1 )  (Z2 , ν2 ) ⇔
ν1 Z = ν2
2

Using Zorn’s Lemma, Ξ posesses a maximal element (Z, ψ). The proof of the
Theorem is finished once we prove that Z = X . Assume Z ( X and choose a
vector x0 ∈ X r Z. Form the subspace V = {z + tx0 : z ∈ Z, t ∈ R} and apply
the particular case of the Theorem for the inclusion Z ⊂ V, for ψ : Z → R and for
the quasi-seminorm q V : V → R. It follows that there exists some linear functional
η : M → R such that

(i) η Z = ψ (in particular we will also have η Y = φ);
(ii) η(v) ≤ q(v), for all v ∈ V.
But then the element (V, η) ∈ Ξ will contradict the maximality of (Z, ψ). 
Theorem 1.2 (Hahn-Banach, C-version). Let X be an C-vector space. Suppose
q : X → R is a quasi-seminorm. Suppose also we are given a linear subspace Y ⊂ X
and a linear map φ : Y → C, such that
Re φ(y) ≤ q(y), for all y ∈ Y.
Then there exists a linear map ψ : X → R such that

(i) ψ Y = φ;
(ii) Re ψ(x) ≤ q(x) for all x ∈ X .
CHAPTER II: ELEMENTS OF FUNCTIONAL ANALYSIS 55

Proof. Regard for the moment both X and Y as R-vector spaces. Define the
R-linear map φ1 : Y → R by φ1 (y) = Re φ(y), for all y ∈ Y, so that we have
φ1 (y) ≤ q(y), ∀ y ∈ Y.
Use Theorem 1 to find an R-linear map ψ1 : X → R such that

(i) ψ1 Y = φ1 ;
(ii) ψ1 (x) ≤ q(x), for all x ∈ X .
Define the map ψ : X → C by
ψ(x) = ψ1 (x) − iψ1 (ix), for all x ∈ X .
Claim 1: ψ is C-linear.
It is obvious that ψ is R-linear, so the only thing to prove is that ψ(ix) = iψ(x),
for all x ∈ X . But this is quite obvious:
ψ(ix) = ψ1 (ix) − iψ1 (i2 x) = ψ1 (ix) − iψ1 (−x) =
= −i2 ψ1 (ix) + iψ1 (x) = i ψ1 (x) − iψ1 (ix) = iψ(x), ∀ x ∈ X .


Because of the way ψ is defined, and because ψ1 is real-valued, condition (ii)


in the Theorem follows immediately
Re ψ(x) = ψ1 (x) ≤ q(x), ∀ x ∈ X ,
so in order to finish the proof, we need to prove condition
(i) in the Theorem, (i.e.
ψ Y = φ). This follows from the fact that φ1 = ψ1 Y , and from:
Claim 2: For every y ∈ Y, we have φ(y) = φ1 (y) − iφ1 (iy).
But this is quite obvious, because
Im φ(y) = −Re (iφ(y)) = −Re φ(iy) = −φ1 (iy), ∀ y ∈ Y.


Theorem 1.3 (Hahn-Banach, for seminorms). Let X be a K-vector space (K


is either R or C). Suppose q is a seminorm on X . Suppose also we are given a
linear subspace Y ⊂ X and a linear map φ : Y → K, such that
|φ(y)| ≤ q(y), for all y ∈ Y.
Then there exists a linear map ψ : X → K such that

(i) ψ = φ;
Y
(ii) |ψ(x)| ≤ q(x) for all x ∈ X .

Proof. We are going to apply Theorems 1 and 2, using the fact that q is also
a quasi-seminorm.
The case K = R. Remark that
φ(y) ≤ |φ(y)| ≤ q(y), ∀ y ∈ Y.
So we can apply Theorem 1 and find ψ : X → R with

(i) ψ = φ;
Y
(ii) ψ(x) ≤ q(x), for all x ∈ X .
56 LECTURE 8

Using condition (ii) we also get


−ψ(x) = ψ(−x) ≤ q(−x) = q(x), for all x ∈ X .
In other words we get
±ψ(x) ≤ q(x), for all x ∈ X ,
which of course gives the desired property (ii) in the Theorem.
The case K = C. Remark that
Re φ(y) ≤ |φ(y)| ≤ q(y), ∀ y ∈ Y.
So we can apply Theorem 2 and find ψ : X → R with

(i) ψ = φ;
Y
(ii) Re ψ(x) ≤ q(x), for all x ∈ X .
Using condition (ii) we also get

(5) Re λψ(x) = Re ψ(λx) ≤ q(λx) = q(x), for all x ∈ X and all λ ∈ T.
(Here T = {λ ∈ C : |λ| = 1}.) Fix for the moment x ∈ X . There exists some
 λ∈T
such that |ψ(x)| = λψ(x). For this particular λ we will have Re λψ(x) = |ψ(x)|,
so the inequality (5) will give
|ψ(x)| ≤ q(x).


In the remainder of this section we will discuss the geometric form of the
Hahn-Banach theorems. We begin by describing a method of constructing quasi-
seminorms.
Proposition 1.1. Let X be a real vector space. Suppose C ⊂ X is a convex
subset, which contains 0, and has the property
[
(6) tC = X.
t>0

For every x ∈ X we define


QC (x) = inf{t > 0 : x ∈ tC}.
(By (6) the set in the right hand side is non-empty.) Then the map QC : X → R is
a quasi-seminorm.

Proof. For every x ∈ X, let us define the set


TC (x) = {t > 0 : x ∈ tC}.
It is pretty clear that, since 0 ∈ C, we have
TC (0) = (0, ∞),
so we get
QC (0) = inf TC (0) = 0.
Claim 1: For every x ∈ X and every λ > 0, one has the equality
TC (λx) = λTC (x).
CHAPTER II: ELEMENTS OF FUNCTIONAL ANALYSIS 57

Indeed, if t ∈ TC (λx), we have λx ∈ tC, which menas that λ−1 tx ∈ C, i.e. λ−1 t ∈
TC (x). Conequently we have
t = λ(λ−1 t) ∈ λTX (x),
which proves the inclusion
TC (λx) ⊂ λTC (x).
To prove the other inclusion, we start with some s ∈ λTC (x), which means that
there exists some t ∈ TC (x) with λt = s. The fact that t = λ−1 s belongs to TC (x)
means that x ∈ λ−1 sC, so get λx ∈ sC, so s indeed belongs to TC (λx).
Claim 2:: For every x, y ∈ X, one has the inclusion4
TC (x + y) ⊃ TC (x) + TC (y).
Start with some t ∈ TC (x) and some s ∈ TC (y). Define the elements u = t−1 x and
v = s−1 y. Since u, v ∈ C, and C is convex, it follows that C contains the element
t s 1
u+ v= (x + y),
t+s t+s t+s
which means that x + y ∈ (t + s)C, so t + s indeed belongs to TC (x + y).
We can now conclude the proof. If x ∈ X and λ > 0, then the equality
QC (λx) = λQC (x)
is an immediate consequence of Claim 1. If x, y ∈ X, then the inequality
QC (x + y) ≤ λQC (x) + QC (y)
is an immediate consequence of Claim 2. 
Definition. Under the hypothesis of the above proposition, the quasi-semi-
norm QC is called the Minkowski functional associated with the set C.
Remark 1.1. Let X be a real vector space. Suppose C ⊂ X is a convex subset,
which contains 0, and has the property (6). Then one has the inclusions
{x ∈ X : QC (x) < 1} ⊂ C ⊂ {x ∈ X : QC (x) ≤ 1}.
The second inclusion is pretty obvious, since if we start with some x ∈ C, using the
notations from the proof of Proposition 2.1, we have 1 ∈ TC (x), so
QC (x) = inf TC (x) ≤ 1.
To prove the first inclusion, start with some x ∈ X with QC (x) < 1. In particular
this means that there exists some t ∈ (0, 1) such that x ∈ tC. Define the vector
y = t−1 x ∈ C and notice now that, since C is convex, it will contain the convex
combination ty + (1 − t)0 = x.
Exercise 1. Let X be a real vector space, and let q : X → R be a quasi-seminorm.
Define the sets
C0 = {x ∈ X : q(x) < 1},
C1 = {x ∈ X : q(x) ≤ 1}.
(i) Prove that C0 and C1 are both convex, they contain 0, and they both
hav property (6).
4For subsets T, S ⊂ R we define T + S = {t + s : t ∈ T, s ∈ S}.
58 LECTURE 8

(ii) Let C is any convex set with


C0 ⊂ C ⊂ C1 .
Analyze the relationship between QC and q.
Definition. A topological vector space is a vector space X over K (which is
either R or C), which is also a topological space, such that the maps
X × X 3 (x, y) 7−→ x + y ∈ X
K × X 3 (λ, x) 7−→ λx ∈ X
are continuous.
Remark 1.2. Let X be a real topological vector space. Suppose C ⊂ X is a
convex open subset, which contains 0. Then C has the property (6). Moreover
(compare with Remark 2.1), one has the equality
(7) {x ∈ X : QC (x) < 1} = C.
To prove this remark, we define for each x ∈ X, the function
Fx : R 3 t 7−→ tx ∈ X.
Since X is a topological vector space, the map Fx , x ∈ X are continuous. To prove
the property (6) we start with an arbitrary x ∈ X, and we use the continuity of the
map Fx at 0. Since C is a neighborhood of 0, there exists some ρ > 0 such that
Fx (t) ∈ C, ∀ t ∈ [−ρ, ρ].
In particular we get ρx ∈ C, which means that x ∈ ρ−1 C.
To prove the equality (7) we only need to prove the inclusion “⊃” (since the
inclusion “⊂” holds in general, by Remark 2.1). Start with some element x ∈ C.
Using the continuity of the map Fx at 1, plus the fact that Fx (1) = x ∈ C, there
exists some ε > 0, such that
Fx (t) ∈ C, ∀ t ∈ [1 − ε, 1 + ε].
In particular, we have F (1 + ε) ∈ C, which means precisely that
x ∈ (1 + ε)−1 C.
This gives the inequality
QC (x) ≤ (1 + ε)−1 ,
so we indeed get QC (x) < 1.
The first geometric version of the Hahn-Banach Theorem is:
Lemma 1.1. Let X be a real topological vector space, and let C ⊂ X be a convex
open set which contains 0. If x0 ∈ X is some point which does not belong to C, then
there exists a linear continuous map φ : X → R, such that
• φ(x0 ) = 1;
• φ(v) < 1, ∀ v ∈ C.
Proof. Consider the linear subspace
Y = Rx0 = {tx0 : t ∈ R},
and define ψ : Y → R by
ψ(tx0 ) = t, ∀ t ∈ R.
It is obvious that ψ is linear, and ψ(x0 ) = 1.
CHAPTER II: ELEMENTS OF FUNCTIONAL ANALYSIS 59

Claim: One has the inequality


ψ(y) ≤ QC (y), ∀ y ∈ Y.
Let y be represented as y = tx0 for some t ∈ R. It t ≤ 0, the inequality is clear,
because ψ(y) = t ≤ 0 and the right hand side QC (y) is always non-negative. Assume
t > 0. Since QC is a quasi-seminorm, we have
(8) QC (y) = QC (tx0 ) = tQC (x0 ),
and the fact that x0 6∈ C will give (by Remark 2.2) the inequality QC (x0 ) ≥ 1. Since
t > 0, the computation (8) can be continued with
QC (y) = tQC (x0 ) ≥ t = ψ(y),
so the Claim follows also in this case.
Use now the Hahn-Banach Theorem, to find a linear map φ : X → R such that

(i) φ Y = ψ;
(ii) φ(x) ≤ QC (x), ∀ x ∈ X.
It is obvious that (i) gives φ(x0 ) = ψ(x0 ) = 1. If v ∈ C, then by Remark 2.2 we
have QC (v) < 1, so by (ii) we also get φ(v) < 1. This means that the only thing
that remains to be proven is the continuity of φ. Since φ is linear, we only need to
prove that φ is continuous at 0. Start with some ε > 0. We must find some open
set Uε ⊂ X, with Uε 3 0, such that
|φ(u)| < ε, ∀ u ∈ Uε .
We take Uε = (εC) ∩ (−εC). Notice that, for every u ∈ Uε , we have ±u ∈ εC, which
gives ε−1 (±u) ∈ C. By Remark 2.2 this gives QC ε−1 (±u) < 1, which gives
QC (±u) < ε.
Then using property (ii) we immediately get
φ(±u) < ε,
and we are done. 
It turns out that the above result is a particular case of a more general result:
Theorem 1.4 (Hahn-Banach Separation Theorem - real case). Let X be a real
topological vector space, let A, B ⊂ X be non-empty convex sets with A open, and
A ∩ B = ∅. Then there exists a linear continuous map φ : X → R, and a real
number α, such that
φ(a) < α ≤ φ(b), ∀ a ∈ A, b ∈ B.
Proof. Fix some points a0 ∈ A, b0 ∈ B, and define the set
C = A − B + b0 − a0 = {a − b + b0 − a0 : a ∈ A, b ∈ B}.
It is starightforward that C is convex and contains 0. The equality
[
C= (A + b0 − a0 )
b∈B
shows that C is also open. Define the vector x0 = b0 − a0 . Since A ∩ B = ∅, it is
clear that x0 6∈ C.
Use Lemma 2.1 to produce a linear continuous map
phi : X → R such that
(i) φ(x0 ) = 1;
60 LECTURE 8

(ii) φ(v) < 1, ∀ v ∈ C.


By the definition of x0 and C, we have φ(b0 ) = φ(a0 ) + 1, and
φ(a) < φ(b) + φ(a0 ) − φ(b0 ) + 1, ∀ a ∈ A, b ∈ B,
which gives
(9) φ(a) < φ(b), ∀ a ∈ A, b ∈ B.
Put
α = inf φ(b).
b∈B
The inequalities (9) give
(10) φ(a) ≤ α ≤ φ(b), ∀ a ∈ A, b ∈ B.
The proof will be complete once we prove the following
Claim: One has the inequality
φ(a) < α, ∀ a ∈ A.
Suppose the contrary, i.e. there exists some a1 ∈ A with φ(a1 ) = α. Using the
continuity of the map
R 3 t 7−→ a1 + tx0 ∈ X
there exists some ε > 0 such that
a1 + tx0 ∈ A, ∀ t ∈ [−ε, ε].
In particular, by (10) one has
φ a1 + εx0 ) ≤ α,
which means that
α + ε ≤ α,
which is clearly impossible. 

Theorem 1.5 (Hahn-Banach Separation Theorem - complex case). Let X be


a complex topological vector space, let A, B ⊂ X be non-empty convex sets with A
open, and A ∩ B = ∅. Then there exists a linear continuous map φ : X → C, and
a real number α, such that
Re φ(a) < α ≤ Im φ(b), ∀ a ∈ A, b ∈ B.

Proof. Regard X as a real topological vector space, and apply the real version
to produce an R-linear continuous map φ1 : X → R, and a real number α, such that
φ1 (a) < α ≤ φ1 (b), ∀ a ∈ A, b ∈ B.
Then the function φ : X → C defined by
φ(x) = φ1 (x) − iφ1 (ix), x ∈ X
will clearly satisfy the desired properties. 

There is another version of the Hahn-Banach Separation Theorem, which holds


for a special type of topological vector spaces. Before we discuss these, we shall
need a technical result.
CHAPTER II: ELEMENTS OF FUNCTIONAL ANALYSIS 61

Lemma 1.2. Let X be a topological vector space, let C ⊂ X be a compact set,


and let D ⊂ D be a closed set. Then the set
C + D = {x + y : x ∈ C y ∈ D}
is closed.

Proof. Start with some point p ∈ C + D, and let us prove that p ∈ C + D. For
every neighborhood U of 0, the set p + U is a neighborhood of p, so by assumption,
we have
(11) (p + U) ∩ (C + D) 6= ∅.
Define, for each neighborhood U of 0, the set
AU = (p + U − D) ∩ C.
Using (11), it is clear that AU is non-empty. It is also clear that, if U1 ⊂ U2 , then
AU1 ⊂ AU2 . Using the compactness of C, it follows that
\
AU 6= ∅.
U neighborhood
of 0

Choose then a point q in the above intersection. It follows that


(q + V) ∩ AU 6= ∅,
for any two neighborhoods U and V of 0. In other words, for any two such neigh-
borphoods of 0, we have
(12) (q + V − U) ∩ (p − D) 6= ∅.
Fix now an arbitrary neighborhood W of 0. Using the continuity of the map
X × X 3 (x1 , x2 ) 7−→ x1 − x2 ∈ X,
there exist neighborhoods U and V of 0, such that U − V ⊂ W. Then q + V − U ⊂
q − W, so (12) gives
(q − W) ∩ (p − D) 6= ∅,
which yields
(p − q + W) ∩ D 6= ∅.
Since this is true for all neighborhoods W of 0, we get p − q ∈ D, and since D is
closed, we finally get p − q ∈ D. Since, by construction we have q ∈ C, it follows
that the point p = q + (p − q) indeed belongs to C + D. 

Definition. A topological vector space X is said to be locally convex, if every


point has a fundamental system of convex open neighborhoods. This means that
for every x ∈ X and every neighborhood N of x, there exists a convex open set D,
with x ∈ D ⊂ N .
Theorem 1.6 (Hahn-Banach Separation Theorem for Locally Convex Spaces).
Let K be one of the fields R or C, and let X be a locally convex K-vector space.
Suppose C, D ⊂ X are convex sets, with C compact, D closed, and C ∩ D = ∅. Then
there exists a linear continuous map φ : X → K, and two numbers α, β ∈ R, such
that
Re φ(x) ≤ α < β ≤ Re φ(y), ∀ x ∈ C, y ∈ D.
62 LECTURE 8

Proof. Consider the convex set B = D − C. By Lemma ??, B is closed. Since


C ∩ D = ∅, we have 0 6∈ B. Since B is closed, its complement X r B will then be a
neighborhood of 0. Since X is locally convex, there exists a convex open set A, with
0 ∈ A ⊂ X r B. In particular we have A ∩ B = ∅. Applying the suitable version
of the Hahn-Banach Theorem (real or complex case), we find a linear continuous
map φ : X → K, and a real number ρ, such that
Re φ(a) < ρ ≤ Re φ(b), ∀ a ∈ A, b ∈ B.
Notice that, since A 3 0, we get ρ > 0. Then the inequality
ρ ≤ Re φ(b), b ∈ B
gives
Re φ(y) − Re φ(x) ≥ ρ > 0, ∀ x ∈ C, y ∈ D.
Then if we define
β = inf Re φ(y) and α = sup Re φ(x),
y∈D x∈C
we get β ≥ α + ρ, and we are done. 
Lectures 9-11

2. Normed vector spaces


Definition. Let K be one of the fields R or C, and let X be a K-vector space.
A norm on X is a map
X 3 x 7−→ kxk ∈ [0, ∞)
with the following properties
(i) kx + yk ≤ kxk + kyk, ∀ x, y ∈ X;
(ii) kλxk = |λ| · kxk, ∀ x ∈ X, λ ∈ K;
(iii) kxk = 0 =⇒ x = 0.
(Note that conditions (i) and (ii) state that k . k is a seminorm.)
Example 2.1. Let K be either R or C. Fix some non-empty set I, and define
 
 
cK
0 (I) = α : I → K : inf sup |α(i)| = 0 .
F ⊂I i∈IrF
finite

Remark that for a function α : I → K, the fact that α belongs to cK


0 (I) is equivalent
to the following condition:
• For every ε > 0, there exists some finite set F ⊂ I, such that
|α(i)| < ε, ∀ i ∈ I r F.
We equip the space cK
0 (I) with the K-vector space structure defined by point-wise
addition and point-wise scalar multiplication. We also define the norm k . k∞ by
kαk = sup |α(i)|, α ∈ cK
0 (I).
i∈I

When K = C, the space cC 0 (I) is simply denoted by c0 (I). When I = N - the set of
natural numbers - the space cK0 (N) can be equivalently described as

cK
0 (N) = α = (αn )n≥1 ⊂ K : lim αn = 0 .
n→∞

In this case instead of cR


0 (N) we simply write cR
0, and instead of c0 (N) we simply
write c0 .
Exercise 1. Prove that k . k∞ is indeed a norm on cK
0 (I).
Example 2.2. Let K be either R or C, and let I be a non-empty set. We
define the space

fin K (I) = α : I → K : the set {i ∈ I : α(i) 6= 0} is finite .
Then fin K (I) is a linear subspace in cK
0 (I).

63
64 LECTURES 9-11

Definition. Suppose X is a normed vector space, with norm k . k. Then there


is a natural metric d on X, defined by
d(x, y) = kx − yk, x, y ∈ X.
The toplogy on X, defined by this metric, is called the norm topology.
Exercise 2. Let X be a normed vector space, over K(= R, C). Prove that, when
equipped with the norm toplogy, X becomes a topological vetor space. That is, the
maps
X × X 3 (x, y) 7−→ x + y ∈ X
K × X 3 (λ, x) 7−→ λx ∈ X
are continuous.
Exercise 3. Let K be one of the fields R or C, and let I be a non-empty set.
Prove that fin K (I) is dense in cK
0 (I) in the norm topology.
Example 2.3. Let K be one of the fields R or C, and let I be a non-empty
set. Define
`∞

K (I) = α : I → K : sup |α(i)| < ∞ .
i∈I
We equip the space `∞
K (I) with the K-vector space structure defined by point-wise
addition and point-wise scalar multiplication. We also define the norm k . k∞ by
kαk∞ = sup |α(i)|, α ∈ `∞
K (I).
i∈I

When K = C, the space `∞


C (I) is simply denoted by `∞ (I). When I = N - the set
of natural numbers - instead of `∞ ∞ ∞
R (N) we simply write `R , and instead of ` (N)

we simply write ` .
Exercise 4. Prove that k . k∞ is indeed a norm on `∞ K (I).
Exercise 5. Let K be one of the fields R or C, and let I be a non-empty set.

Prove that cK
0 (I) is a linear subspace in `K (I), which is closed in the norm topology.
In preparation for the next class of examples, we introduce the following:
Definition. A map α : I → K is said to be summable, if there exists some
number s ∈ K such that
(s) for every ε > 0 there exists some finite set Fε ⊂ I such that
X

s − α(i) < ε, for all finite sets F with Fε ⊂ F ⊂ I.

i∈F
P
If such an s exists, then it is unique, and it is denoted by i∈I α(i). In the case
when I is finite, every map α : I → K is summable, and the above notation agrees
with the usual notation for the sum.
Exercise 6. Assume α : I → K is summable. Prove that, for every λ ∈ K, the
map λα : I → K is summable, and
X X
λα(i) = λ α(i).
i∈I i∈I
If β : I → K is another summable map, prove that α + β : I → K is summable, and
X X  X 
[α(i) + β(i)] = α(i) + β(i) .
i∈I i∈I i∈I
The following result characterizes summability for non-negative terms
CHAPTER II: ELEMENTS OF FUNCTIONAL ANALYSIS 65

Lemma 2.1. Let K be one of the fields R or C, let I be a non-empty set, and
let α : I → [0, ∞). The following are equivalent:
(i) α issummable; 
X
(ii) sup α(i) : F ⊂ I, finite < ∞.
i∈F
Moreover, in this case we have
X  X
sup α(i) : F ⊂ I, finite = α(i).
i∈F i∈I
X 
Proof. We denote the quantity sup α(i) : F ⊂ I, finite simply by t.
i∈F P
(i) ⇒ (ii). Assume α is summable, and denote i∈I α(i) simply by s. Choose,
for each ε > 0 a finite set Fε ⊂ I such that
X

s − α(i) < ε, for all finite subsets F ⊂ I with F ⊃ Fε .

i∈F

Claim: For any finite set G ⊂ I, and any ε > 0, one has the inequality
X
α(i) < s + ε.
i∈G

Indeed, if we take the finite set G ∪ Fε , then using the fact that all α’s are non-
negative, we get
X X
α(i) ≤ α(i) < s + ε.
i∈G i∈G∪Fε

Using the Claim, which holds for any ε > 0, we immediately get
X
α(i) ≤ s, for all finite subsets G ⊂ I,
i∈G

so taking supremum yields t ≤ s, in particular t < ∞.


(ii) ⇒ (i). Assume condition (ii) is true. We are going to show that α is
summable, by proving that the number t satisfies the definition of summabilty.
Consider the set X 
S= α(i) : F finite subset of I ,
i∈F

so that sup S = t < ∞. Start with some ε > 0. Since t − ε isP


no longer an upper
bound for S, there exists some finite set Fε ⊂ I, such that i∈Fε α(i) > t − ε.
Notice that, for any finite set F ⊂ I with F ⊃ Fε , we have
X X
t−ε< α(i) ≤ α(i) ≤ t,
i∈Fε i∈F

so we immediately get
X

t − α(i) < ε.

i∈F


66 LECTURES 9-11

Exercise 7. Let α : I → [0, ∞) be summable. Prove that every map β : I →


[0, ∞) with
β(j) ≤ α(j), ∀ j ∈ I,
P P
is summable, and j∈I β(j) ≤ j∈I α(j).
Remark 2.1. It is obvious tat the above result has a version for non-positive
maps as well. More explicitly, for a map α : I → (−∞, 0] the following are equiva-
lent:
(i) α issummable; 
X
(ii) inf α(i) : F ⊂ I, finite > −∞.
i∈F
Moreover, in this case we have
X  X
inf α(i) : F ⊂ I, finite = α(i).
i∈F i∈I

Lemma 2.2. Let I be a non-empty set. For a function α : I → C, the following


are equivalent:
(i) α is summable;
(ii) both functions Re α, Im α : I → R are summable.
Moreover, in this case we have the equality
X X X
α(j) = Re α(j) + i Im α(j).
j∈I j∈I j∈I
P
Proof. (i) ⇒ (ii). Assume α is summable. Denote the sum j∈I α(j) simply
by s. For every ε > 0 choose a finite set Fε ⊂ I such that

X
s − α(j) < ε, for all finite sets F ⊂ I with F ⊃ Fε .

j∈F

Using the inequality



max |Re z|, |Im z| ≤ |z|, ∀ z ∈ C,
we immediately get the inequalities
X X X
 
Re s − Re α(j) = Re s −
α(j) ≤ s −
α(j) < ε,

j∈F j∈F j∈F

X  X  X
Im s − Im α(j) = Im s −
α(j) ≤ s −
α(j) < ε,

j∈F j∈F j∈F
for all finite sets F ⊂ I with F ⊃ Fε ,
so Re α and Im α are indeed summable and moreover, we have
X X
Re α(j) = Re s and Im α(j) = Im s.
j∈I j∈I
P
(ii) ⇒ (i). Assume Re α and Im α are both summable. Denote j∈I Re α(j)
P
by u and denote j∈I Im α(j) by v. Fix some ε > 0. Choose finite sets Eε , Gε ⊂ I
CHAPTER II: ELEMENTS OF FUNCTIONAL ANALYSIS 67

such that


u −
X ε
Re α(j) < , for all finite sets E ⊂ I with E ⊃ Eε ,
2
j∈E


v −
X ε
Im α(j) < , for all finite sets G ⊂ I with G ⊃ Gε .
2
j∈G

Put Fε = Eε ∪ Gε . Suppose F ⊂ I is a finite set with F ⊃ Fε . Using the inclusions


F ⊃ Eε and F ⊃ Gε , we then get

X ε X ε
u − Re α(j) < and v −
Im α(j) < ,
2 2
j∈F j∈F

so we get

X  X   X 
[u + iv] − α(j) = u −
Re α(j) + i v − Im α(j) ≤

j∈F j∈F j∈F

X X ε ε
≤ u − Re α(j) + v − Im α(j) < + = ε.
2 2
j∈F j∈F
P
This proves that α is indeed summable, and j∈I α(j) = u + iv. 
Exercise 8. Let K be one of the fields R or C, and let I be a non-empty set.
I = I1 ∪ I2 and I1 ∩ I2 = ∅.
Suppose one has two non-empty sets I1 , I2 with
Suppose α : I → K has the property that both α I : I1 → K and α I : I2 → K are
1 2
summable. Prove that α is summable, and
X X X
α(j) = α(j) + α(j).
j∈I j∈I1 j∈I2

Proposition 2.1. Let I be a non-empty set, let K be one of the fields R or C.


For a map α : I → K, the following are equivalent:
(i) α is summable;
(ii) |α| is summable.
Moreover, in this case one has the inequality
X X

(1) α(j) ≤ α(j) .

j∈I j∈I

Proof. (i) ⇒ (ii). Assume α is summable. We divide the proof in two cases:
Case K = R. Define the sets
I + = {j ∈ I : α(j) > 0},
I − = {j ∈ I : α(j) < 0},
I 0 = {j ∈ I : α(j) = 0}.
More generally, for any subset F ⊂ I we define F ± = F ∩ I ± and F 0 = F ∩ I 0 .

Claim: Both maps α I + : I + → R and α I − : I − → R are summable.
Moreover, one has the equality
X X X
(2) α(j) = α(j) + α(j).
j∈I j∈I + j∈I −
68 LECTURES 9-11

P
Denote the sum j∈I α(j) simply by s. Start by choosing some finite set F ⊂ I
such that
X

s − α(j) < 1, for all finite sets G ⊂ I with G ⊃ F.

j∈G

Let E ⊂ I + be a finite subset. Then the set Ẽ = E ∪ F will be a finite subset of I


with Ẽ ⊃ F,, so we will have
X

s − α(j) < 1,

j∈Ẽ

so we get
X X  X X   X 
α(j) ≤ α(j) = α(j) + α(j) − α(j) =
j∈E j∈E∪F + j∈E∪F + j∈F 0 ∪F − j∈F 0 ∪F −
X   X   X 
= α(j) − α(j) < s + 1 − α(j) .
j∈Ẽ j∈F − j∈F −

In particular this gives


X 
+
 X 
sup α(j) : E ⊂ I , finite ≤ s + 1 − α(j) ,
j∈E j∈F −

so by Lemma ??, the map α I + : I + → [0, ∞) is indeed summable. The fact that

the map α I − : I − → (−∞, 0] is summable is proven the exact same way. The
equality (2) follows from Exercise ??
Having proven the Claim, we notice now that the map −α I − : I − → [0, ∞) is
map |α| : I →
also summable. Using Exercise ??, it is clear then that the [0, ∞)
is summable, simply because all the three maps |α|I + = α I + , |α|I − = −α I − , and
|α|I 0 = 0 are all summable.
Case K = C. By Lemma ?? we know that the maps Re α, Im α : I → R
are summable. In particular, using the real case, we get the fact that the maps
|Re α|, |Im α| : I → [0, ∞) are summable. Using the obvious inequality
|z| ≤ |Re z| + |Im z|, ∀ z ∈ C,
we get
X X X X X
|α(j)| ≤ |Re α(j)| + |Im α(j)| ≤ |Re α(j)| + |Im α(j)|,
j∈F j∈F j∈F j∈I j∈I

for every finite subset F ⊂ I. Then we get


X  X X
sup |α(j)| : F ⊂ I, finite ≤ |Re α(j)| + |Im α(j)| < ∞,
j∈F j∈I j∈I

so |α| : I → [0, ∞) is indeed summable.


Having proven Pthe implication (i) ⇒ (ii), let us prove the inequality (1). If s
denotes the sum j∈I α(j), then for every ε > 0 there exists Fε ⊂ I finite such
that
X
s − α(j) < ε, for all finite sets F ⊂ I with F ⊃ Fε .

j∈F
CHAPTER II: ELEMENTS OF FUNCTIONAL ANALYSIS 69

In particular, we get
X X X

|s| ≤ ε + α(j) ≤ ε + α(j) ≤ ε + α(j) .
j∈Fε j∈Fε j∈I

Since this inequality holds for all ε > 0, we then get


X
|s| ≤ α(j) .
j∈Fε

(ii) ⇒ (i). Assume now |α| : I → [0, ∞) is summable.


Case K = R. It is obvious that |α| J : J → [0, ∞) is summable, for any subset
J ⊂ I. In particular,
using the notations
the proof of (i) ⇒ (ii), it follows
from
that α I + = |α| I + , α I − = −|α| I − , and α I 0 = 0 are all summable. Then the
summability of α follows from Exercise ??.
Case K = C. Using the inequality

max |Re z|, |Im z| ≤ |z|, ∀ z ∈ C,
combined with Exercise ??, it follows that both maps |Re z|, |Im z| : I → [0, ∞) are
summable. Using the real case it then follows that both maps Re α, Im α : I → R
are summable. Then the summability of α follows from Lemma ??. 

The following result shows that summability is essentially the same as the
summability of series.
Proposition 2.2. Suppose α : I → K is summable. Then the support set
[[α]] = {j ∈ I : α(j) 6= 0}
is at most countable.

Proof. For every integer n ≥ 1, we define the set Jn = {j ∈ I : |α(j)| ≥ n1 .


Since |α| is summable, the sets Jn , n ≥S1 are all finite. The desired result then

follows from the obvious equality [[α]] = n=1 Jn . 

We are now ready to discuss our next class of examples.


Example 2.4. Let K be either R or C, let I be a non-empty set, and let
p ∈ [1, ∞) be a real number. We define
`pK (I) = α : I → K : |α|p : I → [0, ∞) summable .


For α ∈ `pK (I) we define


X  p1
p
kαkp = |α(j)| .
j∈I

When K = C, the space `∞ ∞


C (I) is simply denoted by ` (I). When I = N - the set
of natural numbers - instead of `R (N) we simply write `∞
∞ ∞
R , and instead of ` (N)

we simply write ` .
In order to show that the `p spaces (1 ≤ p < ∞) are normed vector spaces, we
will need several preliminary results. The first result we are going to need is the
(classical) Hölder inequality.
70 LECTURES 9-11

Exercise 9. Let q > 1 and let u, v ≥ 0. Define the function f : [0, 1] → R by


1
f (t) = ut + v(1 − tq ) q , t ∈ [0, 1].
Prove that
1
max f (t) = (up + v p ) p ,
t∈[0,1]
q
where p = . Prove that, unless u = v = 0, there exists a unique s ∈ [0, 1] such
q−1
that
f (s) = max f (t).
t∈[0,1]

1
tq

p
Hint: Analyze the derivative: f 0 (t) = u − v , t ∈ (0, 1).
1 − tq
Lemma 2.3 (Hölder’s inequality). Let a1 , a2 , . . . , an , b1 , b2 , . . . , bn be non-nega-
tive numbers. Let p, q > 1 be real number with the property p1 + 1q = 1. Then:
Xn Xn  p1  X n  q1
p q
(3) aj bj ≤ aj · bj .
j=1 j=1 j=1

Moreover, one has equality only when the sequences (ap1 , . . . , apn ) and (bq1 , . . . , bqn )
are proportional.
Proof. The proof will be carried on by induction on n. The case n = 1 is
trivial.
Case n = 2.
Assume (b1 , b2 ) 6= (0, 0). (Otherwise everything is trivial). Define the number
b1
r= .
(bq1 + bq2 )1/q
Notice that r ∈ [0, 1], and we have
b2
= (1 − rq )1/q .
(bq1 + bq2 )1/q
Notice also that, upon dividing by (bq1 + bq2 )1/q , the desired inequality
1 1
(4) a1 b1 + a2 b2 ≤ (ap1 + ap2 ) p (bq1 + bq2 ) q
reads
a1 r + a2 (1 − rq )1/q ≤ (ap1 + ap2 )1/p ,
and it follows immediately from the exercise, applied to the function
f (t) = a1 t + a2 (1 − tq )1/q , t ∈ [0, 1].
Let us examine when equality holds. If a1 = a2 = 0, the equality obviosuly holds,
and in this case (a1 , a2 ) is clearly proportional to (b1 , b2 ). Assume (a1 , a2 ) 6= (0, 0).
Put
p/q
a
s = p 1 p 1/q ,
(a1 + a2 )
and notice that
1/q p/q
ap1

q 1/q a
(1 − s ) = 1− p p = p 2 p 1/q ,
a1 + a2 (a1 + a2 )
CHAPTER II: ELEMENTS OF FUNCTIONAL ANALYSIS 71

so we have
1+ p 1+ p
a1 q
+ a2 q
ap1 + ap2 1 1
f (s) = 1 = 1 = (ap1 + ap2 )1− q = (ap1 + ap2 ) p = max f (t).
(ap1 + ap2 ) q (ap1 + ap2 ) q t∈[0,1]

By the exercise, it follows that we have equality in (4) precisely when r = s, i.e.
p

b1 a1q
1 = 1 ,
(bq1 + bq2 ) q (ap1 + ap2 ) q
or equivalently
bq1 ap1
= .
bq1 + bq2 ap1 + ap2
Obviously this forces
bq2 ap2
q = p ,
+ b2 bq1a1 + ap2
so indeed (ap1 , ap2 ) and (bq1 , bq2 ) are proportional.
Having proven the case n = 2, we now proceed with the proof of:
The implication: Case n = k ⇒ Case n = k + 1.
Start with two sequences (a1 , a2 , . . . , ak , ak+1 ) and (b1 , b2 , . . . , ak , bk+1 ). Define
the numbers
X k  p1 X k  q1
a= apj and b = bqj .
j=1 j=1
Using the assumption that the case n = k holds, we have
k+1
X Xk  p1  Xk  q1
p q
(5) aj bj ≤ aj · bj + ak+1 bk+1 = ab + ak+1 bk+1 .
j=1 j=1 j=1

Using the case n = 2 we also have


 k+1  p1  k+1
X q q
1
1 1 X
(6) ab + ak+1 bk+1 ≤ (ap + apk+1 ) p · (bq + bqk+1 ) q = apj · bj ,
j=1 j=1

so combining with (5) we see that the desired inequality (3) holds for n = k + 1.
Assume now we have equality. Then we must have equality in both (5) and in
(6). On the one hand, the equality in (5) forces (ap1 , ap2 , . . . , apk ) and (bq1 , bq2 , . . . , bqk ) to
be proportional (since we assume the case n = k). On the other hand, the equality
in (6) forces (ap , apk+1 ) and (bq , bqk+1 ) to be proportional (by the case n = 2). Since
k
X k
X
ap = apj and bq = bqj ,
j=1 j=1

it is clear that (ap1 , ap2 , . . . , apk , apk+1 ) and (bq1 , bq2 , . . . , bqk , bqk+1 ) are proportional. 
Definition. Two numbers p, q ∈ [1, ∞) are said to be Hölder conjugate, if
1 1 1
p + q = 1. Here we use the convention ∞ = 0.
Proposition 2.3. Let K be one of the fields R or C, let I be a non-empty set,
and let p, q ∈ [1, ∞] be two Hölder conjugate numbers. If α ∈ `pK (I) and β ∈ `qK (I),
then αβ ∈ `1K (I), and
kαβk1 ≤ kαkp · kβkq .
72 LECTURES 9-11

Proof. Using Lemma ??, it suffices to prove the inequality


X
(7) α(j)β(j) ≤ kαkp · kβkq ,
j∈F

for every finite set F ⊂ I.


Fix for the moment a finite subset F ⊂ I. Assume p, q ∈ (1, ∞), using Hölder’s
inequality we have
X 1  X 1
X X p p q q
(8) α(j)β(j) =
α(j) · β(j) ≤
α(j) · β(j) .
j∈F j∈F j∈F j∈F

Notice however that


α(j) p ≤ α(j) p = kαkp p ,
X X 

j∈F j∈I

β(j) q ≤ β(j) q = kβkq q ,


X X 

j∈F j∈I

so we get
X  p1 X  q1
α(j) p β(j) q

≤ kαkp and ≤ kβkq ,
j∈F j∈F
so when we go back to (8) we immediately get the desired inequality (7)
In the case when p = 1, we immediately have
X X   

α(j)β(j) ≤ α(j) · max β(j) ≤
j∈F
j∈F j∈F
X   

≤ α(j) · sup β(j) = kαk1 · kβk∞ .
j∈I
j∈I

The case p = ∞ is proven in the exact same way. 


Remark 2.2. Suppose p, q ∈ [1, ∞] are Hölder conjugate numbers. For any α ∈
`pK (I) and β ∈ `qK (I), the map αβ is summable (by Proposition ??). In particular,
one can define the number
X
hα, βi = α(j)β(j) ∈ K.
j∈I

As a consequence we get the inequality


hα, βi ≤ kαkp · kβkq , ∀ α ∈ `p (I), β ∈ `q (I).

K K
Notations. Let K be either R or C, let I be a non-empty set, and let q ∈ [1, ∞]
be a real number. We define
BqK (I) = α ∈ finK(I) : kαkq ≤ 1 .


(remark that finK (I) ⊂ `qK (I), for all q ∈ [1, ∞].)
Theorem 2.1 (Dual definition of `p spaces). Let p, q ∈ (1, ∞) be Hölder con-
jugate numbers, let K be one of the fields R or C, and let I be a non-empty set.
For a function α : I → K, the following are equivalent:
(i) α ∈ `pK (I);
(ii) sup |hα, βi| < ∞.
β∈BqK (I)
CHAPTER II: ELEMENTS OF FUNCTIONAL ANALYSIS 73

Moreover, one has the equality


(9) sup |hα, βi| = kαkp , ∀ α ∈ `pK (I).
β∈BqK (I)

Proof. It will be convenient to introduce several notations. Given a function


α : I → K, and a finite set F ⊂ I, we define the function βαF : I → K, as follows:
 p
|α(i)|1+ q
if i ∈ F and α(i) 6= 0



p 1/q
 P 
βαF (i) = α(i) · j∈F |α(j)|



0 if i 6∈ F or α(i) = 0

Notice that [[βαF ]] ⊂ F , and unless βαF is identically zero, we have


X
|βαF (i)|q = 1.
F ]]
i∈[[βα

So in any case we have βαF ∈ BqK (I). Notice also that, unless βαF is identically zero,
we have
p
|α(i)|1+ q p
P P
i∈F |α(i)|
X
hα, βαF i = α(i)βαF (i) = P i∈F 1/q = =
p 1/q
P 
|α(j)| p |α(j)|
i∈F j∈F j∈F
(10)
X  1
p 1− q
X
p 1/p

= |α(i)| = |α(i)| .
i∈F i∈F

It is clear that the equality (10) actually holds even when βαF is identically zero.
To make the exposition a bit clearer, we denote the quantity sup hα, βi
β∈BqK (I)
simply by |||α|||.
We now proceed with the proof of the Theorem.
(i) ⇒ (ii). Assume α ∈ `pK (I). In order to prove (ii) it suffices to prove the
inequality
(11) |||α||| ≤ kαkp .
Start with some arbitrary β ∈ BqK (I). Using Hölder inequality we have
X X

|hα, βi| =
α(j)β(j) ≤ |α(j)| · |β(j)| ≤
j∈[[β]] j∈[[β]]
1/p  X 1/q 
 1/p
 X 
p q
X p
≤ |α(j)| · |β(j)| ≤ sup |α(i)| = kαkp .
F ⊂I
j∈[[β]] j∈[[β]] finite i∈F

Since this inequality holds for all β ∈ BqK (I),


the inequality (11) follows.
(ii) ⇒ (i). Assume now |||α||| < ∞. In order to prove condition (i) it suffices
to prove that
X
(12) |α(i)|p ≤ |||α|||p , for every finite subset F ⊂ I.
i∈F

By (10) we know that for every finite subset F ⊂ I we have


X
(13) |α(i)|p ≤ |||α|||p = hα, βαF ip .
i∈F
74 LECTURES 9-11

In particular we get the fact that hα, βαF i = |hα, βαF i|, and the fact that βαF belongs
to BqK (I), combined with (13) will give
X  p
p F p
|α(i)| = |hα, βα i| ≤ sup |hα, βi| = |||α|||p .
i∈F β∈BqK (I)

Having proven the equivalence (i) ⇔ (ii), let us now observe that (9) is an
immediate consequence of (11) and (12). 
Exercise 10. Prove that Theorem 9.1 holds also in the cases (p, q) = (1, ∞) and
(p, q) = (∞, 1).
Corollary 2.1. Let K be either R or C, let I be a non-empty set, and let
p ≥ 1.
(i) When equipped with point-wise addition and scalar multiplication, the set
`pK (I) is a K-vector space.
(ii) The map
`pK (I) 3 α 7−→ kαkp ∈ [0, ∞)
is a norm.
Proof. Let q be the Hölder conjugate of p. If α ∈ `pK (I), and λ ∈ K, then
hλα, βi = λhα, βi, ∀ β ∈ fin K (I),
so we get
sup |hλα, βi| = |λ| · sup |hα, βi|,
β∈BqK (I) β∈BqK (I)

which gives the fact that λα ∈ `pK (I), as well as the equality kλαkp = |λ| · kαkp .
If α1 , α2 ∈ `pK (I), then
hα1 + α2 , βi = hα1 , βi + hα2 , βi, ∀ β ∈ fin K (I),
so we get

sup |hα1 + α2 , βi| = sup hα1 , βi + hα2 , βi ≤
β∈BqK (I) β∈BqK (I)

≤ sup |hα1 , βi| + |hα2 , βi| ≤ sup |hα1 , βi| + sup |hα2 , βi|,
β∈BqK (I) β∈BqK (I) β∈BqK (I)

which gives the fact that α1 + α2 ∈ `pK (I), as well as the inequality
kα1 + α2 kp ≤ kα1 kp + kα2 kp .
The implication kαkp = 0 ⇒ α = 0 is obvious. 
Exercise 11. Let p ≥ 1 be a real number, let K be one of the fields R or C, and
let I be a non-empty set. Prove that finK (I) is a dense linear subspace in `pK (I).
Remark 2.3. Let p, q ∈ [1, ∞] be Hölder conjugate. Then the map
`pK (I) × `qK (I) 3 (α, β) 7−→ hα, βi ∈ K
is bilinear, in the sense that for any γ ∈ `pK (I) and any η ∈ `qK (I), the maps
`pK (I) 3 α 7−→ hα, ηi ∈ K,
`qK (I) 3 β 7−→ hγ, βi ∈ K
are linear. These facts follow immediately from Exercise ??
We now examine linear continuous maps between normed spaces.
CHAPTER II: ELEMENTS OF FUNCTIONAL ANALYSIS 75

Proposition 2.4. Let K be either R or C, let X and Y be normed K-vector


spaces, and let T : X → Y be a K-linear map. The following are equivalent:
(i) T iscontinuous;
(ii) sup kT xk : x ∈ X, kxk ≤ 1 < ∞;

(iii) sup kT xk : x ∈ X, kxk = 1 < ∞;


(iv) T is continuous at 0.
Proof. (i) ⇒ (ii). Assume T is continuous, but
sup kT xk : x ∈ X, kxk ≤ 1 < ∞,


which means there exists some sequence (xn )n≥1 ⊂ X such that
(a) kxn k ≤ 1, ∀ n ≥ 1;
(b) limn→∞ kT xn k = ∞.
Put
zn = kT xn k−1 xn , ∀ n ≥ 1.
On the one hand, we have
kxn k 1
kzn k = ≤ , ∀ n ≥ 1,
kT xn k kT xn k
which gives limn→∞ kzn k = 0, i.e. limn→∞ zn = 0. Since T is assumed to be
continuous, we will get
(14) lim T zn = T 0 = 0.
n→∞

On the other hand, since T is linear, we have T zn = kT xn k−1 T xn , so in particular


we get
kT zn k = 1, ∀ n ≥ 1,
which clearly contradicts (14).
(ii) ⇒ (iii). This is obvious, since the supremum in (iii) is taken over a subset
of the set used in (ii).
(iii) ⇒ (iv). Let (xn )n≥1 ⊂ X be a sequence with limn→∞ xn = 0. For each
n ≥ 1, define
 kxn k−1 xn , if xn 6= 0

un =
any vector of norm 1, if xn = 0

so that we have
kun k = 1 and xn = kxn kun , ∀ n ≥ 1.
Since T is linear, we have
(15) T xn = kxn kT un , ∀ n ≥ 1.
If we define M = sup kT xk : x ∈ X, kxk = 1 , then kT un k ≤ M , ∀ n ≥ 1, so (15)


will give
kT xn k ≤ M · kxn k, ∀ n ≥ 1,
and the condition limn→∞ xn = 0 will force limn→∞ T xn = 0.
(iv) ⇒ (i). Assume T is continuous at 0, and let us prove that T is continuous at
any point. Start with some arbitrary x ∈ X and an arbitrary sequence (xn )n≥1 ⊂ X
with limn→∞ xn = x. Put zn = xn − x, so that limn→∞ zn = 0. Then we will have
limn→∞ T zn = 0, which (use the linearity of T ) means that
0 = lim kT zn k = lim kT xn − T xk,
n→∞ n→∞
76 LECTURES 9-11

thus proving that limn→∞ T xn = T x. 


Remark 2.4. Using the notations above, the quantities in (ii) and (iii) are in
fact equal. Indeed, if we define
M1 = sup kT xk : x ∈ X, kxk ≤ 1 ,


M2 = sup kT xk : x ∈ X, kxk = 1 ,


then as observed during the proof, we have M2 ≤ M1 . Conversely, if we start with


some arbitrary x ∈ X with kxk ≤ 1, then we can always write x = kxku, for some
u ∈ X with kuk = 1. In particular we will get
kT xk = kxk · kT uk ≤ kxk · M2 ≤ M2 .
Taking supremum in the above inequality, over all x ∈ X with kxk ≤ 1, will then
give the inequality M1 ≤ M2 .
Notations. Let K be either R or C, and let X and Y be normed K-vector
spaces. We define
L(X, Y) = T : X → Y : T K-linear and continuous .


For T ∈ L(X, Y) we define (see the above remark)


kT k = sup kT xk : x ∈ X, kxk ≤ 1 = sup kT xk : x ∈ X, kxk = 1
 

When Y = K (equipped with the absolute value as the norm), the space L(X, K)
will be denoted simply by X∗ , and will be called the topological dual of X.
Proposition 2.5. Let K be either R or C, and let X and Y be normed K-vector
spaces.
(i) The space L(X, Y) is a K-vector space.
(ii) For T ∈ L(X, Y) we have
kT k = min C ≥ 0 : kT xk ≤ Ckxk, ∀ x ∈ X .

(16)
In particular one has
(17) kT xk ≤ kT k · kxk, ∀ x ∈ X.
(iii) The map L(X, Y) 3 T 7−→ kT k ∈ [0, ∞) is a norm.
Proof. The fact that L(X, Y) is a vector space is clear.
(ii). Assume T L(X, Y). We begin by proving (17). Start with some arbitrary
x ∈ X, and write it as x = kxku, for some u ∈ X with kuk = 1. Then by definition
we have kT uk ≤ kT k, and by linearity we have
kT xk = kxk · kT uk ≤ kxk · kT k.
To prove the equality (16) let us define the set
CT = C ≥ 0 : kT xk ≤ Ckxk, ∀ x ∈ X .


On the one hand, by (17) we know that kT k ∈ CT . On the other hand, if we take
an arbitrary C ∈ CT , then for every u ∈ X with kuk = 1, we will have
kT uk ≤ Ckuk = C,
so taking supremum, over all u with kuk = 1, will immediately give kT k ≤ C. Since
we now have
kT k ≤ C, ∀ C ∈ CT ,
we clearly get kT k = min CT .
CHAPTER II: ELEMENTS OF FUNCTIONAL ANALYSIS 77

(iii). Let T, S ∈ L(X, Y). Using (17), we have


k(T + S)xk = kT x + Sxk ≤ kT xk + kSxk ≤ (kT k + kSk) · kxk, ∀ x ∈ X.
Then using (16) we get
kT + Sk ≤ kT k + kSk.
If T ∈ L(X, Y) and λ ∈ K, then the equality
k(λT )xk = |λ| · kT xk, x ∈ X
will immediately give kλT k = |λ| · kT k.
Finally if T ∈ L(X, Y) has kT k = 0, then using (17) one immediately gets
T = 0. 
Notation. Let I be a non-empty set, let K be one of the fields R or C, and
let p ∈ [1, ∞]. Let q be the Hölder conjugate of p. For every element α ∈ `pK (I) we
define the map θα : `qK (I) → K by
X
θα (β) = hα, βi = α(i)β(i), β ∈ `qK (I).
i∈I

We know that θα is linear, and by Remark 9.2, we have


θα (β) ≤ kαkp · kβkq , ∀ β ∈ `q (I),

K

so θα is continuous, and we have the inequality


(18) kθα k ≤ kαkp .
Proposition 2.6. Using the above notations, but assuming p ∈ (1, ∞], the
map
∗
Θ : `pK (I) 3 α 7−→ θα ∈ `qK (I)
is a linear isomorphism of K-vector spaces. Moreover, Θ is isometric, in the sense
that
(19) kΘαk = kαkp , ∀ α ∈ `pK (I).
Proof. We begin by proving (19). Since we have the inclusion
{β ∈ `qK (I) : kβkq ≤ 1} ⊃ BqK (I),
it follows that
(20) kθα k = sup θα (β) : β ∈ `qK (I), kβkq ≤ 1 ≥ sup θα (β) : β ∈ BqK (I) .
 

We know however (see Theorem 9 and Exercise 7) that


sup θα (β) : β ∈ Bq (I) = kαkp ,

K

so using (20) we get


kθα k ≥ kαkp .
Combining this with (18) yields the desired equality.
The fact that Θ is linear is pretty obvious. Notice now that since Θ is isometric,
it is clear that Θ is injective, so the only thing we need to prove is the fact that
Θ is surjective. Start with an arbitrary linear continuous map φ : `qK (I) → K. For
every i ∈ I we define the function δ i : I → K by

i 1 if j = i
δ (j) =
0 if j 6= i
78 LECTURES 9-11

It is clear that δ i ∈ `qK (I), for all i ∈ I. (In fact δ i ∈ finK (I).) We define α : I → K
by
α(i) = φ(δ i ), ∀ i ∈ I.
Notice that, for every β ∈ finK , we have
X X X
β(i)φ(δ i ) = φ β(i)δ i = φ(β),

(21) α(i)β(i) =
i∈I i∈I i∈bβc

where bβc = {i ∈ I : β(i) 6= 0}. (Since β ∈ finK (I), the set bβc is finite.) Using
Hölder’s inequality, the above computation shows that

hα, βi ≤ kφk · kβkq , ∀ β ∈ finK (I).
By Theorem 9.1 and Exercise 7, this proves that α ∈ `pK (I). Going back to (21) we
now have
θα (β) = φ(β), ∀ β ∈ finK (I).
Since both θα and φ are continuous, and finK (I) is dense in `qK (I) (by Exercise 10),
it follows that φ = θα . 
Remark 2.5. In the case p = 1, the map
∗
Θ : `1K (I) 3 α 7−→ θα ∈ `∞
K (I)
is still isometric, but it is no longer surjective, unless I is finite. The explanation
is the fact that when I is infinite, the subspace finK (I) is not dense in `∞ K (I). For
example, if we take 1 ∈ `∞ K (I) to be the constant function 1, then it is pretty
obvious that
k1 − βk ≥ 1, ∀ β ∈ finK (I).
The above equality can be immediately extended to
(22) kλ1 + βk ≥ |λ|, ∀ λ ∈ K, β ∈ finK (I).
If we then consider the subspace
f (I) = {λ1 + β : β ∈ fin (I), λ ∈ K},
fin K K
we see that the map
f (I) 3 λ1 + β 7−→ λ ∈ K
φ0 : fin K
is linear, continuous, and has the property that

(23) φ0 fin (I) = 0, φ0 (1) = 1,
K

(24) |φ0 (γ)| ≤ kγk, ∀ γ ∈ fin


f (I).
K
Using the Hahn-Banach Theorem, we can then extend φ0 to a linear map φ :

`∞ ∞
K (I) → K which will still satisfy (23) and (24), in particular we have φ ∈ `K (I) .
Notice however that if we had φ = θα , for some α ∈ `1K (I), then we must have
α(i) = φ(δ i ) = 0, for all i ∈ I, so this would force φ = 0, which is impossible, since
φ(1) = 1.
Exercise 12. Use the notations above. For every α ∈ `1K (I), define

σα = θα K : cK
c0 (I) 0 (I) → K.

Prove that σα is linear and continuous. Prove that the map


∗
Σ : `1K (I) 3 α 7−→ σα ∈ cK
0 (I)
is an isometric linear isomorphism of K-vector spaces.
Lecture 12

3. Banach spaces
Definition. Let K be one of the fields R or C. A Banach space over K is a
normed K-vector space (X, k . k), which is complete with respect to the metric
d(x, y) = kx − yk, x, y ∈ X.
Example 3.1. The field K, equipped with the absolute value norm, is a Banach
space. More generally, the vector space Kn , equipped with any of the norms
k(λ1 , . . . , λn )k∞ = max{|λ1 |, . . . , |λn |},
1/p
k(λ1 , . . . , λn )kp = |λ1 |p + · · · + |λn |p

, p ≥ 1,
is a Banach space.
Remark 3.1. Using the facts from the general theory of metric spaces, we
know that for a normed vector space (X, k . k), the following are equivalent:
(i) X is a Banach space; P∞
(ii) given any sequence (xn )n≥1 ⊂ X with Pn=1 kxn k < ∞, the sequence
n
(yn )n≥1 of partial sums, defined by yn = k=1 xk , is convergent;
(iii) every Cauchy sequence in X has a convergent subsequence.
This is pretty obvious, since the sequence of partial sums has the property that
d(yn+1 , yn ) = kyn+1 − yn k = kxn+1 k, ∀ n ≥ 1.
Exercise 1*. Let X be a finite dimensional normed vector space. Prove that X
is a Banach space.
Hints: Use inductionn on dim X. The case dim X = 1 is trivial. Assume the statement is true for
all normed vector spaces of dimension d, and let us prove it for a normed vector space of dimension
d + 1. Fix such an X, and a linear basis {e1 , e2 , . . . , en , ed+1 } for X. Start with a Cauchy sequence
(xn )n≥1 ⊂ X. Write each term as
d+1
X
xn = αn (k)ek .
k=1

Prove first that αn (d + 1) n≥1 ⊂ K is bounded. Then extract a subsequence (xnp )p≥1 such that

αnp (d + 1) p≥1 is convergent. If we take α(d + 1) = limp→∞ αnp (d + 1), then prove that the

sequence xnp − αnp (d + 1)ed+1 p≥1 is Cauchy in the space Span{e1 , . . . , ed }. Using the inductive
hypothesis, conclude that (xnp )p≥1 is convergent in X. Thus, every Cauchy sequence in X has a
convergent subsequence, hence X is Banach.
Exercise 2*. Let n ≥ 1 be an integer, and let k · k be a norm on Kn . Prove
that there exist constants C, D > 0, such that
Ckxk∞ ≤ kxk ≤ Dkxk∞ , ∀ x ∈ Kn .

79
80 LECTURE 12

Hint: Let e1 , . . . , en be the standard basis vectors for Kn , so that


α1 e1 + · · · + αn en = (α1 , . . . , αn ), ∀ (α1 , . . . , αn ) ∈ Kn .
Define D = ke1 k + · · · + ken k. The existence of C is equivalent to the existence of some C 0 > 0
such that
kxk∞ ≤ C 0 kxk, ∀ x ∈ Kn .
(If such a C 0 exists, then we take C = 1/C 0 .) To prove the existence of C 0 as above, we consider
the set T = {x ∈ Kn : kxk ≤ 1}, and we need to prove that
sup kxk∞ < ∞.
x∈T

Argue by contradiction (see also the hint from the preceding exercise).
Exercise 3. Let X and Y be normed vector spaces. Consider the product X × Y,
equipped with the natural vector space structure.
(i) Prove that k(x, y)k = kxk + kyk, (x, y) ∈ X × Y defines a norm on X × Y.
(ii) Prove that, when equipped with the above norm, X × Y is a Banach space,
if and only if both X and Y are Banach spaces.
There are two key constructions which enable one to construct new Banach
space out of old ones.
Proposition 3.1. Let X be a normed vector space, and let Y be a Banach
space. Then L(X, Y) is a Banach space, when equipped with the operator norm.
Proof. Start with a Cauchy sequence (Tn )n≥1 ⊂ L(X, Y). This means that
for every ε > 0, there exists some Nε such that
(1) kTm − Tn k < ε, ∀ m, n ≥ Nε .
Notice that, if one takes for example ε = 1, and we define
C = 1 + max{kT1 k, kT2 k, . . . , kTN1 k},
then we clearly have
(2) kTn k ≤ C, ∀ n ≥ 1.
Notice that, using (1), we have
(3) kTm x − Tn xk ≤ εkxk, ∀ m, n ≥ Nε , x ∈ X,
which proves that
• for every x ∈ X, the sequence (Tn x)n≥1 ⊂ Y is Cauchy.
Since Y is a Banach space, for each x ∈ X, the sequence (Tn )n≥1 will be convergent.
We define the map T : X → Y by
T x = lim Tn x, x ∈ X.
n→∞

Using (2) we immediately get


kT xk ≤ Ckxk, ∀ x ∈ X.
Since T is obviously linear, this prove that T is continuous. Finally, if we fix n ≥ Nε
and we take limm→∞ in (3), we get
kTn x − T xk ≤ εkxk, ∀ n ≥ Nε , x ∈ X,
which proves precisely that we have the inequality
kTn − T k ≤ ε, ∀ n ≥ Nε ,
hence (Tn )n≥1 is convergent to T in the norm topology. 
CHAPTER II: ELEMENTS OF FUNCTIONAL ANALYSIS 81

Corollary 3.1. If X is a normed vector space, then its topological dual X∗ =


L(X, K) is a Banach space.

Proof. Immediate from the fact that K is a Banach space. 

As a direct application of the above result we get


Corollary 3.2. If I is a non-empty set, if p ∈ [1, ∞], then `pK (I) is a Banach
space.

Proof. For p = 1 we know that `1 ' (c0 )∗ . For p ∈ (1, ∞], we know that
` ' (`q )∗ , where q is Hölder conjugate to p.
p


Proposition 3.2. Let X be a Banach space, and let Z ⊂ X be a linear subspace.


The following are equivalent:
(i) Z is a Banach space, ehen equipped with the norm from X;
(ii) Z is closed in X, in the norm topology.

Proof. This is a particular case of a general result from the theory of complete
metric spaces. 

Corollary 3.3. Let I be a non-empty set, and let K be one of the fields R or
C. Then cK
0 (I) is a Banach space.

Proof. Use the fact that cK
0 (I) is closed in `K (I). 

Exercise 4*. Let X be an infinite dimensional Banach space, and let B be a


linear basis for X. Prove that B is uncountable.
Hint: If B is countable, say B = {bn : n ∈ N}, then

[
X= Fn ,
n=1

where Fn = Span(b1 , b2 , . . . , bn }. Since the Fn ’s are finite dimensional linear subspaces, they will
be closed. Use Baire’s Theorem to get a contradiction.
Comments. A third method of constructing Banach spaces is the completion.
If we start with a normed K-vector space X, when we regard X as a metric space,
its completion X̃ is constructed as follows. One defines
cs(X) = x = (xn )n≥1 : (xn )n≥1 Cauchy sequence in X .


Two Cauchy sequences x = (xn )n≥1 and x0 = (x0n )n≥1 are said to be equivalent, if
limn→∞ kxn − x0n k = 0. In this case one writes x ∼ x0 . The completion X̃ is then
defined as the space
X̃ = cs(X)/ ∼
of equivalence classes. For x ∈ cs(X), one denotes by x̃ its equivalence class in X̃.
Finally for an element x ∈ X one denotes by hxi ∈ X̃ the equivalence class of the
constant sequence x.
We know from general theory that X̃ is a complete metric space, with the
distance d˜ (correctly) defined by
˜ x̃0 ) = lim kxn − x0 k,
d(x̃, n
n→∞

for any two Cauchy sequences x = (xn )n≥1 and x0 = (x0n )n≥1 .
82 LECTURE 12

It turns out that, in our situation, the space cs(X) carries a natural vector
space structure, defined by pointwise addition and scalar multiplication. Moreover,
the space X̃ is identified as a quotient vector space
X̃ = cs(X)/ns(X),
where
ns(X) = x = (xn )n≥1 : (xn )n≥1 sequence in X with lim xn = 0

n→∞

is the linear subspace of null sequences. It then follows that X̃ carries a natural
vector space structure. More explicitly, if we start with a scalar λ ∈ K, and with
two elements p, q ∈ X̃, which are represented as p = x̃ and q = ỹ, for two Cauchy
sequences x = (xn )n≥1 and y = (yn )n≥1 in X, then the sequence
w = (λxn + yn )n≥1
is Cauchy in X, and the element λp + q ∈ X̃ is then defined as λp + q = w̃.
Finally, there is a natural norm on X̃, (correctly) defined by
˜ h0i) = lim kxn k,
kx̃k = d(x̃,
n→∞

for all Cauchy sequences x = (xn )n≥1 . These considerations then prove that X̃ is
a Banach space, and the map
X 3 x 7−→ hxi ∈ X̃
is linear and isometric, in the sense that
khxik = kxk, ∀ x ∈ X.
In the context of normed vector spaces, the universality property of the com-
pletion is stated as follows:
Proposition 3.3. Let X be a normed vector space, let X̃ denote its completion,
and let Y be a Banach space. For every linear continuous map T : X → Y, there
exists a unique linear continuous map T̃ : X̃ → Y, such that
T̃ hxi = T x, ∀ x ∈ X.
Moreover the map
L(X, Y) 3 T 7−→ T̃ ∈ L(X̃, Y)
is an isometric linear isomorphism.
Proof. If T : X → Y is linear an continuous, then T is a Lipschitz map with
Lipschitz constant kT k, because
kT x − T x0 k ≤ kT k · kx − x0 k, ∀ x, x0 ∈ X.
We know, from the theory of metric spaces, that there exists a unique continuous
map T̃ : X̃ → Y, such that
T̃ hxi = T x, ∀ x ∈ X.
We also know that T̃ is Lipschitz, with Lipschitz constant kT k. The only thing we
need to prove is the fact that T̃ is linear. Start with two points p, q ∈ X̃, represented
as p = x̃ and q = z̃, for some Cauchy sequences x = (xn )n≥1 and z = (zn )n≥1 in
X. If λ ∈ K, then λp + q = w̃, where w = (λxn + zn )n≥1 . We then have
   
T̃ (λp + q) = lim T (λxn + z + n) = λ · lim T xn + lim T zn = λT̃ p + T̃ q.
n→∞ n→∞ n→∞
CHAPTER II: ELEMENTS OF FUNCTIONAL ANALYSIS 83

Let us prove now that kT̃ k = kT k. Since T̃ is Lipschitz, with Lipschitz constant
kT k, we will have kT̃ k ≤ kT k. To prove the other inequality, let us consider the
sets
B0 = {p ∈ X̃ : kpk ≤ 1}, B1 = {hxi : x ∈ X, kxk ≤ 1}.
By definition, we have
kT̃ k = sup kT̃ pk.
p∈B0
Since we clearly have B0 ⊃ B1 , we get
kT̃ k = sup kT̃ pk ≥ sup kT̃ hxik : x ∈ X kxk ≤ 1 =

p∈B1

= sup kT xk : x ∈ X kxk ≤ 1 = kT k.


The fact that the map L(X, Y) 3 T 7−→ T̃ ∈ L(X̃, Y) is linear is obvious.
To prove the surjectivity, start with some S ∈ L(X̃, Y). Consider the map
ι : X 3 x 7−→ hxi ∈ X̃.
Since ι is linear and isometric, in particular it is continuous, so the composition
T = S ◦ ι is linear and continuous. Notice that
Shxi = S ι(x) = (S ◦ ι)x = T x, ∀ x ∈ X,


so by uniqueness we have S = T̃ . 
Corollary 3.4. Let X be a normed space, let Y be a Banach space, and let
T : X → Y be an isometric linear map.
(i) Let T̃ : X̃ → Y be the linear continuous map defined in the previous result.
Then T̃ is linear, isometric, and T̃ (X̃) = T (X).
(ii) X is complete, if and only of T (X) is closed in Y.

Proof. (i). The fact that T̃ is isometric, and has the range equal to T (X) is
true in general (i.e. for X metric space, and Y complete metric space). The linearity
follows from the previous result.
(ii). This is obvious. 
Example 3.2. Let X be a normed vector space. For every x ∈ X define the
map x : X∗ → K by
x (φ) = φ(x), ∀ φ ∈ X∗ .
Then x is a linear and continuous. This is an immediate consequence of the
inequality
|x (φ)| = |φ(x)| ≤ kxk · kφk, ∀ φ ∈ X∗ .
Notice that this also proves
kx k ≤ kxk, ∀ x ∈ X.
Interestingly enough, we actually have
(4) kx k = kxk, ∀ x ∈ X.
To prove this fact, we start with an arbitrary x ∈ X, and we consider the linear
subspace
Y = Kx = {λx : λ ∈ K}.
84 LECTURE 12

If we define φ0 : Y → K, by
φ0 (λx) = λkxk, ∀ λ ∈ K,
then it is clear that φ0 (x) = kxk, and
|φ0 (y)| ≤ kyk, ∀ y ∈ Y.
Use then the Hahn-Banach Theorem to find φ : X → K such that φ Y = φ0 , and

|φ(z)| ≤ kzk, ∀ z ∈ X.
This will clearly imply kφk ≤ 1, while the first condition will give φ(x) = φ0 (x) =
kxk. In particular, we will have
kxk = |φ(x)| = |x (φ)| ≤ kx k · kφk ≤ kx k.
Having proven (4), we now have a linear isometric map
E : X 3 x 7−→ x ∈ X∗∗ .
Since X∗∗ is a Banach space, we now see that Ẽ : X̃ → E(X) is an isometric linear
isomorphism. In particular, X is Banach, if and only if E(X) is closed in X∗∗ .
We conclude with a series of results, which are often regarded as the “principles
of Banach space theory.” These results are consequences of Baire Theorem.
Theorem 3.1 (Uniform Boundedness Principle). Let X be a Banach space, let
Y be normed vector space, and let M ⊂ L(X, Y). The following are equivalent
(i) sup kT k : T ∈ M < ∞;


(ii) sup kT xk : T ∈ M < ∞, ∀ x ∈ X.


Proof. The implication (i) ⇒ (ii) is trivial, because if we define
M = sup kT k : T ∈ M ,


then by the definition of the norm, we clearly have


sup kT xk : T ∈ M ≤ M kxk, ∀ x ∈ X.


(ii) ⇒ (i). Assume M satisfies condition (ii). For each integer n ≥ 1, let us
define the set
Fn = x ∈ X : kT xk ≤ n, ∀ T ∈ M .


It is obvious Sthat Fn is a closed subset of X, for each n ≥ 1. Moreover, by (ii) we



clearly have n=1 Fn = X. Using Baire’s Theorem, there exists some n ≥ 1, such
that Int(Fn ) 6= ∅. This means that there exists some x0 ∈ X and some r > 0, such
that
Fn ⊃ B̄r (x0 ) = {y ∈ X : kx − x0 k ≤ r}.
Put M0 = sup kT x0 k : T ∈ M . Fix for the moment some arbitrary x ∈ X,


with kxk ≤ 1, and some arbitrary element T ∈ M. The vector y = x0 + rx clearly


belongs to B̄r (x0 ), so we have kT yk ≤ n. We then get

kT xk = T 1 (y − x0 ) = 1 kT y − T x0 k ≤ 1 kT yk + kT x0 k ≤ 1 (n + M0 ).
 
r r r r
Keep T fixed, and use the above estimate, which gives
n + M0
sup kT xk : x ∈ X, kxk ≤ 1 ≤

,
r
to conclude that kT k ≤ n+M
r
0
. Since T ∈ M is arbitrary, we finally get
n + M0
sup kT k : T ∈ M ≤

< ∞. 
r
CHAPTER II: ELEMENTS OF FUNCTIONAL ANALYSIS 85

Theorem 3.2 (Inverse Mapping Theorem). Let X and Y be Banach spaces,


and let let T : X → Y be a bijective linear continuous map. Then the linear map
T −1 : Y → X is also continuous.
Proof. Let us denote by A the open unit ball in X centered at the origin, i.e.
A = x ∈ X : kxk < 1 .


The first step in the proof is contained in the following.


Claim 1: The closure T (A) is a neighborhood of 0 in Y.
∞
Consider the sequence of closed sets kT (A) k=1 . (Here we use the notation kM =
{kv : v ∈ M}.) Since the map v 7−→ kv is a homeomorphism, one has the equalities
kT (A) = kT (A) = T (kA), ∀ k ≥ 1.
In particular, we have

[ [ ∞
[ ∞
[ 
kT (A) = T (kA) ⊃ T (kA) = T [kA] .
k=1 k=1 k=1 k=1
S∞
Since we obviously have k=1 [kA] = X, and T is surjective, the above equality
S∞
shows that
 k=1kT (A) = Y. Using Baire’s Theorem, there exists some k ≥ 1, such
that Int kT (A) 6= ∅. Again using the fact that v 7−→ kv is a homeomorphism,
   
this gives Int T (A) 6= ∅. Fix now some point y ∈ Int T (A) , and some r > 0,
such that T (A) contains the open ball
Br (y) = z ∈ Y : kz − yk < r .

(5)
The proof of the Claim is then finished, once we prove the inclusion
T (A) ⊃ B r2 (0).
To prove this inclusion, start with some arbitrary v ∈ B r2 (0), i.e. v ∈ Y and kvk < 2r .
Since k(2v + y) − yk = 2kvk < r, using (5) it follows that 2v + y ∈ T (A). i.e. there
n=1 ⊂ X with kxn k < 1, ∀ n ≥ 1, and 2v + y = limn→∞ T xn .
exists a sequence (xn )∞
Since y itself belongs to T (A), there also exists some sequence (zn )∞ n=1 ⊂ X, with
kzn k < 1, ∀ n ≥ 1, and y = limn→∞ T zn . On the one hand, if we consider the
sequence (un )∞n=1 ⊂ X given by un = 2 (xn − zn ), then it is clear that
1

kun k ≤ 12 kxn k + kzn k < 1, ∀ n ≥ 1,




n=1 ⊂ A. On the othe hand, we have


i.e. (un )∞
1
lim T un = lim T xn − T zn ) = 12 (2v + y − y) = v,
n→∞ n→∞ 2

so v indeed belongs to T (A).


The next step is a slight (but crucial) improvement of Claim 1.
Claim 2: T (A) is a neighborhood of 0.
Start off by choosing ε > 0, such that
(6) T (A) ⊃ Bε (0).
The Claim will follow, once we prove the inclusion
(7) T (A) ⊃ B 2ε (0).
86 LECTURE 12

To prove this inclusion, we start with some arbitrary y ∈ Bε (0). We want to


n=1 ⊂ A, such that, for every n ≥ 1, we have
construct a sequence of vectors (xn )∞
the inequality
n
≤ ε .
X
y − 1
(8) T ( x )
2k k
2n+1
k=1

This sequence is constructed inductively as follows. We start by using (6), and we


pick x1 ∈ A such that k2y − T x1 k < 2ε . Once x1 , . . . , xp are constructed, such that
(8) holds with n = p, we consider the vector
p
X
z = 2p+1 y − T ( 21k T xk ) ∈ Bε (0),
 

k=1

and we use again (6) to find xp+1 ∈ A, such that kz − T xp+1 k ≤ 2ε . We then claerly
have
p+1
X
1
 z − T xp+1 ε
y − T 2k xk =
≤ p+2 ,
2 p+1 2
k=1
P∞ 1
Consider now the series k=1 2k xk . Since kxk k < 1, ∀ k ≥ 1, and X is a Banacch
space, by Remark 3.1, the sequence of (wn )∞n=1 ⊂ X of partial sums
n
X
1
wn = x ,
2k k
n ≥ 1,
k=1

is convergent to some point x ∈ X. Moreover, since we have


n ∞
X kxk k X kxk k
kwn k ≤ ≤ , ∀ n ≥ 1,
2k 2k
k=1 k=1

we get the inequality



X kxk k
kxk ≤ < 1,
2k
k=1
which means that x ∈ A. Note also that using these partial sums, the inequality
(8) reads
ε
ky − T wn k ≤ n+2 , ∀ n ≥ 1,
2
so by the continuity of T , we have y = T x ∈ T (A).
Let us show now that T −1 is continuous. Use Claim 2, to find some r > 0 such
that
(9) T (A) ⊃ Br (0),
and let y ∈ Y be an arbitrary vector with kyk ≤ 1. Consider the vector v = 2r y,
which has kvk ≤ 2r < r. By (9), there exists x ∈ A, such that T x = v, which means
that T −1 y = 2r x. This forces kT −1 yk ≤ 2r . This argument shows that
2
sup kT −1 yk : y ∈ Y, kyk ≤ 1 ≤ < ∞,

r
and the continuity of T −1 follows from Proposition 2.4. 
The following two exercises deal with two more “principles of Banach space
theory.”
CHAPTER II: ELEMENTS OF FUNCTIONAL ANALYSIS 87

Exercise 5 ♦ . (Closed Graph Theorem). Let X and Y be Banach spaces, and


let T : X → Y be a linear map. Prove that the following are equivalent:
(i) T is continuous.
(ii) The graph of T
GT = (x, T x) : x ∈ X


is a closed subset of X × Y, in the product topology.


Hint: For the implication (ii) ⇒ (i), use Exercise 3, to get the fact that GT is a Banach space.
Then T is exactly the inverse of πX G , where πX : X × Y → X is the projection onto the first

T
coordinate. Use Theorem 3.2.
Exercise 6 ♦ . (Open Mapping Theorem). Let X and Y be Banach spaces, and
let T : X → Y be a surjective linear continuous map. Prove that T is an open map,
in the sense that
• whenver D ⊂ X is open, it follows that T (D) is open in Y.
Hint: Consider the linear map
S : X × Y 3 (x, y) 7−→ (x, T x + y) ∈ X × Y.
Prove that S is linear, continuous, bijective, hence by Theorem 3.2, it is a homeomorphism. Use
this fact to prove that for every open set D ⊂ X, there exists some open set E ⊂ X × Y, such that
T (D) = πY (E), where πY : X × Y → Y is the projection onto the second coordinate. This reduces
the problem to proving the fact that πY is an open map.
Lecture 13

4. The weak dual topology


In this section we examine the topological duals of normed vector spaces. Be-
sides the norm topology, there is another natural topology which is constructed as
follows.
Definition. Let X be a normed vector space over K(= R, C). For every x ∈ X,
let x : X∗ → K be the linear map defined by
x (φ) = φ(x), ∀ φ ∈ X∗ .
We equipp the vector space X∗ with the weak topology defined by the family Ξ =
(x )x∈X . This topology is called the weak dual topology, which is denoted by w∗ .
Recall (see Section 3) that this topology is characterized by the following property
(w∗ ) Given a topological space T , a map f : T → X∗ is continuous with respect
to the w∗ topology, if and only if x ◦ f : T → K is continuous, for each
x ∈ X.
Remark that all the maps x : X∗ → K, x ∈ X are already continuous with respect
to the norm topology. This gives the fact that
• the w∗ topology on X∗ is weaker than the norm topology.
Remark 4.1. The w∗ topology is Hausdorff. Indeed, if φ, ψ ∈ X∗ are such
that φ 6= ψ, then there exists some x ∈ X such that
x (φ) = φ(x) 6= ψ(x) = x (ψ).
Proposition 4.1. Let X be a normed vector space over K. For every ε > 0,
φ ∈ X∗ , and x ∈ X, define the set
W (φ; x, ε) = ψ ∈ X∗ : |ψ(x) − φ(x)| < ε.


Then the collection


W = W (φ; x, ε) : ε > 0, φ ∈ X∗ , x ∈ X


is a subbase for the w∗ topology. More precisely, given φ ∈ X∗ , a set N ⊂ X∗ is a


neighborhood of φ with respect to the w∗ topology, if and only if, there exist ε > 0
and x1 , . . . , xn ∈ X, such that
N ⊃ W (φ; ε, x1 ) ∩ · · · ∩ W (φ; ε, xn ).
Proof. It is clearly sufficient to prove the second assertion, because it would
imply the fact that any w∗ open set is a union of finite intersections of sets in W.
If we define the collection
S = −1x (D) : x ∈ X, D ⊂ K open ,


then we know that S is a subbase for the w∗ topology.


89
90 LECTURE 13

Fix φ ∈ X∗ . Start with some w∗ neighborhood N of φ, so there exists some w∗


open set E with φ ∈ E ⊂ N . Using the fact that S is a subbase for the w∗ topology,
there exist open sets D1 , . . . , Dn ⊂ K, and points x1 , . . . , xn , such that
\n
φ∈ −1
xk (Dk ) ⊂ E.
k=1

Fix for the moment k ∈ {1, . . . , n}. The fact that φ ∈ −1
xk (Dk ) means that φ(xk ) ∈
Dk . Since Dk is open in K, there exists some εk > 0, such that
Dk ⊃ Bεk φ(xk ) .


Then if we have an arbitrary ψ ∈ W (φ; εk , xk ), we will have


|ψ(xk ) − φ(xk )| < εk ,
which gives ψ ∈ −1
xk (Dk ). This proves that
W (φ; εk , xk ) ⊂ −1
xk (Dk ).

Notice that, if one takes ε = min{ε1 , . . . , εn }, then we clearly have the inclusions
W (φ; ε, xk ) ⊂ W (φ; εk , xk ) ⊂ −1
xk (Dk ).
We then immediately get
n
\
W (φ; ε, xk ) ⊂ −1
xk (Dk ) ⊂ E ⊂ N,
k=1
and we are done. 
Corollary 4.1. Let X be a normed vector space. Then the w∗ topology on X∗
is locally convex, i.e.
• for every φ ∈ X∗ and every w∗ -neighborhood N of φ, there exists a convex
w∗ -open set D such that φ ∈ D ⊂ N .
Proof. Apply the second part of the proposition, together with the obvious
fact that each of the sets W (φ; ε, x) is convex and w∗ -open. 
Proposition 4.2. Let X be a normed vector space. When equipped with the
w∗ topology, the space X∗ is a topological vector space. This means that the maps
X∗ × X∗ 3 (φ, ψ) 7−→ φ + ψ ∈ X∗
K × X∗ 3 (λ, φ) 7−→ λφ ∈ X∗
are continuous with respect to the w∗ topology on the target space, and the w∗
product topology on the domanin.
Proof. According to the definition of the w∗ topology, it suffices to prove
that, for every x ∈ X, the maps
σx : X∗ × X∗ 3 (φ, ψ) 7−→ γx : x (φ + ψ) ∈ K
K × X∗ 3 (λ, φ) 7−→ x (λφ) ∈ K
are continuous. But the continuity of σx and γx is obvious, since we have
σx (φ, ψ) = φ(x) + φ(x) = x (φ) + x (ψ), ∀ (φ, ψ) ∈ X∗ × X∗ ;
γx (λ, φ) = λφ(x) = λx (φ), ∀ (λ, φ, ψ) ∈ K × X∗ .

CHAPTER II: ELEMENTS OF FUNCTIONAL ANALYSIS 91

Our next goal will be to describe the linear maps X∗ → K, which are continuous
in the w∗ topology.
Proposition 4.3. Let X be a normed vector space over K. For a linear map
ω : X∗ → K, the following are equivalent:
(i) ω is continuous with respect to the w∗ topology;
(ii) there exists some x ∈ X, such that
ω(φ) = φ(x), ∀ φ ∈ X∗ .
Proof. The implication (ii) ⇒ (i) is trivial, since condition (ii) gives ω = x
(i) ⇒ (ii). Suppose ω is continuous. In particular, ω is continuous at 0, so if
we take the set
D = {λ ∈ K : |λ| < 1},
the set
ω −1 (D) = {φ ∈ X∗ : |ω(φ)| < 1}
is an open neighborhood of 0 in the w∗ topology. By Proposition ?? there exist
x1 , . . . , xn ∈ X, and ε > 0, such that
(1) W (0; ε, x1 ) ∩ · · · ∩ W (0; ε, xn ) ⊂ D.
Claim 1: One has the inequality
|ω(φ)| ≤ ε−1 · max |φ(x1 )|, . . . , |φ(xn )| , ∀ φ ∈ X∗ .


Fix an arbitrary φ ∈ X∗ , and put M = max |φ(x1 )|, . . . , |φ(xn )| . For every integer


k ≥ 1, define
−1
φk = ε M + k1 φ,
so that
−1 −1
|φk (xj )| = ε M + k1 |φ(xj )| ≤ εM M + k1 < ε, ∀ k ≥ 1, j ∈ {1, . . . , n}.
This proves that φk ∈ W (0; ε, xj ), for all k ≥ 1, and all j ∈ {1, . . . , n}. By (1) this
will give
|ω(φk )| < 1, ∀ k ≥ 1,
which reads −1
ε M + k1 |ω(φ)| < 1, ∀ k ≥ 1.
This gives
|ω(φ)| ≤ ε−1 M + k1 , ∀ k ≥ 1,


and it will obviously force


|ω(φ)| ≤ ε−1 M.
Having proven the Claim, we now define the linear map T : X∗ → Kn , by
T φ = φ(x1 ), . . . , φ(xn ) , ∀ φ ∈ X∗ .


Claim 2: There exists a linear map σ : Kn → K, such that ω = σ ◦ T .


First we show that we have the inclusion
Ker ω ⊃ Ker T.
If we start with φ ∈ Ker T , then φ(x1 ) = · · · = φ(xn ) = 0, and then by Claim
1 we immediately get ω(φ) = 0, so φ indeed belongs to Ker ω. We us now a bit
of linear algebra. On the one hand, since ω Ker T = 0, there exists a linear map
ω̂ : X/Ker T → K, such that ω = ω̂ ◦ π, where π : X → X/Ker T denotes the
quoatient map. On the other hand, by the Isomorphism Theorem for linear maps,
92 LECTURE 13


there exists a linear isomorphism T̂ : X/Ker T −−→ Ran T , such that T̂ ◦ π = T .
We then define
σ0 = ω̂ ◦ T̂ −1 : Ran T → K,
and we will have
σ0 ◦ T = (ω̂ ◦ T̂ −1 ) ◦ (T̂ ◦ π) = ω̂ ◦ π = ω.
We finally extend5 σ0 : Ran T → K to a linear map σ : Kn → K.
Having proven Claim 2, we choose scalars α1 , . . . , αn ∈ K, such that
σ(λ1 , . . . , λn ) = α1 λ1 + · · · + αn λn , ∀ (λ1 , . . . , λn ) ∈ Kn .
We now have
ω(φ) = σ(T φ) = σ φ(x1 ), . . . , φ(xn ) = α1 φ(x1 ) + · · · + αn φ(xn ), ∀ φ ∈ X∗ ,


so if we define x = α1 x1 + · · · + αn xn , we claerly have


ω(φ) = φ(x), ∀ φ ∈ X∗ .
(ii) ⇒ (i). This implication is trivial. 
Corollary 4.2. Let X be a normed vector space, let C ⊂ X∗ be a convex set,
w∗ w∗
and let φ ∈ X∗ r C . (Here C denotes the w∗ -closure of C.) Then there exists
an element x ∈ X, and a real number α, such that
Re φ(x) < α ≤ Re ψ(x), ∀ ψ ∈ C.
Proof. Since the w∗ topology on X∗ is locally convex, there exists a convex
w∗
w∗ -open set A ⊂ X∗ , such that φ ∈ A ⊂ X∗ rC . In particular, we have A∩C = ∅.
Apply the Hahn-Banach separation theorem to find a linear map ω : X∗ → K, which
is w∗ -continuous, and a real number α, such that
Re ω(ρ) < α ≤ Re ω(ψ), ∀ ρ ∈ A, ψ ∈ C.
We then apply the above Proposition. 
Comments. The definition of the w∗ topology can be used in a more general
setting, when X is just a topological vector space. The above results are still vaild
in this general setting.
In general the unit ball
(X∗ )1 = {φ ∈ X∗ : kφk ≤ 1},
although bounded and closed, is not compact in the norm topology. However, when
the w∗ topology is used, we have
Theorem 4.1 (Alaoglu). If X is a normed vector space, then the unit ball
(X∗ )1 , in the topological dual space, is compact in the w∗ topology.
Proof. Let us consider the unit ball in K:
B = {λ ∈ K : |λ| ≤ 1}.
Let us also consider the unital ball in X:
(X)1 = {x ∈ X : kxk ≤ 1}.
5 One can invoke the Hahn-Banach Theorem here. In fact this is not necessary, since Ran T ⊂
Kn are finite dimensional vector spaces.
CHAPTER II: ELEMENTS OF FUNCTIONAL ANALYSIS 93

Define the product space Y


P = B,
x∈(X)1
identified equivalently as the space of maps (X)1 → B. By Tihonov’s Theorem,
when we equip P with the product topology, it will become a compact topological
space. We denote by πx : P → B, x ∈ (X)1 , the projection onto the factor with
label x. By definition of the product topology πx is continuous.
For any x, y ∈ (X)1 define the map ∆x,y : P → K by
f (x) + f (y) x + y
∆x,y (f ) = −f , ∀ f ∈ P.
2 2
Note that
1
∆x,y = (πx + πy ) − π(x+y)/2 ,
2
so ∆x,y : P → K is obviously continuous. In particular, the set
f (x) + f (y) x + y
Ax,y = ∆−1

x,y ({0}) = f ∈ P : =f
2 2
is closed in P , for every x, y ∈ (X)1 .
Similarly, for every x ∈ (X)1 and every λ ∈ B, we define the map Σλ,x : P → K
by
Σλ,x (f ) = f (λx) − λf (x), ∀ f ∈ P,
then Σλ,x is continuous, so the set
Bx,y = Σ−1

x,y ({0}) = f ∈ P : f (λx) = λf (x)

is closed in P , for every λ ∈ B, x ∈ (X)1 .


Define the set \  \ 
L= Ax,y ∩ Bλ,y .
x,y∈(X)1 λ∈B
x∈(X)1
Since L is an intersection of closed sets, it follows that L itself is closed. In partic-
ular, L is compact. By construction, we have
 1 1
 
2 f (x) + f (y)] = f 2 [x + y] and 

L = f : (X)1 → B .
 f (λx) = λf (x), ∀ x, y ∈ (X)1 , λ ∈ B 
For any f ∈ L, we define the map ψf : X → K by


 0 if x = 0
ψf (x) = x 
 kxk · f
 if x 6= 0
kxk
Claim 1: For any f ∈ L, the map ψf : X → K is linear, and satisfies
ψf (X) = f .
1

Fix f ∈ L. Start with some x ∈ X and some λ ∈ K. We have kλxk = |λ| · kxk, so
we get


 0 if either x = 0, or λ = 0

ψf (λx) = λ x 
 |λ| · kxk · f · if λ 6= 0 and x 6= 0


|λ| kxk
94 LECTURE 13

If λ 6= 0 and x 6= 0, we put
λ x
µ= and y = ,
|λ| kxk
and the fact that µ ∈ B, y ∈ (X)1 , and f ∈ Bµ,y , will give
λ x  λ x  λ
f · = f (µy) = µf (y) = ·f = ψf (x),
|λ| kxk |λ| kxk |λ| · kxk
so in this case we get
λ x  λ
ψf (λx) = |λ| · kxk · f · = |λ| · kxk · ψf (x) = λψf (x).
|λ| kxk |λ| · kxk
In the case when either λ = 0 or x = 0, we also get the equality
ψf (λx) = 0 = λψf (x).
This way we have proven the homeogeneity of ψf
(2) ψf (λx) = λψf (x), ∀ λ ∈ K, x ∈ X.

Let us prove now that ψf (X) = f . If x = 0, then using the property
1

(3) f (µy) = µf (y), ∀ µ ∈ B, y ∈ (X)1


with µ = 0 and y = 0, we immediately get f (x) = 0 = ψf (x). If x 6= 0, we use (3)
x
with µ = kxk and y = and we again get
kxk
x 
f (x) = f (kxk · y) = kxk · f (y) = kxk · f = ψf (x).
kxk
We now prove that ψf is additive. Start with two elements x, y ∈ X. Define
x y
v= and w = ,
kxk + kyk + 1 kxk + kyk + 1
so that we obviously have v, w ∈ (X)1 and
x = {kxk + kyk + 1} · v and y = {kxk + kyk + 1} · w.
By homogeneity, we have
1  1
ψf (x + y) = ψf 2{kxk + kyk + 1} · [v + w] = 2{kxk + kyk + 1} · f [v + w]).
2 2
Using the fact that f ∈ Av,w the above computation can be continued to give:
1
ψf (x + y) = 2{kxk + kyk + 1} · f [v + w]) =
2
1
= 2{kxk + kyk + 1} · [f (v) + f (w)] =
2
= {kxk + kyk + 1} · f (v) + {kxk + kyk + 1} · f (w).

Using the fact that ψf (X)1 = f , the above equality gives
ψf (x + y) = {kxk + kyk + 1} · ψf (v) + {kxk + kyk + 1} · ψf (w).
Finally, using the homogeneity property (2) we get
 
ψf (x + y) = ψf {kxk + kyk + 1} · v + ψf {kxk + kyk + 1} · w = ψf (x) + ψf (y).
Having proven the Claim, let us now observe that, for f ∈ L, the fact that
ψf (x) = f (x) ∈ B, ∀ x ∈ (X)1 ,
CHAPTER II: ELEMENTS OF FUNCTIONAL ANALYSIS 95

shows that ψf is continuous, and kψf k ≤ 1. Therefore we have a correctly defined


map
Ψ : L 3 f 7−→ ψf ∈ (X∗ )1 .
Claim 2: When (X∗ )1 is equipped with the w∗ topology, the map Ψ is con-
tinuous.
By the definition of the w∗ topology, we need to prove that x ◦ Ψ : L → K is
continuous, for avery x ∈ X. If x = 0, the composition x ◦ Ψ is the constant map
0, so there is nothing to prove. If x 6= 0, we define
x
y= ∈ (§)1 ,
kxk
and using Claim 1, we have
x 
(x ◦ Ψ)(f ) = x (ψf ) = ψf (x) = kxk · ψf = kxk · ψf (y) = kxk · f (y), ∀ f ∈ L.
kxk
This proves that
x ◦ Ψ = kxk · πy ,
and since πy : P → B is continuous, the continuity of x ◦ Ψ follows.
In order to finish the proof of the Theorem, it then suffices to prove
Claim 3: The map Ψ : L → (X∗ )1 is surjective.
Start with an arbitrary φ ∈ (X∗ )1 , which means that φ : X → K is linear, continu-
ous, and
|φ(x)| ≤ 1, ∀ x ∈ (X)1 .

In particular, if we define f = φ , then
(X)1

f (x) ∈ B, ∀ x ∈ (X)1 ,
which means that f ∈ P . Using the fact that φ is linear, it is obvious that f ∈ L.
Using Claim 1, we have
ψf (x) = f (x) = φ(x), ∀ x ∈ (X)1 .

Now, since ψf (X)1 = φ (X)1 , and both ψf and φ are linear, we immediately get
ψf = φ. 

Remarks 4.2. Using the notations from the above proof, the continuous map
Ψ : L → (X∗ )1 is in fact bijective. The only thing we need to prove is the injectivity.
Suppose ψf = ψg , for some f, g ∈ L. Then

f = ψf (X)1 = ψg (X)1 = g.

Since Ψ : (X∗ )1 → L is bijective, continuous, and the spaces (X∗ )1 and L are
compact Hausdorff, it follows that Ψ is in fact a homeomorphism. The inverse map
Ψ−1 : (X∗ )1 → L is simply defined by
Ψ−1 (φ) = φ (X)1 , ∀ φ ∈ (X∗ )1 .

Proposition 4.4. Suppose X is a normed vector space, which is separable in


the norm topology. When equipped with the w∗ topology, the compact space (X∗ )1
is metrizable.
96 LECTURE 13

Proof. Fix a countable dense subset M ⊂ X, and define (M)1 = (X)1 ∩ M.


Notice that (M)1 is dense in (X)1 . Indeed, if we start with some x ∈ (X)1 , and
some ε > 0, then we set xε = (1− 2ε )x, and we choose y ∈ M such that kxε −yk < 2ε .
On the one hand, we have
ε ε ε ε
kyk ≤ kxε − yk + kxε k < + 1 − · kxk ≤ + 1 − = 1,
2 2 2 2
so y ∈ (M)1 . On the other hand, we have
ε ε ε 
ky − xk ≤ ky − xε k + kx − xε k < + xk ≤ · 1 + kxk ≤ ε.
2 2 2
Let us use the notations from the proof of Theorem 4.1. Let us then define the
product space
Y
B,
x∈(M)1

equipped with the product topology. Define also the map


Y Y
Υ: B 3 f 7−→ f (M)1 ∈ B.
x∈(X)1 x∈(M)1

It is obvious that Υ is continuous. Let


Y
κ : (X∗ )1 3 φ 7−→ φ (X) ∈

B.
1
x∈(X)1

We know that κ is continuous and injective (being the inverse of Ψ : L → (X∗ )1 ).


Claim: The composition Υ ◦ κ : (X∗ )1 → x∈(M)1 B is injective.
Q

Indeed, if φ, ψ ∈ (X∗ )1 satisfy (Υ◦κ)(φ) = (Υ◦κ)(ψ), then we get φ (M) = ψ (M) .
1 1

Since (M)1 is dense in (X)1 , this will force φ (X) = ψ (X) , which finally forces
1 1
φ = ψ.


Q Using the above Claim, we see that∗ if we define Q = (Υ ◦ κ) (X )1 , then Q ⊂
B is compact, and Υ ◦ κ : (X )1 → Q is a homeomorphism. Notice that
Qx∈(M)1
x∈(M)1 B is a countable product of metric spaces, so it is metrizable. Therefore
Q is also metrizable, and so will be (X∗ )1 . 

Remark 4.3. Assuming X is separable, and M ⊂ X is a countable dense subset.


If we enumerate the countable set (M)1 as
(M)1 = {yn : n ≥ 1},
then a metric d that defines the w∗ topology on (X∗ )1 can be constructed as

X |φ(yn ) − ψ(yn )|
d(φ, ψ) = n
, ∀ φ, ψ ∈ (X∗ )1 .
n=1
2
Comments. Let X be a normed vector space. One can extend the map κ to a
map
Y
κ̃ : X∗ 3 φ 7−→ φ (X)1 ∈

K.
x∈(X)1

This map will still be injective and continuous, and one can show that
κ̃ : X∗ → κ̃(X∗ )
CHAPTER II: ELEMENTS OF FUNCTIONAL ANALYSIS 97

is a homeomorphism, when κ̃(X∗ ) is equipped with the induced topology from the
product space x∈(X)1 K. In general however, the set κ̃(X∗ ) is not closed in the
Q
Q
product space x∈(X)1 K.
If X is separable, and if one takes a countable dense set M ⊂ X, then as before,
one also still has a continuous map
Y Y
Υ̃ : K 3 f 7−→ f (M)1 ∈ K,
x∈(X)1 x∈(M)1

and the composition Y


Υ̃ ◦ κ̃ : X∗ → B
x∈(M)1
will still be continuous and injective. In general however, it turns out that the map
Υ̃ ◦ κ̃ : X∗ → Υ̃ ◦ κ̃(X∗ )
is not a homeomrphism. The exercise below explains exactly when this is the case.
Exercise 1*. Let X be a normed vector space, which is of uncountable dimension
(for example, a Banach space). Prove that the topological space (X∗ , w∗ ) is not
metrizable.
Hint: Assume (X∗ , w∗ ) is metrizable. Let d be a metric which gives the w∗ -topology. Then
0 ∈ X∗ will have a countable basic system of neighborhoods. In particular, there exist sequences
(xn )n≥1 ⊂ X, and (εn )n≥1 ∈ (0, ∞), such that the sets
n
\
Bn = W (0; εn , xk )
k=1
satisfy Bn ⊂ B1/n (0), ∀ n ≥ 1, where B1/n (0) denotes the d-open ball of center 0 and radius
1/n. Consider the set M = {xn : n ∈ N}. We know that Span M ( X. Choose some vector
y ∈ X r Span M. For every n ≥ 1, choose a linear map ψn : Span{y, x1 , . . . , xn } → K, such
that ψn (y) = 1, and ψn (xk ) = 0, ∀ k ∈ {1, . . . , n}. Extend (use Hahn-Banach) ψn to a linear
continuous map φn : X → K. Notice now that φn ∈ Bn , for all n ≥ 1, which would then force
d- limn→∞ φn = 0. In particular, this would force limn→∞ φn (x) = 0, ∀ x ∈ X. But this is
impossible, since φn (y) = 1, ∀ n ≥ 1.
Comment. If X is a normed vector space of countable dimension, then (X∗ , w∗ )
is metrizable. Indeed, if we take a linear basis {bn : n ∈ N} for X, then the w∗
topology on X∗ is clearly defined by the metric
n
X 1 |φ(bn ) − ψ(bn )|
d(φ, ψ) = ·
n 1 + |φ(b ) − ψ(b )|
, φ, ψ ∈ X∗ .
n=1
2 n n
Lectures 14-15

5. Banach spaces of continuous functions


In this section we discuss a examples of Banach spaces coming from topology.
Notation. Let K be one of the fields R or C, and let Ω be a topological space.
We define
CbK (Ω) = {f : Ω → K : f bounded and continuous}.
In the case when K = C we use the notation Cb (Ω).
Proposition 5.1. With the notations above, if we define
kf k = sup |f (p)|, ∀ f ∈ CbK (Ω),
p∈Ω

then CbK (Ω) is a Banach space.


Proof. It is obvious that CbK (Ω) is a linear subspace of `∞ K (Ω), and the norm
is precisely the one coming from `∞ K (Ω). Therefore, it suffices to prove that CbK (Ω)

is closed in `K (Ω).
Start with some sequence (fn )n≥1 ⊂ CbK (Ω), which convergens in norm to
some f ∈ `∞ K (Ω), and let us prove that f : Ω → K is continuous (the fact that f is
bounded is automatic).
Fix some point p0 ∈ Ω, and some ε > 0. We need to find some neighborhood
V of p0 , such that
|f (p) − f (p0 )| < ε, ∀ p ∈ V.
Start by choosing n such that kfn − f k < 3ε . Use the fact that fn is continuous, to
find a neighborhood V of p0 , such that
ε
|fn (p) − fn (p0 )| < , ∀ Ω ∈ V.
3
Suppose now Ω ∈ V . We have
|f (p) − f (p0 )| ≤ |fn (p) − f (p)| + |fn (p) − fn (p0 )| + |fn (p0 ) − f (p0 )| ≤
  ε ε
|fn (p) − fn (p0 )| + 2 sup |fn (q) − f (q)| < 2 + = ε.
q∈Ω 3 3

A first application of Banach space techniques is the following:
Lemma 5.1 (Urysohn type density). Let Ω be a topological space, let C ⊂ CbR (Ω)
be a linear subspace, which contains the constant function 1. Assume
(u) for any two closed sets A, B ⊂ Ω, with A ∩ B = ∅, there exists a function
h ∈ C, such that h A = 0, h B = 1, and h(Ω) ∈ [0, 1], for all Ω ∈ Ω.
Then C is dense in CbR (Ω), in the norm topology.

99
100 LECTURES 14-15

Proof. The key step in the proof will be the following:


Claim: For any f ∈ CbR (Ω), there exists g ∈ C, such that
2
kg − f k ≤ kf k.
3
To prove this claim we define
α = inf f (p) and β = sup f (x),
p∈Ω p∈Ω

so that f (p) ⊂ [α, β], and kf k = max{|α|, |β|}. Define the sets
   
 2α + β   α + 2β 
A = f −1 α, and B = f −1 ,β .
3 3
so that both A and B are closed, and A ∩ B = ∅. Use the hypothesis, to find a
function h ∈ C, such that h A = 0, h B = 1, and h(p) ∈ [0, 1], for all p ∈ Ω. Define
the function g ∈ C by
1 
g = α1 + (β − α)k .
3
Let us examine the difference g − f . Start with some arbitrary point p ∈ Ω. There
are three cases to examine:
α
Case I: p ∈ A. In this case we have h(p) = 0, so we get g(p) = . By the
3
2α + β
construction of A we also have α ≤ f (p) ≤ , so we get
3
2α α+β
≤ f (p) − g(p) ≤ .
3 3
β
Case II: p ∈ B. In this case we have h(p) = 1, so we get g(p) = . We also
3
2β + α
have ≤ f (p) ≤ β, so we get
3
α+β 2β
≤ f (p) − g(p) ≤ .
3 3
Case III: p ∈ Ω r (A ∪ B). In this case we have 0 ≤ h(p) ≤ 1, so we get
α β 2α + β α + 2β
≤ g(p) ≤ , and < f (p) < . In particular we get
3 3 3 3
2α + β β 2α
f (p) − g(p) > − = ;
3 3 3
α + 2β α 2β
f (p) − g(p) < − = .
3 3 3
2α α+β 2β
Since ≤ ≤ , we see that in all three cases we have
3 3 3
2α 2β
≤ f (p) − g(p) ≤ ,
3 3
so we get
2α     2β
≤ inf f (p) − g(p) ≤ sup f (p) − g(p) ≤ ,
3 p∈Ω p∈Ω 3
so we indeed get the desired inequality
2
kg − f k ≤ kf k.
3
CHAPTER II: ELEMENTS OF FUNCTIONAL ANALYSIS 101

Having proven the Claim, we now prove the density of C in CbR (Ω). Start with
some f ∈ CbR (Ω), and we construct recursively two sequences (gn )n≥1 ⊂ C and
(fn )n≥1 ⊂ CbR (Ω), as follows. Set f1 = f . Apply the Claim to find g1 ∈ C such that
2
kg1 − f k ≤ kf1 k.
3
Once f1 , f2 , . . . , fn and g1 , g2 , . . . , gn have been constructed, we set
fn+1 = gn − fn ,
and we choose gn+1 ∈ C such that
2
kgn+1 − fn+1 k ≤ kfn+1 k.
3
It is clear, by construction, that
 n−1
2
kfn k ≤ kf k, ∀ n ≥ 1.
3
Consider the sequence (sn )n≥1 ⊂ C of partial sums, defined by
sn = g1 + g2 + · · · + gn , ∀ n ≥ 1.
Using the equalities
gn = fn − fn+1 , ∀ n ≥ 1,
we get
sn − f = g1 + g2 + · · · + gn − f1 = fn+1 ,
so we have  n
2
ksn − f k ≤ kf k, ∀ n ≥ 1,
3
which clearly give f = limn→∞ sn , so f indeed belongs to the closure C. 

We are now in position to prove the following


Theorem 5.1 (Tietze Extension Theorem). Let Ω be a normal topological
space, let T ⊂ Ω be a closed subset. Let f : T → [0, 1] be a continuous function.
(Here Y is equipped with the induced topology.) There there exists a continuous
function g : Ω → [0, 1] such that g T = f .

Proof. Let us introduce the Banach space setting that will make the proof
clearer. We consider the Banach spaces C R (Ω) and CbR (T ). To avoid any confusion,
the norms on these Banach spaces will be denoted by k · kΩ and k · kT . If we define
the restriction map

R : CbR (Ω) 3 g 7−→ g ∈ CbR (T ),
T
then R is obviously linear and continuous.
We define the subspace C = R CbR (Ω) ⊂ CbR (T ).
Claim: For every f ∈ C, there exists some g ∈ CbR (Ω) such that f = Rg, and
inf f (q) ≤ g(p) ≤ supq∈T f (q), ∀ p ∈ Ω.
q∈T
102 LECTURES 14-15

To prove this fact, we start first with some arbitrary g0 ∈ CbR (Ω), such that f =
Rg0 = g0 Y . Put
α = inf f (q) and β = sup f (q),
q∈T q∈T

so that kf kT = max |α|, |β| . Define the function θ : R → [α, β] by

 α if t < α
θ(t) = t if α ≤ t ≤ β
β if t > β

Then obviously θ is continuous, and the composition g = θ ◦ g0 : Ω → [α, β] will


still satisfy g T = f , and we will clearly have
α ≤ g(p) ≤ β, ∀ p ∈ Ω.
Having proven the Claim, we are going to prove that C is closed. We do this by
showing that C is a Banach space, in the norm k · kPY . To get this, we use Remark

??. Start with some sequence (fn )n≥1 ⊂ C, with n=1 kfn kT < ∞. Apply the
Claim, to construct a sequence (gn )n≥1 ⊂ CbR (Ω), such that Rgn = fn , and
inf fn (q) ≤ gn (p) ≤ sup fn (q), ∀ p ∈ Ω,
q∈T q∈T

for each n ≥ 1. Notice that this forces


kgn kΩ ≤ kfn kT , ∀ n ≥ 1.
Define the sequences of partial sums (hn )n≥1 ⊂ C and (sn )n≥1 ⊂ CbR (Ω), by
hn = f1 + · · · + fn and sn = g1 + · · · + gn , ∀ n ≥ 1.
Since

X ∞
X
kgn kΩ ≤ kfn kT < ∞,
n=1 n=1
and CbR (Ω) is a Banach space, it follows that the sequence (sn )n≥1 is convergent to
some point g ∈ CbR (Ω). Since R : CbR (Ω) → CbR (T ) is linear an continuous, we will
have
Rs = lim [Rg1 + · · · + Rgn ] = lim [f1 + · · · + fn ] = lim hn ,
n→∞ n→∞ n→∞
which proves that the sequence of partial sums (hn )n≥1 ⊂ C is indeed convergent
to Rs ∈ C.
Let us remark now that obviously C contains the constant function 1 = R1.
Using Urysohn Lemma (applied to T ) it is clear that C satifies the condition (u)
in the above lemma. Using the Lemma ??, it follows that C = CbR (T ), i.e. R is
surjective.
To finish the proof, start with some arbitrary continuous function f : Y → [0, 1].
Use surjectivity of R, combined with the Claim, to find g ∈ CbR (Ω), such that
Rg = f , and
inf f (q) ≤ g(p) ≤ sup f (q), ∀ p ∈ Ω.
q∈T q∈T
This clearly forces g to take values in [0, 1]. 
Next we concentrate on the case when Ω is a compact Hausdorff space. In
this case, every continuous function F : Ω → K is automatically bounded, and the
Banach space CbK (Ω) will be denoted simply by C K (Ω). (When K = C this space
will be denoted simply by C(Ω).)
CHAPTER II: ELEMENTS OF FUNCTIONAL ANALYSIS 103

Theorem 5.2 (Dini). Let K be a compact Hausdorff space, let (fn )n≥1 ⊂
C R (K) be a monotone sequence. Assume there is some f ∈ C R (K), such that
lim fn (p) = f (p), ∀ p ∈ K.
n→∞
Then limn→∞ fn = f , in the norm topology.
Proof. Replacing fn with fn − f , we can assume that limn→∞ fn (p) = 0,
∀ p ∈ K. Replacing (if necessary) fn with −fn , we can also assume that the
sequence (fn )n≥1 is decreasing. In particular, each fn is non-negative.
We need to prove that limn→∞ kfn k = 0. Assume this is not true, so there
exists some ε > 0, such that the set
M = {m ∈ N : kfm k ≥ ε}
is infinite. For each integer n ≥ 1, let us define the set
Fn = {p ∈ K : fn (p) ≥ ε}.
Then by the definition of M , we have
Fm 6= ∅, ∀ m ∈ M.
Claim: One has the inclusion Fn ⊃ Fn+1 , ∀ n ≥ 1.
Indeed, if p ∈ Fn+1 , then
ε ≤ fn+1 (p) ≤ fn (p),
which proves that p ∈ Fn .
Using the claim, plus the fact that the set M is infinite, it follows that, Fn 6= ∅,
∀ n ≥ 1. (Indeed, if we start with some arbitrary n, then since M is infinite, we
can find m ∈ M , with m ≥ n, and then using the Claim we have ∅ 6= Fm ⊂ Fn .)
Since K is compact, and the sets F1 ⊃ F2 ⊃ . . . are closed and non-empty, by
the finite intersection property, it follows that

\
Fn 6= ∅.
n=1
T∞
But this leads to a contradiction, because if we pick an element p ∈ n=1 Fn ,
then we will have fn (p) ≥ ε, ∀ n ≥ 1, and then the equality limn→∞ fn (p) = 0 is
impossible. 
Exercise 1. Define the sequence (Pn )n≥1 of polynomials, by P1 (t) = 0, and
1
Pn+1 (t) = t − Pn (t)2 + Pn (t), ∀ n ≥ 1.

2
Prove that √ 
lim max Pn (t) − t = 0.
n→∞ t∈[0,1]


Hint: Define the functions fn , f : [0, 1] → R by fn (t) = Pn (t) and f (t) = t. Prove that, for

every t ∈ [0, 1], the sequence fn (t) n≥1 is incresing, bounded, and limn→∞ fn (t) = f (t). Then
apply Dini’s Theorem.
Theorem 5.3 (Stone-Weierstrass). Let K be a compact Hausdorff space. Let
A ⊂ C R (K) be a unital subalgebra, i.e.
• A 3 1 - the constant function 1;
• A is a linear subspace;
• if f, g ∈ A, then f g ∈ A.
104 LECTURES 14-15

Assume A separates the points of K, i.e. for any p, q ∈ K, with p 6= q, there exists
f ∈ A such that f (p) 6= f (q).
Then A is dense in C R (K), in the norm topology.

Proof. Let C denote the closure of A. Remark that C is again a unital sub-
algebra and it still separates the points.
The proof will eventually use the Urysohn density Lemma. Before we get to
that point, we need several preparations.
Step 1. If f ∈ C, then |f | ∈ C.
To prove this fact, we define g = f 2 ∈ C, and we set h = kgk−1 g, so that h ∈ C,
and h(p) ∈ [0, 1], for all p ∈ K. Let Pn (t), n ≥ 1 be the polynominals defined in
the above exercise. The functions hn = Pn ◦ h, n ≥ 1 are clearly all in C. By the
above Exercise, we clearly get
p 
lim max |hn (p) − h(p)| = 0,
n→∞ p∈K
√ √
which means that limn→∞ hn = h, in the norm topology. In particular, h
belongs to C. Obviously we have

h = kf k−1 · |f |,
so |f | indeed belongs to C.
Step 2: Given two functions f, g ∈ C, the continuous functions max{f, g} and
min{f, g} both belong to C.
This follows immediately from Step 1, and the equalities
1  1 
max{f, g} = f + g + |f − g| and min{f, g} = f + g − |f − g| .
2 2
Step 3: For any two points p, q ∈ K, p 6= q, there exists h ∈ C, such that
h(p) = 0, h(q) = 1, and h(s) ∈ [0, 1], ∀ s ∈ K.
Use the assumption on A, to find first a function f ∈ A, such that f (p) 6= f (q).
Put α = f (p) and β = f (q), and define
1 
g= f − α1 .
β−α
The function g still belongs to A, but now we have g(p) = 0 and g(q) = 1. Define
the function h = min{g 2 , 1}. By Step 3, h ∈ C, and it clearly satisfies the required
properties.
Step 4: Given a closed subset A ⊂ K, and a point p ∈ K r A, there exists a
function h ∈ C, such that h(p) = 0, h A = 1, and h(q) ∈ [0, 1], ∀ q ∈ K.
For every q ∈ A, we use Step 3 to find a function hq ∈ C, such that hq (p) = 0,
hq (q) = 1, and hq (s) ∈ [0, 1], ∀ s ∈ K, and we define the open set
Dq = {s ∈ K : hq (s) > 0}.
Using the compactness of A, we find points q1 , . . . , qn ∈ A, such that
A ⊂ Dq1 ∪ · · · ∪ Dqn .
Define the function f = hq1 + · · · + hqn ∈ C, so that f (p) = 0, f (q) > 0, for all
q ∈ A, and f (s) ≥ 0, ∀ s ∈ K. If we define
m = min f (q),
q∈A
CHAPTER II: ELEMENTS OF FUNCTIONAL ANALYSIS 105

then the function g = m−1 f again belongs to C, and it satisfies g(p) = 0, g(q) ≥ 1,
∀ q ∈ A, and g(s) ≥ 0, ∀ s ∈ K. Finally, the function
h = min{g, 1}
will satisfy the required properties.
Step
5: Given
closed sets A, B ⊂ K with A ∩ B = ∅, there exists h ∈ C, such
that h A = 1, h B = 0, and h(q) ∈ [0, 1], ∀ q ∈ K.
Use Step 4, to find for every p ∈ B, a function hp ∈ C, such that hp B = 1,

hp (p) = 0, and hp (s) ∈ [0, 1], ∀ s ∈ K. Put gp = 1 − hp , so that gp (p) = 1, gp B = 0,


and gp (s) ∈ [0, 1], ∀ s ∈ K. We the proceed as above. For each p ∈ A we define the
open set
Dp = {s ∈ K : gp (s) > 0}.
Using the compactness of A, we find points p1 , . . . , pn ∈ A, such that
A ⊂ Dp1 ∪ · · · ∪ Dpn .
Define the function f = gp1 + · · · + gpn ∈ C, so that f B = 0, f (q) > 0, for all q ∈ A,

and f (s) ≥ 0, ∀ s ∈ K. If we define


m = min f (q),
q∈A

then the function g = m f again belongs to C, and it satisfies g B = 0, g(q) ≥ 1,


−1

∀ q ∈ A, and g(s) ≥ 0, ∀ s ∈ K. Finally, the function


h = min{g, 1}
will satisfy the required properties.
We now apply the Urysohn density Lemma, to conclude that C is dense in
C R (K). Since C is already closed, this forces C = C R (K), i.e. A is dense in
C R (K). 
Corollary 5.1 (Complex version of Stone-Weierstrass Theorem). Let K be
a compact Hausdorff space. Let A ⊂ C(K) be a unital subalgebra, which satisfies;
• if f ∈ A, then f ∈ A.
Assume A separates the points of K. Then A is dense in C(K), in the norm
topology.
Proof. Consider the sub-algebra
AR = {f ∈ A : f = f }.
It is clear that
A = AR + iAR ,
and AR is a unital sub-algebra of C R (K), which separates the points of K. Using
the real version, we know that AR is dense in C R (K). Then A is clearly dense in
C(K). 
Example 5.1. Consider the unit disk
D = {λ ∈ C : |λ| < 1},
and let D denote its closure. Consider the algebra A ⊂ C(D) consisting of all
polynomial functions. Notice that, although A is unital and separates the points
of D, it does not have the property
f ∈ A ⇒ f ∈ A.
106 LECTURES 14-15

In fact, one way to see that this property fails is by inspecting the closure of A in
C(D). This closure is denoted by A(D) and is called the disk algebra. The main
feature of A(D) is the following:
Exercise 2*. Prove that

A(D) = f : D → C : f continuous, and f D holomorphic .
We now examine the topological dual of C(K).
Notations. Let K be a compact Hausdorff space, and let K be one of the
fields R or C. We define the space
MK (K) = C K (K)∗ = {φ : C K (K) → K : φ K-linear continuous}.
The unit ball will be denoted by MK (K)1 . When K = C, the superscript C will be
omitted from the notation.
Remarks 5.1. Let K be a compact Hausdorff space. The space M(K) =
C(K)∗ carries a natural involution, defined as follows. For φ ∈ M(K), we define
the map φ? : C(K) → C by
φ? (f ) = φ(f ), ∀ f ∈ C(K).
For every φ ∈ M(K), the map φ? : C(K) → C is again linear, continuous, and has
kφ? k = kφk.
The map φ? will be called the adjoint of φ. We used the term involution, because
the map
M(K) 3 φ 7−→ φ? ∈ M(K)
has the following properties:
• (φ? )? = φ, ∀ φ ∈ M(K);
• (φ + ψ)? = φ? + ψ ? , ∀ φ, ψ ∈ M(K);
• (λφ)? = λφ? , ∀ φ, ∈ M(K), λ ∈ C.
If we define the space of self-adjoint maps
Msa (K) = {φ ∈ M(K) : φ? = φ},
then is clear that, for any φ ∈ Msa (K), the restriction φ C R (K) is real-valued. In

fact, for φ ∈ M(K), one has


φ? = φ ⇐⇒ φ R

is real-valued.
C (K)

Moreover, one has a map


Msa (K) 3 φ 7−→ φ C R (K) ∈ MR (K),

(1)
which is an isomorphism of R-vector spaces. The inverse of this map is defined
as follows. Start with some φ ∈ MR (K), i.e. φ : C R (K) → R is R-linear and
continuous, and we define φ̂ : C(K) → C by
φ̂(f ) = φ(Re f ) + iφ(Im f ), ∀ f ∈ C(K).
It turns out that φ̂ is again linear, continuous, and self-adjoint. Moreover, the
correspondence
MR (K) 3 φ 7−→ φ̂ ∈ Msa (K)
is the inverse of (1).
CHAPTER II: ELEMENTS OF FUNCTIONAL ANALYSIS 107

Proposition 5.2. Let K be a compact Hausdorff space. Then the map


Msa (K) 3 φ 7−→ φ C R (K) ∈ MR (K)

is isometric. Moreover, when the two spaces are equipped with the w∗ topology, this
map is a homeomorphism.

Proof. To prove the first statement, fix φ ∈ M (K). It is obvious that


sa

kφ C R (K) k ≤ kφk. To prove the other inequality, fix for the moment ε > 0, and
choose f ∈ C(K) such that kf k ≤ 1, and
|φ(f )| ≥ kφk − ε.
Choose a complex number λ with |λ| = 1, such that
|φ(f )| = λφ(f ) = φ(λf ).
If we write λf = g + ih, with g, h ∈ C R (K), then using the fact that φ is self-adoint,
we will have
|φ(f )| = φ(g).
Since kgk ≤ kλf k = kf k ≤ 1, we will get

|φ(f )| ≤ kφ R k,
C (K)

so our choice of f will give



kφk − ε ≤ kφ C R (K) k.
Since this holds for all ε > 0, we get

kφk ≤ kφ C R (K) k.
The w∗ continuity (both ways) is obvious. 
Convention. From now on, we will identify the space MR (K) with Msa (K).
Proposition 5.3. Let K be a compact Hausdorff space. For every p ∈ K, let
γp : C(K) → C be the map
γp : C(K) 3 f 7−→ f (p) ∈ C.

(i) For every p ∈ K, the maps γp and γpR = γp C R (K) : C R (K) → R are linear
and continuous.
(ii) For every p ∈ K, one has kγp k = kγpR k = 1.
(ii) The maps
ΓK : K 3 p 7−→ γp ∈ M(K)1
K : K 3 p 7−→ γp ∈ M (K)1
ΓR R R

are injcetive and continuous, when the target spaces M(K)1 and MR (K)1
are equipped with the w∗ topology.
Proof. (i)-(ii). The fact that γp is C-linear is obvious. This will also give the
R-linearity of γpR . The continuity follows from the obvious inequality
|γp (f )| = |f (p)| ≤ max |f (q)| = kf k, ∀ f ∈ C(K).
q∈K

AMong other things, the above inequality also proves


kγp k ≤ 1 and kγpR k ≤ 1.
108 LECTURES 14-15

The fact that we have in fact equalities follows from γp (1) = 1.


(iii) Let us first prove the injectivity. Assume we have two point p, q ∈ K, with
p 6= q. Use Urysohn Lemma to find f : K → [0, 1] continuous, such that f (p) = 0
and f (q) = 1. Then f ∈ C R (K) and γpR (f ) = f (p) = 0, and γqR (f ) = f (q) = 1, so
we indeed have γpR 6= γqR . (This will also imply γp 6= γq .
To prove the continuity of the maps ΓK : K → M(K)1 and ΓR K : K → M (K)1 ,
R

we need to prove the continuity of the maps f ◦ ΓK : K → C, f ∈ C(K), and of



the maps f ◦ ΓR K : K → R, f ∈ C (K). (Recall that f (φ) = φ(f ), ∀ φ ∈ C (K) .)
R K

Notice hoewver that we have in fact equalities


f ◦ ΓK = f, ∀ f ∈ C(K),
f ◦ ΓR R
K = f, ∀ f ∈ C (K),
so the desired continuity is automatic. 
Corollary 5.2. With the above notations, the spaces
Γ(K) = {γp : ∈ K} ⊂ M(K)1 and ΓR (K) = {γpR : ∈ K} ⊂ MR (K)1
are w∗ compact, and the maps
ΓK : K → Γ(K) and ΓR R
K : K → Γ (K)
are homeomorphisms.
Here is an interesting application of the above result to topology.
Theorem 5.4 (Urysohn Metrizatbility Theorem). Let K be a compact Haus-
dorff space. The following are equivalent:
(i) K is metrizable;
(ii) K is second countable, i.e. the topology has a countable base;
(iiiR ) the Banach space C R (K) is separable;
(iiiC ) the Banach space C(K) is separable.
Proof. (i) ⇒ (ii). We already know this fact. (See the section on metric
spaces).
(ii) ⇒ (iiiR ). Assume K is second countable. Fix a countable base {Dn : n ∈
N} for the topology. Consider the countable set
∆ = {(m, n) ∈ N2 : Dm ∩ Dn = ∅}.
Claim: For any two points p, q ∈ K, with p 6= q, there exists a pair (m, n) ∈
∆ with p ∈ Dm and q ∈ Dn .
Indeed, since K is Hausdorff, there exist open sets U0 , V0 ⊂ K with p ∈ U0 , q ∈ V0 ,
and U0 ∩ V0 = ∅. Since K is (locally) compact, there exist open sets U, V ⊂ K,
such that p ∈ U ⊂ U ⊂ U0 and q ∈ V ⊂ V ⊂ V0 . Finally, since {Dn : n ∈ N} is a
basis for the topology, there exist m, n ∈ N such that p ∈ Dm ⊂ U and q ∈ Dn ⊂ V .
Then clearly we have Dm ⊂ U ⊂ U0 , and Dn ⊂ V ⊂ V0 , which forces Dm ∩Dn = ∅.
Having proven the Claim, for every pair (m, n) ∈ ∆ we choose (use Urysohn
Lemma) a continuous function hmn : K → [0, 1] such that hmn Dm = 0 and

hmn Dn = 1, and we define the countable family
F = {hmn : (m, n) ∈ ∆}.
Using the Claim, we know that F separates the points of K. We set
P = {h ∈ C R (K) : h is a finite product of functions in F}.
CHAPTER II: ELEMENTS OF FUNCTIONAL ANALYSIS 109

Notice that P is still countable, it also separates the points of K, but also has the
property:
f, g ∈ P ⇒ f g ∈ P.
If we define
A = Span({1} ∪ P),
then A ⊂ C R (K) satisfies the hypothesis of the Stone-Weierstrass Theorem, hence
A is dense in C R (K). Notice that if we define
AQ = SpanQ ({1} ∪ P),
i.e. the set of linear combinations of elements in {1} ∪ P with rational coefficients,
then clearly AQ is dense in A, and so AQ is dense in C R (K). But now we are done,
since AQ is obviously countable.
(iiiR ) ⇒ (iiiC ). Assume C R (K) is separable. Let S ⊂ C R (K) be a countable
dense set. Then the set
S + iS = {f + ig : f, g ∈ S}
is clearly countable, and dense in C(K).
(iiiC ) ⇒ (i). Assume C(K) is separable. By the results from the previous
section, it follows that, when equipped with the w∗ topology, the compact space
M(K)1 is metrizable. Then the compact subset Γ(K) ⊂ M(K)1 is also metrizable.
Since K is homeomorphic to Γ(K), it follows that K itself is metrizable. 
Definition. Let K be a compact Hausdorff space, and let K be one of the
fields R or C. A K-linear map φ : C K (K) → K is said to be positive, if it has the
property
f ∈ C R (K), f ≥ 0 =⇒ φ(f ) ≥ 0.
Proposition 5.4 (Automatic continuity for positive linear maps). Let K be
a compact Hausdorff space, and let K be one of the fields R or C. Any positive
K-linear map φ : C K (K) → K is continuous. Moreover, one has the equality
kφk = φ(1).

Proof. In the case when K = C, it suffices to prove that φ C R (K) is continuous.
Therefore, it suffices to prove the statement for K = R. Start with some arbitrary
f ∈ C R (K), and define the function f± ∈ C R (K) by
f+ = max{f, 0} and f− = max{−f, 0},
so that f± ≥ 0, f = f + −f− , and kf k = max{kf+ k, kf− k}. On the one hand, by
positivity, we have the inequalities φ(f± ) ≥ 0, so we get
−φ(f− ) ≤ φ(f+ ) − φ(f− ) ≤ φ(f+ ),
which give
(2) |φ(f )| = |φ(f+ ) − φ(f− )| ≤ max{φ(f+ ), φ(f− )}.
On the other hand, we have
kf± k · 1 − f± ≥ 0,
so by positivity we get
kf± k · φ(1) ≥ φ(f± ).
Using this in (2) gives
|φ(f )| ≤ φ(1) · max{kf+ k, kf− k} = φ(1) · kf k.
110 LECTURES 14-15

Since this holds for all f ∈ C R (K), the continuity of φ follows, together with the
estimate
kφk ≤ φ(1).
Since φ(1) ≤ kφk · k1k = kφk, the desired norm equality follows. 

Notations. Let K be a compact Hausdorff space. We define


MK K
+ (K) = {φ : C (K) → K : φ K-linear, positive};

MK
+ (K)1 = {φ ∈ M+ (K) : kφk ≤ 1} = M+ (K) ∩ M (K)1 .
K K K

When K = C, the superscript C will be ommitted.


Remarks 5.2. Let K be a compact Hausdorff space. We have the inclusion
M+ (K) ⊂ Msa (K). Indeed, if we start with φ ∈ M+ (K), then using the fact
that every real-valued continuous function f ∈ C(K) is a difference of non-negative
continuous functions f = f+ −f− , it follows that φ(f ) = φ(f+ )−φ(f− ) is a difference
of two non-negative (hence real) numbers, so φ(f ) ∈ R. This implies φ? = φ.
The set MR + (K) is w -closed in M (K), and the set M+ (K) is w -closed in
∗ R ∗

M(K). This follows from the fact that, for each f ∈ C (K), the set
R

−1
AKf = {f ∈ M (K) : φ(f ) ≥ 0} = f
K

[0, ∞)
is w∗ -closed, being the preimage of a closed set, under a w∗ -continuous map. Then
everything is a consequence of the equality
\
MK+ (K) = AK
f.
f ∈C R (K)
f ≥0

+ (K)1 and M+ (K)1 are w -compact.


In particular, the sets MR ∗

The sets M+ (K)1 and M+ (K)1 are convex.


R

Using the identification MR (K) ' Msa (K), we have the following hierarchies:
MR
+ (K) ' M+ (K) MR
+ (K)1 ' M+ (K)1
∩ ∩ ∩ ∩
MR (K) ' Msa (K) MR (K)1 ' Msa (K)1
∩ ∩
M(K) M(K)1
with ' isometric and w∗ -homeomorphism.
Proposition 5.5. Let K be a compact Hausdorff space. Then one has the
equality
Msa (K)1 = conv M+ (K)1 ∪ −M+ (K)1 .


(Here conv denotes the convex cover.)

Proof. Denote the set conv M+ (K)1 ∪ −M+ (K)1 simply by C.




Claim: One has the equality:


(3) C = {tφ − (1 − t)ψ : φ, ψ ∈ M+ (K)1 , t ∈ [0, 1]}.
In particular, the set C is w∗ -compact.
CHAPTER II: ELEMENTS OF FUNCTIONAL ANALYSIS 111

Denote the set on the right hand side of (3) simply by D. The inclusion C ⊃ D is
clear. To prove the inclusion C ⊂ D, we only need to prove that D is convex and it
contains M+ (K)1 ∪ −M+ (K)1 . The second property is clear. The convexity of D
is also clear, being a consequence of the convexity of ±M+ (K)1 .
The w∗ -compactness of C is then a consequence of the compatness of the prod-
uct space
M+ (K)1 × M+ (K)1 × [0, 1],
and of the fact that C is the range of the continuous map
M+ (K)1 × M+ (K)1 × [0, 1] 3 (φ, ψ, t) 7−→ tφ − (1 − t)ψ ∈ Msa (K).
Having proven the Claim, we now proceed with the equality
Msa (K)1 = C.
The inclusion ⊃ is clear, since Msa (K)1 is convex, and it contains both M+ (K)1
and −M+ (K)1 .
We prove the other inclusion by contradiction. Assume there is some φ ∈
Msa (K)1 r C. Apply Corollary II.4.2 to find some f ∈ C(K) and a real number α,
such that
Re φ(f ) < α ≤ Re σ(f ), ∀ σ ∈ C.
If we take g = Re f , then this gives
φ(g) < α ≤ σ(g), ∀ σ ∈ C.
Notice that 0 ∈ C, so we get α ≤ 0. If we define β = −α(≥ 0), and h = −g, the
above inequality gives
φ(h) > β ≥ σ(h), ∀ σ ∈ C.
Using the obvious inclusions ±Γ(K) ⊂ C, we get
β ≥ ±γp (h) = ±h(p), ∀ p ∈ K.
Since h is real-valued, this will force khk ≤ β. But then we get a contradiction,
because we also have
β < φ(h) ≤ kφk · khk ≤ khk.

Corollary 5.3. Let K be a compact Hausdorff space, and let φ ∈ Msa (K).
Then there exist φ1 , φ2 ∈ M+ (K), such that φ = φ1 − φ2 , and kφk = kφ1 k + kφ2 k.
Proof. If φ ∈ M+ (K) ∪ −M+ (K), there is nothing to prove. Assume φ 6∈
φ
M+ (K)∪−M+ (K), in particular φ 6= 0. We define ψ = , so that ψ ∈ Msa (K)1 .
kφk
Find ψ1 , ψ2 ∈ M+ (K)1 and t ∈ [0, 1], such that
ψ = tψ1 − (1 − t)ψ2 .
Since ψ 6∈ M+ (K) ∪ −M+ (K), it follows that 0 < t < 1. Notice that
1 = kψk = ktψ1 − (1 − t)ψ2 k ≤ tkψ1 k + (1 − t)kψ2 k.
If kψ1 k < 1, or kψ2 k < 1, then this would imply tkψ1 k + (1 − t)kψ2 k < 1, which
is impossible by the above estimate. This argument proves that we must have
kψ1 k = kψ2 k = 1. If we define
φ1 = tkφkψ1 and φ2 = (1 − t)kφkψ2 ,
112 LECTURES 14-15

then kφ1 k = tkφk and kφ2 k = (1 − t)kφk, so we indeed have kφ1 k + kφ2 k = kφk.
Obviously φ1 and φ2 are positive, and
 
φ1 − φ2 = kφk · tψ1 − (1 − t)ψ2 = kφk · ψ = φ.

Proposition 5.6. Let K be a compact Hausdorff space. The set

conv Γ(K) ∪ {0}
is w∗ -dense in M+ (K)1 .
Proof. Let C be the w∗ -closure of conv Γ(K) ∪ {0} . It is obvious that C ⊂


M+ (K)1 , so we only need to prove the inclusion M+ (K)1 ⊂ C. We do this by


contardiction. Assume there exists some φ ∈ M+ (K)1 r C. Since C is w∗ -closed
and convex, there exists some f ∈ C(K) and a real number α, such that
Re φ(f ) < α ≤ Re σ(f ), ∀ σ ∈ C.
In particular, if we take h = −Re f , and β = −α, we get
(4) φ(h) > β ≥ σ(h), ∀ σ ∈ C.
Sinc 0 ∈ C, we have β ≥ 0. Since Γ(K) ⊂ C, we also get
β ≥ γp (h) = h(p), ∀ p ∈ K,
which menas that β1 − h ≥ 0. Since φ is positive, this will force φ(β1 − h) ≥ 0,
which gives
φ(h) ≤ φ(β1) = βφ(1) = βkφk.
Finally, since kφk ≤ 1, this gives
φ(h) ≤ β,
thus contradicting (4). 
The results for the Banach spaces of the form C(K), with K compact Hausdorff
space, can be generalized, with suitable modifications, to the situation when K is
replaced with a locally compact space. The following result in fact reduces the
analysis to the compact case.
Theorem 5.5. Let Ω be a locally compact space, and let Ωβ be the Stone-Cech
compactification of Ω. Then the restriction map
R : C K (Ωβ ) 3 f 7−→ f ∈ CbK (Ω)


is an isometric linear isomorphism.
Proof. The linearity is obvious.
Let us show that R is surjective. We show that R is bijective, by exhibiting an
inverse for it. For every h ∈ CbK (Ω), we consider the compact set
Kh = {z ∈ K : |z| ≤ khk},
so that we can regard h as a continuous map Ω → Kh . We know from the func-
toriality of the Stone-Cech compactification that there exists a unique continuous
map hβ : Ωβ → Khβ , with hβ Ω = h. Since Kh is compact, we have Khβ = Kh . In

particular, this gives the inequality


(5) |hβ (x)| ≤ khk, ∀ x ∈ Ωβ .
CHAPTER II: ELEMENTS OF FUNCTIONAL ANALYSIS 113

Define the map T : CbK (Ω) 3 h 7−→ hβ ∈ C K (Ωβ ), and let us show that T is an
inverse for R. The equality R ◦ T = Id is trivial, by construction. To prove the
equality T ◦ R = Id, we start with some f ∈ CbK (Ω), and we consider h = Rf .
Then T h = hβ , and since hβ Ω = h = f Ω , the denisty of Ω in Ωβ clearly forces
f = hβ = T h = T (Rf ).
The fact that R is isometric is now clear, because on the one hand we clearly
have kRf k ≤ kf k, ∀ f ∈ C K (Ωβ ), and on the other hand, by (5), we also have
kT hk ≤ khk, ∀ h ∈ CbK (Ω). 
If Ω is a locally compact space, the above result suggests that the space CbK (Ω)
is quite “large.” It is then natural to look at smaller spaces.
Definitions. Let Ω be a locally compact space. If K is one of the fields R or
C, and f : Ω → K is a continuous function, we define the support of f by
supp f = {ω ∈ Ω : f (ω) 6= 0}.
We define the space

CcK (Ω) = f : Ω → K : f continuous, with compact support .
When K = C, this space will be denoted simply by Cc (Ω). Remark that, when
equipped with pointwise addition and multiplication, the space CcK (Ω) becomes a
K-algebra. One has obviously the inclusion CcK (Ω) ⊂ CbK (Ω).
We define C0K (Ω) = CcK (Ω), the closure of CcK (Ω) in CbK (Ω). (When K = C, we
will denote this space simply by C0 (Ω).) The Banach space C0K (Ω) can be regarded
as the completion of CcK (Ω). Of course, when Ω is compact, we have the equality
C0K (Ω) = C K (Ω).
The following result characterizes the Banach space C0K (Ω).
Proposition 5.7. Let Ω be a locally compact space. For a function f ∈ CbK (Ω),
the following are equivalent:
(i) f ∈ C0K (Ω);
(ii) for every ε > 0, there exists some compact subset Kε ⊂ Ω, such that
sup |f (ω)| ≤ ε.
ω∈ΩrKε

Proof. (i) ⇒ (ii). Suppose f ∈ C0K (Ω), which means that there exists some
sequence (fn )∞
n=1 ⊂ Cc (Ω), such that limn→∞ fn = f , in the norm topology in
K

Cb (Ω). Fix some ε > 0, and choose k ≥ 1, such that kf − fk k ≤ ε. If we define


K

Kε = supp fk , then, for every ω ∈ Ω r Kε , we have fk (ω) = 0, so the inequality


kf − fk k ≤ ε forces |f (ω)| ≤ ε.
(ii) ⇒ (i). Suppose f satisfies property (ii). Fix for the moment an integer
n ≥ 1. Use condition (ii) to find a compact subset Kn ⊂ Ω, such that
1
|f (ω)| ≤ , ∀ ω ∈ Ω r Kn .
n
Use Urysohn Lemma to choose some continuous function hn : Ω → [0, 1], with
compact support, such that hn K = 1. Define the function fn = hn f , so that
n
fn ∈ CcK (Ω). If ω ∈ Ω r Kn , then, using the inequality 0 ≤ hn ≤ 1, and the choice
of Kn , we have
1
|f (ω) − fn (ω)| = |f (ω)| · [1 − hn (ω)] ≤ |f (ω)| ≤ .
n
114 LECTURES 14-15


Using the fact that fn Kn = f Kn , the above equality proves that kf −fn k ≤ n1 . This
way we have constructed a sequence (fn )∞ n=1 ⊂ Cc (Ω), such that limn→∞ fn = f ,
K

in Cb (Ω), so by the definition it follows that f ∈ C0K (Ω).


K


The following establishes an interesting connection with the Alexandrov com-


pactification.
Proposition 5.8. Let Ω be a locally compact space, which is non-compact,
and let Ωα = Ω t {∞} denote the Alexandrov compactification.
(i) For every function f ∈ C0K (Ω), the function f α : Ωα → K, defined by
f α Ω = f , and f α (∞) = 0, is continuous.
(ii) The correspondence U : C0K (Ω) 3 f 7−→ f α ∈ C K (Ωα ) is an isometric
linear map.
(iii) One has the equality
Ran U = g ∈ C K (Ωα ) : g(∞) = 0 .

(6)

Proof. (i). We know that Ω is open in Ωα , which immediately gives the fact
that f α is continuous at every point ω ∈ Ω. So all we need to show is the continuity
of f α at ∞. This amounts to showing that for every neighborhood N of f α (∞) = 0
in K, there exists a neighborhood V of ∞ in Ωα , such that f α (V ) ⊂ N . Start with
a neighborhood N of 0, and choose ε > 0, such that the set Bε = {z ∈ K : |z| ≤ ε}
is contained in N . Choose some compact set Kε ⊂ Ω, such that
sup |f (ω)| ≤ ε.
ω∈ΩrKε

Define the set D = (Ω r Kε ) ∪ {∞}. By the definition of the topology on Ωα , the


set D is an open neigborhood of ∞. We are now done, because we clearly have
|f α (x)| ≤ ε, ∀ x ∈ D,
which gives the inclusion f α (D) ⊂ Bε ⊂ N .
(ii). This part is trivial.
(iii). Denote the right hand side of (6) by A. The inclusion Ran U ⊂ A is
Conversely, let us start with some g ∈ A, and let us consider
trivial, by definition.
the function f = g Ω . Let us show that f ∈ C0K (Ω), using Proposition 5.7. Start
with some ε > 0, and choose some open neighborhood Dε of ∞, in Ωα , such that
|g(x)| ≤ ε, ∀ x ∈ Dε .
By definition, there exists a compact subset Kε ⊂ Ω, such that Dε = Ωα r Kε ,
so it is immediate that f satisfies condition
(ii) from Proposition 5.7. Notice now
that, by construction we have f α Ω = g Ω , and f α (∞) = 0 = g(∞, so we indeed
get g = U f . 

Remark 5.3. Let Ω be a locally compact space, which is non-compact. Use


the map U defined above, to identify C0K (Ω) with the subspace Ran U ⊂ C K (Ωα ).
With this identification, we have the equality
C K (Ωα ) = K1 + C0K (Ω) = λ1 + f : λ ∈ K, f ∈ C0K (Ω) .


Indeed, if we start with some function g ∈ C K (Ωα ) and we take λ = g(∞) and
f = g − λ1, then f (∞) = 0. Note that this argument proves that in fact every
g ∈ C K (Ωα ), can be uniquely represented as g = λ1+f , with λ ∈ K, and f ∈ C0K (Ω).
CHAPTER II: ELEMENTS OF FUNCTIONAL ANALYSIS 115

We conclude with a couple of generalizations of the various results in this


section. The first two ones are proven, the rest are stated as exercises. The following
result is a generalization of Proposition 5.4.
Proposition 5.9. Let Ω be a locally compact space, and let φ : C0R (Ω) → R be
a positive linear map. Then φ is continuous, and one has the equality
(7) kφk = sup{φ(f ) : f ∈ C0R (Ω), 0 ≤ f ≤ 1}.

Proof. Let us denote the right hand side of (7) by M . First we show that
M < ∞. If M = ∞, there exists a sequence (fn )∞
n=1 ⊂ C0 (Ω), such that
R

0 ≤ fn ≤ 1 and φ(fn ) ≥ 4n , ∀ n ≥ 1.
P∞ P∞ P∞
Consider then the function f = n=1 21n fn . Since n=1 21n fn ≤ n=1 21n = 1,
it follows that f ∈ C0R (Ω). Notice however that, since we obviously have 21n fn ≤ f ,
by the positivity of φ, we get
1  1
φ(f ) ≥ φ n fn = n φ(fn ) ≥ 2n , ∀ n ≥ 1,
2 2
which is clearly impossible. Let us show now that φ is continuous, by proving the
inequality
(8) |φ(f )| ≤ M, ∀ f ∈ C0R (Ω), with kf k ≤ 1.
Start with some arbitrary function f ∈ C0R (Ω). The functions g ± = |f |±f ∈ C0R (Ω),
clearly satisfy g ≥ 0, so we get φ(|f | ± f ) ≥ 0, so we get φ(|f |) ≥ ±φ(f ). This gives
|φ(f )| ≤ φ(|f |), and since 0 ≤ |f | ≤ 1, we immediately get (8).
The inequality (8) proves the inequality kφk ≤ M . Since we obviously have
M ≤ kφk, we get in fact the equality (7). 

Corollary 5.4. Let Ω be a locally compact space, which is non-compact, and


let Ωα be the Alexandrov compactification of Ω. Using the inclusion C0R (Ω) ⊂
C R (Ωα ), given by Proposition 5.8, every positive linear map φ : C0R (Ω) → R can be
uniquely extended to a positive linear map ψ : C0R (Ω) → R, such that kψk = kφk.

Proof. For every g ∈ C R (Ωα ), we know that there exists a unique λ ∈ R


and f ∈ C0R (Ω), such that g = λ1 + f (namely λ = g(∞) and f = g − λ1). We
then define ψ(g) = λkφk + φ(f ). Notice that ψ(1) = kφk. It is obvious that
ψ : C R (Ωα ) → R is linear, and ψ C R (Ω) = φ. Let us show that ψ is positive.
0
Start with some g ∈ C R (Ωα ) with g ≥ 0, and let us prove that ψ(g) ≥ 0. Write
g = λ1 + f with λ ∈ R and f ∈ C0R (Ω). We know that λ = g(∞) ≥ 0. If λ = 0,
there is nothing to prove. If λ > 0, we define the function h = λ−1 f ∈ C0R (Ω), so
that g = λ(1 + h). The positivity of g forces 1 + h ≥ 0, which means if we consider
the function h− = max{−h, 0} ∈ C0R (Ω), then we have 0 ≤ h− ≤ 1, as well as
h− + h ≥ 0. Using the above result, this will then give
kφk + φ(h) ≥ φ(h− ) + φ(h) = φ(h− + h) ≥ 0,
which means that ψ(1 + h) ≥ 0. Consequently we also get
ψ(g) = ψ(λ(1 + h)) = λψ(h) ≥ 0.
Having shown the positivity of ψ, we know that
kψk = ψ(1) = kφk.
116 LECTURES 14-15

To prove uniqueness, start


with another positive linear map ξ : C0R (Ω) → R,
such that kξk = kφk, with ξ C R (Ω) = φ. Since ξ is positive, this forces ξ(1) = kξk =

0
kφk = ψ(1). But then we have
ξ(λ1 + f ) = λkφk + φ(f ) = ψ(λ1 + f ), ∀ λ ∈ R, f ∈ C0R (Ω),
which proves that ξ = ψ. 
Remark 5.4. Let Ω be a locally compact space, which is not compact, and let
φ : CcR (Ω) → R be a positive linear map. Then the following are equivalent:
(i) φ is continuous;
(ii) sup φ(f ) : f ∈ CcR (Ω), 0 ≤ f ≤ 1 < ∞.
The implication (i) ⇒ (ii) is trivial. To prove the implication (ii) ⇒ (i) we follow
the exact same steps as in the proof of the equality (7) in Proposition 5.9. Denote
the quantity in (ii) by M , and using the inequality |φ(f )| ≤ φ(|f |), we immediately
get |φ(f )| ≤ M, ∀ f ∈ CcR (Ω), with kf k ≤ 1.
Remark also that if φ is as above, then we have in fact the equality

kφk = sup φ(f ) : f ∈ CcR (Ω), 0 ≤ f ≤ 1 .
The following is a generalization of Corollary 5.3.
Proposition 5.10. Let Ω be a locally compact space, and let φ : C0R (Ω) → R be
a linear continuous map. Then there exist positive linear maps φ1 , φ2 : C0R (Ω) → R,
such that φ = φ1 − φ2 , and kφk = kφ1 k + kφ2 k.
Proof. If Ω is compact there is nothing to prove (this is Corollary 5.3). As-
sume Ω is non-compact. Use Hahn-Banach Theorem
to find a linear continuous
map ψ : C R (Ωα ) → R, with kψk = 1 and ψ C R (Ω) = φ. Apply Corollary 5.3 to
0
find two positive linear maps ψ1 , ψ2 : C R (Ωα ) → R such that
ψ = ψ1 − ψ2 and
kψk = kψ1 k + kψ2 k. Define the positive linear maps φk = ψk C R (Ω) , k = 1, 2. We
0
clearly have φ = φ1 − φ2 , and
kφ1 k + kφ2 k ≤ kψ1 k + kψ2 k = kψk = kφk = kφ1 − φ2 k ≤ kφ1 k + kφ2 k,
which forces kφk = kφ1 k + kφ2 k. 
Exercise 3. (Dini’s Theorem for locally compact spaces) Let Ω be a locally
compact space, let (fn )n≥1 ⊂ C0R (Ω) be a monotone sequence. Assume there is
some f ∈ C0R (Ω), such that
lim fn (ω) = f (ω), ∀ ω ∈ Ω.
n→∞
Then limn→∞ fn = f , in the norm topology.
Exercise 4. (Stone-Weierstrass Theorems) Let Ω be a locally compact space,
which is non-compact, and let A ⊂ C0K (Ω) be a subalgebra, with the following
separation properties
• For any two points ω1 , ω2 ∈ Ω, with ω1 6= ω2 , there exists f ∈ A such that
f (ω1 ) 6= f (ω2 ).
• For any ω ∈ Ω, there exists f ∈ A with f (ω) 6= 0.
A. Prove that, if K = R, then A is dense in C0R (A), in the norm topology.
B. Prove that, if K = C, and if A has the property f ∈ A ⇒ f¯ ∈ A, then A is
dense in C0 (Ω).
Hint: Work in Ωα (use Remark 5.3), and prove that K1 + A is dense in C K (Ωα ).
Lectures 16-17

6. Hilbert spaces
In this section we examine a special type of Banach spaces.
Definition. Let K be one of the fields R or C, and let X be a vector space
over K. An inner product on X is a map
X × X 3 (ξ, η) 7−→ ξ η ∈ K,


with the following properties:


• ξ ξ ≥ 0, ∀ ξ ∈ X; 


• if ξ ∈ X satisfies ξ ξ = 0, then ξ = 0;


• for any ξ ∈ X, the map X 3 η 7−→ ξ η ∈ K is K-linear;
• η ξ = ξ η , ∀ xi, η ∈ X.
 

Comments. Combining the last two properties, one gets


ξ λη1 + η2 = λ ξ η1 + ξ η2 , ∀ ξ, η1 , η2 ∈ X, λ ∈ K;
  

λξ1 + ξ2 η = λ ξ1 η + ξ2 η , ∀ ξ1 , ξ2 , η ∈ X, λ ∈ K.
  

In particular, one has


λξ λξ = λλ ξ ξ = |λ|2 · ξ ξ , ∀ ξ ∈ X, λ ∈ K.
  
(1)

Proposition 6.1 (Cauchy-Bunyakowski-Schwartz Inequality). Let · · be
an inner product on the K-vector space X. Then
 2
ξ η ≤ ξ ξ · η η , ∀ ξ, η ∈ X.
 
(2)
Moreover, if equality holds then ξ and η are proportional, in the sense that either
ξ = 0, or η = 0, or ξ = λη.
Proof.  Fix ξ, η ∈ X. Assume η 6= 0. In the case when η = 0, both statements
are trivial . Choose a number λ ∈ K, with |λ| = 1, such that
  
ξ η = λ ξ η = ξ λη .
Define the map F : K → K by

F (z) = zλη + ξ zλη + ξ , ∀ z ∈ K.
A simple computation gives
   
F (z) = zλzλ η η + zλ ξ η + zλ η ξ + ξ ξ =
= |z|2 |λ|2 η η + zλ ξ η + zλ ξ η + ξ ξ =
   

= |z|2 η η + z ξ η + z ξ η + ξ ξ , ∀ z ∈ R.
   

117
118 LECTURES 16-17

In particular, when we restrict F to R, it becomes a quadratic function:


F (t) = at2 + bt + c, ∀ t ∈ R,
  
where a = η η > 0, b = 2 ξ η , c = ξ ξ . Notice that we have
F (t) ≥ 0, ∀ t ∈ R.
This forces b2 − 4ac ≤ 0. This last inequality gives
 2  
4 ξ η − 4 ξ ξ · η η ≤ 0,
so we get
 2  
ξ η ≤ ξ ξ · η η ,
and the inequality (2) is proven. Let us examine now when we have equality. The
equality in (2) gives b2 − 4ac = 0, which in terms of quadratic equations says that
the equation
F (t) = at2 + bt + c = 0

has a unique solution t0 . This will give

t0 λη + ξ t0 λη + ξ = F (t0 ) = 0,
which forces t0 λη + ξ = 0, i.e. ξ = (−t0 λ)η. 
Corollary 6.1. Let · · be an inner product on the K-vector space X.


Then the map q 


X 3 ξ 7−→

ξ ξ ∈ [0, ∞
is a norm on X.
q 
Proof. Denote ξ ξ simply by kξk. The fact that kξk is non-negative is
clear. The implication kξk = 0 ⇒ ξ = 0 is also clear. Using (1) we have
q  q q 
ξ ξ = |λ| · kξk, ∀ ξ ∈ X, λ ∈ K.

kλξk = λξ λξ = |λ|2 ξ ξ = |λ| ·
Finally, for ξ, η ∈ X, we have
kξ + ηk2 = ξ + η ξ + η = ξ ξ + η η + ξ η + η ξ =
    

= kξk2 + kηk2 + ξ η + ξ η = kξk2 + kηk2 + 2Re ξ η .


  

We now use the C-B-S inequality, which reads



(3) ξ η ≤ kξk · kηk,
so the above computation gives
kξ + ηk2 = kξk2 + kηk2 + 2Re ξ η ≤ kξk2 + kηk2 + 2 ξ η ≤
 
2
≤ kξk2 + kηk2 + 2kξk · kηk = kξk + kηk ,
so we immediately get kξ + ηk ≤ kξk + kηk. 
Definition. The norm constructed
 in the above result is called the norm
defined by the inner product · · .
Exercise 1. Use the above notations, and assume we have two vectors ξ, η 6= 0,
such that kξ +ηk = kξk+kηk. Prove that there exists some λ > 0 such that ξ = λη.
Lemma 6.1. Let X be a K-vector space, equipped with an inner product.

(ii) [Parallelogram Law] kξ + ηk2 + kξ − ηk2 = 2 kξk2 + kηk2 .
CHAPTER II: ELEMENTS OF FUNCTIONAL ANALYSIS 119

(i) [Polarization Identities]


(a) If K = R, then
 1
ξ η = kξ + ηk2 − kξ − ηk , ∀ ξ, η ∈ X.

4
(b) If K = R, then
3
 1X
ξ η = i−k kξ + ik ηk2 , ∀ ξ, η ∈ X.
4
k=0

Proof. (i). This is obvious, since (since the computations from the proof of
Corollary ??)
kξ ± ηk2 = kξk2 + kηk2 ± 2Re ξ η .


(ii).(a). In the real case, the above identity gives


kξ ± ηk2 = kξk2 + kηk2 ± 2 ξ η ,


so we immediately get
kξ + ηk2 − kξ − ηk2 = 4 ξ η .


(b). For every k ∈ {0, 1, 2, 3}, we have


kξ + ik ηk2 = kξk2 + kηk2 + 2Re ξ ik η = kξk2 + kηk2 + ik ξ η + i−k η ξ .
  

Then, when we sum up, we have


3
X 3
X X
i−k kξ + ik ηk2 = kξk2 + kηk2 i−k + 4 ξ η + η ξ i−2k .


k=0 k=0 k=0

Since
3
X 3
X
i−k = i−2k = 0,
k=0 k=0
the above computation proves that we indeed have
3
X
i−k kξ + ik ηk2 = 4 ξ η .


k=0

Corollary
 6.2. Let X be a K-vector space equipped with an inner product
· · . Then the map
X × X 3 (ξ, η) 7−→ ξ η ∈ K


is continuous, with respect to the product topologies.


Proof. Immediate from the polarization identities. 

6.3. Let X and


Corollary  Y be two K-vector spaces equipped with inner
products · · X and · · Y . If T : X → K is an isometric linear map,
then
T ξ T η Y = ξ η X , ∀ ξ, η ∈ X.
 

Proof. Immediate from the polarization identities. 


120 LECTURES 16-17

Exercise 2. Let X be a normed K-vector space. Assume the norm


 satisfies the
Parallelogram Law. Prove that there exists an inner product · · on X, such
that q 
kξk = ξ ξ , ∀ ξ ∈ X.

Hint: Define the inner product by the Polarization Identity, and then prove that it is indeed an
inner product.

Proposition 6.2. Let X be a K-vector space, equipped with an inner product


· · X . Let Z be the completion of X with respect to the

norm defined by the
inner product. Then Z carries a unique inner product · · Z , so that
the norm
on Z is defined by · · Z . Moreover, this inner product extends · · X , in the


sense that
hξi hηi Z = ξ η X , ∀ ξ, η ∈ X.
 

Proof. It is obvious that the norm on Z satisfies the Parallelogram Law. We


then apply Exercise 2. 
Definitions. Let K be one of the fields R or C. A Hilbert space over K is a
K-vector space, equipped with an inner product, which is complete with respect to
the norm defined by the inner product. Some textbooks use the term Euclidean for
real Hilbert spaces, and reserve the term Hilbert only for the complex case.
Examples 6.1. For I a non-empty set, the space `2K (I) is a Hilbert space. We
know that this is a Banach space. The inner product defining the norm is
 X
α β = α(j)β(j), ∀ α, β ∈ `2K (I).
j∈I

The fact that the function αβ : I → K is summable is a consequence of Hölder’s


inequality.
More generally, a Banach space whose norm satisfies the Parallelogram Law is
a Hilbert space.
Definitions. Let X be a K-vector space, equipped with an inner product
· · . Two vectors ξ, η ∈ X are said to be orthogonal, if ξ η = 0. In this case


we write ξ ⊥ η. Given a set M ⊂ X, and a vector ξ ∈ X, we write ξ ⊥ M, if


ξ ⊥ η, ∀ η ∈ M.
Finally, two subsets M, N ⊂ X are said to be orthogonal, and we write M ⊥ N, if
ξ ⊥ η, ∀ ξ ∈ M, η ∈ N.
Notation. Let X be a vector space equipped with an inner product. For a
subset M ⊂ X, we define the set
M⊥ = {ξ ∈ X : ξ ⊥ M .

Remarks 6.1. Let X be a K-vector space equipped with an inner product.


A. The relation ⊥ is symmetric.
B. If ξ, η ∈ X satisfy ξ ⊥ η, then one has the Pythagorean Theorem:
kξ + ηk2 = kξk2 + kηk2 .

This is a consequence of the equality kξ + ηk2 = kξk2 + kηk2 + 2Re ξ η .
CHAPTER II: ELEMENTS OF FUNCTIONAL ANALYSIS 121

C. If M ⊂ X is an arbitrary subset, then M⊥ is a closed linear subspace of


X. This follows from the linearity of the inner product in the second variable, and
from the continuity.
D. For sets M ⊂ N ⊂ X, one has
M⊥ ⊃ N ⊥ .
E. For any set M ⊂ X, one has
⊥
M⊥ = Span M ,
where Span M denotes the norm closure of the linear span of M. The inclusion
⊥
M⊥ ⊃ Span M
is trivial, since we have M ⊂ Span M. Conversely, if ξ ∈ M⊥ , then M ⊂ {ξ}⊥ . But
since {ξ}⊥ is a closed linear subspace, this gives
Span M ⊂ {ξ}⊥ ,
⊥
i.e. ξ ∈ Span M .
The following result gives a very interesting property of Hilbert spaces.
Proposition 6.3. Let H be a Hilbert space, let C ⊂ H be a non-empty closed
convex set. For every ξ ∈ H, there exists a unique vector ξ 0 ∈ C, such that
kξ − ξ 0 k = dist(ξ, C).
Proof. Denote dist(ξ, C) simply by d. By definition, we have
δ = inf kξ − ηk.
η∈C

Choose a sequence (ηn )n≥1 ⊂ C, such that limn→∞ kξ − ηn k = δ.


Claim: One has the inequality
kηm − ηn k2 ≤ 2kξ − ηm k2 + 2kξ − ηn k2 − 4δ 2 , ∀ m, n ≥ 1.
Use the Parallelogram Law
(4) 2kξ − ηm k2 + 2kξ − ηn k2 = k2ξ − ηm − ηn k2 + kηm − ηn k2 .
We notice that, since 21 (ηm + ηn ) ∈ C, we have
kξ − 12 (ηm + ηn )k ≥ δ,
so we get
k2ξ − ηm − ηn k2 = 4kξ − 12 (ηm + ηn )k2 ≥ 4δ 2 ,
so if we go back to (4) we get
2kξ − ηm k2 + 2kξ − ηn k2 = k2ξ − ηm − ηn k2 + kηm − ηn k2 ≥ 4δ 2 + kηm − ηn k2 ,
and the Claim follows.
Having proven the Claim, we now notice that, since limn→∞ kξ − ηn k = δ, we
immediately get the fact that the sequence (ηn )n≥1 is Cauchy. Since H is complete,
it follows that the sequence is convergent to some point ξ 0 . Since C is closed, it
follows that ξ 0 ∈ C. So far we have
kξ − ξ 0 k = lim kξ − ηn k = δ = dist(ξ, C),
n→∞
thus proving the existence.
122 LECTURES 16-17

Let us prove now the uniqueness. Assume ξ 00 ∈ C is another point such that
kξ − ξ 00 k = δ. Using the Parallelogram Law, we have
4δ 2 = 2kξ − ξ 0 k2 + kξ − ξ 00 k2 = k2ξ − ξ 0 − ξ 00 k2 + kξ 0 − ξ 00 k2 .
If ξ 0 6= ξ 00 , then we will have
4δ 2 > k2ξ − ξ 0 − ξ 00 k2 = 4kξ − 12 (ξ 0 + ξ 00 )k2 ,
so we have a new vector η = 12 (ξ 0 + ξ 00 ) ∈ C, such that
kξ − ηk < δ,
thus contracting the definition of δ. 
Definition. Let H be a Hilbert space, and let X ⊂ H be a closed linear
subspace. For every ξ ∈ H, using the above result, we let PX ξ ∈ X denote the
unique vector in X with the property
kξ − PX ξk = dist(ξ, X).
This way we have constructed a map PX : H → H, which is called the orthogonal
projection ont X.
The properties of the orthogonal projection are summarized in the following
result.
Proposition 6.4. Let H be a Hilbert space, and let X ⊂ H be a closed linear
subspace.
(i) For vectors ξ ∈ H and ζ ∈ X one has the equivalence
ζ = PX ξ ⇐⇒ (ξ − ζ) ⊥ X.

(ii) PX X = IdX .
(iii) The map PX : H → X is linear, continuous. If X 6= {0}, then kPX k = 1.
(iv) Ran PX = X and Ker PX = X⊥ .
Proof. (i). “⇒.” Assume ζ = PX ξ. Fix an arbitrary vector η ∈ X r {0}, and
choose a number λ ∈ K, with |λ| = 1, such that
 
λ ξ − ζ η = ξ − ζ η .
In particular, we have
 
ξ − ζ η = Re ξ − ζ λη .
Define the map F : R → R by
F (t) = kξ − ζ − tληk2 − kξ − ζk2 .
By the definition of ζ = PX ξ, we have
(5) F (t) > 0, ∀ t ∈ R r {0}.
2
 2
 = at + bt,  ∀ t ∈ R, where a = λη λη = kηk , and b =
Notice that F (t)
2Re ξ − ζ λη = 2 ξ − ζ η . Of course, the property
at2 + bt > 0, ∀ t ∈ R r {0}

forces b = 0, so we indeed get ξ − ζ η = 0.
“⇐.” Assume (ξ − ζ) ⊥ X. For any η ∈ X, we have (ξ − ζ) ⊥ (ζ − η), so using
the Pythagorean Theorem, we get
kξ − ηk2 = kξ − ζk2 + kζ − ηk2 ,
CHAPTER II: ELEMENTS OF FUNCTIONAL ANALYSIS 123

which forces
kξ − ηk ≥ kξ − ζk, ∀ η ∈ X.
This proves that
kξ − ζk = dist(ξ, X),
i.e. ζ = PX ξ.
(ii). This property is pretty clear. If ξ ∈ X, then 0 = ξ − ξ is orthogonal to X,
so by (i) we get ξ = PX ξ.
(iii). We prove the linearity of PX . Start with vectors ξ1 , ξ − 2 ∈ H and a scalar
λ ∈ K. Take ζ1 = PX ξ1 and ζ2 = PX ξ2 . Consider the vector ζ = λζ1 + ζ2 . For any
η ∈ X, we have
 
λξ1 + ξ2 − ζ η = (λξ1 − λζ1 ) + (ξ2 − ζ2 ) η =
   
= λξ1 − λζ1 η + ξ2 − ζ2 η = λ ξ1 − ζ1 η + ξ2 − ζ2 η = 0.
By (i) we have (ξ1 − ζ1 ) ⊥ X and (ξ1 − ζ1 ) ⊥ X, so the above computation proves
that
(λξ1 + ξ2 − ζ) ⊥ X,
so using (i) we get
PX (λξ1 + ξ2 ) = ζ = λζ1 + ζ2 = λPX ξ1 + PX ξ2 ,
so PX is indeed linear.
To prove the continuity, we start with an arbitrary vector ξ ∈ H and we use
the fact that (ξ − PX ξ) ⊥ PX ξ. By the Pythagorean Theorem we then have
kξk2 = k(ξ − PX ξ) + PX ξk2 = kξ − PX ξk2 + kPX ξk2 ≥ kPX ξk2 .
In other words, we have
kPX ξk ≤ kξk, ∀ ξ ∈ H,
so PX is indeed continuous, and we have kPX k ≤ 1. Using (ii) we immediately get
that, when X 6= {0}, we have kPX k = 1.
(iv). The equality Ran PX = X is trivial by the construction of PX and by (ii).
If ξ ∈ Ker PX , then by (i), we have ξ ∈ X⊥ . Conversely, if ξ ⊥ X, then ζ = 0
satisfies the condition in (i), i.e. PX ξ = 0. 
Corollary 6.4. If H is a Hilbert space, and X ⊂ H is a closed linear subspace,
then
X + X⊥ = H and X ∩ X⊥ .
In other words the map
(6) X × X⊥ 3 (η, ζ) 7−→ η + ζ ∈ H
is a linear isomorphism.
Proof. If ξ ∈ H then PX ξ ∈ X, and ξ − PX ξ ∈ X⊥ , and then the equality
ξ = PX ξ + (ξ − PX ξ)
proves that ξ ∈ X + X . The equality X ∩ X⊥ = {0} is trivial, since for ζ ∈ X ∩ X⊥ ,

we must have ζ ⊥ ζ, which forces ζ = 0. 


Exercise 3. Let H be a Hilbert space.
(i) Prove that, for any closed subspace X ⊂ H, one has the equality
PX⊥ = I − PX .
124 LECTURES 16-17

(ii) Prove that two closed subspaces X, Y ⊂ H, the following are equivalent:
– X ⊥ Y;
– PX PY = 0;
– PY PX = 0.
(iii) Prove that two closed subspaces X, Y ⊂ H, the following are equivalent:
– X ⊂ Y;
– PX PY = PX ;
– PY PX = PX .
(iv) Let X, Y ⊂ H are closed subspaces, such that X ⊥ Y, then
– X + Y is c closed linear subspace of H;
– PX+Y = PX + PY .
Corollary 6.5. Let H be a Hilbert space, and let X ⊂ H be a linear (not
necessarily closed) subspace. Then on has the equality
⊥
X = X⊥ .
⊥ ⊥
Proof. Denote the closed subspace X⊥ by Z. Since X⊥ = X , by the
previous exercise we have
PZ = I − PX⊥ = I − PX⊥ = I − (I − PX ) = PX ,
which forces
Z = Ran PZ = Ran PX = X.

Theorem 6.1 (Riesz’ Representation Theorem). Let H be a Hilbert space over
K, and let φ : H → K be a linear continuous map. Then there exists a unique
vector ξ ∈ H, such that
φ(η) = ξ η , ∀ η ∈ H.


Moreover one has kξk = kφk.


Proof. First we show the existence. If φ = 0, we simply take ξ = 0. Assume
φ 6= 0. Define the subspace X = Ker φ. Notice that X is closed. Using the linear
isomorphism (6) we see that the composition
quotient map
X⊥ ,→ H −−−−−−−−→ H/X
is a linear isomorphism. Since
H/X = H/Ker φ ' Ran φ = K,
it follows that dim(X ) = 1. In other words, there exists ξ0 ∈ X⊥ , ξ0 6= 0, such

that
X⊥ = Kξ.
Start now with some arbitrary vector η ∈ H. On the one hand, using the equality
Kξ0 + X = H, there exists λ ∈ K and ζ ∈ X, such that
η = λξ0 + ζ,
and since ζ ∈ X = Ker φ, we get
φ(η) = φ(λξ0 ) = λφ(ξ0 ).
On the other hand, we have
ξ0 η = ξ0 λξ0 + ξ0 ζ = λkξ0 k2 ,
  
CHAPTER II: ELEMENTS OF FUNCTIONAL ANALYSIS 125

so if we define ξ = φ(ξ0 )kξ0 k−2 we will have


ξ η = φ(ξ0 )kξ0 k−2 ξ0 | η = φ(ξ0 )kξ0 k−2 ξ0 η = λφ(ξ0 ) = φ(η).
  

To prove uniqueness, assume ξ 0 ∈ H is another vector with


φ(η) = ξ 0 η , ∀ η ∈ H.


In particular, we have
kξ − ξ 0 k2 = ξ − ξ 0 | ξ − ξ 0 = ξ | ξ − ξ 0 − ξ 0 | ξ − ξ 0 = φ(ξ − ξ 0 ) − φ(ξ − ξ 0 ) = 0,
  

which forces ξ = ξ 0 .
Finally, to prove the norm equality, we first observe that when ξ = 0, the
equality is trivial. If ξ 6= 0, then on the one hand, using C-B-S inequality we have
|φ(η)| = ξ η ≤ kξk · kηk, ∀ η ∈ H,


so we immediately get kφk ≤ kξk. If we take the vector ζ = kξk−1 ξ, then kζk = 1,
and
φ(ζ) = ξ kξk−1 ξ = kξk,


so we also have kφk ≥ kξk. 

In the remainder of this section we discuss a Hilbert space notion of linear


independence. This should be thought as a “rigid” linear independence.
Definition. Let X be a K-vector space, equipped with an inner product. A
set F ⊂ X is said to be orthogonal, if 0 6∈ F, and
ξ ⊥ η, ∀ ξ, η ∈ F, with ξ 6= η.
A set F ⊂ X is said to be orthonormal, if it is orthogonal, but it also satisfies:
kξk = 1, ∀ ξ ∈ F.
Remark that, if one starts with an orthogonal set F ⊂ X, then the set
F(1) = kξk−1 ξ : ξ ∈ F


is orthonormal.
Proposition 6.5. Let X be a K-vector space equipped with an inner product.
Any orthogonal set F ⊂ X is linearly independent.

Proof. Indeed, if one starts with a vanishing linear combination


λ1 ξ1 + · · · + λn ξn = 0,
with λ1 , . . . , λn ∈ K, ξ1 , . . . , ξn ∈ X, such that ξk 6= ξ` , for all k, ` ∈ {1, . . . , n} with
k 6= `, then for each k ∈ {1, . . . , n} we clearly have
λk kξk k2 = ξk λ − 1ξ1 + · · · + λn ξn = 0,


and since ξk 6= 0, we get λk = 0. 

Lemma 6.2. Let X be a K-vector space equipped with an inner product, and let
F ⊂ X be an orthogonal set. Then there exists a maximal (with respect to inclusion)
orthogonal set G ⊂ X with F ⊂ G.
126 LECTURES 16-17

Proof. Consider the sets


A = G : G orthogonal subset of X ,


AF = G ∈ A : G ⊃ F ,


ordered with the inclusion. We are going to apply Zorn’s Lemma to AF . Let
T ⊂ AF be a subcollection, which is totally ordered, i.e. for any G1 , G2 ∈ T one has
G1 ⊂ G2 or G1 ⊃ G2 . Define the set
[
M= G.
G∈T

Since G ⊂ X r {0}, for all G ∈ T, it is clear that M ⊂ X r {0}. If ξ1 , ξ2 ∈ M


are vectors with ξ1 6= ξ2 , then we can find G1 , G2 ∈ T with ξ1 ∈ G1 and ξ2 ∈ G2 .
Using the fact that T is totally ordered, it follows that there is k ∈ {1, 2} such that
ξ1 , ξ2 ∈ Gk , so we indeed get ξ1 ⊥ ξ2 . It is now clear that M ∈ AF , and M ⊃ G, for
all G ∈ T. In other words, we have shown that every totally ordered subset of AF
has an upper bound, in AF . By Zorn’s Lemma, AF has a maximal element. Finally,
it is clear that any maximal element for AF is also a maximal element in A. 

Remark 6.2. Using the notations from the proof above, given an orthonormal
set M ⊂ X, the following are equivalent:
(i) M is maximal in A;
(ii) M is maximal in
A(1) = G : G orthonormal subset of X .


The implication (i) ⇒ (ii) is trivial. Conversely, if M is maximal in A(1) , we use


the Lemma to find a maximal N ∈ A with N ⊃ M. But then N(1) is orthonromal,
and N(1) ⊃ M, which by the maximality of M in A(1) will force N(1) = M. Since N
is linearly independent, the relations
N(1) = M ⊂ N,
will force N = N(1) = M.
Comment. In linear algebra we know that a linearly independent set is max-
imal, if and only if it spans the whole space. In the case of orthogonal sets, this
statement has a version described by the following result.
Theorem 6.2. Let H be a Hilbert space, and let F be an orthogonal set in H.
The following are equivalent:
(i) F is maximal among all orthogonal subsets of H;
(ii) Span F is dense in H in the norm topology.

Proof. (i) ⇒ (ii). Assume F is maximal. We are going to show that Span F is
dense in H, by contradiction. Denote the closure Span F simply by X, and assume
X ( H. Since
⊥
X = X⊥ ,
we see that, the strict inclusion X ( H forces X⊥ 6= {0}. But now if we take a
non-zero vector ξ ∈ X⊥ , we immediately see that the set F ∪ {ξ} is still orthogonal,
thus contradicting the maximality of F.
CHAPTER II: ELEMENTS OF FUNCTIONAL ANALYSIS 127

(ii) ⇒ (i). Assume Span F is dense in H, and let us prove that F is maximal.
We do this by contardiction. If F is not maximal, then there exists ξ ∈ H r F, such
that F ∪ {ξ} is still orthogonal. This would force ξ ⊥ F, so we will also have
ξ ⊥ Span F.
But since Span F is dense in H, this will give ξ ⊥ H. In particular we have ξ ⊥ ξ,
which would force ξ = 0, thus contradicting the fact that F ∪ {ξ} is orthigonal.
(Recall that all elements of an orthigonal set are non-zero.) 
Definition. Let H be a Hilbert space An orthonormal set B ⊂ H, which is
maximal among all orthogonal (or orthonormal) subsets of H, is called an orthonor-
mal basis for H.
By Lemma ??, we know that given any orthonormal set F ⊂ H, there exists an
orthonormal basis B ⊃ F.
By the above result, an orthonormal set B ⊂ H is an orthonormal basis for H,
if and only if Span B is dense in H.
Example 6.2. Let I be a non-empty set. Consider the Hilbert space `2K (I).
Consider (see section II.2) the set
B = {δ i : i ∈ I}.
Then
Span B = finK (I),
which is dense in The above result then says that B is an orthonormal basis
`2K (I).
for `2K (I).
The following exercise will be useful in the discussion of another interesting
example.
Exercise 4. Equipp the space C([0, 1]) with the inner product
Z 1

f g =
f (t)g(t) dt, f, g ∈ C([0, 1]).
0
The norm defined by this inner product is
Z 1  12
2
kf k2 = |f (t)| dt , f ∈ C([0, 1]).
0

Define the maps en : [0, 1] 3 t 7−→ exp(2nπit) ∈ T, n ∈ Z. (Here T denotes the unit
circle in C.) Prove that the set
B = {en : n ∈ Z}
is orthonormal in C([0, 1]), and Span B is dense in C([0, 1]) in the topology defined
by the norm k · k2 .
Hints: Define the space
P = f ∈ C([0, 1]) : f (0) = f (1) .


Prove that P is dense in C([0, 1]) in the topology defined by the norm k · k2 .
Prove that the map
Φ : C(T) 3 F 7−→ F ◦ e ∈ P
is a linear isomorphism, which is isometric with respect to the uniform norms.
In order to prove that Span B is dense in C([0, 1]) with respect to k · k2 , it suffices to show
that Span B is dense in P in the uniform norm. Equivalently, it suffices to show that
Φ−1 Span B

128 LECTURES 16-17

is dense in C(T), with respect to the uniform norm. To get this density use Stone-Weierstrass
Theorem, plus the fact that the functions ζn = Φ−1 (en ) ∈ C(T) are defined by
ζn (z) = z n , ∀ z ∈ T, n ∈ Z.

Example 6.3. We define L2 ([0, 1]) to be the completion of C([0, 1]) with re-
spect to the norm k · k2 . Regard C([0, 1]) as a dense linear subspace in L2 ([0, 1]),
so we also regard
B = {en : n ∈ Z}
as a subset in L ([0, 1]). Then Span B is dense in L2 ([0, 1]), so B is an orthonormal
2

basis for L2 ([0, 1]).


Lemma 6.3. Let B be an orthonormal basis for the Hilbert space H, and let
F ( B be an arbitrary non-empty subset.
(i) F is an orthonormal basis for the Hilbert space Span F.
⊥
(ii) Span F = Span(B r F).
Proof. (i). This is clear, since F is orthonormal and has dense span.
(ii). Denote for simplicity Span F = X and Span(B r F) = Y. Since
ξ ⊥ η, ∀ ξ ∈ F, η ∈ B r F,
it is pretty obvious that X ⊥ Y. Since X + Y clearly contains Span B, it follows that
X + Y is dense in H. We know howver that X + Y is closed, so we have in fact the
equality
X + Y = H.
This will then give
I = PH = PX + PY ,
so we get
PY = I − PX = PX⊥ ,
so
X⊥ = Ran PX⊥ = Ran PY = Y.

Theorem 6.3. Let H be a Hilbert space, and let B be an orthonormal basis
for H, labelled6 as B = {ξj : j ∈ I}. For every vector η ∈ H, let αη : I → K be the
map defined by
αη (j) = ξj η , ∀ j ∈ I.


(i) For every η ∈ H, the map αη belongs to `2K (I).


(ii) The map
T : H 3 η 7−→ αη ∈ `2K (I)
is an isometric linear isomorphism.
Proof. (i). Fix for the moment η ∈ H. We must show that
X η 2
sup |α (i)| : F ⊂ I, finite < ∞.
j∈F

For any non-empty finite subset F ⊂ I, we define the subspace


HF = Span{ξj : j ∈ F },
6 This notation implicitly assumes that ξ 6= ξ , for all j, k ∈ I with j 6= k.
j k
CHAPTER II: ELEMENTS OF FUNCTIONAL ANALYSIS 129

and define the vector X


ηF =

big( ξj η ) · ξj .
j∈F
Claim: For every finite set F ⊂ I, one has the equality
η F = PHF η.
It suffices to prove that
(η − η F ) ⊥ HF .
But this is obvious, since if we start with some k ∈ F , then using the fact that
ξk ξj = 0, for all j ∈ F r {k}, together with the equality kξk k = 1, we get
  X     
ξk η − ηF = ξk η − ξj η · ξk ξj = ξk η − ξk η · ξk ξk = 0.
j∈F

Having proven the Claim, let us observe that, since the terms in the sum that
defines η F are all orthogonal, we get
ξj η · ξj 2 = ξj η 2 · kξj k2 =
X X X
kη F k2 = |αη (j)|2 .
 

j∈F j∈F j∈F

Combining this computation with the Claim, we now have


X
|αη (j)|2 = kηF k2 = kPHF ηk2 ≤ kηk2 ,
j∈F

which proves that


X
|αη (i)|2 : F ⊂ I, finite < kηk.

sup
j∈F

(ii). The linearlity of T is obvious. The above inequality actually proves that
kT ηk ≤ kηk, ∀ η ∈ H.
We now prove that in fact T is isometric. Since T is linear and continuous, it
suffices to prove that T Span B is isometric. Start with some vector η ∈ Span B,

P that there exists some finite set F ⊂ I, and scalars (λk )k∈F ⊂ K, such
which means
that η = k∈F λk ξk . Remark that

 X  λk if k ∈ F
ξj | η = λk ξ j | ξj =
0 if k ∈ F
k∈F

so the element αη = T η ∈ `2K (I) is defined by



λk if k ∈ F
αη (k) =
0 if k ∈ F
This gives
X  X X
kηk2 = λj λk ξ j ξ k = |λk |2 = |αη (k)|2 = kαη k2 ,
j,k∈F k∈F k∈F

so we indeed get
kηk = kT ηk, ∀ η ∈ Span B.
Let us prove that T is surjective. Notice that, the above computation, applied to
singleton sets F = {k}, k ∈ I, proves that
T ξk = δ k , ∀ k ∈ I.
130 LECTURES 16-17

In particular, we have
Ran T ⊃ T Span B = Span T (B) =


= Span{T ξk : k ∈ I} = Span{δ k : k ∈ I} = finK (I),


which proves that Ran T is dense in `2K (I). We know however that T is isometric,
so Ran T ⊂ `2K (I) is closed. This forces Ran T = `2K (I). 
Corollary 6.6 (Parseval Identity). Let H be a Hilbert space, and let B =
{ξj : j ∈ I} be an orthonormal basis for H. One has:
 X 
ζ ξj · ξj η , ∀ ζ, η ∈ H.

ζ η =
j∈I
 
Proof. If we define α(j) = ξj ζ and ξj η , ∀ j ∈ I, then by construction
we have α = T ζ and β = T η. Using the fact that T is isometric, the right hand
side of the above equality is the equal to
X   
α(j)β(j) = α β = T ζ T η = ζ η .
j∈I


Notation. Let H be a Hilbert space, let B = {ξj : j ∈ I} be an orthonormal
basis for H, and let T : H → `2K (I) be the isometric linear isomorphism defined in
the previous theorem. Given an element α ∈ `2K (I), we denote the vector T −1 α ∈ H
by
X
α(j)ξj .
j∈I
The summation notation is justified by the following fact.
Proposition 6.6. With the above notations, for every ε > 0, there exists some
finite subset Fε ⊂ I, such that
X
X 2

α(j)ξj − α(k)ξk < ε, for all finite sets F ⊂ I with F ⊃ Fε .
j∈I k∈F
P
Proof. Define the vector η = j∈I α(j)ξj . By construction we have T η = α.
Likewise, if we define, for each finite set F ⊂ I, the element αF ∈ `2K (I) by

α(k) if k ∈ F
αF (k) =
0 if k ∈ I r F
then T −1 αF = k∈F α(k)ξk . Using the fact that T is an isometry, we have
P

kη − T −1 αF k = kT η − αF k = kα − αF k,
and the desired property follows from the well-known properties of `2K (I). 
Exercise 5. Let H be a Hilbert space, let F = {ξj : j ∈ J} be an orthonormal
set. Define the closed linear subspace HF = Span F. Prove that the orthogonal
projection PHF is defined by
X
ξj η ξj , ∀ η ∈ H.

PHF η =
j∈J
CHAPTER II: ELEMENTS OF FUNCTIONAL ANALYSIS 131

Hints: Extend F to an orthonormal basis B. Let B be labelled as {ξi : i ∈ I} for some set
I ⊃ J. First prove that for any η ∈ H, the map β η = T η J belongs to `2K (J). In particular, the
sum
X 
ηF = ξj η ξ j
j∈J

is “legitimate” and defines an element in HF (use the fact that F is an orthonormal basis for HF ).
Finally, prove that (η − ηF ) ⊥ F, using Parseval Identity.
Example 6.4. Let us analyze the space L2 ([0, 1]). Use the orthonormal basis
{en : n ∈ Z} defined by
en (t) = exp(2nπit), ∀ t ∈ [0, 1], n ∈ Z.
For any f ∈ C([0, 1]) we define
Z 1
fˆ(n) =

exp(−2nπit)f (t) dt = en f .
0

We then know that


X
f= fˆ(n)en .
n∈Z

One can think the right hand side as a series, but the reader should be aware of
the fact that this series is convergent only in the norm k · k2 . One can define for
example for any N ≥ 1, a partial sum fN : [0, 1] → C by
N
X
fN (t) = fˆ(n)exp(2nπit), t ∈ [0, 1].
n=−N

We will have
lim kf − fN k2 = 0,
N →∞

but in general there are (many) values of t ∈ [0, 1] for which the limit limN →∞ fN (t)
does not exist. One can consider a formal infinite series

X
(7) fˆ(n)exp(2nπit).
n=−∞

Although this series is not convergent (pointwise) for all t ∈ [0, 1], it plays an
important role in analysis. The series (7) is called the complex Fourier series of f .
Note that Parseval’s Identity gives
Z 1 X∞
f (t)g(t) dt = fˆ(n)ĝ(n).
0 n=−∞

One can construct another orthonormal basis for L2 ([0, 1]), by taking real and
imaginary parts of en . More explicitly, we define the sequences of functions (gn )∞
n=0
and (hn )∞
n=1 by

g0 (t) = 1, ∀ t ∈ [0, 1];



gn (t) = 2 cos(2nπt), ∀ t ∈ [0, 1], n ≥ 1;

hn (t) = 2 sin(2nπt), ∀ t ∈ [0, 1], n ≥ 1.
132 LECTURES 16-17

Then B0 = {gn : n ≥ 0}∪ {hn : n ≥ 1} is again an orthonormal basis for L2 ([0, 1]).
(It is clear that B0 is orthonormal, and Span B0 3 en , ∀ n ∈ Z, so Span B0 is dense
in L2 ([0, 1]).) For f ∈ C([0, 1]) one can then define its real Fourier series

X
fˆ(0) +
 
an cos(2nπt) + bn sin(2nπt) ,
n=1
where
√ Z 1 √ Z 1
an = 2 f (t) cos(2nπt) dt and bn = 2 f (t) sin(2nπt) dt, ∀ n ≥ 1.
0 0
Note that
√ √
2 ˆ ˆ 2 ˆ
an = f (−n) + f (n)] and bn = f (−n) − fˆ(n)], ∀ n ≥ 1.
2 2i
The next result discusses the appropriate notion of dimension for Hilbert spaces.
Theorem 6.4. Let H be a Hilbert space. Then any two orthonormal bases of
H have the same cardinality.
Proof. Fix two orthonormal bases B and B0 . There are two possible cases.
Case I: One of the sets B or B0 is finite.
In this case H is finite dimensional, since the linear span of a finite set is
automatically closed. Since both B and B0 are linearly independent, it follows that
both B and B0 are finite, hence their linear spans are both closed. It follows that
Span B = Span B0 = H,
so B and B0 are in fact linear bases for H, and then we get
Card B = Card B0 = dim H.
Case II: Both B and B0 are infinite.
The key step we need in this case is the following.
Claim 1: There exists a dense subset Z ⊂ H, with
Card Z = Card B0 .
To prove this fact, we define the set
X = SpanQ B0 .
It is clear that
Card X = Card B0 .
Notice that X is dense in SpanR B0 . If we work over K = R, then we are done. If
we work over K = C, we define
Z = X + iX,
and we will still have
Card Z = Card X = Card B0 .
Now we are done, since clearly Z is dense in SpanC B0 .
Choose Z as in Claim 1. For every ξ ∈ B we choose a vector ζξ ∈ Z, such that

2−1
kξ − ζξ k ≤ .
2
Claim 2: The map B 3 ξ 7−→ ζξ ∈ Z is injective.
CHAPTER II: ELEMENTS OF FUNCTIONAL ANALYSIS 133

Start with two vectors ξ1 , ξ2 ∈ B, such that ξ1 6= ξ2 . In particular, ξ1 ⊥ ξ2 , so we


also have ξ1 ⊥ (−ξ2 ), and using the Pythogorean Theorem we get
kξ1 − ξ2 k2 = kξ2 k2 + k − ξ2 k2 = 2,
which gives √
kξ1 − ξ2 k = 2.
Using the triangle inequality, we now have
√ √
2 = kξ1 − ξ2 k ≤ kξ1 − ζξ1 k + kξ2 − ζξ2 k + kζξ1 − ζξ2 k ≤ 2 − 1 + kζξ1 − ζξ2 k.
This gives
kζξ1 − ζξ2 k ≥ 1,
which forces ζξ1 6= ζξ2 .
Using Claim 2, we have constructed an injective map B → Z. In particular,
using Claim 1 and the cardinal arithmetic rules, we get
Card B ≤ Card Z = Card B0 .
By symmetry we also have
Card B0 ≤ Card B,
and then using the Cantor-Bernstein Theorem, we finally get
Card BCard B0 .

Corollary 6.7 (of the proof). A Hilbert space is separable, in the norm topol-
ogy, if and only of it has an orthonormal basis which is at most countable.
Proof. Use Claims 1 and 2 from the proof of the Theorem. 
Definition. Let H be a Hilbert space, and let B be an orthonormal basis for
H. By the above theorem, the cardinal number Card B does not depend on the
choice of B. This number is called the hilbertian (or orthogonal ) dimension of H,
and is denoted by h-dim H.
Corollary 6.8. For two Hilbert spaces H and H0 , the following are equivalent:
(i) h-dim H = h-dim H0 ;
(ii) There exists an isometric linear isomorphism U : H → H0 .
Proof. (i) ⇒ (ii). Choose a set I with h-dim H = h-dim H0 = Card I.
Apply Theorem ?? to produce isometric linear isomorphisms T : H → `2K (I) and
T 0 : H0 → `2K (I). Then define U = T 0−1 ◦ T .
(ii) ⇒ (i). Assume one has an isometric linear isomorphism U : H → H0 .
Choose an orthonormal basis B for H. Then U (B) is clearly and orthonormal basis
for H0 , and since U : B → U (B) is bijective, we get
h-dim H = Card B = Card U (B) = h-dim H0 .

Chapter III
Measure Theory
Lecture 18

1. Set arithmetic: (σ-)rings, (σ-)algebras, and monontone classes


In this section we discuss various types of set collections used in Measure The-
ory.
Notation. Given a (non-empty) set X, we denote by P(X) the collection of
all subsets of X.
Definition. Let X be a non-empty set. For A ∈ P(X), we define the function
κ A : X → {0, 1} by

1 if x ∈ A
κ A (x) =
0 if x ∈ X r A
The function κ A is called the characteristic function of A.
The basic properties of characteristic functions are summarized in the following.
Exercise 1. Let X be a non-empty set. Prove:
(i) κ ∅ = 0 and κ X = 1.
(ii) For A, B ∈ P(X) one has
A ⊂ B ⇔ κA ≤ κB;
A = B ⇔ κA = κB.
(iii) κ A∩B = κ A · κ B , ∀ A, B ∈ P(X).
(iv) κ ArB = κ A · (1 − κ B ), ∀ A, B ∈ P(X).
(v) κ A∩B = κ A + κ B − κ A · κ B , ∀ A, B ∈ P(X).
Xn X
(vi) κ A1 ∪···∪An = (−1)k−1 κ Ai1 · · · κ Ai , ∀, A1 , . . . , An ∈ P(X).
k
k=1 1≤i1 <···<ik ≤n
(vii) κ A4B = |κ A −κ B | = κ A +κ B −2κ A ·κ B , ∀ A, B ∈ P(X). Here 4 stands
for the symmetric set difference, defined by A4B = (A r B) ∪ (B r A).
Property (vi) is called the Inclusion-Exclusion Formula.
Hint: (vi). Show that the right hand side is equal to 1 − (1 − κ A1 ) · · · (1 − κ An ).
Remark 1.1. The Inclusion-Exclusion formula has an interesting application
in Combinatorics. If the ambient set X is finite, then the number of elements of
any subset A ⊂ X is given by
X
|A| = κ A (x).
x∈X

Using the Inclusion-Exclusion formula, we then get


n
X X
|A1 ∪ · · · ∪ An | = (−1)k−1 |Ai1 ∩ · · · Aik |.
k=1 1≤i1 <···<ik ≤n

137
138 LECTURE 18

This is known as the Inclusion-Exclusion Principle.


Definition. Let X be a non-empty set, and let K be one of the fields7 Q, R
or C. An function φ : X → K is said to be elementary, if its range φ(X) is finite.
Remark that this gives
X X
φ= λ · κ φ−1 ({λ}) = λ · κ φ−1 ({λ}) .
λ∈φ(X) λ∈φ(X)r{0}

We define
ElemK (X) = {φ : X → K : φ elementary}.
Given a collection M ⊂ P(X), a function φ : X → K is said to be M-elementary,
if φ is elementary, and moreover,
φ−1 ({λ}) ∈ M, ∀ λ ∈ K r {0}.
We define
M-ElemK (X) = {φ : X → K : φ M-elementary}.
Exercise 2. With the above notations, prove that ElemK (X) is a unital K-
algebra.
Proposition 1.1. Given a non-empty set X, the collection P(X) is a unital
ring, with the operations
A + B = A4B and A · B = A ∩ B, A, B ∈ P(X).
Proof. First of all, it is clear that 4 is commutative.
To prove the associativity of 4, we simply observe that
κ (A4B)4C = κ A4B + κ C − 2κ A4B κ C =
= κ A + κ B − 2κ A κ B + κ C − (κ A + κ B − 2κ A κ B ) · κ C =
= κ A + κ B + κ C − 2(κ A κ B + κ A κ C + κ B κ C ) + 2κ A κ B κ C .
Since the final result is symmetric in A, B, C, we see that we get
κ A4(B4C) = κ (A4B)4C ,
so we indeed get
(A4B)4C = A4(B4C).
The neutral element for 4 is the empty set ∅. Since we obviously have A4A = ∅,
it follows that P(X), 4 is indeed an abelian group.
The operation ∩ is clearly commutative, associative, and has the total set X
as the unit.
To check distributivity, we again use characteristic functions:
κ (A∩C)4(B∩C) = κ A∩C + κ B∩C − 2κ A∩C κ B∩C =
= κ A κ C + κ B κ C − 2κ A κ B κ C = (κ A + κ B − 2κ A κ B )κ C =
= κ A4B κ C = κ (A4B)∩C ,
so we indeed have the equality
(A ∩ C)4(B ∩ C) = (A4B) ∩ C.

7 K can be any field.
CHAPTER III: MEASURE THEORY 139

Definitions. Let X be a non-empty set. A ring on X is a non-empty sub-ring


R ⊂ P(X). We do not require the unit X to belong to R, but we do require ∅ ∈ R.
An algebra on X is a ring A which contains the unit X.
Rings and algebras of sets are characterized as follows.
Proposition 1.2. Let X be a non-empty set.
A. For a non-empty collection R ⊂ P(X), the following are equivalent:
(i) R is a ring on X;
(ii) For any A, B ∈ R, we have A r B ∈ R and A ∪ B ∈ R.
B. For a non-empty collection A ⊂ P(X), the following are equivalent:
(i) A is an algebra on X;
(ii) For any A ∈ A, we have X r A ∈ A, and for any A, B ∈ A, we have
A ∪ B ∈ A.
Proof. A. (i) ⇒ (ii). Assume R is a ring on X, and let A, B ∈ R. Then A ∩ B
belongs to R, so
A r B = A4(A ∩ B)
also belongs to R. It the follows that
A ∪ B = (A4B)4(A ∩ B)
again belongs to R.
(ii) ⇒ (i). Assume R satisfies property (ii). Start with A, B ∈ R. Then A r B
belongs to R, and
A ∩ B = A r (A r B)
again belongs to R. Since A ∪ B also belongs to R, it follows that the set
A4B = (A ∪ B) r (A ∩ B)
again belongs to R.
B. (i) ⇒ (ii). This is clear from the implication A.(i) ⇒ (ii).
(ii) ⇒ (i). Assume A satisfies property (ii). Start with two sets A, B ∈ A.
Then the complements X r A and X r B both belong to A, hence their union
(X r A) ∪ (X r B) = X r (A ∩ B)
belongs to A, and the complement of this union
 
X r X r (A ∩ B) = A ∩ B
will also belong to A.
If A, B ∈ A, then since X r B belongs to A, by the above considerations, it
follows that the intersection
A ∩ (X r B) = A r B
also belongs to A. Likewise, the difference B rA also belongs to A, hence the union
(A r B) ∪ (B r A) = A4B
also belongs to A. By part A, it follows that A is a ring.
Finally, since A is non-empty, if we choose some A ∈ A, then A4A = ∅ belongs
to A, so its complement X r ∅ = X also belongs to A. 
It will be useful to introduce the following terminology.
Definition. A system of sets (Ai )i∈I is said to be pair-wise disjoint, if Ai ∩
Aj = ∅, for all i, j ∈ I with i 6= j.
140 LECTURE 18

Lemma 1.1. Let X be a non-empty set, let K be one of fields Q, R or C, and


let R be a ring on X. For a function φ : X → K, the following are equivalent:
(i) φ is R-elementary;
(ii) there exist an integer n ≥ 1 and sets A1 , . . . , An ∈ R, and numbers
λ1 , . . . , λn ∈ K, such that
φ = λ 1 κ A1 + · · · + λ n κ A n .
(ii) there exist an integer m ≥ 1, and a finite pair-wise disjoint system (Bj )m
j=1 ⊂
R, and numbers µ1 , . . . , µm ∈ K, such that
φ = µ1 κ B1 + · · · + µm κ Bm .
Proof. (i) ⇒ (ii). Assume φ is R-elementary. If φ = 0, there is nothing to
prove, because we have φ = κ ∅ . If φ is not identically zero, then we can obviously
write X
φ= λκ φ−1 ({λ}) ,
λ∈φ(X)r{0}

with all sets φ ({λ}) in R.


−1

(ii) ⇒ (iii). Define


E = {ψ : X → K : ψ satisfies property (iii)}.
Assume φ satisfies (ii), i.e.
φ = λ 1 κ A1 + · · · + λ n κ A n ,
with A1 , . . . , An ∈ R and λ1 , . . . , λn ∈ K. We are going to prove that φ ∈ E, by
induction on n. The case n = 1 is trivial (either φ = 0, so φ = κ ∅ ∈ E, or φ = λκ A
for some A ∈ R and λ 6= 0, in which case we also have φ ∈ E).
Assume
α1 κ D1 + · · · + αk κ Dk ∈ E,
for all D1 , . . . , Dk ∈ R, α1 , . . . , αk ∈ K. Start with a function
φ = λ1 κ A1 + · · · + λk κ Ak + λk+1 κ Ak+1 ,
with A1 , . . . , Ak+1 ∈ R and λ1 , . . . , λk+1 ∈ K, and based on the above inductive
hypothesis, let us show that φ ∈ E. Using the inductive hypothesis, the function
ψ = λ2 κ A2 + · · · + λk κ Ak + λk+1 κ Ak+1
belongs to E, so there exist scalars η1 , . . . , ηp ∈ K, an integer p ≥ 1, and a pair-wise
disjoint system (Cj )pj=1 ⊂ R, such that
ψ = η1 κ C1 + · · · + ηp κ Cp .
With this notation, we have
φ = λ1 κ A1 + η1 κ C1 + · · · + ηp κ Cp .
Put then
B2j = A1 ∩ Cj and B2j−1 = Cj r A1 , for all j ∈ {1, . . . , p};
B2p+1 = A1 r (C1 ∪ · · · ∪ CP ).
It is clear that (Bk )2p+1
k=1 ⊂ R is pair-wise disjoint. Notice now that the equalities
Cj = B2j−1 ∪ B2j , ∀ j ∈ {1, . . . , p},
A1 = B1 ∪ B3 ∪ · · · ∪ B2p+1 ,
CHAPTER III: MEASURE THEORY 141

combined with the fact that the B’s are pairwise disjoint, give
κ Cj = κ B2j−1 + κ B2j , ∀ j ∈ {1, . . . , p},
κ A1 = κ B1 + κ B3 + . . . κ B2p+1 ,
which give
p
X p
X
φ= ηj κ B2j + (ηj + λ1 )κ B2j−1 + λ1 κ B2p+1 ,
j=1 j=1
which proves that φ indeed belongs to E.
j=1 ⊂ R,
(iii) ⇒ (i). Assume there exists a finite pair-wise disjoint system (Bj )m
and numbers µ1 , . . . , µm ∈ K, such that
φ = µ1 κ B1 + · · · + µm κ Bm ,
and let us prove that φ is R-elemntary.
If all the µ’s are zero, there is noting to prove, since φ = 0.
Assume the µ’s are not all equal to zero. Since the µ’s that are equal to zero
do not have any contribution, we can in fact assume that all the µ’s are non-zero.
Notice that
φ(X) r {0} = {µj : 1 ≤ j ≤ m}.
In particular φ is elementary.
If we start with an arbitrary λ ∈ K r {0}, then either λ 6∈ φ(X), or λ ∈
φ(X) r {0}. In the first case we clearly have φ−1 ({λ}) = ∅ ∈ R. In the second
case, we have the equality [
φ−1 ({λ}) = Bj ,
j∈Mλ
where
Mλ = {j : 1 ≤ j ≤ m and µj = λ}.
Since all B’s belong to R, it follows that φ−1 ({λ}) again belongs to R. Having
shown that φ is elementary, and φ−1 ({λ}) ∈ R, for all λ ∈ K r {0}, it follows that
φ is indeed R-elementary. 
Proposition 1.3. Let X be a non-empty set, and let K be one of the fields Q,
R, or C.
A. For a non-empty collection R ⊂ P(X), the following are equivalent:
(i) R is a ring on X;
(ii) R-ElemK (X) is a K-subalgebra of ElemK (X).
B. For a non-empty collection A ⊂ P(X), the following are equivalent:
(i) A is an algebra on X;
(ii) A-ElemK (X) is a K-subalgebra of ElemK (X), which contains the constant
function 1.
Proof. A. (i) ⇒ (ii). Assume R is a ring on X. Using Lemma 1.1 we see that
we have the equality:
R-ElemK (X) = Span{κ A : A ∈ R}.
In particular, this shows that R-ElemK (X) is a K-linear subspace of ElemK (X).
Moreover, in order to prove that R-ElemK (X) is a K-subalgebra, it suffices to prove
the implication
A, B ∈ R =⇒ κ A · κ B ∈ R-ElemK (X).
142 LECTURE 18

But this implication is trivial, since κ A · κ B = κ A∩B , and A ∩ B belongs to R.


(ii) ⇒ (i). Assume R-ElemK (X) is a K-subalgebra of ElemK (X). First of all,
since κ ∅ = 0 ∈ R-ElemK (X), it follows that ∅ ∈ R.
Start now with two sets A, B ∈ R. Then κ A and κ B belong to R-ElemK (X).
Since R-ElemK (X) is an algebra, the function
κ A∩B = κ A · κ B
belongs to R-ElemK (X), so we immediately see that A ∩ B ∈ R.
Likewise, the function
κ A4B = κ A + κ B − 2κ A κ B
belongs to R-ElemK (X), so we also get A4B ∈ R.
B. This equivalence is clear from part A, plus the identity κ X = 1. 
Algebras of elementary functions give in fact a complete description for rings
or algebras of sets, as indicated in the result below.
Proposition 1.4. Let X be a non-empty set, and let K be one of the fields Q,
R, or C.
A. The map
R 7−→ R-ElemK (X)
is a bijective correspondence from the collection of all rings on X, and the collection
of all K-subalgebras of ElemK (X).
B. The map
A 7−→ A-ElemK (X)
is a bijective correspondence from the collection of all algebras on X, and the col-
lection of all K-subalgebras of ElemK (X) that contain 1.
Proof. A. We start by proving surjectivity. Let E ⊂ ElemK (X) be an arbitrary
K-subalgebra. Define the collection
R = {A ⊂ X : κ A ∈ E}.
If A, B ∈ R, then the equalities
κ A∩B = κ A κ B and κ A4B = κ A + κ B − 2κ A κ B ,
combined with the fact that E is a subalgebra, prove that κ A∩B and κ A4B both
belong to E, hence A ∩ B and A4B both belong to R. This shows that R is a ring.
It is pretty clear (see Lemma 1.1) that R-ElemK (X) ⊂ E. To prove the other
inclusion, start with some arbitrary function φ ∈ E, and let us prove that φ ∈
R-ElemK (X). If φ = 0, there is nothing to prove. Assume φ is not identically zero.
We write φ(X) r {0} as {λ1 , . . . , λn }, with λi 6= λj for all i, j ∈ {1, . . . , n} with
i 6= j. For each i ∈ {1, . . . , n}, we set Ai = φ−1 ({λi }), so that
n
X
φ= λ i · κ Ai .
i=1
Since all λ’s are different, the matrix
 
λ1 λ2 ... λn
 λ21 λ22 ... λ2n 
T = .
 
.. .. ..
 ..

. . . 
λn1 λn2 ... λnn
CHAPTER III: MEASURE THEORY 143

 n
is invertible. Take αij i,j=1 to be the inverse of T . The obvious equalities
n
X
φk = λkj κ Aj , ∀ k = 1, . . . , n
j=1

can be written in matrix form as


   
φ κ A1
 φ2   κ A2 
=T · ,
   
 .. ..
 .   . 
φn κ An
so multiplying by T −1 yields
n
X
κ Aj = αjk φk , ∀ j = 1, . . . , n,
k=1
which proves that κ A1 , . . . , κ An ∈ E, so A1 , . . . , An ∈ R. This then shows that
φ ∈ R-ElemK (X).
We now prove injectivity. Suppose first that R and S are rings such that
R-ElemK (X) = S-ElemK (X), and let us prove that R = S. For every A ∈ R, the
function κ A ∈ R-ElemK (X) is also S-elementary, which means that A ∈ S. This
proves the inclusion R ⊂ S. By symmetry we also have the inclusion S ⊂ R, so
indeed R = S.
B. This part is obvious from A. 
Definitions. Let X be a (non-empty) set. A collection U ⊂ P(X) is called a
σ-ring, if it is a ring, and it has the property:
S∞
(σ) Whenever (An )∞ n=1 is a sequence in U, it follows that n=1 An also belongs
to U.
A collection S ⊂ P(X) is called a σ-algebra, if it is an algebra, and it has property
(σ).
Clearly, every σ-algebra is a σ-ring.
Remarks 1.2. A. For σ-rings and σ-algebras, one of the properties in the
definition of rings and algebras is redundant. More explicitly:
(i) A collection U ⊂ P(X) is a σ-ring, if and only if it has the property (σ)
and the property: A, B ∈ U =⇒ A r B ∈ U.
(ii) A collection S ⊂ P(X) is a σ-algebra, if and only if it has the property
(σ) and the property: A ∈ S =⇒ X r A ∈ S.
B. If U is a σ-ring, then it also has the property
T∞
(δ) (An )∞ n=1 ⊂ U =⇒ n=1 ∈ U.
Since σ-algebras are σ-rings, they will also have property (δ).
Definitions. Let X be a non-empty set. A sequence (An )n≥1 of subsets of X
is said to be monotone, if it satisfies one of the following conditions:
(↑) An ⊂ An+1 , ∀ n ≥ 1,
(↓) An ⊃ An+1 , ∀ n ≥ 1.
In the case (↑) the sequence is said to be increasing, and we define

[
lim An = An .
n→∞
n=1
144 LECTURE 18

In the case (↓) the sequence is said to be decreasing, and we define



\
lim An = An .
n→∞
n=1
A collection M ⊂ P(X) is said to be a monotone class on X, if it satisfies the
condition:
(m) whenever (An )n≥1 is a monotone sequence in M, it follows that its limit
limn→∞ An also belongs to M.
Proposition 1.5. Let R be a ring on X. Then the following are equivalent:
(i) R is a σ-ring;
(ii) R is a monotone class.
Proof. (i) ⇒ (ii). This is immediate from the definition and Remark 1.2.B.
(ii) ⇒ (i). Assume R is a monotone class, an let us prove that it is a σ-ring.
By Remark 1.2.A, we only need to prove that R has S property (σ). Start with an

us prove that n=1 An again belongs to R.
arbitrary sequence (An )n≥1 in R, and let S
n
For every integer n ≥ 1, we define Bn = k=1 An . Since R is a ring, it follows that
Bn ∈ R,S∀ n ≥ 1. Moreover, the sequence (Bn )n≥1 is increasing, so by assumption,

the set n=1 An = limn→∞ Bn indeed belongs to R. 
Lecture 19

2. Constructing (σ-)rings and (σ-)algebras


In this section we outline three methods of constructing (σ-)rings and
(σ-)algebras. It turns out that one can devise some general procedures, which
work for all the types of set collections considered, so it will be natural to begin
with some very general considerations.
Definition. Suppose one has a type Θ of set collections. In other words, for
any set X, one defines what it means for a collection C ⊂ P(X) to be of type Θ. The
type Θ is said to be consistent, if for every set X, one has the following conditions:
• the collection P(X), of all subsets of X, is of type Θ;
• if Ci , i ∈ I are collections of type Θ, then the intersection i∈I Ci is again
T
of type Θ.
Examples 2.1. The following types are consistent:
• The type R of rings;
• The type A of algebras;
• The type S of σ-rings;
• The type Σ of σ-algebras;
• The type M of monotone classes.
The reason for the consistency is simply the fact that each of these types is defined
by means of set operations.
Definition. Let Θ be a consistent type, let X be a set, and let E ⊂ P(X) be
an arbitrary collection of sets. Define
FΘ (E, X) = C ⊂ P(X) : C ⊃ E, and C is of type Θ on X .


Notice that the family FΘ (E, X) is non-empty, since it contains at leas the collection
P(X). The collection \
ΘX (E) = C
C∈FΘ (E,X)
is of type Θ on X, and is called the type Θ class generated by E. When there is no
danger of confusion, the ambient set X will be ommitted.
Comment. In the above setting, the class Θ(E) is the smallest collection of
type Θ on X, which contains E. In other words, if C is a collection of type Θ on X,
with C ⊃ E, then C ⊃ Θ(E). This follows immediately from the fact that C belongs
to FΘ (E, X).
Examples 2.2. Let X be a (non-empty) set, and let E be an arbitrary collection
of subsets of X. According to the previous list of consistent types R, A, S, Σ, and
M, one can construct the following collections.
(i) R(E), the ring generated by E; this is the smallest ring that contains E.
145
146 LECTURE 19

(ii) A(E), the algebra generated by E; this is the smallest algebra that contains
E.
(iii) S(E), the σ-ring generated by E; this is the smallest σ-ring that contains
E.
(iv) Σ(E), the σ-algebra generated by E; this is the smallest σ-algebra that
contains E.
(v) M(E), the monotone class generated by E; this is the smallest monotone
class that contains E.
Comment. Assume Θ is a consistent type. Suppose E is an arbitrary collection
of subsets of some fixed non-empty set X. There are instances when we would like
to decide whether a class C ⊃ E coincides with Θ(E). The following is a useful test:
(i) check that C is of type Θ;
(ii) check the inclusion C ⊂ Θ(E).
By (i) we must have C ⊃ Θ(E), so by (ii) we will indeed hav equality.
A simple illustration of the above technique allows one to describe the ring and
the algebra generated by a collection of sets.
Proposition 2.1. Let X be a non-empty set, and let E be an arbitrary collec-
tion of subsets of X.
A. For a set A ⊂ X, the following are equivalent:
(i) A ∈ R(E);
(ii) There exist sets A1 , A2 , . . . , An such that A = A1 4A2 4 . . . 4An , and each
Ak , k = 1, . . . , n is a finite intersection of sets in E.
B. The algebra generated by E is
A(E) = R(E) ∪ X r A : A ∈ R(E) = R E ∪ {X} .
 

Proof. A. Define R to be the class of all subsets A ⊂ X, which satisfy property


(ii), so that what we have to prove is the equality
R = R(E).
It is clear that E ⊂ R. Since every finite intersection of sets in E belongs to R(E),
and the latter is a ring, it follows that R ⊂ R(E). So in order to prove the desired
equality, all we have to do is to prove that R is a ring. But this is pretty clear, if
we think 4 as the sum operation, and ∩ as the product operation. More explicitly,
let us take Π(E) to be the collection of all finite intersections of sets in E, so that
(1) A ∩ B ∈ Π(E), ∀ A, B ∈ Π(E).
Now if we start with two sets A, B ∈ R, written as A = A1 4 . . . 4Am and B =
B1 4 . . . 4Bn , with A1 , . . . , Am , B1 , . . . , Bn ∈ Π(E), then the equality
   
A ∩ B = (A1 ∩ B1 )4 . . . 4(Am ∩ B1 ) 4 (A1 ∩ B2 )4 . . . 4(Am ∩ B2 ) 4 . . .
 
. . . 4 (A1 ∩ Bn )4 . . . 4(Am ∩ Bn ) ,
combined with (1) proves that A ∩ B ∈ R, while the equality
A4B = A1 4 . . . 4Am 4B1 4 . . . 4Bn
proves that A4B also belongs to R.
B. Define
A = R(E) ∪ X r A : A ∈ R(E) .

CHAPTER III: MEASURE THEORY 147

Since we clearly have E ⊂ A ⊂ A(E), all we need to prove is the fact that A is an
algebra. It is clear that, whenever A ∈ A, it follows that X r A ∈ A. Therefore
(see Section III.1), we only need to show that
A, B ∈ A ⇒ A ∪ B ∈ A.
There are four cases to examine: (i) A, B ∈ R(E); (ii) A ∈ R(E) and X rB ∈ R(E);
(iii) X r A ∈ R(E) and B ∈ R(E); (iv) X r A ∈ R(E) and X r B ∈ R(E).
Case (i) is clear, since it will force A ∪ B ∈ R(E).
In case (ii), we use
X r (A ∪ B) = (X r A) ∩ (X r B) = (X r B) r A,
which proves that X r (A ∪ B) ∈ R(E).
Case (iii) is proven exactly as case (ii).
In case (iv) we use
X r (A ∪ B) = (X r A) ∩ (X r B),
which proves that X r (A ∪ B) ∈ R(E).
The equality A(E) = R E ∪ {X} is trivial.


Comment. Unfortunately, for σ-rings and σ-algebras, no easy constructive
description is avaialable. There is an analogue of Proposition 2.1 uses transfinite
induction. In order to formulate such a statement, we introduce the following
notations. For every collection C of subsets of X, we define

[
C∗ = (An r Bn ) : An , Bn ∈ C ∪ {∅}, ∀ n ≥ 1 .

n=1

Notice that
(2) C ∪ {∅} ⊂ C∗ ⊂ S(C).
Theorem 2.1. Let X be a non-empty set, and let E be an arbitrary collection
of subsets of X. For every ordinal number η define the set
Pη = {α : α ordinal number with α < η}.
Let Ω denote the smallest uncountable ordinal number, and define the classes Eα ,
α ∈ PΩ recursively by E0 = E, and
[ ∗
Eα = Eβ , ∀ α ∈ PΩ r {0}.
β∈Pα

Then the σ-ring generated by E is


[
S(E) = Eα .
α∈PΩ

Proof. Denote the union α∈PΩ Eα simply by U. It is obvious that E ⊂ U.


S
Let us prove that U ⊂ S(E). We do this by showing that Eα ⊂ S(E), ∀ α ∈ PΩ .
We use transfinite induction. The case α = 0 is clear. Assume α ∈ PΩ has the
property that Eβ ⊂ S(E), for all β ∈ Pα , and let us show that we also have the
inclusion Eα ⊂ S(E). On the one hand, if we take the class
[
C= Eβ ,
β∈Pα
148 LECTURE 19

then Eα = C∗ . On the other hand, by the inductive hypothesis, we have C ⊂ S(E),


which clearly forces S(C) ⊂ S(E). Then the desired inclusion follows from (2)
In order to finish the proof, we only need to prove that U is a σ-ring. It suffices
to prove the equality U∗ = U, which in turn is equivalent to the inclusion U∗ ⊂ U.
Start with some U ∈ U∗ , written as

[
U= (An r Bn ),
n=1

n=1 and (Bn )n=1 in U. For each n ≥ 1 choose αn , βn ∈ PΩ ,


for two sequences (An )∞ ∞

such that An ∈ Eαn and B ∈ Eβn . Form then the countable set
Z = {αn : n ∈ N} ∪ {βn : n ∈ N} ⊂ PΩ .
Then we clearly have
[ ∗
U∈ Eν .
ν∈Z
Since Z is countable, there is a strict upper bound for Z in PΩ , i.e. there exists
γ ∈ PΩ , such that αn < γ and βn < γ, ∀ n ≥ 1. In other words we have Z ⊂ Pγ , so
[ ∗
U∈ Eν = E γ ,
ν∈Pγ

so U indeed belongs to U. 

Corollary 2.1. Given a non-empty set X, and an arbitrary collection E of


subsets of X, with card E ≥ 2, one has the inequality
ℵ0
card S(E) ≤ card E .

Proof. Using the notations from the proof of the above theorem, we will first
prove, by transfinite induction, that
ℵ0
(3) card Eα ≤ card E , ∀ α ∈ PΩ .
The case α = 0 is clear. Assume now we have α ∈ PΩ r {0}, such that
ℵ0
card Eβ ≤ card E , ∀ β ∈ Pα ,
ℵ0
let us prove that we also have the inequality card Eα ≤ card E . If we take
and S
C = β∈Pα Eβ , we know that C is a countable union of sets, each having cardinality
ℵ0
≤ card E , so we immediately get
ℵ0 ℵ0
card C ≤ ℵ0 · card E = card E .
Then the collection
D(C) = {A r B : A, B ∈ C}
2
has cardinality at most card C , so we also have
ℵ0
card D(C) ≤ card E .
 ℵ0
Finally, the collection Eα = C∗ has cardinality at most card D(C) , so we get
ℵ0 ℵ0 ℵ0
card Eα ≤ card E = card E .

CHAPTER III: MEASURE THEORY 149

Having proven (3), we now have


[ ℵ0 ℵ0
Eα ≤ card PΩ · card E = ℵ1 · card E .
 
card S(E) = card
α∈PΩ
ℵ0
Since ℵ1 ≤ c = 2ℵ0 ≤ card E , the above estimate gives
ℵ0 2 ℵ0
card S(E) ≤ card E = card E . 


Comment. Suppose Θ is a consistent type. There is a very useful technique


for proving results on classes of the form Θ(E). More explicitly, suppose E is an
arbitrary collection of subsets of X, and (p) is a certain property which refers to
subsets of X. Suppose now we want to prove a statement like:
(∗) Every set A ∈ Θ(E) has property (p).
In order to prove such a statement, one defines
U = A ∈ Θ(E) : A has property (p) ,


and it suffices to prove that:


(i) U is of type Θ;
(ii) U ⊃ E, i.e. every set A ∈ E has property (p).
Indeed, if we prove the above two facts, that would force U ⊃ Θ(E), and since by
construction we have U ⊃ Θ(E), we will in fact get U = Θ(E), thus proving (∗).
As a first illustration of the above technique, we prove the following.
Proposition 2.2. Let X be a non-empty set, and let R be a ring on X. Then
the σ-ring generated by R is the same as the monotone class generated by R, that
is, one has the equality
S(R) = M(R).
Proof. Since S(R) is a monotone class, and contains R, we have the inclusion
S(R) ⊃ M(R).
To prove the other inclusion, using the fact that M(R) contains R, it suffices
to show that M(R) and is a σ-ring. Since M(R) is already a monotone class, we
only need to show that it is a ring. In other words, we need to show that whenever
A, B ∈ M(R), it follows that both A r B and A ∪ B belong to M(R). Define then,
for every A ∈ M(R) the set
MA = B ∈ M(R) : A ∩ B, A r B, B r A ∈ M(R) ,


so that what we need to prove is:


(∗) MA = M(A), ∀ A ∈ M(R).
Before we proceed with the proof of (∗), let us first remark that, for A, B ∈ M(R),
one has
(4) B ∈ MA ⇐⇒ A ∈ MB .
Secondly, we have the following
Claim 1: For every A ∈ M(R), the collection MA is a monotone class.
To prove this, we start with a monotone sequence (Bn )∞ n=1 in MA , and we prove
that the limit B = limn→∞ Bn again belongs to MA . First of all, clearly B belongs
to M(R). Second, since the sequences (A ∩ Bn )∞ ∞ ∞
n=1 , (A r Bn )n=1 , and (Bn r A)n=1
are all monotone sequences in M(R), and since M(R) is a monotone class, it follows
150 LECTURE 19

that the limits A ∩ B = limn→∞ (A ∩ Bn ), A r B limn→∞ (A r Bn ), and B r A =


limn→∞ (Bn r A) all belong to M(R), so B indeed belongs to MA .
Having proven Claim 1, we now prove (∗) in a particular case:
Claim 2: MA = M(R), ∀ A ∈ R.
Fix A ∈ R. We know that MA ⊂ M(R) is a monotone class, so it suffices to prove
that MA ⊃ R. But this is obvious, since R is a ring.
We now proceed with the proof of (∗) in the general case. If we define
U = A ∈ M(R) : MA = M(R) ,


all we need to prove is the equality U = M(R). By Claim 2, we know that U ⊃ R, so


it suffices to prove that U is a monotone class. Start then with a monotone sequence
(An )∞n=1 , and let us show that the limit A = limn→∞ An again belongs to U. First
of all, A belongs to M(R). What we then have to prove is that MA = M(R). Start
with some arbitrary B ∈ M(R). We know that B ∈ MAn , ∀ n ≥ 1. Using (4)
we have An ∈ MB , ∀ n ≥ 1, and using the fact that MB is a monotone class (see
Claim 1), it follows that A = limn→∞ An belongs to MA . Using (4) again, this
gives B ∈ MA . This way we have proven that any B ∈ M(R) also belongs to MA ,
so we indeed have the equality MA = M(R). 

Corollary 2.2. Let X be a non-empty set, and let E be an arbitrary family


of subsets of X. Then the σ-ring, and the σ-algebra generated by E respectively, are
given as the monotone classes generated by the ring, and by the algebra generated
by E respectively. That is, one has the equalities:

(i) S(E) = M R(E) ;
(ii) Σ(E) = M A(E) .

Proof. (i). By the above result, since R(E) is a ring, we have


 
(5) M R(E) = S R(E) .

Since S R(E) is a σ-ring, and contains E, it follows that





S R(E) ⊃ S(E).
Conversely, since S(E) is a ring, and contains E, we get the inclusion
S(E) ⊃ R(E),
and since S(E) is a σ-ring, we will now get

S(E) ⊃ S R(E) ,
so we get

S R(E) = S(E).
Using (5), the desired equality follows.
(ii). This follows from Proposition 2.1 andpart (i) applied to E∪{X}, combined
with the obvious equality Σ(E) = S E ∪ {X} . 

The σ-ring and the σ-algebra, generated by an arbitrary collection of sets, are
related by means of the following result.
CHAPTER III: MEASURE THEORY 151

Proposition 2.3. Let X be a non-empty set, and let E be an arbitrary collec-


tion of subsets of X. Define the collection

PE
[

E,

σ (X) = A ⊂ X : there exists (E )
n n=1 ⊂ with A ⊂ En .
n=1

(i)PE
σ (X) is a σ-ring on X;
(ii) the σ-ring S(E) and the σ-algebra Σ(E), generated by E, satsify the equality
S(E) = Σ(E) ∩ PE
σ (X).

Proof. Part (i) is trivial.


To prove part (ii), we first observe that the intersection Σ(E) ∩ PEσ (X) is a
σ-ring, which obviously contains E, so we immediately get the inclusion
S(E) ⊂ Σ(E) ∩ PE
σ (X).

The key ingredient in proving the inclusion “⊃” is contained in the following.
Claim: Given a set E ∈ E, the collection
AE (X) = A ⊂ X : A ∩ E ∈ S(E)


is a σ-algebra on X.
To prove this we need to check:
(a) if A belongs to AE (X), then X r A also belongs to AE (X);
whenever (An )∞
(b) S n=1 is a sequence of sets in AE (X), it follows that the union

A
n=1 n also belongs to AE (X).
To check (a) we simply remark that, since both E and A ∩ E belong to S(E), it
follows immediately that (X r A) ∩ E = E r (A ∩ E), also belongs to S(E), which
means that X r A indeed belongs to AE (X).
Property (b) is clear. Since the
S∞fact that An S ∩ E belongs to S(E), for all n,

immediately gives the fact that n=1 A n ) ∩ E = n=1 (An ∩ E) belongs to S(E),
S∞
which means precisely that n=1 An belongs to AE .
Having proven the Claim, we now proceed with the proof of the inclusion
S(E) ⊃ Σ(E) ∩ PE E
σ (X). Start with some set A ∈ Σ(E) ∩ Pσ (X), and we will show
n=1 ⊂ E, such that
that A belongs to S(E). First of all, there exists a sequence (En )∞

[
(6) A⊂ En .
n=1

Using the Claim, we know that for each n ∈ N, the collection AEn is a σ-algebra.
This σ-algebra clearly contains E, so we have
Σ(E) ⊂ AEn , ∀ n ∈ N.
In particular, we get the fact that A ∈ AEn , which means that A ∩ En belongs to
S(E, for all n ∈ N. But then the inclusion (6) forces the equality
[
A= (A ∩ En ),
n=1

which then gives the fact that A indeed belongs to S(E). 

The above result motivates the following.


152 LECTURE 19

Definition. A collection E of subsets of X is said to be σ-total in X, if


X ∈ PE
S∞
σ (X), i.e. there exists some sequence (E n )∞
n=1 ⊂ E with n=1 En = X. By
the above result, this is equivalent to the fact that X belongs to the σ-ring S(E)
generated by E, which in turn is equivalent to the equality Σ(E) = S(E).
We discuss now two more methods of constructing (σ-)rings, (σ-)algebras, or
monotone classes.
Notations. Let f : X → Y be a function, and let E ⊂ P(X) and G ⊂ P(Y )
be two arbitrary collections of sets. We define
f∗ E = A ∈ P(Y ) : f −1 (A) ∈ E ⊂ P(Y );


f ∗ G = f −1 (G) : G ∈ G ⊂ P(X).


Definitions. Let Θ be a type of set collections. We say that Θ is natural, if


for any map f : X → Y , one has the implications
(i) C of type Θ on X =⇒ f∗ C of type Θ on Y ;
(ii) D of type Θ on Y =⇒ f ∗ D of type Θ on X.
Examples 2.3. The types R, A, S, Σ, and M are natural.
The term “natural” is justified by the following.
f g
Exercise 1. Let X −→ Y −→ Z be maps.
(i) Prove that, for any collection C ⊂ P(X), one has the equality g∗ (f∗ C) =
(g ◦ f )∗ C.
(ii) Prove that, for any collection D ⊂ P(Y ), one has the equality f ∗ (g ∗ D) =
(g ◦ f )∗ D.
Theorem 2.2 (Generating Theorem). Suppose Θ is a consistent class type,
which is natural. Let X and Y be non-empty sets, and let f : X → Y be a map.
For any collection G ⊂ P(Y ), one has the equality
f ∗ Θ(G) = Θ(f ∗ G).

Proof. On the one hand, by naturality, we know that f ∗ Θ(G) is of type Θ.


On the other hand, it is pretty clear that, since Θ(G) ⊃ G, we also have the inclusion
f ∗ Θ(G) ⊃ f ∗ G). Since Θ is consistent, it then follows that we have the inclusion
f ∗ Θ(G) ⊃ Θ(f ∗ G).
To prove the other inclusion, we consider the class
C = f∗ Θ(f ∗ G) ⊂ P(Y ).
 

By naturality, it follows that C is of type Θ on Y . For any G ∈ G, the obvious


relation
f −1 (G) ∈ f ∗ G ⊂ Θ(f ∗ G)
proves that G ∈ C. This means that we have the inclusion C ⊃ G, and since C is of
class Θ, it follows that we have the inclusion
Θ(G) ⊂ C.
This means that, for every A ∈ Θ(G), we have f −1 (A) ∈ Θ(f ∗ G), which means
precisely that we have the desired inclusion
f ∗ Θ(G) ⊂ Θ(f ∗ G). 
CHAPTER III: MEASURE THEORY 153

Example 2.4. Let Θ be a consistent class type, which is both covariant and
contravariant. Let Y be some set, and let C be a collection of type Θ on Y . Given
a subset X ⊂ Y , we consider the inclusion map
ι : X ,→ Y . The collection ι∗ C is
then of type Θ on X. It will be denoted by C X . Since ι A = A ∩ X, ∀ A ∈ P(Y ),
−1

we have
C = {A ∩ X : A ∈ C}.

X
If E ⊂ P(Y ) is a collection with C = Θ(E), then by the Generating Theorem we
have the equality
Θ(E) X = Θ {E ∩ X : E ∈ E} .

(7)
Comment. The exercise below shows that a “forward” version of the Gener-
ating Theorem does not hold in general. In other words, an equality of the type
f∗ Θ(G) = Θ(f∗ G) may fail. The reason is the fact that the collection f∗ G may be
relatively “small.”
Exercise 2. Consider the sets X = {1, 2, 3}, Y = {1, 2}, the function f :X→
Y , defined by f (1) = f (2) = 1, f (3) = 2, and the collection C = {1}, {2}, ∅ .


Describe the collection f∗ C, the algebra A(C) generated by C (on X), and the
algebra A(f∗ C) generated by f∗ C (on Y ). Prove that one has a strict inclusion
A(f∗ C) ( f∗ A(C).
Exercise 3. Let Θ be a consistent natural type, let f : X → Y be a surjective
map, and let G be a collection of subsets of X. Assume one has the inclusion
(8) G ⊂ f ∗ Θ(f∗ G).
Prove that one has the equality
f∗ Θ(G) = Θ(f∗ G).
(One instance when (8) holds is for example when f −1 f (G) = G, ∀ G ∈ G.)


Exercise 4*. Let Θ be one of the types A, R, S, Σ, or M. Let f : X → Y


be an injective map, and let G ⊂ P(X) be some arbitrary collection. Prove the
equality
f∗ Θ(G) = Θ(f∗ G).
Natural consistent types are useful, because it is possible to construct product
structures.
Definition. Let Θ be a consistent type which is natural. Let (Xi )i∈I be a
collection of non-empty sets. Assume that, for each i ∈ I, a collection
Q Ei ⊂ P(Xi )
of type Θ is given. Consider the product cartesian product X = i∈I Xi , together
with the projection maps πi : X → Xi , i ∈ I. The collection
[ 
Θ - Ei = Θ
X πi∗ Ei
i∈I i∈I

is a collection of type Θ on X, which is called the Θ-product. When there is no


danger of confusion, we use the notation . X
Remark 2.1. Use the notations from the above definition. Assume that, for
each i ∈ I, a collection Gi ⊂ P(Xi ) is given. Then one has the equality
[ 
X Θ(Gi ) = Θ πi G i .

i∈I i∈I
154 LECTURE 19

Indeed, if we define Ei = Θ(Gi ), the inclusion ⊃ follows from the obvious inclusions

X Ei ⊃ πi∗ Ei ⊃ πi∗ Gi .
i∈I

The inclusion ⊂, follows from the inclusions


[ 
πi∗ Gi ⊂ Θ πi∗ Gi ,
i∈I

which combined with the fact that the right hand side is of type Θ, and the Gen-
erating Theorem, give the inclusions
[ 
πi∗ Ei = πi∗ Θ(Gi ) = Θ(πi∗ Gi ) ⊂ Θ πi∗ Gi .
i∈I

Natural consistent types also allow one to define disjoint union structures.
Definitions. Let (Xi )i∈I be a collection of non-empty sets. Assume that,
F for
each i ∈ I, a collection Ci ⊂ P(Xi ) is given. On the disjoint union X = i∈I Xi
one defines the collection
_
Ci = C ⊂ X : C ∩ Xi ∈ Ci , ∀ i ∈ I .


i∈I

Assume now Θ is a natural consistent, and Ci is of type Θ on Xi , for each i ∈ I. If


we consider the inclusion maps i : Xi → X, i ∈ I, then one clearly has the equality
_ \
Ci = i∗ Ci ,
i∈I i∈I

which means that i∈I Ci is a collection of type Θ on X.


W

Exercise 5. Let I be countable, and let (Xi )i∈I be a collection of non-empty


sets. Assume that, for each i ∈ I, a collection Ci ⊂ P(Xi ) is given, such that
∅ ∈ Ci . Prove the equalities
_ _  _ _ 
S(Ci ) = S Ci and Σ(Ci ) = Σ Ci .
i∈I i∈I i∈I i∈I

We conclude with a discussion on certain constructions related to topology.


Definitions. Let X be a topological Hausdorff space. We consider the collec-
tion T of all open sets in X. The σ-algebra Σ(T) on X, generated by T is denoted
by Bor(X). The sets in Bor(X) are called Borel sets.
Remark that singleton sets are Borel, since they are closed. Moreover
• every countable set B ⊂ X is Borel.
One also defines the σ-algebra Borc (X) = Σ(CX ) generated by the class CX of
all compact subsets of X.
Another class of sets will also be of interest. Its construction uses the following
terminology.
A subset A ⊂ X is said to be σ-compact,
S∞ if there exists a sequence (Kn )∞
n=1 of
compact subsets of X, such that A = n=1 Kn . A set B ⊂ X is said to be relatively
σ-compact, if there exists a σ-compact set A with B ⊂ A. We set
Pσc (X) = B ∈ P(X) : B relatively σ-compact ,


and we define
Borσc (X) = Bor(X) ∩ Pσc (X).
CHAPTER III: MEASURE THEORY 155

Proposition 2.4. Let X be a topological Hausdorff space.


(i) Pσc (X) is a σ-ring on X;
(ii) the σ-ring Borσc (X) coincides with the σ-ring S(CX ) generated by the
collection CX of all compact subsets of X.
Proof. Using the notations from Proposition 2.3, we have Pσc (X) = PC σ (X),
X

so part (i) is a consequence of Proposition 2.3.(i). By Proposition 2.3.(ii) we also


know that
S(CX ) = Σ(CX ) ∩ Pσc (X) = Borc (X) ∩ Pσc (X),
and since Borc (X) ⊂ Bor(X), we have the inclusion
S(CX ) ⊂ Borσc (X).
To prove the other inclusion, all we need to show is the inclusion
Borσc (X) ⊂ Borc (X).
Start with some arbitrary set B ∈ Borσc (X), and let us prove that prove that

B ∈ Borc (X). Since B is relatively
S∞ σ-compact, there exists a sequence (Kn )n=1
of compact sets, such thatSB ⊂ n=1 Kn . Define, for each integer n ≥ 1, the set

Bn = B ∩ Kn . Since B = n=1 Bn , It suffices to show that
(9) Bn ∈ Borc (X), ∀ n ∈ N.
Fix n, and let us analyze the inclusion ιn : Kn ,→ X. Denote by T the collection of
all open sets in X, and denote by TKn the collection of all sets D ⊂ Kn , which are
open in the induced topology, that is,
TKn = D ∩ Kn : D ∈ T .


By the Generating Theorem (Example 2.4), we know that


Bor(X) K = Σ(T) K = ΣKn {D ∩ Kn : D ∈ T} = ΣKn TKn = Bor(Kn ).
 
n n

(Here the notation ΣKn indicates that the σ-algebra is taken on Kn .) In particular,
we get

(10) Bn = B ∩ Kn ∈ Bor(X) = Bor(Kn ), ∀ n ∈ N.
Kn

Since Kn is compact, the σ-ring S(CKn ), generated by all compact subsets of Kn , is


a σ-algebra on Kn (simply because it contains Kn .) Notice that every set D ∈ TKn
is of the form Kn r F , with F ⊂ Kn compact (in X), therefore D belongs to
S(CKn ). Since S(CX ) is a σ-algebra, which contains TKn , we have
Bor(Kn ) = ΣKn TKn ⊂ S(CKn ) ⊂ S(CX ) ⊂ Borc (X), ∀ n ∈ N.


Now (9) immediately follows from the above inclusions, combined with (10). 
Remark 2.2. For a topological Hausdorff space, we always have the inclusions
Borσc (X) ⊂ Borc (X) ⊂ Bor(X).
The following are equivalent
(i) Borσc (X) = Bor(X);
(ii) X is σ-compact.
The following result exaplains when a minimal set of generators can be chosen
for the Borel sets.
156 LECTURE 19

Proposition 2.5. Let X be a topological space which is second countable, i.e.


there is a countable base for the topology. If S is any sub-base for the topology
(countable or not), then
Bor(X) = Σ(S).
Proof. Denote by T the collection of all open sets in X. Denote by V the
collection of all subsets of X, which can be written as finite intersections of sets in
S. It is obvious that S ⊂ V ⊂ Σ(S), so we have the equality Σ(S) = Σ(V). This
means that it suffices to prove the equality
(11) Σ(T) = Σ(V).
Notice that V is a base for the topology, which means that every open subset D ( X
can be written as a union of sets in V. What we want to prove is
Claim: Every open set D ( X is a countable union of sets in V.
To prove this fact, we fix an open set D ( X, as well as a countable base B =
{Bn }∞
n=1 for the topology. For every x ∈ D we define the set

Mx = {n ∈ N : there exists V ∈ V, such that x ∈ Bn ⊂ V ⊂ D}.


It is pretty clear that Mx 6= ∅, ∀ x ∈ D. (First use the fact that V is a base, to
find V ∈ V such that x ∈ V ⊂ D, and then S use the fact that B is a base to find n
such that x ∈ B n ⊂ V .) If we put M = x∈D Mx , then it is pretty obvious that
n∈M Bn = D. For every n ∈ M we choose some Vn ∈ V with Bn ⊂
S
SVn ⊂ D (use
the fact that n must belong to some Mx ). It is then clear that D = n∈M Vn , and
the claim follows.
As a consequence of the Claim, we see that any open set D ( X automatically
belongs to Σ(V), and then we have the inclusion T ⊂ Σ(V) ⊂ Σ(T). This clearly
forces the equality (11). 
Corollary 2.3. Let I be a set which is at most countable, and let (Xi )i∈I be
a collection of second countable topological spaces. Then one has the equality
Y 
(12) Bor X
Xi = Σ- Bor(Xi ),
i∈I
i∈I
Q
where the product space i∈I Xi is equipped with the product topology.
Proof. By the definition of the product σ-algebra, we know that
[ 

X
Σ- Bor(Xi ) = Σ
i∈I
πj Bor(Xj ) ,
j∈I
Q
where πj : i∈I → Xj , j ∈ I, denote the projection maps. Choose, for each j ∈ I,
a countable sub-base Sj for Xj , so that we have the equalities
Bor(Xj ) = Σ(Sj ), ∀ j ∈ I.
By Remark 2.1 we have the equality
[ 
Σ- X Bor(Xi ) = Σ πj∗ Sj ,
i∈I j∈I
Q
where πj : i∈I → Xj , j ∈ I, denote the projection maps. Since the collection
i∈I i Si is a countable sub-base for the product topology, the above equality,

S
π
combined with Proposition 2.5 immediately gives (12). 
CHAPTER III: MEASURE THEORY 157

Exercise 6. A. Prove that, if XSis second countable, and S is a sub-base for its
topology (countable or not), with S∈S S = X, then we have in fact the equality
Bor(X) = S(S).
B. Prove that, if X is Hausdorff, second countable, with card X ≥ 2, then for any
sub-base S (countable or not), we have the equality
Bor(X) = S(S).
Hints: Follow the proof above. Remark that every open set D ⊂ X, which is a countable union
of sets in V, belongs in fact to the σ-ring S(V) = S(S). So in either case, we only have to show
that X is a countable union of sets in V.
In case A, we trace the proof of the Claim, and we notice that the only property that we
used was the fact that, for every x ∈ D, there exists V ∈ V with x ∈ V ⊂ D, i.e. D is a (possibly
uncountable) union of sets in V. Since X itself satisfies this property, it follows that X is also a
countable union of sets in V.
In case B, we use the Hausdorff property to write X = D1 ∪ D2 , with D1 , D2 ( X open.
Corollary 2.4. If X is a topological Hausdorff space, which is second count-
able, and X is infinite (as a set), then card Bor(X) = c.
Proof. First of all, since X is infinite, one can chose an infinite countable
subset A ⊂ X. Then A, and all its subsets are Borel, i.e. we have the inclusion
P(A) ⊂ Bor(X), thus proving the inequality
card Bor(X) ≥ card P(A) = 2ℵ0 = c.
Secondly, one can choose abase V for the topology, which is countable. We now
have Bor(X) = S V ∪ {X} , so by Corollary 2.1. we get
ℵ0
card Bor(X) ≤ card V ∪ {X} ≤ ℵ0 ℵ0 = c,
and the desired equality follows. 
Examples 2.5. A. Consider the extended real line [−∞, ∞] = R ∪ {−∞, ∞},
thought as a compact space, homeomorphic to the interval [−π/2, π/2], via the map
f : [−π/2, π/2] → [−∞, ∞], defined by

 −∞ if t = −π/2
f (t) = tan t if − π/2 < t < π/2
∞ if t = π/2

Notice that, when restricted to R = (−∞, ∞), this topology agrees


with the usual
topology. In particular, this gives the equality Bor([−∞, ∞]) R = Bor(R).
Let A ⊂ R be a dense subset. Consider the collections
E1 = (a, ∞] : a ∈ A ; E2 = [a, ∞] : a ∈ A ;
 

E3 = [−∞, a) : a ∈ A ; E4 = [−∞, a] : a ∈ A .
 

With these notations we have the equalities


Bor([−∞, ∞]) = Σ(E1 ) = Σ(E2 ) = Σ(E3 ) = Σ(E4 ).
First of all, we notice that each set in E1 ∪ E2 ∪ E3 ∪ E4 is either open or closed,
which means that
E1 ∪ E2 ∪ E3 ∪ E4 ⊂ Bor([−∞, ∞]),
thus giving the inclusions
Σ(Ek ) ⊂ Bor([−∞, ∞]), ∀ k ∈ {1, 2, 3, 4}.
158 LECTURE 19

Second, we observe that E1 ∪ E3 is a sub-base for the topology, and since [−∞, ∞]
is obviously second countable, we will have the equality
Bor([−∞, ∞]) = Σ(E1 ∪ E3 ).
So, in order to finish the proof we only need to show the inclusions
(13) E1 ∪ E3 ⊂ Σ(Ek ), ∀ k ∈ {1, 2, 3, 4}.
Since every set in E2 has its complement in E3 , and viceversa, we have the inclusions
E2 ⊂ Σ(E3 ) and E3 ⊂ Σ(E2 ),
which prove the equality
(14) Σ(E2 ) = Σ(E3 ).
Likewise, we have the equality
(15) Σ(E1 ) = Σ(E4 ).
This means that we only have to prove (13) for k = 2 and k = 4. The case k = 2
amounts to proving that E1 ⊂ Σ(E2 ). Fix some a ∈ A. For every integer n ≥ 1 we
choose an ∈ (a, a + n1 ) ∩ A. Then the equality

[
(a, ∞] = [an , ∞]
n=1

clearly shows that (a, ∞] ∈ Σ(E2 ).


The case k = 4 amounts to proving that E3 ⊂ Σ(E4 ). Fix some a ∈ A. For
every integer n ≥ 1 we choose an ∈ (a − n1 , a) ∩ A. Then the equality

[
[−∞, a) = [−∞, an ]
n=1

clearly shows that [−∞, a) ∈ Σ(E4 ).


B. If we work on R, and we consider the collections
E0k = E ∩ R : E ∈ Ek , k = 1, 2, 3, 4,


then by the Generating Theorem (Example 2.4) we have the equalities


Bor(R) = Σ(E0k ) = S(E0k ), k = 1, 2, 3, 4.
(The fact that the σ-algebra Σ(E0k ) and the σ-ring S(E0k ) coincide is a consequence
of the fact that E0k is σ-total in R.)
C. Let X be a separable metric space. Let A ⊂ X be a dense set, and let
R ⊂ (0, ∞) be a subset with inf R = 0. Then the collection
SA,R = Br (a) : r ∈ R, a ∈ A


is clearly a base for the metric topology. Since X is separable, one can choose both
A and R to be countable, which proves that X is automatically second countable.
Then for any choice of A and R, we will have the equality
Bor(X) = Σ Br (a) : r ∈ R, a ∈ A = S Br (a) : r ∈ R, a ∈ A .
   
(16)
(The equality between the generated σ-algebra and σ-ring follows from Exercise
1.A.) As particular cases when the equality (16) holds, one has the metric spaces
which are σ-compact.
CHAPTER III: MEASURE THEORY 159

Exercise 7*. Let I be an uncountable set, and let (Xi )i∈I be a collection of
topological spaces. Assume that for each i ∈ I, there esists at leas one non-empty
closed subset Fi ( Xi . (This is the case for example when Xi is Hausdorff, and
card Xi ≥ 2.) Prove that one has a strict inclusion
Y 
Bor X
Xi ) Σ- Bor(Xi ).
i∈I
i∈I
Q Q
Hint: For every subset J ⊂ I, define the projection map πJ : i∈I Xi → i∈J Xi . Consider
the collection
Xi : there exists J ⊂ I countable, such that A = πJ−1 πJ (A) .
Y
A= A⊂
 

i∈I

Prove that A ∪ {∅} is a σ-algebra, which contains i∈I πi∗ Bor(Xi ). Prove that one has a strict
S

i∈I Xi ) A ∪ {∅}, by contsructing a non-empty closed set F ⊂


Q  Q
inclusion Bor i∈I Xi , which
does not belong to A.
Lecture 20

3. Measurable spaces and measurable maps


In this section we discuss a certain type of maps related to σ-algebras.
Definitions. A measurable space is a pair (X, A) consisting of a (non-empty)
set X and a σ-algebra A on X.
Given two measurable spaces (X, A) and (Y, B), a measurable map T : (X, A) →
(Y, B) is simply a map T : X → Y , with the property
(1) T −1 (B) ∈ A, ∀ B ∈ B.
Remark 3.1. In terms of the constructions outlined in Section 2, measurability
for maps can be characterized as follows. Given measurable spaces (X, A) and
(Y, B), and a map T : X → Y , the following are equivalent:
(i) T : (X, A) → (Y, B) is measurable;
(ii) T ∗ B ⊂ A;
(iii) T∗ A ⊃ B.
Recall
T ∗ B = T −1 (B) : B ∈ B ;


T∗ A = B ⊂ Y : T −1 (B) ∈ A .


With these equalities, everything is immediate.


The following summarizes some useful properties of measurable maps.
Proposition 3.1. Let (X, A) be a measurable space.
(i) If A0 is any σ-algebra, with A0 ⊂ A, then the identity map IdX : (X, A) →
(X, A0 ) is measurable.
(ii) For any subset M ⊂ X, the inclusion map ι : (M, A M ) ,→ (X, A) is

measurable.
T S
(iii) If (Y, B) and (Z, C) are measurable spaces, and if (X, A) −−→ (Y, B) −−→
(Z, C) are measurable maps, then the composition S ◦ T : (X, A) → (Z, C)
is again a measurable map.

Proof. (i). This is trivial, since (IdX )∗ A0 = A0 ⊂ A.


(ii). This is again trivial, since ι∗ A = A M .
(iii). Start with some set C ∈ C, and let us prove that (S ◦ T )−1 (C) ∈ A. We
know that (S ◦ T )−1 = T −1 S −1 (C) . Since S is measurable, we have S −1 (C) ∈ B,
−1
S (C) ∈ A.
−1

and since T is measurable, we have T 

Often, one would like to check the measurability condition (1) on a small col-
lection of B’s. Such a criterion is the following.
161
162 LECTURE 20

Lemma 3.1. Let (X, A) and (Y, B) be masurable spaces. Assume B = Σ(E),
for some collection of sets E ⊂ P(Y ). For a map T : X → Y , the following are
equivalent:
(i) T : (X, A) → (Y, B) is measurable;
(ii) T −1 (E) ∈ A, ∀ E ∈ E.

Proof. The implication (i) ⇒ (ii) is trivial.


To prove the implication (ii) ⇒ (i), assume (ii) holds. We first observe that
condition (ii) reads f ∗ E ⊂ A. Since A is a σ-algebra, we get the inclusion
Σ(f ∗ E) ⊂ A.
Using the Generating Theorem 2.2, we have
f ∗ B = f ∗ Σ(E) = Σ(f ∗ E) ⊂ A,
and, by the preceding remark, we are done. 

Corollary 3.1. Let (X, A) be a measurable space, let Y be a topological Haus-


dorff space which is second countable, and let S be a sub-base for the topology of Y .
For a map T : X → Y , the following are equivalent:
(i) T : (X, A) → Y, Bor(Y ) is a measurable map;


(ii) T −1 (S) ∈ A, ∀ S ∈ S.

Proof. Immediate from the above Lemma, and Proposition 2.2, which states
that Bor(Y ) = Σ(S). 

We know (see Section 19) that the type Σ is consistent and natural. In par-
ticular, measurability behaves nicely with respect to products and disjoint unions.
More explicitly one has the following.
Q Let (Xi , Ai )i∈I
Proposition 3.2. F be a collection of measurable spaces. Con-
sider the sets X = i∈I Xi and Y = i∈I Xi , and the σ-algebras
_
A = Σ - Ai and B =
X Ai .
i∈I i∈I

Let (Z, G) be a measurable space.


(i) If we denote by πi : X → Xi , i ∈ I, the projection maps, then a map
f : (Z, G) → (X, A) is measurable, if and only if, all the maps πi ◦ f :
(Z, G) → (Xi , Ai ), i ∈ I, are measurable.
(ii) If we denote by i : Xi → Y , i ∈ I, the inclusion maps, then a map
g : (Y, B) → (Z, G) is measurable, if and only if, all the maps g ◦ i ◦ f :
(Xi , Ai ) → (Z, G), i ∈ I, are measurable.

Proof. (i). By the definition of the product σ-algebra, we know that


[
A=Σ πi∗ Ai .

(2)
i∈I

If we fix some index i ∈ I, then the obvious inclusion πi∗ Ai ⊂ A immediately


shows that πi : (X, A) → (Xi , Ai ) is measurable. Therefore, if f : (Z, G) → (X, A)
is measurable, then by Proposition 3.1 it follows that all compositions πi ◦ f :
(Z, G) → (Xi , Ai ), i ∈ I, are measurable.
CHAPTER III: MEASURE THEORY 163

Conversely, assume all the compositions πi ◦ f are measurable, and let us show
that f : (Z, G) → (X, A) is measurable. By Lemma 3.1 and (2), all we need to
prove is the fact that [
f∗ πi∗ Ai ⊂ G,


i∈I
which is equivalent to
f ∗ πi∗ Ai ⊂ G, ∀ i ∈ I.


But this is obvious, because f ∗ πi∗ Ai = (πi ◦ f )∗ Ai , and πi ◦ f is measurable, for




all i ∈ I.
(ii). By the definition of the σ-algebra sum, we know that
\
(3) B= i∗ Ai .
i∈I

If we fix some index i ∈ I, then the obvious inclusion i∗ Ai ⊃ B immediately


shows that i : (Xi , Ai ) → (Y, B) is measurable. Therefore, if g : (Y, B) → (Z, G)
is measurable, then by Proposition 3.1 it follows that all compositions g ◦ i :
(Xi , Ai ) → (Z, G), i ∈ I, are measurable.
Conversely, assume all the compositions g ◦ i are measurable, and let us show
that g : (Y, B) → (Z, G) is measurable. This is equivalent to the inclusion g∗ B ⊃ G.
By (3) we immediately have
\  \
g∗ B = g∗ i∗ Ai = g∗ i∗ Ai .

(4)
i∈I i∈I

We know however that, since g ◦ i are all measurable, we have


g∗ i∗ Ai = (g ◦ i )∗ Ai ⊃ G, ∀ i ∈ I,


so the desired inclusion is an immediate consequence of (4). 


Conventions. Let (X, A) be a measurable space. An extended real-valued
function f : (X, A) → [−∞, ∞] is said to be a measurable function, if it is measur-
able in the above sense as a map f : (X, A) → [−∞, ∞], Bor([−∞, ∞]) . If f has
values in R, this is equivalent to the fact that f is a measurable map f : (X, A) →
R, Bor(R) is measurable. Likewise, a complex valued function f: (X, A) → C is
measurable, if it is measurable as a map f : (X, A) → C, Bor(C) . If K is one of
the fields R or C, we define the set
BK (X, A) = f : (X, A) → K : f measurable function .


Remark 3.2. Let (X, A) be a measurable space. If A ⊂ R is a dense subset,


then the results from Section 2, combined with Lemma 2.1, show that the measur-
ability of a function f : (X, A) → [−∞, ∞] is equivalent to any of the following
conditions:
• f −1 (a, ∞] ∈ A, ∀ a ∈ A;


• f −1 [a, ∞] ∈ A, ∀ a ∈ A;
• f −1 [−∞, a) ∈ A, ∀ a ∈ A;
• f −1 [−∞, a] ∈ A, ∀ a ∈ A.
Definition. If X and Y are topological Hausdorff spaces, a map T : X → Y
is said to be Borel measurable, if T is measurable as a map
 
T : X, Bor(X) → Y, Bor(Y ) .
164 LECTURE 20

In the cases when Y = R, C, [−∞, ∞], a Borel measurable map will be simply
called a Borel measurable function.
For K = R, C, we define

BK (X) = f : X → K : f Borel measurable function .
Remark 3.3. If X and Y are topological Hausdorff spaces, then any continuous
map T : X → Y is Borel measurable. This follows from Lemma 3.1, from the fact
that

Bor(Y ) = Σ {D ⊂ Y : D open } ,
and the fact that T −1 (D) is open, hence in Bor(X), for every open set D ⊂ Y .
Measurable maps behave nicely with respect to “measurable countable opera-
tions,” as suggested by the following result.
Proposition 3.3. Let (X, A) and (Z, B) be a measurable spaces, let I be a
set which is at most countable, and let (Yi )i∈I be a family of topological Hausdorff
spaces, each of which is second countable. Suppose a measurable Qmap Ti : (X, A) →
Yi , Bor(Yi ) is given, for each i ∈ I. Define the map T : X → i∈I Yi by

T (x) = Ti (x) i∈I , ∀ x ∈ X.
Q
Equip the product space Y = i∈I Yi with theproduct topology.
For any measurable map g : Y, Bor(Y ) → (Z, B), the composition g ◦ T :
(X, A) → (Z, B) is measurable.

Proof. We know (see Corollary 2.3) that we have the equality


Bor(Y ) = Σ- X Bor(Yi).
i∈I

By Proposition 3.2, the map T : (X, A) → Y, Bor(Y ) is measurable, so by Propo-




sition 3.1, the composition g ◦ T : (X, A) → (Z, B) is also measurable. 

The above result has many useful applications.


Corollary 3.2. Suppose (X, A) is a measurable space, and K is either R or C.
Then, when equipped with point-wise addition and multiplication, the set BK (X, A)
is a unital K-algebra.

Proof. Clearly the constant function 1 is measurable.


Also, if f ∈ BK (X, A) and λ ∈ K, then the function λf is again measurable,
since it can be written as the composition Mλ ◦ f , where Mλ : K 3 α 7−→ λα ∈ K
is obviously continuous.
Finally, let us show that if f1 , f2 ∈ BK (X, A), then f1 + f2 and f1 · f2 again
belong to BK (X, A). This is however immediate from Proposition 3.3, applied to
the index set I = {1, 2}, the spaces Y1 = Y2 = K, and the continuous maps
g1 : K2 3 (λ1 , λ2 ) 7−→ λ1 + λ2 ∈ K,
g2 : K2 3 (λ, λ2 ) 7−→ λ1 · λ2 ∈ K. 

Corollary 3.3. If (X, A) is a measurable space, then a complex valued func-


tion f : X → C is measurable, if and only if the real valued functions Re f, Im f :
X → R are measurable.
CHAPTER III: MEASURE THEORY 165

Proof. If f is measurable, the composing f with the continuous maps


ρ : C 3 z 7−→ Re z ∈ R and γ : C 3 z 7−→ Im z ∈ R,
immediately gives the measurability of Re f = ρ ◦ f and Im f = γ ◦ f .
Conversely, if both Re f, Im f : X → R then the measurability of f follows from
Proposition 3.3, applied to Y1 = Y2 = R, the functions f1 = Re f and f2 = Im f ,
and to the continuous function
g : R2 3 (a, b) 7−→ a + bi ∈ C. 

Corollary 3.4. Let (X, A) be a measurable space, let I be a set which is at


most countable, and let fi : (X, A) → [−∞, ∞], i ∈ I be collection of measurable
functions. Then the functions g, h : X → [−∞, ∞], defined by
 
g(x) = inf fi (x) : i ∈ I and h(x) = sup fi (x) : i ∈ I , ∀ x ∈ X,
are both measurable.
Q
Proof. Define the maps m, M : i∈I [−∞, ∞] → [−∞, ∞] by
Y
m(x) = inf{xi : i ∈ I} and M (x) = sup{xi : i ∈ I}, ∀ x = (xi )i∈I ∈ [−∞, ∞].
i∈I

By Proposition 3.3, it suffices to prove the (Borel) measurability of the maps m


and M .
To prove the measurability of m, we are going to show that
Y
m−1 [−∞, a) ∈ Bor
 
[−∞, ∞] , ∀ a ∈ R.
i∈I

But this is quite obvious, since a point x = (xi )i∈I belongs to m−1 [−∞, a) , if


and only if thereQexists some j ∈ I with xi < a. In other words, if we define the
projections πj : i∈I [−∞, ∞] → [−∞, ∞], then we have
 [
m−1 [−∞, a) =

πj [−∞, a) .
j∈I
−1

This shows that in fact m [−∞, a) is open, hence clearly Borel.
To prove the measurability of M , we are going to show that
Y
M −1 (a, ∞] ∈ Bor
 
[−∞, ∞] , ∀ a ∈ R.
i∈I

But this is again clear, since, as before, we have the equality


 [
M −1 (a, ∞] =

πj (a, ∞] ,
j∈I
−1

which shows that in fact M (a, ∞] is open, hence Borel. 

Corollary 3.5. Let (X, A) be a measurable space, and let fn : (X, A) →


[−∞, ∞], n ∈ N be sequence of measurable functions. Then the functions g, h :
X → [−∞, ∞], defined by
g(x) = lim inf fn (x) and h(x) = lim sup fn (x), ∀ x ∈ X,
n→∞ n→∞

are both measurable.


166 LECTURE 20

Proof. For every n ∈ N, define the functions gn , hn : X → [−∞, ∞] by


 
gn (x) = inf fk (x) : k ≥ n and hn (x) = sup fk (x) : k ≥ n , ∀ x ∈ X.
By Corollary 3.5, we know that gn and hn are measurable for all n ∈ N. Since
 
g(x) = sup gn (x) : n ∈ N and h(x) = inf hn (x) : n ∈ N , ∀ x ∈ X,
the fact that both g and h are measurable follows again from Corollary 3.5. 
Corollary 3.6. Let (X, A) be a measurable space, and let
fn : (X, A) → [−∞, ∞], n ∈ N
be sequence of measurable
∞ functions, with the property that, for each x ∈ X, the
sequence fn (x) n=1 ⊂ [−∞, ∞] has a limit. Then the function f : X → [−∞, ∞],
defined by
f (x) = lim fn (x), ∀ x ∈ X,
n→∞
is again measurable.
Proof. Immediate from the above result. 
Exercise 1. If fn : R → R, n ∈ N, are continuous functions, and if f (x) =
limn→∞ fn (x) exists, for every x ∈ R, then by the above Corollary we know that
f : R → [−∞, ∞] is Borel measurable. Prove that the converse is not true. More
explicitly, prove that there is no sequence (fn )∞
n=1 of continuous functions, with
lim fn (x) = κ Q (x), ∀ x ∈ R.
n→∞

Hint: Use Baire’s Theorem.


Exercise 2. Prove that a function f : R → R, which is continuous everywhere,
except for a countable set of points, is Borel measurable. As an application, prove
that any monotone function is Borel measurable.
Corollary 3.6 can be generalized, as follows.
Theorem 3.1. Let (X, A) be a measurable space, let Y be a separable metric
space, and let
Tn : (X, A) → Y, Bor(Y ) , n ∈ N


be a sequence
∞ of measurable maps. Assume that, for every x ∈ X, the sequence
Tn (x) n=1 ⊂ Y is convergent. Define the map T : X → Y by
T (x) = lim Tn (x), ∀ x ∈ X.
n→∞

Then T : (X, A) → Y, Bor(Y ) is a measurable map.




Proof. Denote by d the metric on Y . The collection


V = Br (y) : y ∈ Y, r > 0


is a base for the topology of Y . Since Y is second countable, it suffices then to


show that
T −1 Br (y) ∈ A, ∀ y ∈ Y, r > 0.

(5)
Claim: For every y ∈ Y and r > 0 one has the equality
∞  \
[ ∞ 
T −1 Br (y) = Tk−1 Br− n1 (y) .
 
(6)
m,n=1 k=m
CHAPTER III: MEASURE THEORY 167

Denote the set in the right hand side simply by A. Start first with some x ∈ A.
There exist some m, n ∈ N such that

\
Tk−1 Br− n1 (y) ,

x∈
k=m

which means that


Tk (x) ∈ Br− n1 (y), ∀ k ≥ m,
that is,
 1
d Tk (x), y < r − , ∀ k ≥ m.
n
Pasing to the limit (k → ∞) then yields
 1
d T (x), y ≤ r − < r,
n
B −1
B

which means that T (x) ∈ r (y), i.e. x = T r (y) , thus proving the inclusion
−1
Br (y) .

A⊂T
Conversely, if x ∈ T −1 Br (y) , we get T (x) ∈ (Br (y), i.e. d T (x), y < r.
 

Choose an integer n such that


 2
(7) d T (x), y < r − .
n
Since limk→∞ Tk (x) = T (x), there exists some m ∈ N such that
 2
d Tk (x), T (x) < , ∀ k ≥ m.
n
Combining this with (7) then gives
   2 1 1
d Tk (x), y ≤ d T (x), y + d Tk (x), T (x) < r − + = r − , ∀ k ≥ m,
n n n
which means that
\∞
Tk−1 Br− n1 (y) ,

x∈
k=m
hence x indeed belongs to A.
Having proven (6) we now observe that, since the Tk ’s are measurable, it follows
that
Tk−1 Br− n1 (y) ∈ A, ∀ k, n ∈ N, r > 0.


Using the fact that A is closed under countable intersections, it follows that

\
Tk−1 Br− n1 (y) ∈ A, ∀ m, n ∈ N, r > 0.


k=m

Finally, using the fact that A is closed under countable unions, the desired property
(5) follows. 
Exercise 3. Let (X, A)Sbe a measurable space, and let (Xn )∞
n=1 be a sequence

of sets in A, with X = n=1 Xn . Suppose (Y, B) is a measurable space, and
F : X → Y is a map, such that
F : Xn , A → (Y, B)

Xn Xn

is measurable, for all n ∈ N. Prove that f : (X, A) → (Y, B) is measurable.


168 LECTURE 20

Exercise 4*. Let Ω1 ⊂ Rn be an open set, and let f1 , . . . , fn : Ω1 → R be C 1


functions, with the property that the matrix
 n
∂fj
A(p) = (p)
∂xk j,k=1

is invertible, for every point p ∈ Ω1 . Define the map


F : Ω1 3 p 7−→ f1 (p), . . . , fn (p) ∈ Rn .


(i) Prove that the set Ω2 = F (Ω1 ) is open in Rn .


(ii) Although F : Ω1 → Ω2 may fail to be injective, prove that there exists a
Borel measurable map φ : Ω2 → Ω1 , with F ◦ φ = IdΩ2 .
Hint: Use the Inverse Function Theorem, combined with Exercises 2 and 3. exercise.
Exercise 5*. Let P (z) be a non-constant polynomial with complex coefficients.
Prove that there exists a Borel measurable function f : C → C, such that

P f (z) = z, ∀ z ∈ C.

Hint: Use the preceding exercise, applied to the set Ω1 = {z ∈ C : P 0 (z) 6= 0}.
The preceding exercise can be generalized:
Exercise 6*. Let Ω1 ⊂ C be a connected open set, and let f : Ω1 → C be a
non-constant holomorphic function. By the Open Mapping Theorem we know that
the set Ω2 = f (Ω1 ) is open. Prove
that there exists a Borel measurable function
φ : Ω2 → Ω1 , such that f ◦ φ = Id Ω2 .
Hint: Use Exercise 4, applied to the set Ω0 = {z ∈ Ω1 : f 0 (z) 6= 0}. Since f is non-constant,
the set Ω1 r Ω0 is countable.
We continue with a discussion on the role of elementary functions.
Proposition 3.4. Let (X, A) be a measurable space, and let K be one of the
fields R or C. For an elementary function f ∈ ElemK (X), the following are equiv-
alent:
(i) f ∈ A-ElemK (X);
(ii) f : (X, A) → K is measurable.

Proof. (i) ⇒ (ii). We know that A-ElemK = SpanK κ A : A ∈ A}. Since




BK (X, A) is a vector space, it suffices to show only that κ A : (X, A) → K is


measurable, for all A ∈ A. But this is trivial, since for every Borel set B ⊂ R one
has either κ −1 −1 −1
A (B) = ∅, or κ A (B) = A, or κ A (B) = X.
(ii) ⇒ (i). Assume now f is measurable. List the range of f as
f (X) = {λ1 , . . . , λn },
with λj 6= λk , for all j, k ∈ {1, . . . , n} with j 6= k. Since f is measurable, and the
singleton sets {λ1 }, . . . , {λn } are in Bor(K), it follows that the sets Aj = f −1 {λj } ,


j = 1, . . . , n are all in A. Since we clearly have


f = λ 1 κ A 1 + · · · + λ n κ An ,
it follows that f indeed belongs to A-ElemK (X). 
CHAPTER III: MEASURE THEORY 169

Remarks 3.4. A. If (X, A) and (Y, B) are measurable spaces, if T : (X, A) →


(Y, B) is a measurable map, and if f ∈ B-ElemK (Y ), then f ◦ T ∈ A-ElemK (X).
This follows from the fact that the composition f ◦ T : (X, A) → K is measurable,
and elementary.
B. If (X, A) is a measurable space, if f ∈ A-ElemK (X), and if g : f (X) → K is
an arbitrary function, then g ◦ f ∈ A-ElemK (X). This follows from the fact that,
if one considers the finite set Y = f (X), and the σ-algebra P(Y ) on it, then
f  g
(X, A) −−→ Y, P(Y ) −−→ K
are measurable. So g ◦ f is also measurable, and obviously elementary.
The following is an interesting converse of Corollary 3.6.
Theorem 3.2. Let (X, A) be a measurable space, and let f : (X, A) → [−∞, ∞]
be a measurable function. Then there exists a sequence (fn )∞ n=1 ∈ A-ElemR (X),
such that
 
• inf f (y) : y ∈ X ≤ fn (x) ≤ sup f (z) : z ∈ X , ∀ x ∈ X, n ≥ 1;
• limn→∞ fn (x) = f (x), ∀ x ∈ X.
Moreover,
(i) if inf f (x) : x ∈ X > −∞, then the sequence (fn )∞

n=1 can be chosen to
be non-decreasing, i.e. f n ≤ f n+1 , ∀ n ∈ N;
(ii) if sup f (x) : x ∈ X < ∞, then the sequence (fn )∞

n=1 can be chosen to
be non-increasing,
 i.e. fn ≥ fn+1 , ∀ n ∈ N;

(iii) if inf f (x) : x ∈ X > −∞ and sup f (x) : x ∈ X < ∞, then the
sequence (fn )∞n=1 can be chosen eiher non-decreasing, or non-increasing,
and such that it converges uniformly to f , i.e.
 

lim sup fn (x) − f (x) = 0.

n→∞ x∈X

Proof. We begin with a special case of (iii). Assume X = [0, 1], A =


Bor([0, 1]), and consider the inclusion F : [0, 1] ,→ [−∞, ∞]. For each n ∈ N,
define the intervals Ikn , Jkn , 0 ≤ k ≤ 2n − 1 by
Ikn = k/2n , (k + 1)/2n , if 0 ≤ k ≤ 2n − 2; I2nn −1 = (2n − 1)/2n , 1 ,
   

Jkn = k/2n , (k + 1)2n , if 1 ≤ k ≤ 2n − 1; J0n = 0, 1/2n .


  

We then define, for each n ∈ N, the functions gn , hn : [0, 1] → R by


n n
2X −1 2X −1
gn = 2−n kκ Ikn and hn = 2−n (k + 1)κ Jkn .
k=0 k=0

Remark that
(8) 0 ≤ gn (s) < 1 and 0 < hn (s) ≤ 1, ∀ s ∈ [0, 1].
Note that, for every n ∈ N, we have
(9) gn (0) = 0; gn (1) = (2n − 1)/2n ;
(10) hn (0) = 1/2n ; hn (1) = 1.
Claim 1: The sequence (gn )∞ ∞
n=1 is non-decreasing, and the sequence (hn )n=1
is non-increasing.
170 LECTURE 20

Using (9) and (10), we only need to examine the restrictions to the open interval
(0, 1). Fix some point s ∈ (0, 1). For every integer n ≥ 1, define
k
psn = max k ∈ Z : 0 ≤ n < s .

2
We clearly have psn < 2n and
psn ps + 1
(11) n
<s≤ n n .
2 2
We then have
psn /2n if s 6= (psn + 1)/2n psn + 1

(12) gn (s) = and hn (s) =
(psn+ 1)/2n if s = (psn + 1)/2n 2n
We now estimate gn+1 (s) and hn+1 (s). First of all, using (11), we have
2psn 2psn + 2
< x ≤ ,
2n+1 2n+1
which means that either psn+1 = 2psn , or psn+1 = 2psn + 1. This immediately gives
psn+1 + 1 2psn + 2 psn + 1
hn+1 (s) = ≤ = = hn (s).
2n+1 2n+1 2n
Note that, if s = (psn + 1)/2n , we will have psn+1 = 2ps + 1 and s = (psn+1 + 1)/2n+1 ,
so we get
gn+1 (s) = (psn+1 + 1)/2n+1 = (2psn + 2)/2n+1 = (psn + 1)/2n = gn (s).
If s 6= (psn + 1)/2n , then
psn 2ps psn+1
gn (s) =
n
= nn ≤ n+1 ≤ gn+1 (s).
2 2 2
Claim 2: For every s ∈ [0, 1] one has
   

lim sup gn (s) − s = lim sup hn (s) − s = 0.
n→∞ s∈[0,1] n→∞ s∈[0,1]

To prove this fact we are going to estimate the differences |gn (s)−s| and |hn (s)−s|.
If s = 0 or s = 1, then the equalities (9) and (10) immediately show that
1 1
(13) |gn (s) − s| ≤ and |hn (s) − s| ≤ n , ∀ n ∈ N.
2n 2
If s ∈ (0, 1), then the definitions of gn (s) and hn (s) clearly show that
s, gn (s), hn (s) ∈ psn /2n , (psn + 1)/2n ,
 

and then we see that we again have the inequalities (13). Since (13) now holds for
all s ∈ [0, 1], the Claim immediately follows.
We proceed now with the proof of the theorem. Define
 
α = inf f (x) : x ∈ X and β = sup f (x) : x ∈ X .
If α = β, there is nothing to prove. Assume α < β. Depending on the finitude of
α and β, we define a homeomorphism Φ : [α, β] → [0, 1], as follows.
(a) If α > −∞ and β < ∞, we define
s−α
Φ(s) = , ∀ s ∈ [α, β].
β−α
CHAPTER III: MEASURE THEORY 171

(b) If α > −∞ and β = ∞, we define


 2
Φ(s) = π arctan(s − α) if s 6= β
1 if s = β
(c) If α = −∞ and β < ∞, we define
1 + π2 arctan(s − β)

if s 6= α
Φ(s) =
0 if s = α
(d) If α = −∞ and β = ∞, we define

 0 if s = α
1
Φ(s) = 2 + π1 arctan(s − β) if α < sβ
1 if s = β

Notice that Φ(α) = 0, Φ(β) = 1, and


α ≤ s < t ≤ β ⇒ Φ(s) < Φ(t).
After these preparations, we proceed with the proof. We begin with the special
cases (i) (ii) and (iii).
If α > −∞, we define the functions fn = Φ−1 ◦ gn ◦ Φ ◦ f . Since Φ and Φ−1 are
increasing, and (gn )∞ ∞
n=1 is non-decreasing, it follows that (fn )n=1 is non-decreasing.
Since 0 ≤ gn (s) < 1, ∀ s ∈ [0, 1], we see that α ≤ fn (x) < β, ∀ x ∈ X. In particular,
we have −∞ < fn (x) < ∞, for all n and x. It it obvious that fn is elementary,
measurable, and since limn→∞ gn (s) = s, ∀ s ∈ [0, 1] (by Claim 2), we immediately
get limn→∞ fn (x) = f (x), ∀ x ∈ X.
If β < ∞, we define the functions fn = Φ−1 ◦ hn ◦ Φ ◦ f . Since Φ and Φ−1 are
increasing, and (hn )∞ ∞
n=1 is non-increasing, it follows that (fn )n=1 is non-increasing.
Since 0 < hn (s) ≤ 1, ∀ s ∈ [0, 1], we see that α < fn (x) ≤ β, ∀ x ∈ X. In particular,
we have −∞ < fn (x) < ∞, for all n and x. It it obvious that fn is elementary,
measurable, and since limn→∞ hn (s) = s, ∀ s ∈ [0, 1] (by Claim 2), we immediately
get limn→∞ fn (x) = f (x), ∀ x ∈ X.
If α > −∞ and β < ∞, then we can take fn = Φ−1 ◦ gn ◦ Φ ◦ f , ∀ n, or we can
take fn = Φ−1 ◦ hn ◦ Φ ◦ f , ∀ n. The inequalities (13), combined with the definition
(c) of Φ, show that
β−α
|fn (x) − f | ≤ , ∀ x ∈ X, n ∈ N,
2n
with any of the above choices for (fn )∞ n=1 .
Having proven the cases (i), (ii) and (iii), we now examine the general situation,
when α = −∞ and β = ∞. Consider the functions f 0 , f 00 : X → [−∞, ∞] defined
by
f 0 (x) = max{f (x), 0} and f 00 (x) = min{f (x), 0}, ∀ x ∈ X.
By Corollary 3.4, both f 0 and f 00 are measurable. Since inf x∈X f 0 (x) ≥ 0, by part
(i), there exists a sequence (fn0 )∞ n=1 ∈ A-ElemR (X), such that limn→∞ fn (x) =
0
0 00
f (x), ∀ x ∈ X. Since supx∈X f (x) ≤ 0, by part (ii), there exists a sequence
n=1 ∈ A-ElemR (X), such that limn→∞ fn (x) = f (x), ∀ x ∈ X. Define the
(fn00 )∞ 00 00

elementary functions fn = fn + fn , n ∈ N. Clearly the fn ’s are all in A-ElemR (X).


0 00

We now check that


(14) lim fn (x) = f (x), ∀ x ∈ X.
n→∞

There are two cases to examine: (a) f (x) ≥ 0; (b) f (x) ≤ 0.


172 LECTURE 20

In case (a), we have f 0 (x) = f (x) and f 00 (x) = 0, so limn→∞ fn0 (x) = f (x) and
limn→∞ fn00 (x) = 0.
In case (b), we have f 0 (x) = 0 and f 00 (x) = f (x), so limn→∞ fn0 (x) = 0 and
limn→∞ fn00 (x) = f (x).
In either case, the equality (14) follows. 

We conclude this section with a discussion on an interesting measurable space,


that appears often in connection with probability theory.
Example 3.1. Consider the space T = {0, 1}ℵ0 , i.e.
T = a = (αn )∞

n=1 : αn ∈ {0, 1}, ∀ n ∈ N .

We call T the space of infinite coin flippings, having in mind that an element
of T is the same as the outcome of an infinite sequence of coin flips (think 0
as corresponding to tails, and 1 as corresponding to heads). Equipp T with the
product topology. By Tihonov’s Theorem, T is compact. The product topology on
T is in fact given by a metric d defined by

X |αn − βn |
d(a, b) = n
, ∀ a = (αn )∞ ∞
n=1 , b = (βn )n=1 ∈ T.
n=1
2

For every number r ≥ 2 we define a map φr : T → [0, 1] by



X αn
φr (a) = (r − 1) n
, ∀ a = (αn )∞
n=1 ∈ T.
n=1
r
It is pretty clear that

φr (a) − φr (b) ≤ (r − 1)d(a, b), ∀ a, b ∈ T,

so the maps φr : T → [0, 1], r ≥ 2 are continuous. In particular, the set Kr = φr (T )


is a compact subset of [0, 1].
Define

T0 = a = (αn )n∈N ∈ T : the set {n ∈ N : αn = 0} is infinite .
The set T r T0 can be described as:

T r T0 = (αn )n∈N ∈ T : there exists N ∈ N, such that αn = 1, ∀ n ≥ N .
The following are well known (see Appendix B, the proof of Proposition B.2).
Facts: 1. The set T r T0 is countable
2. For any r ≥ 2, and elements a = (αn )∞ ∞
n=1 , b = (βn )n=1 ∈ T0 , the
following are equivalent:
• there exists N ∈ N such that αN = 1, βN = 0, and αn = βn , for all
n ∈ N with n < N ;
• φr (a) > φ(b).
In particular, the map φr T : T0 → [0, 1] is injective.
0

The above constructions have a remarkable feature.


Theorem 3.3. Use the notations above. For a number r ≥ 2 and subset A ⊂ T ,
the following are equivalent:
(i) A ∈ Bor(T );
(ii) φr (A) ∈ Bor(Kr ).
CHAPTER III: MEASURE THEORY 173

Proof. Throughout the proof the number r will be fixed. The map φr will be
denoted by φ, and the compact set Kr will be denoted by K.
Since φ : T → K is continuous, it is measurable, i.e. we have the implication
(15) B ∈ Bor(K) ⇒ φ−1 (B) ∈ Bor(T ).
Before we proceed with the actual proof, we need some preparations. Remark that,
since φ : T → K is surjective, we have the equality
φ φ−1 (C) = C, ∀ C ⊂ K.

(16)
Claim 1: If a subset C ⊂ K is at most countable, if and only if the set
φ−1 (C) ⊂ T is at most countable.
Suppose C is at most countable countable. If we take A0 = φ−1 (C) ∩ T0 , and
A1 = φ−1 (C) r T0 , then obviously φ−1 (C) = A0 ∪ A1 . Since A1 ⊂ T r T0 , and
T rT0 is countable, it follows that A1 is at most
countable, so we only need to prove
that A0 is at most countable. But since φ T is injective, and A0 ⊂ T0 , it follows
0
that φ A : A0 → C is injective, and then the fact that C is at most countable,
0
forces A0 to be at most countable.
Conversely, if φ−1 (C) is at most countable, then so is φ φ−1 (C) . By (16) we


are done.
For each subset A ⊂ T , we define
hAi = φ−1 φ(A) .


Remark that A ⊂ hAi, ∀ A ⊂ T . Note also that, for any family (Ai )i∈I of subsets
of T , one has the equality
 [  [  [

[  [
Ai = φ−1 φ = φ−1 φ−1 φ(Ai ) =

(17) Ai φ(Ai ) = hAi i.
i∈I i∈I i∈I i∈I i∈I

As an application of Claim 1, to the set C = φ(T r T0 ), we see that


(∗) the set hT r T0 i is at most countable.
Claim 2: For any subset A ⊂ T0 , one has the inclusion
hAi r A ⊂ hT r T0 i.
In particular, the difference hAi r A is at most countable.
Start with an arbitrary element x ∈ hAi r A. This means that x 6∈ A, but φ(x) ∈
φ(A), which means that there exists some a ∈ A, with φ(x) = φ(a). Assume now
x 6∈ T r T0 , which means that x ∈ T0 . But then, the fact that x, a ∈ T0 , combined
with the injectivity of φ T0 will force x = a, which is impossible since a ∈ A.
Claim 3: For any set A ⊂ T , the difference hAi r A is at most countable.
Take A0 = A ∩ T0 and A1 = A r A0 . Notice that, since A1 ⊂ T r T0 , we have
hA1 i = φ−1 φ(A1 ) ⊂ φ−1 φ(T r T0 ) = hT r T0 i,
 

so it follows that hA1 i is at most countable. We obviously have A = A0 ∪ A1 , so by


(17)
hAi = hA0 i ∪ hA1 i.
But now we are done, since
  
hAi r A = hA0 i ∪ hA1 i r A0 ∪ A1 ⊂ hA0 i r A0 ∪ hA1 i,
and both hA0 i r A0 (by Claim 2) and hA1 i are at most countable.
174 LECTURE 20

Claim 4: For any subset A ⊂ T , one has the inclusion


(18) φ(T r A) ⊃ K r φ(A),

and the difference φ(T r A) r K r φ(A) is at most countable.
The inclusion (18) is pretty obvious, from the surjectivity of φ. In order to prove
that the difference

C = φ(T r A) r K r φ(A) = φ(T r A) ∩ φ(A)
is countable, by Claim 1, it suffices to prove that φ−1 (C) is countable. We have
φ−1 (C) = φ−1 φ(T r A) ∩ φ(A) = φ−1 φ(T r A) ∩ φ−1 φ(A) = hT r Ai ∩ hAi.
  

We can write φ−1 (C) = A1 ∪ A2 , where


 
A1 = (T r A) ∩ hAi and A2 = hT r Ai r (T r A) ∩ hAi,
so it suffices to prove that both A1 and A2 are at most countable. But these facts
are immediate from Claim 3, since A1 = hAi r A, and A2 ⊂ hT r Ai r (T r A).
We can now proceed with the proof of the theorem. Define
A = A ⊂ T : φ(A) ∈ Bor(K) ,


so that what we need to prove is the equality A = Bor(T ).


First, remark that, if A ∈ A, then φ(A) ∈ Bor(K), and the fact that φ is Borel
measurable will force hAi = φ−1 φ(A) to be a Borel set in T . But since hAi r A


is countable, hence Borel, it follows that



A = hAi r hAi r A
is again Borel. Therefore, we have the inclusion A ⊂ Bor(T ).
Second, remark that if F ⊂ T is a compact subset, then the continuity of φ
gives the fact that φ(F ) is compact, hence Borel. This then forces F ∈ A. Therefore
A contains the collection CT of all compact subsets of T .
Now we have
CT ⊂ A ⊂ Bor(T ) = Σ(CT ),
so all we need to prove is the fact that A is a σ-algebra, i.e. we have the properties
(a) A ∈ A ⇒ T r A ∈ A; S∞
(b) for any sequence (An )∞ n=1 ⊂ A, the union n=1 An also belongs to A..
To check (a) start with some set A ∈ A. We know that φ(A) ∈ Bor(K), and
we want to show that φ(T r A) is again Borel. By Claim 4, we know we can write
 
φ(T r A) = K r φ(A) ∪ C,
for some set C ⊂ K which is at most countable. Since C and K r φ(A) are Borel,
this shows that φ(T r A) is also Borel.
Property (b) is obvious, since φ(An ), n ≥ 1 are all Borel, and
[ ∞  ∞
[
φ An = φ(An ). 
n=1 n=1

Corollary 3.7. Use the above notations. For a number r ≥ 2 and a subset
B ⊂ Kr , the following are equivalent:
(i) B ∈ Bor(Kr );
(ii) φ−1
r (B) ∈ Bor(T ).
CHAPTER III: MEASURE THEORY 175

Proof. The implication (i) ⇒ (ii) is trivial, since φr is continuous, hence


measurable.
Conversely, if the set A = φ−1 r (B) is Borel, then by the Theorem, φr (A) is
Borel. But since φr is surjective, we have B = φr (A). 
Comments. From the above results, we see that φr : T → Kr “almost pre-
serves Borel structures.” More explicitly, if one considers the maps
Φr : P(T ) 3 A 7−→ φr (A) ∈ P(Kr ),
Ψr : P(Kr ) 3 B 7−→ φ−1
r (B) ∈ P(T ),

then
• (Φr ◦ Ψr )(B) = B, for all B ⊂ Kr ;
• (Ψr ◦ Φr )(A) ⊃ A, and (Φr ◦ Ψr )(A) r A is at most countable, for all
A ⊂ T;
• B ∈ Bor(Kr ) ⇔ Ψr (B) ∈ Bor(T );
• A ∈ Bor(T ) ⇔ Φr (A) ∈ Bor(Kr ).
In the particular case r = 2, we  know that K2 = [0, 1], so we can think the mea-
surable space [0,1], Bor([0, 1]) as “approximatively the same” as the measurable
space T, Bor(T ) .
The case r = 3 will be an interesting one, especially for constructing various
counter-examples. The compact set K3 ⊂ [0, 1] is called the ternary Cantor set.
It turns out that there exists another useful description of the ternary Cantor
set K3 , which yields some interesting properties.
Notations. We keep the notations above. An element a = (α)∞ n=1 ∈ T will
be called finite, if there exists some N ∈ N, such that αn = 0, ∀ n ≥ 0. We define

Tfin = a ∈ T : a finite .

Remark that Tfin ⊂ T0 . In particular the map φ3 T : Tfin → K3 is injective.
fin
For a ∈ Tfin we define its length as
`(a) = min{N ∈ N : αn = 0, ∀ n ≥ N } − 1.
With this definition, for every a = (αn )∞
n=1 ∈ Tfin , we have

(19) α`(a) = 1 and αn = 0, ∀ n > `(a).


We define 
Λ = (k, a) ∈ Z × Tfin : k ≥ `(a) .

Finally, for every pair λ = (k, a) ∈ Λ, we define the open interval


1 2 
Iλ = φ3 (a) + k+1 , φ3 (a) + k+1 .
3 3
Remark that, using (19) we have
`a
X 2 1
φ3 (a) ≤ 2 n
= 1 − `(a) ,
n=1
3 3
with the convention that the sum is 0, if `(a) = 0. We then get
2 1 2 1 1
φ3 (a) + ≤ 1 − `(a) + k+1 < 1 − `(a) + k ≤ 1,
3k+1 3 3 3 3
176 LECTURE 20

which gives the inclusion Iλ ⊂ (0, 1).


The following result is describes an alternative construction of K3 .
Theorem 3.4. Use the notations above.
(i) The set Tfin is dense in T ;
(ii) S
The system (Iλ )λ∈Λ is pair-wise disjoint.
(iii) λ∈Λ = [0, 1] r K3 .
Proof. The map φ3 will be simply denoted by φ, and the Cantor set K3 will
be denoted simply by K.
(i). Fix some element a = (αn )∞ n=1 ∈ T . For every integer k ≥ 1 define the
element ak = (αnk )∞
n=1 ∈ T , by

αn if n ≤ k
αnk =
0 if n > k
It is obvious that ak ∈ Tfin , ∀ k ∈ N. The inequality

X αn X 1 1
d(a, ak ) = n
≤ = k, ∀k ∈ N
2 2n 2
n=k+1 n=k+1

then immediately shows that limk→∞ ak = a.


(ii). Assume λ, µ ∈ Λ are such that λ 6= µ, and let us prove that Iλ ∩ Iµ = ∅.
Let λ = (j, a) and µ = (k, b), where a = (αn )∞ ∞
n=1 and b = (βn )n=1 are elements int
Tfin with `(a) ≤ j and `(b) ≤ k. Since λ 6= µ, we have one (or both) of the following
cases: (a) a 6= b, or (b) j 6= k.
In case (a) we take
m = min{n ∈ N : αn 6= βn }.
Without any loss of generality, we can assume that αm = 0 and βm = 1. Note that
k ≥ `(b) ≥ m ≥ 1. We are going to prove that Iλ ∩ Iµ = ∅, by showing that the
right end-point of Iλ is not greater than the left end-point of Iµ , that is,
2 1
(20) φ(a) + ≤ φ(b) + k+1 .
3k+1 3
Define the number
m−1 m−1
X αn X βn
M= = ,
n=1
3n n=1
3n
with the convention that M = 0, if m = 1. We have:
`(a) `(a)
X αn X 1 1 1
φ(a) = 2M + 2 n
≤ 2M + 2 n
= 2M + m − `(a) ;
n=m+1
3 m+1
3 3 3
`(b)
2 X βn 2
φ(b) = 2M + m + 2 n
≥ 2M + m .
3 n=m+1
3 3

The inequality (20) then follows immediately from:


2 1 1 2 1 1 1
φ(a) + ≤ 2M + m − `(a) + j+1 << 2M + m − `(a) + j ≤
3j+1 3 3 3 3 3 3
1 2 1
≤ 2M + m < 2M + m ≤ φ(b) < φ(b) + k+1 .
3 3 3
CHAPTER III: MEASURE THEORY 177

In case (b), based on the fact that we have proven case (a), we can assume, without
any loss of generality, that a = b and j < k. In this case we have
2 2 1 1
φ(b) + = φ(a) + < φ(a) + ≤ φ(a) + j+1 ,
3k+1 3k+1 3 k 3
which means that the right end-point of Iµ is not greater than the left end-point of
Iλ , so again we get Iλ ∩ Iµ = ∅.
For the proof of (iii) we are going to use the space

P = {0, 1, 2}ℵ0 = (αn )∞



n=1 : αn ∈ {0, 1, 2}, ∀ n ∈ N .

Exactly as is the case with T , the product space P is compact with respect to the
product topology, which is given by the metric

X |αn − βn |
d(a, b) = n
, ∀ a = (αn )∞ ∞
n=1 , b = (βn )n=1 ∈ P.
n=1
2

Then map ψ : P → [0, 1], defined by



X αn
ψ(a) = n
, ∀ a = (αn )∞
n=1 ∈ P,
n=1
3

satisfies

ψ(a) − ψ(b)| ≤ d(a, b), ∀ a, b ∈ P,

hence it is continuous. Note also that ψ is surjective. We can write φ = ψ ◦ ρ,


where
ρ : {0, 1}ℵ0 3 (αn )∞ ∞ ℵ0
n=1 7−→ (2αn )n=1 ∈ {0, 1, 2} .

Note also that ρ : T → P is continuous, since we clearly have



d ρ(a), ρ(b) ≤ 2d(a, b), ∀ a, b ∈ T.
S
We now proceed with the proof of (iii). Denote the open set λ∈Λ Iλ simply by
D. Since Tfin is dense in T , it follows that φ(Tfin ) is dense in K = φ(T ). Therefore,
in order to prove the inclusion K ⊂ [0, 1] r D, using the surjectivity of ψ, it suffices
to prove the inclusion
φ(Tfin ) ⊂ [0, 1] r D.
Using the map ψ : P → [0, 1], the above inclusion is equivalent to
(21) P r ρ(Tfin ) ⊃ ψ −1 (D).
In order to prove the inclusion [0, 1] r D ⊂ K, again using the surjectivity of ψ, it
suffices to prove the inclusion
(22) ψ −1 (D) ⊃ P r ψ −1 (K).
To prove (21) start with some element a = (αn )∞ n=1 ∈ ψ
−1
(D), which means that
there exists some b ∈ Tfin , and an integer k ≥ `(b), such that ψ(a) ∈ I(k,b) , i.e.

2β1 2βk 1 X αn 2β1 2βk 2
(23) + · · · + k + k+1 < n
< + · · · + k + k+1 .
3 3 3 n=1
3 3 3 3
178 LECTURE 20

We prove that a 6∈ ρ(Tfin ) by contradiction. Assume a ∈ ρ(Tfin ), which means that


there exists c = (γn )∞ n=1 ∈ Tfin , such that αn = 2γn , ∀ n ∈ N. Define the element
b̃ = (β̃n )∞
n=1 ∈ T fin by

 βn if n ≤ k
β̃n = 1 if n = k + 1
0 if n > k + 1

With this definition, the inequalities (23) give


1
(24) φ(b) < φ(b) + < φ(c) < φ(b̃).
3k+1
By Fact 2 above, there exist N, N 0 ∈ N such that
• γN = 1, βN = 0, and γn = βn , for all n ∈ N with n < N ;
• γN 0 = 0, β̃N 0 = 1, and γn = β̃n , for all n ∈ N with n < N 0 .
We will examine three cases: (a) N < N 0 , (b) N = N 0 , or (c) N > N 00 .
Case (b) is clearly impossible. In case (a), the inequality N < N 0 forces
βN = 0, γN = 1 and β̃N = γN , which means that β̃N = 1 6= βN = 0. This clearly
forces N = k + 1 > `(b), which in particular gives βn = β̃n = 0, ∀ n > N , so we
clearly have γn ≥ β̃n , ∀ n ∈ N, so we get φ(c) ≥ φ(b̃), thus contradicting (24). In
case (c), we have γN 0 = 0, β̃N 0 = 1, and since N 0 < N , we also have βN 0 = γN 0 = 0.
As before this would force N 0 = k + 1. We then have
0
∞ N −1 ∞ k ∞
X γn X γn 2γN 0 X γn X βn X γn
φ(c) = 2 = 2 + 0 + 2 = 2 + 0 + 2 =
n=1
3n n=1
3n 3N 0
3n n=1
3n 3n
n=N +1 n=k+2
∞ ∞
X γn X 1 1
= φ(b) + 2 ≤ φ(b) + 2 = φ(b) + k+1 ,
3n 3n 3
n=k+2 n=k+1

again contradicting (24).


To prove (22), we start with some element a ∈ P r ψ −1 (K), and we show that
ψ(a) ∈ D. The fact that a 6∈ ψ −1 (K) forces the fact that a 6∈ ρ(T ). In particular,
this gives the fact that a = (αn )∞
n=1 ∈ {0, 1, 2}
ℵ0
and there exists some n ∈ N such
that αn = 1. Put
N = min{n ∈ N : αn = 1}.
Define the elements b = (βn )∞ ℵ0
n=1 ∈ {0, 1} , by

αn /2 if n < N
βn =
0 if n ≥ N
Notice that b ∈ Tfin , and `(b) ≤ N − 1. Notice also that 2βn = αn , for all n ∈ N
with n < N − 1. In particular, using the equality αN = 1, this gives
(25)
N −1 N ∞
1 X βn αN X αn X αn
φ(b) + = 2 + = ≤ = ψ(a);
3N n=1
3 n 3N
n=1
3 n
n=1
3n
(26)
N −1 ∞ N ∞ ∞
2 X γn αN X 2 X αn X 2 X αn
φ(b) + N = 2 n
+ N + n
= n
+ n
≥ = ψ(a).
3 n=1
3 3 3 n=1
3 3 n=1
3n
n=N +1 n=N +1
CHAPTER III: MEASURE THEORY 179

Consider the pair λ = (N − 1, β) ∈ Λ. We are going to show that ψ(a) ∈ Iλ , i.e.


we have the inequalities
1 2
(27) φ(b) + N < ψ(a) < φ(b) + N .
3 3
By (25) and (26) it suffices to prove only that
1 2
ψ(a) 6= φ(b) + N and ψ(a) 6= φ(b) + N .
3 3
If ψ(a) = φ(b) + 31N , then by the inequalities (25), we are forced to have
(28) αn = 0, ∀ n > N..
2
If ψ(a) = φ(b) + 3N
, then by the inequalities (26), we are forced to have
(29) αn = 2, ∀ n > N..
If (28) holds, we define c = (γn )∞
n=1 ∈ T , by

 αn /2 if n < N
γn = 0 if n = N
1 if n > N

and we will have


∞ N −1 ∞ N −1
X γn X 2γn X 1 X αn 1
φ(c) = 2 n
= n
+ 2 n
= n
+ N = ψ(a),
n=1
3 n=1
3 3 n=1
3 3
n=N +1

thus forcing ψ(a) ∈ K, which is impossible.


If (29) holds, we define c = (γn )∞
n=1 ∈ T , by

 αn /2 if n 6= N
γn = 1 if n = N
0 if n > N

and we will have


N −1 N −1 ∞ ∞
X γn 2 X 2γn 1 X 2 X αn
φ(c) = 2 n
+ N = n
+ N + n
= = ψ(a),
n=1
3 3 n=1
3 3 3 n=1
3n
n=N +1

thus forcing again ψ(a) ∈ K, which is impossible. 


Exercise 7. Using the notations above, prove that the set
[
[0, 1] r K3 = Iλ
λ∈Λ

is dense in [0, 1].


Hints: Define the set
P0 = (αn )∞ ℵ0

n=1 ∈ {0, 1, 2} : the set {n ∈ N : αn = 1} is infinite .

Prove that P0 is dense in P , and prove that ψ(P ) ⊂ [0, 1] r K. (Use the arguments employed in
the proof of part (iii).)

Remarks 3.5. If we set Λn = Λ∩ {n}×P , then we can write the complement
of the ternary Cantor set as

[
[0, 1] r K3 = Dn ,
n=0
180 LECTURE 20

where [
Dn = Iλ .
λ∈Λn
Then the system of open sets (Dn )n≥0 is pair-wise disjoint. Morever, each Dn is a
union of 2n disjoint intervals of length 1/3n+1 .
Since card T0 = c, and the map φ3 T0 : T0 → K3 is injective, we get card K3 ≥ c.
Since we also have card K3 ≤ card R = c, we get in fact the equality
card K3 = c.
Lecture 21

4. The concept of measure


Definition. Let X be a non-empty set, and let E be an arbitrary collection
of subsets of X. Assume ∅ ∈ E. A measure on E is a map µ : E → [0, 1] with the
following properties
(0) µ(∅) = 0. S∞
(addσ ) Whenever (En )∞ n=1 ⊂ E is a pair-wise disjoint sequence, with n=1 En ∈
E, it follows that we have the equality

[ ∞
 X
µ En = µ(En ).
n=1 n=1
Property (addσ ) is called σ-additivity.
Convention. For a sequence (αn )∞ n=1 ⊂ [0, ∞] we define
 ∞
 X
X∞


 αn if αn ∈ [0, ∞), ∀ n ∈ N
αn = n=1

n=1 

 ∞ if there exists n ∈ N with αn = ∞.
P∞
(Of course, in the first case, it is still possible to have n=1 αn = ∞.)
Remark 4.1. If µ is a measure on E, then µ is additive, i.e.
(add) Whenever (En )N n=1 ⊂ E is a finite pair-wise disjoint system, such that
E1 ∪ · · · ∪ EN ∈ E, it follows that we have the equality

µ E1 ∪ · · · ∪ EN = µ(E1 ) + · · · + µ(EN ).
This follows from (addσ ) (0), after completing the sequence E1 , . . . , EN to an
infinite sequence by taking En = ∅, ∀ n > N .
Comment. The most naturalS setting for measures is the one when E is a σ-ring.

In this case, the stipulation that n=1 En ∈ E, which appears in the definition, is
superfluous.
The purpose of this section is to study measures on more rudimentary collec-
tions.
Examples 4.1. Let X be a non-empty set.
A. If we take E = {∅, X} and we define µ(∅) = 0 and µ(X) to be any element
in [0, ∞], then µ is obviously a measure on {∅, X}.
B. If we take E = P(X) and we define

0 if E = ∅
µ(E) =
∞ if E 6= ∅
then µ is a measure on P(X).
181
182 LECTURE 21

C. If we take E = P(X) and we define



card E if E is finite
µ(E) =
∞ if E is infinite
then µ is a measure on P(X). This is called the counting measure.
Exercise 1. Let X1 , X2 be non-empty spaces, let Ek ⊂ P(Xk ) be arbitrary
collections with ∅ ∈ Ek , k = 1, 2. Let µ1 be a measure on E1 and µ2 be a measure
on E2 . Consider the collections
f∗ E1 = {A ⊂ X2 : f −1 (A) ∈ E1 } ⊂ P(X2 );
f ∗ E2 = {f −1 (A) : A ∈ E2 } ⊂ P(X1 ).
A. Prove that the map f∗ µ1 : f∗ E1 → [0, ∞], defined by
(f∗ µ1 )(A) = µ1 f −1 (A) , ∀ A ∈ f∗ E1 ,


is a measure on f∗ E1 .
B. If f is surjective, prove that the map f ∗ µ2 : f ∗ E2 → [0, ∞], defined by
(f ∗ µ2 ) ) = µ2 f (B) , ∀ B ∈ f ∗ E2 ,
 

is a measure on f ∗ E2 .
We now concentrate on the most rudimentary types of collections E on which
measures can be somehow easily defined. Actually, what we have in mind is a set
of easy conditions on a map µ : E → [0, ∞] which would guarrantee that µ is a
measure.
Definition. Let X be a non-empty set. A collection J ⊂ P(X) is called a
semiring, if it satisfies the following properties:
• ∅ ∈ J;
• if A, B ∈ J, then A ∩ B ∈ J;
• if A, B ∈ J and A ⊂ B, then there exists an integer n ≥ 1, and sets
D0 , D1 , . . . , Dn ∈ J, such that A = D0 ⊂ D1 ⊂ · · · ⊂ Dn = B, and
Dk r Dk−1 ∈ J, ∀ k ∈ {1, . . . , n}.
Remark that every ring is a semiring.
Exercise 2. Prove that the semiring type is not consistent. Give an example of
two semirings J1 , J2 ⊂ P(X), such that J1 ∩ J2 is not a semiring.
Hint: Use the set X = {1, 2, 3}.
Exercise 3. Let X1 , . . . , Xn be non-empty sets, and let Jk ⊂ P(Xk ), k =
1, . . . , n, be semirings. Prove that
J = A1 × · · · × An : A1 ∈ J1 , . . . , An ∈ Jn ⊂ P(X1 × · · · × Xn )


is a semiring.
Hint: First prove the case n = 2, and then use induction.
Example 4.2. Take X = R. The collection
J = {∅} ∪ [a, b) : a, b ∈ R, a < b ⊂ P(R)


is a semiring.
Indeed, the first two axioms are pretty clear. To prove the third axiom, we
start with two intervals A = [a, b) and B = [c, d) with A ⊂ B. This means that
a ≥ c and b ≤ d. If a = c or b = d, we set D0 = A and D1 = B. If a > c and b < d,
we set D0 = A, D1 = [a, d) and D2 = B.
CHAPTER III: MEASURE THEORY 183

More generally, by Exercise 3, the collection of ”half-open boxes”


n
Y
Jn = {∅} ∪ [aj , bj ) : a1 < b1 , . . . , an < bn ⊂ P(Rn )

j=1

is a semiring.
Exercise 4. Let Jn ⊂ P(Rn ) be the semiring defined above. Prove that the
σ-ring S(J) generated by Jn coincides with Bor(Rn ).
The ring generated by a semiring has a particularly nice description (compare
to Proposition 2.1):
Proposition 4.1. Let J be a semiring on X. For a subset A ⊂ X, the following
are equivalent:
(i) A belongs to R(J), the ring generated by J;
(ii) There exists an integer n ≥ 1, and a pair-wise disjoint system (Aj )nj=1 ⊂ J,
such that A = A1 ∪ · · · ∪ An .
Proof. Denote by R the collection of all subsets A ⊂ X that satisfy condition
(ii). It is obvious that
J ⊂ R ⊂ R(J),
so (see Section III.2) we only need to prove that R is a ring.
Let us first remark that we obviously have the property:
(i) if A, B ∈ R, and A ∩ B = ∅, then A ∪ B ∈ R.
Secondly, we remark that we have have the implication:
(ii) A, B ∈ J ⇒ A r B ∈ R.
Indeed, since A∩B ∈ J, by the definition of a semiring, there exist D0 , D1 , . . . , Dn ∈
J with A ∩ B = D0 ⊂ D1 ⊂ · · · ⊂ Dn = A, and Dk r Dk−1 ∈ J, ∀ k ∈ {1, . . . , n}.
Then the equality
n
[
Ar = (Dk r Dk−1 )
k=1
shows that A r B indeed belongs to R.
Thirdly, we prove the implication:
(iii) A, B ∈ R ⇒ A ∩ B ∈ R.
i=1 , (Bk )k=1 ⊂ J
Write A = A1 ∪ · · · ∪ Am and B = B1 ∪ · · · ∪ Bn , with (Ai )m n

pair-wise disjoint systems. If we define the sets Dik = Aj ∩ Bk ∈ J, (i, k) ∈


{1, . . . , m} × {1, . . . , n} then it is obvious that
m [
[ n
A∩B = Dik ,
i=1 k=1

and (Dik )1≤i≤m ⊂ J is a pair-wise disjoint system, therefore A ∩ B indeed belongs


1≤j≤n
to R.
Finally, we show the implication:
(iv) if A, B ∈ R and A ⊃ B, then A r B ∈ R.
i=1 ⊂ J a pair-wise disjoint system. Notice that
Write A = A1 ∪ · · · ∪ Am , with (Ai )m
m
[
ArB = (Ai r B),
i=1
184 LECTURE 21

with (Ai r B)m i=1 a pair-wise disjoint system, so by (i) it suffices to show that
Ai rB ∈ R, ∀ i ∈ {1, . . . , m}. To prove this, we fix i and we write B = B1 ∪· · ·∪Bn ,
with (Bk )nk=1 ⊂ J a pair-wise disjoint system. Then
Ai r B = (Ai r B1 ) ∩ · · · ∩ (Ai r Bn ),
and the fact that Ai r B belongs to R follows from (ii) and (iii).
Having proven (i)-(iv), it we now prove that R is a ring. By (iii), we only need
to prove the implication
(∗) A, B ∈ R ⇒ A4B ∈ R.
On the one hand, using (iv), it follows that the sets A r B = A r (A ∩ B) and
B r A = B r (A ∩ B) both belong to R. Since A4B = (A r B) ∪ (B r A), and
(A r B) ∩ (B r A) = ∅, by (i) is follows that A4B indeed belongs to R. 
Theorem 4.1 (Semiring-to-ring extension). Let J be a semiring on X, and let
µ : J → [0, ∞] be an additive map with µ(∅) = 0.

(i) There exists a unique additive map µ̄ : R(J) → [0, ∞], such that µ̄ J = µ.
(ii) If µ is σ-additive, then so is µ̄.
Proof. The key step is contained in the following
i=1 ⊂ J and (Bj )j=1 ⊂ J are pair-wise disjoint systems, with
Claim: If (Ai )m n

A1 ∪ · · · ∪ Am = B1 ∪ · · · ∪ Bn ,
then µ(A1 ) + · · · + µ(Am ) = µ(B1 ) + · · · + µ(Bn ).
To prove this fact, we define the pair-wise disjoint system (Dij )1≤i≤m by Dij =
1≤j≤n
Ai ∩ Bj , ∀ (i, j) ∈ {1, . . . , m} × {1, . . . , n}. Since
n
[
Dij = Ai , ∀ i ∈ {1, . . . , m},
j=1
[m
Dij = Bj , ∀ j ∈ {1, . . . , n},
i=1
using additivity, we have the equalities
Xn
µ(Dij ) = µ(Ai ), ∀ i ∈ {1, . . . , m},
j=1
Xm
µ(Dij ) = µ(Bj ), ∀ j ∈ {1, . . . , n},
i=1
and then we get
m
X m
X m n n n
X  X X  X
µ(Ai ) = µ(Dij ) = µ(Dij ) = µ(Bj ).
i=1 i=1 j=1 j=1 i=1 j=1

To prove (i), for any set A ∈ R(J) we choose (use Proposition 4.1) a finite
pair-wise disjoint system (Ai )ni=1 ⊂ J, with A = A1 ∪ · · · ∪ An , and we define
(1) µ̄(A) = µ(A1 ) + · · · + µ(An ).
By the above Claim, the number µ̄(A) is independent of the particular choice of the
pair-wise disjoint system (Ai )ni=1 . Also, it is clear that µ̄ J = µ, and µ̄ is additive.
CHAPTER III: MEASURE THEORY 185


The uniqueness is also clear, because the equality µ̄ J = µ and additivity of µ̄ force
(1)
(ii). Assume now that µ is σ-additive, and let us prove that µ̄ is S∞ again σ-
additive. Start with a pair-wise disjoint sequence (An )∞
n=1 ⊂ R(J), with n=1 An ∈
R(J), and let us prove the equality

[ ∞
 X
(2) µ̄ An = µ̄(An ).
n=1 n=1
S∞
SinceS n=1 An ∈ R, there exists a finite pair-wise disjoint system (Bi )pi=1 ⊂ J, such

that n=1 An = B1 ∪ · · · ∪ Bp . With this choice we have

[ p
X

(3) µ̄ An =µ(Bi ).
n=1 i=1
S∞
For each i ∈ {1, . . . , p}, we have Bi = n=1 (Bi ∩ An ). Fix for the moment a
pair (n, i) ∈ N × {1, . . . , p}. Since Bi ∩ An ∈ R(J), it follows that there exist an
integer Nni ≥ 1 and a finite pair-wise disjoint system (Ckni )N k=1 ⊂ J, such that
ni
SNni ni
Bi ∩ An = k=1 Ck .
Since, for each i ∈ {1, . . . , p}, the countable system (Ckni ) n∈N ⊂ J is pair-
1≤k≤Nni
wise disjoint, and we have the equality
∞ N
[ [ni ∞
[
Ckni = (Bi ∩ An ) = Bi ∈ J,
n=1 k=1 n=1

by the σ-additivity of µ, we have


ni
∞ N
X X
(4) µ(Bi ) = µ(Ckni ), ∀ i ∈ {1, . . . , p}.
n=1 k=1

Since, for each n ∈ N, the finite system (Ckni ) 1≤i≤p ⊂ J is pair-wise disjoint,
1≤k≤Nni
and we have the equality
p N
[ [ni ∞
[
Ckni = (Bi ∩ An ) = An ∈ J,
i=1 k=1 i=1
by the definition of µ̄, we have
ni
p N
X X
µ̄(An ) = µ(Ckni ), ∀ i ∈ {1, . . . , p}.
i=1 k=1

Combining this with (4) yields



X X p X
∞ X Nni p
X
µ̄(An ) = µ(Ckni ) = µ(Bi ),
n=1 n=1 i=1 k=1 i=1

and the equality (2) follows from (3). 


Definition. Let X be a non-empty set, and let E ⊂ P(X) be a collection of
sets. We say that a map µ : E → [0, ∞] is sub-additive, if
Sn
(add− ) whenever A ∈ E, and (APnn)k=1 is a finite sequence in E with A ⊂ k=1 Ak ,
n

it follows that µ(A) ≤ k=1 µ(Ak ).


186 LECTURE 21

Note that we do not require the Ak ’s to be pair-wise disjoint. With this terminology,
Theorem 4.1 has the following.
Corollary 4.1. Let X be a non-empty set X, and let J ⊂ P(X) be a semiring.
Then any additive map µ : J → [0, ∞] is sub-additive.
Proof. Let µ̄ : R(J) → [0, ∞] be the additive extension of µ to the ring gener-
ated by J. It suffices to prove that µ̄ is sub-additive. Start with sets A, A1 , . . . , An ∈
R(J) such that A ⊂ A1 ∪ . . . An . Define the sets B1 = A1 , and
Bk = Ak r (A1 ∪ · · · ∪ Ak−1 ), f orall k ∈ {1, . . . , n}, k ≥ 2.
Since we work in a ring, the sets Bk , Bk ∩A, Bk rA, and An rBn , n ∈ N, all belong
to R(J). Moreover, the sequence (Bk )nk=1 is pair-wise disjoint and it satisfies
• S
Bk ⊂ Ak , ∀ Sk ∈ {1, . . . , n},
n n
• k=1 Bk = k=1 Ak ⊃ A,
so by the additivity of µ̄, we get
n
X n
X n
 X  
µ̄(Ak ) = µ̄ (Ak r Bk ) ∪ Bk = µ̄(Ak r Bk ) + µ̄(Bk ) ≥
k=1 k=1 k=1
n
X n
X n
 X  
≥ µ̄(Bk ) = µ̄ (Bk r A) ∪ (Bk ∩ A) = µ̄(Bk r A) + µ̄(Bk ∩ A) ≥
k=1 k=1 k=1
n
X n
[ 
≥ µ̄(Bk ∩ A) = µ̄ [Bk ∩ A] = µ̄(A). 
k=1 k=1

Exercise 5*. Let X1 , X2 be non-empty sets, let Jk ⊂ P(Xk ), k = 1, 2, be


semirings, and let µk : Jk → [0, ∞] be additive maps. Consider the semiring (see
Exercise 3)
J = A1 × A2 : A1 ∈ J1 , A2 ∈ J2 ⊂ P(X1 × X2 ).


Then the map µ : J → [0, ∞] defined by


µ(A1 × A2 ) = µ1 (A)1 · µ2 (A1 )
is additive. Here we use the convention 0 · ∞ = ∞ · 0 = 0.
Hints: One wants to show that, whenever A1 × A2 ∈ J is written as aunion
n
[
A1 × A2 = (Ak1 × Ak2 ),
k=1

k=1 ⊂ J pair-wise disjoint, it follows that


with (Ak1 × Ak2 )n
n
X
µ1 (A1 ) · µ2 (A2 ) = µ1 (Ak1 ) · µ2 (Ak2 ).
k=1

Analyze first the case of “strips,” that is, when A11 = · · · = An 2 n


1 = A1 or A1 = · · · = A2 = A2 . In
the general case, use induction, by picking some k such that Ak1 ( A1 and splitting A1 × A2 into
“strips” of the form B` × A2 , where B1 , . . . , Bm ∈ J1 are pairwise disjoint, with B1 = Ak1 and
B1 ∪ · · · ∪ Bm = A1 .
Comment. In connection with the above exercise, one can as the following
Question: With the notations above, is it true that, if both µ1 and µ2 are
measures, then µ is also a measure?
As we shall see a bit later in the course, that the answer is is “yes.”
CHAPTER III: MEASURE THEORY 187

Definition. Let X be a non-empty set, and let E ⊂ P(X) be a collection of


sets. We say that a map µ : E → [0, ∞] is σ-sub-additive, if
S∞
(add−σ ) whenever A ∈ E, and n )n=1 is a sequence in E with A ⊂

P(A

n=1 An , it
follows that µ(A) ≤ n=1 µ(An ).
Note that we do not require the An ’s to be pair-wise disjoint.
Proposition 4.2 (characterization of semiring measures). Let X be a non-
empty set, let J ⊂ P(X) be a semiring, and let µ : J → [0, ∞] be a map with
µ(∅) = 0. The following are equivalent:
(i) µ is a measure on J;
(ii) µ is additive, and σ-sub-additive.

Proof. (i) ⇒ (ii). Assume µ is a measure on J. It is clear that µ is additive,


so we only need to prove σ-sub-additivity. Use Theorem 4.1 to find a measure µ̄ on
the ring R(J) generated by J, such that

µ̄(A) = µ(A), ∀ A ∈ J.

Then it suffices to show that µ̄ is σ-sub-additive.


S∞ Start with a set A ∈ R(J), and a
sequence (An )∞n=1 ⊂ R(J), such that A ⊂ n=1 A n . Define the sets B1 = A1 , and

Bn = An r (A1 ∪ · · · ∪ An−1 ), f orall n ≥ 2.

Since we work in a ring, the sets Bn , Bn ∩A, Bn rA, and An rBn , n ∈ N, all belong
to R(J). Moreover, the sequence (Bn )∞ n=1 is pair-wise disjoint and it satisfies

• B
Sn∞⊂ An , ∀ S
n ∈ N,

• n=1 Bn = n=1 An ⊃ A,
so by σ-additivity of µ̄, we get

X ∞
X ∞
 X  
µ̄(An ) = µ̄ (An r Bn ) ∪ Bn = µ̄(An r Bn ) + µ̄(Bn ) ≥
n=1 n=1 n=1

X ∞
X ∞
 X  
≥ µ̄(Bn ) = µ̄ (Bn r A) ∪ (Bn ∩ A) = µ̄(Bn r A) + µ̄(Bn ∩ A) ≥
n=1 n=1 n=1

X ∞
[ 
≥ µ̄(Bn ∩ A) = µ̄ [Bn ∩ A] = µ̄(A).
n=1 n=1

(ii) ⇒ (i). Assume µ : J → [0, ∞] is additive and σ-sub-additive, and let us


show that µ is σ-additive. We again use Theorem 4.1, to find an additive map
µ̄ : R(J) → [0, ∞], such that µ̄ J = µ. Start with a pair-wise disjoint sequence
S∞
(An )∞n=1 ⊂ J, such that the union A = n=1 An belongs to J. On the one hand, by
σ-sub-additivity, we have the inequality

X
(5) µ(A) ≤ µ(An ).
n=1
188 LECTURE 21

On the other hand, for any integer N ≥ 1, we have


 N N
 [ 
 [ 
µ(A) = µ̄(A) = µ̄ An ∪ A r An ≥
n=1 n=1
N
[ N
X N
X

≥ µ̄ An = µ̄(An ) = µ(An ),
n=1 n=1 n=1

which then gives


N
X ∞
X
µ(A) ≥ sup µ(An ) = µ(An ),
N ∈N n=1 n=1
P∞
so using (5) we immediately get µ(A) = n=1 µ(An ). 
The following technical result will be often employed in subsequent sections.
Lemma 4.1 (Continuity). Let J be a semiring, and let µ be a measure on J.
S∞
n=1 ⊂ J is a sequence of sets, with A1 ⊂ A2 ⊂ . . . , and
(i) If (An )∞ n=1 An ∈
J, then

[ 
µ An = lim µ(An ).
n→∞
n=1
T∞
n=1 ⊂ J is a sequence of sets, with B1 ⊃ B2 ⊃ . . . , and
(ii) If (Bn )∞ n=1 Bn ∈
J, and µ(B1 ) < ∞, then

\ 
µ Bn = lim µ(Bn ).
n→∞
n=1

Proof. Using Theorem 4.1, we can assume that J is already a ring. (Otherwise
we replace J by R(J), and µ by its extension µ̄.)
(i). Consider the sets D1 = A1 , and Dk = An r Ak−1 , ∀ k ≥ 2. It is clear that
k=1 is a pairwise disjoint sequence in J, and we have the equality
(Dk )∞
n
[
(6) Dk = An , ∀ n ≥ 1.
k=1

This gives of course the equality



[ ∞
[
Dk = An ∈ J.
k=1 n=1

Using this equality, combined with the (σ-)additivity of µ, and with (6), we get

[ ∞
X n n
 X  [ 
µ An = µ(Dk ) = lim µ(Dk ) = lim µ Dk = lim µ(An ).
n→∞ n→∞ n→∞
n=1 k=1 k=1 k=1
T∞
(ii). Consider the sets B = n=1 Bn , and An = B1 rBn , ∀ nS≥ 1. It is clear that

n=1 ⊂ J, and we have A1 ⊂ A2 ⊂ . . . . Moreover, we have
(An )∞ n=1 An = B1 r B,
so by part (i), we get
(7) µ(B1 r B) = lim µ(B1 r Bn ).
n→∞

Using the fact that µ(B1 ) < ∞, it follows that


µ(B) ≤ µ(Bn ) ≤ µ(B1 ) < ∞, ∀ n ≥ 1.
CHAPTER III: MEASURE THEORY 189

This gives then the equalities


µ(B1 r B) = µ(B1 ) − µ(B) and µ(B1 r Bn ) = µ(B1 ) − µ(Bn ), ∀ n ≥ 1,
so the equality (7) immediately gives µ(B) = limn→∞ µ(Bn ). 

The above result has a (minor) generalization, which we record for future use.
To formulate it we introduce the following.
Notation. Let R be a ring, and let µ be a measure on R. For two sets
A, B ∈ R, we write A ⊂ B, if µ(A r B) = 0.
µ
Using this notation, we have the following generalization of Lemma 4.1.
Proposition 4.3. Let R be a ring, and let µ be a measure on R.
S∞
n=1 ⊂ R is a sequence of sets, with A1 ⊂ A2 ⊂ . . . , and
(i) If (An )∞ n=1 An ∈
µ µ
R, then

[ 
µ An = lim µ(An ).
n→∞
n=1
T∞
n=1 ⊂ R is a sequence of sets, with B1 ⊃ B2 ⊃ . . . , and
(ii) If (Bn )∞ n=1 Bn ∈
µ µ
J, and µ(B1 ) < ∞, then

\ 
µ Bn = lim µ(Bn ).
n→∞
n=1
Sn
Proof. (i). Define the sequence of sets (En )∞n=1 ⊂ R, by En = k=1 Ak ,
∀ n ≥ 1. Notice that, A1 = E1 , and for each n ≥ 2, we have An ⊂ En , as well as
the equality
n−1
[
En r An = [An r Ak ].
k=1
Using sub-additivity, it follows that
n−1
X
µ(En r An ) ≤ µ(An r Ak ),
k=1

which forces µ(En r An ) = 0. This gives


(8) µ(En ) = µ(An ) + µ(En r An ) = µ(An ), ∀ n ≥ 1.
S∞ S∞
Since n=1 En = n=1 An , and we have the inclusions E1 ⊂ E2 ⊂ . . . , by Lemma
4.1, combined with (8), we get

[ ∞
[
 
µ An = µ En = lim µ(En ) = lim µ(An ).
n→∞ n→∞
n=1 n=1

Part (ii) is proven exactly as part (ii) from Lemma 4.1. 

Exercise 6. Let µ be a measure on a ring R. Prove that, for A, B ∈ R, one has


the implication
A ⊂ B ⇒ µ(A) ≤ µ(B).
µ
190 LECTURE 21

Example 4.3. Fix some integer n ≥ 1. Consider the semiring of “half-open


boxes” in Rn
n
Y
Jn = {∅} ∪ [aj , bj ) : a1 < b1 , . . . , an < bn ⊂ P(Rn ).

j=1

For a non-empty box A = [a1 , b1 ) × · · · × [an , bn ) ∈ Jn , we define


n
Y
voln (A) = (bk − ak ).
k=1

We also define voln (∅) = 0.


Theorem 4.2. With the above notations, the map voln : J → [0, ∞] is a
measure on Jn .

Proof. First we prove additivity. Using Exercise ?? (and induction on n) it


suffices to analyze only the case n = 1, i.e. the case of half-open intervals in R. We
need to show the implication
Sp 
[a, b) = k=1 [ap , bp )  Xp
(9)  p =⇒ b − a = (bk − ak ).
[ak , bk ) k=1 pair-wise disjoint

k=1

We can prove this using induction on p. The case p = 1 is trivial. Assuming that the
above fact holds for p = N , let us prove it for p = N + 1. Pick k1 ∈ {1, . . . , N + 1}
such that ak1 = a. Then we clearly have
[
[ak , bk ) = [bk1 , b),
1≤k≤N +1
k6=k1

so by the inductive hypothesis we get


X
b − bk1 = (bk − ak ),
1≤k≤N +1
k6=k1

so we get
N
X +1
(bk − ak ) = (bk1 − ak1 ) + (b − bk1 ) = b − ak1 = b − a,
k=1

and we are done.


We now prove that voln is σ-sub-additive.
S∞ Suppose we have A ∈ Jn and a
sequence (Ak )∞
k=1 ⊂ J n , such that A ⊂ k=1 Ak , and let us prove the inequality

X
(10) voln (A) ≤ voln (Ak ).
k=1

It will be helpfull to introduce the following notations. For every half-open box
B = [x1 , y1 ) × · · · × [xn , yn ),
and every δ > 0, we define the boxes boxes
B δ = [x1 − δ, y1 ) × · · · × [xn − δ, yn ) and Bδ = [x1 , y1 − δ) × · · · × [xn , yn − δ).
CHAPTER III: MEASURE THEORY 191

It is clear that, for any box B ∈ Jn we have


(11) Bδ ⊂ B ⊂ Int(B δ ),
(12) voln (B) = lim voln (B δ ) = lim voln (Bδ ).
δ→0+ δ→0+
To prove (10), we fix some ε > 0, and we choose positive numbers δ and (δk )∞ k=1 ,
such that
ε
voln (Aδ ) > voln (A) − ε, and voln (Ak )δn < k + voln (Ak ), ∀ k ∈ N.

(13)
2
Notice now that, using (11), we have the inclusions
[∞
Ak ⊂ Int (Ak )δn ,

Aδ ⊂ A ⊂
k=1

and using the compactness of Aδ , there exists some N ≥ 1, such that


N
[
Int (Ak )δn .

Aδ ⊂
k=1
This immediately gives the inclusion
N
[
Aδ ⊂ (Ak )δn .
k=1
Using sub-additivity (see Corollary 4.1) we now get
N
X
voln (Ak )δn ,

voln (Aδ ) ≤
k=1
and using (13) we have
N N ∞
X ε  X X
voln (A) − ε ≤ + voln (Ak ) ≤ ε + voln (Ak ) ≤ ε + voln (Ak ).
2k
k=1 k=1 k=1
This gives

X
voln (A) − 2ε ≤ voln (Ak ).
k=1
But since this inequality holds for all ε > 0, the inequality (10) immediately follows.

Lecture 22

5. Outer measures
Although measures can be defined on arbitrary collections of sets, the most
natural domain of a measure is a σ-ring. In the previous section we dealt however
only with (semi)rings. Therefore it is natural to ask the following
Question 1: Given a measure µ on a (semi)ring J, is it possible to extend it
to a measure on the σ-ring S(J) generated by J?
As a particular case of the above question, we can specifically ask if there exists a
measure on Bor(Rn ), which agrees with voln on “half-open boxes.”
As a consequence of a remarkably clever construction, due to Caratheodory, we
will be able to answer the above general question in the affirmative. Caratheodory’s
approach is based on the following concept.
Definition. Given a non-empty set X, an outer measure on X is simply a
map ν : P(X) → [0, ∞] with the following properties.
(0) ν(∅) = 0.
(m) If A, B ∈ P(X) are such that A ⊂ B, then ν(A) ≤ ν(B).
(add− σ ) ν is σ-sub-additive,Si.e. whenever A ∈ P(X), and (A Pn∞)∞n=1 is a sequence

in P(X) with A ⊂ n=1 An , it follows that µ(A) ≤ n=1 µ(An ).
The property (m) is called monotonicity.
Remark that ν is automatically sub-additive, in the sense that, whenever
A, A1 , . . . , An ∈ P(X) are such that A ⊂ A1 ∪ · · · ∪ An , it follows that ν(A) ≤
ν(A1 ) + · · · + ν(An ).
The following result explains how a measure on a semiring can be naturally
extended to an outer measure on the ambient space.
Proposition 5.1. Let X be a non-empty set, let J be a semiring on X, and
let µ : J → [0, ∞] be a measure on J. Consider the collection

PJσ (X) = A ⊂ X : there exists (Bn )∞
[
J,

n=1 ⊂ with A ⊂ Bn .
n=1

Define the map µ̄ : PJσ (X) → [0, ∞] by


X ∞ ∞ 
Bn , ∀ A ∈ PJσ (X).
[
µ̄(A) = inf µ(Bn ) : (Bn )∞
n=1 ⊂ J, A ⊂
n=1 n=1

Then the map µ∗ : P(X) → [0, ∞], defined by


µ̄(A) if A ∈ PJσ (X)


µ (A) =
∞ 6 PJσ (X)
if A ∈

is an outer measure on X, and µ∗ J = µ.
193
194 LECTURE 22

Proof. It is obvious that µ∗ (∅) = 0. It is also clear that µ∗ is mono-


tone. To prove that µ∗ is σ-sub-additive,
S∞ start with A ∈ P(X) and a sequence
(An )∞ ∈ P(X), such that A ⊂ An , and let us prove the inequality µ∗ (A) ≤
P∞ n=1∗ n=1
J
n=1 µ (An ). If there exists some n with An 6∈ Pσ (X), there is nothing to prove.
Assume An ∈ Pσ (X), for all n. Then it is clear that A ∈ PJσ (X). Fix for the
J

moment some ε > 0. For every n ∈ N choose a sequence (Bkn )∞ k=1 ⊂ J, such that

X ε
µ(Bkn ) < + µ̄(An ).
2n
k=1

It is clear that, if we list the countable family (Bkn )∞ ∞


n,k=1 as a sequence (Dm )m=1 ,
S∞
then A ⊂ m=1 Dm , and
∞ ∞ X
∞ ∞ ∞
X X X  ε X
µ(Bkn ) ≤

µ̄(A) ≤ µ(Dm ) = + µ̄(An ) = ε + µ̄(An ).
m=1 n=1 k=1 n=1
2n n=1

Since the above inequality holds for all ε > 0, we conclude that

X ∞
X
µ∗ (A) = µ̄(A) ≤ µ̄(An ) = µ∗ (An ),
n=1 n=1

so µ∗ is indeed σ-sub-additive.
Finally, we must show that µ∗ J = µ. Start with some A ∈ J. On the one

hand, since µ is a measure on J, we know that µ is σ-subadditiveS∞ (see Theorem


4.2). This means that, for any sequence (Bn )∞
n=1 ⊂ J with A ⊂ n=1 Bn , we have
P∞ J
n=1 µ(Bn ) ≥ µ(A). Since A obviously belongs to Pσ (X), this will force

µ∗ (A) = µ̄(A) ≥ µ(A).

P∞if we consider the sequence B1 = A, B2 = B3 = · · · = ∅, then


On the other hand,
we clearly have n=1 µ(Bn ) = µ(A), which gives µ̄(A) ≤ µ(A), so in fact we must
have equality µ̄(A) = µ(A). 
Definition. The outer measure µ∗ , defined in the above result, is called the
maximal outer extension of µ. This terminology is justified by the following.
Exercise 1. Let J be a semiring on X, and let µ be a measure on J. Prove that
any outer measure ν on X, with ν J = µ, then ν ≤ µ∗ , in the sense that
ν(A) ≤ µ∗ (A), ∀ A ⊂ X.
Exercise 2. Let J1 and J2 be semirings on X with J1 ⊂ J2 , and let µ1 , µ2 be
respectively measures on J1 , J2 , such that µ2 J1 ≤ µ1 . Let µ∗1 , µ∗2 respectively be
the maximal outer extensions of µ1 , µ2 . Prove the inequality µ∗2 ≤ µ∗1 .
Given a measure µ on a semiring J on X, one can ask whether there exists a
unique outer measure on X, which extends µ. The answer is no, even in the most
trivial cases.
Example 5.1. Work on the set X = {1, 2}. Take the semiring J = {∅, X}
and define a measure µ on J by µ(∅) = 0 and µ(X) = 1. Choose now any number
a ∈ (0, 1) and define νa : P(X) → [0, 1] by νa (A) = aκ A (1) + (1 − a)κ A (2). Then
νa is an outer measure on X - in fact νa is a measure on P(X) - and νa J = µ. It
is obvious that µ∗ ({1}) = 1 6= a = νa ({1}) and µ∗ ({2}) = 1 6= 1 − a = νa ({2}).
We introduce now another concept, which is very important in our analysis.
CHAPTER III: MEASURE THEORY 195

Definition. Let ν be an outer measure on a non-empty set X. A subset


A ⊂ X is said to be ν-measurable, if it satisfies the condition
(m) ν(S) = ν(S ∩ A) + ν(S r A), ∀ S ⊂ X.
For a given S, it is useful to think the equality ν(S) = ν(S ∩ A) + ν(S r A) in
unorthodox terms as “A sharply cuts S,” so that saying that A is ν-measurable
means that “A sharply cut every set S ⊂ X.”
Remarks 5.1. Let ν be an outer measure on X.
A. Since ν is (finitely) sub-additive, for any two sets A, S ⊂ X, one always has
the inequality ν(S) ≤ ν(S ∩ A) + ν(S r A). Therefore, a set A ⊂ X is ν-measurable,
if and only if
ν(S) ≥ ν(S ∩ A) + ν(S r A), ∀ S ⊂ X.
B. Any subset N ⊂ X, with ν(N ) = 0, is ν-measurable. Indeed, from the
monotonicity of ν, we see that for every S ⊂ X, we have
ν(S ∩ N ) + ν(S r N ) ≤ ν(N ) + ν(S) = ν(S),
so by the preceding remark, N is indeed ν-measurable. Such a set N is called
ν-negligeable.
The first key result in this section is the following.
Theorem 5.1. Let ν be an outer measure on a non-empty set X. Then the
collection
mν (X) = A ⊂ X : A ν-measurable


is a σ-algebra on X. Moreover, the restriction


: mν (X) → [0, ∞]

ν
mν (X)
is a measure on mν (X).
Proof. The proof will be carried on in several steps.
Step 1: If A ∈ mν (X), then X r A ∈ mν (X).
This is trivial, since for every S ⊂ X, one has the equalities
S ∩ (X r A) = S r A and S r (X r A) = S ∩ A.
Step 2: If A, B ∈ mν (X), then A ∩ B ∈ mν (X).
Start with some arbitrary S ⊂ X. Since B is ν-measurable, it “shaprply cuts the
set S r (A ∩ B),” which means that
  
ν S r (A ∩ B) = ν [S r (A ∩ B)] ∩ B + ν [S r (A ∩ B)] r B .
Since we clearly have [S r(A∩B)]∩B = (S ∩B)rA, and [S r(A∩B)]rB = S rB,
the above equality gives
 
ν S r (A ∩ B) = ν (S ∩ B) r A + ν(S r B).

Adding ν (S ∩ B) ∩ A , and using the fact that A “sharply cuts S ∩ B,” we now
get
 
ν( S ∩ (A ∩ B) + ν S r (A ∩ B) =
 
= ν (S ∩ B) ∩ A + ν (S ∩ B) r A + ν(S r B) = ν(S ∩ B) + ν(S r B).
Finally, using the fact that B “sharply cuts S,” we get
 
ν( S ∩ (A ∩ B) + ν S r (A ∩ B) = ν(S ∩ B) + ν(S r B) = ν(S),
so A ∩ B is indeed ν-measurable.
196 LECTURE 22

So far, Steps 1 and 2 prove that mν (X) is an algebra on X.


Step 3: For any pair-wise disjoint finite sequence (An )N
n=1 ⊂ mν (X), one
has the equality
N
  X
ν S ∩ A1 ∪ · · · ∪ AN = ν(S ∩ An ), ∀ S ⊂ X.
n=1

Since mν (X) is an algebra, it suffices to prove the aboove equalityonly for N =


2. (The case of arbitrary N follows immediately by induction.) To prove that
ν S ∩ (A1 ∪ A2 ) = ν(S ∩ A1 ) + ν(S ∩ A2 ), we simply use the fact that A1 “sharply
cuts S ∩ (A1 ∪ A2 ),” which gives
  
ν S ∩ (A1 ∪ A2 ) = ν [S ∩ (A1 ∪ A2 )] ∩ A1 + ν [S ∩ (A1 ∪ A2 )] r A1 .
The desired equality then immediately follows from the obvious equalities
[S ∩ (A1 ∪ A2 )] ∩ A1 = S ∩ A1 and [S ∩ (A1 ∪ A2 )] r A1 = S ∩ A2 .
The preceding step can be in fact extended to infinite sequences.
n=1 ⊂ mν (X), one has the
Step 4: For any pair-wise disjoint sequence (An )∞
equality
 ∞  X ∞
[
ν S∩ An ] = ν(S ∩ An ), ∀ S ⊂ X.
n=1 n=1

To prove this fact, we fix a sequence (An )∞


n=1 as above, as well as S ⊂ X. By
σ-sub-additivity, we already know that
 ∞  [ ∞  X ∞
[ 
ν S∩ An = ν [S ∩ An ] ≤ ν(S ∩ An ),
n=1 n=1 n=1

so the only thing we have to show is the inequality


XN  ∞ 
[ 
ν(S ∩ An ) ≤ ν S ∩ An , ∀ N ∈ N.
n=1 n=1

This follows immediately from Step 3 and the monotonicity:


N
X  N   ∞ 
[  [ 
ν(S ∩ An ) = ν S ∩ An ≤ ν S ∩ An .
n=1 n=1 n=1

Step 5: mν (X) is a monotone class.


We need to prove the properties:
n )n=1 ⊂ mν (X) is a sequence with An ⊂ An+1 , ∀ n ∈ N, it

(i) whenever (AS

follows that n=1 An belongs to mν (X);
n )n=1 ⊂ mν (X) is a sequence with An ⊃ An+1 , ∀ n ∈ N, it

(ii) whenever (AT

follows that n=1 An belongs to mν (X).
Since mν (X) is an algebra, it suffices only to prove (i). Start with an arbitrary
S∞a sequence (An )n=1 ⊂ mν (X) with An ⊂ An+1 , ∀ n ∈ N, and denote

subset S, and
the union n=1 An simply by A. Define the sets B1 = A1 and Bn = An r An−1 ,
n=1 is a pair-wise disjoint sequence. Since mν (X)
∀ n ≥ 2. It is obvious that (Bn )∞
CHAPTER III: MEASURE THEORY 197

S∞ S∞
is an alegbra, all the Bn ’s belong to mν (X). We have, n=1 Bn = n=1 An = A,
which, using Step 4 gives
 ∞   ∞  X ∞
[  [ 
(1) ν(S ∩ A) = ν S ∩ An = ν S ∩ Bn = ν(S ∩ Bn ).
n=1 n=1 n=1
SN
Using Step 3, combined with the equality n=1 Bn = AN , we also have
XN  N 
[ 
ν(S ∩ Bn ) = ν S ∩ Bn = ν(S ∩ AN ), ∀ N ∈ N,
n=1 n=1

so by (1) we have

X
(2) ν(S ∩ A) = ν(S ∩ Bn ) = lim ν(S ∩ AN ).
N →∞
n=1

Notice now that, using the fact that AN “sharply cuts S,” combined with the
monotonicity of ν and the obvious inclusion S r A ⊂ S r AN , we have
ν(S ∩ AN ) + ν(S r A) ≤ ν(S ∩ AN ) + ν(S r AN ) = ν(S), ∀ N ∈ N,
so using (2), we immediately get
ν(S ∩ A) + ν(S r A) ≤ ν(S).
Since the above inequality holds for all S ⊂ X, by Remark 5.1.A it follows that A
indeed belongs to mν (X).
By the results from Section 1, we know that the fact that mν (X) is simu-
lutaneously an algebra, and a monotone class, implies the fact that mν (X) is a
σ-algebra.
We now show that ν mν (X) is a measure. If we start with a pair-wise disjoint
n=1 ⊂ mν (X), then the equality equality
sequence (An )∞

[ ∞
 X
ν An = ν(An )
n=1 n=1
S∞
is an immediate consequence of Step 4, applied to the set S = n=1 An , which
clearly satisfies S ∩ An = An , ∀ n ∈ N. 

We are now in position to answer the Question 1.


Theorem 5.2. Let X be a non-empty set, let J be a semiring on X, let µ be a
measure on J, and let µ∗ be the maximal outer extension of µ. Then J ⊂ mµ∗ (X).
In particular, mµ∗ (X) contains the σ-algebra Σ(J) on X, generated by J, and
µ∗ Σ(J) is a measure on Σ(J).

Proof. What we need to prove is the fact that every set A ∈ J is µ∗ -measurable.
Start with an arbitrary set S ⊂ X. As noticed before (Remark 5.1.A), we only need
to prove the inequality
(3) µ∗ (S ∩ A) + µ∗ (S r A) ≤ µ∗ (S).
If µ∗ (S) = ∞, there is nothing to prove, so we can assume that µ∗ (S) < ∞. In
particular this means that S ∈ PJσ (X). Fix for the moment ε > 0. By the definition
198 LECTURE 22

S∞
n=1 ⊂ J, such that S ⊂
of µ∗ (S) = µ̄(S), there exists a sequence (Bn )∞ n=1 Bn ,
and

X
(4) µ(Bn ) ≤ µ∗ (S) + ε.
n=1

Since J is a semiring, for each n ∈ N, we can find some integer pn ≥ 1, and a


sequence (Djn )pj=0
n
⊂ J, such that
• Bn ∩ A = D0n ⊂ D1n ⊂ · · · ⊂ Dpnn = Bn ,
• Dj r Dj−1 ∈ J, ∀ j ∈ {1, . . . , pn }.
Pn
Define the numbers k0 = 0, and kn = j=1 pj , ∀ n ∈ N, and the sequence
(Cm )m=1 ⊂ J, by

n n
Cm = Dm−k n−1
r Dm−1−k n−1
, if kn−1 < m ≤ kn , n ∈ N.
By construction, for each n ∈ N, we have
kn
[ pn
[
Cm = (Djn r Dj−1
n
) = Bn r An .
m=kn−1 +1 j=1

Moreover, for each n ∈ N the system


(D0n , Ckn−1 +1 , Ckn−1 +2 , . . . , Ckn ) = (D0n , D1n r D0n , D2n r D1n , . . . , Dpnn r Dpnn −1 )
in J is pair-wise disjoint, and has
kn
[
D0n ∪ Cm = Bn ,
m=kn−1 +1

so we get the equality


kn
X
µ(D0n ) + µ(Cm ) = µ(Bn ).
m=kn−1 +1

Using (4) we now get



X ∞
X ∞
X ∞ 
X kn
X 
µ(D0n ) + µ(Cm ) = µ(D0n ) + µ(Cm ) =
n=1 m=1 n=1 n=1 m=kn−1 +1
(5)
∞ 
X kn
X  X∞
= µ(D0n ) + µ(Cm ) = µ(Bn ) ≤ µ∗ (S) + ε.
n=1 m=kn−1 +1 n=1

On the one hand, we clearly have


[∞ [∞  kn
[  [ pn
∞  [ 
Cm = Cm = (Djn r Dj−1
n
) =
m=1 n=1 m=kn−1 +1 n=1 j=1

[ ∞
[ ∞
[
(Dpnn r D0n ) =

= (Bn r A) = Bn r A ⊃ S r A,
n=1 n=1 n=1

which gives the inequality



X
(6) µ(Cm ) ≥ µ∗ (S r A).
m=1
CHAPTER III: MEASURE THEORY 199

On the other hand, we also have



[ ∞
[ ∞
[
D0n =

(Bn ∩ A) = Bn ∩ A ⊃ S ∩ A,
n=1 n=1 n=1

which gives the inequality



X
(7) µ(D0n ) ≥ µ∗ (S ∩ A).
n=1

Combining (6) and (7) with (5) immediately gives the desired inequality (3). 
The construction

µ∗ Σ(J)
 maximal outer 
µ∗
   
µ extension restriction
−−− −−−−−−→ −−−−−−→
measure on J outer measure on X measure on Σ(J)

is referred to as the Caratheodory construction.


Definitions. Let J be a semiring on X, and let µ be a measure on J. The
Caratheodory construction provides us with two measures. The first measure -
µ∗ S(J) - is a measure on the σ-ring S(J) generated by J, and is called the maximal


σ-ring extension of µ. The second measure - µ∗ - is a measure on the σ-algebra
Σ(J)
Σ(J) generated by J, and is called the maximal σ-algebra extension of µ.
The above terminology is justified by the following result.
Proposition 5.2. Let J be a semiring on X, and let µ be a measure on J.
(i) If ν is a measure on the σ-ring S(J) generated by J, with ν J = µ, then


ν ≤ µ∗ S(J) .
(ii) If ν is a measure on the σ-algebra Σ(J) generated by J, with ν J = µ, then


ν ≤ µ∗ .
Σ(J)

Proof. We prove both statements simultaneously. Let J1 denote either the


J. In particular J1 is a semiring, and J ⊂ J1 .
σ-ring, or the σ-algebra generated by
Since ν is a measure on J1 with ν J = µ, if we denote by ν ∗ its maximal outer
∗ ∗
extension, then by Exercise 2 we know

that∗ ν ≤ µ . In particular, by Proposition
5.1 and Theorem 5.2, we get ν = ν J1 ≤ µ J1 . 

We now discuss the uniqueness of extensions of a semiring measure. In order


to clarify this matter, we have to introduce a technical condition, which turns out
to be very helpful not only here, but in many other situations.
Definitions. Let J be a semiring on X, and let µ be a measure on J.
A. We say that a subsetSA ⊂ X is J-µ-σ-finite, if there exists a sequence

n=1 ⊂ J, such that A ⊂
(Bn )∞ n=1 Bn , and µ(Bn ) < ∞, ∀ n ∈ N. (When there is
no danger of confusion, we will use the terms “µ-σ-finite,” or simply “σ-finite.”)
B. We say that the measure µ is σ-finite, if every A ∈ J is σ-finite.
C. We say that the measure µ is finite, if µ(A) < ∞, ∀ A ∈ J.
Clearly every finite measure on J is σ-finite.
Remark 5.2. Let J be a semiring on X, let µ be a measure on J, and let A be a
set which belongs to the σ-algebra Σ(J) generated by J. If A if J-µ-σ-finite, then A
in fact belongs to the semiring S(J) generated by J. The only thing S that is actually

needed here is the existence of a sequence (Bn )∞ n=1 ⊂ J with A ⊂ n=1 Bn . This
200 LECTURE 22

gives the fact that A belongs to PJσ (X), so by Proposition 2.3, the set A belongs to
the intersection Σ(J) ∩ PJσ (X) = S(J).
Using the above terminology, we have the following uniqueness result.
Theorem 5.3. Let J be a semiring on X, let µ be a measure on J, let µ∗ be the
maximal outer extension of µ, and let ν be a measure on the σ-ring S(J) generated
by J, with ν J = µ. Then one has ν(A) = µ∗ (A), for all J-µ-σ-finite sets A ∈ S(J).

Proof. Fix a J-µ-σ-finite set A ∈ S(J).


There exists a pair-wise disjoint sequence (Dn )∞
Claim: S n=1 ⊂ S(J) such that

A ⊂ n=1 Dn , and ν(Dn ) = µ∗ (Dn ) < ∞, ∀ n ∈ N.
S∞prove the above statement, start with a sequence (Bn )n=1 ⊂ J with A ⊂

To
n=1 Bn and µ(Bn ) < ∞, ∀ n ∈ N. Define the sets Dn , n ∈ N by D1 = B1 ,
and Dn = Bn r (B1 ∪ · · · ∪ Bn−1 ), ∀ n ≥ 2. It is clear that the sequence (Dn )∞
n=1
is pair-wise disjoint, and
[∞ [∞
A⊂ Bn = Dn .
n=1 n=1
Moreover, all the Dn ’s belong to the ring R(J) generated by J. The inclusions
Dn ⊂ Bn then prove that
µ∗ (Dn ) ≤ µ∗ (Bn ) = µ(Bn ) < ∞, ∀ n ∈ N.

Finally, since both µ∗ R(J) and ν R(J) are measures on R(J), which have the same
values on J, using the Semiring-to-Ring Extension Theorem 4.1, it follows that
µ∗ R(J) = ν R(J) .

(8)
In particular we have the equalities
ν(Dn ) = µ∗ (Dn ), ∀ n ∈ N.
Having proven the Claim, we now show that ν(A) = µ∗ (A). We choose a
sequence (Dn )∞n=1 ⊂ S(J) as in the Claim. On the one hand, since the Dn ’s are
pair-wise disjoint, and both ν and µ∗ S(J) are measures on the σ-ring S(J), one has
the equalities

X ∞
X
ν(A) = ν(A ∩ Dn ) and µ∗ (A) = µ∗ (A ∩ Dn ).
n=1 n=1

So, in order to prove the equality ν(A) = µ (A), it suffices to prove that
(9) ν(A ∩ Dn ) = µ∗ (A ∩ Dn ), ∀ n ∈ N.
Fix n ∈ N. On the one hand, by Proposition 5.2(i), we have the inequalities
(10) ν(A ∩ Dn ) ≤ µ∗ (A ∩ Dn ) < ∞ and ν(Dn r A) ≤ µ∗ (Dn r A) < ∞.
On the other hand, we have
ν(A ∩ Dn ) + ν(Dn r A) = ν(Dn ) = µ∗ (Dn ) = µ∗ (A ∩ Dn ) + µ∗ (Dn r A).
Now if we go back to (10), we see that none of the two inequalities can be strict, be-
cause in that case we would get ν(Dn ) < µ∗ (Dn ). (The assumption that µ∗ (Dn ) <
∞ is essential here.) So we must have (9), and we are done. 
CHAPTER III: MEASURE THEORY 201

Corollary 5.1. If µ is a σ-finite measure on a semiring J, then there exists a


unique measure ν on the σ-ring S(J) generated by J, such that ν J = µ. Moreover,
ν is σ-finite.

Proof. The existence is given by the Caratheodory construction. The unique-


ness follows from Theorem 5.3.
S∞ with some A ∈ S(J), and let us find a sequence
To prove σ-finiteness, start
(Bn )∞n=1 ⊂ S(J) with A ⊂ n=1 Bn and ν(Bn ) < ∞, ∀ n ∈ N. First of all, since
PJσ (X) is a σ-ring which contains J, it follows
S∞ that S(J) ⊂ PJσ (X). In particular,
there exists (Dn )∞
n=1 ⊂ J such that A ⊂ n=1 n . Using the fact thatS
D µ is σ-finite,

we see that for each n we can find a sequence (Dkn )∞
k=1 ⊂ J, with Dn ⊂ n
k=1 Dk and
µ(Dkn ) < ∞, ∀ k ∈ N. If we list all the sets Dkn , k, n ∈ N as a sequence (Bm )∞ m=1 ,
then we are done. 

In the absence of the σ-finitess condition the uniqueness of the σ-ring extension
fails, as illustrated by the following.
Example 5.2. Consider the set X = Q, and the semiring of rational half-open
intervals
J1 = ∅ ∪ [a, b) ∩ Q : a, b ∈ R, a < b .
 

We equipp J1 with the measure µ defined by



0 if A = ∅
µ(A) =
∞ if A = 6 ∅
Notice that, if we look at the inclusion ι : Q ,→ R, then J1 = J Q , where J is the

semiring of half-open intervals in R. By the Generating Theorem we then have


S(J1 ) = S(J Q ) = S(J) Q = Bor(R) Q = P(Q).

Define now the measures ν1 , ν2 : S(J) → [0, ∞] by


 
card A if A is finite 2 · card A if A is finite
ν1 (A) = ν2 (A) =
∞ if A is infinite ∞ if A is infinite

It is obvious that both ν1 and ν2 satisfy ν1 J1 = ν2 J1 = µ, but obviously ν1 and
ν2 are not equal.
Comment. In connection with the Caratheodory construction, it is legitimate
to ask the following.
Question 2: What happens if we do the Caratheodory construction twice?
This problem has in fact two aspects.
Question 2A: Suppose ω is an outer measure on X. Take I = mω (X) and
ν = ω I , so that I is a semiring (in fact it is a σ-algebra) on X, and ν
is a measure on I. Let ν ∗ be the maximal outer extension of ν. Is it true
that ν ∗ = ω?
By Exercise 2, we always have ω ≤ ν ∗ . In general the answer to Question 2A in
negative, as shown in Exercise ??? below. One can ask however the following
Question 2B: Same question as 2A, but suppose ω = µ∗ , the maximal outer
extension of a measure µ on a semiring J.
The following result shows that Question 2B always has an affirmative answer.
202 LECTURE 22

Proposition 5.3. Let X be a non-empty set, let J be a semiring on X, and


let µ be a measure on J. Let µ∗ be the maximal outer exetension of ν. Let I be a
semiring, with I ⊃ J Consider the measure ν = µ∗ I , and let ν ∗ be the maximal
outer extension of ν. Then ν ∗ = µ∗ .

Proof. First of all, since ν J = µ∗ J = µ, by Exercise 2, we have the inequality
ν ∗ ≤ µ∗ .
To prove the other inequality, we start with an arbitrary set A ⊂ X, and we
prove that µ∗ (A) ≤ ν ∗ (A). If ν ∗ (A) = ∞, there is nothing to prove, so we may
assume ν ∗ (A) < ∞. In particular,
S∞ A ∈ PIσ (X), i.e. there exists at least one sequence
(Bn )n=1 ⊂ I, with A ⊂ n=1 Bn , and we have

X∞ [∞ 
ν ∗ (A) = inf ν(Bn ) : (Bn )∞
n=1 ⊂ I, A ⊂ B n .
n=1 n=1

n=1 ⊂ I, such that


Fix for the moment a some ε > 0, and choose a sequence (Bnε )∞

[ ∞
X
(11) A⊂ Bnε and ν(Bnε ) ≤ ν ∗ (A) + ε.
n=1 n=1
By σ-subadditivity of µ∗ , we have

X
µ∗ (A) ≤ µ∗ (Bnε ).
n=1


Using the fact that ν = µ I
, the above inequality, combined with (11) yields
µ∗ (A) ≤ ν ∗ (A) + ε.
Since this inequality holds for all ε > 0, it forces the inequality µ∗ (A) ≤ ν ∗ (A). 
Exercise 3. Let X be an uncountable set, and define ω : P(X) → [0, ∞] by

 0 if A = ∅
ω(A) = 1 if 0 < card A ≤ ℵ0
2 if A is uncountable

(i) Prove that ω is an outer measure on X.


(ii) Take I = mω (X). Prove that I = {∅, X}.
(iii) Consider the measure ν = ω I , and let ν ∗ be the maximal outer extension
of ν. Prove that there are sets A ⊂ X, with ω(A) < ν ∗ (A).
Hints: For (ii) start with some A with ∅ ( A ( X. Prove that A is not ω-measurable, by
showing that A does not “sharply cut” sets of the form {a, b} with a ∈ A and b ∈ X r A.
Comment. Suppose J is a semiring on X, and µ is a measure on J. We
have used the maximal outer extension µ∗ as a tool in defining measures on the
σ-ring S(J) and the σ-algebra Σ(J) generated by J, by employing the Caratheodory
construction, which uses the σ-algebra mµ∗ (X) of µ∗ -measurable sets. A legitimate
question is then
Question 3: Is the inclusion Σ(J) ⊂ mµ∗ (X) strict?
In most cases this inclusion is indeed strict (see Examples ?? below, or the dis-
cussion in the next section). This can be seen by looking at µ∗ -negligeable sets
N ⊂ X, which are automatically µ∗ -measurable. The following result gives some
useful information.
CHAPTER III: MEASURE THEORY 203

Proposition 5.4. Suppose J is a semiring on X, and µ is a measure on J. Let


µ∗ be the maximal outer extension of µ. For any set A ∈ PJσ (X), there exists some
set B in the σ-ring S(J) generated by J, such that A ⊂ B, and µ∗ (A) = µ∗ (B).
In particular, a subset N ⊂ X is µ∗ -neglijeable, i.e. µ∗ (N ) = 0, if and only if
there exists a µ∗ -negligeable set B ∈ S(J), such that N ⊂ B.
J
S∞ Proof. Since A ∈ Pσ (X), there exists a sequence (Dn )n=1 ⊂ J with A ⊂

n=1 Dn . Moreover, we have


X∞ ∞
[ 

µ (A) = inf µ(Dn ) : (Dn )n=1 ⊂ J, A ⊂

Dn .
n=1 n=1

For P∞k ≥ 1,k we can


S∞ eachk integer then choose a sequence (Bnk )∞ n=1 ⊂ J with A ⊂

B
n=1 S n and n=1 µ(B n ) ≤ µ (A) + 1/k. For each integer k ≥ 1, we define the set

Bk = n=1 Bnk . It is clear that Ak ∈ S(J), and B k ⊃ A, for all k ∈ N. Moreover,
by σ-sub-additivity of µ∗ , and the equality µ∗ J = µ, we have the inequalities
∞ ∞
X X 1
µ∗ (Bk ) ≤ µ∗ (Bnk ) = µ(Bnk ) ≤ µ∗ (A) + , ∀ k ∈ N.
n=1 n=1
k
T∞
If we then form B = k=1 Bk , then B still belongs to S(J), and we have A ⊂ B ⊂
Bk , which gives
1
µ∗ (A) ≤ µ∗ (B) ≤ µ∗ (Bk ) ≤ µ∗ (A) + ∀ k ∈ N,
k
thus forcing µ∗ (B) = µ∗ (A).
To prove the second assertion, we see that the “only if” part is a particular
case of the first part. The “if” part is trivial, since the inclusion N ⊂ B forces the
inequality µ∗ (N ) ≤ µ∗ (B). 

In connection to Question 3, it is useful to introduce the following terminology.


Definition. Let X be a non-empty set, and let J be a semiring on X. A
measure µ on J is said to be complete, if it satisfies the condition
(c) whenever N ∈ J has µ(N ) = 0, it follows that J contains all the subsets
of N .

Remarks 5.3. A. Given an outer measure ν on a set X, the measure ν mν (X) :
mν (X) → [0, ∞] is always complete, as a consequence of monotonicity, and of
Remark 5.1.B.
B. Given a semiring J on X, and a measure µ on J, we now see that a sufficient
a strict inclusion Σ(S) ( mµ∗ (X), is the lack of completeness
condition, for having
for the measure µ∗ Σ(J) . Later on (see Corollary 5.2) we shall see that in the case
of σ-finite measures, defined on σ-total semirings, this condition is also necessary.
The lack of completeness of a σ-ring measure can be compensated by the fol-
lowing result.
Theorem 5.4. Let X be a non-empty set, let S be a σ-ring on X, and let ν be
a measure on S.
(i) The collection
N(S, ν) = N ⊂ X : there exists D ∈ S with N ⊂ D and ν(D) = 0

204 LECTURE 22

is a σ-ring on X. Moreover, if N ∈ N(S, ν), then N(S, ν) contains all


subsets of N .
(ii) For a subset A ⊂ X, the following are equivalent:
(a) there exists B ∈ S and N ∈ N(S, ν), such that A = B r N ;
(b) there exists F ∈ S and M ∈ N(S, ν), such that A = F ∪ M .
(iii) The collection S̄ of all subsets A ⊂ X, satisfying the equivalent conditions
in (ii), is a σ-ring. We have the equality
S̄ = S N(S, ν) ∪ S .


There exists a unique measure ν̄ on S̄, such that ν̄ N(S,ν) = 0 and ν̄ S = ν.



(iv)
The measure ν̄ is complete.
(v) If E is a σ-ring with E ⊃ S, and if λ is a complete measure on E with
λ S = ν, then E ⊃ S̄ and λ S̄ = ν̄.

Proof. (i). This is pretty clear. In fact, if one takes E = {B ∈ S : ν(B) = 0},
then one has the equality N(S, ν) = PEσ (X).
(ii). (a) ⇒ (b). Assume A = B r N with B ∈ S and N ∈ N(S, ν). Choose
D ∈ S with ν(D) = 0 and N ⊂ D. We now have
B r D ⊂ B r N = A,
so if we put F = B r D, we have the equality A = F ∪ M , where
M = A r F = (B r N ) r (B r D) ⊂ D.
Notice that F ∈ S, while the inclusion M ⊂ D shows that M ∈ N(S, ν).
(b) ⇒ (a). Assume A = F ∪ M with F ∈ S and M ∈ N(S, ν). Choose D ∈ S
with M ⊂ D and ν(D) = 0. Define B = F ∪ D. It is clear that B ∈ S, and A ⊂ B.
Define N = B r A, so we clearly have A = B r N . We have
N = (F ∪ D) r (F ∪ M ) ⊂ D r M ⊂ D,
so N clearely belongs to N(S, ν).
(iii). We need to prove the following properties:
(∗) whenever A1 , A2 are sets in S̄, it follows that the difference A1 r A2 also
belongs to S̄;
(∗∗) S whenever (An )∞ n=1 1 is a sequence of sets in S̄, it follows that the union

n=1 An also belongs to S̄.
To prove (∗), we write A1 = B r N and A2 = F ∪ M , with B, F ∈ S and M, N ∈
N(S, ν). Then we have
A1 r A2 = (B r N ) r (F ∪ M ) = B r (F ∪ M ∪ N ) = (B r F ) r (M ∪ N ).
The difference B r F belongs to S, and, using (i), the union N ∪ M belongs to
N(S, ν). By (ii) it follows that A1 r A2 belongs to S̄.
To prove (∗∗), we write, for each n ∈ N, the set An as An = Fn ∪ rMn with
Fn ∈ S and Mn ∈ N(S, ν). Then

[ ∞
[ ∞
[
 
An = Fn ∪ Mn .
n=1 n=1 n=1
S∞ S∞
The union n=1 Fn belongs to S, and, using (i), the union
S∞ n=1 Mn belongs to
N(S, ν). By (ii), the union n=1 An belongs to S̄.
CHAPTER III: MEASURE THEORY 205

Since S̄ is a σ-ring, which clearly contains both N(S, ν) and


 S, it follows that
S̄ ⊃ S N(S, ν) ∪ S . The other inclusion S̄ ⊂ S N(S, ν) ∪ S is trivial, by the


definition of S̄.
(iv). To prove the existence, we consider the maximal outer extension ν ∗ .
When restricted to the σ-algebra mν ∗ (X) of all ν ∗ -measurable sets, then we get
a measure. Notice that ν ∗ (N ) = 0, ∀ N ∈ N(S, ν), which gives the inclusion
N(S, ν) ⊂ mν ∗ (X). In particular, since mν ∗ (X) is a σ-algebra, which contains
both N(S, ν) and S, it follows that
mν ∗ (X) ⊃ S N(S, ν) ∪ S = S̄.


In particular, ν̄ = ν ∗ S̄ is a measure on S̄, which clearly satisfies the required


properties.
To prove uniqueness, let µ be another measure on S̄, such that µ N(S,ν) = 0 and

µ S = ν. It we start with an arbitrary set A ∈ S̄, and we write it as A = F ∪ M ,


with F ∈ S and M ∈ N(S, ν), then using the fact that A r F ⊂ M , we see that
A r F belongs to N(S, ν), so we have
µ(A) = µ(F ) + ν(A r F ) = µ(F ) = ν(F ) = ν̄(F ) = ν̄(F ) + ν̄(A r F ) = ν̄(A).
Finally, we prove that the measure ν̄ is complete. Let A ∈ S̄ be a set with
ν̄(A) = 0, and let U be an arbitrary subset of A. Using (ii) we write A = F ∪ M ,
with F ∈ S and M ∈ N(S, ν). Notice that we have
0 ≤ ν(F ) = ν̄(F ) ≤ ν̄(F ∪ M ) = ν̄(A) = 0,
which forces F ∈ N(S, ν), so using (i), we see that A itself belongs to N(S, ν). By
(i), it follows that U ∈ N(S, ν) ⊂ S̄.
(v) Let E and λ be as in indicated. In order to prove the inclusion E ⊃ S̄, it
suffices to prove the inclusion N(S, ν) ⊂ E. But this inclusion is pretty obvious. If
we start with some N ∈ N(S, ν), then there exists A ∈ S with N ⊂ A and ν(A) = 0.
In particular, we have A ∈ E and λ(E) = 0, and then the completeness of λ forces
N ∈ E. Notice that this also forces λ(N ) = ν̄(N ) = 0. Using (iv) it then follows
that λ|S̄ = ν̄. 

Definition. Using the notations above, the σ-ring S̄ is called the completion
of S with respect to ν. The correspondence (S, ν) 7−→ (S̄, ν̄) is referred to as the
measure completion. Remark that, if ν is already complete, then S̄ = S and ν̄ = ν.
Exercise 4. Using the notations from Theorem 5.4, prove that for a set A ⊂ X,
the condition A ∈ S̄ is equivalent to any of the following:
(a0 ) there exists B ∈ S and N ∈ N(S, µ), with A = B r N , and N ⊂ B;
(b0 ) there exists F ∈ S and M ∈ N(S, ν), such that A = F ∪M and F ∩M = ∅;
(c) there exists E ∈ S and Z ∈ N(S, ν), such that A = E4Z.
(d) there exist B, F ∈ S such that F ⊂ A ⊂ B, and µ(B r F ) = 0.
The µ∗ -measurable sets of a special type can be completely characterized using
µ∗ -negligeable ones.
Theorem 5.5. Suppose J is a semiring on X, and µ is a measure on J. Let µ∗
be the maximal outer extension of µ. For a J-µ-σ-finite subset A ⊂ X, the following
are equivalent;
(i) A is µ∗ -measurable;
206 LECTURE 22

(ii) there exists B in the σ-ring S(J) generated by J, and a µ∗ -neglijeable set
N ⊂ X, such that A = B r N .

S∞ Proof. (i) ⇒ (ii). Start by choosing a sequence (Dn )n=1 ⊂ J with A ⊂


n=1 Dn and µ(Dn ) < ∞, ∀ n ∈ N. Since mµ (X) is an algebra, which contains


J, it follows that all the intersections An = A ∩ Dn , n ∈ N, belong to mµ∗ (X).


For each n ∈ N, we use the previous result to find some set Bn ∈ S(J) such that
An ⊂ Bn , and µ∗ (Bn ) = µ∗ (An ). On the one hand, if we put Vn = Bn r An , then
Vn ∈ mµ∗ (X), so we will have
µ∗ (Bn ) = µ∗ (An ) + µ∗ (Vn ).
On the other hand, we know that µ∗ (Bn ) = µ∗ (An ) ≤ µ∗ (Dn ) < ∞, so the above
equality forces µ∗ (Vn ) = 0.
Since we have Bn = An ∪ Vn , ∀ n ∈ N, we will get
[∞ ∞
[ ∞
[ [∞
  
Bn = An ∪ Vn = A ∪ Vn ,
n=1 n=1 n=1 n=1
S∞ S∞
so if we define B = n=1 En and V = n=1 Vn , then B belongs to S(J), we have
the equality B = A ∪ V , and V is µ∗ -negligeable, because of the inequalities
X∞
µ∗ (V ) ≤ µ∗ (Vn ).
n=1

The set N = B r A ⊂ V is clearly µ -negligeable, because µ∗ (N ) ≤ µ∗ (V ). Now


we are done because B r N = A.


(ii) ⇒ (i). This part is trivial, since mµ∗ (X) is an algebra. 
Remarks 5.4. A. The implication (ii) ⇒ (i) holds without the assumption
that A is J-µ-σ-finite. In fact, for any A ⊂ X, one has the implications (ii) ⇒
(ii0 ) ⇒ (i), where
(ii0 ) there exists B in the σ-algebra Σ(J) generated by J, and a µ∗ -neglijeable
set N ⊂ X, such that A = B r N .

B. Consider the measure µ∗ S(J) on the σ-ring S(J). Using the notations from
Theorem 5.4, by Proposition 5.3, we clearly have the equality
N ⊂ X : N µ∗ -negligeable = N S(J), µ∗ S(J) .
 


So, if we denote by S(J) the completion of S(J) with respect to µ∗ S(J) , condition (ii)
from Theorem 5.5 reads: A ∈ S(J). Similarly,
if we denote by Σ(J) the completion
of Σ(J) with respect to the measure µ∗ Σ(J) , condition (ii0 ) above reads: A ∈ Σ(J).
With these notations, we have the inclusions
(12) S(J) ⊂ Σ(J) ⊂ mµ∗ (X).
With these notations, Theorem 5.5 states that
(13) S(J) ∩ A ⊂ X : A J-µ-σ-finite = mµ∗ (X) ∩ A ⊂ X : A J-µ-σ-finite .
 

Theorem 5.5, written in the form (13) has the following.


Corollary 5.2. If the semiring J is σ-total in X, and µ is a σ-finite measure
on J, then one has the equalities
(14) S(J) = Σ(J) = mµ∗ (X).
CHAPTER III: MEASURE THEORY 207

Proof. Indeed, under the given assumptions on J and µ, it follows that every
set A ⊂ X is J-µ-σ-finite. 
Examples 5.3. A. The implication (i) ⇒ (ii) from Theorem 5.5 may fail, if
A is not σ-finite. Start with an arbitrary set X, consider the semiring J = {∅, X}
and the measure µ on J defined by µ(∅) = 0 and µ(X) = ∞. Notice that J is a
σ-algebra, so it is trivial that J is σ-total in X. The maximal outer extension µ∗
of µ is defined by 
0 if A = ∅
µ∗ (A) =
∞ if A 6= ∅
It is clear that, since µ∗ is a measure on P(X), we have the equality mµ∗ (X) =
P(X), but the only µ∗ -neglijeable set is the empty set ∅. This means that the sets
satisfying condition (ii) in Theorem 5.4 are only the sets ∅ and X, so, if ∅ 6= A ( X,
the implication (i) ⇒ (ii) fails, although J is σ-total in X. What occurs here is the
total lack of J-µ-σ-finite sets.
B. Let X be an uncountable set, and let J be the semiring of all finite subsets
of X. We have

S(J) = A ⊂ X : card A ≤ ℵ0 ,

Σ(J) = A ⊂ X : either card A ≤ ℵ0 , or card(X r A) ≤ ℵ0 .
Equipp J with the trivial measure µ(A) = 0, ∀ A ∈ J. The maximal outer extension
µ∗ is then defined by

∗ 0 if card A ≤ ℵ0
µ (A) =
∞ if A is uncountable
It is clear that µ∗ is a measure
on P(X),
so we have mµ∗ (X) = P(X) ) J. Notice
that both measures µ∗ S(J) and µ∗ Σ(J) are complete, so using the notations from
Remark 5.4.B, we have the equalities
S(J) = S(J) and Σ(J) = Σ(J).
It is clear however that both inclusions in (12) are strict, although µ is finite. What
happens here is the fact that J is not σ-total in X.
C. In the same setting as in Example B, if we take I = Σ(J), and ν = µ∗ I ,

then I is σ-total in X, simply because I is a σ-algebra. In this case, by Proposition


5.3, the maximal outer extension ν ∗ of ν coincides with µ∗ . We have
I = S(I) = S(I) = Σ(I) = Σ(I) ( mν ∗ (X),
the reason for the strict inclusion being this time the fact that ν is not σ-finite.
Comment. In the remainder of this section we take another look Question 3,
trying to generalize the answer given by Corollary 5.2. To simplify matters a little
bit, we start with a σ-algebra B on X (which is clearly σ-total in X), and a measure
µ on B. It we take µ∗ to be the maximal outer extension of µ, and consider the
completion B̄, we have the inclusion
(15) B̄ ⊂ mµ∗ (X),
so we can ask whether this inclusion is strict. Of course, if µ is σ-finite, then by
Corollary 5.2 the inclusion (15) is not strict. As Example 5.3.C suggests, in the
absence of the σ-finiteness assumption, the inclusion (15) may indeed be strict. As it
turns out, the fact that the inclusion (15) is strict in Example 5.3.C is a consequence
208 LECTURE 22

of the fact that there are “new” measurable sets which are not necessarily of the
form B r N with B ∈ B and N neglijeable. The existence of such sets is suggested
by the following.
Remark 5.5. Suppose ν is an outer measure on X. For a set A ⊂ X, the
following are equivalent:
(i) A is ν-measurable;
(ii) ν(S) ≥ ν(S ∩ A) + ν(S r A), for all S ⊂ X with ν(S) < ∞;
The implication (i) ⇒ (ii) is trivial. To prove the converse, by Remark 5.1.A, we
need to show that
ν(S) ≥ ν(S ∩ A) + ν(S r A), ∀ S ⊂ X.
But this is trivial, when ν(S) = ∞. If ν(S) < ∞, then this is exactly condition (ii).
The “new” sets, that were mentioned above, are of a type covered by the
following.
Definition. Let ν be an outer measure on X. A subset N ⊂ X is said to be
locally ν-neglijeable, if
ν(N ∩ A) = 0, for all A ⊂ X with ν(A) < ∞.
It is clear that every subset of N is also locally ν-neglijeable.
The above observation shows that every locally ν-neglijeable set is ν-measurable.
The term “local” will be used in connection with properties that hold when the
subject set is cut down by sets of finite measure. For example, one can formulate
the following.
Definitions. Let B be a σ-algebra on X, and µ be a measure on B. We say
that a set N ∈ B is locally µ-null, if
(16) µ(F ∩ N ) = 0, for all F ∈ B, with µ(F ) < ∞.
Remark that locally µ-null sets do not necessarily have zero measure (see Example
5.3.C)
We say that µ is locally complete, if it satisfies the condition
(lc) whenever N ∈ B is a locally µ-null set, it follows that B contains all
subsets of N .
Remarks 5.6. Use the notations above.
A. If the measure µ is σ-finite, the local completeness of µ is equivalent to
completeness. The reason is the fact that, in the σ-finite case, condition (16) is
equivalent to µ(N ) = 0.
B. Given an outer measure ν on X, the measure ν mν (X) is locally complete.
Comment. If we look at Example 5.3.C, we now see that although the measure
ν on I is complete, it is not locally complete, thus giving another explanation for
the strict inclusion I ( mν ∗ (X).
We are now in position to analyze Question 3, in the simplified given setting.
The following fact will be helpful.
Lemma 5.1. Let B be a σ-algebra on X, let µ be a measure on B, and let µ∗
be the maximal outer extension of µ. Then, for every subset S ⊂ X, one has the
equality
µ∗ (S) = inf µ(B) : B ∈ B, B ⊃ S .

(17)
CHAPTER III: MEASURE THEORY 209

Proof. Since B is σ-total in X, by definition we have


∞ ∞
X [
µ∗ (S) = inf µ(Bn ) : (Bn )∞ B,

(18) n=1 ⊂ S ⊂ Bn .
n=1 n=1
If we denote the right hand side of (??) by ν(S), then using (18) we clearly have
n=1 ⊂ B with S ⊂

µ
S∞ (S) ≤ ν(S). Conversely, if we start with any sequence (Bn )∞
B
n=1 n , then we clearly have
X ∞ [∞ 
µ(Bn ) ≥ µ Bn ≥ ν(S),
n=1 n=1

so taking the infimum yields µ (S) ≥ ν(S). 
Proposition 5.5. Let B be a σ-algebra on X, and let µ be a measure on B.
Define the collection
Bfin = F ∈ B : µ(F ) < ∞ .


For every F ∈ Bfin , denote B



by M F the completion of the σ-algebra (on F ) with
F
8 ∗
respect to the measure µ . Denote by µ the maximal outer extension of µ.
F
A. For a subset A ⊂ X, the following are equivalent
(i) A is µ∗ -measurable;
(ii) A ∩ F ∈ MF , for each F ∈ Bfin ;
(iii) A ∩ F is µ∗ -measurable, for each F ∈ Bfin .
B. For a subset N ⊂ X, the following are equivalent
(i) N is locally µ∗ -neglijeable;
(ii) µ∗ (N ∩ F ) = 0, for all F ∈ Bfin .
Proof. Let us fix some useful notations. By construction, for every F ∈ Bfin ,
we have
B F = A ∩ F : A ∈ B = B ∈ B : B ⊂ F .
 

For each F ∈ Bfin , we denote the σ-ring N B F , µ F simply by NF . With the




above identification we have


NF = N ⊂ F : there exists D ∈ B with N ⊂ D ⊂ F and µ(D) = 0 ,


so (see Theorem 5.4) the σ-algebra MF is given as


MF = B r N : B ∈ B, B ⊂ F, N ∈ NF .


A. To prove the implication (i) ⇒ (ii), start with a µ∗ -measurable set A, and
with some F ∈ Bfin . Since F is µ∗ -measurable, the intersection A ∩ F is µ∗ -
measurable. Since µ∗ (A ∩ F ) ≤ µ∗ (F ) = µ(F ) < ∞, by Theorem 5.5 there exist
B0 ∈ B and N0 ⊂ X with µ∗ (N0 ) = 0 and A ∩ F = B0 r N0 . If we then define
B = B0 ∩ F and N = N0 ∩ F , then we clearly have B ∈ B F , N ∈ NF , and
A ∩ F = B r N , so A ∩ F indeed belongs to MF .
The implication (ii) ⇒ (ii) is trivial, since every set in MF is clearly µ∗ -
measurable.
To prove the implication (iii) ⇒ (i), assume A has property (iii), and let us
show that A is µ∗ -measurable. We are going to use Remark 5.5, which means that
it suffices to prove the inequality
(19) µ∗ (S) ≥ µ∗ (S ∩ A) + µ∗ (S r A),
8 Here µ denotes the restriction of µ to the σ-algebra B .

F F
210 LECTURE 22

only for those subsets S ⊂ X with µ∗ (S) < ∞. Fix such a subset S. Since
µ∗ (S) < ∞, Lemma 5.1 gives
µ∗ (S) = inf µ(F ) : F ∈ Bfin , F ⊃ S .

(20)
Start with some arbitrary ε > 0, and choose some F ∈ Bfin with F ⊃ S and
µ(F ) ≤ µ∗ (S) + ε. By (iii) the set A ∩ F is µ∗ -measurable, so we have
µ∗ (F ) = µ∗ F ∩ [A ∩ F ] + µ∗ F r [A ∩ F ] = µ∗ (F ∩ A) + µ∗ (F r A).
 

Since F ∩ A ⊃ S ∩ A, and F r A ⊃ S r A, we have the inequalities µ∗ (F ∩ A) ≥


µ∗ (S ∩ A) and µ∗ (F r A) ≥ µ∗ (S r A), so the above inequality gives
µ∗ (F ) ≥ µ∗ (S ∩ A) + µ∗ (S r A).
By the choice of F , this gives
µ∗ (S) + ε ≥ µ∗ (S ∩ A) + µ∗ (S r A).
Since this inequality holds for all ε > 0, we immediately get the desired inequality
(19).
B. The condition (i) says that
(21) µ∗ (N ∩ S) = 0, for all S ⊂ X with µ∗ (S) = 0.
It is obvious that we have the implication (i) ⇒ (ii). Conversely, suppose N
satisfies (ii), and let us prove (21). Start with some arbitrary subset S ⊂ X with
µ∗ (S) < ∞. Using (20), there exists some F ∈ Bfin with S ⊂ F . By (ii), and
the monotocity of µ∗ we have 0 = µ∗ (N ∩ F ) ≥ µ∗ (N ∩ S), which clearly forces
µ∗ (N ∩ S) = 0. 
The above result suggests that the σ-algebra mµ∗ (X) can be regarded as some
sort of “local” completion of B. To simplify the exposition a little bit, we introduce
the following.
Notation. Let B be a σ-algebra on X, let µ be a measure on B, and let µ∗
be the maximal outer extension of µ. The σ-algebra mµ∗ (X), of all µ∗ -measurable
subsets of X, will be denoted by Mµ (B) (or just Mµ , when there is no danger oof
confusion). The measure µ∗ M will be denoted by µ̃. The pair (Mµ , µ̃) will be
µ
called the quasi-completion of B with respect to µ.
Unfortunately, analogues of Theorem 5.4 are not available, unless some (other-
wise natural) restrictions are imposed. The type of restrictions we have in mind also
aimed at making the test conditions A.(iii) and B.(ii) easier to check. We would
like to check them on a “small” sub-collection of Bfin . This naturally suggests the
following.
Definition. Let B be a σ-algebra on X, and let µ be a measure on B. A
sufficient µ-finite B-partition of X is a collection F of non-empty subsets of X,
with the following properties:
(i) F is pairwise disjoint, and F ∈F F = X;
S
(ii) F ⊂ B, and µ(F ) < ∞, for all F ∈ F;
(iii) for every set B ∈ B, with µ(B) < ∞, one has the equality
X
µ(B) = µ(B ∩ F ).
F ∈F

Condition (iii) uses the summation convention from II.2. (The sum is defined as
the suppremum of all finite partial sums.)
CHAPTER III: MEASURE THEORY 211

Remarks 5.7. A. Suppose F is a sufficient µ-finite B-partition of X. For every


set A ∈ B, we define the collection
SFµ (A) = F ∈ F : µ(A ∩ F ) > 0 .


If µ(A) < ∞, then


(a) SFµ (A) is at most countable, and
S
(b) µ A r µ
F ∈S (A) (A ∩ F ) = 0.
F

By condition (iii) in the definition, it follows that, the family µ(A ∩ F ) F ∈S µ (A) is
F
summable, and
X
(22) µ(A ∩ F ) = µ(A).
µ
F ∈SF (A)

Since µ(A ∩ F ) > 0, ∀SF ∈ SFµ (A), property (a) follows from Proposition II.2.2. If
we denote the union F ∈S µ (A) (A ∩ F ) by A0 , then by the σ-additivity of µ (it is
F
here where we use (a) in an essential way) the equality (22) gives
X
µ(A0 ) = µ(A ∩ F ) = µ(A),
µ
F ∈SF (A)

which combined with µ(A) < ∞ forces µ(A r A0 ) = 0.


B. The existence of a sufficient µ-finite B-partition of X is a generalization of
σ-finitess. In fact the following are equivalent (B is a σ-algebra on X):
• µ is σ-finite;
• there exists a countable sufficient µ-finite B-partition of X.
In the presence of a sufficient µ-finite B-partition, the properties that appear
in Proposition 5.5 are simplified.
Proposition 5.6. Let B be a σ-algebra on X, let µ be a measure on B. As-
sume F is a sufficient µ-finite B-partition of X. Denote by µ∗ the maximal outer
extension of µ.
A. For a subset A ⊂ X, the following are equivalent
(i) A is µ∗ -measurable;
(ii) A ∩ F is µ∗ -measurable, for each F ∈ F.
B. For a subset N ⊂ X, the following are equivalent
(i) N is locally µ∗ -neglijeable;
(ii) µ∗ (N ∩ F ) = 0, for all F ∈ F.
C. If A ⊂ X is a subset with µ∗ (A) < ∞, then
X
(23) µ∗ (A) = µ∗ (A ∩ F ).
F ∈F

Proof. It will be useful to introduce the following notations (use also the
notations from Proposition 5.5). For every B ∈ Bfin , we define
[
B0 = (B ∩ F ).
µ
F ∈SF (B)

By Remark 5.7 we know that µ(B r B 0 ) = 0.


A. The implication (i) ⇒ (ii) is trivial. To prove the implication (ii) ⇒ (i),
we start with a set A ⊂ X satisfying condition (ii), and we show that A satisfies
condition (iii) from Proposition 5.5.A. Start with some arbitrary set B ∈ Bfin ,
212 LECTURE 22

and let us show that A ∩ B is µ∗ -measurable. Using the above notation, and the
monotonicity of µ∗ we have
µ∗ A ∩ [B r B 0 ] ≤ µ∗ (B r B) = µ(B r B 0 ) = 0,


which in particular shows that A ∩ [B r B 0 ] is µ∗ -measurable. Since we have


A ∩ B = (A ∩ B 0 ) ∪ (A ∩ [B r B 0 ]), it then suffices to show that A ∩ B 0 is µ∗ -
measurable. Notice that
[
A ∩ B0 = (A ∩ F ∩ B),
µ
SF (B)

and since the indexing set SFµ (B) is at most countable, it then suffices to show
that A ∩ F ∩ B is µ∗ -measurable, for each F . But this is obvious, since A ∩ F is
µ∗ -measurable, by condition (ii), and B ∈ B.
C. Let A ⊂ X be a subset with µ∗ (A). Using Lemma 5.1, we can find, for
every ε > 0, some set Bε ∈ Bfin , such that Bε ⊃ A, and µ(Bε ) ≤ µ∗ (A) + ε. Fix
for the moment ε. Since the family µ(Bε ∩ F ) F ∈F is summable, and µ∗ (A ∩ F ) ≤
µ(Bε ∩ F ), ∀ F ∈ F, it follows that the family µ∗ (A ∩ F ) F ∈F is summable, and


moreover one has the inequality


X X
µ∗ (A ∩ F ) ≤ µ(Bε ∩ F ) = µ(Bε ) ≤ µ∗ (A) + ε.
F ∈F F ∈F

Since we have F ∈F µ∗ (A ∩ F ) ≤ µ∗ (A) + ε, for all ε > 0, it follows that we have


P
in fact the inequality
X
µ∗ (A ∩ F ) ≤ µ∗ (A).
F ∈F
To prove the reverse inequality, we fix ε = 1 and we define set
[
G= F.
µ
F ∈SF (B1 )

Since SFµ (B1 ) is at most countable, the set G belongs to B. With the above notation,
we have the equality B10 = B1 ∩ G, and by Remark 5.7.A, we have µ(B1 r G) =
µ(B1 r B10 ) = 0. Since A r G ⊂ B1 r G, it follows that µ∗ (A r G) = 0. Since G is
µ∗ -measurable, we get
µ∗ (A) = µ∗ (A ∩ G) + µ∗ (A r G) = µ∗ (A ∩ G).
Since G is a countable union of F ’s, by the σ-subadditivity of µ∗ , we have
[ X X
µ∗ (A) = µ∗ (A ∩ G) = µ∗ µ∗ (A ∩ F ) ≤ µ∗ (A ∩ F ).

[A ∩ F ] ≤
µ µ
F ∈SF (B1 ) F ∈SF (B1 ) F ∈F

B. The implication (i) ⇒ (ii) is trivial. To prove the implication (ii) ⇒ (i), we
must show that condition (ii) implies
µ∗ (N ∩ B) = 0, ∀ B ∈ Bfin .
But if we fix some B ∈ Bfin , then of course we have µ∗ N ∩B) ≤ µ∗ (B) = µ(B) < ∞,
so using part C, we have
X X
µ∗ (N ∩ B) = µ∗ (N ∩ B ∩ F ) ≤ µ∗ (N ∩ F ) = 0,
F ∈F F ∈F

and we are done. 


CHAPTER III: MEASURE THEORY 213

Comments. Let B be a σ-algebra on X, let µ be a measure on B. Assume F


is a sufficient µ-finite B-partition of X.
By Proposition 5.6.C, it follows that F is also a sufficient µ̃-finite Mµ -partition
of X.
We see naow that Mµ may contain more “new” sets, appart from the “nat-
ural candidates,” which are of the form B r N , with B ∈ B and N locally µ∗ -
neglijeable.
Such “new” sets are those which belong (see Section 2) to the σ-algebra
B
W . More precisely, we have the following.
F ∈F F
Corollary 5.3. Let B be a σ-algebra on X, let µ be a measure on B. Assume
F is a sufficient µ-finite B-partition of X.
A. One has the equality
_ 
Mµ = M µ F .
F ∈F

B. For a subset A ⊂ X, the following are equivalent


(i) A ∈ Mµ ;
(ii) there exist a set B ∈ F ∈F B F , and a locally µ∗ -neglijeable set
W 

N ⊂ X, such that A = B r N .
Proof. A. This is exactly property A from Proposition 5.6.
B. (i) ⇒ (ii). Assume A ∈ Mµ , i.e. A is µ∗ measurable. For every F ∈ F, the
set A ∩ F is µ∗ -measurable. Since µ∗ (A ∩ F ) < ∞, by Theorem 5.5, it follows that
A ∩ F = BF r NF , with BF ∈ B and µ∗ (NF ) = 0. Replacing BF with BF ∩ F ,
and N SF with NF ∩ F , we S can assume that BF , NF ⊂ F . Form then the sets
B = F ∈F BF and N = F ∈F NF . On the one hand,  we have B ∩ B = BF ∈ B,
∀ F ∈ F, which means precisely that B ∈ F ∈F B F . On the other hand, we also
W
have N ∩ F = NF , so we get µ∗ (N ∩ F ) = 0, ∀ F ∈ F. By Proposition 5.6.B, it
follows that N is locally µ∗ -neglijeable. We clearly have A = B r N .
The implication (ii) ⇒ (i) is obvious. 
There is yet another nicer consequence of Proposition 5.6, for which we are
going to use the following terminology.
Definition. Let A be a σ-algebra on X, and let µ be a measure on A. A
family F is called a µ-finite decomposition for A, if
(i) F is a sufficient µ-finite
W A-partition
 of X, and
(ii) one has the equality F ∈F A F = A.
(Given a collection F ⊂ A, one always has the inclusion F ∈F A F ⊂ A.)
W 

A measure µ on A is said to be decomposable, if there exists at least one µ-finite


decomposition for A.
Remark 5.8. Decomposability is a generalization of σ-finiteness. This follows
from Remark 5.6.B, combined with the fact W that whenever F ⊂ A is a countable
sub-collection, one always has the equality F ∈F A F = A.


With this terminology, Corollary 5.3 states that if F is a sufficient µ-finite


B-partition of X, then F is a µ̃-finite decomposition for Mµ .
With the above terminology, Corollary 5.2 has the following generalization
Theorem 5.6. Let µ be a decomposable measure on the σ-algebra B.
A. For a subset A ⊂ X, the following are equivalent
(i) A is µ∗ -measurable;
214 LECTURE 22

(ii) there exist B ∈ B, and some locally µ∗ -neglijeable set N , such that
A = B r N.
B. For a subset N ⊂ X, the following are equivalent
(i) N is locally µ∗ -neglijeable;
(ii) there exists a locally µ-null set D ∈ B with N ⊂ D.
Proof. A. This is clear, by Corollary 5.3.
B. The implication (ii) ⇒ (i) is trivial, because any locally µ-null set D is
locally µ∗ -neglijeable, and so is every subset of D.
To prove the implication (i) ⇒ (ii) start with a locally µ∗ -neglijeable set N ,
and we fix F a µ-finite decomposition of B. We know that µ∗ (N ∩ F ) = 0, ∀ F ∈ F.
In particular, using Remark 5.4.B, for each F ∈ F, there exists some
S set EF ∈ B,
with N ∩ F ⊂ EF , and µ(EF ) = 0. Consider now the set D = F ∈F (EF ∩ F ).
 we have D ∩ F = EF ∩ F ∈ B, ∀ F ∈ F, which means that
By construction,
D ∈ F ∈F B F . It is here where we use condition (ii) in the definition of µ-finite
W
decompositions, to conclude that D belongs to B. Of course, we have
µ(D ∩ F ) = µ(EF ∩ F ) ≤ µ(EF ) = 0, ∀ F ∈ F,
which by Proposition 5.6 means that D is locally µ∗ -neglijeable. This means that
µ(D ∩ B) = µ∗ (D ∩ B) = 0, ∀ B ∈ Bfin ,
which means that D is locally µ-null. Since N ∩ F ⊂ EF ∩ F ⊂ D, ∀ F ∈ F, and F
is a partition of X, we get N ⊂ D. 
Lectures 23-25

6. The Lebesgue measure


In this section we apply various results from the previous sections to a very
basic example: the Lebesgue measure on Rn .
Notations. We fix an integer n ≥ 1. In Section 21 we introduced the semiring
of “half-open boxes” in Rn :
n
Y
Jn = {∅} ∪ [aj , bj ) : a1 < b1 , . . . , an < bn ⊂ P(Rn ).

j=1

For a non-empty box A = [a1 , b1 ) × · · · × [an , bn ) ∈ Jn , we defined its n-dimesnional


volume by
Y n
voln (A) = (bk − ak ).
k=1
We also defined voln (∅) = 0.
By Theorem 4.2, we know that voln is a finite measure on Jn .
Definitions. The maximal outer extension of voln is called the n-dimensional
outer Lebesgue measure, and is denoted by λ∗n .
The λ∗n -measurable sets in Rn will be called n-Lebesgue measurable. The σ-
algebra mλ∗n (Rn ) will be denoted simply by m(Rn ). The measure λ∗n m(Rn ) is
simply denoted by λn , and is called the n-dimensional Lebesgue measure. Although
this notation may appear to be confusing, it turns out (see Proposition 5.3) that λ∗n
is indeed the maximal outer extension of λn . In the case when n = 1, the subscript
will be ommitted.
We know (see Section 21) that
S(Jn ) = Σ(Jn ) = Bor(Rn ).
Using the fact that the semiring Jn is σ-total in Rn , by the definition of the outer
Lebesgue measure, we have
∞ ∞
X [
λ∗n (A) (Bk )∞ ⊂ Jn , B k ⊃ A , ∀ A ⊂ Rn

(1) = inf voln (Bk ) : k=1
k=1 k=1

Using Corollary 5.2, we have the equality


m(Rn ) = Bor(Rn ),

where Bor(Rn ) is the completion of Bor(Rn ) with respect to the measure λn Bor(Rn ) .
This means that a subset A ⊂ Rn is Lebesgue measurable, if and only if there ex-
ists a Borel set B and a neglijeable set N such that A = B ∪ N . (The fact N is
215
216 LECTURES 23-25

neglijeable means that λ∗n (N ) = 0, and is equivalent to the existence of a Borel set
C ⊃ N with λn (C) = 0.)
Exercise 1. Let A = [a1 , b1 ) × · · · × [an , bn ) be a half-open box in Rn . Assume
A 6= ∅ (which means that a1 < b1 , . . . , an < bn ). Consider the open box Int(A)
and the closed box A, which are given by
Int(A) = (a1 , b1 ) × · · · × (an , bn ) and A = [a1 , b1 ] × · · · × [an , bn ].
Prove the equalities  
λn Int(A) = λn A = voln (A).
Remarks 6.1. If D ⊂ Rn is a non-empty open set, then λn (D) > 0. This is a
consequence of the above exercise, combined with the fact that D contains at least
one non-empty open box.
The Lebesgue measure of a countable subset C ⊂ Rn is zero. Using σ-additivity,
it suffices to prove this only in the case of singletons C = {x}. If we write x in
coordinates x = (x1 , . . . , xn ), and if we consider half-open boxes of the form
Jε = [x1 , x1 + ε) × · · · × [xn , xn + ε),
then the obvious inclusion {x} ⊂ Jε will force
0 ≤ λn {x} ≤ λn (Jε ) = εn ,


so taking the limit as ε → 0, we indeed get λn {x} = 0.
The (outer) Lebesgue measure is completely determined by its values on open
sets. More explicitly, one has the following result.
Proposition 6.1. Let n ≥ 1 be an integer. For every subset A ⊂ Rn one has:
(2) λ∗n (A) = inf{λn (D) : D open subset of Rn , with D ⊃ A}.
Proof. Throughout the proof the set A will be fixed. Let us denote, for
simplicity, the right hand side of (2) by ν(A). First of all, since every open set is
Lebesgue measurable (being Borel), we have λn (D) = λ∗n (D), for all open sets D,
so by the monotonicity of λ∗n , we get the inequality
λ∗n (A) ≤ ν(A).
We now prove the inequality λ∗n (A) ≥ ν(A). Fix for the moment some S∞ε > 0, and
use (1). to get the existence of a sequence (Bk )∞ k=1 ⊂ J n , such that k=1 Bk ⊃ A,
and
X∞
voln (Bk ) < λ∗n (A) + ε.
k=1
For every k ≥ 1, we write
(k) (k)
Bk = [a1 , b1 ) × · · · × [a(k) (k)
n , bn ),
Qn (k) (k)
so that voln (Bk ) = j=1 (b1 − aj ). Using the obvious continuity of the map
n
(k) (k)
Y
R 3 t 7−→ (b1 − aj − t) ∈ R,
j=1
(k) (k) (k) (k)
we can find, for each k ≥ 1 some numbers c1 < a1 , . . . , cn < an , with
n n
Y (k) (k) ε Y (k) (k)
(3) (b1 − cj ) < k
+ (b1 − aj ).
j=1
2 j=1
CHAPTER III: MEASURE THEORY 217

Notice that, if we define the half-open boxes


(k) (k)
Ek = [c1 , b1 ) × · · · × [c(k) (k)
n , bn ),

then for every k ≥ 1, we clearly have Bk ⊂ Int(Ek ), and by Exercise 1, combined


with (3), we also have the inequality
 ε
λn Int(Ek ) = voln (Ek ) < k + voln (Bk ).
2
Summing up we then get
∞ ∞ ∞
X  X ε X
voln (Bk ) < 2ε + λ∗n (A).

(4) λn Int(Ek ) < + voln (B k ) = ε +
2k
k=1 k=1 k=1

Now we observe that by σ-sub-additivity we have


[∞  X ∞

λn Int(Ek ) ≤ λn Int(Ek ) ,
k=1 k=1
S∞
so if we define the open set D = k=1 Int(Ek ), then using (4) we get
(5) λn (D) < 2ε + λ∗n (A).
It is clear that we have the inclusions
[∞ ∞
[
A⊂ Bk ⊂ Int(Ek ) = D,
k=1 k=1

so by the definition of ν(A), combined with (5), we finally get


ν(A) ≤ λn (D) < 2ε + λ∗n (A).
Up to this moment ε > 0 was fixed. Since the inequality ν(A) < 2ε + λ∗n (A) holds
for any ε > 0 however, we finally get the desired inequality ν(A) ≤ λ∗n (A). 

The Lebesgue measure can also be recovered from its values on compact sets.
Proposition 6.2. Let n ≥ 1 be an integer. For every Lebesgue measurable
subset A ⊂ Rn one has:
(6) λn (A) = sup{λn (K) : K compact subset of Rn , with K ⊂ A}.

Proof. Let us denote, for simplicity, the right hand side of (6) by µ(A). First
of all, by the mononoticity we clearly have the inequality
λn (A) ≥ µ(A).
To prove the inequality λn (A) ≤ µ(A), we shall first use a reduction to the bounded
case. For each integer k ≥ 1, we define the compact box
Bk = [−k, k] × · · · × [−k, k].
S∞
Notice that we have B1 ⊂ B2 ⊂ . . . , with k=1 Bk = Rn . We then have
B1 ∩ A ⊂ B2 ∩ A ⊂ . . . ,
S∞
with k=1 (Bk ∩ A) = A, so using the Continuity Lemma 4.1, we have

(7) λn (A) = lim λn (Bk ∩ A) = sup λn (Bk ∩ A) : k ≥ 1 .
k→∞
218 LECTURES 23-25

Fix for the moment some ε > 0, and use the (7) to find some k ≥ 1, such that
λn (A) ≤ λn (Bk ∩ A) + ε. Apply Proposition 6.1 to the set Bk r A, to find an open
set D, with D ⊃ Bk r A, and λn (Bk r A) ≥ λn (D) − ε. On the one hand, we have
λn (Bk ) = λn (Bk ∩ A) + λn (Bk r A) ≥ λn (Bk ∩ A) + λn (D) − ε ≥
(8)
≥ λn (Bk ∩ A) + λn (Bk ∩ D) − ε.
On the other hand, we have
λn (Bk ) = λn (Bk r D) + λn (Bk ∩ D),
so using (8) we get the inequality
λn (Bk r D) + λn (Bk ∩ D) ≥ λn (Bk ∩ A) + λn (Bk ∩ D) − ε,
and since all numbers involved in the above inequality are finite, we conclude that
λn (Bk r D) ≥ λn (Bk ∩ A) − ε ≥ λn (A) − 2ε.
Obviously the set K = Bk r D is compact, with K ⊂ Bk ∩ A ⊂ A, so we have
µ(A) ≥ λn (K), hence we get the inequality
µ(A) ≥ λn (A) − 2ε.
Since this is true for all ε > 0, the desired inequality µ(A) ≥ λn (A) follows. 

Corollary 6.1. For a set A ⊂ Rn , the following are equivalent:


(i) A is Lebesgue measurable;
(ii) there exists a neglijeable set N and a sequence of (Kj )∞
j=1 of compact
subsets of Rn , such that

[
A=N∪ Kj .
j=1

Proof. (i) ⇒ (ii). Start by using the boxes


Bk = [−k, k] × · · · × [−k, k]
S∞ S∞
which have the property that k=1 Bj = Rn , so we get A = k=1 (Bk ∩ A). Fix
for the moment k. Apply Proposition 6.2. to find a sequence (Crk )∞ r=1 of compact
subsets of Bk ∩ A, such that limr→∞ λn (Crk ) = λn (Bk ∩ A). Consider the countable
family (Crk )∞ ∞
k,r=1 of compact sets, and enumerate it as a sequence (Kj )j=1 , so that
we have
[∞ ∞ [
[ ∞
Kj = Crk .
j=1 k=1 r=1
S∞
If we define, for each k ≥ 1, the sets Ek = r=1 Crk ⊂ Bk ∩A and Nk = (Bk ∩A)rEk ,
then, because of the inclusion Crk ⊂ Ek ⊂ Bk ∩ A, we have the inequalities
(9) 0 ≤ λn (Nk ) = λn (Bk ∩ A) − λn (Ek ) ≤ λn (Bk ∩ A) − λn (Crk ), ∀ r ≥ 1.
Using the fact that
lim λn (Crk ) = λn (Bk ∩ A) ≤ λn (Bk ) < ∞,
r→∞
CHAPTER III: MEASURE THEORY 219

the inequalities
S∞  (9) force λn (Nk ) = 0, ∀ k ≥ 1. Now if we define the set N =
Ar j=1 K j , we have

[ ∞
[  ∞ 
[ ∞
[ 
 
N= (Bk ∩ A) r Kj = (Bk ∩ A) r Ep ⊂
k=1 j=1 k=1 p=1

[ ∞
[
 
⊂ (Bk ∩ A) r Ek = Nk ,
k=1 k=1
which proves that λn (N ) = 0.
The implication (ii) ⇒ (i) is trivial. 
n
Proposition 6.2 does not hold if A ⊂ R is non-measurable. In fact the equality
(6), with λn replaced by λ∗n , essentially forces A to be measurable, as shown by the
following.
Exercise 2. Let A ⊂ Rn be am arbitrary subset, with λ∗n (A) < ∞. Prove that
the following are equivalent:
(i) A is Lebesgue measurable;
(ii) λ∗n (A) = sup{λn (K) : K compact subset of Rn , with K ⊂ A}.
Propositions 6.1 and 6.2 are regularity properties. The following terminology is
useful:
Definitions. Suppose A is a σ-algebra on X, and µ is a measure on A. Sup-
pose we have a sub-collection F ⊂ A.
(i) We say that µ is regular from below, with respect to F, if
µ(A) = sup µ(F ) : F ⊂ A, F ∈ F .


(ii) We say that µ is regular from above, with respect to F, if


µ(A) = inf µ(F ) : F ⊃ A, F ∈ F .


With this terminology, Proposition 6.1 gives the fact that the Lebesgue measure is
regular from above with respect to open sets, while Proposition 6.2 gives the fact
that the Lebesgue measure is regular from below with respect to compact sets.
Exercise 3. For a subset A ⊂ Rn , prove that the following are equivalent:
(i) A is Lebesgue measurable;
(ii) There exist a sequence of compact sets (Kj )∞j=1 , and a dequence of open

S∞ T ∞
sets (Dj )j=1 , such that j=1 Kj ⊂ A ⊂ j=1 Dj , and the difference
T∞  S∞ 
j=1 Dj r j=1 Kj is neglijeable.
Hint: For the implication (i) ⇒ (ii) analyze first the case when λ∗ (A) < ∞. Then write A as a
countable union of sets of finite outer measure.
In the one-dimensional case n = 1, the Lebesgue measure of open sets can be
computed with the aid of the following result.
Proposition 6.3. For every open set D ⊂ R, there exists a countable S (or
finite) pair-wise disjoint collection {Ji }i∈I of open intervals with D = i∈I Ji .
Proof. For every point x ∈ D, we define
ax = inf{a < x : (a, x) ⊂ D} and bx = sup{b > x : (x, b) ⊂ D}.
(The fact that D is open guarantees the fact that both sets above are non-empty.)
It is clear that, for every x ∈ D, the open interval Jx = (ax , bx ) is contained in D, so
220 LECTURES 23-25

S
we have the equality D = x∈D Jx . The problem at this point is the fact that the
collection {Jx }x∈D is not pair-wise disjoint. What we need to find is a countable
(or finite) subset X ⊂ D, S such that the sub-collection {Jx }x∈X is pair-wise disjoint,
and we still have D = x∈X Jx . One way to do this is based on the following
Claim: For two points x, y ∈ D, the following are equivalent:
(i) x ∈ Jy ;
(ii) Jx ⊃ Jy ;
(ii) Jx ∩ Jy 6= ∅;
(iii) Jx = Jy .
To prove the implication (i) ⇒ (ii) we observe that if x ∈ Jy , then ay < x < by ,
so we have (ay , x) ⊂ D and (x, by ) ⊂ D, which means that ax ≤ ay and bx ≥ by ,
therefore we have the inclusion Jx = (ax , bx ) ⊃ (ay , by ) = Jy . The implication
(ii) ⇒ (iii) is trivial. To prove (iii) ⇒ (iv), assume Jx ∩ Jy 6= ∅, and pick a point
z ∈ Jx ∩ Jy . Using the implication (i) ⇒ (ii) we have the inclusions Jz ⊃ Jx and
Jz ⊃ Jy . In particular we have x ∈ Jz , so again using the inplication (i) ⇒ (ii) we
get Jx ⊃ Jz , which means that we have in fact the equality Jx = Jz . Likewise we
have the equality Jy = Jz , so (iv) follows. The implication (iv) ⇒ (i) is trivial.
Going back to the proof of the Proposition, we now see that, using the fact
that any open interval contains a rational number, if we put X0 = D ∩ Q, then
S y ∈ D, there exists x ∈ X0 , such that Jx = Jy . This gives the equality
for any
D = x∈X0 Jx , this time with the indexing set X0 countable. Finally, if we equip
the set X0 with the equivalence relation
x ∼ y ⇐⇒ Jx = Jy ,
and we choose X ⊂ X0 to the a list of all equivalence classes. This means that, for
every y ∈ X0 , S
there exists a unique x ∈ X with Jx = Jy . It is clear now that we
still have D = x∈X Jx , but now if x, x0 ∈ X are such that x 6= x0 , then x 6∼ x0 , so
we have Jx 6= Jx0 , which by the Claim gives Jx ∩ Jx0 = ∅. 
Comments. When we want to compute S the Lebesgue measure of an open set
D ⊂ R, we should first try to write D = i∈I Ji with (Ji )i∈I a countable (or finite)
pair-wise collection of open intervals. If we succeed, then we would have
X
λ(D) = λ(Ji ).
i∈I

For intervals (open or not) the Lebesgue measure is the same as the length.
There are
S∞instances when we can manage only to write a given open set D as a
union D = k=1 Jk , with the J’s not necessarily disjoint. In that case we can only
get the estimate
X∞
λ(D) ≤ λ(Jk ).
k=1
Example 6.1. Consider the ternary Cantor set K3 ⊂ [0, 1], discussed in III.3.
We know (see Remarks 3.5) that one can S find a pair-wise sequence (Dn )∞n=0 of open

subsets of (0, 1) such that K3 = [0, 1] r n=0 Dn , and such that, for each n ≥ 0,
the open set Dn is a disjoint union of 2n intervals of length 1/3n+1 . In particular,
this means that λ(Dn ) = 2n /3n+1 , so
∞ ∞ ∞
2n
[  X X

λ(K3 ) = λ [0, 1] − λ Dn = 1 − λ(Dn ) = 1 − = 0.
n=0 n=0 n=0
3n+1
CHAPTER III: MEASURE THEORY 221

What is interesting here (see Remarks 3.5) is the fact that card K3 = c.
Remark 6.2. An interesting consequence of the above computation is the
fact that all subsets of K3 are Lebesgue measurable, i.e. one has the inclusion
P(K3 ) ⊂ m(R). This gives the inequality
card m(R) ≥ card P(K3 ) = 2card K3 = 2c .
Since we also have m(R) ⊂ P(R), we get
card m(R) ≤ card P(R) = 2card R = 2c ,
so using the Cantor-Bernstein Theorem we get the equality
card m(R) = 2c .
We also know (see Corollary 2.5) that card Bor(R) = c.
As a consequence of this difference in cardinalities, one gets the fact that we
have a strict inclusion
(10) Bor(R) ( m(R).
Later on we shall construct (more or less) explicitly a Lebesgue measurable set
which is not Borel.
Exercise 4. The strict inclusion (10) holds also if R is replaced with Rn ,
with n ≥ 2. In this case, instead of using Cantor sets, one can proceed as fol-
lows. Consider the set S = Rn−1 × {0}. Prove that λn (S) = 0. Conclude that
card m(Rn ) = 2c .
One key feature of the Lebesgue (outer) measure is the translation invariance
property, described in the following result. To formulate it we introduce the follow-
ing notation. For an integer n ≥ 1, a point x ∈ Rn , and a subset A ⊂ Rn , we define
the set
A + x = {a + x : a ∈ A}.
Remark that the map Θx : Rn 3 a 7−→ a + x ∈ Rn is a homeomorphism. In
particular, both Θx and Θ−1x = Θ−x are Borel measurable, which means that, for
a set A ⊂ Rn , one has the equivalence
A ∈ Bor(Rn ) ⇐⇒ A + x ∈ Bor(Rn ).
Proposition 6.4. Let n ≥ 1 be an integer. For any set A ⊂ Rn one has the
equality
λ∗n (A + x) = λ∗n (A).

Proof. Fix A and x. First remark that, for every half-open box B ∈ Jn , its
translation B + x is again a half-open box, and we have the equality
voln (B + x) = voln (B).

S∞ for the moment ε > 0, and choose a sequence (Bk )k=1 ⊂ Jn , such that A ⊂

Fix
k=1 Bk , and

X
voln (Bk ) ≤ λ∗n (A) + ε.
k=1
222 LECTURES 23-25

S∞
Then, using the obvious inclusion A + x ⊂ k=1 (Bk + x), by the remark made at
the begining of the proof, combined with the monotonicity of the outer Lebesgue
measure, we have
[∞  X ∞
∗ ∗
λn (A + x) ≤ λn (Bk + x) ≤ λ∗n (Bk + x) =
k=1 k=1

X ∞
X
= voln (Bk + x) = voln (Bk ) ≤ λ∗n (A) + ε.
k=1 k=1
Since the inequality λ∗n (A + x) ≤ ∗
λn (A) + ε holds for all ε > 0, we get
λ∗n (A + x) ≤ λ∗n (A).
The other inequality follows from the above one applied to the set A + x and the
translation by −x. 
Corollary 6.2. For a subset A ⊂ Rn , one has the equivalence
A ∈ m(Rn ) ⇐⇒ A + x ∈ m(Rn ).
Proof. Write A = B ∪ N , with B Borel, and N neglijeable. Then we have
A + x = (B + x) ∪ (N + x). The set B + x is Borel. By the above result we have
λ∗n (N + x) = λ∗n (N ) = 0, i.e. N + x is neglijeable. Therefore A + x is Lebesgue
measurable. 
As we have seen, the fact that there exist Lebesgue measurable sets that are
not Borel is explained by the difference in cardinalities. Since card m(Rn ) = 2c =
card P(Rn ), it is legitimate to ask whether the inclusion m(Rn ) ⊂ P(Rn ) is strict.
In other words, do there exist sets that are not Lebesgue measurable? The answer
is affirmative, as discussed in the following.
Example 6.2. Equipp R with the equivalence relation
x ∼ y ⇐⇒ x − y ∈ Q.
Denote by R/Q the quotient space (this is in fact the quotient group of (R, +) with
respect to the subgroup Q), and denote by π : R → R/Q the quotient map. Since
every x ∈ R, one can find some y ∼ x, with y ∈ [0, 1), it follows that the map
for
π [0,1) : [0, 1) → R/Q is surjective. Choose then a map φ : R/Q → [0, 1), such that
φ ◦ π = Id, and put E = φ(R/Q). The set E is a complete set of representatives for
the equivalence relation ∼. In other words, E ⊂ [0, 1) has the property that, for
every x ∈ R, there exists exactly one element y ∈ E, with x ∼ y.SIn particular, the
collection of sets (E + q)q∈Q is pair-wise disjoint, and satisfies q∈Q (E + q) = R.
Using σ-sub-additivity, we get
X
∞ = λ(R) ≤ λ∗ (E + q).
q∈Q

Since (by Proposition 6.5) we have λ (E + q) = λ∗ (E), the above inequality forces

λ∗ (E) > 0.
Claim: The set E is not Lebesgue measurable
Assume E is Lebesgue measurable. If we define the set X = Q ∩ [0, 1), then the
sets E + q, q ∈ X are pair-wirse disjoint. On the one hand, the measurability
of E,Scombined with the Corollary 6.2 would imply the measurability of the set
S = q∈X (E + q). On the other hand, the equalities λ(E + q) = λ(E) > 0 will
CHAPTER III: MEASURE THEORY 223

force λ(S) = ∞. But this is impossible, since we obviously have S ⊂ [0, 2), which
forces λ(S) ≤ 2.
Exercise 5. Let E ∈ m(Rn ). Prove that the map
Rn 3 x 7−→ λ E ∪ (E + x) ∈ [0, ∞]


is continuous.
Hint: Analyze first the case when E is compact. In this particular case, show that for every
x0 ∈ Rn and every open set D ⊃ E ∪ (E + x0 ), there exists some neighborhood V of x0 , such that
D ⊃ E ∪ (E + x), ∀ x ∈ V.
Use then regularity from above, combined with the inequality9
|λ(A) − λ(B)| ≤ λ(A4B), for all A, B ∈ m(Rn ), with λ(A), λ(B) < ∞.

In the general case, use regularity from below. (The case λ(E) = ∞ is trivial.)
Exercise 6. Let E ∈ m(Rn ), be such that λn (E) > 0. Prove that the set
E − E = {x − y : x, y ∈ E}
is a neighborhood of 0.
Hint: Assume the contrary, which means that there exists a sequence (xp )∞ n
p=1 ⊂ R r (E − E),
with limp→∞ xp = 0. This will force E ∩ (E + xp ) = ∅, ∀ p ≥ 1. Use the preceding Exercise to
get a contradiction.
We are now in position to construct a Lebesgue measurable set which is not
Borel.
Example 6.3. In Section 3 we discussed the compact space T = {0, 1}ℵ0 and
the maps

X αn
φr : T 3 (αn )∞
n=1 −
7 → (r − 1) ∈ [0, 1].
n=1
rn
For each r ≥ 2 the map φr : T → [0, 1] is continuous so the set Kr = φr (T ) is
compact. We have K2 = [0, 1], and K3 is the ternary Cantor set. We also know
(see Theorem 3.5) that, for a set A ⊂ T , one has the equivalence
(11) A ∈ Bor(T ) ⇐⇒ φr (A) ∈ Bor(Kr ).
Choose now a set E ⊂ [0, 1] which is not Lebesgue measurable. In particular, E is
not Borel, so E 6∈ Bor([0, 1]). Since φ2 : T → [0, 1] is surjective, by (11) it follows
that the set A = φ−12 (E) is not in Bor(T ). Again, by (11) it follows that the set
S = φ3 (A) is not in Bor(K3 ). Since

Bor(K3 ) = Bor(R) K3

this gives S 6∈ Bor(R). Notice however that since S ⊂ K3 , it follows that S is


Lebesgue measurable.
Comment. When one wants to prove that a Lebesgue measurable set M ⊂ R
has positive measure, a sufficient condition for this property is that Int(M ) 6= ∅
(see Remark 6.1). It turns out however that this condition is not always necessary,
as seen from the following:

9
This inequality holds for any additive map defined on a ring.
224 LECTURES 23-25

Exercise 7. Start with an arbitrary inerval [0, 1], and list all rational numbers
in [0, 1] as a sequence Q ∩ [0, 1] = {xn }∞
n=1 . Fix some ε > 0, and consider the open
set

[ ε ε 
D= xn − n+1 , xn + n+1 .
n=1
2 2
Consider the compact set K = [0, 1] r D.
(i) Prove that λ(D) ≤ ε.
(ii) Prove that λ(K) ≥ 1 − ε.
(iii) Prove that Int(K) = ∅.
Hint: For (iii) use the fact that K ∩ Q = ∅.
Exercise 8*. Prove that, for every non-empty open set D ⊂ R, and any two
positive numbers α, β with α + β < λ(D), there exist compact sets A, B ⊂ D, with
λ(A) > α, λ(B) > β, such that A ∩ B = ∅ and (A ∪ B) ∩ Q = ∅.
Hint: Write D as a union of a pair-wise disjoint sequence (Jn )∞ n=1 of open intervals, so that
λ(D) = ∞ ∞ ∞
P
λ(J n ). Find then two sequences (α n ) n=1 and (β n ) n=1 of positive numbers, such
P∞ n=1 P∞
that n=1 αn > α, n=1 βn > β, and αn + βn < λ(Jn ), for all n ≥ 1. This reduces essentially
the problem to the case when D is an open interval, for which one can use the construction
outlined in Exercise 7.
Exercise 9*. Construct o Borel set A ⊂ R, such that, for every open interval
I ⊂ R one has λ(I ∩ A) > 0 and λ(I r A) > 0.
Hints: List all open intervals with rational endpoints as a sequence (In )∞
n=1 . Start (use exercise
8) off by choosing two compact sets A1 , B1 ⊂ I1 , with A1 ∩ B1 = ∅, (A1 ∪ B1 ) ∩ Q = ∅,
and λ(A1 ), λ(B1 ) > 0. Use Exercise 5 to construct two sequences (An )∞ ∞
n=1 and (Bn )n=1 of
compact sets, such that, for all n ≥ 1 we have: (i) An ∩ Bn = ∅; (ii) (An ∪ Bn ) ∩ Q = ∅;
 Sn  S∞
(iii) λ(An ), λ(Bn ) > 0; (iv) An+1 ∪ Bn+1 ⊂ In+1 r k=1 (Ak ∪ Bk ) . Put A = n=1 An and
S∞
B = n=1 Bn . Notice that A ∩ B = ∅, λ(A), λ(B) > 0, and λ(A ∩ In ), λ(B ∩ In ) > 0, ∀ n ≥ 1.
In the remainder of this section we discuss some applications of the Lebesgue
measure to the theory of Riemann integration. The following techincal result will
be very useful.
Lemma 6.1. Let f : [a, b] → R be a non-negative Riemann integrable function,
let A, B ⊂ [a, b] be two disjoint sets, with A∪B = [a, b]. Then one has the estimates
Z b
λ∗ (A) · inf f (z) ≤ f (t) dt ≤ (b − a) · sup f (x) + λ∗ (B) · sup f (y).
z∈A a x∈A y∈B

Proof. Define the numbers


α = sup f (x), β = sup f (y), and γ = inf f (z).
x∈A y∈B z∈A

Recall first that, if for each partition ∆ = (a = x0 < x1 < · · · < xn = b) of [a, b],
we define the lower and the upper Darboux sums of f with respect to ∆:
n
X
L(∆, f ) = (xk − xk−1 ) · inf f (t),
t∈[xk−1 ,xk ]
k=1
Xn
U (∆, f ) = (xk − xk−1 ) · sup f (t),
k=1 t∈[xk−1 ,xk ]
CHAPTER III: MEASURE THEORY 225

then one has the equalities


Z b

f (t) dt = sup L(∆, f ) : ∆ partition of [a, b] =
(12) a

= inf U (∆, f ) : ∆ partition of [a, b] .
Fix now a partition ∆ = (a = x0 < x1 < · · · < xn = b) of [a, b], and define the set

S = k ∈ {1, . . . , n} : [xk−1 , xk ] ∩ A 6= ∅ .
It is clear that
inf f (x) ≤ α, sup f (x) ≥ γ, ∀ k ∈ S,
x∈[xk−1 ,xk ] x∈[xk−1 ,xk ]

inf f (y) ≤ β, sup f (y) ≥ 0, ∀ k ∈ {1, . . . , n} r S,


y∈[xk−1 ,xk ] y∈[xk−1 ,xk ]

so we get
X  X 
(13) L(∆, f ) ≤ (xk − xk−1 ) · α + (xk − xk−1 ) · β
k∈S k6∈S
X 
(14) U (∆, f ) ≥ (xk − xk−1 ) · γ
k∈S

Consider now the sets


[ [
M= [xk−1 , xk ] and N = [xk−1 , xk ].
k∈S k6∈S

Since the intervals involded in both M and N have at most singleton overlaps, it
follows that we have the equalities
X X
(xk − xk−1 ) = λ(M ) and (xk − xk−1 ) = λ(N ),
k∈S k6∈S

so the estimates (13) and (14) read


(15) L(∆, f ) ≤ λ(M ) · α + λ(N ) · β
(16) U (∆, f ) ≥ λ(M ) · γ
Since we clearly have A ⊂ M ⊂ [a, b] and N ⊂ B, we have the inequalities
λ∗ (A) ≤ λ(M ) ≤ b − a and λ(N ) ≤ λ∗ (B),
so the inequalities (15) and (16) give
L(∆, f ) ≤ (b − a) · α + λ∗ (B) · β and U (∆, f ) ≥ λ∗ (A) · γ.
Since ∆ is arbitrary, the desired inequality then follows from (12). 

One application of the above result is the following.


Proposition 6.5. If f : [a, b] → R is Riemann integrable, and the set
N = {x ∈ [a, b] : f (x) 6= 0}
is neglijeable, then
Z b
(17) f (x) dx = 0.
a
226 LECTURES 23-25

Proof. Since f is bounded, there exists some constant C > 0, such that the
Riemann integrable functions C +f and C −f are both non-negative. Apply Lemma
6.1 to these two functions with A = [a, b] r N and B = N . Since f [a,b]rN = 0, we

get (C ± f ) [a,b]rN
= C, so we get
Z b
[C ± f (x)] dx ≤ (b − a) · C,
a

which yields
Z b Z b Z b

± f (x) dx = [C ± f (x)] − C dx = [C ± f (x)] dx − (b − a) · C ≤ 0,
a a a

from which (17) immediately follows. 

In order to make the exposition a bit easier to follow, it will be helpful to


introduce the following
Convention. Given two functions f1 , f2 : [a, b] → R, and a relation r on R
(in our case r will be either “=,” or “≥,” or “≤”), we write

f1 r f2 , a.e.

if the set

A = x ∈ [a, b] : f1 (x) r f2 (x)
has neglijeable complement in [a, b], i.e. λ∗ [a, b] r A = 0. The abreviation “a.e.”


stands for “almost everywhere.”


For example, using this convention, Proposition 6.6 reads: if f : [a, b] → R is
Rb
Riemann integrable, and f = 0, a.e., then a f (x) dx = 0.
Exercise 10. A. Prove that “= a.e” is an equivalence relation, and “≥ a.e” and
“≤ a.e” are transitive relations on the collection of all function [a, b] → R.
B. Prove that f1 ≥ f2 , a.e. and f1 ≤ f2 , a.e. imply f1 = f2 , a.e.
C. Prove that these relations are compatible with the arithmetic operations, in
the exact way as their “honest” versions. For example, if r is one of “=,” or “≥,”
or “≤”, and if f1 r f2 , a.e. and g1 r g2 , a.e., then (f1 + g1 ) r (f2 + g2 ), a.e.
Exercise 11. Let f, g : [a, b] → R be continuous functions, such that f ≥ g, a.e.
Prove that f ≥ g.
Exercise 12. Let f : [a, b] → R be a non-negative Riemann integrable function,
Rb
with a f (x) dx = 0. Prove that f = 0, a.e.
Comment. Riemann integrability is quite a rigid condition. For example the
characteristic function κ Q∩[a,b] of the set of rational numbers in [a, b] is not Riemann
integrable. By the above result however, we can introduce a slightly weaker notion,
which will make such functions integrable, in a weaker sense. This will be a first
“improvement” of the Riemann integration theory. Eventually (see Chapter IV), a
more sofisticated theory - the Lebesgue integral - will emerge.
Definition. We say that a function f : [a, b] → R is almost Riemann inte-
grable, if there exists a Riemann integrable function g : [a, b] → R, with f = g, a.e.
Of course, such a g is not unique. Notice however that, if h : [a, b] → R is another
CHAPTER III: MEASURE THEORY 227

Riemann integrable function, with f = h, a.e., then g = h, a.e., so by Proposition


6.6, we immediately get the equality
Z b Z b
g(x) dx = h(x) dx.
a a

This observation shows that we can unambiguously define


Z b Z b
≈ f (x) dx = g(x) dx.
a a

Example 6.4. Consider the function f = κ Q∩[a,b] . Since Q∩[a, b] is neglijeable,


we have f = 0, a.e. So f is almost Riemann integrable (althought it is not Riemann
integrable), and we have
Z b
≈ f (x) dx = 0.
a
We now focus our attention to (honest) Riemann integrability, with an eye on
the role played by continuity. For a function f : [a, b] → R we define the set

Df = x ∈ [a, b] : f not continuous at x .
It is well-known that continuous functions are Riemann integrable. There are dis-
continuous functions which are still Riemann integrable, for instance we know that
(18) Df finite =⇒ f Riemann integrable.
Notations. Let f : [a, b] → R be a bounded function. Suppose ∆ = (a =
x0 < x1 < · · · < xn = b) is a partition. For each k ∈ {1, . . . , n} we consider the
numbers
Mk = sup f (t) and mk = inf f (t),
t∈[xk−1 ,xk ] t∈[xk−1 ,xk ]

and we define the functions


f∆ = m1 · κ [x0 ,x1 ] + m2 · κ (x1 ,x2 ] + · · · + mn · κ (xn−1 ,xn ] ,
f ∆ = M1 · κ [x0 ,x1 ] + M2 · κ (x1 ,x2 ] + · · · + Mn · κ (xn−1 ,xn ] .
Clearly the functions f∆ and f ∆ have only finitely many points of discontinuity, so
they are Riemann integrable.
With these notations we have the following
Proposition 6.6. For a bounded function f : [a, b] → R, the following are
equivalent:
(i) f is Riemann integrable;
Rb
(ii) inf a [f ∆ (x) − f∆ (x)] dx : ∆ partition of [a, b] = 0;
(iii) there exists a sequence (∆p )∞
p=1 of partitions of [a, b], with ∆1 ⊂ ∆2 ⊂ . . . ,
Rb ∆ 
and limp→∞ a f (x) − f∆p (x) dx = 0.
p

Proof. From the definition of Riemann integrability, we know that (i) is equiv-
alent to any of the following two conditions

(ii’) inf U (∆, f ) − L(∆, f ) : ∆ partition of [a, b] = 0;
(iii’) there exists a sequence (∆p )∞
p=1 of partitions of [a, b], with ∆1 ⊂ ∆2 ⊂ . . . ,
 
and limp→∞ U (∆p , f ) − L(∆p , f ) = 0.
228 LECTURES 23-25

Then the Proposition follows immediately from the fact that, for every partition ∆
one has the equalities
Z b Z b
f∆ (x) dx = L(∆, f ) and f ∆ (x) dx = U (∆, f ). 
a a

The following result gives a complete description of the relationship between


Riemann integrability and continuity.
Theorem 6.1 (Lebesgue’s criterion for Riemann integrability). Let f : [a, b] →
R be a bounded function. The following are equivalent:
(i) f is Riemann integrable;
(ii) the discontinuity set Df is neglijeable.
Proof. (i) ⇒ (ii). Assume f is Riemann integrable. Using Proposition 6.7,
there exists a sequence (∆p )∞
p=1 of partitions of [a, b], such that ∆1 ⊂ ∆2 ⊂ . . . and
Z b
 ∆p 
lim f (x) − f∆p (x) dx = 0.
p→∞ a
Notice that
(19) f ∆1 ≥ f ∆2 ≥ f ∆3 ≥ · · · ≥ f ≥ · · · ≥ f∆3 ≥ f∆2 ≥ f∆1 .
Define the Riemann integrable functions hp = f ∆p − f∆p , p ∈ N. We then clearly
have
(α) hp ≥ hp+1 ≥ 0, ∀ p ∈ N;
Rb
(β) limp→∞ a hp (x) dx = 0.
Using (α) we can define the function h : [a, b] → R by
h(x) = lim hp (x), ∀ x ∈ [a, b].
p→∞

Claim 1: The set N = {x ∈ [a, b] : h(x) 6= 0} is neglijeable.


First of all, the functions hp are all Lebesgue measurable. Secondly, since h is
a point-wise limit of a sequence of Lebesgue measurable functions, it follows (see
Theorem 3.2) that h itself is Lebesgue measurable. In particular N is Lebesgue
measurable. For every integer j ≥ 1, define
 1
Nj = x ∈ [a, b] : h(x) > ,
j
S∞
so that the sets Nj , j ≥ 1 are again Lebesgue measurable, and N = j=1 Nj . In
order to prove that N is neglijeable, it then suffices to prove that λ(Nj ) = 0, for
all j ≥ 1. Fix for the moment j ≥ 1. Since hp ≥ h ≥ 0, it follows that
1
inf hp (x) ≥ , ∀ p ≥ 1,
x∈Nj j
so by Lemma 6.1 we get the inequality
Z b
λ(Nj )
≤ hp (x) dx, ∀ p ≥ 1,
j a
so by (β) we indeed getSλ(Nj ) = 0.

Define the set S = p=1 ∆p .
Claim 2: If y ∈ [a, b] r (N ∪ S), then f is continuous at y.
CHAPTER III: MEASURE THEORY 229

Fix y ∈ [a, b] r (N ∪ S). In order to prove that f is continuous at y, we must find,


for every ε > 0, some open interval Jε 3 y, such that
(20) |f (z) − f (y)| < ε, ∀ z ∈ Jε ∩ [a, b].
Since y 6∈ N , we have limp→∞ hp (y) = 0. Fix ε and choose p ≥ 1, such that
0 ≤ hp (y) < ε. Write the partition ∆p as
∆p = (a = x0 < x1 < · · · < xn = b).

Using the fact that y 6∈ ∆p , if we define k = min j ∈ {1, . . . , n} : y < xj , we
have y ∈ (xk−1 , xk ). In particular, we get
f ∆p (y) = sup f (t) and f∆p (y) = inf f (s),
t∈[xk−1 ,xk ] s∈[xk−1 ,xk ]

so the inequality 0 ≤ hp (y) < ε gives


   
sup f (t) − inf f (s) < ε,
t∈[xk−1 ,xk ] s∈[xk−1 ,xk ]

so if we choose Jε = (xk−1 , xk ), we clearly have (20).


Now we are done, because using the fact that S is countable, it follows that S
is neglijeable, so N ∪ S is also neglijeable. Since by Claim 2, we have Df ⊂ N ∪ S,
it follows that Df itself is neglijeable.
(ii) ⇒ (i). Assume now the discontinuity set Df is neglijeable, and let us prove
that f is Riemann integrable. Fix a sequence (∆p )∞ p=1 of partitions of [a, b], with
S∞
∆1 ⊂ ∆2 ⊂ . . . , and10 limp→∞ |∆p | = 0. As before, we define the set S = p=1 ∆p .
S
Claim 3: For any point y ∈ [a, b] r (Df S), one has the equalities
lim f ∆p (y) = lim f∆p (y) = f (y).
p→∞ p→∞

Fix for the moment ε > 0. Since f is continuous at y, there exists some δε > 0,
such that
(21) |f (z) − f (y)| < ε, ∀ z ∈ (y − δε , y + δε ) ∩ [a, b].
Choose now q ≥ 1, such that |∆q | < δε . Write ∆q = (a = x0 < x1 < · · · < xn = b).
Using the fact that y 6∈ ∆q , we can find k ∈ {1, . . . , n} such that y ∈ (xk−1 , xk ).
Since xk − xk−1 < δε , we have the inclusion [xk−1 , xk ] ⊂ (y − δε , y + δε ), so by (21)
we immediately get
f (y) ≤ f ∆q (y) = sup f (z) ≤ f (y) + ε;
z∈[xk−1 ,xk ]

f (y) ≥ f∆q (y) = inf f (z) ≥ f (y) − ε.


z∈[xk−1 ,xk ]
∞ ∞
Since the sequence f ∆p (y) p=1 is non-increasing, and the sequence f∆p (y) p=1 is
non-decreasing, the above inequalities give
|f ∆p (y) − f (y)| ≤ ε and |f∆p (y) − f (y)| ≤ ε, for all p ≥ q,
and the Claim follows.
Going back to the proof of the Theorem, we will now prove that f satsifies
condition (iii) in Proposition 6.6. Fix ε > 0. Since Df ∪ S is also neglijeable, using
regularity from above with respect to open sets, we can find an open set E ⊂ R
10 Recall that, for a partition ∆ = (a = x < · · · < x = b), the number |∆| is defined as
 0 n
|∆| = max xk − xk−1 : 1 ≤ k ≤ n .
230 LECTURES 23-25

such that E ⊃ Df ∪ S, and λ(E) < ε. Define the compact set A = [a, b] r E, and
put B = [a, b] ∩ E. We clearly have
(22) λ(B) ≤ λ(E) < ε.
Define
the sequence (hp )∞
by hp = f ∆p − f∆p . Since A ∩ ∆p = ∅, it follows that
p=1
hp A is continuous, for each p ≥ 1. Since A ∩ (Df ∪ S) = ∅, by Claim 3, we know
that limp→∞ hp (y) = 0, ∀ y ∈ A. Since (hp )∞ p=1 is monotone, by Dini’s Theorem
(see ??) it follows that
 
lim max hp (y) = 0.
p→∞ y∈A

In particular, there exists pε ≥ 1, such that


(23) hpε (y) ≤ ε, ∀ y ∈ A.
Let
M = sup f (x) and m = inf f (x).
x∈[a,b] x∈[a,b]

Using Lemma 6.1 for hpε and the sets A and B, combined with (22), we have
Z b
hpε (x) dx ≤ (b − a) · sup hpε (y) + λ∗ (B) · sup hpε (z) ≤
a y∈A z∈B
≤ ε(b − a) + λ∗ (B)(M − m) ≤ ε(b − a + M − m).
Since hpε ≥ hp ≥ 0, for all p ≥ pε , we get the inequalities
Z b
0≤ hp (x) dx ≤ ε(b − a + M − m), ∀ p ≥ pε .
a
Rb
The above argument proves that limp→∞ a hp (x) dx = 0, i.e.
Z b
lim [f ∆p (x) − f∆p (x)] dx = 0.
p→∞ a

By Proposition 6.6, it follows that f is Riemann integrable. 

Exercise 13. Prove that a Riemann integrable function f : [a, b] → R is


Lebesgue measurable.
Hint: Use a sequence of partitions (∆p )∞ p=1 , with ∆1 ⊂ ∆2 ⊂ . . . , and limp→∞ |∆p | = 0. Use
the arguments given in the proof of the implication (ii) ⇒ (i), to find a neglijeable set N ⊂ [a, b],
such that
lim f∆p (x) = f (x), ∀ x ∈ [a, b] r N.
p→∞

The sequence (f∆p )∞


p=1 is non-decreasing, so it has a point-wise limit, say g, which is Lebesgue
measurable. Use the fact that
f (x) = g(x) ∀ x ∈ [a, b] r N,

to show that f itself is Lebesgue measurable.


Exercise 14. Let K ⊂ [0, 1] be a compact set with K ∩ Q = ∅, and λ(K) > 0
(see Exercise 7 for the existence of such sets). Prove that the characteristic function
κ K : [0, 1] → R is not Riemann integrable. In fact, f cannot be almost Riemann
integrable either.
Hint: Examine the discontinuity set Df , and prove that K ⊂ Df .
CHAPTER III: MEASURE THEORY 231

Exercise 15. Let fn : [a, b] → R, n ≥ 1Qbe a sequence of Riemann integrable



functions. Consider the product space P = n=1 Ran fn , equipped with the prod-
uct topology (the sets Ran fn , n ≥ 1, are equipped with the topology
∞ induced from
R), and the function F : [a, b] → P , defined by F (x) = fn (x) n=1 . Prove that, for
every bounded continuous function g : P → R, the composition g ◦ F : [a, b] → R is
Riemann integrable. In other words, the result of a bounded continuous operation,
involving a sequence of Riemann integrable functions, is again a Riemann integrable
function.
Hint: Examine the relationship between the discountinuity set Dg◦F and the dsicontinuity sets
Dfn , n ≥ 1.
Exercise 16. Let M be an arbitrary subset of [a, b], and let f : [a, b] → R be a
Riemann integrable function, such that f ≤ κ M . Prove the inequality
Z b
f (x) dx ≤ λ∗ (M ).
a

Hint: Consider the function g : [a, b] → R defined by g(x) = max{f (x), 1}. Then f ≥ g ≥ κ M ,
and g is still Riemann integrable. Apply Lemma 6.1 (the first inequality) to the function 1 − g.
Exercise 17*. Let f : [a, b] → R be a bounded function. Prove that the following
are equivalent:
(i) f is Riemann integrable;
(ii) for every ε > 0, there exist continuous functions g, h : [a, b] → R with
Rb
g ≥ f ≥ h, and a [g(x) − h(x)] dx < ε;
(iii) for every ε > 0, there exist Riemann integrable functions g, h : [a, b] → R
Rb
with g ≥ f ≥ h, and a [g(x) − h(x)] dx < ε.
Hints: For the implication (i) ⇒ (ii) analyze first the particular case when f = κ J , with J
a sub-interval of [a, b]. Then analyze the functions of the type f ∆ and f∆ . For the implication
(iii) ⇒ (i), analyze the relationship among lower/upper Darboux sums of f , g and h.
Comment. The statement of Theorem 6.1 shows that, appart from trivial
cases, the problem of checking that a function f : [a, b] → R is Riemann integrable,
is a rather difficult one. The main difficulty arises from the fact that, if N ⊂ [a, b]
is a neglijeable set, and f [a,b]rN is continuous, then f need not be continuous
at all points in [a, b] r N . For instance, if we consider the characteristic function
f = κ Q∩[a,b] of the rationals in [a, b], and N = Q∩[a, b], then clearly N is neglijeable,

f [a,b]rN is continuous (because it is constant zero), but Df = [a, b].
As earlier suggested, in the hope that such an anomaly can be eliminated, it is
reasonable to consider the slightly weaker notion of almost Riemann integrabilty.
In the remainder of this section, we take a closer look at this notion, and we will
eventually show (see Theorem 6.2) that this indeed removes the above anomaly.
We begin with an “almost” version of Exercise 17.
Lemma 6.2. For a function f : [a, b] → R, the following are equivalent:
(i) f is almost Riemann integrable;
(ii) for every ε > 0, there exist continuous functions g, h : [a, b] → R with
Rb
g ≥ f ≥ h a.e., and a [g(x) − h(x)] dx < ε;
(iii) for every ε > 0, there exist Riemann integrable functions g, h : [a, b] → R
Rb
with g ≥ f ≥ h a.e., and a [g(x) − h(x)] dx < ε.
232 LECTURES 23-25

Proof. The implication (i) ⇒ (iii) is trivial.


The implication (iii) ⇒ (ii) follows from Exercise 17.
We now prove (ii) ⇒ (i). Assume f has property (ii). For each integer n ≥ 1,
choose continuous functions gn , hn : [a, b] → R, such that gn ≥ f ≥ hn , a.e., and
Rb
[g (x) − hn (x)] dx ≤ 1/n. Define the functions Gn , Hn : [a, b] → R, n ≥ 1, by
a n

Gn (x) = min g1 (x), . . . , gn (x) ,

Hn (x) = max g1 (x), . . . , gn (x) .
It is clear that
(α) Gm ≥ f ≥ Hn , a.e., ∀ m, n ≥ 1;
(β) G1 ≥ G2 ≥ . . . and H1 ≤ H2 ≤ . . . ;
Rb Rb
(γ) a [Gn (x) − Hn (x)] dx ≤ a [gn (x) − hn (x)] dx ≤ 1/n, ∀ n ≥ 1.
Notice that, since the Gm ’s and the Hn ’s are continuous, by Exercise ??, we also
have
(α0 ) Gm ≥ Hn (everywhere!), ∀ m, n ≥ 1.
Use (β) to define the functions G, H : [a, b] → R, by
G(x) = lim Gn (x) and H(x) = lim Hn (x), ∀ x ∈ [a, b],
n→∞ n→∞

so by (α0 ) we clearly have Gn ≥ G ≥ H ≥ Hn , ∀ n ≥ 1. Using then (γ), by


Exercise 17 it follows that both G and H are Riemann integrable. Moreover, we
have G − H ≥ 0 and
Z b Z b
0≤ [G(x) − H(x)] dx ≤ [Gn (x) − Hn (x)] dx ≤ 1/n, ∀ n ≥ 1,
a a
Rb
Which forces a [G(x) − H(x)] dx = 0, so by Exercise ??, we get G = H, a.e. By
(α) it follows that f = G, a.e., so f in indeed almost Riemann integrable. 
We are now in position to prove the “almost” version of Theorem 6.1.
Theorem 6.2. Let f : [a, b] → R be a bounded function. The following are
equivalent:
(i) f is almost Riemann integrable;
(ii) there exists a neglijeable set N ⊂ [a, b] such that f [a,b]rN is continuous.

Proof. (i) ⇒ (ii). Assume f is almost Riemann integrable, so there exists a


Riemann integrable function g : [a, b] → R, such that f = g, a.e. By Theorem 6.1,
the discontinuity set Dg is neglijeable. Take
M = {x ∈ [a, b] : f (x) 6= g(x)}.

so is the set N = M ∪ Dg . On
Since f = g, a.e., the set M is neglijeable, and
the one hand, since Dg ⊂ N , the restriction g [a,b]rN , is continuous. On the other

hand, since M ⊂ N , we have f = g
[a,b]rN [a,b]rN
, so (ii) follows.
(ii) ⇒ (i). We are going to imitate the proof of Theorem
6.1, with some minor
modifications. Fix N ⊂ [a, b] neglijeable, such that f [a,b]rN is continuous. Fix
also a sequence (∆p )∞
p=1 of partitions, with ∆1 ⊂ ∆2 ⊂ . . . , and limp→∞ |∆p | = 0.
S∞
Put S = p=1 ∆p . Since S is countable, the set N ∪ S is still neglijeable. We put
T = [a, b] r (N ∪ S), and we define the analogues of the functions f ∆p and f∆p as
CHAPTER III: MEASURE THEORY 233

follows. Write each partition as ∆p = (a = xp0 < xp1 < · · · < xpnp = b), and define,
for each k ∈ {1, . . . , np }, the numbers
Mkp = sup f (t) : t ∈ [xpk−1 , xpk ] ∩ T and mpk = inf f (t) : t ∈ [xpk−1 , xpk ] ∩ T .
 

We then define, for each p ≥ 1, the functions


gp = mp1 · κ [xp0 ,xp1 ] + mp2 · κ (xp1 ,xp2 ] + · · · + mpn · κ (xpn−1 ,xpn ] ,
g p = M1p · κ [xp0 ,xp1 ] + M2p · κ (xp1 ,xp2 ] + · · · + Mnp · κ (xpn−1 ,xpn ] .
Note that we have the inequalities g p (x) ≥ f (x) ≥ gp (x), ∀ x ∈ T , which give
(24) g p ≥ f ≥ gp , a.e., ∀ p ≥ 1.
It is obvious that g p and gp , p ≥ 1, are all Riemann integrable. We are now going
Rb p
to estimate the integrals
a
[g (x) − gp (x)] dx. Put hp = g p − gp , p ≥ 1. First we
observe that, since f T is continuous, and T ∩ ∆p = ∅, ∀ p ≥ 1, we clearly have the

equalities limp→∞ g p (x) = limp→∞ gp (x) = f (x), ∀ x ∈ T , which give
(25) lim hp (x) = 0, ∀ x ∈ T.
p→∞

Fix some ε > 0, and use regularity from above, to find an open set D with D ⊃ N ∪S
and λ(D) < ε. Take the compact set A = [a, b] r D. Note that f A is continuous,

since A ⊂ [a, b] r N . Note also that, since A ⊂ [a, b] r S, the functions g p A and
∞
gp A are also continuous, and so will be hp A , for every p ≥ 1. Since g p (x) p=1
∞
is non-increasing, and gp (x) p=1 is non-decreasing, for all x, it follows that the
sequence (hp )∞p=1 is monotone, so by Dini’s Theorem, (25) gives
 
lim max hp (x) = 0.
p→∞ x∈A

In particular, there exists some pε ≥ 1, such that


(26) hp (x) ≤ ε, ∀ p ≥ pε , x ∈ A.
Put B = [a, b] r A, and take M = supx∈[a,b] f (x) and m = inf x∈[a,b] f (x). Using
the inclusion B ⊂ D, we get λ∗ (B) ≤ λ(D) ≤ ε, so by Lemma 6.1, (the functions
hp , p ≥ 1, are clearly non-negative), combined with (26), we get
Z b
hp (x) dx ≤ (b − a) · sup hp (x) + λ∗ (B) · sup hp (x) ≤
a x∈A x∈B
≤ (b − a)ε + λ∗ (B)(M − m) ≤ ε(b − a + M − m), ∀ p ≥ pε .
Rb
This estimate then proves that limp→∞ a hp (x) dx = 0, i.e.
Z b
lim [g p (x) − gp (x)] dx = 0.
p→∞ a
Combining this with (24), and applying Lemma 6.2, yields the fact that f is almost
Riemann integrable. 
Comment. The hypothesis that f is bounded can be replaced with a slightly
weaker one, which assumes that f is almost bounded, meaning that there exists a
neglijeable set U ⊂ [a, b], such that f [a,b]rU is bounded.
Exercise 18. Let fn : [a, b] → R, n ≥ 1, be almost Riemann integrable functions,
such that
234 LECTURES 23-25

(i) fn ≥ fn+1 ≥ 0, a.e., ∀ n ≥ 1;


(ii) limn→∞ fn (x) = 0, for “almost all” x ∈ [a, b], i.e. there exists a neglijeable
set N ⊂ [a, b], such that limn→∞ fn (x) = 0, ∀ x ∈ [a, b] r N .
Prove that Z b
lim ≈ fn (x) dx = 0.
n→∞ a
Lectures 26-29

7. Measure theory on locally compact spaces


Earlier in this chapter we discussed the construction of (outer) measures, start-
ing with more primitive objects: semiring measures. The main application was the
construction of the (outer) Lebesgue measure on Rn . In this section we describe an
alternative construction, which has as its starting point another primitive object:
a regular content. The idea is again to start with the measure defined on a “small”
class of sets, extend it to an outer measure, and then use the Caratheodory con-
struction. Among other applications, we will get an alternative construction of the
(outer) Lebesgue measure on Rn .
Definition. Let X be a locally compact space. Denote by CX the collection
of all compact subsets of X. A content on X, is a map ω : CX → [0, ∞), with the
following properties:
(i) ω(∅) = 0;
(ii) if K, L ∈ CX are such that K ⊂ L, then ω(K) ≤ ω(L);
(iii) ω(K ∪ L) ≤ ω(K) + ω(L), for all K, L ∈ CX ;
(iv) ω(K ∪ L) = ω(K) + ω(L), for all K, L ∈ CX , with K ∩ L = ∅.
Comments. Note that ω takes finite values. The collection CX does not have
any nice set-arithmetic properties, except for the following: (i) the union of any
finite collection of sets in CX is again in CX ; (ii) an arbitrary intersection of sets in
CX is again in CX .

Examples 7.1. A. If µ is a measure on Bor(X), then µ CX
is a content.
B. Take X = R, and for a compact subset K ⊂ R, define

1 if 0 ∈ Int(K)
ω(K) =
0 if 0 6∈ Int(K)
It is obvious that ω is a content onTR. Notice however that if we consider the

compact sets Kn = [− n1 , n1 ], then ω n=1 Kn = 0, but ω(Kn ) = 1, ∀ n ≥ 1. This
shows that, in general, a content cannot be extended to a measure on Bor(X).
One useful property, which will be invoked several times in this section, is
contained in the following:
Exercise 1. Let X be a locally compact space, let K ⊂ X be compact, and let
D1 , D2 ⊂ X be open subsets, with K ⊂ D1 ∪ D2 . Show there exist compact sets
K1 and K2 , such that K1 ⊂ D1 , K2 ⊂ D2 , and K = K1 ∪ K2 .
As Example 7.1.B suggests, one obstruction for the extendability of a content on
X, to a measure on Bor(X), is its behaviour with respect to interiors. The following
notion isolates an important property, which will be shown to be sufficient for the
extendability property.
235
236 LECTURES 26-29

Definition. Let X be a locally compact space. A content ω on X is said to


be regular, if for any K ∈ CX , one has the equality
ω(K) = inf ω(L) : L ∈ CX , Int(L) ⊃ K .


The following exercise shows how the lack of regularity can always be repaired.
Exercise 2. Let X be a locally compact space, and let ω be a content on X.
Define ω̆ : CX → [0, ∞), by
ω̆(K) = inf ω(L) : L ∈ CX , Int(L) ⊃ K , ∀ K ∈ CX .


Prove that:
(i) ω̆ is a regular content on X;
(ii) ω̆(K) ≥ ω(K), ∀ K ∈ CX ;
(iii) if η is a regular content on X, with η(K) ≥ ω(K), ∀ K ∈ CX , then
η(K) ≥ ω̆(K), ∀ K ∈ CX ;
(iv) ω is regular, if and only if ω̆ = ω.
Definition. With the notations from Exercise 2, the regular content ω̆ is called
the regularization of ω.
Theorem 7.1. Let X be a locally compact space, and let ω be a content on X.
Denote by TX the collection of all open subsets of X. Define the map ω̂ : TX →
[0, ∞] by
ω̂(D) = sup ω(K) : K ∈ CX , K ⊂ D , ∀ D ∈ TX ,


and define the map ω ∗ : P(X) → [0, ∞], by


ω ∗ (A) = inf ω̂(D) : D ∈ TX , D ⊃ A , ∀ A ⊂ X.


Then ω ∗ is an outer measure on X.

Proof. We begin by collecting the useful properties of the map ω̂.


Claim: The map ω̂ has the following properties
(i) ω̂(∅) = 0;
(ii) ω̂ is monotone, i.e. whenever D, E ∈ TX satisfy D ⊂ E, it follows
that ω̂(D) ≤ ω̂(E);
P∞sequence (Dn )n=1 ⊂ TX , one has

(iii) ω̂ is σ-sub-additive,
S∞i.e., for any
the inequality ω̂ n=1 Dn ) ≤ n=1 ω̂(Dn ).
Properties (i) and (ii) are trivial.

To prove property (iii), let us start with
S∞ some sequence (Dn )n=1 of open sets,
and let us denote for simplicity the union n=1 Dn by D. Start with some arbitrary
compact set K ⊂ D. Using compactness, there exists some index p ≥ 1, such that
K ⊂ D1 ∪ D2 ∪ · · · ∪ Dp . Use Exercise 1 (and induction) to find compact sets
K1 ⊂ D1 , K2 ⊂ D2 , . . . , Kp ⊂ Dp , such that K = K1 ∪ K2 ∪ · · · ∪ Kp . We then
clearly have the inequalities
p
X p
X ∞
X
ω(K) ≤ ω(Kn ) ≤ ω̂(Dn ) ≤ ω̂(Dn ).
n=1 n=1 n=1

Since we have

X
ω(K) ≤ ω̂(Dn ), for all K ∈ CX with K ⊂ D,
n=1
CHAPTER III: MEASURE THEORY 237

by the definition of ω̂, we immediately get



X
ω̂(D) ≤ ω̂(Dn ).
n=1
Having proven the Claim, we now check the conditions in the definition of an
outer measure. It is clear that ω ∗ (∅) = 0. It is also clear, from the definition, and
property (ii) from the Claim, that
A ⊂ B =⇒ ω ∗ (A) ≤ ω ∗ (B).
Finally, we need to show σ-sub-additivity, i.e.
[∞ ∞
 X
(1) ω∗ An ≤ ω ∗ (An ).
n=1 n=1

Start with some sequence (An )n=1 of subsets of X. Of course, if one of the terms
in the right hand side of (1) is infinite, there is nothing to prove. Assume that
ω ∗ (An ) < ∞, ∀ n ≥ 1. Fix some ε > 0, and choose, S for each n ≥ 1, an open set

Dn ⊃ An , such that ω̂(Dn ) ≤ ω ∗ (An ) + 2εn . Put D = n=1 Dn . Using part (iii) of
the Claim, we have
∞ ∞ ∞
X X  ∗ ε  X
ω̂(D) ≤ ω̂(Dn ) ≤ ω (An ) + n = ε + ω ∗ (An ).
n=1 n=1
2 n=1
S∞
Since we obviously have the inclusion D ⊃ n=1 An , the above inequality gives

[ ∞
X
ω∗ ω ∗ (An ).

An ≤ ω̂(D) ≤ ε +
n=1 n=1
Now we have

[ ∞
X
ω∗ ω ∗ (An ),

An ≤ ε +
n=1 n=1
for all ε > 0, so the inequality (1) follows. 
Definition. Let X be a locally compact space, and let ω be a content on X.
The outer measure ω ∗ on X, defined in Theorem 7.1, is called the outer measure
induced by ω.
Remarks 7.1. Let X be a locally compact space, let ω be a content on X,
and let ω ∗ be the outer measure induced by ω.
A. The map ω̂ : TX → [0, ∞], defined in the statement of Theorem 7.1, is given
by ω̂ = ω ∗ T . To see that this is the case, start with some open set D. On the
X
one hand, by the definition of ω ∗ , we know that
ω ∗ (D) = inf ω̂(E) : E ∈ TX , E ⊃ D ,


which (using E = D) immediately gives the inequality ω ∗ (D) ≤ ω̂(D). On the


other hand, using property (ii) from the Claim stated in the proof, we also know
that
ω̂(E) ≥ ω̂(D), for all E ∈ TX with E ⊃ D,
which gives the reverse inequality, ω ∗ (D) ≥ ω̂(D).
B. As a consequence of the above remark, we get the fact that ω ∗ is regular
from above, with respect to the collection TX of all open sets in X, i.e.
ω ∗ (A) = inf ω ∗ (D) : D ∈ TX , D ⊃ A , ∀ A ⊂ X.

238 LECTURES 26-29

C. If one denotes by ω̆ the regularization of ω (see Exercise 2), then ω̆ ∗ = ω ∗ .


ˆ Indeed,
In fact, using the notations from Theorem 7.1, we have the equality ω̂ = ω̆.
on the one hand, since we have the inequality
ω̆(K) ≥ ω(K), ∀ K ∈ CX ,
it follows immediately by the definitions, that
ˆ
ω̆(D) ≥ ω̂(D), ∀ D ∈ TX .
To prove the other inequality, fix some open set D ⊂ X. Suppose K ⊂ D is some
compact subset. Using the well-known properties of locally compact spaces, there
exists some compact set L, with
K ⊂ Int(L) ⊂ L ⊂ D,
so by the definitions of ω̂ and ω̆, we get
ω̂(D) ≥ ω(L) ≥ ω̆(K).
Since we have the inequality
ω̂(D) ≥ ω̆(K), for all K ∈ CX with K ⊂ D,
taking supremum in the right hand side yields
ˆ
ω̂(D) ≥ sup ω̆(K) : K ∈ CX , K ⊂ D = ω̆(D).


Proposition 7.1. Let X be a locally compact space, let ω be a content on X,


and let ω ∗ be the outer measure induced by ω. If we denote by ω̆ the regularization
of ω, then one has the equality
ω ∗ C = ω̆.

X

Proof. Using Remark 7.1.C, we can assume that ω is regular, and in this case
we need to prove that ω ∗ C = ω. Start with some compact set K ⊂ X. By the
X
definition of ω ∗ , using the notations from Theorem 7.1, we know that
ω ∗ (K) = inf ω̂(D) : D ∈ TX , D ⊃ K .

(2)
It is clear that, for every open set D ⊃ K, we have the inequality
ω̂(D) ≥ ω(K),
so taking infimum in the left hand side, and using (2), immediately gives the in-
equality
ω ∗ (K) ≥ ω(K).
To prove the reverse inequality, we start by fixing ε > 0, and we use regularity to
find some compact set L with K ⊂ Int(L), and ω(L) ≤ ω(K) + ε. Consider the
open set D = Int(L). On the one hand, for every compact set F ⊂ D, we have
the onbious inclusion F ⊂ L, which gives ω(F ) ≤ ω(L). Taking supremum over all
copact sets F ⊂ D then gives ω̂(D) ≤ ω(L). By the choice of L, by the definition
of ω ∗ , and using the inclusion D ⊃ K, we then get
ω ∗ (K) ≤ ω̂(D) ≤ ω(L) ≤ ω(K) + ε.
Since the inequality
ω ∗ (K) ≤ ω(K) + ε,
holds for all ε > 0, we then must have ω ∗ (K) ≤ ω(K). 
CHAPTER III: MEASURE THEORY 239

The above result gives a nice characterization for the regularity of a content,
in terms of the induced outer measure.
Corollary 7.1. Let X be a locally comoact space. A content ω is regular, if
and only if ω ∗ C = ω.
X

Proof. Immediate from Proposition 7.1 and exercise 2. 


Theorem 7.2. Let X be a locally compact space, let ω be a content on X,
and let ω ∗ be the outer measure induced by ω. Then every open set D ⊂ X is
ω ∗ -measurable.
Proof. Fix an open set D ⊂ X. We need to prove (see Section 5) that D
“sharply cuts” every subset of X, which is equivalent to the fact that, for every
A ⊂ X, one has the inequality:
(3) ω ∗ (A) ≥ ω ∗ (A ∩ D) + ω ∗ (A r D).
This will be shown in several steps.
Claim 1: For any open set E ⊂ X, and any compact set K ⊂ E, one has
the inequality
ω ∗ (E) ≥ ω(K) + ω ∗ (E r K).
To prove this inequality, we first note that, since both E and E r K are open, by
Remark 7.1.A, we have the equalities ω ∗ (E) = ω̂(E) and ω ∗ (E r K) = ω̂(E r K),
where ω̂ : TX → [0, ∞] is the map defined in the statement of Theorem 7.1. If
L ⊂ E r K is an arbitrary compact set, then we obviously have K ∩ L = ∅, so
using the inclusion K ∪ L ⊂ E, we get
ω(K) + ω(L) = ω(K ∪ L) ≤ ω̂(E) = ω ∗ (E),
which then gives
ω ∗ (E) − ω(K) ≥ ω(L), for all L ∈ CX with L ⊂ E r K.
Taking supremum in the right hand side then gives
ω ∗ (E) − ω(K) ≥ sup ω(L) : L ∈ CX , L ⊂ E r K = ω̂(E r K) = ω ∗ (E r K),


and the Claim follows.


Claim 2: The inequality (3) holds for all open subsets A ⊂ X.
Assume A is open. If the left hand side of (3) is infinite, there is nothing to prove.
Assume ω ∗ (A) < ∞, so both ω ∗ (A ∩ D) and ω ∗ (A r D) are also finite. Since A ∩ D
is open, we have
ω ∗ (A ∩ D) = ω̂(A ∩ D) = sup ω(K) : K ∈ CX , K ⊂ A ∩ D .

(4)
Fix for the moment a compact subset K ⊂ A ∩ D. Using Claim 1 we have the
inequality
ω ∗ (A) ≥ ω(K) + ω ∗ (A r K).
Since we obviously have the inclusion A r K ⊃ A r D, the above inequality gives
ω ∗ (A) ≥ ω(K) + ω ∗ (A r D), which can be rw-written as
ω ∗ (A) − ω ∗ (A r D) ≥ ω(K), for all K ∈ CX with K ⊂ A ∩ D.
Taking supremum in the right hand side, and using (4), we immediately get the
desired inequality (3).
240 LECTURES 26-29

We now proceed with the proof of (3) for arbitrary A’s. Fix A, and consider
an arbitrary open set E ⊃ A. By Claim 2, we have
ω ∗ (E) ≥ ω ∗ (E ∩ D) + ω ∗ (E r D).
Using the obvious inclusions E ∩ D ⊃ A ∩ D and E r D ⊃ A r D, we then get
ω ∗ (E) ≥ ω ∗ (A ∩ D) + ω ∗ (A r D).
The desired inequality (3) follows now by taking infimum in the left hand side, and
using Remark 7.1.B. 
The most important consequence of Theorem 7.2 is the following.
Corollary 7.2. Let X be a locally compact space, and let ω be a regular
content on X. Then ω can be extended uniquely to a measure µω on Bor(X), with
the following properties.
(i) µ is regular from above, with respect to the collection TX of all open sets,
that is
µω (B) = inf µω (D) : D ∈ TX , D ⊃ B , ∀ B ∈ Bor(X);


(ii) for every open set D ⊂ X, one has the equality


µω (D) = sup µω (K) : K ∈ CX , K ⊂ D .


Conversely, if µ is a measure on
Bor(X) with properties (i) and (ii), and such that
µ(K) < ∞, ∀ K ∈ CX , then µ C is regular content.
X

Proof. If we denote by mω∗ (X) the σ-algebra of ω ∗ -measurable sets, then


Theorem 7.2 gives the inclusion Bor(X) ⊂ mω∗ (X), so the existence follows by
taking µω = ω ∗ Bor(X) . The fact that µω has properties (i) and (ii) is trivial, by
construction and by Remarks 7.1 and Proposition 7.1.
The uniqueness is trivial, since property (ii) uniquely defines µω on open sets,
and (i) then uniquely defines µω on all Borel sets.
To prove the second assertion, assume µ is a measure on Bor(X) with properties
(i) and (ii), and let us show that ω = µ C is a regular content. The fact that ω is
X
a content is trivial, so the only thing we must show is regularity. Fix some compact
set K ⊂ X. It is clear that
ω(K) ≤ inf ω(L) : L ∈ CX , K ⊂ Int(L) .


To prove the converse, we use property (i), to find, for each ε > 0, an open set
Dε ⊃ K, such that µ(Dε ) ≤ µ(K) + ε. If we choose, for each ε.), a compact set Lε ,
such that
K ⊂ Int(Lε ) ⊂ Lε ⊂ Dε ,
then we obviously have
µ(K) ≤ µ(Lε ) ≤ µ(Dε ) ≤ µ(K) + ε,
so we get the inequality
inf ω(L) : L ∈ CX , K ⊂ Int(L) ≤ ω(K) + ε.


Since this holds for all ε > 0, we get in fact the inequality
inf ω(L) : L ∈ CX , K ⊂ Int(L) ≤ ω(K),


and we are done. 


CHAPTER III: MEASURE THEORY 241

Definition. Let X be a locally compact space. A Radon measure on X is a


measure µ on Bor(X) with the following properties:
(i) µ(K) < ∞, for all compact sets K ⊂ X;
(ii) for every open set D one has

µ(D) = sup µ(K) : K ⊂ D, K compact ;
(iii) for every Borel set B one has

µ(B) = inf µ(B) : D ⊃ B, D open .
By Corollary 7.2, the map ω 7−→ µω establishes a bijective correspondence between
the set of all regular contents on X, and the set of all Radon measures on X. For
a regullar content ω, the measure µω is called the Radon measure extension of ω.
Proposition 7.2. Let X be a locally compact space.
(a) If µ is a Radon measure on X, and t ∈ [0, ∞), then tµ is also a Radon
measure on X.
(b) If µ1 and µ2 are Radon measures on X, then µ1 + µ2 is also a Radon
measure on X.
Proof. Property (a) is trivial.
To prove property (b) let us denote µ1 + µ2 simply by µ. We first obvserve that
µ is indeed a measure on Bor(X), and we clearly have
µ(K) = µ1 (K) + µ2 (K) < ∞, ∀ K ∈ CX .
Let us show that µ satisfies condition (ii). Fix some open set D ⊂ X, and let
us prove that
µ(D) = sup µ(K) : K ∈ CX , K ⊂ D .

(5)
If µ(D) = ∞, then either µ1 (D) = ∞ or µ2 (D) = ∞, so we get
sup max µ1 (K), µ2 (K) : K ∈ CX , K ⊂ D = ∞,
 

and since µ(K) ≥ max µ1 (K), µ2 (K) , ∀ K ∈ CX , the equality (5) immedi-


ately follows. Suppose now µ(D) < ∞, which is equivalent to the fact that
µ1 (D), µ2 (D) < ∞. Denote the right hand side of (5) by ν(D). For every ε > 0,
using the fact that µ1 and µ2 are Radon measures, we can find two compact sets
K1ε , K2ε ⊂ D, such that µ1 (K1ε ) ≥ µ1 (D) − 2ε and µ2 (K2ε ) ≥ µ2 (D) − 2ε . Of course,
the compact set Kε = K1ε ∪ K2ε is still a subset of D, and satisfies
ε
µ1 (Kε ) ≥ µ1 (K1ε ) ≥ µ1 (D) − ,
2
ε
µ2 (Kε ) ≥ µ2 (K2ε ) ≥ µ2 (D) − ,
2
so we get µ(Kε ) = µ1 (Kε ) + µ2 (Kε ) ≥ µ1 (D) + µ2 (D) − ε = µ(D) − ε. This
proves that ν(D) ≥ µ(D) − ε, and since this inequality is true for all ε > 0, we get
ν(D) ≥ µ(D). The inequality ν(D) ≤ µ(D) is trivial.
We now show that µ satisfies condition (iii). Fix some set A ∈ Bor(X), and
let us prove that
µ(A) = inf µ(D) : D ∈ TX , D ⊃ A .

(6)
If µ(A) = ∞, there is nothing to prove. Suppose now µ(A) < ∞, which is equivalent
to the fact that µ1 (A), µ2 (A) < ∞. Denote the right hand side of (6) by λ(A). For
every ε > 0, using the fact that µ1 and µ2 are Radon measures, we can find two
242 LECTURES 26-29

open sets D1ε , D2ε ⊃ A, such that µ1 (D1ε ) ≤ µ1 (A) + 2ε and µ2 (D2ε ) ≤ µ2 (A) + 2ε .
Then open set Dε = D1ε ∩ D2ε still contains A, and satisfies
ε
µ1 (Dε ) ≤ µ1 (D1ε ) ≤ µ1 (A) + ,
2
ε
µ2 (Dε ) ≤ µ2 (D2ε ) ≥ µ2 (A) + ,
2
so we get µ(Dε ) = µ1 (Dε ) + µ2 (Dε ) ≤ µ1 (A) + µ2 (A) + ε = µ(A) + ε. This
proves that λ(A) ≤ µ(A) + ε, and since this inequality is true for all ε > 0, we get
λ(A) ≤ µ(A). The inequality λ(A) ≥ µ(A) is trivial. 
Radon measures are also functorial with respect to proper maps, in the following
sense.
Proposition 7.3. Let X and Y be locally compact spaces, let Φ : X → Y
be a proper continuous map, and let µ be a Radon measure on X. Then the map
ν : Bor(Y ) → [0, ∞], defined by
ν(B) = µ Φ−1 (B) , ∀ B ∈ Bor(Y ),


is a Radon measure on Y .
Proof. First of all, remark that since Φ is continuous, it is Borel measurable,
which means that
Φ−1 (B) ∈ Bor(X), ∀ B ∈ Bor(Y ).
Secondly, by the well known properties of measures, the map ν is a measure.
We now check that ν is a Radon measure. First of all, if K ⊂ Y is compact,
then using the fact that Φ is proper, it means that Φ−1 (K) is compact in X, so we
clearly get
ν(K) = µ Φ−1 (K) < ∞.


To prove that ν satisfies condition (ii), start with some open set D ⊂ Y ,
and let us find a sequence (Ln )∞n=1 of compact subsets of D, such that ν(D) =
limn→∞ ν(Ln ). The set Φ−1 (D) is open, so there exists a sequence (Kn )∞ n=1 of
compact subsets of Φ−1 (D), with
ν(D) = µ Φ−1 (D) = lim µ(Kn ).

(7)
n→∞

It we define the subsets Ln = Φ(Kn ), then (Ln )n≥1 is a sequence of compact subsets
of D, and the inclusion Kn ⊂ Φ−1 (Ln ) immediately gives ν(D) ≥ ν(Ln ) ≥ µ(Kn ),
so by (7) we also get ν(D) = limn→∞ ν(Ln ).
To prove condition (iii) start with some arbitrary subset B ∈ Bor(Y ), and let
us a sequence (En )∞n=1 of open subset of Y , such that ν(B) = limn→∞ ν(En ), and
En ⊃ B, ∀ n ≥ 1. Use the fact that µ is a Radon measure, to find a sequence
(Dn )∞
n=1 of open subset of X, such that

ν(B) = µ Φ−1 (B) = lim µ(Dn ),



(8)
n→∞
−1
and Dn ⊃ Φ (B), ∀ n ≥ 1. Put Tn = X r Dn , so that Tn is closed, for each n ≥ 1.
By Proposition I.5.2, the sets Φ(Tn ) are closed in Y , hence their complements
En = Y r Φ( Tn ), n ≥ 1 are open. Remark that we have the inclusions B ⊂ En ,
∀ n ≥ 1. Otherwise, we would have B ∩ Φ(Tn ) 6= ∅, forcing Tn ∩ Φ−1 (B) 6= ∅,
which is impossible, since Φ−1 (B) ⊂ Dn = X r Tn . Moreover, we also have the
inclusions
Φ−1 (B) ⊂ Φ−1 (En ) ⊂ Dn , ∀ n ≥ 1,
CHAPTER III: MEASURE THEORY 243

which then force


ν(B) ≤ ν(En ) ≤ µ(Dn ), ∀ n ≥ 1.
Using by (8) this gives the equality limn→∞ ν(En ) = ν(B). 
Of course, if X is a compact Hausdorff space, then every Radon measure µ on
X is finite. The following gives an interesting converse of this property, which also
shows that sometimes functoriality can be present beyond the proper case described
above.
Proposition 7.4. Let X be a locally compact space, let µ be a Radon measure
on X, and let (θ, T ) be a compactification of X. The following are equivalent:
(i) µ(X) < ∞;
(ii) the map ν : Bor(T ) → [0, ∞), defined by
ν(B) = µ θ−1 (B) , ∀ B ∈ Bor(T ),


is a Radon measure on T .
Proof. Recall that the fact that (θ, T ) is a compactification of X means that
• T is a compact Hausdorff space;
• θ : X → T is continuous;
• θ(X) is open and dense in T ;
• θ : X → θ(X) is a homeomorphism.
Without any loss of generality, we can assume that X is a dense open subset of T ,
and θ is the inclusion map. With this convention, the map ν is defined by
(9) ν(B) = µ(B ∩ X), ∀ B ∈ Bor(T ).
(i) ⇒ (ii). Assume µ(X) < ∞. It is clear that ν is a finite measure on Bor(T ),
and in fact we have ν(T r X) = 0.
The fact that ν(K) < ∞, for every compact subset K ⊂ T is of course trivial.
We now check the second condition in the definition. Fix some open subset
D ⊂ T , and let us show that

ν(D) = sup ν(K) : K compact, K ⊂ D .
All we need is a sequence (Kn )∞
n=1 of compact subsets of D, with limn→∞ ν(Kn ) =
ν(D). To get this sequence we simply use the fact that D ∩ X is open (in X), so we
can find a sequence (Kn )∞
n=1 of compact subsets of D ∩ X, with limn→∞ µ(Kn ) =
µ(D ∩ X) = ν(D). Now we are done, because the fact that Kn ⊂ X, gives µ(Kn ) =
ν(Kn ), ∀ n ≥ 1.
We now check the third condition in the definition. Fix some set B ∈ Bor(T ),
and let us show that

ν(B) = inf ν(D) : D ⊂ T open, D ⊃ B .
All we need is a sequence (Dn )∞
n=1 of open subsets of T , with Dn ⊃ B, ∀ n ≥ 1,
and limn→∞ ν(Dn ) = ν(B). Start off by choosing a sequnce (Kn )∞ n=1 of compact
subsets of X, such that limn→∞ µ(Kn ) = µ(X), we will get limn→∞ µ(X r Kn ) = 0
(the condition that µ(X) < ∞ is essential here). If we define then the open sets
An = T r Kn , then we will have ν(An ) = µ(An ∩ X) = µ(X r Kn ), ∀ n ≥ 1, so we
have
(10) lim ν(An ) = 0.
n→∞
244 LECTURES 26-29

Notice also that


(11) An ⊃ T r X, ∀ n ≥ 1.
Use now the fact that µ is a Radon measure on X, and the fact that B ∩ X ∈
Bor(X), to find a sequence (En )∞
n=1 of open subsets of X, with En ⊃ B ∩ X,
∀ n ≥ 1, and
(12) lim ν(En ) = µ(B ∩ X).
n→∞
Since X is open in T , it follows that all the En ’s are open in T . If we define
Dn = En ∪ An , then using (11) we have the inclusions
Dn = En ∪ An ⊃ (B ∩ X) ∪ (T r X) ⊃ B, ∀ n ≥ 1,
as well as the inequalities
µ(B ∩ X) = ν(B) ≤ ν(Dn ) ≤ ν(En ) + ν(An ) =
= µ(En ∩ X) + ν(An ) = µ(En ) + ν(An ), ∀ n ≥ 1,
which, combined with (10) and (12), clearly give limn→∞ ν(Dn ) = µ(B∩X) = ν(B).
(ii) ⇒ (i). This implication is trivial, because the fact that ν is a Radon
measure forces µ(X) = ν(X) ≤ ν(T ) < ∞. 
Comment. Assume µ is a Radon measure on a locally compact space X.
Although the measure µ is regular from above with respect to open sets by (iii), in
general, one cannot conclude that it is regular from below with respect to compact
sets. The following example illustrates such an anomaly.
Exercise 3*. Equipp theSspace X = R2with the disjoint union topology defined
by the decomposition X = y∈R R × {y} . More explicitly, if we define, for each
A ⊂ X, and each y ∈ R, the set

Ay = {x ∈ R : (x, y) ∈ A ,
then a set D ⊂ X is declared to be open, if and only if all subsets Dy ⊂ R, y ∈ R
are open (in the usual topology on R). For each subset A ⊂ X, define its support
SA = {y ∈ R : Ay 6= ∅}.
Prove the following.
(i) A set K ⊂ X is compact, if and only if its support SK is finite and, for
each y ∈ SK , the set Ky ⊂ R is compact (in the usual topology on R).
(ii) X is a locally compact space.
(iii) If we define, for every compact subset K ⊂ X, the number
X
ω(K) = λ(Ky ),
y∈SK

where λ is the Lebesgue measure on R, then ω is a regular content on X.


(iv) Let µ denote the Radon measure extension of ω. Then for every open set
D ⊂ X, one has the equality
X
µ(D) = λ(Dy ),
y∈R

where one uses the summation conventions discussed in II.2. (The sum in
the right hand side is defined as the supremum of all finite sums.)
(v) If B ∈ Bor(X) has uncountable support SB , then µ(B) = ∞.
CHAPTER III: MEASURE THEORY 245

(vi) Consider the y-axis Y = {0} × R ⊂ X Show that F is closed in X (hence


Borel), it has infinite measure µ(F ) = ∞, but µ(K) = 0, for all compact
subsets K ⊂ F .
Hints: Using regularity from above, it suffices to prove (v) only when B is open. In this case
use the fact that if a map α : R → R is summable, then the set {t ∈ R : α(t) 6= 0} is countable.
For (vi), the equality µ(F ) = ∞ is a consequence of (v). To get the fact that all compact subsets
of F have measure zero, use part (i).
Remark 7.2. Let X be a locally compact space, and let µ be a Radon measure
on X. We define the maximal outer extension of µ (see Section 5) by
µ∗ (A) = inf µ(B) : B ∈ Bor(X), B ⊃ A , ∀ A ⊂ X.


By the regularity from above, one has the equality


µ∗ (A) = inf µ(D) : D ∈ TX , D ⊃ A , ∀ A ⊂ X.

(13)

If one considers the regular content ω = µ C , then µ∗ = ω ∗ , the outer mea-
X
sure induced by ω. We also know that if we consider the σ-algebra mµ∗ (X) of
all µ∗ -measurable subsets of X, we have the inclusion Bor(X) ⊂ mµ∗ (X), and
µ∗ Bor(X) = µ.
Exercise 4. Consider the collection D of all subsets of Rn , of the form
D = (a1 , b1 ) × · · · × (an , bn ), a1 < b1 , . . . , an < bn .
For every such D we define
n
Y
voln = (bj − aj ).
j=1

We define, for every bounded subset B ⊂ Rn , the number


N N
X [
p=1 ⊂ D, B ⊂
voln (Dp ) : (Dp )N

(14) v(B) = inf Dp .
p=1 p=1

(i) If we define B = B ⊂ R : B bounded , then B is a ring, the map


 n

v : B → [0, ∞) is sub-additive, but not σ-sub-additive. In particular, v


does not extend to an outer measure on Rn .
square S = [0, 1] , then the collection Nv (S) =
n
(ii) If
 we consider the unit
N ⊂ S : v(N ) = 0 is a ring, but not a σ-ring.
(iii) When restricted
to the collection CRn , of all compact subsets of Rn , the
map ω = v C n defines a regular content on Rn .

R
(iv) The outer measure ω ∗ , defined by ω, is precisely the outer Lebesgue mea-
sure λ∗n .
The above construction somehow belongs to the “prehistory” of measure theory.
The map v : B → [0, ∞) is called the Jordan content. Bounded sets N ⊂ Rn ,
with v(N ) = 0 are called Jordan neglijeable. The theory of Riemann integration
(especially for functions of several variables) relies heavily on the use of Jordan
neglijeable sets. Part (ii) shows that, when restricted to Bor(S), the map v fails
to be a measure. Parts (iii) and (iv) explain how the construction can be “fixed.”
The regular content ω = v C n is called the Lebesgue content. The correspondence
R
ω 7−→ ω ∗ gives an alternative construction of the outer Lebesgue measure, which
starts with its definition on compact sets as the Jordan content.
246 LECTURES 26-29

For Radon measures, the lack of regularity from below, with respect to compact
sets, in somehow compensated by the following result (compare with Exercise 2 from
Section 6).
Lemma 7.1. Let X be a locally compact space, let µ be a Radon measure on
X, and let µ∗ be the maximal outer extension of µ. For a subset A ⊂ X, with
µ∗ (A) < ∞, the following are equivalent
(i) A is µ∗ -measurable;
(ii) µ∗ (A) = sup{µ(K) : K ∈ CX , K ⊂ A};
(iii) there exists a sequence (Kn )∞
n=1 of compact subsets of A, such that

[
µ∗ A r

Kn = 0.
n=1

Proof. (i) ⇒ (ii). Suppose A is µ∗ -measurable, and let us prove the equality
(ii). Denote the right hand side of (ii) simply by ν(A). It is obvious, by the
monotonicity of µ∗ , and the fact that µ∗ Bor(X) = µ, that we have the inequality
µ∗ (A) ≥ ν(A). To prove the other inequality we fix for the moment some ε > 0.
Using (13), there exists an open set D ⊃ A, such that µ(D) ≤ µ∗ (A) + ε. Use
property (ii) in the definition of Radon measures, to find some compact set L ⊂ D
such that
µ(D) ≤ µ(L) + ε.
Since µ(D) = µ(D r L) + µ(L), and µ(L) ≤ µ(D) < ∞, this inequality gives
µ(D r L) ≤ ε,
which, combined with the obvious inclusion A r L ⊂ D r L, yields
(15) µ∗ (A r L) ≤ µ∗ (D r L) = µ(D r L) ≤ ε.
Using (13) we can also find an open set E ⊃ L r A, such that
(16) µ(E) ≤ µ∗ (L r A) + ε.
Since LrA is µ∗ -measurable, we have µ(E) = µ∗ (E) = µ∗ E r(LrA) +µ∗ (LrA).


Since µ∗ (L r A) ≤ µ∗ (E) = µ(E) < ∞, the inequality (16) gives


µ∗ E r (L r A) ≤ ε.

(17)
Consider the set K = L r E. It is obvious that K is compact, and we have the
inclusion
K ⊂ L r (L r A) = L ∩ A ⊂ A.
Moreover, we have
(L ∩ A) r K ⊂ E r (L r A).
Using the inequality (17), we then get
µ∗ (L ∩ A) r K ≤ ε.


Finally, the above inequality, combined with (15), gives


µ∗ (A r K) ≤ µ∗ (L ∩ A) r K + µ∗ (A r L) r K ≤ ε + µ∗ (A r L) ≤ 2ε.
 

Since K ⊂ A, we get
µ∗ (A) ≤ µ∗ (A r K) + µ∗ (K) ≤ 2ε + µ(K) ≤ 2ε + ν(A).
Since the inequality µ∗ (A) ≤ 2ε + ν(A) holds for all ε > 0, we get µ∗ (A) ≤ ν(A),
so (ii) follows.
CHAPTER III: MEASURE THEORY 247

(ii) ⇒ (iii). Assume A satisfies (ii), and let us show that A has property (iii).
For every integer n ≥ 1, we use (ii) to find a compact set Kn ⊂ A, such that
1
(18) µ∗ (A) ≤ µ(Kn ) + .
n
 S∞ 
On the one hand, we have the inclusions A r n=1 Kn ⊂ A r Kp , which give

[
µ∗ A r ≤ µ∗ (A r Kp ), ∀ p ≥ 1.

(19) Kn
n=1

On the other hand, since Kp is measurable, we have the equality


µ∗ (A) = µ∗ (A r Kp ) + µ∗ (Kp ) = µ∗ (A r Kp ) + µ(Kp ),
and then the fact that µ∗ (A) < ∞, combined with (18), will force
1
µ∗ (A r Kp ) ≤
, ∀ n ≥ 1.
p
 S∞
Using (19), this forces µ∗ A r

n=1 Kn = 0.
S∞
(iii) ⇒ (i). This is pretty obvious. We define the sets B = n=1 Kn ⊂ A, and
N = A r B. Then µ∗ (N ) = 0, so in particular, N is µ∗ -measurable. Since B is
Borel, it is also µ∗ -measurable, so A = B ∪ N is indeed µ∗ -measurable. 

The following result generalizes Lemma 7.1 to the σ-finite case.


Theorem 7.3. Let X be a locally compact space, let µ be a Radon measure on
X, and let µ∗ be the maximal outer extension of µ. For a set A ⊂ X, the following
are equivalent
(i) A is µ∗ -measurable, and µ∗ -σ-finite.
(ii) There exists sequences (Kn )∞n=1 ⊂ CX and (Dn )n=1 ⊂ TX , such that


[ ∞
\ ∞ ∞
\  [ 
Kn ⊂ A ⊂ Dn and µ Dn r Kn = 0.
n=1 n=1 n=1 n=1

(The condition that A isSµ -σ-finite means that there exists a sequence (An )∞

n=1 of

subsets of X, with A = n=1 An , and µ∗ (An ) < ∞, for all n ≥ 1.)

Proof. (i) ⇒ (ii). Assume A is µ∗ -measurable and µ∗ -σ-finite.


Claim 1:
S∞There exists a sequence (An )∞ ∗
n=1 of µ -measurable sets, such that

A = n=1 An , and µ (An ) < ∞, ∀ n ≥ 1.
A priori, we only know that there exists aSsequence (A0n )∞
n=1 of subsets of X

(not assumed to be µ∗ -measurable), with A = n=1 A0n , and µ∗ (A0n ) < ∞, ∀ n ≥ 1.
Using (13), we can choose however, for each n ≥ 1, an open set En , with A0n ⊂ En ,
and µ(En ) < ∞. In particular,
S∞ En is µ∗ -measurable, and so will be An = A ∩ En .
We clearly have A = n=1 An , and µ∗ (An ) ≤ µ∗ (En )S< ∞, ∀ n ≥ 1.

Using Claim 1, we start off by writing A = n=1 An , with the An ’s µ∗ -
measurable, and µ∗ (An ) < ∞. For each n ≥ 1, we use Lemma 7.1 to find a
sequence (Lpn )∞
p=1 of compact subsets of An , such that

[
µ∗ An r Lpn

= 0.
p=1
248 LECTURES 26-29

Let us list the countable collection {Lpn : p, n ≥ 1} as a sequence (Kn )∞


n=1 , so that
we have

[ ∞ [
[ ∞ ∞
[
Kn = Lpn ⊂ An = A.
n=1 n=1 p=1 n=1
S∞
Kn is µ∗ -neglijeable, i.e. µ∗ (M ) = 0.

Claim 2: The set M = A r n=1
 S∞ 
Indeed, if we define, for each k ≥ 1 the set Mk = Ak r n=1 Kn , then we have
S∞
the obvious equality M = k=1 Mk , and the inclusions
∞ [
∞ ∞
[ [
Lpn ⊂ Ak r Lpk , ∀ k ≥ 1,
 
Mk = Ak r
p=1 n=1 p=1

which, by the choice of the L’s, prove that µ (Mk ) = 0, ∀ k ≥ 1.
We proceed now with the construction of the D’s. For each pair of integers
(p, n), we use (13) to find an open set Enp ⊃ An , such that µ(Enp ) ≤ µ∗ (An ) + 2p+n
1
.

Since the An ’s are µ -measurable, we have
µ(Enp ) = µ∗ (Enp ) = µ∗ (An ) + µ∗ (Enp r An ).
Since µ∗ (An ) < ∞, by the choice of the E’s, we will get
1
(20) µ∗ (Enp r An ) ≤ , ∀ p, n ≥ 1.
2p+n
S∞
We then define, for each p ≥ 1, the
S∞open set D
Sp∞= n=1 Enp . Notice that, for each
p
p ≥ 1, we have the inclusion A = n=1 An ⊂ n=1 En = Dp , and

[ ∞
[
Dp r A = [Enp r A] ⊂ [Enp r An ].
n=1 n=1

Using (20), we then get


∞ ∞
X X 1 1
(21) µ∗ (Dp r A) ≤ µ∗ (Enp r An ) ≤ p+n
= p , ∀ p ≥ 1.
n=1 n=1
2 2
T∞
Since A ⊂ Dp , ∀ p ≥ 1, we get A ⊂ p=1 Dp . Moreover, if we define the set
 T∞ 
N = p=1 Dp r A, we obviously have the inclusions N ⊂ Dp r A, ∀ p ≥ 1, and
forces µ∗ (N ) = 0.T
then (21) clearly S
∞ ∞  T∞   S∞ 
Now we have n=1 Kn ⊂ A ⊂ p=1 Dp , and p=1 Dp r n=1 Kn = N ∪M ,
with µ∗ (M ) = µ∗ (N ) = 0, so we indeed have (ii).
The implication (ii) ⇒ (i) is pretty obvious.
S∞ If there existT sequences (Kn )∞
n=1
∞ ∞
and (Dn )n=1 as in (ii), then the sets B = n=1 Kn and G = n=1 Dn are Borel.
Moreover, the inclusions B ⊂ A ⊂ G, give A r B ⊂ G r B, so we have µ∗ (A r B) ≤
µ∗ (G r B). By the second feature in (ii) we know that µ∗ (G r B) = 0, therefore
the set P = A r B is µ∗ -neglijeable, hence µ∗ -measurable. Since A = B ∪ P , it
follows that A is indeed µ∗ -measurable. 
Comment. The implication (ii) ⇒ (i) in Theorem 7.3 holds without the µ∗ -σ-
finiteness assumption on A. In fact, condition (ii) actually forces A to be µ∗ -σ-finite.
Corollary 7.3. If µ is a Radon measure on X, and the set A is µ∗ -measurable,
and µ∗ -σ-finite, then one has the equality
µ∗ (A) = sup µ(K) : K ∈ CX , K ⊂ A .

CHAPTER III: MEASURE THEORY 249

Proof. Follow the first part of the proof of (i) ⇒ (ii) to find a sequence
(Kn )∞
n=1 of compact subsets of A, such that

[
µ∗ A r

Kn = 0.
n=1
S∞
Since n=1 Kn is µ∗ -measurable, this forces the equality

[
µ∗ (A) = µ∗ Kn = lim µ∗ (K1 ∪ · · · ∪ Kn ). 

n→∞
n=1

Exercise 5*. Let X be a locally compact space, and let µ be a Radon measure on
X. Suppose ν : Bor(X) → [0, ∞] is a measure satisfying the following conditions:
(a) ν(B) ≤ µ(B), ∀ B ∈ Bor(X);
(b) for every B ∈ Bor(X), one has the implication ν(B) < ∞ ⇒ µ(B) < ∞.
Prove that ν is a Radon measure on X. (Notice that, in the case when µ is finite,
the condition (b) is superfluous.)
Hints: To prove condition (ii) in the definition of Radon measures, start with some open set
D ⊂ X, and choose a sequence K1 ⊂ K2 ⊂ · · · ⊂ D of compact subsets, such that
lim µ(Kn ) = µ(D),
n→∞
S∞
and define the Borel set B = n=1 Kn ⊂ D. Notice that we have the equalities µ(B) =
limn→∞ µ(Kn ) and ν(B) = limn→∞ ν(Kn ). Argue that, when ν(D) = ∞, we must have
ν(B) = ∞. When ν(D) < ∞, show that µ(D r B) = 0. In either case we get ν(B) = ν(D).
The next result explains somehow the anomaly illustrated by Exercise 3.
Proposition 7.5. If µ is a Radon measure on X, and let µ∗ denote its maximal
outer extension. For a subset N ⊂ X, the following are equivalent
(i) N is µ∗ -measurable, and for every compact subset K ⊂ N , one has the
equality µ(K) = 0;
(ii) µ∗ (D ∩ N ) = 0, for all open subsets D ⊂ X with µ(D) < ∞;
(iii) N is locally µ∗ -neglijeable, i.e.
µ∗ (A ∩ N ) = 0, for all subsets A ⊂ X with µ∗ (A) < ∞.
Proof. (i) ⇒ (ii). Assume N satisfies condition (i). Fix some open set D ⊂
X, with µ(D) < ∞. Then the set D∩N is measurable, and µ∗ (D∩N ) ≤ µ(D) < ∞.
The equality µ∗ (D ∩ N ) = 0 then follows from (i), combined with Corollary 7.3.
(ii) ⇒ (iii). Assume N satisfies condition (ii). Fix some arbitrary subset
A ⊂ X, with µ∗ (A) < ∞. Using (13), there exists some open set D ⊃ A with
µ(D) < ∞. Then we have the inequality µ∗ (A ∩ N ) ≤ µ∗ (D ∩ N ), so condition (ii)
will force µ∗ (A ∩ N ) = 0.
(iii) ⇒ (i). Let N be locally µ∗ -neglijeable. We know that local µ∗ -neglijeability
implies µ∗ -measurability (see Section 5). The fact that µ(K) = 0, for all compact
subsets K ⊂ N is also trivial. 
Notation. Let µ be a Radon measure on the locally compact space X, and
let µ∗ be the maximal outer extension of µ. We denote the σ-algebra mµ∗ (X),
of all µ ∗ -measurable subsets of X, simply by Mµ (X), and we define the measure
µ̃ = µ∗ m ∗ (X) . Using the terminology introduced in Section 5, the pair (Mµ (X), µ̃)
µ
is the quasi-completion of Bor(X) with respect to µ.
250 LECTURES 26-29

Our next goal is to examine the inclusion Bor(X) ⊂ Mµ (X) along the same
lines used in the final part of Section 5. In preparation for the results that follow,
it is helpful to introduce the following terminology.
Definition. Let µ be a Radon measure on the locally compact space X. A
non-empty compact subset K ⊂ X, is said to be µ-tight, if it has the property
• there is no compact non-empty proper subset L ( K, with µ(K) = µ(L).
Remark 7.3. Singleton sets are always µ-tight. If K is µ-tight, and µ(K) = 0
then K must be a singleton.
For a non-empty compact set K with µ(K) > 0, the µ-tightness is equivalent
to the following condition11:

D ⊂ X open
(22) =⇒ µ(D ∩ K) > 0.
D ∩ K 6= ∅
Indeed, if K is µ-tight, and D ⊂ X is an open set, such that D ∩ K 6= ∅, then the
compact set L = K r D is either empty, or a proper subset of K. In either case,
we get µ(L) < µ(K), and then the equality D ∩ K = K r L gives µ(D ∩ K) =
µ(K) − µ(L) > 0. Conversely, if K satisfies (22) and if L is a non-empty proper
compact subset of K, then the set D = X r L is open, and satisfies D ∩ K 6= ∅.
By (22) this forces µ(D ∩ K) > 0, and since we have L = K r (D ∩ K), we get
µ(L) = µ(K) − µ(D ∩ K) < µ(K).
A µ-tight compact set K, with µ(K) > 0, will be called non-degenerate.
Lemma 7.2. Let X be a locally compact space, let µ be a Radon measure on
X. Every non-empty compact set K ⊂ X has a µ-tight compact subset K0 ⊂ K,
with µ(K0 ) = µ(K).
Proof. If K is already tight, there is nothing to prove. Also, if µ(K) = 0,
then we can pick K0 to be of the form {x}, with x any point in K.
For the remainder of the proof, we are going to assume that K is not µ-tight,
and µ(K) > 0. Consider the collection
L = L ∈ CX : ∅ 6= L ( K and µ(L) = µ(K) .


Since K is not µ-tight, the collection L is non-empty. One key property of the
collection L is the following.
Claim 1: If L1 , . . . , Ln ∈ L, then L1 ∩ · · · ∩ Ln ∈ L.
Indeed, if we define the sets Aj = K rLj , j = 1, . . . , n, then µ(A1 ) = · · · = µ(An ) =
0, and then the equality
K r [L1 ∩ · · · ∩ Ln ] = A1 ∪ · · · ∪ An

will force µ K r [L1 ∩ · · · ∩ Ln ] = 0, thus giving µ(L1 ∩ · · · ∩ Ln ) = µ(K) > 0.
(The last inequality forces of course L1 ∩ · · · ∩ Ln 6= ∅.)
T Using the finite intersection property, it follows that the intersection K0 =
L∈L L is non-empty.
Claim 2: K0 ∈ L.
Obviously K0 is compact non-empty proper subset of K, so the only thing we need
to prove is the equality µ(K0 ) = µ(K). Consider the Borel subset
B = K r K0 ⊂ K.
11 Notice that using D = X, condition (22) actually forces µ(K) > 0.
CHAPTER III: MEASURE THEORY 251

Since B ⊂ K, it follows that µ(B) < ∞. By Corollary 7.3 we have



(23) µ(B) = sup µ(P ) : P compact, P ⊂ B .
Notice however that if P ⊂ B is compact, then we have, by the definition of B, the
equality \ \ 
(L ∩ P ) = L ∩ P = (K r B) ∩ P = ∅,
L∈L L∈L
so again by the finite intersection property, combined with Claim 1, it follows that
there exists L ∈ L, such that P ∩ L = ∅. Then we have µ(K r L) = 0, so the
inclusion P ⊂ K r L will force µ(P ) = 0. Using (23), this forces µ(B) = 0.
We now show that K0 is µ-tight. Indeed, if K0 were not tight, we could find
some non-empty compact proper subset L ( K0 , with µ(L) = µ(K0 ) = µ(K).
This will of course force L to belong to L, and therefore it will force the inclusion
K0 ⊂ L, which is impossible. 
Lemma 7.3. Let X be a locally compact space, let ν be a Radon measure on X,
and let G be a pair-wise disjoint collection of non-degenerate µ-tight compact sets.
For any set A ⊂ X, with µ∗ (A) < ∞, the collection
SG (A) = G ∈ G : G ∩ A 6= ∅


is at most countable.
Proof. Since µ∗ (A) < ∞, by (13), there exists some open set D ⊃ A with
µ(D) < ∞. It is obvious that SG (A) ⊂ SG (D), so it suffices to prove that SG (D) is
at most countable.
On the one hand, we notice that, for every finite subset F ⊂ SG (D), one has
X [ 
µ(G ∩ D) = µ [G ∩ D] ≤ µ(D) < ∞.
G∈F G∈F

This means that the family µ(G ∩ D) G∈S (D) is summable, and we have
G
X
µ(G ∩ D) ≤ µ(D) < ∞.
G∈SG (D)

On the other hand, by Remark 7.3, we know that all the terms µ(G ∩ D), G ∈
SG (D) are are strictly positive. Using Proposition II.2.2, this forces SG (D) to be
countable. 
The main application of the above result is the following.
Theorem 7.4. Let X be a locally compact space, and let µ be a Radon measure
on X. Then there exists a partition F of X into µ-tight compact sets, with the
property that the set [
NF = F
F ∈F
µ(F )=0
is locally µ∗ -neglijeable.
Proof. Define the set
Ω = F : F pairwise disjoint collection of non-degenerate µ-tight compact sets .


We agree to consider the empty collection as an element of Ω, so that Ω is non-


empty. Equip the set Ω with the order relation ⊂ given by inclusion.
252 LECTURES 26-29

Claim 1: The ordered set (Ω, ⊂) contains a maximal element.


This is a straightforward application of Zorn’s Lemma. Start with some subset Λ
of Ω, which is totally ordered with respect to ⊂, and let us show that there is an
upperS bound for Λ (in Ω). If we write Λ = {Gi : i ∈ I}, we define the collection
G = i∈I Gi . It is clear that every element in G is a non-degenerate µ-tight compact
set. If K, L ∈ G are different elements, then there exist i, j ∈ I with K ∈ Gi and
L ∈ Gj . Since Λ is totally ordered, we either have Gi ⊂ Gj , or Gj ⊂ Gi . In either
case, we conclude that there exists some k ∈ I, such that K, L ∈ Gk , and then
K ∩ L = ∅. This shows that G is pairwise disjoint, hence G belongs to Ω. It is
obvious that G is an upper bound for Λ.
S proven Claim 1, we fix a maximal collection G ∈ Ω, and we define the
Having
set T = G∈G G. It is quite possible that G = ∅. In that case we define T = ∅.
Claim 2: For every compact subset K ⊂ X r T , one has µ(K) = 0.
We prove this by contradiction. Assume µ(K) > 0. By Lemma 7.3 there exists
a µ-tight compact subset K0 ⊂ K, with µ(K0 ) = µ(K) > 0 (in particular K0 is
non-degenerate). But then the collection G ∪ {K0 } would obviously contradict the
maximality of G.
Claim 3: Whenever D ⊂ X is an open set with µ(D) < ∞, it follows that
the set D r T is Borel, and µ(D r T ) = 0.
By Lemma 7.3, the collection
SG (D) = G ∈ G : G ∩ D 6= ∅


is at most countable. Now we have


[
D∩T = (D ∩ G),
G∈SG (D)

so D ∩ T is a countable union of Borel sets, hence D ∩ T itself is Borel, and so will


be D r T = D r (D ∩ T ). Since µ(D r T ) ≤ µ(D) < ∞, by Corollary 7.3, we have

µ(D r T ) = sup µ(K) : K compact, K ⊂ D r T .
By Claim 2 this, forces µ(D r T ) = 0.
Going back to the proof of the theorem, we notice that, by Claim 2, none of
the singletons {x}, x ∈ X r T , has positive measure. We can then define collection
F = G ∪ {x} : x ∈ X r T ,


which is obviously a partition of X into µ-tight compact sets. For this partition,
we obviously have the equality NF = X r T . By Claim 3, we have
µ∗ (NF ∩ D) = 0, for all open sets D ⊂ X with µ(D) < ∞.
By Proposition 7.2, it follows that NF is indeed locally µ∗ -neglijeable. 
Definition. Let X be a locally compact space, and let µ be a Radon measure
on X. A partition F of X into µ-tight compact sets, with the property stated in
Theorem 7.3, will be called non-degenerate.
The existence of such partitions is significant, as indicated below.
Theorem 7.5. Let X be a locally compact space, let µ be a Radon measure on
X, and let F be a non-degenerate partition of X into µ-tight compact sets. Then12
F is a sufficient µ-finite Bor(X)-partition of X.
12 See Section 5 for the terminology.
CHAPTER III: MEASURE THEORY 253

Proof. What we to prove are the following properties:


(i) F is pairwise disjoint, and F ∈F F = X;
S
(ii) F ⊂ B, and µ(F ) < ∞, for all F ∈ F;
(iii) for every set B ∈ Bor(X), with µ(B) < ∞, one has the equality13
X
(24) µ(B) = µ(B ∩ F ).
F ∈F

Conditions (i) and (ii) are obvious.


To prove condition (iii) we define the sub-collection G = {F ∈ F : µ(F ) > 0},
so that G consists on non-degenerate µ-tight compact sets, and the set
[ 
NF = X r F
F ∈G

is locally µ -neglijeable. Assume now B ∈ Bor(X) has µ(B) < ∞. By Lemma 7.3,
the collection
SG (B) = F ∈ G : B ∩ F 6= ∅


is at most countable. In particular, the set


[
B0 = (B ∩ F ) = B r NF
F ∈SG (B)

is Borel, and so will be B r B0 = B ∩ NF . On the one hand, since B r B0 is a


subset of NF , it follows that B r B0 is locally µ∗ -neglijeable. On the other hand,
since B r B0 is a subset of B, it follows that µ(B r B0 ) < ∞. This clearly forces
µ(B r B0 ) = 0, so we have the equality
X
(25) µ(B) = µ(B0 ) = µ(B ∩ F ).
F ∈SG (B)

Notice that, if F ∈ F r SG (B), then either F 6∈ G, in which case we have µ(F ) = 0,


or F ∈ G r SG (B), in which case we have µ(B ∩ F ) = 0. This shows that
µ(B ∩ F ) = 0, ∀ F ∈ F r SG (B),
so the equality (25) immediately gives (24). 

Corollary 7.4. Under the hypothesis above, the collection F is a µ̃-finite


decomposition for Mµ (X).

Proof. Immediate from Corollary 5.3. 

In the remainder of this section we discuss two basic examples of methods for
constructing (regular) contents.
To introduce the first construction, let us recall some notations and terminology
introduced in II.5 For a locally compact space X, and K one of the fields R or C, we
denote by CcK (X) the space of all continuous functions f : X → K, with compact
support. A R-linear map φ : CcR (X) → R is said to be positive, if it has the
property:
f ∈ CcR (X), f ≥ 0 ⇒ φ(f ) ≥ 0.
With these notations, we have the following result.
13 Here we use the summation convention from II.2
254 LECTURES 26-29

Proposition 7.6. Let X be a locally compact space, and let φ : CcR (X) → R
be a positive R-linear map. For every compact subset K ⊂ X, define the number

ωφ (K) = inf φ(f ) : f ∈ CcR (X), f ≥ κ K .
Then the map CX 3 K 7−→ ωφ (K) ∈ [0, ∞) is a regular content on X.
Proof. The inequality f ≥ κ K forces f ≥ 0, so we indeed have ωφ (K) ≥ 0,
∀ K ∈ CX . We now check conditions (i)-(iv) in the definition of a content.
The constant function 0 satisfies 0 ≥ κ ∅ , which immediately gives the equality
ωφ (∅) = 0, so condition (i) is satisfied.
By the definition of ωφ , it is clear that one has the implication
K, L ∈ CX , K ⊂ L =⇒ ωφ (K) ≤ ωφ (L),
thus giving condition (ii).
To check condition (iii), suppose K, L ∈ CX , and let us prove the inequality
(26) ωφ (K ∪ L) ≤ ωφ (K) + ωφ (L).
Start with some ε > 0, and choose functions f, g ∈ CcR (X), such that f ≥ κ K ,
g ≥ κ L , φ(f ) ≤ ωφ (K) + ε, and φ(g) ≤ ωφ (L). If we consider the function
h = f + g ∈ CcR (X), then we clearly have h ≥ κ K∪L , so we will have
ωφ (K ∪ L) ≤ φ(h) = φ(f + g) = φ(f ) + φ(g) ≤ ωφ (K) + ωφ (L) + 2ε.
Since the inequality ωφ (K ∪ L) ≤ ωφ (K) + ωφ (L) + 2ε holds for arbitrary ε > 0, it
will clearly force (26)
Finally, to check condition (iv) we need start with two disjoint sets K, L ∈ CX ,
and we prove the equality
(27) ωφ (K ∪ L) = ωφ (K) + ωφ (L).
By (26) it only suffices to show the inequality
(28) ωφ (K ∪ L) ≥ ωφ (K) + ωφ (L).
Start with some arbitrary ε > 0, and choose a function f ∈ CcR (X), with f ≥
κ K∪L and φ(f ) ≤ ωφ (K ∪ L) + ε. Use Uryshon Lemma for locally compact spaces
(Theorem
I.5.1) to find a continuous map θ : X → [0, 1], such that θ = 1 and
K
θ L = 0. The functions g = f θ and h = f (1 − θ) are obviously continuous, and
have compact supports. Moreover, one has the inequalities g ≥ κ K and h ≥ κ L .
Since g + h = f , we get
ωφ (K ∪ L) + ε ≥ φ(f ) = φ(g + h) = φ(g) + φ(h) ≥ ωφ (K) + ωφ (L).
Since the inequality ωφ (K ∪ L) + ε ≥ ωφ (K) + ωφ (L) holds for all ε > 0, it will
clearly force the inequality (28)
So far, we have shown that ωφ is a content. We now prove that ωφ is regular,
which means that, for every K ∈ CX , one has the equality
ωφ (K) = inf ωφ (L) : L ∈ CX , K ⊂ Int(L).


By property (ii) we always have the inequality


ωφ (K) ≤ inf ωφ (L) : L ∈ CX , K ⊂ Int(L),


so all we need to prove is the inequality


ωφ (K) ≥ inf ωφ (L) : L ∈ CX , K ⊂ Int(L).

(29)
CHAPTER III: MEASURE THEORY 255

Start with some arbitrary ε > 0, and choose a function f ∈ CcR (X) with f ≥ κ K ,
and φ(f ) ≤ ωφ (K) + ε. Consider the function g = (1 + ε)f , and the set

D = {x ∈ X : g(x) > 1 .
Obviously D is an open set, and since f (x) ≥ 1, ∀ x ∈ K, we get g(x) ≥ 1 + ε > 1,
∀ x ∈ K. In particular, this gives the inclusion K ⊂ D. Apply then Lemma I.5.1
to find some compact set L ⊂ D, with K ⊂ Int(L). Since g(x) > 1, ∀ x ∈ L, we
clearly have
ωφ (L) ≤ φ(g) = (1 + ε)φ(f ) ≤ (1 + ε)(ωφ (K) + ε).
This argument shows that, if we denote the right hand side of (29) by ν(K), then
we have the inequality
ν(K) ≤ (1 + ε)(ωφ (K) + ε).
Since this inequality holds for all ε > 0, it will force the inequality ν(K) ≤ ωφ (K),
thus proving (29). 
Definition. Let X be a locally compact space, and let φ : CcR (X) → R be a
positive R-linear map. We apply Corollary 7.2 to the regular content ωφ , and we
will denote the Radon measure extension of ωφ simply by µφ . The measure µφ on
Bor(X) is called the Riesz measure associated with φ.
An interesting property, which will later be generalized, is the following.
Lemma 7.4 (Mean Value Property). Let X be a locally compact space, let
φ : CcR (X) → R be a positive R-linear map, and let µφ be the Riesz measure
associated with φ. For any function f ∈ CcR (X), and any compact subset K ⊂ X,
with K ⊃ supp f , one has the inequality
   
(30) min f (x) · µφ (K) ≤ φ(f ) ≤ max f (x) · µφ (K).
x∈K x∈K

Proof. Since minx∈K f (x) = − maxx∈K (−f )(x), it suffices to prove only the
inequality
 
(31) φ(f ) ≤ max f (x) · µφ (K).
x∈K

Fix f ∈ CcR (X), as well as the compact set K ⊃ supp f . Denote the number
maxx∈K f (x) simply by M .
f
If M < 0 the inequality is pretty clear, because the function g = M satisfies g ≥
κ K , which gives φ(g) ≥ ωφ (K) = µφ (K), and then multiplying by M immediately
gives (31).
The case M = 0 is also trivial, since this forces f ≤ 0, so we get φ(f ) ≤ 0.
Assume M > 0. Fix for the moment some ε > 0, and choose some function
h ∈ CcR (X), with h ≥ κ K , and φ(h) ≤ µφ (K) + ε.
Let us observe that M h − f ≥ 0. Indeed, if we start with some arbitrary point
x ∈ X, then either x ∈ K, in which case we have M h(x) ≥ M ≥ f (x), or we have
x ∈ X r K, in which case M h(x) ≥ 0 = f (x).
Using the positivity of φ we then get φ(M h − f ) ≥ 0, which by the choice of h
gives 
φ(f ) ≤ φ(M h) = M φ(h) ≤ M µφ (K) + ε .

Since the inequality φ(f ) ≤ M µφ (K) + ε holds for arbitrary ε > 0, it will clearly
force φ(f ) ≤ M µφ (K). 
The Riesz measure can be implicitly characterized by the following result.
256 LECTURES 26-29

Proposition 7.7. With the notations above, the Riesz measure µφ is the
unique Radon measure which has the interpolation property:
(iφ ) whenver F ⊂ X is compact, D ⊂ X is open, and f ∈ CcR (X) satisfies
κ F ≤ f ≤ κ D , it follows that one has the inequality
µφ (F ) ≤ φ(f ) ≤ µφ (D).

Proof. Let us first show that µφ has property (iφ ). Start with F , D and f
as in (iφ ). Since µφ (F ) = ωφ (F ), by the definition of ωφ , we immediately get the
inequality µφ (F ) ≤ φ(f ).
To prove the inequality φ(f ) ≤ µφ (D), we need some preparations. For every
integer n ≥ 1 we define the sets
 1  1
An = x ∈ X : f (x) > and Bn = x ∈ X : f (x) ≥ .
n n
Define also the set E = {x ∈ X : f (x) > 0}, so that E = supp f . (Here we use
the obvious fact that f ≥ 0.) The sets An , n ≥ 1 are open. The sets Bn , n ≥ 1
are closed subsets of E ⊂ E, hence they are compact. Notice also that we have the
inclusions
A1 ⊂ B1 ⊂ A2 ⊂ B2 ⊂ · · · ⊂ E ⊂ D.
For every n ≥ 1, we use Urysohn
Lemma to find a continuous function hn : X →
[0, 1], with hn B = 1 and hn XrA = 0. On the one hand, we notice that the
n n+1

function f (1 − hn ) has the support contained in the compact set E r An ⊂ X r An .


Moreover, since we clearly have f (x) ≤ n1 , ∀ x ∈ X r An , by Lemma 7.4 we get the
inequality
 µφ (E r An ) µφ (E)
φ(f ) = φ(f hn )+φ f (1−hn ) ≤ φ(f hn )+ ≤ φ(f hn )+ , ∀ n ≥ 1,
n n
which shows that
(32) φ(f ) ≤ lim sup φ(f hn ).
n→∞

On the other hand, for each n ≥ 1, the function f hn has support contained in
Bn+1 , and (f hn )(x) ≤ 1, ∀ x ∈ Bn+1 , so again by Lemma 7.4 combined with the
inclusion Bn+1 ⊂ D, we get
φ(f hn ) ≤ µφ (Bn+1 ) ≤ µφ (D).
Using (32) we immediately get φ(f ) ≤ µφ (D).
We now prove the uniqueness. Let µ be a Radon measure with property (iφ ).
Claim 1: For any compact set K ⊂ X and any open set D ⊂ X, with K ⊂ D,
one has the inequality
µφ (K) ≤ µ(D).
Choose a compact set L ⊂ X, with K ⊂ Int(L) ⊂ L ⊂ D, and use Urysohn
Lemma
to find a continuous function f : X → [0, 1] such that f K = 1 and f XrInt(L) = 0.
In particular, f has compact support, and satisfies κ K ≤ f ≤ κ D . Using (iφ ) for
µφ and for µ, we then get µφ (K) ≤ φ(f ) ≤ µ(D), and we are done.
Claim 2: for every compact set K ⊂ X, one has the equality µφ (K) = µ(K).
CHAPTER III: MEASURE THEORY 257

On the one hand, by the definition of the Radon measure, we have



µ(K) = inf µ(D) : D ⊂ X open, with D ⊃ K .
By Claim 1, this immediately gives the inequality µφ (K) ≤ µ(K). On the other
hand, if we choose, for every ε > 0, a function fε ∈ CcR (X) with f ≥ κ K and
φ(f ) ≤ µφ (K) + ε, then the function gε = min{fε , 1} will also satisfy gε ≥ κ K , and
φ(gε ) ≤ φ(fε ) ≤ µφ (K) + ε. Applying (iφ ) for µ, with C = K and X = D will then
force µ(K) ≤ φ(gε ) ≤ µφ (K) + ε. Since the inequality µ(K) ≤ µφ (K) + ε holds for
all ε > 0, it will force µ(K) ≤ µφ (K).
Having proven Claim 2, we now see that, using condition (iii) in the definition
of Radon measures, we get the equality µ(D) = µφ (D), for all open sets D ⊂ X.
Using condition (ii) from the definition, it then follows that µ(B) = µφ (B), ∀ B ∈
Bor(X). 

Comment. The Riesz correspondence


   
positive R-linear maps Radon measures
3 φ −
7 → µφ ∈ .
RCc (X) → R on X

will be studied in Chapter IV, where we will eventially prove the fact that it is
bijective. At this point we simply regard it as a method of constructing Radon
measures.
Proposition 7.8. Let X be a locally compact space. Then the Riesz corre-
spondence is “linear” in the following sense.
(i) If φ : CcR (X) → R is a positive R-linear map, and t ∈ [0, ∞), then tφ is
also a positive R-linear map, and one has the equality µtφ = tµφ .
(ii) If φ1 , φ2 : CcR (X) → R are positive R-linear maps, then φ1 + φ2 is also a
positive R-linear map, and one has the equality µφ1 +φ2 = µφ1 + µφ2 .

Proof. (i). Assume φ is positive and t ∈ [0, ∞). The fact that tφ is positive
is trivial. We know, by Proposition 7.2, that tµφ is a radon measure. Then the
equality µtφ = tµφ follows from Proposition 7.5, combined with the obvious fact
that µtφ has the interpolation property (itφ )
(ii). If φ1 and φ2 are positive, then so is φ1 + φ2 . Define ψ = φ1 + φ2 , and
ν = µφ1 + µφ2 . By Proposition 7.2, we again know that ν is a Radon measure. The
equality µψ = ν follows from Proposition 7.5, combined with the obvious fact that
ν has the interpolation property (iψ ) 

The Riesz correspondence is also functorial, with respect to proper maps, in


the following sense.
Proposition 7.9. Let X and Y be locally compact spaces, let Φ : X → Y be
a proper continuous map, and let φ : CcR (X) be a positive linear map.
(i) Whenever f : Y → R is a continuous function with compact support, it
follows that the composition f ◦ Φ : X → R is also a continuous function
with compact support.
(ii) The map
ψ : CcR (Y ) 3 f 7−→ φ(f ◦ Φ) ∈ R
is R-linear and positive.
258 LECTURES 26-29

(iii) If µφ is the Riesz measure on X defined by φ, and if µψ is the Riesz


measure on Y defined by ψ, then one has the equality
µψ (B) = µφ Φ−1 (B) , ∀ B ∈ Bor(Y ).


Proof. (i). This statement is trivial, since Φ is proper.


(ii). The linearity of ψ is a consequence of the linearity of the map
T : CcR (Y ) 3 f 7−→ f ◦ Φ ∈ CcR (X),
and of the obvious equality ψ = φ ◦ T .
(iii). Use Proposition 7.3, which states that the map ν : Bor(Y ) → [0, ∞],
defined by
ν(B) = µφ Φ−1 (B) , ∀ B ∈ Bor(Y ),


is a Radon measure. In order to prove statement (iii), which reads µψ = ν, we


observe that, using Proposition 7.5, it suffices to prove that ν has the interpolation
property (iψ ). Fix then a compact set K and an open set D ⊂ Y , as well as a
function f ∈ CcR (Y ), such that κ K ≤ f ≤ κ D , and let us prove the inequalities
(33) ν(K) ≤ ψ(f ) ≤ ν(D).
Define the compact set L = Φ−1 (K) (here we use the fact that Φ is proper), and
define the open set E = Φ−1 (D) ⊂ X, so that ν(K) = µφ (L) and ν(D) = µφ (E).
If we define, using (i), the function g = f ◦ Φ ∈ CcR (X), then we have ψ(f ) = φ(g),
and the inequalities (33) are the same as the inequalities
µφ (L) ≤ φ(g) ≤ µφ (E).
But these inequalities follow immediately from the interpolation property of µφ ,
combined with the obvious inequalities κ L ≤ g ≤ κ E . 
Remarks 7.4. Let X be a locally compact space, let φ : CcR (X) → R be a
positive R-linear map, and let µφ be the Riesz measure defined by φ.
A. One has the equality

(34) µφ (X) = sup φ(f ) : f ∈ CcR (X), 0 ≤ f ≤ 1 .
Indeed, if we denote the right hand side of (34) by M , then the inequality µφ (X) ≥
M is immediate from the interpolation property. In fact, if for each compact set
K ⊂ X, we choose (use Urysohn Lemma) some continuous function fK : X → [0, 1],
with compact support, such that fK K = 1, then by the interpolation property we
get M ≥ φ(fK ) ≥ µφ (K), so we have
M ≥ sup µφ (K) : K ∈ CX = µφ (K).


B. As a consequence of the equality (34), and of Remark II.5.4, we get the


equivalence
φ continuous ⇔ µφ (X) < ∞.
Moreover, in this case one has the equality kφk = µφ (X).
C. Assume X is non-compact, and φ is continuous. Then φ can be extended to
a positive linear function φ0 on the completion C0R (X) of CcR (X). In this case the
Riesz correspondence has a nice connection with the Alexandrov compactification
X α = X t {∞} (see I.5 and II.5). Recall that C0R (X) is identified with the space
of all continuous functions f : X α → R with f (∞) = 0. Moreover, φ0 has a unique
extension to a positive linear map ψ : C R (X α ) → R, with kφk = kφ0 k = kφk.
CHAPTER III: MEASURE THEORY 259

We can then consider two Riesz measures µφ on X, and µψ on X α . One has the
equality
(35) µψ (B) = µφ (B ∩ X), ∀ B ∈ Bor(X α ).
First of all, remark that
(36) µψ (K) = µφ (K), ∀ K ∈ CX .
This is a consequence of the fact that for every g ∈ C R (X α ) with g ≥ κ K , there
exists some f ∈ CcR (X), with g ≥ f ≥ κ K (Simply take f = gh, for some continuous
function h : X → [0, 1] with compact support, with h K = 1.) Using (36), we
immediately get the equality
(37) µψ (B ∩ X) = µφ (B ∩ X), ∀ B ∈ Bor(X α ).
Using this with B = X, we get
µψ (X) = µφ (X) = kφk = kψk = µψ (X α ),
which forces µψ ({∞}) = 0, and then (35) is immediate from (37)
Exercise 6. Consider the case when X = Rn . For every continuous function
f : Rn → R, with compact support, we define
Z b1 Z b2 Z bn
φ(f ) = ··· f (x1 , x2 , . . . , xn ) dx1 dx2 · · · dxn ,
a1 a2 an

where the numbers a1 < b1 , . . . , an < bn are chosen (arbitrarily) such that
supp f ⊂ [a1 , b1 ] × [a2 , b2 ] × · · · × [an , bn ].
(One can show that the multiple integral is independent of the choice of the a’s and
the b’s.) It is obvious that this way we have constructed a positive R-linear map
φ : CcR (Rn ) → R. The Riesz measure µφ , defined by φ, is precisely the Lebesgue
measure λn .
Hint: Compute the values of µφ on compact boxes.
We conclude this section with an important result from harmonic analysis. The
main object of study is explained in the following.
Definition. A topological group is a group G, which comes also equipped with
a topology, which is compatible with the group structure in the sense that the map
G × G 3 (g, h) 7−→ gh−1 ∈ G
is continuous. Remark that is equivalent to the fact that both maps G × G 3
(g, h) 7−→ gh ∈ G and G 3 g 7−→ g −1 ∈ G are continuous. To avoid any complica-
tions, all topological groups are assumed to be Hausdorff.
Examples 7.2. A. Any group becomes a topological group, when equipped
with the discrete topology. (This is the topology in which every subset is open.)
B. The group (Rn , +) is a topological group, when equipped with the norm
topology. 
C. The unit circle T = z ∈ C : |z| = 1 is a topological group, when equipped
with the unsual multiplication, and the topology induced from C. More generally,
for an integer n ≥ 1, the n-dimensional torus Tn , equipped with coordinate-wise
multiplication, and the product topology, is a topological group.
D. Given an integer n ≥ 1, the group GLn (R), of all invertible n × n matrices
(with matrix multiplication as the group operation), is a topological group, when
260 LECTURES 26-29

equipped with the topology comming from the identification of GLn (R) as an open
2
subset in Rn .
Notations. Let G be a group. For a subset A ⊂ G and an element g ∈ G, we
define the left and right translations of A by g, as the sets
gA = {gh : h ∈ A} and Ag = {hg : h ∈ A}.
For two subsets A, B ⊂ G, we define
A · B = {hk : h ∈ A, k ∈ B}.
Finally, for a subset A ⊂ G, we define A−1 = {h−1 : h ∈ A .

Remark 7.5. There is some similarity between topological groups and metric
spaces. The subsets that paly role of open balls are the open neighborhoods of the
identity. More explicitly, if G is a topological group, with identity element e, then
one has the equalities
{N ⊂ G : N open neighborhood of g} =
= {gV ⊂ G : V open neighborhood of e} =
= {W g ⊂ G : W open neighborhood of e}.
For example, given a metric space (X, d), a map f : G → X is continuous at some
point g ∈ G, if and only if, for every ε > 0, there exists some neighborhood Vε of
e, such that 
d f (gh), f (g) < ε, ∀ h ∈ Vε .
The following two results will be used several times.
Lemma 7.5. Suppose G is a topological group, with identity element e. For
any open neighborhood U of e, there exists an open neighborhoods V of e, such that
V = V −1 and V · V ⊂ U .
Proof. Fix the open neighborhood U . Use the continuity of the map G × H 3
(g, h) → gh ∈ G, at (e, e), to find an open neighborhood D of (e, e) in G × G, such
that
gh ∈ U, ∀ (g, h) ∈ D.
Since D is open in the product topology, there exist open neighborhoods U1 and
U2 , of e, such that U1 × U2 ⊂ D. Then we obviously have
U1 · U2 ⊂ U.
Consider the open neighborhood W = U1 ∩ U2 . We still have W · W ⊂ U . Finally,
using the continuity of the map G 3 g 7−→ g −1 ∈ G, it follows that W −1 is also a
neighborhood of e. Then we are done, if we take V = W ∩ W −1 . 
Proposition 7.10. Let G be a topological group, and let K, L ⊂ G be two
compact disjoint sets. Then there exists an open neighborhood V of the identity
element e, such that V = V −1 and (K · V ) ∩ (L · V ) = (V · K) ∩ (V · L) = ∅.
Proof. Consider the continuous map φ : G × G 3 (g, h) 7−→ gh−1 ∈ G, and
the compact set C = (K × L) ∪ (L × K) ⊂ G × G. Since φ is continuous, it follows
that φ(C) is a compact subset of G. The condition K ∩ L = ∅ obviously gives the
fact that e 6∈ φ(C). Since φ(C) is closed, there exists some open neighborhood U
of e, such that φ(C) ∩ U = ∅. Use Lemma 7.5 to find some open neigborhood V
of e, such that V = V −1 and V · V ⊂ U .
CHAPTER III: MEASURE THEORY 261

We now show that (K · V ) ∩ (L · V ) = ∅. Suppose the contrary, i.e. there exist


g ∈ K, h ∈ L, and v, w ∈ V , such that gv = hw. Then we get h−1 g = wv −1 ∈
V · V −1 = V · V ⊂ U , which is impossible, since h−1 g also belongs to φ(C).
Finally, let us show that we also have (V · K) ∩ (W · L) = ∅. Suppose the
contrary, i.e. there exist g ∈ K, h ∈ L, and v, w ∈ V , such that vg = wh. Then we
get hg −1 = w−1 v ∈ V −1 · = V · V ⊂ U , which is impossible, since gh−1 also belongs
to φ(C). 
In what follows we are going to restrict our attention to those topological groups
which are locally compact in their respective topology. The topological groups listed
in Examples 7.2.A-D are all locally compact.
Definition. Let G be a locally compact group. A Radon measure µ on G is
called a Haar measure on G, if µ(G) > 0, and µ has the left invariance property:
µ(gA) = µ(A), ∀ g ∈ G, A ∈ Bor(G).
Remark that, for every g ∈ G the map `g : G 3 h 7−→ gh ∈ G is a homeomorphism,
so for a subset A ⊂ G, one has the equivalence A ∈ Bor(G) ⇔ gA = `g (A) ∈
Bor(G). Likewise, the map rg : G 3 h 7−→ hg ∈ G is a hoemorphism, so A ∈
Bor(X) ⇔ Ag ∈ Bor(G).
Remark 7.6. Let G be a locally compact group. For any element g ∈ G, and
any function F ∈ CcR (G), we define the continuous functions Lg F, Rg F : G → R by
Lg F = F ◦ `g−1 and Rg F = f ◦ rg . In other words,
(Lg F )(h) = F (g −1 h) and (Rg F )(h) = F (hg), ∀ h ∈ G.
It is fairly obvious that Lg F and Rg F both have compact support. Moreover, for
a fixed g ∈ G, the maps Lg , Rg : CcR (G) → CcR (G) are linear, and continuous in the
norm defined in Exercise 5. One has the equalities
Lgh = Lg ◦ Lh and Rgh = Rg ◦ Rh , ∀ g, h ∈ G,
as well as Le = Re = Id, where e denotes the identity element in G.
The following result gives a sufficient condition for a Riesz measure to be a
Haar measures.
Proposition 7.11. Let G be a locally compact group, and let φ : CcR (G) → R
be a positive R-linear map, which is not identically zero, and has the left invariance
property:
φ ◦ Lg = φ, ∀ g ∈ G.
Then the Riesz measure µφ is a Haar measure on G.
Proof. The key property we need is contained in the following
Claim 1: For any g ∈ G, and any compact subset K ⊂ G, one has the
equality µφ (gK) = µφ (K).
Fix for the moment g ∈ G, as well as the compact set K ⊂ G. The set gK is
compact, so we have

(38) µφ (gK) = inf φ(F ) : F ∈ CcR (G), F ≥ κ gK .
Notice that if F ∈ CcR (G) satisfies F ≥ κ gK , this means that F (gh) ≥ κ gK (gh),
∀ h ∈ G. Notice that, for any h ∈ G, one has the equivalences
κ gK (gh) = 1 ⇔ gh ∈ gK ⇔ h ∈ K,
262 LECTURES 26-29

which means that


κ gK (gh) = κ K (h), ∀ h ∈ K.
The inequality F ≥ κ gK then gives
F (gh) ≥ κ K (h), ∀ h ∈ G,
which reads
Lg−1 f ≥ κ K .
Using the invariance property, we get

µφ (K) ≤ φ Lg−1 (F ) = (φ ◦ Lg−1 )(F ) = φ(F ).
In other words, we have
φ(F ) ≥ µφ (K), for all F ∈ CcR (G) with F ≥ κ gK .
Using (38) this immediately gives
µφ (K) ≤ µφ (gK).
Applying the same inequality with g replaced by g −1 and K replaced by gK, yields
µφ (gK) ≤ µφ g −1 (gK) = µφ (K),


so the Claim follows.


Claim 2: For any g ∈ G, and any open subset D ⊂ G, one has the equality
µφ (gD) = µφ (D).
For a compact subset L ⊂ G, one clearly has the equivalence L ⊂ gD ⇔ g −1 ⊂ D.
So, using Claim 1, for every compact subset L ⊂ gD, one has
µφ (L) = µφ (g −1 L) ≤ µφ (D),
and using property (iii) for Radon measures, we immediately get the inequality

µφ (gD) = sup µφ (L) : L compact, L ⊂ gD ≤ µφ (D).
The inequality µφ (D) ≤ µφ (gD) is proven by replacing g with g −1 and D with gD,
in the above inequality.
We now prove that µφ is a Haar measure. Start with some Borel set A ⊂ G.
For every open set D ⊃ gA, one has the inclusion g −1 D ⊃ A, which using Claim 2,
gives µφ (D) = µφ (g −1 D) ≥ µφ (A). Using property (ii) in the definition of Radon
measures, we then have

µφ (gA) = inf µφ (D) : D open, D ⊃ gA ≥ µφ (A).
The inequality µφ (A) ≥ µφ (gA) is proven by replacing g with g −1 and A with gA,
in the above inequality. 

Comment. Later on, in Chapter IV, we are going to prove that the left invari-
ance property of φ is also a necessary condition for µφ to be a Haar measure.
Examples 7.3. Let us examine the examples 7.2.A-D and let us construct
Haar measures on these groups.
A. On a discrete group G, one has the counting measure µ(A) = Card A,
∀ A ⊂ G, which is obviously a Haar measure.
B. On (Rn , +), the Lebesgue measure is a Haar measure.
CHAPTER III: MEASURE THEORY 263

C. On the n-dimensional torus Tn , we consider the Riesz measure µΛ , associated


with the positive R-linear map Λ : C R (Tn ) → R, defined by
Z 1 Z 1
Λ(F ) = ··· F (e2πiθ1 , . . . , e2πiθn ) dθ1 . . . dθn .
0 0

It is not hard to see that Λ◦Lg = Λ, ∀ g ∈ Tn . One easy way is to check directly the
equality (Λ ◦ Lg )(P ) = Λ(P ), for functions of the form P (z1 , . . . , zn ) = z1m1 · · · znmn ,
with m1 , . . . , mn ∈ Z, and then use continuity and the Stone-Weierstrass Theorem
which gives the fact that the linear span of all these P ’s is dense in C R (Tn ). Using
Proposition 7.6 it follows that µΛ is a Haar measure on Tn .
D. The construction of a Haar measure on GLn (R) is outlined in the following.
2
Exercise 7*. Identify GLn (R) as an open subset in Rn . For every continuous
2
function F : GLn (R) → R, with compact support, F̆ : Rn → R by
F (x) · | det x|−n if x ∈ GLn (R)

F̆ (x) = 2
0 if x ∈ Rn r GLn (R)
and we define
Z b1 Z b2 Z bn2
ψ(F ) = ··· f (x1 , x2 , . . . , xn2 ) dx1 dx2 · · · dxn2 ,
a1 a2 an2

where the numbers a1 < b1 , . . . , an2 < bn2 are chosen (arbitrarily) such that

supp F̆ ⊂ [a1 , b1 ] × [a2 , b2 ] × · · · × [an2 , bn2 ].

(On has the equality supp F̆ = supp F , and the multiple integral is independent of
the choice of the a’s and the b’s.) Prove that ψ ◦ Ls = ψ, ∀ s ∈ GLn (R). Conclude
that the Riesz measure µψ associated with ψ is a Haar measure on GLn (R).
Hints: Fix s ∈ GLn (R). The map `s−1 : GLn (R) → GLn (R) has an obvious linear extension
2 2
Φs : Rn → Rn , defined by
2
Φs (x) = s−1 x, ∀ x ∈ Rn ,
2
where the vector space Rn is identified with M atn×n (R). Fix now F ∈ CcR GLn (R) and consider


the function H = F ◦ `s−1 , so that (ψ ◦ Ls )(F ) = ψ(H). Prove the equality

H̆(x) = F̆ Φs (x) · | det s|−n , ∀ x ∈ Rn .




Prove that the Jacobian of Φs is given as


det[(DΦs )(x)] = | det s|−n , ∀ x ∈ Rn .

Use this equality, combined with the above formula for H̆, to get the equality ψ(H) = ψ(F ),
as a result of the change of variable theorem. (Use the fact that in the definition of ψ, instead
of integrating over rectangles one can integrate over arbitrary compact sets Ω ⊂ GLn (R), with
Jordan neglijeable boundary, and Int(Ω) ⊃ supp F .)
Comments. The Haar measures defined in Examples 7.3.A-D are peculiar in
the sense that they also have the right invariance property:
µ(Ag) = µ(A), ∀ g ∈ G, A ∈ Bor(G).
In general such a property does not hold. At this point, we can only speculate on
this matter, by examining the following example.
264 LECTURES 26-29

Exercise 8*. Consider the group G of all affine orientation preserving affine
transformations of R, i.e. the collection

G = Tab : a, b ∈ R, a > 0 ,
where Tab : R 3 x 7−→ ax + b ∈ R. (Some people call this the “ax + b” group.) It
is not hard to see that compositions and inverses of such transformations are again
of this form. In fact one can identify G as the subgroup of GL2 (R) given by
  
a b
G= : a, b ∈ R, a > 0 .
0 1
The topology on G is the one induced from this inclusion. Equivalently, G can be
identified with the right half-plane (0, ∞) × R. We use this identification to define a
positive R-linear map Λ : CcR (G) → R as follows. For every F ∈ CcR (G), we choose
0 < c1 < d1 and c2 < d2 , such that supp F ⊂ [c1 , d1 ] × [c2 , d2 ], and we define
Z d1 Z d2
F (a, b)
Λ(F ) = da db.
c1 c2 a2
The integral does not depend on the particular choice of the rectangle. Prove that
Λ◦Lg = Λ, ∀ g ∈ G, so that the Riesz measure µΛ is a Haar measure. In general the
equality Λ ◦ Rg = Λ fails. As indicated in the comment that followed Proposition
7.6, the fact that Λ ◦ Rg 6= Λ would prevent the Riesz measure µΛ from having the
right invariance property.
Hints: Use similar arguments to the ones in Exercise 8. If g = Tab ∈ G, then the map
`g−1 : G → G extends to a linear map Φg : R2 → R2 , defined by
Φg (x, y) = (ax + by, y), ∀ (x, y) ∈ R2 .

Argue as in Exercise 8, and use the change of variable theorem.


Exercise 9. As indicated above, in general, Haar measures need not have the
right invariance property. Prove that when µ is a Haar measure on G, then the
map ν : Bor(G) → [0, ∞] defined by
ν(B) = µ(B −1 ), ∀ B ∈ Bor(G),
is a Radon measure, which has the right invariance property.
Hint: The map G 3 g 7−→ g −1 ∈ G is a homeomorphism.
The main result we are interested in is the existence of a Haar measure. The
following result reduces the problem to the existence of a left invariant content.
Lemma 7.6. Let G be a locally compact group, and let ω be a content on G,
with the left invariance property:
ω(gK) = ω(K), ∀ g ∈ G, K ∈ CG .
If ω is not identically zero, then the outer measure ω ∗ , induced by ω, also has the
left invariance property:
ω ∗ (gA) = ω(A), ∀ g ∈ G, A ⊂ G.

The measure µ = ω ∗ Bor(G) is a Haar measure on G.

Proof. We trace the construction outlined in Theorem 7.1. Denote by TG the


collection of all open subsets of G, and define the map ω̂ : TG → [0, ∞] by
ω̂(D) = sup ω(K) : K ∈ CG , K ⊂ D , ∀ D ∈ TG .

CHAPTER III: MEASURE THEORY 265

The outer measure ω ∗ is then defined by


ω ∗ (A) = inf ω̂(D) : D ∈ TG , D ⊃ A , ∀ A ⊂ G.


Claim: The map ω̂ : TG → [0, ∞] has the left invariance property:


ω̂(gD) = ω̂(D), ∀ g ∈ G, D ∈ TG .
Start with some arbitrary compact subset K ⊂ gD. Then g −1 K is a compact
subset of D, so by the left invariance property of ω, we get
ω(K) = ω(g −1 K) ≤ ω̂(D).
This means that we have ω(K) ≤ ω̂(D), for all compact subsets K ⊂ gD, so by the
definition of ω̂ we get
ω̂(gD) = sup ω(K) : K ∈ CG , K ⊂ gD ≤ ω̂(D).


The other inequality ω̂(D) ≤ ω̂(gD), follows from the one above if we replace g
with g −1 and D with gD.
We are now in position to prove that ω ∗ has the left invariance property. Fix
for the moment A ⊂ G and g ∈ G. For every open set D ⊃ gA, one has g −1 D ⊃ A,
so by the Claim we get
ω̂(D) = ω̂(g −1 D) ≥ ω ∗ (A).
Since we have ω̂(D) ≥ ω (A), for all open sets D ⊃ gA, by the definition of ω ∗ , we

get
ω ∗ (gA) = inf ω̂(D) : D ∈ TG , D ⊃ gA ≥ ω ∗ (A).


The other inequality ω ∗ (A) ≥ ω ∗ (gA), follows from the one above if we replace g
with g −1 and A with gA.
In order to prove that µ is a Haar measure, all we need to prove is the fact that
µ(G) > 0. Start with some compact subset K ⊂ G, with ω(K) > 0. We have
µ(G) ≥ µ(K) = ω ∗ (K) = ω̆(K) ≥ ω(K) > 0,
and we are done. 
Before we prove the existence of Haar measures, we need more preparations.
Notations. Let G be a group. For two non-empty subsets A, B ⊂ G, we write
A ≺ B, if there exist elements g1 , . . . , gn ∈ G, such that A ⊂ g1 B ∪ · · · ∪ gn B. In
this case we define the number

[A : B] = min{n ∈ N : there exist g1 , . . . , gn ∈ G with K ⊂ g1 V ∪ · · · ∪ gn V .
The following result will be useful.
Lemma 7.7. Let G be a group.
(i) If A, B ⊂ G are non-empty sets with A ⊂ B, then A ≺ B, and [A : B] = 1.
(ii) The relation ≺ is transitive, i.e. whenever A, B, C ⊂ G are non-empty
subsets satisfying A ≺ B and B ≺ C, it follows that A ≺ C. Moreover, in
this case one has the inequality
[A : C] ≤ [A : B] · [B : C].
(iii) The relation ≺ is compatible with left translations. This means that for
any two elements g, h ∈ G, and any two non-empty subsets A, B ⊂ G,
one has the equivalence A ≺ B ⇔ gA ≺ hB. Moreover, in this case one
has
[gA : hB] = [A : B].
266 LECTURES 26-29

(iv) If A, B, C ⊂ G are non-empty subsets such that A ≺ C and B ≺ C, then


A ∪ B ≺ C. Moreover, in this case one has the inequality
[A ∪ B : C] ≤ [A : C] + [B : C].
(v) If A, B, C ⊂ G are non-empty sets, such that A ≺ C, B ≺ C, and (A ·
C −1 ) ∩ (B · C −1 ) = ∅, then one has the equality
[A ∪ B : C] = [A : C] + [B : C].

Proof. (i) This part is trivial.


(ii) Put m = [A : B] and n = [B : C]. Choose g1 , . . . , gm , h1 , . . . , hn ∈ G, such
that A ⊂ g1 B ∪ · · · ∪ gm B, and B ⊂ h1 C ∪ · · · ∪ hn C. We then obviously have the
inclusion
[m [ n
A⊂ (gi hj )C,
i=1 j=1

which proves that A ≺ C, but also shows that [A : C] ≤ mn.


(iii) This follows immediately from (ii) plus the obvious relations A ≺ gA ≺ A,
B ≺ hB ≺ B, and the equalities
[A : gA] = [gA : A] = [B : hB] = [hB : B] = 1.
(iv) Let m = [A : C] and n = [B : C]. Choose g1 , . . . , gm , gm+1 , . . . , gm+n ∈ G
such that A ⊂ g1 C ∪ · · · ∪ gm C and B ⊂ gm+1 C ∪ · · · ∪ gm+n C. This clearly shows
that A ∪ B ≺ C and [A ∪ B : C] ≤ m + n.
(v) Let p = [A ∪ B : C], and choose g1 , . . . , gp ∈ G, such that A ∪ B ⊂
g1 C ∪ · · · ∪ gp C. Define the sets
 
M = j ∈ {1, . . . , p} : A ∩ gj C 6= ∅ and N = k ∈ {1, . . . , p} : B ∩ gk C 6= ∅ .
Notice that M ∩ N = ∅. Indeed, if there exists j ∈ M ∩ N , this means that on the
one hand, we have A ∩ gj C 6= ∅, which gives gj ∈ A · C −1 , and on the other hand,
we have B ∩ gj C 6= ∅ which gives gj ∈ B · C −1 . But this clearly contradicts the
assumption that (A · C −1 ) ∩ (B · C −1 ) = ∅.
By the definition of M and N , we clearly have the inclusions
[ [
A⊂ gj C and B ⊂ gk C.
j∈M k∈N

These immediately give the inequalities [A : C] ≤ card M and [B : C] ≤ card N .


Since M and N are disjoint, and M ∪ N ⊂ {1, . . . , p}, these inequalities give
[A : C] + [B : C] ≤ card M + card N = card(M ∪ N ) ≤ p = [A ∪ B : C].
Using part (iv), we see that in fact we have equality [A : C] + [B : C] = [A ∪ B :
C]. 

Remark 7.7. If G is a topological group with identity element e, and if V is


a neighborhood of e, then K ≺ V , for every compact subset of G. Indeed, if we
choose some open set D with S e ∈ D ⊂ V , then using the compactness of K, and
the obvious inclusion K ⊂ g∈K gD, it follows that there exists g1 , . . . , gn ∈ K,
such that K ⊂ g1 D ∪ · · · ∪ gn D ⊂ g1 V ∪ · · · ∪ gn V .
With these preparations we are in position to prove the following fundamental
result.
CHAPTER III: MEASURE THEORY 267

Theorem 7.6. Let G be a locally compact group, and let A be a compact


neighborhood of the identity element. Then there exists a Haar measure µ on G,
such that µ(A) = 1.

Proof. Denote the identity element of G by e. Throughout the proof the


compact neighborhood A of e will be fixed. For every non-empty compact set
K ⊂ G, we define m(K) = [K : A]. We also put m(∅) = 0.
Let us define V to be the collection of all neighborhoods of e. For every V ∈ V,
we denote by Ω(V ) the set of all maps ω : CG → [0, ∞) with the following properties
(i) 0 ≤ ω(K) ≤ m(K), ∀ K ∈ CG ;
(ii) ω(A) = 1;
(iii) K, L ∈ CG , K ⊂ L ⇒ ω(K) ≤ ω(L);
(iv) ω(K ∪ L) ≤ ω(K) + ω(L), ∀ K, L ∈ CG ;
(v) ω(gK) = ω(K), ∀ g ∈ G, K ∈ CG .
(vi) K, L ∈ CG , (K · V ) ∩ (L · V ) = ∅ ⇒ ω(K ∪ L) = ω(K) + ω(L).
Claim 1: For every V ∈ V, the set Ω(V ) is non-empty.
Fix V . We shall prove this Claim by an explicit construction of an element ω ∈
Ω(V ). Define ω(∅) = 0, and define

[K : V −1 ]
ω(K) = ,
[A : V −1 ]
for all non-empty compact subsets K ⊂ G. The fact that ω has properties (i)-(vi)
is immediate from Lemma 7.7.
Let us regard the sets Ω(V ), V ∈ V as subsets of the product space
Y
P= [0, m(K)].
K∈CG

Notice that, when we equip P with the product topology, it becomes a compact
space, by Tihonov’s Theorem.
Claim 2: For every V ∈ V, the set Ω(V ) is closed in P.
Define, for any K ∈ CG , the map

πK : P 3 ω 7−→ ω(K) ∈ R.

By the definition of the topology of P, all maps πK : P → R are continuous. For


any two sets K, L ∈ CG , consider the functions FKL , TKL : P → R, defined by

FKL (ω) = ω(K) − ω(L) and TKL (ω) = ω(K ∪ L) − ω(K) − ω(L), ∀ ω ∈ P.

Since we have FKL = πK − πL and TKL = πK∪L − πK − πL , it follows that the


maps FKL , TKL : P → R, K, L ∈ CG , are all continuous. As a consequence of the
continuity of these maps, it follows that, for any two sets K, L ∈ CG , the sets
−1

Γ(K, L) = {ω ∈ P : ω(K) ≤ ω(L)} = FKL (−∞, 0] ,
Θ− (K, L) = {ω ∈ P : ω(K ∪ L) ≤ ω(K) + ω(L)} = TKL −1

(−∞, 0] ,
−1
Θ+ (K, L) = {ω ∈ P : ω(K ∪ L) ≥ ω(K) + ω(L)} = TKL

[0, ∞)
268 LECTURES 26-29

are closed subsets of P. It then follows that the sets


−1
Ω1 = ω ∈ P : ω(A) = 1 = πA
 
{1} ,
\
Ω2 = Γ(K, L),
(K,L)∈CG ×CG
K⊂L
\
Ω3 = Θ− (K, L),
(K,L)∈CG ×CG
\ \
4

Ω = Γ(K, gK) ∩ Γ(gK, K) ,
K∈CG g∈G

are all closed, so the intersection


Ω5 = Ω1 ∩ Ω3 ∩ Ω3 ∩ Ω4
is again closed. Notice that
Ω5 = ω ∈ P : ω has properties (i)-(v) .


Finally, if we define, for every V ∈ V, the set


\
Ω6V = Θ+ (K, L),
(K,L)∈CG ×CG
(K·V )∩(L·V )=∅

then Ω6V is also closed, and so will then be the intersection Ω5 ∩ Ω6V = Ω(V ).
T
Claim 3: The intersection V ∈V Ω(V ) is non-empty.
Remark that, if V1 , V2 ∈ V are such that V1 ⊂ V2 , then we have the inclusion
Ω(V1 ) ⊂ Ω(V2 ). Indeed, if ω belongs to Ω(V1 ), then properties (i)-(v) are clear. To
check property (vi) for V2 we need to show that whenever K, L ⊂ G are compact
sets, with (K · V2 ) ∩ (L · V2 ) = ∅, it follows that ω(K ∪ L) = ω(K) + ω(L). This
is however trivial, since the inclusion V1 ⊂ V2 forces (K · V1 ) ∩ (L · V1 ) = ∅, and
then the desired equality follows from the property (vi) for V1 . We now see that,
for any finite number of sets V1 , . . . , Vn ∈ V, we have the inclusion
Ω(V1 ∩ · · · ∩ Vn ) ⊂ Ω(V1 ) ∩ · · · ∩ Ω(Vn ),
which by Claim 1, proves that Ω(V1 ) ∩ · · · ∩ Ω(Vn ) 6= ∅. Using Claim 2, and the
compactness of P, the Claim immediately
T follows.
Pick now an element ω ∈ V ∈V Ω(V ).
Claim 4: The map ω : CG → [0, ∞) is a content on G with the left invariance
property
ω(gK) = ω(K), ∀ g ∈ G, K ∈ CG .
Moreover, one has the equality ω(A) = 1.
The fact that ω(A) = 1 is clear, from condition (ii) in the definition of Ω(V ). The
left invariance property follows from condition (v). In order to prove that ω is a
content, we need to prove
(a) ω(∅) = 0;
(b) K, L ∈ CG , K ⊂ L ⇒ ω(K) ≤ ω(L);
(c) ω(K ∪ L) ≤ ω(K) + ω(L), ∀ K, L ∈ CG ;
(d) K, L ∈ CG , K ∩ L = ∅ ⇒ ω(K ∪ L) = ω(K) + ω(L).
CHAPTER III: MEASURE THEORY 269

Properties (a), (b), and (c) are clear, because every element in Ω(V ), V ∈ V satisfies
them. (Property (a) is a consequence of condition (i), property (b) is a consequence
of (iii), and property (c) is a consequence of (iv).) To prove property (d), we start
with two disjoint compact sets K and L, and we use Proposition 7.5 to find some
V ∈ V such that (K · V ) ∩ (L ∩ V ) = ∅. Then we use the fact that ω belongs to
Ω(V ), and by condition (vi) we indeed get ω(K ∪ L) = ω(K) + ω(L).
Having proven Claim 4, we now define the measure µ0 = ω ∗ Bor(G) . By Lemma
7.7, µ0 is a Haar measure on G. Notice that µ0 (A) = ω̆(A) ≥ ω(A) = 1, so if we
define µ : Bor(G) → [0, ∞] by (use the convention ∞/µ0 (A) = ∞)
µ0 (B)
µ(B) = , ∀ B ∈ Bor(G),
µ0 (A)
then µ is a Haar measure on G, and satisfies µ(A) = 1. 
Comment. Eventually (see Chapter IV) we are going to improve on the above
result by proving the uniqueness of µ.
In concrete examples, it is possible to prove uniqueness.
Exercise 10*. Let S = [0, 1]n be the unit square in Rn , and let µ be a Haar
measure on (Rn , +), with µ(S) = 1. Prove that µ coincides with the n-dimensional
Lebesgue measure λn .
Hint: Consider first the half open box S0 = [0, 1)n , and its measure β = µ(S0 ). Prove that for
a half open box of the form
B = [a1 , b1 ) × · · · × [an , bn )
with a1 , . . . , an , b1 , . . . , bn ∈ Q, one has µ(B) = βλn (B). Conclude that if a subset A ⊂ Rn is
contained in a hyperplane of the form
Πk (a) = {(x1 , . . . , xn ) ∈ Rn : xk = a},
then µ(A) = 0. Use this to get β = 1, so
µ(B) = λn (B),
for every “rational” half-open box. Prove that this equality holds for all half-open boxes. Use
Corollary 5.1 to conclude that µ = λn .
The following two exercises show how a Haar measure can be used to get some
topological information.
Exercise 11. Let G be a locally compact group, and let µ be a Haar measure
on G. Prove that µ(D) > 0, for every open subset D ⊂ G.
Hint: Use the inequality µ(K) ≤ [K : D] · µ(D), for all compact K ⊂ G.
Exercise 12*. Let G be a locally compact group, and let µ be a Haar measure
on G. Prove that the following are equivalent:
(i) G is compact;
(ii) µ(G) < ∞.
Hint: For the implication (ii) ⇒ (i), start with some compact neighborhood V of the identity,
and choose a maximal subset A ⊂ G, such that the sets gV , g ∈ A are disjoint. Prove that A is
finite. Conclude that G = g∈A (gV · V −1 ), so G is a finite union of compact sets.
S
Lectures 30-31

8. Signed measures and complex measures


In this section we discuss a generalization of the notion of a measure, to the
case where the values are allowed to be outside [0, ∞]. The first notion is described
by the following.
Definition. Suppose A is a σ-algebra on a non-empty set X. A function
µ : A → [−∞, ∞] is called a signed measure on A, if it has the properties below.
(i) Either one of the following is true
• µ(A) < ∞, ∀ A ∈ A;
• µ(A) > −∞, ∀ A ∈ A.
(ii) µ(∅) = 0.
n=1 ⊂ A, one has the equality
(iii) For any pairwise disjoint sequence (An )∞

[ ∞
 X
(1) µ An = µ(An ).
n=1 n=1

Here we adopt the convention that if one term in the right hand side of (1) is equal
to ±∞, then the entire sum is equal to ±∞. It is important to use condition (i),
which avoids situations when one term is ∞ and another term is −∞.
Examples 8.1. Let us agree, in this section only, to use the term “honest”
measure, for a measure in the usual sense.
A. Any “honest” measure is of course a signed measure.
B. If µ is a signed measure, then −µ is again a signed measure.
C. If µ1 and µ2 are “honest” measures, one of which is finite, then µ1 − µ2 is
a signed measure. Eventually (see Theorem 8.2) we are going to show that any
signed measure can be written in this form.
One key technical result about signed measures is the following.
Theorem 8.1. Let A be a σ-algebra on a non-empty set X, and let µ be a
signed measure on X. Then there exist sets L, U ∈ A, such that
µ(L) = inf µ(A) : A ∈ A ;

(2)
µ(M ) = sup µ(A) : A ∈ A .

(3)

Proof. Since −µ is also a signed measure, it suffices to prove only the exis-
tence of M satisfying (3). Denote the right hand side of (3) by α, and choose a
sequence (αn )n≥1 ⊂ R, such that limn→∞ αn = α, and αn < α, ∀ n ≥ 1. The key
construction we need is contained in the following.
Claim 1: There exists a family of sets {Bkn : k, n ∈ N, 1 ≤ k ≤ n} ⊂ A,
with the following properties:
271
272 LECTURES 30-31

(i) for every n ≥ 1, one has the inclusions


B1n ⊂ B2n ⊂ ... ⊂ Bnn
∪ ∪ ... ∪
B1n+1 ⊂ B2n+1 ⊂ ... ⊂ Bnn+1 n+1
⊂ Bn+1
(ii) for every k ≥ 1 one has the inequalities
µ(Bkn r Bkn+1 ) ≤ 0, ∀ n ≥ k.
(iii) µ(Bnn ) ≥ αn , ∀ n ≥ 1.
We construct this sequence inductively, one row at a time (the rows ar indexed by
the upper index n). Choose B11 ∈ A to be any set with µ(B11 ) ≥ α1 . Suppose we
have constructed the first N rows, i.e. we have defined the sets Bkn , 1 ≤ k ≤ n ≤ m,
so that property (i) holds for all n = 1, . . . , m − 1, property (ii) holds in the form
αk ≤ µ(Bkk ) ≤ µ(Bkk+1 ) ≤ · · · ≤ µ(Bkm ), ∀ k = 1, . . . , m,
and property (iii) holdes for all n = 1, . . . , m. Let us explain now how the next row
B1m+1 ⊂ B2m+1 ⊂ . . . Bm
m+1 m+1
⊂ Bm+1 is constructed. Define the sets E1 , E2 , . . . , Em ∈
A by
E1 = B1m , and Ek = Bkm r Bk−1 m
, ∀ k = 2, . . . , m.
The sets Ek , k = 1, . . . , m are pairwise disjoint, and we have
k
[
Bkm = Ej , ∀ k = 1, . . . , m.
j=1

Choose now an arbitrary set D ∈ A, with µ(D) ≥ αm+1 , and define, for each
j ∈ {1, . . . , m}, the set

Ej if µ(Ej r D) > 0
Gj =
Ej ∩ D if µ(Ej r D) ≤ 0
Notice that we have Ej ⊃ Gj , and using the equality µ(Ej ) = µ(Ej ∩D)+µ(Ej rD),
we also have
(4) µ(Ej r Gj ) ≤ 0 and µ(Gj ) ≥ µ(Ej ∩ D), ∀ j = 1, . . . , m.
m
Define also the set Gm+1 = D r Bm . It is clear that the sets G1 , G2 , . . . , Gm+1 are
pairwise disjoint. Construct now the m + 1 row by taking
k
[
Bkm+1 = Gj , ∀ k = 1, 2, . . . , m + 1.
j=1

It is obvious that one has the inclusions


B1m+1 ⊂ B2m+1 ⊂ · · · ⊂ Bm+1
m+1
.
Since Ek ⊃ Gk , ∀ k = 1, . . . , m, it is also clear that we have the vertical inclusions
Bkm ⊃ Bkm+1 , ∀ k = 1, . . . , m. Using (4), for each k = 1, . . . , m, we have
k
[ k
 X
µ(Bkm r Bkm+1 ) = µ [Ej r Gj ] = [µ(Ej r Gj ) ≤ 0.
j=1 j=1
CHAPTER III: MEASURE THEORY 273

Finally, again by (4), we have


m+1
m+1
[  m+1
X m
X
µ(Bm+1 )=µ Gk = µ(Gk ) ≥ µ(Gm+1 ) + µ(Ek ∩ D) =
k=1 k=1 k=1
m
[
m m

= µ(Gm+1 ) + µ [Ek ∩ D] = µ(D r Bm ) + µ(Bm ∩ D) = µ(D) ≥ αm+1 .
k=1

Claim 2: There exists a sequence (Ak )∞k=1 ⊂ A, such that


(i) A1 ⊂ A2 ⊂ A3 ⊂ . . . ;
(ii) µ(Ak ) ≥ αk , ∀ k ≥ 1.

We fix a family {Bkn : k, n ∈ N, 1 ≤ k ≤ n satisfying the properties in Claim 1.
T∞
For every k ≥ 1, we define Ak = n=k Bkn . Notice that, using property (i) from
Claim 1 (the vertical inclusions), we have

[
Bkk [Bkn r Bkn+1 ] ,

= Ak ∪
n=k

and the sets Ak , Bkn


r Bkn+1 , n ≥ k, are pairwise disjoint, so using property (ii)
from Claim 1, we have

X
µ(Bkk ) = µ(Ak ) + µ(Bkn r Bkn+1 ) ≤ µ(Ak ).
n=k

Using property (iii) from Claim 1, we then get µ(Ak ) ≥ αk . The fact that we have
the inclusions A1 ⊂ A2 ⊂ . . . is clear, from property (i) in Claim 1 (the horizontal
inclusions).
S∞now the sequence (Ak )k=1 ⊂ A as in Claim 2, and let us consider the set

Fix
M = k=1 Ak . If we define the sets
M1 = A1 and Mk = Ak r Ak−1 , ∀ k ≥ 2,
S∞
then we have M = k=1 Mk , and the sets M1 , M2 , M3 , . . . are pairwise disjoint. In
particular, this gives

X k k
X  [ 
µ(M ) = µ(Mk ) = lim µ(Mj ) = lim µ Mj .
k→∞ k→∞
k=1 j=1 j=1
Sk
Since we obviously have j=1 Mj = Ak , ∀ k ≥ 1, the above equality proves that
(5) µ(M ) = lim µ(Ak ).
k→∞

Since we have αk ≤ µ(Ak ) ≤ α, ∀ k ≥ 1, as well as limk→∞ αk = α, the equality


(5) forces µ(M ) = α. 
Remark 8.1. One interesting application of the above result is the fact that,
whenever µ is a signed measure on A, such that
(6) −∞ < µ(A) < ∞, ∀ A ∈ A,
then
−∞ < inf{µ(A) : A ∈ A ≤ sup{µ(A) : A ∈ A} < ∞.

A signed measure with property (6) is called finite.


We are now in position to prove the statement made in Example 8.1.C.
274 LECTURES 30-31

Theorem 8.2. Let X be a non-empty set, let A be a σ-algebra on X, and let µ


be a signed measure on A. Then there exist subsets X + , X − ∈ A, with the following
properties:
(i) X + ∩ X − = ∅, and X + ∪ X − = X;
(ii) the maps µ± : A → [−∞, ∞], defined by
µ± (A) = ±µ(A ∩ X ± ), ∀ A ∈ A,
are “honest” measures on A;
(iii) one of the measures µ± is finite, and one has the equality µ = µ+ − µ− .
Proof. Without any loss of generality, we can assume that
µ(A) < ∞, ∀ A ∈ A.
(Otherwise, we  replace µ with −µ, and the conclusion does not essentially change.)
Put α = sup µ(A) : A ∈ A . By Theorem 8.1, it follows that 0 ≤ α < ∞, and
there exists a set X + ∈ A, such that µ(X + ) = α. Define X − = X r X + .
Claim: The sets X ± have the following properties:
(a) 0 ≤ µ(A) ≤ α, for all A ∈ A, with A ⊂ X + ;
(b) 0 ≥ µ(B), for all B ∈ A, with B ⊂ X − .
To prove (a) start with some arbitrary subset A ⊂ X + . First of all, by the definition
of α, it is clear that µ(A) ≤ α. Second, using the equality µ(X + ) = µ(A) + µ(X + r
A), it is clear that µ(A), µ(X + r A) > −∞, so we have
α ≥ µ(X + r A) = µ(X + ) − µ(A) = α − µ(A),
which clearly forces µ(A) ≥ 0. To prove (b), we start with some set B ∈ A with
B ⊂ X − . Using the fact that X + ∩ B = ∅, we have
α ≥ µ(X + ∪ B) = µ(X + ) + µ(B) = α + µ(B),
which clearly forces µ(B) ≤ 0.
Having proven the Claim, we define the maps µ± : A → [−∞, ∞] as in the
statement of the Theorem. By the Claim, we get µ± (A) ≥ 0, ∀ A ∈ A. It is also
pretty clear that both µ+ and µ− are σ-additive, so they define “honest” measures.
Also, by the Claim, we have µ+ (A) ≤ α, ∀ A ∈ A, so µ+ is a finite measure. Finally,
if we start with some arbitrary A ∈ A, and we write it as A = (A ∩ X + ) ∪ (A ∩ X − ),
then using the fact that (A ∩ X + ) ∩ (A ∩ X − ) = ∅, we get
µ(A) = µ(A ∩ X + ) + µ(A ∩ X − ) = µ+ (A) − µ− (A). 
It will be helpful not only here, but also in some future discussions, to isolate
a certain feature identified by the above result.
Definition. Given a σ-algebra A on a non-empty set X, and two “honest”
measures µ and ν on A, we say that µ and ν are mutually singular, if there exists
sets M, N ∈ A, with M ∪ N = X and M ∩ N = ∅, such that µ(N ) = ν(M ) = 0.
Notice that this implies the equalities
µ(A) = µ(A ∩ M ) and ν(A) = ν(A ∩ N ), ∀ A ∈ A.
If this situation occurs, we write µ ⊥ ν.
With this terminology, Theorem 8.2 states that any signed measure µ can be
written as µ = µ+ − µ− , with µ+ and µ− “honest” mutually singular measures,
and one of them finite.
CHAPTER III: MEASURE THEORY 275

Although the sets X ± may not be uniquely determined, the decomposition


µ = µ+ − µ− is unique, as indicated by the following result.
Theorem 8.3 (Minimality). Let X be a non-empty set, let A be a σ-algebra
on X, and let µ be a signed measure on A. Suppose µ+ and µ− are mutually
singular “honest” measures on A, one of them being finite, such that µ = µ+ − µ− .
Suppose ν and η are two “honest” measures on A, one of which being finite, such
that µ = ν − η. Then one has the inequalities µ+ ≤ ν and µ− ≤ η.
Proof. Fix sets X + , X − ∈ A, such that X + ∪ X − = X, X + ∩ X − = ∅, and
µ (X − ) = µ− (X + ) = 0.
+

Start with some arbitrary set A ∈ A. On the one hand, since A = (A ∩ X + ) ∪


(A ∩ X − ), with A = (A ∩ X + ) ∩ (A ∩ X − ) = ∅, we see that if λ is either one of the
measures µ, µ+ ,or µ− , we have the equality
(7) λ(A) = λ(A ∩ X + ) + λ(A ∩ X − ), ∀ A ∈ A.
On the other hand, since µ+ is an “honest” measure, and µ+ (X − ) = 0, the inclusion
A ∩ X − ⊂ X − will force
µ+ (A ∩ X − ) = 0, ∀ A ∈ A.
Likewise, we have the equality
µ− (A ∩ X + ) = 0, ∀ A ∈ A.
These equalities, combined with µ = µ+ − µ− , and with (7), give the equalities
(8) µ+ (A) = µ+ (A ∩ X + ) = µ+ (A ∩ X + ) − µ− (A ∩ X + ) = µ(A ∩ X + ),
(9) µ− (A) = µ− (A ∩ X − ) = −µ+ (A ∩ X − ) + µ− (A ∩ X − ) = −µ(A ∩ X − ),
for all A ∈ A. Fix now some set A ∈ A. Since ν is an “honest” measure, and
η(A ∩ X + ) ≥ 0, using (8) we get
ν(A) ≥ ν(A ∩ X + ) ≥ ν(A ∩ X + ) − η(A ∩ X − ) = µ(A ∩ X + ) = µ+ (A).
Likewise, we have
η(A) ≥ η(A ∩ X − ) ≥ η(A ∩ X − ) − ν(A ∩ X − ) = −µ(A ∩ X − ) = µ− (A). 
Corollary 8.1. Let A be a σ-algebra on X, let µ be a signed measure on A,
and let µ+ , µ− , ν + and ν − be “honest” measures on A with
• µ+ ⊥ µ− , and one of the measures µ+ and µ− is finite;
• ν + ⊥ ν − , and one of the measures ν + and ν − is finite;
• µ = µ+ − µ− = ν + − ν − .
Then one has the equalities µ+ = ν + and µ− = ν − .
Proof. Apply Theorem 8.3 “both ways” to get µ+ ≤ ν + and µ− ≤ ν − , as
well as ν + ≤ µ+ and ν − ≤ µ− . 
Definition. Given a signed measure µ, the decomposition µ = µ+ −µ− , whose
existence is shown in Theorem 8.2, and whose uniqueness is shown above, is called
the Hahn-Jordan decomposition of µ. A pair of sets (X + , X − ), with X ± ∈ A,
X + ∪ X − = X, X + ∩ X − = ∅, and µ+ (X − ) = µ− (X + ) = 0, is called a Hahn-
Jordan set decomposition of X relative to µ.
Exercise 1. Let µ be a signed measure, and let µ = µ+ −µ− be the Hahn-Jordan
decomposition. Prove that the following are equivalent
276 LECTURES 30-31

(i) µ is finite, i.e. −∞ < µ(A) < ∞, ∀ A ∈ A;


(ii) both “honest” measures µ+ and µ− are finite.
The following result characterizes mutual singularity in an approximate fashion.
Lemma 8.1. Let A be a σ-algebra on X, and let µ and ν be “honest” measures
on A. The following are equivalent
(i) µ ⊥ ν;
(ii) for every ε > 0, there exist sets D, E ∈ A, such that µ(D) < ε, ν(E) < ε,
and D ∪ E = X.

Proof. The implication (i) ⇒ (ii) is trivial.


To prove the implication (ii) ⇒ (i) construct, for each ε > 0, two sequences
(Dnε )∞ ε ∞
n=1 and (En )n=1 of Tsets in A, such thatSµ(Dnε ) < ε/2n , ν(Enε ) < ε/2n , and
∞ ∞
Dn ∪ En = X. Put Aε = n=1 Dnε and Bε = n=1 Enε . Fix for the moment ε > 0.
ε ε

On the one hand, using the inclusion Aε ⊂ Dnε , ∀ n ≥ 1, we get µ(Aε ) ≤ ε/2n ,
∀ n ≥ 1, which clearly forces
(10) µ(Aε ) = 0.
On the other hand, using σ-subadditivity, we have
∞ ∞ ∞
[  X X ε
(11) ν(Bε ) = ν Enε ≤ ν(Enε ) < n
= ε.
n=1 n=1 n=1
2

Finally, since we have, X r Dnε ⊂ Enε , ∀ n ≥ 1, we get



[ ∞
[
X r Aε = (X r Dnε ) ⊂ Enε = Bε ,
n=1 n=1

which gives
(12) Aε ⊃ X r Bε .
S∞
Define now the sets N = n=1 A1/n and M = X r N . On the one hand, using
σ-subadditivity, combined with (10), we get µ(N ) = 0. On the other hand, using
(12), we have

[ ∞
\ ∞
\

M =X rN =X r A1/n = (X r A1/n ) ⊂ B1/n ⊂ B1/k , ∀ k ≥ 1,
n=1 n=1 n=1

which forces ν(M ) = 0. 

Although the next technical result seems a bit out of context at this point, we
prove it here, and record it for future use.
Lemma 8.2. Let A be a σ-algebra on some non-empty set X, and let µ, η be
signed measures on A. Assume there is an “honest” finite measure ν on A, with
µ + ν = η.
(i) If µ = µ+ − µ− and η = η + − η − are the Hahn-Jordan decompositions of
µ and η respectively, then one has the inequalities
(13) µ+ ≤ η + ≤ µ+ + ν
(14) η − ≤ µ− ≤ η − + ν.
CHAPTER III: MEASURE THEORY 277

(ii) If (X + , X − ) is a Hahn-Jordan set decomposition of X relative to µ, and


if (Y + , Y − ) is a Hahn-Jordan set decomposition of X relative to η, then
one has the relations X + ⊂ Y + and Y − ⊂ X − .
ν ν

Proof. On the one hand, the signed measure η has a decomposition


η = µ + ν = (µ+ + ν) − µ− ,
with µ+ + ν and µ− “honest” measures (one of them finite). Using the minimality
Theorem 8.3, we get the inequalities
(15) η + ≤ µ+ + ν and η − ≤ µ− .
On the other hand, we can also consider the signed measure µ = η − ν, which has
a decomposition
µ = η + − (η − + ν),
with η + and η − + ν “honest” measures (one of them finite). Using again the
minimality Theorem 8.3, we get the inequalities
(16) µ+ ≤ η + and µ− ≤ η − + ν.
Clearly the inequalities (15) and (16) cover the desired inequalities (13) and (14)
(ii). Recall (see Section 4) that the relation A ⊂ B means that ν(A r B) = 0.
ν
In our case, we have to look at the set
N = X + r Y + = Y − r X −,
for which we have to show that ν(N ) = 0. On the one hand, since N ⊂ Y − , we get
η + (N ) = 0. Using (13) this forces µ+ (N ) = 0. On the other hand, since N ⊂ X + ,
we get µ− (N ) = 0, and using (14) we also get η − (N ) = 0. In other words, we get
the equalities
µ(N ) = µ+ (N ) − µ− (N ) = 0,
η(N ) = η + (N ) − η − (N ) = 0,
and then the equality η = µ + ν clearly forces ν(N ) = 0. 

The Hahn-Jordan decomposition has the following interesting application to the


properties of the natural order relation on “honest” measures. The result below
gives the existence of a “infimum” and a ”supremum” for a pair of finite “honest”
measures.
Proposition 8.1 (Lattice Property). Let A be a σ-algebra on a non-empty
set X, and let µ and ν be “honest” measures on A, with one of them finite.
(i) There exists a unique measure µ ∨ ν with:
(a) µ ∨ ν ≥ µ and µ ∨ ν ≥ ν;
(b) whenever ω is an “honest” measure on A, with µ ≤ ω and ν ≤ ω, it
follows that one has the inequality µ ∨ ν ≤ ω.
(ii) There exists a unique measure µ ∧ ν with:
(a) µ ≥ µ ∧ ν and ν ≥ µ ∧ ν;
(b) whenever λ is an “honest” measure on A, with µ ≥ λ and ν ≥ λ, it
follows that one has the inequality µ ∧ ν ≥ λ.
278 LECTURES 30-31

Proof. Since the statement of the Theorem is “symmetric,” without any loss
of generality we can assume that µ is finite.
Consider the signed measure η = µ − ν, and its Hahn-Jordan decomposition
η = η + − η − . Let (X + , X − ) be a Hahn-Jordan set decomposition of X relative to
η. This means that, for every A ∈ A, one has
(17) 0 ≤ η + (A) = η(A ∩ X + ) = µ(A ∩ X + ) − ν(A ∩ X + );
(18) 0 ≤ η − (A) = −η(A ∩ X − ) = ν(A ∩ X − ) − µ(A ∩ X − ).
In particular we get
(19) µ(A ∩ X + ) ≥ ν(A ∩ X + ) and µ(A ∩ X − ) ≤ ν(A ∩ X − ), ∀ A ∈ A.
(i). Define the measure µ ∨ ν = µ + η − . Using (18) we have
(20) (µ ∨ ν)(A) = µ(A ∩ X + ) + ν(A ∩ X − ), ∀ A ∈ A.
Notice that, using (19), it follows that, for every A ∈ A, one has the inequalities
(µ ∨ ν)(A ∩ X + ) = µ(A ∩ X + ) ≥ ν(A ∩ X + ),
(µ ∨ ν)(A ∩ X − ) = ν(A ∩ X − ) ≥ µ(A ∩ X − ),
In particular, this gives
(µ ∨ ν)(A) = (µ ∨ ν)(A ∩ X + ) + (µ ∨ ν)(A ∩ X − ) ≥ µ(A ∩ X + ) + µ(A ∩ X − ) = µ(A),
(µ ∨ ν)(A) = (µ ∨ ν)(A ∩ X + ) + (µ ∨ ν)(A ∩ X − ) ≥ ν(A ∩ X + ) + ν(A ∩ X − ) = µ(A),
for every A ∈ A, so µ ∨ ν indeed has property (a).
To prove property (b), start with some “honest” measure ω on A, with µ, ν ≤ ω,
and let us show that µ ∨ ν ≤ ω. This is quite clear, since for any A ∈ A, using (20)
we have
ω(A) = ω(A ∩ X + ) + ω(A ∩ X − ) ≥ µ(A ∩ X + ) + ν(A ∩ X − ) = (µ ∨ ν)(A).
The uniqueness of µ ∨ ν is now clear from (a) and (b).
(ii). Remark that, using the Minimality Theorem 8.3, for the measure η = µ−ν,
it follows that η + ≤ µ. In particular, η + is a finite “honest” measure, and so is the
difference µ − η + . Put µ ∧ ν = µ − η + . Using (17) we have
(21) (µ ∧ ν)(A) = µ(A ∩ X − ) + ν(A ∩ X + ), ∀ A ∈ A.
Notice that, using (19), it follows that, for every A ∈ A, one has the inequalities
(µ ∧ ν)(A ∩ X + ) = ν(A ∩ X + ) ≤ µ(A ∩ X + ),
(µ ∧ ν)(A ∩ X − ) = µ(A ∩ X − ) ≥ ν(A ∩ X − ),
In particular, this gives
(µ ∧ ν)(A) = (µ ∧ ν)(A ∩ X + ) + (µ ∧ ν)(A ∩ X − ) ≤ µ(A ∩ X + ) + µ(A ∩ X − ) = µ(A),
(µ ∧ ν)(A) = (µ ∧ ν)(A ∩ X + ) + (µ ∧ ν)(A ∩ X − ) ≤ ν(A ∩ X + ) + ν(A ∩ X − ) = µ(A),
for every A ∈ A, so µ ∧ ν indeed has property (a).
To prove property (b), start with some “honest” measure λ on A, with µ, ν ≤ ω,
and let us show that µ ∧ ν ≥ λ. This is quite clear, since for any A ∈ A, using (21)
we have
λ(A) = λ(A ∩ X + ) + ω(A ∩ X − ) ≤ ν(A ∩ X + ) + µ(A ∩ X − ) = (µ ∧ ν)(A).
The uniqueness of µ ∧ ν is now clear from (a) and (b). 
CHAPTER III: MEASURE THEORY 279

We conclude with a series of results that make a connection with the theory of
Radon measures discussed in Section 7.
Definition. Suppose X is a locally compact space, and µ is a signed measure
on Bor(X). We call µ a signed Radon measure on X, if there exist “honest” Radon
measures ν and η on X, one of which is finite, such that µ = ν − η.
Exercise 2*. Let X be a locally compact space, and let µ be a signed measure
on Bor(X). Prove that the following are equivalent:
(i) µ is a signed Radon measure on X;
(ii) if µ = µ+ − µ− denotes the Hahn-Jordan decomposition of µ, then both
µ+ and µ− are Radon measures on X.
Hint: To prove the implication (i) ⇒ (ii) use the fact that µ+ ≤ ν and µ− ≤ η. Moreover,
show that, for any B ∈ Bor(X), one has the implications µ+ (B) < ∞ ⇒ ν(B) < ∞ and
µ− (B) < ∞ ⇒ η(B) < ∞. Then use Exercise 5 from Section 7.
Remark 8.2. Suppose X is a locally compact space. In Section 7 we discussed
the Riesz correpsondence, which associates to each linear positive map φ : CcR (X) →
R, a Radon measure µφ on X. As already suggested, this correspondence is in fact
a bijection, although the proof of this fact will come later in Chapter IV. At this
point we would like to analyze the Riesz correspondence in a simpler situation,
namely the case when X is compact. In this case it is interesting to point out that
Riesz correspondence can be extended beyond the positive case. The key fact (see
Corollary II.5.3) is that every linear continuous map φ : C R (X) → R can be written
as a difference φ = φ1 − φ2 , with φ1 , φ2 : C R (X) → R positive linear maps. (In fact
φ1 and φ2 can be chosen such that kφk = kφ1 k + kφ2 k. This fact will be heavily
exploited a little later.) We would like then to define a finite signed Radon measure
µφ by the formula µφ = µφ1 − µφ2 . There is a minor problem here: What if we
find another pair of continuous positive linear maps ψ1 , ψ2 : C R (X) → R, such that
φ = ψ1 − ψ2 ? Is is true that µψ1 − µψ2 = µφ1 − µφ2 ? The answer is affirmative,
and this is an easy consequence of Proposition 7.6, which gives the equalities
µφ1 + µψ2 = µφ1 +ψ2 = µψ1 +φ2 = µψ1 + µφ2 .
Notations. Suppose X is a compact Hausdorff space. We define
MR (X) = φ : C R (X) → R : φ R-linear continuous ,


RR (X) = µ signed Radon measure on X .
The correspondence
(22) MR (X) 3 φ 7−→ µφ ∈ RR (X)
defined above, will still be referred to as the extended Riesz correspondence.
Remark 8.3. If X is a compact Hausdorff space, then the extended Riesz
correspondence (22) is a linear map. This is a consequence of Proposition 7.6.
Given φ ∈ MR (X), the existence of a decomposition of φ, of the particular type
described in Corollary II.5.3, is extremely significant, as suggested by the following
result.
Theorem 8.4. Let X be a compact Hausdorff space, let φ1 , φ2 : C R (X) → R
be positive linear maps, and let µφ1 and µφ2 be the corresponding Riesz measures.
Consider the linear continuous map φ = φ1 − φ2 , and the finite signed measure
(23) µφ = µφ1 − µφ2 .
280 LECTURES 30-31

If kφk = kφ1 k + kφ2 k, then µφ1 ⊥ µφ2 , so (23) represents the Hahn-Jordan decom-
position of µφ .
Proof. We are going to show that the decomposition (23) satisfies condition
(ii) in Lemma 8.1. The key step in proving this fact is contained in the following.
Claim: For every ε > 0, there exist functions f1 , f2 ∈ C R (X), with f1 , f2 ≥
0, f1 + f2 ≥ 1, and such that φ1 (f2 ) < ε and φ2 (f1 ) < ε.
To prove this we fix ε > 0, and we use the definition of the norm, to find some
function g ∈ C R (X), with kgk ≤ 1, and |φ(g)| ≥ kφk − ε. Replacing g with −g, if
necessary, we can assume that
(24) φ(g) ≥ kφk − ε.
Consider the functions g = max{g, 0} and g − = max{−g, 0 , so that g = g + −g − ,
+

and we clearly have 0 ≤ g ± ≤ 1. On the one hand, since kφk k = φk (1) (see
Proposition II.5.4), we have φk (g ± ) ≤ kφk , k = 1, 2. On the other hand, by (24),
and the positivity of φ1 and φ2 , we know that
kφk − ε ≤ φ(g) = φ1 (g) − φ2 (g) = φ1 (g + ) + φ2 (g − ) − φ1 (g − ) − φ2 (g + ) ≤
≤ φ1 (g + ) + φ2 (g − ) ≤ kφ1 k + kφ2 k = kφk,
so we get
ε ≥ kφk − φ1 (g + ) − φ2 (g − ) = kφ1 k + kφ2 k − φ1 (g + ) − φ2 (g − ) =
= φ1 (1) + φ2 (1) − φ1 (g + ) − φ2 (g − ) = φ1 (1 − g + ) + φ2 (1 − g − ).
If we define f1 = 1 − g − and f2 = 1 − g + , then it is clear that f1 , f2 ≥ 0. Using
the fact that g + + g − = |g| ≤ 1, we get f1 + f2 = 2 − |g| ≥ 1. Finally, the above
estimate gives φ1 (f2 ) + φ2 (f1 ) ≤ ε, and so the Claim immediately follows.
Having proven the Claim, we are now in position to prove that the two measures
µφ1 and µφ2 satisfy condition (ii) in Lemma 8.1. Start with some arbitrary ε > 0,
and use the Claim to find two functions f1 , f2 ∈ C R (X) with f1 , f2 ≥ 0, f1 +f2 ≥ 1,
such that φ1 (f2 ) ≤ ε/2 and φ2 (f1 ) ≤ ε/2. Consider the compact subsets
 1  1
K1 = x ∈ X : f1 (x) ≥ and K2 = x ∈ X : f2 (x) ≥ .
2 2
Since f1 + f2 ≥ 1, it follows immediately that we have K1 ∪ K2 = X. By construc-
tion, we have 2f1 ≥ κ K1 and 2f2 ≥ κ K2 , so using the interpolation property (see
Proposition 7.5), we get
µφ1 (K2 ) ≤ φ1 (2f2 ) = 2φ1 (f2 ) ≤ ε;
µφ2 (K1 ) ≤ φ2 (2f1 ) = 2φ2 (f1 ) ≤ ε. 
The above result has several interesting consequences.
Corollary 8.2. Suppose X is a compcat Hausdorff space. Then the extended
Riesz correspondence (22) is injective.
Proof. Since the correspondence (22) is linear, is suffices to prove the impli-
cation µφ = 0 ⇒ φ = 0. Start with some linear continuous map φ : C R (X) → R,
such that µφ = 0. Use Corollary II.5.3 to find two positive linear maps φ1 , φ2 :
C R (X) → R, such that φ = φ1 − φ2 , and kφk = kφ1 k + kφ2 k. By Theorem 8.4
the difference µφ1 − µφ2 = µφ = 0 is the Hahn-Jordan decomposition of the zero
measure. By the uniqueness (see Corollary 8.1) it follows that µφ1 = µφ2 = 0. By
CHAPTER III: MEASURE THEORY 281

the interpolation property, we know that kφk k = φk (1) = µφk (X) = 0, k ≥ 1, so


we get φ1 = φ2 = 0, thus forcing φ = 0. 
The injectivity of the extended Riesz correspondence has as a consequence the
uniqueness of the decomposition of linear continuous as differences of positive ones,
of the type described in Corollary II.5.3.
Corollary 8.3. Let X be a compact Hausdorff space, and let φ : C R (X) → R
be a linear continuous map. Assume one has positive linear maps φ1 , φ2 , ψ1 , ψ2 :
C R (X) → R, such that
• φ = φ1 − φ2 = ψ1 − ψ2 ;
• kφ1 k + kφ2 k = kψ1 k + kψ2 k = kφk.
Then one has the equalities φ1 = ψ1 and φ2 = ψ2 .
Proof. Consider the signed measure µφ . By Theorem 8.4, the decompositions
µφ = µφ1 − µφ2 = µψ1 − µψ2
both represent the Hahn-Jordan decomposition of µφ . By the uniqueness (Corollary
8.1) we have µφ1 = µψ1 and µφ2 = µψ2 . By Corollary 8.2 this forces φ1 = ψ1 and
φ2 = ψ2 . 
Comment. If X is a compact Hausdorff space, and φ : C R (X) → R is a
linear continuous map, then by the above result, combined with Corollary II.5.3,
we know that there exist unique positive linear maps φ± : C R (X) → R such that
kφk = kφ+ k + kφ− k, and
(25) φ = φ+ − φ− .
The decomposition (25) will be referred to as the Hahn-Jordan decomposition of φ.
This noation and terminology are used for the following reason. If we take µφ the
measure given by the extended Riesz correspondence, then
µφ = µφ+ − µφ−
is precisely the Hahn-Jordan decomposition of µφ .
Remarks 8.4. There is a version of the extended Riesz correspondence which
works for general locally compact spaces. Start with a locally compact space X,
and define the spaces
MR
 R

0 (X) = φ : C0 (X) → R : φ linear continuous ,

RR0 (X) = µ finite signed Radon measure on X .

Since C0R (X) is the completion of CC


R
(X), the correspondence
M0 (X) 3 φ 7−→ φ R
R

Cc (X)

establishes an isometric linear isomorphism between MR 0 (X) and the space of all
continuous linear maps CcR (X) → R. For every positive φ ∈ M R
0 (X), we denote by
µφ the Riesz measure associated with the restriction φc = φ C R (X) . Since kφc k =
c
kφk, we have the equality µφ (X) = kφk.
We know (see Proposition II.5.10) that for every linear continuous map φ :
C0R (X) → R, there exist linear positive continuous maps φ1 , φ2 : C0R (X) → R, with
φ = φ1 − φ2 . (In fact φ1 and φ2 can be chosen such that kφ1 k + kφ2 k = kφk.) We
use this fact to define the finite signed Radon measure µφ = µφ1 − µφ2 . Exactly as
282 LECTURES 30-31

in Remark 8.2, this definition is independent of the particular choice of φ1 and φ2 .


This way we have constructed a map
(26) MR R
0 (X) 3 φ 7−→ µφ ∈ R0 (X)

which we will call the extended finite Riesz correspondence. Of course, if X is already
compact, we have C0R (X) = C R (X), MR 0 (X) = M (X), and mathf rakR0 (X) =
R R
R
R (X), so (26) is the extended Riesz correspondence previously defined.
The following result generalizes the statements of Remark 8.3, Theorem 8.4,
and Corollaries 8.2 and 8.3.
Theorem 8.5. Let X be a locally compact space.
A. The extended finite Riesz correspondence (26) is an injective linear map.
B. For every φ ∈ MR 0 (X), there exist unique positive maps φ , φ ∈∈ M0 (X),
+ − R
+ − + −
such that φ = φ − φ , and kφk = kφ k + kφ k. Moreover, in this case
µφ = µφ+ − µφ−
is precisely the Hahn-Jordan decomposition of µφ .
Proof. First of all, the correspondence (26) is clearly linear, again as a con-
sequence of Proposition 7.6.
Second, we remark that the existence part in B is already known, from Propo-
sition II.5.10. We are going to use the following version of Theorem 8.4.
Claim: Suppose φ ∈ MR 0 (X) is written as a difference φ = φ1 − φ2 , with
φ1 , φ2 ∈ MR
0 (X) positive, and kφk = kφ1 k + kφ2 k. Then
µφ = µφ1 − µφ2
is the Hahn-Jordan decomposition of µφ .
One way to prove this is by employing the Alexandrov compactification X α =
X t {∞}. We use the identification
C0R (X) = {f ∈ C R (X α ) : f (∞) = 0 .

We know that there exist positive linear maps ψ1 , ψ2 : C R (X) → R, such that
ψk C R (X) = φk , and kψk k = kφk k, k = 1, 2. If we define ψ : C R (X α ) → R by
0
ψ = ψ1 − ψ2 , it it not hard to see that kψk = kψ1 k + kψ2 k, so if we consider the
Radon measures µψ , µψ1 and µψ2 on the compact space X α , then using Theorem
8.4, we get the fact that
µψ = µψ1 − µψ2
is precisely the Hahn-Jordan decomposition of µψ . This means that there are sets
B1 , B2 ∈ Bor(X α ), with B1 ∪ B2 = X α , B1 ∩ B2 = ∅, and µψ1 (B2 ) = µψ2 (B1 ) = 0.
We know (see Remarks 7.4) that
µψk (B) = µφk (B ∩ X), ∀ B ∈ Bor(X α ), k = 1, 2,
so if we define Ak = Bk ∩ X, we immediately get A1 ∪ A2 = X, A1 ∩ A2 = ∅, and
µφ1 (A2 ) = µφ2 (A1 ) = 0, thus proving that µφ1 ⊥ µφ2 .
Having proven the above Claim, the proof follows line by line the proofs of
Corollaries 8.3 and 8.4. 
The notion of a finite signed measure can be generalized to the complex case.
Definition. Suppose A is a σ-algebra on a non-empty set X. A function
µ : A → C is called a complex measure on A, if it is σ-additive in the sense that
CHAPTER III: MEASURE THEORY 283

n=1 ⊂ A, one has the equality


(addσ ) for any pairwise disjoint sequence (An )∞

[ ∞
 X
(27) µ An = µ(An ).
n=1 n=1
Remark that the condition µ(∅) = 0 is automatic in this case. Note also that a
map µ : A → C is a complex measure, if and only if the maps Re µ and Im µ are
finite signed measures.
The following result describes an important construction.
Theorem 8.6. Let A be a σ-algebra, and let µ be either a signed measure, or
a complex measure on A. For every A ∈ A, we define
X ∞ ∞
[ 
(28) ν(A) = sup |µ(Ak )| : (Ak )∞
k=1 ⊂ A, pairwise disjoint, Ak = A .
k=1 k=1
The map ν : A → [0, ∞] is an “honest” measure on A.
Proof. The first step in the proof is contained in the following.
Claim 1: For any pariwise disjoint sequence (An )∞n=1 ⊂ A, one has the in-
equality

[ ∞
 X
(29) ν An ≤ ν(An ).
n=1 n=1
S∞
Denote the right hand side of (29) by S, and denote the union n=1 AS n simply by A.

Start now with some pairwise disjoint
S∞ sequence (D )∞
k k=1 ⊂ A, with k=1 Dk = A.
For every k ≥ 1, we have Dk = n=1 (Dk ∩ An ), with (Dk ∩ An )n=1 ⊂ A pairwise

disjoint, so we have
∞ ∞
X X
|µ(Dk )| =
µ(Dk ∩ An ) ≤ |µ(Dk ∩ An )|, ∀ k ≥ 1.
n=1 n=1
Summing up then yields

X X ∞
∞ X  X∞ X
∞ 
(30) |µ(Dk )| ≤ |µ(Dk ∩ An )| = |µ(Dk ∩ An )| .
k=1 k=1 n=1 n=1 k=1

Since forSeach n ≥ 1, the sequence (Dk ∩ An )k=1 ⊂ A is



pairwise disjoint, and

satisfies k=1 (Dk ∩ An ) = An , by the definition of ν, we get

X
|µ(Dk ∩ An )| ≤ ν(An ), ∀ n ≥ 1.
k=1
Using these estimates in (30), we then get
X∞ ∞
X
|µ(Dk ) ≤ ν(An ).
k=1 n=1
P∞
Since the inequalityS k=1 |µ(Dk )| ≤ S holds for all pairwise disjoint sequences

k=1 ⊂ A, with
(Dk )∞ k=1 Dk = A, by the definition of ν we get ν(A) ≤ S, and the
Claim is proven.
n=1 ⊂ A, one has
Claim 2: For any finite pairwise disjoint collection (An )N
the inequality
ν(A1 ∪ · · · ∪ AN ) ≥ ν(A1 ) + · · · + ν(AN ).
284 LECTURES 30-31

We use induction on N , and we see immediately that it suffices only to prove


the case
S∞ N = 2. Fix for the moment a pairwise
P∞ disjoint sequence (Dk )∞ k=1 ⊂ A,
with k=1 Dk = A1 , and denote the sum k=1 S∞ |µ(D k )| by R. Suppose we have a
pairwise disjoint sequence (Ej )∞
j=1 ⊂ A, with j=1 E j = A2 . If we combine it with
the Dk ’s, i.e. we define

Dp/2 if p is even
Fp =
E(p+1)/2 if p is odd
S∞
p=1 ⊂ A, with
then we get a new pairwise disjoint sequence (Fp )∞ p=1 Fp = A1 ∪A2 .
By the definition of ν we will then get

X ∞
X ∞
X ∞
X
ν(A1 ∪ A2 ) ≥ |µ(Fp )| = |µ(Dk )| + |µ(Ej )| = R + |µ(Ej )|.
p=1 k=1 j=1 j=1
S∞
Taking supremum over all pairwise disjoint sequences (Ej )∞j=1 ⊂ A, with j=1 Ej =
A2 , the above inequality yields µ(A1 ∪ A2 ) ≥ R + ν(A2 ), so now we have

X
ν(A1 ∪ A2 ) ≥ ν(A2 ) + |µ(Dk )|.
k=1
S∞
Taking supremum over all pairwise disjoint sequences (Dk )∞k=1 ⊂ A, with k=1 Dk =
A1 , the above inequality finally gives ν(A1 ∪ A2 ) ≥ ν(A2 ) + ν(A1 ), and the Claim
is proven.
We are now in position to prove that ν is a measure on A. The equality
ν(∅) = 0 is trivial. To prove σ-additivity, we start with some pairwise disjoint
n=1 ⊂ A, and we must prove the equality
sequence (An )∞

[ ∞
 X
ν An = ν(An ).
n=1 n=1
S∞ 
On the one hand, using Claim 1, we know that we have the inequality ν
P∞ S∞ n=1 An ≤
n=1 ν(An ). On the other hand, if we denote the union n=1 An simply by A,
then using Claim 2, we see that
ν(A) ≥ ν(A r [A1 ∪ · · · ∪ AN ]) + ν(A1 ) + . . . ν(AN ) ≥ ν(A1 ) + . . . ν(AN ), ∀ N ≥ 1,
P∞
which immedaitely gives the other inequality ν(A) ≥ n=1 ν(An ). 

Definition. With the notations above, and under the hypothesis of Theorem
8.6, the “honest” measure ν, defined by (28), is called the variation measure of µ,
and will be denoted by |µ|. By construction, we have the inequality
|µ(A)| ≤ |µ|(A), ∀ A ∈ A.
Remark 8.5. Let µ be either a signed measure, or a complex measure on
the σ-algebra A. Exactly as with numbers (or functions), the measure |µ| has a
minimality property, which can be stated as follows. Whenever ν is an “honest”
measure on A with
|µ(A)| ≤ ν(A), ∀ A ∈ A,
it follows that we have
|µ|(A) ≤ ν(A), ∀ A ∈ A.
CHAPTER III: MEASURE THEORY 285

S∞ is quite clear, because for any pairwise disjoint sequence (An )n=1 ⊂ A, with

This
n=1 An = A, one has the inequality

X ∞
X
|µ(An )| ≤ ν(An ) = ν(A),
n=1 n=1
and then the desired inequality follows by taking the supremum in the left hand
side.
In the case of signed measures, the variation measure is also given by the
following.
Proposition 8.2. Let µ be a signed measure on the σ-algebra A. Then one
has the equality
|µ| = µ+ + µ− ,
where µ = µ+ − µ− is the Hahn-Jordan decomposition of µ.
Proof. Denote the measure µ+ + µ− simply by ν. Remark that we obviously
have
−ν(A) = −µ+ (A)−µ− (A) ≤ µ+ (A)−µ− (A) ≤ µ+ (A)+mu− (A) = ν(A), ∀ A ∈ A,
which gives
|µ(A)| ≤ ν(A), ∀ A ∈ A.
By Remark 8.5, this forces the inequality |µ| ≤ ν.
To prove the other inequality, we start by fixing sets X + , X − ∈ A as in Theorem
8.2. We decompose each set A ∈ A as A = A+ ∪ A− , where A± = A ∩ X ± , so that
we have
ν(A) = ν(A+ )+ν(A− ) = µ+ (A+ )+µ+ (A− )+µ− (A+ )+µ− (A− ) = µ+ (A+ )+µ− (A− ).
Notice now that µ(A+ ) = µ+ (A+ ) ≥ 0, and −µ(A− ) = µ− (A− ) ≥ 0, which means
that we have the equalities µ+ (A+ ) = |µ(A+ )| and µ− (A− ) = |µ(A− )|, so the above
equality reads
ν(A) = |µ(A+ )| + |µ(A− )|,
and by the definition of |µ| we then immediately get ν(A) ≤ |µ|(A). 
An interesting consequence is the following.
Corollary 8.4. Let µ be either a finite signed measure, or a comlex measure
on the σ-algebra A. Then the variation measure |µ| is finite.
Proof. The signed measure case is clear from the above result.
In the complex case, we write µ = ν + iη, with ν and η finite signed measures
on A. We apply the signed case, to get the fac that both |ν| and |η| are finite.
Notice that we have
|µ(A)| = |ν(A) + iη(A)| ≤ |ν(A)| + |η(A)| ≤ |ν|(A) + |η|(A), ∀ A ∈ A,
so by Remark 8.5 we get |µ| ≤ |ν|+|η|, and then the finiteness of |µ| is a consequence
of the finiteness of |ν| and |η|. 
Exercise 3. Let A be a σ-algebra, and let K be one of the fields R or C. For
the purpose of this exercise, let us agree to use the term K-measure for designating
either a finite signed measure (when K = R), or a complex measure (when K = C).
Prove the following.
(i) The collection of all K-measures on A is a vector space.
286 LECTURES 30-31

(ii) For anu two K-measures if µ and ν, one has the inequality
|µ + ν| ≤ |µ| + |ν|.
(iii) For any K-measure µ and any α ∈ K, one has the equality
|αµ| = |α| · |µ|.
Proposition 8.2 has another interesting consequence, which is relevant for the
study of the extended finite Riesz correspondence.
Corollary 8.5. Let X be a locally compact space. Then the extended finite
Riesz correspondence (26) has the property
(31) |µφ |(X) = kφk, ∀ φ ∈ MR
0 (X).

Proof. From Proposition 8.1 and Theorem 8.5, we know that |µφ | = µφ+ +
µφ− . Using Remark 8.4, and Theorem 8.5 again, we have
|µφ |(X) = µφ+ (X) + µφ− (X) = kφ+ k + kφ− k = kφk. 
Comments. Given a locally compact space X, we can define a complex Radon
measure on X as being a complex measure on X, whose real and imaginary part
are both (finite) signed Radon measures. The extended finite Riesz correspondence
can be then defined also over the complex numbers, as a map
M0 (X) 3 φ 7−→ µφ ∈ R0 (X),
where
M0 (X) = φ : C0 (X) → C : φ linear constinuous ,


R0 (X) = µ complex Radon measure on X .
This correspondence is again linear. One will still have the equality (31), but the
proof of this fact will appear later in Chapter IV.
Chapter IV
Integration Theory
Lectures 32-33

1. Construction of the integral


In this section we construct the abstract integral. As a matter of terminology,
we define a measure space as being a triple (X, A, µ), where X is some (non-empty)
set, A is a σ-algebra on X, and µ is a measure on A. The measure space (X, A, µ)
is said to be finite, if If µ(X) < ∞.
Definition. Let (X, A, µ) be a measure space, and let K be one of the fields
R or R. A K-valued elementary µ-integrable function on (X, A, µ) is an function
f : X → K, with the following properties
• the range f (X) of f is a finite set;
• f −1 ({α}) ∈ A, and µ f −1 ({α}) < ∞, for all α ∈ f (X) r {0}.


We denote by L1K,elem (X, A, µ) the collection of all such functions.


Remarks 1.1. Let (X, A, µ) be a measure space.
A. Every K-valued elementary µ-integrable function f on (X, A), µ) is mea-
surable, as a map f : (X, A) → K, Bor(K . In fact, any such f can be written


as
f = α1 κ A1 + · · · + αn κ An ,
with αk ∈ K, Ak ∈ A and µ(Ak ) < ∞, ∀ k = 1, . . . , n. Using the notations from
III.1, we have the inclusion
L1K,elem (X, A, µ) ⊂ A-ElemK (X).
B. If we consider the collection R = {A ∈ A : µ(A) < ∞}, then R is a ring,
and, we have the equality
L1K,elem (X, A, µ) = R-ElemK (X).
In particular, it follows that L1K,elem (X, A, µ) is a K-vector space.
The following result is the first step in the construction of the integral.
Theorem 1.1. Let (X, A, µ) be a measure space, and let K be one of the fields
µ
R or C. Then there exists a unique K-linear map Ielem : L1K,elem (X, A, µ) → K,
such that
µ
(1) Ielem (κ A ) = µ(A),
for all A ∈ A, with µ(A) < ∞.

Proof. For every f ∈ L1K,elem (X, A, µ), we define


µ
X
α · µ f −1 ({α}) ,

Ielem (f ) =
α∈f (X)r{0}

289
290 LECTURES 32-33

with the convention that, when f (X) = {0} (which is the same as f = 0), we define
µ µ
Ielem (f ) = 0. It is obvious that Ielem satsifies the equality (1) for all A ∈ A with
µ(A) < ∞.
One key feature we are going to use is the following.
Claim 1: Whenever we have a finite pairwise disjoint sequence (Ak )nk=1 ⊂ A,
with µ(Ak ) < ∞, ∀ k = 1, . . . , n, one has the equality
µ
Ielem (α1 κ A1 + · · · + αn κ An ) = α1 µ(A1 ) + · · · + αn µ(An ), ∀ α1 , . . . , αn ∈ K.
It is obvious that we can assume αj 6= 0, ∀ j = 1, . . . , n. To prove the above equality,
we consider the elementary µ-integrable function f = α1 κ A1 + · · · + αn κ An , and we
observe that f (X)r{0} = {α1 }∪· · ·∪{αn }. It may be the case that some of the α’s
a equal. We list f (X) r {0} = {β1 , . . . , βp }, with βj 6= βk , for all j, k ∈ {1, . . . , p}
with j 6= k. For each k ∈ {1, . . . , p}, we define the set

Jk = j ∈ {1, . . . , n} : αj = βk .
It is obvious that the sets (Jk )pk=1 are pairwise disjoint, and we have J1 ∪ · · · ∪ Jp =
{1, . . . , n}. Moreover, for each k ∈ {1, . . . , p}, one has the equality
[
f −1 ({βk }) = Aj ,
j∈Jk

so we get
X X
βk µ f −1 ({βk }) = βk

µ(Aj ) = αj µ(Aj ), ∀ k ∈ {1, . . . , p}.
j∈Jk j∈Jk
µ
By the definition of Ielem we then get
p
X p X  Xn
µ −1
 X
Ielem (f ) = βk µ f ({βk }) = αj µ(Aj ) = αj µ(Aj ).
k=1 k=1 j∈Jk j=1

Claim 2: For every f ∈ L1K,elem (X, A, µ), and every A ∈ A with µ(A) < ∞,
one has the equality
µ µ
(2) Ielem (f + ακ A ) = Ielem (f ) + αµ(A), ∀ α ∈ K.
Write f = α1 κ A1 + · · · + αn κ An , with (Aj )nj=1 ⊂ A pairwise disjoint, and µ(Aj ) <
∞, ∀ j = 1, . . . , n. In order to prove (2), we are going to write the function f +
ακ A in a similar way, and we are going to apply Claim 1. Consider the sets
B1 , B2 , . . . , B2n , B2n+1 ∈ A defined by B2n+1 = A r (A1 ∪ · · · ∪ An ), and B2k−1 =
2n+1
Ak ∩ A, B2k = Ak r A, ∀ k = 1, . . . , n. It is obvious that the sets (Bp )p=1 are
pairwise disjoint. Moreover, one has the equalities
(3) B2k−1 ∪ B2k = Ak , ∀ k ∈ {1, . . . , n},
as well as the equality
n+1
[
(4) A= B2k−1 .
k=1
P2n+1
Using these equalities, now we have f + ακ A = p=1 βp κ Bp , where β2n+1 = α,
and β2k = αk and β2k−1 = αk + α, ∀ k ∈ {1, . . . , n}. Using these equalities,
CHAPTER IV: INTEGRATION THEORY 291

combined with Claim 1, and (3) and (4), we now get


2n+1
X
µ
Ielem (f + ακ A ) = βp µ(Bp ) =
p=1
Xn
 
= αµ(B2n+1 ) + (αk + α)µ(B2k−1 ) + αk µ(B2k ) =
k=1
 n+1
X n
 X 
 
= α µ(B2k−1 ) + αk µ(B2k−1 ) + µ(B2k ) =
k=1 k=1
n+1
[ n
X

= αµ B2k−1 ) + αk µ(B2k−1 ∪ B2k ) =
k=1 k=1
n
X µ
= αµ(A) + αk µ(Ak ) = αµ(A) + Ielem (f ),
k=1

and the Claim is proven.


µ
We now prove that Ielem is linear. The equality
µ µ µ
Ielem (f + g) = Ielem (f ) + Ielem (g), ∀ f, g ∈ L1K,elem (X, A, µ)
follows from Claim 2, using an obvious inductive argument. The equality
µ µ
Ielem (αf ) = αIelem (f ), ∀ α ∈ K, f ∈ L1K,elem (X, A, µ).
is also pretty obvious, from the definition.
The uniqueness is also clear. 
Definition. With the notations above, the linear map
µ
Ielem : L1K,elem (X, A, µ) → K
is called the elementary µ-integral.
In what follows we are going to encounter also situations when certain relations
among measurable functions hold “almost everywhere.” We are going to use the
following.
Convention. Let T be one of the spaces [−∞, ∞] or C, and let r be some
relation on T (in our case r will be either “=,” or “≥,” or “≤,” on [−∞, ∞]). Given
a measurable space (X, A, µ), and two measurable functions f1 , f2 : X → T ,
f1 r f2 , µ-a.e.
if the set

A = x ∈ X : f1 (x) r f2 (x)
belongs to A, and it has µ-null complement in X, i.e. µ(X r A) = 0. (If r is one of
the relations listed above, the set A automatically belongs to A.) The abreviation
“µ-a.e.” stands for “µ-almost everywhere.”
Remark 1.2. Let (X, A, µ) be a measure space, let f ∈ A-ElemK (X) be such
that
f = 0, µ-a.e.
µ
Then f ∈ L1K,elem (X, A, µ), and Ielem (f ) = 0. Indeed, if we define the set
N = {x ∈ X : f (x) 6= 0},
292 LECTURES 32-33

then N ∈ Aand µ(N ) = 0. Since f −1 ({α}) ⊂ N , ∀ α ∈ f (X) r {0}, it follows that


µ f −1 ({α}) = 0, ∀ α ∈ f (X) r {0}, and then by the definition of the elementary
µ
µ-integral, we get Ielem (f ) = 0.
One useful property of elementary integrable functions is the following.
Proposition 1.1. Let (X, A, µ) be a measure space, let f, g ∈ L1R,elem (X, A, µ),
and let h ∈ A-ElemR (X) be such that
f ≤ h ≤ g, µ-a.e.
Then h ∈ L1R (X, A, µ), and
µ µ µ
(5) Ielem (f ) ≤ Ielem (h) ≤ Ielem (h).
Proof. Consider the sets
A = {x ∈ X : f (x) > h(x)} and B = {x ∈ X : h(x) > g(x)},
which both belong to A, and have µ(A) = µ(B) = 0. The set M = A ∪ B
also belongs to A and has µ(M ) = 0. Define the functions f0 = f (1 − κ M ),
g0 = g(1 − κ M ), and h0 = h(1 − κ M ). It is clear that f0 , g0 , and h0 are all
in A-ElemR (X). Moreover, we have the equalities f0 = f , µ-a.e., g0 = g, µ-a.e.,
and h0 = h, µ-a.e., so by Remark ??, combined with Theorem 1.1, the functions
f0 = f + (f0 − f ) and g0 = (g0 − g) + g both belong to L1R (X, A, µ), and we have
the equalities
µ µ µ µ
(6) Ielem (f0 ) = Ielem (f ) and Ielem (g0 ) = Ielem (g).
Notice now that we have the (absolute) inequality
f0 ≤ h0 ≤ g0 .
Let us show that h0 is elementary integrable. Start with some α ∈ h0 (X) r {0}. If
α > 0, then, using the inequality h0 ≤ g0 , we get
[
h−1 −1
g0−1 ({λ}),

0 ({α}) ⊂ g0 (0, ∞) ⊂
λ∈g0 (X)r{0}

h−1

which proves that µ 0 ({α}) < ∞. Likewise, if α < 0, then, using the inequality
h0 ≥ f0 , we get
[
h−1 −1
f0−1 ({λ}),

0 ({α}) ⊂ f0 (−∞, 0) ⊂
λ∈f0 (X)r{0}

h−1

which proves again that µ 0 ({α}) < ∞.
Having shown that h0 is elementary integrable, we now compare the numbers
µ µ
Ielem (f ), Ielem (h0 ), and I µ (g). Define the functions f1 = h0 − f0 , and g1 = g0 − h0 .
By Theorem 1.1, we know that f1 , g1 ∈ L1R,elem (X, A, µ). Since f1 , g1 ≥ 0, we
µ
have f1 (X), g1 (X) ⊂ [0, ∞), so it follows immediately that Ielem (f1 ) ≥ 0 and
µ
Ielem (g1 ) ≥ 0. Now, again using Theorem 1.1, and (6), we get
µ µ µ µ µ µ
Ielem (h0 ) = Ielem (f0 + f1 ) = Ielem (f0 ) + Ielem (f1 ) ≥ Ielem (f0 ) = Ielem (f );
µ µ µ µ µ µ
Ielem (h0 ) = Ielem (g0 − g1 ) = Ielem (g0 ) − Ielem (g1 ) ≤ Ielem (g0 ) = Ielem (g).
Since h = h0 , µ-a.e., by the above Remark it follows that h ∈ L1R,elem (X, A, µ), and
µ µ
Ielem (h) = Ielem (h0 ), so the desired inequality (5) follows immediately. 
We now define another type of integral.
CHAPTER IV: INTEGRATION THEORY 293

Definition. Let (X, A, µ) be a measure space. A measurable function f :


X → [0, ∞] is said to be µ-integrable, if
(a) every hµ ∈ A-ElemR (X), with 0 ≤ h ≤ f , is elementary µ-integrable;
(h) : h ∈ A-ElemR (X), 0 ≤ h ≤ f < ∞.

(b) sup Ielem
µ
If this is the case, the above supremum is denoted by I+ (f ). The space of all such
functions is denoted by L1+ (X, A, µ). The map
µ
I+ : L1+ (X, A, µ) → [0, ∞)
is called the positive µ-integral.
The first (legitimate) question is whether there is an overlap between the two
definitions. This is anwered by the following.
Proposition 1.2. Let (X, A, µ) be a measure space, and let f ∈ A-ElemR (X)
be a function with f ≥ 0. The following are equivalent
(i) f ∈ L1+ (X, A, µ);
(ii) f ∈ L1R,elem (X, A, µ).
µ µ
Moreover, if f is as above, then Ielem (f ) = I+ (f ).
Proof. The implication (i) ⇒ (ii) is trivial.
To prove the implication (ii) ⇒ (i) we start with an arbitrary elementary
h ∈ A-ElemR (X), with 0 ≤ h ≤ f . Using Proposition 1.1, we clearly get
(a) h ∈ L1R,elem (X, A, µ);
µ µ
(b) Ielem (h) ≤ Ielem (f ).
Using these two facts, it follows that f ∈ L1+ (X, A, µ), as well as the equality
 µ µ
(h) : h ∈ A-ElemR (X), h ≤ f = Ielem

sup Ielem (f ),
µ µ
which gives I+ (f ) = Ielem (f ). 
We now examine properties of the positive integral, which are similar to those
of the elementary integral. The following is an analogue of Proposition 1.1.
Proposition 1.3. Let (X, A, µ) be a measure space, let f ∈ L1+ (X, A, µ),
and let g : X → [0, ∞] be a measurable function, such that g ≤ f , µ-a.e., then
µ µ
g ∈ L1+ (X, A, µ), and I+ (g) ≤ I+ (f ).
Proof. Start with some elementary function h ∈ A-ElemR (X), with 0 ≤ h ≤
g. Consider the sets
M = {x ∈ X : h(x) > f (x)} and N = {x ∈ X : g(x) > f (x)},
which obviously belong to A. Since N ⊂ N , and µ(N ) = 0, we have µ(M ) = 0. If
we define the elementary function h0 = h(1 − κ M ), then we have h = h0 , µ-a.e.,
µ
and 0 ≤ h0 ≤ f , so it follows that h0 ∈ L1R,elem (X, A, µ), and Ielem (h0 ) ≤ I µ (f ).
Since h = h0 , µ-a.e., by Proposition 1.1., it follows that h ∈ L1R,elem (X, A, µ),
µ µ µ
and Ielem (h) = Ielem (h0 ) ≤ I+ (f ). By definition, this gives g ∈ L1+ (X, A, µ) and
µ µ
I+ (g) ≤ I+ (f ). 
Remark 1.3. Let (X, A, µ) be a measure space, and let f ∈ L1+ (X, A, µ).
Although f is allowed to take the value ∞, it turns out that this is inessential.
More precisely one has
µ f −1 ({∞}) = 0.

294 LECTURES 32-33

This is in fact a consequence of the equality


lim µ f −1 ([t, ∞]) = 0.

(7)
t→∞

Indeed, if we define, for each t ∈ (0, ∞), the set At = f −1 ([t, ∞]) ∈ A, then we
have 0 ≤ tκ At ≤ f . This forces the functions tκ At , t ∈ (0, ∞) to be elementary
integrable, and
I µ (f )
µ(At ) ≤ + , ∀ t ∈ (0, ∞).
t
This forces limt→∞ µ(At ) = 0.
The next result explains the fact that positive integrability is a “decomposable”
property.
Proposition 1.4. Let (X, A, µ) be a measure space. Suppose (Ak )nk=1 ⊂ A
is a pairwise disjoint finite sequence, with A1 ∪ · · · ∪ An = X. For a measurable
function f : X → [0, ∞], the following are equivalent.
(i) f ∈ L1+ (X, A, µ);
(ii) f κ Ak ∈ L1+ (X, A, µ), ∀ k = 1, . . . , n.
Moreover, if f satisfies these equivalent conditions, one has
n
X
µ µ
I+ (f ) = I+ (f κ Ak ).
k=1

Proof. The implication (i) ⇒ (ii) is trivial, since we have 0 ≤ f κ Ak ≤ f , so


we can apply Proposition 1.3.
To prove the implication (ii) ⇒ (i), start by assuming that f satisfies condition
(ii). We first observe that every elementary function h ∈ A-ElemR (X), with 0 ≤
h ≤ f , has the properties:
(a) h ∈ L1R,elem (X, A, µ);
µ Pn µ
(b) Ielem (h) ≤ k=1 I+ (f κ Ak ).
Pn
This is immediate from the fact that we have the equality h = k=1 hκ Ak , and all
function hκ Ak are elementary, and satisfy 0 ≤ hκ Ak ≤ f κ Ak , and then everything
follows from Theorem 1.1 and the definition of the positive integral which gives
µ µ
Ielem (hκ Ak ) ≤ I+ (f κ Ak ).
Of course, the properties (a) and (b) above prove that f ∈ L1+ (X, A, µ), as well
as the inequality
n
X
µ µ
I+ (f ) ≤ I+ (f κ Ak ).
k=1
To prove that we have in fact equality, we start with some ε > 0, and we choose, for
each k ∈ {1, . . . , n}, a function hk ∈ L1R,elem (X, A, µ), such that 0 ≤ hk ≤ f κ Ak ,
µ µ
and Ielem (hk ) ≥ I+ (f κ Ak ) − nε . By Theorem 1.1, the function h = h1 + · · · + hn
belongs to L1R,elem (X, A, µ), and has
n
X n
X
µ µ µ 
(8) Ielem (h) = Ielem (hk ) ≥ I+ (f κ Ak ) − ε.
k=1 k=1

We obviously have
n
X n
X
h= hk ≤ f κ Ak = f,
k=1 k=1
CHAPTER IV: INTEGRATION THEORY 295

µ
so we get Ielem (h) ≤ I µ (f ), thus the inequality (8) gives
n
X µ
I µ (f ) ≥

I+ (f κ Ak ) − ε.
k=1
Pn µ
Since this inequality holds for all ε > 0, we get I µ (f ) ≥ k=1 I+ (f κ Ak ), and we
are done. 
Remark 1.4. Let (X, A, µ) be a measure space, and let S ∈ A. We can
A S = {A ∩ S : A ∈ A} = {A ∈ A : A ⊂ S},

so that A S ⊂ A is a σ-algebra on S. The restriction of µ to A S will be denoted


by µ|S . With these notations, (S, A S , µ|S ) is a measure space. It is not hard to
see that for a measurable function f : X → [0, ∞], the conditions
S ∈ L1 + (X, A,
1
• fκ µ),
• f S ∈ L+ (S, A S , µ S )

are equivalent. Moreover, in this case one has the equality
µ µ|
I+ (f κ S ) = I+ S (f S ).
This is a consequence of the fact that these two conditions are equivalent
if f is
elementary, combined with the fact that the restriction map h 7−→ h S establishes
a bijection between the sets
h ∈ A-ElemR (X) : 0 ≤ h ≤ f κ S ,


k ∈ A -ElemR (S) : 0 ≤ k ≤ f .

S S
The next result gives an alternative definition of the positive integral, for func-
tions that are dominated by elementary integrable ones.
Proposition 1.5. Let X(, A, µ) be a measure space, let f : X → [0, ∞] be
a measurable function. Assume there exists h0 ∈ L1R,elem (X, A, µ), with h0 ≥ f .
Then f ∈ L1+ (X, A, µ), and one has the equality
µ  µ
(h) : h ∈ L1R,elem (X, A, µ), h ≥ f .

(9) I+ (f ) = inf Ielem

Proof. Since h0 ≥ 0, by Proposition 1.2, we know that h0 ∈ L1+ (X, A, µ).


The fact that f ∈ L1+ (X, A, µ) then follows from Proposition 1.3, combined with
the inequality h0 ≥ f . More generally, again by Propositions 1.2 and 1.3, we know
that for any h ∈ L1R,elem (X, A, µ), with h ≥ f , we have h ∈ L1+ (X, A, µ), as well as
the inequality
µ µ µ
I+ (f ) ≤ I+ (h) = Ielem (h).
µ
So, if we denote the right hand side of (9) by J(f ), we have I+ (f ) ≤ J(f ) ≤
µ
Ielem (h0 ).
µ
We now prove the other inequality I+ (f ) ≥ J(f ). If h0 = 0, there is nothing
to prove. Assume h0 is not identically zero. Without any loss of generality, we can
assume that h0 = βκ B, forSsome β ∈ (0, ∞) and B ∈ A with µ(B) < ∞. (If we
define B = h−1
0 (0, ∞) = α∈h0 (X)r{0} h−1 0 ({α}), and if we set β = max h0 (X),
then we clearly have µ(B) < ∞, and h0 ≤ βκ B .)
For every integer n ≥ 1, we define the sets An1 , . . . , Ann ∈ A by
Ank = f −1 ( (k−1)β , kβ

n n ] , ∀ k = 1, . . . , n,
296 LECTURES 32-33

and we define the elementary functions


n
X n
X
(k−1)β kβ
gn = n κ n
Ak and h n = n κ An
k
.
k=1 k=1
The main features of these constructions are collected in the following.
Claim: For every n ≥ 1, the functions gn and hn are elementary integrable,
and satisfy the inequalities 0 ≤ gn ≤ f ≤ hn ≤ h0 , as well as
µ µ βµ(B)
Ielem (hn ) ≤ Ielem (gn ) + .
n
To prove this fact, we fix n ≥ 1, and we first remark that the sets (Ank )nk=1 are
pairwise disjoint. Since 0 ≤ f ≤ h0 = βκ B , we have
An1 ∪ · · · ∪ Ann = f −1 (0, β] ⊂ B.


In particular, if we define An = An1 ∪ · · · ∪ Ann ⊂ B, we have


Xn X n

hn = n κ n
Ak ≤ β κ Ank = βκ An ≤ βκ B .
k=1 k=1
Let us prove the inequalities gn ≤ f ≤ hn . Start with some arbitrary point x ∈ X,
and let us show that gn (x) ≤ f (x) ≤ hn (x). If f (x) = 0, there is nothing to prove,
because this forces κ Ank (x) = 0, ∀ k = 1, . . . , n. Assume now f (x) > 0. Since
f ≤ βκ B , we now that f (x) ∈ (0, β], so there exists a unique k ∈ {1, . . . , n}, such
that (k−1)β
n < f (x) ≤ kβ n
n , i.e. x ∈ Ak . We then obviously have
(k−1)β (k−1)β kβ kβ
gn (x) = n κ Ank (x) = n < f (x) ≤ n = n κ An
k
(x) = hn (x),
and we are done. Finally, let us observe that since gn ≤ hn ≤ h0 , it follows that gn
and hn are in L1+ (X, A, µ), so gn and hn are elementary integrable. Notice that
n
X
β β β
hn − gn = n κ Ank = n κ An ≤ n κB,
k=1
µ µ βµ(B)
so we have Ielem (hn − gn ) ≤ Ielem ( nβ κ B ) = n , so using Theorem 1.1, we get
µ µ µ βµ(B)
Ielem (hn ) = Ielem (gn ) + Ielem (hn − gn ) ≤ I µ (gn ) + .
n
Having proven the Claim, we immediately see that by the definition of the
positive integral, we have
µ βµ(B) µ βµ(B)
J(f ) ≤ Ielem (hn ) ≤ I µ (gn ) + ≤ I+ (f ) + .
n n
µ βµ(B)
Since the inequality J(f ) ≤ I+ (f ) + n holds for all n ≥ 1, it will clearly force
µ
J(f ) ≤ I+ (f ). 
Our next goal is to prove an analogue of Theorem 1.1, for the positive integral
(Theorem 1.2 below). We discuss first a weaker version.
Lemma 1.1. Let (X, A, µ) be a measure space.
(i) If f ∈ L1+ (X, A, µ) and g ∈ L1R,elem (X, A, µ) are such that g + f ≥ 0, then
µ µ µ
g + f ∈ L1+ (X, A, µ), and I+ (g + f ) = Ielem (g) + I+ (f ).
(ii) If f ∈ L+ (X, A, µ) and g ∈ LR,elem (X, A, µ) are such that g − f ≥ 0, then
1 1
µ µ µ
g − f ∈ L1+ (X, A, µ), and I+ (g − f ) = Ielem (g) − I+ (f ).
CHAPTER IV: INTEGRATION THEORY 297

Proof. (i). We start with a weaker version.


Claim: If f ∈ L1+ (X, A, µ) and g ∈ L1R,elem (X, A, µ), are such that g +f ≥ 0,
µ µ µ
then g + f ∈ L1+ (X, A, µ), and I+ (g + f ) ≤ Ielem (g) + I+ (f ).
What we need to prove is the fact that, for every h ∈ A-ElemR (X), with 0 ≤ h ≤
g + f , we have:
(a) h ∈ L1R,elem (X, A, µ);
µ µ µ
(b) Ielem (h) ≤ Ielem (g) + I+ (f ).
Consider the elementary function h1 = max{h−g, 0}. It is obvious that 0 ≤ h1 ≤ f ,
µ µ
so by Proposition 1.3, it follows that h1 ∈ L1+ (X, A, µ), and I+ (h1 ) ≤ I+ (f ). By
Proposition 1.2, this gives h1 ∈ L1R,elem (X, A, µ), and
µ µ µ
(10) Ielem (h1 ) = I+ (h1 ) ≤ I+ (f ).
Using the obvious inequality −g ≤ h − g ≤ h1 , again by Proposition 1.2, it follows
that h − g ∈ L1R,elem (X, A, µ), and
µ µ
(11) Ielem (h − g) ≤ Ielem (h1 ).
Of course, by Theorem 1.1, this gives the fact that h = (h − g) + g is elementary
µ-integrable, as well as the equality
µ µ µ
Ielem (h) = Ielem (h − g) + Ielem (g).
Combining this with (11) and (10) immediately gives
µ µ µ µ µ
Ielem (h) ≤ Ielem (h1 ) + Ielem (g) ≤ I+ (f ) + Ielem (g),
and the Claim is proven.
Having proven the above Claim, we now proceed with the proof of (i). If
f ∈ L1+ (X, A, µ) and g ∈ L1R,elem (X, A, µ) are such that g + f ≥ 0, then by the
µ µ
Claim , we already know that g+f ∈ L1+ (X, A, µ), and I µ (g+f ) ≤ Ielem (g)+I+ (f ).
We apply now again the Claim to the functions f1 = g + f and g1 = −g, to get
µ µ µ µ µ
I+ (f ) = I+ (g1 + f1 ) ≤ Ielem (g1 ) + I+ (f1 ) = −I+ (g) + I + (g + f ),
µ µ µ
which gives the other inequality Ielem (g) + I+ (f ) ≤ I+ (g + f ).
(ii). Start with f ∈ L+ (X, A, µ) and g ∈ LR,elem (X, A, µ), with g −f ≥ 0. First
1 1

of all, since 0 ≤ g − f ≤ g, by Proposition 1.5, it follows that g − f ∈ L1+ (X, A, µ),


and
µ  µ
(k) : k ∈ L1R,elem (X, A, µ), k ≥ g − f .

(12) I+ (g − f ) = inf Ielem
Second, remark that, whenever k ∈ L1R,elem (X, A, µ) is such that g − f ≤ k, it
follows that k + f ≥ g, so using part (i) combined with Proposition 1.3, we see that
k + f ∈ L1+ (X, A, µ), and
µ µ µ µ µ
Ielem (g) = I+ (g) ≤ I+ (k + f ) = Ielem (k) + I+ (f ).
This means that we have
µ µ µ
Ielem (k) ≥ Ielem (g) − I+ (f ),
for all k ∈ L1R,elem (X, A, µ), with k ≥ g − f , and then by (12), we immediately get
µ µ µ
I+ (g − f ) ≥ Ielem (g) − I+ (f ).
298 LECTURES 32-33

To prove the other inequality, we use the definition of the positive integral, which
gives
µ  µ
(h) : h ∈ L1R,elem (X, A, µ), 0 ≤ h ≤ g − f .

(13) I+ (g − f ) = sup Ielem
Remark that, whenever h ∈ L1R,elem (X, A, µ) is such that 0 ≤ h ≤ g − f , it follows
that 0 ≤ h + f ≤ g, so using part (i) combined with Proposition 1.3, we see that
h + f ∈ L1+ (X, A, µ), and
µ µ µ µ µ
Ielem (g) = I+ (g) ≥ I+ (h + f ) = Ielem (h) + I+ (f ).
This means that we have
µ µ µ
Ielem (h) ≤ Ielem (g) − I+ (f ),
for all h ∈ L1R,elem (X, A, µ), with 0 ≤ h ≤ g − f , and then by (13), we immediately
µ µ µ
get I+ (g − f ) ≤ Ielem (g) − I+ (f ). 

We are now in position to prove the following result (compare with Theorem
1.1).
Theorem 1.2. Let (X, A, µ) be a measure space.
(i) If f1 , f2 ∈ L1+ (X, A, µ), then f1 + f2 ∈ L1+ (X, A, µ), and one has the
µ µ µ
equality I+ (f1 + f2 ) = I+ (f1 ) + I+ (f2 ).
(ii) If f ∈ L1+ (X, A, µ), and α ∈ [0, ∞), then14 αf ∈ L1+ (X, A, µ), and one
µ µ
has the equality I+ (αf ) = αI+ (f ).

Proof. (i). Fix f1 , f2 ∈ L1+ (X, A, µ).


Claim 1: Whenever h ∈ A-ElemR (X) satisfies 0 ≤ h ≤ f1 + f2 , it follows
that
(a) h ∈ L1R,elem (X, A, µ),
µ µ µ
(b) Ielem (h) ≤ I+ (f1 ) + I+ (f2 ).
Fix an elementary function h ∈ A-ElemR (X), with 0 ≤ h ≤ f1 + f2 , and let us first
show that h is elementary integrable. Fix some α ∈ h(X)r{0}, and let us prove that
µ h−1 ({α}) < ∞. If we define the sets Aj = fj−1 [α/2, ∞] ∈ A, j = 1, 2, then
the elementary functions hj = α2 κ Aj satisfy 0 ≤ hj ≤ fj , j = 1, 2. In particular,
it follows that h1 , h2 ∈ L1R,elem (X, A, µ), which forces µ(A1 ) < ∞ and µ(A2 ) < ∞.
Notice however that, for every x ∈ h−1 ({α}), we have f1 (x) + f2 (x) ≥ h(x) = α,
which forces either f1 (x) ≥ α2 or f2 (x) ≥ α2 . This argument shows tha we have
 the
inclusion h−1 ({α}) ⊂ A1 ∪ A2 , so it follows that we indeed have µ h−1 ({α}) < ∞.
Having shown property (a), let us prove property (b). Define the sets
B = {x ∈ X : h(x) ≥ f1 (x)} and D = X r B.
It is obvious that B, D ∈ A are pairwise disjoint, and B ∪ D = X. Define the
elementary functions h0 = hκ B , and h00 = h − h0 = hκ D . On the one hand, we
have
f1 κ B ≤ h0 ≤ f1 κ B + f2 κ B ,
which gives
0 ≤ h0 − f1 κ B ≤ f2 κ B .

14 Here we use the convention that when α = 0, we take αf = 0.


CHAPTER IV: INTEGRATION THEORY 299

By Lemma 1.1.(ii), combined with Proposition 1.4, it follows that h0 − f1 κ B ∈


µ µ µ µ
L1+ (X, A, µ) and Ielem (h0 ) − I+ (f1 κ B ) = I+ (h0 − f1 κ B ) ≤ I+ (f2 κ B ), so we get
µ µ µ
(14) Ielem (h0 ) ≤ I+ (f1 κ B ) + I+ (f2 κ B ).
On the other hand, we have
h00 = hκ D ≤ f1 κ D ,
which gives
µ µ µ µ
(15) Ielem (h00 ) ≤ I+ (f1 κ D ) ≤ I+ (f1 κ D ) + I+ (f2 κ D ).
Since h = h0 + h00 , with h0 and h00 elementary integrable, using Theorem 1.1 com-
bined with Proposition 1.4, by adding the inequalities (14) and (15) we get
µ µ µ
Ielem (h) = Ielem (h0 ) + Ielem (h00 ) ≤
µ µ µ µ µ µ
≤ I+ (f1 κ B ) + I+ (f2 κ B ) + I+ (f1 κ D ) + I+ (f2 κ D ) = I+ (f1 ) + I+ (f2 ),
and the Claim is proven.
Claim 1 obviously implies the fact that f1 + f2 ∈ L1+ (X, A, µ), as well as the
inequality
µ µ µ
I+ (f1 + f2 ) ≤ I+ (f1 ) + I+ (f2 ).
To prove the other inequality, we use the following.
Claim 2: For every h ∈ A-ElemR (X), with 0 ≤ h ≤ f1 , one has the inequal-
ity
µ µ µ
Ielem (h) ≤ I+ (f1 + f2 ) − I+ (f2 ).
Indeed, if h is as above, then h is in L+ (X, A, µ), hence elementary integrable, and
1

we obviously have 0 ≤ h + f2 ≤ f1 + f2 . Then by Lemma 1.1.(i), combined with


Proposition 1.3, we get
µ µ µ µ
Ielem (h) + I+ (f2 ) = I+ (h + f2 ) ≤ I+ (f1 + f2 ),
and the Claim follows.
Using Claim 2, and the definition of the positive integral, we get
µ  µ µ µ
(h) : h ∈ A-ElemR (X), 0 ≤ h ≤ f1 ≤ I+

I+ (f1 ) = sup Ielem (f1 + f2 ) − I+ (f2 ),
which then gives
µ µ µ
I+ (f1 ) + I+ (f2 ) ≤ I+ (f1 + f2 ).
(ii). This part is obvious. 
Definitions. Let (X, A, µ) be a measure space. Denote the extended real line
[−∞, ∞] by R̄. A measurable function f : X → R̄ is said to be µ-integrable, if there
exist functions f1 , f2 ∈ L1+ (X, A, µ), such that
f (x) = f1 (x) − f2 (x), ∀ x ∈ X r f1−1 ({∞}) ∪ f2−1 ({∞}) .
 
(16)
By Remark 1.3, we know that the sets fk−1 ({∞}), k = 1, 2, have measure zero. The
equality (16) gives then the fact f = f1 − f2 , µ-a.e. We define
L1R̄ (X, A, µ) = f : X → R̄ : f µ-integrable .


We also define the space of “honest” real-valued µ-integrable functions, as


L1R (X, A, µ) = f ∈ L1R̄ (X, A, µ) : f − ∞ < f (x) < ∞, ∀ x ∈ X .


Finally, we define the space of complex-valued µ-integrable functions as


L1C (X, A, µ) = f : X → C : Re f, Im f ∈ L1R (X, A, µ) .

300 LECTURES 32-33

The next result collects the basic properties of L1R̄ . Among other things, it
states that it is an “almost” vector space.
Theorem 1.3. Let (X, A, µ) be a measure space.
(i) For a measurable function f : X → R̄, the following are equivalent:
(a) f ∈ L1R̄ (X, A, µ);
(b) f ∈ L1+ (X, A, µ).
(ii) If f, g ∈ L1R̄ (X, A, µ), and if h : X → R̄ is a measurable function, such
that
h(x) = f (x) + g(x), ∀ x ∈ X r f −1 ({−∞, ∞}) ∪ g −1 ({−∞, ∞}) ,
 

then h ∈ L1R̄ (X, A, µ).


(iii) If f ∈ L1R̄ (X, A, µ), and α ∈ R, and if g : X → R̄ is a measurable function,
such that
g(x) = αf (x), ∀ x ∈ X r f −1 ({−∞, ∞}),
then g ∈ L1R̄ (X, A, µ).
(iv) One has the inclusion
L1R,elem (X, A, µ) ∪ L1+ (X, A, µ) ⊂ L1R̄ (X, A, µ).

Proof. (i). Consider the functions measurable functions f ± : X → [0, ∞]


defined as
f + = max{f, 0} and f − = max{−f, 0}.
To prove the impliaction (a) ⇒ (b), assume f ∈ L1R̄ (X, A, µ), which means there
exist f1 , f2 ∈ L1+ (X, A, µ), such that
f (x) = f1 (x) − f2 (x), ∀ x ∈ X r f1−1 ({∞}) ∪ f2−1 ({∞}) .
 

Notice that we have the inequalities


(17) f + ≤ f1 , µ-a.e.,
(18) f − ≤ f2 , µ-a.e..

Indeed, if we put N = f1−1 ({∞}) ∪ f2−1 ({∞}), then µ(N ) = 0, and if we start with
some x ∈ X r N , we either have f1 (x) ≥ f2 (x) ≥ 0, in which case we get
f + (x) = f (x) = f1 (x) − f2 (x) ≤ f1 (x),
f − (x) = 0 ≤ f2 (x),
or we have f1 (x) ≤ f2 (x), in which case we get
f + (x) = 0 ≤ f1 (x),
f − (x) = −f (x) = f2 (x) − f1 (x) ≤ f2 (x).
In other words, we have
f + (x) ≤ f1 (x) and f − (x) ≤ f2 (x), ∀ x ∈ X r N,
so we indeed get (17) and (18). Using these inequalities, and Proposition 1.3, it
follows that f ± ∈ L1+ (X, A, µ), so by Theorem 1.2, it follows that f + + f − = |f |
also belongs to L1+ (X, A, µ).
CHAPTER IV: INTEGRATION THEORY 301

To prove the implication (b) ⇒ (a), start by assuming that |f | ∈ L1+ (X, A, µ).
Then, since we obviously have the inequalities 0 ≤ f ± ≤ |f |, again by Proposition
1.3, it follows that f ± ∈ L1+ (X, A, µ). Since we obviously have
f (x) = f + (x) − f − (x), ∀ x ∈ X r f −1 ({−∞, ∞}),
it follows that f indeed belongs to f ± ∈ L1R̄ (X, A, µ).
(ii). Assume f , g, and h are as in (ii). By (i), both functions |f | and |g| are in
L1+ (X, A, µ). By Theorem 1.2, it follows that the function k = |f | + |g| also belongs
to L1+ (X, A, µ). Notice that we have the equality
f −1 ({−∞, ∞}) ∪ g −1 ({−∞, ∞}) = k −1 ({∞}),
so the hypothesis on h reads
h(x) = f (x) + g(x), ∀ x ∈ X r k −1 ({∞}),
which then gives
|h(x)| = |f (x) + g(x)| ≤ |f (x)| + |g(x)|, ∀ x ∈ X r k −1 ({∞}).
Of course, since µ k −1 ({∞}) = 0, this gives


|h| ≤ k, µ-a.e.,
and using (i) it follows that h indeed belongs to L1R̄ (X, A, µ).
(iii). Assume f , α, and g are as in (iii). Exactly as above, we have |g| = |α|·|f |,
µ-a.e., and then by Theorem 1.2 it follows that |g| ∈ L1+ (X, A, µ).
(iv). The inclusion L1+ (X, A, µ) ⊂ L1R̄ (X, A, µ) is trivial. To prove the inclusion
L1R,elem (X, A, µ) ⊂ L1R̄ (X, A, µ), we use parts (ii) and (iii) to reduce this to the fact
that κ A ∈ L1R̄ (X, A, µ), for all A ∈ A, with µ(A) < ∞. But this fact is now obvious,
because any such function belongs to L1+ (X, A, µ) ⊂ L1R̄ (X, A, µ). 
Corollary 1.1. Let (X, A, µ) be a measure space, and let K be one of the
fields R or C.
(i) For a K-valued measurable function f : X → K, the following are equiva-
lent:
(a) f ∈ L1K (X, A, µ);
(b) |f | ∈ L1+ (X, A, µ).
(ii) When equipped with the pointwise addition and scalar multiplication, the
space L1K (X, A, µ) becomes a K-vector space.
Proof. (i). The case K = R is immediate from Theorem 1.3
In the case when K = C, we use the obvious inequalities

(19) max |Re f |, |Im f | ≤ |f | ≤ |Re f | + |Im f |.
If f ∈ L1C (X, A, µ), then both Re f and Im f belong to L1R (X, A, µ), so by
Theorem 1.3, both |Re f | and |Im f | belong to L1+ (X, A, µ). By Theorem 1.2, the
function g = |Re f | + |Im f | belongs to L1+ (X, A, µ), and then using the second
inequality in (19), it follows that |f | belongs to L1+ (X, A, µ).
Conversely, if |f | belongs to L1+ (X, A, µ), then using the first inequality in (19),
it follows that both |Re f | and |Im f | belong to L1+ (X, A, µ), so by Theorem 1.3,
both Re f and Im f belong to L1R (X, A, µ), i.e. f belongs to L1C (X, A, µ).
(ii). This part is pretty clear. If f, g ∈ L1K (X, A, µ), then by (i) both |f |
and |g| belong to L1+ (X, A, µ), and by Theorem 1.2, the function |f | + |g| will
302 LECTURES 32-33

also belong to L1+ (X, A, µ). Since |f + g| ≤ |f | + |g|, it follows that |f + g| itself
belongs to L1+ (X, A, µ), so using (i) again, it follows that f + g indeed belongs to
L1K (X, A, µ). If f ∈ L1K (X, A, µ) and α ∈ K, then |f | belongs to L1+ (X, A, µ), so
|αf | = |α| · |f | again belongs to L1+ (X, A, µ), which by (i) gives the fact that αf
belongs to L1K (X, A, µ). 

Remark 1.5. Let (X, A, µ) be a measure space. Then one has the equalities
L1+ (X, A, µ) = f ∈ L1R̄ (X, A, µ) : f (X) ⊂ [0, ∞] ;

(20)
(21) L1K,elem (X, A, µ) = L1K (X, A, µ) ∩ A-ElemK (X).
Indeed, by Theorem 1.3 that we have the inclusion
L1+ (X, A, µ) ⊂ f ∈ L1R̄ (X, A, µ) : f (X) ⊂ [0, ∞] .


The inclusion in the other direction follows again from Theorem 1.3, since any
function that belongs to the right hand side of (20) satisfies f = |f |. The inclusion
L1K,elem (X, A, µ) ⊂ L1K (X, A, µ) ∩ A-ElemK (X)
is again contained in Theorem 1.3. To prove the inclusion in the other direction,
it suffices to consider the case K = R. Start with h ∈ L1R (X, A, µ) ∩ A-ElemR (X),
which gives |h| ∈ L1+ (X, A, µ). The function |h| is obviously in A-ElemR (X), so
we get |h| ∈ L1R,elem (X, A, µ). Since L1R,elem (X, A, µ) is a vector space, it will also
contain the function −|h|. The fact that h itself belongs to L1R,elem (X, A, µ) then
follows from Proposition 1.1, combined with the obvious inequalities
−|h| ≤ h ≤ |h|.
The following result deals with the construction of the integral.
Theorem 1.4. Let (X, A, µ) be a measure space. There exists a unique map
IR̄µ (X, A, µ) → R, with the following properties:
(i) Whenever f, g, h ∈ L1R̄ (X, A, µ) are such that

h(x) = f (x) + g(x), ∀ x ∈ X r f −1 ({−∞, ∞}) ∪ g −1 ({−∞, ∞}) ,


 

it follows that IR̄µ (h) = IR̄µ (f ) + IR̄µ (g).


(ii) Whenever f, g ∈ L1R̄ (X, A, µ) and α ∈ R are such that

g(x) = αf (x), ∀ x ∈ X r f −1 ({−∞, ∞}),


it follows that IR̄µ (g) = αIR̄µ (f ).
µ
(iii) IR̄µ (f ) = I+ (f ), ∀ f ∈ L1+ (X, A, µ).

Proof. Let us first show the existence. Start with some f ∈ L1R̄ (X, A, µ), and
define the functions f ± : X → [0, ∞] by f + = max{f, 0} and f − = max{−f, 0} so
that f = f + − f − , and f + , f − ∈ L1+ (X, A, µ). We then define
µ
IR̄µ (f ) = I+ µ
(f + ) − I+ (f − ).

It is obvious that IR̄µ satisfies condition (iii).


The key fact that we need is contained in the following.
CHAPTER IV: INTEGRATION THEORY 303

Claim: Whenever f ∈ L1R̄ (X, A, µ), and f1 , f2 ∈ L1+ (X, A, µ) are such that
−1
f (x) = f + (x) − f − (x), ∀ x ∈ X. r ({∞}) ∪ f2−1 ({∞}) ,


it follows that we have the equality


IR̄µ (f ) = I+
µ µ
(f1 ) − I+ (f2 ).
Indeed, since we have f = f + −f − , it follows immediately that we have the equality
f2 (x) + f + (x) = f1 (x) + f − (x), ∀ x ∈ X. r f1−1 ({∞}) ∪ f2−1 ({∞}) ,
 

which gives
f2 + f + = f1 + f − , µ-a.e.
By Theorem 1.2, this immediately gives
µ µ µ µ
I+ (f2 ) + I+ (f + ) = I+ (f1 ) + I+ (f − ),
which then gives
µ µ µ µ
I+ (f1 ) − I+ (f2 ) = I+ (f + ) − I+ (f − ) = IR̄µ (f ).
Having prove the above Claim, let us show now that IR̄µ has properties (i) and
(ii). Assume f , g and h are as in (i). Notice that if we define h1 = f + + g + and
h2 = f − + g − , then we clearly have 0 ≤ h1 ≤ |f | + |g| and 0 ≤ h2 ≤ |f | + |g|, so h1
and h2 both belng to L1+ (X, A, µ). By Theorem 1.2, we then have
µ µ µ + µ µ µ −
(22) I+ (h1 ) = I+ (f + ) + I+ (g ) and I+ (h2 ) = I+ (f − ) + I+ (g ).
Notice also that, because of the equalities
h−1
1 ({∞}) = f
−1
({∞} ∪ g −1 ({∞}) and h−1
2 ({∞}) = f
−1
({−∞} ∪ g −1 ({−∞}),
we have
h = h1 (x) − h2 (x), ∀ x ∈ X. r h−1 −1
 
1 ({∞}) ∪ h2 ({∞}) ,
so by the above Claim, combined with (22), we get
µ
IR̄µ (h) = I+ µ
(h1 ) − I+ µ
(h2 ) = I+ µ +
(f + ) + I+ µ
(g ) − I+ µ −
(f − ) − I+ (g ) = IR̄µ (f ) + IR̄µ (g).
Property (ii) is pretty obvious.
The uniqueness is also obvious. If we start with a map J : L1R̄ (X, A, µ) → R
with properties (i)-(iii), then for every f ∈ L1R̄ (X, A, µ), we must have
µ µ
J(f ) = J(f + ) − J(f − ) = I+ (f + ) − I+ (f − ).
(For the second equality we use condition (iii), combined with the fact that both
f + and f − belong to L1+ (X, A, µ).) 
Corollary 1.2. Let (X, A, µ) be a measure space, and let K be either R or
C. There exists a unique linear map IKµ (X, A, µ) → K, such that
IKµ (f ) = I+
µ
(f ), ∀ f ∈ L1+ (X, A, µ) ∩ L1K (X, A, µ).
Proof. Let us start with the case K = R. In this case, we have the inclusion
L1R (X, A, µ) ⊂ L1R̄ (X, A, µ),
so we can define IRµ as the restriction of IR̄µ to L1R (X, A, µ). The uniqueness is again
clear, because of the equalities
IRµ (f ) = IRµ (f + ) − IRµ (f − ) = I+
µ µ
(f + ) − I+ (f − ).
304 LECTURES 32-33

In the case K = C, we define


ICµ (f ) = IRµ (Re f ) + iIRµ (Im f ).
The linearity is obvious. The uniqueness is also clear, because the restriction of ICµ
to L1R (X, A, µ) must agree with IRµ . 
Definition. Let (X, A, µ) be a measure space, and let K be one of the symbols
µ
R̄, R, or C. For any f ∈ L1K (X, A, µ), the number IK (f ) (which is real, if K = R̄
or R, and is complex if K = C) will be denoted by
Z
f dµ,
X
and is called the µ-integral of f . This notation is unambiguous, because if f ∈
L1R (X, A, µ), then we have IR̄µ (f ) = ICµ (f ) = IRµ (f ).
Remark 1.6. If (X, A, µ) is a measure space, then for every A ∈ A, with
µ(A) < ∞, using the above Corollary, we get
Z
µ
κ A dµ = I+ (κ A ) = µ(A).
X
By linearity, if K = R, C, one has then the equality
Z
µ
h dµ = Ielem (h), ∀ h ∈ L1K,elem (X, A, µ).
X
To make the exposition a bit easier, it will adopt the following.
Convention. If (X, A, µ) is a measure space, and if f : X → [0, ∞] is a
measurable function, which does not belong to L1+ (X, A, µ), then we define
Z
f dµ = ∞.
X
Remarks 1.7. Let (X, A, µ) be a measure space.
A. Using the above convention,
R when h ∈ A-ElemR (X) is a function with
h(X) ⊂ [0, ∞), the condition X h dµ = ∞ is equivalent to the existence of some
α ∈ h(X) r {0}, with µ h−1 ({α}) = ∞.
B. Using the above convention, for every measurable function f : X → [0, ∞],
one has the equality
Z Z 
f dµ = sup h dµ : h ∈ A-ElemR (X), 0 ≤ h ≤ f .
X X
C. If f, g : X → [0, ∞] are measurable, then one has the equalities
Z Z Z
(f + g) dµ = f dµ + g dµ,
X X X
Z Z
(αf ) dµ = α f dµ, ∀ α ∈ [0, ∞),
X
even in the case when some term is infinite. (We use the convention ∞ + t = ∞,
∀ t ∈ [0, ∞], as well as α · ∞ = ∞, ∀ α ∈ (0, ∞), and 0 · ∞ = 0.)
D. If f, g : X → [0, ∞] are measurable, and f ≤ g, µ-a.e., then (using B) one
has the inequality Z Z
f dµ ≤ g dµ,
X X
even if one side (or both) is infinite.
CHAPTER IV: INTEGRATION THEORY 305

E. Let K be one of the symbols R̄, R, or C, and let f : X → K be a measurable


function. Then the function |f | : X → [0, ∞] is measurable. Using the above con-
vention,
R the condition that f belongs to L1K (X, A, µ) is equivalent to the inequality
X
|f | dµ < ∞.
In the remainder of this section we discuss several properties of integration that
are analoguous to those of the positive/elementary integration.
We begin with a useful estimate
Proposition 1.6. Let (X, A, µ) be a measure space, and let K be one of the
symbols R̄, R, or C. For every function f ∈ L1K (X, A, µ), one has the inequality
Z Z


f dµ ≤ |f | dµ.
X X

Proof. Let us first examine the case when K = R̄, R. In this case we define
f + = max{f, 0} and f − = max{−f, 0}, so we have f = f + − f − , as well as
µ
|f | = f + + f − . Using the inequalities I+ (f ± ) ≥ 0, we have
Z Z
µ + µ − µ + µ −
f dµ = I+ (f ) − I+ (f ) ≤ I+ (f ) + I+ (f ) = |f | dµ;
ZX X
Z
µ µ µ µ
− f dµ = −I+ (f + ) + I+ (f − ) ≤ I+ (f + ) + I+ (f − ) = |f | dµ.
X X
In other words, we have Z Z
± f dµ ≤ |f | dµ,
X X
and the desired inequality immediately follows. R
Let us consider now the case K = C. Consider the number λ = X f dµ, and
let us choose some complex number α ∈ C, with |α| = 1, and αλ = |λ|. (If λ 6= 0,
we take α = λ−1 |λ|; otherwise we take α = 1.) Consider the measurable function
g = αf . Notice now that
Z  Z  Z Z
Re g dµ + i Im g dµ = g dµ = α f dµ = αλ = |λ| ≥ 0,
X X X X
so in particular we get Z
|λ| = Re g dµ.
X
If we apply the real case, we then get
Z
(23) |λ| ≤ |Re g| dµ.
X
Notice now that, we have the inequality |Re g| ≤ |g| = |f |, which gives
Z
µ  µ

I+ |Re g| ≤ I |f | = |f | dµ,
X
so the inequality (23) immediately gives
Z Z

f dµ = |λ| ≤ |f | dµ. 

X X

Corollary 1.3. Let (X, A, µ) be a measure space, and let K be one of the
symbols R̄, R, or C. If a measurable function f : X → K satisfies f = 0, µ-a.e,
then f ∈ L1K (X, A, µ), and X f dµ = 0.
R
306 LECTURES 32-33

Proof. Consider the measurable function |f | : X → [0, ∞], which satisfies


R Proposition 1.3, it follows that |f | ∈ L+ (X, A,
1
|f | = 0, µ-a.e. By R µ), hence f ∈
LK (X, A, µ), and X |f | dµ = 0. Of course, the last equality forces X f dµ = 0. 
1

Corollary 1.4. Let K be either R or C. If (X, A, µ) is a finite measure space,


then every bounded measurable function f : X → K belongs to L1K (X, A, µ), and
satisfies Z


f dµ ≤ µ(X) · sup |f (x)|.
X x∈X

Proof. If we put β = supx∈X |f (x)|, thenR we clearly Rhave |f | ≤ βκ X , which


shows that |f | ∈ L1+ (X, A, µ), and also gives X |f | dµ ≤ X βκ X dµ = µ(X) · β.
Then everything follows from Proposition 1.6. 

Comment. The introduction of the space L1R̄ (X, A, µ), of extended real-valued
µ-integrable functions, is useful mostly for technical reasons. In effect, everything
can be reduced to the case when only “honest” real-valued functions are involved.
The following result clarifies this matter.
Lemma 1.2. Let (X, A, µ) be a measure space, and let f : X → R̄ be a mea-
surable function. The following ar equivalent
(i) f ∈ L1R̄ (X, A, µ);
(ii) there exists g ∈ L1R (X, A, µ), such that g = f , µ-a.e.
Moreover, if f satisfies these equivalent conditions, then any function g, satisfying
(ii), also has the property
Z Z
f dµ = g dµ.
X X

Proof. Consider the set F = {x ∈ X : −∞ < f (x) < ∞}, which belongs to
A. We obviously have the equality X r F = |f |−1 ({∞}).
(i) ⇒ (ii). Assume f ∈ L1R̄ (X, A, µ), which means that |f | ∈ L1+ (X, A, µ). In
particular, we get µ(X r F ) = 0. Define the measurable function g = f κ F . On the
one hand, it is clear, by construction,
we have −∞ < g(x) < ∞, ∀ x ∈ X. On
that
the other hand, it is clear that g F = f F , so using µ(X r F ) = 0, we get the fact
that f = g, µ-a.e. Finally, the inequality 0 ≤ |g| ≤ |f |, combined with Proposition
1.3, gives |g| ∈ L1+ (X, A, µ), so g indeed belongs to L1R (X, A, µ).
(ii) ⇒ (i). Suppose there exists g ∈ L1R (X, A, µ), with f = g, µ-a.e., and let us
prove that
(a) fR ∈ L1R̄ (X, RA, µ);
(b) X f dµ = X g dµ.
The first assertion is clear, because by Proposition 1.3, the equality |f | = |g|, µ-a.e.,
combined with |g| ∈ L1+ (X, A, µ), forces |f | ∈ L1+ (X, A, µ), i.e. f ∈ L1R̄ (X, A, µ). To
prove (b), we consider the difference h = f − g, which is a measurable function h :
X →R R̄, and satisfies h = 0, µ-a.e. By Corollary 1.3, we know that h ∈ L1R̄ (X, A, µ),
and X h dµ = 0. By Theorem 1.3, we get
Z Z Z Z
f dµ = g dµ + h dµ = g dµ. 
X X X x

The following result is an analogue of Proposition 1.1 (see also Proposition 1.3).
CHAPTER IV: INTEGRATION THEORY 307

Proposition 1.7. Let (X, A, µ), and let f1 , f2 ∈ L1R̄ (X, A, µ). Suppose f :
X → R̄ is a measurable function, such that f1 ≤ f ≤ f2 , µ-a.e. Then f ∈
L1R̄ (X, A, µ), and one has the inequality
Z Z Z
f1 dµ ≤ f dµ ≤ f2 dµ.
X X X

Proof. First of all, since f1 and f2 belong to L1R̄ (X, A, µ), it follows that
|f1 | and |f2 |, hence also |f1 | + |f2 |, belong tp L1+ (X, A, µ). Second, since we have
f2 ≤ |f2 | ≤ |f1 | + |f2 | and f1 ≥ −|f1 | ≥ −|f1 | − |f2 | (everyhwere!), the inequalities
f1 ≤ f ≤ f2 , µ-a.e., give
−|f1 | − |f2 | ≤ f ≤ |f1 | + |f2 |, µ-a.e.,
which reads
|f | ≤ |f1 | + |f2 |, µ-a.e.
Since |f1 | + |f2 | ∈ L1+ (X, A, µ), by Proposition 1.3., we get |f | ∈ L1+ (X, A, µ), so f
indeed belongs to L1R̄ (X, A, µ).
To prove the inequality for integrals, we use Lemma 1.2, to find functions
g1 , g2 , g ∈ L1R (X, A, µ), such that f1 R= g1 , µ-a.e.,R f2 = g2 ,Rµ-a.e., andRf = g, µ-a.e.
Lemma
R 1.2R also gives the equalities X f1 dµ = X g1 dµ, X f2 dµ = X g2 dµ, and
X
f dµ = X
g dµ, so what we need to prove are the inequalities
Z Z Z
(24) g1 dµ ≤ g dµ ≤ g2 dµ.
X X X
Of course, we have
g1 ≤ g ≤ g2 , µ-a.e.
To prove the first inequality in R(24), we consider the function h = g − g1 ∈
L1R (X, A, µ), and we prove that X h dµ ≥ 0. But this is quite clear, because
we have h ≥ 0, µ-a.e., which means that h = |h|, µ-a.e., so by Lemma 1.2, we get
Z Z
µ
h dµ = |h| dµ = I+ (|h|) ≥ 0.
X X
The second inequality in (24) is prove the exact same way. 
The next result is an analogue of Proposition 1.4.
Proposition 1.8. Let (X, A, µ) be a measure space, and let K be one of the
symbols R̄, R, or C. Suppose (Ak )nk=1 ⊂ A is a pairwise disjoint finite sequence,
with A1 ∪ · · · ∪ An = X. For a measurable function f : X → K, the following are
equivalent.
(i) f ∈ L1K (X, A, µ);
(ii) f κ Ak ∈ L1K (X, A, µ), ∀ k = 1, . . . , n.
Moreover, if f satisfies these equivalent conditions, one has
Z X n Z
(25) f dµ = f κ Ak dµ.
X k=1 X

Proof. It is fairly obvious that |f κ Ak | = |f |κ Ak . Then the equivalence (i) ⇔


(ii) follows from Proposition 1.4 applied to the function |f | : X → [0, ∞]. In
the cases when K = R, C, the P equality (25) follows immediately from linearity,
n
and the obvious equality f = k=1 f κ Ak . In the case when K = R̄, we take
g ∈ LR (X, A, µ), such that f = g, µ-a.e. Then we obviously have f κ Ak = gκ Ak ,
1
308 LECTURES 32-33

µ-a.e., for all k = 1, . . . , n, and the equality (25) follows from the corresponding
equality that holds for g. 
Remark 1.8. The equality (25) also holds for arbitrary measurable functions
f : X → [0, ∞], if we use the convention that preceded Remarks 1.7. This is an
immediate consequence of Proposition 1.4, because the left hand side is infinite, if
an only if one of the terms in the right hand side is infinite.
The following is an obvious extension of Remark 1.4.
Remark 1.9. Let K be one of the symbols R̄, R, or C, let (X, A, µ) be a
measure space. For a set S ∈ A, and a measurable function f : X → K, one has
the equivalence
f κ S ∈ L1K (X, A, µ) ⇐⇒ f ∈ L1K S, A , µ .

S S S
If this is the case, one has the equality
Z Z

(26) f κ S dµ = f S dµ|S .
X S
The above equality also holds for arbitrary measurable functions f : X → [0, ∞],
again using the convention that preceded Remarks 1.7.
Notation. The above remark states that, whenver the quantities in (26) are
defined, they are equal. (This only requires the fact that f S is measurable, and
either f S ∈ L1K S, A S , µ S , or fR(S) ⊂ [0, ∞].) In this case, the equal qunatities


in (26) will be simply denoted by S f dµ.


Exercise 1. Let I be some non-empty set. Consider the σ-algebra P(I), of all
subsets of I, equipped with the counting measure

Card A if A is finite
µ(A) =
∞ if A is infinite
Prove that L1R̄ (I, P(I), µ) = L1R (I, P(I), µ). Prove that, if K is either R or C, then
L1K (I, P(I), µ) = `1K (I),
the Banach space discussed in II.2 and II.3.
Exercise 2. There is an instance when the entire theory developped here is
essentially vacuous. Let X be a non-empty set, and let A be a σ-algebra on X. For
a measure µ on A, prove that the following are equivalent
(i) L1+ (X, A, µ) = f : X → [0, ∞] : f measurable, and f = 0, µ-a.e. ;


(ii) for every A ∈ A, one has µ(A) ∈ {0, ∞}.


A measure space (X, A, µ), with property (ii), is said to be degenerate.
Exercise 3 ♦ . Let (X, A,
R µ) be a measure space, and let f : X → [0, ∞] be a
measurable function, with X f dµ = 0. Prove that f = 0, µ-a.e.
1
Hint: Define the measurable sets An = {x ∈ X : f (x) ≥ n
}, and analyze the relationship
between f and κ An .
Lecture 34

2. Convergence theorems
In this section we analyze the dynamics of integrabilty in the case when se-
quences of measurable functions are considered. Roughly speaking, a “convergence
theorem” states that integrability is preserved under taking limits. In other words,
if one has a sequence (fn )∞n=1 of integrable functions, and if f is some kind of a
limit of the fn ’s,
R then we would
R like to conclude that f itself is integrable, as well
as the equality f = limn→∞ fn .
Such results are often employed in two instances:
A. When we want to prove that some function f is integrable. In this case
we would look for a sequence (fn )∞
n=1 of integrable approximants for f .
B. When we want to construct and integrable function. In this case, we will
produce first the approximants, and then we will examine the existence
of the limit.
The first convergence result, which is somehow primite, but very useful, is the
following.
Lemma 2.1. Let (X, A, µ) be a finite measure space, let a ∈ (0, ∞) and let
fn : X → [0, a], n ≥ 1, be a sequence of measurable functions satisfying
(a) f1 ≥ f2 ≥ · · · ≥ 0;
(b) limn→∞ fn (x) = 0, ∀ x ∈ X.
Then one has the equality
Z
(1) lim fn dµ = 0.
n→∞ X

Proof. Let us define, for each ε > 0, and each integer n ≥ 1, the set

Aεn = {x ∈ X : fn (x) ≥ ε}.

Obviously, we have Aεn ∈ A, ∀ ε > 0, n ≥ 1. One key fact we are going to use is the
following.
Claim 1: For every ε > 0, one has the equality

lim µ(Aεn ) = 0.
n→∞

Fix ε > 0. Let us first observe that, using (a), we have the inclusions

(2) Aε1 ⊃ Aε2 ⊃ . . .


309
310 LECTURE 34

T∞
Second, using (b), we clearly have the equality k=1 Aεk = ∅. Since µ is finite,
using the Continuity Property (Lemma III.4.1), we have

\
µ(Aεn ) Aεn = µ(∅) = 0.

lim =µ
n→∞
n=1

Claim 2: For every ε > 0 and every integer n ≥ 1, one has the inequality
Z
0≤ fn dµ ≤ aµ(Aεn ) + εµ(X).
X

Fix ε and n, and let us consider the elementary function


hεn = aκ Aεn + εκ Bnε ,
where Bnε = X r Aε . Obviously, since µ(X) < ∞, the function hεn is elementary
integrable. By construction, we clearly have 0 ≤ fn ≤ hεn , so using the properties
of integration, we get
Z Z
0≤ fn dµ ≤ hεn dµ = aµ(Aεn ) + εµ(B ε ) ≤ aµ(Aε ) + εµ(X).
X X

Using Claims 1 and 2, it follows immediately that


Z Z
0 ≤ lim inf fn dµ ≤ lim sup fn dµ ≤ εµ(X).
n→∞ X n→∞ X

Since the last inequality holds for arbitrary ε > 0, the desired equality (1) immedi-
ately follows. 

We now turn our attention to a weaker notion of limit, for sequences of mea-
surable functions.
Definition. Let (X, A, µ) be a measure space, let K be a one of the symbols
R̄, R, or C. Suppose fn : X → K, n ≥ 1, are measurable functions. Given
a measurable function f : X → K, we say that the sequence (fn )∞ n=1 converges
µ-almost everywhere to f , if there exists some set N ∈ A, with µ(N ) = 0, such that
lim fn (x) = f (x), ∀ x ∈ X r N.
n→∞

In this case we write


f = µ-a.e.- lim fn .
n→∞
Remark 2.1. This notion of convergence has, among other things, a certain
uniqueness feature. One way to describe this is to say that the limit of a µ-a.e. con-
vergent sequence is µ-almost unique, in the sense that if f anf g are measurable func-
tions which satisfy the equalities f = µ-a.e.- limn→∞ fn and g = µ-a.e.- limn→∞ fn ,
then f = g, µ-a.e. This is quite obvious, because there exist sets M, N ∈ A, with
µ(M ) = µ(N ) = 0, such that
lim fn (x) = f (x), ∀ x ∈ X r M,
n→∞
lim fn (x) = g(x), ∀ x ∈ X r N,
n→∞

then it is obvious that µ(M ∪ N ) = 0, and


f (x) = g(x), ∀ x ∈ X r [M ∪ N ].
CHAPTER IV: INTEGRATION THEORY 311

Comment. The above definition makes sense if K is an arbitrary metric space.


Any of the spaces R̄, R, and C is in fact a complete metric space. There are instances
where the requirement that f is measurable is if fact redundant. This is somehow
clarified by the the next two exercises.
Exercise 1*. Let (X, A, µ) be a measure space, let K be a complete separable
metric space, and let fn : X → K, n ≥ 1, be measurable functions.
(i) Prove that the set
 ∞
L = x ∈ X : fn (x) n=1 ⊂ K is convergent
belongs to A.
(ii) If we fix some point α ∈ K, and we define ` : X → K by
(
lim fn (x) if x ∈ L
`(x) = n→∞
α if x ∈ X r L
then ` is measurable.
In particular, if µ(X r L) = 0, then ` = µ-a.e.- limn→∞ fn .
Hints: If d denotes the metric on K, then prove first that, for every ε > 0 and every m, n ≥ 1,
the set
ε
 
Dmn = x ∈ X : d fm (x), fn (x) < ε
belongs to A (use the results from III.3). Based on this fact, prove that, for every p, k ≥ 1, the
set
Ekp = x ∈ X ; d fm (x), fn (x) < p1 , ∀ m, n ≥ k
 

belongs to A. Finally, use completeness to prove that


∞ ∞
Ekp .
\ [ 
L=
p=1 k=1

Exercise 2. Use the setting from Exercise 1. Prove that Let (X, A, µ), K, and
(fn )∞
n=1 be as in Exercise 1. Assume f : X → K is an arbitrary function, for which
there exists some set N ∈ A with µ(N ) = 0, and
lim fn (x) = f (x), ∀ x ∈ X r N.
n→∞

Prove that, when µ is a complete measure on A (see III.5), the function f is auto-
matically measurable.
Hint: Use the results from Exercise 1. We have X r N ⊂ L, and f (x) = `(x), ∀ x ∈ X r N .
Prove that, for a Borel set B ⊂ K, one has the equality f −1 (B) = `−1 (B)4M , for some M ⊂ N .
By completeness, we have M ∈ A, so f −1 (B) ∈ A.
The following fundamental result is a generalization of Lemma 2.1.
Theorem 2.1 (Lebesgue Monotone Convergence Theorem). Let (X, A, µ) be
a measure space, and let (fn )∞n=1 ⊂ L+ (X, A, µ) be a sequence with:
1

• fn ≤ fRn+1 , µ-a.e., ∀ n ≥ 1;
• sup X fn dµ : n ≥ 1 < ∞.
Assume f : X → [0, ∞]Ris a measurable function, with f = µ-a.e.- limn→∞ fn . Then
f ∈ L1+ (X, A, µ), and X f dµ = limn→∞ X fn dµ.
R

R
Proof. Define αn = X fn dµ, n ≥ 1. First of all, we clearly have
0 ≤ α1 ≤ α2 ≤ . . . ,
312 LECTURE 34

so the sequence (αn )∞


n=1 has a limit α = limn→∞ αn , and we have in fact the
equality 
α = sup αn : n ≥ 1 < ∞.
With these notations, all we need to prove is the fact that f ∈ L1+ (X, A, µ), and
that we have
Z
(3) f dµ = α.
X
Fix a set M ∈ A, with µ(M ) = 0, and such that limn→∞ fn (x) = f (x),
∀ x ∈ X r M . For each n, we define the set
Mn = {x ∈ X : fn (x) > fn+1 (x)}.
Obviously Mn ∈ S∞ A, and by assumption, we have µ(Mn ) = 0, ∀ n ≥ 1. Define the
set N = M ∪ n=1 Mn . It is clear that µ(N ) = 0, and
• 0 ≤ f1 (x) ≤ f2 (x) ≤ · · · ≤ f (x), ∀ x ∈ X r N ;
• f (x) = limn→∞ fn (x), ∀ x ∈ X r N .
So if we put A = X r N , and if we define the measurable functions gn = fn κ A ,
n ≥ 1, and g = f κ A , then we have
(a) 0 ≤ g1 ≤ g2 ≤ · · · ≤ g (everywhere!);
(b) limn→∞ gn (x) = g(x), ∀ x ∈ X;
(c) gn = fn , µ-a.e., ∀ n ≥ 1;
(d) g = f , µ-a.e.;
Notice that property (c) gives gn ∈ L1+ (X, A, µ) and X gn dµ = αn , ∀ n ≥ 1. By
R

property (d), we see that we have the equivalence


f ∈ L1+ (X, A, µ) ⇐⇒ f ∈ L1+ (X, A, µ).
Moreover, if g ∈ L1+ (X, A, µ), then we will have X g dµ = X f dµ. These observa-
R R

tions show that it suffices to prove the theorem with g’s in place of the f ’s. The
advantage is now the fact that we have the slightly stronger properties (a) and (b)
above. The first step in the proof is the following.
Claim 1: For every t ∈ (0, ∞), one has the inequality µ g −1 ((t, ∞]) ≤ αt .


Denote the set g −1 ((t, ∞]) simply by At . For each n ≥ 1, we also define the set
Ant = gn−1 ((t, ∞]). Using property (a) above, it is clear that we have the inclusions
(4) At1 ⊂ A2t ⊂ · · · ⊂ At .
S∞
Using property (b) above, we also have the equality At = n=1 Ant . Using the
continuity Lemma 4.1, we then have
µ(At ) = lim µ(Ant ),
n→∞
so in order to prove the Claim, it suffices to prove the inequalities
αn
(5) µ(Ant ) ≤ , ∀ n ≥ 1.
t
But the above inequality is pretty obvious, since we clearly have 0 ≤ tκ Ant ≤ gn ,
which gives Z Z
tµ(Ant ) = tκ Ant dµ ≤ gn dµ = αn .
X X
Claim 2: For any elementary function h ∈ A-ElemR (X), with 0 ≤ h ≤ g,
one has
CHAPTER IV: INTEGRATION THEORY 313

(i) h ∈ L1R,elem (X, A, µ);


R
(ii) X h dµ ≤ α.
Start with some elementary function h, with 0 ≤ h ≤ g. Assume h is not identically
zero, so we can write it as
h = β1 κ B1 + · · · + βp κ Bp ,
with (Bj )pj=1 ⊂ A pairwise disjoint, and 0 < β1 < · · · < βp . Define the set
B = B1 ∪ · · · ∪Bp . It is obvious that, if we put t = β1 /2, we have the inclusion
B ⊂ g −1 (t, ∞] , so by Claim 1, we get µ(B) < ∞. This gives, of course µ(Bj ) < ∞,
∀ j = 1, . . . , p, so h is indeed elementary integrable. To prove the estimate (ii), we
define the measurable functions hn : X → [0, ∞] by hn = min{gn , h}, ∀ n ≥ 1.
Since 0 ≤ hn ≤ gn , ∀ n ≥ 1, it follows that, hn ∈ L1+ (X, A, µ), ∀ n ≥ 1, and we have
the inequalities
Z Z
(6) hn dµ ≤ gn dµ = αn , ∀ n ≥ 1.
X X
It is obvious that we have
(∗) 0 ≤ h1 ≤ h2 ≤ · · · ≤ h ≤ βp κ B (everywhere);
(∗∗) h(x) = limn→∞ hn (x), ∀ x ∈ X.
B = A B , and the

Let us restrict everything to B. We consider the σ-algebra
measure ν = µ B . Consider the elementary function ψ = h B ∈ B-ElemR (B), as

well as the measurable functions ψn = hn : B → [0, ∞], n ≥ 1. It is clear that

B
ψ ∈ L1R,elem (B, B, ν), and we have the equality
Z Z
(7) ψ dν = h dµ.
B X

Likewise, using (∗), which clearly forces hn
XrB
= 0, it follows that, for each n ≥ 1,
the function ψn belongs to L1+ (B, B, ν), and by (6), we have
Z Z
(8) ψn dν = hn dµ, ∀ n ≥ 1.
B X

Let us analyze the differences ϕn = ψ − ψn . On the one hand, using (∗), we


have ϕn (x) ∈ [0, βp ], ∀ x ∈ B, n ≥ 1. On the other hand, again by (∗), we have
ϕ1 ≥ ϕ2 ≥ . . . . Finally, by (∗∗) we have Rlimn→∞ ϕn (x) = 0, ∀ x ∈ B. We can
apply Lemma 2.1, and we will get limn→∞ B ϕn dν = 0. This clearly gives,
Z Z
ψ dν = lim ψn dν,
B n→∞ B

and then using (7) and (8), we get the equality


Z Z
h dµ = lim hn dµ.
X n→∞ X
R
Combining this with (6), immediately gives the desired estimate X h dµ ≤ α.
Having proven Claim 2, let us observe now that, using the definition of the
positive integral, it follows immediately that g ∈ L1+ (X, A, µ), and we have the
inequality Z
g dµ ≤ α.
X
314 LECTURE 34

The other inequality is pretty obvious, because the inequality g ≥ gn forces


Z Z
g dµ ≥ gn dµ = αn , ∀ n ≥ 1,
X X
so we immediately get
Z
g dµ ≥ sup{αn ; n ≥ 1} = α. 
X

Comment. In the previous section we introduced the convention which defines


dµ = ∞, if f : X → [0, ∞] is measurable, but f 6∈ L1+ (X, A, µ). Using this
R
X
f
convention, the Lebesgue Monotone Convergence Theorem has the following general
version.
Theorem 2.2 (General Lebesgue Monotone Convergence Theorem). Let
(X, A, µ) be a measure space, and let f, fn : X → [0, ∞], n ≥ 1, be measurable
functions, such that
• fn ≤ fn+1 , µ-a.e., ∀ n ≥ 1;
• f = µ-a.e.- limn→∞ fn .
Then
Z Z
(9) f dµ = lim fn dµ.
X n→∞ X

(αn )∞
R
Proof. As before, the sequence ⊂ [0, ∞], defined by αn =
n=1 X
fn dµ,
∀ n ≥ 1, is non-decreasing, and is has a limit
Z 
α = lim αn = sup fn dµ : n ≥ 1 ∈ [0, ∞].
n→∞ X
There are two cases to analyze.
Case I : α = ∞.
In this case the inequalities f ≥ fn ≥ 0, µ-a.e. will force
Z Z
f dµ ≥ fn dµ = αn , ∀ n ≥ 1,
X X
R
which will force X
f dµ ≥ α, so we indeed get
Z
f dµ = ∞ = α.
X
Case II : α < ∞.
In this case we apply directly Theorem 2.1. 
The following result provides an equivalent definition of integrability for non-
negative functions (compare to the construction in Section 1).
Corollary 2.1. Let (X, A, µ) be a measure space, and let f : X → [0, ∞] be
a measurable function. The following are equivalent:
(i) f ∈ L1+ (X, A, µ);
n=1 ⊂ LR,elem (X, A, µ), with
(ii) there exists a sequence (hn )∞ 1

• 0 ≤ h1 ≤ h2 . . . ;
• limn→∞
 R hn (x) = f (x), ∀ x ∈ X;
• sup X hn dµ : n ≥ 1 < ∞.
CHAPTER IV: INTEGRATION THEORY 315

Moreover, if (hn )∞
n=1 is as in (ii), then one has the equality
Z Z
(10) f dµ = lim hn dµ.
X n→∞ X

Proof. (i) ⇒ (ii). Assume f ∈ L1+ (X, A, µ). Using Theorem III.3.2, we know
n=1 ⊂ A-ElemR (X), with
there exists a sequence (hn )∞
(a) 0 ≤ h1 ≤ h2 ≤ · · · ≤ f ;
(b) limn→∞ hn (x) = f (x), ∀ x ∈ X.
Note the (a) forces hn ∈ L1R,elem (X, A, µ), as well as the inequalities X hn dµ ≤
R

f dµ < ∞, ∀ n ≥ 1, so the sequence (hn )∞


R
X n=1 clearly satisfies condition (ii).
The implication (ii) ⇒ (i), and the equality (10) immediately follow from the
General Lebesgue Monotone Convergence Theorem. 

Corollary 2.2 (Fatou Lemma). Let (X, A, µ) be a measure space, and let
fn : X → [0, ∞], n ≥ 1, be a sequence of measurable functions. Define the function
f : X → [0, ∞] by
f (x) = lim inf fn (x), ∀ x ∈ X.
n→∞
Then f is measurable, and one has the inequality
Z Z
f dµ ≤ lim inf fn dµ.
X n→∞ X

Proof. The fact that f is measurable is already known (see Corollary III.3.5).
Define the sequence (αn )∞
R
n=1 ⊂ [0, ∞] by αn = X fn dµ, ∀ n ≥ 1.
Define, for each integer n ≥ 1, the function gn : X → [0, ∞] by

gn (x) = inf fk (x) : k ≥ n , ∀ x ∈ X.
By Corollary III.3.4, we know that gn , n ≥ 1 are all measurable. Moreover, it is
clear that
• 0 ≤ g1 ≤ g2 ≤ . . . ;
• f (x) = limn→∞ gn (x), ∀ x ∈ X.
By the General Lebesgue Monotone Convergence Theorem 2.2, it follows that
Z Z
(11) f dµ = lim gn dµ.
X n→∞ X

Notice that, if we define the sequence (βn )∞


R
n=1 ⊂
R [0, ∞], by
R βn = g dµ, ∀ n ≥ 1,
X n
then the obvious inequalities 0 ≤ gn ≤ fn give X g dµ ≤ X fn dµ, so we get
βn ≤ αn , ∀ n ≥ 1.
Using (11), we then get
Z
f dµ = lim βn = lim inf βn ≤ lim inf αn . 
X n→∞ n→∞ n→∞

The following is an important application of Theorem 2.1, that deals with


Riemann integration.
Corollary 2.3. Let a < b be real numbers. Denote  by λ the Lebesgue measure,
and consider the Lebesgue space [a, b], Mλ ([a, b]), λ , where Mλ ([a, b]) denotes the
316 LECTURE 34

σ-algebra of all Lebesgue measurable subsets of [a, b]. Then every Riemann inte-
grable function f : [a, b] → R belongs to L1R ([a, b], Mλ ([a, b]), λ), and one has the
equality
Z Z b
(12) f dλ = f (x) dx.
[a,b] a

Proof. We are going to use the results from III.6. First of all, the fact that f
is Lebesgue integrable, i.e. f belongs to L1R [a, b], Mλ ([a, b]), λ , is clear since f is
Lebesgue measurable,  and bounded. (Here we use the fact that the measure space
[a, b], Mλ ([a, b]), λ is finite.)
Next we prove the equality between the Riemann integral and the Lebesgue
integral. Adding a constant, if necessary, we can assume that f ≥ 0. For every
partition ∆ = (a = x0 < x1 < · · · < xn = b) of [a, b], we define the numbers
mk = inf f (t), ∀ k = 1, . . . , n,
t∈[xk−1 ,xk ]

and we define the function


f∆ = m1 κ [x0 ,x1 ] + m2 κ (x1 ,x2 ] + · · · + mm κ (xn−1 ,xn ] .
Fix a sequence of partitions (∆p )∞
p=1 , with ∆1 ⊂ ∆2 ⊂ . . . , and limp→∞ |∆p | = 0,
We know (see III.6) that we have
f = λ-a.e.- lim f∆p .
p→∞

Clearly we have 0 ≤ f∆1 ≤ f∆2 ≤ · · · ≤ f , so by Theorem 2.1, we get


Z Z
(13) f dλ = lim f∆p dλ.
[a,b] p→∞ [a,b]

Notice however that


Z
f∆p dλ = L(f, ∆p ), ∀ p ≥ 1,
[a,b]

where L(f, ∆p ) denotes the lower Darboux sum. Combining this with (13), and with
the well known properties of Riemann integration, we immediately get (12). 

The following is another important convergence theorem.


Theorem 2.3 (Lebesgue Dominated Convergence Theorem). Let (X, A, µ) be
a measure space, let K be one of the symbols R̄, R, or C, and let (fn )∞
n=1 ⊂
L1K (X, A, µ). Assume f : X → K is a measurable function, such that
(i) f = µ-a.e.- limn→∞ fn ;
(ii) there exists some function g ∈ L1+ (X, A, µ), such that
|fn | ≤ g, µ-a.e., ∀ n ≥ 1.
Then f ∈ L1K (X, A, µ), and one has the equality
Z Z
(14) f dµ = lim fn dµ.
X n→∞ X

Proof. The fact that f is integrable follows from the following


Claim: |f | ≤ g, µ-a.e.
CHAPTER IV: INTEGRATION THEORY 317

To prove this fact, we define, for each n ≥ 1, the set



Mn = x ∈ X : |fn (x)| > g(x) .
It is clear that Mn ∈ A, and µ(Mn ) = 0, ∀ n ≥ 1. If we choose M ∈ A such that
µ(M ) = 0, and
f (x) = lim fn (x), ∀ x ∈ X r M,
n→∞
S∞
n=1 Mn ∈ A will satisfy

then the set N = M ∪
• µ(N ) = 0;
• |fn (x)| ≤ g(x), ∀ x ∈ X r N ;
• f (x) = limn→∞ fn (x), ∀ x ∈ X r N .
We then clearly get
|f (x)| ≤ g(x), ∀ x ∈ X r N,
and the Claim follows.
Having proven that f is integrable, we now concentrate on the equality (14).
Case I : K = R̄.
First of all, without any loss of generality, we can assume that 0 ≤ g(x) < ∞,
∀ x ∈ X. (See Lemma 1.2.) Let us define the functions gn = min{fn , g} and
hn = max{fn , −g}, n ≥ 1. Since we have −g ≤ fn ≤ g, µ-a.e., we immediately get
(15) gn = hn = fn , µ-a.e., ∀ n ≥ 1,
thus giving the fact that gn , hn ∈ L1R̄ (X, A, µ), ∀ n ≥ 1, as well as the equalities
Z Z Z
(16) gn dµ = hn dµ = fn dµ, ∀ n ≥ 1.
X X X

Define the measurable functions ϕ, ψ : X → R̄ by


ϕ(x) = lim inf hn (x) and ψ(x) = lim sup gn (x), ∀ x ∈ X.
n→∞ n→∞

Using (15), we clearly have f = ϕ = ψ, µ-a.e., so we get


Z Z Z
(17) f dµ = ϕ dµ = ψ dµ.
X X X

Remark also that we have equalities


(18)
g(x)−ϕ(x) = lim inf [g(x)−gn (x)] and g(x)+ψ(x) = lim inf [g(x)+hn (x)], ∀ x ∈ X.
n→∞ n→∞

Since we clearly have


g − gn ≥ 0 and g + hn ≥ 0, ∀ n ≥ 1,
using (18), and Fatou Lemma (Corollary 2.2) and we get the inequalities
Z Z
(g − ϕ) dµ ≤ lim inf (g − gn ) dµ,
n→∞
ZX ZX
(g + ψ) dµ ≤ lim inf (g + hn ) dµ,
X n→∞ X
318 LECTURE 34

In other words, we get


Z Z Z Z  Z Z
g dµ − ϕ dµ ≤ lim inf g dµ − gn dµ = g dµ − lim sup gn dµ,
X X n→∞ X X X n→∞ X
Z Z Z Z  Z Z
g dµ + ψ dµ ≤ lim inf g dµ + hn dµ = g dµ + lim inf hn dµ.
X X n→∞ X X X n→∞ X
Using the equalities (16) and (17), the above inequalities give
Z Z Z Z
f dµ = ϕ dµ ≥ lim sup gn dµ = lim sup fn dµ,
X X n→∞ X n→∞ X
Z Z Z Z
f dµ = ψ dµ ≤ lim inf hn dµ = lim inf fn dµ.
X X n→∞ X n→∞ X
In other words, we have
Z Z Z Z
f dµ ≤ lim inf fn dµ ≤ lim sup fn dµ ≤ f dµ,
X n→∞ X n→∞ X X
thus giving the equality (14)
The case K = R is trivial (it is in fact contained in case K = R̄).
The case K = C is also pretty clear, using real and imaginary parts, since for
each n ≥ 1, we clearly have
|Re fn | ≤ g, µ-a.e.,
|Im fn | ≤ g, µ-a.e.,. 
Exercise 3. Give an example of a sequence of continuous functions fn : [0, 1] →
[0, ∞), such that
(a) Rlimn→∞ fn (x) = 0, ∀ n ≥ 1;
(b) [0,1] fn dλ = 1, ∀ n ≥ 1.
(Here λ denotes the Lebesgue measure). This shows that the Lebesgue Dominated
Convergence Theorem fails, without the dominance condition (ii).
Hint: Consider the functions fn defined by
n2 x

 if 0 ≤ x ≤ 1/n
fn (x) = n(2 − nx) if 1/n ≤ x ≤ 2/n
0 if 2/n ≤ x ≤ 1

The Lebesgue Convergence Theorems 2.2 and 2.3 have many applications. They
are among the most important results in Measure Theory. In many instances, these
theorem are employed during proofs, at key steps. The next two results are good
illustrations.
Proposition 2.1. Let (X, A, µ) be a measure space, and let f : X → [0, ∞] be
a measurable function. Then the map
Z
ν : A 3 A 7−→ f dµ ∈ [0, ∞]
A
defines a measure on A.
Proof. It is clear that ν(∅) = 0. To proveSσ-additivity, start with a pairwise

n )n=1 ⊂ A, and put A =

disjoint sequence (AS n=1 An . For each integer n ≥ 1,
n
define the set Bn = k=1 Ak , and the measurable function gn = f κ Bn . Define also
the function g = f κ A . It is obvious that
• 0 ≤ g1 ≤ g2 ≤ · · · ≤ g (everywhere),
CHAPTER IV: INTEGRATION THEORY 319

• limn→∞ gn (x) = g(x), ∀ x ∈ X.


Using the General Lebesgue Monotone Convergence Theorem, it follows that
Z Z Z
(19) ν(A) = f κ A dµ = g dµ = lim gn dµ.
X X n→∞ X
Notice now that, for each n ≥ 1, one has the equality
gn = f κ A1 + · · · + f κ An ,
so using Remark 1.7.C, we get
Z n Z
X n
X
gn dµ = f κ Ak dµ = ν(Ak ),
X k=1 X k=1
P∞
so the equality (19) immediately gives ν(A) = n=1 ν(An ). 
The next result is a version of the previous one for K-valued functions.
Proposition 2.2. Let (X, A, µ) be a measure space, let K be oneSof the symbols

R̄, R, or C, and let (An )∞n=1 ⊂ A be a pairwise disjoint sequence with n=1 An = X.
For a function f : X → K, the following are equivalent.
(i) f ∈ L1K (X, A, µ);
(ii) f An ∈ L1K (An , A An , µ An ), ∀ n ≥ 1, and

X∞ Z
|f | dµ < ∞.
n=1 An

Moreover, if f satisfies these equivalent conditions, then


Z X∞ Z
f dµ = f dµ.
X n=1 An

Proof. (i) ⇒ (ii). Assume f ∈ L1K (X, A, µ). Applying Proposition 2.1, to
|f |, we immediately get
∞ Z
X Z
|f | dµ = |f | dµ < ∞,
n=1 An X

which clearly proves (ii).


(ii) ⇒ (i). Assume f satisfies condition (ii). Define,Sfor each n ≥ 1, the set

Bn = A1 ∪ · · · ∪ An . First of all, since (An )∞n=1 ⊂ A, and n=1 An = X, it follows
that f is measurable. Consider the the functions fn = f κ B n
and gn = f κ An , n ≥ 1.
Notice that, since f An ∈ L1K (An , A An , µ An ), it follows that gn ∈ L1K (X, A, µ),

∀ n ≥ 1, and we also have


Z Z
gn dµ = f dµ, ∀ n ≥ 1.
X An
In fact we also have Z Z
|gn | dµ = |f | dµ, ∀ n ≥ 1.
X An
Notice that we obviously have fn = g1 + · · · + gn , and |fn | = |g1 | + · · · + |gn |, so if
we define
X∞ Z
S= |f | dµ,
n=1 An
320 LECTURE 34

we get
Z n Z
X
|fn | dµ = |f | dµ ≤ S < ∞, ∀ n ≥ 1.
X k=1 Ak

Notice however that we have 0 ≤ |f1 | ≤ |f2 | ≤ . . . |f |, as well as the equality


limn→∞ fn (x) = f (x), ∀ x ∈ X. On the one hand, using the General Lebesgue
Monotone Convergence Theorem, we will get
Z Z n Z
X  X ∞ Z
|f | dµ = lim |fn | dµ = lim |f | dµ = |f | dµ = S < ∞,
X n→∞ X n→∞ Ak An
k=1 n=1

which proves that |f | ∈ L1+ (X, A, µ), so in particular f belongs to L1K (X, A, µ). On
the other hand, since we have |fn | ≤ |f |, by the Lebesgue Dominated Convergence
Theorem, we get
Z Z Xn Z  X ∞ Z
f dµ = lim fn dµ = lim f dµ = f dµ. 
X n→∞ X n→∞ Ak An
k=1 n=1

Corollary 2.4. Let (X, A, µ) be a measure space,S let K be one of the symbols

R̄, R, or C, and let (Xn )∞ n=1 ⊂ A be sequence with n=1 Xn = X, and X1 ⊂
X2 ⊂ . . . . For a function f : X → K be a measurable function, the following are
equivalent.
(i) f ∈ L1K (X, A, µ);
(ii) f X ∈ L1K (Xn , A X , µ X ), ∀ n ≥ 1, and

n n n
Z 
sup |f | dµ : n ≥ 1 < ∞.
Xn

Moreover, if f satisfies these equivalent conditions, then


Z Z
f dµ = lim f dµ.
X n→∞ Xn

Proof. Apply the above result to the sequence (An )∞


n=1 given by A1 = X1
and An = Xn r Xn−1 , ∀ n ≥ 2. 

Remark 2.2. Suppose (X, A, µ) is a measure space, K is one of the fields R or


C, and f ∈ L1K (X, A, µ). By Proposition 2.2, we get the fact that the map
Z
ν : A 3 A 7−→ f dµ ∈ K
A
is a K-valued measure on A. By Proposition 2.1, we also know that
Z
ω : A 3 A 7−→ |f | dµ ∈ K
A
is a finite “honest” measure on A. Using Proposition 1.6, we clearly have
Z Z
|f | dµ = ω(A), ∀ A ∈ A,

|ν(A)| = f dµ ≤
A A

which by the results from III.8 gives the inequality |ν| ≤ ω. (Here |ν| denotes the
variation measure of ν.) Later on (see Section 4) we are going to see that in fact
we have the equality |ν| = ω.
CHAPTER IV: INTEGRATION THEORY 321

Comment. It is important to understand the “sequential” nature of the con-


vergence theorems discussed here. If we examine for instance the Mononotone
Convergence Theorem, we could easily formulate a “series” version, which states
the equality
Z X∞ X∞ Z

fn dµ = fn dµ,
X n=1 n=1 X

for any sequence measurable functions fn : X → [0, ∞].


Suppose now we have an arbitrary family fj : X → [0, ∞], j ∈ J of measurable
functions, and we define
X
f (x) = fj (x), ∀ x ∈ X.
j∈J

(Here we use the summability convention which defines the sum as the supremum
of all finite sums.) In general, f is not always measurable. But if it is, one still
cannot conclude that Z XZ
f dµ = fi dµ.
X j∈J X

The following example illustrates this anomaly.


Example 2.1. Take the measure space ([0, 1], Mλ ([0, 1]), λ), and fix J ⊂ [0, 1]
and arbitrary set. For each j ∈ J we consider tha characteristic function fj = κ {j} .
It is obvious that the function f : X → [0, ∞], defined by
X
f (x) = fj (x), ∀ x ∈ [0, 1],
j∈J

is
P equal to κ J If J is non-measurable, this already gives an example when f =
j∈J fj is non-measurable. But even if J were measurable, it would be impossible
to have the equality Z XZ
f dλ = fj dλ,
X j∈J X

simply because the right hand side is zero, while the left hand side is equal to λ(J).
The next two exercises illustrate straightforward (but nevertheless interesting)
applications of the convergence theorems to quite simple situations.
Exercise 4. Let A be a σ-algebra on a (non-empty) set X, and let (µn )∞ n=1 be
a sequence
∞ of signed measures on A. Assume that, for each A ∈ A, the sequence
µn (A) n=1 has a limit denoted µ(A) ∈ [−∞, ∞]. Prove that the map µ : A →
[0, ∞] defines a measure on A, if the sequence (µn )∞
n=1 satisfies one of the following
hypotheses:
A. 0 ≤ µ1 (A) ≤ µ2 (A) ≤ . . . , ∀ A ∈ A;
B. there exists a finite measure ω on A, such that |µn (A)| ≤ ω(A), ∀ n ≥ 1,
A ∈ A.
S∞
k=1 ⊂ A, and put A =
Hint: To prove σ-additivity, fix a pairwise disjoint sequence (Ak )∞ k=1 Ak .
P∞
Treat the problem of proving the equality µ(A) = k=1 µ(A k ) as a convergence problem on
the measure space (N, P(N), ν) - with ν the counting measure - for the sequence of functions
fn : N → [0, ∞] defined by fn (k) = µn (Ak ), ∀ k ∈ N.
Exercise 5*. Let A be a σ-algebra on a (non-empty) set X, and let (µj )j∈J be
a family of signed measures on A. Assume either of the following is true:
322 LECTURE 34

A. µj (A) ≥ 0, ∀ j ∈ J, A ∈ A.
B. There exists a finite measure ω on A, such that j∈J |µj (A)| ≤ ω(A),
P
∀ A ∈ A.
Define the map µ : A → [0, ∞] by µ(A) = j∈J µj (A), ∀ A ∈ A. (In Case A, the
P
sum is defined as the supremum over finite sums. In case B, it follows that the
family µj (A) j∈J is summable.) Prove that µ is a measure on A.
S∞
k=1 ⊂ A, and put A =
Hint: To prove σ-additivity, fix a pairwise disjoint sequence (Ak )∞ k=1 Ak .
P∞
To prove the equality µ(A) = k=1 µ(Ak ), analyze the following cases: (i) There is some k ≥ 1,
such that µ(Ak ) = ∞; (ii) µ(Ak ) < ∞, ∀ k ≥ 1. The first case is quite trivial. In the second
case reduce the problem to the previous exercise, by observing that, for each k ≥ 1, the set

J(Ak ) = {j ∈ J : µj (Ak ) > 0 must be countable. Then the set J(A) = {j ∈ J : µj (A) > 0} is
also countable.
Comment. One of the major drawbacks of the theory of Riemann integration
is illustrated by the approach to improper integration. Recall that for a function
h : [a, b) → R the improper Riemann integral is defined as
Z b− Z x
h(t) dt = lim f (t) dt,
a x→b− a
provided that

(a) h [a,x] is Riemann integrable, ∀ x ∈ (a, b), and
(b) the above limit exists.
The problem is that although the improper integral may exist, and the function is
actually defined on [a, b], it may fail to be Riemann integrable, for example when
it is unbounded.
In contrast to this situation, by Corollary 2.4, we see that if for example h ≥ 0,
then the Lebesgue integrability of h on [a, b] is equivalent to the fact that

(i) h [a,x] is Lebesgue integrable, ∀ x ∈ (a, b), and
R
(ii) limx→b− [a,x] h dλ exists.
Going back to the discussion on improper Riemann integral, we can see that
a sufficient condition for h : [a, b) → R to be Riemann integrable in the improper
sense, is the fact that h has property (a) above, and h is Lebesgue integrable on
[a, b). In fact, if h ≥ 0, then by Corollary 2.4, this is also necessary.
Notation. Let −∞ ≤ a < b ≤ ∞, and let f be a Lebesgue integrable function,
defined on some interval J which is one of (a, b), [a, b), (a, b], or [a, b]. Then the
R Rb
Lebesgue integral J f dλ will be denoted simply by a f dλ.
Exercise 6*. Let (X, A, µ) be a finite measure space. Prove that for every
f ∈ L1+ (X, A, µ), one has the equality
Z Z ∞
µ f −1 ([t, ∞]) dt,

f dµ =
X 0
where the second term is defined as improper Riemann integral.
Hint: The function ϕ : [0, ∞) → [0, ∞) defined by ϕ(t) = µ f −1 ([t, ∞]) , ∀ t ≥ 0, is non-


increasing, so it is Riemann integrable on every interval [0, a], a > 0. Prove the inequalities
Z Z a Z
f dµ ≤ ϕ(t) dt ≤ f dµ, ∀ a > 0,
Xa 0 X

where Xa = f −1 ([0, a)), by analyzing lower and upper Darboux sums of ϕ [0,a] . Use Corollary

R R
2.4 to get lima→∞ Xa f dµ = X f dµ.
Lecture 35

3. Banach spaces of integrable functions I: the Lp spaces


In this section we discuss an important construction, which is extremely useful
in virtually all branches of Analysis. In Section 1, we have already introduced the
space L1 . The first construction deals with a generalization of this space.
Definitions. Let (X, A, µ) be a measure space, and let K be one of the fields
R or C.
A. For a number p ∈ (1, ∞), we define the space
Z
LpK (X, A, µ) = f : X → K : f measurable, and |f |p ∈ dµ < ∞ .

X
R
Here we use the convention introduced in Section 1, which defines X h dµ = ∞,
for those measurable functions h : X → [0, ∞], that are not integrable.
Of course, in this definition we can allow also the value p = 1, and in this case
we get the familiar definition of L1K (X, A, µ).
B. For p ∈ [1, ∞), we define the map Qp : L1K (X, A, µ) → [0, ∞) by
Z
Qp (f ) = |f |p dµ, ∀ f ∈ L1K (X, A, µ).
X

Remark 3.1. The space L1K (X, A, µ) was studied earlier (see Section 1). It
has the following features:
(i) L1K (X, A, µ) is a K-vector space.
(ii) The map Q1 : L1K (X, A, µ) → [0, ∞) is a seminorm, i.e.
(a) Q1 (f + g) ≤ Q1 (f ) + Q1 (g), ∀ f, g ∈ L1K (X, A, µ);
(b) ) = |α| · Q1 (f ), ∀
R Q1 (αf f ∈ L1K (X, A, µ), α ∈ K.
(iii) X f dµ ≤ Q1 (f ), ∀ f ∈ LK (X, A, µ).
1

Property (b) is clear. Property (a) immediately follows from the inequality |f +g| ≤
|f | + |g|, which after integration gives
Z Z Z Z
 
|f + g| dµ ≤ |f | + |g| dµ = |f | dµ + |g| dµ.
X X X X

In what follows, we aim at proving similar features for the spaces LpK (X, A, µ)
and Qp , 1 < p < ∞.
The following will help us prove that Lp is a vector space.
Exercise 1 ♦ . Let p ∈ (1, ∞). Then one has the inequality
(s + t)p ≤ 2p−1 (sp + tp ), ∀ s, t ∈ [0, ∞).

323
324 LECTURE 35

Hint: The inequality is trivial, when s = t = 0. If s + t > 0, reduce the problem to the case
t + s = 1, and prove, using elementary calculus techniques that
min tp + (1 − t)p = 21−p .
 
t∈[0,1]

Proposition 3.1. Let (X, A, µ) be a measure space, let K be one of the fields
R or C, and let p ∈ (1, ∞). When equipped with pointwise addition and scalar
multiplication, LpK (X, A, µ) is a K-vector space.
Proof. It f, g ∈ LpK (X, A, µ), then by Exercise 1 we have
Z Z Z Z 
p
|f + g|p dµ ≤ |f | + |g| dµ ≤ 2p−1 |f |p dµ + |g|p dµ < ∞,
X X X X
so f + g indeed belongs to LpK (X, A, µ).
It f ∈ LpK (X, A, µ), and α ∈ K, then the equalities
Z Z Z
p p p p
|αf | dµ = |α| · |f | dµ = |α| · |f |p dµ
X X X
clearly prove that αf also belongs to LpK (X, A, µ). 
Our next task will be to prove that Qp is a seminorm, for all p > 1. In this
direction, the following is a key result. (The above mentioned convention will be
used throughout this entire section.)
Theorem 3.1 (Hölder’s Inequality for integrals). Let (X, A, µ) be a measure
space, let f, g : X → [0, ∞] be measurable functions, and let p, q ∈ (1, ∞) be such
that p1 + 1q = 1. Then one has the inequality15
Z Z 1/p  Z 1/q
(1) f g dµ ≤ f p dµ · g q dµ .
X X X

Proof. If either X f p dµ = ∞, or X g p dµ = ∞, then the inequality (1) is


R R

trivial, because in this case, R is ∞. For the remainder of the


R the right hand side
proof we will assume that X f p dµ < ∞ and X g q dµ < ∞.
n=1 , (ψn )n=1 ⊂ LR,elem (X, A, µ),
Use Corollary 2.1 to find two sequences (ϕn )∞ ∞ 1

such that
• 0 ≤ ϕ1 ≤ ϕ2 ≤ . . . and 0 ≤ ψ1 ≤ ψ2 ≤ . . . ;
• limn→∞ ϕn (x) = f (x)p and limn→∞ ψn (x) = g(x)q , ∀ x ∈ X.
By the Lebesgue Dominated Convergence Theorem, we will also get the equalities
Z Z Z Z
(2) f p dµ = lim ϕn dµ and g q dµ = lim ψn dµ.
X n→∞ X X n→∞ X
1/p 1/q
Remark that the functions fn = ϕn , n ≥ 1 are also elementary (because
gn ψn ,
they obviously have finite range). It is obvious that we have
• 0 ≤ f1 ≤ f2 ≤ . . . , and 0 ≤ g1 ≤ g2 ≤ . . . ;
• limn→∞ fn (x) = f (x), and limn→∞ gn (x)] = g(x), ∀ x ∈ X.
With these notations, the equalities (2) read
Z Z Z Z
p p q
(3) f dµ = lim (fn ) dµ and g dµ = lim (gn )q dµ.
X n→∞ X X n→∞ X
Of course, the products fn gn , n ≥ 1 are again elementary, and satisfy
15 Here we use the convention ∞1/p = ∞1/q = ∞.
CHAPTER IV: INTEGRATION THEORY 325

• 0 ≤ f1 g1 ≤ f2 g2 ≤ . . . ;
• limn→∞ [fn (x)gn (x)] = f (x)g(x), ∀ x ∈ X.
Using the General Lebesgue Monotone Convergence Theorem, we then get
Z Z
f g dµ = lim fn gn dµ.
X n→∞ X

Using (3) we now see that, in order to prove (1), it suffices to prove the inequalities
Z Z 1/p  Z 1/q
p q
fn gn dµ ≤ (fn ) dµ · (gn ) dµ , ∀ n ≥ 1.
X X X

In other words, it suffices to prove (1), under the extra assumption that both f and
g are elementary integrable.
Suppose f and g are elementary integrable. Then (see III.1) there exist pair-
wise disjoint sets (Dj )m j=1 ⊂ A, with µ(Dj ) < ∞, ∀ j = 1, . . . , m, and numbers
α1 , β1 , . . . , αm , βm ∈ [0, ∞), such that
f = α1 κ D1 + · · · + αm κ Dm
g = β1 κ D1 + · · · + βm κ Dm
Notice that we have
f g = α1 β1 κ D1 + · · · + αm βm κ Dm ,
so the left hand side of (1) is the given by
Z Xm
f g dµ = αj βj µ(Dj ).
X j=1

Define the numbers xj = αj µ(Dj )1/p , yj = βj µ(Dj )1/q , j = 1, . . . , m. Using these


numbers, combined with p1 + 1q = 1, we clearly have
Z m
X
(4) f g dµ = (xj yj ).
X j=1

At this point we are going to use the Hölder inequality for finite sequences (Lemma
II.2.3), which gives
m
X m
X m
1/p  X 1/q
p q
(xj yj ) ≤ (xj ) · (yj ) ,
j=1 j=1 j=1

so the equality (4) continues with


Z Xm m
1/p  X 1/q
p q
f g dµ ≤ (xj ) · (yj ) =
X j=1 j=1
m
X m
1/p  X 1/q
p q
= (αj ) µ(Dj ) · (βj ) µ(Dj ) =
j=1 j=1
Z 1/p  Z 1/q
p q
= f dµ · g dµ . 
X X
326 LECTURE 35

Corollary 3.1. Let (X, A, µ) be a measure space, let K be one of the fields
R or C, and let p, q ∈ (1, ∞) be such that p1 + 1q = 1. For any two functions
f ∈ LpK (X, A, µ) and g ∈ LqK (X, A, µ), the product f g belongs to L1K (X, A, µ) and
one has the inequality Z

f g dµ ≤ Qp (f ) · Qq (g).

X

Proof. By Hölder’s inequality, applied to |f | and |g|, we get


Z
|f g| dµ ≤ Qp (f ) · Qq (g) < ∞,
X

so |f g| belongs to L1+ (X, A, µ),


i.e. f gRbelongs to LR1K (X, A, µ). The desired inequal-
ity then follows from the inequality X f g dµ ≤ X |f g| dµ. 

Notation. Suppose (X, A, µ) is a measure space, K is one of the fields R or C,


and p, q ∈ (1, ∞) are such that p1 + 1q = 1. For any pair of functions f ∈ LpK (X, A, µ),
g ∈ LqK (X, A, µ), we shall denote the number X f g dµ ∈ K simply by hf, gi. With
R

this notation, Corollary 3.1 reads:


hf, gi ≤ Qp (f ) · Qq (g), ∀ f ∈ Lp (X, A, µ), g ∈ Lq (X, A, µ).

K K

The following result gives an alternative description of the maps Qp , p ∈ (1, ∞).
Proposition 3.2. Let (X, A, µ) be a measure space, let K be one of the fields
R or C, let p, q ∈ (1, ∞) be such that p1 + 1q = 1. and let f ∈ LpK (X, A, µ). Then
one has the equality
Qp (f ) = sup hf, gi : g ∈ LqK (X, A, µ), Qq (g) ≤ 1 .

(5)

Proof. Let us denote the right hand side of (5) simply by P (f ). By Corollary
3.1, we clearly have the inequality
P (f ) ≤ Qp (f ).
To prove the other inequality, let us first observe that in the case when Qp (f ) = 0,
there is nothing to prove, because the above inequality already forces P (f ) = 0.
Assume then Qp (f ) > 0, and define the function h : x → K by
|f (x)|p

if f (x) 6= 0


f (x)

h(x) =


0 if f (x) = 0

It is obvious that h is measurable. Moreover, one has the equality |h| = |f |p−1 ,
which using the equality qp = p + q gives |h|q = |f |qp−q = |f |p . This proves that
h ∈ LqK (X, A, µ), as well as the equality
Z 1/q  Z 1/q
Qq (h) = |h|q dµ = |f |p dµ = Qp (f )p/q .
X X
−p/q
If we define the number α = Qp (f ) , then the function g = αh has Qq (g) = 1,
so we get Z Z
1
P (f ) ≥ f g dµ = f h dµ .
X Q (f )p/q
p

X

CHAPTER IV: INTEGRATION THEORY 327

Notice that f h = |f |p , so the above inequality can be continued with


Qp (f )p
Z
1 p
P (f ) ≥ |f | dµ = = Qp (f ). 
Qp (f )p/q X Qp (f )p/q
Corollary 3.2. Let (X, A, µ) be a measure space, let K be one of the fields
R or C, and let p ∈ (1, ∞). Then the Qp is a seminorm on LpK (X, A, µ), i.e.
(a) Qp (f1 + f2 ) ≤ Qp (f1 ) + Qp (f2 ), ∀ f1 , f2 ∈ LpK (X, A, µ);
(b) Qp (αf ) = |α| · Qp (f ), ∀ f ∈ LpK (X, A, µ), α ∈ K.
p
Proof. (a). Take q = p−1 , so that p1 + 1q = 1. Start with some arbitrary
q
g ∈ LK (X, A, µ), with Qq (g) ≤ 1. Then the functions f1 g and f2 g belong to
L1K (X, A, µ), and so f1 g + f2 g also belongs to L1K (X, A, µ). We then get
Z Z Z

hf1 + f2 , gi = (f1 g + f2 g) dµ = f1 g dµ + f2 g dµ ≤

X X X
Z Z

≤ f1 g dµ + f2 g dµ = hf1 , gi + hf2 , gi .
X X
Using Proposition 3.2, the above inequality gives

hf1 + f2 , gi ≤ Qp (f1 ) + Qp (f2 ).

Since the above inequality holds for all g ∈ LqK (X, A, µ), with Qq (g) ≤ 1, again by
Proposition 3.2, we get
Qp (f1 + f2 ) ≤ Qp (f1 ) + Qp (f2 ).
Property (b) is obvious. 
Remarks 3.2. Let (X, A, µ) be a measure space, and K be one of the fields R
or C, and let p ∈ [1, ∞).
A. If f ∈ LpK (X, A, µ) and if g : X → K is a measurable function, with g = f ,
µ-a.e., then g ∈ LpK (x, A, µ), and Qp (g) = Qp (f ).
B. If we define the space
NK (X, A, µ) = f : X → K : f measurable, f = 0, µ-a.e. ,


then NK (X, A, µ) is a linear subspace of LpK (X, A, µ). In fact one has the equality
NK (X, A, µ) = f ∈ LpK (X, A, µ) : Qp (f ) = 0 .


The inclusion “⊂” is trivial. Conversely, f ∈ LpK (X, A, µ) has Qp (fR ) = 0, then the
measurable function g : X → [0, ∞) defined by g = |f |p will have X g dµ = 0. By
Exercise 2.3 this forces g = 0, µ-a.e., which clearly gives f = 0, µ-a.e.
Definition. Let (X, A, µ) be a measure space, let K be one of the fields R or
C, and let p ∈ [1, ∞). We define
LpK (X, A, µ) = LpK (X, A, µ)/NK (X, A, µ).
In other words, LpK (X, A, µ) is the collection of equivalence classes associated with
the relation “=, µ-a.e.” For a function f ∈ LpK (X, A, µ) we denote by [f ] its
equivalence class in LpK (X, A, µ). So the equality [f ] = [g] is equivalent to f = g,
µ-a.e. By the above Remark, there exists a (unique) map k . kp : LpK (X, A, µ) →
[0, ∞), such that
k[f ]kp = Qp (f ), ∀ f ∈ LpK (X, A, µ).
328 LECTURE 35

By the above Remark, it follows that k . kp is a norm on LpK (X, A, µ). When K = C
the subscript C will be ommitted.
Conventions. Let (X, A, µ), K, and p be as above We are going to abuse a
bit the notation, by writing
f ∈ LpK (X, A, µ),
if f belongs to LpK (X, A, µ). (We will always have in mind the fact that this notation
signifies that f is almost uniquely determined.) Likewise, we are going to replace
Qp (f ) with kf kp .
Given p, q ∈ (1, ∞), with p1 + 1q = 1, we use the same notation for the (correctly
defined) map
h . , . i : LpK (X, A, µ) × LqK (X, A, µ) → K.
Remark 3.3. Let (X, A, µ) be a measure space, let K be either R or C, and
let p, q ∈ (1, ∞) be such that p1 + 1q = 1. Given f ∈ LpK (X, A, µ), we define the map
Λf : LqK (X, A, µ) 3 g 7−→ hf, gi ∈ K.
According to Proposition 3.2, the map Λf is linear, continuous, and has norm
kΛf k = kf kp . If we denote by LqK (X, A, µ)∗ the Banach space of all linear continu-
ous maps LqK (X, A, µ) → K, then we have a correspondence
(6) LpK (X, A, µ) 3 f 7−→ Λf ∈ LqK (X, A, µ)∗
which is linear and isometric. This correspondence will be analyzed later in Section
5.
p
Notation. Given a sequence (fn )∞ n=1 , and a function f , in LK (X, A, µ), we
are going to write
f = Lp - lim fn ,
n→∞
if (fn )∞
n=1 converges to f in the norm topology, i.e. limn→∞ kfn − f kp = 0.
The following technical result is very useful in the study of Lp spaces.
Theorem 3.2 (Lp Dominated Convergence Theorem). Let (X, A, µ) be a mea-
sure space, let K be one of the fields R or C, let p ∈ [1, ∞) and let (fn )∞n=1 be a
sequence in LpK (X, A, µ). Assume f : X → K is a measurable function, such that
(i) f = µ-a.e.- limn→∞ fn ;
(ii) there exists some function g ∈ L1K (X, A, µ), such that
|fn | ≤ |g|, µ-a.e., ∀ n ≥ 1.
Then f ∈ LpK (X, A, µ), and one has the equality
f = Lp - lim fn .
n→∞

Proof. Consider the functions ϕn = |fn |p , n ≥ 1, and ϕ = |f |p , and ψ = |g|p .


Notice that
• ϕ = µ-a.e.- limn→∞ ϕn ;
• |ϕn | ≤ ψ, µ-a.e., ∀ n ≥ 1;
• ψ ∈ L1+ (X, A, µ).
We can apply the Lebsgue Dominated Convergence Theorem, so we get the fact that
ϕ ∈ L1+ (X, A, µ), which gives the fact that f ∈ LpK (X,  A, µ). Now if we consider
the functions ηn = |fn − f |p , and η = 2p−1 |g|p + |f |p , then we have (use Exercise
1):
• 0 = µ-a.e.- limn→∞ ηn ;
CHAPTER IV: INTEGRATION THEORY 329

• |ηn | ≤ η, µ-a.e., ∀ n ≥ 1;
• η ∈ L1+ (X, A, µ).
Again using the Lebesgue Dominated Convergence Theorem, we get
Z
lim ηn dµ = 0,
n→∞ X

which means that


lim |fn − f |p dµ,
n→∞
p
which reads limn→∞ kfn − f kp = 0, so we clearly have f = Lp - limn→∞ fn . 

Our main goal is to prove that the Lp spaces are Banach spaces. The key result
which gives this, but also has some other interesting consequences, is the following.
Theorem 3.3. Let (X, A, µ) be a measure space, let K be one of the fields R
p
k=1 be a sequence in LK (X, A, µ), such that
or C, let p ∈ [1, ∞) and let (fk )∞

X
kfk kp < ∞.
k=1
p
n=1 ⊂ LK (X, A, µ) of partial sums:
Consider the sequence (gn )∞
n
X
gn = fk , n ≥ 1.
k=1

Then there exists a function g ∈ LpK (X, A, µ), such that


(a) g = µ-a.e.- limn→∞ gn ;
(b) g = Lp - limn→∞ gn .
P∞
Proof. Denote the sum k=1 kfn kp simply by S. For each integer n ≥ 1,
define the function hn : X → [0, ∞], by
n
X
hn (x) = |fn (x)|, ∀ x ∈ X.
k=1

It is clear that hn ∈ LpR (X, A, µ), and we also have


n
X
(7) khn kp ≤ kfk kp ≤ S, ∀ n ≥ 1.
k=1

Notice also that 0 ≤ h1 ≤ h2 ≤ . . . . Define then the function h : X → [0, ∞] by


h(x) = lim hn (x), ∀ x ∈ X.
n→∞

Claim: h ∈ LpR (X, A, µ).


To prove this fact, we define the functions ϕ = hp and ϕn = (hn )p , n ≥ 1. Notice
that, we have
• 0 ≤ ϕ1 ≤ ϕ2 ≤ . . . ;
• ϕn ∈ L1R (X, A, µ), ∀ n ≥ 1;
•  R ϕn (x) = ϕ(x), ∀ x ∈ X;
limn→∞
• sup X ϕn dµ : n ≥ 1 ≤ M p .
330 LECTURE 35

Using the Lebesgue Monotone Convergence Theorem, it then follows that hp = ϕ ∈


L1R (X, A, µ), so h indeed belongs to LpR (X, A, µ).(7) gives
Let us consider now the set N = {x ∈ X : h(x) = ∞}. On the one hand, since
we also have
N = {x ∈ X : ϕ(x) < ∞},
and ϕ is integrable, it follows that N ∈ A, and µ(N ) = 0. On the other hand, since

X
|fn (x)| = h(x) < ∞, ∀ x ∈ X r N,
k=1
P∞
it follows that, for each x ∈ X r N , the series k=1 fk (x) is convergent. Let us
define then g : X → K by
 P∞
k=1 fk (x) if x ∈ X r N
g(x) =
0 if x ∈ N
It is obvious that g is measurable, and we have
g = µ-a.e.- lim gn .
n→∞

Since we have n n
X X
|gn | =
fk ≤ |fk | = hn ≤ h, ∀ n ≥ 1,
k=1 k=1
using the Claim, and Theorem 3.2, it follows that g indeed belongs to LpK (X, A, µ)
and we also have the equality g = Lp - limn→∞ gn . 
Corollary 3.3. Let (X, A, µ) be a measure space, and let K be one of the
fields R or C. Then LpK (X, A, µ) is a Banach space, for each p ∈ [1, ∞).
Proof. This is immediate from the above result, combined with the complete-
ness criterion given by Remark II.3.1. 
Another interesting consequence of Theorem 3.3 is the following.
Corollary 3.4. Let (X, A, µ) be a measure space, let K be one of the fields
R or C, let p ∈ [1, ∞), and let f ∈ LpK (X, A, µ). Any sequence (fn )∞ n=1 ⊂∈
LpK (X, A, µ), with f = Lp - limn→∞ fn , has a subsequence (fnk )∞
k=1 such that f =
µ-a.e.- limk→∞ fnk .
Proof. Without any loss of generality, we can assume that f = 0, so that we
have
lim kfn kp = 0.
n→∞
Choose then integers 1 ≤ n1 < n2 < . . . , such that
1
kfnk kp ≤ k , ∀ k ≥ 1.
2
If we define the functions
Xm
gm = fnk ,
k=1
then by Theorem 3.3, it follows that there exists some g ∈ LpK (X, A, µ), such that
g = µ-a.e.- lim gm .
m→∞
CHAPTER IV: INTEGRATION THEORY 331

This measn that there exists some N ∈ A, with µ(N ) = 0, such that
lim gm (x) = g(x), ∀ x ∈ X r N.
m→∞
P∞
In other words, for each x ∈ X r N , the series k=1 fnk (x) is convergent (to some
number g(x) ∈ K). In particular, it follows that
lim fnk (x) = 0, ∀ x ∈ X r N,
k→∞

so we indeed have 0 = µ-a.e.- limk→∞ fnk . 


The following result collects some properties of Lp spaces in the case when the
undelrying measure space is finite.
Proposition 3.3. Suppose (X, A, µ) is a finite measure space, and K is one
of the fields R or C.
(i) If f : X → K is a bounded measurable function, then f ∈ LpK (X, A, µ),
∀ p ∈ [1, ∞).
(ii) For any p, q ∈ [1, ∞), with p < q, one has the inclusion LqK (X, A, µ) ⊂
LpK (X, A, µ). So taking quotients by NK (X, A, µ), one gets an inclusion of
vector spaces
(8) LqK (X, A, µ) ,→ LpK (X, A, µ).
Moreover the above inclusion is a continuous linear map.
Proof. The key property that we are going to use here is the fact that the
constant function 1 = κ X is µ-integrable (being elementary µ-integrable).
(i). This part is pretty clear, because if we start with a bounded measurable
function f : X → K and we take M = supx∈X |f (x)|, then the inequality |f |p ≤
M p · 1, combined with the integrability of 1, will force the inetgrability of |f |p , i.e.
f ∈ LpK (X, A, µ).
(ii). Fix 1 ≤ p < q < ∞, as well as a function f ∈ LqK (X, A, µ). Consider the
number r = pq > 1, and s = r−1 r
, so that we have 1r + 1s = 1. Since f ∈ LqK (X, A, µ),
the function g = |f | belongs to L1K (X, A, µ). If we define then the function h = |f |p ,
q

then we obviously have g = hr , so we get the fact that h belongs to LrK (X, A, µ).
Using part (i), we get the fact that 1 ∈ LsK (X, A, µ), so by Corollary 3.1, it follows
that h = 1 · h belongs to L1K (X, A, µ), and moreover, one has the inequality
Z Z Z 1/s  Z 1/r
p r
|f | dµ = h dµ ≤ k1ks · khkr = 1 dµ · h dµ =
X X X X
Z 1/r
q/r
= µ(X)1/s · |f |q dµ = µ(X)1/s · kf kq .
X

On the one hand, this inequality proves that f ∈ LpK (X, A, µ). On the other hand,
this also gives the inequality
p q/r p p
kf kp ≤ µ(X)1/s · kf kq = µ(X)1− q · kf kq ,
which yields
1 1
kf kp ≤ µ(X) p − q · kf kq .
This proves that the linear map (8) is continuous (and has norm no greater than
1 1
µ(X) p − q ). 
332 LECTURE 35

Exercise 2. Give an example of a sequence of continuous functions fn : [0, 1] →


[0, ∞), n ≥ 1, such that Lp - limn→∞ fn = 0, ∀ p ∈ [1, ∞), but for which it
is not true that 0 = µ-a.e.- limn→∞ fn . (Here we work on the measure space
[0, 1], Mλ ([0, 1]), λ).)
Exercise 3. Let Ω ⊂ Rn be an open set. Prove that CcK (Ω) is dense in
LpK (Ω, Mλ (Ω), λ),
for every p ∈ [1, ∞). (Here λ denotes the n-dimensional Lebesgue
measure, and Mλ (Ω) denotes the collection of all Lebesgue measurable subsets of
Ω.)
Notations. Let (X, A, µ) be a measure space, let K be one of the fields R or
C. We define the space
NK,elem (X, A, µ) = L1K,elem (X, A, µ) ∩ NK (X, A, µ),
and we define the quotient space
L1K,elem (X, A, µ) = L1K,elem (X, A, µ) / NK,elem (X, A, µ).
In other words, if one considers the quotient map
Π1 : L1K (X, A, µ) → L1K (X, A, µ),
then L1K,elem (X, A, µ) = Π1 L1K,elem (X, A, µ) . Notice that we have the obvious


inclusion
L1K,elem (X, A, µ) ⊂ LpK (X, A, µ), ∀ p ∈ [1, ∞),
so we we consider the quotient map
Πp : LpK (X, A, µ) → LpK (X, A, µ),
we can also define the subspace
LpK,elem (X, A, µ) = Πp LpK,elem (X, A, µ) , ∀ p ∈ [1, ∞).


Remark that, as vector spaces, the spaces LpK,elem (X, A, µ) are identical, since
Ker Πp = NK (x, A, µ), ∀ p ∈ [1, ∞).
With these notations we have the following fact.
Proposition 3.4. LpK,elem (X, A, µ) is dense in LpK (X, A, µ), for each p ∈
[1, ∞).
Proof. Fix p ∈ [1, ∞), and start with some f ∈ LpK (X, A, µ). What we
need to prove is the existence of a sequence (fn )∞ n=1 ⊂ LK,elem (X, A, µ), such that
1
p
f = L - limn→∞ fn . Taking real and imaginary parts (in the case K = C), it
suffieces to consider the case when f is real valued. Since |f | also belongs to Lp,
it follows that f + = max{f, 0} = 21 |f | + f , and f − = max{−f, 0} = 12 |f | − f
both belong to Lp , so in fact we can assume that f is non-negative. Consider the
function g = f p ∈ L1+ (X, A, µ). Use the definition of the integral, to find a sequence
n=1 ⊂ LR,elem (X, A, µ), such that
(gn )∞ 1

• 0 ≤ gn ≤Rg, ∀ n ≥ 1;R
• limn→∞ X gn dµ = X g dµ.
This gives the fact that g = L1 - limn→∞ gn . Using Corollary 3.4, after replacing
(gn )∞
n=1 with a subsequence, we can also assume that g = µ-a.e.- limn→∞ gn . If we
put fn = (gn )1/p , ∀ n ≥ 1, we now have
• 0 ≤ fn ≤ f , ∀ n ≥ 1;
• f = µ-a.e.- limn→∞ fn .
CHAPTER IV: INTEGRATION THEORY 333

Obviously, the fn ’s are still elementary integrable, and by the Lp Dominated Con-
vergence Theorem, we indeed get f = Lp - limn→∞ fn . 
Comments. A. The above result gives us the fact that LpK (X, A, µ) is the com-
pletion of LpK,elem (X, A, µ). This allows for the following alternative construction
of the Lp spaces.
B. For a measurable function f : X → K, by the (proof of the) above result,
it follows that the condition f ∈ LpK (X, A, µ) is equivalent to the equality f =
µ-a.e.- limn→∞ fn , for some sequence (fn )∞ n=1 of elementary integrable functions,
which is Cauchy in the Lp norm, i.e.
(c) for every ε > 0, there exists Nε , such that
kfm − fn kp < ε, ∀ m, n ≥ Nε .
One key feature, which will be heavily exploited in the next section, deals with
the Banach space p = 2, for which we have the following.
Proposition 3.5. Let (X, A, µ) be a measure space, and let K be one of the
fields R or C.
(i) The map ( . | . ) : L2K (X, A, µ) × L2K (X, A, µ) → K, given by
Z
( f | g ) = hf¯, gi = f¯g dµ, ∀ f, g ∈ L2K (X, A, µ),
X

defines an inner product on L2K (X, A, µ).


(ii) One has the equality
kf k2 = ( f | f ), ∀ f ∈ L2K (X, A, µ).
p

Consequently , L2K (X, A, µ) is a Hilbert space.


Proof. The properties of the inner product are immediate, from the properties
of integration. The second property is also clear. 
Remark 3.4. The main biproduct of the above feature is the fact that the
correspondence (6) is an isometric isomorphism, in the case p = q = 2. This
follows from Riesz Theorem (only the surjectivity is the issue here; the rest has
been discussed in Remark 3.3). If φ : L2K (X, A, µ) → K is a linear continuous map,
then there exists some h ∈ L2K (X, A, µ), such that
φ(g) = ( h | g ), ∀ g ∈ L2K (X, A, µ).
If we put f = h̄, then the above equality gives
φ(g) = hf, gi, ∀ g ∈ L2K (X, A, µ).
i.e. φ = Λf .
Comments. Eventually (see Section 5) we shall prove that the correspondence
(6) is surjective also in the general case.
The correspondence (6) also has a version for q = 1. This would require the
definition of an Lp space for the case p = ∞. We shall postpone this until we reach
Section 5. The next exercise hints towards such a construction.
Exercise 4 ♦ . Let (X, A, µ) be a measure space, let K be one of the fields R or C,
and let f : X → K be a bounded measurable function. Define M = supx∈X |f (x)|.
Prove the following.
334 LECTURE 35

(i) Whenever g ∈ L1K (X, A, µ), it follows that the function f g also belongs to
L1K (X, A, µ), and one has the inequality
kf gk1 ≤ M · kgk1 .
(ii) The map
Z
Λf : L1K (X, A, µ) 3 g 7−→ f g dµ ∈ K
X
is linear and continuous. Moreover, one has the inequality kΛf k ≤ M .
Remark 3.5. If we apply the above Exercise to the constant function f = 1,
we get the (already known) fact that the integration map
Z
(9) Λ1 : LK (X, A, µ) 3 g 7−→
1
g dµ ∈ K
X
is linear and continuous, and has norm kΛ1 k ≤ 1. The follwing exercise gives the
exact value of the norm.
Exercise 5. With the notations above, prove that the following are equivalent:
(i) the measure space (X, A, µ) is non-degenerate, i.e. there exists A ∈ A
with 0 < µ(A) < ∞;
(ii) L1K (X, A, µ) 6= {0};
(ii) the integration map (9) has norm kΛ1 k = 1.
Lectures 36-37

4. Radon-Nikodym Theorems
In this section we discuss a very important property which has many important
applications.
Definition. Let X be a non-empty set, and let A be a σ-algebra on X. Given
two measures µ and ν on A, we say that ν has the Radon-Nikodym property relative
to µ, if there exists a measurable function f : X → [0, ∞], such that
Z
(1) ν(A) = f dµ, ∀ A ∈ A.
A
Here we use the convention which defines the integral in the right hand side by
f κ A dµ if f κ A ∈ L1+ (X, A, µ)
Z  R
f dµ = X
A ∞ if f κ A 6∈ L1+ (X, A, µ)
In this case, we say that f is a density for ν relative to µ.
The Radon-Nikodym property has an equivalent useful formulation.
Proposition 4.1 (Change of Variables). Let X be a non-empty set, and let
A be a σ-algebra on X, let µ and ν be measures on A, and let f : X → [0, ∞] be a
measurable function.
A. The following are equivalent
(i) ν has the Radon-Nikodym property relative to µ, and f is a density for ν
relative to µ;
(ii) for every measurable function h : X → [0, ∞], one has the equality16
Z Z
(2) h dν = hf dµ.
X X
B. If ν and f are as above, and K is either R or C, then the equality (2)
also holds for those measurable functions h : X → K with h ∈ L1K (X, A, ν) and
hf ∈ L1K (X, A, µ).
Proof. A. (i) ⇒ (ii). Assume property (i) holds, which means that we have
(1). Fix a measurable function h : X → [0, ∞], and use Theorem III.3.2, to find a
sequence (hn )∞n=1 ⊂ A-ElemR (X), with
(a) 0 ≤ h1 ≤ h2 ≤ · · · ≤ h;
(b) limn→∞ hn (x) = h(x), ∀ x ∈ X.
Of course, we also have
(a0 ) 0 ≤ h1 f ≤ h2 f ≤ · · · ≤ hf ;
16 For the product hf we use the conventions 0 · ∞ = ∞ · 0 = 0, and t · ∞ = ∞ · t = ∞,
∀ t ∈ (0, ∞].

335
336 LECTURES 36-37

(b0 ) limn→∞ hn (x)f (x) = h(x)f (x), ∀ x ∈ X.


Using the Monotone Convergence Theorem, we then get the equalities
Z Z Z Z
(3) h dν = lim hn dν and hf dµ = lim hn f dν
X n→∞ X X n→∞ X
Pp
Notice that, if we fix n and we write hn = k=1 αk κ Ak , for some A1 , . . . , Ap ∈ A,
and α1 > · · · > αp > 0, then
Z Xp Xp Z Z
hn dν = αk ν(Ak ) = αk κ Ak f dµ = hn f dµ,
X k=1 k=1 X X

so using (3), we immediately get (2).


The implication (ii) ⇒ (i) is trivial, using functions of the form h = κ A , A ∈ A.
B. Suppose ν has the Radon-Nikodym property relative to µ, and f is a density
for ν relative to µ, and let h : X → K be a measurable function with h ∈ L1K (X, A, ν)
and hf ∈ L1K (X, A, µ). In the complex case, using the inequalities |Re h| ≤ |h| and
|Im h| ≤ |h|, it is clear that both functions Re h and Im h belong to L1 (X, A, ν),
and also the products (Re h)f and (Im h)f belong to L1 (X, A, µ). This shows that
it suffices to prove (2) under the additional hypothesis that h is real-valued. In this
case we consider the functions h± , defined by
h+ = max{h, 0} and h− = max{−h, 0}.
Since we have 0 ≤ h± ≤ |h|, it follows that h± ∈ L1+ (X, A, ν), as well as h± f ∈
L1+ (X, A, µ). In particular, we get the equalities
Z Z Z Z Z Z

(4) h dν = +
h dν − h dν and hf dν = +
h f dµ − h− f dµ.
X X X X X X
Since h± ≥ 0, we can use property A.(ii) above, and we have
Z Z
±
h dν = h± f dµ,
X X
and then the desired equality (2) immediately follows from (4). 
One important issue is the uniqueness of the density. For this purpose, it will
be helpful to introduce the following.
Definition. Let T be one of the spaces [−∞, ∞] or C, and let r be some
relation on T (in our case r will be either “=,” or “≥,” or “≤,” on [−∞, ∞]).
Given a measurable space (X, A, µ), and two measurable functions f1 , f2 : X → T ,
f1 r f2 , µ-l.a.e.
if the set 
A = x ∈ X : f1 (x) r f2 (x)
belongs to A, and it has locally µ-null complement in X, i.e. µ [X r A] ∩ F ) = 0,
for every set F ∈ A with µ(F ) < ∞. (If r is one of the relations listed above, the
set A automatically belongs to A, so all intersections [X r A] ∩ F , F ∈ A, also
belong to A.) The abreviation “µ-l.a.e.” stands for “µ-locally-almost everywhere.”
Remark that one has the implication
f1 r f2 , µ-a.e. ⇒ f1 r f2 , µ-l.a.e.
Remark that, when µ is σ-finite, then the other implication also holds:
f1 r f2 , µ-l.a.e. ⇒ f1 r f2 , µ-a.e.
CHAPTER IV: INTEGRATION THEORY 337

With this terminology, one has the following uniqueness result.


Proposition 4.2. Suppose A is a σ algebra on some non-empty set X, and µ
and ν are measures on A, such that ν has the Radon-Nikodym property relative to
µ. If f, g : X → [0, ∞] are densities for ν relative to µ, then
f = g, µ-l.a.e.
In particular, if µ is σ-finite, then
f = g, µ-a.e.
Proof. Consider the set B = x ∈ X : f (x) 6= g(x) , which belongs to A.


We need to prove that B is locally µ-null, i.e. one has µ(B ∩ F ) = 0, for all F ∈ A
with µ(F ) < ∞. Fix F ∈ A with µ(F ) < ∞, and let us write B ∩ F = D ∪ E,
where
 
D = x ∈ B ∩ F : f (x) < g(x) and E = x ∈ B ∩ F : f (x) > g(x) .
If we define, for each integer n ≥ 1, the sets
Dn = x ∈ B ∩ F : f (x) + n1 ≤ g(x) and En = x ∈ B ∩ F : f (x) ≥ g(x) + n1 ,
 

then it is clear that



[
B∩F =D∪E = (Dn ∪ En ),
n=1
so in order to prove that µ(B ∩ F ) = 0, it suffices to show that µ(Dn ) = µ(En ) = 0,
∀ n ≥ 1.
Fix n ≥ 1. It is obvious that f (x) < ∞, ∀ x ∈ Dn , so if we define the sequence
k=1 ⊂ A, by
(Dnk )∞
Dnk = x ∈ Dn : f (x) ≤ k , ∀ k ≥ 1,

S∞
we have the equality Dn = k=1 Dnk , so in order to prove that µ(Dn ) = 0, it suffices
to show that µ(Dnk ) = 0, ∀ k ≥ 1. On the one hand, since f (x) ≤ k, ∀ k ≥ 1, using
the inclusion Dnk ⊂ F , we get
Z Z
ν(Dnk ) = f dµ ≤ kκ Dnk dµ = kµ(Dnk ) ≤ kµ(F ) < ∞.
k
Dn X

On the other hand, since g(x) ≥ f (x) + n1 , ∀ x ∈ Dnk , we get


Z Z
k
ν(Dn ) = g dµ ≥ (f κ Dnk + n1 κ Dnk ) dµ =
k
Dn X
Z Z
= f κ Dnk dµ + 1
n κ Dn
k dµ = ν(Dnk ) + n1 µ(Dnk ).
X X

Since ν(Dnk ) < ∞, the above inequality forces µ(Dnk ) = 0.


The fact that µ(En ) = 0, ∀ n ≥ 1, is proven the exact same way. 
In general, the uniqueness of the density does not hold µ-a.e., as it is seen from
the following.
Example 4.1. Take X to be some non-empty set, put A = {∅, X , and define

the measure µ on A, by µ(∅) and µ(X) = ∞. It is clear that µ has the Radon-
Nikodym property realtive to itself, but as sensities one can choose for instance the
constant functions f = 1 and g = 2. Clearly, the equality f = g, µ-a.e. is not true.
338 LECTURES 36-37

Remark 4.1. The local almost uniqueness result, given in Proposition 4.2,
holds under slightly weaker assumptions. Namely, if (X, A, µ) is a measure space,
and if f, g : X → [0, ∞] are measurable functions for which we have the equality
Z Z
f dµ = g dµ,
A A

for all A ∈ A with µ(A) < ∞, then we still have the equality f = g, µ-l.a.e.
This

follows actually from Proposition 4.2, applied to functions of the form f A and g A .
Let us introduce the following.
Notations. For a measure space (X, A, µ) we define
Aµ0 = {N ∈ A : µ(N ) = 0};
Aµfin = {F ∈ A : µ(F ) < ∞};
Aµ0,loc = {A ∈ A : µ(A ∩ F ) = 0, ∀ F ∈ Aµfin }.
With these notations, we have the inclusions
Aµ0 = Aµ0,loc ∩ Aµfin ⊂ Aµ0,loc ⊂ A,
and Aµ0 and Aµ0,loc are in fact σ-rings.
Comment. The “locally-almost everywhere” terminology is actually designed
to “hide some pathologies under the rug.” For instance, if (X, A, µ) is a degenerate
measure space , i.e. µ(A) ∈ {0, ∞}, ∀ A ∈ A, then “anything happens locally
almost-everywhere,” which means that we have the equality Aµ0,loc = A.
At the other end, there is a particular type of measure spaces on which, even in
the absence of σ-finiteness, the notions of “locally-almost everywhere” and ”almost
everywhere” coincide, i.e. we have the equality Aµ0,loc = Aµ0 . Such spaces are
described by the following.
Definition. A measure space (X, A, µ) is said to be nowhere degenerate, or
with finite subset property, if
(f) for every set A ∈ A with µ(A) > 0, there exists some set F ∈ A, with
F ⊂ A, and 0 < µ(F ) < ∞.
With this terminology, one has the following result.
Proposition 4.3. For a measure space (X, A, µ), the following are equivalent:
(i) Aµ0,loc = Aµ0 ;
(ii) (X, A, µ) has the finite subset property.

Proof. (i) ⇒ (ii). Assume Aµ0,loc = Aµ0 , and let us prove that (X, A, µ) has
the finite subset property. We argue by contradiction, so let us assume there exists
some set A ∈ A, with µ(A) = ∞, such that µ(B) ∈ {0, ∞}, for every B ∈ A, with
B ⊂ A. In particular, if we start with some arbitrary F ∈ Aµfin , using the fact
that µ(A ∩ F ) ≤ µ(F ) < ∞, we see that we must have µ(A ∩ F ) = 0. This proves
precisely that A ∈ Aµ0,loc . By assumption, it follows that A ∈ Aµ0 , i.e. µ(A) = 0,
which is impossible.
(ii) ⇒ (i). Assume that (X, A, µ) has the finite subset property, and let us
prove the equality (i). Since one inclusion is always true, all we need to prove is
the inclusion Aµ0,loc ⊂ Aµ0 , which equivalent to the inclusion Aµ0,loc ⊂ Aµfin . Start
with some set A ∈ Aµ0,loc , but assume µ(A) = ∞. On the one hand, using the finite
CHAPTER IV: INTEGRATION THEORY 339

subset property, there exists some set F ∈ A with F ⊂ A and µ(F ) > 0. On the
other hand, since A ∈ Aµ0,loc , we have µ(F ) = 0, which is impossible. 
Example 4.2. Take X be an uncountable set, let A = P(X), and let µ be the
counting measure, i.e.

Card A if A is finite
µ(A) =
∞ if A is infinite
Then (X, P(X), µ) has the finite subset property, but is not σ-finite.
When we restrict to integrable functions, the two notions µ-l.a.e, and µ-a.e.
coincide. More precisely, we have the following.
Proposition 4.4. Let (X, A, µ) be a measure space, let K be one of the fields
R or C, and let p ∈ [1, ∞). For a function f ∈ LpK (X, A, µ), the following are
equivalent:
(i) f = 0, µ-l.a.e.
(ii) f = 0, µ-a.e.
Proof. Of course, we only need to prove the implication (i) ⇒ (ii). Assume
f = 0, µ-l.a.e. Using the function g = |f |p , we can assume that p = 1 and f (x) ≥ 0,
∀ x ∈SX. Consider then the set N = {x ∈ X : f (x) > 0}, and write it as a union

N = n=1 Nn , where
1
Nn = {x ∈ X : f (x) ≥ n }, ∀ n ≥ 1.
Of course, all we need is the fact that µ(Nn ) = 0, ∀ n ≥ 1. Fix n ≥ 1. On the one
hand, the assumption on f , it follows that Nn ∈ Aµ0,loc . On the other hand, the
inequality n1 κ Nn ≤ f , forces the elementary function n1 κ Nn to be µ-integrable, i.e.
µ(Nn ) < ∞. Consequently we have
N ∈ Aµ0,loc ∩ Aµfin = Aµ0 . 
Comment. In what follows we will discuss several results, which all have as
conclusion the fact that one measure has the Radon-Nikodym property with respect
to another one. All such results will be called “Radon-Nikodym Theorems.”
The first result is in fact quite general, in the sense that it works for finite
signed or complex measures.
Theorem 4.1 (“Easy” Radon-Nikodym Theorem). Let (X, A, µ) be a finite
measure space, let K denote one of the fields R or C, and let C > 0 be some
constant. Suppose ν is a K-valued measure on A, such that
|ν(A)| ≤ Cµ(A), ∀ A ∈ A.
Then there exists some function f ∈ L1K (X, A, µ), such that
Z
(5) ν(A) = f dµ, ∀ A ∈ A.
A
Moreover:
(i) Any function f ∈ L1K (X, A, µ), satisfying (5) has the property |f | ≤ C, µ-
a.e. If ν is an “honest” measure, then one also has the inequality |f | ≥ 0,
µ-a.e.
(ii) A function satisfying (5) is essentially unique, in the sense that, whenever
f1 , f2 ∈ L1K (X, A, µ) satisfy (5), it follows that f1 = f2 , µ-a.e.
340 LECTURES 36-37

R
Proof. The ideea is to somehow make sense of X h dν, for suitable measurable
Rfunctions h, and to examine the properties of such a number relative to the integral
X
h dµ. The second integral is of course defined, for instance for h ∈ L 1
K (X, A, µ),
but the first integral is not, because ν is not an “honest” measure. The proof will
be carried on in several steps.
Step 1: There exist four “honest” finite measures νk , k = 1, 2, 3, 4, and num-
bers αk , k = 1, 2, 3, 4, such that ν = α1 ν1 + α2 ν2 + α3 ν3 + α4 ν4 , and
(6) νk ≤ Cµ, ∀ k = 1, 2, 3, 4.
In the case K = R we use the Hahn-Jordan decomposition ν = ν + − ν − . We also
know that ν ± ≤ |ν|, the variation measure of ν. In this case we take α1 = 1,
ν1 = ν + , α2 = −1, ν2 = ν − , and we set ν3 = ν4 = 0, α3 = α4 = 0.
In the case K = C, we write ν = η + iλ, with η and λ finite signed measures,
and we use the Hahn-Jordan decompositions η = η + − η − and λ = λ+ − λ− . We
also know that the variation measures of η and λ satisfy |η| ≤ |ν| and |λ| ≤ |ν|, so
we also have η ± ≤ |ν| and λ± ≤ |ν|. In this case we can then take α1 = 1, ν1 = η + ,
α2 = −1, ν2 = η − , α3 = i, ν3 = λ+ , α4 = −i, ν4 = λ− .
Notice that in either case we have
νk ≤ |ν|, ∀ k = 1, 2, 3, 4.
By Remark III.8.5 it follows that we have |ν| ≤ Cµ, so we immediately get the
inequalities (6).
Step 2: For any measurable function h : X → [0, ∞], one has the inequality
Z Z
(7) h dνk ≤ C h dµ, ∀ k = 1, 2, 3, 4.
X X

n=1 ⊂ A-ElemR (X),


To prove this, we choose a sequence of elementary functions (hn )∞
with
• 0 ≤ h1 ≤ h2 ≤ . . . (everywhere),
• limn→∞ hn (x) = h(x), ∀ x ∈ X,
so that by the General Monotone Convergence Theorem, we get the equalities
Z Z Z Z
h dµ = lim hn dµ and h dνk = lim hn dνk , ∀ k = 1, 2, 3, 4.
X n→∞ X X n→∞ X
This means that, in order to prove (7), it suffices to prove it under the extra
assumption that h is elementary. In this case, we have
h = β1 κ B1 + · · · + βp κ Bp ,
with β1 , . . . , βp ≥ 0 and B1 , . . . , Bp ∈ A. The inequality is then immediate, from
(6) since we have
Z p
X p
X Z
h dνk = βj νk (Bj ) ≤ C µ(Bj ) = C h dµ.
X j=1 j=1 X

As a consequence of Step 2, we get the fact that, for every k = 1, 2, 3, 4, one


has the inclusions
L1K (X, A, µ) ⊂ L1K (X, A, νk ) and NK (X, A, µ) ⊂ NK (X, A, νk ).
Taking quotients, this gives rise to correctly defined linear maps
(8) Φk : L1K (X, A, µ) 3 h 7−→ h ∈ L1K (X, A, νk ), k = 1, 2, 3, 4.
CHAPTER IV: INTEGRATION THEORY 341

(Here we use the abusive notation that identifies an element in L1 with a function
in L1 , which is defined almost uniquely.) Moreover, one has the inequality
Z Z
|h| dνk ≤ C |h| dµ, ∀ h ∈ L1K (X, A, µ), k = 1, 2, 3, 4,
X X
in other words, the linear maps (8) are all continuous. For every k = 1, 2, 3, 4, let
φk denote the integration map
Z
φk : LK (X, A, νk ) 3 h 7−→
1
h dνk ∈ K.
X
We know (see Remark 3.5) that the φk ’s are continuous. In particular, the compo-
sitions ψk = φk ◦ Φk : L1K (X, A, µ) → K, which are defined by
Z
ψk : L1K (X, A, µ) 3 h 7−→ h dνk , k = 1, 2, 3, 4,
X
are linear and continuous.
We now use Proposition 3.3 which states that one has an inclusion
(9) Θ : L2K (X, A, µ) ,→ L1K (X, A, µ),
which is in fact a linear continuous map. So if we consider the compositions θk =
ψk ◦ Θ, which are defined by
Z
θk : L1K (X, A, µ) 3 h 7−→ h dνk , k = 1, 2, 3, 4,
X
then these compositions are linear and continuous. Apply then Riesz Theorem (in
the form given in Remark 3.4), to find functions f1 , f2 , f3 , f4 ∈ L2K (X, A, µ), such
that
θk (h) = hfk , hi, ∀ h ∈ L2K (X, A, µ), k = 1, 2, 3, 4.
In particular, using functions of the form h = κ A , A ∈ A (which all belong to
L2K (X, A, µ), due to the finiteness of µ), we get
Z Z
νk (A) = κ A dνk = fk κ A dµ, ∀ A ∈ A, k = 1, 2, 3, 4.
X X

Finally, if we define the function f = α1 f1 + α2 f2 + α3 f3 + α4 f4 ∈ L2K (X, A, µ),


then the above equalities immediately give the equality (5).
At this point we only know that f belongs to L2K (X, A, µ). Using the inclusion
(9), it turns out that f indeed belongs to L1K (X, A, µ).
Let us prove now the additional properties (i) and (ii).
To prove the first assertion in (i), we start off by fixing some function f ∈
L1K (X, A, µ), which satisfies (5), and we define the set

A = {x ∈ X : |f (x)| > C ,
for which we must prove that µ(A) = 0. Since f is measurable, it follows that A
belongs to A. Consider the “rational unit sphere” SQ1 in K, defined as

{−1, 1} if K = R
(10) SQ1 =
{e2πit : t ∈ Q} if K = C
The point is that SQ1 is dense in the unit sphere S 1 in K:
S 1 = {α ∈ K : |α| = 1},
342 LECTURES 36-37

S
so we immediately have the equality A = Aα , where
1
α∈SQ

Aα = {x ∈ X : Re[αf (x)] > C .
Since SQ1 is countable, in order to prove that µ(A) = 0, it then suffices to show that
µ(Aα ) = 0, ∀ α ∈ SQ1 . Fix then α ∈ SQ1 , and consider the K-valued measure η = αν.
It is clear that we still have
(11) |η(A)| = |ν(A)| ≤ Cµ(A), ∀ A ∈ A,
as well as the equality
Z
(12) η(A) = αf dµ, ∀ A ∈ A.
A
For each integer n ≥ 1, let us define the set
Anα = {x ∈ X : Re[αf (x)] ≥ C + n1 ,

S∞
so that we obviously have the equality Aα = n=1 Anα . In particular, in order to
prove µ(Aα ) = 0, it suffices to prove that µ(Anα ) = 0, ∀ n ≥ 1. Fix for the moment
n ≥ 1. Using (12), it follows that
Z  Z Z
Re η(Anα ) = Re αf dµ = Re[αf ] dµ = Re[αf ]κ Anα dµ.
An
α An
α X
1
Since we have Re[αf ]κ Anα ≥ (C + n )κ An ,
the above inequality can be continued
α
with Z
Re η(Anα ) ≥ (C + n1 )κ Anα dµ = (C + n1 )µ(Anα ).
X
Of course, this will give
|η(Anα )| ≥ Re η(Anα ) ≥ (C + n1 )µ(Anα ).
Note now that, using (11), this will finally give
Cµ(Anα ) ≥ (C + n1 )µ(Anα ),
which clearly forces µ(Anα ) = 0.
Having proven that |f | ≤ C, µ-a.e., let us turn our attention now to the unique-
ness property (ii). Suppose f1 , f2 ∈ L1K (X, A, µ) are such that
Z Z
ν(A) = f1 dµ = f2 dµ, ∀ A ∈ A.
A X
Consider then the difference f = f1 − f2 and the trivial measure ν0 = 0. Obviously
we have
|ν0 (A)| ≤ n1 µ(A), ∀ A ∈ A,
for every integer n ≥ 1, as well as
Z
ν0 (A) = f dµ, ∀ A ∈ A.
A
By the first assertion in (i), it follows that
1
|f1 − f2 | = |f | ≤
, µ-a.e.,
n
n=1 ⊂ A defined by
for every n ≥ 1. So if we take the sets (Nn )∞
1
Nn = {x ∈ X : |f1 (x) − f2 (x)| > n },
CHAPTER IV: INTEGRATION THEORY 343

S∞
then µ(Nn ) = 0, ∀ n ≥ 1. Of course, if we put N = n=1 Nn , then on the one hand
we have µ(N ) = 0, and on the other hand, we have
f1 (x) − f2 (x) = 0, ∀ x ∈ X r N,
which means that we indeed have f1 = f2 , µ-a.e.
Finally, let us prove the second assertion in (i), which starts with the assumption
that ν is an “honest” measure. Let f ∈ L1K (X, A, µ) satisfy (5). By the uniqueness
property (ii), it follows immediately that
f = Re f, µ-a.e.,
so we can assume that f is already real-valued. Consider the “honest” measure
ω = Cµ − ν, and notice that the function g : X → R defined by
g(x) = C − f (x), ∀ x ∈ X,
clearly has the property
Z
ω(A) = g dµ, ∀ A ∈ A.
A
Since we obviously have
0 ≤ ω(A) ≤ Cµ(A), ∀ A ∈ A,
by the first assertion of (i), applied to the measure ω and the function g, it follows
that |g| ≤ C, µ-a.e. In other words, we have now a combined inequality:

max |f |, |C − f | ≤ C, µ-a.e.
Of course, since f is real valued, this forces f ≥ 0, µ-a.e. 

In what follows we are going to offer various generalizations of Theorem 4.1.


There are several directions in which Theorem 4.1 can be generalized. The main
direction, which we present here, will aim at weakening the condition |ν| ≤ Cµ.
The following result explains that in fact the case of K-valued measures can be
always reduced to the case of “honest” finite ones.
Proposition 4.5 (Polar Decomposition). Let A be a σ-algebra on a non-empty
set X, let K be one of the fields R or C, and let ν be a K-valued measure on A. Let
|ν| denote the variation measure of ν. There exists some function f ∈ L1K (X, A, |ν|),
such that
Z
(13) ν(A) = f d|ν|, ∀ A ∈ A.
A
Moreover
(i) Any function f ∈ L1K (X, A, |ν|), satisfying (13) has the property |f | = 1,
|ν|-a.e.
(ii) A function satisfying (13) is essentially unique, in the sense that, when-
ever f1 , f2 ∈ L1K (X, A, |ν|) satisfy (13), it follows that f1 = f2 , |ν|-a.e.

Proof. We know that


|ν(A)| ≤ |ν|(A), ∀ A ∈ A.
So if we apply Theorem 4.1 for the finite measure µ = |ν| and C = 1, we immediately
get the existence of f ∈ L1K (X, A, |ν|), satisfying (13). Again by Theorem 4.1, the
344 LECTURES 36-37

uniqueness property (ii) is automatic, and we also have |f | ≤ 1, |ν|-a.e. To prove


the fact that we have in fact the equality |f | = 1, |ν|-a.e., we define the set
A = {x ∈ X : |f (x)| < 1},
which belongs to A, and we prove that |ν|(A) = 0. If we define the sequence of sets
n=1 ⊂ A, by
(An )∞
An = {x ∈ X : |f (x)| ≤ 1 − n1 }, ∀ n ≥ 1,
S∞
then we clearly have A = n=1 An , so all we have to show is the fact that |ν|(An ) =
0, ∀ n ≥ 1. Fix n ≥ 1. For every B ∈ A, with B ⊂ An , we have
|f (x)| ≤ 1 − n1 , ∀ x ∈ B,
so using (13) we get
Z Z Z

|ν(B)| =
f d|ν| ≤ |f | d|ν| ≤ (1 − n1 ) d|ν| = (1 − n1 )|ν|(B).
B B B
S∞
Now if we take an arbitrary pairwise disjoint sequence (Bk )∞ k=1 ⊂ A, with k=1 Bk =
An , then the above estimate will give
X∞ X∞
|ν(Bk )| ≤ (1 − n1 ) |ν|(Bk ) = (1 − n1 )|ν|(An ).
k=1 k=1
Taking supremum in the left hand side, and using the definition of the variation
measure, the above estimate will finally give
|ν|(An ) ≤ (1 − n1 )|ν|(An ),
which clearly forces |ν|(An ) = 0. 
Remark 4.2. The case K = R can be slighly generalized, to include the case
of infinite signed measures. If ν is a signed measure on A and if we consider the
Hahn-Jordan set decomposition (X + , X − ), then the density f is simply the function
if x ∈ X +

1
f (x) =
−1 if x ∈ X −
The equality (13) will then hold only for those sets A ∈ A with |ν|(A) < ∞.
Since |ν| is allowed to be infinite, as explained in Example 4.1, the only version of
uniqueness property (ii) will hold with “|ν|-l.a.e” in place of “|ν|-a.e” Likewise, the
absolute value property (i) will have to be replaced with ”|f | = 1, |ν|-l.a.e”
Comment. Up to this point, it seems that the hypotheses from Theorem 4.1
are essential, particularly the dominance condition |ν| ≤ Cµ. It is worth discussing
this property in a bit more detail, especially having in mind that we plan to weaken
it as much as possible.
Notation. Suppose A is a σ-algebra on some non-empty set X, and suppose
µ and ν are “honest” (not necessarily finite) measures on A. We shall write
ν b µ,
if there exists some constant C > 0, such that
ν(A) ≤ Cµ(A), ∀ A ∈ A.
A few steps in the proof of Theorem 4.1 hold even without the finiteness as-
sumption, as indicated by the follwing.
CHAPTER IV: INTEGRATION THEORY 345

Exercise 1*. Suppose A is a σ algebra on some non-empty set X, and suppose


µ and ν are “honest” measures on A. Prove the following.
(i) If ν b µ, then one has the inclusions
NK (X, A, µ) ⊂ NK (X, A, ν) and LpK (X, A, µ) ⊂ LpK (X, A, ν), ∀ p ∈ [1, ∞).
Consequently (see the proof of Theorem 4.1) one has linear maps
LpK (X, A, µ) 3 h 7−→ h ∈ LpK (X, A, ν), ∀ p ∈ [1, ∞).
Show that these linear maps are continuous.
(ii) Conversely, assuming one has the inclusion
LpK0 (X, A, µ) ⊂ LpK0 (X, A, ν),
for some p0 ∈ [1, ∞), prove that ν b µ.
Hint: To prove (ii) show first one has the inclusion L1+ (X, A, µ) ⊂ L1+ (X, A, µ). Then show that
the quantity Z Z 
C = sup h dν : h ∈ L1+ (X, A, µ), h dµ ≤ 1
X X

n=1 ⊂ L+ (X, A, µ), with


is finite. If C = ∞, there exists some sequence (hn )∞ 1
Z Z
hn dµ ≤ 1 and h dν ≥ 4n , ∀ n ≥ 1.
X X
P∞ 1
Consider then the series n=1 2n hn , and get a contradiction. Finally prove that ν(A) ≤ Cµ(A),
∀ A ∈ A.
It is the moment now to introduce the following relation, which is a highly
non-trivial weakening of the relation b.
Definition. Let A is a σ-algebra on some non-empty set X, and suppose
µ and ν are “honest” (not necessarily finite) measures on A. We say that ν is
absolutely continuous with respect to µ, if for every A ∈ A, one has the implication
(14) µ(A) = 0 =⇒ ν(A) = 0.
In this case we are going to use the notation
ν  µ.
It is obvious that one always has the implication
ν b µ ⇒ ν  µ.
Remarks 4.3. Let (X, A, µ) be a measure space. A. If ν is an “honest” measure
on A, which has the Radon-Nikodym property relative to µ, then ν  µ. This is
pretty obvious, since if we pick f : X → [0, ∞] to be a density for ν realtive to µ,
then for every A ∈ A with µ(A) = 0, we have f κ A = 0, µ-a.e., so we get
Z Z
ν(A) = f dµ = f κ A dµ = 0.
A X
B. For an “honest” measure ν on A, the relation ν  µ is equivalent to the
inclusion
NK (X, A, µ) ⊂ NK (X, A, ν).
By Exercise 1, this already suggests that the relation  is much weaker than b
(see Exercise 2 below).
C. If ν is either a signed or a complex measure on A, then the following are
equivalent:
(i) the variation measure |ν| is absolutely continuous with respect to µ;
346 LECTURES 36-37

(ii) for every A ∈ A, one has the implication (14)


The implication (i) ⇒ (ii) is trivial, since one has
|ν(A)| ≤ |ν|(A), ∀ A ∈ A.
The implication (ii) ⇒ (i) is also clear, since if we start with some A ∈ A with
µ(A) = 0, then we get |ν(B)| = 0, for all B ∈ A with B ⊂ A, and then arguing
exactly as in the proof of Proposition 4.3, we get |ν|(A) = 0.
Convention. Using Remark 4.2.A, we extend the definition of absolute con-
tinuity, and the notation ν  µ to include the case when ν is either a signed
measure, or a complex measure on A. In other words, the notation ν  µ means
that |ν|  µ.
The following techincal result is key for the second Radon-Nikodym Theorem.
Lemma 4.1. Let (X, A, µ) be a finite measure space, and let ν be an “honest”
measure on A, with ν  µ. Then there exists a sequence (νn )∞ n=1 , of “honest”
measures on A, such that
(i) νn b µ, ∀ n ≥ 1; in particular the measures νn , n ≥ 1 are all finite;
(ii) ν1 ≤ ν2 ≤ . . . ;
(iii) limn→∞ νn (A) = ν(A), ∀ A ∈ A.
Proof. Let us define
νn = (nµ) ∧ ν, ∀ n ≥ 1.
Recall (see III.8, the Lattice Property; it is essential here that one of the measures,
namely nµ, is finite) that by construction νn has the following properties:
(a) νn ≤ nµ and νn ≤ ν;
(b) whenever ω is a measure with ω ≤ nµ and ω ≤ ν, it follows that ω ≤ νn .
Property (a) above already gives condition (i). It will be helpful to notice that
property (a) also gives the inequality
(15) νn ≤ ν, ∀ n ≥ 1.
The monotonicity condition is now trivial, since by (b) the inequalities νn−1 ≤
(n − 1)µ ≤ nµ and νn−1 ≤ ν, imply νn−1 ≤ (nµ) ∧ ν = νn .
To derive property (iii), it will be helpful to recall the actual definition of the
operation ∧. Fix for the moment n ≥ 1. One first considers the signed measure

λn = nµ − ν, and its Hahn-Jordan decomposition λn = λ+ n − λ . In our case, we
+ −
get λn ≤ nµ and λn ≤ ν. With these notations the measures νn are defined by
νn = nµ − λ+ n , ∀ n ≥ 1. If we fix, for each n ≥ 1, a Hahn-Jordan set decomposition
(Xn+ , Xn− ) for X relative to λn , then we have
(16) νn (A) = ν(A ∩ Xn+ ) + nµ(A ∩ Xn− ), ∀ A ∈ A, n ≥ 1.
S∞ T∞
Consider then the sets X∞+
= n=1 Xn+ and X∞ − ±
= n=1 . It is clear that X∞ ∈ A,
− +
and X∞ = X r X∞ .
Fix now a set A ∈ A, and let us prove the equality (iii). On the one hand, the
obvious inclusions Xn− ⊃ X∞−
, combined with (16), give the inequalities

(17) νn (A) ≥ ν(A ∩ Xn+ ) + nµ(A ∩ X∞ ), ∀ n ≥ 1.
On the other hand, since λn+1 = µ + λn , ∀ n ≥ 1, using Lemma III.8.2, we get the
relations
X1+ ⊂ X2+ ⊂ . . . .
µ µ
CHAPTER IV: INTEGRATION THEORY 347

(Recall that the notation D ⊂ E stands for µ(D r E) = 0.) Since ν  µ, we also
µ
have the relations
A ∩ X1+ ⊂ A ∩ X2+ ⊂ . . . ,
ν ν
so using Proposition III.4.3, one gets the equality
+
ν(A ∩ X∞ ) = lim ν(A ∩ Xn+ ).
n→∞
Combining this with the inequalities (15) and (17) then yields the inequality
+ −
 
(18) ν(A) ≥ lim sup νn (A) ≥ lim inf νn (A) ≥ ν(A ∩ X∞ ) + lim nµ(A ∩ X∞ ) .
n→∞ n→∞ n→∞

There are two posibilities here.



Case I : µ(A ∩ X∞ ) > 0.
In this case, the estimate (18) forces
ν(A) = lim sup νn (A) = lim inf νn (A) = ∞.
n→∞ n→∞

Case II : µ(A ∩ X∞ ) = 0.

In this case, using absolute continuity, we get ν(A ∩ X∞ ) = 0, and the equality
+ −
A = (A ∩ X∞ ) ∪ (A ∩ X∞ ) yields
+
ν(A) = ν(A ∩ X∞ ).
Then (18) forces
lim sup νn (A) = lim inf νn (A) = ν(A).
n→∞ n→∞
In either case, the concluison is the same: limn→∞ νn (A) = ν(A). 
After the above preparation, we are now in position to prove the following.
Theorem 4.2 (Radon-Nikodym Theorem: the finite case). Let (X, A, µ) be a
finite measure space.
A. If ν is an “honest” measure on A, with ν  µ, then there exists a measurable
function f : X → [0, ∞], such that
Z
(19) ν(A) = f dµ, ∀ A ∈ A.
A
Moreover, such a function is essentially unique, in the sense that, whenever f1 , f2 :
X → [0, ∞] are measurable functions, that satisfy (19), it follows that f1 = f2 ,
µ-a.e.
B. Let K be either R or C. If λ is a K-valued measure on A, with λ  µ, then
there exists a function f ∈ L1K (X, A, µ), such that
Z
(20) λ(A) = f dµ, ∀ A ∈ A.
A
Moreover:
(i) A function f ∈ L1K (X, A, µ) satisfying (20) is essentially unique, in the
sense that, whenever f1 , f2 ∈ L1K (X, A, µ) satisfy (20), it follows that
f1 = f2 , µ-a.e.
(ii) If f ∈ L1K (X, A, µ) is any function satisfying (20), then the variation
measure |λ| of λ is given by
Z
|λ|(A) = |f | dµ, ∀ A ∈ A.
A
348 LECTURES 36-37

Proof. A. Use Lemma 4.1 to find a sequence (νn )∞


n=1 of “honest” measures
on A, such that
• νn b µ, ∀ n ≥ 1; in particular the measures νn , n ≥ 1 are all finite;
• ν1 ≤ ν2 ≤ . . . ;
• limn→∞ νn (A) = ν(A), ∀ A ∈ A.
For each n ≥ 1, we apply the “Easy” Radon-Nikodym Theorem 4.1, to find some
measurable function fn : X → R, such that
Z
νn (A) = fn dµ, ∀ A ∈ A.
A

Claim: The sequence (fn )∞


n=1 satisfies
0 ≤ fn ≤ fn+1 , µ-a.e., ∀ n ≥ 1.
Fix n ≥ 1. On the one hand, since the νn ’s are “honest” finite measures, and
νn b µ, by part (i) of Theorem 4.1, it follows that fn ≥ 0, µ-a.e. On other hand,
since νn+1 − νn is also an “honest” finite measure with νn+1 − νn b µ, and with
density fn+1 − fn , again by part (i) of Theorem 4.1, it follows that fn+1 − fn ≥ 0,
µ-a.e.
Having proven the above Claim, let us define the function f : X → [0, ∞], by
 
f (x) = lim inf max{fn (x), 0} ∀ x ∈ X.
n→∞

It is obvious that f is measurable. By the Claim, we have in fact the equality


f = µ-a.e.- lim fn .
n→∞

Since we also have


f κ A = µ-a.e.- lim fn κ A , ∀ A ∈ A,
n→∞
using the Claim and the Monotone Convergence Theorem, we get
Z Z Z Z
f dµ = f κ A dµ = lim fn κ A dµ = lim fn dµ =
A X n→∞ X n→∞ A
= lim νn (A) = ν(A), ∀ A ∈ A.
n→∞

Having shown that f satisfies (19), let us observe that the uniqueness property
stated in part A is a consequence of Proposition 4.2.
B. Let λ be a K-valued. In particular, the variation measure |λ| is finite, so by
the Polar Decomposition (Proposition 4.3) there exists some measurable function
h : X → K, such that
Z
(21) λ(A) = h d|λ|, ∀ A ∈ A,
A

and such that |h| = 1, |λ|-a.e. Replacing h with the measurable function h0 : X →
K, defined by

0 h(x) if |h(x)| = 1
h (x) =
1 if |h(x)| =
6 1
we can assume that in fact we have
|h(x)| = 1, ∀ x ∈ X.
CHAPTER IV: INTEGRATION THEORY 349

Apply then part A, to the measure |λ|, which is again absolutely continuous with
respect to µ, to find some measurable function g : X → [0, ∞], such that
Z
|λ|(A) = g dµ, ∀ A ∈ A.
A
Remark that, since Z
g dµ = |λ|(X) < ∞,
X
it follows that g ∈ L1+ (X, A, µ). Fix for the moment some set A ∈ A. On the one
hand, since
(22) |hκ A | ≤ 1,
and |λ| is finite, it follows that hκ A ∈ L1K (X, A, |λ|). On the other hand, since
g ∈ L1+ (X, A, µ), using (22) we get the fact that hκ A g ∈ L1K (X, A, µ). Using the
Change of Variable formula (Proposition 4.1) we then get the equality
Z Z
hκ A d|λ| = hκ A g dµ,
X X
which by (21) reads: Z
λ(A) = hg dµ.
A
Now the function f0 = hg (which has |f0 | = g) belongs to L1K (X, A, µ), and clearly
satisfies (20).
To prove the uniqueness property (i), we start with two functions f1 , f2 ∈
L1K (X, A, µ) which satisfy
Z Z
f1 dµ = f2 dµ = λ(A), ∀ A ∈ A.
A A

If we define the function ϕ = f1 − f2 ∈ L1K (X, A, µ), then we clearly have


Z Z
ϕ dµ = 0 dµ = ω(A), ∀ A ∈ A,
A A
where ω is the zero measure. Since ω ≤ µ, using Theorem 4.1 it follows that ϕ = 0,
µ-a.e.
To prove (ii) we start with some f ∈ L1K (X, A, µ) that satisfies (20), and we
use the uniqueness property (i) to get the equality f = f0 , µ-a.e., where f0 is the
function constructed above. In particular, using the construction of f0 , the fact
that |f0 | = g, and the fact that g is a density for |λ| relative to µ, we get
Z Z Z
|f | dµ = |f0 | dµ = g dµ = |λ|(A), ∀ A ∈ A. 
A A A

At this point we would like to go further, beyond the finite case. The following
generalization of Theorem 4.2 is pretty straightforward.
Corollary 4.1 (Radon-Nikodym Theorem: the σ-finite case). Let (X, A, µ)
be a σ-finite measure space.
A. If ν is an “honest” measure on A, with ν  µ, then there exists a measurable
function f : X → [0, ∞], such that
Z
(23) ν(A) = f dµ, ∀ A ∈ A.
A
350 LECTURES 36-37

Moreover, such a function is essentially unique, in the sense that, whenever f1 , f2 :


X → [0, ∞] are measurable functions, that satisfy (19), it follows that f1 = f2 ,
µ-a.e.
B. Let K be either R or C. If λ is a K-valued measure on A, with λ  µ, then
there exists a function f ∈ L1K (X, A, µ), such that
Z
(24) λ(A) = f dµ, ∀ A ∈ A.
A
Moreover:
(i) A function f ∈ L1K (X, A, µ) satisfying (20) is essentially unique, in the
sense that, whenever f1 , f2 ∈ L1K (X, A, µ) satisfy (24), it follows that
f1 = f2 , µ-a.e.
(ii) If f ∈ L1K (X, A, µ) is any function satisfying (24), then the variation
measure |λ| of λ is given by
Z
|λ|(A) = |f | dµ, ∀ A ∈ A.
A
µ
S∞ Proof. Since µ is σ-finite, there exists a sequence (An )n=1 ⊂ Afin , with

n=1 An = X. Put X1 = A1 and Xn = An r (A1 ∪ ·S · · ∪ An−1 ), ∀ n ≥ 2. Then


µ ∞
(Xn )∞n=1 ⊂ Afin is pairwise disjoint, and we still have n=1 Xn = X. The Corol-
lary follows then immediately from Theorem 4.2, applied to the measure spaces
(Xn , A Xn , µ Xn ) and the measures ν Xn and λ Xn respectively. What is used here

is the fact that, if K denotes one of the sets [0, ∞], R or C, then for a function
f : X → K the fact that f is measurable, is equivalent to the fact that f Xn is
measurable for each n ≥ 1. Moreover, given two functions f1 , f2 : X → K, the con-
dition f1 = f2 , µ-a.e. is equivalent to the fact that f1 Xn = f2 Xn , µ-a.e., ∀ n ≥ 1.
condition f ∈ LK (X, A, µ), is equivalent to the
1
Finally, for f : X → K(= R, C), the
fact that f X ∈ LK (Xn , A X , µ X ), ∀ n ≥ 1, and
1
n n n

∞ Z
X  
f X d µ X < ∞. 
n n
n=1 Xn

Comment. The σ-finite case of the Radon-Nikodym Theorem, given above, is


in fact a particular case of a more general version (Theorem 4.3 below). In order
to formulate this, we need a concept which has already appeared earlier in III.5.
Recall that a measure space (X, A, µ) is said to be decomposable, if there exists a
pairwise disjoint subcollection F ⊂ Aµfin , such that
S
(i) F ∈F F = X;
(ii) for a set A ⊂ X, the condition A ∈ A is equiavelnt to the condition
A ∩ F ∈ A, ∀ F ∈ F;
(iii) one has the equality
X
µ(A) = µ(A ∩ F ), ∀ Aµfin .
F ∈F

Such a collection F is then called a decomposition of (X, A, µ). Condition (ii)


is referred to as the patching property, because it characterizes measurability as
follows.
CHAPTER IV: INTEGRATION THEORY 351

(p) Given a measurable space (Y, B), a function f : (X, A) → (Y, B) is mea-
surable, if and only if all restrictions F F : (F, A F ) → (Y, B), F ∈ F, are

measurable.
Theorem 4.3 (Radon-Nikodym Theorem: the decomposable case). Let (X, A, µ)
be a decomposable measure space. Let Aµσ-fin be the collection of all µ-σ-finite sets
in A, that is,

[
Aµσ-fin = A ∈ A : there exists (An )∞ Aµ

n=1 ⊂ fin , with A = An .
n=1

A. If ν is an “honest” measure on A, with ν  µ, then there exists a measurable


function f : X → [0, ∞], such that
Z
(25) ν(A) = f dµ, ∀ A ∈ Aµσ-fin .
A
Moreover, such a function is locally essentially unique, in the sense that, whenever
f1 , f2 : X → [0, ∞] are measurable functions, that satisfy (25), it follows that
f1 = f2 , µ-l.a.e.
B. Let K be either R or C. If λ is a K-valued measure on A, with λ  µ, then
there exists a function f ∈ L1K (X, A, µ), such that
Z
(26) λ(A) = f dµ, ∀ A ∈ Aµσ-fin .
A
Moreover:
(i) A function f ∈ L1K (X, A, µ) satisfying (26) is essentially unique, in the
sense that, whenever f1 , f2 ∈ L1K (X, A, µ) satisfy (26), it follows that
f1 = f2 , µ-a.e.
(ii) If f ∈ L1K (X, A, µ) is any function satisfying (26), then the variation
measure |λ|, of λ, satisfies
Z
|λ|(A) = |f | dµ, ∀ A ∈ Aµσ-fin .
A

Proof. Fix F to be a decomposition for (X, A, µ).


A. For every F ∈ F, we apply Theorem 4.2 to the measure space (F, A f , µ F )


and the measure ν F , to find some measurable function fF : F → [0, ∞], such that
Z
fF dµ, ∀ A ∈ A F .

ν(A) =
A

Using the patching property, there exists a measurable function f : X → [0, ∞],
such that f F = fF , ∀ F ∈ F. The key feature we ar going to prove is a particular

case of (25).
Claim 1: ν(A) = A f dµ, ∀ A ∈ Aµfin .
R

Fix A ∈ Aµfin . On the one hand, we know that


X
µ(A) = µ(A ∩ F ).
F ∈F

Since the sum is finite, it follows that the subcollection


F(A) = F ∈ F : µ(A ∩ F ) > 0

352 LECTURES 36-37

S
is at most countable. We then form the set à = F ∈F(A [A ∩ F ], which is clearly a
subset of A. The difference D = A r à has again µ(D) < ∞, so its measure is also
given as X
µ(D) = µ(D ∩ F ).
F ∈F
Notice however that we have µ(D ∩ F ) = 0, ∀ ∈ F. (If F ∈ F(A), we already have
D ∩ F = ∅, whereas if F ∈ F r F(A), we have D ∩ F ⊂ A ∩ F , with µ(A ∩ F ) = 0.)
Using then the above equality, we get µ(D) = 0. By abosulte continuity we also
get ν(D) = 0. Using the equality A = Ã ∪ Dn , and σ-additivity (it is essential here
that F(A) is countable), it follows that
X
ν(A) = ν(Ã) = ν(A ∩ F ).
F ∈F(A)

Using the hypothesis, we then get


X Z
(27) ν(A) = f dµ.
F ∈F(A) A∩F

Now if we list F(A) = {Fk }∞


k=1 , and if we take a partial sum, we have
Xn Z Z Z
f dµ = f dµ = f κ Gn dµ,
k=1 A∩Fk Gn X

where
p
[
Gn = [A ∩ Fk ], ∀ n ≥ 1.
k=1
It is clear that we have
• f κ G 1 ≤ f κ G2 ≤ . . . ,
• limn→∞ (f κ Gn )(x) = (f κ Ã )(x), ∀ x ∈ X,
so using the Monotone Convergence Theorem, it follows that
Z Z Z
lim f κ Gn dµ = f κ Ã dµ = f dµ.
n→∞ X X Ã
Using (27) we then get
Z Z
ν(A) = lim f κ Gn dµ = f dµ.
n→∞ X Ã

On the other hand, since µ(A r Ã) = 0, it follows that


Z Z
f dµ = f dµ,
A Ã
so the preceding equality immediately gives the desired equality
Z
ν(A) = f dµ.
A
At this point let us remark that the local almost uniqueness of f already follows
from Remark 4.1.
µ
S∞with some set A ∈ Aσ-fin , and choose
Let us prove now the equality (25). Start
µ
a sequence (An )n=1 ⊂ Afin , such that A = n=1 An . Define the sequence (Bn )∞

n=1
by
Bn = A1 ∪ · · · ∪ An , ∀ n ≥ 1,
CHAPTER IV: INTEGRATION THEORY 353

S∞
so that we still have Bn ∈ Aµfin , ∀ n ≥ 1, as well as A = n=1 Bn , but moreover we
have B1 ⊂ B2 ⊂ . . . . For each n ≥ 1, using Claim 1, we have the equality
Z
ν(Bn ) = f dµ.
Bn

Using these equalities, combined with


• 0 ≤ f κ B1 ≤ f κ B 2 ≤ . . . ,
• limn→∞ (f κ Bn )(x) = (f κ B )(x), ∀ x ∈ X,
the Monotone Convergence Theorem, combined with continuity yields
Z Z Z Z
f dµ = f κ B dµ = lim f κ Bn dµ = lim dµ = lim ν(Bn ) = ν(A).
B X n→∞ X n→∞ Bn n→∞

B. We start off by choosing a measurable function h : X → K, with |h| = 1,


such that
Z
λ(A) = h d|λ|, ∀ A ∈ A.
A

Using part A, there exists some measurable function g0 : X → [0, ∞], such that
Z
(28) |λ|(A) = g0 dµ, ∀ A ∈ Aµσ-fin .
A

At this point, g0 may not be integrable, but we have the freedom to perturb it (µ-
l.a.e.) to try to make it integrable. This is done as follows. Consider the collection

F0 = F ∈ F : |λ|(F ) > 0 .


Since S|λ| is finite, it follows that F0 is at most countable. Define then the set
X0 = F ∈F0 F ∈ Aµσ-fin . Since X0 is µ-σ-finite, every set A ∈ A with A ⊂ X0 , is
µ-σ-finite, so we have
Z
g0 dµ, ∀ A ∈ A X0 .

|λ|(A) =
A

Applying the σ-finite version of the Radon-Nikodym Theorem to the σ-finite mea-
sure space (X0 , A X0 , µ X0 ) and the finite measure λ X0 , it follows that the density

g0 X0 belongs to L1+ (X0 , A X0 , µ X0 ), which means that the function g = g0 κ X0


belongs to L1+ (X, A, µ). With this choice of g, let us prove now that the equality
(28) still holds, with g in place of g0 . Exactly as in the proof of part A, it suffices
to prove only the equality
Z
(29) |λ|(A) = g dµ, ∀ A ∈ Aµfin .
A

Claim 2: |λ|(A) = |λ|(A ∩ X0 ), ∀ A ∈ Aµσ-fin .


Since (use the fact that |λ| is finite) the equality is equivalent to

|λ|(A r X0 ) = 0, ∀ A ∈ Aµσ-fin ,
354 LECTURES 36-37

it suffices to prove it only for A ∈ Aµfin . If A ∈ Aµfin , using the properties of the
decomposition F, we have
X X X
|λ|(A) = |λ|(A ∩ F ) = |λ|(A ∩ F ) + |λ|(A ∩ F ) =
F ∈F F ∈F0 F ∈FrF0
[  X
= |λ| [A ∩ F ] + |λ|(A ∩ F ) =
F ∈F0 F ∈FrF0
X
= |λ|(A ∩ X0 ) + |λ|(A ∩ F ).
F ∈F 0 (A)

Notice now that, for F ∈ F r F0 , we have |λ|(F ) = 0, which gives |λ|(A ∩ F ) = 0,


so the Claim follows immediately from the above computation.
Having proven the above Claim, let us prove now (29). Fix A ∈ Aµfin . The
desired equality is now immediate from Claim 2, combined with (28):
Z Z
|λ|(A) = |λ|(A ∩ X0 ) = g0 dµ = g0 κ A∩X0 dµ =
A∩X0 X
Z Z Z
= g0 κ X0 κ A dµ = gκ A dµ = g µ.
X X A

Define now the function f0 = hg. Since |f0 | = g ∈ L1+ (X, A, µ),
it follows that
f0 ∈ L1K (X, A, µ). Let us prove that f0 satisfies the equality (26). Start with some
A ∈ Aµσ-fin . On the one hand, using Claim 2, we have
|λ(A r X0 )| ≤ |λ|(A r X0 ) = 0,

∩ X0 ). Using
so we get λ(A) = λ(A the σ-finite version of the Radon-Nikodym
Theorem for (X0 , A X0 , µ X0 ) and λ X0 , we then have
Z Z
λ(A) = λ(A ∩ X0 ) = hg0 dµ = hg0 κ A∩X0 dµ =
A∩X0 X
Z Z Z Z
= hg0 κ X0 κ A dµ = hgκ A dµ = hg dµ = f0 dµ.
X X A A

We now prove the uniqueness property (i) of f (µ-a.e.!). Assume f ∈ L1K (X, A, µ)
is another function, such that
Z
λ(A) = f dµ, ∀ A ∈ Aµσ-fin .
A
Claim 3: f = f0 , µ-l.a.e.
What we need to show here is the fact that
f κ B = f0 κ B , µ-a.e., ∀ B ∈ Aµfin .
But this follows immediately from the uniqueness from part B of Theorem 4.2,
applied to the finite measure space (B, A B , µ B ) and the measure λ B , which has


both f B and f0 B as densities.
Using Claim 3, we now have f − f0 ∈ L1K (X, A, µ), with f − f0 = 0, µ-l.a.e.,
so we can apply Proposition 4.4, which forces f − f0 = 0, µ-a.e., so we indeed get
f = f0 , µ-a.e.
Property (ii) is obvious, since by (i), any function f ∈ L1K (X, A, µ), that satisfies
(26), automatically satisfies |f | = |f0 | = g, µ-a.e. 
CHAPTER IV: INTEGRATION THEORY 355

Comment. One should be aware of the (severe) limitations of Theorem 4.3,


notably the fact that the equalities (25) and (26) hold only for A ∈ Aµσ-fin . For
example, if one considers the measure space (X, P(X), µ), with X uncountable,
and µ defined by

∞ if A is uncountable
µ(A) =
0 if A is countable
This measure space is decomposable, with a decomposition consisting of singletons:
F = {x} : x ∈ X . For a measure ν on P(X), the condition ν  µ means


precisely that ν(A) = 0 for all countable subsets A ⊂ X. In this case the equality
(25) says practically nothing, since it is restricted solely to countable sets A ⊂ X,
when both sides are zero.
In this example, it is also instructive to analyze the case when ν is finite (see part
B in Theorem 4.3). If we follow the proofS of the Theorem, we see
 that at some point
we have constructed a certain set X0 = F ∈F0 , where F0 = F ∈ F : ν(F ) > 0 .

In our situation however it turns out that X0 = ∅. This example brings up a very
interesting question, which turns out to sit at the very foundation of set theory.
Question: Does there exists an uncountable set X, and a finite measure ν
on P(X), such that ν(X) > 0, but ν(A) = 0, for every countable subset
A ⊂ X?
(The above vanishing condition is of course equivalent to the fact that ν({x}) = 0,
∀ x ∈ X.) It turns out that, not only that the answer of this question is unkown, but
in fact several mathematicians are seriously thinking of proposing it as an axiom
to be added to the current system of axioms used in set theory!
The limitations of Theorem 4.3 also force limitations in the Change of Variables
property (see Proposition 4.1), which in this case has the following statement.
Proposition 4.6 (Local Change of Variables). Let (X, A, µ) be a measure
space, and let ν be a measure on A, and let f : X → [0, ∞] be a measurable
function.
A. The following are equivalent:
(i) one has
Z
ν(A) = f dµ, ∀ A ∈ Aµσ-fin ;
A

(ii) for every measurable function h : X → [0, ∞], with the property that the
set Eh = {x ∈ X : h(x) 6= 0} belongs to Aµσ-fin , one has the equality
Z Z
(30) h dν = hf dµ.
X X

B. If ν and f are as above, and K is either R or C, then the equality (30)


also holds for those measurable functions h : X → K with Eh ∈ Aµσ-fin , for which
h ∈ L1K (X, A, ν) and hf ∈ L1K (X, A, µ).

Proof. A. (i) ⇒ (ii). Assume (i) holds. Start with some measurable function
h : X → [0, ∞], such that the set Eh = {x ∈ X : h(x) 6= 0} belongs to Aµσ-fin . The
equality
(30) is then immediate from Proposition 4.1, applied to the measure space
(Eh , A E , µ E ), and the measure ν E , which has density f E .
h h h h
356 LECTURES 36-37

(ii) ⇒ (i). Assume (ii) holds. If we start with some A ∈ Aµσ-fin , then obviously
the measurable function h = κ A will have Eh = A, so by (ii) we immediately get
Z Z Z
ν(A) = κ A dν = κ A f dµ = f dµ.
X X A
B. Assume now ν and f satisfy the equivalent conditions (i) and (ii). Suppose
h : X → K is measurable, with Eh ∈ Aµσ-fin , such that h ∈ L1K (X, A, ν) and
hf ∈ L1K (X, A, µ). Then the equality
(30) follows again from Proposition 4.1,
applied to the measure space (Eh , A E , µ E ), and the measure ν E , which has

h h h
density f E . 
h
Appendix A

Zorn Lemma
In this Appendix we review basic set theoretical results, which are consequences
of the following postulate:
Axiom of Choice. Given any non-empty collection17 {Xi : i ∈ I} of non-
empty sets, the cartesian product Y
Xi
i∈I
is non-empty.
Recall that the cartesian product is defined as
Y  [
= f :I→ Xi : f (i) ∈ Xi , ∀ i ∈ I .
i∈I i∈I
In order to formulate several consequences of the Axion of Choice, we need
several concepts.
Definitions. Given a set X, by a relation on X one means simply as subset
R ⊂ X × X. The standard notation for relations is:
xRy ⇐⇒ (x, y) ∈ R.
An order relation on X is a relation ≺ with the following properties:
• x ≺ x, ∀ x ∈ X;
• if x, y, z ∈ X satisfy x ≺ y and y ≺ z, then x ≺ z;
• if x, y ∈ X satisfy x ≺ y and y ≺ x, then x = y.
In this case the pair (X, ≺) is called an ordered set.
An ordered set (X, ≺) is said to be totally ordered, if
• for any elements x, y ∈ X one has either x ≺ y or y ≺ x.
More generally, given an (arbitrary) ordered set (X, ≺), by a totally ordered subset
of (X, ≺), one means a subset T ⊂ X, which becomes totally ordered with respect
to the order relation ≺ T .
Example A.1. Fix a set M , and take X to be the collection of all subsets of
M . Then X carries a natural order relation defined by inclusion:
A ≺ B ⇐⇒ A ⊂ B.
A totally ordered subset C of (X, ⊂) is called a chain of subsets of M . Two subset
A, B ⊂ M will be said to be comparable, if either A ⊂ B, or B ⊂ A, i.e. the
collection {A, B} is a chain of subsets of M .
Definition. Let M be a set. A collection F of subsets of M is said to have
the chain property, if
17 By a “collection of sets” one simply means a set whose elements are sets themselves.

357
358 APPENDIX A

(c) whenever C ⊂ F is a chain, it follows that the union C∈C C also belongs
S
to F.
Lemma A.1. Let M be a set, let F be a collection of subsets of M with the
chain property. For every set A ∈ F, the collection
comp(A; F) = {B ∈ F : Bcomparable to A}
has the chain property.
Proof. Let C ⊂ comp(A; F) be a chain, and put T = C∈C C. Since F has
S
the chain property, we have T ∈ F. To show that T is comparable with A, we
consider the two pssibilities:
Case 1: A ⊃ C, for all C ∈ C. In this case we have A ⊃ C∈C C = T .
S
Case 2: There exists C0 ∈ C, such that A ⊂ C0 . In this case we have A ⊂
C0 ⊂ T . 
Lemma A.2. Let M be some non-empty set, let F let F be a non-empty collec-
tion of subsets of M , with the chain property Suppose one has a map
F 3 A 7−→ xA ∈ M,
with the property that
A ∪ {xA } ∈ F, ∀ A ∈ F.
Then there exists A ∈ F such that xA ∈ A.
Proof. For each A ∈ F we define A+ = A ∪ {xA }. Call a subset G ⊂ F
inductive, if it has the chain property, and
(+) A ∈ G ⇒ A+ ∈ G.
T that if Gi , i ∈ I is a collection of inductive subsets of F, then the
It is quite clear
intersection i∈I Gi is again an inductive subset of F.
Fix now some subset A0 ∈ F, and define
\
G0 = G.
G inductive
A0 ∈G

Note that the subset F0 = {A ∈ F : A ⊃ A0 } is an inductive subset of F, so in


particular, G0 is non-empty, and G0 ⊂ F0 , i.e.
(1) A ⊃ A0 , ∀ A ∈ G0 .
Claim: The set G0 is a chain.
What we need to prove is the fact that G0 is totally ordered by inclusion. Consider
the set
\
T = {T ∈ G : T is comparable with every A ∈ G0 } = comp(A; G0 ),
A∈G0

and we try to prove that T = G0 . By Lemma A.1 it is clear that T has the chain
property. Using (1), it is clear that A0 ∈ T. Finally, we need to prove property
(+). We prove this indirectly as follows. Fix T ∈ T, consider the collection
VT = comp(T + ; G0 ) = {A ∈ G0 : A comparable with T + },
and let us prove that VT = G0 , by showing that VT is an inductive set, and contains
A0 . First of all, by Lemma A.1, it follows that VT has the chain property. Secondly,
using (1) we have A0 ⊂ T ⊂ T + , so A0 ∈ VT . Finally, to check property (+), we
ZORN LEMMA 359

start with some V ∈ VT , and we show that V + ∈ VT . In the case when T + ⊂ V ,


we are done, because we have T + ⊂ V ⊂ V + . Assume T + 6⊂ V , so that we have
V ⊂ T . Since T is comparable with V + , we either have V + ⊂ T , in which case we
are done, or we have T ⊂ V + . In the latter case, we have
V ⊂ T ⊂ V +.
Since V + = V ∪ {xV }, the above inclusions forces either T = V , which gives
T + = V + , or T = V + . Clearly, either case gives V + ∈ VT . Having shown that VT
is inductive, the inclusion VT ⊂ G0 will force the equality VT = G0 . In turn, the
definition of VT proves that T + ∈ T, so T is indeed inductive. Finally, the inclusion
T ⊂ G0 then forces T = G0 , and by the definition of T, it follows that G0 is indeed
a chain.
Having proven the Claim, we now take A = G∈G0 G. Since G0 has the chain
S
property, it follows that A ∈ G0 . By construction we have
A ⊃ G, ∀ G ∈ G0 .
In particular we have A ⊃ A+ , which clearly forces xA ∈ A. 
Definitions. Let (X, ≺) be an ordered set. By a maximal element for X one
means an element x ∈ X with the property:
{y ∈ X : x ≺ y} = {x}.
In other words, this means that there is no element y ∈ X, with x ≺ y and y 6= x.
Given a subset S ⊂ X, an element x ∈ X is said to be an upper bound for S, if
s ≺ x, ∀ s ∈ S.
If such an x exists, we say that S has an upper bound. (It is not assumed that x
belongs to S!)
Lemma A.3 (“Easy” Zorn Lemma). Let M be a set, and let F be a collection
of subsets of M . Assume
• the Axiom of Choice is true;
• F has the chain property;
• F and is hereditary, in the sense that, whenever A ∈ F, it follows that all
subsets of A belong to F.
Then, when equipped with the inclusion relation, (F, ⊂) has at least one maximal
element.
Proof. The proof will be carried on by contradiction. Assume no A ∈ F is
maximal. For each A ∈ F, define
XA = x ∈ M r A : A ∪ {x} ∈ F .


Claim: For every A ∈ F, the set XA is non-empty.


Indeed, since A is not maximal, there exists some B ∈ F, with A ( B. In particular,
there exists some x ∈ B r A, and since A ∪ {x} ⊂ B, by the hereditary property,
it follows that x ∈ XA .
Use now the Axiom of Choice, to find a map
F 3 A 7−→ xA ∈ M,
such that xA ∈ XA , ∀ A ∈ F. This means that A ∪ {xA } ∈ F, and xA 6∈ A, for all
A ∈ F. By Lemma A.2 this is however impossible. 
360 APPENDIX A

Theorem A.1 (Zorn Lemma). Assume the Axiom of Choice is true. Let (X, ≺)
be a non-empty ordered set, with the following property
(z) every totally ordered subset A ⊂ X has an upper bound.
Then X has at least one maximal element.
Proof. Define the collection
F = {A ⊂ X : A totally ordered subset}.
Clearly F is non-empty (it contains, for instance, all singletons).
It is quite clear that F satisfies the hypothesis of Lemma A.3. So (F, ⊂) has a
maximal element A. Take now x to be an upper bound for A, i.e. a ≺ x, ∀ a ∈ A.
Now we prove that x is maximal for (X, ≺). Suppose y ∈ X satisfies x ≺ y.
Then clearly A ∪ {y} will still be a totally ordered subset of X, i.e. A ∪ {y} ∈ F.
The maximality of A in (F, ⊂) will force A ∪ {y} = A, so we get y ∈ A, hence y ≺ x.
Since we also have x ≺ y, this forces y = x. 
Appendix B

Cardinal Arithmetic
In this Appendix we discuss cardinal arithmetic. We assume the Axiom of
Choice is true.
Definitions. Two sets A and B are said to have the same cardinality, if there
exists a bijective map A → B. It is clear that this defines an equivalence relation
on the class18 of all sets.
A cardinal number is thought as an equivalence class of sets. In other words,
if we write a cardinal number as a, it is understood that a consists of all sets of a
given cardinality. So when we write card A = a we understand that A belongs to
this class, and for another set B we write card B = a, exactly when B has the same
cardinality as A. In this case we write card B = card, A.
Notations. The cardinality of the empty set ∅ is zero. More generally the
cardinality of a finite set is equal to its number of elements. The cardinality of the
set N, of all natural numbers, is denoted by ℵ0 .
Definition. Let a and b be cardinal numbers. We write a ≤ b if there exist
sets A ⊂ B with card A = a and card B = b.
This is equivalent to the fact that, for any sets A and B, with card A = a and
card B = b, one of the following equivalent conditions holds:
• there exists an injective function f : A → B;
• there exists a surjective function g : B → A.
For two cardinal numbers a and b, we use the notation a < b to indicate that
a ≤ b and a 6= b.
Theorem B.1 (Cantor-Bernstein). Suppose two cardinal numbers a and b sat-
isfy a ≤ b and b ≤ a. Then a = b.
Proof. Fix two sets A and B with card A = a and card B = b, so there exist
injective functions f : A → B and g : B → A. We shall construct a bijective
function h : A → B. Define the sets
A0 = A r g(B) and B0 = A r f (A).
Then define recursively the sequences (An )n≥0 and (Bn )n≥0 by
An = g(Bn−1 ) and Bn = f (An−1 ), ∀ n ≥ 1.
Claim 1: One has Am ∩ An = Bm ∩ Bn , ∀ m > n ≥ 0.
Let us first observe that the case when n = 0 is trivial, since we have the inclusions
Am = g(Bm−1 ) ⊂ g(B) = A r A0 and Bm = f (Am−1 ) ⊂ f (A) = B r B0 . Next we
prove the desired property by induction on m. The case m = 1 is clear (this forces
18 The term class is used, because there is no such thing as the “set of all sets.”

361
362 APPENDIX B

n = 0). Suppose the statement is true for m = k, and let us prove it for m = k + 1.
Start with some n < k + 1. If n = 0, we are done, by the above discussion. Assume
first n ≥ 1. Since f and g are injective we have
Ak+1 ∩ An = g(Bk ) ∩ g(Bn−1 ) = g(Bk ∩ Bn−1 ) = ∅,
Bk+1 ∩ Bn = f (Ak ) ∩ f (An−1 ) = f (Ak ∩ An−1 ) = ∅,
and we are done. S
Put C = A rn≥0 An and D = B r n≥0 Bn .
Claim 2: One has the equality f (C) = D.
First we prove the inclusion f (C) ⊂ D. Start with some point c ∈ C, but assume
f (c) 6∈ D. This means that there exists some n ≥ 0 such that f (c) ∈ Bn . Since
f (c) ∈ f (A) = B rB0 , we must have n ≥ 1. But then we get f (c) ∈ Bn = f (An−1 ),
and the injectivity of f will force c ∈ An−1 , which is impossible.
Second, we prove that D ⊂ f (C). Start with some d ∈ D. First of all, since
D ⊂ B r B0 = f (A), there exists some c ∈ A with d = f (c). If c 6∈ C, then there
exists some n ≥ 0, such that c ∈ An , and then we would get d = f (c) ∈ f (An ) =
Bn+1 , which is impossible. S
We now begin constructing the desired bijection. First we define φ : n≥0 Bn →
B by 
b if b ∈ Bn and n is odd
φ(b) =
(f ◦ g)(b) if b ∈ Bn and n is even
Claim 3: The map φ defines a bijection
[ [
φ: Bn → Bn .
n≥0 n≥1

It is clear that, since φ Bn is injective, the map φ is injective. Notice also that, if

n ≥ 0 is even, then φ(Bn ) = f g(Bn ) = f (An+1 ) = Bn+2 . When n ≥ 0 is odd we
have φ(Bn ) = Bn , so we have indeed the equality
[  [
φ Bn = Bn .
n≥0 n≥1

Now we define ψ : n≥0 An → B by ψ = φ−1 ◦ f . Clearly ψ is injective, and


S
[ [ [ [ [
An = φ−1 f (An ) = φ−1 Bn+1 = φ−1
   
ψ Bn = Bn ,
n≥0 n≥0 n≥0 n≥1 n≥0

so ψ defines a bijection [ [
ψ: An → Bn .
n≥0 n≥0
We then combine ψ with the bijection f : C → D, i.e. we define the map h : A → B
by S

ψ(x) if x ∈ n≥0 An
h(x) = S
f (x) if x ∈ A r n≥0 An = C.
Clearly h is injective, and
[  [ 
h(B) = ψ An ∪ f (C) = Bn ∪ D = B,
n≥0 n≥0

so h is indeed bijective. 
CARDINAL ARITHMETIC 363

Theorem B.2 (Total ordering for cardinal numbers). Let a and b be cardinal
numbers. Then one has either a ≤ b, or b ≤ a.

Proof. Choose two sets A and B with card A = a and card B = b. In order to
prove the theorem, it suffices to construct either an injective function f : A → B,
or an injective function f : B → A.
We define the set
X = {(C, D, g) : C ⊂ A, D ⊂ B, g : C → D bijection}.
We equip X with the following order relation:
 C ⊂ C0

0 0 0
(C, D, g) ≺ (C , D , g ) ⇐⇒ D ⊂ D 0
g = g 0 C

We now check that (X, ≺) satisfies the  hypothesis of Zorn Lemma. LetSA ⊂ X
be a totally
S ordered subset, say A = (Ci , D ,
i ig ) : i ∈ I . Define C = i∈I Ci ,
D = i∈I Di , and g : C → D to be the unique function with the property that
g Ci = gi , ∀ i ∈ I. (We use here the fact that for i, j ∈ I we either have Ci ⊂ Cj

and gj Ci = gi , or Cj ⊂ Ci and gi Cj = gj . In either case, this proves that

gi
Ci ∩Cj
= gj Ci ∩Cj
, ∀ i, j ∈ I, so such a g exists.) It is then pretty clear that
(C, D, g) ∈ X and (Ci , Di , gi ) ≺ (C, D, g), ∀ i ∈ I, i.e. (C, D, g) is an upper bound
for A. Use now Zorn Lemma, to find a maximal element (A0 , B0 , f ) in X.
Claim: Either A0 = A or B0 = B.
We prove this by contradiction. If we have strict inclusions A0 ( A and B0 ( B,
then if we choose a ∈ A r A0 and b ∈ B r B0 , we can define a bijection g :
A0 ∪ {a} → B0 ∪ {b0 } by g(a) = b and g A = f . This would then produce a
0
new element (A0 ∪ {a}, B0 ∪ {b}, g) ∈ X, which would contradict the maximality of
(A0 , B0 , f ).
The theorem now follows immediately from the Claim. If A0 = A, then f :
A → B is injective, and if B0 = B, then f : B → A is injective. 

We now define the operations with cardinal numbers.


Definitions. Let a and b be cardinal numbers.
• We define a+b = card S, where S is any set which is of the form S = A∪B
with card A = a, card B = b, and A ∩ B = ∅.
• We define a·b = card P , where P is any set which is of the form P = A×B
with card A = a and card B = b.
• We define ab = card X, where X is any set of the form X which is of the
form Y
X= Ai ,
i∈I
with card I = b and card Ai = a, ∀ i ∈ I. Equivalently, if we take two sets
A and B with card A = a, and card B = b, and if we define
Y
AB = A = {f : f function from B to A},
B

then a = card(AB ).
b
364 APPENDIX B

It is pretty easy to show that these definitions are correct, in the sense that they
do not depend on the particular choices of the sets involved. Moreover, these
operations are consistent with the usual operations with natural numbers.
Remark B.1. The operations with cardinal numbers, defined above, satisfy:
• a + b = b + a,
• (a + b) + d = a + (b + d),
• a + 0 = a,
• a · b = b · a,
• (a · b) · d = a · (b · d),
• a · 1 = a,
• a · (b + d) = (a · b) + (a · d),
• (a · b)d = (ad ) · (bd ),
• ab+d = (ab ) · (ad ),
• (ab )d = (ab·d ,
for all cardinal numbers a, b, d ≥ 1.
Remark B.2. The order relation ≤ is compatible with all the operations, in
the sense that, if a1 , a2 , b1 , and b2 are cardinal numbers with a1 ≤ a2 and b1 ≤ b2 ,
then
• a 1 + b1 ≤ a 2 + b2 ,
• a 1 · b 1 ≤ a 2 · b2 ,
• ab1 1 ≤ ab2 2 .
Proposition B.1. Let a ≥ 1 be a cardinal number.
(i) If A is a set with card A = a, and if we define
P(A) = {B : B subset of A},
then 2 = card P(A).
a

(ii) a < 2a .
Proof. (i). Put
P = {0, 1}A = f : f function from A to {0, 1} ,


so that 2a = card P . We need to define a bijection φ : P → P(A). We take


φ(f ) = {a ∈ A : f (a) = 1}, ∀ f ∈ P.
It is clear that, since a function f : A → {0, 1} is completely determined by the set
{a ∈ A : f (a) = 1}, the map φ is indeed bijective.
(ii). The map A 3 a 7−→ {a} ∈ P(A) is clearly injective. This prove the
inequality a ≤ 2a . We now prove that a 6= 2a , by contradiction. Assume there is a
bijection θ : A → P(A). Define the set
B = {a ∈ A : a 6∈ θ(a)},
and choose b ∈ A such that B = θ(b). If b ∈ B, then by construction we get
b 6∈ θ(b) = B, which is impossible. If b 6∈ B, we have b 6∈ θ(b), which forces b ∈ B,
again an impossibility. 
We now discuss the properties of these operations, when infinite cardinal num-
bers are used.
Lemma B.1 (Properties of ℵ0 ).
(i) For any infinite cardinal number a, one has the inequality ℵ0 ≤ a.
CARDINAL ARITHMETIC 365

(ii) ℵ0 + ℵ0 = ℵ0 ;
(iii) ℵ0 · ℵ0 = ℵ0 ;
Proof. (i). Let a be an infinite cardinal number, and let A be an infinite
set A, with card A = a. Since for every finite subset F ⊂ A, there exists some
x ∈ A r F , one to construct a sequence (xn )n∈N ⊂ A, with xm 6= xn , ∀ m > n ≥ 1.
Then the subset B = {xn : n ∈ N} has card B = ℵ0 , so the inclusion B ⊂ A gives
the desired inequality.
(ii). Consider the sets
A0 = {n ∈ N : n, even} and A1 = {n ∈ N : n, odd}.
Then clearly card A0 = card A1 = ℵ0 , and the equality A0 ∪ A1 = N gives
ℵ0 + ℵ0 = card A0 + card A1 = card(A0 ∪ A1 ) = card N = ℵ0 .
(iii). Take the set P = N × N, so that ℵ0 · ℵ0 = card P . It is obvious that
card P ≥ ℵ0 . To prove the other inequality, we define a surjection φ : N → P as
follows. For each n ≥ 1 we take sn = n(n − 1)/2, we set
Bn = {m ∈ N : sn < m ≤ sn+1 },
and we define φn : Bn → P by
φ(m) = (n + sn − m, m − sn + 1), ∀ m ∈ Bn .
Notice that
(1) φn (Bn ) = {(p, q) ∈ N × N : p + q = n + 1}.
S
Notice also that n≥1 Bn = N, and Bj ∩ Bk = ∅, ∀ j > k ≥ 1, so there exists a

(unique) function φ : N → P , such that φ Bn = φn , for all n ≥ 1. By (1) it is clear
that φ is surjective.

Theorem B.3. Let a and b be cardinal numbers, with 1 ≤ b ≤ a, and a infinite.
Then:
(i) a + b = a;
(ii) a · b = a.
Proof. It is clear that
a ≤ a + b ≤ a + a,
a ≤ a · b ≤ a · a,
so in order to prove the theorem, we can assume that a = b.
(i). Fix some set A with card A = a. Use Zorn Lemma, to find a maximal
non-empty family {Ai : i ∈ I} of subsets of A with
(a) card Ai = ℵ0 , for all i, j ∈ I;
(b) Ai ∩ Aj = ∅, for all i, j ∈ I with i 6= j.
S 
If we put B = A r i∈I Ai , then by maximality it follows that B is finite. In
particular, if we take i0 ∈ I then obviously card(Ai0 ∪ B) = ℵ0 , so if we replace
S Ai0 ∪ B, we will still have the above properties (a) and (b), but also
Ai0 with
A = i∈I Ai . This proves that a = card A = ℵ0 · d, where d = card I. In other
words, we have a = card(N × I). Consider then the sets
C0 = {n ∈ N : n even} and C1 = {n ∈ N : n odd},
366 APPENDIX B

so that (C0 × I) ∪ (C1 × I) = I × N, and (C0 × I) ∩ (C1 × I) = ∅. In particular,


we get
a = card(C0 × I) + card(C1 × I) =
= (card C0 ) · (card I) + (card C1 ) · (card I) =
= ℵ0 · d + ℵ0 · d = a + a.
(ii). Fix A a set with card A = a. We are going to employ Zorn Lemma to find
a bijection A → A × A. Define
X = (D, f ) : D ⊂ A, f : D → D × D bijective .


Equip X with the following order


D ⊂ D 0

(D, f ) ≺ (D0 , f 0 ) ⇐⇒
f = f 0 D
Notice that X is non-empty, since we can find at leas one set D ⊂ A with card D =

0 . We now check that X satisfies the hypothesis of Zorn Lemma. Let T =
(Di , fi ) : i ∈ I be a totally ordered subset of X. It is fairly clear that if one

S takes
D = i∈I and one defines f : D → D × D as the unique function with f Di = fi ,
∀ i ∈ I, then f is injective, and
[ [ [
f (D) = f (Di ) = fi (Di ) = (Di × Di ) = D × D,
i∈I i∈I i∈I

so the pair (D, f ) indeed belongs to X, and is an upper bound for T.


Use Zorn Lemma to produce a maximal element (D, f ) ∈ X. Notice that, if we
take d = card D, then by construction we have
(2) d · d = d.
We would like to prove that D = A. In general this is not the case (for example,
when A = N, every (D, f ) ∈ X, with N r D finite, is automatically maximal). We
notice however that all we need to show is the equality
(3) d = a.
We prove this equality by contradiction. We know that we already have d ≤ a.
Suppose d < a. Put G = A r D notice that d + card G = a. Since d < a, by (i) we
see that we must have the equality card G = a. Then there exists a subset E ⊂ G
with card E = d. Consider the set
P = (E × E) ∪ (E × D) ∪ (D × E).
Since E ∩ D = ∅, the three sets above are pairwise disjoint, so using (2) combined
again with part (i), we get
card P = card(E × E) + card(E × D) + card(D × E) =
= d · d + d · d + d · d = d + d + d = d = card E.
This means that there exists a bijection g : E × P , which combined with the fact
that E ∩ D = P ∩ (D × D)
= ∅, will produce a bijection h : D ∪ E → P ∪ (D × D),
such that h D = f and h E = g. Since we have P ∪ (D × D) = (D ∪ E) × (D ∪ E),
the pair (D ∪ E, h) ∈ X will contradict the maximality of (D, f ). 
CARDINAL ARITHMETIC 367

Corollary B.1. If a is an infinite cardinal number, and if b is a cardinal


number with 2 ≤ b ≤ 2a , then
ba = 2a .

Proof. We have
2a ≤ ba ≤ (2a )a = 2a·a = 2a ,
and the desired equality follows from the Cantor-Bernstein Theorem. 

Corollary B.2. Let a be an infinite cardinal number, let A be a set with


card A = a, and define
Pfin (A) = {F ∈ P(A) : F finite}.
Then card Pfin (A) = a.

Proof. First of all, the map A 3 a 7−→ {a} ∈ Pfin (A) is injective, so a ≤
card Pfin (A).
We now prove the other inequality. For every integer n ≥ 1, let An denote the
n-fold cartesian product. We treat the sequence A1 , A2 , . . . as pairwise disjoint.
For every n ≥ 1 we define the map
φn : An → Pfin (A),
by
φ(a1 , . . . , an ) = {a1 , . . . , an },
S∞
we define the map φ : n=1 A → Pfin (A) as the unique map such that
n
and
φ An = φn , ∀ n ≥ 1. Notice now that, since

card An = an = a, ∀ n ≥ 1,
it follows that

[
An = ℵ0 · a = a,

card
n=1
which gives
card(Range φ) ≤ a.
But it is clear that
{∅} ∪ Range φ = Pfin (A),
and the fact that Pfin (A) is infinite, proves that
card Pfin (A) = card(Range φ) ≤ a.


We conclude with a result on the cardinal number c = card R.


Proposition B.2.
(i) For two real numbers a < b, one has
card(a, b) = card[a, b) = card(a, b] = card[a, b] = c.
ℵ0
(ii) c = 2 .
368 APPENDIX B

Proof. (i). It is clear that, since (a, b) is infinite, we have


card[a, b] = 2 + card(a, b) = card(a, b).
The inclusions (a, b) ⊂ [a, b) ⊂ [a, b] and (a, b) ⊂ (a, b] ⊂ [a, b], combined with the
Cantor-Bernstein Theorem, immediately give
card[a, b) = card(a, b] = card(a, b).
Finally, the bijection
 
π(2t − a − b)
(a, b) 3 t 7−→ tan ∈R
2(b − a)
shows that card(a, b) = c.
(ii). The proof of this result uses a certain construction, which is useful for
many other purposes. Therefore we choose to work in full generality. Consider the
set
T = {0, 1}ℵ0 = a = (αn )n∈N : αn ∈ {0, 1}, ∀ n ∈ N ,


so 2ℵ0 = card P . For any real number r ≥ 2, we define the map φr : T → [0, 1] by

X αn
φ(a) = (r − 1) , ∀ a = (αn )n∈N ∈ T.
n=1
rn
The maps φr , r ≥ 2 are “almost” injective. To clarify this, we define the set

T0 = a = (αn )n∈N ∈ T : the set {n ∈ N : αn = 0} is infinite .
Note that

T r T0 = (αn )n∈N ∈ T : there exists N ∈ N, such that αn = 1, ∀ n ≥ N .
Clearly φ is surjective. In fact φ is “almost” bijective.
Claim 1: Fix r ≥ 2. For elements a = (αn )n∈N , b = (βn )n∈N ∈ T0 , the
following are equivalent
(∗) φr (a) > φr (b);
(∗∗) there exists k ∈ N, such that alphak > βk , and αj = βj , for all j ∈ N
with j < k.
We first prove the implication (∗∗) ⇒ (∗). If a, b ∈ T0 satisfiy (∗∗), then
∞ ∞
r−1 X αn − βn r−1 X βn
(4) φr (a) − φr (b) = k
+ (r − 1) n
≥ k
− (r − 1) .
r r r 2n
n=k+1 n=k+1

Notice now that there are infinitely many indices n ≥ k + 1 such that βn = 0. This
gives the fact that
∞ ∞
X βn X 1 1
< = ,
rn rn (r − 1)rk
n=k+1 n=k+1
so if we go back to (4) we get

r−1 X βn r−1 1 r−2
φr (a) − φr (b) ≥ k
− (r − 1) n
> k − k = k ≥ 0,
r r r r r
n=k+1

so in particular we get φr (a) > φr (b.


Conversely, if φr (a) > φr (b), we choose
k = min{n ∈ N : αn 6= βn }.
CARDINAL ARITHMETIC 369

Using the implication (∗∗) ⇒ (∗) we see that we cannot have βk > αk , because this
would force φ(b) > φ(a). Therefore we must have αk > βk , and we are done.
Using Claim 1, we now see that φr T0 : T0 → [0, 1] is injective
Claim 2: card(T r T0 ) = ℵ0 .
This is pretty clear, since we can write

[
T r T0 = Rk ,
k=1
where 
Rn = a = (αn )n∈N ∈ T : αn = 1, ∀ n ≥ 1 .
Since each Rn is finite, the desired result follows.
Using Claim 2, we have
2ℵ0 = card T = card(T r T0 ) + card T0 = ℵ0 + card T0 .
Since ℵ0 < 2ℵ0 , the above equality forces
2ℵ0 = card T0 .

For every r ≥ 2, we also have card φr (T rT0 ) ≤ ℵ0 , which then gives card φr (T )r
φr (T0 ) ≤ ℵ0 , hence using the injectivity of φr T , we have card φr (T0 ) = card T0 =
0
2ℵ0 , so we get
2ℵ0 = card φr (T0 ) ≤ card φr (T ) = card φr (T0 )+card φr (T )rφr (T0 ) ≤ card phir (T0 )+ℵ0 = 2ℵ0 +ℵ0 = 2ℵ0 .
 

By the Cantor-Bernstein Theorem this forces card φr (T ) = 2ℵ0 .


Now we are done, since for r = 2 we clearly have φ2 (T ) = [0, 1]. 
Corollary B.3.
(i) cℵ0 = c.
(ii) If we define the set
Pcount = {C ⊂ R : card F ≤ ℵ0 },
then card Pcount (R) = c.
Proof. (i). This is immediate from the equality 2ℵ0 = c and from Corollary
B.1.
(ii). Using the inclusion Pfin (R) ⊂ Pcount (R), combined with Corollary B.2, we
see that we have the inequality
c ≤ cardPcount (R).
To prove the other inequality, we define a map φ : RN → Pcount (R), as follows. If
a ∈ RN is a sequence, say a = (αn )n∈N , we put
φ(a) = {αn : n ∈ N}.
Since φ is clearly surjective, using part (i) we get
card Pcount (R) ≤ card RN = cℵ0 = c.

Appendix C

Ordinal numbers
In this Appendix we discuss ordinal number arithmetic. The Axiom of Choice
is assumed to be true.
Definition. Let X be a non-empty set. A well ordering on X is an total order
relation ≺ on X with the following property:
(w) every non-empty subset A ⊂ X has a smallest element, i.e. there exists
a ∈ A, such that a ≺ x, ∀ x ∈ A.
In this case the pair (X, ≺) is called a well ordered set.
Notations. Let (W, ≺) be a well-ordered set. For any a ∈ W , we define
W (a) = {x ∈ W : x ≺ a and x 6= a}.
Remark that (W (a), ≺) is well-ordered.
Lemma C.1. Let (W, ≺) be a well ordered set. For a subset S ⊂ W , the
following are equivalent:
(i) for every s ∈ S, one has the inclusion W (s) ⊂ S;
(ii) either S = W , or there exists some a ∈ W , such that S = W (a).
Proof. (i) ⇒ (ii). Assume S ( W . Take a to be the smallest element of the
set W r S. If s ∈ S, then a 6= s, and by (i) we cannot have a ≺ s, since this would
force a ∈ W (s) ⊂ S. Therefore we must have s ≺ a, i.e. s ∈ W (a). This prove the
inclusion S ⊂ W (a). Conversely, if s ∈ W (a), then s must belong to S. Otherwise
s ∈ W r S would contradict the minimality of a.
(ii) ⇒ (i). This is trivial. 
Definition. A subset S, as above, is called a full subset.
The key feature of well-ordered sets is the following.
Lemma C.2 (Transfinite Induction Principle). Let (W, ≺) be a well-ordered
set. Let w1 ∈ W be the smallest element of W . Assume A ⊂ W is a set with the
property
(i) If w ∈ W has the property that, W (w) ⊂ A, then w ∈ A.
Then A = W .
Proof. Consider the set
S = {s ∈ A : W (s) ⊂ A}.
It is obvious that S is full, and S ⊂ A. By Lemma C.1, either S = W , in which
case we clearly get A = W , or there exists w ∈ W , such that S = W (w). In this
case we have W (w) ⊂ A. By (i) this forces w ∈ A, so we get w ∈ S, which is
impossible. 
371
372 APPENDIX C

Another useful feature is


Lemma C.3 (Recursion Principle). Let (W, ≺) be a well-ordered set, and let
w1 be the smallest
Q element in W . Let X be a set, and assume one has a family of
maps Φa : W (a) X → X, a ∈ W r {w1 }. Then for any element x1 ∈ X, there
exists a unique function f : W → X, such that

(1) f (w1 ) = x1 and f (a) = Φa f W (a) , ∀ a ∈ W r {w1 }.

Proof. For every a ∈ W let us denote the set W (a) ∪ {a} simply by Wa , and
let us define the set
Fa = g : Wa → X : g(w1 ) = x1 and g(b) = Φb
 
g W (b) , ∀ b ∈ Wa r {w1 } .
Remark that, for any a, b ∈ W , with a ≺ b, one has
f W a ∈ Fa , ∀ f ∈ Fb .

(2)
Claim: For every a ∈ W , the set Fa is a singleton.
We prove this statement using transfinite induction. Define
A = a ∈ W : Fa is a singleton .


Suppose a ∈ W has the property W (a) ⊂ A, which means that Fb is a singleton,


for all b ∈ W (a). For each b ∈ W (a), let fb : Wb → X be the unique element in Fb .
We notice that, for any b, c ∈ W (a), with b ≺ c, using (2), we have

(3) fc W = fb .
b

This follows immediately from the fact that fc W belongs to Fb . Using the obvious

b
equality
[
W (a) = Wb ,
b∈W (a)

we define g : W (a) → X as the unique function with the property that g W = fb ,
b
∀ b ∈ W (a). Finally, we define fa : Wa → X by fa W (a) = g, and fa (a) = Φa (g). It
is clear that fa ∈ Fa , so Fa has at least one element. If h ∈ F
a is another function,
then for every b ∈ W (a) we have h W ∈ Fb , which forces h W = fb , in particular

b b
giving h W (a) = g = fa W (a) . Then h(a) = Φa (g), which means that we also have
h(a) = fa (a), so we must have h = fa .
Having proven
the Claim, we now have a family of functions fa : Wa → X,
a ∈ W , with fb W = fa , for all a, b ∈ W with a ≺ b. Using the equality
a
[
W = Wa ,
a∈W

we then define f : W → X to be the unique function such that f Wa = fa , ∀ a ∈ W .
Notice that, for each a ∈ W r {w1 }, we have f (a) = fa (a), and since fa ∈ Fa ,
we immediately get (1). The uniqueness of f with property (1) is also clear, since
any such f will atomatically satisfy f Wa ∈ Fa , for all a ∈ W . 
Q
Comment. The system of maps Φa : W (a) X → X, a ∈ W is to be thought
as a “recurence relation,” in the sense that it is used to define the value f (a) in
terms of all “preceding” values f (w), w ≺ a, w 6= a.
ORDINAL NUMBERS 373

Definitions. Given two well ordered sets (W1 , ≺1 ) and (W2 , ≺2 ), a map f :
(W1 , ≺1 ) → (W2 , ≺2 ) is called an full embedding, if
• f is injective.
• For any two elements x, y ∈ W1 , one has
x ≺1 y ⇒ f (x) ≺2 f (y).
• f (W1 ) is a full subset of W2 .
If f is a full emebedding, with f (W1 ) = W2 , then f is called an order isomorphism.
The properties of these types of maps are contained in the following
Proposition C.1. A. Suppose (W1 , ≺1 ) and (W2 , ≺2 ), are well-ordered sets.
(i) If f : (W1 , ≺1 ) → (W2 , ≺2 ) is a full embedding, then
 
f W1 (a) = W2 f (a) , ∀ a ∈ W1 .
In particular, if w1 is the smallest element in W1 , and w2 is the smallest
element in W2 , then f (w1 ) = w2 .
(ii) If f : (W1 , ≺1 ) → (W2 , ≺2 ) is an order isomorphism, then f −1 : (W2 , ≺2
) → (W1 , ≺1 ) is again an order isomorphism.
(iii) There exists at most one full embedding f : (W1 , ≺1 ) → (W2 , ≺2 ).
B. Suppose (W1 , ≺1 ), (W2 , ≺2 ), (W3 , ≺3 ) are well-ordered sets, and
f g
(W1 , ≺1 ) −→ (W2 , ≺2 ) −→ (W3 , ≺3 )
are full emebeddings.
(i) The composition g ◦ f : (W1 , ≺1 ) → (W3 , ≺3 ) is again a full emebdding.
(ii) The composition g ◦ f is an order isomorphism, if and only if both f and
g are order isomorphisms.
Proof. A. (i). Start first with some element x ∈ W (a). Since x ≺1 a, we have
f (x) ≺2 f (a). Since f is injective, and x 6= a, we must have f (x) 6= f (a), hence
x ∈ W2 f (a) . Conversely, if y ∈ W2 f (a)), then using the fact that f (W2 ) is full
in W2 , it follows that y ∈ f (W2 ), so there exists some x ∈ W1 , with y = f (x). If
a ≺1 x, then we would get f (a) ≺2 f (x), which is impossible. Therefore we  must
have x ≺1 a and x 6= a, i.e. x ∈ W1 (a), so y indeed belongs to f W1 (a) . The
second assertion is now clear since we have
 
W2 f (w1 ) = f W1 (w1 ) = f (∅) = ∅,
which clearly forces f (w1 ) = w2 .
(ii). This is obvious.
(iii). Suppose f, g : (W1 , ≺1 ) → (W2 , ≺2 ) are full embeddings, and let us show
that we must have f = g. We use transfinite induction. Define the set
A = {w ∈ W1 : f (w) = g(w)}.
Let w ∈ W1 be some element such that W1 (w) ⊂ A, and let us prove that w ∈ A,
i.e. f (w) = g(w). Denote f (w) by a, and g(w) by b. Using the fact that f W (w) =
1

g W (w) , combined with (i), we have


1
   
W2 (a) = W2 f (w) = f W1 (w) = g W1 (w) = W2 g(w) = W2 (b).
This clearly forces a = b. Indeed, if a 6= b, then either a ≺ b, in which case
a ∈ W2 (b) r W2 (a), or b ≺ a, in which case b ∈ W2 (a) r W2 (b). In either case, we
will get W2 (a) 6= W2 (b).
374 APPENDIX C

B .(i). It is clear that g ◦ f is injective, and satisfies the second condition in the
definition, so the only thing we need to prove is the fact that (g ◦ f )(W1 ) is full. If
f (W1 ) = W2 , there is nothing to prove, since we would get (g ◦ f )(W1 ) = g(W2 ),
which is full.
Assume f (W1 ) = W2 (a), for some a ∈ W2 . Then by (i) we have
  
(g ◦ f )(W1 ) = g f (W1 ) = g W2 (a) = W3 g(a) ,
so again (g ◦ f )(W1 ) is full.
(ii). Assume first that both f and g are order isomprphisms. Then g ◦ f :
(W1 , ≺1 ) → (W3 , ≺3 ) is a full embedding, by (i), and it is clearly surjective, hence
g ◦ f is indeed an order isomorphism.
Conversely, assume g ◦ f : (W1 , ≺1 ) → (W3 , ≺3 ) is an order isomorphism. This
clearly forces g to be surjective, hence an order isomorphism. But then g −1 is an
order isomorphism, and so will be g −1 ◦ (g ◦ f ) = f . 
Corollary C.1. If (W, ≺) is a well-ordered set, and a ∈ W , then there is no
full embedding (W, ≺) → (W (a), ≺).
Proof. Suppose there exists a full embedding f : (W, ≺) → (W (a), ≺). Since
the inclusion ι : (W (a), ≺) ,→ (W, ≺) is obviously a full embedding, the composition
ι ◦ f : (W, ≺) → (W, ≺) is a full embedding. Since we also have IdW : (W, ≺) →
(W, ≺) as a full embedding, this would force ι ◦ f = IdW , which would force ι to be
surjective. But this is obviously impossible. 
Definitions. Two well-ordered sets W1 , ≺1 ) and (W2 , ≺2 ) are said to have
the same order type, if there exists an order isomorphism (W1 , ≺1 ) → (W2 , ≺2 ).
By the above considerations, this defines an equivalence relation on the class of all
well-ordered sets.
An ordinal number is thought as an equivalence class of well-ordered sets. In
other words, if we write a cardinal number as α, it is understood that α consists
of all well-ordered sets of a given order type. So when we write ord(W, ≺) = α
we understand that (W, ≺) belongs to this class, and for another well-ordered set
(W 0 , ≺0 ) we write ord(W 0 , ≺0 ) = α, exactly when (W 0 , ≺0 ) has the same order type
as (W, ≺). In this case we write ord(W 0 , ≺0 ) = ord(W, ≺).
We regard the empty set ∅ as a well-ordered set, with the empty relation. We
write ord(∅) = 0.
Comments. If (W1 , ≺1 ) and (W2 , ≺2 ) are well-ordered sets, then one has the
obvious implication
ord(W1 , ≺1 ) = ord(W2 , ≺2 ) =⇒ card W1 = card W2 .
Conversely, if the well-ordered sets (W1 , ≺1 ) and (W2 , ≺2 ) are finite, and card W1 =
card W2 , then ord(W1 , ≺1 ) = ord(W2 , ≺2 ). Indeed, if we take n = 1card W1 , then
one can define recursively a finite sequence (wk )nk=1 ⊂ W1 , by taking w1 to be the
smallest element of W1 , and defining, for each k ∈ {2, 3, . . . , n} the element wk to
be the smallest element of the set W1 r {w1 , w2 , . . . , wk−1 }. The obvious bijection
{1, 2, . . . , n} 3 k 7−→ wk ∈ W1
will then define an order isomorphism

{1, . . . , n}, ≤ → (W1 , ≺1 ).

Likewise (W2 , ≺2 ) has same order type as {1, . . . , n}, ≤ .
ORDINAL NUMBERS 375

Using the above notations, we can then regard all non-negative integers as
ordinal numbers, by identifying ord(W, ≺) = card(W ), for all finite well-ordered
sets (W, ≺).
Notation. If α is an ordinal number, say α = ord(W, ≺), for some well-ordered
set (W, ≺), then the cardinal number card W does not depend on the particular
choice of (W, ≺). We will denote it by card α. As dicussed above, if
card α = card β = finite cardinal,
then α = β. As we shall see later, this implication holds only for finite ordinal
numbers.
Definitions. Let α1 and α2 be ordinal numbers, say α1 = ord(W1 , ≺1 ) and
α2 = ord(W2 , ≺2 ), where (W1 , ≺1 ) and (W2 , ≺2 ) are two well-ordered sets. We write
α1 ≤ α2 , if there exists a full embedding f : (W1 , ≺1 ) → (W2 , ≺2 ). By Proposition
C.1, this definition is independent of the choices of (W1 , ≺1 ) and (W2 , ≺2 ).
We write α1 < α2 if α1 ≤ α2 and α1 6= α2 .
Remark C.1. If α1 and α2 are ordinal numbers, with α1 ≤ α2 , then card α1 ≤
card α2 .
Proposition C.2. The relation ≤ is an order relation, on any set of ordinal
numbers.
Proof. It is obvious that α ≤ α, for any ordinal number α
Assume α1 and α2 are ordinal numbers with α1 ≤ α2 and α2 ≤ α1 , and let
us show that this forces α1 = α2 . Let (W1 , ≺1 ) and (W2 , ≺2 ) be well-ordered sets
with α1 = ord(W1 , ≺1 ) and α2 = ord(W2 , ≺2 ). Since α1 ≤ α2 , there exists a full
emebedding f : (W1 , ≺1 ) → (W2 , ≺2 ). Since α2 ≤ α1 , either there exists a full
emebdding g : (W2 , ≺2 ) → (W1 , ≺1 ). By Proposition C.1.B, the composition g ◦ f :
(W1 , ≺1 ) → (W1 , ≺1 ) is a full emebedding. Since we already have a full emebdding
IdW1 : (W1 , ≺1 ) → (W1 , ≺1 ), by Proposition C.1.A, we must have g ◦ f = IdW1 .
Using Proposition C.1.B this forces f (and g) to be order isomorphisms, so we
indeed have α1 = α2 .
Finally, suppose α1 , α2 and α3 are ordinal numbers such that α1 ≤ α2 and
α2 ≤ α3 . The fact that α1 ≤ α3 follows immediately from Proposition C.1.B. 
Theorem C.1 (Ordinal Comparability Theorem). Let α1 and α2 be ordinal
numbers. Then either α1 ≤ α2 , or α2 ≤ α1 .
Proof. Let (W1 , ≺1 ) and (W2 , ≺2 ) be well-ordered sets with α1 = ord(W1 , ≺1 )
and α2 = ord(W2 , ≺2 ). For every a ∈ W1 we denote the set W1 (a) ∪ {a} simply by
W1a . It is clear that (W1a , ≺1 ) is well-ordered. Consider the set
A = a ∈ W1 : there exists a full embedding (W1a , ≺1 ) → (W2 , ≺) .


By Proposition C.1.A, we know that for any a ∈ A, there exists a unique full
embedding (W1a , ≺1 ) → (W2 , ≺2 ). We denote this full embedding by fa .
Claim 1: The
set A is full. Moreover, for any a, b ∈ A, with b ≺ a, we have
fb = fa W b .
1

Start with some a ∈ A, and let us prove that W1 (a) ⊂ A. Fix some arbitrary b ∈
W (a). Then the inclusion ι : (W1b , ≺1 ) ,→ (W1a , ≺1 ) is obviously a full embedding,
since we can write
W1b = W1 (c),
376 APPENDIX C

where c is the smallest element of the set


Db = {x ∈ W1 : b ≺1 x and b 6= x}.
(The fact that a ∈ Db shows that Db 6= ∅.) Then the composition
fa ◦ ι : (W1b , ≺1 ) → (W2 , ≺2 )
is a full emebedding, so b indeed belongs to A. Moreover, we will have fb = fa ◦ ι =
fa W b .
1
Define the map φ : A → W2 by
φ(a) = fa (a), ∀ a ∈ A.
Remark that

(4) φ W a = fa , ∀ a ∈ A.
1

take some b ∈ W1 (a), then by Claim 1, we have φ(b) = fb (b) = fa (b),


Indeed, if we
so we get φ W1 (a) = fa W1 (a) .
Claim 2: φ : (A, ≺1 ) → (W2 , ≺2 ) is a full embedding.
We start by proving the first two conditions. Let a, b ∈ A be such that b ≺1 a and
b 6= a, and let us show that φ(b) ≺2 φ(a) and φ(b) 6= φ(a). We have b ∈ W1a and
φ W a = fa , so using the fact that fa : (W1a , ≺1 ) → (W2 , ≺2 ) is a full embedding,
1
we indeed get φ(b) = fa (b) ≺ fa (a) = φ(a), and φ(b) 6= φ(a).
We now show that φ(A) is full in (W2 , ≺2 ). Start with some y ∈ φ(A), and let
us show that W2 (x) ⊂ φ(A). On the one hand, since we obviously have
[
A= W1a ,
a∈A

we also have [ [
W1a = φ(W1a ),

φ(A) = φ
a∈A a∈A
so there exists some a ∈ A, such that y ∈ φ(W1a ) = fa (W1a ). On the other hand,
since fa : (W1a , ≺1 ) → (W2 , ≺2 ) is a full embedding, it follows that fa (W1a ) is full,
so we get W2 (y) ⊂ fa (W1a ) = φ(W1a ) ⊂ φ(A).
We now finish the proof. Since both A and φ(A) are full, there are three cases
to examine
Case 1: A = W1 . In this case φ : (W1 , ≺1 ) → (W2 , ≺2 ) is a full embedding, so
we get α1 ≤ α2 .
Case 2: φ(A) = W2 . In this case φ : (A, ≺1 ) → (W2 , ≺2 ) is a an order
isomorphism, so φ−1 : (W2 , ≺2 ) → (W1 , ≺1 ) is a full embedding, and we get α1 ≤
α2 .
Case 3: A ( W1 and φ(A) ( W2 . This means there exist a1 ∈ W1 and
a2 ∈ W2 such that A = W1 (a1 ) and φ(A) = W2 (a2 ). This case turns out to be
impossible. To see this, we define ψ : W1a1 → W2 by ψ W1 (a) = φ and ψ(a1 ) = a2 ,
then ψ : (W1a1 , ≺1 ) → (W2 , ≺2 ) will still be an order isomorphism. Indeed, the first
two conditions in the definition are clear, while the equality
ψ(W1a1 ) = W2a2 = {y ∈ W2 : y ≺2 a2 },
proves that ψ(W1a1 ) is full. The existence of ψ then forces a1 ∈ A, which contradicts
the equality A = W1 (a1 ). 
ORDINAL NUMBERS 377

Theorem C.2. Let α be an ordinal number. Then the class Pα of all ordinal
numbers β with β < α is a set. More explicitly, if (W, ≺) is a well-ordered set with
ord(W, ≺) = α, then the map
φ : W 3 a 7−→ ord(W (a), ≺) ∈ Pα
is a bijection. Moreover, (Pα , ≤) is well-ordered, and φ : (W, ≺) → (Pα , ≤) is an
order isomorphism.

Proof. Let β be an ordinal number with β < α. Then there exists a well-
ordered set (W1 , ≺1 ), and a full emebedding φ : (W1 , ≺1 ), such that
• β = ord(W1 , ≺1 ),
• φ(W1 ) = W (a1 ),
for some a1 ∈ W . This fact already proves that Pα is a set.
Claim: The element a1 ∈ W does not depend on the particular choice of
(W1 , ≺1 ).
Indeed, if (W2 , ≺2 ) is another well-ordered set, and ψ : (W2 , ≺2 ) → (W, ≺) is
another full emebdding with
• β = ord(W2 , ≺2 ),
• ψ(W2 ) = W (a2 ),
for some a2 ∈ W , then we would get the existence of an order isomorphism γ :
(W (a1 ), ≺) → (W (a2 ), ≺). We can assume (otherwise we replace γ with γ −1 ) that
a1 ≺ a2 . If a1 6= a2 , we would have a1 ∈ W (a2 ), so if we work with the well-ordered
set Z = W (a2 ) we would have an order isomorphism (Z, ≺) → (Z(a1 ), ≺). By
Corollary C.1 this is impossible. Therefore, we must have a1 = a2 .
Using the Claim, we then define aβ as the unique element in W , such that
ord(W (aβ ), ≺) = β. Define the map ψ : Pα 3 β 7−→ aβ ∈ W . It is clear that
φ ◦ ψ = IdPα .
Let us prove now that ψ ◦ φ = IdW . Start with some arbitrary a ∈ W , and put
β = φ(a) = ord(W (a), ≺). Since ord(W (a), ≺) = β, by the Claim, we must have
aβ = a, i.e. ψ(β) = a, which means that (ψ ◦ φ)(a) = a.
Finally, we note that, if a, b ∈ W are elements with a ≺ b, then the obvious full
embedding (W (a), ≺) ,→ (W (b), ≺) proves that ord(W (a), ≺) ≤ ord(W (b), ≺), i.e.
φ(a) ≤ φ(b).
Since φ is bijective, it is clear that, for a, b ∈ W , we have in fact the equivalence
a ≺ b ⇐⇒ φ(a) ≤ φ(b).
This proves that (Pα , ≤) is well-ordered, and φ : (W, ≺) → (Pα , ≤) is an order
isomorphism. 

Corollary C.2. If S is a set of ordinal numbers, then (S, ≤) is well-ordered.

Proof. By Theorem C.1, (S, ≤) is totally ordered. Fix some non-empty subset
A ⊂ S, and let us show that A has a smallest element. Start with some arbitrary
α ∈ A. If α ≤ β, ∀ β ∈ A, we are done. Otherwise, the intersection A ∩ Pα is
non-empty. We then use the fact that (Pα , ≤) is well-ordered, to choose α1 to be
its smallest element. If we start with some arbitrary β ∈ A, then either α ≤ β, in
which case we immediately get α1 < β, or β < α, in which case β ∈ A ∩ Pα , and
we again get α1 ≤ β. So α1 is in fact the smallest element of A. 
378 APPENDIX C

Theorem C.3 (Well ordering Theorem). Every non-empty set has a well or-
dering.

Proof. Let
W = (W, ≺) : (W, ≺) well-ordered, and W ⊂ X .


For two elements (W1 , ≺1 ) and (W2 , ≺2 ), we define (W1 , ≺1 ) @ (W2 , ≺2 ), if and
only if W1 ⊂ W2 , and the inclusion map (W1 , ≺1 ) ,→ (W2 , ≺2 ) is a full embedding.

(This is equivalent to the fact that W1 is a full subset of (W2 , ≺2 ), and ≺1 =≺2 W .)
1
It is obvious that (W, @) is an ordered set. We want to apply Zorn Lemma
to this
 set. We need to check the hypothesis. Start with a totally ordered subset
T = (Wi , ≺iS ) : i ∈ I ⊂ W, and let us show that T has an upper bound in W.
Define W = i∈I Wi . For a, b ∈ W , we define a ≺ b, if and only if there exists
i ∈ I, such that a, b ∈ Wi , and a ≺i b. Let us chack that (W, ≺) is a well-ordered
set. First of all, we need to show that ≺ is an order relation on W . It is clear
that a ≺ a, ∀ a ∈ W . Suppose a, b ∈ W satisfy a ≺ b and b ≺ a, and let us show
that a = b. We know there exists i, j ∈ I such that a, b ∈ Wi and a ≺i b, and
a, b ∈ Wj and b ≺j a. Now there are two possibilities: either (Wi , ≺i ) @ (Wj , ≺j ),
or (Wj , ≺j ) @ (Wi , ≺i ). In the first case we get a ≺i b and b ≺i a, so we would
get a = b. In the other case, by symmetry, we again get a = b. Let us show now
transitivity. Suppose a, b, c ∈ W satisfy a ≺ b and b ≺ c, and let us show that
a ≺ c. We know there exist i, j ∈ I, such that a, b ∈ Wi and a ≺i b, and b, c ∈ Wj
and b ≺j c. As above, we have two possibilities: either (Wi , ≺i ) @ (Wj , ≺j ), or
(Wj , ≺j ) @ (Wi , ≺i ). In the first case we get a, b, c ∈ Wj and a ≺j b ≺j c, so we get
a ≺j c. In the second case, we get a, b, c ∈ Wi and a ≺i b ≺i c, so we get a ≺i c. In
either case we get a ≺ c.
Next we show that (W, ≺) is totally ordered. Start with arbitrary a, b ∈ W , and
let us prove that either a ≺ b or b ≺ a. If we choose i, j ∈ I such that a ∈ Wi and
b ∈ Wj , then using the two possiblities (Wi , ≺i ) @ (Wj , ≺j ) or (Wj , ≺j ) @ (Wi , ≺i )
we immediately see that we can find k ∈ I (k is either i or j), such that a, b ∈ Wk .
Then using the fact that (Wk , ≺k ) is totally ordered, we either have a ≺k b, or
b ≺k a. This gives either a ≺ b, or b ≺ a.
In order to prove that (W, ≺) is well-ordered, and (Wi , ≺i ) @ (W, ≺), ∀ i ∈ I,
we shall use the following
Claim: For any i ∈ I, one has the implication:
a ∈ Wi =⇒ W (a) ⊂ Wi .
Indeed, if there exists some b ∈ W (a), but b 6∈ Wi , this would mean that there
exists some j ∈ I, with b ∈ Wj , b ≺j a, and b 6= a. This would then force
(Wi , ≺i ) @ (Wj , ≺j ), and b ∈ Wj (a). But this is impossible, since the fact that Wi
is full in (Wj , ≺j ) would force b ∈ Wj (a) ⊂ Wi .
Let us show now that (W, ≺) is well-ordered. Start with some arbitrary non-
empty subset A ⊂ W . Choose i ∈ I, such that A ∩ Wi 6= ∅, and take a to be the
smallest element in A ∩ Wi , in the well-ordered set (Wi , ≺i ), i.e.
(5) a ∈ A ∩ Wi , and a ≺i x, ∀ x ∈ A ∩ Wi .
Let us prove that a is in fact the smallest element of A, in (W, ≺). Start with some
arbitrary element b ∈ A, and let us prove that a ≺ b. Assume the opposite, which
using the fact that (W, ≺) is totally ordered, this means that b ≺ a, and b 6= a,
ORDINAL NUMBERS 379

i.e. b ∈ W (a). By the Claim hoewever, this will force b ∈ Wi , so we would get
b ∈ A ∩ Wi , and the choice of a would give a ≺i b, which would then give a ≺ b,
thus contradicting the assumption on b.
We now prove (Wi , ≺i ) @ (W, ≺), ∀ i ∈ I. It is clear that the inclusion map
ι : (Wi , ≺i ) ,→ (W, ≺) satsifies the first two conditions in the definition of full
embeddings, so the only thing we need is the fact that Wi is full in (W, ≺). But
this is precisely the content of the above Claim.
Having shown that every totally orderes subset T ⊂ W has an upper bound, we
now invoke Zorn Lemma, to get the existence of a maximal element (W, ≺) ∈ W.
The proof of the Theorem will be finished onece we prove that W = X. We prove
this equality by contardiction. Assume W ( X. Pick an element x ∈ X r W , and
define the set W1 = W ∪ {x}. Equipp W1 with the order relation ≺1 defined by

a, b ∈ W and a ≺ b,
a ≺ b ⇐⇒
or b = x

It is pretty obvious that W = W1 (x) and ≺=≺1 W , so (W1 , ≺1 ) is well-ordered
and (W, ≺) @ (W1 , ≺1 ). Since W ( W1 , this would contardict the maximality. 
Comment. An interesting consequence of the Well-Ordering Theorem is the
following: For any cardinal number a, there exists an ordinal number α, such that
card α = a.
Another interesting application is the following:
Corollary C.3. If C is a set of cardinal numbers, then (C, ≤) is well-ordered.
Proof. For any a ∈ C we choose a well-ordered set (Wa , ≺a ) with card Wa = a.
Choose any set X with
a < card X, ∀ a ∈ C.
(For example, we can take Y = a∈C Wa , so that a ≤ card Y , ∀ a ∈ C, and then we
S

define X = {0, 1}Y .) Choose a well-ordering ≺ on the set X. Define α = ord(X, ≺)


and αa = ord(Wa , ≺a ), ∀ a ∈ C). Since
card αa = a < card X = card(X, ≺), ∀ a ∈ C,
it follows that we have αa < α, i.e. αa ∈ Pα , ∀ a ∈ C.
Apply now the fact that the ordinal set (Pα , ≤) is a well-ordered, to find some
a0 ∈ C, such that
αa0 ≤ αa , ∀ a ∈ C.
This will clearly imply
a0 = card αa0 ≤ card αa = a, ∀ a ∈ C. 
Examples C.1. As previously discussed, for every finite cardinal number n ≥
0, there exists exactly one ordinal number with n as its cardinality.
The next interesting case is the class
A = {α : α ordinal number with card α ≤ ℵ0 },
then A is a set. Indeed if we choose an ordinal number γ1 with card γ1 = c, then A
is a subset of Pγ1 . Moreover, if we choose an ordinal number γ2 with card γ2 = 2c ,
then we see that γ1 ∈ Pγ2 r A. We can then take Ω to be the smallest element of
the non-empty set Pγ2 r A, and we have
A = PΩ .
380 APPENDIX C

The inclusion A ⊂ PΩ is clear. To prove the other inclusion, we start with some
ordinal number α < Ω and we see that this forces α ∈ Pγ2 , so it will be impossible to
have ℵ0 < card α, because this would give α ∈ Pγ2 rA, contradicting the minimality
of Ω.
The ordinal number Ω is called the smallest uncountable ordinal number.
Fact 1: The set PΩ is uncountable.
This follows from the fact that
Ω = ord(PΩ , ≤),
which gives ℵ0 < card Ω = card PΩ .
Fact 2: The cardinal number ℵ1 = card Ω = card PΩ is the smallest uncount-
able cardinal number.
Indeed, if one starts with some cardinal number a < ℵ1 , then if we choose a well-
ordered set (W, ≺) with card W = a, then, since we have card W < card Ω we must
have ord(W, ≺) < Ω, which then forces a ≤ ℵ0 .
Fact 3: Any countable subset A ⊂ PΩ has a strict upper bound in PΩ , that
is, there exists β ∈ PΩ , such that α < β, ∀ α ∈ A.
We prove this by contradiction. Assume A has no strict upper bound in PΩ , which
means that for every β ∈ PΩ , there exists some α ∈ A such that β ≤ α. This gives
[
(6) PΩ = (Pα ∪ {α}).
α∈A
But for every α ∈ PΩ we have ord Pα = α, which forces card Pα ≤ ℵ0 . Then the
fact that A is countable, combined with (6) will force PΩ to be countable, which is
impossible.
The above construction can be generalized to arbitrary cardinal numbers, giving
the following
Fact 4: Given any cardinal number a, there exist a smallest ordinal number
Ωa with a < card Ωa , and the cardinal number a0 = card Ωa is the smallest
cardinal number with a < a0 . Any set A ⊂ PΩa , with card A ≤ a, has a
strict upper bound in PΩa .

Вам также может понравиться