Академический Документы
Профессиональный Документы
Культура Документы
Economics
Andrew McLennan
April 8, 2014
Preface
Over two decades ago now I wrote a rather long survey of the mathematical
theory of fixed points entitled Selected Topics in the Theory of Fixed Points. It
had no content that could not be found elsewhere in the mathematical literature,
but nonetheless some economists found it useful. Almost as long ago, I began
work on the project of turning it into a proper book, and finally that project is
coming to fruition. Various events over the years have reinforced my belief that the
mathematics presented here will continue to influence the development of theoretical
economics, and have intensified my regret about not having completed it sooner.
There is a vast literature on this topic, which has influenced me in many ways,
and which cannot be described in any useful way here. Even so, I should say
something about how the present work stands in relation to three other books on
fixed points. Fixed Point Theorems with Applications to Economics and Game
Theory by Kim Border (1985) is a complement, not a substitute, explaining various
forms of the fixed point principle such as the KKMS theorem and some of the
many theorems of Ky Fan, along with the concrete details of how they are actually
applied in economic theory. Fixed Point Theory by Dugundji and Granas (2003) is,
even more than this book, a comprehensive treatment of the topic. Its fundamental
point of view (applications to nonlinear functional analysis) audience (professional
mathematicians) and technical base (there is extensive use of algebraic topology)
are quite different, but it is still a work with much to offer to economics. Particularly
notable is the extensive and meticulous information concerning the literature and
history of the subject, which is full of affection for the theory and its creators. The
book that was, by far, the most useful to me, is The Lefschetz Fixed Point Theorem
by Robert Brown (1971). Again, his approach and mine have differences rooted in
the nature of our audiences, and the overall objectives, but at their cores the two
books are quite similar, in large part because I borrowed a great deal.
I would like to thank the many people who, over the years, have commented
favorably on Selected Topics. It is a particular pleasure to acknowledge some very
detailed and generous written comments by Klaus Ritzberger. This work would not
have been possible without the support and affection of my families, both present
and past, for which I am forever grateful.
i
Contents
I Topological Methods 22
2 Planes, Polyhedra, and Polytopes 23
2.1 Affine Subspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.2 Convex Sets and Cones . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.3 Polyhedra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.4 Polytopes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.5 Polyhedral Complexes . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.6 Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
ii
CONTENTS iii
7 Retracts 95
7.1 Kinoshitas Example . . . . . . . . . . . . . . . . . . . . . . . . . . 95
7.2 Retracts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
7.3 Euclidean Neighborhood Retracts . . . . . . . . . . . . . . . . . . . 99
7.4 Absolute Neighborhood Retracts . . . . . . . . . . . . . . . . . . . 100
7.5 Absolute Retracts . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
7.6 Domination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
The Brouwer fixed point theorem states that if C is a nonempty compact convex
subset of a Euclidean space and f : C C is continuous, then f has a fixed point,
which is to say that there is an x C such that f (x ) = x . The proof of
this by Brouwer (1912) was one of the major events in the history of topology.
Since then the study of such results, and the methods used to prove them, has
flourished, undergoing radical transformations, becoming increasingly general and
sophisticated, and extending its influence to diverse areas of mathematics.
Around 1950, most notably through the work of Nash (1950, 1951) on noncoop-
erative games, and the work of Arrow and Debreu (1954) on general equilibrium
theory, it emerged that in economists most basic and general models, equilibria
are fixed points. The most obvious consequence of this is that fixed point theo-
rems provide proofs that these models are not vacuous. But fixed point theory also
informs our understanding of many other issues such as comparative statics, robust-
ness under perturbations, stability of equilibria with respect to dynamic adjustment
processes, and the algorithmics and complexity of equilibrium computation. In par-
ticular, since the mid 1970s the theory of games has been strongly influenced by
refinement concepts defined largely in terms of robustness with respect to certain
types of perturbations.
As the range and sophistication of economic modelling has increased, more ad-
vanced mathematical tools have become relevant. Unfortunately, the mathematical
literature on fixed points is largely inaccessible to economists, because it relies heav-
ily on homology. This subject is part of the standard graduate school curriculum
for mathematicians, but for outsiders it is difficult to penetrate, due to its abstract
nature and the amount of material that must be absorbed at the beginning before
the structure, nature, and goals of the theory begin to come into view. Many re-
searchers in economics learn advanced topics in mathematics as a side product of
their research, but unlike infinite dimensional analysis or continuous time stochastic
processes, algebraic topology will not gradually achieve popularity among economic
theorists through slow diffusion. Consequently economists have been, in effect,
shielded from some of the mathematics that is most relevant to their discipline.
This monograph presents an exposition of advanced material from the theory of
fixed points that is, in several ways, suitable for graduate students and researchers
in mathematical economics and related fields. In part the fit with the intended
2
1.1. THE FIRST FIXED POINT THEOREMS 3
Chapter 3 presents various proofs of this result. Although some are fairly brief,
none of them can be described as truly elementary. In general, proofs of Brouwers
4 CHAPTER 1. INTRODUCTION AND SUMMARY
theorem are closely related to algorithmic procedures for finding approximate fixed
points. Chapter 3 discusses the best known general algorithm due to Scarf, a new
algorithm due to the author and Rabee Tourky, and homotopy methods, which are
the most popular in practice, but require differentiability. The last decade has seen
major breakthroughs in computer science concerning the computational complexity
of computing fixed points, with particular reference to (seemingly) simple games
and general equilibrium models. These developments are sketched briefly in Section
3.7.
In economics and game theory fixed point theorems are most commonly used to
prove that a model has at least one equilibrium, where an equilibrium is a vector of
endogenous variable for the model with the property that each individual agents
predicted behavior is rational, or utility maximizing, if that agent regards all
the other endogenous variables as fixed. In economics it is natural, and in game
theory unavoidable, to consider models in which an agent might have more than
one rational choice. Our first generalization of Brouwers theorem addresses this
concern.
If X and Y are sets, a correspondence F : X Y is a function from X to the
nonempty subsets of Y . (On the rare occasions when they arise, we use the term set
valued mapping for a function from X to all the subsets of Y , including the empty
set.) We will tend to regard a function as a special type of correspondence, both
intuitively and in the technical sense that we will frequently blur the distinction
between a function f : X Y and the associated correspondence x 7 {f (x)}.
If Y is a topological space, F is compact valued if, for all x X, F (x) is
compact. Similarly, if Y is a subset of a vector space, then F is convex valued if
each F (x) is convex.
The extension of Brouwers theorem to correspondences requires a notion of
continuity for correspondences. If X and Y are topological spaces, a correspondence
F : X Y is upper semicontinuous if it is compact valued and, for each x0 X
and each neighborhood V Y of F (x0 ), there is a neighborhood U X of x0 such
that F (x) V for all x U. It turns out that if X and Y are metric spaces and Y
is compact, then F is upper semicontinuous if and only if its graph
desire to provide a simple approach to the von Neumann (1928) minimax theorem,
which is a fundamental result of game theory. This is the fixed point theorem that
is most commonly applied in economic analysis.
1.2. FIXING KAKUTANIS THEOREM 5
{ (1 t)x + tx : 0 t 1 }
(x, t) 7 (1 t)x + tx .
6 CHAPTER 1. INTRODUCTION AND SUMMARY
It seems natural to guess that a nonempty compact contractible space has the
fixed point property. Whether this is the case was an open problem for several years,
but it turns out to be false. In Chapter 7 we will see an example due to Kinoshita
(1953) of a nonempty compact contractible subset of R3 that does not have the
fixed point property. Fixed point theory requires some additional ingredient.
If X is a topological space, a subset A X is a retract if there is a continuous
function r : X A with r(a) = a for all a A. Here we tend to think of X as a
simple space, and the hope is that although A might seem to be more complex,
or perhaps crumpled up, it nonetheless inherits enough of the simplicity of X. A
particularly important manifestation of this is that if r : X A is a retraction and
X has the fixed point property, then so does A, because if f : A A is continuous,
then so is f r : X A X, so f r has a fixed point, and this fixed point
necessarily lies in A and is consequently a fixed point of f . Also, a retract of a
contractible space is contractible because if c : X [0, 1] X is a contraction of
X and r : X A X is a retraction, then
is a contraction of A.
A set A Rm is a Euclidean neighborhood retract (ENR) if there is an
open superset U Rm of A and a retraction r : U A. If X and Y are metric
spaces, an embedding of X in Y is a function e : X Y that is a homeomorphism
between X and e(X). That is, e is a continuous injection1 whose inverse is also
continuous when e(X) has the subspace topology inherited from Y . An absolute
neighborhood retract (ANR) is a separable2 metric space X such that whenever
Y is a separable metric space and e : X Y is an embedding, there is an open
superset U Y of e(X) and a retraction r : U e(X). This definition probably
seems completely unexpected, and its difficult to get any feeling for it right away.
In Chapter 7 well see that ANRs have a simple characterization, and that many
of the types of spaces that come up most naturally are ANRs, so this condition
is quite a bit less demanding than one might guess at first sight. In particular, it
will turn out that every ENR is an ANR, so that being an ENR is an intrinsic
property insofar as it depends on the topology of the space and not on how the
space is embedded in a Euclidean space.
An absolute retract (AR) is a separable metric space X such that whenever
Y is a separable metric space and e : X Y is an embedding, there is a retraction
r : Y e(X). In Chapter 7 we will prove that an ANR is an AR if and only if it
is contractible.
For practical purposes this is the maximally general topological fixed point
theorem, but for mathematicians there is an additional refinement. There is a con-
cept called acyclicity that is defined in terms of the concepts of algebraic topology.
A contractible set is necessarily acyclic, but there are acyclic spaces (including com-
pact ones) that are not contractible. The famous Eilenberg-Montgomery fixed point
theorem is:
1
b
0
0 s t 1
Figure 1.1
The figure above shows a function f : [0, 1] [0, 1] with two fixed points, s
and t. If we perturb the function slightly by adding a small positive constant, s
disappears in the sense that the perturbed function does not have a fixed point
anywhere near s, but a function close to f has a fixed point near t. More precisely,
if X is a topological space and f : X X is continuous, a fixed point x of f is
essential if, for any neighborhood U of x , there is a neighborhood V of the graph
of f such that any continuous f : X X whose graph is contained in V has a
fixed point in U. If a fixed point is not essential, then we say that it is inessential.
These concepts were introduced by Fort (1950).
There need not be an essential fixed point. The function shown in Figure 1.2
8 CHAPTER 1. INTRODUCTION AND SUMMARY
has an interval of fixed points. If we shift the function down, there will be a fixed
point near the lower endpoint of this interval, and if we shift the function up there
will be a fixed point near the upper endpoint.
This example suggests that we might do better to work with sets of fixed points.
A set S of fixed points of a function f : X X is essential if it is closed, it has a
neighborhood that contains no other fixed points, and for any neighborhood U of S,
there is a neighborhood V of the graph of f such that any continuous f : X X
whose graph is contained in V has a fixed point in U. The problem with this
concept is that large connected sets are not of much use. For example, if X is
compact and has the fixed point property, then the set of all fixed points of f is
essential. It seems that we should really be interested in sets of fixed points that are
either essential and connected3 or essential and minimal in the sense of not having
a proper subset that is also essential.
0
0 1
Figure 1.2
In Chapter 8 we will show that any essential set of fixed points contains a min-
imal essential set, and that minimal essential sets are connected. The theory of
refinements of Nash equilibrium (e.g., Selten (1975); Myerson (1978); Kreps and
Wilson (1982); Kohlberg and Mertens (1986); Mertens (1989, 1991); Govindan and
Wilson (2008)) has many concepts that amount to a weakening of the notion of
essential set, insofar as the set is required to be robust with respect to only cer-
tain types of perturbations of the function or correspondence. In particular, Jiang
(1963) pioneered the application of the concept to game theory, defining an es-
sential!Nash equilibrium and an essential set of Nash equilibria in terms
of robustness with respect to perturbations of the best response correspondence
induced by perturbations of the payoffs. The mathematical foundations of such
3
We recall that a subset S of a topological space X is connected if there do not exist two
disjoint open sets U1 and U2 with S U1 6= 6= S U2 and S U1 U2 .
1.4. INDEX AND DEGREE 9
1
b
0
0 1
Figure 1.3
10 CHAPTER 1. INTRODUCTION AND SUMMARY
1.4.1 Manifolds
First of all, it makes sense to expand our perspective a bit. An m-dimensional
manifold is a topological space that resembles Rm in a neighborhood of each of its
points. More precisely, for each p M there is an open U Rm and an embedding
: U M whose image is open and contains p. Such a is a parameterization
and its inverse is a coordinate chart. The most obvious examples are Rm itself
and S m . If, in addition, N is an n-dimensional manifold, then M N is an (m + n)-
dimensional manifold. Thus the torus S 1 S 1 is a manifold, and this is just the
most easily visualized member of a large class of examples. An open subset of an
m-dimensional manifold is an m-dimensional manifold. A 0-dimensional manifold is
just a set with the discrete topology. The empty set is a manifold of any dimension,
including negative dimensions. Of course these special cases are trivial, but they
come up in important contexts.
A collection {i : Ui M}iI of parameterizations is an atlas if its images
cover M. The composition 1 j i (with the obvious domain of definition) is called
a transition function. If, for some 1 r , all the transition functions are
C r functions, then the atlas is a C r atlas. An m-dimensional C r manifold is an
m-dimensional manifold together with a C r atlas. The basic concepts of differential
and integral calculus extend to this setting, leading to a vast range of mathematics.
In our formalities we will always assume that M is a subset of a Euclidean
space Rk called the ambient space, and that the parameterizations i and the
coordinate charts 1 i are C r functions. This is a bit unprincipledfor example,
physicists see only the universe, and their discourse is more disciplined if it does
not refer to some hypothetical ambient spacebut this maneuver is justified by
embedding theorems due to Whitney that show that it does not entail any serious
loss of generality. The advantages for us are that this approach bypasses certain
technical pathologies while allowing for simplified definitions, and in many settings
the ambient space will prove quite handy. For example, a function f : M N
(where N is now contained in some R ) is C r for our purposes if it is C r in the
standard sense: for any S Rk a function h : S R is C r , by definition, if there
is an open W Rk containing S and a C r function H : W R such that h = H|S .
Having an ambient space around makes it relatively easy to establish the basic
objects and facts of differential calculus. Suppose that i : Ui M is a C r
parameterization. If x Ui and i (x) = p, the tangent space of M at p, which we
denote by Tp M, is the image of Di (x). This is an m-dimensional linear subspace
of Rk . If f : M N is C r , the derivative
Df (p) : Tp M Tf (p) N
The inverse and implicit function theorems have important generalizations. The
point p is a regular point of f if the image of Df (p) is all of Tf (p) N. We say that
f : M N is a C r diffeomorphism if m = n, f is a bijection, and both f and f 1
are C r . The generalized inverse function theorem asserts that if m = n, f : M N
is C r , and p is a regular point of f , then there is an open U M containing p such
that f (U) is an open subset of N and f |U : U f (U) is a C r diffeomorphism.
If 0 s m, a set S Rk is an s-dimensional C r submanifold of M if it
is an s-dimensional C r submanifold that happens to be contained in M. We say
that q N is a regular value of f if every p f 1 (q) is a regular point. The
generalized implicit function theorem, which is known as the regular value theorem,
asserts that if q is a regular value of f , then f 1 (q) is an (m n)-dimensional C r
submanifold of M.
We need to extend the degree to situations in which the target point q is not a
regular value of f , and to functions that are merely continuous. Instead of being
able to define the degree directly, as we did above, we will need to proceed indirectly,
showing that the generalized degree is determined by certain of its properties, which
we treat as axioms.
The first step is to extend the concept, giving it a local character. For a
compact C M let C = C (M \ C) be the topological boundary of C, and let
int C = C \ C be its interior. A smooth function f : C N with compact domain
C M is said to be smoothly degree admissible over q N if f 1 (q) C =
and q is a regular value of f . As above, for such a pair (f, q) we define deg q (f ) to
be the number of p f 1 (q) at which f is orientation preserving minus the number
of p f 1 (q) at which f is orientation reversing. Note that deg
q (f ) = deg q (f |C )
1
whenever C is a compact subset of C and f (q) has an empty intersection with
the closure of C \ C . Also, if C = C1 C2 where C1 and C2 are compact and
disjoint, then
deg
q (f ) = deg q (f |C1 ) + deg q (f |C2 ).
From the point of view of topology, what makes the degree important is its
invariance under homotopy. If C M is compact, a smooth homotopy h : C
[0, 1] N is smoothly degree admissible over q if h1 (q) (C [0, 1]) =
and q is a regular value of h0 and h1 . In this circumstance
deg
q (h0 ) = deg q (h1 ). ()
b b
1 b b
1
+1 b b
+1
b
+1
1 b
+1 b b
1
b b
t=0 t=1
Figure 1.4
1.4. INDEX AND DEGREE 13
(3) deg
q (h0 ) = deg q (h1 ) whenever C M is compact and the homotopy h :
C [0, 1] N is smoothly degree admissible over q.
We note two additional properties of the smooth degree. The first is that if, in
addition to M and N, M and N are m -dimensional smooth functions, (f, q)
D (M, N), and (f , q ) D (M , N ), then
deg
q (f ) = deg q (f )
f 1 (q) C1 . . . Cr \ (C1 . . . Cr ).
These conditions are imposed in order to express a property of the index that is
inherited from the multiplicative property of the degree for cartesian products.
The index also has an additional property that has no analogue in degree theory.
Suppose that C Rm and C Rm are compact, g : C C and g : C C are
continuous, and g g and g g are index admissible. Then
Rm (g g) = Rm (g g).
When g and g are smooth and the fixed points in question are regular, this boils
down to a highly nontrivial fact of linear algebra (Proposition 13.3.2) that was
unknown prior to the development of this aspect of index theory.
This property turns out to be the key to moving the index up to a much higher
level of generality, but before we can explain this we need to extend the setup a
bit, allowing for the possibility that the images of g and g are not contained in C
and C, but that there are compact sets D C and D C with g(D) C and
g(D) C that contain the relevant sets of fixed points.
(X, C, D, g, X, C, D, g)
(b) g C(C, X) and g C(C, X) with g(D) int C and g(D) int C;
After all these preparations we can finally describe the heart of the matter.
X (g g|D ) = X (g g|D ).
XX (F F ) = X (F ) X (F ).
1.5. TOPOLOGICAL CONSEQUENCES 17
Let SS Ctr be the class of ANRs, and for each X SS Ctr let IS Ctr (X) be the union over
compact C X of the sets of index admissible upper semicontinuous contractible
valued correspondences F : C X. The central goal of this book is:
Theorem 1.4.4. There is a unique index Ctr for S Ctr , which is multiplicative.
The passage from the indices Rm to Ctr has two stages. The first exploits
Commutativity to extend from Euclidean spaces and continuous functions to ANRs
and continuous functions. There is a significant result that is the technical basis for
this. Let X be a metric space with metric d. If Y is a topological space and > 0,
a homotopy : Y [0, 1] X is an -homotopy if
d (y, s), (y, t) <
for all y Y and all 0 s, t 1. We say that h0 and h1 are -homotopic. For
> 0, a topological space D -dominates C X if there are continuous functions
: C D and : D X such that : C X is -homotopic to IdC . In
Section 7.6 we show that:
The second stage passes from continuous function to contractible valued corre-
spondences. As in the passage from the smooth degree to the continuous degree,
the idea is to use approximation by functions to define the extension. The basis
of this is a result of Mas-Colell (1974) that was extended to ANRs by the author
(McLennan (1991)) and is the topic of Chapter 9.
(b) a neighborhood U of Gr(F ) such that, for any two continuous functions f0 , f1 :
D Y with Gr(f0 ), Gr(f1 ) U , there is a homotopy h : C [0, 1] Y with
h0 = f0 |C , h1 = f1 |C , and Gr(ht ) U for all 0 t 1.
(a) A is compact;
(b) A is invariant;
(a) L1 (0) = A;
One of the oldest results in the theory of dynamical systems (Theorem 15.4.1) due
to Lyapunov, is that if there is a Lyapunov function for A, then A is asymptotically
stable.
A converse Lyapunov theorem is a result asserting that if A is asymptotically
stable, then there is a Lyapunov function for A. Roughly speaking, this is true, but
there is in addition the question of what sort of smoothness conditions one may
require of the Lyapunov function. The history of converse Lyapunov theorems is
rather involved, and the issue was not fully resolved until the 1960s. We present
one such theorem (Theorem 15.5.1) that is sufficient for our purposes.
There is a well established definition of the index of an isolated equilibrium of
a vector field. We show that this extends to an axiomatically defined vector field
index. The theory of the vector field index is exactly analogous to the theories of
the degree and the fixed point index, and it can be characterized in terms of the
fixed point index. Specifically, a vector field defined on a compact C M is
index admissible if it does not have any equilibria in the boundary of C. It turns
out that if is defined on a neighborhood of C, and satisfies the technical condition
guaranteeing the existence and uniqueness of the flow, then the vector field index
of is the fixed point index of (, t)|C for small negative t. (The characterization
is in terms of negative time due to an unfortunate normalization axiom for the
vector field index that is now traditional.) One may define the vector field index
of a compact connected component of the set of equilibria to be the index of the
restriction of the vector field to a small compact neighborhood of the component.
The definition of asymptotic stability, and in particular condition (d), should
make us suspect that there is a connection with the Euler characteristic, because
1.6. DYNAMICAL SYSTEMS 21
for small positive t the flow (, t) will map neighborhoods of A into themselves.
The Lyapunov function given by the converse Lyapunov theorem is used in Section
15.6 to show that if A is dynamically stable and an ANR (otherwise the Euler char-
acteristic is undefined) then the vector field index of A is (1)m (A). In particular,
if A is a singleton, then A can only be stable when the vector field index of A is
(1)m . This is the result of Demichelis and Ritzberger. The special case when
A = {p0 } is a singleton is a prominent result in the theory of dynamical systems
due to Krasnoselski and Zabreiko (1984).
We now describe the relationship between this result and qualitative properties
of an equilibriums comparative statics. Consider the following stylized example.
Let U be an open subset of Rm ; an element of U is thought of as a vector of
endogenous variables. Let P be an open subset of Rn ; an element of P is thought of
as a vector of exogenous parameters. Let z : U P Rm be a C 1 function, and let
x z(x, ) and z(x, ) denote the matrices of partial derivatives of the components
of z with respect to the components of x and .
We think of z as a parameterized vector field on U. An equilibrium for a
parameter P is an x U such that z(x, ) = 0. Suppose that x0 is an
equilibrium for 0 , and x z(x0 , 0 ) is nonsingular. The implicit function theorem
gives a neighborhood V of and a C 1 function : V U with (0 ) = x0 and
z((), ) = 0 for all V . The method of comparative statics is to differentiate
this equation with respect to at 0 , then rearrange, obtaining the equation
d
(0 ) = x z(x0 , 0 )1 (x0 , 0 )
d
describing how the endogenous variables adjust, in equilibrium, to changes in the
vector of parameters. The Krasnoselski-Zabreiko theorem implies that if {x0 } is an
asymptotically stable set for the dynamical system determined by the vector field
z(, 0 ), then the determinant of x z(x0 , 0 )1 is positive. This is a precise and
general statement of the correspondence principle.
Part I
Topological Methods
22
Chapter 2
This chapter studies basic geometric objects defined by linear equations and
inequalities. This serves two purposes, the first of which is simply to introduce
basic vocabulary. Beginning with affine subspaces and half spaces, we will pro-
ceed to (closed) cones, polyhedra, and polytopes, which are polyhedra that are
bounded. A rich class of well behaved spaces is obtained by combining polyhedra
to form polyhedral complexes. Although this is foundational, there are nonetheless
several interesting and very useful results and techniques, notably the separating
hyperplane theorem, Farkas lemma, and barycentric subdivision.
0 y 0 + + r y r
then = . If y0 , . . . , yr are not affinely dependent, then they are affinely inde-
pendent.
23
24 CHAPTER 2. PLANES, POLYHEDRA, AND POLYTOPES
P
(c) therePdo not exist 0 , . . . , r R, not all of which are zero, with j j = 0
and j j yj = 0.
The affine hull aff(S) of a set S V is the set of all affine combinations of
elements of S. The affine hull of S contains S as a subset, and we say that S is
an affine subspace if the two sets are equal. That is, S is an affine subspace if it
contains all affine combinations of its elements. Note that the intersection of two
affine subspaces is an affine subspace. If A V is an affine subspace and a0 A,
then { a a0 : a A } is a linear subspace, and the dimension dim A of A is,
by definition, the dimension of this linear subspace. The codimension of A is
d dim A. A hyperplane is an affine subspace of codimension one.
A (closed) half-space is a set of the form
H = { v V : hv, ni }
and for small positive t this is less than kx0 zk2 , contradicting the choice of x0 .
A convex cone is convex set C that is nonempty and closed under multiplication
by nonnegative scalars, so that x C for all x C and 0. Such a cone is
closed under addition: if x, y C, then x + y = 2( 21 x + 12 y) is a positive scalar
multiple of a convex combination of x and y. Conversely, if a set is closed under
addition and multiplication by positive scalars, then it is a cone.
The dual of a convex set C is
Theorem 2.2.2 (Farkas Lemma). If C is a closed convex cone, then for any b
V \ C there is n C such that hn, bi < 0.
Lemma 2.2.3. Suppose C is nonempty, closed, and convex. Then RC is the set
of y V such that hy, ni 0 whenever H = { v V : hv, ni } is a half space
containing C, so RC is closed because it is an intersection of closed half spaces. In
addition, C is bounded if and only if RC = {0}.
26 CHAPTER 2. PLANES, POLYHEDRA, AND POLYTOPES
C = (C L
C ) + LC .
Lemma 2.2.4. If C = 6 V is a closed convex cone, then there is n C with hn, xi > 0
for all x C \ LC .
2.3 Polyhedra
A polyhedron in V is an intersection of finitely many closed half spaces. We
adopt the convention that V itself is a polyhedron by virtue of being the intersec-
tion of zero half-spaces. Any hyperplane is the intersection of the two half-spaces
it bounds, and any affine subspace is an intersection of hyperplanes, so any affine
subspace is a polyhedron. The dimension of a polyhedron is the dimension of its
affine hull. Fix a polyhedron P .
A face of P is either the empty set, P itself, or the intersection of P with the
bounding hyperplane of some half-space that contains P . Evidently any face of P
2.3. POLYHEDRA 27
Proof. For each i we cannot have P Ii because that would imply that G Ii ,
making Hi redundant. Therefore P must contain some xi in the interior of each
Hi . If x0 is a convex combination of x1 , . . . , xk with positive weights, then x0 is
contained in the interior of each Hi .
T
Proposition 2.3.4. For J {1, . . . , k} let FJ = P jJ Ij . Then FJ is a face
of P , and every nonempty face of P has this form.
Corollary 2.3.5. P has finitely many faces, and the intersection of any two faces
is a face.
Corollary 2.3.7. The facets of P are F{1} , . . . , F{k} . The dimension of each F{i}
is one less than the dimension of P , The facets are the only faces of P with this
dimension.
Proof. Minimality implies that each F{i} is a proper face, and the result above
implies that F{i} cannot be a proper subset of another proper face. Thus each F{i}
is a facet.
For each i minimality implies that for each j 6= i there is some xj F{i} \ F{j} .
Let x be a convex combination of these with positive weights, then F{i} contains a
neighborhood of x in Ii , so the dimension of F{i} is the dimension of G Ii , which
is one less than the dimension of P .
A face F that is not a facet is a proper face of some facet, so its dimension is
not greater than two less than the dimension of P .
2.4. POLYTOPES 29
Now suppose that P is bounded. Any point in P that is not a vertex can be
written as a convex combination of points in proper faces of P . Induction on the
dimension of P proves that:
Proposition 2.3.8. If P is bounded, then it is the convex hull of its set of vertices.
2.4 Polytopes
A polytope in V is the convex hull of a finite set of points. Polytopes were
already studied in antiquity, but the subject continues to be an active area of
research; Ziegler (1995) is a very accessible introduction. We have just seen that
a bounded polyhedron is a polytope. The most important fact about polytopes is
the converse:
Proof. Let L be its lineality, and let K be a linear subspace of V that is comple-
mentary to L in the sense that K L = {0} and K + L = V . Let Q = P K. Then
P = Q + L, and the lineality of Q is {0}, so RQ is pointed. Let S be the convex
hull of the set of initial points of Q. Above we saw that this is the convex hull of
the set of vertices of Q, so S is a polytope. Now Proposition 2.3.2 gives
P = L + RQ + S.
is the set of points such that the xj for j J are as close to y as any of the points
x1 , . . . , xn . From Euclidean geometry we know that the condition kyxj k kyxi k
determines a half space in V (a quick calculation shows that ky xj k2 ky xi k2 if
and only if hy, xj xi i 21 (kxj k2 kxi k)) so each PJ is a polyhedron, and conditions
(a) and (b) are easy consequences of Proposition 2.3.4.
Fix a polyhedral complex P. A subcomplex of P is a subset Q P that
contains all the faces of its elements, so that Q is also a polyhedral complex. If this
is the case, then |Q| is a closed (because it is a finite union of closed subsets) subset
of |P|. We say that P is a polytopal complex if each Pj is a polytope, in which
case P is said to be a polytopal subdivision of |P|. Note that |P| is necessarily
2.5. POLYHEDRAL COMPLEXES 31
b b b b b b b b
b b
b
b b b
b
b b b
b b b
b
b b
b
b
b b
Q = conv({ wP : P Q })
This construction shows that the underlying space of a polytopal complex is also
the underlying space of a simplicial complex. In addition, repeating this process
can give a triangulation with small simplices. The diameter of a polytope is the
maximum distance between any two of its points. The mesh of a polytopal complex
is the maximum of the diameters of its polytopes.
Consider an -dimensional simplex P whose vertices are v0 , . . . , v . The barycen-
ter of P is
1
(P ) := (v0 + + v ).
+1
In the construction above, suppose that P is a simplicial complex, and that we
chose wP = P for all P . We would like to bound the diameter of the simplices in
the subdivision of |P|, which amounts to giving a bound on the maximum distance
between the barycenters of any two nested faces. After reindexing, these can be
taken to be the faces spanned by v0 , . . . , vk and v0 , . . . , v where 0 k < m and
m is the dimension of P. The following rather crude inequality is sufficient for our
purposes.
1 1
(v0 + + vk ) (v0 + + v )
k+1 +1
1
X X
= vi vj
(k + 1)( + 1) 0ik 0j
1 X X
kvi vj k
(k + 1)( + 1)
0ik 0j,j6=i
1 m
(k + 1)D D.
(k + 1)( + 1) m+1
It follows from this that the mesh of the subdivision of |P| is not greater than
m/(m + 1) times the mesh of P. Since we can subdivide repeatedly:
2.6 Graphs
A graph is a one dimensional polytopal complex. That is, it consists of finitely
many zero and one dimensional polytopes, with the one dimensional polytopes in-
tersecting at common endpoints, if they intersect at all. A one dimensional polytope
is just a line segment, which is a one dimensional simplex, so a graph is necessarily
a simplicial complex.
Relative to general simplicial complexes, graphs sound pretty simple, and from
the perspective of our work here this is indeed the case, but the reader should be
aware that there is much more to graph theory than this. The formal study of
graphs in mathematics began around the middle of the 20th century and quickly
became an extremely active area of research, with numerous subfields, deep results,
and various applications such as the theory of networks in economic theory. Among
the numerous excellent texts in this area, Bollobas (1979) can be recommended to
the beginner.
This book will use no deep or advanced results about graphs. In fact, almost
everything we need to know about them is given in Lemma 2.6.1 below. The main
purpose of this section is simply to introduce the basic terminology of the subject,
which will be used extensively.
Formally, a graph1 is a triple G = (V, E) consisting of a finite set V of vertices
and a set E of two element subsets of V . An element of e = {v, w} of E is called an
edge, and v and w are its endpoints. Sometimes one writes vw in place of {v, w}.
Two vertices are neighbors if they are the endpoints of an edge. The degree of a
vertex is the cardinality of its set of neighbors.
A walk in G is a sequence v0 v1 vr of vertices such that vj1 and vj are
neighbors for each j = 1, . . . , r. It is a path if v0 , . . . , vr are all distinct. A path is
1
In the context of graph theory the sorts of graphs we describe here are said to be simple,
to distinguish them from a more complicated class of graphs in which there can be loops (that is,
edges whose two endpoints are the same) and multiple edges connecting a single pair of vertices.
They are also said to be undirected to distinguish them from so-called directed graphs in which
each edge is oriented, with a source and target.
34 CHAPTER 2. PLANES, POLYHEDRA, AND POLYTOPES
maximal if it not contained (in the obvious sense) in a longer path. Two vertices
are connected if they are the endpoints of a path. This is an equivalence relation,
and a component of G is one of the graphs consisting of an equivalence class and
the edges in G joining its vertices. We say that G is connected if it has only one
component, so that any two vertices are connected. A walk v0 v1 vr is a cycle if
r 3, v0 , . . . , vr1 are distinct, and vr = v0 . If G has no cycles, then it is said to
be acyclic. A connected acyclic graph is a tree.
The following simple fact is the only result from graph theory applied in this
book. It is sufficiently obvious that there would be little point in including a proof.
Lemma 2.6.1. If the degree of each of the vertices of G is at most two, then the
components of G are maximal paths, cycles, and vertices with no neighbors.
This simple principle underlies all the algorithms described in Chapter 3. There
are an even number of endpoints of paths in G. If it is known that an odd number
represent or embody a situation that is not what we are looking for, then the rest
do embody what we are looking for, and in particular the number of solutions is
odd, hence positive. If it is known that exactly one endpoint embodies what we are
not looking for, and that endpoint is easily computed, then we can find a solution
by beginning at that point and following the path to its other endpoint.
Chapter 3
When it was originally proved, Brouwers fixed point theorem was a major break-
through, providing a resolution of several outstanding problems in topology. Since
that time the development of mathematical infrastructure has provided access to
various useful techniques, and a number of easier demonstrations have emerged, but
there are no proofs that are truly simple.
There is an important reason for this. The most common method of proving
that some mathematical object exists is to provide an algorithm that constructs it,
or some proxy such as an arbitrarily accurate approximation, but for fixed points
this is problematic. Naively, one might imagine a computational strategy that
tried to find an approximate fixed point by examining the value of the function at
various points, eventually halting with a declaration that a certain point was a good
approximation of a fixed point. For a function f : [0, 1] [0, 1] such a strategy
is feasible because if f (x) > x and f (x ) < x (as is the case if x = 0 and x = 1
unless one of these is a fixed point) then the intermediate value function implies
that there is a fixed point between x and x . According to the sign of f (x ) x ,
where x = (x+ x )/2, we can replace x or x with x , obtaining an interval with the
same property and half the length. Iterating this procedure provides an arbitrarily
fine approximation of a fixed point.
In higher dimensions such a computational strategy can never provide a guar-
antee that the output is actually near a fixed point. To say precisely what we mean
by this we need to be a bit more precise. Suppose you set out in search of a fixed
point of a continuous function f : X X (where X is nonempty, compact, and
convex subset of a Euclidean space) armed with nothing more than an oracle that
evaluates f . That is, the only computational resources you can access are the theo-
retical knowledge that f is continuous, and a black box that tells you the value of
f at any point in its domain that you submit to it. An algorithm is, by definition,
a computational procedure that is guaranteed to halt eventually, so our supposed
algorithm for computing a fixed point necessarily halts after sampling the oracle
finitely many times, say at x1 , . . . , xn , with some declaration that such-and-such is
at least an approximation of a fixed point. Provided that the dimension of X is
at least two, the Devil could now change the function to one that agrees with the
original function at every point that was sampled, is continuous, and has no fixed
points anywhere near the point designated by the algorithm. (One way to do this is
35
36 CHAPTER 3. COMPUTING FIXED POINTS
see that two person games can be used to approximate quite general fixed point
problems.
Formally, a finite two person game consists of:
k1 = { Rk+ : 1 + + k = 1 }
s2 t2
b b
S t3 T
s3
t1
s2 s1
t2
b b b b
s1 s3 t1 t3
Figure 3.1
s1 t2 s3 t1 s2 t3 s1 ,
A similar procedure can be used to find Nash equilibria in which each agent
mixes over two pure strategies. If we consider s1 and s2 , we see that there are two
mixtures that allow agent 2 to mix over two pure strategies, and we will need to
consider both of them, so things are a bit more complicated than they were for pure
strategies because the process branches. Suppose that agent 1 mixes over s1 and
s2 in the proportion that makes t1 and t2 best responses. Agent 2 has a mixture of
t1 and t2 that makes s2 and s3 best responses. There is a mixture of s2 and s3 that
makes t1 and t3 best responses, and a certain mixture of t1 and t3 makes s1 and
s2 best responses. The only hope for continuing this path in a way that might lead
to a Nash equilibrium is to now consider the mixture of s1 and s2 that makes t1
and t3 best responses, and indeed, ( , ) is a Nash equilibrium.
We havent yet considered the possibility that agent 1 might mix over s1 and s3 ,
nor have we examined what might happen if agent 2 mixes over t2 and t3 . There is
a mixture of s1 and s3 that allow agent 2 to mix over t1 and t2 , which is a possibility
we have already considered and there is a mixture of t2 and t3 that allows agent
1 to mix over s1 and s3 , which we also analyzed above. Therefore there are no
additional Nash equilibria in which both agents mix over two pure strategies.
Could there be a Nash equilibrium in which one of the agents mixes over all
three pure strategies? Agent 2 does have one mixed strategy that allows agent 1 to
mix freely, but this mixed strategy assigns positive probability to all pure strategies
(such a mixed strategy is said to be totally mixed) so it is not a best response
to any of agent 1s mixed strategies, and we can conclude that there is no Nash
equilibrium of this sort. Thus ( , ) is the only Nash equilibrium.
This sort of analysis quickly becomes extremely tedious as the game becomes
larger. In addition, the fact that we are able to find all Nash equilibria in this way
does not prove that there is always something to find.
Before continuing we reformulate Nash equilibrium using a simple principle with
numerous repercussions, namely that a mixed strategy maximizes expected utility if
and only if it assigns all probability to pure strategies that maximize expected utility.
To understand this formally it suffices to note that agent 1s problem is to maximize
m
X n
X
T
ui (, ) = A = i aij j
i=1 j=1
Pm
subject to the constraints i 0 for all i and i=1 i = 1, taking as given. From
this it follows that:
Lemma 3.1.1. A mixed strategy profile (, ) is a Nash equilibrium if and only if:
Pn Pn
(a) for each i = 1, . . . , m, either i = 0 or j=1 aij j j=1 ai j j for all
i = 1, . . . , m;
For each m + n conditions there are two possibilities, so there are 2m+n cases. For
each of these cases the intuition derived from counting equations and unknowns
40 CHAPTER 3. COMPUTING FIXED POINTS
suggests that the set of solutions of the conditions given in Lemma 3.1.1 will typi-
cally be zero dimensional, which is to say that it is a finite set of points. Thus we
expect that the set of Nash equilibria will typically be finite.
The Lemke-Howson algorithm is based on the hope that if we relax one of the
conditions above, say the one saying that either 1 = 0 or agent 1s first pure
strategy is a best response, then we may expect that the resulting set will be one
dimensional. Specifically, we let M be the set of pairs (, ) S T satisfying:
For the rest of the section we will assume that M is 1-dimensional, and that it does
not contain any point satisfying more than m + n of the 2(m + n 1) conditions
i = 0, strategy i is optimal, j = 0, and strategy j is optimal, for 2 i
m and 1 j n.
For our example there is a path in M that follows the path
This path alternates between the moves in S and the moves in T shown in Figure
3.2 below:
s2 t2
b b
t3 2
Db
s3
5 B
C t1 4
3
s2 s1
t2
b
A b b b
E b
s1 1 s3 t1 6 t3
Figure 3.2
best response at A, so there is the possibility of holding A fixed and moving away
from t2 along the edge of T between t1 and t2 . We cant continue in this way past
B because s3 would no longer be a best response. However, at B both s2 and
s3 are best responses, so the conditions defining M place no constraints on agent
1s mixed strategy. Therefore we can move away from (A, B) by holding B fixed
and moving into the interior of S in a way that obeys the constraints on agent 2s
mixed strategy, which are that t1 and t2 are best responses. This edge bumps into
the boundary of S at C. Since the probability of s3 is now zero, we are no longer
required to have it be a best response, so we can continue from B along the edge of
T until we arrive at t1 . Since the probability of t2 is now zero, we can move away
from C along the edge between s1 and s2 until we arrive at D. Since t3 is now a
best response, we can move away from t1 along the edge between t1 and t3 until we
arrive at E. As we saw above, (D, E) = ( , ) is a Nash equilibrium.
We now explain how this works in general. If Y is a proper subset of {1, . . . , m}
and D is a nonempty subset of {1, . . . , n}, let
X
SY (D) = { S : i = 0 for all i Y and D argmax bij i }
j=1,...,n
i
be the set of mixed strategies for agent 1 that assign zero probability to every pure
strategy in Y and make every pure strategy in D a best response. Evidently SY (D)
is a polytope.
It is now time to say what typically means. The matrix B is said to be in
Lemke-Howson general position if, for all Y and D, SY (D) is either empty or
(m |D| |Y |)-dimensional. That is, SY (D) has the dimensions one would expect
by counting equations and unknowns. In particular, if m < |D| + |Y |, then SY (D)
is certainly empty.
Similarly, if Z is a proper subset of {1, . . . , n} and C is a nonempty subset of
{1, . . . , m}, let
X
TZ (C) = { T : j = 0 for all j Z and C argmax aij j }.
i=1,...,m
j
The matrix A is said to be in Lemke-Howson general position if, for all Z and C,
TZ (C) is either empty or (n |C| |Z|)-dimensional. Through the remainder of
this section we assume that A and B are in Lemke-Howson general position.
The set of Nash equilibria is the union of the cartesian products SY (D) TZ (C)
over all quadruples (Y, D, Z, C) with Y C = {1, . . . , m} and Z D = {1, . . . , n}.
The general position assumption implies that if such a product is nonempty, then
|Y | + |C| = m and |Z| + |D| = n, so that Y and C are disjoint, as are Z and D,
and SY (D) TZ (C) is zero dimensional, i.e., a singleton. Thus the general position
assumption implies that there are finitely many equilibria.
In addition, we now have
[
M= SY (D) TZ (C) ()
(c) {2, . . . , m} Y C;
(d) {1, . . . , n} = Z D;
(ii) if it is an endpoint of one edge quadruple, then it is either the starting point
of the algorithm, but not a Nash equilibrium, or a Nash equilibrium, but not
the starting point of the algorithm;
(iii) if it is an endpoint of two edge quadruples, then it is neither the starting point
of the algorithm nor a Nash equilibrium.
So, suppose that (Y, D, Z, C) is a vertex quadruple. There are two main cases
to consider, the first of which is that it is a Nash equilibrium, so that 1 Y C.
If 1 Y , then (Y \ {1}, D, Z, C) is the only quadruple that could be an edge
quadruple that has (Y, D, Z, C) as an endpoint, and it is in fact such a quadruple:
(a)-(d) hold obviously, and SY \{1} (D) is nonempty because SY (D) is a nonempty
subset. If 1 C, then (Y \ {1}, D, Z, C) is the only quadruple that could be an
edge quadruple that has (Y, D, Z, C) as an endpoint, and the same logic shows that
it is except when C = {1}, in which case Y = {2, . . . , m}, i.e., (Y, D, Z, C) is the
starting point of the algorithm. Summarizing, if (Y, D, Z, C) is a Nash equilibrium
vertex quadruple, it is an endpoint of precisely one edge quadruple except when it
3.1. THE LEMKE-HOWSON ALGORITHM 43
is the starting point of the algorithm, in which case it is not an endpoint of any
edge quadruple.
Now suppose that (Y, D, Z, C) is not a Nash equilibrium. Since SY (D) and
TZ (C) are 0-dimensional, |D| + |Y | = m and |C| + |Z| = n, so, in view of (e), one
of the two intersections Y C and Z D is a singleton while the other is empty.
First suppose that Z D = {j}. Then (Y, D, Z \ {j}, C) and (Y, D \ {j}, Z, C)
are the only quadruples that might be edge quadruples that have (Y, D, Z, C) as an
endpoint, and in fact both are: again (a)-(d) hold obviously (except that one must
note that |D| 2 because |Z D| = n, |Z| < n, and |Z D| = 1) and SY (D \ {j})
and TZ\{j} (C) are both nonempty because SY (D) and TZ (C) are nonempty subsets.
Taken together, these observations verify (i)-(iii), and complete the formal veri-
fication of the main properties of the Lemke-Howson algorithm. Two aspects of
the procedure are worth noting. First, when SY (D) TZ (C) is a vertex that
is an endpoint of two edges, the two edges are either SY \{i} (D) TZ (C) and
SY (D) TZ (C \ {i}) for some i or SY (D) TZ\{j} (C) and SY (D \ {j}) TZ (C) for
some j. In both cases one of the edges is the cartesian product of a line segment
in S and a point in T while the other is the cartesian product of a point in S and
a line segment in T . Geometrically, the algorithm alternates between motion in S
and motion in T .
Second, although our discussion has singled out the first pure strategy of agent
1, this was arbitrary, and any pure strategy of either player could be designated for
this role. It is quite possible that different choices will lead to different equilibria.
In addition, although the algorithm was described in terms of starting at this pure
strategy and its best response, the path following procedure can be started at any
endpoint of a path in M. In particular, having computed a Nash equilibrium using
one designated pure strategy, we can then switch to a different designated pure
strategy and follow the path, for the new designated pure strategy, going away
from the equilibrium. This path may go to the starting point of the algorithm
for the new designated pure strategy, but it is also quite possible that it leads
to a Nash equilibrium that cannot be reached directly by the algorithm using any
designated pure strategy. Equilibria that can be reached by repeated applications of
this maneuver are said to be accessible. A famous example due to Robert Wilson
(reported in Shapley (1974))) shows that there can be inaccessible equilibria even
in games with a surprisingly small number of pure strategies.
44 CHAPTER 3. COMPUTING FIXED POINTS
A + s = u em , B T + t = v en , hs , i = 0 = ht , i, h , em i = 1 = h , en i,
s , 0 Rm , t , 0 Rn .
The set of Nash equilibria is unaffected if we add a constant to every entry in a
column of A, or to every entry of a row of B. Therefore we may assume that all
the entries of A and B are positive, and will do so henceforth. Now the equilibrium
utilities u and v are necessarily positive, so we can divide in the system above,
obtaining the system
A + s = em , B T + t = en , hs, i = 0 = ht, i, s, 0 Rm , t, 0 Rn
together with the formulas h, em i = 1/v and h, en i = 1/u for computing equi-
librium expected payoffs. The components of s and t are called slack variables.
This new system is not quite equivalent to the one above because the one above
in effect requires that and each have some positive components. The new system
has another solution that does not come from a Nash equilibrium, namely = 0,
= 0, s = em , and t = en . It is called the extraneous solution. To see that
this is the only new solution consider that if = 0, then t = en , so that ht, i = 0
implies = 0, and similarly = 0 implies that = 0.
We now wish to see the geometry of the Lemke-Howson algorithm in the new
coordinate system. Let
S T
t3
s3
t1
3 s1 3
s2
t2
1
1
Figure 3.3
Cy + x = q hx, yi = 0 x, y 0 R . ()
Let
P = { (x, y) R R : x 0, y 0, and Cy + x = q }.
We will assume that all the components of q are positive, that all the entries of C
are nonnegative, and that each row of C has at least one positive entry, so that P
is bounded and thus a polytope. In general a d-dimensional polytope is said to be
simple if each of its vertices is in exactly d facets. The condition that generalizes
the general position assumption on A and B is that P is simple.
Let the projection of P onto the second copy of R be
Q = { y R : y 0 and Cy q }.
0 A
If C = and q = e , then Q = S T , and each edge of Q is either the
BT 0
cartesian product of a vertex of S and an edge of T or the cartesian product of
an edge of S and a vertex of T .
t3
b
s3
t1
3 s1 3
b b
s2
b
t2
1
1
Figure 3.4
M = { (x, y) P : x2 y2 + + x y = 0 }.
x1 = q1 c11 y1 c1 y ,
.. .. .. ..
. . . .
xi = qi ci1 y1 ci y ,
.. .. .. ..
. . . .
x = q c1 y1 c y .
1 1 ci2 ci
y1 = qi xi y2 y .
ci1 ci1 ci1 ci1
Replacing the first equation above with this, and substituting it into the other
equations, gives
c11 c11 c11 ci2 c11 ci
x1 = q1 qi xi c12 y2 c1 y ,
ci1 ci1 ci1 ci1
.. .. .. .. ..
. . . . .
1 1 ci2 ci
y1 = qi xi y2 y ,
ci1 ci1 ci1 ci1
.. .. .. .. ..
. . . . .
c1 c1 c1 ci2 c1 ci
x = q qi xi c2 y2 c y .
ci1 ci1 ci1 ci1
This is not exactly a thing of beauty, but it evidently has the same form as what
we started with. The data of the algorithm consists of a tableau [q , C ], a list
describing how the rows and the last columns of the tableau correspond to the
original variables of the problem, and the variable that vanished when we arrived
at the corresponding vertex. If this variable is either x1 or y1 we are done. Other-
wise the data is updated by letting the variable that is complementary to this one
48 CHAPTER 3. COMPUTING FIXED POINTS
increase, finding the next variable that will vanish when we do so, then updating
the list and the tableau appropriately. This process is called pivoting.
We can now describe how the algorithm works in the degenerate case when P
is not necessarily simple. From a conceptual point of view, our method of handling
degenerate problems is to deform them slightly, so that they become nondegenerate,
but in the end we will have only a combinatoric rule for choosing the next pivot
variable. Let L = { (x, y) R R : Cy + x = q }, let 1 , . . . , , 1 , . . . , be
distinct positive integers, and for > 0 let
(a) terms with positive coefficients are greater than terms with negative coeffi-
cients;
(b) among terms with positive coefficients, those with smaller exponents are
greater than terms with larger exponents, and if two terms have equal ex-
ponents they are ordered according to the coefficients;
3.3. USING GAMES TO FIND FIXED POINTS 49
(c) among terms with negative coefficients, those with larger exponents are greater
than terms with smaller exponents, and if two terms have equal exponents
they are ordered according to the coefficients.
We now eliminate all i for which the dominant term is not minimal. All remaining
i have the same dominant term, and we continue by subtracting off this term and
comparing the resulting expressions in a similar manner, repeating until only one i
remains. This process does necessarily continue until only one i remains, because
if other terms of the expressions above fail to distinguish between two possibilities,
eventually there will be a comparison involving the terms i /ci1 , and the exponents
1 , . . . , , 1 , . . . , are distinct.
Lets review the situation. We have given an algorithm that finds a solution
of the linear complementarity problem () that is different from (q, 0). The as-
sumptions that insure that the algorithm works are that q 0 and that P is
a polytope. In particular, these assumptions are satisfied when the linear comple-
mentarity problem is derived from a two person game with positive payoffs, in which
case any solution other than (q, 0) corresponds to a Nash equilibrium. Therefore
any two person game with positive payoffs has a Nash equilibrium, but since the
equilibrium conditions are unaffected by adding a constant to a players payoffs, in
fact we have now shown that any two person game has a Nash equilibrium.
There are additional issues that arise in connection with implementing the al-
gorithm, since computers cannot do exact arithmetic on arbitrary real numbers.
One possibility is to require that the entries of q and C lie in a set of numbers
for which exact arithmetic is possibleusually the rationals, but there are other
possibilities, at least theoretically. Alternatively, one may work with floating point
numbers, which is more practical, but also more demanding because there are issues
associated with round-off error, and in particular its accumulation as the number of
pivots increases. The sort of pivoting we have studied here also underlies the sim-
plex algorithm for linear programming, and the same sorts of ideas are applied to
resolve degeneracy. Numerical analysis for linear programming has a huge amount
of theory, much of which is applicable to the Lemke-Howson algorithm, but it is far
beyond our scope.
Of course this observation does not prove anything, but it does point in a useful
direction. Let x1 , . . . , xn , y1 , . . . , yn X be given. We can define a finite two person
game with n n payoff matrices A = (aij ) and B = (bij ) by setting
(
0, i 6= j,
aij = kxi yj k2 and bij =
1, i = j.
X X X
j kxi yj k2 =
aij j = j xi yj , xi yj
j j j
X
X
X
= j xi , xi + 2 j xi , yj j yj , yj
j j j
where C = kzk2 nj=1 j kyj k2 is a quantity that does not depend on i. Therefore
P
is a best response to if and only if it assigns all probability to those i with xi as
close to z as possible. If y1 F (x1 ), . . . , yn F (xn ), then there is a sense in which
a Nash equilibrium may be regarded as a point that is approximately fixed.
We are going to make this precise, thereby proving Kakutanis fixed point the-
orem. Assume now that F is upper semicontinuous with convex values. Define
sequences x1 , x2 , . . . and y1 , y2 , . . . inductively as follows. Choose x1 arbitrarily, and
let y1 be an element of F (x1 ). Supposing that x1 , . . . , xn and y1 , . . . , yn , have al-
ready been determined, let ( n , n ) be a Nash equilibrium of the two person game
with payoff matrices An = (anij ) and B nP= (bnij ) where anij = kxi yj k2 and bnij is
1 if i = j and 0 otherwise. Let xn+1 = j j yj , and choose yn+1 F (yn+1).
Let x be an accumulation point of the sequence {xn }. To show that x is a
fixed point of F it suffices to show that it is an element of the closure of any convex
neighborhood V of F (x ). Choose P > 0 such that F (x) V for all x U (x ).
Consider an n such that xn+1 = j jn yj U/3 (x ) and at least one of x1 , . . . , xn
is also in this ball. Then the points in x1 , . . . , xn that are closest to xn+1 are in
U2/3 (xn+1 ) U (x ), so xn+1 is a convex combination of points in V , and is
therefore in V . Therefore x is in the closure of the set of xn that lie in V , and thus
in the closure of V .
In addition to proving the Kakutani fixed point theorem, we have accumulated
all the components of an algorithm for computing approximately fixed points of
a continuous function f : X X. Specifically, for any error tolerance > 0 we
compute the sequences x1 , x2 , . . . and y1 , y2 , . . . with f in place of F , halting when
kxn+1 f (xn+1 )k < . The argument above shows that this is, in fact, an algorithm,
in the sense that it is guaranteed to halt eventually. This algorithm is quite new.
Code implementing it exists, and the initial impression is that it performs quite
well. But it has not been extensively tested.
3.4. SPERNERS LEMMA 51
There is one more idea that may have some algorithmic interest. As before, we
consider points x1 , . . . , xn , y1 , . . . , yn Rd . Define a correspondence : Rd Rd
by letting (z) be the convex hull of { yj : j argmini kz xi k } when z PJ .
(Evidently this construction is closely related to the Voronoi diagram determined by
x1 , . . . , xn . Recall that this is the polyhedral decomposition of Rd whose nonempty
polyhedra are the sets PJ = { z V : J argmini kz xi k } where = 6 J
{1, . . . , n}.) Clearly is upper semicontinuous and convex valued.
Suppose that P z is a fixed point of this correspondence. Then z is a convex
combination j j yj with yj = 0 if j / argmini kz xi k. Let J = { j : yj >
0 }. If i = 1/|J| when i J and i = 0 when i / J, then (, ) is a Nash
equilibrium of the game derived from xP 1 , . . . , x ,
n 1y , . . . , yn . Conversely, if (, ) is
a Nash equilibrium of this game, then jJ j yj is a fixed point of . In a sense,
the algorithm described above approximates the given correspondence F with a
correspondence of a particularly simple type.
We may project the path of the Lemke-Howson algorithm, in its application to
the game derived from x1 , . . . , xn , y1, . . . , yn , into this setting. Define 1 : Rd Rd
by letting 1 (z) be the convex hull of { yi : i {1}argmini kzxi k }. Suppose that
(, ) is an element of the set M defined in Section 3.1, so that all the conditions
of Nash equilibrium are satisfied except that it may be the case that 1 > 0Peven if
the first pure strategy is not optimal. Let J = { j : j > 0 }, and let z = j j yj .
Then J { i : i > 0 } {1} argminj kz xj k, so z 1 (z). Conversely, P suppose
z is a fixed point of 1 , and let J = argminj kz xj k. Then z = j j yj for some
n1 with j = 0 for all j / {1} J. If we let be the element of n1 such
that i = 1/|{1} J| if i J and i = 0 if i / {1} J, then (, ) M.
If n is large one might guess that there is a sense in which operating in Rd might
be less burdensome than working in n1 n1 , but it seems to be difficult to
devise algorithms that take concrete advantage of this. Nonetheless this setup does
give a picture of what the Lemke-Howson algorithm is doing that has interesting
implications. For example, if there is no point in Rd that is equidistant from more
than d + 1 points, then there is no point (, ) M with i > 0 for more than
d + 2 indices. This gives a useful upper bound on the number of pivots of the
Lemke-Howson algorithm.
3
b
3 b b
3
1 b
3
b
b
2
1 b
3 1 1 2
b
1 b
b b b
b
b
2 1
b b b b b b b
1 1 2 2 1 2 2
Figure 3.5
Elementary properties of the determinant imply that each p and p are polynomial
functions. For sufficiently small t the simplices (t) are the (d 1)-dimensional
simplices of a triangulation of d1 .3 Therefore p(t) is d!1 D for small t. Since p is a
2
Actually, it is straightforward if you know integration, but Gauss regarded this as too heavy
a tool, expressing a wish for a more elementary theory of the volume of polytopes. The third of
Hilberts famous problems asks whether it is possible, for any two polytopes of equal volume, to
triangulate the first in such a way that the pieces can be reassembled to give the second. This
was resolved negatively by Hilberts student Max Dehn within a year of Hilberts lecture laying
out the problems, and it remains the case today that there is no truly elementary theory of the
volumes of polytopes. In line with this, our discussion presumes basic facts about d-dimensional
measure of polytopes in Rd that are very well understood by people with no formal mathematical
training, but which cannot be justified formally without appealing to relatively advanced theories
of measure and integration.
3
This is visually obvious, and a formal proof would be tedious, so we provide only a sketch.
Suppose that for each v V we have a path connected neighborhood Uv of v in the interior of the
smallest face of d1 containing v, and this system of neighborhoods satisfies the condition that
for any simplex in P, say with vertices v1 , . . . , vk , if v1 Uv1 , . . . , vk Uvk , then v1 , . . . , vk are
affinely independent. We claim that a simplicial complex obtained by replacing each v with some
element of Uv is a triangulation of d1 ; note that this can be proved by moving one vertex at a
time along a path. Finally observe that because is a Sperner labelling, for each v and 0 t < 1,
v(t) is contained in the interior of the smallest face of d1 containing v.
54 CHAPTER 3. COMPUTING FIXED POINTS
1
polynomial function of t, it follows that it is constant, and in particular p(1) = d!
D.
We have established the following refinement of Sperners lemma:
Theorem 3.4.2. If is a Sperner labelling, then the number of P d1 such that
is orientation preserving on is one greater than the number of P d1 such
that is orientation reversing on .
One of our major themes is that fixed points where the function or correspon-
dence reverses orientation are different from those where orientation is preserved.
Much of what follows is aimed at keeping track of this difference in increasingly
general settings.
3
b
3 b b
3
1 b
3
bb
bb
2
1 b
3 1 1 2
b
1 bb
bb bb bb
bb
bb
2 1
b bb bb bb bb bb b
1 1 2 2 1 2 2
Figure 3.6
V = V 0 V d1 and E = E 0 F 1 E 1 E d2 F d1 E d1 .
V, we can compute the simplices of its neighbors in and the labels of the vertices
of these simplices. If we remember which of these neighbors we were at prior to
arriving at the current element of V, then the next step in the algorithm is to go to
the other neighbor. Such a step along the path of the algorithm is called a pivot.
3
b
3 b b
3
1 b
3
bb
bb
2
1 b
3 1 1 2
b
1 bb
bb bb bb
bb
bb
2 1
bb b bbb bb b bbb bbb bb b b
1 1 2 2 1 2 2
Figure 3.7
At this point we remark on a few aspects of the Scarf algorithm, and later
we will compare it with various alternatives. The first point is that it necessarily
moves through d1 rather slowly. Consider a k-almost completely labelled simplex
. Each pivot of the algorithm drops one of the vertices of the current simplex,
possibly adding a new vertex, or possibly dropping down to a lower dimensional
face. Therefore a minimum of k pivots are required before one can possibly arrive
at a simplex that has no vertex in common with . If the grid is fine, the algorithm
will certainly require many pivots to arrive a fixed point far from the algorithms
starting point.
This suggests the following strategy. We first apply the Scarf algorithm to a
coarse given triangulation of d1 , thereby arriving at a completely labelled simplex
that is hopefully a rough approximation of a fixed point. We then subdivide the
given triangulation of d1 , using barycentric subdivision or some other method.
If we could somehow restart the algorithm in the fine triangulation, near the
completely labelled simplex in the coarse triangulation, it might typically be the
case that the algorithm did not have to go very far to find a completely labelled
simplex in the fine triangulation. Restart methods do exist (see, e.g., Merrill (1972),
Kuhn and MacKinnon (1975), and van der Laan and Talman (1979)) but it remains
the case that the Scarf algorithm has not proved to be very useful in practice,
perhaps due in part to its difficulties with high dimensional problems.
There is one more feature of the Scarf algorithm that is worth mentioning. In
our description of the algorithm the ordering of the vertices plays an explicit role,
and can easily make a difference to the outcome. If one wishes to find more than
one completely labelled simplex, or perhaps as many as possible, or perhaps even all
of them, there is the following strategy. Having followed the algorithm for the given
ordering of the indices to its terminus, now proceed from that completely labelled
simplex in the graph associated with some different ordering. This might lead
58 CHAPTER 3. COMPUTING FIXED POINTS
back to the starting point of the algorithm in , but it is also quite possible that
it might lead to some completely labelled simplex that cannot be reached directly
by the algorithm under any ordering of the indices. A completely labelled simplex
is accessible if it is reachable by the algorithm in this more general sense: there
is path going to from the starting point of the algorithm for some ordering of the
indices, along a path that is a union of maximal paths of the various graphs for
the various orderings of the indices.
3.6 Homotopy
Let f : X X be a continuous function, and let x0 be an element of X. We
let h : X [0, 1] X be the homotopy
Here we think of the variable t at time, and let ht = h(, t) : X X be the function
at time t. In this way we imagine deforming the constant function with value x0
at time zero into the function f at time one.
Let g : X [0, 1] X be the function g(x, t) = h(x, t) x. The idea of the
homotopy method is to follow a path in Z = g 1 (0) starting at (x0 , 0) until we reach
a point of the form (x , 1). As a practical matter it is necessary to assume that f is
C 1 , so that h and g are C 1 . It is also necessary to assume that the derivative of g has
full rank at every point of Z, and that the derivative of the map x 7 f (x) x has
full rank at each of the fixed points of f . As we will see later in the book, there is a
sense in which this is typically the case, so that these assumptions are mild. With
these assumptions Z will be a union of finitely many curves. Some of these curves
will be loops, while others will have two endpoints in X {0, 1}. In particular,
the other endpoint of the curve beginning at (x0 , 0) cannot be in X {0}, because
there is only one point in Z (X {0}), so it must be (x , 1) for some fixed point
x of f .
We now have to tell the computer how to follow this path. The standard com-
putational implementation of curve following is called the predictor-corrector
method. Suppose we are at a point z0 = (x, t) Z. We first need to compute a
vector v that is tangent to Z at z0 . Algebraicly this amounts to finding a nonzero
linear combination of the columns of the matrix of Dg(z0 ) that vanishes. For this
it suffices to express one of the columns as a linear combination of the others, and,
roughly speaking, the Gram-Schmidt process can be used to do this. We can divide
any vector we obtain this way by its norm, so that v becomes a unit vector. There is
a parameter of the procedure called the step size that is a number > 0, and the
predictor part of the process is completed by passing to the point z1 = z0 + v.
The corrector part of the process uses the Newton method to pass from z1 to
a new point in Z, or at least very close to it. The first step is to find a vector w1
that is orthogonal to v such that g(z1 ) + Dg(z1)w1 = 0. To do this we can use the
Gram-Schmidt process to find a basis for the orthogonal complement of v, compute
the matrix M of the derivative of g with respect to this basis, compute the inverse
of M, and then set w1 = M 1 g(z1 ). We then set z2 = z1 + w1 , find a vector w2
3.7. REMARKS ON COMPUTATION 59
compute:
the next state of the processor,
a bit that will be written at the current location of the input-output device
(overwriting the bit that was just read) and
a motion (forward, back, stay put) of the input-output device.
The computation ends when it reaches a particular state of the machine called
Halt. Once that happens, the data in the storage device is regarded as the
output of the computation.
As you might imagine, an analysis based on a concrete and detailed description
of the operation of a Turing machince can be quite tedious. Fortunately, it is
rarely necessary. Historically, other models of computation were proposed, but were
subsequently found to be equivalent to the Turing model, and the Church-Turing
thesis is the hypothesis that all reasonable models of computation are equivalent,
in the sense that they all yield the same notion of what it means for something to be
computable. This is a metamathematical assertion: it can never be proved, and a
refutation would not be logical, but would instead be primarily a social phenomenon,
consisting of researchers shifting their focus to some inequivalent model.
Once we have the notion of a Turing machine, we can define an algorithm to
be a Turing machine that eventually halts, for any input state of the storage device.
A subtle distinction is possible here: a Turing machine that always halts is not
necessarily the same thing as a Turing machine that can be proved to halt, regardless
of the input. In fact one of the most important early theorems of computer science
is that there is no algorithm that has, as input, a description of a Turing machine
and a particular input, and decides whether the Turing machine with that input will
eventually halt. As a practical matter, one almost always works with algorithms
that can easily be proved to be such, in the sense that it is obvious that they
eventually halt.
A computational problem is a rule that associates a nonempty set of outputs
with each input, where the set of possible inputs and outputs is the set of pairs
consisting of a position of the input-output device and a state of the storage medium
in which there are finitely many nonblank cells. (Almost always the inputs of
interest are formatted in some way, and this definition implicitly makes checking the
validity of the input part of the problem.) A computational problem is computable
if there is an algorithm that passes from each input to one of the acceptable outputs.
The distinction between computational problems that are computable and those
that are not is fundamental, with many interesting and important aspects, but in
our discussion here we will focus exclusively on problems that are known to be
computable.
For us the most important distinctions is between those computable computa-
tional problems that are easy and those that are hard, where the definitions
of these terms remain to be specified. In order to be theoretically useful, the eas-
iness/hardness distinction should not depend on the architecture of a particular
machine or the technology of a particular era. In addition, it should be robust, at
least in the sense that a composition of two easy computational problems, where
3.7. REMARKS ON COMPUTATION 61
the output of the first is the input of the second, should also be easy, and possi-
bly in other senses as well. For these reasons, looking at the running time of an
algorithm on a particular input is not very useful. Instead, it is more informative
to think about how the resources (time and memory) consumed by a computation
increase as the size of the input grows. In theoretical computer science, the most
useful distinction is between algorithms whose worst case running time is bounded
by a polynomial function of the size of the output, and algorithms that do not
have this property. The class of computational problems that have polynomial time
algorithms is denoted by P. If the set of possible inputs of a computational prob-
lem is finite, then the problem is trivially in P, and in fact we will only consider
computational problems with infinite sets of inputs.
There are many kinds of computational problems, e.g., sorting, function evalua-
tion, optimization, etc. For us the most important types are decision problems ,
which require a yes or no answer to a well posed question, and search problems,
which require an instance of some sort of object or a verification that no such ob-
ject exists. An important example of a decision problem is Clique: given a simple
undirected graph G and an integer k, determine whether G has a clique with k
nodes, where a clique is a collection of vertices such that G has an edge between
any two of them. An example of a search problem is to actually find such a clique
or to certify that no such clique exists.
There is a particularly important class of decision problems called NP, which
stands for nondeterministic polynomial time. Originally NP was thought of as
the class of decision problems for which a Turing machine that chose its next state
randomly has a positive probability of showing that the answer is Yes when this
is the case. For example, if a graph has a k-clique, an algorithm that simply guesses
which elements constitute the clique has a positive probability of stumbling onto
some k-clique. The more modern way of thinking about NP is that it is the class of
decision problems for which a Yes answer has a certificate or witness that can
be verified in polynomial time. In the case of Clique an actual k-clique is such a
witness. Factorization of integers is another algorithmic issue which easily generates
decision problemsfor example, does a given number have a prime factor whose
first digit is 3?that are in NP because a prime factorization is a witness for them.
(One of the historic recent advances in mathematics is the discovery of a polynomial
time algorithm for testing whether a number is prime. Thus it is possible to verify
the primality of the elements of a factorization in polynomial time.)
An even larger computational class is EXP, which is the class of computational
problems that have algorithms with running times that are bounded above by a
function of the form exp(p(s)), where s is the size of the problem and p is a poly-
nomial function. Instead of using time to define a computational class, we can
also use space, i.e., memory; PSPACE is the class of computational problems that
have algorithms that use an amount of memory that is bounded by a polynomial
function of the size of the input. The sizes of the certificates for a problem in
NP are necessarily bounded by some polynomial function of the size of the input,
and the problem can be solved by trying all possible certificates not exceeding this
bound, so any problem in NP is also in PSPACE. In turn, the number of processor
state-memory state pairs during the run of a program using polynomially bounded
62 CHAPTER 3. COMPUTING FIXED POINTS
P NP PSPACE EXP.
naturally; Clique is one of them. One of the most famous problems in contem-
porary mathematics is to determine whether NP is contained in P. This question
boils down to deciding whether Clique (or any other NP-complete problem) has
a polynomial time algorithm. This is thought to be highly unlikely, both because a
lot of effort has gone into designing algorithms for these problems, and because the
existence of such an algorithm would have remarkable consequences. It should be
mentioned that this problem is, to some extent at least, an emblematic representa-
tive of numerous open questions in computer science that have a similar character.
In fact, one of the implicit conventions of the discipline is to regard a computational
problem as hard if, after some considerable effort, people havent been able to figure
out whether it is hard or easy.
For any decision problem in NP there is an associated search problem, namely
to find a witness for an affirmative answer or verify that the answer is negative.
For Clique this means not only showing that a clique of size k exists, but actually
producing one. The class of search problems associated with decision problems is
called FNP. (The F stands for function.) For Clique the search problem is
not much harder than the decision problem, in the following sense: if we had a
polynomial time algorithm for the decision problem, we could apply it to the graph
with various vertices removed, repeatedly narrowing the focus until we found the
desired clique, thereby solving the search problem is polynomial time.
However, there is a particular class of problems for which the search problem
is potentially quite hard, even though the decision problem is trivial because the
answer is known to be yes. This class of search problems is called TFNP. (The
T stands for total.) There are some trivial decision problems that give rise
to quite famous problems in this class:
Does a integer have a prime factorization? Testing primality can now be
done in polynomial time, but there is still no polynomial time algorithm for
factoring.
Given a set of positive integers {a1 , . . . , an } with ai < 2n /n for all i, do there
exist two different subsets with the same sum? There are 2n different subsets,
and the sum of any one of them is less than 2n n + 1, so the pigeonhole
principle implies that the answer is certainly yes.
Does a two person game have sets of pure strategies for the agents that are
the supports4 of a Nash equilibrium? Verifying that a pair of sets are the
support of a Nash equilibrium is a computation involving linear algebra and a
small number of inequality verifications that can be performed in polynomial
time.
Problems involving a function defined on some large space must be specified
with a bit more care, because if the function is given by listing its values, then the
problem is easy, relative to the size of the input, because the input is huge. Instead,
one takes the input to be a Turing machine that computes (in polynomial time) the
value of the function at any point in the space.
4
The support of a mixed strategy is the set of pure strategies that are assigned positive
probability.
64 CHAPTER 3. COMPUTING FIXED POINTS
Given a Turing machine that computes a real valued function at every vertex
of a graph, is there a vertex where the functions value is at least as large as
the functions value at any of the vertex neighbors in the graph? Since the
graph is finite, the function has a global maximum and therefore at least one
local maximum.
There is a rather subtle point that is worth mentioning here. In our descriptions
of Lemke-Howson, Scarf, and homotopy, we implicitly assumed that the algorithm
used its memory of where it had been to decide which direction to go in the graph,
but the definition of EOTL requires that the graph be directed, which means in
effect that if we begin at any point on the path, we can use local information to de-
cide which of the two directions in the graph constitutes forward motion. It turns
out that each of our three algorithms has this property; a proper explanation of
this would require more information about orientation than we have developed at
this point. The class of problems that can be reduced to the computational prob-
lem that has the same features as EOTL, except that the graph is undirected, is
PPA. Despite the close resemblance to PPAD, the theoretical properties of the
two classes differ in important ways.
In a series of rapid developments in 2005 and 2006 (Daskalakis et al. (2006);
Chen and Deng (2006b,a)) it was shown that computing a Nash equilibrium of a
two player game is PPAD-complete, and also that the two dimensional Sperner
problem is PPAD-complete. This means that computing a Nash equilibrium of a
two player game is almost certainly hard, in the sense that there is no polynomial
time algorithm for the problem, because computing general fixed points is almost
certainly hard. Since this breakthrough many other computational problems have
been shown to be PPAD-complete, including finding Walrasian equilibria in seem-
ingly quite simple exchange economies. In various senses the problem does not go
away if we relax the problem, asking for a point that is -approximately fixed for
an that is significantly greater than zero.
The current state of theory presents a contrast between theoretical concepts
that classify even quite simple fixed point problems as intractable, and algorithms
that often produce useful results in a reasonable amount of time. A recent result
presents an even more intense contrast. The computational problem OEOTL has
the same given data as EOTL, but now the goal is to find the other end of the path
beginning at (0, . . . , 0), and not just any second leaf of the graph. Goldberg et al.
(2011) show that OETL is PSPACE-complete, even though the Lemke-Howson
algorithm, the Scarf algorithm, and many specific instances of homotopy procedures
can be recrafted as algorithms for OEOTL.
Recent developments have led to a rich and highly interesting theory explaining
why the problem of finding an approximate fixed point is intractable, in the sense
that there is almost certainly no algorithm that always finds an approximate fixed
point in a small amount of time. What is missing at this point are more tolerant
theoretical concepts that give an account of why the algorithms that exist are as
useful as they are in fact, and how they might be compared with each other, and
with theoretical ideals that have not yet been shown to be far out of reach.
Chapter 4
The theories of the degree and the index involve a certain kind of continuity
with respect to the function or correspondence in question, so we need to develop
topologies on spaces of functions and correspondences. The main idea is that one
correspondence is close to another if its graph is close to the graph of the second
correspondence, so we need to have topologies on spaces of subsets of a given space.
In this chapter we study such spaces of sets, and in the next chapter we apply these
results to spaces of functions and correspondences. There are three basic set theo-
retic operations that are used to construct new functions or correspondences from
given ones, namely restriction to a subdomain, cartesian products, and composi-
tion, and our agenda here is to develop continuity results for elementary operations
on sets that will eventually support continuity results for those operations.
To begin with Section 4.1 reviews some basic properties of topological spaces
that hold automatically in the case of metric spaces. In Section 4.2 we define
topologies on spaces of compact and closed subsets of a general topological space.
Section 4.3 presents a nice result due to Vietoris which asserts that for one of these
tolopogies the space of nonempty compact subsets of a compact space is compact.
Economists commonly encounter this in the context of a metric space, in which
case the topology is induced by the Hausdorff distance; Section 4.4 clarifies the
connection. In Section 4.5 we study the continuity properties of basic operations
for these spaces. Our treatment is largely drawn from Michael (1951) which contains
a great deal of additional information about these topologies.
66
4.2. SPACES OF CLOSED AND COMPACT SETS 67
(d) normal if, for any two disjoint closed sets C and D, there are disjoint open
sets U and V with C U and D V .
UU = { K U : K is compact };
UU = UU \ {};
VU = { K X : K is compact and K U 6= };
UU0 = { C U : C is closed };
K(X) is the space of compact subsets of X endowed with the topology gen-
erated by the subbase { UU : U X is open }.
68 CHAPTER 4. TOPOLOGIES ON SPACES OF SETS
{ UU : U X is open } { VU : U X is open }.
K0 (X) is the space of closed subsets of X endowed with the topology generated
by the base { UU0 : U X is open }.
K0 (X) is the space of nonempty closed subsets of X endowed with the sub-
space topology inherited from K0 (X).
H0 (X) is the space of nonempty closed subsets of X endowed with the topol-
ogy generated by the subbase
The topologies of H(X) and H0 (X) are both called the Vietoris topology.
Roughly, a neighborhood of K in K(X) or K(X) consists of those K that
are close to K in the sense that every point in K is close to some point of K.
A neighborhood of K H(X) consists of those K that are close in this sense,
and also in the sense that every point in K is close to some point of K . Similar
remarks pertain to K0 (X), K0 (X), and H0 (X). Section 4.4 develops these intuitions
precisely when X is a metric space.
Compact subsets of Hausdorff spaces are closed, so for practical purposes (i.e.,
when X is Hausdorff) every compact set is closed. In this case K(X), K(X), and
H(X) have the subspace topologies induced by the topologies of K0 (X), K0 (X), and
H0 (X). Of course it is always the case that K(X) and K0 (X) have the subspace
topologies induced by K(X) and K0 (X) respectively.
It is easy to see that { UU : U X is open } is a base for K(X) and { UU0 :
U X is open } is a base for K0 (X). Also, for any open U1 , . . . , Uk we have
and similarly for UU , UU0 , and UU0 , so the subbases of K(X), K(X), K0 (X), and
K0 (X) are actually bases.
Lemma 4.3.1. If X has a subbase such that any cover of X by elements of the
subbase has a finite subcover, then X is compact.
4.4. HAUSDORFF DISTANCE 69
Proof. Say that a set is basic if it is a finite intersection of elements of the subbasis.
Any open cover is refined by the collection of basic sets that are subsets of its
elements. If a refinement of an open cover has a finite subcover, then so does the
cover, so it suffices to show that any open cover of X by basic sets has a finite
subcover.
A collection of open covers is a chain if it is completely ordered by inclusion:
for any two covers in the chain, the first is a subset of the second or vice versa. If
each open cover in a chain consists of basic sets, and has no finite subcover, then
the union of the elements of the chain also has these properties (any finite subset
of the union is contained in some member of the chain) so Zorns lemma implies
that if there is one open cover with these properties, then there is a maximal such
cover, say {U : A}.
Suppose, for some A, that U = V1 . . . Vn where V1 , . . . , Vn are in the
subbasis. If, for each i = 1, . . . , n, {U : A} {Vi } has a finite subcover Ci ,
then each Ci \ {Vi } covers X \ Vi , so
(C1 \ {V1 }) . . . (Cn \ {Vn }) {U }
is a finite subcover from {U : A}. Therefore there is at least one i such that
{U : A}{Vi } has no finite subcover, and maximality implies that Vi is already
in the cover. This argument shows that each element U of the cover is contained
in a subbasic set that is also in the cover, so the subbasic sets in {U : A} cover
X, and by hypothesis there must be a finite subcover after all.
Theorem 4.3.2. If X is compact, then H(X) is compact.
Proof. Suppose that { UU : SA} { VV : B} is an open cover of H(X)
by subbasic sets. Let D := X \ V ; since D is closed and X is compact, D is
compact. We may assume that D is nonempty because otherwise X = V1 . . .Vn
for some 1 , . . . , n , in which case H(X) = VV1 . . . VVn . In addition, D must
be contained in some U because otherwise D would not be an element of any UU
or any VV . But then {U } {V : B} has a finite subcover, so, for some
1 , . . . , n , we have
H(X) = UU VV1 . . . VVn .
On the other hand, whenever K U with K compact and U open there is some
> 0 such that U (K) U (otherwise we could take sequences x1 , x2 , . . . in L
and y1 , y2 , . . . in X \ U with d(xi , yi ) 0, then take convergent subsequences) so
{ L : K (L, K) < } UU . Thus:
Lemma 4.4.1. When X is a metric space, the sets of the form { L : K (L, K) < }
constitute a base of the topology of K(X).
This is a metric. Specifically, it is evident that H (K, L) = H (L, K), and that
H (K, L) = 0 if and only if K = L. If M is a third compact set, then
from which it follows easily that the Hausdorff distance satisfies the triangle in-
equality.
There is now an ambiguity in our notation, insofar as U (L) might refer either to
the the union of the -balls around the various points of L or to the set of compact
sets whose Hausdorff distance from L is less than . Unless stated otherwise, we
will always interpret it in the first way, as a set of points and not as a set of sets.
Proposition 4.4.2. The Hausdorff distance induces the Vietoris topology on H(X).
We now show that any element of our subbasis for the Vietoris topology contains
{ L : H (K, L) < } for some > 0. If U is an open set containing K, then (as we
argued above) U (K) U for some > 0, so that
Proof. Applying Lemma 4.5.1, it suffices to show that preimages of subbasic open
sets are open. For T {K, K, K0 , K0 } it suffices to note that
1(WU ) = WU WU
for all four W {U, U, U 0 , U 0 }. For T {H, H0 } we also need to observe that
0
For a nonempty closed set A X let KA (X) and KA (X) be the sets of compact
and closed subsets of X that have nonempty intersection with A. Since the topolo-
gies of K(X) and K0 are the subspace topologies inherited from K(X) and K0 (X),
last result has the following immediate consequence.
Lemma 4.5.4. The function K 7 K A from KA (X) to K(A) and the function
0
C 7 C A from KA (X) to K0 (A) are continuous.
Proof. By Lemma 4.5.1 it suffices to show that, for any open U X, 1 (UU0 ) is
open. For any (C, D) in this set normality implies that there are disjoint open sets
V and W containing C \ U and D \ U respectively. Then (U V ) (U W ) = U,
so
(C, D) (UU0 V UU0 W ) I 0 (X) 1 (UU0 ).
If X is also T1 , it is a Hausdorff space, so compact sets are closed. Therefore
: K(X) K(X) K(X) is continuous because its domain and range have the
subspace topologies inherited from K0 (X) K0 (X) and K0 (X).
Let I(X) (resp. I 0 (X)) be the set of pairs (K, L) of compact (resp. closed)
subsets of X such that K L 6= , endowed with the topology it inherits from the
product topology of K(X) K(X) (resp. K0 (X) K0 (X)). The relevant topologies
are relative topologies obtained from the spaces in the last result, so:
4.5.3 Singletons
Lemma 4.5.7. The function : x 7 {x} is a continuous function from X to
T (X) when T {K, H}. If, in addition, X is a T1 -space, then it is continuous
when T {K0 , H0 }.
Proof. Singletons are always compact, so for any open U we have 1 (UU ) =
1 (VU ) = U. If X is T1 , then singletons are closed, so 1 (UU0 ) = 1 (VU0 ) = U.
Proof. By the definition of the product topology, for each (x, y) K L there
are neighborhoods U(x,y) and V(x,y) of x and y such that S U(x,y) V(x,y) W . For
each x KT we can find y1 , . . . , yn such that L Vx := j V(x,yj ) , and
S we can then
let UxT:= j U(x,yj ) . Now choose x1 , . . . , xm such that K U := i Uxi , and let
V := i Vxi .
(K, L) UU UV 1 (UW ).
By Lemma 4.5.1, this establishes the asserted continuity when T {K, K}.
To demonstrate continuity when T = H we must also show that 1 (VW ) is open
in H(X) H(Y ) whenever W X Y is open. Suppose that (K L) W 6= .
Choose (x, y) (K L) W , and choose open neighborhoods U and V of x and y
with U V W . Then
K L VU VV 1 (VW ).
Proof. Preimages of subbasic open sets are open: for any open V Y we have
1
f (WV ) = Wf 1 (V ) for all W {U, U, V}.
subset of H(X) is open. If X is either T1 or regular, then similar logic shows that
for either T {K0 , H0 } the union of the elements of an open subset of T (X) is
open.
If a subset C of H(X) or H0 (X) is compact, then it is automatically compact in
the coarser topology of K(X) or K0 (X). Therefore the following two results imply
the analogous claims for the H(X) and H0 (X), which are already interesting.
S
Lemma 4.5.14. If S K(X) is compact, then L := KS K is compact.
Proof. We will show that X \ D is open; let x be a point in this set. Each element
of S is a closed set that does not contain x, so (since X is regular) it is an element
0
of UX\N for some closed neighborhood N of x. Since S is compact we have S
0 0
UX\N1 . . . UX\N k
for some N1 , . . . , Nk . Then N1 . . . Nk is a neighborhood
of x that does not intersect any element of S, so x is in the interior of X \ D as
desired.
Chapter 5
In order to study of robustness of fixed points, or sets of fixed points, with respect
to perturbations of the function or correspondence, one must specify topologies on
the relevant spaces of functions and correspondences. We do this by identifying
a function or correspondence with its graph, so that the topologies from the last
chapter can be invoked. The definitions of upper and lower semicontinuity, and their
basic properties, are given in Section 5.1. There are two topologies on the space of
upper semicontinuous correspondences from X to Y . The strong upper topology,
which is defined and discussed in Section 5.2, turns out to be rather poorly behaved,
and the weak upper topology, which is usually at least as coarse, is presented in
Section 5.3. When X is compact the strong upper topology coincides with the weak
upper topology.
The strong upper topology plays an important role in the development of the
topic, and its definition provides an important characterization of the weak upper
topology when the domain is compact, but it does not have any independent signif-
icance. Throughout the rest of the book, barring an explicit counterindication, the
space of upper semicontinuous correspondences from X to Y will be endowed with
the weak upper topology, and the space of continuous functions from X to Y will be
endowed with the weak topology.
76
5.1. UPPER AND LOWER SEMICONTINUITY 77
Proof. We show that the complement of the graph is open. Suppose (x, y) / Gr(F ).
Since Y is Hausdorff, y and each point z F (x) have disjoint neighborhoods Vz
and Wz . Since F (x) is compact, F (x) Wz1 Wzk for some z1 , . . . , zk . Then
V := Vz1 Vzk and W := Wz1 Wzk are disjoint neighborhoods of y and
F (x) respectively. If U is a neighborhood of x with F (x ) W for all x U, then
U V is a neighborhood (x, y) that does not intersect Gr(F ).
F P : US (X, Y ) K(X)
is continuous.
The basic operations for combining given correspondences to create new cor-
respondences are restriction to a subset of the domain, cartesian products, and
composition. We now study the continuity of these constructions.
Lemma 5.2.2. If A is a closed subset of X, then the map F 7 F |A is continuous
as a function from US (X, Y ) to US (A, Y ).
Proof. Since A Y is a closed subset of X Y , continuity as a function from
US (X, Y ) to US (A, Y )that is, continuity of Gr(F ) 7 Gr(F ) (A Y )follows
immediately from Lemma 4.5.4.
An additional hypothesis is required to obtain continuity of restriction to a
compact subset of the domain, but in this case we obtain a kind of joint continuity.
Lemma 5.2.3. If X is regular, then the map (F, K) 7 Gr(F |K ) is a continuous
function from US (X, Y ) K(X) to K(X Y ). In particular, for any fixed K the
map F 7 F |K is a continuous function from US (X, Y ) to US (K, Y ).
Proof. Fix F US (X, Y ), K K(X), and an open neighborhood W of Gr(F |K ).
For each x K Lemma 4.5.8 gives neighborhoods Ux of x and Vx of F (x) with
Ux Vx W . Choose x1 , . . . , xk such that U := Ux1 . . . Uxk contains K.
Since X is regular, each point in K has a closed neighborhood contained in U, and
the interiors of finitely many of these cover K, so K has a closed neighborhood C
contained in U. Let
Let X and Y be two other topological spaces with Y Hausdorff. Since the map
(C, D) 7 C D is not a continuous operation on closed sets, we should not expect
the function (F, F ) 7 F F from US (X, Y )US (X , Y ) to US (XX , Y Y ) to be
continuous, and indeed, after giving the matter a bit of thought, the reader should
be able to construct a neighborhood of the graph of the function (x, x ) 7 (0, 0)
that shows that the map (F, F ) 7 F F from US (R, R) US (R, R) to US (R2 , R2 )
is not continuous.
We now turn our attention to composition. Suppose that, in addition to X and
Y , we have a third topological space Z that is Hausdorff. (We continue to assume
that Y is Hausdorff.) We can define a composition operation from (F, G) 7 G F
from U(X, Y ) U(Y, Z) to U(X, Z) by letting
[
G(F (x)) := F (y).
yF (x)
That is, G(F (x)) is the projection onto Z of Gr(G|F (x)), which is compact by
Proposition 5.1.4, so G(F (x)) is compact. Thus G F is compact valued. To show
80CHAPTER 5. TOPOLOGIES ON FUNCTIONS AND CORRESPONDENCES
Proof. We need to show that the identity map from US (X, Y ) to UW (X, Y ) is con-
tinuous, which is to say that for any given compact K X, the map Gr(F )
Gr(F |K ) = Gr(F ) (K Y ) is continuous. This follows from Lemma 5.3.1 because
K Y is closed in X Y whenever K is compact.
If X is compact, the continuity of the identity map from UW (X, Y ) to US (X, Y )
follows directly from the definition of the weak upper topology.
There is a useful variant of Lemma 5.2.3.
Lemma 5.3.3. If X is normal, Hausdorff, and locally compact, then the function
(K, F ) 7 Gr(F |K ) is a continuous function from K(X) UW (X, Y ) to K(X Y ).
Proof. We will demonstrate continuity at a given point (K, F ) in the domain. Local
compactness implies that there is a compact neighborhood C of K. The map
F 7 F |C from U(X, Y ) to US (C, Y ) is a continuous function by virtue of the
definition of the topology of U(X, Y ). Therefore Lemma 5.2.3 implies that the
composition (K , F ) (K , F |C ) Gr(F |K ) is continuous, and of course it
agrees with the function in question on a neighborhood of (K, F ).
In contrast with the strong upper topology, for the weak upper topology carte-
sian products and composition are well behaved. Let X and Y be two other spaces
with Y Hausdorff.
Lemma 5.3.4. If X and X are Hausdorff, then the function (F, F ) 7 F F
from UW (X, Y ) UW (X , Y ) to UW (X X , Y Y ) is continuous.
Proof. First suppose that X and X are compact. Then, by Proposition 5.1.4,
the graphs of upper semicontinuous functions with these domains are compact,
and continuity of the function (F, F ) 7 F F from US (X, Y ) US (X , Y ) to
US (X X , Y Y ) follows from Proposition 4.5.9.
Because UW (X X , Y Y ) has the quotient topology, to establish the gen-
eral case we need to show that (F, F ) 7 F F |C is a continuous function from
UW (X, Y ) UW (X , Y ) to US (C, Y Y ) whenever C X X is compact. Let
K and K be the projections of C onto X and X respectively; of course these sets
are compact. The map in question is the composition
(F, F ) (F |K , F |K ) F |K F |K (F |K F |K )|C .
The continuity of the second map has already been established, and the continuity
of the first and third maps follows from Lemma 5.3.1, because compact subsets of
Hausdorff spaces are closed and products of Hausdorff spaces are Hausdorff1 .
Suppose that, in addition to X and Y , we have a third topological space Z that
is Hausdorff.
Lemma 5.3.5. If K X is compact, Y is normal and locally compact, and X
Y Z is normal, then
(F, G) 7 Gr(G F |K )
is a continuous function from UW (X, Y ) UW (Y, Z) to K(X Z).
1
I do not know if the compact subsets of X X are closed when X and X are compact spaces
whose compact subsets are closed.
82CHAPTER 5. TOPOLOGIES ON FUNCTIONS AND CORRESPONDENCES
are continuous functions of (K, F, G). Since X is T1 while Y and Z are Hausdorff,
X Y Z is T1 , so Lemma 4.5.6 implies that the intersection
of these two sets is a continuous function of (K, F, G), and Gr(G F |K ) is the
projection of this set onto X Z, so the claim follows from another application of
Lemma 4.5.10.
Gr(Fx ) (V W ) ((Y \ V ) Z)
and CCO (X, Y ) will denote the space of continuous functions from X to Y endowed
with this topology. The set of correspondences F : X Y with Gr(F |K ) K V
is open in UW (X, Y ), so the compact-open topology is always at least as coarse as
the topology inherited from UW (X, Y ).
Proposition 5.5.1. Suppose X is regular. Then the compact-open topology coin-
cides with the weak topology.
84CHAPTER 5. TOPOLOGIES ON FUNCTIONS AND CORRESPONDENCES
Proof. What this means concretely is that whenever we are given a compact K X,
an open set W K Y , and a continuous f : X Y with Gr(f |K ) W , we can
find a compact-open neighborhood of f whose elements f satisfy Gr(f |K ) W .
For each x K the definition of the product topology gives open sets Ux K and
Vx Y such that (x, f (x)) Ux Vx W . Since f is continuous, by replacing Ux
with a smaller open neighborhood if necessary, we may assume that f (Ux ) Vx .
Since X is regular, x has a closed neighborhood Cx Ux , and Cx is compact because
it is a closed subset of a compact set. Then f CCx ,Vx for each x. We can find
x1 , . . . , xn such that K = Cx1 . . . Cxn , and clearly Gr(f |K ) W whenever
Proof. In view of the subbasis for the strong topology, it suffices to show, for a given
continuous g : Y Z and an open V X Z containing the graph of g f , that
is a neighborhood of the graph of g. If not, then some point (y, g(y)) is an accumu-
lation point of points of the form (f (x ), z) where (x , z)
/ V . Since X is compact,
it cannot be the case that for each x X there are neighborhoods A of x and B of
(y, g(y)) such that
{ (x , z) (A Z) \ V : (f (x ), z) B } = .
6.1 Paracompactness
Fix a topological space X. A family {S }A of subsets of X is locally finite if
every x X has a neighborhood W such that there are only finitely many with
W S 6= . If {U }A is a cover of X, a second cover {V }B is a refinement of
{U }A if each V is a subset of some U . The space X is paracompact if every
open cover is refined by an open cover that is locally finite. This section is devoted
to the proof of:
Theorem 6.1.1. A metric space is paracompact.
This result is due to Stone (1948). At first the proofs were rather complex, but
eventually Rudin (1969) found a brief and simple argument. A well ordering of a
set Z is a complete ordering such that any A Z has a least element. That any
set Z has a well ordering is the assertion of the well ordering theorem, which
is a simple consequence of Zorns lemma. Let O be the set of all pairs (Z , )
where Z Z and is a well ordering of Z. We order O by specifying that
85
86 CHAPTER 6. METRIC SPACE THEORY
Theorem 6.2.2. For any locally finite open cover {U }A of a normal space X
there is a partition of unity subordinate to {U }.
A basic tool used in the constructive proof of this result, and many others, is:
Our goal is to find such an F with B = A. The partial thinnings can be partially
ordered as follows: F < G if the domain of F is a proper subset of the domain of
G and F and G agree on this set. We will show that this ordering has maximal
elements, and that the domain of a maximal element is all of A.
88 CHAPTER 6. METRIC SPACE THEORY
Proof of Theorem 6.2.2. The result above gives a closed cover {C }A of X with
C U for each . For each let : X [0, 1] be continuous
P with (x) = 0
for all x X \ U and (x) = 1 for all x C . Then is well defined
and continuous everywhere since {U } is locally finite, and it is positive everywhere
since {C } covers X. For each A set
:= P .
Proof. Continuity of addition implies that there are neighborhoods of the origin
B1 , B2 with B1 + B2 A, and replacing these with their intersection gives a neigh-
borhood B such that B +B A. If w B, then w B intersects any neighborhood
of the origin, and in particular (w B) B 6= . Thus B B + B A. Applying
this argument again gives a closed neighborhood U of the origin with U B.
Proof. For each v K Lemma 6.3.2 gives a closed neighborhood Wv of the origin,
which is convex if V is locally convex, such that v + Wv + Wv U. Then there are
v1 , . . . , vn such that v1 +Wv1 , . . . , vn +Wvn is a cover of K. Let W := Wv1 . . .Wvn .
For any v K there is an i such that v vi + Wi , so that
v + W vi + Wvi + Wvi U.
k v vk k v vk + k v vk = || kv vk + | | kvk
and
k(v + w ) (v + w)k kv vk + kw wk,
which are easily seen to imply that scalar multiplication and addition are continuous.
A vector space endowed with a norm and the associated metric and topology is
called a normed space.
For a normed space the calculation
shows that for any > 0, the open ball of radius centered at the origin is convex.
The open ball of radius centered at any other point is the translation of this ball,
so a normed space is locally convex.
A sequence {vm } in a topological vector space V is a Cauchy sequence if, for
each neighborhood A of the origin, there is an integer N such that vm vn A for
all m, n N. The space V is complete if its Cauchy sequences are all convergent.
A Banach space is a complete normed space.
For the most part there is little reason to consider topological vector spaces
that are not complete except insofar as they occur as subspaces of complete spaces.
The reason for this is that any topological vector space V can be embedded in
a complete space V whose elements are equivalence classes of Cauchy sequences,
where two Cauchy sequence {vm } and {wn } are equivalent if, for each neighborhood
A of the origin, there is an integer N such that vm wn A for all m, n N. (This
6.4. BANACH AND HILBERT SPACES 91
where A V is open. (It is easy to see that the condition vm A for all large
m does not depend on the choice of representative {vm } of [vm ].) A complete
justification of this definition would require verifications of the vector space axioms,
the axioms for a topological space, the continuity of addition and scalar multiplica-
tion, and that {0} is a closed set. Instead of elaborating, we simply assert that the
reader who treats this as an exercise will find it entirely straightforward. A similar
construction can be used to embed any metric space in a completion in which all
Cauchy sequences (in the metric sense) are convergent.
As in the finite dimensional case, the best behaved normed spaces have inner
products. An inner product on a vector space V is a function h, i : V V R
that is symmetric, bilinear, and positive definite:
(i) hv, wi = hw, vi for all v, w V ;
0 hv, viw hv, wiv, hv, viw hv, wiv = hv, vi hv, vihw, wi hv, wi2 ,
which implies the Cauchy-Schwartz inequality: hv, wi kvk kwk for all v, w
V . This holds with equality if v = 0 or hv, viw hv, wiv, which is the case if
and only if w is a scalar multiple of v, and otherwise the inequality is strict. The
Cauchy-Schwartz inequality implies the inequality in the calculation
which implies (c) and completes the verification and k k is a norm. A vector space
endowed with an inner product and the associated norm and topology is called an
inner product space. A Hilbert space is a complete inner product space.
Up to linear isometry there is only one separable2 Hilbert space. Let
6.5 EmbeddingTheorems
An important technique is to endow metric spaces with geometric structures by
embedding them in normed spaces. Let (X, d) be a metric space, and let C(X) be
the space of bounded continuous real valued functions on X. This is, of course, a
vector space under pointwise addition and scalar multiplication. We endow C(X)
with the norm
kf k = sup |f (x)|.
xX
so f (xn ) 0. For each i we have 0 fyi (xn ) f (xn )/i 0, which implies that
xn yi , whence f = fy1 = = fyk h(X). Thus h(X) is closed in the relative
topology of its convex hull.
Now suppose that X is complete, and that {xn } is a sequence such that fxn f .
Then as above, min{1, d(xm , xn )} kfxm fxn k , and {fxn } is a Cauchy sequence,
so {xn } is also Cauchy and has a limit x. Above we saw that fxn fx , so fx = f .
Thus h(X) is closed in C(X).
For separable metric spaces we have the following refinement of Theorem 6.5.2.
Proof. The sets Ud(x,A)/2 (x) are open and cover X \ A. Theorem 6.1.1 implies the
existence of an open locally finite refinement {W }I . Theorem 6.2.2 implies the
existence of a partition of unity { }I subordinate to {W }I . For each choose
a A with d(a , W ) < 2d(A, W ), and define the extension by setting
X
f (x) := (x)f (a ) (x X \ A).
I
and
d(x , x) d(x , A)/2 d(W , A) d(W , a ),
so
d(a , x) d(a , x ) + d(x , x) 3d(a , W ) 6d(A, W ) 6d(a, x).
Thus d(a , a) d(a , x)+d(x, a) 7d(x, a) < whenever x W , so f (x) U.
Chapter 7
Retracts
95
96 CHAPTER 7. RETRACTS
Here C [0, 1] is the cylindrical side of the tin can, D {0} is its base, and S [0, 1]
is the roll of toilet paper. Evidently X is closed, hence compact, and there is an
obvious contraction of X that first pushes the cylinder of the tin can and the toilet
paper down onto the closed unit disk and then contracts the disk to the origin.
We are now going to define functions
This is evidently well defined and continuous. The point (1, , z) cannot be fixed
because (z) = z implies that z = 0 or z = 1 and is not a multiple of 2.
Observe that D = { ((t), ) : t 0, R }. The second function is
(
(0, 0, 1 t/), 0 t ,
f2 ((t), , 0) :=
((t ), , 0), t.
This is well defined because is invertible and the two formulas give the origin
as the image when t = . It is continuous because it is continuous on the two
subdomains, which are closed and cover D. It does not have any fixed points
because the coordinate of f2 ((t), , 0) is less than (t) except when t = 0, and
f2 ((0), , 0) = (0, 0, 1).
The third function is
(
(s((t + )z), 1 (1 (z))t/), 0 t ,
f3 (s(t), z) :=
(s(t (1 2z)), (z)), t.
This is well defined because s is invertible and the two formulas give (s(2z), (z))
as the image when t = . It is continuous because it is continuous on the two
subdomains, which are closed and cover S [0, 1]. Since f2 (s(t), 0) = f3 (s(t), 0)
for all t, f2 and f3 combine to define a continuous function on the union of their
domains.
Can (s(t), z) be a fixed point of f3 ? If t < , then the equation
z = 1 (1 (z))t/
7.2. RETRACTS 97
Now consider a sequence {(s(ti ), zi )} converging to a point (1, , z). In order for f
to be continuous it must be the case that
f3 (s(ti ), zi ) = (s(ti (1 2zi )), (zi )) (1, (1 2z), (z)) = f1 (1, , z).
Since s(ti ) (1, ) means precisely that ti and ti mod 2 mod 2, again
this is clear.
7.2 Retracts
This section prepares for later material by presenting general facts about retrac-
tions and retracts. Let X be a metric space, and let A be a subset of X such that
there is a continuous function r : X A with r(a) = a for all a A. We say that
A is a retract of X and that r is a retraction. Many desirable properties that X
might have are inherited by A.
Lemma 7.2.1. If X has the fixed point property, then A has the fixed point property.
Proof. If f : A A is continuous, then f r necessarily has a fixed point, say a ,
which must be in A, so that a = f (r(a )) = f (a ) is also a fixed point of f .
Lemma 7.2.2. If X is contractible, then A is contractible.
Proof. If c : X [0, 1] X is a contraction X, then so is (a, t) 7 r(c(a, t)).
Lemma 7.2.3. If X is connected, then A is connected.
Proof. We show that if A is not connected, then X is not connected. If U1 and
U2 are nonempty open subsets of A with U1 U2 = and U1 U2 = A, then
r 1 (U1 ) and r 1 (U2 ) are nonempty open subsets of X with r 1 (U1 ) r 1 (U2 ) =
and r 1 (U1 ) r 1 (U2 ) = X.
Here are two basic observations that are too obvious to prove.
Lemma 7.2.4. If s : A B is a second retraction, then s r : X B is a
retraction, so B is a retract of X.
Lemma 7.2.5. If A Y X, then the restriction of r to Y is a retraction, so A
is a retract of Y .
98 CHAPTER 7. RETRACTS
Lemma 7.2.6. Suppose that A is not connected: there are disjoint open sets
U1 , U2 X such that A U1 U2 with A1 := A U1 and A2 := A U2 both
nonempty. Then A is a neighborhood retract in X if and only if both A1 and A2
are neighborhood retracts in X.
Proof. Since A is locally closed and Rm is locally compact, each point in A has
a closed neighborhood that contains a compact neighborhood. Having a compact
neighborhood is an intrinsic property, so every point in B has such a neighborhood,
and Corollary 7.2.10 implies that B is locally closed. Let V Rn be an open set
that has B as a closed subset. The Tietze extension theorem gives an extension of
h1 to a map j : V Rm . After replacing V with j 1 (U), V is still an open set
that contains B, and h r j : V B is a retraction.
Proof. To begin with suppose that there are simplices of positive dimension in K
that are not in K . Let be such a simplex of maximal dimension, and let be
the barycenter of ||. Then |K| \ {} is a neighborhood of |K| \ int ||, and there
is a retraction r of the former set onto the latter that is the identity on the latter,
of course, and which maps (1 t)x + t to x whenever x || and 0 < t < 1.
Iterating this construction and applying Lemma 7.2.7 above, we find that there
is a neighborhood retract of |K| consisting of |K | and finitely many isolated points.
Now Lemma 7.2.6 implies that |K | is a neighborhood retract in |K|.
Proof. Let be the convex hull of the set of unit basis vectors in R|V | . After
repeated barycentric subdivision of there is a (|V | 1)-dimensional simplex
in the interior of . (This is a consequence of Proposition 2.5.2.) Identifying the
vertices of with the elements of V leads to an embedding of |K| as a subcomplex
of this subdivision, after which we can apply the result above.
100 CHAPTER 7. RETRACTS
7.6 Domination
In our development of the fixed point index an important idea will be to pass
from a theory for certain simple or elementary spaces to a theory for more general
spaces by showing that every space of the latter type can be approximated by a
simpler space, in the sense of the following definitions. Fix a metric space (X, d).
Definition 7.6.1. If Y is a topological space and > 0, a homotopy : Y [0, 1]
X is an -homotopy if
d (y, s), (y, t) <
for all y Y and all 0 s, t 1. We say that 0 and 1 are -homotopic.
Definition 7.6.2. For > 0, a topological space D -dominates C X if there
are continuous functions : C D and : D X such that : C X is
-homotopic to IdC .
This sections main result is:
Theorem 7.6.3 (Domination Theorem). If X is a separable ANR and C X is
compact, then for any > 0 there is a simplicial complex that -dominates C.
Proof. If C = , then for any > 0 it is -dominated by , which we consider
to be a simplicial complex. Similarly, if C is a singleton, then for any > 0 it is
-dominated by the simplicial complex consisting of a single point. Therefore we
may assume that C has more than one point.
7.6. DOMINATION 105
is an open cover of C.
Let e1 , . . . , en be the standard unit basis vectors of Rn . The nerve of the open
cover is
[ [
N(U1 ,...,Un) = conv({ ej : x Uj }) = conv(ej1 , . . . , ejk ).
xX Vj1 ...Vjk 6=
Of course the denominator is always positive, so these functions are well defined
and continuous. There is a continuous function : C N(U1 ,...,Un ) given by
n
X
(x) := j (x)ej .
j=1
for all 0 t 1.
Sometimes we will need the following variant.
Figure 2.1 shows a function f : [0, 1] [0, 1] with two fixed points, s and t.
Intuitively, they are qualitatively different, in that a small perturbation of f can
result in a function that has no fixed points near s, but this is not the case for t.
This distinction was recognized by Fort (1950) who described s as inessential, while
t is said to be essential.
1
b
0
0 s t 1
Figure 1.1
In game theory one often deals with correspondences with sets of fixed points
that are infinite, and include continua such as submanifolds. As we will see, the
definition proposed by Fort can be extended to sets of fixed points rather easily:
roughly, a set of fixed points is essential if every neighborhood of it contains fixed
points of every sufficiently close perturbation of the given correspondence. (Here
one needs to be careful, because in the standard terminology of game theory, fol-
lowing Jiang (1963), essential Nash equilibria, and essential sets of Nash equilibria,
are defined in terms of perturbations of the payoffs. This is a form of Q-robustness,
which is studied in Section 8.3.) But it is easy to show that the set of all fixed
107
108 CHAPTER 8. ESSENTIAL SETS OF FIXED POINTS
is then a perturbation of F that has no fixed points near K, which contradicts the
assumption that K is essential. Much of this chapter is concerned with filling in
the technical details of this argument.
Turning to our particular concerns, Section 8.1 gives the Fan-Glicksberg the-
orem, which is the extension of the Kakutani fixed point theorem to infinite di-
mensional sets. Section 8.2 shows that convex valued correspondences can be ap-
proximated by functions, and defines convex combinations of convex valued corre-
spondences, with continuously varying weights. Section 8.3 then states and proves
Kinoshitas theorem, which implies that minimal connected sets exist. There re-
mains the matter of proving that minimal essential sets actually exist, which is also
handled in Section 8.3.
Proof. We will show that the compliment is open. Let y be a point of V that is not
in K + C. For each x K, translation invariance of the topology of V implies that
x + C is closed, so Lemma 6.3.2 gives a neighborhood Wx of the origin such that
(y + Wx + Wx ) (x + C) = . Since we can replace Wx with Wx Wx , we may
assume that Wx = Wx , so that (y + Wx ) (x + C + Wx ) = . Choose x1 , . . . , xk
such that the sets xi + Wxi cover K, and let W = Wx1 . . . Wxk . Now
[
(y + W ) (K + C) (y + W ) (xi + C + Wxi )
i
[
(y + Wxi ) (xi + C + Wxi ) = .
i
FU (x ) = (F (x ) + U) X (F (x) + W + U) X T.
has the finite intersection property because FU1 ...Uk FU1 . . . FUk , so its
intersection is nonempty. Suppose that x is an element of this intersection. If x
was not an element of F (x ) there would be a closed neighborhood U of the origin
such that (x U) F (x ) = , which contradicts x FU , so x is a fixed point
of F .
(, K) 7 {} L 7 K = { v : v K } ()
(K, L) 7 K L 7 K + L := { v + w : v K, w L } ()
(1 , . . . , k , F1 , . . . , Fk ) 7 1 F1 + + k Fk
is a pointed map. A nonempty compact set K F P(F ) is Q-robust if, for ev-
ery neighborhood V X of K, there is a neighborhood U A of a0 such that
F P(Q(a)) V 6= for all a U.
A set of fixed points is essential if and only if it is Id(ConS (X,X),F ) -robust. At the
other extreme, if Q is a constant function, so that Q(a) = F for all a, then any
nonempty compact K F P(F ) is Q-robust. The weakening of the notion of an
essential set provided by this definition is useful when certain perturbations of F
are thought to be more relevant than others, or when the perturbations of F are
derived from perturbations of the parameter a in a neighborhood of a0 . Some of
the most important refinements of the Nash equilibrium concept have this form. In
particular, Jiang (1963) defines essential Nash equilibria, and essential sets of Nash
equilibria, in terms of perturbations of the games payoffs, while Kohlberg and
Mertens (1986) define stable sets of Nash equilibria in terms of those perturbations
of the payoffs that are induced by the trembles of Selten (1975).
Lemma 8.3.6. F P(F ) is Q-robust.
114 CHAPTER 8. ESSENTIAL SETS OF FIXED POINTS
Proof. The continuity of F P (Theorem 5.2.1) implies that for any neighborhood
V X of F P(F ) there is a neighborhood U A of a0 such that F P(Q(a)) V
for all a U. The Fan-Glicksberg fixed point theorem implies that F P(Q(a)) is
nonempty.
This result shows that if our goal is to discriminate between some fixed points
and others, these concepts must be strengthened in some way. The two main
methods for doing this are to require either connectedness or minimality.
Proof. Let C be the set of Q-robust sets that are contained in K. We order this set
by reverse inclusion, so that our goal is to show that C has a maximal element. This
follows from Zorns lemma if we can show that any completely ordered subset O has
an upper bound in C. The finite intersection property implies that the intersection
of all elements of O is nonempty; let K be this intersection. If K is not Q-
robust, then there is a neighborhood V of K such that every neighborhood U of
a0 contains a point a such that Q(a) has no fixed points in V . If L O, we cannot
have L V because L is Q-robust, but now { L \ V : L O } is a collection of
compact sets with the finite intersection property, so it has a nonempty intersection
that is contained in K but disjoint from V . Of course this is absurd.
The argument for connected Q-robust sets follows the same lines, except that in
addition to showing that K is Q-robust, we must also show that it is connected.
If not there are disjoint open sets V1 and V2 such that K V1 V2 and K V1 6=
= 6 K V2 . For each L O we have L V1 6= = 6 L V2 , so L \ (V1 V2 ) must
be nonempty because L is connected. As above, { L \ (V1 V2 ) : L O } has a
nonempty intersection that is contained in K but disjoint from V1 V2 , which is
impossible.
Chapter 9
Approximation of
Correspondences
115
116 CHAPTER 9. APPROXIMATION OF CORRESPONDENCES
Theorem 9.1.2. If X is a compact ANR with the fixed point property, then any
upper semicontinuous contractible valued correspondence F : X X has a fixed
point.
y : L \ {} L and t : L \ {} [0, 1)
(1 t(x))y(x) + t(x) = x.
Let
L1 = t1 ([0, 31 ]), L2 = t1 ([ 13 , 32 ]), L3 = t1 ([ 32 , 1)) {}.
We first define f at points in L2 , then extend to L1 and L3 .
9.2. EXTENDING FROM THE BOUNDARY OF A SIMPLEX 117
Let d be a metric on L. Since f , t(), and y() are continuous, and L2 is compact,
for some sufficiently small > 0 it is the case that
y 1 (F ) t1 ( 31 ), y 1 (F ) L2 , y 1(F ) t1 ( 32 )
for the various faces F of L. Proposition 2.5.2 implies that repeated barycentric
subdivision of this polyhedral complex results eventually in a simplicial subdivision
of L2 whose mesh is less than .
For each vertex v of this subdivision choose s(v) (f (y(v)) + A) S, and set
This definition does not depend on the choice of if x is contained in more than
one simplex, it is continuous on each , and the simplices are a finite closed cover
of L2 , so f is continuous.
Suppose that v and v are two vertices of , so they are the endpoints of an
edge. We have d(v, v ) < , so f (y(v)) f (y(v )) A and |t(v) t(v )| < 31 . In
addition, s(v) f (y(v)) and f (y(v )) s(v ) are elements of A, so
f ( 32 y(x) + 13 ) (s(vj ) + B) + Q.
118 CHAPTER 9. APPROXIMATION OF CORRESPONDENCES
The point s(vj ) was chosen with f (y(vj )) s(vj ) A, and f (y(x)) f (y(vj )) A
because d( 32 y(x) + 13 , y(vj )) < , so
f (x) (s(vj ) + B) Q (S + B) Q.
Thus f (L1 ) (S + B) Q.
Let z be the point S is contracted to by c: c(S, 1) = {z}. We define f on L3 by
setting f (x) := z. Of course this is a continuous function whose image is contained
in S (S + B) Q.
If x L1 L2 , then t(x) = 31 and 23 y(x) + 31 = x, so the formula defining f on
L1 agrees with the definition of f for elements of L2 at x. If v is a vertex of the
subdivision of L2 contained in L2 L3 , then t(v) = 23 , so that the definition of f on
L2 gives f (v) = c(s(v), 3t(v) 1) = z. If x L2 L3 , then L2 L3 contains any
simplex of the subdivision of L2 that has x as an element, and the definition of f
on L2 gives f (x) = z. Thus this definition agrees with the definition of f on L2 at
points in L2 L3 . Thus f is well defined and continuous.
The main argument will employ two technical results, the first of which will also
be applied in the next section. Recall that an ANR can be embedded in a normed
space (Proposition 7.4.3) so it is metrizable.
Proof. By the definition of the product topology, for every z F (x) there exist
z > 0 and an open neighborhood Az Z of the origin in T such that
Uz (x) ((z + Az ) Z) V,
and let Vy be an open subset of U(52)ry (y) that contains y. Then for all y, y X,
if Vy Vy 6= , then Vy Ury (y).
Proof. Let = 52 and = 3 5. Suppose Vy Vy 6= . The distance from y to
any point in Vy cannot exceed (ry + 2ry ), so if Vy is not contained in Ury (y), then
(ry +2ry ) > ry , which boils down to 2ry > ry . Let iy be one of the indices such
that Ury (y) Uiy . We claim that x Uiy because ry > (ry + ry ), which reduces
to ry > ry . A quick computation verifies that /2 > /, so this follows
from the inequality above. Since y Uiy , and the distance from y to y is less than
(ry +ry ), we have ry > ry (ry +ry ), which reduces to (1)r
y > ry . Together
this inequality and the one above imply that 2/ > (3 5)/( 1), but one
may easily compute that in fact these two quantities are equal. This contradiction
completes the proof.
Proof of Proposition 9.3.1. Let m be the largest dimension of any simplex in K that
is not in J. The main idea is to use induction on m, but one of the methods used in
the construction is subdivision of K, and the formulation of the induction hypothesis
must be sensitive to this. Precisely, we will show that for each k = 0, . . . , m there
is a neighborhood Wk W of Gr(F ) and a simplicial subdivision of K such that
if Hk is the union of J with the k-skeleton of some further subdivision, then any
f : J Z with Gr(f ) Wk has an extension f : Hk Z with Gr(f ) W .
For k = 0 the claim is obvious: we can let W0 = W and take K itself without any
further subdivision. By induction we may assume that the claim has already been
established with k 1 in place of k. That is, there is a neighborhood Wk1 W
of Gr(F ) and a simplicial subdivision of K such that if Hk1 is the union of J
with the (k 1)-skeleton of some further subdivision, then any f : J Z with
Gr(f ) Wk1 has an extension f : Hk1 Z with Gr(f ) W .
We now develop two open coverings of K. Consider a particular x K. Fix a
contraction cx : F (x) [0, 1] F (x). Lemma 9.3.2 allows us to choose a convex
balanced neighborhood Bx of the origin in T and x > 0 such that
Ux (F (x) + Bx ) Z Wk1
(F (x) + Bx ) Q Z.
so that
Proof. For each x C Lemma 9.3.2 allows us to choose x > 0 and a neighborhood
Ax of F (x) such that Ux (x) Ax V . Replacing x with a smaller number
if need be, we may assume without loss of generality that F (x ) Ax for all
x Ux (x). Choose x1 , . . . , xH such that Ux1 /2 (x1 ), . . . , UxH /2 (xH ) cover C. Let
:= min{xi /2}, and set
[
V := Uxi /2 (xi ) Axi .
i
Smooth Methods
124
Chapter 10
Differentiable Manifolds
125
126 CHAPTER 10. DIFFERENTIABLE MANIFOLDS
Df : U L(Rm , Rn )
Often the domain and range of the pertinent functions are presented to us as
vector spaces without a given or preferred coordinate system, so it is important to
observe that we can use the chain rule to achieve definitions that are independent of
the coordinate systems. Let X and Y be m- and n-dimensional vector spaces. (In
this chapter all vector spaces are finite dimensional, with R as the field of scalars.)
Let c : X Rm and d : Y Rn be linear isomorphisms. If U X is open, we can
say that a function f : U Y is C r , by definition, if d f c1 : c(U) Rk is C r ,
and if this is the case and x U, then we can define the derivative of f at x to be
Using the chain rule, one can easily verify that these definitions do not depend
on the choice of c and d. In addition, the chain rule given above can be used to
show that this coordinate free definition also satisfies a chain rule. Let Z be a
third p-dimensional vector space. Then if V Y is open, g : V Z is C r , and
f (U) V , then g f is C r and D(g f ) = Dg Df .
Sometimes we will deal with functions whose domains are not open, and we need
to define what it means for such a function to be C r . Let S be a subset of X of any
sort whatsoever. If Y is another vector space and f : S Y is a function, then
f is C r by definition if there is an open U X containing S and a C r function
F : U Y such that f = F |S . Evidently being C r isnt the same thing as having
a well defined derivative at each point in the domain!
Note that the identity function on S is always C r , and the chain rule implies
that compositions of C r functions are C r . Those who are familiar with the category
concept will recognize that there is a category of subsets of finite dimensional vector
spaces and C r maps between them. (If you havent heard of categories it would
certainly be a good idea to learn a bit about them, but what happens later wont
depend on this language.)
We now state coordinate free versions of the inverse and implicit function theo-
rems. Since you are expected to know the usual, coordinate dependent, formulations
of these results, and it is obvious that these imply the statements below, we give
no proofs.
at x and y respectively.
In addition
Dg(x0 ) = Dy f (x0 , y0 )1 Dx f (x0 , y0 ).
The first order of business is to show that such partitions of unity exist. The
key idea is the following ingenious construction.
Proof. Let (
0, t 0,
(t) := 1/t
e , t > 0.
10.2. SMOOTH PARTITIONS OF UNITY 129
Standard facts of elementary calculus can be combined inductively to show that for
each r 1 there is a polynomial Pr such that (r) (t) is Pr (1/t)e1/t if t > 0. Since
the exponential function dominates any polynomial, it follows that (r) (t)/t 0 as
t 0, so that each (r) is differentiable at 0 with (r+1) (0) = 0. Thus is C .
Qm
Note that for any open rectangle i=1 (ai , bi ) Rm the function
Y
x 7 (xi ai )(bi xi )
i
Proof. For any integer j 0 and vector k = (k1 , . . . , km ) with integer components
let
m
Y m
Y
(ki 1)/2j , (ki + 1)/2j and Qj,k = (ki 2)/2j , (ki + 3)/2j .
Qj,k =
i=1 i=1
The cover consists of those Qj,k such that the closure of Qj,k is contained in some
U and, if j > 0, there is no such that the closure of Qj,k is contained in U .
Consider a point x U. The last requirement implies that x has a neighborhood
that intersects only finitely many cubes in the collection, which is to say that the
collection is locally finite.
For any j the Qj,k cover Rm , so there is some k such that x Qj,k , and if j
is sufficiently small, then the closure of Qj,k is contained in some U . If Qj,k is
not in the collection, then the closure of Qj,k is contained in some U . Define k
by letting ki be ki /2 or (ki + 1)/2 according to whether ki is even or odd. Then
Qj,k Qj1,k Qj,k . Repeating this leads eventually to an element of the collection
that contains x, so the collection is indeed a cover of U.
10.3 Manifolds
The maneuver we saw in Section 10.1passing from a calculus of functions
between Euclidean spaces to a calculus of functions between vector spaceswas
accomplished not by fully eliminating the coordinate systems of the domain and
range, but instead by showing that the real meaning of the derivative would not
change if we replaced those coordinate systems by any others. The definition of a
C r manifold, and of a C r function between such manifolds, is a more radical and
far reaching application of this idea.
A manifold is an object like the sphere, the torus, and so forth, that looks like
a Euclidean space in a neighborhood of any point, but which may have different
sorts of large scale structure. We first of all need to specify what looks like means,
and this will depend on a degree of differentiability. Fix an m-dimensional vector
space X, an open U X, and a degree of differentiability 0 r .
Recall that if A and B are topological spaces, a function e : A B is an
embedding if it is continuous and injective, and its inverse is continuous when
e(A) has the subspace topology. Concretely, e maps open sets of A to open subsets
of e(A). Note that the restriction of an embedding to any open subset of the domain
is also an embedding.
Although the definition above makes sense when r = 0, we will have no use for
this case because there are certain pathologies that we wish to avoid. Among other
things, the beautiful example known as the Alexander horned sphere (Alexander
(1924)) shows that a C 0 manifold may have what is known as a wild embedding
in a Euclidean space. From this point on we assume that r 1.
There are many obvious examples of C r manifolds such as spheres, the torus,
etc. In analytic work one should bear in mind the most basic examples:
(ii) Any open subset (including the empty set) of an m-dimensional affine sub-
space of Rk is an m-dimensional C r manifold. More generally, an open subset
of an m-dimensional C r manifold is itself an m-dimensional C r manifold.
(a) f is C r ;
Proof. Because compositions of C r functions are C r , (a) implies (c), and since each
point in a manifold is contained in the image of a C r parameterization, it is clear
that (c) implies (b). Fix a point p M and C r parameterizations : U M and
: V N with p (U) and f ((U)) (V ). Lemma 10.3.1 implies that 1
and 1 are C r , so ( 1 f ) 1 is C r on its domain of definition. Since
p was arbitrary, we have shown that f is locally C r , and Proposition 10.2.6 implies
that f is C r . Thus (b) implies (a).
the ambient space.) However, in the abstract approach there are certain technical
difficulties that must be overcome just to get acceptable definitions. In addition,
the Whitney embedding theorems (cf. Hirsch (1976)) show that, under as-
sumptions that are satisfied in almost all applications, a manifold satisfying the
abstract definition can be embedded in some Rk , so our approach is not less general
in any important sense. From a technical point of view, the assumed embedding
of M in Rk is extremely useful because it automatically imposing conditions such
as metrizability and thus paracompactness, and it allows certain constructions that
simplify many proofs.
There is a category of C r manifolds and C r maps between them. (This can
be proved from the definitions, or we can just observe that this category can be
obtained from the category of subsets of finite dimensional vector spaces and C r
maps between them by restricting the objects.) The notion of isomorphism for this
category is:
If M and N are C r diffeomorphic we will, for the most part, regard them as
two different realizations of the same object. In this sense the spirit of the
definition of a C r manifold is that the particular embedding of M in Rk is of no
importance, and k itself is immaterial.
T : U X { (p, v) T M : p (U) } T M
by setting
T (x, w) := ((x), D(x)w).
Lemma 10.5.3. If r 2, then T is a C r1 parameterization for T M.
We need to show that this definition does not depend on the choice of extension
F . Let : U M be a C r parameterization whose image is a neighborhood of p,
let x = 1 (p), and observe that, for any v Tp M, there is some w Rm such that
v = D(x)w, so that
We also need to show that the image of Df (p) is, in fact, contained in Tf (p) N.
Let : V N be a C r parameterization of a neighborhood of f (p). The last
equation shows that the image of Df (p) is contained in the image of
so the image of Df (p) is contained in the image of D 1 ((f (p)), which is Tf (p) N.
Naturally the chain rule is the most important basic result about the derivative.
We expect that many readers have seen the following result, and at worst it is a
suitable exercise, following from the chain rule of multivariable calculus without
trickery, so we give no proof.
Proof. Since IdRk is a C extension of IdM , we clearly have DIdM (p) = IdTp M for
each p M. The claim now follows directly from the definition of T IdM .
T g(T f (p, v)) = T g(f (p), Df (p)v) = (g(f (p)), Dg(f (p))Df (p)v)
For the categorically minded we mention that Proposition 10.5.4 and the last
three results can be summarized very succinctly by saying that if r 2, then T
is a functor from the category of C r manifolds and C r maps between them to the
category of C r1 manifolds and C r1 maps between them. Again, we will not use
this language later, so in a sense you do not need to know what a functor is, but
categorical concepts and terminology are pervasive in modern mathematics, so it
would certainly be a good idea to learn the basic definitions.
Lets relate the definitions above to more elementary notions of differentiation.
Consider a C 1 function f : (a, b) M and a point t (a, b). Formally Df (t) is
a linear function from Tt (a, b) to Tf (t) M, but thinking about things in this way is
usually rather cumbersome. Of course Tt (a, b) is just a copy of R, and we define
f (t) = Df (t)1 Tf (t) M, where 1 is the element of Tt (A, b) corresponding to 1 R.
When M is an open subset of R we simplify further by treating f (t) as a number
under the identification of Tf (t) M with R. In this way we recover the concept of
the derivative as we first learned it in elementary calculus.
136 CHAPTER 10. DIFFERENTIABLE MANIFOLDS
10.6 Submanifolds
For almost any kind of mathematical object, we pay special attention to subsets,
or perhaps substructures of other sorts, that share the structural properties of the
object. One only has to imagine a smooth curve on the surface of a sphere to see
that such substructures of manifolds arise naturally. Fix a degree of differentiability
1 r . If M Rk is an m-dimensional C r manifold, N is an n-dimensional
that is also embedded in Rk , and N M, then N is a C r submanifold of M. The
integer m n is called the codimension of N in M.
The reader can certainly imagine a host of examples, so we only mention one
that might easily be overlooked because it is so trivial: any open subset of M
is a C r manifold. Conversely, any codimension zero submanifold of M is just an
open subset. Evidently submanifolds of codimension zero are not in themselves
particularly interesting, but of course they occur frequently.
Submanifolds arise naturally as images of smooth maps, and as solution sets of
systems of equations. We now discuss these two points of view at length, arriving
eventually at an important characterization result. Let M Rk and N R be C r
manifolds that are m- and n-dimensional respectively, and let f : M N be a C r
function. We say that p M is:
(a) an immersion point of f if Df (p) : Tp M Tf (p) N is injective;
Then
{ (x, g(x)) : x V } = f 1 (f (p)) (W )
is a neighborhood of p in f 1 (f (p)), and x 7 (x, g(x)) is a C r embedding because
its inverse is the composition of 1 with the projection (x, y) 7 x.
We obviously have Tp f 1 (q) ker Df (p), and the two vector spaces have the
same dimension.
Proposition 10.6.3. If p is a diffeomorphism point of f , then there is a neighbor-
hood W of p such that f (W ) is a neighborhood of f (p) and f |W : W f (W ) is a
C r diffeomorphism.
Proof. Let : U M be a C r parameterization of a neighborhood of p, let
x = 1 (p), and let : V N be a C r parameterization of a neighborhood of
f (p). Then
D( 1 f )(x) = D 1 (f (p)) Df (p) D(x)
is nonsingular, so the inverse function theorem implies that, after replacing U and
V with smaller open sets containing x and 1 (f (p)), 1 f is invertible with
C r inverse. Let W = (U). We now have
(f |W )1 = ( 1 f )1 1 ,
which is C r .
Now let P be a p-dimensional C r submanifold of N. The following is the tech-
nical basis of the subsequent characterization theorem.
Lemma 10.6.4. If q P then:
(a) There is a neighborhood V P , a p-dimensional C r manifold M, a C r func-
tion f : M P , a p f 1 (q) that is an immersion point of f , and a
neighborhood U of P , such that f (U) = V .
(The third equality follows from the final assertion of Proposition 10.6.2, and the
fourth is the transversality assumption.) Thus p is a submersion point of . Since
p is an arbitrary point of f 1 (P ) the claim follows from Theorem 10.6.5.
We now have
Like the tangent bundle, the normal bundle attaches a vector space of a certain
dimension to each point of N. (The general term for such a construct is a vector
bundle.) The zero section of N is { (q, 0) : q N }. There are maps
and let = |U . The main topic of this section is the following result and its
many applications.
The inverse function theorem implies that each (q, 0) in the zero section has a
neighborhood that is mapped C r1 diffeomorphically by onto a neighborhood of
q in R . The methods used to produce a suitable neighborhood of the zero section
with this property are topological and quite technical, in spite of their elementary
character.
10.7. TUBULAR NEIGHBORHOODS 141
Proof. For s S let (s) be one half of the supremum of the set of > 0 such that
U (s) Ns and f |U (s) is an embedding. The restriction of an embedding to any
subset of its domain is an embedding, which implies that is continuous.
Since f |S is an invertible, its inverse is continuous. In conjunction with the
continuity of and d, this implies that for each s S there is a s > 0 such that
for all s S with e(f (s), f (s)) s . For each s chooseSan open Us X such that
s Us U(s)/2 (s) and f (Us ) Us /3 (f (s)). Let U = sS Us . We will show that
f |U is injective with continuous inverse.
Consider s, s S and y, y Y with e(f (s), y) < s /3 and e(f (s ), y ) < s /3.
We claim that if y = y , then () holds: otherwise e(f (s), f (s)) > s , s , so that
It is easy to see (and not hard to compute formally using the chain rule) that
D(q, 0) = IdTq N IdTq N under the natural identification of T(q,0) (T N) with Tq N
Tq N. The inverse function theorem implies that after replacing Nq with a smaller
neighborhood of (q, 0), the restriction of to Nq is a diffeomorphism onto its image.
We can now proceed as in the proof of the tubular neighborhood theorem.
The following construction simulates convex combination.
Proposition 10.7.9. There is a neighborhood W of the diagonal in N N and a
continuous function c : W [0, 1] N such that:
(a) c((q, q ), 0) = q for all (q, q ) W ;
D m := { x Rm : kxk 1 },
144 CHAPTER 10. DIFFERENTIABLE MANIFOLDS
We will often write -manifold in place of the cumbersome phrase manifold with
boundary.
Fix an m-dimensional C r -manifold M Rk . We say that p M is a boundary
point of M if there a C r -parameterization of M that maps a point in the boundary
of H to p. If any C r parameterization of a neighborhood of p has this property, then
all do; this is best understood as a consequence of invariance of domain (Theorem
14.4.4) which is most commonly proved using algebraic topology. Invariance of
domain is quite intuitive, and eventually we will be able to establish it, but in the
meantime there arises the question of whether our avoidance of results derived from
algebraic topology is pure. One way of handling this is to read the definition
of a -manifold as specifying which points are in the boundary. That is, a -
manifold is defined to be a subset of Rk together with an atlas of m-dimensional
C r parameterizations {i }iI such that each 1 j i maps points in the boundary
of H to points in the boundary and points in the interior to points in the interior.
In order for this to be rigorous it is necessary to check that all the constructions in
our proofs preserve this feature, but this will be clear throughout. With this point
cleared up, the boundary of M is well defined; we denote this subset by M. Note
that M automatically inherits a system of coordinate systems that display it as
an (m 1)-dimensional C r manifold (without boundary).
Naturally our analytic work will be facilitated by characterizations of -manifolds
that are somewhat easier to verify than the definition.
Examination of the matrix of partial derivatives shows that D(x) is nonsingular, so,
by the inverse function, after replacing W with a smaller neighborhood of x, we may
assume that is a C r embedding. Let U = (V ), U = U H, = 1 : U W ,
and = |U : U W . Evidently is a C r -parameterization for M.
The following consequence is obvious, but is still worth mentioning because it
will have important applications.
Proposition 10.8.3. If M is an m-dimensional C r manifold, f : M R is C r ,
and a is a regular value of f , then f 1 ([a, )) is an m-dimensional C r -manifold.
The definitions of tangent spaces, tangent manifolds, and derivatives, are only
slightly different from what we saw earlier. Suppose that M Rk is an m-
dimensional C r -manifold, : U M is a C r -parameterization, x U, and
(x) = p. The definition of a C r function gives a C r extension : U Rk of to
an open (in Rm ) superset of U, and we define Tp M to be the image of D (x). (Of
course there is no difficulty showing that D (x) does not depend on the choice of
extension .) As before, the tangent manifold of M is
[
TM = {p} Tp M.
pM
The reason this is the relevant notion has to do with transversality. Suppose
that M is a C r -manifold, N is a C r manifold, without boundary, P is a C r
submanifold of N, and f : M N is C r . We say that f is transversal to P along
S M, and write f S P , if f |M \M S\M P and f |M SM P . As above, when
S = M we write f P .
The transversality theorem generalizes as follows:
Clearly this will be satisfactory if (s) > 0 for all s. A brief calculation gives
If Q is larger than the upper bound for s f (s), then (s) > 0 when (s) is close
to 0 or 1. Since those s for which this is not the case will be contained in a compact
interval on which positive and continuous, hence bounded below by a positive
constant, if Q is sufficiently large then (s) > 0 for all s.
Proof of Proposition 10.9.1. Let M be a nonempty compact connected 1-dimensional
C r manifold. We can pass from a C r atlas for M to a C r atlas whose elements all
have connected domains by taking the restrictions of each element of the atlas to
the connected components of its domain. To be concrete, we will assume that the
domains of the parameterizations are connected subsets of R, i.e., open intervals.
Since we can pass from a parameterization with unbounded domain to a countable
collection of restrictions to bounded domains, we may assume that all domains are
bounded. Since M is compact, any atlas has a finite subset that is also an atlas.
We now have an atlas of the form
{ 1 : (a1 , b1 ) M, . . . , K : (aK , bK ) M }.
148 CHAPTER 10. DIFFERENTIABLE MANIFOLDS
is increasing. Applying the lemma above again, there is a real number R and an
increasing C r diffeomorphism : (b2 , b2 ) (b2 R, s0 ) such that (s) = sR
for s near b2 and (s) = g(s) for s near b2 .
We now define : [s0 , s0 + R) M by setting
(
(s), s0 s b2 ,
(s) = 1
( (s R)), b2 < s < s0 + R.
R) to M. This function is easily seen to be injective, and it maps open sets to open
sets, so its image is open, but also compact, hence closed. Since M is connected,
its image must be all of M, so we have constructed te desired C r diffeomorphism
between the circle and M.
The argument for a compact connected one dimensional C r -manifold with
nonempty boundary is similar, but somewhat simpler, so we leave it to the reader.
Although it will not figure in the work here, the reader should certainly be
aware that the analogous issues for higher dimensions are extremely important in
topology, and mathematical culture more generally. In general, a classification of
some type of mathematical object is a description of all the isomorphism classes
(for whatever is the appropriate notion of isomorphism) of the object in question.
The result above classifies compact connected 1-dimensional C r manifolds.
The problem of classifying oriented surfaces (2-dimensional manifolds) was first
considered in a paper of Mobius in 1870. The classification of all compact connected
surfaces was correctly stated by van Dyke in 1888. This result was proved for
surfaces that can be triangulated by Dehn and Heegaard in 1907, and in 1925 Rado
showed that any surface can be triangulated.
After some missteps, Poincare formulated a fundamental problem for the the
classification of 3-manifolds: is a simply connected compact 3-manifold necessarily
homeomorphic to S 3 ? (A topological space X is simply connected if it is con-
nected and any continuous function f : S 1 = { (x, y) R2 : x2 + y 2 = 1 } X has
a continuous extension F : D 2 = { (x, y) R2 : x2 + y 2 1 } X.) Although
Poincare did not express a strong view, this became known as the Poincare con-
jecture, and over the course of the 20th century, as it resisted solution and the four
color theorem and Fermats last theorem were proved, it became perhaps the most
famous open problem in mathematics. Curiously, the analogous theorems for higher
dimensions were proved first, by Smale in 1961 for dimensions five and higher, and
by Freedman in 1982 for dimension four. Finally in late 2002 and 2003 Perelman
posted three papers on the internet that sketched a proof of the original conjecture.
Over the next three years three different teams of two mathematicians set about
filling in the details of the argument. In the middle of 2006 each of the teams posted
a (book length) paper giving a complete argument. Although Perelmans papers
were quite terse, and many details needed to be filled in, all three teams agreed
that all gaps in his argument were minor.
Chapter 11
Sards Theorem
150
11.1. SETS OF MEASURE ZERO 151
11.2 develops a Fubini-like result for sets of measure zero. With these elements
in place, it becomes possible to state and prove Sards theorem in Section 11.3.
Section 11.4 explains how to extend the result to maps between sufficiently smooth
manifolds.
The application of Sards theorem that is most important in the larger scheme of
this book is given in Section 11.5. The overall idea is to show that any map between
manifolds can be approximated by one that is transversal to a given submanifold
of the range.
Definition 11.1.1. A set S Rm has measure zero if, for any > 0, there is
a sequence {(xj , rj )} k
j=1 in R (0, 1) such that
[ X
S Urj (xj ) and rjm < .
j j
Proof. For given take the union of a countable cover of S1 by rectangles of total
volume < /2, a countable cover of S2 by rectangles of total volume < /4, etc.
Lemma 11.1.3. If S has measure zero, its interior is empty, so its complement is
dense.
152 CHAPTER 11. SARDS THEOREM
Proof. Suppose that, on the contrary, S has a nonempty interior. Then it contains a
closed cube C, say of side length
P 2. Fixing > 0, suppose that S has a covering by
cubes of side length 2j with j (2j )m < . Then it has a covering by open cubes
Cj of side length 3j , and there is a finite subcover of C. For some large integer K,
ij ij +1
consider all standard cubes of the form m
Q
j=1 [ K , K ]. For each cube in our finite
subcover, let Dj be the union of all such standard cubes contained in Cj , and let nj
be the number of such cubes. Let D be the union of all standard cubes containing a
point in C, and let n be the number of them. Simply as a matter of counting (that
is to say, without reference to any theory of volume)Swe have nj /K m P (3j )m and
n/K m (2)m . If K is sufficiently large, then D j Dj , so that n j nj and
X X
(2)m n/K m nj /K m (3j )m (3/2)m ,
j j
The next result implies that the notion of a set of measure zero is invariant
under C 1 changes of coordinates. In the proof of Theorem 11.3.1 we will use this
flexibility to choose coordinate systems with useful properties. In addition, this
fact is the key to the definition of sets of measure zero in manifolds. Recall that if
L : Rm Rm is a linear transformation, then the operator norm of L is
Proof. Let C U be a closed cube. Since U can be covered by countably many such
cubes (e.g., all cubes contained in U with rational centers and rational side lengths)
it suffices to show that f (S C) has measure zero. Let B := maxxC kDf (x)k. For
any x, y C we have
Z 1
kf (x) f (y)k =
Df ((1 t)x + ty)(y x) dt
0
Z 1
kDf ((1 t)x + ty)k ky xk dt Bky xk.
0
If {(xj , rj )}
j=1 is a sequence such that
[ X
SC Urj (xj ) and rjm < ,
j j
then [ X
f (S C) UBrj (f (xj )) and (Brj )m < B m .
j j
11.2. A WEAK FUBINI THEOREM 153
be the t-slice of S. Let P (S) be the set of t such that S(t) does not have (m 1)-
dimensional measure zero. Certainly it seems natural to expect that if S is a set
of m-dimensional zero, then P (S) should be a set of 1-dimensional measure zero,
and conversely. This is true, by virtue of Fubinis theorem, but we do not have the
means to prove it in full generality. Fortunately all we will need going forward is a
special case.
Specifically,
S Lemma 11.1.2 implies that (a) and (b) are equivalent, and also that
P (S) = j P (C Aj ), after which the equivalence of (c) and (d) follows from a
third application of the result. The equivalence of (b) and (c) follows from the
lemma above.
We now need to prove Lemma
Q 11.2.2. Fix a compact set C, which we assume
is contained in the rectangle mi=1 [ai , bi ]. For each > 0 let P (C) be the set of
t such that C(t) cannot be covered by a finite collection of open rectangles whose
total (m 1)-dimensional volume is less than .
Proof. If t is in the complement of P (C), then any collection of open rectangles that
cover C(t) also covers C(t ) for t sufficiently close to t, because C is compact.
The next two results are the two implications of Lemma 11.2.2.
Lemma 11.2.4. If P (C) has measure zero, then C has measure zero.
154 CHAPTER 11. SARDS THEOREM
Proof. Fix > 0, and choose < /2(b1 a1 ). Since P (C) P (C), it has one
dimensional measure zero, and since it is closed, hence compact, it can be covered by
the union J of finitely many open intervals of total length /2(b2 a2 ) (bm am ).
In this way { x C : x1 J } is covered by a union of open rectangles of total
volume /2.
/ J we can choose a finite union of rectangles in Rm1 of total volume
For each t
less than that covers C(t), and these will also cover C(t ) for all t in some open
interval around t. Since [a1 , b1 ] \ J is compact, it is covered by a finite collection of
such intervals, and it is evident that we can construct a cover of { x C : x1 /J}
of total volume less than /2.
Lemma 11.2.5. If C has measure zero, then P (C) has measure zero.
S
Proof. Since P (C) = n=1,2,... P1/n (C), it suffices to show that P (C) has measure
zero for any > 0. For any > 0 there is a covering of C by finitely many rectangles
of total volume less than . For each t there is an induced covering C(t) be a finite
collection of rectangles, and there is an induced covering of [a1 , b1 ]. The total length
of intervals with induced coverings of total volume greater than cannot exceed
/.
Proof of (a): We will show that each x C \ C1 has a neighborhood V such that
f (V C) has measure zero. This suffices because C \ C1 is an open subset of a
11.3. SARDS THEOREM 155
closed set, so it is covered by countably many compact sets, each of which is covered
by finitely many such neighborhoods, and consequently it has a countable cover by
such neighborhoods.
f1
After reindexing we may assume that x 1
(x) 6= 0. Let V be a neighborhood of
f1
x in which x1 does not vanish. Let h : V Rm be the function
so the inverse function theorem implies that, after replacing V with a smaller neigh-
borhood of x, h is a diffeomorphism onto its image. The chain rule implies that
the critical values of f are the critical values of g = f h1 , so we can replace f
with g, and g has the additional property that g1 (z) = z1 for all z in its domain.
The upshot of this argument is that we may assume without loss of generality that
f1 (x) = x1 for all x V .
For each t R let V t := { w Rm1 : (t, w) V }, let f t : V t Rn1 be the
function
f t (w) := (f2 (t, w), . . . , fn (t, w)),
and let C t be the set of critical points of f t . The matrix of partial derivatives of f
at x V is
1 0 0
f2 (x) f2 (x) f2 (x)
x1 x2 xm
.. .. .. ,
. . .
fn fn fn
x1
(x) x2
(x) xm
(x)
so x is a critical point of f if and only if (x2 , . . . , xm ) is a critical point of f x1 , and
consequently
[ [
CV = {t} C t and f (C V ) = {t} f t (C t ).
t t
Since the result is known to be true with (m, n) replaced by (m 1, n 1), each
f t (C t ) has (n 1)-dimensional measure zero. In addition, the continuity of the
relevant partial derivatives implies that C \ C1 is locally closed, so Proposition
11.2.1 implies that f (C V ) has measure zero.
Proof of (b): As above, it is enough to show that an arbitrary x Ci \ Ci+1 has a
neighborhood V such that f (Ci V ) has measure zero. Choose a partial derivative
i+1 f
xs xs xs
that does not vanish at x. Define h : U Rm by
1 i i+1
i
h(x) := ( xs x
f
s
(x), x2 , . . . , xm ).
1 i
156 CHAPTER 11. SARDS THEOREM
After reindexing we may assume that si+1 = 1, so that the matrix of partial deriva-
tives of h at x is triangular with nonzero diagonal entries. By the inverse function
theorem the restriction of h to some neighborhood V of x is a C diffeomorphism.
Let g := f (h|V )1 . Then h(V Ci ) {0} Rm1 . Let
g0 : { y Rm1 : (0, y) h(V ) } Rn
be the map g0 (y) = g(0, y). Then f (V (Ci \Ci+1 )) is contained in the set of critical
values of g0 , and the latter set has measure zero because the result is already known
when (m, n) is replaced by (m 1, n).
Proof of (c): Since U can be covered by countably many compact cubes, it suffices
to show that f (Cr I) has measure zero whenever I U is a compact cube. Since I
is compact and the partials of f of order r are continuous, Taylors theorem implies
that for every > 0 there is > 0 such that
kf (x + h) f (x)k khkr
whenever x, x + h I with x Cr and khk < . Let L be the side length of I. For
each integer d > 0divide I into dm subcubes of side length L/d. The diameter of
such a subcube is mL/d. If this quantity is less than and the subcube contains a
point x Cr , then its image is contained in a cube of sidelength 2( mL)r centered
at f (x). There are dm subcubes of I, each one of which may or may not contain a
point in Cr , so for large
d,r fn(C r I) is contained in a finite union of cubes of total
n mnr
volume at most 2( mL) d . Now observe that nr m: either m < n and
r 1, or m n and
nr n(m n + 1) = (n 1)(m n) + m m.
Therefore
f (Cr I) is contained in a finite union of cubes of total volume at most
r n n
2( mL) , and may be arbitrarily small.
Instead of worrying about just which degree of differentiability is the smallest
that allows all required applications of Sards theorem, in the remainder of the book
we will, for the most part, work with objects that are smooth, where smooth is a
synonym for C . This will result in no loss of generality, since for the most part
the arguments depend on the existence of smooth objects, which will follow from
Proposition 10.2.7. However, in Chapter 15 there will be given objects that may, in
applications, be only C 1 , but Sards theorem will be applicable because the domain
and range have the same dimension. It is perhaps worth mentioning that for this
particular case there is a simpler proof, which can be found on p. 72 of Spivak
(1965).
We briefly describe the most general and powerful version of Sards theorem,
which depends on a more general notion of dimension.
Definition 11.3.2. For > 0, a set S Rk has -dimensional Hausdorff
measure zero if, for any > 0, there is a sequence {(xj , j )}
j=1 such that
[ X
S Uj (xj ) and j < .
j j
11.4. MEASURE ZERO SUBSETS OF MANIFOLDS 157
Note that this definition makes perfect sense even if is not an integer! Let
U Rm be open, and let f : U Rn be a C r function. For 0 p < m let Rp
be the set of points x M such that the rank of Df (x) is less than or equal to
p. The most general and sophisticated version of Sards theorem, due to Federer,
states that f (Rp ) has -dimensional measure zero for all > p + mp r
. A beautiful
informal introduction to the circle of ideas surrounding these concepts, which is the
branch of analysis called geometric measure theory, is given by Morgan (1988). The
proof itself is in Section 3.4 of Federer (1969). This reference also gives a complete
set of counterexamples showing this result to be best possible.
In order for this to be sensible, it should be the case that (S) has measure zero
whenever : U M is a C 1 parameterization and S U has measure zero. That
is, it must be the case that if : U M is another C 1 parameterization, then
1 ((S)) has measure zero. This follows from the application of Lemma 11.1.4
to 1 .
Clearly the basic properties of sets of measure zero in Euclidean spacesthe
complement of a set of measure zero is dense, and countable unions of sets of mea-
sure zero have measure zeroextend, by straightforward verifications, to subsets of
manifolds of measure zero. Since uncountable unions of sets of measure zero need
not have measure zero, the following fact about manifolds (as we have defined them,
namely submanifolds of Euclidean spaces) is comforting, even if the definition above
makes it superfluous.
(a) Gr(f) A;
(b) f|M \Z = f |M \Z ;
(d) f CK P .
Proof. Let
f|(U ) = g 1 .
U (p) i (Ui ) for some i, and let ip be an integer that attains the maximum. Then
: M (0, ) is a continuous function. For each i let i := minpi (Ui ) (p), and
let
Ki = { p i (Ui ) : Ui (p) i (Ui ) }.
Clearly Ki is a closed subset of i (Ui ), so it is compact. For any p M we have
p Kip , so the sets K1 , K2 , . . . cover M.
Let C0 = , and for each positive i let Ci = K1 . . . Ki . Let f0 = f . Suppose
for some i = 1, 2, . . . that we have already constructed a neighborhood Wi1 of Ci1
and a continuous function fi1 : M N with Gr(fi1 ) A such that f |Wi1 is
smooth and f Ci1 P . Let Zi be an open subset of i (Ui ) that contains Ki , and
whose closure is compact and contained in i (Ui ). Now Lemma 11.5.5 gives an open
Zi Zi containing Ki and a continuous function fi : M N with Gr(fi ) A such
that fWi1 Zi is smooth, fi |M \Zi = fi1 |M \Zi , and fi Ci P . Set Wi = Wi1 Zi .
Evidently this constructive process can be extended to all i.
For each i, i (Ui ) intersects only finitely many j (Uj ), so the sequence
f1 |i (Ui ) , f2 |i (Ui ) , . . .
is unchanging after some point. Thus the sequence f1 , f2 , . . . has a well defined limit
that is smooth and transversal to P , and whose graph is contained in A.
We now turn to the proof of Proposition 11.5.4. The main idea is to select a
suitable member from a family of perturbations of g. The following lemma isolates
the step in the argument that uses Sards theorem.
Lemma 11.5.7. If U Rm and B Rns are open, P is a p-dimensional smooth
submanifold of Rn , and G : U B Rn is smooth and transversal to P , then for
almost every b B the functions gb = G(, b) : U N is transversal to P .
Proof. Let Q = G1 (P ). By the transversality theorem, Q is a smooth manifold,
of dimension (m + (n s)) (n p) = m + p s. Let be the natural projection
U B B. Sards theorem implies that almost every b B is a regular value of
|Q . Fix such a b. We will show that gb is transversal to P .
Fix x gb1 (P ), set q = gb (x), and choose some y Tq N. Since G is transversal
to P there is a u T(x,b) (U B) such that y is the sum of DG(x, b)u and an element
of Tq P . Let u = (v, w) where v Rm and w Rns . Since (x, b) is a regular point
of |Q , there is a u T(x,b) Q such that D|Q (x, b)u = w. Let u = (v , w).
Then Tq P contains DG(x, b)u , so it contains
For any (x, b) the image of DG(x, b) contains the image of Dg(x), so, since g|Y P ,
we have G Y B P . Since g s is a submersion, at every (x, b) such that (x) > 0 the
image of DG(x, b) is all of Rn , so G (U \Y )B P . Therefore G P . The last result
implies that for some b B, gb = G(, b) is transversal to P . Evidently gb agrees
with g on Y .
Let Z be an open subset of U with K Z and Z Z. Corollary 10.2.5 gives
a smooth : U [0, 1] that is identically one on Z and vanishes on U \ Z. Define
g be setting
g(x) = g s (x), (x)gbns (x) + (1 (x))g ns (x) .
Degree Theory
12.1 Orientation
The intuition underlying orientation is simple enough, but the formalism is a bit
heavy, with the main definitions expressed as equivalence classes of an equivalence
relation. We assume prior familiarity with the main facts about determinants of
matrices.
163
164 CHAPTER 12. DEGREE THEORY
No doubt most readers are well aware that a linear automorphism (that is, a
linear transformation from a vector space to itself) has a determinant. What we
mean by this is that the determinant of the matrix representing the transformation
does not depend on the choice of coordinate system. Concretely, if L and L are the
matrices of the transformation in two coordinate systems, then there is a matrix U
(expressing the change of coordinates) such that L = U 1 LU, so that
from [0, ] to the space of ordered bases. Evidently such paths can be combined to
construct a path from d1 e1 , . . . , dm em to e1 , e2 , . . . , em .
There are two parts to the argument, the first of which is concrete and geometric.
The general result is obtained by applying this in the context of finite collection
of parameterizations that cover .
Proof of Proposition 12.1.3. There are a = t0 < t1 < < tJ1 < tJ = b such
that for each j = 1, . . . , J, the image of |[tj1 ,tj ] is contained in the image of a
smooth parameterization. We may assume that J = 1 because the general case
can obviously be obtained from J applications of this special case. Thus there is
a smooth parameterization : U M whose image contains the image of . Let
= 1 , let := , let uh+1 = D((a))vh+1 (a), and define a moving h-frame
u along by setting
The last result gives a uh+1 : [a, b] Rm such that uh+1 (a) = uh+1 and (u1 , . . . , uh , uh+1 )
is a moving (h + 1)-frame along . We define vh+1 to [a, b] by setting
Proof. For a t b let A(t) = (aij (t)) be the matrix such that vi (t) = m
P
j=1 aij (t)vj (t).
Then A is continuous, and the determinant is continuous, so t 7 |A(t)| is a con-
tinuous function that never vanishes, and consequently |A(a)| > 0 if and only if
|A(b)| > 0.
If (b) = (a) and a given orientation of T(a) M = T(b) M differs from the
one induced by the given orientation and , then we say that is an orientation
reversing loop. Suppose that M has no orientation reversing loops. For any
choice of a base point p0 in each path component of M and any specification
of an orientation of each Tp0 M, there is an induced orientation of Tp M for each
p M defined by requiring that whenever : [a, b] M is a continuous path with
(a) = p0 , the orientation of T(b) M is the one induced by and the given orientation
of Tp0 M. If : [a , b ] M is a second path with (a ) = (a) and (b ) = (b),
then for any given orientation of T(a) the orientations of T(b) induced by and
must be the same because otherwise following , then backtracking along would
be an orientation reversing loop. Thus, in the absense of orientation reversing loops,
an orientation of Tp0 M induces an orientation at every p in the path component of
p0 .
We have arrived at the following collection of concepts.
Definition 12.1.7. An orientation for M is a assignment of an orientation to
each tangent space Tp M such that for every moving frame v along a path : [a, b]
168 CHAPTER 12. DEGREE THEORY
define the induced orientation of Tp M, then they give the same induced orientation.
Suppose that v1 and v1 are both outward pointing. Since the set of outward pointing
vectors is convex,
t 7 (1 t)v1 + tv1 , v2 , . . . , vm
are two bases of Tp M, then the matrix of the linear transformation taking each vi
to vi is the same as the matrix of the linear transformation taking each Df (p)vi to
Df (p)vi .
We can generalize this in a way that does not play a very large role in later
developments, but does provide some additional illumination at little cost. Sup-
pose that M is an oriented m-dimensional smooth -manifold, N is an oriented
n-dimensional boundaryless manifold, P is an oriented (n m)-dimensional sub-
manifold of N, and f : M N is a smooth map that is transversal to P . We say
that f is positively oriented relative to P at a point p f 1 (P ) if
Df (p)v1 , . . . , Df (p)vm, w1 , . . . , wnm
is a positively oriented ordered basis of Tf (p) N whenever v1 , . . . , vm is a positively
oriented ordered basis of Tp M and w1 , . . . , wnm is a positively oriented ordered
basis of Tf (p) P . It is easily checked that whether or not this is the case does not
depend on the choice of positively oriented ordered bases v1 , . . . , vm and Tp M and
w1 , . . . , wnm . When this is not the case we say that f is negatively oriented
relative to P at p.
Now, in addition, suppose that f 1 (P ) is finite. The oriented intersection
number I(f, P ) is the number of points in f 1 (P ) at which f is positively oriented
170 CHAPTER 12. DEGREE THEORY
I(f |M , P ) = 0.
f 1 (q) C = .
(1) deg
q (f ) = 1 for all (f, q) D (M, N) such that f
1
(q) is a singleton {p}
and f is orientation preserving at p.
Pr
(2) degq (f ) =
i=1 degq (f |Ci ) whenever (f, q) D (M, N), the domain of f
is C, and C1 , . . . , Cr are pairwise disjoint compact subsets of C such that
f 1 (q) C1 . . . Cr \ (C1 . . . Cr ).
(3) deg
q (h0 ) = deg q (h1 ) whenever C M is compact and the homotopy h :
C [0, 1] N is smoothly degree admissible over q.
Concretely, deg
q (f ) is the number of p f
1
(q) at which f is orientation preserv-
1
ing minus the number of p f (q) at which f is orientation reversing.
172 CHAPTER 12. DEGREE THEORY
Proof. For (f, q) D(M, N) the inverse function theorem implies that each p
f 1 (q) has a neighborhood that contains no other element of f 1 (q), and since U is
compact it follows that f 1 (q) is finite. Let deg
q (f ) be the number of p f
1
(q)
1
at which f is orientation preserving minus the number of p f (q) at which f is
orientation reversing.
Clearly deg satisfies (1) and (2). Suppose that h : C [0, 1] N is
smoothly degree admissible over q. Let V be a neighborhood of q such that for all
q V :
(a) h1 (q ) U [0, 1];
(b) q is a regular value of h0 and h1 ;
(c) deg
q (h0 ) = deg q (h0 ) and deg q (h1 ) = deg q (h1 ).
In preparation for the next result we show that deg is continuous in a rather
strong sense.
Proposition 12.3.4. If C M is compact, f : C N is continuous, and q
N \ f (C), then are neighborhoods Z C(C, N) of f and V N \ f (C) of q such
that
deg
q (f ) = deg q (f )
12.3. THE DEGREE 173
Z = { f C(C, N) : f (C) N \ V }
is an open subset of C(C, N), and Theorem 10.7.7 gives an open Z Z containing f
such that for any f , f Z C (C, N) there is a smooth homotopy h : C [0, 1]
N with h0 = f , h1 = f , and ht Z for all t, which implies that h is a degree
admissible homotopy, so (3) implies that deg
q (f ) = deg q (f ) whenever q V
is a regular point of both f and f .
Since Sards theorem implies that such a q exists, it now suffices to show that
deg
q (f ) = degq (f ) whenever f Z C (C, N) and q , q V are regular values
of f . Let j : N [0, 1] N be a smooth function with the following properties:
(a) j0 = IdN ;
(d) j1 (q ) = q .
(D1) degq (f ) = 1 for all (f, q) D(M, N) such that f is smooth, f 1 (q) is a
singleton {p}, and f is orientation preserving at p.
(D2) degq (f ) = ri=1 degq (f |Ci ) whenever (f, q) D(M, N), the domain of f is C,
P
and C1 , . . . , Cr are pairwise disjoint compact subsets of U such that
f 1 (q) C1 . . . Cr \ (C1 . . . Cr ).
Proof. We claim that if deg : D(M, N) Z satisfies (D1)-(D3), then its restriction
to D (M, N) satisfies (1)-(3). For (1) and (2) this is automatic. Suppose
that C M is compact and h : U [0, 1] N is a smoothly degree admissible
homotopy over q. Such a homotopy may be regarded as a continuous function from
[0, 1] to C(U , N). Therefore (D3) implies that degq (ht ) is a locally constant function
of t, and since [0, 1] is connected, it must be constant. Thus (3) holds.
Proposition 11.5.1 implies that for any (f, q) D(M, N) the set of smooth
f : M N that have q as a regular value is dense at f . In conjunction with
Proposition 12.3.4, this implies that the only possibility consistent with (D3) is to
set degq (f ) = deg
q (f ) for (f , q ) D (M, N) with f and q close to f and q.
This establishes uniqueness, and Proposition 12.3.4 also implies that the definition
is unambiguous. It is easy to see that (D1) and (D2) follow from (1) and (2),
and (D3) is automatic.
Since (D2) implies that the degree of f over q is the sum of the degrees of the
restrictions of f to the various connected components of the domain of f , it makes
sense to study the degree of the restriction of f to a single component. For this
reason, when studying the degree one almost always assumes that M is connected.
(In applications of the degree this may fail to be the case, of course.) The image
of a connected set under a continuous mapping is connected, so if M is connected
and f : M N is continuous, its image is contained in one of the connected
components of N. Therefore it also makes sense to assume that N is connected.
So, assume that N is connected, and that f : M N is continuous. We have
(M, f, q) D(M, N) for all q N, and (D3) asserts that degq (f ) is continuous as
a function of q. Since Z has the discrete topology, this means that it is a locally
constant function, and since N is connected, it is in fact constant. That is, degq (f )
does not depend on q; when this is the case we will simply write deg(f ), and we
speak of the degree of f without any mention of a point in N.
Proof. Since C (C, N) and C (D, P ) are dense in C(C, N) and C(D, P ) (Theorem
10.7.6) and composition is a continuous operation (Proposition 5.3.6) the continuity
12.4. COMPOSITION AND CARTESIAN PRODUCT 175
property (D3) of the degree implies that is suffices to prove the claim when f and
g are smooth. Sards theorem implies that there are points r arbitrarily near r
that are regular values of both g and g f , and Proposition 12.3.4 implies that the
relevant degrees are unaffected if r is replaced by such a point, so we may assume
that r has these regularity properties.
For q g 1 (r) let sg (q) be 1 or 1 according to whether g is orientation pre-
serving or orientation reversing at q. For p (g f )1 (q) define sf (p) and sgf (p)
similarly. In view of the chain rule and the definition of orientation preservation
and reversal, sgf (p) = sg (f (p))sf (p). Therefore
X X X
deg(g f ) = sg (f (p))sf (p) = sg (q) sf (p)
p(gf )1 (r) qg 1 (r) pg 1 (q)
X
= sg (q) degq (f ).
qg 1 (r)
We now take up the theory of the fixed point index. For continuous functions de-
fined on compact subsets of Euclidean spaces it is no more than a different rendering
of the theory of the degree; this perspective is developed in Section 13.1.
But we will see that it extends to a much higher level of generality, because the
domain and the range of the function or correspondence have the same topology.
Concretely, there is a property called Commutativity that relates the indices of the
two compositions g g and g g where g : C X and g : C X are continuous,
and other natural restrictions on this data (that will give rise to a quite cumbersome
definition) are satisfied. This property requires that we extend our framework to
allow comparison across spaces. Section 13.2 introduces the necessary abstractions
and verifies that Commutativity is indeed satisfied in the smooth case. It turns out
that this boils down to a fact of linear algebra that came as a surprise when this
theory was developed.
When we extended the degree from smooth to continuous functions, we showed
that continuous functions could be approximated by smooth ones, and that this gave
a definition of the degree for continuous functions that made sense and was uniquely
characterized by certain axioms. In somewhat the same way Commutativity will
be used, in Section 13.4, to extend from Euclidean spaces to separable ANRs, as
per the ideas developed in Section 7.6. The argument is lengthy, technically dense,
and in several ways the culmination of our work to this point.
The Continuity axiom is then used in Section 13.5 to extend the index to con-
tractible valued correspondences. The underlying idea is the one used to extend
from smooth to continuous functions: approximate and show that the resulting def-
inition is consistent and satisfies all properties. Again, there are many verifications,
and the argument is rather dense.
Multiplication is an additional property of the index that describe its behav-
ior in connection with cartesian products. For continuous functions on subsets of
Euclidean spaces it is a direct consequence of Proposition 12.4.3. At higher levels
of generality it is, in principle, a consequence of the axioms, because those axioms
characterize the index uniquely, but an argument deriving Multiplication from the
other axioms is not known. Therefore we carry Multiplication along as an addi-
tional property that is extended from one level of generality to the next along with
everything else.
176
13.1. AXIOMS FOR AN INDEX ON A SINGLE SPACE 177
For each integer m 0 an index base for Rm is given by letting I m be the set of
index admissible continuous functions f : C Rm .
We can now state the first batch of axioms.
Proof. Continuity implies that X (ht ) is a locally constant function of t, and [0, 1]
is connected.
Our first index scope S 0 has the collection of spaces SS 0 = {R0 , R1 , R2 , . . .} with
IS 0 (Rm ) = I m for each m. Of course (b) is satisfied by identifying Rm Rn with
Rm+n .
To understand the motivation for the following definition, first suppose that
X, X SS , and that g : X X and g : X X are continuous. In this
circumstance it will be the case that g g and g g have the same index. We
would like to develop this idea in greater generality, for functions g : C X and
g : C X, but for our purposes it is too restrictive to require that g(C) C and
g(C) C. In this way we arrive at the following definition.
Definition 13.2.2. A commutativity configuration is a tuple
(X, D, E, g, X, D, E, g)
(b) g C(D, X) and g C(D, X) with g(E) int D and g(E) int D;
X (g g|E ) = X (g g|E ).
180 CHAPTER 13. THE FIXED POINT INDEX
XX (F F ) = X (F ) X (F ).
We can now state the result that has been the main objective of all our work.
Let SS Ctr be the class of separable absolute neighborhood retracts, and for each
X SS Ctr let IS Ctr (X) be the union over compact C X of the sets of index
admissible upper semicontinuous contractible valued correspondences F : C X.
Since cartesian products of contractible valued correspondences are contractible
valued, we have defined an index scope S Ctr .
Theorem 13.2.4. There is a unique index Ctr for S Ctr , which is multiplicative.
V1 = ker K im L, V1 V2 = im L, V1 V3 = ker K,
and similarly for W . With suitably chosen bases the matrices of K and L have the
forms
0 K12 0 K14 0 L12 0 L14
0 K22 0 K24
and 0 L22 0 L24
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
13.3. THE INDEX FOR EUCLIDEAN SPACES 181
|IdV LK| = |L1 | |IdV LK| |L| = |L1 (IdV LK)L| = |IdW KL|.
Lemma 13.3.4 states that if we fix the sets X, D, E, X, D, E, then the set of pairs
(g, g) giving a commutativity configuration is open. This is simple and unsurprising,
but without the spadework we did in Chapter 5 the proof would be a tedious slog.
We extract one piece of the argument in order to be able to refer to it later.
(g, g) 7 g g|E
Proof. Lemma 5.3.1 implies that the function g 7 g|E is continuous, after which
Proposition 5.3.6 implies that (g, g) 7 g g|E is continuous.
Proof. Lemma 4.5.10 implies that the set of (g, g) such that g(E) int D and
g(E) int D is an open subset of C(D, X) C(D, X). The lemma above implies
that the functions (g, g) 7 g g|E and (g, g) 7 g g|E are continuous, and Propo-
sition 13.1.4 implies that the set of (g, g) satisfying part (c) of Definition 13.2.2 is
open. In view of the discussion in the last section, a pair (g, g) that satisfies (a)-(c)
of Definition 13.2.2 will also satisfy (d) if and only if
Since (g, g) 7 g g|E and (g, g) 7 g g|E are continuous, Theorem 5.2.1 and Lemma
4.5.10 imply that the set of such pairs is open.
182 CHAPTER 13. THE FIXED POINT INDEX
Proof of Theorem 13.3.1. Uniqueness and (I1)-(I3) follow from Proposition 13.1.5,
so, we only need to prove that (I4) and (M) are satisfied.
Suppose that (Rm , D, E, g, Rm, D, E, g) is a commutativity configuration. Lemma
13.3.4 states that it remains a commutativity configuration if g and g are replaced
by functions in any sufficiently small neighborhood, and Lemma 13.3.3 implies that
g g|E and g g|E are continuous functions of (g, g), so, since we already know that
(I3) holds, it suffices to prove the equation of (I4) after such a replacement. Since
the smooth functions are dense in C(D, Rm ) and C(D, Rm ) (Proposition 10.2.7) we
may assume that g and g are smooth. In addition, Sards theorem implies that
the regular values of IdE g g|E are dense, so after perturbing g by adding an
arbitrarily small constant, we can make it the case that 0 is a regular value. In the
same way we can add a small constant to g to make 0 a regular value of IdE g g|E ,
and if the constant is small enough it will still be the case that 0 is a regular value
of IdE g g|E .
Let x1 , . . . , xr be the fixed points of g g|E , and for each i let xi = g(xi ). Then
x1 , . . . , xr are the fixed points of gg|E . Let D1 , . . . , Dr be pairwise disjoint compact
subsets of E with xi int Di , and let D1 , . . . , Dr be pairwise disjoint open subsets
of E with xi int Di . For each i let Ei be a compact subset of g 1(int Di ) with
xi int Ei , and let Ei be a compact subset of g 1(int Di ) with xi int Ei . It is easy
to check that each (Rm , Di , Ei , gi , Rm , Di , Ei , gi ) is a commutativity configuration.
Recalling the relationship between the index and the degree, we have
Rm (g g|Ei ) = Rm (g g|Ei )
Throughout this section we will work with two fixed index scopes S and S. We
say that S subsumes S if, for every X SS , we have X SS and IS (X) IS (X).
If this is the case, and is an index for S, then the restriction (in the obvious sense)
of to S is an index for S. (It is easy to check that this is an automatic consequence
of the definition of an index.) If is an index for S, then an extension to S is an
index for S whose restriction to S is .
If f : C X is in IS (X), a narrowing of focus for f is a pair (D, E) of
compact subsets of int C such that
the supremum of the set of > 0 such that d(x , f (x )) > 2 whenever x
C \ (int E), x C, and d(x, x ) < ,
where d is the given metric for X. (Of course X has many metrics that give the
same topology. In contexts such as this we will implicitly assume that one has been
selected.)
Since f is continuous and admissible, narrowings of focus for f exist: continuity
implies the existence of an open neighborhood V of F P(f ) satisfying V f (V )
int C. Repeating this observation gives an open neighborhood W of F P(f ) satis-
fying W f (W ) V , and we can let D = V and E = W .
Let C be a compact subset of a metric space X. An (S, )-domination of C is
a quadruple (X, C, , ) in which X SS , C is an compact subset of X, : C C
and : C X are continuous functions, and is -homotopic to IdC . We say
that S dominates S if, for each X SS , each compact C X, and each > 0,
there is an (S, )-domination of C. This sections main result is:
X (f ) = X ( f |1 (E) ) ()
Let SS ANR be the class of compact absolute neighborhood retracts, and for each
X SS ANR let IS ANR (X) be the union over open C X of the sets of index
admissible functions in C(C, X). These definitions specify an index scope S ANR
because SS ANR is closed under formation of finite cartesian products, and f f
IS ANR (X X) whenever X, X SS ANR , f IS ANR (X), and f IS ANR (X).
184 CHAPTER 13. THE FIXED POINT INDEX
Theorem 13.4.2. There is a unique index ANR for S ANR that extends 0 , and
ANR is multiplicative.
Proof. Theorem 7.6.4 implies that S 0 dominates S ANR , and S ANR evidently sub-
sumes S 0 .
The rest of this section is devoted to the proof of Theorem 13.4.1. Before
proceeding, the reader should be warned that this is, perhaps, the most difficult
argument in this book. Certainly it is the most cumbersome, from the point of
view of the burden of notation, because the set up used to extend the index is
complex, and then several verifications are required in that setting. To make the
expressions somewhat more compact, from this point forward we will frequently
drop the symbol for composition, for instance writing rather than .
Lemma 13.4.3. Suppose X SS , f : C X is in IS (X), (D, E) is a narrowing
of focus for f , 0 < < (D,E) , and (X, C, , ) is an (S, )-domination of U. Let
D = 1 (D) and E = 1 (E). Then
(X, D, E, f |D , X, D, E, |D )
is a commutativity configuration.
Proof. We need to verify (a)-(d) of Definition 13.2.2. We have E D X with
D and E compact, so E D X. In addition D and E are closed because is
continuous, so they are compact because they are subsets of C. Thus (a) holds.
Of course f |D and |D are continuous. We have
((f (E))) U (f (E)) int D,
so (f (E)) 1 (int D) int D. In addition, (E) E int D. Thus (b) holds.
If x D \ int (E), then d(x, f (x)) > 2(D,E) > 2 and d(f (x), ((f (x)))) < ,
so x cannot be a fixed point of f . Thus F P(f |E ) int E. If x D is a fixed
point of f |D , then (x) is a fixed point of f |D , so F P(f |D ) 1 (int E)
int E. Thus (c) holds.
We now establish () and (). We have
((f (F P(f |D )))) = F P(f |D ) int E,
so
(f (F P(f |D ))) 1 (int E) E,
and F P(f |E ) int E, so
(F P(f |D )) (int E) E.
Thus (d) holds.
From this point forward we assume that there is a given index for S. In
order for our proposed definition of to be workable it needs to be the case that
the definition of the derived index does not depend on the choice of narrowing or
domination, and it turns out that proving this will be a substantial part of the
overall effort. The argument is divided into a harder part proving a special case
and a reduction of the general case to this.
13.4. EXTENSION BY COMMUTATIVITY 185
Then
X1 (1 f 1 |E1 ) = X2 (2 f 2 |E2 ).
Proof. The definition of domination gives an -homotopy h : C [0, 1] X with
h0 = IdC and h1 = 1 1 and a -homotopy j : C [0, 1] X be an -homotopy
with j0 = IdC and j1 = 2 2 . We will show that:
(a) the homotopy t 7 1 jt f 1 |E1 is well defined and index admissible;
Specifically, in view of (a) and (b) the first and third equalities follows from the
homotopy principle, while (c) permits an application of Commutativity that gives
the second equality.
For each t the composition 1 jt f 1 |E1 is well defined because
and
1 1 2 (E2 ) U (2 (E2 )) U (E) int D,
186 CHAPTER 13. THE FIXED POINT INDEX
so
which is to say that () and () hold, which implies (d), completing the proof.
The hypotheses of the next result are mostly somewhat more general, but we
now need to assume that S dominates S.
Lemma 13.4.5. Assume that S dominates S. Let X be an element of SS , and let
f : C X be an element of IS (X). Suppose (D1 , E1 ) and (D2 , E2 ) are narrowings
of focus for f , 0 < 1 < (D1 ,E1 ) , 0 < 2 < (D2 ,E2 ) , and (X1 , C1 , 1 , 1 ) and
2 )-domination of C. Set
(X2 , C2 , 2 , 2 ) are an (S, 1)-domination and an (S,
Then
X1 (1 f 1 |E1 ) = X2 (2 f 2 |E2 ).
Proof. It suffices to show this when D1 D2 and E1 E2 , because then the
general case follows from two applications in which first D1 and E1 , and then D2
and E2 , are replaced by D1 D2 and E1 and E2 with E1 E2 . The assumption
that S dominates S which guarantees the existence of an (S, 2) domination of U
for arbitrarily small 2 , and if we apply the lemma above to this domination and
the given one we find that it suffices to prove the result with the given domination
replaced by this one. This means that we may assume that 2 is as small as need
be, and in particular we may assume that 2 < (D1 ,E1 ) . Now Additivity implies
that
X2 (2 f 2 |E2 ) = X2 (2 f 2 |21 (E1 ) ),
which means that it suffices to establish the result with D2 and E2 replaced by D1
and E1 , which is the case established in the lemma above.
Proof of Theorem 13.4.1. Since S dominates S, the objects used to define exist,
and the last result implies that the definition of does not depend on the choice
of (D, E), , and (X, C, , ). We now verify that satisfies (I1)-(I4) and (M).
13.4. EXTENSION BY COMMUTATIVITY 187
X (f ) = X (f ) = 1.
Additivity:
Suppose that F P(f ) int C1 . . . int Cr where C1 , . . . , Cr C are compact
and pairwise disjoint. For each j = 1, . . . , r choose open sets Dj D Cj and
Ej E Cj such that (Dj , Ej ) is a narrowing of focus for (Cj , f |Cj ). In view of
Lemma 13.4.5 we may assume that < (Dj ,Ej ) for all j. It is easy to see that for
)-domination of Cj . For each j let E = 1 (Ej ).
each j, (X, C, |Cj , ) is an (S, j
Additivity for gives
X X
X (f ) = X (f |E ) = X (f |Ej ) = X (f |Cj ).
j j
Continuity:
It is easy to see that if f : C X that are sufficiently close to f , then (D, E) is
a narrowing of focus for (C, f ), and (X, C, , ) is a (S, )-domination of C. Since
f 7 f is continuous (Propositions 5.5.2 and 5.5.3) Continuity for gives
X (f ) = X (f ) = X (f ) = X (f )
For any positive 1 < (D1 ,E2 ) and 2 < (D2 ,E2) Lemma 13.4.5 implies that there is
a (S, 1 )-domination (X1 , C1 , 1 , 1 ) of C1 and a (S, 2 )-domination (X2 , C2 , 2 , 2 )
of C2 . Let
Here the first and fifth equality are from the definition of , the second and fourth
are implied by Continuity for , and the third is from Commutativity for . In
order for this to work it must be the case that all the compositions in this calculation
are well defined, in the sense that the image of the first function is contained in the
domain of the second function, the homotopies
is a commutativity configuration. Clearly this will be the case when 1 and 2 are
sufficiently small.
Multiplication:
We now consider X1 , X2 SS , f1 : C1 X1 in IS (X1 ), and f2 : C2 X2 in
IS (X2 ). For each i = 1, 2 let (Di , Ei ) be a narrowing of focus for (Ci , fi ), and let
(Xi , Ci , i , i ) be an (S, i )-domination of C, where < (Di ,Ei ) . The definition of
an index scope requires that X1 X2 SS and (C1 C2 , f1 f2 ) IS (X1 X2 ).
Clearly (D1 D2 , E1 E2 ) is a narrowing of focus for (C1 C2 , f1 f2 ). If d1 and
d2 are given metrics for X1 and X2 respectively, endow X1 X2 with the metric
(x1 , x2 ), (y1 , y2 ) 7 max{d1 (x1 , y1 ), d2 (x2 , y2)}.
the second one is an application of Multiplication for and the third is the definition
of .
We now prove that if S subsumes S, then is the unique extension of to
S. Consider X SS and (C, f ) IS (X). For any > 0, (X, C, IdC , IdC ) is
)-domination of C. For any narrowing of focus (D, E) equation () gives
an (S,
X (f ) = X (f |E ) and Additivity for gives X (f |E ) = X (f ). Thus extends
.
Two indices for S that restrict to necessarily agree everywhere because, by
Continuity and Commutativity, () holds in the circumstances described in the
statement of Theorem 13.4.1.
13.5. EXTENSION BY CONTINUITY 189
Proposition 13.5.4. Suppose I and I are index bases for a compact metric space
X, and I approximates I. Then for any index X : I Z there is a unique
index X : I Z such that for each open C X with compact closure, each F
I U(C, X), and each open D with F P(F ) D and D C, there is a neighborhood
E U(D, X) of F |D such that X (F ) = X (f ) for all f E C(D, X) I.
Proof. Fix C, F I U(C, X), and D as in the hypotheses. Then F |D is index
admissable, hence an element of I because I is an index base.
Applying (E2), let B D X be a neighborhood of Gr(F |D ) such that for
any f, f I C(D, X) with Gr(f ), Gr(f ) B there is a homotopy h : [0, 1]
C(D, X) with h0 = f , h1 = f , and
Gr(ht ) (C X) \ { (x, x) : x C \ D }
for all t. Since F has no fixed points in C \ D, the right hand side is a neighborhood
of Gr(FD ). We define X by setting
X (F ) := X (f )
and Continuity implies that this definition does not depend on the choice of f .
Since A and B can be replaced by smaller open sets, it does not depend on the
choice of A and B. We must also show that it does not depend on the choice of D.
So, let D be another open set with D C and F P(F ) (C \ D) = . Then
F P(F ) D D. The desired result follows if we can show that it holds when D
and D are replaced by D and D D and also when D and D are replaced by D D
and D. Therefore we may assume that D D.
Let B D X be a neighborhood of Gr(F |D ) such that for any f, f I
C(D, X) with Gr(f ), Gr(f ) B there is a homotopy h : [0, 1] C(D, X) with
h0 = f , h1 = f , and
Gr(ht ) (C X) \ { (x, x) : x C \ D }
X (g g|D ) = X ( |D ) = X ( |D ) = X (g g |D ).
Multiplication:
For spaces X, X SS and open C X and C X with compact closure
consider
F IS (X) U(C, X) and F IS (X ) U(C , X ).
192 CHAPTER 13. THE FIXED POINT INDEX
Then the definition of an index scope implies that F F IS (XX ). Choose open
sets D and D with F P(F ) D, D C, F P(F ) D , and D C . As above, we
can find neighborhoods B U(D, X), B U(D , X ), and D U(DD , X X ),
of F |D , F |D , and (F F )|DD respectively, such that X (F ) = X (f ) for all
f B IS (X), X (F ) = X (f ) for all f B IS (X ), and XX (F F ) =
XX (j) for all j D IS (X X ). Since the formation of cartesian products
of correspondences is a continuous operation (this is Lemma 5.3.4) we may replace
B and B with smaller neighborhoods to obtain F F D for all F B and
F B . Assumption (E1) implies that there are
XX (F F ) = XX (f f ) = X (f ) X (f ) = X (F ) X (F ).
Part III
193
Chapter 14
Topological Consequences
This chapter is a relaxing and refreshing change of pace. Instead of working very
hard to slowly build up a toolbox of techniques and specific facts, we are going to
harvest the fruits of our earlier efforts, using the axiomatic description of the fixed
point index, and other major results, to quickly derive a number of quite famous
results. In Section 14.1 we define the Euler characteristic, relate it to the Lefschetz
fixed point theorem, and then describe the Eilenberg-Montgomery as a special case.
For two general compact manifolds, the degree of a map from one to the other
is a rather crude invariant, in comparison with many others that topologists have
defined. Nevertheless, when the range is the m-dimensional sphere, the degree
is already a complete invariant in the sense that it classifies functions up to
homotopy: if M is a compact m-dimensional manifold that is connected, and f
and f are functions from M to the m-sphere of the same degree, then f and f
are homotopic. This famous theorem, due to Hopf, is the subject of Section 14.2.
Section 12.4 proves a simple result asserting that the degree of a composition of two
functions is the products of their degrees.
Section 14.3 presents several other results concerning fixed points and antipodal
maps of a map from a sphere to itself. Some of these are immediate consequences
of index theory and the Hopf theorem, but the Borsuk-Ulam theorem requires a
substantial proof, so it should be thought of as a significant independent fact of
topology. It has many consequences, including the fact that spheres of different
dimensions are not homeomorphic.
In Section 14.4 we state and prove the theorem known as invariance of domain.
It asserts that if U Rm is open, and f : U Rm is continuous and injective, then
the image of f is open, and the inverse is continuous. One may think of this as a
purely topological version of the inverse function theorem, but from the technical
point of view it is much deeper.
If a connected set of fixed points has a nonzero index, it is essential. This raises
the question of whether a connected set of fixed points of index zero is necessarily
inessential. Section 14.5 presents two results of this sort.
194
14.1. EULER, LEFSCHETZ, AND EILENBERG-MONTGOMERY 195
Here is a sketch of a proof that our definition of (M) agrees with Eulers
when M is a triangulated compact 2-manifold. We deform the identity function
slightly, achieving a function f : M M defined as follows. Each vertex of the
triangulation is mapped to itself by f . Each barycenter of an edge is mapped to
itself, and the points on the edge between the barycenter and either of the vertices
of the edge are moved toward the barycenter. Each barycenter of a two dimensional
simplex is mapped to itself. If x is a point on the boundary of the 2-simplex, the
line segment between x and the barycenter is mapped to the line segment between
f (x) and the barycenter, with points on the interior of the line segment pushed
toward the barycenter, relative to the affine mapping. It is easy to see that the only
fixed points of f are the vertices and the barycenters of the edges and 2-simplices.
Eulers formula follows once we show that the index of a vertex is +1, the index of
the barycenter of an edge is 1, and the index of the barycenter of a 2-simplex is +1.
We will not give a detailed argument to this effect; very roughly it corresponds to
the intuition that f is expansive at each vertex, compressive at the barycenter
of each 2-simplex, and expansive in one direction and compressive in another at the
barycenter of an edge.
Although Euler could not have expressed the idea in modern language, he cer-
tainly understood that the Euler characteristic is important because it is a topo-
logical invariant.
(X) = (X ).
(F | ) = (F ) = (F | ) + (F |).
Proof. Recall (Proposition 7.5.3) that an absolute retract is an ANR that is con-
tractible. Theorem 9.1.1 implies that F can be approximated in the sense of
Continuity by a continuous function, so X (F ) = X (f ) for some continuous
f : X X. Let c : X [0, 1] X be a contraction. Then (x, t) 7 c(f (x), t) (or
(x, t) f (c(x, t))) is a homotopy between f and a constant function, so Homotopy
[fix this] and Normalization imply that X (f ) = 1. Now the claim follows from the
last result.
Of course the first two expressions agree when x1 = 0, so this is well defined and
continuous, and h1 (x) = q for all x.
In preparation for an application of the Hopf theorem, we introduce an important
concept from topology. If X is a topological space and A X, the pair (X, A)
has the homotopy!extension property if, for any topological space Y and any
function g : (X {0}) (A [0, 1]) Y , there is a homotopy h : X [0, 1] Y
such that is an extension of g: h(x, 0) = g(x, 0) for all x X and h(x, t) = g(x, t)
for all (x, t) A [0, 1].
14.2. THE HOPF THEOREM 199
Lemma 14.2.2. The pair (X, A) has the homotopy extension property if and only
if (X {0}) (A [0, 1]) is a retract of X [0, 1].
Proof. If (X, A) has the homotopy extension property, then the inclusion map from
(X {0}) (A [0, 1]) to X [0, 1] has a continuous extension to all of X [0, 1],
which is to say that there is a retraction. On the other hand, if r is a retraction,
then for any g there is continuous extension h = g r.
We will only be concerned with the example given by the next result, but it
is worth noting that this concept takes on greater power when one realizes that
(X, A) has the homotopy extension property whenever X is a simplicial complex
and A is a subcomplex. It is easy to prove this if there is only one simplex in X
that is not in A; either the boundary of is contained in A, in which case there
is an argument like the proof of the following, or it isnt, and another very simple
construction works. The general case follows from induction because if (X, A) and
(A, B) have the homotopy extension property, then so does (X, B). To show this
suppose that g : (X {0}) (B [0, 1]) Y is given. There is a continuous
extension h : A [0, 1] Y of the restriction of g to (A {0}) (B [0, 1]). The
extension of h to all of (X {0}) (A [0, 1]) defined by setting h|X{0} = g|X{0}
is continuous because it is continuous on X {0} and A [0, 1], both of which
are closed subsets of X [0, 1] (here the requirement that A is closed finally shows
up) and since (X, A) has the homotopy extension property this h can be further
extended to all of X [0, 1].
Lemma 14.2.3. The pair (D m , S m1 ) has the homotopy extension property.
Proof. There is an obvious retraction
S m := { x Rm+1 : kxk = 1 }.
Some of our arguments involve induction on m, and for this purpose we will regard
S m1 as a subset of S m by setting
S m1 = { x S m : xm+1 = 0 }.
am (x) = x.
tx + (1 t)y
rm (x, y, t) := .
ktx + (1 t)yk
(f ) = 1 + (1)m deg(f ).
Proof. Hopfs theorem (Theorem 14.2.1) implies that two maps from S m to itself are
homotopic if they have the same degree, and the index is a homotopy invariant, so
if suffices to determine the relationship between the degree and index for a specific
instance of a map of each possible degree.
We begin with m = 1. For d Z let f1,d : S 1 S 1 be the function
deg(f1,d ) = d.
Now observe that f1,1 is homotopic to a map without fixed points, while for
d 6= 1 the fixed points of f1,d are the points
2k 2k
cos d1 , sin d1 (k = 0, . . . , d 2).
If d > 1, then motion in the domain is translated by f1,d into more rapid motion
in the range, so the index of each fixed point is 1. When d < 1, f1,d translates
motion in the domain into motion in the opposite direction in the range, so the
index of each fixed point is 1. Combining these facts, we conclude that
(f1,d ) = 1 d,
S m = { x + em+1 : x S m1 , 0, 2 + 2 = 1 }.
202 CHAPTER 14. TOPOLOGICAL CONSEQUENCES
f am |D = an f.
Proof. There are smooth maps arbitrarily close to f . For such an f the map
p 7 rm (f (p), f (p), 21 )
(f 1 (q) f 1 (q)) S m1 = ,
centered at the North pole. Since f is antipode preserving, q is also a regular value
of f . In view of the inverse function theorem, f 1 (D D ) is a disjoint union
of diffeomorphic images of D , and none of these intersect S m1 if is sufficiently
small. Concretely, for each p f 1 (q) f 1 (q) the component C p of f 1 (D
D ) containing p is mapped diffeomorphically by f to either D or D , and the
various C p are disjoint from each other and S m1 . Therefore we wish to show that
f 1 (D D ) S+m has an odd number of components.
Let M = S+m \ f 1 (D D ). Clearly M is a compact m-dimensional smooth
-manifold. Each point in S m \ {q, q} has a unique representation of the form
y + q where y S m1 , 0 < 1, and 2 + 2 = 1. Let j : S m \ {q, q} S m1
be the function j y + q := y, and let
g := j f |M : M S m1 .
Sards theorem implies that some q S m1 is a regular value of both g and g|M .
Theorem 12.2.1 implies that degq (g|M ) = 0, so (g|M )1 (q ) has an even number of
elements. Evidently g maps the boundary of each Cp diffeomorphically onto S m1 ,
so each such boundary contains exactly one element of (g|M )1 (q ). In addition,
j maps antipodal points of S m \ {q, q} to antipodal ponts of S m1 , so g|S m1 is
antipodal, and our induction hypothesis implies that (g|M )1 (q ) S m1 has an
odd number of elements. Therefore the number of components of f 1 (D D )
contained in S+m is odd, as desired.
The hypotheses can be weakened:
Corollary 14.3.7. If the map f : S m S m satisfies f (p) 6= f (p) for all p, then
the degree of f is odd.
Proof. This will follow from the last result once we have shown that f is homo-
topic to an antipodal map. Let h : S m [0, 1] S m be the homotopy h(p, t) =
rm (f (p), f (p), 2t). The hypothesis implies that this is well defined, and h1 is
antipodal.
This result has a wealth of geometric consequences.
Theorem 14.3.8 (Borsuk-Ulam Theorem). The following are true:
(a) If f : S m Rm is continuous, then there is a p S m such that f (p) =
f (am (p)).
(e) Any cover F1 , . . . , Fm+1 of S m by m + 1 closed sets has a least one set that
contains a pair of antipodal points.
(f ) Any cover U1 , . . . , Um+1 of S m by m + 1 open sets has a least one set that
contains a pair of antipodal points.
In the argument above we showed that (a) (b) (c) (d) and (a) (e)
(f). There are also easy arguments for the implications (d) (c) (b) (a) and
(f) (e) (c), so (a)-(f) are equivalent in the sense of each being an elementary
consequence of each other. The proofs that (d) (c) and (c) (b) are obvious
and can be safely left to the reader. To show that (b) (a), for a given continuous
f : S m Rm we apply (b) to f f am . To show that (f) (e) observe that if
F1 , . . . , Fm+1 are closed and cover S m , then for each n the sets U1/n (Fi ) are open
and cover S m , so there is a pn with pn , pn U1/n (Fi ) for some i. Any limit point
of the sequence {pn } has the desired property.
The proof that (e) (c) is more interesting. Consider an m-simplex that is
embedded in D m with the origin in its interior. Let F1 , . . . , Fm+1 be the radial
projections of the facets of the simplex onto S m1 . These sets are closed and cover
S m1 , and since each facet is separated from the origin by a hyperplane, each Fi
does not contain an antipodal pair of points. If f : S m S m1 is continuous,
then f 1 (F1 ), . . . , f 1 (Fm+1 ) are a cover of S m by closed sets, and (e) implies the
206 CHAPTER 14. TOPOLOGICAL CONSEQUENCES
existence of p, p f 1 (Fi ) for some i. If f was also antipodal, then f (p), f (p) =
f (p) Fi , which is impossible.
As a consequence of the Borsuk-Ulam theorem, the following obvious fact is
actually highly nontrivial.
Theorem 14.3.9. Spheres of different dimensions are not homeomorphic.
Proof. If k < m then, since S k can be embedded in Rm , part (a) of the Borsuk-Ulam
theorem implies that a continuous function from S m to S k cannot be injective.
The next result is quite famous, being commonly regarded as one of the major
accomplishments of algebraic topology. As the elementary nature of the assertion
suggests, it is applied quite frequently.
Proof. The last result can be applied to a closed disk surrounding any point in the
domain, so for any open V U, f (V ) is open. Thus f 1 is continuous.
= : V M M
Proposition 11.5.2 and Corollary 10.2.5 combine to imply that there is a vector field
on Y0 with image contained in 1 (W ) that agrees with on Y0 \ Y1 , is C r1 on
Y2 , and has only regular equilibria, all of which are in Y2 . The number of equilibria
is necessarily finite, and we may assume that, among all the vector fields on Y0 that
agree with on Y0 \ Y1 , are C r1 on Y2 , and have only regular equilibria in Y2 ,
minimizes this number. If has no equilibria, then we may define a continuous
function f : C X without any fixed points whose graph is contained in W by
setting f(p) = ((p)) if p Y0 and setting f(p) = f (p) otherwise.
Aiming at a contradiction, suppose that has equilibria. Since the index is
zero, there must be two equilibria of opposite index, say p0 and p1 , and it suffices
to show that we can further perturb in a way that eliminates both of them.
There is a C r embedding : (, 1 + ) Upp with (0) = p0 and (1) = p1 .
14.5. ESSENTIAL SETS REVISITED 209
(This is obvious, but painful to prove formally, and in addition the case m = 1
requires special treatment. A formal verification would do little to improve the
readers understanding, so we omit the details.) Applying the tubular neighborhood
theorem, this path can be used to construct a C r parameterization : Z U where
Z Rm is a a neighborhood of D m .
Let g : Z Rm be defined by setting g(x) = D(x)1 (x) . Proposition 14.5.1
gives a continuous function g : Z Rm \ {0} that agrees with g on the closure of
Z \ D m . We extend g to all of Z by setting g(x) = g(x) if x
/ D m . Define a new
vector field on (Z) by setting
There are two final technical points. In order to insure that (p) 1 (W ) for
all p we can first multiplying g by a C r function : D m (0, 1] that is identically
1 on Z \ D m and close to zero in the interior of D m outside of some neighborhood of
S m1 . We can also use Proposition 11.5.2 and Corollary 10.2.5 to further perturb
to make is C r1 without introducing any additional equilibria. This completes the
construction, thereby arriving at a contradiction that completes the proof.
Economic applications call for a version of the result for correspondences. Ideally
one would like to encompass contractible valued correspondences in the setting of
a manifold, but the methods used here are not suitable. Instead we are restricted
to convex valued correspondences, and thus to settings where convexity is defined.
Theorem 14.5.3. If X Rm is compact and convex, C X is compact, F :
C X is an index admissible upper semicontinuous convex valued correspondence,
(F ) = 0, and F P(F ) is connected, then F is inessential.
Caution: The analogous result does not hold for essential sets of Nash equilibria,
which are defined by Jiang (1963) in terms of perturbations of the games payoffs.
Hauk and Hurkens (2002) give an example of a game with a component of the set
of Nash equilibria that has index zero but is robust with respect to perturbations
of payoffs.
Proof. Let W C X be an open set containing the graph of F . We will show that
there is a continuous f : C X with Gr(f ) W and F P(f ) = . Let x0 be a point
in the interior of X, let h : X [0, 1] X be the contraction h(x, t) = (1t)x+tx0 ,
and for t [0, 1] let ht F be the correspondence x 7 ht (F (x)). This correspondence
is obviously upper semicontinuous and convex valued, and Gr(ht F ) W for small
t > 0, so it suffices to prove the result with F replaced by ht F for such t. Therefore
we may assume that the image of F is contained in the interior of X.
For each x F P(F ) we choose convex neighborhoods Yx C of x and Zx X
of F (x) such that Yx Zx and Yx Zx W . Choose x1 , . . . , xk such that F P(F )
Yx1 . . . Yxk , and let
Note that for all (x, y) Z0 , Z0 contains the line segment { (x, (1 t)y + tx) }. Let
Y1 and Y2 be open subsets of C with F P(F ) Y2 , Y 2 Y1 , Y 1 Y0 , and Y2 is
210 CHAPTER 14. TOPOLOGICAL CONSEQUENCES
is C , its graph is contained in W0 , and it has only regular fixed points. If > 0
is sufficiently small, then f (x) U (x) for all x Y 2 . Therefore we may assume
that f (x) U (x) for all x Y 2 .
Define a function : Y2 Rm by setting (x) = f (x) x. Aiming at a
contradiction, suppose that has zeros. Since the (f ) = 0, there must be two zeros
of opposite index, say x0 and x1 . As in the last proof, there is a C r embedding :
(, 1 + ) Y2 with (0) = x0 and (1) = x1 . Applying the tubular neighborhood
theorem, this path can be used to construct a C parameterization : T Y2
where T Rm is a neighborhood of D m .
Let g : T Rm be defined by setting g(x) = D(x)1 (x) . Proposition 14.5.1
gives a continuous function g : T Rm \ {0} that agrees with g on the closure of
T \ D m . We extend g to all of T by setting g(x) = g(x) if x / D m . Define a new
vector field on (T ) by setting
There are two final technical points. In order to insure that k(x)k < for all
p we can first multiply g by a C function : D m (0, 1] that is identically 1
on T \ D m and close to zero in the interior of D m outside of some neighborhood of
S m1 . We can also use Proposition 11.5.2 and Corollary 10.2.5 to further perturb
to make it C without introducing any additional zeros. We can now define
a function f : C X by setting f (x) = x + (x) if x T and f (x) = f (x)
otherwise. Since f has all the properties of f , and two fewer fixed points, this is a
contradiction, and the proof is complete.
Chapter 15
Under mild technical conditions, explained in Sections 15.1 and 15.2, a vector
field on a manifold M determines a dynamical system. That is, there is a function
: W M, where W M R is a neighborhood of W {0}, such that the
derivative of at (p, t) W , with respect to time, is (p,t) . In this final chapter
we develop the relationship between the fixed point index and the stability of rest
points, and sets of rest points, of such a dynamical system.
In addition to the degree and the fixed point index, there is a third expression of
the underlying mathematical principle for vector fields. In Section 15.3 we present
an axiomatic description of the vector field index, paralleling our axiom systems
for the degree and fixed point index. Existence and uniqueness are established by
showing that the vector field index of |C , for suitable compact C M, agrees
with the fixed point index of (, t)|C for small negative t. Since we are primarily
interested in forward stability, it is more to the point to say that the fixed point
index of (, t)|C for small positive t agrees with the vector field index of |C .
The notion of stability we focus on, asymptotic stability, has a rather compli-
cated definition, but the intuition is simple: a compact set A is asymptotically
stable if the trajectory of each point in some neighborhood of A is eventually drawn
into, and remains inside, arbitrarily small neighborhoods of A. In order to use the
fixed point index to study stability, we need to find some neighborhood of such an A
that is mapped into itself by (, t) for small positive t. The tool we use to achieve
this is the converse Lyapunov theorem, which asserts that if A is asymptotically
stable, then there is a Lyapunov function for that is defined on a neighborhood of
A. Unlike the better known Lyapunov theorem, which asserts that the existence of
a Lyapunov function implies asymptotic stability, the converse Lyapunov theorem
is a more recent and difficult result. We prove a version of it that is sufficient for
our needs in Section 15.5.
Once all this background material is in place, it will not take long to prove the
culminating result, that if A is a asymptotically stable, and an ANR, then the vector
field index of is the Euler characteristic of A. This was proved in the context of
a game theoretic model by Demichelis and Ritzberger (2003). The special case of A
being a singleton is a prominent result in the theory of dynamical systems, due to
Krasnoselski and Zabreiko (1984): if an isolated rest point is asymptotically stable
for , then the vector field index of that point for is 1.
211
212 CHAPTER 15. VECTOR FIELDS AND THEIR EQUILIBRIA
Due to its fundamental character, a detailed proof would be out of place here,
but we will briefly describe the central ideas of two methods. First, for any > 0
one can define a piecewise linear approximate solution going forward in time by
setting F (x, 0) = x and inductively applying the equation
i (p, t) = i (Fi (1
i (p), t))
If W and is a second pair with these properties, then W W satisfies (a), and
uniqueness implies that and agree on W W , so the function on W W
that agrees with on W and with on W satisfies (b). In fact his logic extends
to any, possibly infinite, collection of pairs. Applying it to the collection of all such
pairs shows that there is a maximal W satisfying (a), called the flow domain of
, such that there is a unique : W M satisfying (b), which is called the flow
of . Since the flow agrees, in a neighborhood of any point, with a function derived
(by change of time) from one of those given by Theorem 15.2.5, it is continuous,
and it is C s (1 s r) if is C s .
The vector field is said to be complete if W = M R. When this is the case
each (, t) : M M is a homeomorphism (or C s diffeomorphism is is C s ) with
inverse (, t), and t 7 (, t) is a homomorphism from R (thought of as a group)
to the space of homeomorphisms (or C s diffeomorphisms) between M and itself.
It is important to understand that when is not complete, it is because there
are trajectories that go to in finite time. One way of making this rigorous is
to define the notion of going to as a matter of eventually being outside any
compact set. Suppose that Ip = (a, b), where b < , and C M is compact. If we
had (p, tn ) C for all n, where {tn } is a sequence in (a, b) converging to b, then
after passing to a subsequence we would have (p, tn ) q for some q C, and we
could used the method of the last proof to show that (p, b) W .
(V1) ind() = 1 for all V(M) with domain C such that there is a C r parame-
terization : V M with C (V ), 1 (C) = D m , and D(x)1 (x) = x
for all x D m = { x Rm : kxk 1 }.
Ps
(V2) ind() = i=1 ind(|Ci ) whenever V(M), C is the domain of , and
C1 , . . . , Cs are pairwise disjoint compact subsets of C such that has no equi-
libria in C \ (int C1 . . . int Cs ).
Remark: In the theory of dynamical systems we are more interested in the future
than the past. In particular, forward stability is of much greater interest than
backward stability, even though the symmetry t 7 t makes the study of one
equivalent to the study of the other. From this point of view it seems that it would
have been preferable to define the vector field index with (V1) replaced by the
normalization requiring that the vector field x 7 x Tx Rm has index 1.
The remainder of this section is devoted to the proof of Theorem 15.3.2. Fix
V(M) with domain C. The first order of business is to show that can be ap-
proximated by a well enough behaved vector field that is defined on a neighborhood
of C.
Since C is compact, it is covered by the interiors of a finite collection K1 , . . . , Kk
of compact sets, with each Ki contained in an open Vi that is the image of a C r
parameterization i . Each i induces an isomorphism between T Vi and Vi Rm ,
so that the Tietze extension theorem implies that there is a vector field on Vi that
agrees with on CVi . There is a partition of unity {i } for K1 . . .K
S k subordinate
to the cover V1 , . . . , Vk , and we may define an extension of to V = i Vi by setting
X
(p) = i (p)i (p).
pVi
2 : Tp M Tp M Tp M, 2 (v, w) = w,
be the projection onto the second component. We say that p is a regular equilib-
rium of if p is an equilibrium of and 2 D (p) is nonsingular. (Intuitively, the
derivative at p of the map q 7 q has rank m.) We need the following local result.
Proof. The equidimensional case of Sards theorem implies that the set of regular
values of f |V is dense, and if y is a regular value of f |V , then all the zeros of fy |K
are regular. If the claim is false, there must be a sequence yn 0 such that for
each n there is a xn V D such that xn is a singular zero of fyn . But V D is
compact, so the sequence {xn } must have a limit point, which is a singular zero of
f |D by continuity, contrary to assumption.
find that the index of the equilibrium of the second type is 1, so the vector field
index is indeed uniquely determined by the axioms.
We still need to construct the index. One way to proceed would be to define
the vector field index to be the index of nearby smooth approximations with reg-
ular equilibria. This is possible, but the key step, namely showing that different
approximations give the same result, would duplicate work done in earlier chapters.
Instead we will define the vector field index using the characterization in terms of
the fixed point index given in the statement of Theorem 15.3.2, after which the
axioms for the fixed point index will imply (V1)-(V3).
We need the following technical fact.
Therefore the intermediate value theorem implies that |h(p, t) p, vi| M|t| kvk.
Since v may be any unit vector, () follows this.
Now suppose that (p, t) = p for some (p, t) C ((, 0) (0, )). Rolles
theorem implies that there is some s between 0 and t such that
d
0= ds
h(p, s) p, pi = h(p,s) , p i,
but among the vectors that are orthogonal to p , the origin is the closest, and
so this is impossible.
220 CHAPTER 15. VECTOR FIELDS AND THEIR EQUILIBRIA
We now define the vector field index of the pair (U, ) to be ( (, t)|C ), where
is a nearby C r1 vector field, for all sufficiently small negative t. Since such a
is vector field admissible, the last result (applied to C) implies that (, t)|C is
index admissible for all small negative t, and it also (by Homotopy) implies that
the choice of t does not affect the definition.
We must also show that the choice of does not matter. Certainly there is a
neighborhood such that for 0 and 1 in this neighborhood and all s [0, 1], s =
(1 s)0 + s1 is index admissible. In addition, that s (p, t) is jointly continuous
as a function of (p, t, s) follows from Theorem 15.2.5 applied to the vector field
(p, s) 7 (s (p), 0) T(p,s) (M (, 1 + )) on M (, 1 + ), where is a suitable
small positive number. Therefore Continuity for the fixed point index implies that
(0 (, t)|C ) = (1 (, t)|C ).
We now have to verify that our definition satisfies (V1)-(V3). But the result
established in the last paragraph immediately implies (V3). Of course (V2) follows
directly from the Additivity property of the fixed point index. Finally, the flow of
the vector field (x) = x on Rm is (x, t) = et x, so for small negative t there is
an index admissible homotopy between (, t)|Dm and the constant map x 7 0, so
(V1) follows from Continuity and Normalization for the fixed point index.
All that remains of the proof of Theorem 15.3.2 is to show that is locally
Lipschitz and defined in a neighborhood of C, then
for sufficiently small positive t. Since we can approximate with a vector field that
is C r1 and has only regular equilibria, by (V2) it suffices to prove this when C is
a single regular equilibrium. If is one of the two vector fields x 7 x Tx Rm or
x 7 (x1 , x2 , . . . , xm ) Tx Rm on Rm , then (x, t) = (x, t) for all x and t,
so the result follows from the relationship between the index and the determinant
of the derivative.
One of the earliest and most useful tools for understanding stability was intro-
duced by Lyapunov toward the end of the 19th century. A function f : M R is
-differentiable if the -derivative
d
f (p) = f ((p, t))|t=0
dt
is defined for every p M. A continuous function L : M [0, ) is a Lyapunov
function for A M if:
(a) L1 (0) = A;
(a) A is compact;
(b) A is invariant;
Proof. If L((p, t)) > 0 for some p A and t > 0, the intermediate value theorem
would give a t [0, t] with
d d
0< L((p, t))|t=t = L(((p, t ), t))|t=0 ,
dt dt
contrary to (b). Therefore A = L1 (0) is invariant.
222 CHAPTER 15. VECTOR FIELDS AND THEIR EQUILIBRIA
{ (p, t) : t 0 } L1 ([0, ]) K,
Proof. Let U be a neighborhood of A that does not contain any element of {pn }
Since {(pn , tn )} is bounded, it is contained in a compact set, so the last result
gives a T such that ((pn , tn ), t) = (pn , tn + t) U for all t T . For all n we
have tn > T because otherwise pn = (pn , 0) U.
Proof. Otherwise there is a p and sequence {tn } with tn such that {(p, tn )}
is bounded and consequently contained in a compact set. The last result implies
that this is impossible.
Let : M [0, ) be the function
Proof. Since (p) d(p, A), is continuous at points in A. Suppose that {pn } is a
sequence converging to a point p / A. The last result implies that there are t 0
and tn 0 for each n such that (p) = d((p, t), A) and (pn ) = d((pn , tn ), A).
The continuity of and d gives
On the other hand d((pn , tn ), A) d(pn , A), so the sequence (pn , tn ) is bounded,
and Lemma 15.5.4 implies that {tn } is bounded below. Passing to a subsequence,
we may suppose that tn t , so that
We are now ready for the main construction. Let L : M [0, ) be defined by
Z
L(p) = ((p, s)) exp(s) ds.
0
The rest of the argument verifies that L is, in fact, a Lyapunov function.
Since A is invariant, L(p) = 0 if p A. If p / A, then L(p) > 0 because
(p) > 0.
To show that L is continuous at an arbitrary p M we observe that for any
> 0 there is a T such that ((p, T )) < /2. Since is continuous we have
((p , T )) < /2 and |((p , t)) ((p, t))| < /2 for all p in some neighborhood
of p and all t [0, T ], so that
Z T
((p , s)) ((p, s)) exp(s) ds
|L(p ) L(p)|
0
Z Z
((p , s)) exp(s) ds ((p, s)) exp(s) ds <
T T
for all p in this neighborhood.
To show that L is -differentiable, and to compute its derivative, we observe
that
Z Z
L((p, t)) = (p, t + s) exp(s) ds = exp(t) (p, s) exp(s) ds,
0 t
so that
Z Z t
L((p, t)) L(p) = (exp(t) 1) (p, s) exp(s) ds ((p, t)) exp(s) ds.
t 0
Dividing by t and taking the limit as t 0 gives
L(p) = L(p) (p).
Note that Z
L(p) < (p) exp(s) ds = (p)
0
because ((p, )) is weakly decreasing with limt ((p, t)) = 0. Therefore L(p) <
0 when p / A.
We need one more technical result.
Lemma 15.5.7. If {(pn , tn )} is a sequence such that d(pn , A) and there is a
number T such that tn < T for all n, then d((pn , tn ), A) .
Proof. Suppose not. After passing to a subsequence there is a B > 0 such that
d((pn , tn ), A) < B for all n, so the sequence {(pn , tn )} is contained in a compact
set K. Since the domain of attraction of A is all of M, is continuous, and K
is compact, for any > 0 there is some S such that d((p, t), A) < whenever
p K and t > S. The function p 7 d((p, t), A) is continuous, hence bounded on
the compact set K [T, S], so it is bounded on all of K [T, ). But this is
impossible because tn > T and
d(((pn , tn ), tn ), A) = d(pn , A) .
226 CHAPTER 15. VECTOR FIELDS AND THEIR EQUILIBRIA
It remains to show that if U is open and contains A, then there is an > 0 such
that L1 ([0, ]) U. The alternative is that there is some sequence {pn } in M \ U
with L(pn ) 0. Since L is continuous and positive on M \ U, the sequence must
eventually be outside any compact set. For each n we can choose tn 1 such that
((pn , 1)) = d((pn , tn ), A), and the last result implies that ((pn , 1)) , so
Z 1 Z 1
L(pn ) ((pn , t)) exp(t) dt ((pn , t)) exp(t) dt .
0 0
This contradiction completes the proof that L is a Lyapunov function, so the proof
of Theorem 15.5.1 is complete.
This works because there is some neighborhood U of A such that c((p, r(p)), t) is
defined and in the interior of A for all p U and all 0 t 1, and (A , T ) U
if T is sufficiently large.
Physical equilibrium concepts are usually rest points of explicit dynamical sys-
tems, for which the notion of stability is easily understood. For economic models,
dynamic adjustment to equilibrium is a concept that goes back to Walras notion of
tatonnement, but such adjustment is conceptually problematic. If there is gradual
adjustment of prices, or gradual adjustment of mixed strategies, and the agents un-
derstand and expect this, then instead of conforming to such dynamics the agents
will exploit and undermine them. For this reason there are, to a rough approxi-
mation, no accepted theoretical foundations for a prediction that an economic or
strategic equilibrium is dynamically stable.
Paul Samuelson (1941, 1942, 1947) advocated a correspondence principle, ac-
cording to which dynamical stability of an equilibrium has implications for the
qualitative properties of the equilibriums comparative statics. Samuelsons writ-
ings consider many particular models, but he never formulated the correspondence
principle as a precise and general theorem, and the economics professions under-
standing of it has languished, being largely restricted to 1-dimensional cases; see
Echenique (2008) for a succinct summary. However, it is possible to pass quickly
from the Krasnoselski-Zabreiko theorem to a general formulation of the correspon-
dence principle, as we now explain.
Let U Rm be open, let P be a space of parameter values that is an open
subset of Rn , and let z : U P Rm be a C 1 function that we understand as
a parameterized vector field. (Working in a Euclidean setting allows us to avoid
discussing differentiation of vector fields on manifolds, which is a very substantial
topic.) For (x, ) U P let x z(x, ) and z(x, ) denote the matrices of partial
derivatives of the components of z with respect to the components of x and
respectively.
We consider a point (x0 , 0 ) with z(x0 , 0 ) = 0 such that x z(x0 , 0 ) is nonsin-
gular. The implicit function implies that there is a neighborhood V of 0 and C 1
function : V U such that (0 ) = x0 and z((), ) = 0 for all V . The
method of comparative statics if to differentiate this equation with respect to ,
using the chain rule, then rearrange, arriving at
d
(0 ) = x z(x0 , 0 )1 z(x0 , 0 ).
d
The last result implies that if {x0 } is asymptotically stable for the vector field
z(, 0 ), then the determinant of x z(x0 , 0 ) is positive, as is the determinant of
d
its inverse. When m = 1 this says that the vector d (0 ) is a positive scalar multiple
228 CHAPTER 15. VECTOR FIELDS AND THEIR EQUILIBRIA
Arrow, K., Block, H. D., and Hurwicz, L. (1959). On the stability of the competitive
equilibrium, II. Econometrica, 27:82109.
Border, K. C. (1985). Fixed point theorems with applications to economics and game
theory. Cambridge University Press, Cambridge.
Browder, F. (1948). The Topological Fixed Point Theory and its Applications to
Functional Analysis. PhD thesis, Princeton University.
Brown, R. (1971). The Lefschetz Fixed Point Theorem. Scott Foresman and Co.,
Glenview, IL.
Chen, X. and Deng, X. (2006b). Settling the complexity of two-player Nash equi-
librium. In Proceedings of the 47th Annual IEEE Symposium on Foundations of
Computer Science, pages 261272.
229
230 BIBLIOGRAPHY
Fan, K. (1952). Fixed point and minimax theorems in locally convex linear spaces.
Proceedings of the National Academy of Sciences, 38:121126.
Goldberg, P., Papadimitriou, C., and Savani, R. (2011). The complexity of the
homotopy method, equilibrium selection, and Lemke-Howson solutions. In Pro-
ceedings of the 52nd Annual IEEE Symposium on the Foundations of Computer
Science.
Hart, O. and Kuhn, H. (1975). A proof of the existence of equilibrium without the
free disposal assumption. J. of Mathematical Economics, 2:335343.
Hirsch, M., Papadimitriou, C., and Vavasis, S. (1989). Exponential lower bounds
for finding Brouwer fixed points. Journal of Complexity, 5:379416.
Hopf, H. (1928). A new proof of the Lefschetz formula on invariant points. Pro-
ceedings of the National Academy of Sciences, USA, 14:149153.
Jiang, J.-h. (1963). Essential component of the set of fixed points of the multivalued
mappings and its application to the theory of games. Scientia Sinica, 12:951964.
Kinoshita, S. (1953). On some contractible continua without the fixed point prop-
erty. Fundamentae Mathematicae, 40:9698.
232 BIBLIOGRAPHY
Klee, V. and Minty, G. (1972). How good is the simplex algorithm? In Sisha, O.,
editor, Inequalities III. Academic Press, New York.
Kuhn, H. and MacKinnon, J. (1975). Sandwich method for finding fixed points.
Journal of Optimization Theory and Applications, 17:189204.
Lyapunov, A. (1992). The General Problem of the Stability of Motion. Taylor and
Francis, London.
Mertens, J.-F. (1991). Stable equilibriaa reformulation, part ii: Discussion of the
definition and further results. Mathematics of Operations Research, 16:694753.
Ritzberger, K. (1994). The theory of normal form games from the differentiable
viewpoint. International Journal of Game Theory, 23:201236.
Stone, A. H. (1948). Paracompactness and product spaces. Bull. Amer. Math. Soc.,
54:977982.
van der Laan, G. and Talman, A. (1979). A restart algorithm for computing fixed
points without an extra dimension. Mathematical Programming, 17:7484.
C r , 126 subspace, 24
C r -embedding, 144 Alexander horned sphere, 131
C r -immersion, 144 algorithm, 60
C r atlas, 10 ambient space, 10, 133
C r function, 127 annulus, 144
C r manifold, 10, 131 antipodal function, 203
C r submanifold, 11, 136 antipodal points, 200
Q-robust set, 113 approximates, 189
Q-robust set Arrow, Kenneth, 2
minimal, 114 asymptotic stability, 20, 221
minimal connected, 114 atlas, 10, 131
T1 -space, 66 axiom of choice, 36
-limit set, 19, 221
-parameterization, 144 balanced set, 116
-domination, 17, 104 Banach space, 90
-homotopy, 17, 104 barycenter, 32
EXP, 61 base of a topology, 67
FNP, 63 bijection, 6
NP, 61 Bing, R. H., 196
PLS (polynomial local search), 64 Border, Kim, i
PPAD, 64 Borsuk, Karol, 18
PPA, 65 Borsuk-Ulam theorem, 18, 204
PPP (polynomial pigeonhole principle), bounding hyperplane, 24
64 Brouwers fixed point theorem, 3
PSPACE, 61 Brouwer, Luitzen, 2
P, 61 Brown, Robert, i
TFNP, 63
Clique, 61 category, 135
EOTL (end of the line), 64 Cauchy sequence, 90
OEOTL (other end of the line), 65 Cauchy-Schwartz inequality, 91
certificate, 61
absolute neighborhood retract, 6, 100 Church-Turing thesis, 60
absolute retract, 6, 102 closed function, 74
acyclic, 34 codimension, 24, 136
affine commutativity configuration, 16, 179
combination, 23 compact-open topology, 83
dependence, 23 complete invariant, 194
hull, 24 complete metric space, 90
independence, 23 complete vector field, 216
235
236 INDEX