Applied Functional Analysis Applications To Mathematical Physics

Applied Mathematical Sciences
Volume 108
Editors
J.E. Marsden L. Sirovich
Advisors
s. Antman J.K. Hale P. Holmes
T. Kambe J. Keller K. Kirchgiissner
B.J. Matkowksy C.S. Peskin
Springer Science+Business Media, LLC

Eberhard Zeidler
Applied Functional Analysis

Applications to Mathematical Physics
With 56 Illustrations
Springer
Eberhard Zeidler
Max-Planck-Institut fur Mathematik
in den Naturwissenschaften
InselstraBe 22-26
D-04103 Leipzig
Germany
Editors
J.E. Marsden L. Sirovich
Control and Division of
Dynamical Systems, 107-81 Applied Mathematics
California Institute of Technology Brown U niversity
Pasadena, CA 91125 Providence, RI 02912
USA USA
Mathematics Subject Classification (1991): 34A12, 42A16, 35J05
Library of Congress Cataloging-in-Publication Data

Zeidler, Eberhard
Applied functional analysis : applications to mathematical physics
/ Eberhard Zeidler
p. cm. - (Applied mathematical sciences ; 108)
Includes bibliographical references and index.
ISBN 978-1-4612-6910-6 ISBN 978-1-4612-0815-0 (eBook)
DOI 10.1007/978-1-4612-0815-0
1. Functional analysis. 2. Mathematical physics. 1. Title.
II. Series: Applied mathematical sciences (Springer-Verlag New York
Inc.) ; v. 108.
QA1.A647 voI. 108
[QA3201
510 s--dc20 94-43219
[515'.71
Printed on acid-free paper.
© 1995 Springer Science+Business Media New York

Originally published by Springer-Verlag New York, lnc in 1995
Softcover reprint of the hardcover lst edition 1995
AII rights reserved. This work may not be translated or copied in whole or in part without
the written permission of the publisher Springer Science+Business Media, LLC,
except for brief excerpts in connection with reviews or scholarly
analysis. Use in connection with any form of information storage and retrieval, electronic adap-
tation, computer software, or by similar or dissimilar methodology now known or hereafter
developed is forbidden.
The use of general descriptive names, trade names, trademarks, etc., in this publicat ion, even
if the former are not especially identified, is not to be taken as a sign that such names, as
understood by the Trade Marks and Merchandise Marks Act, may accordingly be used freely
byanyone.
Production managed by Laura Carlson; manufacturing supervised by Joe Quatela.

Photocomposed copy prepared from the author's U\.TEX files.
9 8 7 6 5 4 3 (Corrected third printing, 1999)
ISBN 978-1-4612-6910-6 SPIN 10738833

To My Students
Textbooks should be attractive by showing the beauty ofthe subject.
Johann Wolfgang von Goethe (1749-1832)
I am not able to learn any mathematics unless I can see some problem
I am going to solve with mathematics, and I don't understand how
anyone can teach mathematics without having a battery of problems
that the student is going to be inspired to want to solve and then
see that he or she can use the tools for solving them.
Steven Weinberg
(Winner of the Nobel Prize in physics in 1979)
The more I have learned about physics, the more convinced I am that
physics provides, in a sense, the deepest applications of mathematics.
The mathematical problems that have been solved, or techniques
that have arisen out of physics in the past, have been the lifeblood
of mathematics .... The really deep questions are still in the physical
sciences. For the health of mathematics at its research level, I think
it is very important to maintain that link as much as possible.
Sir Michael Atiyah

(Winner of the Fields Medal in 1966)
David Hilbert Stefan Banach
(1862-1943) (1892-1945)
John von Neumann

(1903-1957)
Preface
A theory is the more impressive,

the simpler are its premises,
the more distinct are the things it connects,
and the broader is its range of applicability.
Albert Einstein
There are two different ways of teaching mathematics, namely,

(i) the systematic way, and
(ii) the application-oriented way.
More precisely, by (i), I mean a systematic presentation of the material
governed by the desire for mathematical perfection and completeness of
the results. In contrast to (i), approach (ii) starts out from the question
"What are the most important applications?" and then tries to answer this
question as quickly as possible. Here, one walks directly on the main road
and does not wander into all the nice and interesting side roads.
The present book is based on the second approach. It is addressed to
undergraduate and beginning graduate students of mathematics, physics,
and engineering who want to learn how functional analysis elegantly solves
mathematical problems that are related to our real world and that have
played an important role in the history of mathematics. The reader should
sense that the theory is being developed, not simply for its own sake, but
for the effective solution of concrete problems.
viii Preface
This introduction to functional analysis is divided into the following two

parts:
Part I: Applications to mathematical physics (the present AMS Vol. 108);
Part II: Main principles and their applications (AMS Vol. 109).
Our presentation of the material is self-contained. As prerequisites we
assume only that the reader is familiar with some basic facts from calculus.
One of the special features of our introduction to functional analysis is that
we try to combine the following topics on a fairly elementary level:
(a) linear functional analysis;
(b) nonlinear functional analysis;
(c) numerical functional analysis; and
(d) substantial applications related to the main stream of mathematics

and physics.
I think that time is ripe for such an approach. From a general point of view,
functional analysis is based on an assimilation of analysis, geometry, alge-
bra, and topology. The applications to be considered concern the following
topics:
ordinary differential equations (initial-value problems, boundary-eigen-
value problems, and bifurcation);
linear and nonlinear integral equations;
variational problems, partial differential equations, and Sobolev spaces;
optimization (e.g., Cebysev approximation, control of rockets, game the-
ory, and dual problems);
Fourier series and generalized Fourier series;
the Fourier transformation;
generalized functions (distributions) and the role of the Green function;
partial differential equations of mathematical physics (e.g., the Laplace
equation, the heat equation, the wave equation, and the Schrodinger equa-
tion);
time evolution and semigroups;
the N-body problem in celestial mechanics;
capillary surfaces;
minimal surfaces and harmonic maps;
superfluids, superconductors, and phase transition (the Landau-Ginz-
burg model);
viscous fluids (the Navier-Stokes equations);
Preface ix
boundary-value problems and obstacle problems in nonlinear elasticity;

quantum mechanics (both the Schrodinger equation approach and the
Feynman path integral approach);
quantum statistics (both the Hilbert space approach and the C* -algebra
approach);
quantum field theory (the Fock space);
quarks in elementary particle physics;
gauge field theory (the Yang-Mills-Dirac equations);
string theory.
We also study the following fundamental approximation methods:
iteration method via k-contractions;
iteration method via monotonicity in ordered Banach spaces;
the Ritz method and the method of finite elements;
the dual Ritz method (also called the Trefftz method);
the Galerkin method; and
cubature formulas.
We shall make no attempt to present concepts in the most general way
but will rather try to expose their essential core without, on the other
hand, trivializing them. In the experience of the author, it is substantially
easier for the student to take a mathematical concept and extend it to
a more general situation than to struggle through a theorem formulated
in its broadest generality and burdened with numerous technicalities in
an attempt to divine the basic concept. Here it is the teacher's duty to
be helpful. To assist the reader in recognizing the central results, these
propositions are denoted as "theorems." A list of the theorems along with
a list of the most important definitions can be found at the end of this book.
Furthermore, a number of schematic overviews should help the reader to
understand the interrelations between the abstract principles and their
applications.
Functional analysis is a child of the twentieth century. It provides us
with a new language that allows us to formulate apparently different top-
ics in a unique way. It seems that functional analysis is deeply rooted
in our real world, since it is the appropriate tool for describing quantum
phenomena in terms of mathematics. For example, the famous Heisenberg
uncertainty principle on position and momentum of particles follows easily
from the Schwarz inequality, which represents the most important inequal-
ity in Hilbert space theory. In the study of many problems, the following
steps are used:
(i) translating the given concrete problem into the language of functional
analysis;
x Preface
(ii) applying abstract functional analytic theorems;

(iii) verifying the assumptions in step (ii), which often requires applying
very specific analytical tools.
The basic idea of functional analysis is to formulate differential and in-
tegral equations in terms of operator equations. For example, the operator
equation
Au=j, UEX, (E)
may represent a concise formulation of the following integral equation for
the unknown function u:
u(x) -lb A(x, y)¢(u(y), y)dy = j(x), a S x S b,
provided we introduce the operator A through
(Au)(x) := u(x) -l b
A(x, y)¢(u(y), y)dy for all x E [a, b]. (Op)
From the abstract point of view, we assume that u is an element of the

"space" X, where X := era, b] denotes the set of all continuous functions
u: [a,b] ~ R
More precisely, the definition of the operator A is to be understood in the
following sense. To each function u E X we assign a new function Au on
the interval [a, b] given by (Op). For example, if u( x) == 1, then the function
Au is given by
(Au)(x) =1 -l b
A(x, y)¢(l, y)dy for all x E [a, b].
The set X is also called a function space. Typically, functional analysis

employs the fact that the function spaces possess an additional structure.
For example, the space era, b] can be equipped with the norm
Ilull := max lu(x)l·
a~x~b
We callilull the length of the vector (function) u. This way, era, b] becomes
a so-called Banach space. In the special case where ¢(u, y) == u, the integral
equation (E int ) is said to be linear. Generally, the importance of nonlinear
problems stems from the fact that they describe processes in nature with
interactions.
In contrast to the integral equation (E int ), the operator equation (E) may
also correspond to the following boundary-value problem for a differential
equation:
u"(x) + c(x)u(x) = f(x), a S x S b,
u(a) = u(b) = 0 (boundary condition),

Preface xi
provided we define the operator A through
(Au)(x) := ul/(x) + c(x)u(x) for all x E [a, b].
Naturally enough, we now assume that u is an element of the space X,

where X denotes the set of all functions u: [a, b] ---+ IR. that are twice con-
tinuously differentiable on the interval [a, b] and that satisfy the boundary
condition u(a) = u(b) = O.
Finally, set u := (Ul, U2), f := (il, h), where Ul, U2, il, and 12 are real
numbers, i.e., u, f E IR.2 . Then, the following system of real equations
corresponds to the original operator equation (E), too. Here we define the
operator A through
for all u E X,
where we set X := IR.2 . Obviously, u E X implies Au E X. Thus, the

operator A: X ---+ X maps the space X into itself.
Furthermore, for example, the abstract minimum problem
F(u) = min!, u E X, (M)
corresponds to Euler's classical variational problem
lb L(x,u(x),u'(x))dx = min!,
(Mvar)
u(a) = u(b) = 0 (boundary condition),
provided we set
F(u) := lb L(x, u(x), u'(x))dx for all u E X,
where X denotes an appropriate space of functions that satisfy the bound-

ary condition u( a) = u(b) = O. Since F( u) is a real number for each function
u EX, the operator F: X ---+ IR. from the space X to the space IR. of real
numbers is called a functional. In addition, many problems in optimization
and control theory can be formulated in terms of the abstract minimum
problem (M). Roughly speaking:
Functional analysis provides us with existence theorems for both the op-
erator equation (E) and the minimum problem (M) and with convergent
approximations methods for (E) and (M).
xii Preface
Typically, the spaces X are infinite-dimensional. From the physical point

of view, such spaces describe physical systems with an infinite number of
degrees of freedom.
Problems of the type
minmaxL(u,p) = maxminL(u,p) = L(uo,Po) (Minimax)

uEA pEB pEB uEA
and, more generally,
inf supL(u,p) = sup inf L(u,p)

UEApEB PEBUEA
represent basic problems in game theory and duality theory. This will be
shown in Chapter 2 of AMS Vol. 109.
Functional analysis also establishes a calculus for linear operators. For
example, let us consider the abstract differential equation
u'(t) = Au(t), t > 0,

(D)
u(o) = Uo (initial condition),
where A is a linear operator. Formally, the solution of (D) is given by
u(t) = etAuo.
It is the goal of the theory of semigroups to give the formal symbol etA a
rigorous meaning. Equation (D) describes many time-dependent processes
in nature. It turns out that if (D) corresponds to an irreversible process in
nature (e.g., diffusion or heat conduction), then the symbol etA only makes
sense for time t 2: o.
Let us briefly discuss the contents of the present AMS Vol. 108 and of
AMS Vol. 109.
Chapter 1 concerns Banach spaces. For the convenience of the reader,
the most important notions of functional analysis are explained in terms of
the simple space era, b] of continuous functions without using the Lebesgue
integral. This way, the first chapter may serve as a quite elementary intro-
duction to functional analysis. The applications to be studied in Chapter
1 concern existence proofs for ordinary differential equations as well as for
linear and nonlinear integral equations. Here, we will use the two most im-
portant fixed-point theorems due to Banach and Schauder. We also justify
the following fundamental principle in mathematics:
A priori estimates yield existence.
In an abstract functional analytic setting, this principle was established by
Leray and Schauder in 1934.
Riemann's famous Dirichlet principle stands at the beginning of Chapter
2, which is devoted to Hilbert space theory. We give an elegant functional
Preface xiii
analytic justification for the Dirichlet principle based on an existence theo-

rem for quadratic minimum problems in Hilbert spaces. In this connection,
the use of the Lebesgue integral is indispensible. Basic facts about this in-
tegral are summarized in the appendix. Thus, the book is also accessible to
those readers who are not familiar with the Lebesgue integral. In fact, our
abstract setting for the Dirichlet principle represents one of several equiva-
lent formulations of the so-called linear orthogonality principle for Hilbert
spaces, which will be studied in Section 2.13. In terms of geometry, the
linear orthogonality principle tells us that:
In Hilbert spaces, there exists a perpendicular from any point to any
closed plane.
In other words, there exists an orthogonal projection onto closed linear
subspaces of Hilbert spaces. If one tries to generalize this fundamental or-
thogonality principle to nonlinear operators, then one obtains an existence
theorem for so-called monotone operator equations.
Each Hilbert space is a Banach space. But Hilbert spaces possess a richer
structure than Banach spaces, since the concept of orthogonality is avail-
able.
In Chapter 3 we shall show that complete orthonormal systems in Hilbert
spaces are the right tool for solving the convergence problem for Fourier
series and more general series expansions of functions. This convergence
problem was a famous open problem in the nineteenth century.
Hilbert discovered around 1900 that many eigenvalue problems of clas-
sical analysis for differential and integral equations can be formulated in
terms of a general theory for compact symmetric operators in Hilbert
spaces. This approach, which is closely related to Chapter 3, will be stud-
ied in Chapter 4. This way, it is possible to understand why the "Fourier
method" of physicists works. In terms of physics, this method represents
general states as superpositions of so-called eigenstates, which correspond
to eigenoscillations of the system under consideration. Functional analysis
rigorously establishes the old conjecture by Daniel Bernoulli (1700-1782)
that physical systems with an infinite number of degrees of freedom possess
an infinite number of eigenoscillations.
Around 1935 Friedrichs found out that the partial differential equations
of mathematical physics can be understood best by means of the Friedrichs
extension of symmetric operators. This extension procedure generates self-
adjoint operators, which von Neumann introduced in connection with his
mathematical foundations of quantum mechanics in 1932. From the physi-
cal point of view, the Friedrichs approach is intimately related to the con-
cept of energy. This will be studied in Chapter 5, where we also show
that time-dependent processes in nature can be described mathematically
either by semigroups (irreversible processes) or by one-parameter groups
(reversible processes).
Near 1950 Kato proved that the Schrodinger equation for large classes of
XIV Preface
physical systems corresponds to a uniquely determined self-adjoint Hamil-

tonian. This way Kato showed that von Neumann's abstract setting for
quantum mechanics from 1932 represents the right tool for the mathemat-
ical description of the behavior of atoms and molecules.
Chapter 5 represents the heart of the present book. It is devoted to the
close relations between functional analysis and both classical and modern
mathematical physics. For example, in Sections 5.21 through 5.24, which
discuss the Dirac calculus and the Feynman path integral in quantum
physics,
we try to build a bridge between the language and thoughts oj physicists
and mathematicians.
The mathematician should have the following in mind. Until today, it has
not been possible to develop a mathematically rigorous quantum field the-
ory for describing the behavior of elementary particles. For about 40 years,
however, physicists have worked with dubious mathematical methods that
are in fantastic coincidence with experiment (e.g., in quantum electrody-
namics).
As a typical example for the difference between the language of physicists
and mathematicians, let us consider the "delta Junction" 8, which the fa-
mous physicist Paul Dirac introduced around 1930. In terms of physics, the
function 8 = 8(x) describes the mass density of a point of mass m = 1 at
x = 0 on the real line. This physical interpretation of 8 leads us immediately
to
8 (x) = { 0 ~f x # 0 (I)
+00 If x = 0,
as well as
I: 8(x)dx = total mass = 1 (II)
and
I: J(x)8(x)dx = J(O) . (mass at 0) = J(O). (III)
I:
Using the substitution x := z - y and g(z) := J(z - y), we also get
g(z)8(z - y)dz = g(y) for all y E lit (IV)
I:
Set u(x) := 8(x - y). Applying (IV) to the Fourier transformation
v(k) = e-ikXu(x)dx for all k E JR.
and the inverse Fourier transformation
for all x E JR.,

Preface xv
we formally obtain that
e- iky = 100
e- ikX 8(x - y)dx for all k, y E lR (V)
I:
-00
and
8(x - y) = (27r)-1 eik(x-y)dk for all x, y E R (VI)
From a mathematical point of view, there is no classical function 8 that

satisfies (I) and (II). At a first glance, it seems that (I) along with (II) is
nonsense. In the introduction to his famous 1932 monograph Foundations
of Quantum Mechanics, John von Neumann points that the Dirac calculus
lacks a rigorous justification. Therefore, von Neumann did not use this
calculus. Around 1950 Laurent Schwartz created the theory of generalized
functions (distributions), which allows a rigorous definition of the delta
distribution related to Dirac's "delta function." As we will show in Chapters
2 and 3, the theory of generalized functions gives formulas (III) through
(VI) a precise meaning. However, physics textbooks do not use the rigorous
mathematical approach to generalized functions. Physicists prefer formulas
(I) through (VI) because of their mnemotechnical elegance. Experience
shows that, generally, the calculi used by physicists possess the advantage
of working on their own and leading very quickly to the desired results at
least on a heuristic level. Therefore, it is useful to learn both the language
of physicists and the language of mathematicians. The present book tries
to support this.
A mathematician who teaches mathematics to physics students should
try to help the students understand the differences and connections between
the two different languages of mathematics and physics. In order to avoid
confusion, we clearly distinguish between physical motivations and purely
mathematical results. The word "proof" is always understood in the sense
of a rigorous mathematical proof.
Let us now briefly discuss the contents of AMS Vol. 109.
In Chapter 1 of AMS Vol. 109 we show that the Hahn-Banach theorem
allows us to solve interesting convex optimization problems. Here, in terms
of geometry, we use the separation of convex sets by hyperplanes.
Chapter 2 of AMS Vol. 109 is devoted to variational principles. In partic-
ular, we generalize the classical Weierstrass existence theorem for minimum
problems via weak convergence. Furthermore, we consider the Ekeland vari-
ational principle on the existence of quasi-minimal points. For example,
combining this principle with the Palais-Smale condition, we will get the
mountain pass theorem on saddle points. Functional analysis explains why
the nineteenth-century mathematicians encountered many difficulties in es-
tablishing existence theorems for variational problems. The reason for this
is the following simple geometric fact:
The closed unit ball in an infinite-dimensional Banach space is not com-
pact.
xvi Preface
At the end of the 1920s, Banach proved a number of important theorems

on linear continuous operators in Banach spaces, which follow from the
Baire category theorem, which, in turn, is a consequence of a straightforward
generalization of Cantor's nested interval principle to Banach spaces. These
so-called principles of linear functional analysis are presented in Chapter 3
of AMS Vol. 109. Applications to linear and nonlinear operator equations
are studied in Chapters 4 and 5 in AMS Vol. 109. In particular, in Chapter 4
of AMS Vol. 109 we will use the implicit function theorem in order to study
the local behavior of nonlinear operators (diffeomorphisms, submersions,
immersions, and subimmersions). This is important for global analysis (i.e.,
the theory of finite-dimensional and infinite-dimensional manifolds).
Chapter 5 of AMS Vol. 109 is devoted to a study of linear and non-
linear Fredholm operators along with bifurcation theory. Many differential
and integral operators correspond to Fredholm operators in appropriate
function spaces. The theory of Fredholm operators generalizes the classical
Fredholm alternative for integral equations formulated first by Fredholm
around 1900. In fact, the theory of linear and nonlinear Fredholm opera-
tors represents the completely natural generalization of the classical theory
for finite systems of real equations to infinite dimensions. Bifurcation the-
ory mathematically models an essential change of the behavior of systems
in nature (e.g., the buckling of beams, ecological catastrophes, etc.). The
theory of nonlinear Fredholm operators dates back to a 1965 fundamental
paper by Smale.
The creation of functional analysis by Hilbert around 1900 was strongly
influenced by the theory of integral equations. Until the 1930s, partial dif-
ferential equations were treated by being reduced to integral equations.
The more successful modern functional analytic approach to partial differ-
ential equations is based on an inspection of the operator equations that
correspond directly to the differential equations (cf. (E) and (Ediff))' This
approach dates back to von Neumann and Friedrichs in the 1930s. In fact,
this point of view works successfully in numerical analysis, too. Note that
all the basic equations of physical field theories (elasticity, hydrodynam-
ics, thermodynamics, gas dynamics, electrodynamics, quantum mechanics,
quantum field theory, general relativity, gauge field theory, etc.) are partial
differential equations. It seems fair to say that the theory of integral equa-
tions has reached a certain final shape. In contrast, there are still many
deep open questions in the theory of those partial differential equations
related to physics.
At the end of each chapter, the reader will find problems. Most of them
are routine. I hope that such a carefully selected collection of fairly simple
problems will help the student to check her or his basic understanding of
the material. Some more advanced problems are marked with a star and
provided with hints for further reading. For an in-depth presentation of non-
linear functional analysis and its many applications to the natural sciences,
the reader is referred to the five-volume treatise Nonlinear Functional Anal-
Preface xvii
ysis and Its Applications by the same author. In particular, Vols. 4 and 5
contain a detailed motivation of the basic equations in classical and modern
mathematical physics along with both abstract existence proofs and inter-
esting applications to concrete problems in physics, chemistry, biology, and
economics.
The representation takes into account that in general no book is read
completely from beginning to end. We hope that even a quick skimming of
the text will suffice to grasp the essential contents. To this end, we recom-
mend reading the introductions to the individual chapters, the definitions,
the "theorems" (without proofs), and the examples (without proofs) as well
as the motivations and comments in the text, which point out the meaning
of the specific results. The proofs are worked out in great detail. Grasping
the individual steps in the proofs as well as their essential ideas is made
easier by the careful organization. It is a truism that only a precise study
of the proofs enables one to penetrate more deeply into a mathematical
theory.
Readers have the following two options:
(i) Those who want to become acquainted as quickly as possible with the
Hilbert space approach to mathematical physics and numerical anal-
ysis can immediately begin with Chapter 2 after glancing at the last
section of Chapter 1, which summarizes important notions concerning
Banach spaces.
(ii) Those interested in the main principles of functional analysis and

their applications might skip to AMS Vol. 109 after reading Chap-
ter 1.
The book is based on lectures I have given for students of mathematics

and physics at Leipzig University. The manuscript has been finished during
a stay at the "Sonderforschungsbereich 256" of Bonn University and at
the Max Planck Institute for Mathematics in Bonn. I would like to thank
Professors Stefan Hildebrandt and Friedrich Hirzebruch for the invitations
and the kind hospitality. Finally, my special thanks are due to Springer-
Verlag for the harmonious collaboration.
I hope that the reader of this book enjoys getting a feel for the unity of
mathematics by discovering interrelations between apparently completely
different subjects.
Leipzig Eberhard Zeidler

Spring 1995
Prologue
Each progress in mathematics is based on the discovery of stronger

tools and easier methods, which at the same time makes it easier
to understand earlier methods. By making these stronger tools and
easier methods his own, it is possible for the individual researcher to
orientate himself in the different branches of mathematics.
The organic unity of mathematics is inherent in the nature of this
science, for mathematics is the foundation of all exact knowledge of
natural phenomena.
David Hilbert, 1900

(Paris lecture) 1
In order to understand the great achievement of Hilbert (1862-1943)

in the field of analysis, it is necessary to first comment on the state
of analysis at the end of the nineteenth century. After Weierstrass
(1815-1897) had made sure of the foundations of complex function
theory, and it has reached an impressive level, research switched
to boundary-value problems, which first arose in physics. The work
of Riemann (1826-1866) on complex function theory, however, had
shown that boundary-value problems have great importance for pure
mathematics as well. Two problems had to be solved:
1 In this fundamental lecture, Hilbert formulated his famous 23 open prob-

lems, which strongly influenced the development of mathematics in the twentieth
century.
xx Prologue
(i) the problem of the existence of a potential function for given

boundary values; and
(ii) the problem of eigenoscillations of elastic bodies, for example,
string and membrane.
The state of the theory was bad at the end of the nineteenth
century. Riemann had believed that, by using the Dirichlet princi-
ple, one could deal with these problems in a simple and uniform way.
After Weierstrass' substantial criticism of the Dirichlet principle in
1870, special methods had to be developed for these problems. These
methods, by C. Neumann, Schwarz, and Poincare, were very elabo-
rate and still have great aesthetic appeal today; but because of their
variety they were confusing, although at the end of the nineteenth
century, Poincare (1854-1912), in particular, endeavoured with great
astuteness to standardize the theory. There was, however, a lack of
"simple basic facts" from which one could easily get complete results
without sophisticated investigations of limiting processes.
Hilbert first looked for these "simple basic facts" in the calculus
of variations. He considered so-called regular variational problems
which satisfy the Legendre condition. In 1900 he had an immediate
and great success; he succeeded in justifying the Dirichlet principle.
While Hilbert used variational methods, the Swedish mathemati-
cian Fredholm (1866-1927) approached the same goal by developing
Poincare's work by using linear integral equations. In the winter
semester 1900/01 Holmgren, who had come from Upsala (Sweden)
to study under Hilbert in Gottingen, held a lecture in Hilbert's sem-
inar on Fredholm's work on linear integral equations which had been
published the previous year. This was a decisive day in Hilbert's life.
He took up Fredholm's new discovering with great zeal, and com-
bined it with his variational methods. In this way he succeeded in
creating a uniform theory which solved problems (i) and (ii) above.
In 1904 Hilbert's first note on the "Foundations of a General
Theory of Linear Integral Equations" was published in the Gottinger
Nachrichten. These results were based on lectures which Hilbert held
from the summer of 1901 onwards. Fredholm had proved the exis-
tence of solutions for linear integral equations of the second kind.
His result was sufficient to solve the boundary-value problems of po-
tential theory. But Fredholm's theory did not include the eigenoscil-
lations and the expansions of arbitrary functions with respect to
eigenfunctions. Only Hilbert solved this problem by using finite-
dimensional approximations and a passage to the limit. In this way
he obtained a generalization of the classical principal-axis transfor-
mation for symmetric matrices to infinite-dimensional matrices. The
symmetry of the matrices corresponds to the symmetry of the ker-
nels of integral equations, and it shows that the kernels appearing in
oscillation problems are indeed symmetrical.
Prologue xxi
From our point of view today, Hilbert's paper of 1904 appears

clumsy, compared to the elegance of Erhard Schmidt's method pub-
lished in 1907 which he developed in his dissertation written while a
student of Hilbert in Gottingen. But the first step had been made.
In the same year, 1904, Hilbert, in his second note, was able to apply
his theory to general Sturm-Liouville eigenvalue problems. His third
note in 1905 contained a very important result. Of the great prob-
lems which had Riemann posed with the complex function theory,
there was still one left open; the proof of the existence of differential
equations with a prescribed monodromy group. Hilbert solved this
problem by reducing it to the determination of two functions which
are holomorphic in both the interior and the exterior of a closed
curve, and whose real and imaginary parts satisfy appropriate lin-
ear combinations on the curve (the Riemann-Hilbert problem). The
solution to this problem is a classic example for the axiomatics of lim-
iting processes demanded by Hilbert. No concrete limiting processes
are used, but everything results from the existence of the Green func-
tion for the interior and the exterior of the closed curve, and from the
Fredholm alternative which says that either the homogeneous inte-
gral equation has a nontrivial solution or the inhomogeneous integral
equation has a solution.
Hilbert soon noticed that limits are set to the method of integral
equations. In order to overcome these limits he created, in his fourth
and fifth notes in 1906, the general theory of quadratic forms of an
infinite number of variables. Hilbert believed that with this theory he
had provided analysis with a great general basis which corresponds
to an axiomatics of limiting processses. The further development of
mathematics has proved him to be right.
Otto Blumenthal, 1932
The perfection of mathematical beauty is such that whatsoever is

most beautiful and regular is also found to be most useful and ex-
cellent.
D'Arcy W. Thompson, 1917

On Growth and Form
Contents
Preface vii
Prologue xix
Contents of AMS Volume 109 xxvii
1 Banach Spaces and Fixed-Point Theorems 1

1.1 Linear Spaces and Dimension 2
1.2 Normed Spaces and Convergence 7
1.3 Banach Spaces and the Cauchy Convergence Criterion 10
1.4 Open and Closed Sets 15
1.5 Operators 16
1.6 The Banach Fixed-Point Theorem and the Iteration Method 18
1.7 Applications to Integral Equations 22
1.8 Applications to Ordinary Differential Equations. 24
1.9 Continuity . 26
1.10 Convexity 29
1.11 Compactness 33
1.12 Finite-Dimensional Banach Spaces and Equivalent Norms 42
1.13 The Minkowski Functional and Homeomorphisms . 45
1.14 The Brouwer Fixed-Point Theorem. 53
1.15 The Schauder Fixed-Point Theorem 61
1.17 Applications to Ordinary Differential Equations . 63
XXIV Contents
1.18 The Leray-Schauder Principle and a priori Estimates. 64

1.19 Sub- and Supersolutions, and the Iteration Method in
Ordered Banach Spaces 66
1.20 Linear Operators . . . . . . . . . 70
1.21 The Dual Space. . . . . . . . . . 74
1.22 Infinite Series in Normed Spaces 76
1.23 Banach Algebras and Operator Functions 76
1.24 Applications to Linear Differential Equations in Banach
Spaces. . . . . . . . . . . . . 80
1.25 Applications to the Spectrum 82
1.26 Density and Approximation . . 84
1.27 Summary of Important Notions 88
2 Hilbert Spaces, Orthogonality, and the Dirichlet

Principle 101
2.1 Hilbert Spaces 105
2.2 Standard Examples. 109
2.3 Bilinear Forms . . . 120
2.4 The Main Theorem on Quadratic Variational Problems 121
2.5 The Functional Analytic Justification of the Dirichlet
Principle. . . . . . . . . . . . . . . . . . . . . . . . . 125
2.6 The Convergence of the Ritz Method for Quadratic
Variational Problems. . . . . . . . . . . . . . . . . . 140
2.7 Applications to Boundary-Value Problems, the Method of
Finite Elements, and Elasticity . . . . . . . . 145
2.8 Generalized Functions and Linear Functionals 156
2.9 Orthogonal Projection . . . . . . . . . . . . 165
2.10 Linear Functionals and the Riesz Theorem. 167
2.11 The Duality Map. . . . . . . . . . . . . . . 169
2.12 Duality for Quadratic Variational Problems 169
2.13 The Linear Orthogonality Principle. . . . . 172
2 .14 Nonlinear Monotone Operators . . . . . . . 173
2.15 Applications to the Nonlinear Lax-Milgram Theorem and
the Nonlinear Orthogonality Principle . . . . . . . . . .. 174
3 Hilbert Spaces and Generalized Fourier Series 195

3.1 Orthonormal Series. . . . . . . . . . . . 199
3.2 Applications to Classical Fourier Series. 203
3.3 The Schmidt Orthogonalization Method 207
3.4 Applications to Polynomials 208
3.5 Unitary Operators . . . . . . . . . . . . 212
3.6 The Extension Principle . . . . . . . . . 213
3.7 Applications to the Fourier Transformation 214
3.8 The Fourier Transform of Tempered Generalized Functions 219
Contents xxv
4 Eigenvalue Problems for Linear Compact Symmetric

Operators 229
4.1 Symmetric Operators. . . . . 230
4.2 The Hilbert-Schmidt Theory 232
4.3 The Fredholm Alternative . . 237
4.5 Applications to Boundary-Eigenvalue Value Problems 245
5 Self-Adjoint Operators, the Friedrichs Extension and

the Partial Differential Equations of Mathematical
Physics 253
5.1 Extensions and Embeddings 260
5.2 Self-Adjoint Operators . . 263
5.3 The Energetic Space . . . . 273
5.4 The Energetic Extension. . 279
5.5 The Friedrichs Extension of Symmetric Operators. 280
5.6 Applications to Boundary-Eigenvalue Problems for the
Laplace Equation . . . . . . . . . . . . . . . . . . . . . . 285
5.7 The Poincare Inequality and Rellich's Compactness Theorem 287
5.8 Functions of Self-Adjoint Operators. . . . . . . . . . . . 293
5.9 Semigroups, One-Parameter Groups, and Their Physical
Relevance . . . . . . . . . . . . . . 298
5.10 Applications to the Heat Equation . . . . . . . . . . . .. 305
5.11 Applications to the Wave Equation. . . . . . . . . . . .. 309
5.12 Applications to the Vibrating String and the Fourier Method 315
5.13 Applications to the Schrodinger Equation 323
5.14 Applications to Quantum Mechanics 327
5.15 Generalized Eigenfunctions . . . . . 343
5.16 Trace Class Operators . . . . . . . . 347
5 .17 Applications to Quantum Statistics . 348
5.18 C*-Algebras and the Algebraic Approach to Quantum
Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . 357
5.19 The Fock Space in Quantum Field Theory and the Pauli
Principle. . . . . . . . . . . . . . . . . . . . . . . . . . .. 363
5.20 A Look at Scattering Theory . . . . . . . . . . . . . . .. 368
5.21 The Languase of Physicists in Quantum Physics and the
Justification of the Dirac Calculus . . . . . . 373
5.22 The Euclidean Strategy in Quantum Physics . . . . . . 379
5.23 Applications to Feynman's Path Integral. . . . . . . . . 385
5.24 The Importance of the Propagator in Quantum Physics 394
5.25 A Look at Solitons and Inverse Scattering Theory 406
Epilogue 425
Appendix 429
xxvi Contents
References 443
Hints for Further Reading 457
List of Symbols 461
List of Theorems 467
List of the Most Important Definitions 469
Subject Index 473

Contents of AMS Volume 109
Preface
Contents of AMS Volume 108
1 The Hahn-Banach Theorem and Optimization

Problems
1.1 The Hahn-Banach Theorem
1.2 Applications to the Separation of Convex Sets
1.3 The Dual Space era,b]*
1.4 Applications to the Moment Problem
1.5 Minimum Norm Problems and Duality Theory
1.6 Applications to Cebysev Approximation
1. 7 Applications to the Optimal Control of Rockets
2 Variational Principles and Weak Convergence

2.1 The nth Variation
2.2 Necessary and Sufficient Conditions for Local Extrema
and the Classi8al Calculus of Variations
2.3 The Lack of Compactness in Infinite-Dimensional Banach
Spaces
2.4 Weak Convergence
2.5 The Generalized Weierstrass Existence Theorem
2.6 Applications to the Calculus of Variations
2.7 Applications to Nonlinear Eigenvalue Problems
xxviii Contents of AMS Volume 109
2.8 Reflexive Banach Spaces

2.9 Applications to Convex Minimum Problems and
Variational Inequalities
2.10 Applications to Obstacle Problems in Elasticity
2.11 Saddle Points
2.12 Applications to Duality Theory
2.13 The von Neumann Minimax Theorem on the Existence of
Saddle Points
2.14 Applications to Game Theory
2.15 The Ekeland Principle about Quasi-Minimal Points
2.16 Applications to a General Minimum Principle via the
Palais-Smale Condition
2.17 Applications to the Mountain Pass Theorem
2.18 The Galerkin Method and Nonlinear Monotone Operators
2.19 Symmetries and Conservation Laws (The Noether
Theorem)
2.20 The Basic Ideas of Gauge Field Theory
2.21 Representations of Lie Algebras
2.22 Applications to Elementary Particles
3 Principles of Linear Functional Analysis

3.1 The Baire Theorem
3.2 Application to the Existence of Nondifferentiable
Continuous Functions
3.3 The Uniform Boundedness Theorem
3.4 Applications to Cubature Formulas
3.5 The Open Mapping Theorem
3.6 Product Spaces
3.7 The Closed Graph Theorem
3.8 Applications to Factor Spaces
3.9 Applications to Direct Sums and Projections
3.10 Dual Operators
3.11 The Exactness of the Duality Functor
3.12 Applications to the Closed Range Theorem and to
Fredholm Alternatives
4 The Implicit Function Theorem

4.1 m-Linear Bounded Operators
4.2 The Differential of Operators and the Frechet Derivative
4.3 Applications to Analytic Operators
4.4 Integration
4.5 Applications to the Taylor Theorem
4.6 Iterated Derivatives
4.7 The Chain Rule
4.8 The Implicit Function Theorem
Contents of AMS Volume 109 xxix
4.9 Applications to Differential Equations

4.10 Diffeomorphisms and the Local Inverse Mapping Theorem
4.11 Equivalent Maps and the Linearization Principle
4.12 The Local Normal Form for Nonlinear Double Splitting
Maps
4.13 The Surjective Implicit Function Theorem
4.14 Applications to the Lagrange Multiplier Rule
5 Fredholm Operators
5.1 Duality for Linear Compact Operators
5.2 The Riesz-Schauder Theory On Hilbert Spaces
5.3 Applications to Integral Equations
5.4 Linear Fredholm Operators
5.5 The Riesz-Schauder Theory on Banach Spaces
5.6 Applications to the Spectrum of Linear Compact
Operators
5.7 The Parametrix
5.8 Applications to the Perturbation of Fredholm Operators
5.9 Applications to the Product Index Theorem
5.10 Fredholm Alternatives via Dual Pairs
5.11 Applications to Integral Equations and Boundary-Value
Problems
5.12 Bifurcation Theory
5.13 Applications to Nonlinear Integral Equations
5.14 Applications to Nonlinear Boundary-Value Problems
5.15 Nonlinear Fredholm Operators
5.16 Interpolation Inequalities
5.17 Applications to the Navier-Stokes Equations
References
Subject Index
1
Banach Spaces and Fixed-Point
Theorems
The role of functional analysis has been decisive exactly in connec-

tion with classical problems. Almost all problems are on the appli-
cations, where functional analysis enables one to focus on a specific
set of concrete analytical tasks and organize material in a clear and
transparent form so that you know what the difficulties are.
Concrete and functional analysis exist today in an inextricable sym-
biosis. When someone writes down a system of axioms, no one is
going to take them seriously, unless they arise from some intuitive
body of concrete subject matter that you would really want to study,
and about which you really want to find out something.
Felix E. Browder, 1975
In a Banach space, the so-called norm
Ilull = nonnegative number

is assigned to each element u. This generalizes the absolute value lui of a
real number u. The norm can be used in order to define the convergence
lim
n->oo
Un =U
by means of
lim
n->oo
Ilu n - ull = o.
2 1. Banach Spaces and Fixed-Point Theorems
]RN era, b]
"'/
I Banach space I - Cauchy convergence criterion
normed space
_ convergence and boundedness
(norm iiuii)
~
linear space
- dimension and convexity
(linear combination au + (3v)
FIGURE 1.1.
As a standard example for a Banach space we will consider the space
C[a, b],
which consists of all continuous functions u: [a, b] -+ lR along with the norm
Ilull := max
a::;x::;b
lu(x)l, where - 00 < a < b< 00.
Figure 1.1 shows the relations between Banach spaces and other impor-
tant notions. For example, Figure 1.1 tells us that each Banach space is
also a normed space, etc. In this chapter we will prove the two fundamental
fixed-point theorems of Banach and Schauder
along with applications to integral equations and ordinary differential equa-

tions (cf. Figures 1.2 and 1.3). We will show in Chapter 3 of AMS Vol. 109
that the fundamental implicit function theorem is a simple consequence of
the Banach fixed-point theorem.
The first chapter can be understood without any knowledge of the
Lebesgue integral.
1.1 Linear Spaces and Dimension

In the following let
lK := lR or lK:= C,
where lR and C denote the set of real and complex numbers, respectively.
Roughly speaking, in a linear space X over lK it is possible to form "linear
combinations"
au + (3v,
where u, v E X and a, (3 E lK. In addition, the "usual rules" hold for
au + (3v.
norm
k - contraction A
compact operator A
(11Au - Avll ::; kllu - vii, 0::; k < 1)
~
Banach fixed-point theorem Schauder fixed-point theorem
(Au = u) (Au = u)
Picard-Lindelof theorem for Peano theorem for

the ordinary differential the ordinary differential
equation u' = F(x, u) equation u' = F(x, u)
FIGURE 1.2.
convex set
/
Brower fixed-point theorem in ]RN
1,"mpoctn~'
+
Schauder fixed-point theorem in Banach spaces
the Leray-Schauder principle and a priori estimates
FIGURE 1.3.
Definition 1. A linear space X over OC is a set X together with an addition
u+v, u,v E X
and a scalar multiplication
o:u, 0: E OC, u E X,
where all the usual rules are satisfied.

More precisely, for all u, v E X and all 0: E OC,
u+v and o:u

are defined elements of X such that, for all u, v, W E X and a, {3 E lK, the
following are true:
u+v =v+u, (u + v) + W = u + (v + w),
(a + (3)u = au + (3u, a (u + v) = au + av,
a({3u) = (a{3)u, au = u if a = 1.
Furthermore, there exists exactly one element () in X such that
for all u E X. (1)
Finally, for each given u EX, the equation
(2)
has exactly one solution v EX.

X is called a real or complex linear space as lK = ffi. or lK = C, respectively.
For simplifying notation, let us write
() := 0 and v:= -u
in (1) and (2), respectively. The following proposition shows that this con-
vention makes sense. We also write w - u instead of w + (-u).
Proposition 2. Let X be a linear space over lK. Then:
(i) a() = () for all a E lK.
(ii) Ou = () for all u E X.
(iii) (-a)u = -(au) for all a E lK and u E X.
Proof. Ad (i).l It follows from
au = a( u + ()) = au + a()
and the unique solvability of the equation au = au + v that v = a() = ().
Ad (ii). Since
au = (a + O)u = au + Ou,
we get Ou = ().
Ad (iii). It follows from
() = Ou = (a + (-a))u = au + (-a)u
lThe Latin term "Ad (i)" stands for "proof of (i)."

that -(au) = (-a)u. 0
Example 3. Let X := oc. Then, X is a linear space over ]I{, where
au + (3v with a, (3 E ]I{ and u, v E X
is to be understood in the classical sense.
Example 4. Let X := ]I{N, where N = 1,2, ... ; that is, the set X consists
of all the N-tuples
with ~k E ]I{ for all k.
Define
(6, ... , ~N) + (rJl, ... , rJN) = (6 + 6, ... ,~N + rJN),

a(6,··· '~N) = (a6,.·. ,a~N)' a E oc.
Then, X becomes a linear space over ]I{.
Obviously, () = (0, ... ,0).
Example 5. Let C[a, b] denote the set of all continuous functions
u: [a, b] --+ JR.,
where -00 < a < b < 00. For u, v E C[a, b] and a E JR., let
u+v and au
denote the corresponding functions, i.e.,
(u + v)(x) = u(x) + v(x) and (au)(x) = au(x) for all x E [a, b].
Since the sum and the product of two continuous functions are again con-
tinuous, we get
u+ v E C[a,b] and au E C[a,b].
Thus, C[a, b] becomes a real linear space.
Definition 6. Let X be a linear space over oc. The elements Ul, ... , UN of
X are called linearly independent iff
with ak E ]I{ for all k
always implies al = ... = aN = O.

Let N = 1,2, .... We write
dimX=N
iff the maximal number of linearly independent elements in X is equal to

N. The number N is called the dimension of X.
We write
dim X = 00
iff, for each N = 1,2, ... , there exist N linearly independent elements in
X. In this case, X is called an infinite-dimensional space.
For X = {O}, we set dim X = o.
The space X is called finite-dimensional iff 0 ::::: dim X < 00.
Example 7. Let X := IK. Then dim X = 1.

Here, X is considered to be a linear space over IK.
Proof. Let u E X with u i- o. Then
au =0 implies a=O.
Hence X contains at least one linearly independent element.

Let u and v be two linearly independent elements of X, i.e.,
au + (3v = 0 with a, (3 E IK implies a = (3 = O. (3)
Hence u i- 0 and v i- O. Setting

v
a:= - and (3:=-1,
u
we obtain a contradiction to (3). Thus, there are no two linearly indepen-
dent elements in X, i.e., dim X = 1. D
The following result is well known from the course on linear algebra.
Example 8. Let X := IKN for fixed N = 1,2, ... , where X is considered

to be a linear space over IK.
Then, dim X = N.
Example 9. Let X:= Ora, bj. Then, dim X = 00.
Proof. Set
for all x E [a, bj and k = 0,1, ....

Let ao, ... , aN E R It follows from
in Ora, bj
that
for all x E [a, bj.
Since a proper polynomial has only a finite number of zeros, we get al =

... = aN = O.
Consequently, for each N = 1,2, ... , the elements uo, ... , UN in Ora, b]
are linearly independent, i.e., dim Ora, b] = 00. 0
Definition 10. If A and B are subsets of a linear space over lK, then we
set
A+B:= {a+b:a E A and b E B},

aA:= {aa:a E A}, a E lK,
A x B:= {(a,b):a E A, bE B}.
1.2 Normed Spaces and Convergence

Recall that lK := lR or lK := C.
Definition 1. Let X be a linear space over K

Then, X is called a normed space over lK iff there exists a norm II· lion
X, i.e., for all u, v E X and a E lK, the following are true:
(i) Ilull ~ 0 (i.e., Ilull is a nonnegative real number).
(ii) lIull = 0 iff u = O.
(iii) Ilaull = laillull·
(iv) Ilu + vii:::; Ilull + Ilvll (triangle inequality).
A normed space over lK = lR or lK = <C is called a real or complex normed

space, respectively.
The number Ilu - vii is called the distance between the two points u and
v. In particular,
Ilull = distance between the point u and the origin v = O.
Since -u = (-l)u, relation (iii) implies
II-ull = Ilull for all u E X. (4)

It follows from (iv) that
II(u + v) + wll :::; Ilu + vii + Ilwll :::; Ilull + Ilvll + Ilwll·
Analogously, by induction, we get
N N
LUj : :; Lllujll for all Ul, ... , UN EX, N = 1,2, ....
j=l j=l
Example 2. Let X := R We set
Ilull:= lui for all u E JR.,
where lui denotes the absolute value of the real number u.

Then, X becomes a real normed space.
Example 3. Let X := C. We set
Ilull:= lui for all u E C,
where lui denotes the absolute value of the complex number u.

Then, X becomes a complex normed space.
In these two examples, the triangle inequality (iv) from Definition 1 cor-
responds to the classical triangle inequality for real and complex numbers.
The norm generalizes the absolute value of numbers.
Further examples will be considered in the next section.
Proposition 4 (Generalized triangle inequality). Let X be a normed space.

Then, for all u, v EX,
Illull - Ilvlll :::; Ilu ± vii:::; Ilull + Ilvll· (5)
Proof. By the triangle inequality,
Ilu ± vii = Ilu + (±v)11 :::; Ilull + I ± vii = Ilull + Ilvll,

and
Ilull = II(u - v) + vii:::; Ilu - vii + Ilvll·
Hence
Ilull-Ilvil :::; Ilu - vii·
Analogously,
Ilvll - Ilull :::; Ilv - ull = Ilu - vii·
This implies
Illull-llvlll :::; Ilu - vii·
Replacing v with -v and observing that u-( -v) = u+v and II-vii = Ilvll,
we also get
Illull - Ilvlll :::; Ilu + vii· o
Definition 5. Let (un) be a sequence in the normed space X, i.e., Un EX

for all n. We write
lim Un = U (6)
n-->oo
iff limn-+ DO Ilun - ull = O.
We say that the sequence (un) converges to u. Instead of (6) we also

write Un ---+ U as n ---+ 00.
Intuitively, the convergence (6) means that the distance Ilu n -ull between
the points Un and u goes to zero as n ---+ 00.
Proposition 6. Let X be a normed space over lK. Let Un, Vn , U, V E X and

an, a E lK for all n = 1,2, .... Then the following are met:
(i) The limit point U in (6) is uniquely determined.

(ii) If Un ---+ u as n ---+ 00, then the sequence (un) is bounded, i. e., there
exists a number r ~ 0 such that Ilunll :::; r for all n.
(iii) If Un ---+ u as n ---+ 00, then
as n ---+ 00.
(iv) If Un ---+ u and Vn ---+ v as n ---+ 00, then
un + Vn ---+ U+V as n ---+ 00.
(v) If Un ---+ U and an ---+ a as n ---+ 00, then
as n ---+ 00.
Proof. Ad (i). Let Un ---+ u and Un ---+ V as n ---+ 00. Then

as n ---+ 00.
Hence lIu - vii = 0, i.e., U = v.
Ad (ii). Let Un ---+ u as n ---+ 00. Hence Ilu n - ull ---+ 0 as n ---+ 00. Thus,
the real sequence (Ilun-ull) is bounded, i.e., there is a number R such that
for all n.
This implies
for all n.
Ad (iii). Let Un ---+ u as n ---+ 00. Then

as n ---+ 00.
Ad (iv). If Un ---+ u and Vn ---+ v as n ---+ 00, then
II(un + vn ) - (u + v)11 = II(un - u) + (vn - v)11

:::; Ilun - ull + Ilvn - vii ---+ 0 as n ---+ 00.
Ad (v). If Un ----t U and an ----t a as n ----t 00, then
Ilanun - aull = II(an - a)un + a(un - u)11

~ II(an - a)unll + Ila(un - u)11
~ Ian - al . Ilunll + lal' Ilun - ull
~ Ian - air + lal . Ilun - ull ----t 0 as n ----t 00. D
Definition 7. The sequence (un) in the normed space X is called a Cauchy

sequence iff, for each s > 0, there is a number no(s) such that
for all n, m ~ no(s).
Proposition 8. In a normed space, each convergent sequence is Cauchy.
Proof. Let Un ----t U as n ----t 00. Hence Ilu n - ull ----t 0 as n ----t 00, i.e., for
each s > 0, there is a number no(s) such that
s
Ilun - ull < "2 for all n ~ no(s).
This implies
lIun - urn I = II(un - u) + (u - urn)11

~ Ilun - ull + Ilu - urnll < s for all n, m ~ no(s). D
1.3 Banach Spaces and the Cauchy Convergence

Criterion
Definition 1. The normed space X is called a Banach space iff each Cauchy
sequence is convergent.
Therefore, from Proposition 8 in the preceding section, we get the fol-

lowing so-called Cauchy convergence criterion:
In a Banach space, a sequence is convergent iff it is Cauchy.
Banach spaces are also called complete normed spaces.
Example 2. The space X := lK is a Banach space over lK with the norm
Ilull:= lui for all u E lK.

This follows from the classical Cauchy convergence criterion.
Example 3. Let N = 1,2, .... The space X :=]KN is a Banach space over
]Kwith the norm Ilxll := Ixl=, where
Let Xn = (6n, ... '~Nn). Then
lim IXn -
n-->=
xl= = 0 iff lim
n-->=
~kn = ~k for all k = 1, ... , N. (7)
That is, the convergence Xn -+ x as n -+ 00 in X is equivalent to the

convergence of the corresponding components.
Proof. The inequality
implies statement (7). In fact, if IXn - xl= -+ 0 as n -+ 00, then ~kn -+ ~k

as n -+ 00 for all k, and the converse is also true.
Let us now prove that I . 1= is a norm. Obviously,
Ixl= = 0 {::} ~j = 0 for all j {::} x = 0,

and
Furthermore, the classical triangle inequality
implies
Finally, we have to show that X is a Banach space with respect to the

norm I . 1=· To this end, let (xn) be a Cauchy sequence. Then
for all n, m ~ no(c).
Thus, the sequence (~kn) is also Cauchy. The classical Cauchy convergence
criterion implies the convergence
lim ~kn = ~k, k = 1, ... ,N.

n-->=
By (7), Xn --+ x as n --+ 00. o

Example 4. Let N = 1,2, .... The space X := jRN is a Banach space with
the Euclidean norm Ilxll := lxi, where
Ixl := (t~;)! ,
3=1
Moreover,
lim IX n
n-+oo
- xl = 0 iff lim ~kn = ~k
n-+oo
for all k = 1, ... ,N. (8)
Convention 5. If we do not explicitly express the contrary, then the space

jRNis equipped with the Euclidean norm I· I.
Proof. Statement (8) follows from (7) by using the following inequality:
IX n - xl oo ::; IX n - xl ::; Nlx n - xl oo .

Next we want to prove that I· I is a norm. Obviously,
Ixl = 0 {:} ~j = 0 for all j {:} x = 0,

and
laxl = lallxl for all a E jR, x E jRN.
To prove the triangle inequality
Ix + yl ::; Ixl + Iyl (9)
we will use the classic Schwarz inequality
(10)
for all real numbers ~j, TJj, j = 1, ... , N. Hence
r(t,qJr
N N
Ix + Yl2 = ~)~j + TJj)2 = L~; + 2~jTJj + TJ;

j=1 j=1
~ t,e; +2 (t,e; + t,~;

= Ixl 2+ 21xllyl + lyl2 = (Ixl + lyl)2.
This implies (9).

It remains to prove (10). From 0 ::; (a ± b)2 = a2 ± 2ab + b2 we get
for all a, b E R
. ._ / ('\'N 2)
Choosmg a .- ~j L.j=l ~j ! and b .-
._ 'T/j
/ ('\'N 2) !
L.j=l 'T/j and summing
over j, it follows that
This implies (10).

Finally, we have to show that JRN is a Banach space with respect to the
Euclidean norm 1·1. To this end, let (xn) be a Cauchy sequence with respect
to the norm I . I. It follows from
for all n, m ?: no(C:)
that (xn) is also a Cauchy sequence with respect to the norm I . 100' By
Example 3, we get the convergence
lim ~kn
n---+oo
= ~k for all k,
and hence (8) implies IXn - xl ----+ 0 as n ----+ 00. D
Standard Example 6. Let -00 < a < b < 00. Then, X := C[a, b] is a
real Banach space with the norm
Ilull := max
a$x$b
lu(x)l·
The convergence Un ----+ U in X as n ----+ 00 means
i.e., the sequence (un) of continuous functions Un: [a, b] ----+ JR converges
uniformly on [a, b] to the continuous function u: [a, b] ----+ R
Proof. We first prove that I . II is a norm. Obviously,
for all a E JR, u EX, and
Ilull = 0 {o} max

a$x$b
lu(x)1 = 0
{o} u(x) = 0 on [a, b] {o} u = 0 in C[a, b].

Moreover, from Iu(x) +v(x)1 ~ lu(x)1 + Iv(x) I we get the triangle inequality
Ilu + vii ~ Ilull + Ilvll·
Finally, we have to show that X = era, b] is a Banach space. Let (un)
be a Cauchy sequence in X, i.e.,
for all n, m ~ no(c). (11)
This implies the pointwise convergence
as n ---> 00 for each x E [a, b]. (12)

Letting m ---> 00 in (11), we obtain
for all n ~ no(c).
Thus, the convergence in (12) is uniform on the interval [a, b]. By a classical
result, this implies the continuity of the limit function u: [a, b] ---> R Hence
u E X and
Un ---> U in X as n ---> 00. D
Proposition 7. Let (un) be a Cauchy sequence in the normed space X

over lK, which has a convergent subsequence (un')' that is,
in X as n ---> 00.
Then, the entire sequence converges to u, i.e., Un ---> u in X as n ---> 00.
Proof. Let c > 0 be given. There is an no (c) such that

for all n, m ~ no(c).
Since (un') converges to u, there exists some fixed index m such that
Ilum - ull < c, where m ~ no(c).
By the triangle inequality,
for all n ~ no(c).
Hence Un ---> u as n ---> 00. D
Corollary 8. Suppose that

00
L Iluj+! - Uj II < 00,

j=l
1.4 Open and Closed Sets 15
where (un) is a sequence in a normed space X over K. Then, (Un) is a

Cauchy sequence in X.
Proof. By the triangle inequality, for all k = 1,2, ... , we get

<Xl
Ilun - un+k II ~ L Iluj+l - Uj II ---- 0 as n ---- 00. D

j=n
1.4 Open and Closed Sets

Definition 1. Let X be a normed space. For fixed Uo E X and c: > 0, the
set
Uc(Uo) := {u E X: Ilu - uoll < c:}
is called an c:-neighborhood of the point Uo.
The subset M of X is called open iff, for each point Uo E M, there is
some c:-neighborhood Uc(uo) such that
Uc(Uo) ~ M
(cf. Figure 1.4). The subset M of X is called closed iff the set X - M is
open.
Recall that X - M := {u E X: uri- M}. By an open neighborhood U(u)

of the point u, we understand an open subset of X containing u.
Proposition 2. Let M ~ X, where X is a normed space. Then, the fol-

lowing are equivalent:
(i) M is closed.
(ii) It follows from Un E M for all nand
as n ---- 00
that U E M.
Proof. (i) =} (ii). Let Un ---- U as n ---- 00 and Un E M for all n. We have
to show that U E M. If this is not true, then U E X - M. Since the set
X - M is open, there is some c:-neighborhood Uc(u) such that
From Ilun - ull ---- 0 as n ---- 00 we get Ilurn - ull < c: for some index m, and
hence
/,""
....------ ....... .....................
I
I ./
/
...... --.....
~U,,(uo),
"\
\ \ . Uo I I
\
" "./ /
..... _...... ./
-------
..... /
.............. "./
(a) open set (b) closed set
FIGURE 1.4.
i.e., EX - M. This contradicts Urn E M.

urn
(ii)(i). Suppose that the set M is not closed, i.e., the set X - M is
=}
not open. Then, there exists a point
uEX-M
such that no c:-neighborhood U,,(u) is contained in the set X - M. Thus,

choosing c: = ~, n = 1,2, ... , we get a sequence (un) such that
and Un E M for all n.
Hence
1
Ilun - ull ~ -n -+ 0 as n -+ 00.
By (ii), u E M. This contradicts u E X - M. o

Example 3 (Balls). Let X be a normed space. For fixed v E X and fixed
r> 0, define
B:= {u E X: Ilu - vii ~ r}.
Then, B is closed.
The set B is called a closed ball of radius r around the point v.
Proof. Let Un E B for all n, i.e.,
for all n.
If Un -+ u as n -+ 00, then Ilu - vii ~ r, and hence u E B. o
1.5 Operators
Definition 1. Let M and Y be sets. An operator
A:M-+Y
1.5 Operators 17
associates to each point u in M a point v in Y denoted by v = Au.

The set M is called the domain of definition of A. We also write D(A)
for M. The set
A(M):= {v E Y:v = Au for some u E M}

is called the range of A. We also write R(A) for A(M).
The operator A: M ---+ Y is called surjective iff A(M) = Y.
The operator A: M ---+ Y is called injective iff
Au=Av implies u=v.
The operator A: M ---+ Y is called bijective iff A is both surjective and

injective.
If the operator A: M ---+ Y is bijective, then there exists the so-called

inverse operator
defined through
iff Au=v.
This definition makes sense, since for each given v E Y, there exists exactly
one u E M such that Au = v. The set
A- 1 (N) := {u E M:Au E N}
is called the preimage of the set N. Operators are also called functions.
Convention 2. In order to indicate conveniently that the domain of defini-

tion M of the operator A: M ---+ Y is contained in the set X, we frequently
write
A: M ~ X ---+ Y.
In particular, if Y = lK, then the operator A: M ~ X ---+ lK is called a
functional.
Example 3. Let M := [a, b], Y := [c, d], and X := JR., where -00 < a<
b < 00 and -00 < c < d < 00. The operator
A:M ~ X ---+ Y
pictured in Figures 1.5(a), (b), and (c) is injective, surjective, and bijective,
respectively. In addition, A is not injective in Figure 1.5(b).
Example 4. Let -00 < a < b< 00, and let the function
F: [a, b] x [a, b] x JR. ---+ JR.

A
d
a b
(a) injective (b) surjective (c) bijective
FIGURE 1.5.
be continuous. We set
(Au)(x) := lx
F(x, y, u(y))dy for all x E [a, b],
lb
and
(Bu)(x) := F(x, y, u(y))dy for all x E [a, b].
Then, we obtain the two operators
A: Cia, b] --+ Cia, b] and B: Cia, b] --+ C[a, b].

In fact, it is a well-known classical result that the continuity of the function
u: [a, b] --+ lR
implies the continuity of the two functions
Au: [a, b] --+ lR and Bu: [a, b] --+ lR,
i.e., u E Cia, b] implies both Au E Cia, b] and Bu E C[a, b].

The operators A and B are called integral operators.
1.6 The Banach Fixed-Point Theorem

and the Iteration Method
The Banach fixed-point theorem represents a fundamental conver-
gence theorem for a broad class of iteration methods.
We want to solve the operator equation
u=Au, UEM, (13)

by means of the following iteration method:
n = 0,1, ... , (14)

1.6. The Banach Fixed-Point Theorem and the Iteration Method 19
where Uo E M. Each solution of (13) is called a fixed point of the operator

A.
Theorem 1.A (The fixed-point theorem of Banach). We assume that:
(a) M is a closed nonempty set in the Banach space X over lK, and
(b) the operator A: M --> M is k-contractive, i.e., by definition,
IIAu - Avll :s: kllu - vii for all u, v E M, (15)
and fixed k, 0 :s: k < 1.

Then, the following hold true:
(i) Existence and uniqueness. The original equation (13) has exactly one
solution u, i.e., the operator A has exactly one fixed point U on the
set M.
(ii) Convergence of the iteration method. For each given Uo E M, the

sequence (un) constructed by (14) converges to the unique solution U
of equation (13).
(iii) Error estimates. For all n = 0,1, ... we have the so-called a priori
error estimate
(16)
and the so-called a posteriori error estimate
(17)
(iv) Rate of convergence. For all n = 0, 1, ... we have
This theorem was proved by Banach in 1920. The Banach fixed-point

theorem is also called the contraction principle.
The a priori estimate (16) makes it possible to use a knowledge of the
initial value Uo along with Ul = Auo to determine the maximal number of
steps of iteration required to attain a desired level of precision.
In contrast to this, the a posteriori estimate (17) allows the use of the
computed values Un and Un+l to determine the accuracy of the approxi-
mation Un+l.
Experience shows that, as a rule, a posteriori estimates are better than

a priori estimates.
Proof. Ad (i), (ii). Step 1: We show first that (Un) is a Cauchy sequence.
Let n = 1,2, .... Using (15) we get
Ilun+l - unll = IIAun - AUn-lll ::; kllun - un-lll

= kllAun-l - AUn-211 ::; k 21lun_l - un-211
::; ... ::; k n Ilul - Uo II.
Now let n = 0,1, ... and m = 1,2, .... The triangle inequality and the sum
formula for the geometric series yield
Ilun - un+mll = lI(un - Un+1) + (Un+l - Un+2) + ... + (Un+m-l - u n+m) II

::; lIun - Un+111 + Ilun+l - un+211 + ... + Ilun+m-l - un+mll
::; (kn + kn+1 + ... + kn+m-l)llul - uoll
::; kn(l + k + k 2 + .. ·)llul - uoll
= kn(1- k)-lilul - uoll.
It follows from 0 ::; k < 1 that k n ~ 0 as n ~ 00. Hence the sequence (Un)
is Cauchy.
Since X is a Banach space, the Cauchy sequence (Un) converges, Le.,
as n ~ 00.
Step 2: We show that the limit point U is a solution of the original

equation (13).
From Uo E M and Ul = Auo along with A(M) ~ M, we get Ul E M.
Similarly, by induction,
Un EM for all n = 0,1, ....

Since the set M is closed, we obtain
UEM,
and hence Au E M. By (15),
IIAun - Aull ::; kllun - ull ~ 0 as n ~ 00.
Letting n ~ 00 it follows from Un+l = AUn that
u=Au.
Step 3: We show the uniqueness of the solution U of (13). It follows from
Au = U and Av = v with u, v E M that
Ilu - vii = IIAu - Avll ::; kllu - vii·
1.6. The Banach Fixed-Point Theorem and the Iteration Method 21
°
Since ~ k < 1, this implies Ilu - vii = 0, and hence
Ad (iii). Letting m -> 00 it follows from
U = v.
Ilu n - un+mll ~ kn(1- k)-lilul - uoll

that
for all n = 0, 1, ....
This is the error estimate (16).
Let n = 0,1, ... and m = 1,2, .... To prove the error estimate (17),
observe that
Ilun+l - un+m+lll ~ Ilun+! un+211 + Ilun+2 - un+311

-
+ ... + Ilun+m - un+m+lll
~ (k + k 2 + ... + km)llu n - un+lli.
Letting m -> 00 we get
Ilun+! - ull ~ k(l- k)-lilu n - un+lll·

Ad (iv). Observe that
Ilun+! - ull = IIAu n - Aull ~ kllu n - ull· o
Example 1. Let -00 < a < b < 00. Suppose that we are given the
differentiable function
A: [a, b] -> [a, b]
such that
IA'(u)1 ~ k < 1 for all U E [a, b] and fixed k.

Then, Theorem 1.A can be applied to the equation
u=Au, U E [a,b] (18)
with M := [a, b], X := JR., and the norm Ilull := lui.

In particular, equation (18) has a unique solution u. This solution corre-
sponds to the intersection point between the graph of A and the diagonal
in Figure 1.6.
Proof. The set M = [a, b] is closed in the real Banach space X = R

By the classical mean value theorem, for each u, v E [a, b], there exists a
point w E [a, b] such that
IAu - Avl = IA'(w)(u - v)1 ~ klu - vi,
i.e., the function A: [a, b] -> [a, b] is k-contractive.
Therefore, the assumptions of Theorem LA are satisfied. o
a u b
FIGURE 1.6.
1. 7 Applications to Integral Equations

We want to solve the integral equation
U(x) = A lb F(x, y, u(y))dy + f(x), a:::: x:::: b, (19)
along with the iteration method
Un+l(x) = A lb F(x,y,un(y))dy + f(x), a:::: x:::: b, n = 0,1, ... , (19*)
where uo(x) == 0 and -00 < a < b< 00.
Proposition 1. Assume the following:
(a) The function f: [a, b] -+ lR is continuous.
(b) The function F: [a, b] x [a, b] x lR -+ lR is continuous, and the partial

derivative
Fu: [a, b] x [a, b] x lR -+ lR
is also continuous.
(c) There is a number .c such that
lFu(x,y,u)l:::: .c for all X,y E [a,b], u E R
(d) Let the real number A be given such that (b - a)IAI.c < l.
(e) Set X:= C[a,b] and Ilull := maxa:Sx:Sb lu(x)l.
(i) The original problem (19) has a unique solution u E X.

(ii) The sequence (un) constructed by (19*) converges to u in X.
1. 7 Applications to Integral Equations 23
(iii) For all n = 0,1,2, ... we get the following error estimates:
ull ::; kn(l- k)-lllulll,

lIun -
Ilun+l - ull ::; k(l- k)-lllun+l - unll, where k := (b - a)I)'I£.
Proof. Define the operator
(Au)(x) := ), lb F(x, y, u(y))dy + f(x) for all x E [a, b].
Then, the original equation (19) corresponds to the fixed-point problem
u=Au.
If u: [a, b] -+ lR is continuous, then so is the function Au: [a, b] -+ R This

way we get the operator
A:X -+ x.
For each x, y E [a, b] and u, v E lR, there exists awE lR such that
IF(x, y, u) - F(x, y, v)1 ::; lFu(x, y, w)llu - vi::; £Iu - vi,

by the classical mean value theorem. This implies
IIAu - Avll = max I(Au)(x) - (Av)(x)1

a<::x<::b
::; 1)'I(b - a)£ max lu(x) - v(x)l.
a<::x<::b
Hence
IIAu - Avll ::; kllu - vii for all u, vEX.
Letting M := X, the assertions follow now from the Banach fixed-point
theorem (Theorem l.A in Section l.6). D
Example 2 (Linear integral equation). Let
F(x, y, u) := K(x, y)u, (20)
and suppose that the function K : [a, b] x [a, b] -+ lR is continuous.

Then, the assumptions of Proposition 1 are satisfied with
£ = max IK(x,y)l.
a<::x,y<::b
Therefore, all the statements of Proposition 1 are true for the integral
equation (19) with (20).
In the special case (20), the original problem (19) is called a linear integral
equation.
u = u(x)
uo \ ~ '-
.",,----_-8
L-----~--*---~----_x
FIGURE 1.7.
1.8 Applications to Ordinary Differential

Equations
We want to solve the following initial-value problem:
u' = F(x,u), Xo h ::; x ::; Xo + h, (21)
u(Xo) = uo,
where the point (xo, uo) E ]R2 is given. More precisely, we are looking for a
solution u = u(x) of (21) such that
u: [xo - h, Xo + h] ---+]R is differentiable, and
(21 *)
(x, u(x)) E S for all x E [xo - h, Xo + hJ,
with the square S:= {(x,u) E ]R2: Ix - xol::; rand lu - uol::; r} for fixed
r > 0 (see Figure l.7). We set
X:= C[xo - h,xo + h] and M:= {u E X: Ilu - uoll ::; r}.
Recall that Ilull = maxa:'Ox:'Ob lu(x)l·
Parallel to (21), (21*) let us consider the integral equation
u(x) = Uo + r F(y, u(y))dy,

lxo
Xo - h ::; x ::; Xo + h, u E M, (22)
along with the iteration method
Un+l(x) = Uo + r F(y,un(y))dy,
lxo
Xo - h ::; x ::; Xo + h, n = 0, 1, ... ,
(23)
where uo(x) == uo.
Proposition 1 (The Picard-Lindelof theorem). Assume the following:

(a) The function F: S ---+ ]R is continuous and the partial derivative
Fu: S ---+ ]R
is also continuous.
1.8 Applications to Ordinary Differential Equations 25
(b) We set
M:= max IF(x,u)1 and £:= max lFu(x,u)l,

(x,u)ES (x,u)ES
and we choose the real number h in such a way that
o < h ::; r, hM::; r, and h£ < 1.

(i) The original problem (21) has a unique solution of the form (21*).
(ii) This is also the unique solution of the integral equation (22).
(iii) The sequence (un) constructed by (23) converges to u in the Banach

space X.
(iv) For n = 0,1, ... we have the following error estimates.'
Ilu n - ull ::; kn(1- k)-lilul - uoll,
Ilun+! - ull ::; k(1 - k)-lllun+l - unll, where k := h£.
Proof. Step 1: The integral equation. Define the operator A through
(Au)(x) := Uo + lxo
x
F(y, u(y))dy for all x E [xo - h,xo + h].
Then, the integral equation (22) corresponds to the following fixed-point

problem:
u=Au, uEM. (22*)
If u E M, then the function u: [xo - h, Xo + h] ----t lR is continuous and
(x, u(x)) E S for all x E [xo - h, Xo + h]. Therefore, the function
X f---+ F(x,u(x))
is also continuous on the interval [xo - h, Xo + h]. This implies the continuity
of the function
Au: [xo - h, Xo + h] ----t R
This way we get the operator
A:M----tX.
Let us prove that
(a) A(M) ~ M.
(b) IIAu - Avll :::; kllu - vii for all u,v E M.

Ad (a). Let u E M. Then
11Xo
x
F(y, u(Y))dyl :::; Ix - xol max IF(y, u)1
(y,u)ES
:::;hM :::;r for all x E [xo - h,xo + h],
and hence
IIAu - uoll = xo-h':;x':;xo+h

max 11Xo
x
F(y, U(Y))dyl :::; r,
i.e., Au E M.
Ad (b). By the classical mean value theorem,
IF(x, u) - F(x, v)1 = lFu(x, w)llu - vi:::; £Iu - vi

for all (x, u), (x, v) E S. Observe that w depends on u and v, where (x, w) E
S. Hence, for all u, v E M, we obtain
IIAu - Avll = xo-h~~xo+h 11~ [F(y, u(y)) - F(y, V(Y))]dyl

:::; h£ max
xo-h':;y':;xo+h
lu(y) - v(y)1 = kllu - vii,
where k := he.
We now apply the Banach fixed-point theorem (Theorem 1.A in Section
1.6) to equation (22*). This yields the statements concerning the integral
equation (22).
Step 2: Equivalence. Let u be a solution of the integral equation (22).
Differentiating (22), it follows that the function u is also a solution of the
original initial-value problem (21), (21*).
Conversely, let u be a solution of (21), (21*). Integration of (21) shows
that the function u is also a solution of the integral equation (22).
Therefore, the two problems (21), (21*) and (22) are equivalent. 0
The following three sections serve as a preparation for the proof of the
fundamental Schauder fixed-point theorem, which allows a generalization of
the Picard-Lindelof theorem. The basic notions to be considered are:
continuity, convexity, and compactness.
1.9 Continuity
Definition 1. Let X and Y be normed spaces over K The operator
A:M~X----tY (24)
1.9 Continuity 27
is called sequentially continuous iff, for each sequence (un) in M,
lim
n-.oo
Un = U with U E M implies lim
n-.oo
AU n = Au.
The operator A in (24) is called continuous iff, for each point u E M and
each number 10 > 0, there is a number 15(10, u) > 0 such that
Ilv - ull < b(E,U) and v E M imply IIAv - Aull < E. (25)
In addition, if it is possible to choose the number 15(10, u) > 0 in such a

way that it does not depend on the point u EM, then the operator A in
(24) is called uniformly continuous.
Example 2. Let X and Y be normed spaces over lK. The operator A: M <;;;
X ---+ Y is called Lipschitz continuous iff there is a number L > 0 such that
IIAv - Aull :::; Lllv - ull for all u, v E M. (26)
Each Lipschitz continuous operator is uniformly continuous.

In fact, condition (26) implies (25) with 15(10) = f.
Proposition 3. We are given the operator A: M <;;; X ---+ Y, where X

and Yare normed spaces over lK. Then, the following two statements are
equivalent:
(i) A is continuous.
(ii) A is sequentially continuous.
Proof. (i) =} (ii). Suppose that A is continuous. Let
un ---+ u as n ---+ 00, where Un, U EM for all n.
Then, for each 10 > 0, there is a number no such that
Ilun - ull < 15(10, u) for all n :::: no,

where the number 15 corresponds to (25). It follows from (25) that IIAu n -
Aull < 10 for all n :::: no. Hence
as n ---+ 00.
(ii) =} (i). Suppose that the operator A is not continuous. Then, there
exists a number EO > 0 such that condition (25) is violated for each number
15 > O. In particular, when 15 = ; there is a point Un EM such that
1
Ilun - ull < -n and IIAu n - Aull :::: EO for all n= 1,2, ....
Hence Un ----+ U as n ----+ 00. By (ii),
as n ----+ 00.
This contradicts IIAu n - Aull :::: co for all n. D
Proposition 4 (Composition of continuous operators). Let
A:M~X----+Y and B:A(M) ----+ Z
be two continuous operators, where X, Y, and Z are normed spaces over

K Set C:= BoA, i.e.,
Cu:= B(Au) for all u E M.
Then, the operator C: M ----+ Z is continuous.
Proof. Let Un, U E M for all n. Then, Aun , Au E A(M) for all n. It follows
from Un ----+ u as n ----+ 00 that
as n ----+ 00,
and hence B(Au n ) ----+ B(Au) as n ----+ 00. D
Definition 5. Let M and Y be subsets of normed spaces over lK. The

operator
A:M----+Y
is called a homeomorphism iff it is continuous and bijective, and the inverse
operator A-I: Y ----+ M is also continuous.
The two sets M and Yare called homeomorphic iff there exists a home-
omorphism A: M ----+ Y.
Example 6. Let r, a, b > O. The disk
and the ellipse
are homeomorphic. The homeomorphism A: D ----+ E is given through
along with the inverse operator A- 1(C TJ) = r(a-1~, b- 1TJ) (see Figure 1.8).
The intuitive meaning of a homeomorphism is a "rubber-sheet transforma-

tion.!!
1.10 Convexity 29
TJ
A
=>
D E
FIGURE 1.8.
1.10 Convexity
Definition 1. The set M in a linear space is called convex iff
u, v E M and 0 ~ a ~ 1 imply au + (1 - a)v E M.
The function f: M ---- lR. is called convex iff M is convex and
f(au + (1 - a)v) ~ af(u) + (1 - a)f(v),
for all u, v E M and all a, 0 ~ a ~ 1.
Intuitively, the convexity of the set M means that if the two points u
and v belong to M, then the segment joining them also belongs to M (see
Figure 1.9(a)).
The convexity of the real function f: [a, b] ---- lR. means that the chords
always lie above the graph of f (see Figure 1.9(b)).
Example 2. Let X be a normed space, and let Uo EX, r ~ 0 be given.

Then, the ball
B = {u E X: Ilu - uoll ~ r}
is convex.
Proof. Ifu,v E B and 0 ~ a ~ 1, then
Ilau + (1 - a)v - uoll = Ila(u - uo) + (1- a)(v - uo)1I

~ Ila(u - uo)11 + 11(1- a)(v - uo)11
= allu - uoll + (1 - a)llv - uoll ~ ar + (1 - a)r = r.
Hence au + (1- a)v E B. o
Example 3. Let X be a normed space with the norm 11·11. Set f(u) := Iluli.
Then, the function f: X ---- lR. is continuous and convex.
Proof. It follows from Un ---- u as n ---- 00 that Ilunll ---- Iluli. Hence f is
continuous.
a u v b
(a) convex sets (b) convex functions
FIGURE 1.9.
For u, v E X and 0 ::; a ::; 1,
Ilau + (1 - a)vll ::; Ilaull + 11(1 - a)vll = allull + (1 - a)llvll·

This proves the convexity of f. o
Definition 4. A subset L of the linear space X over lK is called a linear
subspace of X iff
u, vEL and a, (3 E lK imply au + (3v E L.
By a closed linear subspace L of the normed space X over lK we mean a

linear subspace that is a closed set.
Obviously, each linear subspace is convex.
Example 5. Let X := era, b], -00 < a< b < 00, and set
L:= {u E X:u(a) = O}.

Then, L is a closed linear subspace of X.
Proof. If u, vEL and a, (3 E lR, then
(au + (3v)(a) = au(a) + (3v(a) = 0,

and hence au + (3v E L, i.e., L is a linear subspace of X.
If Un E L for all n and Un ---+ U in X as n ---+ 00, then
un(a) ---+ u(a) as n ---+ 00,
and hence u(a) = 0, i.e., u E L. Thus, the set L is closed. o

Definition 6. Let M be a subset of the linear space X over lK. Then:
span M := smallest linear subspace of X containing M;

co M := smallest convex set of X containing M.
1.10 Convexity 31
Let X be a normed space over ][(. Then:
M := smallest closed set of X containing M;

co M := smallest closed convex set of X containing M;
int M := largest open set of X contained in M.
Here, we use the following terminology:
span M = linear hull of M;

M = closure of M;
co M = convex hull of M;
co M = closed convex hull of M;
int M = interior of M.
The set
8M :=M -int M
is called the boundary of M.
Finally, the set ext M := int(X - M) is called the exterior of M.
The point u is called an interior point, boundary point, or exterior point
of Miff u E int M, u E 8M, or u E ext M, respectively.
Proposition 7. Let M be a nonempty subset of the normed space X over

][{. Then, the following hold true:
(i) u E span M iff, for some fixed n = 1,2, ... ,
(27)
where UI, ... , Un E M and al, ... , an E ][(.
(ii) u E CO M iff, for some fixed n = 1,2, ... ,
(28)
where UI, ... ,Un E M and 0 ~ aI, ... ,an ~ 1 with al + ... + an = 1.
(iii) u E M iff, for some sequence (Un) in M,
un -> U as n -> 00.
Proof. Ad (i). Let L be the set of all the linear combinations of the form
(27). Then
U, vEL and a, (3 E ][{ imply au + (3v E L,

i.e., L is a linear subspace of X. In fact,
Conversely, let .c be a linear subspace of X such that M ~ .c. Then, it

follows from U1, ... , Un E M that U E .c, where U is given by (27). Hence
L ~.c.
Thus, L is the smallest linear subspace of X that contains the set M, i.e.,
L = span M.
Ad (ii). Use a similar argument as in the proof of (i). In this connection,
observe that it follows from
as well as a1 + ... + an = 1, (31 + ... + (3m = 1, and a + (3 = 1, that

aa1 + ... + aa n + (3(31 + ... + (3(3m = a + (3 = 1.
Ad (iii). Let C be the set of all the points U E X such that Un -+ U as
n -+ 00for some sequence (un) in M, i.e.,
Ilun - ull -+ 0 as n -+ 00 and Un E M for all n. (29)
We want to show that the set C is closed. To this end, let (v n ) be a sequence
in C such that
Vn -+ V as n -+ 00.
By the definition of C, for each Vn , there is a point Wn E M such that
for all n = 1,2, ....
Ilwn - vii = II(wn - vn ) + (v n - v)11

~ Ilwn - vnll + IIv n - vii -+ 0 as n -+ 00.
Hence v E C. Thus, the set C is closed.

Conversely, let C be a closed subset of X such that M ~ C. Then
C~C.
Therefore, C is the smallest closed subset that contains the set M, i.e.,
C=M. 0
1.11 Compactness 33
open set norm
I convergence I
closed set
-- ~
compact set
~
compact operator
bounded set
-- relatively compact set
FIGURE 1.10.
continuous operator
1.11 Compactness
We want to study both compact sets and compact operators. The notion
of a compact set generalizes the classical Bolzano-Weierstrass convergence
theorem.
Compact operators allow us to generalize classical results for opera-
tor equations in finite-dimensional normed spaces to infinite-dimensional
normed spaces via approximation and a limiting process. For example, this
method will be used in the proof of the Schauder fixed-point theorem.
Compactness plays a key role in functional analysis.
Figure 1.10 tells us, for example, that each compact set is closed, and so
on.
1.11.1 Compact Sets

Definition 1. Let M be a set in a normed space.
M is called relatively sequentially compact iff each sequence (un) in M
has a convergent subsequence Un' -+ U as n' -+ 00.
M is called sequentially compact iff each sequence (un) in M has a Con-
vergent subsequence Un' -+ U as n' -+ 00 such that U EM.
M is called bounded iff there is a number r ::::: 0 such that \lull:; r for all
uEM.
Convention 2. For brevity of terminology, we will use "relatively compact"

and "compact" instead of "relatively sequentially compact" and "sequen-
tially compact," respectively.
We will show in Problem 1.13 of AMS Vol. 109 that in normed spaces
this convention makes sense with respect to the corresponding definitions
in general topological spaces.
Proposition 3. The set M is compact iff it is relatively compact and closed.

Proof. Let M be compact. By definition, this implies that M is also rela-

tively compact. Furthermore, let
Un~v asn~oo with Un E M for all n.
Since M is compact, there is a subsequence Un' ~ U as n' ~ 00 with

U E M. Obviously, U = v. Hence M is closed.
Conversely, let M be a relatively compact and closed set. Consider any
sequence (un) in M. Then, there is a subsequence such that Un' ~ U as
n' ~ 00. Since M is closed, U E M. Thus, M is compact. 0
Proposition 4. Each relatively compact set is bounded.
Proof. Let the set M be relatively compact and suppose that M is not
bounded. Then, there exists a sequence (un) in M such that
for all n. (30)
Since M is relatively compact, there exists a convergent subsequence (un')'

Hence (Un') is bounded. This contradicts (30). 0
Example 5. Let M be a subset of JR equipped with the usual norm Ilull :=

lui·
Then, M is relatively compact iff it is bounded.
Proof. Let M be bounded, and let (un) be a sequence in M. By the classical

Bolzano- Weierstrass theorem, there exists a convergent subsequence Un' ~
U as n' ~ 00. Hence M is relatively compact.
Conversely, if the set M is relatively compact, then M is bounded, by
Proposition 4. 0
Standard Example 6. Let M be a set in ][{N equipped with the norm

Ilull := Iuloo, where N = 1,2, ....
Then, M is relatively compact iff it is bounded.
Proof. By Proposition 4, it is sufficient to prove that each bounded set in

][{N is relatively compact.
Step 1: ][{ = JR, N = 1. Observe that JRl = JR and use Example 5.
Step 2: ][{ = JR, N = 2. Suppose that M is bounded. Let (un) be a
sequence in M, i.e.,
Un := (6n, 6n).
Then, (un) is bounded. Since
for all n, (31)

1.11 Compactness 35
the sequence (6n) is bounded in R By Example 5, there is a convergent

subsequence
as n' -+ 00.
Similarly, by (31), the sequence (6nf) is bounded. Thus, there is a conver-
gent subsequence
as nil -+ 00.
Setting u:= (6,6), we get Un" -+ u as nil -+ 00.
Step 3: OC = JR., N 2': 3. Proceed similarly to Step 2.
Step 4: OC = C, N = 1. Suppose that M is a bounded set in C. Consider
a sequence (v n ) in M, i.e.,
Vn := 6n + i6nf
Since M is bounded,
for all n and fixed r 2': o.

As in the proof of Step 2, we get subsequences
6n" -+ 6 and ~2nff -+ ~2 as nil -+ 00.
Letting v := 6 + i6, this implies V n " -+ v as nil -+ 00.

Step 5: OC = C, N 2': 2. Use the same argument as in Step 2 along with
Step 4. 0
Standard Example 7 (The Arzela-Ascoli theorem). Let X := C[a, bj

with Ilull := maXa<x<b lu(x)1 and -00 < a < b < 00. Suppose that we are
given a set M in X such that
(i) M is bounded, i.e., Ilull :::; r for all u E M and fixed r 2': o.
(ii) Mis equicontinuous, i.e., by definition, for each c > 0, there is a {j >0
such that
Ix - yl < (j and u E M imply lu(x) - u(y)1 < c.
Then, M is a relatively compact subset of X.
Proof. Suppose that we are given a sequence (un) in M, i.e., the functions
un: [a, bj -+ JR., n = 1,2, ...

are continuous. Let Q denote the set of rational numbers contained in the
interval [a, bj. This set is countable. Thus, we may write
Q = {ri: i = 1,2, ... }.

Step 1: Diagonal sequence. By assumption (i), the sequence (un(rt))

is bounded in R Thus, there is a subsequence (U~l)) of (un) such that
(u~l\rt)) is convergent, i.e., there is a real number WI such that
Again by (i), the sequence (u~l) (T2)) is bounded in R Hence there exists a
subsequence (U~2)) of (U~l)) such that
Continuing this construction, for each k = 1,2, ... , we obtain a subsequence

(u~k)) of (un) such that, as n -'> 00,
In addition, (u~k+1)) is a subsequence of (u~k)) for all k.

We now consider the diagonal sequence
V n .=
.
urn)
n' n = 1,2, ....
Then
for all j = 1, 2, . .. . (32)
Step 2: Cauchy sequence in [a, b]. Let E > 0 be given. We choose the
number 8 > 0 as in (ii). Then, there exists a finite number of points
Xl, ... ,Xs E IQ such that, for each X E [a, b], there is some Xj such that
(33)
By (32), for each j = 1, ... ,8, the sequence (vn(Xj)) is convergent, and
hence it is a Cauchy sequence. Thus, there is a number nO(E) such that
for all n,m 2 no(E), j = 1, ... ,8.
Finally, for each X E [a, b], it follows from assumption (ii) and (33) that
for all n, m 2 nO(E).

1.11 Compactness 37
This implies
for all n, m ~ no(C:),
i.e., (v n ) is a Cauchy subsequence of (un).

Since C[a, bJ is a Banach space, (v n ) represents a convergent subsequence
of (un) in C[a, bJ. D
Proposition 8 (The Weierstrass theorem). Let
f:M --dR
be a continuous function on the compact nonempty subset M of a normed

space.
Then, f has a minimum and a maximum on M.
Proof. Set
a:= inf f(u).
uEM
Then -00 ~ a < 00. Recall that if the set
M:= {f(u):u E M}
is bounded below, then a is the largest lower bound of M, and hence

a> -00. If M is not bounded below, then a := -00.
By construction of a, there exists a sequence (un) in M such that
as n ---+ 00.
Since the set M is compact, there exists a convergent subsequence Un' ---+ V
as n' ---+ 00. By the continuity of the function f,
f(u n ,) ---+ f(v) as n' ---+ 00.
Hence a = f(v). Consequently, a> -00 and
f(v) = inf f(u).

uEM
That is, the function f has a minimum on the set M.

Replacing f with - f, we obtain the corresponding result for the maxi-
mum of f. D
Proposition 9. Let X and Y be normed spaces over lK, and let
A:M ~ X ---+ Y
be a continuous operator on the compact non empty subset M of X.

Then, A is uniformly continuous on M.
Proof. Recall that the uniform continuity of A means that, for each 10 > 0,
there is a number 8 (c) > 0 such that
Ilu - vii < 8(10) and U,V E M imply IIAu - Avil < c. (34)
Suppose that A is not uniformly continuous. Then, there exist a number
co > 0 and two sequences (un) and (v n ) in M such that
1
Ilun - vnll ::; -n and IIAun - Avnll ~ co for all n. (35)
Since M is compact, there exists a subsequence of (un), again denoted by

(un), such that
un -+ U as n -+ 00 and u E M.
This implies Ilvn - ull ::; IIVn - unll + Ilun - ull -+ 0 as n -+ 00, and hence
Vn -+ U as n -+ 00.
By the continuity of the operator A, AUn - AVn -+ 0 as n -+ 00. This

contradicts condition (35). 0
Proposition 10 (Finite c-net). Let M be a nonempty set in the Banach

space X. Then, the following two statements are equivalent:
(i) M is relatively compact.

(ii) M has a finite c-net; that is, by definition, for each c > 0, there exists
a finite number of points VI,"" VJ E M such that
min
l'5j"5,J
Ilu - Vj II ::; c for all u EM.
Proof. (i) =* (ii). Let M be relatively compact. Suppose that (ii) is not
true. Then, there is a number co > 0 such that M has no finite co-net.
Choose a fixed point UI EM. Then, there exists a point U2 EM such that
Furthermore, there exists a point U3 E M such that
and
Continuing this construction, we get a sequence (un) in M such that
Ilun - urn I > co for all n, m = 1,2, ... with n i=- m.

1.11 Compactness 39
Consequently, each subsequence of (un) is not Cauchy, i.e., (un) does not
contain any convergent subsequence. This is a contradiction to the relative
compactness of M.
(ii) =? (i). Suppose that condition (ii) is satisfied. Let (un) be a sequence
in M. Fix c = 1. By (ii), there is some Vj EM such that
for infinitely many indices n.
Thus, there is a subsequence (u~1)) of (un) such that
for all n.
Hence
for all n, m.
Continuing this construction for c = ~, n = 2,3, ... , we obtain the se-

quences
with the following properties. For each k = 1,2, ... , (u~k+l)) is a subse-
quence of (u~k)) and
for all n, m. (36)
Consider now the diagonal sequence

vn .=
. u(n)
n·
By (36),
for all n, m with m ~ n.
Therefore, (v n ) is a Cauchy subsequence of (un). Since X is a Banach space,
the sequence (v n ) is convergent. This proves the relative compactness of the
set M. D
1.11.2 Compact Operators

Definition 11. Let X and Y be normed spaces over IK. The operator
A:M <;;; X ---+ Y
is called compact iff

(i) A is continuous, and
(ii) A transforms bounded sets into relatively compact sets.
Obviously, property (ii) is equivalent to the following: If (un) is a bounded

sequence in M, then there exists a subsequence (unl) of (un) such that the
sequence (Au n is convergent in Y.
l )
Standard Example 12. Let us consider the integral operator
(Au)(x) := ib F(x, y, u(y))dy for all x E [a, b],
where -00 < a < b< 00. Set
Q := {(x, y, u) E JR3: x, y E [a, b] and lui::; r for fixed r > O}.
Suppose that the function F: Q ----> JR is continuous. Set X := era, b] and

M:= {u E X: IIuli ::; r}.
Then, the operator A: M ----> X is compact.
Proof. By Proposition 9 in Section 1.11.1, the function F is uniformly

continuous on the compact set Q. This implies that, for each c > 0, there
is a number 15 > 0 such that
W(x, y, u) - F(z, y, v)1 < c (37)
for all (x, y, u), (z, y, v) E Q with Ix - zl + lu - vi < 15.

We first show that the operator A: M ----> X is continuous. In fact, if
u E M, then the function u: [a, b] ----> JR is continuous, and lu(y)1 ::; r for
all y E [a, b]. Hence the function Au: [a, b] ----> JR is also continuous. Let
u,v E M. Then
IIu - vII = max lu(y) - v(y)1 < 15
a:S;x:S;b
implies
IIAu - Avll = a~;~b lib [F(x, y, u(y)) - F(x, y, V(Y))]dyl ::; (b - a)c, (38)
by (37). Hence A: M ----> X is continuous.

We now show that A: M ----> X is compact. Since the set M is bounded
it suffices to show that the set A(M) is relatively compact. By the Arzeld-
Ascoli theorem (Standard Example 7 in Section 1.11.1), it remains to show
that
1.11 Compactness 41
(i) A(M) is bounded, and

(ii) A(M) is equicontinuous.
Ad (i). Set M := max(x,y,U)EQ IF(x, y, u)J. Then, for all u E M,
JJAuJJ = a~?b lib F(X,y'U(Y))dyl :::; (b - a)M.
Ad (ii). Let Jx - zJ < 8 and x, z E [a, bJ. Then, by (37),
J(Au)(x) - (Au)(z)J:::; ib JF(x,y,u(y)) - F(z,y,u(y))Jdy

:::; (b - a)c for all u E M. 0
Proposition 13 (Approximation theorem for compact operators). Let
A:M~X-Y
be a compact opemtor, where X and Yare Banach spaces over lK, and M
is a bounded nonempty subset of X.
Then, for every n = 1,2, ... there exists a continuous opemtor
such that
1
sup JJAu - AnuJJ :::; - and dim (spanAn(M)) < 00,
uEM n
as well as An(M) ~ co A(M).
Proof. The set A( M) is relatively compact in Y. Thus, for every n =

1,2, ... , there exists a finite 2~ -net for A(M). That is, there are elements
Uj E A(M), j = 1, ... ,J, such that
min JJAu - u·JJ :::;

l~j~J J
~
2n
for all u E M.
Define the Schauder opemtor
for all u EM,
where
for all u E M, j = 1, ... , J.

The function U f---7 IIAu - Uj II is continuous, by Proposition 6(iii) in Section

1.2. Thus, the function aj: M --t lR is also continuous. Moreover, for each
U EM, the aj ( u) do not all vanish simultaneously. Hence the operator
An: M --t Y is continuous.
Finally, for each U E M,
1.12 Finite-Dimensional Banach Spaces

and Equivalent Norms
Definition 1. Let X be an N-dimensional linear space over lK, where
N = 1, 2, .... By a basis {e1' ... , eN} of X we understand a set of elements
e1," ., eN of X such that, for each U E X,
(39)
where the numbers a1,"" aN ElK are uniquely determined by u.
The numbers a1,"" aN are called the components of u.
In particular, letting U = 0 in (39) it follows from the uniqueness of the

components that a1 = ... = aN = 0, i.e., the elements e1, ... , eN of a basis
are linearly independent.
Proposition 2. Let N = 1,2, .... In each N-dimensional linear space X

over lK there exists a basis {e1' ... , eN}'
Proof. Since dim X = N, there exist N linearly independent elements

e1, ... , eN of X, and N + 1 elements of X are never linearly independent.
Thus, for given U E M, there are numbers f30, f31,"" f3N such that
f30u + f31e1 + ... + f3NeN = 0,
where f3k -I- 0 for some k. Since f30 = 0 implies f31 = ... = f3N = 0, we get
f30 -I- O. Letting aj = - ~~ we obtain (39).
Finally, it follows from (39) and u = ai e1 + ... + a~eN that
(a1 - a~)e1 + ... + (aN - a~)eN = 0,
and hence aj -aj = 0 for all j. This yields the uniqueness of the components
aj. D
Definition 3. The two norms II . II and II . 111 on the normed space X are
1.12. Finite-Dimensional Banach Spaces and Equivalent Norms 43
called equivalent iff there are positive numbers a and (3 such that
for all u E X. (40)
Proposition 4. Two norms on a finite-dimensional linear space X over

OC are always equivalent.
Proof. If dim X = 0, then X = {O}. In this case, the inequality (40) is

satisfied trivially. Let dim X = N for fixed N = 1,2, .... Suppose that 11·11
is a norm on X. By (39), two arbitrary elements u and v of X allow the
following representations:
N N
U = Lajej and v=L(3jej, where aj,(3j E OC for all j.

j=l j=l
Set a = (a1, ... , aN) and define
One checks easily that I ·1100 is a norm on X. We want to show that there
exist positive numbers a and b such that
for all u E X. ( 41)
Observe first that

N N
Ilull = L ajej : :; L Ilajejil

j=l j=l
N
where b:= L Ilejll·

j=l
Since ej of. 0 for all j, we get b > O.
Furthermore, set
Since 1·100 is a norm on OC N , it follows from a(n) --+ a in OC N as n --+ 00

that
as n --+ 00.
Thus, the set M is closed and bounded in OC N with respect to I . 100, i.e.,
M is compact in OC N . Define the function
N
!(a1, ... ,aN):= Lajej
j=l
Then, 1: lK N ----; lR is continuous. This follows from
N
11(0:1, ... , O:N) - 1(,81, ... ,,8N)1 = L O:jej
j=l
N N
< L(O:j - ,8j)ej ~ 10: - ,8100 L Ilejll for all 0:,,8 E lKN.
j=l j=l
By the Weierstrass theorem (Proposition 8 in Section 1.11.1), the continu-

ous function f: M ----; lR on the compact set M has a minimum. Denote the
minimal value of 1 by a. Then
1(,8) = Ilvll 2: a for all v E X with Ilvll oo = 1,
where v := L:f=l,8jej. Note that Ilvll oo = 1 implies ,8j -=I- 0 for some j.
Hence a > O. For given u -=I- 0, set v := Ilull~lu. Hence
Ilull 2: allull oo for all u E X.
This proves inequality (41).

To finish our argument, let 11·111 be a second norm on X. Replacing 11·11
with 11·111, from (41) we obtain
for all u E X (41 *)
and fixed positive numbers a1 and bt. The desired inequality (40) follows
now from (41) and (41*). 0
The following consequences of Proposition 4 show that
Finite-dimensional normed spaces possess a simple structure.
Proposition 5. Let (un) be a sequence in a finite-dimensional normed

space X with dim X > o. Then
Un ----; U in X as n ----; 00 (42)
iff the corresponding components with respect to any fixed basis converge to
each other.
Proof. Let {e1' ... , eN} be a fixed basis of X. Set

N N
Un = LO:jnej and U = L O:jej, where N = dim X.
j=l j=l
1.13. The Minkowski Functional and Homeomorphisms 45
By (41),
allun - ull::; 15cj5cN
max lajn - aj I ::; bllun - ull·
Relation (42) means that Ilun -ull ---t 0 as n ---t 00. In turn, this is equivalent
to
as n ---t 00 for all j. (42*)
o
Corollary 6. Each finite-dimensional normed space is a Banach space.
Proof. Let dim X = O. Then, X = {O}, and the statement is trivial.

Let dim X > o. Suppose that (un) is a Cauchy sequence. Then
for all n, m ~ no(c) and all j.
Consequently, each sequence (anj) of the corresponding components is also
Cauchy. Hence we obtain (42*), which implies (42). 0
Corollary 7. Each finite-dimensional linear subspace L of a normed space

is closed.
Proof. Let Un ---t U as n ---t 00 with Un E L for all n. Then, (un) is Cauchy,
and hence u E L, by Corollary 6. 0
Corollary 8. Let M be a subset of a finite-dimensional normed space X.

Then
(i) M is relatively compact iff it is bounded.
(ii) M is compact iff it is bounded and closed.
Proof. For dim X = 0, i.e., X = {O}, the statements are trivial.

Let N := dim X > O. By Section 1.11.1, all the statements are true in
the special case of the space lKN equipped with the norm I . 100. Now use
Proposition 5 along with inequality (41). 0
1.13 The Minkowski Functional

and Homeomorphisms
The following elementary geometrical considerations will be used in the
proof of the Brouwer fixed-point theorem in the next section.
Definition 1. Let N = 1,2, .... The points uo, ... , UN in the linear space
X over lK are called to be in general position iff
are linearly independent.
This definition does not depend on the numbering of the points. For
example, if uo, ... , UN are in general position, then so are U1, UO, U2, ... , UN.
In fact, it follows from
ao(uo - U1) + a2(u2 - U1) + ... + aN(uN - U1) = 0 with aj ElK for all j
that
(ao + a2 + ... + aN)(uO - U1) + a2(u2 - uo) + ... + aN(uN - uo) = 0,
and hence ao +a2 + ... +aN = 0, a2 = ... = aN = O. This implies aj = 0
for all j.
Proposition 2. Let N = 1,2, .... Suppose that the points Uo, ... ,UN are
in general position, and suppose that
U - Uo (j. span{U1 - Uo, ... ,UN - uo}, (43)
then the points Uo, ... ,UN, U are also in general position.
Proof. Let
a1(U1-uO)+" ·+aN(UN-UO)+a(u-uo) = 0 with aj, a E lK for all j.
By (43), d: = O. Hence aj = 0 for all j. o
Definition 3. Let N = 1,2, ... , and let X be a linear space over lK. By an
N-simplex we understand the set
S := co{uo, ... ,UN}, (44)
where the points Uo, ... , UN E X are in general position.
By a O-simplex S, we understand a single point of X, i.e., S = {uo}.
Example 4. I-simplices are segments, and 2-simplices are triangles (see

Figure 1.11).
Let N = 0, 1, . . .. The points Uo, . .. , UN in (44) are called the vertices
of the simplex S. Explicitly,
s={tajUj:aj~O forall j andao + ... +a N =I}. (44*)

)=0
Using ao = 1 - (a1 + ... + aN), we also get
S = {uo + t aj(Uj - Uo): aj ~0 for all j and a1 + ... + aN ~ I} .

j=l
(44**)
•
Uo
/
So
•
Ul
~ ~
Uo Ul Uo
.b /
Ul
(a) I-simplex (b) 2-simplex (c)
FIGURE 1.11.
The point
1 N
b:= N+1 LUj
j=O
is called the barycenter of S (see Figure 1. 11 (c)).
Let N = 1,2, .... The (N - 1)-simplices
So :=CO{Ul, ... ,UN}, SI :=cO{UO,U2, ... ,UN}, ... ,
SN := cO{UO, ... , UN-I}

are called the (N - 1)-faces of S opposite to the points Uo, . .. , UN, respec-
tively (see Figure 1.11(c)).
By a k-face of S we understand the convex hull of k + 1 distinct vertices
of S, where k = 0, 1, ... ,N.
Definition 5. Let M be a nonempty set in a normed space X. Then, we

define the diameter of M through
diam M:= sup Ilu - vii.

u,VEM
The number
dist(u, M):= inf Ilu - wll
wEM
is called the distance of the point U E X from the set M.
Instead of dist(u, M), we also write distx(u, M).
Standard Example 6 (N-simplices). Let
S = co{ uo, ... ,UN}

be an N-simplex in the normed space X over OC, where N = 1,2, .... Then,
the following are true:
(i) The set S is convex and compact.
(ii) S ~ Uo + Y, where Y := span{ul - UO,··., UN - Uo}.

(iii) The barycenter b is an interior point of S with respect to the metric

subspace Uo + Y of X.
(iv) diam S ~ 2 max Iluj - uoll.

l~j~N
Proof. Ad (i). By (44), S is convex.

To show that S is compact, let (v n ) be a sequence in S. Then
N
Vn = LCi.jnUj,
j=O
where
o ~ Ci.jn ~ 1 and Ci.O n + ... + Ci.Nn = 1 for all j, n.
By Standard Example 6 in Section 1.11.1, there exist convergent subse-

quences
as n' -4 00 for all j.
Hence
o ~ Ci.j ~ 1 and Ci.o + ... + Ci.N = 1 for all j.
Letting v := E j Ci.jUj, we get v E Sand Vn' -4 v as n' -4 00. Thus, S is
compact.
Ad (ii). This follows from (44**).
Ad (iii). Let II ·11 denote the norm on X. Then,
Y:= span{ul - UO, •.• ,UN - UO}
is an N-dimensional linear subspace of X. We have U E Y iff
N
U = L,6j{Uj - uo} with ,6j E OC for all j.
j=1
The norm
lIulloo:= max 1,6·1 for all U E Y
l~j~N 3
is equivalent to the norm 11·11 on Y, by Proposition 4 in Section 1.12. Hence
for all U E Y, j = 1, ... , N, and fixed c> o. (45)
Let v E Uo +Y and
Ilv - bll < r for fixed r > o.

We have to show that this implies v E S provided r is sufficiently small. By
(44**), V-Uo E Y and b-uo E Y. It follows from (45) that the coordinates
(3j(v - uo) and (3j(b - uo) of the points v - Uo and b - uo, respectively,
satisfy
j= 1, ... ,N.
Note that (3j(b - uo) = N~l' j = 1, ... , N. Consequently, for sufficiently

small r > 0, we get
for all j.
By (44**), this implies v E S.

Ad (iv). Let u, v E S. By (44**),
N
Ilu - vII = ~)Oj(u) - OJ(v))(Uj - uo)
j=l
N
::; max lIuj - uoll
l<j<N
- - j=l
L
IOj(u) - oj(v)1 ::; 2 max lIuj - uoll. D
l<j<N
- -
Definition 7. By a barycentric subdivision of the I-simplex S = co{ uo, ud,

we understand the collection of the following two I-simplices:
So := co{b, uo} and Sl := co{b, ud,
where b is the barycenter of S (see Figure 1.12(a)).

By induction, the barycentric subdivision of an N-simplex S with bary-
center b is the collection of all the N -simplices
co{b, V1,"" vN-d,
where V1,"" VN-1 are vertices of any (N - I)-simplex obtained by a

barycentric subdivision of an (N - I)-face of S.
The barycentric subdivision of a 2-simplex is pictured in Figure 1.12(b)

on next page. Intuitively, a barycentric subdivision corresponds to a trian-
gulation based on barycentric centers.
Proposition 8. Let M be a closed, bounded, convex, nonempty subset of

a normed space X, where M has an interior point.
Then, M is homeomorphic to the closed ball B := {u E X: lIuli ::; I}.
Proof. If X = {O}, then M = {O}, and the statement is trivial.

Now let X =f:. {O}, and let Uo E int M. Replacing u with u - uo, we may
assume that Uo = O.
Uo
• /
(a)
3
•
u\ Uo
•
So
/ • /.b
3\
U\
&
Uo
(b)
U\
FIGURE 1.12.
FIGURE 1.13.
Step 1: Minkowski functional. For each u EX, we define the Minkowski

functional of the set M through
p( u) := inf{A,: A-IU E M, A > O}.

The intuitive meaning of p( u) is pictured in Figure 1.13, i.e., the ray through
the point u and the origin intersects the boundary 8M of the set M at the
point p(U)-lU. We want to show that the following are true:
(i) allull :::; p(u) :::; bllull for all u E X and fixed a, b > O.
(ii) p(au) = ap(u) for all a::::: O.
(iii) p(u + v) :::; p(u) + p(v) for all u, v E X (triangle inequality).
(iv) p: X ~ lR. is continuous.
(v) M = {u E X:p(u) :::; I}.
Ad (i). Since 0 E int M, there is a number r > 0 such that
Ilull:::; r implies u E M.
Obviously, p(O) = O. Now let u E X and u i= O. Then IIA -lull = r for

A:= r-Iliuli. Hence A-lU E M, i.e., the definition ofp(u) makes sense, and
for all u E X.
The set M is bounded, i.e.,
Ilull ::; R for all u E M and fixed R> o.

Consequently, if ).-lU E M, then II).-lull ::; R, i.e., ). ;::: R-Iliuli. This
implies
for all u E X.
Ad (ii). Let a> O. Observe that ).-lu E M with), > 0 iff (a).)-lau E M.
Ad (iii). Let u, vEX. For fixed c > 0, choose numbers a and (3 such
that
p( u) < a < p( u) + c and p(v) < (3 < p(v) + c.
Then a-lu, (3-lv E M. Let, := a + (3. Since ,-la + ,-1(3 = 1 and the
set M is convex, the point
lives in M. By the definition of p,
p( u + v) ::; , = a + (3 < p( u) + p( v) + 2c.

Letting c ---+ 0, we get (iii).
Ad (iv). It follows from (iii) that
p(u) = p(v + (u - v)) ::; p(v) + p(u - v).

Replacing u with v and using (i), we obtain that
Ip(u) - p(v)1 ::; max{p(u - v),p(v - un

::; bllu - vii for all u, v E X.
Thus, p is continuous on X.
Ad (v). Let u E M. Since 0 E M and the set M is convex, we get JLU E M
for all JL: 0::; JL ::; 1. Hence ).-lU E M for all ). ;::: 1. This implies p(u) ::; 1.
Conversely, let p(u) ::; 1. If u = 0, then u E M. Suppose now that u -I- o.
Then, p(u) > 0 by (i), and
for all ). ;::: p( u) + c.

Letting c ---+ 0, this implies p(u)-lU E M, since M is closed. Using 0 E M
and p(U)-l ;::: 1, the convexity of M implies u E M.
Step 2: Homeomorphism A: X -~ X. Set
Au:= {oli~,ru if u E X, u -I- 0

if u = o.
By (i),
IIAul1 ::; bllull for all u E X. (46)
Thus, A: X - X is continuous. In fact, let
Un - U as n - 00.
If U = 0, then AUn - 0 as n - 00, by (46). Moreover, if U =f. 0 it follows

from the continuity of p and II· II along with lIuli =f. 0 and Proposition 6(v)
in Section 1.2 that AUn - Au as n - 00.
The inverse operator A-I:X _ X is given through
IIvll V I·f V E X , V --'

A-IV := { p(v) -r- 0
o if V = O.
This follows from V = p(u)llull-Iu if u =f. O. Again by (i), A-I: X - X is
continuous. Thus, A: X - X is a homeomorphism.
Step 3: Obviously, A(M) C;;;; Band A-I(B) C;;;; M, by (v) and (ii), re-
spectively. Hence A(M) = B. Consequently, the restriction A: M - B
represents the desired homeomorphism. 0
Proposition 9. Let M be a compact, convex, nonempty set in a finite-

dimensional normed space X.
Then, M is homeomorphic to some N -simplex S in X with N = 0, 1, ....
Proof. If M consists of a single point, then the statement is true for N = o.

Suppose now that M contains at least two distinct points. Since dim
X < 00, the maximal number N of points in the set M being in general
position is finite. Let
UO, •.• ,UN E M
be in general position. After using a translation, we may assume that Uo =

o. Set
L := span{ull ... ' UN} and
By Proposition 2 and the maximality of N, we get
MC;;;;L,
and the convexity of M implies
SC;;;;M.
By Standard Example 6, the simplex S has an interior point in the normed

space L, and hence M also has an interior point in L. By Proposition 8
applied to the space L, the set M is homeomorphic to the ball
B:= {u E L: Ilull ~ I}.

Aga.in by Proposition 8, the compact convex set S is also homeomorphic
to the ball B.
1.14 The Brouwer Fixed-Point Theorem 53
Consequently, there exist homeomorphisms
A:M -'> B and c:s -'> B.
Then, the map

A c- 1
C-IoA:M -----. B-----.S
is the desired homeomorphism from the given set M onto the simplex S.
o
1.14 The Brouwer Fixed-Point Theorem

Theorem LB. The continuous operator
A:M -'> M
has at least one fixed point when M is a compact, convex, nonempty set in
a finite-dimensional normed space over lK.
A variant of this famous theorem was proved by Brouwer in 1912. The

Brouwer fixed-point theorem (Theorem loB) represents one of the most im-
portant existence principles in mathematics. It is equivalent to numerous,
apparently completely different, propositions. This can be found in Zeidler
(1986), Vol. 4, Chapter 77, along with interesting applications to game the-
ory, mathematical economics, and numerical mathematics. By the proof of
Example 2 ahead, the Brouwer fixed-point theorem generalizes the classi-
cal intermediate-value theorem for continuous functions, which was proved
first by Bolzano in 1817.
Further important existence principles in mathematics are the following:
the Hahn-Banach theorem (Section 1.1 of AMS Vol. 109);
the Weierstrass existence theorem for minima (Section 2.5 of AMS Vol.
109);
the Baire category theorem (Section 3.1 of AMS Vol. 109).
The Brouwer fixed-point theorem implies the Schauder fixed-point theorem
(cf. Section 1.15). For example, we will prove in Section 1.18 that the
Schauder fixed-point theorem implies the Leray-Schauder principle: a priori
estimates yield existence.
Corollary 1. The continuous operator
B:K -'> K
has at least one fixed point when K is a subset of a normed space that is
homeomorphic to a set M as considered in Theorem LB.
A
b
a u b
FIGURE 1.14.
The proof of Theorem 1.B will be given in Section 1.14.4.

We first show that Corollary 1 is a simple consequence of Theorem LB.
Proof of Corollary 1. Let C: M ---+ K be a homeomorphism. Then, the

operator
C B c- 1
C- 1 0 B 0 C: M ---+ K ---+ K ---+ M
is continuous. By Theorem 1.B, there exists a fixed point u of A := C- 1 0
B 0 C, i.e.,
C- 1 (B(Cu)) = u, uEM.
Letting v = Cu, this implies Bv = v, v E K, i.e., B has a fixed point. 0
Example 2. Let M = [a, b], where -00 < a< b< 00. Then, each contin-
uous function
A: [a, b] ---+ [a, b]
has a fixed point u (see Figure 1.14).
This is the simplest special case of the Brouwer fixed-point theorem (The-
orem 1.B). Let us give a direct proof. To this end, we set
B(u) := A(u) - u for all u E [a, b].
Since A(a), A(b) E [a, b], we get A(a) ~ a and A(b) ::; b. Hence
B(a) ~ 0 and B(b) ::; O.
By the intermediate-value theorem, the continuous real function B has a

zero u E [a, b], i.e., B(u) = O. Hence A(u) = u. 0
1.14.1 Intuitive Proof of the Brouwer Fixed-Point Theorem

Let M be a closed disk in ~2, and let A: M ---+ M be a continuous operator.
We want to use a simple intuitive argument in order to prove that A has a
fixed point.
FIGURE 1.15.
Suppose A: M ~ M were a fixed-point free operator, i.e., Au =f. u for all

u EM. Then, we can construct an operator
R:M~8M (47)
as follows. For each point u E M follow the directed line segment from
the point Au through the point u to its intersection with the boundary
8M, and let the intersection point be Ru, as in Figure 1.15. Obviously, the
operator R in (47) is a so-called retraction, that is, R is continuous and
Ru=u for all u E 8M.
Intuitively, such a retraction does not exist. This is the desired contradic-
tion.
However, a rigorous proof for the nonexistence of a retraction of the form
(4 7) is highly nontrivial. Such a proof can be found in Zeidler (1986), Vol. 1,
p. 51, by means of the mapping degree, which represents an important tool
from topology.
At this place we want to give a completely different proof, one that uses
only elementary facts about simplices. This elegant proof was discovered
by Knaster, Kuratowski, and Mazurkiewicz in 1929.
Remark 3. Intuitively, all the closed sets M pictured in Figure 1.16 are
homeomorphic to a 2-simplex, and hence by Corollary 1 the Brouwer fixed-
point theorem applies to these sets.
Remark 4 (Counterexamples). We want to show through counterexam-

ples that each of the assumptions of the Brouwer fixed-point theorem is
essential.
(i) Let M := [0,1]. The function A: M ~ M pictured in Figure 1.17(a)

has no fixed point. The set M is compact and convex, but A is not
continuous.
• (a) (b) (c)
Homeomorphic sets
FIGURE 1.16.
'"
"" ""
""
"" ""
"" ""
""",,""""r-
(a) (b)
FIGURE 1.17.
(ii) Let M := R The continuous function A: M -+ M defined through

Au := u+ 1 has no fixed point. The set M is convex, but not compact.
(iii) Let M be a closed annulus as pictured in Figure 1.17(b). Then, a

proper rotation A: M -+ M of the annulus around the center is fixed-
point free. Here, the operator A is continuous and M is compact, but
M is not convex.
For our proof of the Brouwer fixed-point theorem we need the prepara-
tions that we consider in Sections 1.14.2 and 1.14.3.
1.14.2 The Sperner Lemma

Let
S = co{uo, ... , UN}
be an N-simplex with N ~ 1. By a triangulation of S we mean a finite
collection
(48)
FIGURE 1.18.
of N -simplices Sj such that

J
(a) S = USj, and
j=1
(b) if j :F k, then the intersection Sj n Sk is either empty or a common

face of dimension ~ N - 1 (cf. Figure 1.18).
Lemma 5. Let one of the numbers 0, 1, ... ,N be associated with each vertex
v of the simplices Sj in (48). Suppose that if
k=O, ... ,N, (49)
then one of the numbers i o, ... ,ik is associated with v.

By definition, Sj is called a Sperner simplex iff all of its vertices carry
different numbers, i.e., the vertices of S carry the numbers 0,1, ... , N.
Then, the number of Sperner simplices is odd.
Condition (49) means the following. Each vertex Uj of the original sim-
plex S carries the number j. Moreover, let :F be that lowest-dimensional
face of the original simplex S that contains the point v. Then, the number
associated with v is equal to one of the numbers of the vertices of :F (cf.
Figure 1.19).
Proof. Step 1: Let N = 1 (Figure 1.I9(a». Then, each Sj is a I-simplex

(segment). A O-face (vertex) of Sj is called distinguished iff it carries the
number O. We have exactly the following two possibilities:
(i) Sj has precisely one distinguished (N - I)-face (Le., Sj is a Sperner

simplex);
(ii) Sj has precisely two or no distinguished (N - I)-faces (Le., Sj is not
a Sperner simplex).
But since the distinguished O-faces occur twice in the interior and once
on the boundary, the total number of distinguished O-faces is odd. Hence
the number of Sperner simplices is odd.
•
Uo
•
U, •0 •1 •
0
•1
(a)
D
2
Spemer
simplex
Uo U, 0
(b)
FIGURE 1.19.
Step 2: Let N = 2 (Figure 1.I9(b)). Then, Sj is a 2-simplex. A I-face

(segment) of Sj is called distinguished iff it carries the numbers 0,1. Then,
conditions (i) and (ii) above are satisfied for N = 2.
The distinguished I-faces occur twice in the interior. By (49), the distin-
guished I-faces on the boundary are subsets of co{ uo, ud. It follows from
Step 1 that the number of distinguished I-faces on co{ UO, U1} is odd. Thus,
the total number of distinguished I-faces is odd, and hence the number of
Spemer I-simplices is also odd.
Step 3: Induction. Let N :::: 3. Suppose that the lemma is true for N - 1.
Then it is also true for N. This follows as in Step 2. In this connection, an
(N - I)-face of Sj is called distinguished itf its vertices carry the numbers
0, 1, ... , N - 1. D
1.14.3 The Lemma of Knaster, Kuratowski,

and Mazurkiewicz
Lemma 6. Let S = co{ Uo, ... , UN} be an N -simplex in a finite-dimensional
normed space X, where N = 0,1, .... Suppose that we are given closed sets
Co, ... , CN in X such that
k
co{ Uio' ... , Ui k } ~ UC = i (50)
m=O
for all possible systems of indices {io, ... , ik} and all k = 0, ... , N.
Then, there exists a point v in S such that v E C j for all j = 0, ... ,N.
Proof. For N = 0, S consists of a single point, and the statement is trivial.

Now let N:::: 1.
Step 1: Consider a triangulation Sl, ... , SJ of S. Let v be any vertex of
Sj, j = I, ... ,J, where
v E co{ Uio, ... ,Ui k } for some k = 0, ... , N.

By (50), there is a set C k such that
v E Ck ·
We associate the number k with the vertex v. It follows from the Sperner
lemma (Lemma 5) that there is a Sperner simplex Sj whose vertices carry
the numbers 0, ... ,N. Hence the vertices Va, ... ,VN of Sj satisfy the con-
dition
for all k = 0, ... ,N.
Step 2: We now consider a sequence of triangulations of the simplex S
such that the diameters of the simplices of the triangulation go to zero. For
example, one can choose a sequence of barycentric subdivisions of S.
By Step 1, there are points
for all k = 0, ... ,N and n = 1,2, ...
such that
· d·lam co { Va(n) , ... ,vN
11m
n-oo
(n)} -- ° . (51)
Since the simplex S is compact, there exists a subsequence, again denoted

by (Vkn »), such that
as n ...... 00 and V E S.
By (51),
as n ...... 00 for all k = 0, ... , N.
Since the set C k is closed, this implies
for all k = 0, ... , N. o
1.14.4 Proof of the Brouwer Fixed-Point Theorem

Step 1: Simplices. Let S be an N-simplex in a finite-dimensional normed
space, and let the operator
A:S ...... S
be continuous, where N = 0,1, .... We want to show that A has a fixed
point.
For N = 0, the set S consists of a single point and the statement is
trivial. For N = 1, the proof has been given in Example 2.
Now let N = 2. Then, S = co{ Ua, Ul, U2}, i.e., S is a triangle. Each point
U in S has the representation
where
and (52)
With
u - Uo = a1(U)(U1 - UO) + a2(U)(U2 - uo)
and ao (u) = 1 - a1 (u) - a2 (u), it follows from the linear independency of
U1 - Uo, U2 - Uo that the barycentric coordinates ao(u), a1(u), and a2(u)
of the points u are uniquely determined by u and depend continuously on
u, by Proposition 5 in Section 1.12.
We set
j = 0,1,2.
Since aj(') and A are continuous on S, the set Cj is closed. Further-
more, the crucial condition (50) of the lemma of Knaster, Kuratowski, and
Mazurkiewicz is satisfied, i.e.,
k
co{ Uio' ... ,Uik} <:;;: U Cim , k = 0,1,2.
m=O
In fact, if this is not true, then there exists a point U E co{ Uio' ... ,Uik}
such that U ~ U:=o Cim , i.e.,
for all m = 0, ... , k and some k = 0,1,2. (53)
This is a contradiction to (52). In fact, if we renumber the vertices, if

necessary, condition (53) means that
for all j = 0, ... , k and some k = 0,1,2. (53*)
In addition, since u E S and Au E S, it follows from (52) that
For k = 2, relation (53*) is impossible, by (53**). If k = 1 or k = 0, then

U E cO{UO,U1} or u E co{uo}, and hence
or
respectively. Again, (53*) contradicts (53**).

The lemma of Knaster, Kuratowski, and Mazurkiewicz (Lemma 6) tells
us now that there is a point v E S such that
for all j = 0,1,2.

This implies
for all j = 0, 1,2.
According to (53**) with U = v, we get aj(Av) = aj(v) for j = 0, 1,2, and
hence Av = v. Thus, v is the desired fixed point of A in the case where
N=2.
1.16. Applications to Integral Equations 61
If N ~ 3, then use the same argument as for N = 2 above.

Step 2: Let M be a compact,convex, nonempty subset of a finite-di-
mensional normed space. By Proposition 9 in Section 1.13, the set M is
homeomorphic to some N-simplex S. Using Step 1, the same argument as
in the proof of Corollary 1 shows that each continuous operator A: M - M
has a fixed point.
This finishes the proof of the Brouwer fixed-point theorem (Theorem
1.B). 0
1.15 The Schauder Fixed-Point Theorem

Theorem 1.C. The compact operator
A:M-M
has at least one fixed point when M is a bounded, closed, convex, nonempty
subset of a Banach space X over lK.
This theorem was proved by Schauder in 1930. If dim X < 00, then
Theorem 1.C coincides with the Brouwer fixed-point theorem (Theorem
1.B in Section 1.14).
Proof. Let Uo EM. Replacing u with u - uo, if necessary, we may assume

that 0 E M.
It follows from the approximation theorem for compact operators (Propo-
sition 13 in Section 1.11) that, for every n = 1,2, ... , there exists a finite-
dimensional subspace Xn of X and a continuous operator
such that
for all u E M. (54)
Define
Mn :=XnnM.
Then, Mn is a bounded, closed, convex subset of Xn with 0 E Mn and
An(M) ~ co A(M) ~ M, since M is convex.
By the Brouwer fixed-point theorem (Theorem 1.B in Section 1.14), the
operator An: Mn - Mn has a fixed point Un, i.e.,
for all n = 1,2, .... (55)
By (54),
for all n = 1,2, .... (56)
Since Mn ~ M for all n, the sequence (un) is bounded. The compactness of

the operator A: M --+ M implies that there is a subsequence, again denoted
by (un), such that
as n --+ 00.
By (56), Ilv - un II ::; Ilv - Aunll + IIAun - un II --+ 0 as n --+ 00. Hence
un --+ V as n --+ 00.
Since AUn E M for all n and the set M is closed, we get v E M. Finally,
since the operator A: M --+ M is continuous, it follows that
Av=v, vEM. D

We want to solve the integral equation
u(x) = Alb F(x,y,u(y))dy, a ::; x ::; b, (57)
where -00 < a < b< 00 and A E ~. Let
Q := {(x, y, u) E ~3: x, y E [a, b], lui::; r} for fixed r > o.
Proposition 1. Assume the following:

(a) The function F: Q --+ ~ is continuous.
(b) We define (b - a)M := max(x,y,U)EQ IF(x, y, u)l. Let the real number
A be given such that IAIM ::; r.
(c) We set X:= C[a,b] and M:= {u E X: Ilull ::; r}.

Then, the original integral equation (57) has at least one solution u E M.
This generalizes Proposition 1 in Section 1.7.
Proof. Define the operator
(Au)(x) := Alb F(x, y, u(y))dy for all x E [a, b].
Then, the integral equation (57) corresponds to the following fixed-point

problem:
u=Au, uEM. (57*)
1.17 Applications to Ordinary Differential Equations 63
The operator A: M --; M is compact, by Standard Example 12 in Section

1.11. For each u E M,
IIAul1 ::; IAI a~~b lib F(x, y, U(Y))dyl

::; IAIM ::; r.
Hence A(M) ~ M.
Thus, the Schauder fixed-point theorem (Theorem I.e in Section 1.15)
tells us that equation (57*) has a solution. 0
1.17 Applications to Ordinary Differential

Equations
Let us consider the following initial-value problem:
u' = F(x, u), Xo - h ::; x ::; Xo + h,

(58)
u(xo) = uo,
where the point (xo, uo) E ~2 is given. Set S := {(x, u) E ~2: Ix - xol ::; r,
lu - uol ::; r} for fixed r > o.
Proposition 1 (The Peano theorem). Assume the following:
(a) The function F: S --; ~ is continuous.
(b) We set M := max(x,u)ES IF(x, u)l, and we choose a number h in such

a way that 0 < h ::; rand hM ::; r.
Then, the original initial-value problem (58) has at least one solution.
This generalizes the Picard-Lindelof theorem from Section 1.8 based on
the Banach fixed-point theorem. In contrast to the Picard-Lindelof theo-
rem, the Peano theorem does not tell us that the solution is unique. The
following proof will be based on the Schauder fixed point theorem.
Proof. Let us consider the integral equation
u(x) = Uo + r F(y,u(y))dy,
lxo
Xo - h ::; x ::; Xo + h. (59)
Let a := Xo - hand b := Xo + h. Set X := era, b] and

M:= {u E X: Ilu - uoll ::; r}.
Define the operator
(Au)(x) := Uo + IX F(y, u(y))dy for all x E [a, b].

Xo
Then, the following are true:
(i) A: M ~ X is continuous.
(ii) A(M) is equicontinuous.
(iii) A(M) ~ M, i.e., in particular, the set A(M) is bounded.
Ad (i), (ii). This follows as in the proof of Standard Example 12 in Section
-i:
1.11. In this connection, observe that
li~ F(y,u(Y))d y F(y,U(Y))dyl = liz F(y,U(Y))dyl::; Ix-zIM,
for all x, z E [a, b] and u E M.

Ad (iii). If u E M, then
IIAu - uoll = max

a::; X::; b
11
Xo
x
F(y,U(Y))dyl ::; hM ::; r.
By the Arzeld-Ascoli theorem (Standard Example 7 in Section 1.11), it

follows from (i), (ii) that the set A(M) is relatively compact in X. Since the
set M is bounded, this implies the compactness of the operator A: M ~ M.
The Schauder fixed-point theorem (Theorem I.e in Section 1.15) tells us
that the operator equation
Au=u, UEM,
has a solution, i.e., the integral equation (59) has a solution u E M.

Differentiating the integral equation (59) with respect to x, we see that
u is also a solution of the original problem (58). 0
1.18 The Leray-Schauder Principle and a priori

Estimates
Let X be a Banach space. We want to solve the equation
u=Au, UEX, (60)

by using properties of the parametrized equation
u = tAu, UEX, O::;t<1. (61)

1.18 The Leray-Schauder Principle and a priori Estimates 65
For t = 0, equation (61) has the trivial solution u = 0, whereas (61)

coincides with (60) if t = 1. The following condition is crucial:
(A) A priori estimate. There is a number r > 0 such that if u is a solution
of (61), then
JJuJJ ::; r.
Observe that we do not assume that equation (61) has a solution. Con-
dition (A) is satisfied trivially if the set A(X) is bounded, i.e., there is a
number r > 0 such that JJAuJJ :::; r for all u E X.
Theorem l.D. Suppose that the compact operator A: X -+ X on the Ba-

nach space X over K satisfies condition (A).
Then, the original equation (60) has a solution.
This theorem was proved by Leray and Schauder in 1934. Roughly speak-
ing, Theorem 1.C corresponds to the following important principle in math-
ematics:
A priori estimates yield existence.
A typical application of this principle to the famous Navier-Stokes equa-

tions for viscous fluids will be considered in Section 5.17 of AMS Vol. 109.
Further applications to a general class of quasilinear elliptic partial differ-
ential equaitons can be found in Zeidler (1986), Vol. 1, Chapter 6.
Proof. Set M:= {u E X: JJuJJ ::; 2r}. We define an operator
Au if JJAuJJ ::; 2r
Bu:= { 2rAu .
IIAul1 If JJAuJJ > 2r.
Obviously, JJBuJJ :::; 2r for all u E X, i.e., B(M) ~ M. We claim that

B: M -+ M is compact. In fact, B is continuous. This follows from the
continuity of the operator A: X -+ X and from
2rAu
if JJAuJJ = 2r,
Au= JJAuJJ
by using the (c: - 8)-definition of continuity (Definition 1 in Section 1.9).

To establish compactness, let (un) be a sequence in the ball M. We
consider two cases, namely, there is a subsequence (v n ) of (un) such that
(a) JJAvnJJ:::; 2r for all n;
(b) JJAvnll > 2r for all n.

In case (a), the boundedness of the set M and the compactness of the
operator A imply that there is a subsequence (w n ) of (v n ) such that BW n =
AWn ---+ z as n ---+ 00.
In case (b), one can choose a subsequence (w n ) of (v n ) so that
1
IIAwnl1 ---+ a and AWn ---+ Z as n ---+ 00
for suitable a and z, since the sequence CIA~n II) is bounded and the op-
erator A is compact. Hence
BW n ---+ 2raz as n ---+ 00.
The Schauder fixed-point theorem (Theorem I.e in Section 1.15) applied

to the compact operator B: M ---+ M provides us with a point u E M such
that
u=Bu.
If IIAul1 : : ; 2r, then Bu = Au, and hence u = Au, i.e., u is a solution of the
original problem (60).
The other case IIAul1 > 2r is impossible by the a priori estimate (A). In
fact, let u = Bu with IIAul1 > 2r. Then
= Bu = tAu .h 2r 1 (62)
u WIt t:= IIAul1 < .
This forces Ilull = Itl·IIAull = 2r. On the other hand, equation (62) implies
Ilull : : ; r, by (A), which is a contradiction. D
Applications can be found in Problems 1.10 and LIp.
1.19 Sub- and Supersolutions, and the Iteration

Method in Ordered Banach Spaces
The idea of ordered Banach spaces is to introduce a relation
u::::; v,
which generalizes the corresponding relation for real numbers.
Definition 1. A subset X+ of a normed space X is called an order cone

iff the following are true:
(i) X+ is closed, convex, and nonempty, and X+ =I- {o}.
(ii) If u E X+ and a ~ 0, then au E X+.

1.19. Sub- and Supersolutions 67
(iii) If u E X+ and -u E X+, then u = O.
Let u, v EX. We define
u :::; v iff v - u E X+.
By an ordered normed space (resp., ordered Banach space) we understand

a normed space (resp., Banach space) together with an order cone.
We also define the order interval
[u,w] := {v E X:u:::; v:::; w}.
The order cone X + is called normal iff there is a number c > 0 such that
implies Ilull:::; cllvll·
Example 2. The Banach space X := JR. with the norm Ilull := lui is an
ordered Banach space with the order cone X+ := JR.+ := {u E X: u ~ O}.
Here, the order relation u :::; v in X coincides with the corresponding
classical relation. Since 0 :::; u :::; v implies lui:::; lvi, the order cone X+ is
normal.
Example 3. The Banach space X := JR.N, N = 1,2, ... , with the Euclidean
norm I . I is an ordered Banach space with the order cone
Here,
(6,··· ,eN) :::; ('fJ1> ... ,TIN) iff ej:::; 'fJj for all j. (63)
By (63), 0 :::; x :::; y implies 0 :::; ej :::; 'fJj for all j, and hence Ixl :::; Iyl.
Thus, the order cone X+ is normal. The order cone JR.! in JR.2 is pictured
in Figure 1.20.
Standard Example 4. The Banach space X := C[a, b], -00 < a < b < 00,
with the usual norm Ilull := maxa$x$b lu(x)l, is an ordered Banach space
with the normal order cone
X+:= C+[a,b] = {u E C[a,b]:u(x) ~ 0 on [a,b]}.
Here,
u:::;vonX iff u(x):::; v(x) on [a,b].
It follows from 0 :::; u :::; v in X that 0 :::; u(x) :::; v(x) on [a,b], and hence
lIull :::; Ilvll· Thus, the order cone X+ is normal.
FIGURE 1.20.
The following proposition shows that the relation u :::; v has the usual
properties.
Proposition 5. Let u, v, W, Un, Vn E X+ for all n, where X+ is an order

cone in the Banach space X. Then:
(i) u:::; v and v :::; w imply u :::; w.

(ii) u:::; v and v :::; u imply u = v.
(iii) u:::; v implies u +W :::; v + wand o:u :::; o:v for all 0: 2': O.
(iv) Un :::; Vn for all n and Un ---+ u and Vn ---+ v as n ---+ 00 imply u :::; v.
(v) If the order cone X+ is normal, then u :::; v :::; w implies Ilv - ull :::;
cllw - ulland Ilw - vii:::; cllw - ull·
Proof. Ad (i). v-u E X+ and w-v E X+ imply 2-1(v-u)+2-1(w-v) E

X+, since X+ is convex. Hence (v - u) + (w - v) E X+, i.e., w - u E X+.
Ad (ii). v - u E X+ and -(v - u) E X+ imply v - u = O.
Ad (iii). v - u E X+ implies (v + w) - (u + w) E X+ and o:(v - u) E X+
for all 0: 2': O.
Ad (iv). If Vn - Un E X+ for all n and Un ---+ u and Vn ---+ v as n ---+ 00,
then v - u E X+, since X+ is closed.
Ad (v). Adding -u to u :::; v :::; w, we get 0 :::; v - u :::; w - u, and hence
Ilv - ull :::; cllw - ull·
Moreover, u :::; v :::; w implies v - u E X+ and (w - u) - (w - v) E X+.
Hence 0 :::; w - v:::; w - u. This yields Ilw - vii:::; cllw - ull. D
We now want to solve the operator equation
u=Au, UEX, (64)

by means of the two iteration methods
and n = 0, 1, ... , (65)

1.19. Sub- and Supersolutions 69
where uo, Vo E X are given.
Theorem 1.E. Suppose that the following are met:
(a) The operator A: [uo, vol ~ X --> X is compact, where X is an ordered

Banach space with normal order cone.
(b) The operator A is monotone increasing, i.e., u ::; v implies Au::; Av.
(c) Uo is a subsolution of (64), i.e., Uo ::; Auo.

(d) Vo is a supersolution of (64), i.e., Avo::; Vo.
Then, the iterative sequences (un) and (v n ) constructed in (65) converge
to a solution u and v of the original equation (64), respectively. In addition,
we have the error estimates
for all n. (66)
This theorem corresponds to the following general existence principle in

mathematics:
The existence of both a subsolution and a supersolution yields the exis-
tence of a solution.
Proof. We use the same arguments as in the classical case X = JR.

Step 1: Monotonicity of (un) and (v n ). For all n,
In fact, Uo ::; Auo implies Uo ::; UI. Since A is monotone increasing, Uo ::; UI
yields Auo :S AUI, i.e., UI :S U2. Moreover, Uo ::; Vo implies Auo ::; Avo. By
hypothesis, Avo::; Vo. Hence UI ::; VI ::; Vo. Relation (66*) follows now by
induction.
Step 2: Convergence of (un). By Proposition 5(v), it follows from (66*)
that
Ilvo - unll :S cllvo - uoll for all n,
i.e., the sequence (un) is bounded. Since the operator A is compact, there
exists a subsequence (un') such that
as n' --> 00.
Let c > 0 be given. Since un+! = Au n , there is a number no(c) such that
Iluno - ull < c.

Letting n' --> 00 in (66*), we get
for all n 2: no(c).

Hence Ilu - unll ::; cllu - uno II < cc for all n ?: no(€), i.e.,
Un -> U as n -> 00.
Since the operator A is continuous, letting n -> 00 in Un+l = AUn produces
u=Au.
Step 3: Similarly, one proves that Vn -> v as n -> 00 and v = Av. Letting
n -> 00 in (66*), we obtain (66).
An application to integral equations will be considered in Problem 1.1n.
1.20 Linear Operators

Definition 1. Let X and Y be linear spaces over K The operator A: L ~
X -> Y is called linear iff L is a linear subspace of X and
A(au + j3v) = aAu + j3Av for all u, vEL and a, j3 E IK.
Recall that R(A) := {v E Y: v = Au for some u E X}. We also introduce

the null space
N(A) := {u E X: Au = O}.
The linear operator A: X -> Y is injective iff N(A) = {o}.
This follows from Au - Av = A(u - v). In fact, if N(A) = {O}, then
Au = Av implies u = v, i.e., A is injective. Conversely, if A is injective,
then Au = 0 implies u = 0, i.e., N(A) = {O}.
Proposition 2. Let A: X -> Y be a linear operator, where X and Yare

normed spaces over IK. Then the following two conditions are equivalent:
(i) A is continuous.
(ii) There is a number c > 0 such that IIAul1 ::; cllull for all u EX.
For a linear continuous operator A: X -> Y, we define the operator norm

through
IIAII:= sup IIAvll· (67)
IIvl19
By (ii), IIAII < 00. From (67) we get
IIAul1 ::; IIAllllul1 for all u E X, (67*)
by letting v := Ilull-lu for u =I- O. In fact, then Ilvll land IIAvl1 =

IIAullllull- 1 ::; IIAII·
1.20 Linear Operators 71
Obviously, if X -I- {a}, then
IIAII = sup IIAvll·

Ilvll=l
Proof. (i) =? (ii). Because of the linearity of A, condition (ii) is equivalent

to IIAvl1 ::; c for all v E X with Ilvll ::; 1.
Let A be continuous. If (ii) is not true, then there is a sequence (v n ) with
for all n = 1,2, ....

Setting Wn := n-lv n , then Wn -7 a as n -+ 00 and
for all n. (68)
Since A is linear, A(a) = a. Moreover, since A is continuous, Wn -7 a as

n -7 00 implies AWn -7 a as n -7 00. This contradicts (68).
(ii) =? (i). For given E > a choose 8 = EC- l . Then
Ilu - vii < 8 implies IIAu - Avll < E,
since
IIA(u - v)11 ::; cllu - vii < E. o
The following proposition tells us that linear continuous operators be-

tween finite-dimensional normed spaces correspond to matrices. The two
basic formulas are given through
(69)
where
N
"lm = Lamn~n, m=l, ... ,M. (69*)
n=l
Proposition 3. Let X and Y be finite-dimensional normed spaces over][{

with dim X = N and dim Y = M, where N, M ~ 1. Let {el,"" eN} and
{h, ... , fM} be a basis in X and Y, respectively.
Then, the operator A: X -7 Y is linear iff there is an (M x N)-matrix
(a mn ) with
for all n = 1, ... ,N, m = 1, ... ,M

such that the formulas (69) and (69*) hold.
All these operators are continuous.
Proof. Suppose that A is linear. Then Aen E Y, and hence there are
numbers amn in ][{ such that
M
Aen = L amnfm for all n = 1, ... , N.

m=l
Since A is linear,
A (t, ~nen) t, ~nAen. =
This yields (69) with (69*).

Conversely, define the operator A: X -+ Y through (69) and (69*). Then,
A is linear. Recall that Ilull oo := maxn I~nl for u = 6el + ... + ~nen. Then
for all u EX,
where
N
max "" lamnl·
Aoo:= l<m<M~
- - n=l
Hence
IIAlloo:= sup IIAulloo ~ Aoo. (70)
II'U II 9
00
Since each norm II . II on a finite-dimensional normed space is equivalent to

the norm II ·1100' we get
IIAul1 ~ Gllull for all u E X and fixed G > 0,
by (40). Hence the operator A:X -+ Y is continuous. o

Standard Example 4. Let X := G[a, b] with the norm
Ilull := as;xS;b
max lu(x)l,
where -00 < a < b< 00. Suppose that the function
K: [a, b] x [a, b] -+ lR
is continuous. Define the integral operator
(Au)(x):= lb K(x, y)u(y)dy for all x E [a, b].
Then the operator A: X -+ X is linear and continuous with
IIAII ~ (b - a) max IK(x, Y)I·

as;x,yS;b
1.20 Linear Operators 73
In addition, it follows from Standard Example 12 in Section 1.11 that

A: X - t X is also compact.
Proof. Let u EX. It follows from
Ilb K(x, Y)U(Y)dyl ::; a~?y~)K(x, Y)ll b lu(Y)ldy

that
Proposition 5. Let L(X, Y) denote the space of linear continuous opera-

tors
A:X -t Y,
where X is a normed space over lK and Y is a Banach space over lK.
Then L(X, Y) is a Banach space over lK with respect to the operator
norm IIAII.
Proof. Step 1: L(X, Y) is a linear space, where the linear combination
aA + (3B, A, B E L(X, Y), a, (3 E lK
is defined in the usual way through
(aA + (3B)u = aAu + (3Bu for all u E X, a,(3 E lK.
Step 2: The operator norm represents a norm on L(X, Y). In fact, it

follows from (67) and (67*) that IIAII = 0 iff A = O.
Let a E lK, and A, B E L(X, Y). Then
IlaA11 = sup IlaAul1 = lal sup IIAul1 = lalllAII·

lIul19 IIul19
Finally, the triangle inequality follows from
IIA + BII = sup IIAu + Bull::; sup(IIAull + IIBul1)

::; sup IIAul1 + sup IIBul1 = IIAII + IIBII,
where the supremum is taken over all u E X with Ilull ::; 1.

Step 3: Cauchy sequences. Let (An) be a Cauchy sequence in L(X, Y),
i.e., IIAn - Amll < c for all n,m ~ no(c). Hence
for all n, m ~ no(c). (71)

Thus, the sequence (Anu) is Cauchy. Since Y is a Banach space, (Anu) is

convergent. Define
Au:= lim Anu for all u E X.

n-too
Letting n ......., 00, it follows from An(au + (3v) = aAnu + (3Anv that
A(au + (3v) = aAu + (3Av for all u, v E X, a, (3 E lK,
i.e., the operator A is linear. By (71),
for all n 2: no(c:).
Letting n ......., 00, this implies
IIAul1 : : : (c: + llAno 1I)IIuli for all u E X,
i.e., A is continuous.
Finally, letting m ......., 00 in (71), we get
for all n 2: no(C:) and all u E X.
Hence
IIAn - All::::: c: for all n 2: no(C:),
i.e., An ......., A in L(X, Y) as n"""" 00.
This proves that each Cauchy sequence in L(X, Y) is convergent, i.e.,
L(X, Y) is a Banach space. 0
Proposition 6. Let A: X ......., Y and B: Y ......., X be linear operators, where

X and Yare linear spaces over lK. Suppose that
AB=I and BA=I.
Then A is bijective and A-I = B.
Proof. Since A(Bu) = u for all u E Y, the operator A is surjective.

Moreover, if Au = Av, then A(u-v) = O. Hence u-v = BA(u-v) = 0,
i.e., A is injective. Consequently, A is bijective.
Applying A-I to AB = I, we get A-lAB = A-I, i.e., B = A-I. 0
1.21 The Dual Space

Definition 1. Let X be a normed space over lK. By a linear continuous
functional on X we understand a linear continuous operator
f:X......., lK.
1.21 The Dual Space 75
The set of all linear continuous functionals on X is called the dual space
X* of X.
Obviously, X* = L(X, JK). We set
U, u) := f( u) for all u E X, f E X*.
Let f E X*. By (67), the norm of f is given through

Ilfll:= sup If(v)l· (72)
IIvl19
Hence
IU,u)1 = If(u)l:s; 1I.fliliuli for all u E X, f E X*. (72*)
Proposition 2. Let X be a normed space over JK. Then the dual space X*
is a Banach space over JK with respect to the norm Ilfll.
This follows from Proposition 5 in Section 1.20.
Example 3. Let X := Ora, b], -00 < a< b< 00, and let vEX. Define
f(u) := lb u(x)v(x)dx for all u E X.
Then, f E X*, and Ilfll :s; (b - a)llvll·

Proof. For all u EX,
If(u)1 :s; (b - a) max lu(x)1 max

a<:;x<:;b a<:;x<:;b
Iv(x)1 = (b - a)llullllvll. D
A complete description of the dual space Ora, b]* will be given in Section
2.3 of AMS Vol. 109, along with applications to the famous classical moment
problem.
Proposition 4. Let X be a finite-dimensional normed space over JK with

dim X 2: 1. Let {e1, ... , eN} be a basis in X, and let
N
U = L~jej.
j=l
Then, f E X* iff there exist numbers an E JK, n = 1, ... ,N, such that
N
f(u) = L an~n for all u E X.

n=l
This is a special case of Proposition 3 in Section 1.20 with Y := lK,

M = 1, and II := 1.
Important properties of the dual space X* will be studied in Chapters 2
and 3 of AMS Vol. 109, namely, the Hahn~Banach extension theorem and
its consequences, the separation of convex sets, reflexive Banach spaces,
and variational principles.
1.22 Infinite Series in Normed Spaces

Definition 1. Let X be a normed space over lK and let Uj E X for all j.
We set
00 n
L Uj := lim '~
" Uj,
n---tcx)
(73)
j=O j=O
provided this limit exists. This infinite series is called absolutely convergent
iff
00
L Iluj I < 00. (73*)

j=O
Proposition 2. Each absolutely convergent infinite series in a Banach

space is convergent.
Proof. Set Sn := 'L.?=o Uj. By (73*), for each € > 0, there is an no(€) such
that
n+k
Ilsn+k - snll:::; L Ilujll < € for all n ~ no(€) and all k = 1,2, ....
j=n+l
Hence the sequence (sn) is Cauchy, i.e., the limit (73) exists. o
1.23 Banach Algebras and Operator Functions

Definition 1. By a Banach algebra 13 over lK we understand a Banach
space over lK, where an additional multiplication "AB" is defined such
that
AB E 13 for all A, B E 13.
Moreover, for all A, B, C E 13 and a E lK, the following are true:
(AB)C = A(BC), A(B + C) = AB + AC, (B + C)A = BA + CA,

a(AB) = (aA)B = A(aB), IIABII:::; IIAIIIIBII·
In addition, we postulate that there exists an E E B such that
A = AE = EA for all A E B and IIEII = 1.
Standard Example 2. Let X be a Banach space over IK with X =I=- {O}.

Then L(X, X) represents a Banach algebra, where AB corresponds to
the usual multiplication of operators defined through
(AB)u := A(Bu) for all u E X,
and E is equal to the identical operator, i.e., Eu := u for all u E X.
Proof. Observe that
IIABII = sup IIA(Bu)11 ~ sup IIAllllBul1 = IIAIIIIBII,

IIEII = sup Ilull = 1,
where the supremum is taken over all u E X with Ilull ~ 1. o
Proposition 3. Let B be a Banach algebra, and let A, B, An, Bn E B for
all n. Then:
(i) IIAkl1 ~ IIAllk for all k = 0,1,2, ... , where we set A O := E.

(ii) If An ---; A and Bn ---; B in B as n ---; 00, then AnBn ---; AB in B as
n ---; 00.
Proof. Ad (i). Use IIAm+ll1 = IIAm All ~ IIAmlillAl1 for m = 1,2, ....
Ad (ii). Since the sequences (An) and (Bn) are bounded, we get
IIAnBn - ABII = II(An - A)Bn - A(B - Bn)11

~ IIAn - AllllBnl1 + IIAIIIIB - Bnll ---; 0 as n ---; 00. 0
Our next goal is the definition of operator functions through

00
F(A) := L ajAj, aj ElK for all j, (74)

j=O
where
00
F(z) = Lajz j , z E IK, (75)

j=O
along with
L lajllzl
00
j < 00 for all z E C with Izl < r and some fixed r > O. (75*)
j=O
Recall that lK = lR or lK = C.
Proposition 4. Let X be a Banach space over lK. Suppose we are given

the function F as in (75) with (75*).
Then, for each A E L(X, X) with
IIAII < r,
formula (74) defines an opemtor F(A) E L(X, X).
The following proof shows that this proposition remains valid if we re-
place L(X, X) with a Banach algebra B. Then, F(A) E B.
Proof. Let I All < r. Then

00 00
j=O j=O
Thus, the infinite series 2:;:0 ajAj from (74) is absolutely convergent and
hence convergent in L(X, X). 0
Example 5 (The exponential function). Let X be a Banach space over K

Then
(i) The infinite series
1
L
00
e A := ~Aj
j=O J.
converges absolutely for all A E L(X, X).
(ii) For each A E L(X, X) and all t, s E lK,
etAe sA = e(t+s)A. (76)
Proof. Ad (i). Observe that 2:;:0 I;~I < 00 for all Z E C, by a well-known
property of the classical exponential function
00 1
eZ = """'
L...- -., zj for all z E C.
j=O J.
Ad (ii). As for the classical exponential function, we get
~ t j Aj~ sk Ak = ~ """' tjsk Ar = ~~ tjs r - j Ar

L...- j! L...- k! L...- L...- j!k! L...- L...- (r - j)!
tt ~ (~)
j=O k=O r=O j+k=r r=O j=O
=
r=Oj=O
r. J r .
tjsr-jA r = t
r=O
(t+,s)j Ar.
Letting n ----+ 00 and using Proposition 3(ii), we obtain (76). 0
Example 6 (The geometric series). Let X be a Banach space over lK with

X I- {O}. The classical geometric series
L zj
00
(1 - z) -1 =
j=O
converges absolutely for all z E C with Izl < 1. By Proposition 4, for each
operator A E L(X, X) with IIAII < 1, the infinite series
00
converges absolutely to an operator B E L(X, X). This series is called the

Neumann series. In addition,
Proof. Obviously,
(I - A)B = I and B(I - A) = I.

Hence B = (I - A)-I. o
Let X and Y be Banach spaces over lK with X I- {O} and Y I- {O}.
Denote by Linv(X, Y) the set of all the operators A E L(X, Y) such that
the inverse operator A-I: Y ----+ X exists and A-I E L(Y,X).
Proposition 7. If A E Linv(X, Y) and BE L(X, Y) with
IIBII < IIA-lll- l ,

then A + BE Linv(X, Y).
Corollary 8. The set Linv(X, Y) is open in L(X, Y).
Proof. Let A E Linv(X, Y). It follows from AA- l = I that A-I I- 0, and
hence IIA- 1 11 1-0.
If A E Linv(X, Y) and C E Linv(X, X), then AC E Linv(X, Y) and
(AC)-l = C- 1 A-I. (77)
In fact, (C-lA-l)(AC) = I = (AC)(C-lA-l).

Since IIA -1 BII ::; IIA -llIIIBII < 1, it follows from Example 6 that
C:= (I +A- 1 B) E Linv(X, X).
By (77), AC = A + BE Linv(X, Y). 0

1.24 Applications to Linear Differential Equations

in Banach Spaces
Definition 1. Let
u: U(to) ~ lR ---+ X
be a function where X is a normed space over IK and U(to) is an open
neighborhood of the point to E R We define the derivative
u'(to) := lim h-l(u(to + h) - u(to))

h-+O
provided this limit exists.
Proposition 2. If the derivative u'(to) exists, then the function u(·) zs

continuous at the point to.
Proof. The identity
u(to + h) = u(to) + h(h- 1 (u(to + h) - u(to)))
yields
u(to + h) ---+ u(to) as h ---+ O. o
Let us now consider the following initial-value problem:
u'(t) = Au(t), -00 <t< 00,

(78)
u(O) = uo,
where Uo E X is given.
Proposition 3. Let X be a Banach space over IK, and let the operator
A E L(X, X) be given.
Then the initial-value problem (78) has a unique solution given by
for all t E R
Example 4. Consider the special case where X := lRN, N = 1, and A =

(ajk) is a real (N x N)-matrix. Then problem (78) corresponds to the
following system of linear differential equations:
N
~j(t) = L ajk~k(t), -00 <t< 00,

k=l (78*)
j = 1, ... ,N.
1.24 Applications to Linear Differential Equations in Banach Spaces 81
By Proposition 3, this system has a unique solution.
Proof of Proposition 3. Step 1: Existence. Let hER It follows from
ehA = 1+ hA + h2 A2 + ...
2!
that
00 Ihl j - 1
:::; Ihl ~ -.,-IIAll j :::; constlhl ---> 0 as h ---> O.
j=2 J.
Since
for all t, h E JR,
we get
Ilh-l(U(t + h) - u(t)) - Au(t)11 = II(h-l(e hA - I) - A)etAuoll
:::; ~lletAlilluoll---> 0 as h ---> O.
This implies u'(t) = Au(t) for all t ERIn addition, u(t) = etAuo = Uo for
t = o.
Step 2: For the uniqueness proof, we need the following result, which will
be proved in Section 1.1 of AMS Vol. 109 as an easy consequence of the
Hahn-Banach theorem: For all v EX,
Ilvll = sup 1(1, v)l,

111119
where f E X*.
Step 3: Let u = u(t) and v = v(t) be two solutions of the original problem
(78). Set
w(t) := u(t) - v(t) for all t E R
Then
w'(t) = Aw(t), -00 <t< 00,
(79)
w(O) = O.
We have to show that (79) implies w(t) = 0 for all t E JR.
Let w = w(t) be a solution of (79). Choose f E X*. By (79),
(1, w'(t)) = (1, Aw(t)) for all t E R
Since f: X ---> JR is linear and continuous, we get
lim (1,w(t + h)h - (1,w(t)) = lim (1, h-1(w(t + h) -w(t)))

h-O h_O
= (1, w' (t)) for all t E R
This implies
d
dt (j,w(t)) = (j,Aw(t)) for all t E R (80)
By Proposition 2, the function t r-> w(t) is continuous on R Hence, the

function t r-> (j, Aw(t)) is also continuous, since A and f are continuous.
Integrating (80) and observing that (j, w(O)) = 0, we obtain
(j, w(t)) = lot (j, Aw(s))ds for all t E R
By (72*), for all t with It I ::; h, we get
1(j,w(t))I::; llot IlfIIIIAllllw(s)lldsl

::; hllfllllAl1 Isl:<S:h
max Ilw(s)ll·
It follows from Step 2 that
Ilw(t)11 ::; h11A11 max Ilw(s)11 for all t: It I ::; h.

Isl:<S:h
Hence
max Ilw(t)II ::; h11A11 max Ilw(t)ll·
Itl:<S:h Itl9
This yields w(t) = 0 for all t E IR provided A = o.
If A -I- 0, then we choose the number h := 211~1I. Hence
w(t) =0 for all t: It I ::; h.
Now applying the same argument to the initial-value problems
w'(t) = Aw(t), -00 <t< 00,

w(±h) = 0,
we get w(t) = 0 for all t E [-2h,2h]. Continuing this, we obtain w(t) = 0
for all t E R 0
1.25 Applications to the Spectrum

Let us consider the equation
Au = AU, U E X, A E C. (81)
Definition 1. Let A E L(X, X), where X is a complex Banach space with

X -I- {O}.
1.25 Applications to the Spectrum 83
The complex number >. is called an eigenvalue of the operator A iff

equation (81) has a nontrivial solution u =1= o.
The resolvent set p(A) of A is defined to be the set of all the complex
numbers>' for which the inverse operator (>.I - A)-l: X -+ X exists and
(>.1 - A)-~ E L(X, X).
In this case, the operator (>.I - A)-l is called the resolvent of A at >..
The spectrum a(A) of A is defined through a(A) := C - p(A).
Proposition 2. The spectrum a(A) is a compact subset of C and
1>'1 ~ IIAII for all >. E a(A).
Each eigenvalue>. of A belongs to the spectrum of A.

The resolvent set p(A) is open in C.
Proof. Let>. E p(A) and J.L E C. Hence >.1 - A E Linv(X, X). Since
11(>.1 - A) - (J.LI - A)II ~ I>' - J.LI,
it follows from Example 6 in Section 1.23 that J.LI -A E Linv(X, X) provided

I>' - J.LI is sufficiently small. Hence the set p(A) is open.
If 1>'1 > IIAII, then>. E p(A). In fact, since
II>' -1 All:::; I>' -l1I1AII < 1,

Example 6 in Section 1.23 tells us that (I - >.-1 A)-l E L(X, X). Hence
Consequently, the spectrum a(A) = C - p(A) is closed and bounded, i.e.,

a(A) is compact.
Finally, if>. E p(A), then
(>.1 - A)u = 0 implies u = (>.1 - A)-l(O) = 0,

i.e., >. is not an eigenvalue of A. o
The operator B: X -+ X on the Banach space X is called semi-Fredholm
iff the range R(B) is closed and the null space of B has a finite dimension,
i.e., 0 ~ dim B < 00.
Definition 3. Let the operator A E L(X, X) be given, where X is a

complex Banach space with X =1= {O}. The essential spectrum ae(A) of
A consists of all >. E C such that the operator >'I - A: X -+ X is not
semi-Fredholm.
Obviously, O"e(A) ~ O"(A). Moreover, O"e(A) contains all the eigenvalues

>. of A that have an infinite multiplicity, i.e., dim N(A - >.1) = 00.
Example 4. If dim X < 00, then the essential spectrum of A is empty.

Proof. Observe that finite-dimensional linear subspaces of Banach spaces
are always closed. Hence R(>.1 - A) is closed for all >. E C. 0
Example 5. If the operator A is compact, then we shall show in Section

5.6 of AMS Vol. 109 that the operator >.1 - A: X --+ X is Fredholm (and
hence also semi-Fredholm) for all >. E C with>' =f:. O. Thus, either O"e(A) is
empty or O"e(A) = {O}.
The essential spectrum plays a fundamental role in quantum mechanics
with respect to scattering processes (cf. Section 5.20).
1.26 Density and Approximation

Classical approximation theorems can frequently be formulated in terms of
dense sets in normed spaces.
Definition 1. Let X be a normed space. A subset M of X is called dense

in X iff
M=X,
i.e., for each u E X and each c > 0, there is a v E M such that lIu - vii < c.
The space X is called sepamble iff there is an at most countable dense
subset M of X.
Recall that a set M is called countable iff there exists a bijective map
A: M --+ N, where N denotes the set of natural numbers n = 1,2 ....
The set M is called at most countable iff it is either finite or countable.
It is well-known that the set Q of rational numbers is countable.
Proposition 2. Let X := era, b], where -00 <a <b< 00. Then the set
of all the polynomials
p(x) := ao + alX + ... + anxn , ... , n = 0,1, ... ,
with real coefficients ai is dense in X.
This is the classical Weierstrass approximation theorem. In fact, Propo-

sition 2 tells us that for each continuous function u: [a, bj --+ R and each
c > 0, there is a polynomial p such that
Ilu - pil := max

a::;x::;b
lu(x) - p(x)1 < c.
1.26 Density and Approximation 85
Corollary 3. The space Ora, b] is separable.
Proof of Corollary 3. For each real number aj and each e > 0, there is
a rational number r j such that
(82)
Letting
q(x) := ro + rlX + ... + rnx n ,
it follows from (82) that
lIu - qll :s Ilu - pil + lip - qll < e+ t

j=O
laj - rjl (max
a<x<b
- -
IXI)j :s const· e.
Thus, the set M of all the polynomials q with rational coefficients is dense
in Ora, b]. But, the set M is countable, since the set of rational numbers is
countable, and the union of a countable number of countable sets is again
countable. 0
a °
Proof of Proposition 2. Let ~ := E~=o' We only consider the case where
= and b = 1. The general case can be reduced to this special case by
using the coordinate transformation x = a + (b - a)y, which transforms
[0,1] into [a, b].
Step 1: Two identities. For bk(X) := (~) xk(1- x)n-k, we have
(83a)
and
for all x E lR. and n = 0,1,.... (83b)
To prove this, we begin with the binomial theorem
(x+yt =~ (~) xkyn-k. (84)
Setting y = 1 - x, we get (83a).

Differentiation of (84) with respect to x and multiplication with x (resp.,
x 2 ) yields
nx(x + yt- 1 = ~ (~) kxkyn-k,

n(n - 1)x2(x + y)n-2 = ~ (~) k(k _1)xkyn-k.
Setting y = 1 - x, we obtain (83b) by summation. In fact,

I:.b k (x)(n 2 x 2 - 2nxk + k 2 ) = n 2 x 2 - 2n 2 x 2 + [n(n -1)x 2 +nx] = nx(l- x).
Step 2: The Bernstein polynomials Bn. Let u E e[O,l] and Ilull :=
maxo:<ox:S1lu(x)l. We set
By (83a),
lu(x) - Bn(x)1 = II:. (U(x) - u (~)) bk(x)1

~ I:. lu(X) - u (~) I bk(x).
°
Let c > be given. Since u is uniformly continuous on [0, 1], there is a
°
8 > such that
if Ix - ~I < 8 and x E [0,1].

Obviously,
lu(X) - u (~) I ~ 211ull ~ 211ull (x -~) 2/82 if Ix - ~I ~ 8.

Hence, by (83),
lu(x) - Bn(x)1 ~ I:. (c + 211ull (x _ ~) 2/( 2) bk(x)

= c+ 211ullx(1 - x))/8 2 n.
This implies that
for all n ~ no(c)

with suitable no (c). o
Example 4. The set Q of rational numbers is dense in R
In fact, for each real number x and each c > 0, there is a rational number
r such that Ix - rl < c.
Example 5. The set Q+ iQ = {o: + i/3: 0:, /3 E Q} is dense in C.

This follows from Example 4 and from the inequality
10: + i/3 - (r + i8)1 ~ 10: -,I + 1/3 - 81 for all 0:, /3",8 E JR.
1.27. Summary of Important Notions 87
Since the set Q of rational numbers is countable, so is the set Q + iQ.

Consequently, lR and C are separable.
Proposition 6. Each finite-dimensional normed space over lK is separable.
Proof. Let X = {O}. Then the statement is trivial.

Now let dim X = N with N :::: 1. Choose a fixed basis {e1,"" eN} of
X. Then, each u E X can be represented in the form
N
U = L~jej, where ~j E lK for all j.
j=l
Case 1: lK = R Let M be the set of all the u in X with ~j E Q for all j.

Let ii E X. Then, for given c > 0, there is au E M such that
Ilu - iill ::; L I~j - €jlllejil < c.

j=l
This follows from Example 4.

Thus, the set M is dense in X. Since the set Q is countable, so is the set
M. Hence X is separable.
Case 2: lK = C. Let M be the set of all the u in X with ~j E Q + iQ for
all j. Then, the countable set M is dense in X, by the same argument as
in Case 1 along with Example 5. 0
The following proposition is important for the construction of approxi-

mation methods such as the Ritz and the Galerkin methods.
Proposition 7. Let X be a separable normed space over lK. Then there

exists a sequence {Xn} of finite-dimensional linear subspaces Xn of X such
that
and
Proof. First let X = {O}. Then, we set Xn := X for all n.

Now let 1 ::; dim X ::; 00. Since X is separable, there exists an at
most countable dense subset M of X. By the proof of Proposition 6, Mis
countable, i.e.,
M={U1,U2, .. '}'
Set Xn := span{u1, .. ' ,un} and K := U~=l X n . Since M ~ K, it follows
from M ~ K and M = X that K = X. 0
1.27 Summary of Important Notions

We may distinguish between the ants, who read page n before page
(n + 1), and the grasshoppers, who skim and skip until something of
interest appears and only then attempt to trace its logical ancestry.
For the sake of the grasshoppers, herewith is a listing of certain
basics.
Dan Henry (1981)
Let us summarize a number of important notions that have been introduced

in this chapter. These notions will be used frequently throughout this book.
The most important notion is compactness (compact sets and compact
operators) .
The symbollK stands for either lR (the set of real numbers) or C (the set
of complex numbers).
1.27.1 Linear Spaces

By a linear space X over lK we understand a set such that the linear com-
binations
au + j3v
are defined for all u, v E X and a, j3 E K In addition, we postulate that
the usual rules for classical vectors in the three-dimensional space of our
intuition remain valid. The precise definition can be found in Section 1.1.
The points U1, ... , U m in X are called linearly independent iff, for aI, ... ,
am ElK,
implies a1 = ... = am = 0.
The maximal number of linearly independent points is called the dimension
dim X of X. In particular, we write dim X = 00 iff there is no finite
maximal number of linearly independent points.
Let X, Y be linear spaces over lK. A subset M of X is called a linear
subspace of X iff
U,VEM implies au + j3v E M for all a, j3 E K
The set M is called convex iff
U,VEM implies tu + (1 - t)v E M for all t E [O,lJ.
An operator
A:M~X -> Y
1.27 Summary of Important Notions 89
assigns to each point u E M precisely one point in Y, which is denoted by

Au. The set M is called the domain of definition of A. We sometimes write
D(A) instead of M. The set A(M) of all the image points of A is called
the range of A. We also write R(A) for A(M).
The operator A is called surjective (resp., injective) iff R(A) = Y (resp.,
Au = Av implies u = v). The operator A is called bijective iff it is both
injective and surjective.
The operator A: M s;;: X - t Y is called linear iff M is a linear subspace
of X and
A(au + (3v) = aAu + (3Av for all u, v E M and a, (3 E K
Operators of the form A: M s;;: X - t lK are also called functionals. The

functional A: M s;;: X - t lR is called convex iff M is a convex set and
A(tu + (1 + t)v) :::; tAu + (1 - t)Av for all u, v E M, t E [0, 1].
1.27.2 Sets in Normed Spaces

It is typical for normed spaces that many notions from topology can be
characterized by means of sequences.
A linear space X over lK is called a normed space over lK iff to each u E X
a real number Ilull ;::: 0 is assigned such that, for all u, v E X and a E lK,
the following hold:
(a) Ilull = 0 iff u = o.

(b) Ilaull = laillull·
(c) Ilu + vii:::; Ilull + Ilvll·
The number Ilu - vii is called the distance between the two points u and v.
A sequence (un) in the normed space X converges to the point u iff
Ilun - ull - t 0 as n - t 00.
We briefly write Un - t U as n - t 00. The sequence (un) in X is called a

Cauchy sequence iff, for each c > 0, there is a number no(c) such that
for all n, m ;::: no(c).
A normed space X over lK is called a Banach space over lK iff each Cauchy
sequence in X is convergent.
Let M be a subset of the normed space X. Then, M is called open iff,
for each point u EM, there is a number c > 0 such that the set
{v E M: Ilv - ull < c}

is contained in M. The set M is called closed iff the complement X - M

is open. This is equivalent to the fact that M is sequentially closed, i.e.,
Un -+ U as n -+ 00 and Un E M for all n implies uEM.
By definition, the closure M of M is the smallest closed set that contains

M, and the interior int M of M is the largest open set contained in M.
The set M is called bounded iff there exists a number r > such that °
Ilull :S r for all u E M.
The set M is called dense in X iff for each point u E X there exists a
sequence (un) in M such that Un -+ u as n -+ 00.
The set M is called relatively sequentially compact iff each sequence (un)
in M has a convergent subsequence. If, in addition, the limits of these con-
vergent subsequences belong to M, then M is called sequentially compact.
The set M is called compact iff each family of open sets, that covers
M, possesses a finite subfamily that already covers M. The set M is called
relatively compact iff the closure M of M is compact. We have the following
equivalences: 2
M is compact iff M is sequentially compact;
M is relatively compact iff M is relatively sequentially compact.
By an open neighborhood U(uo) of the point Uo E X, we understand an

open subset of X which contains the point Uo.
1.27.3 Operators in N armed Spaces

Let A: M <:;;; X -+ Y be an operator, where X and Yare normed spaces
°
over K Then, A is called continuous at the point u E M iff, for each 6> 0,
there is a 8(6) > such that
Ilv - ull < 8(6) and v E M imply IIAv - Aull < 6.

This is equivalent to the fact that A is sequentially continuous at the point
u, i.e., for each sequence (un) in M,
Un -+ u as n -+ 00 implies AUn -+ Au as n -+ 00.
2This will be proved in Problem 1.12 of AMS Vol. 109. In the following
chapters, we only need the concepts of relative sequential compactness and se-
quential compactness. However, to simplify notation, we will use "compact" and
"relatively compact" instead of "sequentially compact" and "relatively sequen-
tially compact." By the equivalences above, this convention cannot cause any
misunderstandings.
Problems 91
The operator A: M ~ X ~ Y is called continuous iff it is continuous at

each point u E M. If, in addition, A maps bounded sets onto relatively
compact sets, then the operator A is called compact.
The linear operator A: X ~ Y is continuous iff it is bounded, i.e., there
exists a number d ~ 0 such that
IIAul1 ~ dllull for all u E X.
The smallest of these numbers d is called the norm IIAII of the operator A,
i.e.,
IIAII:= sup IIAull·
Ilull:5l
Hence, IIAul1 ~ IIAliliuli for all u E X.
The space f: X ~ ][( of all linear continuous functionals on the normed
spaces X over ][( becomes a Banach space X* with respect to the norm
Ilfll, i.e., we define the linear combination of + f3g through
(of + f3g)(u) := of(u) + f3g(u) for all u E X,
and we set Ilfll := sUPllull:5llf(u)l. Here, X* is called the dual space to X.

Let X be a normed space over ][( and let Y be a Banach space over
K Similarly to X*, the space L(X, Y) of all linear continuous operators
A: X ~ Y becomes a Banach space over ][( equipped with the operator
norm IIAII. Obviously, X* = L(X,][().
Let A: X ~ X be a linear operator, where X is a normed space over ][(.
The number A E ][( is called an eigenvalue of A iff the equation
Au = AU, uEX
has a nontrivial solution u i- O. Let ][( = C. The resolvent set p(A) of

A consists precisely of all those numbers A E C for which the continuous
inverse operator
(AI - A)-l:X ~ X
exists. This operator is called the resolvent of A at A.
The complement a(A) := C - p(A) of p(A) is called the spectr-um of
the operator A. Each eigenvalue of A is contained in the spectrum of A.
However, the spectru~ may contain additional points.
Problems
1.1. Simple examples. Let X := era, b], where -00 < a< b < 00 and
Ilull := maxa$x$b lu(x)l· Show that
1.1a. {u E X:
is not dense in X.
I:
u(x)dx = O} is a closed linear subspace of X. This set
1.Ib. {u E X:u(a)2 = u(b)} is a closed subset of X, but not a linear

subspace of X.
1.Ie. {u EX: u( a) > O} is an open, convex, not dense subset of X.
1.Id. {u E X:u(a) = I} is a closed, convex, not dense subset of X.
1.Ie. {u E X: Ilull ~ I} is not a compact subset of X.
Hint: Construct a sequence (un) of, say, piecewise linear continuous
functions such that un(x) -; 0 as n -; 00 for all x E [a, b], where this
convergence is not uniform on [a, b].
1.1f. {u E X:u(x) = 0 on [c,d]} is not dense in X provided a ~ c <d~
b.
1.Ig. {u E X:u(a) ? O} is the closure of the set {u E X:u(a) > O} in
X.
1.Ih. If we set ¢>(u):= lu(a)l, then ¢> is not a norm on X.
lb
1.Ii. If we set
IIul11 := lu(x)ldx,
then I . 111 is a norm on X, but X is not a Banach space with respect to

I . 111.
Hint: Define a discontinuous function w: [a, b] -; JR., say,
w(x) := {Io ~f a ~ x ~ c < b

If c < x ~ b.
Construct a sequence (un) in X such that
as n -; 00.
Show that (un) is Cauchy with respect to 11·111. Suppose that Ilu n -ulh -; 0
as n -; 00, where u E X. Then,
as n -; 00.
Hence u(x) = w(x) on [a, b], contradicting the continuity of the function u.
1.Ij. The operators A: X -; X and B: X -; X defined through
(Au)(x) := u(a) and (Bu)(x) := l x

u(y)dy
are linear and continuous with IIAII = 1 and IIBII = b - a.

1.Ik. If we set
f(u) := lb yu(y)dy for all u E X,

Problems 93
then I E X* with 11/11 = (b-2a)2.
1.11. Let a E JR with lal(b-a) < 1. For each given Uo E X, the iteration
method
Un+! (x) = a lb sin Un (x)dx + 1, n = 0,1, ... , x E [a, b],
converges uniformly on [a, b] to the unique solution U E X of the integral

equation
U(x) = a lb sin u(x)dx + 1, x E [a,b].
LIm. Let a E JR with lal < 1. For each given Uo E JR, the iteration
method
Un+l = a sin Un + 1, n=O,l, ... ,
converges to the unique solution U E JR of the equation U = a sin U + 1.
for all x, y E [a, b]. Let 2(b - a)d :S 1 along with u(x) == and vo(x) == 2.
Then, the two iteration methods
°°
LIn. Let K(x, y): [a, b] x [a, b] - t JR be continuous with :S K(x, y) :S d
Un+l(X) = lb K(x, y)un(y)dy + 1, n = 0, 1, ... , x E [a, b],
Vn+l(X) = lb K(x,y)v n(y)dy+l
converge uniformly on [a, b] to the unique solution u E X of the integral

equation
u(x) = lb K(x, y)u(y)dy + 1, x E [a,b],
where uo(x) :S Ul(X) :S ... :S Vl(X) :S vo(x) for all x E [a,b].

Hint: Use Theorem I.E on sub- and supersolutions and Example 2 in
Section 1. 7.
1.10. Let a E JR and I E X be given. Then, the nonlinear integral
equation
u(x) = a lb sin u(x)dx + I(x)
has a solution u EX.

Hint: Use the Leray-Schauder principle from Section 1.18.
LIp. Let f: JR2 -t JR be continuous. Then, the system
e = 1027 + sin/(e, "'),

has a solution (~, T/) E lR.2.

Hint: Use the Leray-Schauder principle from Section 1.18.
1.2. Balls. Set B := {u E X: Ilull :::; r} and B o := {u E X: Ilull < r} for

fixed r ;::: 0, where X is a normed space over K. Show that
B = B = Bo, int B = Bo, BB = {u E X: Ilull = r}.
1.3. The spectrum. Let u(A) denote the spectrum of the linear operator
A: X --+ X. Show that
1.3a. u(A) = {2} provided X := e and Au = 2u.
1.3b. If X = eN, N ;::: 1, then the spectrum cr(A) of the matrix operator
A: X --+ X given through
'f)j = L ajk~k' j = 1, ... ,N,

k=l
consists precisely of all the eigenvalues ,\ E C of the matrix (ajk), i.e., ,\ is

a solution of the characteristic equation
where det(·) denotes the determinant of the corresponding (NxN)-matrix.
1.3c. u(A) = {2~, -2~} and IIAII = 2 provided
where X:= e 2 and II(C'f))11 := max{I~I, IT/I} on X.

1.3d. If X is a complex, finite-dimensional normed space, then the spec-
trum u(A) consists precisely of all the eigenvalues of the operator A.
Hint: Use Proposition 3 in Section 1.20.
1.4. The spectral radius. Let A: X --+ X be a linear continuous operator on

the complex Banach space X. Define the spectral radius r(A) of A through
r(A):= sup 1,\1.

AEa(A)
Show that
1.4a. r(A) :::; IIAII, by Proposition 2 in Section 1.25.
l.4b.* r(A) = limn->oo IIAnll*.

Hint: Cf. Yosida (1980), Chapter 8, Section 2.
Problems 95
l.4c. Volterra integral operator. Let X := Ora, b]c, where -00 < a <
b< 00 (cf. Problem 1.6e). Define the operator A: X --> X through
(Au)(x) := l x
K(x,y)u(y)dy for all x E [a, b],
where K: [a, b] x [a, b] --> C is continuous. Then, r(A) = 0, and hence

a(A) = {O}.
Hint: Use Problem l.4b. Cf. Zeidler (1986), Vol. 1, p. 38.
l.4d. Fredholm integral operator. Let X := Ora, b]c. Define A: X --> X
through
(Au)(x):= lb K(x,y)u(y)dy for all x E [a, b],
where K is given as in Problem 1.4c. Show that
r(A) ::; (b - a) max IK(x, Y)I.

x,yE[a,bj
Hint: Use Problem 1.4a.
1.5. The Banach space l~. Let ][(00 denote the space of all sequences
(Un)n>l, where Un E ][( for all n E N. Moreover, let l~ denote the set
of all (un) E ][(00 such that
II(un)lloo := sup lunl < 00.

n2:1
Define
for all (x, j3 E IT(.
Show that
1.5a. ][(00 is an infinite-dimensional linear space over IT(.
1.5b. l~ is an infinite-dimensional Banach space over ][( with respect to

the norm 11·1100.
1.6. Classical function spaces on [a, b]. Let -00 < a < b < 00. Show that
the following function spaces are Banach spaces.
Observe that Holder continuous functions play a fundamental role in
the theory of linear and nonlinear elliptic and parabolic partial differential
equations3 as well as in classical potential theory.4
3Cf. Zeidler (1986), Vol. 1, Chapter 6, and Gilbarg and Trudinger (1977).
4Cf. Kellogg, Foundations of Potential Theory, Springer~Verlag, 1929.
1.6a. Let B[a, b] denote the set of all bounded functions u: [a, b] ---+ lR and
set
Ilull:= sup lu(x)l·
a:'Ox:'Ob
1.6b. For 0 < a :s; 1, let cO,a [a, b] denote the set of all the so-called
Holder continuous functions u: [a, b] ---+ lR, i.e., by definition,
lu(x) - u(y)1 :s; constlx - yla for all x, y E [a, b]. (85)
Let
H ( ) ._ lu(x) - u(y)1
a U . - sup Ix_ya I '
where the supremum is taken over all x, y E [a, b] with x =1= y, i.e., the so-
called Holder constant H a (u) of u is the smallest constant such that (85)
holds. In particular,
for all X,y E [a,b].
Set
1.6c. Let Ck[a, b] with k = 1,2, ... denote the set of all continuous
functions u: [a, b] ---+ lR that have continuous derivatives on [a, b] up to order
k. Set
k
Ilull := L a<x<b
max lu(j)(x)l,
j=O - -
where u(j) denotes the jth derivative.
1.6d. For 0 < a :s; 1 and k = 1,2, ... , let Ck,a[a, b] denote the set of all
functions u E Ck[a,b] with u(k) E CO,a[a,b]. Set
1.6e. Let C[a, bk denote the set of all complex continuous functions
u: [a, b] ---+ C. Define
Ilull := max lu(x)l·
a:'Ox:'Ob
1. 7. Compact embedding. Use the Arzela-Ascoli theorem from Section

1.11.1 in order to prove that the embedding
CO,a[a, b] C;;; C[a, b], o< a :s; 1,

Problems 97
is compact, i.e., each bounded set in CO,a[a, bj is relatively compact in

C[a,bj.
Show also that the embedding
0< (3 < 0: :::; 1,

is compact, again by the Arzela-Ascoli theorem.
Hint: Cf. Zeidler (1986), Vol. 2A, p. 283.
1.8. Classical function spaces on subsets of JRN. Let us directly generalize

the function spaces from Problem 1.6. Let G be a nonempty bounded open
set in JRN, N ~ 1. Show that the following function spaces are Banach
spaces.
1.8a. Let B(M) denote the set of all bounded functions u: M ---+ !R. Set
Ilull := sup lu(x)l,

xEM
where M is an arbitrary nonempty subset of JRN.

1.8b. Let C(M) denote the set of all continuous functions u: M ---+ JR.
Set
Ilull := max lu(x)l,
xEM
where M is a nonempty compact subset of JRN (e.g., M = G).

1.8c. For 0 < 0: :::; 1, let CO,a(G) denote the set of all Holder continuous
functions u: G ---+ JR, i.e.,
lu(x) - u(y)1 :::; constlx _ yla for all x, y E G.
Set
Ilull := m~ lu(x)1
xEG
+ Ha(u),
where
lu(x) - u(y)1
Ha(u) := sup
x,YEG, x#y Ix - yla
1.8d. For k = 1,2, ... , let Ck (G) denote the set of all functions u E C( G)
which have continuous partial derivatives on G up to order k. Moreover,
suppose that all these partial derivatives can be extended continuously to
the closure G. Set
Ilull := L m~ 18!1 u(x) I,
1!1I:5k xEG
where we sum over the function and all their partial derivatives up to order
k.
1.8e. For 0 < a S; 1 and k = 1,2, ... , let Ck,Q(G) denote the set of all
functions u E C k (G) such that all the partial derivatives of u of order k
belong to CO,Q(G). We set
1.9. Complexification. Let X be a real linear space. Define
Xc:= {(u,v):u,v EX}.
Show that
1.9a. Xc forms a complex linear space with respect to the following
operations:
(u,v) + (w,z) = (u+w,v+z), (86)
(a + i(3)(u, v) = (au - f3v, av + f3u), a,f3 E~. (87)
Instead of (u, v) we also write u + iv. Then, (87) corresponds to the
following formal multiplication rule:
(a + i(3)(u + iv) = au - f3v + i(av + f3u).
1.9b. If X is a real normed space, then Xc becomes a complex normed

space equipped with the following norm:
Ilu + ivll:= max II (cos ¢)u + (sin¢)vll·

0::;4>911"
1.9c. If X is a real Banach space, then Xc is a complex Banach space.

1.9d. Every linear continuous operator A: X ---- X can be extended to a
linear continuous operator Ac: Xc ---- Xc by the quite natural definition
Ac(u + iv) := Au + iAv.
Then, IIAII S; IIAcil S; 211AII·

1.ge. Show that if X := ~, then Xc = C along with lIu + ivll = lu + ivl
for all complex numbers u + iv with u, v E R
1.10. Density. Let D be a dense subset of the normed space X over K.

Show that
(u*,u) =0 for all u E D and fixed u* E X*
implies u* = O.
Problems 99
Solution: Recall that u*(u) = (u*,u). Let v E X be given. Since D is

dense in X, there exists a sequence (un) in D such that Un -+ v as n -+ 00.
The functional u* is continuous and u* (un) = 0 for all n. Hence
u*(v) = n-+oo
lim u*(u n ) = 0 for all v E X.
Therefore, u* = O.
1.11. The importance of equivalent norms. Let II . I and II . IiI be two

equivalent norms on the linear space X over lK., and let Y be a normed
space over lK..
LIla. Show that the following notions are invariant under a passage
from the norm I . I to I . Ill: convergent sequence, open set, closed set,
bounded set, closure of a set, interior of a set, sequentially compact set,
relatively sequentially compact set, dense set in X, continuous (resp.) com-
pact operator A: X -+ Y, and Cauchy sequence.
In particular, X is a Banach space with respect to II ·11 iff X is a Banach
space with respect to II· Ill.
1.llb. Use Banach's continuous inverse theorem from Section 3.5 of
AMS Vol. 109 in order to prove the following. Suppose that there is a
constant d > 0 such that
Ilull ~ dllul11 for all u E X, (88)
and suppose that X is a Banach space with respect to both the norms II . II
and II . IiI· Then, II . II is equivalent to I . IiI on X.
Solution: Let X and Xl denote the linear space X equipped with the
norm II . II and II . Ih, respectively. Define the operator
through Au := u for all u E Xl.
Obviously, A is bijective. By (88), A is continuous. Banach's continuous in-

verse theorem tells us that the inverse operator A-I: X -+ Xl is continuous,
too. Hence there is a constant c > 0 such that
IIAuliI ~ cllull for all u E Xl. (89)
By (88) and (89), c 111ulh ~ Ilull ~ dllulh for all u E X.

2
Hilbert Spaces, Orthogonality, and
the Dirichlet Principle
When the answers to a mathematical problem cannot be found, then

the reason is frequently the fact that we have not recognized the
general idea, from which the given problem appears only as a single
link in a chain of related problems.
David Hilbert, 1900 (Paris lecture)
In a famous paper from 1857, Riemann used the Dirichlet principle for
the foundation of the theory of complex analytic functions. In 1870 Weier-
strass showed that there are variational problems that do not have any
solution. l This way the justification of the Dirichlet principle became an
important open problem, which Hilbert solved in 1900.
In his Paris lecture, Hilbert formulated 23 open problems. In connection
with the twentieth problem, he said
The sophisticated methods of Schwarz, C. Neumann, and Poincare

essentially solved the boundary-value problems for the Laplace equa-
tion. However, these methods cannot be directly extended to more
general cases ... I am convinced that it will be possible to get these
existence proofs by a general basic idea, towards the Dirichlet princi-
ple points. Perhaps it will then also be possible to answer the question
lThis classical counterexample will be considered in Problem 2.1. A detailed

historical discussion of the Dirichlet principle and its influence on modern analysis
can be found in Zeidler (1986), Vol. 2A, Sections 18.7 through 18.9.
102 2. Hilbert Spaces, Orthogonality, and the Dirichlet Principle
of whether or not every regular variational problem possesses a solu-

tion if, with regard to boundary conditions, certain assumptions are
fulfilled and if, when necessary, one sensibly generalizes the concept
of solution.
In this chapter we want to show that the Dirichlet principle can be

justified extremely elegantly by using elementary geometric arguments from
the theory of Hilbert spaces and by introducing Sobolev spaces based on
generalized derivatives. We will also study the convergence of the method
of finite elements, which represents one of the most important methods in
modern numerical analysis. A physical interpretation in terms of elasticity
can be found in Section 2.7.
In Hilbert spaces, an inner product (u Iv) is defined, allowing us to
introduce the fundamental notion of orthogonality.
Figure 2.1 shows the relationship between Hilbert spaces and Banach
spaces, and others, that is, each Hilbert space is a Banach space, and so
forth. With a view to applications, the most important Hilbert spaces are
the Lebesgue spaces L2(G), L~(G) and the related Sobolev spaces W:J(G)
and W~(G). Roughly speaking, the real Lebesgue space L2(G) (resp., the
complex Lebesgue space L~(G)) consists of all functions
u: G ~ JR.N --+ JR.
q
L
(resp., u: G --+ with
lu(xWdx < 00.
Lebesgue spaces Sobolev spaces

<C 1 0 1
(L2(G), L2 (G)) (W2 (G), W 2 (G))
~!~ I Hilbert space I - Banach space
(the Cauchy criterion is valid)
!
pre-Hilbert space _
!
normed space ~ linear space
(inner product (u I v)) (norm lIull = (u I U)2) (em + ,l3v)
FIGURE 2.1.
2. Hilbert Spaces, Orthogonality, and the Dirichlet Principle 103
Fourier series and

integral equations
(Sections 3.2 and 4.4)
partial differential
equations of mathematical
physics (Chapter 5)
SoboJevspaces wi (G) and wi (G) - Dirichlet principle and the

calculus of variations
(Section 2.4)
complex Lebesgue space Lf (]RN) quantum mechanics

(Section 5.13)
Fourier transfonnation
(Section 3.7)
FIGURE 2.2.
The corresponding inner product 2
(u I v):= l u(x)v(x)dx
generalizes the classic Euclidean inner product

N
(u I v) = L UjVj
j=l
on JRN and (CN, where u = (Ul, ... ,UN) and v = (VI> ... ,VN). Observe that
the integral fa ... is to be understood in the sense of Lebesgue.
The theory of Hilbert spaces forces the use of the Lebesgue integral.
The deeper reason for this is the fact that in the case of the classical
Riemann integral the limiting relation
lim
n---HXl
1
a un(x)dx = a u(x)dx 1
2The bar denotes the conjugate complex number. In the case of the real space
L2(G), u(x) is real, and hence
(u I v) = fa u(x)v(x)dx.
is only valid under very restrictive assumptions, in contrast to the Lebesgue

integral. The Riemann integral leads only to pre-Hilbert spaces for which
the fundamental Cauchy criterion is not valid.
For the convenience of the reader, basic facts about the Lebesgue integral
are summarized in the appendix.
Figure 2.2 portrays some applications of Lebesgue spaces and Sobolev
spaces to important concrete problems. Moreover, Figure 2.3 displays the
logical structure of this chapter.
generalized functions
(distributions)
main theorem on quadratic generalized derivative
!
variational problems
Sobolev spaces
existence of a perpendicular
I
justification of the
I
Dirichlet principle
convergence of the Ritz

orthogonal decomposition approximation method
I
(finite elements)
Riesz theorem - - - - - - - - - _ _ nonlinear Lax-Milgram theorem
t
nonlinear Lipschitz continuous,
strongly monotone operators
Iorthogonality principle I
t
Banach fixed-point theorem
FIGURE 2.3.
2.1 Hilbert Spaces

Recall that ][{ = IR or ][{ = C.
Definition 1. Let X be a linear space over ][{. An inner product on X
assigns to each pair (u, v) with u, v E X a number
(UIV)E][{
such that the following hold for all u, v, w E X and a, (3 E ][{;
(i) (u I u) ~ 0 and (u I u) = 0 iff u = 0;

(ii) (u I av + (3w) = a(u I v) + (3(u I w);
(iii) (u I v) = (v I u) (cf. Remark 11 on page 379).
Here, the bar denotes the conjugate complex number.
A pre-Hilbert space over ][{ is a linear space X over ][{ together with an
inner product.
It follows from (ii) and (iii) that
(av + (3w I u) = a(v I u) + .B(w I u) for all u,v,w EX, a,(3 EK (1)
Let u, v EX. Then, u is called orthogonal to v iff
(u I v) = o. (2)
The following Schwarz inequality (3) is the most important inequality

in pre-Hilbert and Hilbert spaces. In Section 5.14, we shall show that the
famous Heisenberg uncertainty relation in quantum mechanics follows from
the Schwarz inequality.
Proposition 2. Let X be a pre-Hilbert space. Then
I(u I v)1 ::; (u I u)~ (v I v)~ for all u, vEX. (3)
Using the norm 11·11 introduced in Proposition 3 ahead, we can write the
Schwarz inequality in the following form:
I(u I v)1 ::; Ilullllvll for all u, vEX. (3*)
Proof. Let v f= O. Then, we get (3) from
0::; (u - av I u - av) = (u I u) - a(u I v) - a[(v I u) - a(v I v)]

with a .- (vlu)
. - (vlv)· o
Proposition 3. Each pre-Hilbert X space over K is also a normed space
over K with respect to the norm
Ilull := (u I u)! for all u E X. (4)
Proof. We have Ilull : : : 0 for all u EX, and Ilull = 0 iff u = O. Furthermore,
Ilaull = (au I au)! = (aa)! (u I u)! = lailiull for all u EX, a E K.
Finally, the triangle inequality
Ilu + vii::; Ilull + Ilvll for all u,v E X,
follows from the Schwarz inequality (3*). In fact, for all u,v E X,
Ilu + vll 2 = (u + v I u + v) = (u I u) + (u I v) + (u I v) + (v I v)
= IIul1 2 + 2Re(u I v) + IIvl1 2
::; IIul1 2 + 211ullllvil + IIvl1 2 = (Ilull + Ilvll)2,
where Re z denotes the real part of the complex number z. o
From Proposition 3 we obtain the following:
All the notions and theorems for normed spaces 3 remain valid for pre-
Hilbert spaces with respect to the norm Ilull from (4).
In particular, the convergence
asn-'oo
in the pre-Hilbert space X is to be understood in the following sense:
as n -. 00.
Proposition 4. Let X be a pre-Hilbert space. Then, the following hold

true:
(i) The inner product is continuous, that is,
and vn -. v as n -. 00
imply (un I v n ) -. (u I v) as n -. 00.
3The basic notions concerning normed spaces and Banach spaces are summa-
rized in Section 1.27.
(ii) Let M be a dense subset of X. If
(u I v) = 0 for fixed u E X and all v E M,
then u = o.
Proof. Ad (i). Since (v n ) is bounded, it follows from the Schwarz inequality

(3*) that
I(un I vn) - (u I v)1 = I(un - u I vn) + (u I Vn - v)1

::; I(un - u I vn)1 + I(u I Vn - v)1
::; Ilu n - ullllvnli + Ilullllvn - vii -+ 0 as n -+ 00.
Ad (ii). Since M is dense in X, there is a sequence (v n ) in M such that
Vn -+ U in X as n -+ 00. Letting n -+ 00, it follows from
for all n
that (u I u) = o. Hence u = O. D
Example 5 (The product rule). Let X be a pre-Hilbert space, and let

u, v: U (s) ~ R. -+ X be two functions defined on an open neighborhood of
s E R. that are differentiable at the point s.
Then the function t f-+ (u(t) I v(t)) is differentiable at s, where
d
dt (u(t) I v(t))s = (u'(s) I v(s)) + (u(s) I v'(s)).
Proof. Set ¢(t) := (u(t) I v(t)). Letting h -+ 0, the assertion follows from
¢(s + h~ - ¢(s) = (u(s + h~ - u(s) I v(s + h)) + (u(S) I v(s + h~ - V(s)).

D
The following definition is basic.
Definition 6. By a Hilbert space we mean a pre-Hilbert space that is a

Banach space with respect to the norm Ilull from (4).
In other words, a linear space X over ][{ is a Hilbert space iff the following
hold:
(i) there exists an inner product in X, and
(ii) each Cauchy sequence with respect to the norm Ilull from (4) is con-
vergent.
If lK = ~ or lK = C, then X is called a real or complex Hilbert space,

respectively.
Example 7. Set X := R Then, X is a real Hilbert space with the inner

product
(u I v) := uv for all u, v E R
The corresponding norm Ilull = (u I u)! equals lui.

Example 8. Set X := C. Then, X is a complex Hilbert space with the
inner product
(u I v):= uv for all u,v E C,
and the norm Ilull = (u I u)! = lui·

Proposition 9. Each finite-dimensional pre-Hilbert space is a Hilbert
space.
This follows immediately from the fact that each finite-dimensional

normed space is a Banach space.
Proposition 10. Let X be a Hilbert space (resp., Banach space) over lK,
and let L be a linear subspace of X.
Then, the closure L of L is also a Hilbert space (resp., Banach space)
with respect to the restriction of the inner product (resp., norm) on X to
1.
Proof. We first prove that L is a linear space over lK. In fact, let u, vEL
and a, j3 E lK. Then, there are sequences (un) and (v n ) in L such that
and vn -7 V in X as n -7 00.
Letting n -7 00, it follows from
aU n + j3v n E L for all n
that au + j3v E L.
Restrict the inner product (resp., norm) on X to the subset L of X.
Then, L is a pre-Hilbert space (resp., normed space).
Finally, let (un) be a Cauchy sequence in 1. Then,
in X as n -7 00.
Since L is closed, u E L. Hence
in L as n -7 00. D
2.2 Standard Examples 109
2.2 Standard Examples

The reader who wants to pass to important applications of the Hilbert space
theory as quickly as possible should only read the examples and proposi-
tions of this section without studying the corresponding proofs, which are
based on important properties of the Lebesgue integral.
Standard Example 1. The space X := lKN , N = 1,2, ... , is an N-

dimensional Hilbert space over lK with the inner product
N
(x I Y) := L f,j'r/j for all x, y E lK,
j=1
where x = (6, ... , ~N) and y = ('TI1, ... , 'TIN). The corresponding norm is
given through
(t, l~jI2)
1
Ilxll ~ (x 1 x)1 ~ , for all x E K
This is a real or complex Hilbert space if lK = ]R or lK = C, respectively.

For X =]RN, the norm Ilxll is identical to the Euclidean norm Ixl.
Proof. One checks easily that (. I .) represents an inner product. Thus, X

is a finite-dimensional pre-Hilbert space, and hence it is a Hilbert space. 0
The corresponding space in the case where N = 00 will be studied in

Problem 2.2.
Example 2. Let -00 <a < b< 00. For all u, v E C[a, b], we define
(u I v) := lb uvdx. (5)
One checks easily that this is an inner product on C[a, bj.

The corresponding pre-Hilbert space is denoted by C*[a,bj. Obviously,
the norm on C* [a, b], namely,
for all u E C[a, bj
differs from the maximum norm maxa<x<b lu(x)1 introduced in Chapter l.

We shall show in Example 9 that C*[a,bj is not a Hilbert space, but only
a dense subset of the Hilbert space L 2 (a, b) whose definition is based on
the Lebesgue integral.
Convention 3. In the following, all integrals are to be understood in the

sense of Lebesgue.
2.2.1 The Hilbert Space L 2 (a, b)

Standard Example 4. Suppose that -00 ::; a < b ::; 00. Let L 2 (a, b)
denote the set of all measurable functions
u:la,b[ -IR
such that
Then
(i) L 2 (a, b) is a real Hilbert space with respect to the following inner
product:
(u I v) := lb uvdx for all u, v E L 2 (a, b).
(ii) dim L 2 (a, b) = 00.
More precisely, we use the following identification principle:

(I) Two functions u and v correspond to the same element in the Hilbert
space L2 (a, b) iff
u(x) = v(x) for almost all x E la, b[.

Thus, the elements of L2 (a, b) are classes of functions characterized by
(I).
Proof. In the following we will essentially make use of the Fatou lemma
for the Lebesgue integral (cf. the appendix).
J:
Ad (i). Step 1: The classic Schwarz inequality. We want to show that if
u, v E L2 (a, b), then the integral uv dx exists and
(6)
To prove this, we start with the simple, classic inequality
for all ~,TJ E C, (7)

First let Ilull = 0 or Ilvll = O. Then
u(x)=O or v(x)=O for almost all x E la, b[,
respectively. Hence J:
uv dx = 0, i.e., (6) is true.
Suppose now that Ilull -=1= 0 and Ilvll -=1= O. Replacing u with II~II and v
with II~II' if necessary, we may assume that lIull = 1 and Ilvll = 1. By (6),
for all x E la, b[. (8)

Since the functions u and v are measurable on la, b[, so is the product uv.
By (8), the existence of the integrals
and
implies the existence of the integral J: luvldx, and hence the existence of
J: uv dx. Furthermore, it follows from (8) that
lib uVdxl ~ ib luvldx ~ Tl (i b ib lul 2dx + IVI2dX)
= 1 = Ilullllvll.
This is the desired inequality (6).
Step 2: We show that L 2( a, b) is a linear space. In fact, for all a, (3 E IR
and all x E la, b[,
Let u, v E L 2(a, b). Then, the integrals lul 2dx andJ: Ivl 2dx exist. ByJ:
Step 1, this implies the existence of J:
luvldx. Hence the integral
exists, i.e., au + {3v E L2 (a, b).

Step 3: We prove that (u I v) :=
We first show that
J: uv dx is an inner product on L2 (a, b).
(ulu)=O iff u = O.
In fact, let (u I u) = O. Then
ib lul 2dx = 0 implies u(x) = 0 for almost all x E la, b[.

By the identification principle (I) given earlier, we obtain that the function
u = u(x) corresponds to the zero element u = 0 in L 2 (a, b).
Conversely, let u = 0 be the zero element in L 2 (a, b). By (I), this element
corresponds to the class of all the functions u with u(x) = 0 for almost all
x E la,b[. Hence (u I u) = O.
Furthermore, it follows from
u(X) = Ul(X) and v(x) = Vl(X) for almost all x E la, b[

that u(x)v(x) = Ul(X)Vl(X) for almost all x E la, b[, and
lb uvdx = lb Ulvldx, i.e., (u I v) = (Ul I Vl).
Consequently, the inner product respects the identification principle (I),

i.e., it depends only on the corresponding class of functions.
Finally, ifu,v,w E L 2 (a,b), then (u I v) = (v I u) and
lb u(o:v + f3w)dx = 0: lb uvdx + f31b uwdx,
i.e., (u I o:v + f3w) = o:(u I v) + f3(u I w).

Consequently, L 2 (a, b) is a pre-Hilbert space. Obviously, the Schwarz in-
equality
I(u I v)1 ~ lIulilivll for all u, v E L 2 (a, b)
corresponds to the classic Schwarz inequality (6) from Step 1.
Step 4: Hilbert space. We want to show that L 2 (a, b) is a Hilbert space.
To this end, we have to show that each Cauchy sequence in L 2 (a, b) is
convergent. By Proposition 7 in Section 1.3, it is sufficient to prove that
each Cauchy sequence in L 2 (a, b) has a convergent subsequence.
Let (un) be a Cauchy sequence in L 2 (a,b), i.e.,
for all n, m ~ no(e).
Choosing e = 2- k , k = 1,2, ... , there follows the existence of natural

numbers nl ~ n2 ~ .•. such that
for all k = 1,2, ....
m
sm(x) := L IVk+l(X) - vk(x)l·
k=l
Since the sequence (sm(x)) is monotone increasing, the limit
exists for all x E la,b[, where 0:::; S(x):::; 00. Since L 2(a,b) is a pre-Hilbert
space, the triangle inequality holds. Hence
m
Ilsmll :::; L Ilvk+! - vkll :::; TI + 2- 2 + ... :::; 1 for all m ;::: 1.
k=l
This implies
for all m ;::: 1.
Thus, by the Fatou lemma, the function S is integrable over la, b[ with
In particular, S(x) is finite for almost all x E la, b[. In the remaining points
let us redefine S by setting S(x) := O.
Letting s(x) := S(x)4, we get
s(x) = m->oo
lim sm(x) for almost all x E la, b[, (9)
and J:
s2dx :::; 1, i.e., s E L 2(a, b).
We now use the identity
m-l
vm(x) = VI(X) + L Vk+!(X) - Vk(X). (10)

k=l
By (9),
L
00
s(x) = IVk+!(X) - vk(x)1 < 00 for almost all x E la, b[. (11)
k=l
Thus, the finite limit
V(X):= lim vm(x)
m->oo
exists for almost all x E la, b[. In the remaining points x of the interval la, b[
we set V(X) := O. As the limit of measurable functions Vm, the function v
is also measurable on the interval la, b[. According to (10) and (11),
Iv(x)1 :::; IVI(X)I + s(x) for almost all x E la, b[. (12)
Since VI E L 2(a, b), we get !vI I, s E L 2(a, b), and hence IVII + s E L 2(a, b),
by Step 2. It follows from (12) that
lb Iv(xWdx :::; lb (IVI(X)I + s)2dx < 00,

i.e., v E L 2 (a, b).

Finally, we want to show that
Vn -+ v in L 2 (a, b) as n -+ 00. (13)
In fact, since (v n ) is a Cauchy sequence, for each c: > 0 there is an mo(C:)

such that
Ilvn - vm l1 2== lb IV n - v m 2 dx :::::: c: 2

l for all n, m ~ mo(c:).
Letting m -+ 00, it follows from the Fatou lemma that

b
Ilvn - vI12 == lIVn(X) - v(x)1 2 dx
lar m~CXJ
b
= lim Ivn(x) - v m (x)1 2 dx
lar Ivn(x) - vm(xWdx:::::: c:

b
:::::: lim 2,
m---+oo
for all n ~ mo(C:). This is (13).

Ad (ii). Choose a fixed compact interval [c, d] with [c, d] C la, b[ and
c < d. Define
Un(x) := {xn if x E [~, d]
o otherwIse.
Then Un E L 2 (a, b) for all n = 0,1,2, ....
It follows as in the proof of Example 9 in Section 1.1 that the functions
UO, ... , Un are linearly independent for each n. Hence dim L 2 (a, b) = 00. D
2.2.2 The Lebesgue Spaces L2 (G) and L~ (G)

Proposition 5 (The space L~( G)). Let G be a nonempty measurable subset
oflR N , N ~ 1 (e.g., G is open or closed), and let L~(G) denote the set of
all measurable functions
U: G -+ lK
such that
(14)
Then
(i) L~(G) is a Hilbert space with respect to the following inner product:
(U I v):= fa uvdx for all U,V E L~(G). (15)
More precisely, two functions U and v correspond to the same element

of the Hilbert space L~(G) iff u(x) = v(x) for almost all x E G.
For][( = ]R or][( = C, Llf(G) is a real or complex Hilbert space, respec-

tively.
(ii) If G is open, then dim Llf(G) = 00.
In particular, the Schwarz inequality (3) applied to the Hilbert space

Llf (G) reads as follows:
for all U,V E Llf(G). For brevity of notation, we set
if][( =R Note also that L 2(a, b) = L2(G) with G = la, b[.

Proof. Ad (i). Use the same argument as in the proof of Standard Example
4, by replacing the interval la, b[ with the set G.
Ad (ii). Since G is open, there is a cuboid C := {(6, ... '~N) E ]RN:
a < ~j < b for all j} with C t;;;; G, where -00 < a < b < 00. Define
un(x).-
._ {~l° if x E C
if]RN - C, (17)
where x = (6, ... '~N)' Then, Un E Llf(G) for all n = 0,1,2, ....
It follows as in the proof of Example 9 in Section 1.1 that the functions
Uo, ... , Un are linearly independent for all n. Hence dim Llf(G) = 00. 0
2.2.3 The Space Co (G) and Density in the Hilbert Space

L2(G)
The space CO' (G) plays a fundamental role in modern analysis.
Definition 6. Let G be a nonempty open set in ]RN, N ;:::: 1. Then
(a) Ck(G) is the set of all real functions
u: G -+]R
that have continuous partial derivatives of orders 4 m = 0, 1, ... ,k.
4 As usual, we understand the derivative of order m = 0 to be the function

itself.
(a) ulOCO'(IR) (b)
FIGURE 2.4.
(b) Ck(G) is the set of all u E Ck(G) for which all partial derivatives of
order m = 0, ... , k can be extended continuously to the closure G of
G.
(c) lf u E Ck(G) (resp., U E Ck(G)) for all k = 0,1,2, ... , then we write
U E COO(G) (resp., u E COO(G)).
(d) CO'(G) is the set of all functions u E COO(G) that vanish outside
a compact subset C of G that depends on u, i.e., u(x) =
x E G - C (see Figure 2.4).
for all °
Instead of CO(G) (resp., CO(G)) we write briefly C(G) (resp., C(G)).
That is, C( G) consists of all continuous functions u: G --+ IR, and C(G)
consists of all continuous functions u: G --+ IR.
The set CO' (G)c consists of all functions u: G --+ C for which both the
real and the imaginary parts of U belong to CO' (G). Similarly, we define
Ck(G)c, and so on.
lf u E C k (G), then we say "u is C k on G." In the one-dimensional special
case where G = la, b[, we write briefly
and
and so forth.
Proposition 7. Let U be a nonempty open set in IR N , N ~ 1. Then, the

following hold true:
(i) The set CO'(G) is dense in L2(G).
(ii) The set C(G) is dense in L2(G).
(iii) The sets CO'(G)c and C(G)c are dense in L~(G).
Corollary 8. The spaces L2(G) and L~(G) are separable.

The proofs of Proposition 7 and Corollary 8 will be given in Problems

2.12ff by using an important smoothing technique.
Example 9. The pre-Hilbert space C* [a, bl is not a Hilbert space.
Proof. Let L := C*[a, bl and X := L 2(a, b). By Proposition 7, the linear

subspace L is dense in X, i.e., L = X.
If L were be a Hilbert space, then L would be closed. Hence L = L = X.
But this is impossible, since there are functions with u E X and u rf. L. For
example, this is true for
I if a S x S c for fixed c E la, b[

u(x):= { 0 (18)
if c < x S b.
D
2.2.4 The Space Coo(G) and the Variational Lemma

Variational Lemma 10. Let G be a nonempty open set in JRN, N :::: l.
Then, it follows from u E L2 (G) and
fa uvdx = 0 for all v E C[{' (G) (19)
that u(x) = 0 for almost all x E G.

If, in addition, U E C(G), then u(x) =0 for all x E G.
This lemma plays a fundamental role in the calculus of variations. We

shall use it in Section 2.5.
Proof. Let X := L2(G). By (19),
(ulv)=O for all v E C[{'(G).

Since the set C'o (G) is dense in X, it follows as in the proof of Proposition
4(ii) in Section 2.1 that
(u I u) = fa lul 2dx = O. (20)
Hence u(x) = 0 for almost all x E G.

If u is continuous on G, then (20) implies u(x) = 0 for all x E G. D
2.2.5 The Space COO (G) and Integration by Parts

The classic integmtion-by-parts formula reads as follows:
lb u'v dx = uvl~ -lb uv' dx, (21)

with the "boundary integral" uvl~ = u(b)v(b) - u(a)v(a). In particular, if

v(a) = v(b) = 0, then
b
lb u'vdx = - l uv' dx. (21 *)
Proposition 11. Let -00 <a < b< 00. Then, the following are met:
(i) The integration-by-parts formula (21) holds for all
u,v E C1[a,b].
(ii) Formula (21 *) holds for all
u E C1(a, b) and v E Co(a, b).
Here, we set C1[a,b]:= C1(G) and C1(a,b):= C1(C), where C = ]a,b[,

and so on.
Proof. Ad (i). By the fundamental theorem of calculus,
l
b
(U'v+uv')dx= lb(uv)'dx=uvl~.
Ad (ii). Since the function v vanishes in a neighborhood of the two bound-
ary points x = a and x = b, we can choose a subinterval [c, d] of la, b[ such
that v E CO'(c, d). Furthermore, u E C1(a,b) implies u E C1[c,d]. Hence
lb (u'v + uv')dx = ld (uv)' dx = uvl: = 0,
since v(c) = v(d) = 0. D
The generalization of the integration-by-parts formula (21) to higher di-

mensions reads as follows:
j = 1, ... ,N, (22)
where x = (6, ... '~N) and 8j u := 8u/8~j. In addition, the outer unit
normal vector to the boundary 8C is denoted by n = (nl' ... ,nN). In the
special two-dimensional case (N = 2), the surface integral fac ... dO is to
be understood in the sense of fac ... ds, where s denotes arclength, and
the boundary curve 8C is oriented in such a way that the set C lies on the
left-hand side of 8C (see Figure 2.5(a)).
(a) (b)
G
FIGURE 2.5.
In the special case where v = 0 on 8e, formula (22) passes over to
j = 1, ... ,N. (22*)
Proposition 12 (Integration by parts). For N = 1,2, ... , the following

hold true:
(i) Formula (22) holds for all

1 -
u,v E C (e),
provided eis a nonempty bounded open set in ~N that has a suffi-

ciently smooth boundary.
(ii) Formula (22*) holds for all
u E C1(e) and v E cO'(e),

provided e is a nonempty open set in ~N.
The integration-by-parts formula (22) is the key to the modern theory of

partial differential equations and to the modern calculus of variations.
In this book, we only need the special case (22*). Let us sketch the proof
of Proposition 12.
Ad (i). The generalization of the fundamental theorem of calculus
{b b
Ja w'dx = wla
to higher dimensions is given by the famous Gauss theorem:

Hence
k
r ((8j u)v + u8j v)dx = kr 8j (uv)dx = kG
r uvnjdO.
This is (22).
Ad (ii). Let v E C(f(G). Then, the function v vanishes outside a compact
subset of G, i.e., v vanishes on a "boundary strip" of 8G. Thus, it is possible
to construct a nonempty bounded open subset H of G such that v = 0
outside H and the boundary 8H is sufficiently smooth (see Figure 2.5(b)).
Hence
since v = 0 on 8H.
2.3 Bilinear Forms

Definition 1. Let X be a normed space over K By a bounded bilinear
form on X we understand a function
a: X x X -+ II{
that has the following properties:
(i) Bilinearity. For all u, v, w E X and a, (3 E II{,
a(au + (3v, w) = aa(u, w) + (3a(v, w)

and
a(w, au + (3v) = aa(w, u) + (3a(w, v).
(ii) Boundedness. There is a constant d> 0 such that
la(u,v)1 :::; dilullllvil for all u,v E X.
In addition, a(·, .) is called symmetric iff
a(u, v) = a(v, u) for all u, vEX.
Moreover, a(·,·) is called positive iff
0:::; a(u,u) for all u E X.
Finally, a(·,·) is called strongly positive iff there is a constant c > 0 such
that
cllul1 2 :::; a(u,u) for all u E X.
Proposition 2. Let a: X x X -+ lR be a bounded bilinear form on the

normed space X over lK. Then
Un -+ u and Vn -+ v as n -+ 00
imply a(un,vn ) -+ a(u,v) as n -+ 00.
Proof. Since the sequence (v n ) is bounded, we get
la(u n , v n ) - a(u, v)1 = la(u n - u, vn ) + a(u, Vn - v)1

::::; dllun - ullllvnli + dilullllvn - vii -+ 0 as n -+ 00.0
2.4 The Main Theorem on Quadratic Variational

Problems
We consider the minimum problem
2- 1a(u, u) - b(u) = min !, u E X. (23)
In the next section we shall show that the famous Dirichlet principle is a
special case of the following theorem.
Theorem 2.A (Main theorem on quadratic variational problems). Suppose

that
(a) a: X x X -+ lR is a symmetric, bounded, strongly positive, bilinear

form on the real Hilbert space X.
(b) b: X -+ lR is a linear continuous functional on X.
Then the following hold true:
(i) The variational problem (23) has a unique solution.
(ii) Problem (23) is equivalent to the following so-called variational equa-
tion:
a(u, v) = b(v) for fixed u E X and all v E X. (23*)
Proof of Theorem 2.A. By hypothesis, there are constants c > 0 and

d > 0 such that
for all u E X. (24)
Step 1: Equivalent equation. We show that (23) is equivalent to (23*).
To this end, we set
F(u) := r1a(u, u) - b(u) for all u E X.
Moreover, for fixed u, v EX, we set
¢(t) :=F(u+tv) for all t E R
Using the symmetry condition a( u, v) = a( v, u), we obtain
¢(t) = T 1t 2 a(v, v) + t[a( u, v) - b(v)] + T1a(u, u) - b(u).
Note that a(v,v) 2 cllvl1 2 > 0 for all v E X with v =f. O. Thus, the original
problem (23),
F(u) = min!, UEX,
has a solution u iff the real quadratic function ¢ = ¢(t) has a minimum at
the point t = 0 for each fixed v EX, i.e.,
¢'(O) = O. (25)
Equation (25) is identical to
a(u, v) - b(v) = 0 for all v E X.
This is (23*).
Step 2: Uniqueness. Let u and w be solutions of the original problem
F(u) = min!, u E X. By Step 1,
a(u, v) = b(v)
for all v E X,
a(w,v) = b(v).
Letting v := u - w, we get
cllu - wl1 2 ::; a(u - w, u - w) = o.

Hence u = w, i.e., the original problem (23) has at most one solution.
Step 3: Existence proof. Set
0:= inf F(u).

uEX
Since
F(u) = T1a(u, u) - b(u) 2 T 1 cllul1 2 - Ilbllllull,
we obtain F(u) ~ +00 if Ilull ~ +00. Hence 0 > -00.
By the definition of 0, there is a sequence (un) such that
as n ~ 00.
Obviously, we have the following key identity:

Hence
F(u n ) + F(u m ) = 4- 1 a(u n - Um, Un - Um) + 2F( Un ~ U m )

~ 4- 1 cllu n - umll + 2a.
Since F( un) + F( um) ---> 2a, it follows that (un) is a Cauchy sequence.
Hence
Un ---> U as n ---> 00.
Since F: X ---> lR is continuous,
as n ---> 00.
This implies
F(u) = a,
i.e., u is a solution of the original problem F(u) = min !, U E X. 0
Proposition 1. Let X be a pre-Hilbert space over IK. Then, for all u, v EX!
we have the so-called parallelogram identity
(27)
Proof. From
(u ± v I u ± v) = (u I u) ± (u I v) ± (v I u) + (v I v)
we get Ilu ± vl1 2 = IIul1 2 ± (u I v) ± (v I u) + Ilv11 2 . o
Remark 2 (The geometrical meaning of the Dirichlet principle). Figure
2.6(a) shows the geometrical meaning of the parallelogram identity (26).
By Figure 2.6(b), it is obvious that (27) generalizes the classic Pythagorean
theorem.
The proof of Theorem 2.A has been based on the identity (26). If we
introduce the energetic inner product
(u I V)E := a(u, v) for all U,V E X,
then (26) can be written in the following form:
where Ilull~ = (u I U)E. This is precisely the parallelogram identity with

respect to the energetic inner product.
In Section 2.13 we shall prove that Theorem 2.A is equivalent to the
perpendicular principle, which says the following:
v-u u+v v-u u+v
o u u
(a) 2jjujj2 + 2jjvjj2 = jju + vjj2 + jju _ vjj2
FIGURE 2.6.
----e<---/l------+----U L
o
FIGURE 2.7.
In a Hilbert space, there exists a perpendicular from each point u to each

given closed linear subspace L.
This principle is pictured in Figure 2.7. Since the Dirichlet principle
follows from Theorem 2.A, we can say the following:
The functional analytic justification of the Dirichlet principle is based on
the idea of orthogonality.
There are ideas in mathematics that remain eternally young and that
lose nothing in their intellectual freshness, even after thousands of years.
Mathematicians of the Pythagorean school in ancient Greece attributed the
Pythagorean theorem to the master of their school, Pythagoras of Samos
(circa 560 B.C.-480 B.C.). It is said that Pythagoras sacrified one hundred
oxen to the gods in gratitude. In fact, this theorem was already known in
Babylon at the time of King Hammurabi (circa 1728 B.C.-1686 B.C.). Pre-
sumably, however, it was a mathematician of the Pythagorean school who
first proved the Pythagorean theorem. This theorem appears as Proposition
47 in Book I of Euclid's Elements (300 B.C.).
The theory of Hilbert spaces is the abstract and very efficient formulation
of the idea of orthogonality.
It seems that this idea has deep roots in our real world, since the Hilbert
space theory is the right mathematical tool for describing quantum physics.
This will be discussed in Section 5.14.
2.5. The Functional Analytic Justification of the Dirichlet Principle 125
2.5 The Functional Analytic Justification

of the Dirichlet Principle
We want to study the following variational problem:
F(u) :=
N
r1 12)Oju)2dX
G j =1
-1 G
fudx = min!, u = g on oG. (28)
This problem is also called the Dirichlet problem. Here, we assume that
(H) G is a nonempty bounded open set in IR N , N = 1,2, ....
In addition, we set x = (6, ... , ~N) and OjU := ou/o~j.
The Dirichlet principle says that problem (28) has a solution u. After
the necessary preparations, the final existence theorem will be proved at
the end of this section.
2.5.1 The Classic Euler~Lagrange Equation

Along with (28) let us consider the following boundary-value problem for
the Poisson equation:
-flu = f on G,
(28a)
u=g on oG.
In addition, let us also study the so-called generalized problem to (28a):
N
1 LOjuOjvdx =
G j=1
1G
fvdx for all v E Co(G). (28b)
Here, the Laplacian is defined through

N
flu:= LoJu.
j=1
If f == 0, then (28a) is called the first boundary-value problem for the

Laplace equation.
Proposition 1. Assume (H). Let the continuous functions g: oG --> IR and

f: G --> IR
be given. Suppose that u E C 2 ( G). Then, the following hold true:
(i) If the function u is a solution of the original variational problem (28),
i. e., more precisely, u is a solution of
F(w) = min !,
(28*)
w=g on oG,
then u is a solution of the boundary-value problem (28a).
(ii) The function u is a solution of the boundary-value problem (28a) iff

it is a solution of the generalized boundary-value problem (28b).
Equation (28a) is called the Euler-Lagrange equation to the variational

problem (28). The following arguments are typical for the calculus of vari-
ations.
Proof. Ad (i). Step 1: Admissible functions. Let u be a solution of (28*).

Then, for each fixed v E CD (G) and t E JR, the function
w:= u +tv
is admissible for the variational problem (28*), i.e.,
w =9 on DG and
Step 2: Reduction to a minimum problem for real functions. For fixed

v E CO'(G), we set
1;(t) := F(u + tv) for all t E R
Explicitly,
Since u is a solution of (28*), the function 1;: JR -> JR has a minimum at the
point t = O. Hence
1;'(0) = O.
Explicitly,
1;'(0) = 1I::
N
Gj=l
DjUDjVdx -1 G
fvdx =0 for all v E cgo(G). (29a)
Thus, u is a solution of the generalized problem (28b).

Step 3: The variational lemma. Applying integration by parts to (29a),
we get
for all v E Cgo(G), (29b)
that is,
-fa (/j,.u + J)v dx = 0 for all v E Cgo(G).
By the variational lemma from Section 2.2.4, this implies
6.u + f =0 on G, (29c)
i.e., u is a solution to the boundary-value problem (28a).
Ad (ii). By Step 3, equation (29a) implies (29c). Conversely, integration
by parts tells us that (29c) implies (29a). 0
Remark 2 (Lack of classic solutions). By Proposition 1, each sufficiently

smooth solution u to the Dirichlet problem (28) is also a solution to the
boundary-value problem (28a). However, the point is that
There exist reasonable situations where the Dirichlet problem (28) lacks
smooth solutions.
In order to understand this typical difficulty of the calculus of variations,
let us consider the following two simple minimum problems:
f(u) = min!, u E [a, b], (30)
and
f(u) = min !, u E [a,b] nQ, (30*)
where -00 < a < b < 00, and Q denotes the set of rational numbers.
Suppose that the function f: [a, b] - 7 ~ is continuous. Then
(a) problem (30) has always a solution, but
(b) there are reasonable functions f for which problem (30*) has no so-
lution.
In fact, statement (a) follows from the classical Weierstrass theorem.
Suppose now that all the solutions u of (30) are irrational numbers. Then,
problem (30*) has no solution. Consequently, mathematicians who do not
know irrational numbers cannot prove the Weierstrass existence theorem
for minimum problems.
A similar situation is encountered with respect to the Dirichlet problem.
Roughly speaking, the search for smooth solutions to the Dirichlet problem
(28) corresponds to problem (30*). In order to get a situation comparable
with the solvable problem (30), we have to add ideal elements to the class
of smooth solutions. These ideal elements correspond to functions
u E Wi(G)
in the Sobolev space Wi(G), which will be introduced ahead. Such func-
tions only possess generalized derivatives of first order. Summarizing:
The introduction of Sobolev spaces corresponds to the introduction of
real numbers by completion of the set of rational numbers via irrational
numbers.
Our program is now the following:
(i) We define generalized derivatives via integration by parts.

(ii) We define Sobolev spaces.
(iii) We prove the inequality of Poincare-Friedrichs.
(iv) We apply the main theorem on quadratic variational problems (The-
orem 2.A) to the Dirichlet problem (28) with respect to a suitable
o
linear closed subspace W ~ (G) of the Sobolev space Wi (G).
Here, the Poincare-Friedrichs inequality ensures the fundamental strong

positivity of the quadratic main part of the Dirichlet problem. This way
we obtain both generalized solutions of the Dirichlet problem (28) and
generalized solutions of the classical boundary-value problem (28a) for the
Poisson equation.
The modern theory of variational problems and partial differential equa-
tions is governed by the notion of generalized solutions.
In this connection, one uses the following general strategy:
(n) One proves the existence of generalized solutions by using the meth-
ods of functional analysis.
((3) One uses sophisticated analytical methods in order to prove that
generalized solutions are also classic solutions, provided the situation
is sufficiently regular (e.g., the boundary oG and the functions f and
g in (28) are sufficiently smooth).
Step ((3) represents the subject of the so-called regularity theory. An

elementary introduction to regularity theory can be found in Zeidler (1986),
Vol. 2A. .
2.5.2 Generalized Derivatives

The point of departure for the definition of generalized derivatives is the
classic integration-by-parts formula:
fa UOjV dx =- fa (Oju)v dx for all v E Cg"(G), (31)
where U E C 1 (G). The simple trick is to set
This way we obtain the key formula
fa UOjV dx = -fa wv dx for all v E Cg"(G). (31 *)

The point is that this formula remains valid for certain nonsmooth func-
tions u and w.
Definition 3. Let G be a nonempty open set in ~N, N 2: 1. Let u, w E

L 2 (G), and suppose that (31*) holds. Then, the function w is called a
generalized derivative of the function u on the set G of type OJ.
As in the classic case, we write w = Oju.
Proposition 4. The generalized derivative w = OjU is uniquely determined

up to the values of w on a set of N -dimensional measure zero.
Proof. Suppose that (31*) holds for w,w E L2(G). This yields
fa (w - w)v dx = 0 for all v E C[f(G),
and the variational lemma from Section 2.2.4 implies that

w(x) = w(x) for almost all x E G. D
Example 5. Consider the function u: ] - I, 1[ --t ~ with

u(x) := Ixl for all x E ] - 1, 1 [.
Set
if-l<x<O -I
{
if x = 0
w(x):= ~
if 0 < x < I,
where c is a fixed, but otherwise arbitrary, real number. Then, the function
w represents the generalized derivative of the function u on the interval
] - 1,1[. We write
u' =W on] - 1,1[.
Note that w is the classic derivative of u On both the subintervals] - I, O[
and ]0,1[, but the classic derivative of u does not exist at the point x = O.
Proof. For all v E COO (-1, 1), integration by parts yields
] 1 =]0
-1
uv'dx
-1
uv'dx + r uv'dx
io
1
= _]0 u'vdx- r u'vdx+uV[~l +uv[~

1
-1 io
=_]0 wvdx- ior1wvdx+u(O)v(O)-u(-I)v(-I)
-1
+ u(1 )v(l) - u(O)v(O)
= _]1 -1
wvdx,
since v(±l) = O. More precisely, for small c: > 0, note that integration by
l- -1-
parts yields
e
uv'dx =
e
u'v dx + uv [~ ,
-1 -1
since u and v are 0 1 on J - 1, -c:[. Letting c: ---> +0, we get
1 -1
0 uv'dx = -1 0
-1
u'v dx + UVI~l'
since uv is continuous on [-1, OJ.
Similarly, applying c: ---> +0 to Ie1 ... , we get the corresponding formula
Io
1 ... =.... 0
2.5.3 The Sobolev Space Wi (G)

Definition 6. Let G be a nonempty open set in ]R.N, N ~ 1. The Sobolev
space wi
(G) consists precisely of all the functions
u E L2(G)
that have generalized derivatives
for all j = 1, ... , N.
Furthermore, for all u,v E Wi(G), we set
(u I Vh,2:= 1 +t
G
(uv
J=l
8 j U8j V) dx,
1
and Ilu111,2 := (u I u)i,2'
Proposition 7. The space Wi(G) together with the inner product (. I
·h,2 becomes a real Hilbert space, provided we identify two functions whose
values differ only on a set of N-dimensional measure zero.
Proof. Let u E Wi(G). From (u I Uh,2 = 0 we get u 2dx = 0, and hence IG

u(x) = 0 for almost all x E G, i.e., u is the zero element. Hence (. I ·h,2 is
an inner product on Wi(G). Thus, Wi(G) is a pre-Hilbert space.
In order to prove that Wi (G) is a Hilbert space, let (Un) be a Cauchy
sequence in Wi (G), i.e.,
for all n, m ~ no(C:).
Hence (un) and (8j u n ) are Cauchy sequences in L2(G). Since L2(G) is a
Hilbert space, there are functions u, Wj E L2(G) such that, as n ---> 00,
un ---> U in L2(G) and 8j u ---> Wj in L2(G) for all j. (32)

Letting n -+ 00, from
L8 u n jvdx = - L(8 j U n )VdX for all v E CO'(G)
we obtain
L u8jvdx = - L wjvdx for all v E CO'(G), (33)
using the continuity of the inner product on the Hilbert space L2(G) (see
Proposition 4 in Section 2.1).
Equation (33) tells us that the function u has the genemlized derivatives
Wj = 8j u on G for all j.
Since 8ju E L2(G) for all j, we get u E WJ(G).
Finally, it follows from (32) that
Ilun - ulll,2 -+ 0 as n -+ 00,
i.e., Un -+ U in WJ(G) as n -+ 00. Hence WJ(G) is a Hilbert space. 0
o
2.5.4 The Sobolev Space W~(G)
o
Definition 8. Let W ~(G) denote the closure of the set C(f(G) in the
Hilbert space WJ(G).
We will later discuss that it makes sense to say that all of the functions
o
u EW~(G) satisfy the boundary condition
u=o on8G
in some generalized sense.
o
Proposition 9. The space W~(G) is a real Hilbert space.
Proof. Note that C8"(G) is a linear subspace of the Hilbert space WJ(G).
Now use Proposition 10 from Section 2.1. 0
In the special case where N = 1 and G = la, b[, let us briefly write
wi (a, b) and
o
instead of WJ(G) and W~(G), respectively. The following example shows
o
that the functions u EW~(a, b) possess a simple structure.
o
Example 10. Let -00 < a < b < 00. If u EW~(a, b), then there exists a
unique continuous function v: [a, b] -+ lR such that u(x) = v(x) for almost
all x E la, b[ and

v(a) = v(b) = O.
In addition, we have the estimate
Recall that, by Section 2.5.3,
(u I Vh,2 = lb (uv + u'v')dx and
0
1
for all u, v EW2 (a, b).
Proof. Uniqueness ofv. If two continuous functions v, w: [a, b] --+ ~ differ at

a single point, then they also differ on a small interval J with meas( J) > O.
Hence
"v(x) = w(x) for almost all x E la, b[" implies v(x) = w(x) on [a, b].
Existence of v. First let WECo (a, b). Then
w(x) = l x
w'dy for all x E [a, b].
By the Schwarz inequality (6),
Iw(xll S 1"olw'ldY S (1' I (1' lw'I'd

dY) Y) I
:::; (b - a)! Ilwlll,2 for all x E [a, b].

o
Now let u EW§(a, b). Then, there exists a sequence (v n ) in Co (a, b) such
that
Ilvn - ulll,2 --+ 0 as n --+ 00.
Since (v n ) is a Cauchy sequence in w§(a, b), it follows from
that (v n ) is also a Cauchy sequence in the Banach space C[a, b]. Thus, there
is a function v E C[a, b] such that
as n --+ 00.
Since vn(a) = vn(b) = 0 for all n, this implies v(a) = v(b) = O.

Finally, it follows from
that v(x) = u(x) for almost all x E la, b[. o

The following example will be used in Section 2.7.3 in order to prove the
convergence of the important method of finite elements.
Standard Example 11. Let -00 <a < b< 00, and let the function
u: [a, b] --+ lR
be continuous and piecewise continuously differentiable. Denote by C the
set of points x where the classic derivative exists. Define the real function
. _ {u' (x) if x E C
w(x).- arbitrary otherwise.
More precisely, we assume the following:
(a) The function u is continuous on [a, b].

(b) There exists a finite number of points aj with
a = ao < al < ... < an = b

such that, for all j, u is continuously differentiable on the open subin-
tervals ]aj, aj+l [ and the derivative u' can be extended continuously
to the closed subinterval [aj,aj+1] (cf. Figure 2.8).
Then, the function u has the following properties:
(i) The function w is the generalized derivative of u on la, b[, i.e., w = u'
on ]a,b[.
(ii) u E Wi(a, b).
o
(iii) u EW§(a, b) iff u(a) = u(b) = O.
Proof. Ad (i). Divide the interval [a, b] into the subintervals raj, aj+1] and
use integration by parts as in Example 5.
Ad (ii). Since u is continuous and w = u' is piecewise continuous and
bounded, we get
and
a b
(a) uEWi(a, b)
a b
•
a
~. b
FIGURE 2.8.
Hence u E wi (a, b).

o
Ad (iii). If u EW~(a, b), then u(a) = u(b) = 0, by Example 10.
Conversely, let u(a) = u(b) = O. Choose a number 'f/ > O. By smoothing
the function u at the corners, we obtain a function v E C 1 [a, b] such that
v vanishes in a neighborhood of the two bounda'f'1J points x = a and x = b
along with
The idea of the construction of the function v related to u is pictured in

Figures 2.8(b) and (c).
Choose the function ¢>c; as in Problem 2.12. Letting
it follows from the density proof in Problem 2.13a that Vc; E CO'(a, b), for
sufficiently small c > 0, and
as c -+ +0.
Differentiation and integration by parts yield
v~(x) = lb ¢>~(x - y)v(y)dy = lb ¢>c;(x - y)v'(y)dy.
Hence
as c -+ +0.
Summarizing, we get
where Vc: E Cgo(a, b),
for sufficiently small E > O. Set Un := V!;. Since 1] > 0 is arbitrary,
Un-+U in Wi(a, b) as n -+ 00,
o
where Un E C(f(a, b) for all n. Hence U EW~(a, b). o
2.5.5 Generalized Boundary Values

Definition 12. Let G be a nonempty bounded open set in ]RN, N ;::: l. If
U E W~ (G), then we say that the function U satisfies the boundary condition
U=O on 8G, (34)
in the generalized sense.
Remark 13 (Motivation of (34)). A very formal motivation is based on the

o
fact that the set C(f (G) is dense in W ~ (G) and the functions U E C(f (G)
vanish on a boundary strip of G, i.e., U satisfies condition (34) in the classic
sense. A more convincing motivation is obtained as follows:
(a) Let G ~ ]RN with N = 1 and G = la, b[. Then, Example 10 tells us
that (34) holds true in the "classic sense."
(b) Let G ~ ]RN with N ;::: 2, and suppose that the boundary 8G of the
nonempty bounded open set G is sufficiently regular. Then it can be
proved that
for all U E Wi(G).
(B)
o
This implies the following: If U EWHG), then there exists a sequence (un)
in C(f(G) such that Un -+ U in wd-(G) as n -+ 00. By (B),
r (u -
JaG u n )2dO ::::; constllu - u n lli,2 -+ 0 as n -+ 00.
Since Un = 0 on 8G, we get

and hence
u(x) = 0 for almost all x E 8G, (34*)
in the sense of the surface measure on 8G.
The elementary proof of the boundary inequality (B) can be found in
Zeidler (1986), Vol. 2A, p. 247. This proof is based on the Schwarz inequal-
ity.
2.5.6 The Poincare-Friedrichs Inequality

We want to prove the following Poincare-Friedrichs inequality:
for all u EW~(G). (35)
Proposition 14. Let G be a nonempty bounded open set in jRN, N =

1,2, .... Then there exists a constant C > 0 such that inequality (35) holds.
Example 15. We consider first the special case where N = 1 and G = la, b[
with -00 < a < b < 00.
Step 1: Let u E CO'(a, b). Then
u(x) =1 x
u'(y)dy for all x E [a, bl.
By the Schwarz inequality (6),
u(x)2:::; ( 1 b
1. lu'ldy
)2 :::; 1 b
dy 1 b
u,2dy,
and hence
b
1b u 2dx :::; (b - a)21 u,2dx.
Step 2: Let u EW~(a, b). Then, there is a sequence (un) in CO'(a, b) such
that Ilu - un lll,2 -+ 0 as n -+ 00. Hence
and u~ -+ u' in L 2 (a, b) as n -+ 00.
By Step I,
1b u~dx :::; (b - a)21

b
u~dx for all n.
Letting n -+ 00, this implies the special Poincare-Friedrichs inequality,

'fJ
~----~----------~-----e
a b
FIGURE 2.9.
for all u Ew~(a, b). (35*)
o
Proof of Proposition 14. Let N = 2. The general case proceeds analo-
gously.
Step 1: Let u E Co(G). As in Figure 2.9 consider a rectangle R :=
[a, b] x [c, d] with G ~ int R. Note that u vanishes outside G. Then
u(~, TJ) = 1''1 u1](~, y)dy for all (~, TJ) E R.
The Schwarz inequality yields
u(C TJ)2 = (11] 1 . u1](~' Y)dY) ~ 11] dy 11] U1](~' y)2dy

2
~ (d - c) 1d U1](~' y?dy for all (~, TJ) E R.

Integrating this over R, we get
L u 2dx ~ (d - C)2 Lu~dx.

This is (35).
o
Step 2: Let u EW~(G). Then there is a sequence (un) in Co(G) such
that
as n -? 00.
Hence
un -? U in L2(G) and
for all j. By Step 1,
C i u~dx ~ i Ej(8j un)2dx for all n.

Letting n ---+ 00, we get the desired inequality (35). D
In 1890, inequalities of type (35) were considered by Poincare in a famous

paper on eigenvalue problems for the Laplace equation. In 1934, Friedrichs
recognized that such inequalities represent the key to the functional ana-
lytic existence theory for linear elliptic differential equations.
2.5. 7 The Existence Theorem for the Dirichlet Problem

That's what it was really all about.
Faust
Parallel to Section 2.5.1, let us consider the generalized Dirichlet problem
(36)
along with the generalized boundary-value problem
u-9EW~(G). (37)
o
As defined in (34), the condition "u - g EW~(G)" corresponds to the
boundary condition
u-g=O on8G,
in the generalized sense. We only consider such boundary functions g on
8G which can be continued to a function g on G such that g E Wi (G).
Theorem 2.B (The Dirichlet principle). Let G be a nonempty bounded

open set in ]R.N, N = 1,2, .... We are given f E L2(G) and g E Wi(G).
(i) The generalized Dirichlet problem (36) has a unique solution u E

Wi(G).
(ii) This is also the unique solution u E wi(G) of the generalized bound-
ary-value problem (37).
o
Proof. We set X :=W~(G) and
for all u, v E WJ(G). Introducing w := u - g, the original problem (36) can

be written in the following form:
T1a(w + g,w + g) - b1(w + g) = min!, wEX.
If we use a(w + g,w + g) = a(w,w) + 2a(w,g) + a(g,g) and b1(w + g) =

b1(w) + b1(g), then this minimum problem is equivalent to
T1a(w,w) - b(w) = min!, wEX, (36*)
where we set
b(w) := b1(w) - a(w,g) for all w E X.
Furthermore, the generalized boundary-value problem (37) is equivalent

to
a(w,v) = b(v) for fixed w E X and all vEX. (37*)
We want to apply Theorem 2.A to problems (36*) and (37*).
Step 1: Properties of a: X x X -+ R. Set
and recall that
represents the norm on X. By the Schwarz inequality (16), for all v, wE X,

we get
N
~ L 1I0jvl12110j wll2 ~ Nllvlh,2I1wlh.2, (38)
j=l
i.e., a(·,·) is bounded. Obviously, a(·,·) is bilinear, and
a(v,w) = a(w,v) for all v,w E X,
i.e., a(·,·) is symmetric. Finally, the Poincare-Friedrichs inequality (35)

tells us that
G fa (v' + t,(8;V)') dx 5, (1+ G) fa t,(8;V)'dx for all vEX.

Hence
for all v E X,
i.e., a(·, .) is strongly positive.
Step 2: The functional b: X --+ R By the Schwarz inequality,
Ib1(v)1 ::; fc Ifvldx::; IIfl1211vl12
::; Ilf11211vlll,2 for all v E X.
By (38),
for all v E X.
Hence
Ib(v)1 ::; constllvlk2 for all v E X. (39)
Obviously, b(·) is linear. By (39), b: X --+ lR is a linear continuous functional.
Thus, the assumptions of Theorem 2.A are satisfied. Consequently, using
Theorem 2.A, we obtain that problem (36*) has a unique solution wE X,
and w is also the unique solution of (37*).
Finally, set
u = g+w.
Then U E Wi(G), and u is the unique solution of both (36) and (37). 0
2.6 The Convergence of the Ritz Method

for Quadratic Variational Problems
Let us again consider the variational problem
F(u) := T1a(u, u) - b(u) = min !, UEX, (40)
along with the corresponding variational equation
a(u, v) = b(v) for fixed u E X and all v E X. (40*)
In order to construct the fundamental Ritz method for solving approxi-

mately the problems (40) and (40*), let us consider the Ritz problem
(41)
and the Ritz equation
for fixed Un E Xn and all Vn E X n , (41 *)
where
Xn is a finite-dimensional subspace of the real Hilbert space X,
2.6. The Convergence of the Ritz Method 141
i.e., 0 < dim Xn < 00. Comparing (40) with (41), it turns out that the
space X of the original problem is replaced with X n . Thus, the Ritz method
is based on a quite natural idea. Let {el n , ... ,eNn} be a basis of X n . Then,
the elements Un and Vn of Xn allow the following simple representation:
N N
Un = LDinein and Vn = L f3in ein.
i=l i=l
Choosing Vn := ejn, we obtain that the Ritz equation (41*) is equivalent

to the following system of linear equations:
N
L Dina(ein, ejn) = b(ejn), j = 1, ... ,N, (41 **)
i=l
for the N unknown real numbers Dl n , ... ,DNn.

We assume the following:
(HI) X is a real Hilbert space. There exists a sequence (Xn) of finite-
dimensional linear subspaces Xn of X such that
lim distx(u,Xn)
n--+oo
=0
for all u E X.
(H2) The bilinear form a: X x X --+ ~ is bounded, symmetric, and strongly
positive, i.e., there exist positive constants c and d such that
la(u,v)1 ::; dilullllvil and cllul1 2 ::; a(u,u) for all u,v E X.
(H3) The functional b: X --+ ~ is linear and continuous.
Theorem 2.C (The Ritz method). Assume (HI) through (H3). Then, the
(i) Existence and uniqueness. The original variational problem (40) has
a unique solution u. This is also the unique solution of the variational
equation (40*). We have the a priori estimate
(42)
(ii) The Ritz equation. For all n = 1,2, ... , the Ritz problem (41) has
a unique solution Un. This is also the unique solution of the Ritz
equation (41 *).
(iii) Convergence. The Ritz method converges, i.e., Un --+ u in X as n --+

00.
(iv) Rate of convergence. For all n = 1,2, ... ,
(43)
(v) Error estimates. Suppose that we know a lower bound (3 for the min-
imal value of the original problem (40), i.e., F(u) 2:: {3. Then,
for all n = 1,2, ....
Remark 1 (Discussion of Theorem 2.C).

(a) Typical applications. For example, in elasticity we encounter the fol-
lowing situation:
u = displacement of the elastic body (e.g., a beam or a plate);

T1a(u, u) = elastic potential energy of the body;
b(u) = work of the outer forces.
Furthermore:
minimum problem (40) = principle of minimal potential energy,

variational equation (40*) = principle of virtual power.
A simple example will be considered in Section 2.7 (the deformation of a

string). Applications of functional analysis to elasticity are studied in detail
in Zeidler (1986), Vol. 4.
In 1909, Ritz introduced his "Ritz method" in order to compute approx-
imately the deformation of plates under the action of forces.
(b) Lower bounds and error estimates via duality. In part (v) of Theorem
2.C we need a lower bound (3 for the minimal value F(u) in order to get error
estimates. Such lower bounds can be obtained by using duality theory. Here,
the basic idea is the following. Along with the original minimum problem
F(u) = min!, UEX, (V)
we also study a dual maximum problem
G(v) = max!, v E Y, (V*)
which has the crucial property that
minimal value of (V) = maximal value of (V*).
Therefore, if u is a solution of (V), then, for each v E Y and each W E X,

we get
G(v) ::; F(u) ::; F(w),
2.6. The Convergence of the Ritz Method 143
i.e., G(v) is a lower bound and F(w) is an upper bound for the minimal
value of (V).
Such dual problems will be considered in Section 2.12.
(c) The golden rule for the rate of convergence. Using information from
approximation theory, it is possible to obtain estimates for the distance
which depend on the smoothness of the solution u in the case of boundary-

value problems for elliptic differential equations (e.g., the Poisson equation
or problems in elastostatics). Then, relation (43) yields information on the
rate of convergence of the Ritz method. This will be explained with a simple
example in the next section. Generally speaking, one has the following
golden rule of numerical analysis:
The smoother the solution u of the original problem and the smoother
the functions in X n , the faster is the convergence of the Ritz method.
A detailed discussion of this golden rule can be found in Zeidler (1986),
Vol. 2A.
(d) Finite elements. In engineering, the spaces Xn consist frequently of
so-called finite elements, which are piecewise smooth functions. A simple
example will be studied in the next section. More information can be found
The method of finite elements is one of the greatest achievements of
modern numerical analysis.
Proof of Theorem 2.C. Ad (i). This follows from Theorem 2.A.

In addition, if u is a solution of the variational equation (40*), then
a(u,u) = b(u).
Hence cllul1 2 ::; a(u,u) = b(u) ::; Ilbllllull. This implies cllull ::; IIbll·
Ad (ii). This follows from Theorem 2.A applied to the Hilbert space X n .
Note that each finite-dimensional linear subspace of a Hilbert space is again
a Hilbert space.
Ad (iii), (iv). The key to the proof is relation (43). Subtracting the Ritz
equation (41*) from the variational equation (40*), we obtain the so-called
orthogonality relation
for all v E X n .
Letting v := Un, we get a(u - Un, Un) = 0, and hence
a(u - Un, U - Un) = a(u - Un, U - v) for all v E X n . (44)

This yields
cllu - unl1 2 :::; a(u - Un, U - Un)
= a(u - Un,U - v) :::; dllu - unliliu - vii for all v E X n .
Hence
By (HI), distx(u, Xn) -+ 0 as n -+ 00. Hence Un -+ U as n -+ 00.

Ad (v). For all v E X,
F(u + v) = 2- 1 a(u + v, U + v) - b(u + v)

= 2- 1 a(v,v) + (a(u,v) - b(v)) + 2- 1 a(u,u) - b(u).
By the variational equation (40*), a(u, v) - b( v) = O. This implies
F(u + v) = T1a(v,v) + F(u),

and hence
for all vEX.
If we set w := u +v and if F(u) ~ (3, then
F(w) - (3 ~ T1cllu - w11 2 • D
Definition 2. We set
(u I V)E:= a(u,v) for all u, v E X
and call (. I ·)E the energetic inner product to a(·, .). The energetic norm
is defined through
1
IluliE := (u I u)l: for all u E X.
The linear space X equipped with (. I ·)E is called the energetic space
X E to a(·, .).
A physical motivation for this terminology will be given in the next

section.
Corollary 3. Let X be a real Hilbert space, and assume that the bilinear
form a: X x X -+ lR satisfies assumption (H2) of Theorem 2.C.
Then, the following are met:
2.7. Applications to Boundary-Value Problems 145
(i) The energetic space X E is a real Hilbert space.

(ii) The original norm 11·11 on X is equivalent to the energetic norm II· liE,
i.e.,
for all u E X. (45)
(iii) A set is dense in the energetic space X E iff it is dense in the original
Hilbert space X.
Proof. The inequality (45) follows from (H2).
Obviously, (. I ')E is an inner product on X, since a(·,·) is bilinear and
IluliE = 0 implies Ilull = 0, i.e., u = 0, by (45).
Let (un) be a Cauchy sequence in X E . By (45), this is also a Cauchy
sequence in X, and hence it is convergent in X. Furthermore, (45) tells us
that (un) is also convergent in X E . Thus, X E is a Hilbert space.
Assertion (iii) follows immediately from (45). 0
Corollary 4. Assume (HI) through (H3) of Theorem 2.C. Then, we get the
following crucial error relations for the Ritz method:
for all n = 1,2, ....
Proof. Apply Theorem 2.C to the energetic space X E . This situation cor-
responds to c = d = 1. By Theorem 2.C(iv),
Ilu - unllE :S distxE(u,Xn).

But since Un E X n , we can replace ":s" with "=" by the definition of the
distance dist(·). 0
2.7 Applications to Boundary-Value Problems,

the Method of Finite Elements, and Elasticity
We want to solve the following boundary-value problem:
-u"(x) = f(x), a < x < (3,

(46)
u(a) = u((3) = 0,
where -00 < a < (3 < 00. According to Section 2.5.1, the corresponding
variational problem reads as follows:
F(u) := J: (2- 1 u,2 - uf)dx = min!, (47)
u(a) = u((3) = 0, u E C 2 [a,(3].

f(xo)
u
u = u(x)
Xo (3 x
FIGURE 2.10.
This problem allows the following physical interpretation5 (cf. Figure 2.10):
u( x) = deflection of a string at the point x under

the vertical outer force density f(x) = f(x)e;
J:
F (u) = total potential energy of the string;
2- 1 u '2 dx = elastic energy of the string;
J: uf dx = work of the outer force density f with respect to the
vertical displacement u(x) = u(x)e of the string

= -(potential energy stored by the force density f).
Let us also consider the following generalized variational problem:
F(u) := J: (T 1 u '2 - uf)dx = min !, (47*)
where we replace the classical boundary condition "u( a) = u({3) = 0 and

o
u E 02[a,{3]" in (47) with the more general condition "u EW§(a,{3)."
Observe that the solutions u of (46) through (47**) are assumed to have
different smoothness properties.
From the physical point of view, problem (47*) is quite natural, namely:
o
The functions u EW§(a,{3) have a finite elastic energy, i.e.,
and u(a) = u({3) = 0, in the sense of Example 10 in Section 2.5.
5That is, the force (J: f(x)dx) e acts on the part of the string over the
interval b,6].
Furthermore, we set
(u I V)E:= J: u'v'dx and IluliE := (U I U)1,

1
where (. I ')E and I . liE are called the energetic inner product and the
energetic norm, respectively. Obviously,
Tl(U I U)E = Tlllull~ = elastic potential energy with

respect to the displacement u.
Finally, the variational equation reads as follows:
dF(u + tv) I
dt
= 1(3 (u'v' -
a
fv)dx = 0 for all v EW~(a,,8). (47**)
t=O
The variational equation (47**) represents the principle of virtual power.

To explain this, consider the motion t f--+ U + tv of the string, where t
denotes time. Then, F( u + tv) is the total potential energy of the string at
time t, and relation (47**) tells us that the time derivative of the energy
at time t = 0 equals zero. In physics, the time derivative of energy is called
power.6
We will also need the integral formula
u(x) = J: Q(x, y)f(y)dy, (48)
where
«(3-y)(x-a)
ifa~x~y~,8
{
Q(x, y):= «(3-~~~-a)
(3-a ifa~y<x~,8
is called the Green function to the original boundary-value problem (46).

Functions of this type were introduced by Green in 1828. The physical
meaning of the Green function will be discussed in Section 2.8.
2.7.1 Existence and Uniqueness
u: U'dx) ,
We set
and Ilull> .~
6For historical reasons, the principle of virtual power is frequently called the
principle of virtual work.
Recall that
Proposition 1. Let the force density f E L2 (0:, (3) be given. Then, the
o
(i) The variational problem (47*) has a unique solution u EW~(o:,f3).
This is also the unique solution of the variational equation (47**).
We obtain
Ilulh.2 ~ constllfll2 for all f E L 2(0:, (3). (49)
(ii) If f E C[o:, f3], then the original boundary-value problem (46) has a
unique solution u E C2 [0:, f3], which is given by the integral formula
(48). This function u is identical to the unique solution of the two
variational problems (47) and (47*).
o
Corollary 2. Set X :=W Ho:,(3). Let X E denote the linear space X
equipped with the energetic inner product (. I .) E. Then, X E is a real Hilbert
space.
By our physical motivation, it is quite natural to call X E the energetic

space of the string.
The integral formula (48) explicitly relates the force density f to the
displacement u. By Proposition l(ii),
the integral operator from (48)
is the inverse operator to the
differential operator A from (46)
given by A: D(A) ~ C 2[0:, f3] -+ C[o:, f3] along with

Au := -u" and D(A) := {u E C 2[0:, f3]: u(o:) = u(f3) = O}.
Roughly speaking:
Integral operators are frequently inverse operators to differential opera-
tors.
In Section 4.5, we will use the integral formula (48) in order to reduce
boundary-eigenvalue problems to integral equations and to solve them. In
this connection, it is important that the Green function is symmetric, i.e.,
g(x, y) = g(y, x) for all x, y E [0:, f3].
It was one of the main goals of functional analysis in the twentieth cen-
tury to generalize the results above to more general elliptic partial differen-
tial equations.
The Ritz method for solving numerically the original problem (46) will
be studied ahead.
o
Proof of Proposition 1. We set X :=W~(a, {3) along with
a(u,v) := J: u'v'dx and b(v):= J: fvdx.
The norm on X is given by II . 111,2.

The variational problem (47*) is identical to
T 1 a(u, u) - b(u) = min!, UEX, (V)
and the variational equation (47**) is identical to
a(u, v) - b(v) =0 for fixed u E X and all v E X. (E)

Step 1: Uniqueness of the solution u of the original boundary-value prob-
lem (46). Let U1 and U2 be solutions of (46). Set w := U1 - U2' Then
w" = 0, a < x < {3,
w(a) = w({3) = 0,
Integration by parts yields
0= i{3 w"wdx =- J: w,2dx,
and hence w = 0 on [a, ,B].

Step 2: Existence for (46). Let f E C[a,{3]. We want to show that the
function u from (48) represents a solution of (46). In fact, for all x E [a,{3],
({3 - a)u(x) = ({3 - a) J: 9(x, y)f(y)dy
= ({3 - x) i X
(y - a)f(y)dy + (x - a) 1{3 ({3 - y)f(y)dy,
and hence
({3 - a)u'(x) = - i X
(y - a)f(y)dy + ({3 - x)(x - a)f(x)
+ 1{3 ({3 - y)f(y)dy - ({3 - x)(x - a)f(x),

which yields
(/3 - a)ul/(x) = -(x - a)f(x) - (/3 - x)f(x) = -(/3 - a)f(x).

In addition, u( a) = u(/3) = O.
Step 3: Existence and uniqueness for (V) and (E). By the Schwarz in-
equality,
Hence b: X -4 lR is a linear continuous functional with
Ilbll :::; Ilf112.

Again by the Schwarz inequality,
la(u,vll:> U: U"dX) (t V"dX)! l

for all u, vEX.
Furthermore, it follows from the Poincare-Friedrichs inequality (35*) that
J: u 2dx :::; (/3 - a)2 J: u '2 dx for all u E X, (50)
and hence
J: (u 2 + u '2 )dy :::; ((/3 - a)2 + 1) J: u '2 dy,
i.e., there is a constant c> 0 such that
cllulli,2 :::; a( u, u) for all u E X. (51)

By Theorem 2.C(i) in Section 2.6, problem (V) has a unique solution u,
which is also the unique solution of (E). Moreover,
Ilulll,2 :::; constllbll :::; constllfl12.

Step 4: Regularity of the solution of the variational equation (E). For the
moment, let w denote the unique solution of the original boundary-value
problem (46). By Standard Example 11 in Section 2.5,
wEX.
a(w, vn ) - b(vn ) = J: (W'V~ - fvn)dx

-J: (wI! + f)vndx = 0 for all Vn E C'O(a, (3). (52)
For each given v EX, there is a sequence (v n ) in Co

(a, (3) such that
v in X as n ---+ 00. Letting n ---+ 00 in (52) we obtain
Vn ---+
a(w, v) - b(v) = 0 for all v E X.
Since the solution of (E) is unique, we get w = u.

By Step 3, w is also the unique solution of the variational problem (V),
which corresponds to (47*).
Finally, since w satisfies the side conditions of the variational problem
(47) and w is a solution of the variational problem (47*), the function w is
also a solution of (47). 0
Corollary 2 follows from Corollary 3 in Section 2.6.
2.7.2 Finite Elements and the Ritz Method

In order to solve approximately the given boundary-value problem
-ul!(x) = f(x), a < x < (3,

(53)
u(a) = u((3) = 0,
by means of the so-called method of finite elements, let us divide the interval
[a, (3] into n + 1 equal subintervals, i.e.,
ao = a < a1 < ... < an < a n+1 = (3, (54)
where aj := a + j~;~). By definition, a finite element
i = 1, ... ,n,
is a piecewise linear function with
for all j =I i
(cf. Figure 2.1l(a)). We also set
Then, Un E Xn iff
n
Un = Lainein with a1n, ... ,ann E JR.
i=l
The point is that

(a) (b)
FIGURE 2.11.
Each basic function ein satisfies the boundary condition ein (a)
ein(3) = 0 of the original problem (53). Hence
for all Un E X n . (55)
The function Un E Xn is piecewise linear and un(ai) = ain for all

i = 1, ... , n. Thus, the space Xn consists precisely of all piecewise lin-
ear functions, with respect to the nod points a, al, ... , an, (3, which satisfy
the boundary condition (55) (Figure 2.U(b)).
Along with the original boundary-value problem we consider the varia-
tional problem
F(u) := 1{3 (T u
l '2 - uf)dx = min !, (56)
This induces the Ritz problem
(57)
which represents a minimum problem with respect to the real variables

al n , ... ,ann. If Un is a solution of (57), then
j = 1, ... ,no
This yields the Ritz equation,
un E X n , j = 1, ... ,n. (58)
Explicitly, we get the linear system
j = 1, ... ,n,
for the unknown real coefficients O:ln, ... ,O:nn of Un·
Proposition 3 (The Ritz method via finite elements). Let f E C[o:,,8].

Then, the Ritz method (58) converges to the unique solution u of the original
o
boundary-value problem (53), in the sense of the Sobolev space W~(o:,,8).
That is, for each n = 1,2, ... , the Ritz equation (58) has the unique solution
Un and
as n ---7 00.
More precisely, for all n = 1,2, ... , we get the following error estimates:
(1:
1
lu(x) - Un(X)12dX) "2 = Ilu - u n l12 ::::; h~llfI12' (60)
where hn denotes the mesh size hn := ~~~.
The proof will be based on the following result.
Lemma 4.
(i) Xn ~ X E for all n.
(ii) For all u E C 2 [0:,,8] with u(o:) = u(,8) = 0 and all n = 1,2, ... ,
distxE(u, Xn) ::::; h n llul/l12.
Proof. Ad (i). This follows from Standard Example 11 in Section 2.5.

Ad (ii). Set
n
vn(x) := L u(aj)ejn(x) and v(x) := u(x) - vn(x).
j=l
Then the function Vn is piecewise linear and v( aj) = 0 for all j = 0, ... , n +
1. Consider a fixed subinterval [aj, aj+l], j = 0, ... , n. Since v( aj)
v(aj+l) = 0, it follows from the mean value theorem of calculus that
v'(~) =0
Hence we get the key formula
v'(X) = lX vl/(y)dy
This implies
By the Schwarz inequality,
Iv'(xW ~ l aH1
dy l aH1
Iv"1 2 dy
l
J J
aj
= hn +
1
Iv"1 2 dy.
J
Hence
since v~ = 0 on [aj, aj+l], by the linearity of Vn on [aj, aj+ll. Summing
J: J:
over j, we get
Iv'I 2 dx ~ h; lu"1 2 dx.

Recall that v =u- Vn . Since Vn E X n , this implies
o
Ad (iii). The set 00'(0:,(3) is dense in X =W§(o:, (3). By Corollary 3 in
Section 2.6,00'(0:, (3) is also dense in X E . Thus, assertion (ii) implies (iii),
since h n ---+ 0 as n ---+ 00. D
Proof of Proposition 3. We set X :=W~(o:, (3) and
b(u) := J: fudx.
It is convenient to use the energetic space X E , which is equal to the linear

space X equipped with the energetic inner product
(u I V)E := J: u'v' dx.

1
Recall that IluliE = (u I u)1· Along with the variational problem
rl(U I U)E - b(u) = min!, uEXE, (V)
and the variational equation
(u I V)E - b(v) =0 for fixed u E X E and all v E X E , (E)

we consider the Ritz problem
and the corresponding Ritz equation
for fixed Un E Xn and all Vn E X n .
Problems (V), (V n ), and (En) correspond to (56), (57), and (58), respec-
tively. By (50),
Ilu112::; C8 - a)llullE for all u E X E .
According to the Schwarz inequality along with (50),
Ib(u)1 = If: IUdxl ::; 1III1211ul12 ::; ((3 - a)11111211u11E for all uE X E .
Hence IlbilE ::; ((3 - a)111112.

Ad (59). Applying Theorem 2.C to (V) through (En), we obtain the
convergence of the Ritz method along with the following error estimates:
n = 1,2, ... ,
by Lemma 4. It follows from Proposition 1 that the solution u of (V) is
also a solution of the original boundary-value problem (53), i.e., -u" = 1
on [a, (3J. Hence
(61)
By Example 10 in Section 2.5,
Ad (60). Let us consider the following two functions:
w = solution of the original variational problem (V)

with 1 := u - Un;
Wn = solution of the Ritz equation (En) with 1 := u - Un.
Replacing u, Un, 1 with W, W n , U - Un in (59), respectively, we get the key

estimate
(62)
Choosing v := u - Un in the variational equation (E), we obtain
From (E) and (En) with V := Wn and Vn := Wn we get

and
respectively. This yields the decisive orthogonality relation
Hence
Ilu - unll~ = (u - Un IW - Wn)E.
By the Schwarz inequality along with (61) and (62),
Ilu - unll~ :::; Ilu - unllEllw - WnllE

:::; h;,llfl121lu - un112.
This implies (60). D
2.8 Generalized Functions and Linear Functionals

We want to discuss the physical meaning of the Green function Q from
Section 2.7 and its relation to the theory of generalized functions.
2.8.1 Special Force Densities

Let us consider the basic boundary-value problem (46) in the special case
where the force density fy,E;(x)e is located around the given point
YE ]a,;3[.
That is, let us study the following boundary-value problem:
-u"(x) = fy,E(X), a < x < (3,
(63)
u(a) = u((3) = 0,
where fy,E: [a, (3] -+ lR. is continuous with
2: 0 if x E [y - c, y + c]
fy,E(X)= { 0 ifxrj.[y-c,y+c],
for sufficiently small c > 0, along with
lf3 fy,E(x)dx = 1
(cf. Figure 2.12). By (48), the solution u = Uy,E of (63) is given through
Uy,E(X) = J: Q(x, z)fy,E(z)dz. (64)

a y-c y y + c (3
FIGURE 2.12.
U1L~-+----t~~(.,y)
I~I'
e
a y (3
(a) (b)
FIGURE 2.13.
In terms of physics, uy,e(x) represents the deflection of a string at the

point x under the vertical outer force density
with the unit vector e, i.e., the force
(65)
acts onto each subinterval ['Y,8l of [a,,Bl (cf. Figure 2.13(a)).

Let us now consider any fixed sequence (fy,e)e>o, and let us study the
limiting process c ~ +0 (cf. Figure 2.14).
Since the Green function Q: [a,,Bl x [a,,Bl ~ R is continuous, the mean
value theorem along with (64) tells us that
Uy,e(x) = Q(x, iJ), where y - c ~ iJ ~ y + c.
This yields the classical key relation
Q(x, y) = lim uy,e(x) for all x, y E la, ,13[. (66)

10-+0
€ ...... +0
FIGURE 2.14.
Along with (65) we get the following physical interpretation of the Green
function g:
For fixed y E la, ,8[, the function
xl---+g(x,y)
describes the deflection of a string caused by the vertical unit force
F=e, (67)
which is concentrated at the point y (cf. Figure 2.13(b)).
2.8.2 Formal Approach of Physicists via the Dirac 8-Function

In order to describe formally the point force F from (67) by a "force density
8y ," physicists introduce the Dirac 8y -function defined by
+oo if x = y
8y (x):= { 0 (68a)
if x # y,
along with
1 'Y
8
8y (x)dx:=
{I if Y E b, 8]
0 if y ~ b,8], (68b)
and they formally write
fy,e(x) -+ 8y(x) as € -+ +0 for all x E [a,,8] (68c)
(cf. Figure 2.14). By (68b),
lim
e--->+O
1'Y
8
fy,e(x)dx = 188Y(X)dX,
'Y
for all subintervals b,8] of [a,,B]. Finally, with € -+ +0 it follows formally

from (63) and (66) that
_ f)2g(X, y) =8 ( ) a < x <,8, (69)

f)x 2 Y X ,
g(a, y) = g(,8, y),

for each y E la, ,8[.
Obviously, there is
no classic function 8y
that satisfies relations (68a) and (68b). In the following we want to show
that 8y can be regarded as a generalized function.
The theory of generalized functions allows us to justify rigorously equa-
tion (69).
Let ¢: [a,;3] ---t ]R be continuous. Formally, we have "(¢Dy)(X)

¢(Y)Dy(X)," since "Dy(X) = 0 for x i= y." Hence
1(3 ¢(X)Dy(x)dx = ¢(y) 1(3 Dy(x)dx = ¢(y). (70)
The rigorous definition of Dy will be based on this formal relation.
2.8.3 Rigorous Approach via Generalized Functions

In this section, G denotes a nonempty open set in ]RN, N ~ 1.
Definition 1. Let x = (6, ... , ~N) and OJ := 8/o~j. By a multiindex a =

(al, ... ,aN), we understand a tuple of nonnegative integers al, ... ,aN.
We set lal := al + ... + aN and
i.e.,
ol"'lu
o"'u = O~fl ... o~ft .
For a = (0, ... ,0), we set o"'u := u.
Proposition 2 (Integration by parts). For all u, v E CO" (G) and all mul-
tiindices a,
(71)
Proof. Use repeatedly the classic formula
for all u, v E CO'(G).
fa(O/)jU)VdX = - fa ojuoivdx = fa u(ojoiv)dx,
for all u, v E CO" (G). This is (71). o

We set V(G) := CO"(G). Let ¢n, ¢ E V(G) for all n. We write
¢n ---t ¢ in V( G) as n ---t 00 (72)
iff
(i) there exists a compact subset K of the open set G such that
for all x E G - K and all n, and
(ii) we have the uniform convergence
for all multiindices 0:,
i.e.,
max lao: ¢n (x) - ao: ¢( x) I ---+ 0 as n ---+ 00 for all multiindices 0:.
xEK
If G = ]0:, ,8[, then we set V(o:,,8) := V(G).
Fundamental Definition 3. By a generalized function U E V'(G), we

understand a linear, sequentially continuous functional
U: V(G) ---+ 1R,
with respect to the convergence (72), i.e.,
U(a¢ + b'lj;) = aU(¢) + bU('lj;) for all ¢, 'lj; E V(G), a, bE 1R,
and, as n ---+ 00,
¢n ---+ ¢ in V( G) implies
The theory of generalized functions (also called distributions) was created

by Laurent Schwartz around 1950.
We want to show first that a broad class of functions can be identified
with generalized functions. More precisely, we want to show that
(73)
i.e., the Hilbert space L2 (G) can be identified with a linear subspace of the
linear space V' (G) .
Standard Example 4. For u E L2 (G), we define
U(¢) := fc u(x)¢(x)dx for all ¢ E V(G). (74)
Then
(i) U is a generalized function, i.e., U E V'(G).
(ii) If u =v in the Hilbert space L 2 (G), then U =V in V'(G).
(iii) The map U f-+ U from L 2 ( G) into V' (G) is injective.
Proof. Ad (i). Obviously, the functional U: V( G) -+ lR is linear, and
¢n -+ ¢ in V(G) as n -+ 00
implies ¢n -+ ¢ in L 2 (G), and hence U(¢n) -+ U(¢) as n -+ 00.

Ad (ii). It follows from u(x) = v(x) for almost all x E G that
fc u(x)¢(x)dx = fc v(x)¢(x)dx for all ¢ E V(G).
Ad (iii). Let u, v E L 2 (G), and suppose that
U(¢) = fc u(x)¢(x)dx = fc v(x)¢(x)dx for all ¢ E V(G).
This implies
fc (u(x) - v(x))¢(x)dx =0 for all ¢ E V(G).
By the variational lemma from Section 2.2.4, we get u =v in L2(G). 0
Standard Example 5. Let y E G. We set
for all ¢ E V(G). (75)
Obviously, this is a generalized function, which we call the Dirac 8y -distri-

bution.
In the special case where G = la, ,8[, definition (75) is motivated by the
formal relation (70) of physicists and by (74).
General Strategy 6. The basic definitions in the theory of generalized

functions are chosen in such a way that they are generalizations of the
corresponding definitions for functions via relation (74).
As a typical example, let us consider the derivative of a generalized func-

tion.
Definition 7. For a generalized function U E V'(G), the derivative 8 n U

is defined through
for all ¢ E V(G), (76)
and for all multiindices a.

In order to motivate this formula, assume that U corresponds to the

function u E V( G) and V corresponds to 8C<u, in the sense of Standard
Example 4, i.e., for all ¢ E V(G),
U(¢) = L u¢dx and V(¢) = L (8C<u)¢dx.
Hence V = 8C<u.
Proposition 8. If U E V'(G), then 8C<U E V'(G) for all multiindices ct.
Proof. The functiona18C<U: V( G) ---+ JR. is linear. In fact, for all ¢, 'lj; E V( G)
and all a, b E JR.,
(8C<U)(a¢ + b'lj;) = (-1)1c<IU(a8C<¢ + b8C<'lj;) = (_1)1c<1 (aU(8C<¢) + bU(8C<'lj;))

=a(8C<U)(¢) + b(8C<U) ('lj;).
Furthermore, as n ---+ 00, it follows from
in V(G)
that 8C<¢n ---+ 8C<¢ in V(G), and hence
U(8C<¢n) ---+ U(8C<¢),
i.e., (8C<U)(¢n) ---+ (8C<U)(¢). o

In contrast to classic functions, generalized functions possess derivatives
of arbitrary order.
Standard Example 9. For fixed y E JR., define the function u: JR. ---+ JR.
through
if x < y
u(x):= if x;::: y. {~
Then
U' = by in V' (JR.),
where U denotes that generalized function that corresponds to u.
I:
Proof. For all ¢ E V(JR.),
U(¢) = u(x)¢(x)dx.
Hence
U'(¢) = -U(¢') = _ ~oo ¢'(x)dx = ¢(y) = 8y(¢),
since the function ¢ vanishes outside some compact interval. o

Definition 10. Let Un, U E V' (G) for all n. We write
as n -+ 00 in V'(G) (77)
iff Un (¢) -+ U(¢) as n -+ 00 for all ¢ E V(G).
Proposition 11. If Un, U E L2(G) and all nand
Un -+ U
then relation (77) holds true.
Proof. Observe that V(G) ~ L2(G). Hence, for all ¢ E V(G),
i un¢dx -+ i u¢dx as n -+ 00. o
Standard Example 12. Let -00 < 0: < {3 < 00 and y E ]0:, {3[. Suppose
that we are given a family (fy,fJ of continuous functions fy,c: ]0:, {3[-+ lR as
considered in Section 2.8.1. Then,
as c -+ +0 in V'(o:,{3),
where Fy,c denotes the generalized function that corresponds to fy,c. This
is a rigorous formulation of (68c).
Proof. Let ¢ E V(G). By the mean value theorem, there is ayE [y-c, y+c]
such that
This yields
lim Fy,c(¢)
c-->+o
= ¢(y). o
Applications to electrostatics can be found in Problem 2.9.
2.8.4 Applications to the Green Function

Proposition 13. Let Q = Q(x, y) denote the Green function as defined in
(48). Then, for each y E la, ;3[,
-u" = 8y on V(a, ;3),
where U denotes the generalized function that corresponds to the classic

function u(x) := Q(x, y) for all x E [a,;3].
This result rigorously justifies equation (69).
Proof. For simplifying computations, we set a = 0 and ;3 = 1. The general

case proceeds analogously. Let ¢> E V(O, 1). Then
U(¢» = 11 Q(x, y)¢>(x)dx.
By (48),
U"(¢» = 11 Q(x, y)¢>"(x)dx
= 1(1 -
Y
y)x¢>"(x)dx + 11(1 - x)y¢>"(x)dx.
With ¢>(O) = ¢>'(O) = ¢>(1) = ¢>'(1) = 0, integration by parts yields
U"(¢» = (1- y)y¢>'(y) -l Y

(1 - y)¢>'(x)dx
- (1 - y)y¢>'(y) + 11 y¢>'(x)dx
= -(1- y)¢>(y) - y¢>(y) = -¢>(y) = -8y (¢». D
2.8.5 Generalized Derivatives

Recall that G denotes a nonempty open set in lR. N , N ~ 1.
Definition 14. Let u, W E L2 (G), and let a denote any multiindex. We

write
W =oQ u (78)
iff this relation holds for the corresponding generalized functions, i.e., W =
DQU. Explicitly,
for all ¢> E V( G). (79)

2.9 Orthogonal Projection 165
If (78) holds true, then we call the function w a generalized derivative of

the function u of type aa..
This generalizes Definition 3 in Section 2.5.2.
Proposition 15. The generalized derivative w from (78) is uniquely de-

termined up to the values of w on a set of N-dimensional measure zero.
Moreover, w is uniquely determined as an element of the Hilbert space
L2(G).
Proof. Let u, v, wE L 2 (G), and let

as well as v = aa. u .
By (79),
L(v - w)</Jdx = 0 for all </J E V(G),
and the variational lemma from Section 2.2.3 implies

w(x) = v(x) for almost all x E G.
Hence w = v in L2(G). D
2.9 Orthogonal Projection

We consider the minimum problem
lIu - vii = min!, VEM, (80)

and make the following assumptions:
(H) Let M be a closed linear subspace of the real or complex Hilbert
space X, and let u E X be given.
Figure 2.15 shows the geometrical meaning of (80), i.e., we seek the foot
v of a perpendicular from the point u to the plane M. Let M J. denote the
orthogonal complement to M, that is, by definition,
MJ. := {w E X: (w I v) = 0 for all v EM}.
Theorem 2.D (The perpendicular principle). Assume (H). Then, the min-
imum problem (80) has a unique solution v, and u - v E MJ..
Corollary 1 (Orthogonal decomposition). If (H) holds, then there exists

a unique decomposition of u of the form
u=V+w,
FIGURE 2.15.
Proof of Theorem 2.D. Since
Ilu - vl1 2 = (u - v I u - v) = (u I u) - (u I v) - (v I u) + (v I v),

problem (80) is equivalent to
T1a(v, v) - b(v) = min !, vEM, (80*)
where
a(v, w) := Re(v I w), b(v) := Tl[(U I v) + (v I u)] = Re(u I v).
Note that a(v,v) = (v I v) for all vEX. By the Schwarz inequality,
la(v, w)1 :s; Ilvllllwll, a(v, v) ;::: Ilv11 2 , Ib(v)l:S; Ilullllvll,

for all v, w EX.
First let X be a real Hilbert space; then M is also a real Hilbert space.
It follows from Theorem 2.A that problem (80*) has a unique solution v.
Hence the original problem (80) has also a unique solution v.
Now let X be a complex Hilbert space. Then, X becomes a real Hilbert
space with respect to the new inner product
(v I w)* := Re(v I w) for all v,w E X.
Again by Theorem 2.A, problem (80*) has a unique solution v, and hence
(80) has a unique solution v.
Finally, let X be a Hilbert space over lK. We want to show that u - v E
Ml... Since v is a solution of (80), we get
for all w E M, >- E lK..

Hence
(u - v I u - v) :s; (u - v I u - v) - >-(u - v I w) - .5.(w I u - v) + 1>-12(w I w).

2.10 Linear Functionals and the Riesz Theorem 167
Suppose that u - v =I=- 0 and W =I=- O. Letting A := (1I~~2v), we get 0 ::;

-I(u - v I w)i2, and hence
(u-vlw)=O for all w E M.
This remains true if u - v = O. D
Proof of Corollary 1. The existence of such a decomposition follows from

Theorem 2.D.
To prove the uniqueness of the decomposition, let
be a second decomposition of u. Then
0= (v - vd + (w - WI),
Corollary 2 (The Pythagorean theorem). If u is orthogonal to v, i.e.,

(u I v) = 0, then
(cf. Figure 2.6(b)).
Proof. Ilu + vl1 2 = (u + v I u + v) = (u I u) + (u I v) + (v I u) + (v I v) =

(u I u) + (v I v).
D
2.10 Linear Functionals and the Riesz Theorem

Theorem 2.E (The Riesz theorem). Let X be a Hilbert space over OC, and
let X* denote the dual space of X.
Then, f E X* iff there is a v E X such that
f(u) = (v I u) for all u E X. (81)

Here, the element v of X is uniquely determined by f.
In addition,
Ilfll = Ilvll· (82)
Proof. Step 1: Uniqueness of v. It follows from
(v I u) = (VI I u) for all u E X
that (v - VI I u) = 0 for u = v - VI, and hence v = VI.

Step 2: Existence of v. Let f E X* with f =f. o. The null space

N(f) := {u E X: f(u) = O}
is a closed linear subspace of X. In fact if f(u n ) = 0 for all n and Un ---+ u as

n ---+ 00, then f(u) = 0, by the continuity of f. According to the orthogonal
decomposition theorem (Corollary 1 in Section 2.9), there exists an element
Uo E N(f)l. with Uo =f. O.

Otherwise we would have N(f)l. = {O}, and hence N(f) = X. But this is
impossible because of f =f. O. Since Uo tf- N(f), we get f(uo) =f. O. Without
any loss of generality we may assume that f (uo) = 1. This implies
f(u - f(u)uo) = 0 for all u E X,
i.e., u - f(u)uo E N(f). Hence we obtain the orthogonal decomposition
u = w + f(u)uo, w E N(f), Uo E N(f)l.. (83)
Inner multiplication by Uo yields
(uo I u) = f(u)(uo I uo) for all u E X.
This implies (81) with v := (u~';.o).

If f = 0, then (81) holds with v = o.
Step 3: Conversely, if f is given through (81), then f E X*. In fact, f is
linear because
f(au + (Jw) = (v I au + (Jw) = a(v I u) + (J(v I w)

=af(u) + (Jf(w) for all v,w EX, a,(JEK.
Furthermore, the continuity of f follows from
If(u)1 = I(v I u)1 :::; Ilvllllull for all u E X. (84)
Step 4: By (84), Ilfll :::; Ilvll· Furthermore, f(v) = Ilvllllvll. Hence
Ilfll = sup If(u)1 = Ilvll· o

IIul19
Equation (83) tells us the following fundamental geometrical fact:

If f is a nonzero linear continuous functional on a Hilbert space, then
the null space N(f) of f is a closed plane and its orthogonal complement
N(f)l. has dimension one, i.e.,
dim N(f)l. = 1.
2.11 The Duality Map

Definition 1. Let X be a Hilbert space over lK. We define the duality map
J:X -+ X*
of X through J(v) := f, where f is given by (81).

Using the notation (f,u) = f(u) for f E X* and u E X, this means
(J(v), u) := (v I u) for all u, vEX.
Proposition 2. The duality map J is bijective, continuous, and norm

preserving, i. e.,
IIJ(u)11 = Ilull for all u E X.
If X is a real Hilbert space, then J is linear.
If X is a complex Hilbert space, then J is antilinear, i.e.,
J(av + {3w) = iiJu + !JJw for all a,{3 E <C, u,w E X.
Proof. This follows from Theorem 2.E. o

The duality map will be used critically in Section 5.4 in connection with
the energetic extension of symmetric operators. This allows important ap-
plications to mathematical physics.
2.12 Duality for Quadratic Variational Problems

Along with the original minimum problem
F(u) := a(u,u) - 2b(u) = min!, UEX, (85)
let us consider the dual maximum problem
F*(v) := -a(v, v) = max!, v E Va + Y, (85*)
where Va E Z is a fixed solution of the following equation:
a(va, w) = b(w) for all w E X.
We make the following assumptions:
(HI) Let X be a linear closed subspace of the real Hilbert space Z.
(H2) Let a: Z x Z -+ lR be a bounded, symmetric, positive bilinear form.

(H3) Let a: X x X ---> lR. be strongly positive, i.e.,
cllul11 : : : a(u,u) for all u E X and fixed c > o.

(H4) The functional b: X ---> lR. is linear and continuous.
Finally, we set
Y:= {u E Z:a(u,w) = 0 for all w EX}.
Theorem 2.F (Duality). The original minimum problem (85) has a unique
solution Ua.
The element Ua is also the unique solution of the dual maximum problem
(85*), and the extremal values of (85) and (85*) are the same, i.e.,
F(ua) = F*(ua).
Moreover, we have
a(u - v, u - v) = F(u) - F*(v) for all u E X, v E Va + Y. (86)
Corollary 1 (Error estimates). For all u E X and v E Va + Y, we get
F*(v) ::::: F(ua) ::::: F(u) (87)
and
cllua - u111::::: a(ua - U,Ua - u) ::::: F(u) - F*(v). (88)
In numerical analysis, one computes u and v in Corollary 1 as solutions

of the Ritz method for (85) and (85*), respectively. The Ritz method for
the dual problem (85*) is also called the Trefftz method.
Proof. By Theorem 2.A in Section 2.4, problem (85) has a unique solution
Ua that satisfies the variational equation
a(u,ua) = b(u) for all u E X.
By construction of Y,
a(u,v - va) = 0 for all u E X, v E Va + Y,

and the choice of Va yields the key relation
a(u,v) = a(u, va) = b(u) for all u E X, v E Va + Y.

Hence
O:=::: a(u - v,u - v) = a(u,u) - 2a(u,v) + a(v,v)

= a(u, u) - 2b(u) + a(v, v)
= F(u) - F*(v) for all u E X, v E Va + y. (89)
This implies
F*(v) :=::: F(ua) for all v E Va + y.
Furthermore, we shall show that Ua E Va +Y and
F*(ua) = F(ua). (90)
Thus, Ua is a solution of the dual problem (85*).

To prove (90) observe that
a(ua, w) = b(w) and a(va, w) = b(w) for all w E X.
Hence a(ua - va,w) = 0 for all w E X, i.e., Ua - Va E Y. Furthermore,

letting u = V = Ua in (89), we get (90).
Corollary 1 is an immediate consequence of Theorem 2.F. D
Example 2. Let X be a linear closed subspace of the real Hilbert space

Z. For given Va E Z, we consider the minimum problem
F(u) := Ilva - ull 2- IIvall2 = min !, UEX, (91)
together with the dual maximum problem
F*(w) := -liva - wll 2 = max!, WEX~, (91*)
where II . II denotes the norm on Z.

Then, problem (91) has a unique solution Ua, and Va - Ua is the unique
solution of (91 *). Moreover,
F(ua) = F*(va - ua). (92)
The geometrical meaning of the result is pictured in Figure 2.16. Relation

(92) is identical to the Pythagorean theorem
uaEX, Va-UaEX~.
Proof. Use Theorem 2.F with
a(u, v) := (u I v)z and b(u) := (va I u)z.

Here, Y = X~. D
xJ..
Vo - Uo
"---......----...-----x
Uo
FIGURE 2.16.
2.13 The Linear Orthogonality Principle

The following three existence principles are mutually equivalent:
(i) the existence principle for quadratic minimum problems (Theorem

2.A);
(ii) the perpendicular principle (Theorem 2.D);
(iii) the Riesz theorem (Theorem 2.E).
These three principles represent variants of the linear orthogonality prin-

ciple in Hilbert spaces.
In the preceding sections we have already proved that
(i) =? (ii) =? (iii).
It remains to show that (iii) =? (i). To this end, we consider the minimum
problem
T1a(u,u) - b(u) = min!, UEX, (93)
as in Section 2.4. Let us introduce the energetic inner product on X through
(u I V)E := a(u, v) for all u, vEX.
By Definition 2 in Section 2.6, the energetic space X E consists of the set

X equipped with (. I ·)E, and Corollary 3 in Section 2.6 tells us that X E is
a Hilbert space. Moreover, there are positive constants a and (3 such that
for all u E X.
By assumption, the linear functional b: X -+ lR is continuous. Hence
Ib(u)l:::; Ilbllllullx :::; (3llbllllullE for all u E X.
That is, b(·) also represents a linear continuous functional on X E . By the

Riesz theorem, there is a v E X E such that
b(u) = (v I U)E for all u E XE.

2.14 Nonlinear Monotone Operators 173
Consequently, problem (93) can be written in the following form:
This is equivalent to the problem
(u - v I u - V)E == (U I U)E - 2(u I V)E + (v I V)E = min!,
which has the unique solution u = v.
2.14 Nonlinear Monotone Operators

We want to solve the nonlinear operator equation
Au=z, uEX. (94)

(HI) The operator A: X ~ X is strongly monotone on the real Hilbert

space, i.e., by definition, there is a constant c> 0 such that
(Au - Av I u - v) 2:: cllu - vl1 2 for all u, vEX.
(H2) The operator A is Lipschitz continuous, i.e., there is a constant L > 0

such that
IIAu - Avll :::; Lllu - vii for all u, vEX.
Theorem 2.G. For each given z E X, problem (94) has a unique solution
u.
This theorem was proved by Zarantonello in 1960. It marks the beginning

of the modern theory of monotone operators that allows many applications
to nonlinear mathematical physics. This can be found in Zeidler (1986),
Vols. 2, 4, and 5.
Proof. We will use the Banach fixed-point theorem. The idea of our proof is
to replace the original equation (94) by the equivalent fixed-point problem
u=Bu, UEX, (95)

where
Bu := u - t(Au - z) for fixed real t > o.
If X = {O}, then the statement is trivial. Let X =I- {O}. For all u, v E X,
IIBu-BvI1 2 = Ilu-vI1 2 -2t(Au-Av I u-v)+t 2 1IAu-AvI1 2 :::; mllu-vI1 2 ,

(96)
where
m:=1-2tc+t2L2.
By (96), m;:::: 0. If t °
= or t = i~, then m = 1. This implies
k := Vm < 1 for all t E ] 0, ~~ [.

Therefore,
IIBu - Bvll :::; kllu - vii for all u,v E X,
i.e., the operator B is k-contractive for each t E ]0, i~ [.
By the Banach fixed-point theorem in Section 1.6, problem (95) has a
unique solution u. 0
In addition, it follows from the Banach fixed-point theorem that, for each
given Uo E X and each fixed t E ]0, i~ the iteration method[,
Un+l = Un - t(Au n - z), n= 0,1, ...
converges to the unique solution u of the original problem (94). Moreover,
we have the error estimates
for n = 1,2, ....
2.15 Applications to the Nonlinear Lax-Milgram

Theorem and the Nonlinear Orthogonality
Principle
We want to solve the equation
a(u, v) = b(v) for fixed u E X and all vEX. (97)
(HI) Let b: X -7 ffi. be a linear continuous functional on the real Hilbert
space X.
(H2) Let a: X x X -7 ffi. be a function such that, for each w EX,
V I---> a(w,v)
represents a linear continuous functional on X.
(H3) There are positive constants Land c such that, for all u, v, w EX,
cllu - vl1 2 :::; a(u, u - v) - a(v, u - v)
and
la(u,w) - a(v,w)1 :::; Lllu - vllllwll·
Problems 175
Theorem 2.H (The nonlinear Lax-Milgram theorem). Problem (97) has

a unique solution.
Proof. By (H2) and the Riesz theorem in Section 2.10, for each w E X,
there is an element called Aw such that
a(w,u) = (Aw I u) for all u E X.
This way we get an operator A: X ----> X. It follows from (H3) that
cllu - vl1 2 ::::: (Au - Av I u - v) for all u,v E X,
i.e., A is strongly monotone. Furthermore,
I(Au - Av I w)1 ::::: Lllu - vllllwil for all u,v,w E X.
Hence
IIAu - Avll = sup I(Au - Av I w)1 ::::: Lllu - vii for all u, vEX.
Ilwll:::;l
Again by the Riesz theorem, there is a z E X such that
b(u) = (z I u) for all u E X.
Consequently, the original problem (97) is equivalent to the operator equa-

tion
Au=z, u E X. (98)
It follows now from Theorem 2.G in the preceding section that equation
(98) has a unique solution u. 0
In the special case where a: X x X ----> lR is bilinear, bounded, and strongly

positive, i.e.,
a(w, w) ~ cllwl1 2 for all w E X and fixed c > 0,
assumptions (H2) and (H3) are satisfied. Then, Theorem 2.H is called the
linear Lax-Milgram theorem. If, in addition, a(·, .) is symmetric, then prob-
lem (97) is identical to the variational equation from Theorem 2.A. By
Section 2.13, it is motivated that
Theorem 2.H can be regarded as a nonlinear orthogonality principle on

Hilbert spaces.
Problems
In Problems 2.9ff the importance of generalized functions for mathematical
physics will be studied on an elementary level. Moreover, in Problems 2.12ff
we will study in detail a smoothing technique that plays a fundamental role
in modern analysis (Friedrichs' mollification).
2.1. Weierstrass' classical counterexample from 1870. Consider the mini-

mum problem
F(u) := /1
~1
(xu'(x))2dx = min !, u E G 1 [-I, 1], u( -1) = 0, u(l) = 1.
(P)
Use the sequence
._ 1 1 arctan nx
Un ( X ) .- - +- , n = 1,2, ... ,
2 2 arctan n
in order to show that this variational problem has no solution. Recall
that G 1 [-1, 1] denotes the space of continuously differentiable functions
u: [-1,1] -dR.
Solution: Set M := {u E G 1 [-I, 1]:u(-I) = 0 and u(l) = I}. Then,
problem (P) can be written in the following form:
F(u) = min!, uEM.
Since Un ( -1) = 0 and Un (1) = 1, we get Un E M for all n. Explicitly,
F(u n ) = 1
2n 2 . arctan n
/1 ~1
n2x2
(1 + (nx)2)2
ndx
= 1
2n 2 . arctan n
/n y2
-n (1 + y2)2
dy <
1
- 2n 2 . arctan n
/00
~OO (1
y2
+ y2)2
dy.
Hence F(u n ) ---+ 0 as n ---+ 00. Since F(u) ~ 0 for all u E M, this implies
inf F(u) = 0,
uEM
i.e., (un) is a minimal sequence for (P).

Suppose now that u is a solution of (P). Then,
F(u) = 0, uEM,
and hence
xu'(x) =0 for all x E [-1,1].
This implies u'(x) = 0 on [-1,1]' i.e., u(x) = const. But, this contradicts
the side condition u( -1) = 0 and u( 1) = 1.
Problems 177
This example was given by Weierstrass to show that a minimum problem

in the calculus of variations need not always have a solution, namely,
The infimum of the functional F on the set M is not attained at some

point u of M.
We shall discover in Chapter 2 of AMS Vol. 109 that the reason for this
bad structure of (P) is related to the fact that the Banach space C 1 [-1, 1]
is not reflexive.
A detailed historical discussion can be found in Zeidler (1986), Vol. 2A,
Sections 18.7 through 18.9.
2.2. The classical Hilbert space l~. By definition, the space l~ consists of
all the sequences (U n )n>l with Un ElK for all n E fir and
The linear operations are defined as in Problem 1.5 for lK oo • Show that
1~ is an infinite-dimensional Hilbert space over lK equipped with the inner
product
00
(u I v) := L UnVn '
n=l
where u := (un) and v := (v n ). As usual, the bar denotes the conjugate

complex number.
Hint: Apply the limiting process N ----* 00 to the classical Schwarz in-
equality
which corresponds to the Hilbert space lKN (cf. Standard Example 1 in

Section 2.2).
Hilbert introduced the space l2 in 1906. He used this space in order to
establish his general theory of integral equations (cf. Hilbert (1912)). The
notion of an abstract Hilbert space was introduced by von Neumann in
1929.
2.3. Simple identities. Let X be a pre-Hilbert space over lK with the inner
product (. I .). Show that the following hold true:
(i) If lK = IR, then
4(u I v) = Ilu + vI12 - Ilu - vI12 for all U,V E X. (99a)

(ii) If IK = C, then
4(u I v) = Ilu+vI12_llu-vI12_illu+ivI12+illu-ivI12 for all u,v E X.
(99b)
(iii) Appolonius' identity. If IK = JR, C, then
for all U,V,W EX.
2.4. The Banach space Ora, b]. Let -00 < a < b < 00. Show that the
Banach space C[a, b] equipped with the usual maximum norm
Ilull = maxa$x$b lu(x)1 is not a Hilbert space.
Hint: Prove that the parallelogram identity is violated.
Use the same method in order to show that the Banach space C of
complex numbers equipped with the norm
Ix+iyl = Ixl + Iyl for all x + iy E C

is not a Hilbert space.
2.5. * The role of the parallelogram identity. Let X be a normed space over
K Show that X is a pre-Hilbert space iff the parallelogram identity holds,
i.e.,
for all u, vEX.
Hint: Use (99a) and (99b). Cf. Jordan and v. Neumann, On inner prod-
ucts in linear metric spaces, Annals of Math. (1935), 719-723.
2.6. Complexijication of real Hilbert spaces. Let X be a real pre-Hilbert.

As in Problem 1.9, consider the complexification Xc of the space X, where
Xc consists of all the elements
u+iv with u,v E X.
Show that
(i) Xc becomes a complex pre-Hilbert space with the inner product
(u+iv I w+iz):= (u I w) - i(v I w) +i(u I z)(v I z)

for all u + iv, w + iz E Xc.
(ii) If X is a real Hilbert space, then Xc is a complex Hilbert space.
2.7. Orthogonal complements. Let L be a linear subspace of the Hilbert

space X over K Set L1.1. := (L1.)1.. Show that
Problems 179
(a) L = Ll..l...
(b) L is closed iff L = Ll..l...
2.8. The Ritz method. By Section 2.7.1, the variational problem
171: (2- 1u,2 - u cos x)dx = min!,
is equivalent to the boundary-value problem
u"(x) + cos x = 0 on [0,7fJ, u(O) = u(7f) = 0, (B)
which has a unique solution u. Explicitly,
u(x) = cos x + 27f- 1 x-1.

Use the Ritz method in order to compute an approximate solution U2n
of (V), by making the ansatz
2n
U2n(X) = "2:::Ck sinkx.
k=l
Determine the coefficients Cll ... , C2n. Show that (U2n) converges uniformly
on [0,7f] to the solution u of (V).
Hint: Cf. Zeidler (1986), Vol. 2A, p. 94.
2.9. Applications of generalized functions to mathematical physics. In the

following, we set 8 := 80, i.e., 8(¢) = ¢(O) for all ¢ E CO'(JR. N ). We want
to use very elementary arguments in order to explain the importance of
generalized functions for mathematical physics. In fact, the theory of gen-
eralized functions allows us to justify many classical heuristic arguments
of physicists (for example, see Problems 2.9b and 2.9f on applications to
electrostatics) .
2.9a. A special fundamental solution. Show that the equation
u(n) = 8 on JR., n = 1,2, ... ,

has a solution U E V'(JR.) that corresponds to the function
u(x) = {(~~-l~! if x ~ 0,
o if x < O.
Hint: Use a similar argument as in Standard Example 9 from Section
2.8.
2.9b. The electric field of a charged point. Let us first consider the classic
basic equation of electrostatics in a vacuum:
-coLlu =p on JR3. (100)
Here, p denotes the charge density and co denotes the so-called dielectricity
constant in a vacuum. The function u is called the electrostatic potential
caused by the charge density p. If there is a particle P at the point x of
charge Q, then the force K acts on P, where
K = QE(x) with E(x) = -grad u(x).
Here, the vector E(x) is called the electric field at the point x.
Suppose now that p corresponds to a point of charge q at the origin
x = O. Then, formally we get
p(x) = {O ~f x ~ 0
00 If x = 0
and flR 3 p(x)dx = q. Thus, p corresponds to the Dirac distribution q8.
Consequently, let us replace the original equation (100) with the equation
-coLlU = q8 (101)
in the sense of generalized functions.

Show that equation (101) has a solution U E D'(JR3) that corresponds to
the function
q
u(x) = 471'co IX I (102)
Moreover, show that -grad U corresponds to the classical vector field
E(x) _ qx (103)
- 471'colxl 3
This is the classic electric field caused by a charged point at the origin with
charge q. The corresponding force K = QE(x) is the Coulomb force.
Solution: Ad (101). To simplify notation, set q = 471' and co = 1. Let
U(¢) = f ¢(x)dx
JlR 3 Ixl
By (101), we have to show that -(LlU)(¢) = 471'8(¢) for all ¢ E CQ'(JR3 ).
Since Ll = or+ o~ + o§, the definition of derivatives of generalized functions
yields
-U(Ll¢) = 471'8(¢)
Explicitly, this means that
(104)
Problems 181
To prove this, set G := {x E JR3: 0 < r < Ixl < R}. Choose R so large
that ¢(x) = 0 if Ixl ~ ~. Using spherical coordinates, we get dx =
r2 sin iJ dr diJ d¢. Hence
1 IR3
/l¢(x)
- II- dx
x
= 1271" 171"
0 0
lRr=O
/l¢rsiniJdrdiJd¢.
Thus,
r /l¢(x) dx = lim r /l¢(x) dx.
lIR3 Ixl r~O le Ixl
J:= ler -Ix-I-

/l¢(x) r (1 )
dx = - le ¢/l j;T dx
+ 1 (Ixl1
8e
a¢- ¢ -
-- a -
an an
(1))
Ixl
dO
'
where tn denotes the derivative in the direction of the outer unit normal
vector, i.e., tn = - tr for Ixl = r. Since /l (1;1) = 0on Gand ¢ vanishes
in a neighborhood of the set {x: Ixl = R}, we get
J = 1Ixl=r (--1- a¢
r ar
+ ¢-
a
ar
(1))
-
r
r 2 smiJdiJd¢.
•
By the mean value theorem, there are points y and z with Iyl = Izl = r
such that
J = 47r( -r¢r(Y) - ¢(z)) ~ -47r¢(O) as r ~ O.
This is (104).
Ad (103). By the definition of the derivative of generalized functions, we
have to show that
r E(x)¢(x)dx = -(grad U)(¢) = U(grad ¢)

lIR 3
= r grad
lIR3
¢(x) dx
47rcolxl
for all ¢ E CO'(JR3 ),
where E(x) = 471"!oixI 3 ' However, this follows as above by using integration
by parts.
2.9c. Special fundamental solutions. Justify Table 2.1.

Table 2.1
Differential equation Fundamental solution (x E JR3)
1
-/:lU = 8 on JR3 u(x) = 47flxl
(potential equation) (potential)
e-lxI2(4t
Ut - /:lU = 8 on JR4 u(x) = {

~(1rt)3(2
if t > 0
if t ::::: 0
(heat equation) (temperature)
Utt - /:lU = 8 on JR4 U(cjJ) = 41 {'Xl C 1 ({ cjJ(x, t)dO x ) dt.

7f Jo J1xl=t
(wave equation)
2.9d. Convolution of generalized functions. Let L1,o(JRN) denote the set

of all measurable (e.g., continuous) functions f: JRN -+ JR that vanish out-
side some compact set and that are integrable, i.e., JJRN If(x)ldx < 00. Let
F denote the generalized function corresponding to f, i.e.,
F(cjJ) = ( f(x)cjJ(x)dx
JJRN
For all generalized functions V E V'(JRN) and all f E L1,o(JR N ), we define
the convolution U * F through
(U * F)(cjJ) = U (IN f(x)cjJ(x + Y)dX)

Show that
(i) U * FE V'(JRN).
(ii) 8 * F = F.
(iii) DQ(U * F) = (DQU) *F for all derivatives DQ.
Convince yourself that this follows simply from the corresponding defi-
nitions.
Problems 183
2.ge. The importance of fundamental solutions. Let L denote any linear

differential operator of order m = 1,2, ... with real coefficients, i.e.,
L:= L aa Da , (105)
lal<;m
where aa E JR for all a. Suppose that U E V'(JRN) is a fundamental solution

of L, i.e.,
LU=o
Let f E L1,o(JRN). Show that the convolution V = U * F is a solution of
the nonhomogeneous equation
LV=F
Solution: By Problem 2.9d,
2.9f. Applications to electrostatics. We are given the charge density p E

L 1,o(JR3) n L 2 (JR3) (e.g., p is continuous and vanishes outside some large
ball B). Let p E V'(JR3) denote the generalized function corresponding to
p. Show that the generalized function V E V' (JR3) corresponding to the
classic function
v(x) = r 47rcop(y)dy
JlR 3 Ix - yl '
(106)
is a solution of the fundamental equation of electrostatics:
-co~V = p (107)
The function v from (105) is called the classic volume potential. Recall
from classic analysis that v is a classic solution of (107) provided p is
sufficiently smooth (e.g., p E C 1(JR3)).
Solution: Using spherical coordinates, it follows as in Problem 2.9b that
sup
xEG
1
B
I
X
dy
- Y
12 < 00,
where Band G denote arbitrary balls in JR3. By the Schwarz inequality,
11 lR 3
p(Y)dYI2
-I- I ::;
X - Y
1
B
p(y) dy
2 1IB X -
dy
Y
12 < 00,
provided p vanishes outside the ball B. Thus, the function v is well defined
on JR3 and bounded on each ball. Let f:= flR 3 ' For all ¢ E C(f'(JR3), set
V(¢) := J v(x)¢(x)dx and p(¢) := Jp(x)¢(x)dx.

By Problem 2.9b, U is a fundamental solution of (106), i.e., -co!::l.U = 8,

and Problem 2.ge tells us that the convolution W := U * Pis a solution of
(106). Explicitly, for all ¢ E CO'(JR3),
W(¢) = U ( / p(x)¢(x + Y)dX) = / 47l" C101YI (/ p(x)¢(x + y)dx )dY
= / 47l" C101YI (/ p(z - Y)¢(Z)dZ) dy.
By the Tonelli theorem (cf. the appendix) and by the substitution Z := x-y,
we get
Thus, W = v.
2.9g. Generalized plane waves. Let x := (6,6,6) E JR3, t E JR, and
consider the wave operator
Let n := (nI' n2, n3), where n 2 = n~ + n~ + n~ = 1. Let the function

¢: JR ~ JR be given such that ¢ E L1,loc(JR) (e.g., ¢ is continuous). Then,
the function
u(x, t) := ¢(nx - ct), (108)
is called a plane wave, where nx := n16 + n26 + n36. Observe that the
function u is constant for nx - ct = const. That is, ¢ is constant on planes
that move with the velocity c in the direction of n.
Show that the generalized function U corresponding to u is a solution of
the wave equation
DU=O (109)
Observe that u is a classic solution of (109) if ¢ is sufficiently smooth (e.g.,
¢ E C 2 (JR)).
Hint: Approximate ¢ by polynomials ¢n. Then, D¢n = 0 in the classical
sense. Cf. Zeidler (1986), Vol. 2, p. 1050.
2.9h. A special tensor product. In classic analysis, the tensor product
¢ 0 ¢ of the two functions ¢ = ¢(x) and ¢ = ¢(y) is defined to be the
function
(¢ 0 ¢)(x, y) := ¢(x)¢(y).
Let U E D'(JRN) and 8 E D'(JR). We define the tensor product U 0 8(n)
through
n =0,1, ... , (110)
Problems 185
for all functions X = x(x, t) with X E CO'(JR N + 1 ). Here, 8 acts on the time
variable t.
To motivate this definition, let us formally consider the product
(u 129 8)(x, t) = u(x)8(t).
If U denotes the generalized function to u, then formally we get
(U 129 8)(X) = J u(x)8(t)X(x, t)dtdx = J u(x)X(x, O)dx.
This is (110) for n = O.

Show that the tensor product (110) yields a generalized function, i.e.,
U 129 8(n) E V'(JRN+l).
2.9i. The generalized initial-value problem for the wave equation in JR3.
Using the notation from Problem 2.9g, the classic initial-value problem for
the wave equation reads as follows:
Du=f
(111)
u(x, 0) = uo(x) and Ut(x,O) = U1(X)
For given functions Uo (initial state) and U1 (initial velocity), we are looking
for a function u = u(x, t) such that (111) is satisfied.
Set JRt := {(x, t) E JR4: t ~ O}. We are given
f E C(JRt), Uo E C 1 (JR3), U1 E C(JR3).
Suppose that the function u E C 2 (int JRt) n C 1 (JRt) is a solution of (111).
Set u(x, t) = 0 and f(x, t) = 0 outside JRt. Let U, Uo, U1, F denote the
generalized functions corresponding to u, UO, U1, f, respectively. Show that
U is a solution to the following equation:
DU = F + Uo 129 8' + U1 129 8 on JR4, where supp U ~ JRt. (111*)
Here, supp U ~ JRt means that U (¢) = 0 for all functions ¢ E CD (JR4)
which vanish on an open neighborhood of JRt. Problem (111*) is called the
generalized problem to (111).
In physics, the right-hand side f corresponds to some outer force. In prob-
lem (111*), the initial conditions from (111) are replaced with additional
outer forces Uo 129 8' and U1 129 8, where the appearance of the 8-distribution
is responsible for the fact that these additional forces only act at the initial
time t = O.
Hint: Use integration by parts. Cf. Zeidler (1986), Vol. 2, p. 1054.
2.10.* The general tensor product. Let U E V'(JR N ) and V E V'(JR M ).

Then, there exists exactly one generalized function W E V'(JR N+ M ) such
that
W(¢ 129 'lj;) = U(¢)V('lj;)
We call W := U 0 V the tensor product of U and V. Explicitly,
(U 0 V)(X) = U(V(X(x,'))
where V acts on the function y ~ X(x, y).

Study the proof in Hormander (1983), Vol. 1, Section 5.1.
2.11. * The existence theorem for fundamental solutions. To each (nonzero)

linear differential operator L with constant coefficients (cf. (105)), there
exists a fundamental solution U, i.e., the equation
LU=8
has a solution U E 1)'(lR,N).

The proof of this famous Malgrange-Ehrenpreis theorem from 1955 can
be found in Yosida (1980), Chapter 6. This proofrelies on properties of the
Fourier transform (the Payley-Wiener theorem).
2.12. Smoothing of functions by using mean values (Friedrichs' mollifica-

tion). The point of departure is the integral
(112)
where ¢£(x) := eN ¢(e1x) along with
¢(x) '= {ce-(1-IX I2 )-1 if x E lR,N and Ixl < 1

. 0 if x E lR,N and Ixl 2: 1.
Then
(i) ¢ E CO'(lR,N).
(ii) ¢ 2: 0 on lR,N.
(iii) JIRN ¢(x)dx = 1 for a suitable choice of the constant c> O.
Hence:
(i*) ¢£ E CO'(lR,N) and ¢£(x) = 0 if Ixl 2: c for all c > O.

(ii*) ¢£ 2: 0 on lR,N for all c > O.
(iii*) JIRN ¢£(x)dx = 1 (see Figure 2.17 for N = 1).

Let u E L 2 (G), where G is a nonempty open set in lR,N, N 2: 1. We set
u(x) = 0 outside G. Show that
(a) u£ E coo(lR,N) for all c > o.

Problems 187
-€ €
FIGURE 2.17.
(fJ) Ug E L2(JR N ) for all c > o.

(r) Ug """""' U in L2(G) as c """""' +0.
Solution: Ad (a). Consider the ball
B := {x E JRN: Ix - xol < I}
around the given point xo, and consider the set
Bg := {y E G:dist(B,y)::; c}.
Since ¢g(x - y) = 0 for all points x, y E JRN with Ix - yl ::::: c, from (112)
we get the key formula
for all x E B. (113)
By the Schwarz inequality (16), we obtain
{ lu(y)ldy = ( 1· lu(y)ldy
lBe lBe
since fBe dy = meas(Bg) < 00 and U E L2(JR N ) implies U E L2(Bg). Thus,

the function y ....... Iu(y) I is integrable over Bg.
First let N = 1. For all x E B, y E B g , and k = 0,1,2, ... ,c > 0, we
obtain
I¢ik)(x - y)u(y)1 ::; const(k,c)lu(y)l, (114)
where ¢ik ) denotes the kth derivative. In this connection, note that the
function ¢ik ) is continuous on JR, and hence it is bounded on compact sets
by the Weierstrass theorem (Proposition 8 in Section 1.11). In particular,
¢ik ) is bounded on each ball.
Applying standard theorems on parameter integrals (see the appendix)
to (113), the majorant condition (114) tell us that the continuous derivative
ui k ) exists on B, where
for all x E B, k = 0, 1, ....

Since the center Xo of the ball B is arbitrary, this implies (a).

In the case where N = 2,3, ... , we use the same argument with respect
to partial derivatives.
I
Ad ((3). Set := IIRNand II
:= IIRNxlRN.
Using the substitution z :=
c1(x - y), it follows from (112) that
Ug(x) = 1 1>(z)u(x - cz)dz. (115)
Observe now the identity 1>u = 1>! (1)!u). Thus, by the Schwarz inequality
1
(16), we get
lu g(x)12 :::; 1>(z)lu(x - cz)1 2dz,
noting that I 1>(z)dz = 1. Substituting y := x - cz, we obtain
1(1 1>(z)lu(x - czWdX) dz = 11>(z) (1 IU(yWdY) dz
= 1 lu(y)1 2dy < 00.
Therefore, it follows from the Fubini-Tonelli theorem (see the appendix)

that
1 lug(xWdx:::; 1(1 1>(z)lu(x - cZWdZ) dx
= 1(1 1>(z)lu(x - cZWdX) dz < 00,
and hence U g E L 2(JRN).

Ad (-'y). Let B := {z E JRN: Izl < I}. Recall that 1> = 0 outside and
IB 1>(z)dz = 1. By (115),
ug(x) = L u(x - cz)1>(z)dz,
and hence
Ug(X) - u(x) = L(u(x - cz) - u(x))1>(z)dz.
The Schwarz inequality (16) yields
Iug(x) - u(xW :::; c L Iu(x - cz) - u(xWdz,
where C is a positive constant. By the p-mean continuity of the Lebesgue

integral with p = 2 (see the appendix), for each 'f/ there is an co > 0 such
fa
that
Iu(x - cz) - u(xWdx < 'f/,
Problems 189
for all z E B and all c: 0 < c ~ co. Thus, it follows from the Fubini-Tonelli
theorem (see the appendix) that
fa luc(x) - u(x)1 2dx ~ C fa (lIU(X - cZ) - U(X)12dZ) dx

= C l (fa Iu(x - cz) - u(xWdX) dz ~ Cmeas(B) . 'fJ,
for all c: 0 < c ~ co. Hence
fa luc(x) - u(xWdx ~ 0 as c ~ +0.
This is b).
2.13. Density (Proof of Proposition 7 in Section 2.2). Let G be a nonempty

open set in ]RN, N ~ l.
2.13a. Show that the set COO(G) is dense in L2(G).
Solution: This follows immediately from Problem 2.12 (a)-b).
2.13b. Show that C(f(G) is dense in L2(G).
Solution: Case A: The nonempty open set G is bounded. Let C be a
compact see with C c G, and let u E L2 (G). We set
( ) .= {u(x)
v x. 0
on C
on G - C.
Then
l G
lu - Vl2dx = 1G-C
lul 2 dx.
By the absolute continuity of the integral (see the appendix), the right-
hand integral is arbitrarily small provided the measure of the set G - C is
sufficiently small. Thus, for each given 'fJ, we can choose the set C in such
a way that
(fa lu - vI2dX):2 <

1
Ilu - vii = 'fJ.
By Problem 2.12, there is a function Vc E coo(]RN) such that

for all c:O < c ~ co.
Next let us show that Vc E C(f(G) for sufficiently small c. In fact, since
v = 0 outside G - C, it follows from (112) that
vc(x) = fa ¢c(x - y)v(y)dy.
7 C ~ G means that C is a subset of G, and C C G means that C is a proper

subset of G, i.e., C ~ G and C:f. G.
...... ----------- ................ ,

,,
",""'"
/
/
/
/ \
/ \
I I
I I
\ /
\ /
G
'y
/ ' - - ------------
H
-- //
FIGURE 2.18.
Hence vc;(x) = 0 for all x E G with dist(x, C) > c because of CPc;(x - y) = 0

for Ix - yl 2: c. Since C is a compact subset of the open set G, there is an
open set H such that
ccHr;iIcG
(see Figure 2.18).8 Consequently, if we choose the number c sufficiently
small, then dist(x, C) > c for all x E G - iI, and hence
for all x E G - iI,
i.e., Vc; E C(f(G). Summarizing,
Ilu - vc;11 ::; I\u - vii + Ilv - vc;11 < 2'Tl,
i.e., C(f(G) is dense in L2(G).

Case B: The open set G is unbounded. Then, for each 'Tl > 0, there is an
open ball B such that
la-H
r lul 2 dx < 'Tl 2 ,
where H := G n Band H =f=. 0, by a well-known property of the Lebesgue

integral.
Applying Case A to the nonempty bounded open set H, there is a function
Vc; E C(f(H), and hence Vc; E C(f(G), such that
L lu - vc;1 2 dx < 'Tl 2 •
Since Vc; = 0 on G - H, we get
BIn fact, for each point x, there exists an open ball B around x such that
B ~ BeG. Since C is compact, a finite set of such balls already covers C. Call
the union of these balls H.
Problems 191
i.e., COO(G) is dense in L2(G).

2.13c. Show that C(G) is dense in L2(G).
Solution: Note that COO(G) ~ C(G) and use Problem 2.13b.
2.14. Show that both COO(G)c and C(G)c are dense in L2(G)c.
Hint: Use the same arguments as above.
2.15. Separability (Proof of Corollary 8 in Section 2.2).

2.15a. Let G = la, b[ be a bounded open interval in R Show that L2 (G)
is separable.
Solution: Let u E L2(G) and c > 0 be given. By Problem 2.13c, the set
C[a, bl is dense in L 2 (G), i.e., there is a function v E C[a, bl such that
By the Weierstrass approximation theorem (Proposition 2 in Section

1.25), the set of polynomials with real coefficients is dense in the Banach
space C[a, b], i.e., there is a real polynomial p such that
Ilv -. pll* := max

a~x~b
Iv(x) - p(x)1 < c.
Let us introduce
M := set of all polynomials with rational coefficients.
By the proof of Corollary 3 in Section 1.26, for each polynomial p, there is

a polynomial q E M such that
Hence Ilv - qll* :::; IIv - pll* + lip - qll* < 2c. This implies
(l
1
Ilv - qll = b
Iv - qI2dX) "2 :::; (b - a)! Ilv - qll* < (b - a)!2c.
Summarizing, for each c > 0, there is a q E M such that
lIu - qll :::; Ilu - vii + Ilv - qll < c + (b - a)!2c.
That is, the set M is dense in L2 (G). Since the set M is countable, the
space L2 (G) is separable.
2.15b. Let G be an unbounded open interval in JR, e.g., G = R Show
that L2 (G) is separable.
Solution: There exists a sequence (G n ) of bounded open intervals G

such that G 1 <::::; G 2 <::::; ... <::::; G and
00
Define
1 if x E G n
Xn(x):= { 0 if]R - G n ,
and
Moo := {Xnq:q E M and n = 1,2, ... }.
Let U E L2(G) and E; > 0 be given. There exists a bounded interval J

with J <::::; G and
1 G-J
luI2dx < E;2,
by a well-known property of the Lebesgue integral. Choose some interval

G n such that J <::::; G n <::::; G. Then
1 G-G n
lul2dx < E;2.
By Problem 2.15a, there is a polynomial q E M such that
Hence
Consequently, the countable set Moo is dense in L2(G), i.e., L2(G) is sep-
arable.
2.16. Let G be an open interval in R Show that L~(G) is separable.

Use an analogous argument as in Problem 2.15, based on polynomials
with complex coefficients.
2.17. Let G be a nonempty open set in ]RN, N:::::: 1. Show that L2(G) and
L~ (G) are separable.
Hint: Use the same arguments as above. Replace the one-dimensional
Weierstrass approximation theorem with the N-dimensional Weierstrass
approximation theorem (cf. Problem 1.19b in AMS Vol. 109).
2.18.* The Sobolev embedding theorems. An elementary approach to the

Sobolev embedding theorems can be found in Zeidler (1986), Vol. 2, Section
21.3. Study these proofs. The appendix to Zeidler (1986), Vol. 2, contains a
Problems 193
summary of important material related to the Sobolev embedding theorems

(relation to the theory of generalized functions, interpolation theory, etc.).
We also recommend Gilbarg and Trudinger (1983).
2.19. A lormal relation lor the o-function. Let I: ~ -) ~ be a Cl-function

that has precisely the zeros Xl, ... , X N. In addition, assume that f' (x j) =1= 0
for all j. Use the formal definition of the o-function to show that
N
o(f(x)) = LO(x - xj)I!'(xj)I- I for all X ER
j=l
This relation is frequently used by physicists (cf. Standard Example 18 in

Section 5.24). For example, if a =1= 0, then
f:( 2 _ 2) = o(x - a) + o(x + a) for all

u X a 21al X E~.
Solution: Let ¢ E Co(~), and let gj be the inverse function to I in

a sufficiently small neighborhood of Xj such that l(gj(Y)) = y. Then, for
sufficiently small E: > 0,
l o(f(x))¢(x)dx = ~ l~~:e o(f(x))¢(x)dx

= ~
3
J:i
3
o(y)¢(gj(y))lgj(y)ldy
= L¢(gj(O))lgj(O)1 = L¢(Xj)I!'(Xj)I- I .
j j
3
Hilbert Spaces and Generalized
Fourier Series
The interplay between generality and individuality, deduction and

construction, logic and imagination-this is the profound essence of
live mathematics.
Anyone or another of the aspects can be at the center of a
given achievement. In a far-reaching development all of them will be
involved. Generally speaking, such a development will start from the
"concrete ground," then discard ballast by abstraction and rise to
the lofty layers of thin air where navigations and observations are
easy; after this flight comes the crucial test of landing and reaching
specific goals in the newly surveyed low plains of individual "reality."
In brief, the flight into abstract generality must start from and
return to the concrete and the specific.
Richard Courant (1888-1972)
Let f: JR -+ JR be a function of period 21T. Then, the corresponding clas-

sical Fourier series reads as follows:
= T1ao + L ak coskx + bk sinkx,

00
f(x) (1)
i:
k=l
with the so-called Fourier coefficients
i:
ak := 1T- 1 f(x) cos kx dx, k = 0, 1,2, ... ,
bk := 1T- 1 f(x) sin kx dx.

196 3. Hilbert Spaces and Generalized Fourier Series
From the physical point of view, relation (1) tells us that the 21l"-periodic
"oscillation" f can be represented as a superposition of simple "harmonic
oscillations"
h(x) := coskx, sinkx, (2)
of period ~, where k = 1,2, .... The Fourier coefficients ak and bk cor-
respond to the amplitudes of the harmonic oscillations (2). For example,
if lak I is large for some k, then the harmonic oscillation x ....... cos kx con-
tributes strongly to the oscillation f. Therefore, physicists are interested
in such periods 2;:
for which lak I or Ib k I are large with respect to the other
Fourier coefficients.
For example if f: JR -+ JR has the period 21l" with
f(x) := ~ (~ -Ixl) for -1l"::::; x::::; 1l",
then
cos3x cos5x
f( x) =cosx+ - 2- + - - + ... for all x E JR.
3 52
Figure 3.1 represents schematically the corresponding superposition.
In the nineteenth century, many mathematicians studied in detail the
convergence of Fourier series. In 1876, du Bois-Reymond obtained the sur-
prising result that there are continuous functions f for which the Fourier
series f does not converge at each point x E JR. This counterexample shows
that the classical convergence of infinite series is not the right concept for
solving the fundamental convergence problem for Fourier series.
In 1907, Fischer and Riesz proved independently that
A natural answer to the convergence problem for (1) can be given in
terms of the Hilbert space L2 (-1l", 1l").
In this chapter we will show that this is a special case of an abstract
result on complete orthonormal systems in Hilbert spaces. In particular, it
turns out that, for each function
the Fourier series (1) converges in the Hilbert space L 2 ( -1l", 1l"), i.e.,
lim Ilf -
n-+oo
snll = 0,
where
n
sn(x) := 2- 1ao+ 2::::ak coskx + bksinkx,
k=l
and II· II denotes the norm on L 2 ( -1l", 1l"). Explicitly,

3. Hilbert Spaces and Generalized Fourier Series 197
(a)
00
JVV~ k 0
(b) (c)
FIGURE 3.1.
We shall also show that this corresponds to the convergence of the Gauss
method of least squares.
In the nineteenth century, mathematicians and physicists also used more
general series expansions than (1) of the following form:
L Ckik(X),
<Xl
f(x) = XEG, (3)

k=O
where G is a nonempty open set in JRN, and the functions fo, iI, ... satisfy
the so-called orthogonality relation
k,m =0,1, .... (4)
°
Here, 8km = if k =I- m and 8km = 1 if k = m. Observe that (. I .) represents
the inner product in the Hilbert space L2(G).
Using (4), it is easy to compute the unknown coefficients Ck formally.
Namely, multiplying (3) with fm(x) and integrating over G, it follows for-
mally from (4) that
Cm = (f I fm) == fa f(x)fm(x)dx, m=0,1, ....
Our abstract results will show that the infinite series (3) converges in the
Hilbert space L2(G) provided the set of all the real linear combinations
ao, ... , an E JR, n = 0,1, ... ,

is dense in L2(G). This is a quite natural result.

In Chapters 4 and 5 we shall show that series expansions of the form (3)
are closely related to eigenvalue problems for integral equations and differ-
ential equations where fo, h, ... are the corresponding eigenfunctions. In
terms of physics, for example, the infinite series (3) represents the oscilla-
tions of an elastic body as a superposition of "eigenoscillations." Moreover,
in quantum mechanics the eigenfunctions fk from (3) correspond to bound
states of atoms and molecules with a well-defined energy, whereas (3) de-
scribes the superposition of a state f by means of the energy states ik.
The continuous version of the Fourier series (1) is given through the
Fourier integral
-00 <x< 00, (5)
with
-00 < k < 00. (5*)
The function a(·) is called the Fourier transform of f(·).

From the physical point of view, formula (5) describes the superposition
of an arbitary, not necessarily periodic "oscillation" f by means of the
simple "harmonic oscillations"
e- ikx = (coskx - isinkx) (6)
of period I~ with real k, and (21f)-~a(k) is the amplitude of the harmonic

oscillation (6).
Differentiating formally equation (5), we get
dfd(x)
x
= (21f)-~ Joo ika(k)eikxdx.
-00
Thus, the Fourier transform of the differential operator
f(x) f--' df(x)

dx
is given through the multiplication operator
a(k) f--' ika(k).
This is the most important property of the Fourier transformation.

The Fourier transformation allows the reduction of differential equations
to purely algebraic problems.
For example, suppose we want to solve the differential equation
J'(x) = h(x), (7)

3.1 Orthonormal Series 199
where II is given. Applying formally the Fourier transformation to (7), we

get
ika(k) = al(k),
where a and al denote the Fourier transform of f and II, respectively.
Hence
a(k) = _ ial~k),
and the solution f of (7) is obtained by the Fourier transformation (5).

In classical mathematics, the crucial difficulty of this method was caused
by the lack of convergence for the Fourier integrals (5) and (5*). For ex-
I:
ample, if f(x) == 1, then
a(k) = (21r)-! e-ikXdx.
But this integral does not exist. In Section 3.7 we shall show that
The extension of the Fourier transformation to generalized functions of
the class 8' (lR N) overcomes the classical difficulties.
In order to describe this formally, choose
f(x) := 8y(x),
where 8y denotes the "Dirac function" from Section 2.8.2. By (5*),
Thus, in the formal language of physicists, the "Dirac function" 8y has the
Fourier transform k f-> (21r)-!e- iky . In particular, the "Dirac function" 80
has the constant Fourier transform
A rigorous approach will be considered in Section 3.7.
3.1 Orthonormal Series

In this section, we make the following assumption:
(H) Let X be a Hilbert space over II{ = JR, C, and let {UO,Ul, ... } be a
finite or countable orthonormal system in X, i.e., by definition,
for all k, m. (8)

Our goal is to study the convergence of the so-called abstract Fourier

series
00
U = ~)Un I U)Un. (9)

n=O
We also set
m
~)Un I u)un.
Sm := (9*)
n=O
The numbers (Un I u) are called the Fourier coefficients of u.
Definition 1. Assume (H). The finite orthonormal system {UO, ... , UN} is
called complete in X iff
N
U = I)un I U)Un for all U E X. (10)
n=O
The countable orthonormal system {UO, Ul, ... } is called complete in X
iff the infinite series (9) converges for all U EX, i.e.,
U = m--->oo
lim sm for all U E X.
Proposition 2. The finite orthonormal system {UO, .. . ,UN} is complete

in the Hilbert space X over OC iff it is a basis of X.
Proof. Let {un} be a basis in X. Then,

N
U = LCnUn for all U E X, (10*)
n=O
where the coefficients co, ... ,CN E OC depend on u. Using (8), we get
N
(Uk I u) = L Cn(Uk I Un) = Ck, k = 0,1, ... ,N.

n=O
This implies (10), i.e., {un} is complete.
Conversely, let {un} be a complete orthonormal system; then {un} is a
basis of X, by (10). In this connection, note that {UO, .. . , UN} is linearly
independent, since (10*) with U = 0 implies that Ck = 0 for all k. 0
Corollary 3. Let {un} be a countable orthonormal system in the Hilbert

space over OC. Assume that the infinite series
00
U = LCnUn with Cn E OC for all n

n=O
3.1 Orthonormal Series 201
is convergent for some fixed U EX.

Then, Cn = (un I u) for all n.
Proof. Using (8) we get
(Uk I u) = m~oo
lim (Uk I~
~
lim ~ Cn(Uk I un) = Ck·
cnun) = m~oo~ 0
n=O n=O
This motivates the ansatz (9). Let us give a second motivation for (9).
In order to get a good approximation of u by a linear combination COUo +
... + CmU m ' we set
and we consider the following minimum problem:
f(co,···, cm) = min !, CO"",C m ElK, (11)
according to the least-squares method of Gauss.
Proposition 4. Assume (H). Then, the unique solution of (11) is given

through the Fourier coefficients Cn = (un I U), n = 0, ... , m.
Proof. By (8),
f(c) = (u- ~cnun I u- ~CkUk)

m m m
n=O k=O n=O

Hence
m m
n=O n=O
The smallest value of f is attained for Cn = (un I u), n = 0, ... , m. 0
In particular, it follows from (9*) that
for all C E lK m and all m. (13)

By (12),
m
Ilu - sml1 2 = IIul1 2 - 2: I(u n I u)1 2 for all u E X and all m. (14)
n=O
Hence we obtain the following Bessel inequality:

m
L I(un I uW ::; IIul1 2 for all u E X and all m. (15)

n=O
Proposition 5 (Convergence criterion). Let {un} be a countable orthonor-

mal system in the Hilbert space X over IK. Then, the series
cn E OC for all n,
is convergent iff the series L~=o icnl 2 is convergent.
m+k
L Ic n l 2 , (16)
n=m+l
for all m, k = 1,2, ....

If Ln len 12 is convergent, then (8m ) is a Cauchy sequence. Hence (8m )
is convergent, i.e., Ln CnU n is convergent.
Conversely, if Ln CnU n is convergent, then (8m ) is a Cauchy sequence,
and hence Ln Ic n l2 is convergent, by (16). D
It follows from Proposition 5 and the Bessel inequality (15) that for each
U E X the Fourier series is convergent, i.e., there is some v E X such that
00
v = L(un I u)un .
n=O
However, it is possible that v i= u. But if the orthonormal system {un} is
complete, then v = U for all U EX.
Theorem 3.A. Let {Un} be a countable orthonormal system in the Hilbert

space X over OC. Then, the following two conditions are equivalent:
(i) The system {un} is complete in X.

(ii) The linear hull of {un} is dense in X.
Proof. (i) * (ii). This is obvious, by Definition l.

(ii)* (i). For given c > 0, there exist coefficients co, ... ,Cm E OC such
that
3.2 Applications to Classic Fourier Series 203
We also may assume that m is sufficiently large, by letting C n = 0 for large

n. Choosing E: = ~, r = 1,2, ... , it follows from (13) that there exists a
subsequence (smr) such that
r = 1,2, .... (17)
Since a Fourier series is always convergent, the sequence {sm} is conver-

gent, i.e., Sm -+ v as m -+ 00. Letting r -+ 00, it follows from (17) that
v =U. D
Corollary 6. Let {un} be a countable complete orthonormal system in the

Hilbert space X over lK. Then, the following hold true:
(i) For all u, v E X,
L
00
(u I v) = cn(u) cn(v) (the Parseval equation), (18)

n=O
where cn(w) := (un I w).
(ii) For all u E X, the Bessel inequality is replaced with the so-called
special Parseval equation
IIul1 2= L
00
I(u n I u)12. (18*)

n=O
(iii) If (un I u) = 0 for all n and fixed u E X, then u = o.
Proof. Ad (i). By (8) and (9),
Ad (ii). This is a special case of (i).

Ad (iii). This follows from (9). D
3.2 Applications to Classic Fourier Series
I:
Recall that the inner product in the Hilbert space L 2 ( -7f, 7f) is given
through
(u I v) = u(x)v(x)dx.
For all x E [-7r,7rJ, we set uo(x) := (27r)-! and

1 1
U2m-l(X) := 7r-"2 cosmx, U2m(X):= 7r-"2 sinmx, m = 1,2, ....
Proposition 1. The set {Uo, Ul, ... } forms a complete orthonormal system
in the Hilbert space L2 ( -7r, 7r).
This proposition tells us that for each U E L2 ( -7r, 7r) the Fourier series
00
U = LCnUn , Cn := (un I u), (19)

n=O
converges in L2 ( -7r, 7r). This is identical to the classic Fourier series

00
u(x) = T1ao + Lakcoskx + bksinkx, (19*)

k=l
I:
where
I:
ak := 7r- 1 u(x) cos kx dx,
bk := 7r- 1 u(x) sin kx dx, k = 0,1,2, ....
In fact, (U2m I U)U2m(X) =7r- 1 (I~7ru(x)sinmxdx)sinmx, m= 1,2, ... ,

and so on. Consequently, Proposition 1 implies the following corollary.
Corollary 2. For each U E L 2( -7r, 7r), the classic Fourier series converges
in L 2( -7r, 7r), i.e.,
The proof of Proposition 1 will be based on the following classic approx-

imation theorem due to Weierstrass (see Lemma 3). Let T denote the set
of all trigonometric polynomials, i.e., pET iff
m
p(x) := Lan cosnx + f3n sin nx,
n=O
where m = 0,1, ... , and all the coefficients an, f3n are real numbers. It
follows from the classical addition theorems for sin(·) and cos(·) that
p,qE T implies pqET.

3.2 Applications to Classic Fourier Series 205
We also set
IIJIIC[a,bj := max IJ(x)l·
a::; x::; b
Lemma 3. For each Junction J E C[-7r,7r] with J( -7r) = J(7r) and each
c > 0, there exists a Junction pET such that
IIJ - pllc[-7r,7rj < C.
Proof. Step 1: Let J be even, i.e., J( -x) = J(x) for all x E [-7r, 7r]. The
function
ep(x) := cos x
is strictly decreasing on [0, 7r]. Since y ~ J(ep-l(y)) is continuous on [-1,1],
it follows from the Weierstrass approximation theorem (Proposition 2 in
Section 1.26) that there exists a polynomial p(y) = Co + elY + ... + Cnyn
such that
max IJ(ep-l(y)) - p(y)1 < C.
-1::;y::;I
Letting y = cos x, this implies
max IJ(x) - q(x)1 < c,

0::; x::; 7r
where q(x) := p(cosx), and hence q E T. Since J and q are even, we also
get
Step 2: Let J be odd, i.e., J( -x) = - J(x) for all x E [-7r,7r], and let
J(O) = J(7r) = O. Choose 8> O. Set
g(x) := {J (7r;~~~)) if 0 < 8 :::; x :::; 7r - 8

o if 0 :::; x :::; 8 or 7r - 8 :::; x :::; 7r.
Finally, let g(x) := -g( -x) if -7r :::; x:::; O. Since J is uniformly continuous
on [-7r,7r], we get
C
-~~7r IJ(x) - g(x)1 < :2
for sufficiently small 8 > O. Applying Step 1 to the even continuous function
x ~ ;i~~ on [-7r, 7r], it follows that there exists a q E T such that
- -q(x) I <-.
g(x) c
-7r::;X::;7r Ism x
max -.
2
Setting rex) := q(x) sinx, we obtain that rET and

c
max Ig(x) - r(x)1 < -.
-7r::;X::;7r 2
max If(x) - r(x)1 < c.

-1l':Sx:S1l'
Step 3: In the general case, we use the decomposition

f(x) = T1(J(x) + f( -x)) + T1(J(x) - f( -x)),
and we apply Steps 1 and 2 to the even part and odd part, respectively.
Observe that f(x) - f( -x) = 0 for x = 0, 7l'. D
Corollary 4. The set T of trigonometric polynomials is dense in

L 2 ( -7l', 7l').
Proof. Let U E L 2 ( -7l', 7l'), and let c > 0 be given. By Proposition 7 in

Section 2.2, the set C[-7l',7l'] is dense in L 2 ( -7l', 7l'). Thus, there exists a
continuous function f: [-7l', 7l'] --> lR. such that
Changing continuously the function f near the point x = 7l', we may assume
that f( -7l') = f(7l'). By Lemma 3, there exists a function q E T such that
Ilf - qll :::; (27l')! max

-1l':S x:S 11'
If(x) - q(x)1 < C.
By the triangle inequality, Ilu - qll < 2c. D
Proof of Proposition 1. We first show that {Un} forms an orthonormal

system, i.e.,
for n, k = 0,1, .... (20)
Using
e±inx = (cos nx ± i sin nx), n = 1,2, ... ,
relation (20) follows from
einx _ e- inx
cosnx = sin nx = --::-:---
2 2i
and
m=±1,±2,... ,
by a simple computation.
By Corollary 4, the set T = span {uo, Ul, ... } is dense in L2 ( -7l', 7l'). It
follows from Theorem 3.A that the orthonormal system {un} is complete
in L 2 ( -7l', 7l'). D
3.3 The Schmidt Orthogonalization Method 207
3.3 The Schmidt Orthogonalization Method

Proposition 1. In each separable Hilbert space X over II{ with X -I=- {O},
there exists a complete orthonormal system.
Proof. By assumption, there exists an at most countable set {VO,VI, ... }

that is dense in X. We may assume that Vo -I=- 0. Set
Vo
Uo := Ilvo II·
Suppose that we have already constructed Uo, ... , Un such that {uo, ... , un}
forms an orthonormal system. Then, let
n
WnH := Vn+1 - 2:)Uk I VnH)Uk. (21)
k=O
If Wn+1 -I=- 0, then we set
Un+1 := (21*)
°
Thus, (U m I un+d = for m = 0, ... , n, and (UnH I un+d = 1.
If WnH = 0, then we use Vn+2, and so forth. This way we obtain an
orthonormal system {u m }.
By induction, it follows that all the Vm are finite linear combinations of
the Un. Hence the linear hull of {Um } is dense in X. If {u m } is countable,
then Theorem 3.A tells us that {u m } is complete.
If {u m } is finite, then span {u m } = X, since each finite-dimensional
linear subspace of a Hilbert space is closed, by Corollary 7 in Section 1.12.
Thus, {un} is again complete. D
Method (21) for constructing the orthonormal system {un} is called the
Schmidt orthogonalization method. The following two results will be used
critically in the next section.
Proposition 2. We assume the following:

(i) Let {vo, VI, ... } be a sequence in the Hilbert space X over II{ such that
Vo, ... ,Vm are linearly independent for each m = 0,1, ....
(ii) Let the span {vo, VI' ... } be dense in X.
(iii) Let {uo, Ul, ... } be a countable orthonormal system in X such that
ao > 0, (22)
and
n
Un+1 = an+Ivn+1 +L akVk, anH > 0, (23)
k=O
for all n = 0, 1, ... and appropriate coefficients ak ElI{, k = 0, ... ,n.
(a) The system {un} is obtained from {v n } by means of the Schmidt

orthogonalization method.
(b) The system {un} is complete in X.
Proof. Ad (a). It follows from (22) that 000 = I vIa II ' and hence
Vo
Uo = Ilvoli'
Let n 2:: 1. By (23),
Vk E span{ Uo,···, un} for k = 0, ... , n. (23*)
Hence
n
Un+I = an+IVn+I + 2..= 13m u m,
m=O
where 130, ... ,13n E lK are appropriate coefficients. Using (Uk I un+r) = 0
for k = 0, ... ,n, we get 13k = -an+I(uk I v n+!). By (21),
and Un+I =I- o.
Thus, according to (21*), Un+! corresponds to the Schmidt orthogonaliza-

tion.
Ad (b). Since span {v n } is dense in X, it follows from (23*) that the set
span {un} is also dense in X. By Theorem 3.A, {un} is complete. 0
Corollary 3. Let {VO, VI, ... } be a sequence in the Hilbert space X over lK.
Suppose that
(V n I u) =0 for all n and fixed U E X implies U = O. (24)
Then, the set span {Vo, VI, ... } is dense in X.
Proof. Let S := span{vo, VI," .}. By (24), (S)-L = {O}. Thus, Corollary 1
from Section 2.9 tells us that X = S. 0
3.4 Applications to Polynomials

Standard Example 1 (Legendre polynomials). For n = 0,1, ... , let
-1 S x S 1.
Applying the Schmidt orthogonalization method to {v n }, we get the com-

plete orthonormal system {un} in the Hilbert space L2(-1,1). Explicitly,
()
unx-
_(2n+1)~ dn(2
1
)n
-x-1, n= 0,1, .... (25)
n! 2n+2 dxn
The polynomials
n = 0,1, ... ,
are called the Legendre polynomials.
Proof. Step 1: We show that the system {un} defined by (25) represents
an orthonormal system in L2 (-1, 1), i.e.,
for n, m = 0,1, ....
To this end, we set
n = 0,1, ....
Let n > m ~ 0. Then,
for x = ±1 and r = 0,1, ... , n - 1.
Thus, integration by parts yields
since m + n > 2m. Similarly,
ill Wm(X)wm(x)dx = (_1)m ill (x 2 - l)m d~:' (x 2 - l)mdx
= (2m)! J1
-1
(m,)222m+1
(1 - x2)mdx = -'--::.-'----:--
2m + 1
Step 2: The set span {v n } is dense in L2 ( -1, 1). This follows from the
density of the set C[-l,l] in L2(-1,1) and from the fact that span {v n }
is dense in the Banach space C[-l,l] by the Weierstrass approximation
theorem (cf. the proof of Corollary 4 in Section 3.2).
Step 3: The assertion follows now from Proposition 2 in Section 3.3. In

fact, condition (23) is satisfied, since Un is a polynomial of nth degree with
a positive coefficient at Xn. D
Standard Example 2 (Hermitean functions). For n = 0,1, ... , let
-00 <x < 00.
Applying the Schmidt orthogonalization method to {v n }, we get the com-

plete orthonormal system {un} in the Hilbert spaces L 2 (-00,00) and
L~( -00, (0). Explicitly,
n = 0,1, ... , (26)
with the Hermitean polynomials
n = 0, 1, ... ,
and an := 2-2"n ( n! ) -27r-

1
41 .
The functions Un are called Hermitean functions. These functions play

an important role in quantum mechanics (cf. Section 5.14.3).
Proof. Step 1: We show that the system {un} defined by (26) forms an
orthonormal system in L~( -00,(0) with IK. = lR or IK. = C, i.e.,
for n, m = 0, 1, .... (27)
A simple computation shows that
U~(X) + (2n + 1 - X2)u n (x) = 0, -00 <x< 00, n = 0,1, ... , (28)
u~(x) + (2m + 1- X2)U m(x) = 0, -00 <x < 00, m = 0,1, .... (29)
Observe that, for all a > ° and k = 0, 1, ... ,
lim e- ax2 xk = 0. (30)
x-+±oo
I: u~(x)um(x)dx J~oo I: u~(x)um(x)dx

Consequently, integration by parts yields
N- jN un(x)u:n(x)dx
I:
= lim Un(X)Um(x)I lim
N-+oo -N N-+oo_N
=- un(x)u:n(x)dx, n, m = 0, 1, ....
Similarly,
i: u~(x)um(x)dx = i: un(x)u~(x)dx, n,m =0,1, ....
Thus, multiplying (28) and (29) by Um and Un, respectively, we get
(2n +1- 2m - 1) i : un(x)um(x)dx = 0, n,m = 0,1, ... .
This implies (27) for n =I- m.

Furthermore, if we use (30), then integration by parts yields
This yields (27) for n = m.

Step 2: A classic lemma. Let the function f: ~ -+ C be integrable, i.e.,
J~oo If(x)ldx < 00. Suppose that
i : f(x)e-ikxdx =0 for all k E R
Then, f(x) = 0 for almost all x E R

The proof of this well-known uniqueness theorem for the Fourier trans-
formation can be found in Rudin (1966), p. 200.
Step 3: We want to show that span {v n } is dense in L~( -00,00). Accord-
ing to Corollary 3 in Section 3.3, we have to show that if U E L~( -00,00)
and
(V n I u) == i : x n e- x22 u(x)dx = 0 for all n = 0, 1, . . . , (31)
then u = O. To this end, let M := {k E C: 11m kl < I} and set
g(k):= Joo
-00
2
e-"'2 u(x)e-ikXdx for all k E M.
Formally,
g(n)(k) = i:e-X22U(X)(-ix)ne-ikXdx forallkEM, n=O,1,2, ....

(32)
For all x E lR and k EM, we get
n = 0, 1, .... (33)
Since x f---7 e- x42 and u are elements of L~( -00,(0),
Thus, the majorant condition (33) justifies formula (32) (cf. Parameter
Integrals in the appendix). Consequently, the function 9 is analytic on the
strip M. By (31) and (32),
for all n = 0, 1, ....
°
Hence g(k) = for all k E M. By Step 2, u(x) = for almost all x E R°
Step 4: The assertion follows now from Proposition 2 in Section 3.3. 0
3.5 Unitary Operators

Definition 1. Let X and Y be Hilbert spaces over lK. The operator U: X ~
Y is called unitary iff U is linear, surjective, and
(Uv I Uw) = (v I w) for all V,W E X. (34)
Proposition 2. If the operator U: X ~ Y is unitary, then U is bijective

and continuous, and
IIUvl1 = Ilvll for all v E X. (35)
Moreover, there exists the inverse operator U- 1 : Y ~ X, which is also

unitary.
Proof. Equation (35) follows from (34) with v = w. By (35), U is contin-

uous.
If Uz = Uw, then U(z - w) = 0, and hence z - w = 0, by (35). That is,
U is bijective.
Finally, equation (34) implies (a I b) = (U-1a I U-1b) for all a, bEY,
i.e., U- 1 is unitary. 0
3.6 The Extension Principle 213
3.6 The Extension Principle

Proposition 1. Suppose that
(a) X and Yare Banach spaces over lK. The linear operator A: D <;;: X -->
Y satisfies
IIAul1 ~ ellull for all u E D, (36)
where e 2': 0 is a constant.
(b) The set D is a linear dense subset of X.
(i) The operator A can be uniquely extended to a linear continuous op-

erator A: X --> Y such that (36) holds for all u EX.
(ii) If, in addition, A is compact on D, then so is the extended operator

A:X --> Y.
Proof. Ad (i).
Step 1: Existence. Let U E X - D. Since D is dense in X, there exists a
sequence (un) in D such that Un --> U as n --> 00. In particular, (un) is a
Cauchy sequence. By (36),
Hence (Au n ) is also a Cauchy sequence, i.e., (Au n ) converges. We define
Au:= lim Au n . (37)

n->oo
We have to show that this definition is independent of the choice of (un).

To this end, let (v n ) be another sequence in D such that Vn --> U as n --> 00.
Then
as n --> 00.
Hence AVn --> Au as n --> 00.
A passage to the limit shows that (36) holds for all u E X and the
operator A is linear on X.
Step 2: Uniqueness. Each linear continuous extension A: X --> Y of the
operator A: D --> Y satisfies (37). Hence the extension is unique.
Ad (ii). Let (un) be a bounded sequence in X. Since D is dense in X,
there exists a bounded sequence (v n ) such that Un - Vn --> 0 as n --> 00.
By assumption, the operator A: D --> Y is compact. Thus, there exists a
convergent subsequence (Avn/). Since
Aun, = A( Un' - vn') + Avn"

the sequence (Au n,) is also convergent. D
Standard Example 2 (Unitary operators). Let A: D C X --+ D be a

linear surjective operator such that
(Av lAw) = (v I w) for all v, wED, (38)
where D is a linear dense subspace of the Hilbert space X over lK.

Then, this operator can be uniquely extended to a unitary operator
A:X--+X.
Proof. It follows as in the proof of Proposition 2 in Section 3.5 that A: D --+

D is bijective and
IIAul1 = I A-lull = Ilull for all u E D.
According to Proposition 1, the operators A: D --+ D and A -1: D --+ D

can be uniquely extended to linear continuous operators A: X --+ X and
B: X --+ X, respectively. Using (37), a passage to the limit shows that the
two relations (38) and
ABv=v for all v E D
remain true for all v, wE X. Hence AB(X) = X, i.e., A is surjective. D
3.7 Applications to the Fourier Transformation

Definition 1. The space S consists precisely of all the COO-functions
u: ~--+ C with
Ilullp,q < 00 for all p, q = 0,1, ... , (39)

where
q
Ilullp,q := sup(l + IxI P ) L lu(n\x)l· (40)

xEffi. n=O
Let Un, U E S for all n. We introduce the convergence

S
Un ---+ U as n --+ 00
by means of
Ilun - ullp,q --+ 0 as n --+ 00 for all p, q = 0, 1,2, ....
The operator A: S --+ S is called sequentially continuous iff, as n --+ 00,
S s
Un ---+ U implies AU n ---+ Au.
Obviously, u E S implies
lu(n)(x)1 < const(n,p) on JR for all n,p = 0, 1, .... (41)

- 1 + Ixlp
Therefore, the functions u from the linear space S are called rapidly de-
creasing at infinity. In particular, if u, v E S, then
for all n = 0,1, ... ,
and integration by parts yields
1_= = u'(x)v(x)dx = lim u(x)v(x)i

N
-
IN
u(x)v'(x)dx
i:
N ..... ±= -N-N
= - u(x)v'(x)dx. (42)
Obviously, formula (42) remains true if u E S and v E C 1 (JR) along with

Iv(x)1 + Iv'(x)1 ~ const on R
Each 11·llp,q represents a norm on S. In contrast to a linear normed space,
S is equipped with a countable set of norms.
",2 •
Example 2. Set u(x) := e- T . ObVIOusly, u E S. Moreover,
for all k E R (43)
In terms of the Fourier transformation introduced ahead, relation (43)

tells us that
u(k) = (Fu)(k) for all k E JR,
i.e., u(·) is a fixed point of the Fourier transformation F: S --+ S.
Proof. Set
for all k E R (44)
i:
Formally,
j(n)(k) = e-"'22 e- ikx ( -ixtdx for all k E JR, n = 0,1, ....
This can be justified rigorously because of the majorant condition

for all k E ~, n = 0,1, ... , (cf. Parameter Integrals in the appendix).

!,(k) = 1-<Xl<Xl x2 .
e-"2e- tkX (-ix)dx =
1<Xl de -
.,2
2
_ _ ie-tkxdx
-<Xl ~
.
<Xl 2
=- 1-<Xl e-"2 ke-ikxdx = -kf(k) for all k E R
Hence
k2
f(k) = e-"2 f(O) for all k E R
By (44), f(O) = (21f)-!. o
Let us now study the so-called Fourier transformation
for all k E~, (45)
along with the inverse Fourier transformation
for all x E R (46)
We set Fu:= a.
Proposition 3. The following hold true:

(i) The Fourier transformation
F: S ---+ S
is linear, bijective, and sequentially continuous.

(ii) The inverse transformation
is also sequentially continuous. Explicitly, F- 1 is given through (46).

(iii) For all u, v E S,
[ : u(x) v(x) dx = [ : (Fu) (k)(Fv) (k) dk. (47)
(iv) For all u E S and all k E ~, we have

kPa(k) = (-i)P(Fu(p))(k) (48a)
and
a(ql(k) = (-i)qF[xqu(x)](k), (48b)
where p, q = 0, 1, ....
Corollary 4. Let u E S. Then Fu = 0 implies u = O.
Proof. Step 1: We prove (48). Let u E S. Formal differentiation yields
for all k E JR, q = 0,1, .... This can be justified rigorously by using the
majorant condition
for all k E JR
(cf. Parameter Integrals in the appendix).

Moreover, integration by parts yields
kPa(q)(k) = (21f)-2 1 1 00
u(x)( -ix)q( -i)-P dPe-
ikx
p
dx
1
-00 dx
d Pp u(x)( -ix)q } (-i)-P e- 2kx

00
= (21f)-21 (-I? -00 { dx . dx. (48*)
This implies (48).

Step 2: We want to show that F: S ----+ S is sequentially continuous. Let
q ~ m. By (48*),
This implies
Iiallp,q ~ const(p, q) Ilull q+2,p.
Recall that a = Fu. Since F: S ----+ S is linear, we get
IIFu - Funllp,q ~ const(p,q)llu - u n ll q+2,p for all p, q = 0,1, ... ,
and all u, Un E S. Consequently, as n ----+ 00,
S
Un ---+ U implies
i: i:
Step 3: Let us prove the following key formula:
a(k)v(k)eikYdk = b(z)u(z + y)dz, (49)

provided a = Fu and b = Fv with u, v E S. In fact, by the Fubini theorem,
[ : a(k)v(k)eikYdk = (27r)-! [ : v(k)e ikY ( [ : e-ikxu(x)dx )dk
= (27r)-! [ : u(x) ( [ : e-ik(X-Y)V(k))dX
= [ : u(x)b(x - y)dy = [ : b(z)u(z + y)dz
(cf. Iterated Integration in the appendix).

2 2
Step 4: Let E > 0 and choose v(x) = e-~. Using the substitution
k2
z = EX, it follows from Example 2 and b = Fv that b(k) = E-le-~. By
(49),
= 1
00
-00
2
e-~u(Et + y)dt. (50)
Observe that
for all y E lR and E > 0, since u E S. Thus, letting E ~ +0 in (50), it follows

from the Lebesgue dominated convergence theorem (cf. the appendix) that
(51)
Instead of (51), let us write
Ga:=u.
This way we obtain the linear sequentially continuous operator G: S ~ S.
Observe that G is obtained from F by replacing e- ikx with e ikx , and use
the same argument as in Step 2.
Step 5: We want to prove that the linear operator F: S ~ S is bijective.
By (51) with a = Fu, we get
GFu=u for all u E S. (52)
Replacing e- ikx with e ikx , the argument from Step 4 yields
FGu=u for all u E S. (53)
By Proposition 6 in Section 1.20, F: S ~ S is bijective and F- 1 = G.

Similarly as in Step 2 it follows that F-l: S ----+ S is sequentially contin-

uous.
I: I:
Step 6: Let us prove (47). By (49) with y = 0,
(Fu)(k)(F-1b)(k)dk = u(z)b(z)dz for all u, b E S.
Set b = w. Then
for all k E R
I: I:
Hence
(Fu)(k)(Fw)(k) dk = u(z)w(z) dz for all u,w E S.
This is (47). o
Proposition 5. The Fourier transformation F: S ----+ S can be uniquely
extended to a unitary operator
Proof. By the definition of S,
Since the set CO' (JR k is dense in L~ (JR), so is the set S. By (47),
(Fu I Fv) = (u I v) for all u, v E S,
where (. I .) denotes the inner product in L~(JR). The assertion follows now
from Standard Example 2 in Section 3.6. 0
3.8 The Fourier Transform of Tempered

Generalized Functions
Definition 1. The set S' consists exactly of all the linear, sequentially
continuous mappings
T:S----+C,
i.e., we have T E S' iff
T(o:u + (3v) = o:Tu + (3Tv for all 0:,(3 E C, u,v E S,

and, as n - t 00,
S
Un ----+ U implies TUn -t Tu.
The elements T of S' are called tempered generalized functions (or tem-
pered distributions).
Definition 2. Let T E S'. The Fourier transform FT of T is defined by
(FT)(u) := T(Fu) for all u E S.
Proposition 3. The operator F: S' -t S' is linear and bijective.
Proof. Let T E S'. If Un

s
----+ u, then FUn
s
----+ Fu, as n - t 00. Hence
(FT)(u n ) = T(Fu n ) - t T(Fu) = (FT)(u) as n - t 00.
Consequently, FT E S'.
Let S E S'. Define
for all u E S.
Then,
(FT)(u) = S(F- 1 Fu) = S(u) for all u E S.
Hence FT = S, i.e., F: S' - t S' is surjective.
Moreover, let FT = O. Then T(Fu) = 0 for all u E S, and hence Tv = 0
for all v E S, i.e., T = O. Thus, F: S' - t S' is bijective.
Let a, (3 E C and T, S E S'. Then
F(aT + (3S)(u) = (aT + (3S)(Fu) = aT(Fu) + (3S(Fu)

= a(FT)(u) + (3(FS)(u) for all u E S.
Hence F(aT + (3S) = aFT + (3FS, i.e., F: S' -t S' is linear. D
Standard Example 4. Let v: lR - t C be a measurable bounded function

(e.g., v is continuous and bounded, i.e., Iv(x)1 ~ const for all x E lR). Define
T(u) = [ : v(x)u(x)dx for all u E S.
Then, T E S'.
Proof. Let Un ~ u as n - t 00. Then
~n := sup(l + x 2 )lun (x) - u(x)1 -t 0 as n - t 00.

xEIR
Hence
IT(u n - u)1 :::; 1 00
-00
Iv(x)~ (1
1+x
+ x2)lu n (x) - u(x)ldx
:::; const bon --+ 0 as n --+ 00,
i.e., TUn --+ Tu as n --+ 00. D
Standard Example 5. Let y E IR.. Define the tempered delta distribution

8y through
8y(u) := u(y) for all u E S.
Then
(i) 8y E Sf.
(ii) F8 y = (27r)-~ "e- iky ."
(iii) 8y = (27r)-~F-l ("e- iky ,,).

Here, the tempered generalized function "e- iky " corresponds to the clas-
sic function a(k) = e- iky for all k E JR and fixed y E JR, in the sense of
Standard Example 4. That is,
for all u E S.
Proof. Ad (i). Let Un ~ u as n --+ 00. Then
sup lun(x) - u(x)1 --+ 0 as n --+ 00.

xEIR
Hence
8y(u n ) --+ 8y(u) as n --+ 00.
i:
Ad (ii). For all u E S,
(F8y)(u) = 8y(Fu) = (Fu)(y) = u(k)(27r)-~e-iYkdk.

Ad (iii). Observe that F: Sf --+ Sf is bijective. D
Remark 6 (The language of physicists). Instead of (ii) and (iii) from Stan-
dard Example 5, physicists formally write
for all k, y E JR, (54)

and
for all x,y E:K (55)
Formally, (54) follows from the "naive interpretation" J~oo f(x)8(x -

y)dx = f(y) of the Dirac delta function 8. Moreover, (55) follows from (54)
by means of the inverse Fourier transformation if we regard 8 as a "classical
function."
Formulas (54) and (55) are frequently used in quantum physics. Applying
the Dirac calculus from Section 5.21, physicists "elegantly obtain" (55) in
the following way:
(21l')-1 Joo eik(x-Y)dk

-00
= L(x I k)(k I y) = (x I y) = 8(x - y).
k
Problems
3.1. Density. Let M be a subset of a Hilbert space X over K Show that
the set span M is dense in X iff
(u I v) = 0 for all v E M implies u = O.
3.2. The Parseval equation. Let (Un)n~l be an orthonormal system in the

separable Hilbert space X over K Show that (Un) is complete iff
L I(un I uW = II u l1 2 for all u E X.

n~l
3.3. A fundamental completeness theorem. Let -00 $ a < b $ 00. We are

given a measurable function f: la, b[-+][{ (e.g., f is continuous) such that
If(x)1 $ Ce- adxl for all x E ~ and fixed 0: > 0 and C > 0.
Show that the linear hull of the system {xn f(x )}n=O,l, ... is dense in the
Hilbert space Lf(a, b).
Hint: Use a similar argument as in the proof of Standard Example 2 in
Section 3.4. Cf. Kolmogorov and Fomin (1975), Section 8.4.3.
3.4. The completeness of the system of the Laguerre functions. Starting

from the system
n = 0,1, ... , x E ~,
the Schmidt orthogonalization method yields a system of functions

Ln(x)e-~, n = 0, 1, ... , x E :K (56)
Show that the following are true:
Problems 223
(i) System (54) forms a complete orthonormal system in L~(O, 00).

(ii) Explicitly,
(_I)n dn
L (x)·= - - e-X_(e-xxn) n = 0,1, ... , x E JR.
n . n! dxn '
Hint: Use a similar argument as in the proof of Standard Example 2 in
Section 3.4 and use Problem 3.3.
3.5. * Properties of the Fourier transform. Let f: JR --t JR be a measurable

function (e.g., f is continuous) such that In~.lf(x)ldx < 00. Assume that
L f(x)e-ixtdx = 0 for all t E R
Then, f(x) = 0 for almost all x E R

Study the proof in Rudin (1966), p. 200.
3.6. Applications to density. Problem 3.5 can be used in order to prove

the density of certain sets in X := L~(JR) via the Fourier transformation
F: X --t X. Let D denote the set of all the Gaussian functions
u a ,(3(x) := e-(3(x-a)2 for all x E JR, where Q E JR and f3 > O.
Show that D is dense in X.
Solution: Since F: X --t X is a unitary operator, it is sufficient to show
(Fu a ,(3)(k) = (27r)-! 1:

that F(D) is dense in X. Observe that, for all k E JR,
e- ikx U a ,(3(x)dx
k2
where w(k) := e-4;J.
1:
To prove that span F(D) is dense in X, by Problem 3.1, we have to show
that
(Fua,(3 I v) == (Fua,(3)(k)v(k)dk = 0 for all Q E JR, f3 > 0, (57)
1:
implies v(k) = 0 for almost all k E JR. In fact, from (57) we get
eiakw(k)v(k)dk = 0 for all Q E R
Since V,W E L~(JR), we get J~oo Iv(k)w(k)ldk < 00. Thus, Problem 3.5
implies that v(k)w(k) = 0 for almost all k E JR, and hence v(k) = 0 for
almost all k E R
3.7. * The fundamental Payley- Wiener theorem. Let us consider the Fourier
transform
Then, for each fixed R > 0, the following two statements are equivalent:
=
°
(i) The function F: C ---t C is holomorphic and, for each N 1,2, ... ,
there is a constant C N > such that
for all z E C.
(ii) The function f belongs to CO'(lR.)c and vanishes outside the interval
[-R,R].
Study the proof in Yosida (1980), Chapter 6.
3.8. The tensor product X I8i Y. Let X and Y be linear spaces over lK, and
let X* denote the space of all linear functionals u*: X ---t lK. Define
(ul8iv)(u*,v*):= u*(u)v*(v), (58)
for all u EX, v E Y, u* E X*, v* E Y*. Obviously, u I8i v is a bilinear form

on X* x Y*. Furthermore, let X I8i Y denote the set of all possible finite
linear combinations
(59)
where Uj EX, vk E Y. Thus, each element of X I8i Y is a bilinear form on

X* x Y* given by (59), i.e.,
for all u* E X*, v* E Y*. Naturally enough, if a, b E X I8i Y, then we say

that a is identical to b iff the corresponding bilinear forms are identical,
i.e.,
a=b iff a(u*,v*) = b(u*,v*) for all u* E X*, v* E Y*. (60)
Observe that different expressions (59) may correspond to identical bilinear

forms, i.e., the representation (59) is not unique for the elements of X I8i Y.
Show that
(i) X I8i Y is a linear space, by means of the natural linear operations

for bilinear forms. More precisely, if a, b E X I8i Y and a, f3 E lK, then
aa + (3b is given by
(aa+(3b)(u*,v*):= aa(u*,v*)+f3b(u*,v*) for all u* E X*,v* E Y*.

Problems 225
(ii) The symbol "®" behaves like a product, i.e., for all u, v EX, w, z E Y,
and a, (3 E lK, we get the following distributive laws:
(au + (3v) ® w = a(u ® w) + (3(v ® w)

and
U ® (aw + (3z) = a(u ® w) + (3(u ® z).
(iii) Let {UI' ... ,UN} and {VI, ... ,VM } be a basis of the finite-dimensional
linear spaces X and Y. Then,
{(Uj ® Vk):j = 1, ... ,N, k = 1, ... ,M}
forms a basis of the tensor product X ® Y.
(iv) Let X and Y be Hilbert spaces. Set
(U®V I w®z):= (u I w)(v I z),
and generalize this definition to linear combinations in a natural way

by letting
Then, we get an inner product on X ® Y.
(v) Consider situation (iii). If X and Yare finite-dimensional Hilbert

spaces and {Uj} and {vd is an orthonormal basis of X and Y, re-
spectively, then {(Uj ® Vk)} is an orthonormal basis of the Hilbert
space X ® Y.
Solution: Ad (i), (ii). Use elementary computations.

Ad (iii). Define
and
By (ii), each a E X ® Y can be represented as a linear combination of the

elements Uj ® Vk. Moreover, it follows from
L ajk(Uj ® Vk) = 0
j,k
Ad (iv). We first want to show that the definition of (. I .) is independent

of the choice of the representatives. For given w E X and z E Y, define
w* E X* and z* E Y* by
w*(u) := (w I u) and z*(v) := (z I v),
for all U E X and v E Y, respectively. According to (58),
(U 0 v)(w*, z*) = w*(u)z*(v) = (w I u)(z I v) = (w 0 z I U 0 v). (61)

Thus,
u0v=O implies (w0zlu0v)=O for all w,z E X.
Now let a, b, c E X 0 Y. Suppose that a = b. Then a - b = O. Using linear
combinations, it follows from (61) that
(c I a - b) = O.
Thus, a = b implies (c I a) = (c I b).
Furthermore, one checks easily that
(aa+!3b I c) = a(a I c) +!3(b I c), (a I b) = (b I a),

for all a, b, c E X 0 Y and a,!3 E K Finally, we have to show that
(ala);::::O and (a I a) = 0 implies a = O. (62)
To this end, let a = 2:ajk(uj0vk). Set L:= span{uj} and M:= span{vd.
Choose an orthonormal basis {ej} and {fk} of Land M, respectively. By
(ii) ,
a= "L!3jk(ej0Ik).
Since (ej 0/k I er 0 Is) = (ej I er)(lk I Is) = OjrOks,
(a I a) = "L!3jk!3jk.
By (iii), a = 0 iff !3jk = 0 for all j, k. This yields (62).
3.9. The tensor product X 0 Y 0 Z. Let X, Y, and Z be linear spaces over

K Similarly to Problem 3.8, we set
(u 0 v 0 w)(u*, v*, w*) := u*( u)v* (v)w* (w)
for all U E X, v E Y, wE Z, u* E X*, v* E Y, w* E Z*. Thus, u0v0w is a
trilinear form on X* x y* x Z*. By definition, the tensor product X 0 Y 0 Z
consists of all possible finite linear combinations
"L Uj 0Vk 0w m,
j,k,m
where Uj EX, Vk E Y, Wm E Z. Show that

Problems 227
(i) X 0 Y 0 Z is a linear space.
(ii) The symbol "0" behaves like a product. That is, for all u, v E X,
Y E Y, Z E Z, and a, f3 ElK, we get
(au + f3v) 0 y0 Z = a(u 0 y 0 z) + f3(v 0 y 0 z), and so on.
(iii) If {Uj}, {vd, {w m } form a basis of the finite-dimensional linear spaces

X, Y, Z, respectively, then {Uj 0 Vk 0 w m } forms a basis of the tensor
product X 0 Y 0 Z.
(iv) Let X, Y, Z be Hilbert spaces. Define
(u0v0w I x0y0z):= (u I x)(v I y)(w I z)

and extend this definition to linear combinations in a natural way.
Then, we get an inner product on X 0 Y 0 Z.
(v) If {Uj}, {vd, {w m } form orthonormal bases of the finite-dimensional

Hilbert spaces X, Y, Z, respectively, then {Uj 0 Vk 0 w m } forms an
orthonormal basis of the Hilbert space X 0 Y 0 Z.
It will be shown in Section 2.22 of AMS Vol. 109 that

Tensor products describe composite states of elementary particles.
4
Eigenvalue Problems for Linear
Compact Symmetric Operators
The validity of theorems on eigenfunctions can be made plausible

by the following observation made by Daniel Bernoulli (1700-1782).
A mechanical system of n degrees of freedom possesses exactly n
eigensolutions. A membrane is, however, a system with an infinite
number of degrees of freedom. This system will, therefore, have an
infinite number of eigenoscillations.
Arnold Sommerfeld, 1900
In 1900 Fredholm had proved the existence of solutions for linear

integral equations of the second kind. His result was sufficient to
solve the boundary-value problems of potential theory. But Fred-
holm's theory did not include the eigenoscillations and the expansion
of arbitrary functions with respect to eigenfunctions. Only Hilbert
solved this problem by using finite-dimensional approximations and
a passage to the limit. In this way he obtained a generalization of
the classical principal-axis transformation for symmetric matrices to
infinite-dimensional matrices. The symmetry of the matrices corre-
sponds to the symmetry of the kernels of integral equations, and
it turns out that the kernels appearing in oscillation problems are
indeed symmetrical.
Otto Blumenthal, 1932
A great master of mathematics passed away when Hilbert died in

Gottingen on February 14, 1943, at the age of eighty-one. In retro-
spect, it seems that the era of mathematics upon which he impressed
230 4. Eigenvalue Problems for Linear Compact Symmetric Operators
the seal of his spirit, and which is now sinking below the horizon,
achieved a more perfect balance than has prevailed before or since,
between the mastering of single concrete problems and the formation
of general abstract concepts.
Hermann Weyl, 1944
In this chapter we want to study the following eigenvalue problem:

Au = AU, U E X, A E OC, U i- 0, (1)
on the Hilbert space X over OC, along with applications to integral equations
and boundary-value problems.
Each solution (u, A) of (1) is called an eigensolution of A, where u is called
an eigenvector and A is called an eigenvalue of A, respectively. Recall that
OC = lR or OC = C.
The set of all the eigenvectors u that correspond to a fixed eigenvalue A
is called the eigenspace to u.
By definition, the eigenvalue A has finite multiplicity iff the corresponding
eigenspace has finite dimension.
In this chapter we will assume that A: X -c> X is a linear compact sym-
metric operator. We want to show that such operators possess a complete
orthonormal system of eigenvectors.
In the next chapter we will study problem (1) for more general symmet-
ric operators A: D(A) t;;;; X -c> X, along with applications to the Laplace
and Poisson equations, the heat equation, the wave equation, and the
Schrodinger equation in quantum mechanics.
4.1 Symmetric Operators

Definition 1. The linear operator A: D(A) t;;;; X -c> X on the Hilbert space
X over OC is called symmetric iff the domain of definition D(A) is dense in
X and
(Au I v) = (u I Av) for all u, v E D(A).
Proposition 2. Let A: D(A) t;;;; X -c> X be a linear symmetric operator on

the Hilbert space X over oc. Then
(i) (Au I u) is real for all u E D(A).
(ii) All the eigenvalues of A are real.
(iii) Two eigenvectors of A with different eigenvalues are orthogonal.
(iv) Let {Ul' U2, ... } be a finite or countable complete orthonormal system
of eigenvectors of A. Then the corresponding system {AI, A2,"'} of
eigenvalues contains all the eigenvalues of A.
4.1 Symmetric Operators 231
Proof. Ad (i). Let u E D(A). Then,

(Au I u) = (u I Au) = (Au I u).
Ad (ii). It follows from (1) that
A(U I u) = (u I Au) = (Au I u) = ,\(u I u)
with (u I u) i=- O. Hence A = '\.
Ad (iii). From Au = AU and Av = pv along with A, p E lR and Ai=- p it
follows that
(A - p)(u I v) = (Au I v) - (u I Av) = 0,
and hence (u I v) = o.
Ad (iv). Since {un} is complete, we have
N
u = 2)u n I u)un for all u E X, (2)

71=1
where N is a natural number or "N = 00." Let Au = AU with u i=- 0 and

A i=- An for all n. By (iii), (un I u) = 0 for all n. Hence u = 0, by (2). This
contradicts u i=- O. D
Proposition 3. Let A: X -> X be a linear continuous symmetric operator

on the Hilbert space X over lK with X i=- {O}. Then
sup I(Au I u)1 = IIAII·
Ilull=l
Proof. Set a := sUPllull=l I(Au I u)l. Since A is linear,
I(Av I v)1 ::: allvl1 2 for all v E X.

I(Au I u)1 ::: IIAullllul1 ::: IIAlillul1 2 for all u E X.

Hence a ::: IIAII. To prove that IIAII ::: a, we set
v± := AU ± A- 1 Au, A> o.
It follows from (A 2u I u) = (Au I Au) and IIAul1 2= (Au I Au) that
IIAul1 2= 4- 1 [(Av+ I v+) - (Av_ I v_)]
::: r 1 a(llv+112 + Ilv_112)
= T 1 a(A211u11 2+ A- 21IAuI1 2).
Assume first that Au i=- O. Letting A2 = IIAul1 and Ilull = 1, we find that
IIAul1 2::: allAul1 if Au i=- O. Hence
IIAul1 ::: a for all u E X with Ilull = 1.
This implies IIAII ::: a. D
4.2 The Hilbert-Schmidt Theory

Theorem 4.A. Let A: X ----7 X be a linear compact symmetric operator on
the separable Hilbert space X over lK with X =I=- {O}. Then, the following
hold true:
(i) The operator A has a complete orthonormal system of eigenvectors.

(ii) All the eigenvalues of A are real, and each eigenvalue A =I=- 0 of A has
finite multiplicity.
(iii) Two eigenvectors of A that correspond to different eigenvalues are

orthogonal.
(iv) If the operator A has a countable set of eigenvalues (e.g., A = 0 is

not an eigenvalue of A and dim X = 00), then the eigenvalues of A
form a sequence (An) such that
as n ----7 00.
Proof of Theorem 4.A under the additional assumption (A). Sup-

pose that
(A) Au = 0 implies u = 0 and let dim X = 00.
Since X =I=- {O}, this implies A =I=- 0 and hence IIAII =I=- O.
Step 1 is decisive.
Step 1: Variational problem for constructing an eigensolution. We con-
sider the maximum problem
I(Au I u)1 = max !, Ilull = 1. (3)
By Proposition 3 in Section 4.1, the maximal value is equal to IIAII. Thus,

there exists a sequence (v n ) with I Vn II = 1 for all n such that
as n ----7 00.
Set an := (Av n I v n ). Since the real sequence (an) is bounded, there exists
a convergent subsequence of (an). Consequently, there exist a subsequence,
again denoted by (v n ), and a real number A1 such that
as n ----7 00.
Therefore,
IA11 = IIAII > O.
This implies IIAvnl1 :::; IIAllllvnl1 = IA11, and hence
0:::; IIAvn - A1vnl12 = IIAvnl1 2- 2A1(Avn I vn ) + Ai ----70 as n ----7 00.

The operator A is compact. Thus, there exists a subsequence, again denoted

by (v n ), such that (Av n ) converges. Since
as n ---+ 00,
and A1 i= 0, the sequence (v n ) also converges to a certain element U1, i.e.,
as n ---+ 00.
This implies
AU1 - A1U1 = 0, U1 EX, IIu111 = 1,
i.e., (U1' A1) is an eigensolution of A.
Step 2: Induction. Let
Y := {u E X: (u 1 U1) = O}.
Then, Y is a closed linear subspace of X. The key to our induction argument
is the relation
A(Y) ~ Y, (4)
i.e., the Hilbert space Y is invariant with respect to the operator A. In fact,
let u E Y. Then
(Au 1U1) = (u 1AU1) = A1(U 1ud = 0,

and hence Au E Y.
Since dim X = 00, Y i= {O}. Furthermore, A i= 0 on Y. Otherwise there
would exist an element u E Y with u i= 0 and Au = 0, which contradicts
assumption (A).
Therefore, we may apply Step 1 to the restricted operator A: Y ---+ Y.
This way we obtain the eigensolution (U2' A2), i.e.,
According to Step 1, IA21 is equal to the norm of A: Y ---+ Y. Hence
IA21 = sup IIAvll·

Ilvll=l,vEY
By Step 1,
IA11 = IIAII = sup IIAvll,
Ilvll=l,VEX
and hence
IA11?: IA21 > O.
We now set Z := {u E Y: (u 1 U2) = O} and continue this procedure.
This way we obtain a countable system {un, An} of eigensolutions, i.e.,
where {Un} is an orthonormal system.

Step 3: We show that
as n ---* 00.
Otherwise, IAnl ?: const > 0 for all n, and hence the sequence (A~lun) is
bounded. The operator A is compact. Since
n = 1,2, ... ,
the sequence (un) contains a convergent subsequence. But this is impossi-

ble, since (Un I um) = 0 for n i=- m, and hence
for all n, m with n i=- m,
i.e., no subsequence of (un) is Cauchy.

00
Au = L An(Un I u)un for all U E X. (5)

n=l
To this end, let
m
Wm := U - L(un I u)un , m= 1,2, ....
n=l
Set V:= {u E X: (u I Uj) = 0, j = 1, ... ,m}. By Step 2, A(V) ~ V, and
IAm+ll equals the norm of A:V ---* V, i.e.,
IIAvl1 ~ IAm+llllvll for all v E V.
Obviously, Wm E V, and hence
Since
m
IIwml1 2= IIuI1 2- L I(u n I u)12 ~ IIuI1 2 for all m,
n=l
we get
as m ---* 00.
Hence
m
AWm = Au - L(u n I U)AnUn ---* 0 as m ---* 00.

n=l
This is (5).

00
U = L(un I u)un . (6)

n=1
By Proposition 5 in Section 3.1, each Fourier series is convergent, i.e., there
is some v E X such that
00
v = L(un I u)un .
n=1
Hence
00 00
n=1 n=1
By assumption (A), it follows from A(v - u) = 0 that v - U = o.
Step 6: We show that each eigenvalue A i=- 0 of A has finite multiplicity.
Since {un} forms a complete orthonormal system in X, A is identical to
some Am, by Proposition 2 in Section 4.l.
For simplifying notation, assume first that A = AI. Since An i=- 0 and
An ~ 0 as n ~ 00, there exists a number N such that Al = ... = AN and
Aj i=- Al for all j > N. Let
Au = AU, U i=- O.
By (5),
00 N
AU = L An(Un I u)un = Al L(Un I u)un ,

n=1 n=1
since two eigenvectors of A with different eigenvalues are orthogonal, i.e.,
(un I u) = 0 for all n > N. Hence {Ul, ... ,UN} forms a basis of the
eigenspace to A, i.e., A has the multiplicity N.
The same argument applies to the general case where A = Am.
The proof of Theorem 4.A is complete under the additional assumption
(A). D
Proof of Theorem 4.A under the additional assumption (B). Sup-

pose that
(B) Au = 0 implies U = 0 and let dim X = N, where N = 1,2, ....
In this case, it follows from Step 2 that there exist eigensolutions
n= 1, ... ,M,
where S:= {Ul, ... ,UM} is an orthonormal system. Let M be the largest
possible number. The construction from Step 2 shows that M = N. Hence
the system S with M = N is complete.
This finishes the proof of Theorem 4.A under the additional assumption
(B). D
Proof of Theorem 4.A. Finally, let us consider the general case. We may
assume that >. = 0 is an eigenvalue of A. Otherwise, we meet assumption
(A) or (B). Set
N(A):= {u E X:Au = a}.
Then, N(A) -I {a}. Since the operator A is continuous, N(A) is a closed
linear subspace of X. In fact, if AUn = 0 for all n and Un - t U as n - t 00,
then Au = O.
By Proposition 1 in Section 3.3, there exists a complete orthonormal
system {wd in N(A). Hence
for all k.
Moreover, it follows from Corollary 1 in Section 2.9 that for each U E X

there exists the unique decomposition
U = w + z, wE N(A), z E N(A)~. (7)
Recall that
N(A)~ := {z E X: (z I w) = 0 for all w E N(A)}.
Thus, N(A)~ is a closed linear subspace of X. We have these conditions:
(a) The operator A maps N(A)~ into N(A)~.

(b) If Az = 0 with z E N(A)~, then z = O.
Ad (a). Let z E N(A)~. Then
(Az I w) = (z lAw) = 0 for all w E N(A),
and hence Az E N(A)~.

Ad (b). If Az = 0 with z E N(A)~, then z E N(A) n N(A)~. By the
uniqueness of the decomposition (7), z = O.
We now apply Theorem 4.A with the additional assumption (A) to the
restricted operator
A: N(A)~ - t N(A)~.
This way we get a complete orthonormal system {un} of eigenvectors of
A on N (A) ~. Recall that {wd forms a complete orthonormal system in
N(A).
By (7), for each U E X,
U = W + z = ~)Wk I W)Wk + ~)Un I z)un .

k n
4.3 The Fredholm Alternative 237
Since W E N(A) and Z E N(A).L, we get (Wk I z) = 0 for all k and

(un I w) = 0 for all n. Hence
u = L~ Uk I U)Wk + L(un I u)un .

k n
Consequently, {WI, Ul, W2, U2, ... } represents a complete orthonormal sys-
tem of eigenvectors of A.
The proof of Theorem 4.A is complete. D
4.3 The Fredholm Alternative

Let us consider the equation
AU - Au = b, uEX, (8)
along with the homogeneous problem
AV - Av = 0, vEX. (8*)
Let us make the following assumption:
(H) The operator A: X -+ X is linear, compact, and symmetric on the

separable Hilbert space X over OC with X f. {O}.
Theorem 4.B. Assume (H). We are given b E X and the number A E OC

with A f. O.
Then, the original equation (8) has a solution iff
(b I v) = 0 for all solutions v of (8*). (9)
Corollary 1. Assume (H). We are given b E X and A E OC with A f. o.

Suppose that equation (8) has at most one solution. Then, the following
hold true:
(i) There exists the linear continuous opemtor (>./ - A)-l: X -+ X.

(ii) Equation (8) has the unique solution u = (>./ - A)-lb.
This corollary tells us that the following important principle holds true
for the original problem (8):
Uniqueness implies existence.
Corollary 2. Assume (H) with OC = C. Then, the following are met:

(i) If dim X < 00, then the spectrum a(A) of the operator A consists
precisely of all the eigenvalues of the operator A.
(ii) If dim X = 00, then the spectrum a(A) consists of all the eigenvalues
of A together with the point A = O.
Proof of Theorem 4.B. Suppose first that the operator A has a countable
system of nonzero eigenvalues.
By (5), there exists an orthonormal system {un} in X such that
L
00
Au = An(un I u)un , (10)

n=l
along with AUn = Anun for all n. The system {An} contains all the nonzero
eigenvalues of A, and
as n --+ 00. (11)
Furthermore,
for all U E X and all n. (12)

Case 1: Suppose that A i- An for all n, i.e., equation (8*) only has the
trivial solution v = O. Then, equation (8) has at most one solution.
Let U be a solution of (8). Then,
By (8) and (12),
(A - An)(un I u) = (un I (AI - A)u) = (un I b) for all n. (13)

Hence
An
where an := A _ An' (14)
Conversely, it follows from the Bessel inequality
along with An --+ 0 as n --+ 00 that Ian I ::::; const for all n, and hence
L
00
lan(un I bW : : ; constllbl1 2 .
n=l
4.3 The Fredholm Alternative 239
By Proposition 5 in Section 3.1, the series (14) is convergent. It follows

from (10) with U = band (14) that
Au = A-I {Ab + ~ anAn(un I b)Un} = ~ A-I An(l + an)(un I b)un

and
L
00
AU = b + an(u n I b)un.
n=1
Hence
AU - Au = b,
i.e., u is a solution of (8). Since this solution is unique, the inverse operator
(AI - A)-I: X ---+ X exists. In addition, (14) tells us that
IIul1 2 = IA- 21 (11b112 + ~ 2an l(un I bW + a~l(un I bW)
:::: Ir 21 (11b 112 + ~ constl(u n I bW)
:::: const Ilb11 2.

This implies
Ilull :::: const Ilbll for all b E X.
Hence the linear operator (AI - A)-I: X ---+ X is continuous.
Case 2: Suppose A is an eigenvalue of A, i.e., A = Am for some m. To
simplify notation, let us assume that A = AI. Then Al = A2 = ... = AN for
some natural number N and An i= Al for all n > N.
If u is a solution of (8), then it follows from (13) that
(un I b) = 0 for all n = 1, ... , N. (15)

This is equivalent to condition (9). Hence
(16)
Conversely, let condition (15) be fulfilled. As in Case 1, one checks easily

that u from (16) satisfies AU - Au = b, i.e., u is a solution of (8).
Finally, observe that if the operator A has only a finite number of nonzero
eigenvalues An, then the series from (14) and (16) reduce to finite sums
~.... 0
Proof of Corollary 1. If equation (8) has at most one solution, then
AU - Au = AV - Av implies u = v.
By the linearity of A, this is equivalent to the fact that AW - Aw = 0 implies

w = O. The assertion follows now from Case 1 in the preceding proof. 0
Proof of Corollary 2. Let A -I- An for all n and A -I- o. By Corollary l(i),
the point A E <C belongs to the resolvent set of A.
Ad (i). Statement (i) follows from Problem 1.4. Let us give a different
proof. If A = 0 is an eigenvalue of A, then 0 E O'(A), by Section 1.25.
If A = 0 is not an eigenvalue of A, then Av = 0 implies v = O. By
Theorem 4.A, the operator A has a complete orthonormal system {un} of
eigenvectors. Set
Then, Au = band Ilull ::; const Ilbll. Thus, the operator A- 1 : X ----+ X is
continuous, i.e., A = 0 belongs to the resolvent set of A.
Ad (ii). Suppose first that the operator A has only a finite number of
nonzero eigenvalues. Since all these eigenvalues have finite multiplicity and
dim X = 00, it follows from Theorem 4.A that there is some v -I- 0 for
which Av = 0, i.e., A = 0 belongs to the spectrum O'(A).
Suppose now that the operator A has a countable set {An} of nonzero
eigenvalues. Then, An ----+ 0 as n ----+ 00. Since the spectrum O'(A) is compact,
the limit point A = 0 belongs to O'(A). 0

We want to study the following integral equation:
lb A(x, y)u(y)dy = AU(X), a ::; x ::; b. (17)
Let -00 < a < b < 00. We are looking for eigensolutions A E IR and
u E L2 (a, b) with u -I- O. To this end, we assume the following:
(HI) The function A: [a, b] x [a, b] ----+ IR is continuous.
(H2) The function A is symmetric, i.e.,
A(x,y) = A(y,x) for all X,y E [a,b].
We set X := L 2 (a, b) along with the inner product
(u I v):= 1 b
u(x)v(x)dx.
We also define the integral operator
(Au)(x) := lb A(x, y)u(y)dy.

Then, the original equation (17) can be written in the following form:
Au = AU, A E lR, u E X, u -I=- o. (17*)
Proposition 1 (Eigensolutions). Under assumptions (HI) and (H2), the

(i) The original integral equation (17) has a countable system of eigen-
functions {Ul' U2, ... }, which forms a complete orthonormal system
in the Hilbert space L2 (a, b).
(ii) Two eigenfunctions u and v of (17) which correspond to different

eigenvalues are orthogonal in L2 (a, b), i. e., (u Iv) = O.
(iii) Each nonzero eigenvalue of (17) has finite multiplicity.
(iv) If the integral equation (17) has a countable number of eigenvalues

(e.g., A = 0 is not an eigenvalue of (17)), then all the nonzero eigen-
values of (17) form a sequence (An) with An ----> o.
(v) For each eigenvalue A -I=- 0 of (17), the eigenfunctions u are continuous
on [a,b].
By (i), for each u E L 2 (a, b), the Fourier series

00
u = 2)un I u)un (18)

n=l
converges in L2 (a, b), i.e.,
~~oo 1 (m~(Un
b
u(x) - I u)un(x)
)2 dx = o.
Corollary 2 (Classical convergence). Suppose that the function u allows

the following representation:
u(x) = lb A(x, y)v(y)dy for all x E [a, b],
where v E L 2(a, b). Then, the Fourier series

00
u(x) = 2)un I u)un(x) (18*)

n=l
converges absolutely and uniformly on the interval [a, b].

In addition, we have (un I u) = 0 if the eigenvector Un corresponds to

the eigenvalue>. = O.
Proposition 1 follows immediately from Theorem 4.A by using the fol-

lowing result.
Lemma 3. Assume (HI) above and let X := L2 (a, b). Then, the following
hold true:
(a) The operator A: X --+ X is linear and compact.
(b) If u E X, then Au E era, bj.

(c) If, in addition, (H2) holds, then the operator A: X --+ X is symmetric.
Proof. Ad (a), (b). Let u E X, and let lIull denote the norm of u in X. By
the Schwarz inequality,
b
11'lu(y)ldY:S
(
1 )! (
b
dy
b
l1U(YWdy
)! = (b - a)! lIull·
Set v := Au, i.e.,
v(x) = lb A(x,y)u(y)dy for all x E [a, bj.
Since the set [a, bj x [a, bj is compact, the function A is uniformly continuous
on [a, bj x [a, b]. Thus, for each c > 0, there is a 8 > 0 such that
x, z E [a, b] and Ix - zl < 8 implies a:= max IA(x, y) - A(z, y)1 < c.
w$y::;b
Hence
Iv(x) - v(z)1 :S a lb lu(y)ldy:S c(b - a)! Ilull, (19)
for all x, z E [a, bj with Ix - zl < 8. This proves the continuity of the
function v on [a, b].
Moreover, we also get
a~~b Iv(x)1 :S a~~b IA(x, y)'l b lu(y)ldy :S const Ilull· (20)
This implies
(l IV(XWdX)
1
IIAul1 =
b
"2 :S const Ilull· (21)
Obviously, the operator A: X ---+ X is linear. Consequently, relation (21)

tells us that A: X ---+ X is continuous.
Let M be a bounded set in X. Then, it follows from (19) and (20) along
with the Arzela-Ascoli theorem (Example 7 in Section 1.11) that the set
A(M) is relatively compact in era, b].
Each relatively compact set in era, b] is also relatively compact in X =
L 2(a, b). In fact, if vn ---+ v in era, b] as n ---+ 00, then
(l
1
IIVn - vii =
b
(vn(x) _ v(x))2 dX) 2
:::; max Ivn(x) - v(x)l(b - a)! ---+ 0 as n ---+ 00.

a::':x::':b
Thus, the set A(M) is relatively compact in X, and hence A: X ---+ X is

compact.
Ad (c). It follows from the Tonelli theorem 1 (cf. Iterated Integration in
the appendix) that, for all u, v E X,
(Au I v) = lb (l b A(x, Y)U(Y)dY) v(x)dx
= lb (l b A(x, Y)V(X)dX) u(y)dy = (u I Av),
since A(x, y) = A(y, x) for all x, y E [a, b]. o

Proof of Corollary 2. We are given u E L 2 (a, b). Let An denote the
eigenvalue to the eigenvector Un, i.e., AUn = AnUn, An E R Observe that
An = 0 is possible for some indices n. By (18),
00
Au = 2)un I Au)un. (22)

n=l
This series converges in L 2 (a, b). Observe that
lObserve that
lb IA(x,y)llu(y)ldy::; const lb lu(y)ldy

We now want to study the classical convergence of the series

00
2:(Un 1 Au)un(x) for all x E [a, bj.

n=l
Let k, m ?:: 1. By the classic Schwarz inequality (10) from Chapter 1,

k+= k+=
2: I(un 1 Au)un(x) 1 = 2: I(un 1 u)Anun(x)1
f
n=k n=k
< (~I(Un Iu)I' (~IAn"n(x) I') 1 for all x E [a, bj. (23)
By the Bessel inequality (15) from Chapter 3, it follows that for all x E [a, bj
and k, m ?:: 1,
k+=
= 2: I(A(x,') 1 unW ~ IIA(x, ')11 2
n=k
= lb
a
IA(x, y)1 2 dy ~ (b _ a) ( sup
a::;x,y::;b
IA(x, Y)I) 2 ~ canst.
Since the Fourier series L:n (un 1 U)u n converges, it follows from Proposition
5 in Section 3.1 that the series
00
is also convergent. Thus, for each c > 0, there exists a number no(c) such
that
k+=
2: I(u n 1U)12 < c for all k ?:: no(c), m?:: 1.
n=k
By (23),
k+=
2:
I(un Au)un(x)1 ~ A . c ~ canst· c,
1
n=k
for all k ?:: no(c), m ?:: 1, and x E [a, bj. This proves the absolute and
uniform convergence of the series L::'=l(Un 1 Au)un(x) on [a,bj. D
We now study the following nonhomogeneous integral equation:
lb A(x, y)u(y)dy - AU(X) = h(x), a ~ x ~ b. (24)

4.5 Applications to Boundary-Eigenvalue Problems 245
To this end, we need the corresponding homogeneous equation
lb A(x,y)v(y)dy - AV(X) = 0, a ::; x ::; b. (24*)
Proposition 4 (The Fredholm alternative). Assume (HI) and (H2). Let

the function h E L 2 (a, b) and the real number A =f. 0 be given. Then, the
(i) If A is not an eigenvalue of the homogeneous integral equation (24*),

then the original equation (24) has a unique solution u E L 2 (a, b).
(ii) If A is an eigenvalue of (24*), then (24) has a solution u E L 2 (a, b)

iff
lb h(x)v(x)dx = 0,
for all the eigenfunctions v corresponding to A.
(iii) If h E C[a, b], then each solution u of (24) is continuous on [a, b].
This follows from Theorem 4.B along with Lemma 3. Observe that equa-
tion (24) can be written in the following form:
Au - AU = b, u E X, A E lR,
where X := L 2 (a, b) and (h I b) := J: h(x)v(x)dx.

4.5 Applications to Boundary-Eigenvalue
Problems
Let us consider the following boundary-eigenvalue problem:
-u"(x) = p,u(x), 0< x < 7f,

(25)
u(O) = u(7f) = 0,
This problem can be written in the following form:
Au = p,u, P, E lR, u E D(A), (25*)
where Au := -u" and
D(A) := {u E C 2 [0, 7f]: u(O) = u(7f) = O}.

Let us also study the following integral equation:
u(x) = fJ LT
' Q(X, y)u(y)dy, (26)
where
(7I"-Y)X
{
if 0 ::; x ::; y ::; 11"
Q(x,y):= (7I"~x)y
if 0 ::; y < x ::; 11".
Observe the following:
The Green function Q is continuous and symmetric on [0,11"] X [0,11"].
We set X := L 2 (0, 11") and
(u I v) := 171" u(x)v(x)dx for all u, vEX.
Lemma 1.
(i) The linear operator A: D(A) t;;; X --+ X is symmetric.
(ii) Two eigenfunctions u and v of (25) which correspond to different

eigenvalues are othogonal in X, i.e., (u I v) = O.
(iii) Each eigenvalue fJ of (25) is positive.
(iv) The original boundary-eigenvalue problem (25) is equivalent to the

integral equation (26).
Proof. Ad (i). Integration by parts shows that, for all u, v E D(A),
(Aulv)= 171"(-u")vdx=-u'vl~+ 171"U'V'dX
r
= Jo u'v'dx = uv'l~ - Jor uv" dx
(27)
=- f07l" uv"dx = (u I Av),
since u and v vanish at the boundary points x = 0 and x = 11".

Ad (ii). This follows from Proposition 2 in Section 4.l.
Ad (iii). Let Au = fJU, where fJ E JR., u E D(A), and u =1= o. By (27),
fJ(u I u) = (Au I u) = 171" u,2dx > O.
Hence fJ > O.
Ad (iv). If u is a solution of (25), then u is also a solution of (26), by

Proposition l(ii) in Section 2.7.1. In this connection, set f := f-tu.
Conversely, if u is a solution of (26), then it follows from Lemma 3(b)
in Section 4.4 that u E C[0,7rJ, and again Proposition l(ii) in Section 2.7.1
tells us that u is a solution of (25). D
Recall that eigenvalues of multiplicity one are called simple.
Proposition 2.
(i) The original problem (25) has precisely the eigenvalues
n = 1,2, ... ,
which are simple.
(ii) The normalized eigenfunction Un to f-tn with (un I un) = 1 is given by
n = 1,2, ....
(iii) For each function u E D(A), the Fourier series

00
u(x) = 2:(un I u)un(x) (28)

n=l
converges absolutely and uniformly on the interval [O,7rJ. The same
is true for
00
u'(x) = 2:(un I u)u~(x). (28*)

n=l
For each function u E D(A) with u" E D(A), the series
00
u"(x) = 2:(un I u)u~(x) (28**)

n=l
converges absolutely and uniformly on the interval [0, 7rJ.
(iv) For each function u E X, the Fourier series (28) converges in X :=

L 2(0,7r), i.e., {Ul,U2, ... } forms a complete orthonormal system in
X.
Applications of these results to the vibrating string can be found in Sec-

tion 5.12.
Proof. Ad (i), (ii). By Lemma 1, each eigenvalue f-t of (25) is positive. The
general solution of the differential equation -u" = f-tu, f-t > 0, is given
through
u(x) = Csinf-t~x+Dcosf-t~x,
°
°
where C and D are real constants. From u(O) = we get D = 0. Moreover,
the second boundary condition u(7f) = along with U ¢. implies °
1
J-l2 = n, n = 1,2, ....
Ad (iii). Let U E D(A). It follows from Proposition 1(ii) in Section 2.7.1

with f := u" that
u(x) = -10'' !J(x, y)u"(y)dy.
Thus, the assertion for (28) follows from Corollary 2 in Section 4.4.
Let us prove (28*). Set
(BJ)(x) := l1r !J(x, y)f(y)dy for all x E [0,7f].
It follows from (26) that
since !Jx is piecewise continuous and bounded on [0,7f] X [0,7f]. From the
symmetry condition !J(x,y) = !J(y,x) for all x,y E [0,7f], we get
because of Un = -J-lnBun, by (26). Hence
The uniform convergence of this series on [0,7f] follows as in the proof

of (23), since !Jx is bounded on [0,7f] X [0,7f]. This justifies the formal
differentiation of (28) in order to get (28*).
Finally, let us prove (28**). Observe that
(Un I -u") = (Un I Au) = (Au n I u) = J-ln(u n I u).
Since u" E D(A), relation (28) remains true if we replace u with u". Using
u~ = -J-lnUn, we get
00 00
n=l n=l
Ad (iv). We have to prove that span {Ul,U2, ... } is dense in L 2 (0,7f).

Then, the assertion follows from Theorem 3.A in Section 3.1.
The set CO'(O, 7r) is dense in L 2 (0, 7r). Let v E L 2 (0, 7r), and let c >
given. Then, there is a function u E CO'(O, 7r) such that
°be
1
Ilv - ull = (10" (v(x) - u(x))2 dX ) 2' < c.
Let us now study the following nonhomogeneous boundary-value prob-

lem:
-u"(x) = J-LU(X) + f(x), < x < 7r, ° (29)
u(O) = u(7r) = 0,
To this end, we need the corresponding homogeneous problem:
-u"(x) = J.Lu(x) ,
(29*)
u(O) = u(7r) = 0,
Recall that (29*) has the following eigensolutions:
n = 1,2, ....
Proposition 3 (The Fredholm alternative). We are given the function

f E C[O, 7r] and the real number J.L. Then, the following hold true:
(i) If J.L is not an eigenvalue of (29*), then the original equation (29) has
a unique solution u.
(ii) If J.L is an eigenvalue of (29*), i.e., J.L = J.Ln for some n = 1,2, ... ,
then equation (29) has a solution u iff the so-called non-resonance
condition
10" f(x)un(x)dx =0
is satisfied.
Proof. It follows as in the proof of Lemma 1 that the boundary-value

problem (29) is equivalent to the integral equation
u(x) = 10" Q(x, Y)(J.Lu(y) + f(y))dy,
This can be written as
u(x) = J.L 10" Q(x, y)u(y)dy + b(x),
where b(x) := fo" Q(x, y)f(y)dy.

The corresponding homogeneous integral equation
U(x) = JL 1'" g(x,y)u(y)dy, o :s x :s 71",

is equivalent to (29*). In particular, we get
o :s x :s 71", n = 1,2, .... (31)
Finally, recall that the inner product on L 2 (0, 71") is given by
(f I g):= 1'" f(x)g(x)dx.
Ad (i). This follows from Proposition 4(i) in Section 4.4.

Ad (ii). It follows from Proposition 4(ii) in Section 4.4 that (30) has a
solution iff
By (31),
(b I un) =1'" (1'" g(X,Y)f(Y)dY) un(x)dx
= 1'" (1'" g(x, y)un(x)dx ) f(y)dy = 1'" JL;;lun(y)f(y)dy

= JL;;l(f I Un),
since g(x,y) = g(y,x) for all X,y E [a,b]. Therefore, (b I un) = 0 iff
(f I Un) = O.
Observe that U E D(A). By (iii), there exists a function w E
span{ Ul, U2, .. .} such that
~:= max Iu(x) - w(x)1 < c:.

0::;"::;,,,
Hence
(1'" (u(x) - w(X))2dX)

1
Ilu - wll = "2 :s 71"~ ~ < 71"!C:.

This implies
Ilv - wll :s IIv - ull + lIu - wll < c: + 71"~C:.

Thus, the set span {Ul,U2, ... } is dense in L 2(0,71"). D
Problems 251
Problems
Let -00 < a < b < 00.
4.1. Integral equations with degenerate kernels. Consider the integral equa-
tion
~b }((x,y)u(y)dy = AU(X) for all x E [a, b]. (32)
Suppose that
N
}((x,y):= "Lh(x)gj(y) for all x, y E [a, b],
j=l
where the nonzero functions h,gj: [a, b]-+ lR are continuous for all j.
Compute the eigenvalues and eigenfunctions of (32). Consider first the
special case where N = 1.
4.2. The Green function. We want to study the following boundary-value

problem:
-(p(x)u'(x))' + q(x)u(x) = f(x) on [a,b], (33a)
along with the boundary conditions
au(a) + /3u'(b) = 0, (33b)
')'u(a) + 8u'(b) = 0, (33c)

where a, /3, ,)" and 8 are fixed real numbers with a 2 +/32 =I- 0 and ')'2+8 2 =I- O.
We are given the continuous functions p, q, f: [a, b] -+ lR such that
p(x) =1-0 on [a,b].

Suppose that the homogeneous problem (33) with f == 0 has only the trivial
solution u == O.
Let U1 and U2 be a solution of (33a,b) and (33a,c) with f == 0, respec-
tively. Define
uI(X)U2(Y)
if a :S x :S y,
Q( x,y).
.= { p(x)
U2(X)U,(y)
p(x) if Y < x :S b,
where p(x) := p(X)(U2(X)U~(x) - U1(X)U~(x)). Show that
(i) Q is the Green function to (33), i.e., if we set v(x):= Q(x,y) for fixed
y E [a, b], then v is a solution of (33) with

(ii) Q is symmetric, i.e., Q(x,y) = Q(y,x) for all x,y E [a,b].
(iii) Problem (33) has the solution
u(x) = lb Q(x, y)f(y)dy for all x E [a, b].
(iv) Reduce the boundary-eigenvalue problem
-(p(x)u'(x))' + q(x)u(x) = AU(X) on [a,b]
along with the boundary conditions (33b,c) to an integral equation

and prove the results for (34), which are similar to Section 4.5.
4.3. A special problem. Compute the Green function to the following bound-
ary-value problem:
-u" = f on [0,1],
u(O) = u'(I) = O.
5
Self-Adjoint Operators,
the Friedrichs Extension,
and the Partial Differential
Equations of Mathematical
Physics
In the fall of 1926, the young John von Neumann (1903-1957) arrived
at Gottingen to take up his duties as Hilbert's assistant. These were
the hectic years during which quantum mechanics was developing at
breakneck speed, with a new idea popping up every few weeks from
all over the horizon.
Jean Dieudonne
History of Functional Analysis, 1981
Stimulated by an interest in quantum mechanics, John von Neumann

(1903-1957) began the work in operator theory .... The result was
a paper von Neumann submitted for publication to the Mathemati-
sche Zeitschrijt but later withdrew. The reason for this withdrawal
was that in 1928 Erhard Schmidt and myself, independently, saw
the role which could be played in the theory by the concept of the
adjoint operator, and the importance which should be attached to
self-adjoint operators. When von Neumann learned from Professor
Schmidt of this observation, he was able to rewrite his paper in a
much more satisfactory and complete form .... Incidentally, for per-
mission to withdraw the paper, the publisher exacted from Professor
von Neumann a promise to write a book on quantum mechanics.
The book soon appeared and has become one of the classics of mod-
ern physics (Foundations of Quantum Mechanics, Springer-Verlag,
1932).
Marshall Harvey Stone, 1970

254 5. Self-Adjoint Operators, the Friedrichs Extension, etc.
At the very beginning the given elliptic differential operator is only

defined on a space of 0 2 -functions. This operator is extended to an
abstractly defined operator by using a formal closure. The main task
is to show that the extended operator is self-adjoint. In this case it
is possible to apply the methods of John von Neumann.
Kurt Otto Friedrichs, 1934
The fundamental quality required of operators representing physical

quantities in quantum mechanics is that they be self-adjoint which
is equivalent to saying that the eigenvalue problem is completely
solvable for them, that is, there exists a complete set (discrete or
continuous) of eigenfunctions.
The problem has, of course, been solved in the case of operators
for which the eigenvalue problem is explicitly solved by separation
of variables or other methods, but it seems not have been settled in
the general case of many-particle systems.
The main purpose of the present paper is to show that the
Schrodinger Hamiltonian operator of every atom, molecule, or ion,
in short, of every system composed of a finite number of particles
interacting with each other through a potential energy, for instance,
of Coulomb type, is essentially self-adjoint (i.e., this operator pos-
sesses a unique self-adjoint extension). Thus, our result serves as a
mathematical basis for all theoretical works concerning nonrelativis-
tic quantum mechanics.
Tosio Kato, 1951
The interaction between physics and mathematics has always played

an important role. The physicist who does not have the latest math-
ematical knowledge available to him is at a distinct disadvantage.
The mathematician who shies away from physical applications will
most likely miss important insights and motivations.
Martin Schechter, 1981
In this chapter we want to study the following problems in a Hilbert

space X, where A: D(A) <;;; X -+ X is a linear symmetric operator that has
additional properties to be discussed ahead (cf. also Figure 5.1).
(i) Abstract boundary-value problem:
Au=j, u E D(A). (1)
(ii) Abstract Dirichlet problem:
T 1 (A u I u) - (f I u) = min !, u E D(A), (1*)

5. Self-Adjoint Operators, the Friedrichs Extension, etc. 255
quadratic variational
problem (Chapter 2)
t
orthogonal projection
t
IRiesz theorem I
l
duality map
energetic space l
energetic extension
Hilbert-Schmidt theory
for symmetric compact
operators (Chapter 4)
(abstract Sobolev space)
l l
self-adjoint
l
eigenvalue problem
abstract Dirichlet problem _
Friedrichs extension and Fredholm alternative
l
elasticity l l
complete orthonormal
functional calculus for
system of eigenvectors
/ self-adjoi1operators (abstract Fourier series)
abstract heat equation abstract wave equation l

abstract Schriidinger
l l equation
semigroup one-parameter group
l
l
irreversible process
l
reversible process
one-parameter
unitary group
in nature in nature
l
quantum physics
FIGURE 5.1.
which is equivalent to (1). This minimum problem is also equivalent to

Tl(U I U)E - (f I u) = min!, UEX E , (1**)
where X E denotes the so-called energetic space of the operator A.
(iii) Abstract boundary-eigenvalue problem:
Au - pu = f, u E D(A), p E lR. (2)
(iv) Abstract heat equation:
u'(t) + Au(t) = 0, t ~ 0,
(3)
u(o) = Uo,
with the solution
for all t ~ 0,
which corresponds to the semigroup {e-Ath;::o.

(v) Abstract wave equation:
u"(t) + Au(t) = 0, t E JR,

(4)
u(O) = uo, u'(O) = Ul,
with the solution
(vi) Abstract Schrodinger equation:
iu'(t) = Au(t), t E JR,

(5)
u(O) = Uo,
with the solution

u(t) = e-iAtuo,
which corresponds to the one-parameter group {e-iAthEIR.
Problems (i)-(vi) allow important applications to the partial differential
equations of mathematical physics.
For example, this concerns boundary-value problems and boundary-eigen-
value problems for the Laplace equation or the Poisson equation, the heat
equation, the wave equation, and the Schrodinger equation in quantum
mechanics. Such applications of the abstract theory will be considered in
this chapter. In particular, in applications to elasticity the "energetic space"
X E corresponds to "states" u of the elastic body which have finite energy,
and
rl(U I U)E = elastic energy in the state u,

(f I u) = work of outer forces.
Thus, the variational problem (1 **) corresponds to the principle of minimal

potential energy.
Observe that the "solutions" in (iv)-(vi) correspond to classic solutions
of ordinary differential equations if we assume that A is a real number and
u = u(t) is a real or complex function. The beauty of functional analysis
consists in the fact that
The classic formulas remain true for operator equations if we define op-
erator functions
A I-t F(A)
in an appropriate way.
The simplest method for constructing such operator functions is the fol-
lowing. Suppose that the operator A has a
complete orthonormal system {UI' U2, ... }
of eigenvectors with the corresponding eigenvalues {AI, A2, ... }, i.e., AU n =

An Un for all n. Then
00
U = 2)un I u)un for all U E X.

n=l
This yields
= ~)Un I Au)un = L
00 00
Au An(Un I u)un for all U E D(A), (6)

n=l n=l
since the symmetry of A implies
Formula (6) motivates the following definition:
L F(An){Un I u)un.
00
F(A)u := (7)
n=l
It is quite natural to define the domain of definition D(F(A)) of the oper-

ator F(A) as follows:
U E D(F(A)) iff the series (7) converges.
By the convergence criterion for abstract Fourier series (Proposition 5 in

Section 3.1), we get
L
00
U E D(F(A)) iff !F(An)(Un I uW < 00.

n=l
If we apply this to (6), then we obtain
L
00
U E D(A) iff IAn(un I uW < 00. (8)

n=l
It turns out that
If the linear operator A: D(A) ~ X -+ X is self-adjoint, then condition
(8) is satisfied.
This way the functional calculus leads to self-adjoint operators in a natural
way.
In applications to mathematical physics one encounters the following

situation. For example, the classic boundary-value problem for the Poisson
equation
-Au = f on G,
(9)
u=o on 8G
can be written in the following form:
Bu=f, u E D(B), (9*)
where we set Bu := -Au and
D(B) := {u E C 2 (G): u = 0 on 8G}.
Here, G is a nonempty bounded open set in ]RN. Letting X := L 2 (G), we

get the linear symmetric operator
B: D(B) <;;: X --+ X.
However, if N ~ 2, then the operator B is not surjective. More precisely:

There are functions f E C(G) for which equation (9*) has no solution.
This is identical to the fact that problem (9) has not always a classic
solution if f E C (G).
The idea of Friedrichs was to extend the operator B to a self-adjoint
operator
A: D(A) <;;: X --+ X,
i.e., we have
Bu=Au for all u E D(B) and D(B) <;;: D(A).
The operator A is called the Friedrichs extension of the original operator

B. It turns out that the equation
Au=f, u E D(A) (9**)
has a unique solution u for each given f E X. This solution u of (9**) can
be regarded as
a generalized solution to the classic problem (9).
In terms of the expansion (6), the situation is as follows. If G has a suffi-

ciently smooth boundary, then the symmetric operator B: D(B) <;;: X --+ X
has an orthonormal system {Ul' U2, ... } of classic eigenfunctions, which is
complete in the Hilbert space X = L2(G). These eigenfunctions Un corre-
spond to eigensolutions of the following classic eigenvalue problem:
on G,
on 8G.
Using the same argument as for (6), we obtain that

=
Bu = ·~.:::>).n(Un I u)u n for all U E D(B). (10)
n=l
However, this series also converges for points U E X that do not live in
D(B). Naturally enough, the Friedrichs extension A ofthe original operator
B is given through formulas (6) and (8), i.e.,
=
Au = L An(Un I u)u n for all U E D(A), (10*)
n=l
where
= =
n=l n=l
The preceding considerations motivate the appearance of the Friedrichs
extension in a quite natural way. However, the general theory of the Fried-
richs extension is independent of Fourier series expansions. The basic idea is
the following. We are given a linear, symmetric, strongly monotone operator
B: D(B) ~ X ---7 X on the real Hilbert space X, i.e.,
for all U E D(B) and fixed c> O.
We first construct the so-called energetic extension
of the operator B, where
D(B) ~ XE ~ X ~ X'E, (11)
and BE is the duality map of XE. Then, the Friedrichs extension
A: D(A) ~ X ---7 X
of B is an appropriate restriction of BE, namely, we set
Au:= BEu for all U E D(A),
where D(A) := {u E X E : BEu E X}. This construction guarantees auto-

matically that the Friedrichs extension A: D(A) ~ X ---7 X is bijective, since
the duality map BE is bijective. That is, the equation Au = j, u E D(A),
has a unique solution for each given j EX.
For brevity we write
B~A~BE, (11 *)
Le., A is an extension of B. In turn, BE is an extension of A.

In terms of Fourier series, the energetic space X E of the symmetric op-
erator B: D(B) S; X ---t X from (10) is given through
XE = { u E X:; Anl(un I uW} < 00.
If the operator B corresponds to the classic boundary-value problem (9)

for the Poisson equation, Le., B is given through (9*), then
Le., the energetic space is a Sobolev space.

As we shall show in this chapter, the compactness of the embedding
plays a fundamental role. This is the famous Rellich compactness theorem.

In fact, this compact embedding guarantees that the Laplacian with zero
boundary conditions possesses a complete orthonormal system of eigen-
functions in the Hilbert space L 2 (G), Le., series (10) is convergent. This
result will be critically used in order to solve the heat and wave equations.
Our approach justifies the classic Fourier method of physicists.
The Friedrichs extension represents the functional analytic core of math-
ematical physics.
This approach is closely related to the fundamental physical concept of
energy.
In quantum physics, physical states correspond to unit vectors in a
Hilbert space and the physical quantities (e.g., energy, momentum, and so
on) correspond to self-adjoint operators. This will be discussed in Section
5.14.
In Section 5.2 we shall show that
Self-adjoint operators are closely related to both orthogonality and gen-
eralized derivatives.
5.1 Extensions and Embeddings

Definition 1. Let A: D(A) S; X ---t Y and B: D(B) S; X ---t Y be opera-
tors, where X and Y are linear spaces over lK. We write
iff
Au=Bu for all u E D(B) and D(B) S; D(A).
5.1 Extensions and Embeddings 261
l):, -1
B<:;;A
FIGURE 5.2.
In this case, we say that the operator A is an extension of the operator B.
Obviously, A =B iff B <;;;; A and A <;;;; B.
Example 2. Set X = Y := lR. and D(B) = [0,1] as well as D(A) := [-1,1].

Moreover, let
Bu := u on D(B) and Au := lui on D(A).
Then, B <;;;; A (cf. Figure 5.2).
Definition 3. Let X and Y be normed spaces over K
(i) We say that the embedding "X <:;; Y" is continuous iff there exists
an operator
j:X -7 Y (12)
that is linear, continuous, and injective.
(ii) The embedding "X <;;;; Y" is called compact iff the operator j from
(12) is linear, compact, and injective.
Let X be a subset ofY, i.e., X <;;;; Y. Then we set j(u) := u for all u E X.
In terms of sequences, we have the following:
(a) The embedding X <;;;; Y is continuous iff, as n -7 00,
un -7 U in X implies un -7 U in Y.
(b) The embedding X <;;;; Y is compact iff it is continuous and each

bounded sequence (un) has a subsequence that converges in Y, i.e.,
as n' -7 00,
un' -7 V in Y.
In the general case of Definition 3, we may identify u with j (u). This

makes sense since j: X ----+ Y is injective. In this sense, we may regard the
space X as a subset of Y, and we may write X S;;; Y instead of "X S;;; Y"
for brevity.
Standard Example 4. Let G be a nonempty bounded open set in ]RN,

N ~ 1. Then, the following hold true:
(i) The embedding C(G) S;;; L2(G) is continuous.

o
(ii) The embedding W~(G) S;;; L2(G) is compact.
This is the prototype for embedding theorems, which playa fundamental

role in modern analysis.
Proof. Ad (i). We are given u E C(G). Let j(u) denote the element of
L2(G) that corresponds to u. Then
(fa IU(X)12dX) ~ (fa dX)

1 1
Ilj(u)IIL2(G) = 2 2 ~Ea5Iu(x)1
~ constllullc(G)'
Thus, the operator j: C(G) ----+ L2(G) is linear and continuous.

Moreover, j is also injective. In fact, if j (u) = j (v) and u, v E C( G), then
u(X) = v(x) for almost all x E G.
Since u and v are continuous, this implies u(x) = v(x) for all x E G.
Ad (ii). This is the famous Rellich embedding theorem, which will be
proved in Section 5.7. 0
Definition 5. Let X, Y, and Z be linear spaces over lK, and let A: D(A) S;;;
X ----+ Y and B: D(B) S;;; X ----+ Y be linear operators. For each a E lK, we
define the operators
(aA)u:= aAu for all u E D(A),
and
(A+ B)u := Au + Bu for all u E D(A) n D(B),
i.e., D(aA) := D(A) and D(A + B) := D(A) n D(B).
Let C: D(C) S;;; Y ----+ Z be a linear operator. We set
(CA)u := C(Au) for all u E D(A) with Au E D(C).
Obviously, the operators aA, A + B, and CA are also linear.

5.2 Self-Adjoint Operators 263
5.2 Self-Adjoint Operators

While working with operators A: D(A) <;;; X --> X that are not de-
fined on the total space, observe carefully the specific form of the
domain of definition D(A) of A.
The following examples show that, roughly speaking, we have the follow-
ing situation:
(i) Linear integral operators correspond to linear operators A: X ---+ X

that are defined on the total space X.
(ii) Linear differential operators A: D(A) c X ---+ X correspond to linear

operators that are not defined on the total space X.
The definition of the adjoint operator A * is based on the following for-

mula:
(Au I v) = (u I A*v) for all u E D(A), v E D(A*). (13)
Definition 1. Let A: D(A) <;;; X ---+ X be a linear operator, where D(A) is

dense in the Hilbert space X over lK. By definition,
v E D(A*)
iff there exists an element W E X such that
(Au I v) = (u I w) for all u E D(A). (14)
Furthermore, we set A*v := w. This way we obtain the adjoint operator
A*: D(A*) <;;; X ---+ X.
We have to show that this definition makes sense. In fact, suppose that
relation (14) also holds if we replace w with WI. Then
(u I W - WI) =0 for all u E D(A).
Since D(A) is dense in X, we get W = WI.
Proposition 2. Let A: D(A) <;;; X ---+ X and B: D(B) <;;; X ---+ X be linear
operators, where D(A) and D(B) are dense in the Hilbert space X over lK.
(i) The adjoint operator A*: D(A*) <;;; X ---+ X is linear.
(ii) For each a E IK,

(aA)* = aA*. (15)
(iii) A C;;; B implies B* C;;; A*.
Consequently, if D(A*) is dense in X, then the operator (A*)* exists. In

this case, we set
A** := (A*)*.
Proof. Ad (i). Let aI, a2 E K If (Au I Vj) = (u I Wj) for all u E D(A),
j= 1,2, then
(Au I alvl + a2v2) = al(Au I vd + a2(Au I V2)
= (u I alWl + a2w2) for all u E D(A).
Thus, Vj E D(A*) for j = 1,2 implies alvl + a2v2 E D(A*) and
Ad (ii). If a =1= 0, then relation (15) follows from
(Au I v) = (u I w) {::? (aAu I v) = (u law).
For a = 0, the assertion is trivial.

Ad (iii). Let A C;;; B. It follows from
(Bu I v) = (u I B*v) for all u E D(B), v E D(B*)
that (Au I v) = (u I B*v) for all u E D(A), and hence A*v = B*v for all
v E D(B*). This implies B* C;;; A*. 0
Definition 3. Let A: D(A) C;;; X --7 X be a linear operator, where D(A) is

dense in the Hilbert space X over OC.
(i) A is called symmetric iff A C;;; A*, i.e., (Au I v) = (u I Av) for all
u,v E D(A).
(ii) A is called self-adjoint iff A = A*.

(iii) A is called skew-symmetric iff A C;;; -A*, i.e., (Au I v) = -(u I Av)
for all u, v E D(A).
(iv) A is called skew-adjoint iff A = -A * .
Proposition 4. Let the operator A: X --7 X be linear and continuous on

the Hilbert space X over OC. Then, the adjoint operator
A*:X--7X
is also linear and continuous. In addition, IIAII = IIA*II.

Moreover, A** = A.
Proof. Let v EX. Set
I(u) := (v I Au) for all u E X.
I/(u)1 :S IIAu1l1lv11 :S IIAllllullllvll for all u E X.
Hence the linear functional I:X ~ lK is continuous with 11/11 :S IIAllllvll·

By the Riesz theorem from Section 2.10, there exists an element wE X
such that
I(u) = (w I u) for all u E X,
and IIwll = 11/11. Hence (Au I v) = (u I w) for all u E X. This implies
A*v = w,
and IIA*vll :S IIAllllvl1 for all vEX. Therefore,
IIA*II :S IIAII· (16)
It follows from (Au I v) = (u I A*v) that
(A*v I u) = (v I Au) for all u,v E X.
Hence (A*)* = A. Replacing A with A* in (16), we get II All = II(A*)*II :S

IIA*II. This implies IIAII = IIA*II. D
Standard Example 5 (Integral operators). Let A: [a, bj x [a, bj ~ lR be a

continuous function, where -00 < a < b < 00. Define
(Au)(x) := lb A(x, y)u(y)dy for all x E [a, bj, (17)
and set X := L 2 (a, b). Then, the following are met:
(i) The operator A: X ~ X is linear and compact.

(ii) The adjoint operator A*: X ~ X is given through
(A*u)(x) = lb A(y,x)u(y)dy for all x E [a, bj. (17*)
The operator A *: X ~ X is linear and compact.
(iii) If A is symmetric, i.e., A(x, y) = A(y, x) for all x, y E [a, b], then the
operator A: X ~ X is sell-adjoint.
Proof. Ad (i). This follows from Lemma 3 in Section 4.4.

Ad (ii). We are given vEX. Set
w(x):= lb A(y,x)v(y)dy for all x E [a, b].
As in the proof of Lemma 3(c) in Section 4.4, it follows from the Tonelli
theorem that
(Au I v) = lb (l b
A(x, Y)U(Y)dY ) v(x)dx
= lb (l b A(x, Y)V(X)dX) u(y)dy
= lb w(y)u(y)dy = (u I w) for all u E X.
Hence A*v = w. This yields (17*).

By (17*) and Lemma 3 in Section 4.4, the operator A *: X ---> X is linear
and compact.
Ad (iii). If A is symmetric, then A = A*, by (17) and (17*). D
Proposition 6. Let A: D(A) ~ X ---> X be a linear operator on the Hilbert

space X over lK such that D(A) is dense in X. Then, the following hold
true:
(i) A is self-adjoint iff A is symmetric and
(Au I v) = (u I w) for all u E D(A) and fixed v, w E X (18)
implies that v E D(A) and w = Av.

(ii) A is skew-adjoint iff A is skew-symmetric and
(Au I v) = :-(u I w) for all u E D(A) and fixed v, w E X (19)
implies v E D(A) and w = Av.

(iii) Let lK = C and a E IR with a -I O. Then,
A is skew-symmetric {:} aiA is symmetric,

A is skew-adjoint {:} aiA is self-adjoint.
Proof. Ad (i). Observe that A = A* iff A ~ A* and A* ~ A.

Ad (ii). Use A = -A* iff A ~ -A* and -A* ~ A.
Ad (iii). Since (adA)* = -adA *, we get
A ~ -A* {:::> aiA ~ (aiA)*
and
A = -A* {:::> aiA = (aiA)*. o
Corollary 7.
(i) Each self-adjoint linear opemtor A: D(A) ~ X -+ X on the Hilbert

space X over IK is maximally symmetric, i. e., by definition, if we have
A~8
for any symmetric opemtor 8: D(8) ~ X -+ X, then A = 8.

(ii) Each skew-adjoint linear opemtor A: D(A) ~ X -+ X is maximally
skew-symmetric.
Proof. Ad (i). It follows from A ~ 8 that 8* ~ A*. Since A = A* and

8 ~ 8*, we get 8 ~ A. Thus, 8 = A.
Ad (ii). If A ~ 8 with A = -A* and 8 ~ -8*, then 8* ~ A*, and hence
8 ~ A. Thus, A = 8. 0
Standard Example 8 (Differential operator). Let X := L~(lR). Define
(Au)(x) := u'(x) for all x E JR,
where D(A) := {u E X: u' E X}. Here, the derivative u' is to be understood

in the genemlized sense. Then, the following hold true:
(i) The operator A: D(A) ~ X -+ X is skew-adjoint.
(ii) For each a E JR, the operator aiA is self-adjoint.
Proof. Ad (i). We have u E D(A) iff u E X and there is a function w E X

such that
l u(x),¢'(x)dx = -l w(x),¢(x)dx for all '¢ E CO'(JR)c. (20)
If this holds true, then we get w = u' in the generalized sense.

Step 1: Approximation. Let v E D(A). We want to show that there exists
a sequence (vn ) in COO(JR)c such that, as n -+ 00,
Vn -+ v in L~(JR) and v~ -+ v' in L~(JR).

To prove this we will use the same arguments as in the proof of Propo-
sition 7 from Section 2.2. To this end, we set
Vn(X):=
JRr(/J1. (x -
n
y)v(y)dy, n= 1,2, ...
(see (115) in Chapter 2). Then, Vn E CO'(JR)c for all n, and we get Vn ---- v
in L~(JR) as n - t 00.
Since for fixed x E JR, the function y 1-+ (/J1. (x - y) belongs to the space
CO'(JR), it follows from the definition of the generalized derivative that
v~(x)= ld~¢~(x-Y)V(Y)dY=-l {d~¢~(X-Y)}V(Y)dY

=
JRr ¢1.(x -
n
y)v'(y)dy.
Hence v~ - t v' in L~(JR) as n - t 00, by Problem 2.12(r).

Since the function ¢ ~ is real, we also get
vn -t v and (v~) -t (v') in L~(JR) as n ---- 00. (21)

Step 2: We want to show that the operator A is skew-symmetric. Replac-
ing v with vn in (20) and letting n - t 00, we get
l u(v')dx = - l u'vdx for all u,v E D(A). (22)
Hence
(Av I u) = ,-(v I Au) for all u, v E D(A).
Step 3: We prove that the operator A is skew-adjoint. In fact, it follows
from
(Av I u) = -(v I w) for all v E D(A) and fixed u, w E X
that
l (v') udx = - l vwdx for all v E CO' (JR)c.
Thus, we get u' = w in the generalized sense, i.e., W = Au. By Proposition

6, A is skew-adjoint.
Ad (ii). This follows from Proposition 6(iii). 0
The following example shows that the notion of the generalized derivative
is quite natural from the operator theory viewpoint.
Example 9. Let X := L~(JR). In contrast to the preceding Standard Ex-

ample 8, let us consider the classic differential operator
(Bu)(x) := u'(x) for all x E JR,
where D(B):= {u E C1(JR):u,u' E X}. Then
(i) The operator B: D(B) ~ X -+ X is skew-symmetric but not skew-

adjoint.
(ii) For each a E JR. with a i= 0, the operator aiB is symmetric but not
self-adjoint.
Proof. Ad (i). By Step 2 ofthe preceding proof, B is skew-symmetric. Set
u(x) := Ixl and w(x) := u'(x) if x i= 0, w(O) := O.
As in Example 5 from Section 2.5.2, we obtain that
w=u' in the generalized sense.
However, u (j. D(B), since u is not C i . By Standard Example 8,
Au=w.
Hence the operator A is a proper skew-symmetric extension of B. Thus, B

is not skew-adjoint, by Corollary 7.
Ad (ii). This follows from (i) and Proposition 6(iii). 0
Standard Example 10 (The multiplication operator). Let X := L~(JR.).

Define
(Mu)(x) := xu(x) for all x E JR.,
where D(M) := {u E X: Mu E X}. Then, the operator M: D(M) ~ X -+
X is self-adjoint.
The self-adjoint operators iA and M from Standard Examples 8 and 10

correspond to momentum and position in quantum mechanics, respectively
(cf. Section 5.14).
Proof. For all u, v E D(M),
(Mu I v) = L (xu(x)) v(x)dx = L u(x) (xv(x))dx = (u I Mv).
Hence M is symmetric.
Moreover, it follows from
(Mu I v) = (u I w) for all u E D(M) and fixed v, w E X
that
L u(x) (xv(x))dx = L u(x)w(x)dx for all u E CO'(JR.)c.

o Pu
FIGURE 5.3.
Hence w(x) = xv(x) for almost all x E ~, Le., w Mv. Thus, M is

self-adjoint, by Proposition 6. o
Next we want to show that orthogonal projections are closely related to
a special class of self-adjoint operators. Let M be a closed linear subspace
of the Hilbert space X over K By Section 2.9, for each u E X, there exists
the unique decomposition
u= v+w, where v E M and w E M.L. (23)
Definition 11. The operator
Pu:=v
is called the orthogonal projection from X onto M (cf. Figure 5.3).
Proposition 12. Let X be a Hilbert space over K Then
(i) The orthogonal projection P: X --+ M from X onto the closed linear
subspace M of X is linear, continuous, and self-adjoint and p 2 = P.
If M =1= {O}, then IIPII = 1.
(ii) Conversely, let P: X --+ X be a linear continuous self-adjoint operator
with p 2 = P. Then, P is the orthogonal projection from X onto the
closed linear subspace P(X).
Proof. Ad (i). By the Pythagorean theorem, it follows from (23) that
Hence IIPul1 S Ilull for all u E X. Moreover, if u E M, then Pu = u. Hence

IIPII = 1.
Let
Uj = Vj +Wj, where Vj EM and Wj E M.L, j = 1,2.

Then, (Vj I Wk) = 0 for j, k = 1,2. Hence
This implies
Hence P = P*, i.e., P is self-adjoint.

If v E M, then v = v + 0, where v E M, 0 E MJ.. Hence Pv = v. By
(23),
p 2 u = Pv = v = Pu for all u E X.
Therefore, p 2 = P.
Ad (ii). Set M := P(X). Since P is linear, M is a linear subspace of X.
It follows from p 2 = P that M is closed. In fact, let (Un) be a sequence
in M such that Un -+ U as n -+ 00. Then, Un = PVn for some Vn and
PUn = p2 vn = PVn = Un. Hence
U = lim Un = n-+oo
lim PUn = Pu,
n~oo
i.e., U E M. Furthermore, since P is self-adjoint and p 2 = P, we get
(PU I (I - P)v) = (Pu I v) - (Pu I Pv)

= (Pu I v) - (P 2 u I v) = 0 for all u,v E M.
Hence (I - P)v E MJ. for all vEX. Thus, it follows from
u = Pu + (I - P)u, Pu E M, (I - P)u E MJ.,
that P is the orthogonal projection from X onto M. o

Proposition 13. Let A: D(A) ~ X -+ X be a linear symmetric operator
on the Hilbert space X, where R(A) is dense in X. Then,
where all the appearing inverse and adjoint operators exist.

If, in addition, A is self-adjoint, then so is A-I.
Proof. The operator A is injective. In fact, Au = 0 implies
(u I Av) = (Au I v) = 0 for all v E D(A).
Since R(A) is dense in X, u = o.

The operator A* is also injective. In fact, if A*u = 0, then it follows from
(Av I u) = (v I A*u) = 0 for all v E D(A)
that u = o.
Consequently, the inverse operators A-I and (A*)-I exist. Since
D(A- I ) = R(A) and R(A) is dense in X, the adjoint operator (A-I)*
exists.
Set B:= (A-I)*. We have
(u I v) = (A-IAu I v) = (Au I Bv) for all u E D(A), v E D(B), (24)
and
(z I w) = (AA-Iz I w) = (A-Iz I A*w) for all w E D(A*), z E D(A- I ).

(25)
It follows from (24) that Bv E D(A*) and
(u I v) = (u I A* Bv) for all u E D(A), v E D(B).
Since D(A) is dense in X, this implies
A*Bv = v for all v E D(B). (26)

Analogously, it follows from (25) that
BA*w = w, for all w E D(A*). (27)

Hence B = (A*)-I, by Proposition 6 in Section 1.20, Le., (A-I)* = (A*)-I.
If A is self-adjoint, then A = A*, and hence (A-I)* = (A*)-I = A-I,
i.e., A-I is also self-adjoint. 0
Proposition 14. Let A: D(A) ~ X - X be a linear operator where D(A)

is dense in the Hilbert space X over IK. Suppose that there exists a sequence
(un) in D(A*) such that
un - u and A*un - v in X as n - 00.
Then, u E D(A*) and A*u = v.

This implies that if A is self-adjoint or skew-adjoint, then
un - u and AUn - v in X
imply u E D(A) and Au = v (cf. Problem 5.3).
Proof. Letting n - 00, it follows from
for all w E D(A)
that (Aw I u) = (w I v) for all w E D(A). Hence A*u = v. o

Proposition 15. For a linear operator U: X - X on the Hilbert space X
over IK, the following four conditions are mutually equivalent:
(i) U is unitary, i.e., U is surjective and (Uv I Uw) = (v I w) for all

v,WEX.
5.3 The Energetic Space 273
(ii) UU* = U*U = I.

(iii) U is bijective and U- 1 = U*.
(iv) U is surjective and IIUvl1 = IIvll for all vEX.
Proof. (i) ::::} (ii). It follows from (Uv I Uw) = (v I w) for all v, wE X that
U*(Uw) =w for all w E X,
i.e., U*U = I. Hence UU*UU-1w = w for all w E X, i.e., UU* = I.

(ii) {:} (iii). This follows from Proposition 6 in Section 1.20.
(ii) ::::} (i). From UU* = I it follows that D(U*) = X, and U*U = I
implies
(Uv I Uw) = (v I U*Uw) = (v I w) for all v,w E X.
(i) {:} (iv). Observe that the inner product can be expressed by norms
(cf.. Problem 2.2). For example, if X is a real Hilbert space, then
for all u,v E X.
Hence IIUvll = IIvll for all v E X is equivalent to (Uv I Uw) = (v I w) for

allv,wEX. D
5.3 The Energetic Space

Definition 1. The linear operator B: D(B) ~ X -+ X on the real Hilbert
space X is called strongly monotone iff
for all u E D(B) and fixed c> O. (28)
(H) The operator B: D(B) ~ X -+ X is linear, symmetric, and strongly

monotone on the real Hilbert space X.
Let (. I .) and 11·11 denote the inner product and the norm on X, respec-
tively.
Let us also introduce the eneryetic inner product
(u I V)E := (Bu I v) for all u, v E D(B),
and the eneryetic norm

1
lIuliE := (u I u)1 for all u E D(B).
By (28), (u I U)E = 0 implies u = O. The symmetry of B yields
(u I V)E = (v I U)E for all u, v E D(B).
Thus, (- I ')E represents an inner product on the linear space D(B).
Definition 2. Let the operator B be given as in (H). Then, the energetic

space X E of the operator B consists precisely of all the u E X that have
the following two properties:
(i) There exists a sequence (un) in D(B) such that Un --+ U in X as

n --+ 00.
(ii) The sequence (un) is Cauchy with respect to the energetic norm II·IIE.
Each sequence (un) having the properties (i) and (ii) is called an admis-
sible sequence for u E X E .
For all u, v E X E , we set
(u I V)E:= lim (un I Vn)E,

n-+oo
where (un) and (v n ) are admissible for u and v, respectively. We shall show
below that this limit exists and is independent of the chosen admissible
sequences.
Proposition 3. Assume (H). Then, the following hold true:
(i) The energetic space X E becomes a real Hilbert space with respect to
the energetic inner product (. I ·)E. The set D(B) is dense in X E .
(ii) The embedding XE <:; ; X is continuous, i.e.,
for all u E X E .
(iii) There exists a continuous embedding "X <:; ; X'E" given by the operator
j: X --+ X'E, where
j(f)(V) := (f I v) for all v E XE and each fixed f E X.
If we identify f with j(f), then X becomes a subset of X'E, i.e.,

X E <:;;;X <:;;;X'E,
and
(j,V)E = (f I v) for all f E X, v E X E , (29)
where we set
(g, V)E := g(V) for all g E X;;, v E X E .
Proof. Ad (i), (ii). Step 1: Let (un) be an admissible sequence for U = o.

We want to show that
lim IlunllE = o.
n->oo
In fact, since
(30)
and (un) is Cauchy with respect to I . liE, the sequence (1IunIIE) is also
Cauchy. Thus, the limit
A:= lim
n->oo
IlunllE
exists. Furthermore, observe that
::; Ilun - urllEllumllE + IlurllEllum- urllE < C (31)

for all n,m,r ::::: no(c). Since Un ---+ 0 in X, we get (Un I Um)E = (Un I
BUm) ---+ 0 as n ---+ 00. Letting n ---+ 00 and r ---+ 00 in (31) for fixed m, we
get IA21 ::; c for all c > O. Hence A = O.
Step 2: Let U E X E . Choose an admissible sequence (un) for u. By (30),
(1IunIIE) is Cauchy. Define
IluIIE:= n->oo
lim IlunllE.
Let (v n ) be another admissible sequence for u. We want to show that
IluIIE = n->oo
lim IlvnllE.
In fact, since the sequence (un -Vn ) is admissible for w = 0, we obtain that
as n ---+ 00,
by Step 1.
Step 3: For each Un, Vn E D(B), we have the identity
Let (un) and (v n ) be admissible for U E X E and v E X E , respectively.

Then, the sequence (un ± vn ) is admissible for U ± v. By Step 2, the limit
(U I V)E:= lim (un I Vn)E (32)

n->oo
exists and is independent of the chosen admissible sequences.
Step 4: Let (un) be an admissible sequence for U E X E. Then
lim
n-+oo
Ilu - unllE = O.
In fact, since (Un) is Cauchy with respect to II· liE,
for all n, m :::: no(c:).
For each fixed m, the sequence (un - um) is admissible for U - Um. Letting
n ---t 00, we get
for all m :::: no(C:),
by Step 2.
Step 5: Let (un) be admissible for U E X E . By (28),
for all n.
Letting n ---t 00, this implies
for all U E XE. (33)

Step 6: X E is a pre-Hilbert space with respect to (. I ·)E. In fact, (u I
U)E = 0 implies U = 0, by (33). Observe that (. I ')E is an inner product on
D(B). Using admissible sequences and the limiting relation (32), it follows
that (. I .) E is an inner product on the real linear space X E.
Step 7: The set D(B) is dense in XE. In fact, let U E XE. Then, there
exists a sequence (un) in D(B) that is admissible for u. By Step 4, for each
c: > 0, there is a Un E D(B) such that Ilu - unllE < c:.
Step 8: X E is a Hilbert space. To prove this, let (un) be a Cauchy sequence
in X E . We have to show that there exists a U E X E such that
Un ---t U in X E as n ---t 00. (34)

In fact, since D(B) is dense in X E , there exists a sequence (v n ) in D(B)
such that
for all n.
It follows from
for all n, m :::: no(C:), that (v n ) is Cauchy in X E . By (33), (v n ) is also Cauchy

in X. Thus,
Vn ---t U in X as n ---t 00.
Hence the sequence (v n ) is admissible for u. By Step 4,
as n ---t 00.
Therefore,
as n -+ 00.
This is (34).
Ad (iii). Let 1 E x. Set
j(f)(v) := (f I v) for all v E XE.
By (33),
l(f I v)1 :::; IIIllllvl1 :::; c-! IIIllllvllE for all v E X E .
Hence j(f) E X'E and Ilj(f)llxi; :::; c-! 11111. Thus, the operator
j:X -+ X'E
is linear and continuous. In addition, if j(f) = j(g), then
(f-glv)=O for all v E X E .
Since D(B) ~ X E and D(B) is dense in X, 1 = g. Hence j is injective. D
Standard Example 4 (The Laplacian). Let G be a nonempty bounded

open set in IR N , N ~ 1. Set X:= L2(G) and
Bu:= -~u, D(B) := Cgo(G).
Then, the following are met:
(i) The operator B: D(B) ~ X -+ X is linear, symmetric, and strongly

monotone.
(ii) The corresponding energetic space is given through
i.e., XE is a Sobolev space.
(iii) For all u, v E X E ,
where the derivatives 8j u and 8j v are to be understood in the gen-

eralized sense.
(iv) The embedding XE ~ X is compact.

Proof. Ad (i). Integration by parts yields
(Bu I v) = i(-t:J.U)VdX = i u(-t:J.v)dx = (u I Bv) for all u,v E D(B).
Hence B is symmetric.
For all u, v E D(B), integration by parts yields
By the Poincare-Friedrichs inequality from Section 2.5.6, we get
cllul1 2 :s: (Bu I u) for all u E D(B),
i.e., B is strongly monotone.

Ad (ii). Let u E X E . Then there exists an admissible sequence (un) for
u, i.e., Un E D(B) for all n,
un ---+ U in X as n ---+ 00, (36)
and (un) is Cauchy in X E . Hence (DjU n ) is Cauchy in X, since
N
Ilun - umll~ = L IIDjUn - Djuml1 2 ,
j=1
by (35). Thus, for each j = 1, ... , N, there exists a function Vj E X such

that
as n ---+ 00. (37)
Letting n ---+ 00, it follows from
for all W E Cff(G)
that
for all W E Cff(G).
o
Hence Vj = DjU in the generalized sense. By (36) and (37), U EW§(G).
o
Conversely, let U EW§(G). Then there exists a sequence (un) in D(B)
such that (36) and (37) hold true with Vj := DjU. Thus, (un) is an admissible
sequence for u, and hence U E X E .
Ad (iii). Let u, v E X E , and let (un) and (v n ) be an admissible sequence
for U and v, respectively. Letting n ---+ 00, it follows from
5.4 The Energetic Extension 279
for all n
that
Ad (iv). This will be proved in Section 5.7. o
5.4 The Energetic Extension

Definition 1. Let B: D(B) ~ X ---+ X be a linear, symmetric, strongly
monotone operator on the real Hilbert space X. Then, the duality map
of the energetic space X E is called the energetic extension of the operator

B.
By Section 2.11, the operator BE is defined through
(BEU,V)E = (u I V)E for all u, v E X E . (38)
Moreover, BE: X E ---+ X'E is a linear homeomorphism with
IIBEUllXg = IluliE for all u E XE.
Proposition 2. The operator BE is an extension of B, i.e.,
Bu = BEu for all u E D(B).
Proof. Let u E D(B) be given. It follows from (29) and (38) that
(BEu,V)E = (u I V)E = (Bu I v) = (BU,V)E for all u, v E D(B).
Since D(B) is dense in X E , this implies
(BEu,V)E = (BU,V)E for all v E X E .
Hence B EU = Bu. o
5.5 The Friedrichs Extension of Symmetric

Operators
(H) B: D(B) <;;; X ---- X is a linear, symmetric, strongly monotone opera-
tor on the real Hilbert space X. In particular, this means that D(B)
is dense in X and
(Bu I u) ~ cllul1 2 for all u E D(B) and fixed c > O.
Definition 1. The Friedrichs extension A: D(A) <;;; X ---- X of the operator

B is defined through
Au:= BEU for all u E D(A),
where D(A) := {u E X E : BEu EX}.
By Proposition 3(iii) in Section 5.3, we obtain that u E D(A) iff there

exists an f E X such that
(BEu,V)E = (f I v) for all v E XE.
Observe that D(B) <;;; XE <;;; X <;;; X E and
B <;;;A<;;; BE.
Theorem 5.A. Assume (H). The Friedrichs extension A possesses the

following properties:
(i) The operator A: D(A) <;;; X ---- X is self-adjoint and bijective, and
for all u E D(A).
(ii) The inverse operator
is linear, continuous, and self-adjoint.
(iii) If, in addition, the embedding X E <;;; X is compact, then A-I: X ---- X
is compact.
Proof. The operator A is a restriction of BE. Since BE: X E ---- X E is

bijective, so is the operator A. For all u E D(A),
(Au I u) = (AU,U)E = (BEu,U)E = (u I U)E ~ cllul1 2 ,

5.5 The Friedrichs Extension of Symmetric Operators 281
by (33).
The operator Bi/: Xi: -+ X E is linear and continuous. Since the embed-
ding X ~ Xi: is continuous, the restriction
Bi/: X -+ XE
is also continuous. 1 This operator is identical to A-I. Hence the operator
(39)
is linear and continuous. In turn, since the embedding XE ~ X is contin-
uous, the operator
(40)
is linear and continuous. Moreover, if the embedding X E ~ X is compact,
then it follows from (39) that the operator (40) is compact.
Let f,g E X. By (29) and (38),
(A-If I A- 1 g)E = (B E (A- 1 j),A- 1 g)E = (f,A- 1 g)E
= (f I A- 1 g).
Since (A-l f I A-lg)E = (A-l g I A-I j)E, we get
for all f,g E X.
Thus, the linear, continuous operator A-I: X -+ X is symmetric, and hence

A-I is self-adjoint. Finally, by Proposition 13 in Section 5.2, the operator
A is self-adjoint. 0
5.5.1 Variational Problem

For given f EX, let us consider the following two variational problems:
Tl(U I U)E - (f I u) = min!, UEX E , (41)
and
Tl(Au I u) - (f I u) = min!, u E D(A). (41*)
Proposition 2. Let A be the Friedrichs extension of the operator B given

through (H). Then, the two variational problems (41) and (41*) have the
unique solution
lObserve that, for all U E X,

Proof. Ad (41). By (29) and (38), it follows that, for all u E XE,
(f I u) = (f,U)E = (AUO,U)E
= (BEUO, u) E = (uo I U)E.
Hence
Thus, the minimum problem (41) is equivalent to
uEX E ·
Obviously, this problem has the unique solution uo.

Ad (41*). Since (uo I Au) = (Auo I u) = (f I u),
r1(Au - Auo I u - uo) = r1(Au I u) - (f I u) + r1( Auo I uo).

Therefore, problem (41 *) is equivalent to
2- 1(A(u-uo) I u-uo) =min!, u E D(A).
Because of (Av I v) 2': cllvl1 2 for all v E D(A), this problem has the unique
solution u = uo. 0
5.5.2 Operator Equation

Suppose we are given the operator equation
Bu=j, u E D(B), (42)
where the operator B satisfies condition (H). Let A be the Friedrichs exten-
sion of B. Note that D(B) ~ D(A) ~ X and Bu = Au for all u E D(B).
Consequently, each solution of (42) is also a solution to the following equa-
tion:
Au=j, u E D(A). (43)
Let us also consider the problem
(u I Bv) = (f Iv) for fixed u E X E and all v E D(B), (43*)
along with the minimum problem
2- 1 (u I U)E - (f I u) = min!, UEXE. (43**)
Theorem 5.B (The abstract Dirichlet problem). Assume (H). Let j E X.

Then, each oj the three problems (43), (43*), and (43**) has the unique
solution Uo = A -1 j.
5.5 The Friedrichs Extension of Symmetric Operators 283
The solution Uo is called a generalized solution to the original "classic

problem" (42). We will show in Section 5.6 that (43**) corresponds to the
Dirichlet problem and that the solution Uo of (43*) is a solution to the
Poisson equation, in the sense of the theory of generalized functions.
Proof. By Theorem 5.A and Proposition 2, problems (43) and (43**) have
the unique solution uo.
Equation (43*) is identical to
(Av I u) = (f I v) for fixed u E X E and all v E D(B).
Since (Av I u) = (BEV,U}E = (v I U)E and (v I U)E = (u I V)E for all
U E XE and v E D(B), equation (43*) is equivalent to
(BEU,V}E = (f,V}E for fixed U E XE and v E D(B).
Since D(B) is dense in X E , this is equivalent to

BEu= j, uEXE
(cf. Problem 1.10). Finally, since j E X, this is equivalent to Au = j,
u E D(A). 0
5.5.3 Eigenvalue Problem

We now consider the operator equation
BU=J.Lu+j, UED(B), J.LElR, u#O, (44)
along with the following two generalized problems:
Au = J.LU + j, u E D(A), J.L E lR, (44*)
and
(u I Bv) = J.L(u I v)+(f I v) for fixed u E X E , J.L E lR, and all v E D(B).
(44**)
It follows as in the proof of Theorem 5.B that
Problem (44*) is equivalent to (44**).
(HI) Let X be a real separable Hilbert space with dim X = 00. We are
given the operator B: D(B) ~ X -; X as in (H), Le., B is linear and
symmetric, along with (Bu I u) ~ cllul1 2 for all u E D(B) and fixed
c> O. Let
A:D(A) ~ X -; X
be the Friedrichs extension of B. In addition, we assume that the
embedding XE ~ X is compact.
Theorem 5.C (Eigenvalue problem). Assume (HI) and set f = O. Then,

the following hold true:
(i) The operator A has a countable system {un' JLn} of eigensolutions

that contain all the eigensolutions of A.
(ii) The eigenvectors {un} form a complete orthonormal system in the

Hilbert space X. In addition, Un E X E for all n.
(iii) All the eigenvalues JLn have finite multiplicity. Furthermore, we have
o < c S JL1 S JL2 S ... and JLn ----+ +00 as n ----+ 00.
Proof. According to Theorem 5.A, (Au I u) ;::: cllul1 2 for all u E D(A). If
u is a solution of (44*) with f = 0 and u i- 0, then
JL(u I u) = (Au I u) ;::: c(u I u),

and hence JL ;::: c > O.
Again by Theorem 5.A, the operator A-I: X ----+ X is symmetric and
compact. Let A := JL- 1. Then the eigenvalue problem (44**) with f = 0 is
equivalent to
(45)
Observe that A = 0 is not an eigenvalue of A -l, since A -lU = 0 implies
u = O. The assertion follows now from Theorem 4.A in Section 4.2 applied
to the inverse operator A-I. D
5.5.4 The Fredholm Alternative

Theorem 5.D. Assume (HI). We are given f E X and JL E lR. Then, the
(i) If JL is not an eigenvalue of the operator A, then equation (44*) has

a unique solution u.
(ii) If JL is an eigenvalue of A, then equation (44*) has a solution u iff
(flv)=O
for all eigenvectors v of A corresponding to JL.
Proof. By Theorem 5.A, the operator A-I: X ----+ X is symmetric and

compact.
Case 1: Let JL i- O. Equation (44*) is equivalent to
UEX, (46)
5.6. Applications to Boundary-Eigenvalue Problems 285
where A := f-l- 1 .
Ad (i). This follows from Theorem 4.B in Section 4.3 applied to A -1.
Ad (ii). By Theorem 4.B, equation (46) has a solution u iff
(47)
for all eigenvectors v of A- 1 corresponding to A. Observe that
(A- 1 f I v) = (f I A- 1 v) = A(f I v).
Thus, condition (47) is equivalent to (f Iv) = 0 for all eigenvectors v of

A-1 corresponding to A. In turn, this is equivalent to assertion (ii).
Case 2: If f-l = 0, then f-l is not an eigenvalue of A. By Theorem 5.B,
equation (44*) has a unique solution u. D
5.6 Applications to Boundary-Eigenvalue

Problems for the Laplace Equation
Let us consider the following fundamental classical boundary-eigenvalue
problem for the Laplacian:
-t:.u - f-lU = f on G, f-l E JR,

(48)
u=o on aG.
Let G be a nonempty bounded open set in JRN. For f-l = 0, (48) is called
the Poisson equation.
Definition 1. The generalized problem to (48) reads as follows. We are

o
looking for a function u EW~(G) such that
fc u( -t:.v)dx - f-l fc uv dx = fc fv dx for all v E ego (G). (48*)
This means that equation (48) is satisfied in the sense of the theory of
generalized functions. Formally, we obtain (48*) by multiplying (48) with
v and subsequent integration by parts.
Proposition 2 (Eigenvalue problem). Let f == O. Then, the following are

met:
(i) The generalized problem (48*) has a countable system {un' f-ln} of
eigensolutions that contain all the possible eigenvalues.
(ii) The system {un} of eigenfunctions forms a complete orthonormal

o
system in the Hilbert space L2(G). In addition, Un EwHG) for all n.
(iii) All the eigenvalues /1-n have finite multiplicity. Furthermore, we have
o < /1-1 :::; /1-2 :::; . . . and /1-n -+ +00 as n -+ 00.
Proof. We set X := L2(G) and

Bu:= -~u, where D(B) := Co(G).
By Standard Example 4 in Section 5.3, the operator B: D(B) ~ X -+ X is
linear, symmetric, and strongly monotone. The corresponding energy space
is given through
0 1
X E =W2(G),
and the embedding X E ~ X is compact. Moreover,
(u I V)E = 1 N
'LojuOjvdx
G j=1
for all u, v E xE .
Let A be the Friedrichs extension of B. Then, problem (48*) corresponds
to (44**). The assertion follows now from Theorem 5.C. 0
Denote the energetic extension BE of B = -..6. by -~E. Then,
is a linear homeomorphism, by Section 5.3. More precisely, the operator

o
-~E is identical to the duality map of the Sobolev space W~(G). Explicitly,
for all u, v E X E .
This means
for all u,v EW~(G).
Recall that the Friedrichs extension A of B = - ~ is given through

for all u E D(A),
o
where D(A) consists precisely of all those functions u EW~(G) for which
o
-~EU E L2(G). This means that u E D(A) iff u EW~(G) and that there
exists a function f E L2 (G) such that
for all v EW~(G).

5.7. The Poincare Inequality 287
Proposition 3 (The Fredholm alternative). We are given f E L2(G) and

p, E JR. Then, the following hold true:
(i) If p, is not an eigenvalue, then problem (48*) has a unique solution.

(ii) If p, is an eigenvalue, then (48*) has a solution iff
ifVdX = 0
o
for all eigenfunctions v EW~(G) corresponding to p,.
Proof. Problem (48*) corresponds to (44**), which is equivalent to (44*).

Thus, the assertion follows from Theorem 5.D. 0
Proposition 4 (The variational problem). Let p, = 0, and let f E L2(G).

Then, the unique solution u of problem (48*) is equal to the unique solution
of the following minimum problem:
(49)
Proof. Problems (48*) and (49) correspond to (43*) and (43**), respec-
tively. Thus, the assertion follows immediately from Theorem 5.B. 0
Problem (49) is identical to the Dirichlet problem from Section 2.5.
5.7 The Poincare Inequality and Rellich's

Compactness Theorem
Proposition 1. Let G be a nonempty bounded open set in JRN, N ;::: l.
Then, the embedding
is compact.
This proposition was proved by Rellich in 1930. Our proof will be based
on the following special Poincare inequality:
Lemma 2. Let C be a closed cube in JRN, N ~ 1, with edge length 2R > o.

Then, relation (50) holds for all u E C 1 (C).
Proof. We will use the well-known inequality
for all aI, .. . , aN E JR, (51)
which follows from 2ab ~ a2 + b2 for all a, b E JR.

Observe that (50) is invariant under a translation. Moreover, using the
transformation x ~ 2Rx, it is sufficient to prove (50) for R = ~, i.e.,
C = [-~, ~l x ... x [-~, n
Step 1: Let N = 1 and C = [-~, n
We are given u E C 1 (C). Then,
u(y) - u(x) = l Y
u'(t)dt for all x, y E C.

(u(x) - U(y))2 == u(X)2 + u(y)2 - 2u(x)u(y)
~ fc 1·lu'(t)ldt ~ fc u'(t)2dt,
since Ie dt = 1. Applying the integral Iexe ... dxdy, we get
fc u{x)2dx + fc u(y)2dy ~ fc u'(t)2dt
+ 2 (fc U(X)dX) (fc U(Y)dY) .
Hence
2 fc u(x)2dx ~ fc u'(x)2dx + 2 (1 U(X)dX) 2
This is (50) with R = ~.

Step 2: Let N = 2 and C = [-~, ~l x [-~,
Let x = (e,7]) and y = (a:,{3). Then
n We are given u E C 1 (C).
u(x) - u(y) = it; Ut;(t, {3)dt + iT/ UT/(e, t)dt for all x, y E C.
By (51), (a + b)2 ~ Na 2 + Nb 2 for all a, bE JR. Hence

(u(x) - U(y))2 = U(X)2 + U(y)2 - 2u(x)u(y)
,; N (I.' u,(t, (j)dt) 2 + N (1.' ",,(e, t)dt) 2


1
U(X)2 + U(y)2 :::; N [21 [U~(t, ,8)2 + U1J(~' t)2Jdt + 2u(x)u(y).

2
Applying the integral fcxc'" dxdy and observing that fcxc dxdy = 1, we
get
This is (51).
Step 3: The proof proceeds analogously for ]R.N with N 2 3. 0
Proof of Proposition 1. We set

j(u) := u.
Then, the linear operator
is continuous because
o
Since the set C8"(G) is dense in W~(G), the compactness of j follows from
the compactness of the operator
(52)
by the extension principle from Section 3.6.
Let 8 be an open ball in ]R. N such that G ~ 8. Since each function
u E C8"(G) vanishes on a compact subset of G, we get C8"(G) ~ C8"(8),
and the compactness of the operator
(53)
implies the compactness of j from (52). Consequently, Proposition 1 follows
from Lemma 3, given next. 0
Lemma 3. Let 8 be a closed ball in ]R.N. Set j (u) := u. Then, the operator
j from (53) is compact.
Proof of Lemma 3 for N = 1. Let 8 := Ja, b[, where -00 < a < b< 00.
We are given u E C8"(8). Then,
u(x) = l x
u'(t)dt for all x E [a, bJ.
o'l'~blu(x)1 <; ll.lu'(t)ldt <; (l dt) I (l U'(t)l) I

:::; (b - a)~ Ilulh,2' (54)
and
x
lu(x) - u(y)1 = i l ul(t)dti :::; Ix - YI~ Ilulll,2, (55)
where
(l
1
Ilulll,2 := b
(u 2 + U I2 )dX) "2
Let M be a bounded set in W~(B) n 08"'(B). By the Arzela-Ascoli the-

orem from Section 1.11, it follows from (54) and (55) that the set j(M) is
relatively compact in O(B). Since
it follows from Vn --+ v in 0 (B) as n --+ 00 that
as n --+ 00.
Consequently, the set j(M) is also relatively compact in L2(B). 0
Proof of Lemma 3 for N = 2. Let x = (~, ry) and set
Define
M:= {u E OO'(B): Ilulll,2 :::; I}.
We want to prove that
(A) The set j(M) is relatively compact in L 2 (B).
Since the operator j is linear, j(aM) = aj(M) for each a > O. Thus,
assertion (A) implies that j sends bounded sets from W~(B) n Oo(B) to
relatively compact sets in L2(B). Consequently, (A) implies the assertion
of Lemma 3.
By Proposition 10 in Section 1.11, the set j(M) is relatively compact iff,
for each c > 0, the set j(M) has a finite c-net. Therefore, it remains to
prove the following:
(a) (b)
FIGURE 5.4.
(B) For each c > 0, there exist functions Ui, ... , U r EM such that
Step 1: Boundary strip. We first prove the following: For each () > 0,
there exists an open subset 1i of the open ball B such that R <;;; Band
r
Jf3 - H
u 2 dx < () (57)
(cf. Figure 5.4(a)). To this end, we choose a local (/1, ()-coordinate system
as pictured in Figure 5.4(b). More precisely, we assume that the boundary
8B has a local representation of the form
( = g(/1), /1 E J,
where g: J -; lR is a Ci-function on the interval J := 1- a, a[ with a > 0.

For sufficiently small (3 > 0, the local boundary strip
Bf3:= {(/1,():/1 E J, g(/1) - (3 < « g(/1)}

is a subset of B. Let u E M. For all points (/1, () and (/1, T) in Bf3,
U(/1, () = 1( uc,(/1, t)dt + u(/1, T).
From (a + b)2 ::; 2a 2 + 2b 2 for all a, b E lR and the Schwarz inequality, it

follows that
Integration over T yields

and integration over 8 E yields
{3
is,
r u2dx::::: c is{3r (2{32u2 + 2u2)dx
::::: c . constllulli,2'
There exists a number n independent of c such that n local boundary
strips cover a boundary strip of the ball 8. Thus, choosing the number c
sufficiently small, we obtain (57).
Step 2: The inequality of Poincare on H. We choose closed cubes Cl , ... ,
Cs of edge length 2R that cover the set H such that Cj <:;;; 8 for all j. Let
s
By the special Poincare inequality (50) with N = 2,
1 u 2dx::::: L u 2dx ::::: 4R2L (u~ + u~)dx
for all u E Cg"(8). (58)
1
We set
where Yj := udx.
Cj
Relation (58) yields
1 u 2dx ::::: 4R211ulli,2 + (2R)-21F(u)1 2 for all u E C (8), o (59)
where IF(uW = 2:: j IYjI2. By the Schwarz inequality,
IYjl2 ::::: meas(Cj ) r

iC j
u 2dx::::: meas(8)
is
r u 2dx,
for all j. Thus, the set F(M) IS bounded in ]Rs and is hence relatively
compact. Consequently, for each 'f) > 0, the set F(M) has a finite 'f)-set, i.e.,
there exist functions Ul, ... ,Ur E M such that
min IF(u) - F(Uk)1 < 'f) for all u EM.
l~k~r
From (57) and (59) along with (u - Uk)2 ::::: 2u 2 + 2u~, we get the key
formula
r (u - Uk)2dx is-f{
is
r (u - Uk)2dx + if{r (u - Uk)2dx
=
::::: 40 + 4R211u - uklli,2 + (2R)-2IF(u) - F(UkW

::::: 40 + 16R2 + (2R)-2'f),
5.8 Functions of Self-Adjoint Operators 293
for all u E M and k = 1, ... , r.

Finally, if we choose the positive numbers 8, R, and TJ sufficiently small,
then we get the desired estimate (56). D
For N ~ 3, the proof proceeds completely analogously.
5.8 Functions of Self-Adjoint Operators

Our objective is to construct a simple functional calculus for an important
class of self-adjoint operators. We make the following assumptions:
(H) The linear operator A: D(A) ~ X ---. X is self-adjoint on the separa-

ble Hilbert space X over ][{, and A possesses a complete orthonormal
system {un} of eigenvectors in X, where AUn = AnUn for all n.
The system {un} is finite or countable if dim X < 00 or dim X = 00,

respectively.
Proposition 1. Assume (H). Then
for all U E D(A). (60)

n
Furthermore, the following three conditions are mutually equivalent:
(i) U E D(A).
(ii) En An (Un I u)U n is convergent.

(iii) En IAn(Un I U)12 is convergent.
Proof. (i) => (ii). Since {un} is complete,
for all vEX. (61)

n
If U E D(A), then (un I Au) = (Au n I u) = An(Un I u). Hence

for all U E D(A).
n n
(ii) {::> (iii). This equivalence represents the convergence criterion for
Fourier series (Proposition 5 in Section 3.1).
(ii) => (i). We construct the linear operator C: D(C) ~ X ---. X through
for all U E D(C),

n
where U E D( G) iff this series converges. It follows from
for all u, v E D(G)

n
that G is symmetric. Obviously, G is an extension of A. Since A is self-

adjoint, there does not exist a proper symmetric extension of A (Corollary
7 in Section 5.2). Hence A = G. 0
Let the functions F: JR --+ K be given. We define the operator

F(A): D(F(A)) ~ X --+ X
through the quite natural formula
for all U E D(F(A)), (62)

n
where U E D(F(A)) iff En F(An)(U n I U)U n is convergent. By Proposition

5 in Section 3.1 this means that U E D(F(A)) iff
In particular, we obtain Un E D(F(A)) and

for all n.
Since {un} is a complete orthonormal system in X, the set span {Ul' U2, ... }
is dense in X, and hence D(F(A)) is dense in X. 0
Proposition 2. Assume (H). Then, for each real function F: JR --+ JR, the
operator F(A) is self-adjoint.
Proof. Let G := F(A). As in the proof of Proposition 1, we obtain that G

is symmetric, i.e., G ~ G*.
In order to prove that G* ~ G, let U E D(G*). Since
(Un I G*u) = (Gun I u) = F(An)(Un I u) for all n,

we obtain
n n
Hence U E D(G). o
With a view to applications in the next section, we now consider functions
of the form
n
depending on a real parameter t, which will play the role of time. Set
x(t) := F(A, t)u
for fixed u EX. We expect that
x'(t) = Ft(A, t)u, (63)

where
n
In order to justify this we need the following two majorant conditions:
where Cn := sup IF(A n , t)l, (64)

tEJ
n
and
where dn := sup 1Ft (An' t)l. (65)
n tEJ
Proposition 3. Suppose that (H) holds. Let the function F: IR x J -t IR be

given, where J is a real interval. Then, the following hold true:
(i) Assume (64) for u EX. If t f--t F(A, t) is continuous on J for each
A E IR, then t f--t x(t) is continuous on J.
(ii) Assume (64) and (65) for fixed u E X. 1ft f--t F(A, t) is continuously
differentiable on J for each A E IR, then t f--t x(t) is continuously
differentiable on J and relation (63) holds true for all t E J.
Proof. Ad (i). We set Fn(t) := F(An' t) and an := (un I u). Since la+bl 2 ::::;
21al 2 + 21bl 2 for all a, b E lK, we get
~ := IIF(A, t)u - F(A, s)u11 2 = L I(Fn(t) - Fn(s))an I2
n
Let E: > 0 be given. By (64), there is a number N such that 2 2:n>N Ic n a n l 2 <
E:. Since Fn is continuous on J, this implies
~ < E:+E:,
provided It - sl is sufficiently small.

Ad (ii). Use the mean value theorem
=
Fn(S~ ~n(t) _ F~(t) = F~(t + 1J(s - t)) - F~(t), 0< 1J < 1,
and an analogous argument as in the proof of (i). o

Example 4 (Characterization of the energetic space X E ). Let B: D(B) S;;;
X - X be a linear, symmetric, strongly monotone operator on the real
separable Hilbert space X. Suppose that the embedding
is compact. Let A: D(A) S;;; X - X denote the Friedrichs extension of B.

Then
(66)
and
for all u, v E X E • (67)
Proof. By Theorems 5.A and 5.C, the operator A satisfies condition (H),
where An ~ C > 0 for all n. Then
for all U E D(A!), (68)

n
where U E D(A!) iff
Observe that U E D(A) iffLnA;I(un I U)12 < 00. Since 0 < An ~ const·A;
for all n,
D(A) S;;; D(A!).
In addition, note that D(B) S;;; D(A). Furthermore, it follows from (60) and
(68) that
(U I V)E = (Bu I v) = (Au I v) = L An(U I un)(un I v)

n
= (A!u I A!v) for all u, v E D(B). (69)

Moreover, by (68) we get
n n
= clluII 2 for all u E D(A!). (69*)

Step 1: We set Y := D(A!) and
(u I v)y := (A!u I A!v) for all u,v E Y.

By (69*),
for all U E Y. (70)
We want to show that Y becomes a real Hilbert space equipped with the
inner product (. I .)y.
In fact, if Ilully = 0, then u = 0, by (70). Now let (un) be a Cauchy
sequence in Y. Then, (A~un) is a Cauchy sequence in X. Hence
as n ---> 00. (71)
It follows from (70) that (un) is also a Cauchy sequence in X, and hence
un ---> U in X as n ---> 00. (71*)
The operator A~ is self-adjoint. According to Proposition 14 in Section 5.2,

it follows from (71) and (71*) that u E D(A~) and A~u = v, i.e., u E Y.
Finally, by (71),
Ilun - ully = IIA~un - A~ull ---> 0 as n ---> 00.
In the following we want to show that X E = Y.

Step 2: We prove that the set D(A) is dense in Y. In fact, if dim X < 00,
then D(A) = Y = X. Now let dim X = 00. Suppose we are given u E D(A).
Hence u E D(A~). Set
m
Um := ~)Un I u)un.
n=l
Then, Um E Y for all m, and
lIum - ull} = IIA~um - A~u112 = L Anl(un I uW ---> 0 as m ---> 00,

n>m
by (68).
Step 3: We prove that XE ~ Y. Let u E X E . Then there exists an
admissible sequence (un) for u. This means that Un E D(B) for all n as
well as
un ---> U in X as n ---> 00,
and (un) is a Cauchy sequence with respect to II . liE. By (69), (Un) is a
Cauchy sequence in Y. Thus, Step 1 tells us that
un ---> W in Y as n ---> 00,
along with
as n ---> 00.
Hence u = w, i.e., u E Y.
Step 4: Let us prove (67), i.e.,
(u I V)E = (u I v)y for all u, v E X E . (72)
Ifu,v E X E , then there exist admissible sequences (un) and (v n ) for u and
v, respectively. Step 3 yields
Un -+ u and Vn -+ v in Y as n -+ 00.
By (69), (un I Vn)E = (Un I vn)y for all n. Letting n -+ 00, we get (72).
Step 5: Since X E is a Hilbert space, X E is closed with respect to the
norm II ·IIE. By Step 4, IluliE = Ilully for all u E X E . Thus, X E is a closed
linear subspace of Y. Since D(A) ~ X E ~ Y and D(A) is dense in Y, we
get X E = Y. 0
5.9 Semigroups, One-Parameter Groups,

and Their Physical Relevance
The notion of a semigroup is the most important notion for describing
time-dependent processes in nature in terms of functional analysis. The key
relations are
S(t + s) = S(t)S(s) for all t, s 2:: 0, (73)
S(O) = I, (74)
and (75).
Definition 1. Let X be a Banach space over lK. A semigroup {S(t)k::o

on X consists of a family of operators S(t): X -+ X for all t 2:: 0 such that
(73) and (74) hold true.
The generator A: D(A) ~ X -+ X of the semigroup {S(t)} is defined
through
Au:= lim C 1 (S(t) - I)u, (75)
t--->+O
where u E D(A) iff this limit exists.
Let S+ = {S(t)} be a semigroup. Set
u(t) := S(t)uo for all t 2:: O. (76)
Then
(i) S+ is called strongly continuous iff the function u: [0, oo[ -+ IR is con-
tinuous for each Uo EX.
5.9. Semigroups, One-Parameter Groups, and Their Physical Relevance 299
(ii) S+ is called nonexpansive iff S+ is strongly continuous and 8(t): X ----

X is nonexpansive for all t ~ 0, i.e.,
118(t)uo - 8(t)voll ::; Iluo - voll for all Uo, Vo E X and each t ~ O.
(iii) S+ is called linear iff 8(t): X ---- X is linear and continuous for all
tER
Definition 2. A one-parameter group {8(t)}tER. on the Banach space X

consists of a family of operators 8(t): X ---- X for all t E JR such that
8(0) = I and
8(t + s) = 8(t)8(s) for all t, s E R (77)
The generator A: D(A) ~ X ---- X of the one-parameter group {8(t)} is
defined through
Au := lim rl(8(t) - I)u, (78)
t-->O
where u E D(A) iff this limit exists.
Let S = {8(t)} be a one-parameter group on X. Set

u(t) := 8(t)uo for all t E JR. (79)
Parallel to semigroups, we introduce the following terminology:
(i) S is called strongly continuous iff the function u: JR ---- JR is continuous
for each Uo EX. .
(ii) S is called linear iff 8(t): X ---- X is linear and continuous for all
tER
(iii) S is called uniformly continuous iff S is linear and the function t 1---+
8(t) is continuous from JR to L(X, X), i.e.,

118(t + h) - 8(t)lI---- 0 as h ---- 0,
for all t E JR.
Each uniformly continuous one-parameter group {8(t)} is strongly con-
tinuous. This follows from
118(t + h)uo - 8(t)uoll ::; 118(t + h) - 8(t)lllluoll ---- 0 as h ---- 0,

for all Uo EX.
Example 3. Let X be a Banach space over lK, and let A: X ---- X be a

linear continuous operator. We set
8(t) := etA for all t E R
Then
(i) S = {S(t)} forms a linear one-parameter group on X with the gen-

erator A.
(ii) S is uniformly continuous and hence strongly continuous.
(iii) Let Uo E X be given. If we set
u(t) := etAuo for all t E JR,
then u = u(t) is the unique solution to the following ordinary differ-

ential equation:
u'(t) = Au(t), -00 <t< 00,

(80)
u(O) = Uo.
This shows that there exists a close connection between one-parameter

groups and ordinary differential equations. We shall discuss later that one-
parameter groups describe reversible processes in nature. Therefore:
If the operator A: X -+ X is linear and continuous, then the differential
equation (80) cannot describe an irreversible process in nature.
This explains why it is necessary to study the differential equation (80) in
the case where the linear operator
A:D(A) ~ X -+ X
cannot be extended to a continuous operator on X. If D(A) is dense in X,

then this means that
sup IIAul1 = 00,

IluI19,uED(A)
by the extension principle from Section 3.6. Such linear operators are called
unbounded.
Proof. By Example 5 in Section 1.23,
for all t, s E JR.
This is the group property (77). Statement (iii) follows from Proposition 3
in Section 1.24. In particular, we get u'(O) = Auo for each Uo E X, i.e., the
operator A is the generator of {S(t)}.
Furthermore,
for all hER

Hence Ile hA - III :::; IlhA11 + Ilh~1I2 + ... :::; ellhAl1 - 1. For each t E JR, this
implies
IIS(t + h) - S(t)11 = Ile(t+h)A - etAl1 = IletA(e hA - 1)11

:::; IletAlille hA - III --t ° as h --t 0,
i.e., {S(t)} is uniformly continuous. o

Definition 4. Let X be a Hilbert space over IK. By a one-parameter unitary
group we understand a strongly continuous, one-parameter group {S(t)}
where each operator S(t): X --t X is unitary, i.e.,
IIS(t)ull = Ilull for all U E X, t E R
Example 5. Let A: X --t X be a linear continuous self-adjoint operator on

the complex Hilbert space X. Set
S(t) := eiAt for all t E R
Then, {S(t)} is a one-parameter unitary group with the generator iA.
Proof. By Example 3, {S(t)} is a uniformly continuous, one-parameter

group with the generator iA.
Since A is self-adjoint, we get
for all u,v E X,
i.e., A2 is also self-adjoint. Analogously, An is self-adjoint for n = 0,1,2,

.... This implies
( L
m
n=O
(iA)n ) (
-,-u I v = u
n.
IL
n=O
m (-iA)n )
n.
,v for all u, vEX.
Letting m --t 00, we get
for all vEX.
Thus, (e itA )* = e- itA , and hence

S(t)S(t)* = S(t)* S(t) = I for all t E JR.
By Proposition 15 in Section 5.2, S(t) is unitary. o

Remark 6 (Physical interpretation). Let X be a Banach space. We regard
the elements u of X as "states" of a physical system. Furthermore, let
t = time.
(i) One-parameter groups and reversible processes in nature. Let S =

{S(t)}tEIR be a one-parameter group. Each function u = u(t) given through
u(t) := S(t)uo for all t E JR and fixed Uo (81)

is called a possible process of the system. We say that the system is in the
state u(t) at time t. In particular, since S(O) = I, the state Uo corresponds
to the "initial state" of the system at time t = O.
It follows from the group property S(t + to) = S(t)S(to) for all t, to E JR
that
u(t + to) = S(t)u(to) for all t E JR and fixed to E R (82)
This allows the following interpretation:
(C) Strong causality. The state of the system u(to) at a fixed time to
determines uniquely all the states of the system in the future t > to
and in the past t < to.
(H) Homogeneity in time. 1ft f--+ u(t) is a possible process of the physical
system, then the process t f--+ u(t + to) is also possible for each fixed
to E JR.
Observe that the transformation t f--+ t + to corresponds to a translation of

time.
To explain this, consider two observers 0 1 and O 2 . Suppose that 0 1
and O 2 perform two experiments that correspond to possible processes.
Furthermore, assume that 0 1 measures the state Uo at time t = 0, whereas
O 2 measures the state Uo at time t = to. Then, 0 1 and O 2 observe the
same process provided O 2 changes his clock by replacing the initial time to
with the time t = O.
In addition, the group property tells us that S(t)S( -t) = S( -t)S(t) =
S(O) = I for each t E R Thus, we obtain that the operator S(t): X --> X
is bijective and
S(t)-1 = S( -t) for all t E R
Hence
u( -t) = S(t)-luo. (83)
(R) Reversibility. If t f--+ u(t) is a possible process of the system, then the
reverse process t f--+ u( -t) is also possible.
Consequently, one-parameter groups describe reversible processes in nature,

e.g., wave processes without friction (energy dissipation). In fact, if u = u(t)
corresponds to such a wave process, then the reverse process u = u( -t) is
also possible (cf. Figure 5.5).
If the one-parameter group S is strongly continuous, then each possible
process u = u(t) depends continuously on time t for all t E R
(a) (b)
FIGURE 5.5.
One-parameter groups are also called dynamical systems. For example,

many systems in mechanics are dynamical systems (e.g., the motion of
planets).
Suppose that the gravitational field of a star changes in time. Then the
motion of its planets is not homogeneous in time, i.e., this motion cannot
be described by a one-parameter group.
We will show in Sections 5.11 and 5.14 that
One-parameter unitary groups reflect energy conservation of wave pro-
cesses or probability conservation of quantum processes.
(ii) Semigroups and irreversible processes in nature. Let S+ = {S(t)}t>o

be a semigroup. Each function u = u(t) defined through
u(t) := S(t)uo for all t:::=:O and fixed Uo E X (84)
is called a possible process of the physical system. In contrast to one-

parameter groups, such a process is only defined for time points t :::=: 0,
and the reversibility condition (R) is not satisfied. Proper semigroups de-
scribe irreversible processes. For example, the growth of a human being is
irreversible.
The semigroup property S(t + to) = S(t)S(to) for all t, to :::=:0 yields
u(t + to) = S(t)u(t o ) for all t:::=:O and fixed to :::=: O. (85)
(C) Causality. The state of the system at time to :::=: 0 uniquely determines
all the states u( t) of the system in the future t > to.
(H) Homogeneity in time. If t f--+ u(t) is a possible process for all t :::=: 0,
then so is the process t f--+ u( t + to) for all t:::=:O and fixed to :::=: O.
Example 7 (The harmonic oscillator). The motion x = x(t) of a mass

point of mass m > 0 on IR 1 is described by the basic equation of classic
mechanics, "force equals mass times acceleration," i.e.,
K = mx", (86)
(a)
o
:lJ
• ••
K -
(b)
e
~K
FIGURE 5.6.
where x = ue and K = Ke. Here, e denotes a unit vector (cf. Figure 5.6).
For small lxi, the Taylor expansion yields
K(u) = K(O) + K'(O)u + ....

We assume that K(O) = 0 and K'(O) < O. From (86), we get
u"(t) + Au = 0, -00 <t< 00,

(87)
u(O) = Uo, u'(O) = va,
where A = - K~O). For example, equation (87) describes the motion of a

spring (cf. Figure 5.6(b)). Introducing the new variable v := u', equation
(87) is equivalent to the following first-order system:
u' =v,
(87*)
v' = -Au, u(O) = uo, v(O) = va.
w'=Aw, w(O) = Wo, (87**)
where we set w := (u, v) and Aw := (v, -Au). The space X := JR.2 with
the inner product
becomes a Hilbert space, and the operator A: X ---t X is linear, continuous,

and skew-symmetric, i.e.,
(Aw I z) = -(w I Az) for all w, z E X.
According to Example 3, for each given Wo E X, problem (87**) has the

unique solution
for all t E JR., (88)
where {etA} represents a one-parameter group on X. On the other hand,
one checks easily that
u(t) = + C- 1 (sintC)vo,
(costC)uo
v(t) = -C(sintC)uo + (costC)vo,
5.10 Applications to the Heat Equation 305
is the solution of (87*). In matrix notation, this means that
( u(t)) = (cos.tC C- 1 sintC) (ua) = etA (ua).

v(t) -C sm tC cos tC Va Va
The group property e(t+s)A ua = etAesAwa for all t, s E IR is equivalent to
( cos(t + s)C C- 1 sin(t + S)C)

-Csin(t+s)C cos(t+s)C
( costC C-l sintC) ( cossC C- 1 sin'sC)

- -CsintC costC -CsinsC cossC '
for all t, s ERIn turn, this is equivalent to
cos(t + s)C = (cos tC) cos sC - (sin tC) sin sC,

sin(t + s)C = (sin tC) cos sC + (cos tC) sin sC for all t, s E IR,
which represents the addition theorems for the trigonometric functions. In

summary,
The addition theorems for the trigonometric functions are equivalent to
the fact that the harmonic oscillator corresponds to a dynamical system.
Finally, let us study the energy
E := 2- 1 mv 2 + 2- 1 mAu 2
of the harmonic oscillator. We want to show that
The energy E is constant along each possible motion of the harmonic
oscillator.
In fact, it follows from (87*) that
d~~t) = mv(t)v'(t) + mAu(t)u'(t) = 0 for all t E R
In Section 5.11 we shall show that the same argument applies to the wave
equation. Then, the original equation (87) represents the wave equation,
where the self-adjoint operator A is the Friedrichs extension of -~.
5.10 Applications to the Heat Equation

We consider the following initial-value problem:
u'(t) + Au(t) = 0, 0::;; t < 00,

(89)
u(t) = Ua.
(HI) The operator B: D(B) ~ X - X is linear, symmetric, and strongly

monotone on the real separable Hilbert space X with dim X = 00.
(H2) A: D(A) ~ X - X is the Friedrichs extension of B with the energetic
space XE. Suppose that the embedding X E ~ X is compact.
Theorem 5.E (The abstract heat equation). For each given initial value
Uo E D(A), the original equation (89) has a unique CI-solution u: [0, oo[ -
X. This solution is given by
for all t ~ O. (90)
The opemtor - A is the genemtor of the linear, strongly continuous, non-

expansive semigroup {e- tA }.
We regard (89) as a genemlized problem to the following classic problem:
u'(t) + Bu(t) = 0, 0::; t < 00, (91)

u(O) = Uo.
The function u = u(t) from (90) is defined for each Uo EX. This function
is called a mild (genemlized) solution of both (89) and (91).
Standard Example 1 (The classic heat equation). Let G be a nonempty

bounded open set in lRN , N ~ 1. Set ~+ =: [0,00[. We consider the follow-
ing initial boundary-value problem for the heat equation:
Ut -t::..u = 0 on G x lR+,
u(x, t) = 0 on 8G x lR+ (boundary condition), (92)
u(x,O) = uo(x) onG (initial condition).
For example, this problem allows the following physical interpretation. Set
u(x, t) = temperature at the point x at time t.

Then equation (92) describes the distribution of temperature in the "body"
G without any outer heat sources. 2 The given function Uo corresponds to
the initial temperature of the body at time t = O.
We set X := L2(G) and
B:=-t::.. with D(B) := C(f'(G).
By Standard Example 4 in Section 5.3, conditions (HI) and (H2) are sat-
isfied with the energetic space
2 A detailed physical motivation can be found in Zeidler (1986), Vol. 4, Section

69.2.
5.10 Applications to the Heat Equation 307
Theorem 5.E tells us that for each given initial temperature Uo E L2(G) the
original classic problem (92) has a uniquely determined mild (generalized)
solution in the sense of (90).
The semigroup property of {e- tA } reflects the fact that
Heat conduction represents an irreversible process in nature.
Proof of Theorem 5.E. Uniqueness. Let u, v: lR+ --+ X be two C 1 _

solutions of (89). Define w(t) := u(t) - v(t). Then, w(O) = O. By the
product rule (Example 5 in Section 2.1), it follows from
d
dt (w(t) I w(t)) = 2(w'(t) I w(t)) = -2(Aw(t) I w(t)) :::; 0
for all t E lR+ that (w(t) I w(t)) = 0 for all t E lR+, and hence w(t) = 0 for
all t E lR+.
The semigroup. By Theorem 5.C, the operator A has a complete or-
thonormal system {u }n> 1 of eigenvectors with
for all n = 1, 2, . .. .
We now use the functional calculus from Section 5.8. Then,
and u E D(A) iff I::n IAn(un I U)12 < 00. By definition,
for each t E lR+,

n
where u E D(e- tA ) iff I::n Ie-tAn (un I uW < 00.

Observe that An 2': O. Thus, for all u E X and t E lR+,
n n
Hence D(e- tA ) = X and lie-tAil:::; 1. From
we get the semigroup property
for all t, s E lR+. (93)
In fact, let u E X and t, s E lR+. Since the operator e- sA is self-adjoint,

Hence
n
= Le-(t+S)An(Un I u)un = e-(t+s)A U.
n
We now set
for all t E lR+.
For all Uo E X and t E lR+, we have the decisive majorant condition
n n
Thus, by Proposition 3 in Section 5.8, the function t 1-+ u(t) is continuous

on lR+, and u(O) = Uo.
Let Uo E D(A) and t E lR+. Then, we have the majorant condition
n n
Again by Proposition 3 in Section 5.8, this implies that the derivative
U'(t) = - L Ane-tAn(un I uo)u n

n
exists for each t E lR+, and the function u' is continuous on lR+. Further-
more,
for all t E lR+.

n n
Hence u(t) E D(A). In addition,
Au(t) = LAn(Un I e-tAUo)un = LAne-tAn(Un I uo)un = -u'(t),

n n
for all t E lR+. Therefore, U = u( t) represents a solution of the original

problem (89).
Generator of the semigroup. Let us define the operator C: D(C) ~ X ~

X through
where W E D(C) iff this limit exists. We want to show that C = -A. In
fact, differentiation of the relation
for all u, v E D( C) and all t E lR+

5.11 Applications to the Wave Equation 309
with respect to t at t = 0 yields

(Cu I v) = (u I Cv) for all u, v E D( C),
i.e., C is symmetric. Let u(t) := e-tAuo for all t E IR+. For each Uo E D(A),
u'(O) = -Auo. Hence -A <:;:; C. That is, the symmetric operator C is an
extension of the self-adjoint operator -A. By Corollary 7 in Section 5.2,
C= -A. 0
5.11 Applications to the Wave Equation

We consider the initial-value problem
u"(t) + Au(t) = 0, -00 <t< 00,
(94)
u(O) = uo, u'(O) = vo,
and we make the following assumptions:
(HI) The linear operator B: D(B) <:;:; X ---- X is symmetric and strongly
monotone on the real separable Hilbert space with dim X = 00.
(H2) A: D(A) <:;:; X ---- X is the Friedrichs extension of B with the energetic
space X E . Suppose that the embedding X E <:;:; X is compact.
We also set C:= A~.
In the trivial case where X = IR and A > 0, equation (94) describes the
motion of a harmonic oscillator with the classic solution
u(t) = (costC)uo + C- 1 (sintC)vo,
(95)
u'(t) = -(sintC)Cuo + (costC)vo.
Up to a constant, the energy of the harmonic oscillator at time t is given
through
E(t) = 2- 1 (lu'(tW + ICu(t)12) (96)
(cf. Example 7 in Section 5.9).
Definition 1. By a classic solution to the original problem (94), we under-

stand a C 2-function u: IR ---- D(A) such that (94) holds true and t f---> Cu(t)
is C 1 from IR into X.
By Example 4 in Section 5.8,
D(A) <:;:; D(C), XE = D(C), and IluliE = IICul1

for all u E X E ·
(97)
Generalizing the classic expression (96) for the energy of a harmonic
oscillator, we define the energy of a classic solution u = u(t) of (94) at time
t through
This is equal to
E(t) = Tl{llu'(t)112 + Ilu(t)II~}. (98)
Theorem 5.F (The abstract wave equation). Assume (HI) and (H2).
Then, for given initial values
ua E D(A) and va E D(C),
the original problem (94) has a unique classic solution. This solution is
given by (95).
The energy E(t) is constant along this solution.
The proof will be given ahead.

We regard (94) as a generalized problem to the following classic problem:
u"(t) + Bu(t) = 0, -00 < t < 00, (99)

u(O) = Ua, u'(O) = Va·
Let us introduce the product space X E x X, which consists of all ordered

pairs 3
w:= (u,v), where u E X E , vEX.
Then, X E xX becomes a real Hilbert space equipped with the inner product
Let us define the operator
S(t)(ua,va) = (u(t),u'(t)) (100)
through (95). If Ua E D(A) and Va E D(C), then u = u(t) from (100) is

the classic solution to the original problem (94). Observe that the energy
at time t is given by
This shows that the use of the space X E x X is quite natural.

Energy conservation means that
for all t E R (100*)
However, we will show that the operator S(t) is still defined for all
ua E X E and va EX, (100**)
3General product spaces will be studied in Section 3.6 of AMS Vol. 109.
and the relation (100*) remains valid.
Definition 2. Assume (100**). Then the function u = u(t) from (100) is

called a mild (generalized) solution of both (94) and (99).
Corollary 3 (One-parameter group). Assume (HI) and (H2). Then the

operator family {S(t)} defined in (100) represents a one-parameter unitary
group on the product space X E X X.
The proof will be given later. The original equation (94) describes wave
processes. Corollary 3 reflects the fact that these wave processes are re-
versible.
Standard Example 4 (The classic wave equation). Let us consider the

following initial boundary-value problem for the wave equation:
Utt - ~u = 0 on G x JR,
u(x, t) = 0 on BG x JR (boundary condition),
(101)
u(x, 0) = uo(x) on G (initial position),
Ut(x,O) = vo(x) on G (initial velocity).
Here, G is assumed to be a nonempty bounded open set in JRN, N 2: l.
We set X := L2(G) and
Bu:= -~u with D(B) := Co(G).
By Standard Example 4 in Section 5.3, conditions (HI) and (H2) are sat-
isfied, where
(Lt,(8
1
and lIuliE ~ j U)2<lx) ,
Thus, we may apply the results to (101). In particular, for given initial
values
and
we get a uniquely determined mild (generalized) solution for (101). An
application to the vibrating string will be considered in the next section.
According to (98), the energy of the wave process at time t is given
through
E(t) ~ 2-' L [u'(X,t)2 + t,(8 j U(X,t))2] <lx.
This is the well-known classic energy formula.
Proof of Theorem 5.F. We will use the functional calculus from Section
5.8. In fact, the assertions of Theorem 5.F follow by means of simple formal
computations. The point is that we have to justify these formal computa-

tions by respecting the domains of definition of the operators A, C, and
so forth, and by using majorant conditions in the sense of Proposition 3 in
Section 5.8.
By Theorem 5.C, the operator A has a complete orthonormal system
{un} of eigenvectors with
AUn = .An Un for all n = 1,2, ... ,

where
o < c ::; .AI ::; ... ::; .An ::; ... ~ 00 as n ~ 00.
By Section 5.8, we get
A<l<u:= L.A~(Un I u)un , a> 0,

n
where U E D(A<l<) iff I:n 1.A~(un

1
I u)j2 < 00. Hence D(A) ~ D(C), where
C := A2". Moreover,
C(CU) = Au for all U E D(A).
1
In fact, since C is self-adjoint, we get (un I Cu) = (CU n I u) = .A~(Un I u).
Hence, for all U E D(A),
C(Cu) = L.A!(un I Cu)un = L.An(Un I u)un = Au.

n n
This implies
(CU I Cv) = (u I C(Cv)) = (u I Av) for all u, v E D(A).
Uniqueness via energy conservation. Let u = u(t) and v = v(t) be two

classic solutions to (94). Set w(t) := u(t) - v(t). Then, w(·) is a classic
solution of (94) with w(O) = w'(O) = o. Observe that
w(t) E D(A) for all t E JR,
and hence w(t) E D(C) for all t E JR. We shall show that
(Cw(t))' = Cw'(t) for all t E JR. (102)
By the product rule (Example 5 in Section 2.1), differentiation of the energy
function
E(t) = Tl(W'(t) I w'(t)) + Tl(Cw(t) I Cw(t))
yields
E'(t) = (w"(t) I w'(t)) + (Cw'(t) I Cw(t))
= -(Aw(t) I w'(t)) + (w'(t) I Aw(t)) = 0 foralltER
Since E(O) = 0, this implies E(t) == o. Therefore, w'(t) == 0, and hence

w(t) == o.
Energy conservation. Similarly, we obtain that E' (t) o along each

classic solution u = u(t) of (94).
Proof of (102). Let Wh(t) := h- 1 (w(t + h) - w(t». Then
CWh(t) = h- 1 (Cw(t + h) - Cw(t».

By Definition 1, the derivatives w' (t) and (Cw( t»' exist. Letting h ---+ 0,
we get
and CWh(t) ---+ (Cw(t»'.
Since C is self-adjoint, this yields (Cw(t»' = Cw'(t), by Proposition 14 in
Section 5.2. This proves (102).
1
Let Ua E D(A) and Va E D( C) be given. Set f.Ln := .>-J. Then,
and (103)
n n
For all t E IR, define 4
u(t) := (costC)ua + C- 1 (sintC)va

~ ~ sinf.Ln t
= ~ COSf.Lnt(un I ua)un + ~ ---(un I va)un. (104)
n n f.Ln
Formal differentiation yields
n n
(105)
n n
For all t E IR and all n = 1, 2, ... , we get
ICOSf.LntI 2, Isinf.LntI2, 1Si:nt 12 ::; const,
and If.Lnl ::; constlf.LnI 2. Thus, using (103), the necessary majorant conditions
from Proposition 3 in Section 5.8 are satisfied. This justifies the preceding
formal differentiation. Hence the function u: IR ---+ X from (104) is C 2 and
u(O) = Ua, u'(O) = Va.
4For brevity of notation, we write cos f.Lnt( Un I uo) instead of (cos J1n t )(Un I uo),
and so on.
Similar arguments 5 show that
Au(t) = LJL~COSJLnt(Un I uo)un + LJLnsinJLnt(un I vo)un,

n n
and u(t) E D(A) for all t E JR.. Hence

u"(t) = -Au(t) for all t E JR..
In addition, we get
Cu(t) = LJLnCOSJLnt(Un I uo)un + LsinJLnt(un I vo)un, (106)

n n
and u(t) E D(C) for all t E JR., along with
(Cu(t))' = L -JL~ sinJLnt(un I uo)un + LJLn COSJLnt(un I vo)un,

n n
for all t E JR., i.e., the function t t--t Cu(t) is C l On R D
Proof of Corollary 3. Let Uo E D(C) and Vo E X. Then,
and L I(u n I voW < 00.

n n
By the majorant criterion from Proposition 3 in Section 5.8, it follows from

(104) and (105) that U = u(t) is C l on R
Set
S(t)(Uo,Vo):= (u(t),u'(t)).
Since D(A) and D(C) are dense in X E and X, respectively, it follows from
the extension principle in Section 3.6 that relation (100*) remains valid for
all Uo E X E and Vo E X, i.e.,
for all (uo, vo) E X E x X.
(107)
5For example, observe that
U(t) =L OnUn,
n
where
On := (Un I u(t)) = (Un I uo) COSJ1.nt + (Un I VO)J1.;;:l sinJ1.nt,
by (104). To prove that u(t) E D(A), we need
L 1J1.~OnI2 < 00.

n
However, this follows from (103) along with la + W ::; lal 2 + IW for all a, bE C.
5.12. Applications to the Vibrating String 315
Hence the operator S(t) is linear and continuous on X E xX.

The group property
S(t + s) = S(t)S(s) for all t, s E JR.
follows as in Example 7 from Section 5.7 by using the addition theorems

for sin /-Lt and cos /-Lt along with the functional calculus. Thus, the operator
S(t) is bijective by Section 5.9. Finally, it follows from (107) that S(t) is
unitary for each t. 0
5.12 Applications to the Vibrating String

and the Fourier Method
We will show that the motion of a vibrating string6 is governed by the
following equations:
Utt - Uxx = 0, o < x < 7f, -00 < t < 00,

u(x, t) = 0, x = 0, 7f, -00 < t < 00 (boundary condition),
u(x,O) = uo(x), o :s; x :s; 7f (initial position),
Ut(x,O) = vo(x) o :s; x :s; 7f (initial velocity).
(108)
Here, u(x, t) denotes the deflection of the string at the point x at time t
(cf. Figure 5.7). Set
un(x) := 7f-! sinnx, An:= n 2 , (u I v) := 10'" u(x)v(x)dx.

By a classic solution to (108), we understand a solution U such that
U, u x , Uxx , Ut, Utt E C([O, 7f] x JR.).
Proposition 1. (i) Classic solution. We are given the functions
uo, Vo E C 4 [0, 7f]
with the boundary conditions u~k)(O) = u~k)(7f) = va k)(0) = va k)(7f) = 0 for

k=0,2.
Then, the original problem (108) has a unique classic solution u, where
00
u(x,t) = L{(un I uo)cosnt+ (Un I vo)n-1sinnt}un(x). (109)

n=l
6For simplicity of notation we set c = 1, where c denotes the wave velocity

(cf. Remark 4 ahead). Moreover, we choose the string length .e = 7L
U=U(X,t)
~--------------*-------X ~------~~----*-------X
FIGURE 5.7.
This series converges uniformly for all x E [0,7r] and t E ~.

(ii) Generalized solution. Suppose that the initial data satisfy the weaker
conditions
and
Then, for each t E ~, series (109) converges in the sense of the Hilbert
space £2(0, 7r).
The proof ahead shows that, under the assumption (ii), series (109) rep-
resents the mild (generalized) solution of the original problem (108), in the
sense of Theorem 5.F in Section 5.11.
Remark 2 (The classic Fourier method). Let us recall the famous classic
motivation of the ansatz (109). First we are looking for special solutions of
the string equation of the form
u(x, t) = ¢(x)7jJ(t).
If u satisfies the equation Utt - U xx = 0, then
¢(x)7jJ"(t) - ¢"(x)7jJ(t) = O. (110)
Suppose that there exist points Xo and to such that ¢(xo) =I 0 and 7jJ(to) =I
O. Then, the boundary condition "u(O, t) = u(7r, t) = 0 for all times t E ~"
implies ¢(O) = ¢(7r) = O. Letting either t = to or x = Xo in (110), we obtain
the following boundary-eigenvalue problem:
¢"(x) = -A¢(X), o < x < 7r, AE ~,
(111)
¢(O) = ¢(7r) = 0,
along with the differential equation
7jJ" (t) = -A7jJ(t), -00 < t < 00, (112)
where A = - 'lj;~~~~o/ = - <p;~~)). Problem (111) has been studied in Section

4.5. The eigensolutions of (111) are
¢=un , A=A n =n2 , n=1,2, ....

Moreover, each solution of (112) with A = An is a linear combination of the

two special solutions
'l/Jl (t) = cos nt and 'l/J2(t) = sin nt, n= 1,2, ....
The special solutions u(x, t) := Un (X)'l/Jl(t) and u(x, t) := Un (X)'l/J2(t) of

Utt - Uxx = 0, i.e.,
Un(x)cosnt and Un(X) sin nt, n= 1,2, ... ,
are called the eigenoscillations of the string.

Now to the point of the classic Fourier method. We assume that
Each motion of the string is a superposition of eigenoscillations.
Therefore, we make the following ansatz:7
00
u(x, t) = ~)an cosnt +,Bn sin nt)un(x). (113)

n=l
To determine the unknown coefficients an and ,Bn, we use the following
formal argument. Let t = 0. Then,
00
uo(x) = u(x,O) = Lanun(x). (113*)

n=l
Observe the orthogonality relation
for all n, m = 1,2, ....
Thus, multiplying equation (113*) by um(x) and integrating over the in-
terval [0,11"], we get
m=1,2, ....
Furthermore, formal differentiation of (113) yields

00
vo(x) = Ut(x, 0) = L n,Bnun(x), (113**)

n=l
and hence
m= 1,2, ....
7This general superposition principle dates back to a famous paper by Daniel

Bernoulli in 1753. Interestingly enough, Euler did not believe that the ansatz
(113) describes the most general form of the string vibration. Obviously, the
time was not ripe for a general theory of Fourier series. A detailed discussion can
be found in SzabO (1987), Chapter 4.
This way we obtain the expression (109).

In the following proof we have to justify these formal considerations.
Here, our earlier investigations about the boundary-eigenvalue problem
(111) will playa decisive role (cf. Section 4.5).
Proof. Ad (i). Uniqueness. Let U be a classic solution to (108). Then, for

each t E JR,
U(·, t) E C 2[0, 7r] and u(O, t) = u(7r, t) = 0.

By Proposition 2(iii) in Section 4.5,
00
U(X, t) = L bn(t)un(x) for all x E [0,7rJ, t E JR,

n=l
where
bn(t) := (un I U(-, t)) = 17'> un(x)u(x, t)dx, n = 1,2, ....
Hence
b~(t) = 17'> Un(X)Ut(X, t)dx.
Let n = 1,2, .... Integration by parts yields
b~(t) = 17'> Un (X)Utt(X, t)dx = 17'> Un(X)Uxx(x, t)dx

= 17'> u~(x)u(x, t)dx = -Anbn(t),
since u~ = -AnUn . This implies the following initial-value problem

b~(t) = -Anb(t), -00 <t< 00,
bn(O) = (un I uo), b~(O) = (un I vo), n = 1,2, ... ,

which has the following unique solution:
bn(t) = (un I uo) cosnt + (un I vo)n- 1 sin nt for all t E R
Recall that An = n2 . This way we obtain (109).
Existence. Formal differentiation of (109) yields the following formulas:

00
Ut(x,t) = L{-(u n I uo)nsinnt+ (un I vo)cosnt}un(x),

n=l
00
Utt(X, t) = L {-(Un I uo)n 2 cos nt - (Un I vo)n sin nt}un(x),

n=l
00
Ux(X, t) = L {(un I Uo) cosnt + (Un I vo)n- 1 sin nt}u~(x),

n=l
00
uxx(x, t) = L {(un I Uo) cosnt + (Un I vo)n- 1 sin nt}u~(x). (114)

n=l
To justify this, observe the following. By Proposition 2 in Section 4.5, each

of the series
00
L I(u n I w)u~k)(x)l, k = 0, 1,2, w = Uo, vo (115)

n=l
converges uniformly on [0, n). For all t E JR and n = 1,2, ... ,
Icosntl, Isinntl ~ 1 and n ~ n2 •
In addition, n 2 u n = -u~. Therefore, each of the series from (114) can be

majorized by (115) and is hence uniformly convergent on [0, n) x JR. This
justifies the formulas (114). In addition, we obtain that the functions u, Ut,
Utt, u x , Uxx are continuous on [0, n) x JR.
One checks easily that the function U from (109) represents a solution of
the original problem (108). In fact, from
n= 1,2, ... ,
we obtain the boundary relation u(O, t) = u(n, t) = 0 for all t E JR. Since
n= 1,2, ... ,
it follows from (114) that the differential equation
Utt - Uxx = 0
is satisfied on [0, n) x JR.

Finally, using Proposition 2 in Section 4.5, we obtain the following two
initial conditions:
00
u(x,O) = L(un I uo)un(x) = uo(x),

n=l
00
Ut(x,O) = L(un I vo)un(x) = vo(x) for all x E [0, n).

n=l
Ad (ii). We set X := L2(0, n) and
Bu:= -u" with D(B):= CO'(O, n).

Then, problem (108) is a special case of Standard Example 4 in Section 5.11.

Let A denote the Friedrichs extension ofthe operator B. The corresponding
energetic space is given through
Since un(O) = Un C7r) = 0, it follows from Standard Example 11 in Section

2.5 that
Un E X E , n = 1,2, ....
for all v E CO'(O, 7r).
That is,
(Un I Bv) = An(Un I v) for all v E D(B).
By (44**), this means that
n = 1,2, ....
By Proposition 2 in Section 4.5, the orthonormal system {un} is complete in
L2(0, 7r), by Section 4.1. Thus, the self-adjoint operator A has no eigenval-
ues different from {An}. According to (104), the mild (generalized) solution
to the original problem (108) is given through
L {(un I uo) COSf.Lnt + (un I vO)f.L;;-l sinf.Lnt}un,

00
u(t) =
n=l
o
1
where f.Ln := A~ = n. This is precisely the series from (109).
Example 3 (Physical motivation of the string energy). Let p denote the

constant density of the string, where p > O. Then, a small piece P of the
string has the mass
.6.m = p.6.s,
where .6.s denotes the arclength of P (cf. Figure 5.7). Let Xo denote the
position of P. Then, the velocity v of P at time t is given through
v = Ut(xo, t).
Thus, we obtain
for the kinetic energy of P. Finally, we assume that the potential energy
Epot(P) of P is proportional to the extension of P, i.e.,
£pot(P) = a(.6.s - .6.x), where a = const > O.
Then, the total kinetic energy Ekin and the total potential energy Epot of
the string are given by summing over all small pieces of the string, i.e.,
Ekin = I)-lp(D.S)U~ and Epot = La(D.s - D.x).

p p
Observe that
D.s = vII + ux(x, t)2 D.x.
More precisely, the final definition of Ekin and Epot at time t will be
based on the following integrals:
Ekin(t) : = 1'" TI PUt (x, t)2dx,
Epot(t) : = a 1"'( Jl + ux(x, t)2 - l)dx.
Suppose that the deflection of the string is small, i.e., luxl is small. By
Taylor series expansion,
Hence the total energy E(t) of the string at time t is s, given approximately
through
where c:= (~) 2.
Let f: IR ---+ IR be a CI-function. Then, f is said to be stationary at the

point x iff
J'(x) = 0,
i.e., the tangent line at the point x is horizontal (cf. Figure 5.8 on next
page). Thus, f is stationary at local minima, local maxima, and horizontal
inflection points. If J'(x) = 0, then we also say that x is a critical point of
f, and f(x) is called a critical value of f.
Remark 4 (Physical motivation of the string equation). The fundamental

principle of stationary action in mechanics says that
action A(u) := ito

h
(Ekin(t) - Epot(t))dt = stationary!.
Here, we vary over those states of the system that are fixed at both the
initial time to and the terminal time tl and those that satisfy the boundary
L-----------------------------.x
FIGURE 5.8.
conditions. For the string with fixed end points, this means the following
for the action A( u):
A( u) := TI p 1: (1'' (
1
U; - c2 U;)dX) dt = stationary! , (116a)
where
u(O, t) = U(1T, t) =
u(x, to) = fixed
° for all t E [to, tIl
(boundary condition),
for all x E [0,1T]
(initial condition),
u(x, td = fixed for all x E [0,71']
(terminal condition).
(116b)
The meaning of "stationary!" will be explained in (117*). Define
o := la, 71'[ x lto, h [.
Let U = u(x, t) be a sufficiently smooth solution of (116). We set
w(x, t) := u(x, t) + TV(X, t),
where T is a real number. The sufficiently smooth function v is called ad-
missible iff
V=o on8W.
This guarantees that both wand u satisfy the same side conditions (116b).
Set
¢>V(T) := A(u + TV).
By definition, A is stationary at u iff
¢>~(O) = ° for all admissible functions v. (117*)

We want to show that (117*) implies the string equation
Utt(x, t) - c2 u xx (x, t) = ° on O. (117)

In fact, it follows from (117*) that
¢>~(o) = p 10 (UtVt - c2 u x v x )dxdt = ° for all v E C8"(O).
10 (Utt - c u
2 xx )vdxdt = ° for all v E Co(O).
5.13 Applications to the SchrOdinger Equation 323
u u
---.~ c
--~----~--+---~-------x
(a) u = a(x - ct) (b) u = b(x + ct)
FIGURE 5.9.
By the variational lemma from Section 2.2.3, this implies (117).
If the functions a, b: 1R -- 1R are C 2 , then the function
u(x, t) := a(x - ct) + b(x + ct) (118)
is a solution of the string equation (117) for all (x, t) E 1R2 . Here, x r-t
a(x - ct) corresponds to a wave that moves with velocity c from left to
right (cf. Figure 5.9), whereas x r-t b(x + ct) moves with velocity c from
right to left. In each textbook on partial differential equations one proves
that (118) represents the most general C 2-so1ution of the string equation
(117). This justifies the designation "one-dimensional wave" equation for
the string equation.
5.13 Applications to the Schrodinger Equation

The following so-called abstract Schrodinger equation
u'(t) = -iAu(t), -00 < t < 00, (119)

u(O) = uo
governs the motion of quantum systems.
Theorem 5.G. Let A: D(A) ~ X -- X be a self-adjoint operator on the

complex Hilbert space X. Then, the following hold true:
(i) There exists a unique one-parameter unitary group generated by the

skew-adjoint operator -iA.
(ii) For each Uo E D(A), the function
u(t) := S(t)uo for all t E JR. (120)
is the unique C1-solution to (119).
For each Uo E X, the continuous function u: JR. ---> X from (120) is called
a mild solution of (119).
Proof. Uniqueness. Let {Set)} be a one-parameter unitary group generated

by the operator C := -iA. Then
Set + h) - Set) = S(t)(S(h) - I) = (S(h) - I)S(t). (121)
Let Uo E D(A) be given. Then
d
-d (S(t)uo) = lim S(t)h-l(S(h) - I)uo = S(t)Cuo,
t h--+O
since {Set)} is strongly continuous. On the other hand, it follows from (121)
that
lim h-1(S(h) - J)S(t)uo = S(t)Cuo.
h--+O
By the definition of the generator C in Section 5.9, this implies S(t)uo E

D(C) and
CS(t)uo = S(t)Cuo for all t E R
Thus, the function u(t) := S(t)uo satisfies the initial-value problem
u'(t) = Cu(t), -00 <t< 00,

(122)
u(O) = uo.
Let v = vet) be another solution of (122). We shall show below that
d
ds Set - s)v(s) = -Set - s)Cv(s) + Set - s)v'(s)
(123)
= -Set - s)Cv(s) + Set - s)Cv(s) = 0
for all t, s E R This implies that
Set - s)v(s) = const(t) for all s E JR.,
and hence S( -s )v( s) = v(O) for all s E R Applying S( s) to this equation,

we get
v(s) = S(s)uo = u(s) for all s E R
This proves the uniqueness of the solution u = u(t) to (122).
5.13 Applications to the Schrodinger Equation 325
Let {T(t)} be another one-parameter unitary group generated by the

operator C. Then, the function
wet) := T(t)uo for all t E JR.
is a solution of (122), and hence wet) = u(t) for all t E R This implies
(T(t) - S(t»uo = 0
for all Uo from the dense subset D(A) of the Hilbert space X. By the
extension principle from Section 3.6,
T(t) = Set) for all t E JR..
This proves that the operator C cannot generate two different one-para-
meter unitary groups.
Proof of (123). Since the derivative v'(s) exists, we get
v(s + h) = v(s) + hv'(s) + h€(h),

where €(h) -t 0 as h -t O. Observe that
h-l[S(t - s - h)v(s + h) - Set - s)v(s)] = ~l + ~2'

where
~l := h-1[S(t-s-h)-S(t-s)]v(s), ~2:= h-1S(t-s-h)(v(s+h)-v(s».
Obviously,
~l = Set - s)h-l(S(-h) - I)v(s) -t -Set - s)Cv(s) • as h -t O.
Moreover,
~2 = Set - s - h)v'(s) + r(h), where r(h) := Set - s - h)€(h).
Since IIS(t)11 :s 1 for all t, we get IIr(h) II :s 1I€(h)1I -t 0 as h -t 0, and hence
~2 -t Set - s)v'(s) as h -t O.
Existence. Case 1: We first make the following additional assumption.
(H) The self-adjoint operator A: D(A) ~ X - t X possesses a finite or

countable complete orthonormal system {un} of eigenvectors with
the corresponding eigenvalues {An}, i.e.,
for all n.
We now use the functional calculus from Section 5.8. By definition,
e-itAu:= Le-itAn(Un I u)u n , (124)

n
where u E D(e- itA ) iff Ln le-itAn(Un I uW < 00. Since lei'" I = 1 for all
real numbers 0:, we obtain D(e- itA ) = X. Moreover,
n n
(125)
As in the proof of Theorem 5.E, it follows from
for all t, s,.A E lEt
that
for all t, s E R
That is, {e- itA } represents a linear one-parameter group. Hence the oper-
ator S(t): X -+ X is bijective, by Section 5.9. Moreover, relation (125) tells
us that S(t) is unitary on X for each t.
Set
for all t E R
First let Uo E X. According to (124), it follows from the majorant criterion
(Proposition 3 in Section 5.8) that u = u(t) is continuous on R Thus,
{e- itA } is strongly continuous. Summarizing, we obtain that {e- itA } is a
one-parameter unitary group.
Now let Uo E D(A). Formal differentiation of (124) with u = Uo yields
00
u'(t) = L e- iAnt ( -i.An)(u n I uo)u n . (126)

n=l
Because of the majorant condition

00 00
L le- iAnt .An (un I uo)1 2 ::; L l.An(un I uo)12 < 00, (127)
n=l n=l
formula (126) holds true for all t E lEt, and u' is continuous on R By (124)
with u = Uo and (127), we get u(t) E D(A) for all t E lEt and
00
Au(t) = L e-iAnt.An(Un I uo)u n for all t E R

n=l
Hence
u'(t) = -iAu(t) for all t E JEt. (128)
Finally, we want to show that the operator -iA is the generator of

{e- itA }. Let C denote the generator of {e- itA }. By (124),
for all u, vEX.
Differentiating this with respect to t at t = 0, we obtain
(Cu Iv) = (u I -Cv) for all u,v E D(C).
Thus, the operator C is skew-symmetric. By (128) with t = 0, C is an ex-

pansion of the skew-adjoint operator -iA, and hence C = -iA by Corollary
7 in Section 5.2.
Case 2: In the general case where the operator A: D(A) ~ X -+ X is
merely self-adjoint, one has to use the general functional calculus. Then,
we define
for all u E X,
in the sense of Remark 4 in the upcoming Section 5.14, and the proof
proceeds analogously to Case 1. The details can be found in Zeidler (1986),
Vol. 2A, p. 186. 0
Note that our applications of Theorem 5.G to the harmonic oscillator in

quantum mechanics correspond to Case 1, for which we have given a full
proof (cf. Section 5.14).
5.14 Applications to Quantum Mechanics

We want to show that the theory of Hilbert spaces represents a proper tool
for the mathematical description of quantum systems.
5.14.1 An Abstract Setting for Quantum Mechanics

A quantum system, e.g., an atom or a molecule, is described by a complex
Hilbert space X.
(i) Physical states. The unit vectors 't/J of X are called states, that is,
('t/J I 't/J) =1.
The two unit vectors 't/J and ¢ are called equivalent iff 't/J = >..¢ for some
complex number>" with 1>"1 = 1.
Intuitively, each physical state of the quantum system corresponds to a
state. We assume that equivalent states represent the same physical state. 8
81t was generally believed until 1952 that there exists a one-to-one correspon-
dence between "states" and "physical states." However, in quantum field theory
(ii) Physical quantities. The self-adjoint operators A: D(A) ~ X -+ X on

the Hilbert space X are called observables.
Intuitively, each "physical quantity" of the quantum system corresponds
to an observable.
In particular, the energy of the quantum system corresponds to a self-
adjoint operator H: D(H) ~ X -; X, which is called the Hamiltonian of
the quantum system.
(iii) Measurements. Suppose that we measure the observable A in the
state 'IjJ. A fundamental feature of quantum physics is that, opposed to
classical physics, the prediction of a measurement's outcome is only statis-
tical. The numbers
A:= ('IjJ I A'IjJ), 'IjJ E D(A),
and
'IjJ E D(A)
correspond to the mean value A and the dispersion (~A? of the observable
A in the state 'IjJ, respectively. The dispersion is also called variance.
Since the operator A is symmetric, the mean value A is real. Moreover,
~A = IIA'IjJ - A'ljJ11 ~ 0. 9
(iv) Dynamics. The equation
t E JR, (129)
describes the time-evolution of the quantum system, i.e., if 'ljJo E X corre-

sponds to the state of the system at time t = 0, then 'IjJ(t) corresponds to
the state of the system at time t.
Here, {e- it"H} denotes the one-parametric unitary group generated by

the skew-adjoint operator - iJ{ (cf. Theorem 5.G in Section 5.13). Recall
that if 7./Jo E D(H), then
in'IjJ'(t) = H'IjJ(t) for all t E lR.
there exist so-called supers election rules, e.g., for charge and baryon number.
These superselection rules say that there exist "states" (resp., "observables")
that do not correspond to "physical states" (resp., "physical quantities"). For
example, suppose that the two states 'lj;l and 'lj;2 correspond to a charged par-
ticle, with the charge el and e2, respectively, where el ::I e2. Then, the state
Cil 'lj;l + Ci2'1j;2 with Cil ::I 0 and Ci2 ::I 0 does not correspond to a "physical state."
9To motivate the definition of ~A, assume that 'Ij; E D(A2), i.e., 'Ij; E D(A)
and A'Ij; E D(A). Then
(~A)2 = ((A - AI)'Ij; I (A - AI)'Ij;) = ('Ij; I (A - AI)2'1j;),

i.e., (~A)2 is the mean value of the observable (A - AI? This coincides with
the definition of dispersion (variance) in probability theory.
This is the Schriidinger equation, where Ii := ;". and h denotes the Planck
quantum action. If we measure length, time, and mass in meter, second,
and kilogram, respectively, then
2
h = 6.626 . 10-34 kg ~.
s
Since the operator U(t) := e-¥ is unitary for each t E JR, we get
('IjJ(t) I 'IjJ(t» = ('ljJo I 'ljJo) = 1,

i.e., if 'ljJo is a state, then so is 'IjJ(t) for each time t. Therefore, we obtain
the following crucial fact:
Time evolution of quantum systems preserves states.
5.14.2 Discussion of the Abstract Setting

Example 1 (Physical interpretation of eigenvalues). Suppose that a: is an
eigenvalue of the observable A with the corresponding eigenstate 'IjJ, i.e.,
A'IjJ = a:'IjJ, ('ljJI'IjJ)=1.
Then
A = ('IjJ I A'IjJ) = a:
and
LlA = IIA'IjJ - A'ljJ1l = O.
In this case we say that the observable A has the "sharp" value a: in the
eigenstate 'IjJ.
Example 2 (Physical interpretation of Fourier coefficients and probabil-

ity). Suppose that {'ljJn} is an orthonormal system of eigenvectors of the
observable A, i.e.,
for all n.
Set Y := closure of span {'ljJn}. Then, {'ljJn} represents a complete orthonor-
mal system in Y. Thus, for each state 'IjJ E Y, we get the Fourier expansion
(130)
n
Since ('IjJ I 'IjJ) = 1,
Therefore, the following physical interpretation of the Fourier coefficients

makes sense. Suppose that the system is in the state 'IjJ E Y. Then
1('ljJn I 'IjJ)12 = probability for the realization of the eigenstate 'ljJn.
Assume now that 'IjJ E Y and A'IjJ E Y. Then
(131)
n
and
Proof. Since A'IjJ E Y,
n n n
Hence
n
This is (131). Moreover,
(~A)2 = ((A - AI)'IjJ I (A - AI)'IjJ)
= (~(an - A)('ljJn I 'IjJ)'ljJn I ~(am - A)('ljJm I 'IjJ)'ljJm) ,
which implies (132). o

More generally, let the quantum system be in the state 'IjJ EX. Let cp be
another state. Then, we define
1('IjJ I cp)1 2 = probability for realizing the state cp.

This definition makes sense. In fact, by the Schwarz inequality, we have
If U:X ~ X is a unitary operator, then (U'IjJ I Ucp) = ('IjJ I cp), and hence
for all cp, 'IjJ E X.

Therefore, we obtain the following:
Unitary operators preserve probability.
In particular, since the time-evolution operator e- it"H is unitary, we obtain
that
Time evolution of quantum systems preserves probability.
Proposition 3 (The uncertainty inequality). Let A and B be two observ-

ables, and let 'IjJ be a state in the Hilbert space X such that
'IjJ E D(A) n D(B), A'IjJ E D(B), and B'IjJ E D(A).
Then
LlALlB 2: ICI, (133)
where LlA and LlB correspond to the state 'l/J, and
c= Tl((BA - AB)'l/J I 'l/J).
Roughly speaking, relation (133) tells us that

If the two observables A and B do not commute, then it is impossible
to measure precisely the corresponding two physical quantities at the same
time.
In Section 5.14.5 we will show that (133) implies the classical Heisenberg
uncertainty principle:
It is impossible to measure precisely position and momentum of a quan-
tum particle at the same time.
The proof of the fundamental inequality (133) will be based on the
Schwarz inequality.
Proof. For all A, B E JR,
((BA - AB)'l/J I 'l/J) = ((B - BI)(A - AI)'l/J I 'l/J) - ((A - AI)(B - BI)'l/J I 'l/J)
= ((B - BI)(A - AI)'l/J I 'l/J) - ('l/J I (B - BI)(A - AI)'l/J)
= 2i Im((B - BI)(A - AI)'l/J I'l/J).
With A:= ('l/J I A'l/J) and B =: ('l/J I B'l/J), the Schwarz inequality yields
LlALlB = II(A - AI)'l/JIIII(B - BI)'l/J11 2: I((A - AI)'l/J I (B - BI)'l/J)1

2: IIm((B - BI)(A - AI)'l/J I 'l/J)I = 2- 1 1((BA - AB)'l/J I'l/J)I. 0
5.14-3 A Look at the General Functional Calculus

Remark 4 (General functional calculus). In the general spectral theory
for self-adjoint operators, one shows that each self-adjoint operator (ob-
servable) allows a representation of the following form:
1:
Explicitly, this means that
(v I Au) = >..d(v I E>..u) for all u E D(A), v E X, (134)

and
I:
This way it is possible to define functions of the operator A through
"F(A) = F(>')dE)., "
I:
for the given function F: lR. --t C. Explicitly, this means that
(v I F(A)u) = F(>.)d(v I E)..u) for all u E D(F(A)), v E X,
where
u E F(A) iff I: 1F(>'Wd(u I E)..u) < 00.
In addition,
for all u E D(F(A)).
I:
Moreover, we get D(F(A)*) = D(F(A)), and
"F(A)* = F(>.) dE).,"
I:
i.e.,
(v I F(A)*u) = F(>') d(v I E).u) for all u E D(F(A)*), vEX.
The meaning of the Stieltjes integrals J ...

d( v I E).. u) will be discussed
ahead.
This generalized functional calculus dates back to von Neumann (1932)
who generalized the spectral theory of Hilbert (1912) for bounded symmet-
ric operators to general self-adjoint operators.
In terms of quantum theory, the formula
allows the following interpretation: Let J be an interval. Suppose we mea-

sure the observable A in the state 'Ij;. Let a be the measured value. Then
i d('Ij; I E)..'Ij;) = probability for a E J.

Using probability theory, the corresponding mean value A and the disper-
I:
sion (~A)2 are given through
I: (). -
A = )"'d('I/; I E)..'I/;), 'I/; E D(A),
(~A)2 = A)2d('I/; I E)..'I/;), 'I/; E D(A2).
This implies A = ('I/; I A'I/;) and (~A)2 = ('I/; I (A-AI)2'1/;) = II(A-AI)'I/;112,

which coincides with the definition given earlier.
This general spectral theory can be found in Riesz and Nagy (1955).
Applications to quantum theory are contained in Reed and Simon (1972),
Triebel (1972), and Prugovecki (1981).
Remark 5 (Spectral family and Stieltjes integral). More precisely, the

general functional calculus is to be understood as follows. For each self-
adjoint operator A: D(A) ~ X -+ X, there exists a unique spectral family
{E)..} having the following properties:
(i) For each)", E JR, the operator E)..: X -+ X is linear, continuous, and
self-adjoint with E~ = E).., i.e., E).. is an orthogonal projection.
(ii) For each u E X, the function
is nondecreasing on JR.
(iii) For each U E X,
and lim E)..u

).._+00
= U.
(iv) For each u E X and each fL E JR,
(v) The operator A allows a representation of the form (134).
Since E~ = E).. and E~ = E)..,
for all u E X, )... E R
Let u,v E X and)", E R Then, for all u,v E X,
4(v I E)..u) = liE).. (v + u)112 -IIE)..(v - u)112

+ aIIE)..(v + au)112 - aIIE)..(v - au)112, (135)
where a = 0 or a = i if X is a real or complex Hilbert space, respectively.

Since the function A f---+ IIEAWl12 is nonincreasing for each wE X, it follows
from (135) that the function
Af---+(V I EAu)
is of bounded variation. That is, the integrals J ... d(v I EAu) from Remark
4 are to be understood as Stieltjes integrals (see the appendix).
Standard Example 6. Let A: D(A) ~ X ~ X be a self-adjoint operator

on the separable Hilbert space X, which possesses a complete orthonor-
mal system {UI' U2, ... } of eigenvectors with the corresponding eigenvalues
{AI, A2, .. .}, i.e.,
for all n.
Then,
for all U E X, A E lR,
n
where
eA(JL):= { o1 ifA:::;JL
if JL < A.
Let the arbitrary function F: lR ~ <C be given. Then, for all U E D(F(A))
I:
and v EX, we get
(v I F(A)u) = F(A)d(v I EAu) = L F(An)(U n I u)(v I Un).

n
I:
Here, U E D(F(A)) iff
IF(A)1 2d(u I EAu) = L IF(An)121(un I uW < 00.

n
Thus, the functional calculus from Section 5.8 represents a special case of
the general functional calculus from Remark 4.
Example 7 (The multiplication operator). Let X := L~(lR). Define
(Au)(x) := xu(x) for all x E lR,
where u E D(A) iff u E X and J~oo Ixu(x)1 2dx < 00. By Example 10 in
Section 5.2, the operator A: D(A) ~ X ~ X is self-adjoint.
(i) The spectral family {E A } of A is given through
if x:::; A
(EW)(x) := { ~(x) if x> A,
for all u E X and each A E R
(ii) For all 'ljJ E X and -00 ::; a < b ::; 00,
(136)
Proof. Ad (i). For all u, v E X and ,X E JR.,
(v I E).,u) = i~ v(x) u(x)dx = (E).,v I u). (137)
In addition,
Thus, the operator E).,: X ----+ X is linear, continuous, and self-adjoint. Ob-
viously, E~ = E)., for all ,X E R
By (137), the function ,X f---+ (u I E).,u) is nondecreasing on R Let -00 <
,X < P, < 00. Then, for each u E X,
and hence
Similarly, we get
lim E/l-u = 0 and lim E)., u - u = 0 for all u E X.

/l-~-OO ).,~+oo
Let -00 ::; a < b ::; 00, and let F: JR. ----+ C be a measurable function. It
follows from (137) and from (5) in the appendix that
(138)
provided u, v E X and the integral J~oo F('x)v('x)u('x)d'x exists.
i: i:
For all u E D(A) and v E X,
(v I Au) = v('x)'xu('x)d'x = 'xd(v I E).,u).
In this connection, observe that u E D(A) iff J~oo ,X2Iu(x)1 2dx < 00. Thus,
the integral J~oo v('x)'xu('x)d'x exists, by the Schwarz inequality.
Ad (ii). Use (138) with F == 1. 0
5.14.4 Quantization of Classical Mechanics

and the Schrodinger Equation
In classical mechanics, the motion x = x(t) of a particle of mass m in 1R3
is governed by the following classical Newtonian equation:
mx"(t) = K(x(t)). (139)
Here, x denotes the vector of position. Let us suppose that the force field
K = K(x) possesses a potential U, i.e.,
K(x) = -grad U(x).
Then, the total energy E of the particle is given though

p2
E = 2m + U(x), (140)
where p(t) := mx'(t) denotes the momentum vector at time t. Recall that
r~ is called the kinetic energy and U is called the potential energy of the
particle.
If x = x(t) is a solution to (139), then
E = const along the motion x = x(t) (conservation of energy),
provided the potential U is C1 in a neighborhood of the trajectory. In fact,
E' (t) = mx' (t)x" (t) + x' (t) grad U(x(t)) = o.

In quantum mechanics, the motion of a particle of mass m is described
by the following Schrodinger equation:
in'I/Jt = --
n2 D..'I/J + U'I/J, (141)
2m
which Schrodinger formulated in 1926. We are looking for solutions 'I/J =
'I/J(x, t) of (141) such that
r I'I/J(x, tWdx =
JJR3
1 for all times t E R (142)
The function 'I/J describes the physical state of the particle. More precisely,
'I/J allows the following interpretation:
(i) Probability. Let G be a nonempty open set in 1R 3 . Then,
jl'I/J(x, tWdx = probability of finding the particle in

G the set G at time t.
(ii) Stationary particle states of fixed energy E. Substituting the ansatz

iEt
'l/J(x, t) = cjJ(x)e-"'-
into the Schrodinger equation (141), we get the stationary Schro-

dinger equation
n,z
EcjJ= --D.cjJ+UcjJ, (143)
2m
where the eigenvalue E corresponds to the energy of the particle in
the state cjJ. The normalization condition (142) is equivalent to
Formally, the Schrodinger equation (141) is obtained from the classic

energy relation (140) by using the following simple substitutions:
and p ::::} -in grad.
That is, when passing from classical mechanics to quantum mechanics,

classical physical quantities (e.g., energy and momentum) are replaced with
differential operators. lO
In this connection, observe that in a Cartesian coordinate system we get
where the basis vectors {eI,e2,e3} form an orthonormal system. Hence
Remark 8 (Interpretation in terms of functional analysis). Let us introduce

the Hilbert space
X := L~(lR3).
Then, the normalization condition (142) reads as follows:
('l/J(t) I 'l/J(t)) =1 for all t E 1R,
i.e., 'l/J(t) is a unit vector in X for each t. The differential operator 1-l:
D(1-l) ~ X ~ X given by D(1-l) := CO'(1R3 ) and
n2
1-lrp := - - D.rp + U(x)rp
2m
10 A more detailed motivation can be found in Zeidler (1986), Vol. 4, p. 112.

is called the formal Hamiltonian of the particle. Integration by parts yields
J~3
r (b.¢)'IjJ dx = r
J~3
¢(b.'IjJ)dx for all ¢, 'IjJ E D(H),
and hence
(¢ I H'IjJ) = r
J~3
¢H'ljJdx = r
J~3
(H¢)'ljJdx = (H¢ I 'IjJ) for all ¢,'IjJ E D(H),
provided the real function U = U(x) is sufficiently regular. Thus, the formal
Hamiltonian H is a linear symmetric operator on the Hilbert space X.
One of the main tasks of a rigorous mathematical approach to quantum
mechanics consists in extending the formal Hamiltonian H to a self-adjoint
operator H: D(H) <;;; X --+ X, which is called the Hamiltonian of the parti-
cle. Then, the spectrum of H corresponds to the possible energy values of
the particle. An important special case will be studied in the next section.
In Problem 5.10 we will show that for the electron of the hydrogen atom
the Hamiltonian H can be obtained as the Friedrichs extensiorl of H.
5.14.5 Applications to the Harmonic Oscillator in Quantum

Mechanics
We want to explain how the abstract setting of quantum physics from
Section 5.14.1 can be realized in the special case of a harmonic oscillator.
In classical mechanics, a harmonic oscillator corresponds to a point of
mass m > 0, where the motion x = x(t) in lR is governed by the ordinary
differential equation
mx"(t) = -mw 2x(t) (144)
for fixed w > 0 (cf. Example 7 in Section 5.9). The total energy is given
through
p2 mw2x2
E= 2m + - 2 - '
where p(t) := mx'(t) denotes the momentum of the particle at time t. That
is,
. momentum
veloczty of the particle = .
mass
In quantum mechanics, the motion of the harmonic oscillator is described
by the Schrodinger equation
fi2 mw 2x 2
ifi'IjJt = - 2m 'ljJxx + --2- 'IjJ. (145)
This is formally obtained from (144) by means of the substitutions
p =}
. a
-zfi ax and E =}
a
ifi at· (146)
i:
We are looking for solutions 't/J = 't/J(x, t) of (145) with
I't/J(x, t)1 2dx = 1 for all times t E R
Using the ansatz

't/J(x, t) = ¢(x)e- iri ',
from (145) we obtain the stationary Schrodinger equation
fi2 2
E¢= __ ¢,,+mw x2¢ onR (147)
2m 2
Let us introduce the Hilbert space
X:= L~(lR)
i:
with the inner product
(¢ I 't/J) = ¢(x)'t/J(x)dx.
Each unit vector ¢ E X is called a state of the particle (harmonic oscillator),

i.e.,
Let -00 ::; a < b ::; 00. By definition,
lb
a
1¢(x)1 2 dx:= probability of finding the particle
in the interval [a, b].
(148)
This will be motivated in Remark 16.
Definition 9. The formal Hamiltonian 1i: D(1i) <;;; X -+ X of the har-

monic oscillator is given through
fi2 mw 2
1i,/..:=
'f'
--,/.."
2m'f'
+ -2- x 2 'f',
,/..
where D(1i) := S.
The space S has been introduced in Section 3.7. Recall from Section 3.4
that the Hermitean functions Un are defined through
n = 0,1,2, ... , (149)
where
1
an == nIl'
227r 4(n!)2
Proposition 10.
(i) The operator 1i is symmetric.

(ii) For all n = 0, 1, ... ,
1i<Pn = En<Pn, (150)
where <Pn(X) := Un (:0) x~! with Xo := (mli;J!, and
n = 0,1,2, .... (150*)
(iii) The eigenfunctions {<Pn} form a complete orthonormal system in X.
By (iii), all the eigenvalues of 1i are given through (150*). In terms of

physics, the numbers Eo, E 1 , ... are the only possible energy levels of the
harmonic oscillator in quantum mechanics. This tells us that
The energy of the simplest oscillating system is quantized.
Planck made this fundamental discovery in 1900. He formulated such a
quantum hypothesis in order to get the right radiation law.
This marked the beginning of quantum physics. Relation (150*) is closely
related to the following fundamental physical fact. In modern physics, one
assumes that light consists of particles called photons. The energy fl.E of
such a photon is given through
27l'c
fl.E = En+l - En = nw, wherew = T' (151)
Here, c = velocity of light in vacuum and .x = wavelength of the specific

light. A derivation of Planck's radiation law from (151) and its applications
to the expansion of our universe and the vaporization of black holes can be
found in Zeidler (1986), Vol. 4.
Proof of Proposition 10. Ad (i). For all <p, 'lj; E S, integration by parts
yields
1 -1
i: i:
N
¢"'lj;dx =
00
00 lim ¢'(X)'lj;(X)I ¢''lj;'dx
-00 N ..... +oo -N-oo
=- ¢''lj;'dx = ¢'lj;"dx.
Hence
(<p 11i'lj;) = (1i<p I 'lj;) for all <p, 'lj; E D(1i).
2
Ad (ii). By (149), un(x) = e-"2 x polynomial (x). Hence we obtain
<Pn E S. A fairly simple computation yields (150).
Ad (iii). This has been proved in Section 3.4. 0
Definition 11. The operator H: D(H) C;;; X -+ X defined through

00
H¢:= LEn(¢n! ¢)¢n

n=O
is called the Hamiltonian of the harmonic oscillator. Here, ¢ E D(H) iff
00
L !En(¢n ! ¢)¢n)!2 < 00.

n=O
Proposition 12.
(i) The Hamiltonian H: D(H) C;;; X -+ X is self-adjoint.
(ii) The operator H is an extension of the formal Hamiltonian H.
Proof. Ad (i). This follows from Proposition 2 in Section 5.8.

Ad (ii). Let ¢ E D(H), Le., ¢ E S. By the definition of the space S in
Section 3.7, H¢ E S. Since {¢n} forms a complete orthonormal system in
X,
00 00
H¢ = L(¢n ! H¢)¢n = L En(¢n ! ¢)¢n, (152)

n=O n=O
by the symmetry of H along with (150). Hence the series (152) is conver-
gent, Le., ¢ E D(H). 0
Since H C;;; H, we get
H¢n = En¢n, n = 0, 1,2, ....
According to Example 1, we say that the particle has the sharp energy En
in the state ¢n'
Suppose that the particle is in the state ¢. By Example 2,
!(¢n ! ¢W = probability of having the sharp energy En·
Remark 13 (Dynamics of the harmonic oscillator). We are given
'¢O EX with ('¢O ! '¢o) = 1.

Suppose that the harmonic oscillator is in the state '¢o at time t = O.
According to Section 5.14.1, the state ,¢(t) of the harmonic oscillator at
time t is given through
for all t E JR.

Explicitly,
L e- it:n (¢n I 'lj;O)¢n

00
'lj;(t) = for all t E R

n=O
This series converges in the Hilbert space X = L~(lR).

In addition, if'lj;o E S, then it follows from Theorem 5.G in Section 5.13
that
in'lj;'(t) = H'lj;(t) for all t E lR,
(153)
'lj;(O) = 'lj;o.
The abstract Schrodinger equation generalizes the classical Schrodinger
equation (145).
Next we want to study both the momentum operator A and the position
operator B. Recall that X := L~(lR).
Definition 14. The operator A: D(A) <:;; X ---> X with
. d
(A¢)(x) := -zn dx ¢(x) for all x E lR
is called the mcrmentum operator. Here, D(A) := {¢ E X: ¢' E X}, where

the derivative is to be understood in the generalized sense.
The operator B: D(B) <:;; X ---> X with
(B¢)(x) := x¢(x) for all x E lR
is called the position operator. Here, D(B):= {¢ E X: B¢ EX}.
According to Standard Examples 8 and 10 in Section 5.2, the operators

A and B are self-adjoint. The definition of the momentum operator A
and the position operator B can be motivated by (146) and (154), (155),
respectively. A more detailed motivation can be found in Zeidler (1986),
Vol. 4, p. 112ff.
Remark 15 (Heisenberg's uncertainty principle). We are given the state
with (¢ I ¢) = 1.
By Section 5.14.1, the mean position X and the corresponding dispersion

(~X)2 of the particle in the state ¢ are given through
(154)
and
5.15 Generalized Eigenfunctions 343
Furthermore, the mean momentum P and the corresponding dispersion

(tlP)2 in the state ¢ are equal to
P = (¢ I A¢) and
Obviously,
AB¢ - BA¢ = ili(x¢' - (x¢)') = -ili¢.
Thus, it follows from Proposition 3 that
(156)
Heisenberg formulated this famous uncertainty principle in 1927. Relation

(156) tells us that it is impossible to measure exactly position and momen-
tum (i.e., the velocity) of the particle at the same time.
More precisely, we get the following:
(i) If we localize sharply the particle (i.e., tlX is small), then the velocity
of the particle is highly uncertain (i.e., tlp is large).
(ii) Conversely, if we determine sharply the velocity of the particle (i.e.,
tlP is small), then the position of the particle is highly uncertain
(i.e., tlX is large).
Remark 16 (Justification of (148)). Let {E>.} be the spectral family of

the position operator B. We are given a state ¢ of the particle, i.e., ¢ E X
and (¢ I ¢) = 1. By Remark 4,
lba
d(¢ I E>.¢) = probability of measuring the position x of
the particle in the interval [a, b],
and Example 7 tells us that
This yields (148).
5.15 Generalized Eigenfunctions

In order to explain the basic idea in terms of physics, let us begin with the
following simple relation:
. d ~ ~
-zli - e" = pe " for all x E lR. (157)
dx
• • • • . .. Y-E Y Y+E
(al particle stream of (b1 particle localized

velocity v in a neighborhood
of the point Y
FIGURE 5.10.
and each fixed p E IE.. That is, the function ¢p(x) := e i~X is an eigenfunc-
lx
tion of the differential operator -in that corresponds to the momentum
operator A from Section 5.14.4. The point is that
The function ¢p does not live in the Hilbert space X = L~(IR).
In fact,
i: l¢p(x)1 2 dx = 00.
Therefore, ¢p does not correspond to the state of a single particle. However,

physicists use the following interpretation. The function ¢p corresponds to
a particle stream in lR from left to right with the velocity
p
v= - ,
m
where m > 0 denotes the particle mass (Figure 5.1O(a)). In addition, the
density p of the particle stream is given through
for all x E IE..
That is,
lb p(x)dx = number of particles in the interval [a, b].
Definition 1. Set X := L~(lR). Let A: D(A) C;;; X --+ X be a symmetric

operator such that S C;;; D(A). Then the tempered distribution T E S'
with T i- 0 is called a generalized eigenfunction of the operator A with the
eigenvalue A E lR iff
T(A¢) = AT(¢) for all ¢ E S.
The system {ToJaEA of generalized eigenfunctions of A is called complete

iff
for all a E A and fixed ¢ E S
5.15 Generalized Eigenfunctions 345
implies ¢ == o.
i:
Lemma 2. We set
T(¢):= 1/;(x) ¢(x)dx for all ¢ E S, (158)
i. e., T( ¢) = (1/; I ¢) for all ¢ E S. Then, the following are met:
(i) For each 1/; E X, T E S'.
(ii) The corresponding map 1/; f-+ T is linear and bijective from the space
X onto S'.
00. By the Schwarz inequality,

i: i:
Proof. Ad (i). Let ¢n ~ ¢ as n ---'>
IT(¢n) - T(¢)12 ~ 11/;(x)1 2dx I¢n(x) - ¢(xW (~:::~~~x

~ canst [SUP (l + x 2)I¢n(x) -
xEIR
¢(x)l] 2 ---'> 0 as n ---'> 00.
Ad (ii). Let 1/;j EX, j = 1,2. Since S is dense in X, it follows from
for all ¢ E S
that 1/;1 = 1/;2. D
Proposition 3. Each eigenfunction 1/; E D(A) of the operator A from

Definition 1 is also a generalized eigenfunction in the sense of (158).
Proof. Let A1/; = >"1/; with 1/; i- 0 and>" E R For each ¢ E S, it follows
from the symmetry of A that
T(A¢) = (1/; I A¢) = (A1/; I ¢) = >"(1/; I ¢) = >"T(¢). D
In the following let us consider an interpretation of equation (157) in

terms of generalized eigenfunctions.
Standard Example 4 (Momentum operator). Let pER We set
for all ¢ E S, (159)
ipx
where ¢p(x) := eT". By Standard Example 4 in Section 3.8, Tp E S'.
Let A: D(A) ~ X -- X be the momentum operator from Definition 14

in Section 5.14. Then,
{ ¢P}PElR
forms a complete system of generalized eigenfunctions for A, in the sense
of (159).
Proof. Recall that A'ljJ = -ili'ljJ'. For each ¢ E S, integration by parts

yields
Tp(A¢) = i:
= pTp(¢).
ili¢~(x) ¢(x)dx = i: p¢p(x) ¢(x)dx
In order to prove the completeness of {Tp}, suppose that

for all p E JR. and fixed ¢ E S.
This means
100
-00
.
'px
e-"¢(x)dx =0 for all p E lR..
Using the Fourier transformation F:S -- S from Section 3.7, this implies
F¢ = 0, and hence ¢ = O. D
Standard Example 5 (The position operator). Let B: D(B) ~ X -- X

be the position operator from Definition 14 in Section 5.14. Then,
{8Y }YElR
forms a complete system of generalized eigenfunctions for B. More precisely,
for all ¢ E S, (160)
Proof. Recall that (B¢)(x) = x¢(x) for all x E lR.. Hence

8y (B¢) = (B¢)(y) = y¢(y) = y8 y (¢) for all ¢ E S.
In order to prove the completeness of {8 y }, let 8y (¢) = 0 for all y E lR..
Then, ¢(y) = 0 for all y E JR., and hence ¢ = O.
Because of (160), physicists regard the "Dirac delta function 8y" as a
"state" of the particle in which the particle is localized at the point y E lR..
Such a "state" is approximated by a state 'ljJg E X, where 'ljJg E CQ"(JR.) and
'ljJg(x) =0 outside [y - c, y + c] for small c > 0, (161)
along with ('ljJg I 'ljJg) = J~oo l'ljJg(x)l2dx = 1. By (161), the probability of
finding the particle outside the interval [y - c, y + c] is equal to zero (Figure
5.1O(b)). For the mean position Xg of the particle in the state 'ljJg, we get
as c -- +0. D
5.16 Trace Class Operators 347
5.16 Trace Class Operators

Definition 1. Let X be a separable Hilbert space over lK.
(i) The linear operator A: X -+ X is said to be of the trace class iff the
series
n
converges for each complete orthonormal system {v n } of X and the
value of the series is independent of {v n }. The number tr A is called
the trace of A.
(ii) The linear continuous operator A: X -+ X is called a Hilbert-Schmidt
operator iff A * A is of trace class.
Standard Example 2. Let A: X -+ X be a linear continuous symmetric

operator on the separable Hilbert space X over lK. Suppose that A possesses
a complete orthonormal system {Un} of eigenvectors with the corresponding
eigenvalues {An}, i.e.,
for all n.
(i) If An 2: 0 for all nand Ln An < 00, then the operator A is of trace
class and
(ii) If Ln A; < 00, then A is a Hilbert-Schmidt operator.

Proof. Ad (i). Since the orthonormal system {un} is complete, the series
converges for each U EX, and hence
for all U E X.
n
Let (): > O. Since the sequence (An) is bounded,ll it follows that
for all U E X.
n
llBy Proposition 2 in Section 1.25, IAnl ~ IIAII for all n.

By the functional calculus from Section 5.8, we get D(A"') = X and
A"'u = L A~(Un I U)Un for all U E X.

n
In particular, the operator A ~: X ----; X is self-adjoint and A ~ A ~ = A. Let

{v n } be an arbitrary complete orthonormal system in X. Then,
n n n m
n m
Since this is a convergent double series with nonnegative terms, the sum-
mation can be interchanged. Hence
n m n m m
Ad (ii). Obviously, A 2 u n = A(Aun ) = A;Un for all n. Since {un} is

complete, all the eigenvalues of A2 are given through {A;}. Because of
A* A = A 2 , the assertion (ii) is a special case of (i). 0
5.17 Applications to Quantum Statistics

The true logic in this world lies in probability theory.
James Clerk Maxwell (1831-1879)
Don't trust any statistics that you didn't falsify yourself.
Folklore
5.17.1 The Abstract Setting of Quantum Statistics

Let r be a quantum system (e.g., a gas). Assume that the "physical states"
of r correspond to states of the separable Hilbert space X. We want to
describe the physical behavior of r in terms of statistics.
(i) Statistical states. By a statistical state \jI of the system r we under-
stand the tuple
(162)
where {1j;m} forms a complete orthonormal system in X, and PI, P2, ... are
real numbers with 0 :::; Pm :::; 1 for all m and
m
5.17 Applications to Quantum Statistics 349
Intuitively, we say that Pm is the probability of finding the system r in the

state '¢m. In terms of statistics, roughly speaking, this means the following.
Let us consider C copies of the system r, where the number C is very large.
Then, PmC copies of the system r are in the state '¢m.
The statistical state \]I is called a pure state iff Pmo = 1 for some fixed
mo and Pm = 0 for all m =I=- mo. Otherwise, \]I is called a mixed state.
(ii) Measurements. Let A be an observable of the system r, Le., the
operator A: D(A) ~ X - t X is self-adjoint. Suppose that we measure
the "physical quantity" corresponding to A in the statistical state \]I from
(162). Naturally enough, we assume that the outcome of this measurement
is statistical. More precisely, we assume that the mean value A and the
dispersion (~A)2 of this measurement are given through 12
and
(iii) Entropy. By definition, the entropy S of the statistical state \]I from
(162) is given through
Here, k is the Boltzmann constant, where k = 1.381· 10- 23 Joule/Kelvin.

(iv) Dynamics. Let H be the Hamiltonian of r, Le., H: D(H) ~ X - t X
is a self-adjoint operator which corresponds to the energy of the system r.
Suppose that the system r is in the given statistical state
at time t = o. Then, the system r is in the statistical state
at time t E JR., where Pm = const and
for all times t E JR. and all m.
That is, the time evolution of the state '¢mO is identical to the time evolution
of quantum states in Section 5.14.1.
12We assume tacitly that tPm E D(A) for all m. Furthermore, note that (~A)2
is the mean value of (A - AI? provided tPm E D(A2) for all m. In fact,
Let t E R We have to show that w(t) represents a statistical state. In

fact, since the operator e- it"H is unitary and the given orthonormal system
{ 'ljJmO} is complete in X, the set {'ljJm (t)} also forms a complete orthonormal
system in X.
5.17.2 Discussion of the Abstract Setting

Definition 1. The operator p: X ~ X is called a statistical operator iff
the following are met:
(a) p is linear, continuous, and self-adjoint.
(b) p possesses a complete orthonormal system {'ljJm} of eigenvectors with

the corresponding eigenvalues {Pm}, i.e.,
for all m.
(c) Lm Pm = 1 and 0 ~ Pm ~ 1 for all m.
Proposition 2. There exists a one-to-one correspondence between the sta-

tistical states W of the system r from (162) and the statistical operators p,
which is given through
for all 'IjJ E X. (163)

m
Proof. Let W be a statistical state. Since
for all 'IjJ E X,

m m
the operator p defined through (163) is a statistical operator, by Proposition

2 in Section 5.8.
Conversely, each statistical operator is of the form (163) and it deter-
mines uniquely a statistical state of the form (162). 0
If p is a statistical operator, then p is of trace class and
tr p = 2..::>m = 1.
m
In addition, by Section 5.8,
for all 'IjJ E X.

m
Hence the entropy of the statistical state W corresponding to p is given

through
S = -k tr(plnp).
Let the operator A be given as in (ii) above. If pA is of trace class, then
the mean value A is equal to
A = tr(pA).
In fact,
m m m
Proposition 3. If the statistical state Wo corresponds to the statistical

operator Po, then the time evolution t 1---* W(t) corresponds to t 1---* p( t) with
p(t) = U(t)pOU(t)-l for all t E lR, (164)
where U(t) := e- itr.H •
Equation (164) represents the basic equation of quantum statistics. A

formal differentiation of (164) yields
inp'(t) = Hp(t) - p(t)H for all t E R
Proof. It follows from (163) that for each 'lj; E X
Since 'lj;m(t) = U(t)'lj;mD and hence ('lj;m(t) I 'lj;) = ('lj;mo I U(t)*'lj;), we get
p(t)'lj; = U(t)PoU(t)*'lj; for all 'lj; E X.
Noting that U(t) is unitary and hence U(t)* = U(t)-l, we obtain (164). 0
5.17.3 The Standard Model in Statistical Physics

The huge field of statistical physics can be understood best by studying
the following standard situation.
Let X be an M-dimensional complex Hilbert space with the orthonormal
basis
We define the linear operators H, N, p: X --t X through

for m = 1, ... ,M, where
(165)
Pm ;= I:~=1 e(/J-N 7n -E7n )/kT·
Here, the positive numbers Em and the nonnegative integers N m are given.
A motivation for the choice of Pm will be given in Remark 8.
Remark 4 (Physical interpretation). Let r be a physical system (e.g., a

gas). Then, 'lj;m corresponds to a state of r, where
Em = energy of r in the state 'lj;m, and
N m = particle number of r in the state 'lj;m.
In addition,
Pm = probability of finding r in the state 'lj;m.
Furthermore, the positive real parameter T and the real parameter J1, pos-
sess the following physical meanings;
T = absolute temperature of r;
J1, = chemical potential of r.
By Section 5.17.1,
M
£ = mean value of energy of r = L Pm(J1" T)Em'
m=l
M
N = mean value of the particle number of r = L Pm(J1" T)Nm.
m=l
This relates T and J1, to £ and N. In this connection, note that
M M
£ = tr(pH) = L ('lj;m 1 pH'lj;m) = L PmEm
m=l m=l
and
M
N = tr(pN) = L PmNm·
m=l
Since IIH'lj;m - £'lj;mll = IEm - £1 and IIN'lj;m -N'lj;mll = INm -Nml, we
obtain that
M
(~£)2 = dispersion of the energy of r= LPm(J1"T)IEm _£12;
m=l
M
(M)2 = dispersion of the particle number of r= LPm(J1"T)INm _NI2.
m=l
Finally,
M
S = entropy of r = -k L Pm(P" T) lnpm(p" T).
m=l
Definition 5. The function
L
M
Z(p" T) := e"N"t<:;.Em
m=l
is called the partition function of r, and
n(p" T) := -kT In Z(p" T)

is called the statistical potential of r.
Obviously,
p.N-H
Z(p" T) = tr e~.
Proposition 6. All important thermodynamical quantities of the system r

can be computed from the function n. We have
N=_8n
8p,'
Proof. Use simple computations. For example,

M
nT = -klnZ - --z-
kTZ T 1 '"
= -klnZ + TZ L...J (p,Nm - Em)e
p.Nm-Em
kT
m=l
M
=k L Pm In Pm = -So o
m=l
Definition 7. Suppose that for each m the energy Em and the particle
number N m depend on the volume V of the system r. Then, the pressure
P of r is defined through 13
Remark 8 (Motivation of the fundamental formula (165) for the probabil-

ity Pm). Let us use the principle of maximal entropy for fixed mean energy
13 A motivation of this definition in terms of phenomenological thermodynamics

can be found in Zeidler (1986), Vol. 4, pp. 387 and 400.
and fixed mean particle number. That is, let us consider the following max-
imum problem:
M
entropy S(p) := -k L Pm lnpm = max!, (166)
m=l
along with the side conditions
M
£(p):= L PmEm = const,
m=l
M
N(p):= L PmNm = const, (167a)
m=l
M
W(p):= LPm = 1,
m=l
and
o :::;Pm:::; 1, m=l, ... ,M. (167b)
We are given the real numbers Em, N m , m = 1, ... , M, such that
E1
EM) =3.
rank ( ~1 .. , NM (168)
... 1
We are also given the mean energy £ and the mean particle number N of
the system r.
Suppose that P = (P1, ... ,PM) is a solution of (166), (167) with 0 < Pm <
1 for all m. By the Lagrange multiplier rule,14 there exist real parameters
a, f3, and 'Y such that
m=l, ... ,M, (169)
where £. := S(p) + a£(p) + f3N(p) + 'YW(p). It follows from (169) that
-klnpm - k + aEm + f3Nm +'Y = O.
14 A rigorous justification of the general Lagrange multiplier rule can be found

in Section 4.14 of AMS Vol. 109. Observe that condition (168) implies that p is
a regular point of the side conditions (167a), i.e.,
rank(£'(p),N'(p), W'(p)) = maximal = 3.

Cf. also Zeidler (1986), Vol. 3, p. 293.
Hence
exEm +(3Nm
Pm = const e k
Using LmPm = 1, we get (165) with
and p, = -f3T.
Therefore, we obtain the surprising fact that from a purely mathematical

point of view temperature T and chemical potential p, are nothing more
than Lagrange multipliers.
5.17.4 Bose-Einstein Statistics and Fermi-Dirac Statistics

Suppose that the system r (e.g., a gas) consists of particles that may as-
sume one of the energy values C1, ... , C M. By definition, a state of r is
characterized through
(170)
This means that nj particles of r have the energy C j, j = 1, ... , J. For each
such state 'lj;, the particle number N and the energy E are given through
J
and E = Lnjcj.
j=l
Thus, the partition function Z and the statistical potential 0 of r are equal
to
J
Z(p"T) = Lel'~TE = rrLe"ni~;i<i
r j=l nj
and
O(p"T) = -kTlnZ(p"T) = t
j=l
-kTlnL
nj
(e"~;jrj. (171)
Standard Example 9 (Bose-Einstein statistics). Suppose that

Each occupation number nj may assume the values 0,1, ... , n.
Using the geometric series, it follows from (171) that
J n ["_<' ]n, J 1 _ e(n+1)(/l-cj )/kT

O(p" T) = L -kTln L e kT = L -kTln 1- e(/l-c,)/kT
j=l nj=O j=l
To simplify computations, suppose that the maximal occupation number

n is very large and f-L - Cj < 0 for all j. Letting; n --+ 00, we get
J
n(f-L, T) = 2: kT in ( 1 - e "~;j ) .
j=l
By Proposition 6, the mean particle number N and the mean energy £ of
the system r are given through N = -nIL and £ = t-LN - T2 (¥)T' Hence
J J
N=2:Nj £ = 2:NjCj, (172)
j=l j=l
where
e(/L-Cj )/kT
Nj := 1 _ e(/L-cj)/kT' (173)
By (172), Nj is the mean occupation number of the energy level Cj.

In the special case where e "~;i is very small (e.g., the energies cl, ... ,cJ
are very large for fixed f-L and T), we approximately obtain
j.L-e·
Nj =e~. (173*)
This corresponds to the classic Maxwell-Boltzmann statistics.
Standard Example 10 (Fermi-Dirac statistics). In contrast to Standard

Example 9, we now assume that
Each occupation number nj may only assume the values 0, l.
This corresponds to the Pauli principle. By (171), we get
J
n(f-L, T) = 2: -kTln (1 + e"~;j) .
j=l
Hence the mean particle number N and the mean energy £ of the system
r are given through N = -nIL and £ = t-LN - T2 (¥)T' i.e.,
J J
N=2:Nj and £ = 2:NjCj,
j=l j=l
where
e(/L-Cj)/kT
N j = 1 + e(/L-cj)/kT'
p.-e·
If e ~ is very small, then again the mean particle number Nj corre-
sponds approximately to the Maxwell-Boltzmann statistics from (173*).
5.18. C* -Algebras and the Algebraic Approach to Quantum Statistics 357
In physics, the Bose-Einstein statistics can be applied to particles with

integer spin (e.g., photons), whereas the Fermi-Dirac statistics can be ap-
plied to particles with half-numberly spin (e.g., electrons, protons, neutrons,
etc.).
Interesting applications to cosmology (the Planck radiation law, the Big
Bang and the expansion of our universe, white dwarfs, etc.) can be found
5.18 C*-Algebras and the Algebraic Approach

to Quantum Statistics
Banach algebras have been defined in Section 1.23. Recall that we always
assume that such a Banach algebra contains a unit element E, which we
also denote by I. In the following, let us introduce special Banach algebras,
where we have the following implications:
von Neumann algebra =? C* -algebra =? Banach algebra.
Definition 1. By a C* -algebra 12(, we understand a Banach algebra over

C such that there exists a map A f--+ A * from 12( to 12( having the following
properties for all A, B E 12(, and all a, (3 E C:
(i) (A*)* = A;
(ii) (aA + (3B)* = aA* + /3B*;

(iii) (AB)* = B* A*;
(iv) IIA* All = IIAI12.

In addition, 12( is called commutative iff AB = BA for all A, B E 12(. An
element A of 12( is called self-adjoint (resp., unitary or normal) iff A = A*
(resp., AA* = A* A = I or AA* = A* A).
Standard Example 2. Let X be a complex Hilbert space. Then, the

Banach space L(X, X) of all linear continuous operators A: X ---> X is a
C* -algebra if A * denotes the adjoint operator.
Proof. Properties (i)-(iii) follow easily from the definition of the adjoint
operator. Let us prove (iv). Assume that X i- {O}. Since
(A* A)* = A* A** = A* A,

the operator A* A is self-adjoint. By Proposition 3 in Section 4.1,
IIA* All = sup I(A* Au I u)1 = sup I(Au I Au)1

Ilull=1 Ilull=1
= sup IIAul1 2= IIAI12. D
Ilull=1
Standard Example 3. Consider the Banach space C[a, bk of all continu-

ous functions f: [a, b] ---+ C equipped with the norm Ilfll := maxa<x<b If(x)l.
Letting
j*(x) := f(x) for all x E [a, b],
C[a, bk becomes a commutative C* -algebra.
Definition 4. Let Qt and IB be two C* -algebras. By a *-homomorphism,

we understand a linear map ¢: Qt ---+ IB such that, for all A, B E Qt,
¢(AB) = ¢(A)¢(B) and ¢(A*) = ¢(A)*.

If, in addition, ¢ is bijective, then ¢ is called a *-isomorphism between Qt
and lB.
Each *-isomorphism ¢: Qt ---+ Qt is called a *-automorphism of Qt.
The following notion is crucial for quantum statistics.
Definition 5. By a state w of a C* -algebra Qt, we understand a linear

continuous functional w: Qt ---+ C such that w(I) = 1 and
w(A* A) 20 for all A E Qt.
A state w is called mixed iff there exist two different states WI and W2 of
Qt such that
for some A E ]0,1[.
Otherwise, the state w is called pure.
Standard Example 6. Let Qt := L(X, X), where X is a complex Hilbert

space, and X i- {O}. For fixed u E X with Ilull = 1, we set
w(A) := (u I Au) for all A E L(X, X).
Then, w is a state of Qt.
Proof. Obviously, w(I) = 1 and
w(A*A) = (Au I Au) 20 for all A E L(X, X). D

5.18. C* -Algebras and the Algebraic Approach to Quantum Statistics 359
Let us now define von Neumann algebras based on the algebraic relation
SJ3 = SJ3/1. (174)
Definition 7. Let X be a complex Hilbert space. A subset SJ3 of L(X, X)

is called a C* -subalgebra of L(X, X) iff SJ3 is a linear subspace of L(X, X)
and
A, BE B implies AB E SJ3 and A* E SJ3.
The set SJ3' := {A E L(X,X): AB = BA for all B E SJ3} is called the
commutant of SJ3. We also set SJ3/1 := (SJ3')'.
By a von Neumann algebra, we understand a C* -subalgebra of L(X, X)
such that (174) holds.
Operator algebras represent an important tool of modern mathematical

physics. John von Neumann introduced von Neumann algebras in the 1930s.
The theory of C*-algebras was developed by Gelfand and Naimark in the
1940s. In particular, the famous Gelfand-Naimark theorem says that each
C* -algebra is *-isomorphic to a C* -subalgebra of L(X, X) for some complex
Hilbert space X (cf. Problem 1.20 of AMS Vol. 109).
5.18.1 The Algebraic Setting for Quantum Statistics

A physical system r (e.g., a gas) is described by a C* -algebra QL
(i) The self-adjoint elements A of S2( are called observables. They cor-
respond to physical quantities like energy, particle number, and so
forth.
(ii) The states w of S2( correspond to physical states of r.
(iii) We define
w(A) := expectation value A of the
observable A in the state w;
w((A - w(A)I)2) := dispersion (~A)2 of the observable A
in the state w.
(iv) Dynamics. We postulate that there exists a one-parameter family

{<PdtEIR of *-automorphisms <Pt: S2( -+ S2( such that, for all t, s E JR.,
and <Po = identity.
This allows the following physical interpretation. If the system r is in

the state w at time t = 0, then it is in the state
Wt := w 0 <Pt
at time t. We have to show that Wt is a state. In fact, for each A E ~, we get

<pt(A* A) = B* B, where B := <pt(A), and hence wt(A* A) = w(B* B) 2:: o.
Definition 8. Let (3 ERA state W is called a (3-KMS-state iff
~ f(t)w(A<pt(B))dt = ~ f(t + i(3fi)w(<pt(B)A)dt (175)
for all A, B E ~ and for all continuous functions f: JR. ----+ JR. whose Fourier
transform belongs to CD (JR.).
In particular, condition (175) is satisfied if
w(A<pt(B)) = W(<Pt-if3fi(B)A) for all t E JR., (175*)
and these functions are continuous on JR. with respect to time t.

It was discovered around 1960 by the physicists Kubo, Martin, and
Schwinger that such states may describe thermodynamic equilibrium states.
From the mathematical point of view, we postulate that the KMS-states
correspond to thermodynamical equilibrium states of the system r. The
corresponding temperature T is given by
1
T = k(3'
where k denotes the Boltzmann constant. A motivation will be given ahead,
where we also introduce a more general concept of KMS-states that includes
systems with variable particle number (i.e., there is a nonvanishing chemical
potential f-L). By (175) and (175*), we obtain the surprising fact that
The temperature T of thermodynamic equilibrium states is related to
imaginary time i(3fi, where (3 = k~'
5.18.2 Applications to the Standard Model of Statistical

Physics
Let us reconsider the situation from Section 5.17.3. We are given a finite-
dimensional Hilbert space X with the orthonormal basis
Let
m=l, ... ,M,
where (3 := k~' The operators H, N, p: X ----+ X are defined through

5.18. C'-Algebras and the Algebraic Approach to Quantum Statistics 361
for all m = 1, ... , M.

Constant particle number. Let us first consider the case where JL = 0,
i.e., the particle numbers N m are constant for each m.
(i) Algebra. Set 21 := L(X, X).
(ii) States. If we define
w(A) := tr(pA) for all A E 21, (176)
then w is a state.
M
w(A) = L Pm('ljJm I A'ljJm)
m=l
that
M
w(A* A) = L Pm('ljJm I A* A'ljJm)
m=l
M
= L Pm (A'ljJm I A'ljJm) ;::: 0 for all A E 21
m=l
and w(I) = 1.
The state w from (176) is mixed provided we have Pj > 0 and Pk > 0 for
two different indices j and k.
(iii) Dynamics. Define
.+. (A) itHA _itH for all A E 21 and all t E R (177)
'l't := e" e"
Then, CPt: 21 ---+ 21 is a *-automorphism for each time t E :!R.

±itH)* :ritH
(
e" = e'"
and (AB)* = B* A* that
cpt(AB) = cpt(A)cpt(B) and
for all A, B E 21, and all t E R Furthermore, cpt(A) = B implies A =

cp-t(B), by (177). Hence CPt l = CP-t.
To motivate (177), observe that
M
w(cpt(A)) = L Pm(Am'IjJm(t) I 'ljJm(t)),
m=l
This coincides with the time evolution from Section 5.17.l.

From the physical point of view, the state w from (176) corresponds to a
physical state, where the total energy Em is realized with the probability
Pm. Intuitively, we expect that such a state corresponds to a thermodynamic
equilibrium. The following proposition justifies this in terms of the KMS-
condition.
Proposition 9. Each state w from (176) is a (3-KMS-state with respect to

cPt! i.e.,
w(AcPt(B)) = w(cPt-i{3n(B)A) for all A, B E Qt and all t E R (178)
The temperature of this state is given by T = k~'
Proof. For simplifying notation, let us use such units of time that Ii = l.
Observing that tr(C) := L~=l Cl/Jm I C'lj;m) for all C E Qt, we get
tr(CD)=tr(DC) for all C,DEQt (179)
(cf. Problem 5.12). This implies (178). In fact, we have
w(C) = tr(pC) = Z- l tr(e-{3HC),
where Z:= tr e-{3H. By (179),
Zw( cPt-i{3(B)A) = tr (e-{3H ei(t-i{3)H B e- i (t-i{3)H A)

= tr (e itH B e- itH e-{3H A)
= tr (e-{3H A eitH B e- itH ) = Zw(AcPt(B)). D
Variable particle number. Let us now consider the more general case
where J-l =f. O. Here, the state w from (176) corresponds to a physical state
where the particle number N m and the total energy Em are realized with
probability Pm. We expect again that this represents a thermodynamic
equilibrium state. Introducing
'" iHC",)t iHC",)t
cPt (A) := e-"-Ae--"- for all A E Qt,
where H(J-l) := H - J-lN, we obtain the following generalization of Proposi-
tion 9.
Proposition 10. The state w from (176) is a (3-KMS-state with respect to

cPr, i.e.,
for all A, B E Qt and t E R
This equilibrium state corresponds to the temperature T = k~ and to the

chemical potential J-l.
Proof. Replace H with H(J-l) and use the same argument as in the proof
of Proposition 9. D
5.19. The Fock Space in Quantum Field Theory 363
5.19 The Fock Space in Quantum Field Theory

and the Pauli Principle
In quantum mechanics the number of particles is fixed. Quantum field
theory describes the interaction between elementary particles where the
number of particles is not fixed. The Fock space allows us to describe such
a situation. Experience shows that there exist two completely different
kinds of elementary particles in nature, namely,
(i) bosons (i.e., particles with integer spin like photons) and
(ii) fermions (i.e., particles with half-numberly spin like electrons or

quarks).
The Pauli principle postulates the following:

It is impossible for two identical fermions to be in the same state.
Furthermore, we have the following principle of indistinguishability for both
bosons and fermions:
It is impossible to distinguish between n identical particles.
This means the following. For example, consider two electrons and let A
and B be two one-electron states. In classical physics, we can distinguish
between the following two-electron states:
Here, AIB2 means that electron 1 is in state A and electron 2 is in state

B. In quantum physics, only the following states exist:
AA, AB, BB,
where AA means that the two electrons are both in state A, and so on.
In quantum statistics, the number of different states is of fundamental im-
portance. A different counting of states yields completely different physical
results. In fact, the two preceding principles were discovered by physicists
via quantum statistics.
The completely elementary proofs of the following propositions are left
to the reader as exercises.
5.19.1 The Fock Space for Bosons

We start with the Hilbert space Xn := L~(lR.3n), n = 1,2, ... , with the
inner product
(f I g)n:= r
}[f.3n
f(x)g(x)dx,
_ 1
and we set Xo := C with (f I g)o = Ig· As usual, let 1I/IIn := (f I f)~.
Definition 1. The bosonic Fock space X consists of all the sequences

(1/Jn)n=O,l, ... such that
00
L l11/Jnll~ < 00.

n=O
Here, we assume that each function 1/Jn = 1/Jn(XI, ... ,xn ) is symmetric with
respect to all arguments Xl, ... ,Xn E R3.
We also set Vn := CO'(R3 n), n ~ 1, and V(X) := {(1/Jn) EX: 1/Jn E Vn

for all n ~ I}.
Proposition 2. The bosonic Fock space X is a complex Hilbert space with

respect to the inner product
00
(1/J I ¢) := L(1/Jn I ¢n)n'

n=O
Definition 3. We set D(N) := {'¢I E X: I:~=on2111/JnI12 < oo} and define

the particle number opemtor N: D(N) ~ X -+ X through
Example 4. Let 1/Jn E Xn with II1/Jnll = 1 for fixed n = 0,1, .... Set
1/J := (0, ... ,0, 1/Jn, 0, ... ),
where '¢In stands at the nth place. Then, II'¢III = 1 and
N1/J = n1/J.
We say that the state '¢I corresponds to n identical bosons (e.g., n photons).
The state
0:= (1,0,0, ... )
is called the vacuum (or the ground state). Obviously, NO = 0.
Definition 5 (Creation operators b+ and annihilation operators b_). We

are given I E VI and '¢I E V(X). Let us define the operators
b±(f): V(X) ~ X -+ V(X)
in the following way. For all Xl, ... ,Xn E R3 and all n = 1,2, ... , let
n
(b+(f)1/J)n(XI,' .. ,xn) := n-! L l(xj)1/Jn-l(XI, .. . , Xj-I, xj+I, .. . ,xn ).
j=l
For n = 0, 1, 2, ... , let
Furthermore, for n = 0, we set (b+(f)'ljJ)o := O.
Example 6. If f, 9 E VI, then
b+(f)O = (0, f, 0, ... ) and b+(g)b+(f)O = (0,0, h, 0, ... ),
where h(XI,X2) := 2-~(g(xdf(X2) + f(xdg(X2))'

The following result is crucial.
Proposition 7 (The commutation relations). For all f, 9 E VI and 'ljJ, ¢ E

V(X), we get
b+(f)b+(g)'ljJ - b+(g)b+(f)'ljJ = 0, (180)
L(f)L(g)'ljJ - L(g)L(f)'ljJ = 0, (181)
L(f)b+(g)'ljJ - b+(g)b_(f)'ljJ = (f I gh'ljJ· (182)
Furthermore,
Example 8. Let f,g E VI with Ilflll = Ilglll = 1. Since L(f)O = 0, it

follows from (182) that
and
L(f)b+(f)b+(f)O = b+(f)L(f)b+(f)O + b+(f)O

= b+(f)b+(f)L(f)O + 2b+(f)O = 2b+(f)O.
Moreover, from (180) we get
(183)
Physical Interpretation 9. Let f, h,···, fm E VI with IlfliI = 1 and

I fJ III = 1 for all j. Then, we regard fJ as the state of one particle. Suppose
that 'ljJ =1= 0, where
Choose Q: m E C in such a way that 11'ljJ11 = 1. Then, we regard 'ljJ as a state

of m identical particles (bosons) in the states h, ... , f m' It follows as in
(183) that '¢ remains unchanged under a permutation of h, ... , fm. This
reflects the principle of indistinguishability of identical particles. Observe
that
N'¢=m'¢.
We say that the operator b+ (f) creates one particle in the state f from the
vacuum. Furthermore, by Example 8,
Therefore, we say that L(f) annihilates one particle in the state f.
5.19.2 The Fock Space for Fermions

In constrast to the bosonic Fock space, the functions '¢n are now antisym-
metric. As we will show, this forces the Pauli principle.
Definition 10. The fermionic Pock space Y consists of all the sequences
('¢n)n=O,I, ... such that
00
n=O
Here, we assume that each function '¢n = '¢n(XI, ... ,xn ) is antisymmetric
with respect to all arguments Xl, ... ,Xn E lR.3 .
Then, Y is a complex Hilbert space with respect to the inner product

00
('¢ I ¢) := 2:)'¢n I ¢n)n.

n=O
Let V(Y) := {('¢n) E Y: '¢n E Vn for all n ~ 1}. The particle number
operator N: D(N) ~ Y -+ Y is defined through
N('¢n) = (n'¢n)
and D(N):= {'¢ E Y: L~=on211'¢nI12 < oo}.
Definition 11 (Creation operators a+ and annihilation operators a_). We

are given f E VI and '¢ E V(Y). Let us define the operators
a±(f): V(Y) ~ Y -+ V(Y)
in the following way. For all Xl, ... ,Xn E lR.3 and all n = 1,2, ... , let
n
(a+(f)'¢)n(XI, ... , xn) := n-~ ~) _1)j-1 f(xj)
j=l
For all n = 0,1,2, ... ,
Furthermore, for n = 0, we set (a+U)'lj;)o := O.
Example 12. If f E VI, then
a-U)(O,f,O,O, ... ) = U I fhO,·
Proposition 13 (The anticommutation relations). For all f, 9 E VI and

E V(Y), we get
'lj;,4>
a+U)a+(g)'lj; + a+(g)a+U)'lj; = 0, (184)
a-U)a_(g)'lj; + a_(g)a-U)'lj; = 0, (185)

a-U)a+(g)'lj; + a+(g)a-U)'lj; = U I gh'lj;· (186)
Furthermore,
Example 14. Let f,g E VI with IIfl11 = IIgl11 = 1. Since a-U)o' = 0, it

follows from (186) that
and
a-U)a+U)a+U)o' = -a+U)a-U)a+U)o' + a+U)o'

= a+U)a+U)a-U)o' + 2a+U)0' = 2a+U)0'·
From (184) we get
(187)
Physical Interpretation 15. Let fI, ... , f m E VI with II fJ III = 1 for all
j. Then, we regard fJ as the state of one particle. Suppose that 'lj; =1= 0,
where
'lj; := ama+(fI)a+(h)··· a+Um)o'.
Choose am E C in such a way that 11'lj;11 = 1. Then, we regard 'lj; as a state
of m identical particles (fermions) in the states fI, ... , f m. It follows as in
(187) that 'lj; passes to 'lj; (resp., -'lj;) if we perform an even (resp., odd)
permutation of h, ... , fm. This reflects the principle of indistinguishability

of identical particles. If h = h = ... = fm' then it follows as in (187) that
a+U)a+U)··· a+U)O = O.
This is the Pauli principle.
Remark 16. Physicists use a heuristic machinery (e.g., the path integral)
in order to compute physical effects in particle accelerators with a high
accuracy. Unfortunately, there is no rigorous mathematical justification of
the arguments of physicists. It is one of the most important challenges of
mathematics to construct a rigorous quantum field theory which describes
realistic physical situations. Rigorous application of Fock spaces to quan-
tum statistics can be found in Bratteli and Robinson (1979), Vol. 2. Math-
ematical models of quantum field theory are studied rigorously in Glimm
and Jaffe (1981), and in Grosse (1995).
5.20 A Look at Scattering Theory

Scattering theory studies the motion of particles that move like free parti-
cles as time t goes to ±oo. By a free particle, we mean a particle that is
free of forces. In classical mechanics, free motion corresponds to a uniform
motion on straight lines. For example, in celestial mechanics the motion of
a comet with a hyperbolic trajectory is free as t - t ±oo (cf. Figure 5.11).
In particle accelerators, scattering experiments are performed in order to
study properties of elementary particles. In the following, let us study scat-
tering processes in quantum physics.
We are given a complex Hilbert space X along with the self-adjoint
operator H: D(H) ~ X - t X such that
H=Ho+Hl,
and the motion of a particle is given by
for all t E lR.
We assume that H j : D(Hj ) ~ X -t X, j = 0,1, are self-adjoint. In terms
of physics, we regard the motion
for all t E lR.

as a free motion, whereas Hl corresponds to the action of forces.
Definition 1. The motion '¢ = ,¢(t) is called asymptotically free as t -t +00

(resp., t - t -00) iff there exists a '¢o(O) E X such that
lim 1I,¢(t) - '¢o(t) II = 0
t-++oo
5.20 A Look at Scattering Theory 369
<
FIGURE 5.11.
(resp., limt->-oo 11'Ij!(t) - 'lj!o(t) I = 0).

The motion 'Ij! = 'Ij!(t) is called asymptotically free (or a scattering mo-
tion) iff there exists a 'Ij!(0) E X such that
lim
t->+oo
11'Ij!(t) - 'lj!o(t) I = 0 and
t-+-oo
lim 11'Ij!(t) - 'lj!o(t) II = O.
Let us also define the wave operators W±: D(W±) C X --+ X in the
following way. We set
(188)
where 'Ij!±(0) E D(W±) iff the limit (188) exists.
Proposition 2. Let 'Ij!(0) EX. Then the motion
for all t E lR (189)
is asymptotically free as t --+ +00 (resp., t --+ -(0) iff 'Ij!(0) E R(W+)
(resp., 'Ij!(0) E R(W_)).
More precisely, if 'Ij!(0) = W±'Ij!±, then
lim 11'Ij!(t) - 'lj!o(t) I = 0, (190)

t->±oo
itHO
where 'lj!o(t) := e--"-'Ij!±.
In particular, if D(W+) = X (resp., D(W_) = X), then for each free

motion 'lj!o = 'lj!o(t), there exists a motion 'Ij! = 'Ij!(t) under the action of the
"force H 1 " such that (190) holds as t --+ +00 (resp., t --+ -(0).
Proof. Since the operator e it"H : X --+ X is unitary for each t E lR,
o
At the same time, relation (191) tells us the following.
Corollary 3. Let 7jJ(0) EX. Then, the motion (189) is asymptotically free
iff7jJ(O) E R(W+) nR(W_).
Example 4 (One-dimensional classical scattering motion). Let us first con-

sider the classical motion x = x(t) of a particle of mass m > 0 on the real
x-axis under the action of the force
K(x) := -U'(x), xEJR.
in the direction of the positive x-axis. This motion is governed by the

Newtonian equation
mxl/(t) = K(x(t)) for all t E JR.,
(192)
x(O) = xo, x'(O) = Xl.
We assume that
(A) The potential U: JR. ---> JR. is C 1 and vanishes outside some compact
interval.
Since K is bounded on JR., the classical theory of ordinary differential
equations tells us that, for given real numbers Xo and XI, problem (192)
has a solution for all times t. If X = x(t) is a solution of (192), then we get
d
dt (2- 1 mx'(t? + U(x(t)) = [mxl/(t) - K(x(t))]x'(t) = 0,
and hence
2- 1 mx'(t)2 + U(x(t)) = const = E for all times t, (193)
where E = energy. This means conservation of energy.

Case 1: Bound motion. Let E < 0 and let U be as given in Figure 5.12.
Then it follows from (193) that the motion is only possible in the region
{x ERE - U(x) ;::: O}.
Case 2: Asymptotically free motion. Let E > 0 and let U be as given in
Figure 5.12. By (193), x'(t) =1= 0 for all times t E R Thus, if Xl > 0, then
1
x'(t) ;::: (~);2 > 0 for all times t ;::: 0, and hence
x(t);:::
2E)1 t +
(~ 2 Xo for all t ;::: O.
5.20 A Look at Scattering Theory 371
--------E
• x .x
u u
(a) bound motion (E = energy) (b) asymptotically free motion
FIGURE 5.12.
Replacing t with -t, this implies
x(t) ----> ±oo as t ----> ±oo.
Recall that the classical momentum p is given by p = mx'(t). If U == 0,

then the force K vanishes and the energy E of the free motion x = Xl t + Xo
is equal to
p2
E=-
2m
Since each real value p is possible, we obtain the following:
The energy values E of classical free motion fill up the interval [0,00[.
Standard Example 5 (One-dimensional scattering motion in quantum

mechanics). Parallel to the classical motion (192), let us now study the
corresponding motion in quantum mechanics described by the one-dimen-
sional Schrodinger equation
in'l/Jt = H'l/J, (192*)
where 'l/J = 'l/J(x, t), and H = Ho + HI with the free Hamiltonian,
the momentum operator p'l/J := -in'l/J', and HI'l/J := U'l/J. Let X := L~(lR)
along with D(H) := {'l/J E X: 'l/J','l/J" E X}. Here, the prime denotes the
derivative with respect to x. Assume (A). Then, the following hold true:
(i) The Hamiltonian H: D(H) <;;; X ----> X is self-adjoint.
(ii) Let E(H) denote the linear hull of the eigenvectors of H. Then
and
where E(H)l. denotes the orthogonal complement to E(H).

(iii) The spectrum a(H) of H satisfies the relation
[0, oo[ ~ a(H) ~ ~,
where the essential spectrum of H is equal to [0,00[.
If A is an eigenvalue of H, then A < 0 and A is simple. There exists at

most a finite number of such eigenvalues.
(iv) If flR U(x)dx < 0, then H has at least one eigenvalue.
(v) If U == 0, then H = Ho has no eigenvalues and the spectrum a(Ho)

is equal to the essential spectrum [0,00[.
Proof. The sophisticated proofs can be found in Schechter (1981). 0
By (iii) and (v), the essential spectrum of the free Hamiltonian Ho is

stable under the perturbation HI corresponding to the potential V. Gener-
ally, it is a fundamental property of the essential spectrum that it is stable
under reasonable perturbations.
In terms of physics, the eigenvalues of H are the energy levels of bound
states of the particle. These energy levels do not belong to the essential
spectrum. From (ii) we get the following. Let 'IjJ(0) E X. Then, it follows
from R(W+) = R(W_) = E(H).l.. and Corollary 3 that
'IjJ(0) is the initial state of an asymptotically free (scattering) motion iff
'IjJ(0) is orthogonal to all eigenvectors (bound states) of H.
In order to motivate the spectral property a(Ho) = [0,00[, observe that
the function ¢p(x) := e i~x is an eigenfunction of Ho := - (::) da:2 with
the eigenvalue frn,
2
Le.,
for all p E~,
but ¢p does not live in the Hilbert space X = L~(~). However, for each
p E~, ¢p is a generalized eigenfunction of Ho in the sense of Section 5.15.
In fact, if we set
for all ¢ E S,
then Tp E S' for all p E ~ and integration by parts yields
p2
Tp(Ho¢) = 2m Tp(¢) for all ¢ E S and all pER
5.21. The Language of Physicists 373
5.21 The Language of Physicists in Quantum

Physics and the Justification of the Dirac
Calculus
If one does not sometimes think the illogical, one will never discover
new ideas in science.
Max Planck, 1945
"I think this is so," says Cicha, "in the fight for new insights, the
breaking brigades are marching in the front row. The vanguard that
does not look to left nor to right, but simply forges ahead-those
are the physicists. And behind them there are following the vari-
ous canteen men, all kinds of stretcher bearers, who clear the dead
bodies away or, simply put, get things in order. Well, those are the
mathematicians. "
From the criminal novel Dead Loves Poetry of the

Czech physicist Jan Klima (born in 1938)
In this section, we try to build a bridge between the language of physicists

and mathematicians.
Let X be a separable Hilbert space over lK with the inner product (u I v),
where u, v EX. Physicists write
lu) instead of u.
This forces
lau) = alu) for all a E lK and u E X.
They also use the symbol (vi along with the formal multiplication
(vi' lu) = (v I u) := (u I v) for all u, vEX.

This forces
(avl = a (vi for all a E lK and u EX.
Furthermore, if A is an operator on X, then this formal multiplication
yields
(vi' Alu) = (vi Alu) = (v I Au).
Finally, the symbol
lu)(vl
stands for the operator B: X ---> X defined through
Bw:= (v I w)u for all w E X.

In fact, formally we get lu)(vl'lw) = lu) (v I w). As each perfect calculus,
The Dirac calculus works on its own.
5.21.1 The Discrete Dirac Calculus for Complete

Orthonormal Systems
Let {Uj} and {Vj} be two complete orthonormal systems in X.
The Completeness and Orthogonality Relation 1. Physicists write
2::IUj)(Ujl =I (completeness relation) (194)

j
and
(orthogonality relation). (195)
Formal Consequences 2. Using the formal rules introduced previously,

we conveniently get the following relations from (194):
Iu) = 2:: IUj)(uj I u) for all U E X, (196)

j
for all U,V E X. (197)

j
Along with Lj IVj) (Vj I = I and (194), we also obtain
2::(V I Uj)(Uj I Vk)(Vk I u) = (v I u) for all u, vEX. (198)

j,k
Justification of the Dirac Calculus 3. Since {Uj} is complete,
U = 2::(Uj I u)Uj for all U E X. (196*)

j
This is (196). From (196*) we obtain
(v I u) = 2::(v I Uj)(Uj I u) for all U,V E X. (197*)

j
This is (197). Finally, since {vd is complete, we have
Uj = 2::(Vk I Uj)Vk,
k
and from
j j,k
we obtain
(198*)
k,j
noting that (au Iv) = a( U Iv) for all a E OC. This is (198).
5.21.2 The Continuous Dirac Calculus, the Fourier

Transformation, and the Momentum Operator
Physicists also use formally the Dirac calculus in the case where a contin-
uum of indices appears. As an example, let us introduce the function
<Pk(X) := (27T)-!e ikx for all x E lR and each index k E lR.
Physicists set
for all k E lR
and
L"':= IR.( ... dk.
k
Recall also the formal use of the Dirac 8-function, namely,
Lf(k)8(k - k') = ( f(k)8(k - k')dk = f(k').

k fIR.
This relation tells us the following:
The Dirac function 8(k - k') can be regarded as a continuous version of
the Kronecker symbol8kk ,.
The Completeness Relation and the Orthogonality Relation 4.

Physicists write
Llk)(kl = I (completeness relation) (199)

k
and
(k'i k) = 8(k'- k) for all k',k E lR (orthogonality relation). (200)
Formal Consequences 5. From (199) we get
lu) = L Ik)(k I u) for all u E X (201)

k
and
(v I u) = L(v I k)(k I u) for all u,v E X. (202)
k
Furthermore, it follows from I:k' Ik')(k'i = I that
(vi = L(v I k')(k'i and lu) = L Ik)(k I u).

k' k
Hence, by the orthogonality relation (200),
(v I u) = I:(v I k')(k' I k)(k I u) = I:(v I k')o(k' - k)(k I u)

k,k' k,k'
= I:(V I k)(k I u).
k
This coincides with (202). Thus, the orthogonality relation (200) guarantees
that the computation of (v I u) can be based on the multiplication of (vi
by lu).
Justification 6. Let X := L~(lR). Then,
(u I v) = (u I v) = l u(x)v(x)dx for all u, vEX.
Now let u, v E S. Then
(k I u) = (¢k I u) = l ¢k(x)u(x)dx = (27r)-! l e-ikxu(x)dx.
Consequently, the function k 1--* (k I u) represents the Fourier transform of

u. Hence the inverse Fourier transformation yields
for all x E R
u(x) = l ¢k(x)(k I u)dk,
which is (201). Furthermore, since the Fourier transformation represents a

unitary operator on X, we get
(u I v) = l (k I u)(k I v)dk.
Because of (k I u) = (k I u) = (u I k) = (u I k), this is (202).
Example 7 (The momentum operator). As in Section 5.15, let us consider

the momentum operator
.d
P:= -2 dx.
Here, we use such physical units that It = 1. The equation
for all x E lR and each p E lR
can be written as
Pip) = pip) for all p E lR.
Because of the completeness relation (199) and the orthogonality relation

(200), physicists say that {Ip)} forms a "complete orthonormal system" of
eigenstates of the momentum operator P.
Observe that Ip) does not live in the Hilbert space X. But, recall from
Section 5.15 that, in terms of mathematics, the system {Ip)} forms a com-
plete system of generalized eigenfunctions of P.
5.21.3 The Continuous Dirac Calculus and the Position

Operator
In order to deal with the position operator in a similar way, physicists
introduce the formal state Ix) and they set
u(x) := (x I u) for all x E R
The Formal Completeness and Orthogonality Relation 8. Physi-

cists write formally
Llx)(xl =I (completeness relation) (203)

x
and
(x' I x) = 8(x' - x) for all x',x E lR (orthogonality relation). (204)
Moreover, they set Ex· .. := IJR· .. dx.

Formal Consequences 9. From (203) we get
(v I u) = L(v I x)(x I u). (205)

x
This is identical to the rigorous expression
(v I u) = (v I u) = L v(x) u(x)dx for all u,v EX,
where X := L~(lR). From
(vi = L(v I x')(x'i

x,
and lu) = L Ix)(x I u)
x
along with the orthogonality relation (204), we obtain
(v I u) = L(v I x')(x' I x)(x I u) = L(v I x')6(x' - x)(x,u)

X,x' x,x'
= L(v I x)(x I u).

x
This coincides with (205). Formally,
u(x) = (x I u) = L
u(y)6(y - x)dx.
Therefore, physicists write

Ix) = 6x ,
where 6x (Y) := 6(y - x) for all y E R
Finally, we want to show that the Dirac calculus is so powerful that it
automatically leads to the right formula for the inverse Fourier transfor-
mation. In fact, from
lu) = Ik)(k I u)L
k
with Ik) = ¢k, we obtain
(x I u) = L(x I k)(k I u) = L¢k(X)(k I u). (206)

k k
Furthermore, 2:x Ix)(xl = I yields
(k I u) = L(k I x)(x I u) = L (x I k) (x I u) = L ¢k(X)U(X). (207)

x x x
Letting a(k) := (k I u) and recalling that ¢k(X) := (27r)-ie ikx , relation

(207) is identical to the Fourier transformation
a(k) = L ¢k(x)u(x)dx for all k E~,
and (206) is identical to the inverse Fourier transformation
u(x) = L ¢k(x)a(k)dk forallxER
Example 10 (The position operator). Let Q be the position operator de-

fined by
(Q1/J) (y) := y1/J(y) for all y E~.
Formally,
y6x (Y) = y6(y - x) = x6(y - x) = x6x(Y) for all y E~,

5.22 The Euclidean Strategy in Quantum Physics 379
since "8(y - x) = 0 if y -I- x." Hence

Q I x) = x I x) for all x E lR.
Because of the completeness relation (203) and the orthogonality relation

(204), physicists say that {Ix)} forms a "complete system" of eigenstates
of the position operator Q.
Observe that Ix) does not live in the Hilbert space X. But recall from
Section 5.15 that, in terms of mathematics, the system {Ix)} forms a com-
plete system of generalized eigenfunctions of Q.
Remark 11 (An artificial barrier between mathematics and physics). In

many math textbooks, the inner product on a complex Hilbert space X is
defined in such a way that
(au I v) = a(u I v) for all a E C and all u, vEX.
Hence (u I av) = a(u I v). However, the Dirac calculus used in all physics
text books forces the convention
(u I av) = a(u I v) for all a E C and all u,v E X, (208)
which is used in the present book. In the future, mathematicians should

pass to the convention (208) in order to avoid an artificial barrier between
the language of physicists and mathematicians.
The beauty and elegance of the Dirac calculus will become clear in the
next section. The relation between the general Dirac calculus and rigorous
mathematics (namely, the general spectral theorem due to von Neumann),
is discussed in Zeidler (1986), Vol. 5, Chapter 89.
5.22 The Euclidean Strategy in Quantum Physics

5.22.1 Diffusion
Let us start out from the diffusion equation
for all x E JR., t > 0,

(209)
u(x, 0) = uo(x) for all x E JR. (initial condition).
This equation describes the mass conservation of a diffusion process on the

real line, where
u(x, t) := mass density at the point x at time t.

ld
Hence
u(x, t)dx = mass on the interval [c, d] at time t.
The fixed positive number a is called the diffusion coefficient. A physical

motivation of (209) can be found in Zeidler (1986), Vol. 4, Section 69.1.
By a classical solution of (209), we understand a bounded continuous
function u: JR. x [0, oo[ - t JR. which is CIon JR.x ]0, oo[ and satisfies (209).
Proposition 1. Let the bounded continuous function Uo: JR. - t JR. be given.
Then, the initial-value problem (209) has a unique classical bounded solu-
tion given by
1JOO
{ (41l"at)-2
(x_y)2
u(x, t):= -00 e-~uo(y)dy if x E JR., t > 0,
uo(x) if x E JR., t = O.
In addition, u is Coo on JR. x ]0,00[.
The well-known classical proof of Proposition 1 can be found in John

(1982), Chapter 7.
Remark 2 (Probabilistic interpretation of the diffusion process via Brow-

nian motion). Suppose that the diffusion process (209) corresponds to the
motion of a large number of identical particles of mass m > O. Define
(P) Id (41l"ta)-! e- (x~~)2 dx = probability of finding

c the particle in the interval [c, d]
at time t > 0 provided the particle is
at the point y at time t = O.
Naturally enough, this corresponds to a Gauss distribution with the mean

value y and the dispersion (72 = 2ta. We want to show that (P) implies
Proposition 1. Let the mass density Uo be given at time t = O.
For small Ay > 0, the number N of particles in the interval [y, y + Ay]
at time t = 0 is approximately equal to
These N particles spread over the real line during the time interval [0, t].
Let M of them be in the small interval [x, x + Ax] at time t. By (P),
This corresponds to the partial mass density ~;: at the point x at time t.
The total mass in the interval [x,x+Ax] at time t is obtained by summing
over all the contributions coming from all the possible intervals [y, y + Lly].
Letting Llx -+ 0 and Lly -+ 0, we get the formula from Proposition 1 for
the mass density u(x, t) at the point x at time t.
5.22.2 The Schrodinger Equation as a Diffusion Equation

in Imaginary Time
By (141), the Schrodinger equation for a free particle of mass m > 0 on
the real line reads as follows:
'l/Jt = ia'I/Jxx for all x E JR, t > 0,
(209*)
'I/J(x, 0) = 'l/Jo(x) for all x E JR (initial condition),
where a := 2':n. Recall that

ldC
I'I/J(x, tWdx = probability of finding the particle
in the interval [c, dJ.
Comparing (209*) with (209), we obtain the following fundamental result:

If we pass from the real time t to the imaginary time it, then the diffusion
equation passes to the Schrodinger equation.
This leads us immediately to the so-called Euclidean strategy of physicists:
In order to compute quantum processes in nature, consider first diffusion
processes and pass then to imaginary time.
The theory of diffusion processes on a microscopic level (the Brownian
motion) was created in rigorous mathematical terms by Norbert Wiener in
1923. Roughly speaking, we have the following:
Brownian motion in real time::::} the Wiener path integral

::::} diffusion process in nature.
Formally using the Euclidean strategy, we obtain the Feynman approach
to quantum processes discovered in the 1940s:
Brownian motion in imaginary time::::} the Feynman path integral

::::} quantum process in nature.
Unfortunately, the latter approach frequently works only on a formal level.
However, from the physical point of view, the Feynman approach provides
us with deep insight concerning the relation between classical physics and
quantum physics.
Let us first use the Euclidean strategy in order to solve the Schrodinger
equation (209*) in terms of classical mathematics. The Feynman path in-
tegral will be studied in the next section. Replacing t with it and letting
i:
a := 2":n, from Proposition 1 we obtain the following formal solution of
(209*):
(A1/;o) (x, t) := C::nt) ~ e- m(;i-;;~)2 1/;0 (y)dy,
for all x E JR and t > O. Here we choose d := elf. To justify this, let us
write the Schri::idinger equation (209*) in the usual operator form
in1/;'(t) = Ho1/;(t) for all t E JR,

(209**)
1/;(0) = 1/;0,
with the Hilbert space X := L~(JR) and the free Hamiltonian Ho: D(Ho) s:;:
X ---> X defined through D(Ho):= {1/; E X: 1/;',1/;" E X} and Ho1/;:=
- (;~) 1/;". By Problem 5.7, Ho is self-adjoint. Thus, the quantum motion
corresponding to (209**) is given through
for all t E R
Finally, let us introduce the Gauss function
for all x E JR
and define D:= {u a ,,i3:a: E JR,,8 > O}. It has been proved in Problem 3.6
that span D is dense in X. We shall show ahead that (A1/;o) (x, t) can be
continued to all times t E JR if 1/;0 E span D.
Bya classic solution of the Schri::idinger equation (209*), we understand
a COO-function 1/; = 1/;(x, t) on JR2 that satisfies (209*).
Proposition 3. Let 1/;0 E span D be given. Then, the following hold true:
(i) A1/;o is a classical solution of the Schrodinger equation (209*).

(ii) A1/;o coincides with the corresponding quantum dynamics, i.e.,
for all t E R
Proof. Ad (i). Let u a ,,i3 E D. For simplifying notation, set m = n= l.

Computing the classical integral, we obtain
1 fl(x-n)2
(Au f.I)(x t) =
a,i->' J1 + 4i,8t e- 1+4iflt for all x E JR, t > o.
But the right-hand side also makes sense if t ::; O. Therefore, we define
1
o.
fl(x-n)2
(Au
a,,i3
)(x t)
,
'=
' J1 + 4i,8t e- 1+4iflt for all x E JR, t::;
More precisely, we have -~ < arg(1 + 4if3t) < ~ for all t E JR, and we
choose that branch of the square root where
-i < argyl1 + 4if3t < i for all t E JR.
To simplify notation, set B(x, t) := (Aua,.a)(x, t). For all x, t E JR,
2if3 . _ 13("_0)2
Bt(X t) = (2f3(x - a)2 - 1 - 4zf3t)e 1+ 4i 13 t •
, (1 + 4if3t)2J1 + 4i!3t
Computing the partial derivative Bxx the same way, we obtain
for all x, t E R
Thus, B is a classic solution of the Schrodinger equation (209*).

Ad (ii). Since the operator Ho is linear, it is sufficient to prove the state-
ment for 'ljJo ED. Let 'ljJo := ua,.a. Define
'IjJ(t) := B(·, t) for all t E JR,
i.e., 'IjJ(t) represents the function x ~ B(x, t) on R Let us show the follow-
ing:
(a) 'IjJ(t) E D(Ho) for all t E R
(b) The time derivative 'IjJ'(t) exists in the Hilbert space X for all t E R
Explicitly, 'IjJ'(t) = BtL t) for all t E JR.
(c) The function t ~ 'IjJ' (t) is continuous from JR to X.
(d) 'IjJ is a solution of the Schrodinger equation (209**).
This follows easily from the explicit expression for B(x, t) by using ma-
jorants for parameter integrals based on
(M)
for all a E JR, f3 > 0, and k = 0,1, ....

Proof of (a). By (M),
1 (arB(X,t))2d
00
-00
a xr
x < 00, r = 0,1, ... ,
for all t E JR. Thus, the functions B(·, t), Bx(-, t), and B xx (·, t) belong to
X = L~(JR). Hence 'IjJ E D(Ho).
Proof of (b). For all t E JR.,
This follows by using a majorant of type (M) (cf. Parameter Integrals in

the appendix).
Proof of (c). Similarly, for all t E JR.,
lim 11?f'(t + h) -?f'(t)11 = lim Joo (Bt(x, t + h) - Bt(x, t))2dx = O.

h--->O h--->O -00
Proof of (d). Observe that B is a classic solution of the Schrodinger

equation (209*).
In summary, ?f represents a CI-solution of the Schrodinger equation
(209**). Therefore, assertion (ii) follows from Theorem 5.G in Section 5.13.
o
Remark 4 (The generalized quantum dynamics). Let the initial state?fo E
X be given. Then the corresponding quantum motion reads as follows:
?f(t) := U(t)?fo for all t E JR.
with the unitary operator U(t) := e- i ' : . Hence IIU(t)II = 1 for all t E R
Since the set span D is dense in the Hilbert space X, there exists a
sequence (?fon) in span D such that ?fon ----+ ?fo in X as n ----+ 00. This
implies
?f(t) = lim (A?fon)(-, t) in X as n ----+ 00,
n--->oo
for all t ERIn fact,
11?f(t) - (A?fOn)(-, t)11 = IIU(t)(?fo -?fOn)11

~ IIU(t)llll?fo -?fonll = II?fo -?fOnll ----+ 0 as n ----+ 00.
The preceding considerations show that the formal integral expression

for (A?fo)(x, t) corresponds to the rigorous quantum dynamics provided we
use analytic continuation and the approximation argument from Remark
4.
It happens frequently that formal solutions lead to rigorous solutions
after discovering the proper mathematical interpretation of the formal so-
lution.
5.23 Applications to Feynman's Path Integral 385
5.23 Applications to Feynman's Path Integral

Dick Feynman (winner of the Nobel Prize in physics in 1965) was a
profoundly original scientist. He refused to take anybody's word for
anything. This meant that he was forced to rediscover or reinvent
for himself almost the whole of physics. It took him five years of
concentrated work to reinvent quantum mechanics. At the end, he
had a version of quantum mechanics that he could understand. The
calculations I did for Hans Bethe, using the orthodox theory (via the
Schrodinger equation), took me several months of work and several
hundred sheets of paper. Dick could get the same answer, calculating
on a blackboard, in half an hour.
In orthodox physics it can be said: Suppose an electron is in this
state at a certain time, then you calculate what it will do next by
solving a certain differential equation (the Schrodinger equation from
1926). Instead of this, Dick said simply: "The electron does whatever
it likes." A history of the electron is any possible path in space and
time. The behavior of the electron is just the result of adding together
all the histories according to some simple rules that Dick worked out.
I had the enormous luck to be at Cornell in 1948 when the idea was
newborn, and to be for a short time Dick's sounding board ....
Dick distrusted my mathematics and I distrusted his intuition.
Dick fought against my scepticism, arguing that Einstein had failed
because he stopped thinking in concrete physical images and became
a manipulator of equations. I had to admit that was true. The great
discoveries of Einstein's earlier years were all based on direct physical
intuition. Einstein's later unified theories failed because they were
only sets of equations without physical meaning ....
Nobody but Dick could use his theory. Without success I tried to
understand him .... For two weeks I had not thought about physics,
and then it came bursting into my consciousness like an explosion.
Feynman's pictures and Schwinger's equations began sorting them-
selves out in my head with a clarity they had never had before. I
had no pencil or paper, but everything was so clear I did not need to
write it down. Feynman and Schwinger were just looking at the same
set of ideas from two different sides. Putting their methods together,
you would have a theory of quantum electrodynamics that combined
the mathematical precision of Schwinger with the practical flexibility
of Feynman.
Freeman J. Dyson in Disturbing the Universe

(Harper & Row, New York, 1979)
In this section, we only use purely formal arguments in the spirit of the
two great physicists Dirac and Feynman. We hope that our detailed pre-
sentation helps mathematicians to understand the thoughts of physicists.

Our goal is to compute the Green function G = G(x, tj y, s).
If the Green function G is known, then the process u = u(x, t) is known
1:
for each initial state u(x, s) = uo(x) at the initial time s, namely,
u(x, t) = G(x, tj y, s)u(y, s)dy for all x E ~,t ;::: s. (210)
This will be shown later. Therefore, the Green function plays a fundamental
role in physics. It is decisive that we will compute the Green function
without referring to the results from Section 5.22. To this end, we will use
the Dirac calculus and the Euclidean strategy. Recall from Section 5.21 that
for all X,p E ~, (211)
5.23.1 Diffusion and the Wiener Path Integral

Let us first consider the following generalized diffusion equation:
Ut(x, t) = auxx(x, t) - U(x)u(x, t) for all x E~, t > s, (212)

u(x, s) = uo(X) for all x E~,
and fixed initial time s. Here, u(x, t) denotes the mass density at the point
x at time t, and a > 0 is the so-called diffusion coefficient. Introducing the
operator
(Hv)(x) := -av"(x) + U(x)v(x) for all x E~,
equation (212) can be written as an operator equation of the following

form:
u'(t) = -Hu(t) for all t > s,
(212*)
u(s) = Uo
with the solution
u(t) = S(t, s)uo for all t ;::: s,
where S(t, s) := e-(t-s)H. Here, u(t) stands for the function x I--> u(x, t)
onR
Formal Definition 1. The function G defined through
G(x, tj y, s) := (x I S(t, s) I y) for all x, y E~, t;::: s

is called the Green function (or the propagator) of the diffusion equation
(212).
Formal Proposition 2. For given uo, equation (210) yields the solution
u = u(x, t) of the initial-value problem (212) for the genemlized diffusion
equation.
Formal Proof. By the Dirac calculus, L: y Iy)(yl = I. Hence
= (x I u) = (x I S(t,s) I uo) = I:(x I S(t,s) I y)(y I uo)
i:
u(x,t)
y
= G(x, t; y, s)uo(y)dy. o
Formal Proposition 3. The Green function G satisfies the following fun-
i:
damental relation:
G(x, t; y, s) = G(x, t; z, T)G(Z, T; y, s)dz (213)
for all x,y E R and all times t ~ T ~ s.
Formal Proof. Observe that
if t ~ T ~ s. (214)
Hence
This is (213). o
The proof shows that (213) can be regarded as a localized version of the
semigroup property (214).
i:
Formal Proposition 4. Let 6.t > 0 be small. Then, for all x, y, t E R,
G(x, t + 6.t; y, t) = (271")-1 e- at (a p2 +U(y»+i(x-Y)Pdp, (215)
up to terms of order (6.t)2.
Formal Proof. Since e- taH = I - 6.t· H + -2-'

(at)2H2
- - ... , we get
G(x, t + 6.t; y, t) = (x I e- atH I y) = (x I y) - 6.t(x I H I y) + O( (6.t)2).
It follows from H = U - ~ that

A := (x I y) - t:.t(x I H I y) = (x I y) - t:.t(x I U I y)
d2
+ t:.t L(x I p')(p' I a dx 2 I p)(p I y).
p,pl
Observing Iy) = 8y, we obtain (x I U I y) = U(y)(x I y) and
(p'l ::2 I p) = (1v I ¢~) = _p2(¢pl I ¢p) = _p2(p' I p) = _p28(p' - p),
as well as (x I y) = Ep(x I p)(p I y). Hence

A = L(x I p)(p I y)(l - t:.t(ap2 + U(y)))
p
= L(x I p)(P I y)e-t;,.t(a p2 +U(y)) + O((t:.t)2).

p
1:
Thus, up to terms of order (t:.t)2,
G(x, t + t:.t; y, t) = A = ¢p(X)¢p(y)e-t;,.t(ap2 +U(y))dp.
This is (215). o
Formal Proposition 5. If we set t:.t := ~+~, Xn+1 := x, Xo := y, and
._
S .- ~ zn+l{ ( ) Pj - aPj - U(Xj-l) }(-Zt:.t) ,
. Xj - Xj-I
t:.t
2 .
(216)
then
for all X, Y E lR and t > s.
Formal Proof. Let tk := kt:.t + s. Then, s := to < tl < ... < tn+l := t.
1:
By (213),
G(X, t; y, s) = G(x; t; Xn, tn)G(Xn, tn; Xn-I, tn-I)

... G(XI' t l ; y, s)dx n ··· dXI'
Using (215) and letting t:.t --+ 0, we obtain (217). o
Formal Example 6. If U == 0, then
1 _ (x_y)2
G(x, t; y, s) = (47l'a(t - S))-2 e 4a(t-B). (218)
i:
Formal Proof. By Example 2 in Section 3.7, the Fourier transformation
yields the following rigorous formula:
(27r)-1 e- i {3P e -ap2 dp = (47ra)-!e-~
i:
for all a > 0 and f3 E JR. Furthermore, we will use the formal relation 15
(27r)-le iX (k'-k)dx = 2:)k I x)(x I k') = 8(k -

x
k').
By (216),
n n+1
is = iPn+1xn+1 - iPIXO + i LXj(Pj - PHI) - L ap]Llt,
j=1 j=1
where (n + I)Llt = t - s. Hence
A:= (27r)-n-ljOO eisdxI··· dXndPl ... dp~+1

-00
= (27r)-n-l j 00
ei(Pn+lX-PlY)
n+1
II e-ap~Atdpj II eiXr(Pr-Pr+ddxr .
n
-00 j=l r=l

Integrating first over dx r , we get
A = (27r)-1 j 00
ei(Pn+lX-PlY)
n+l
II e-ap~Atdpj II 8(Pr+l - PrJ.
n
-00 j=l r=l
i:
Observing that J~oo 8(p-q)f(p)dp = f(q), integration over Pn+b Pn, ... ,P2
yields
A = (27r)-1 eiPl(X-Y)e-ap~(n+1)Atdpl
1 (x(y)2
= (47ra(t - s))-"2 e- 4a .-s).
The assertion follows now from (217). o

Observe the following:
The Green function G from (218) coincides with the classical Green func-
tion from Proposition 1 in Section 5.22.
Definition 7. We are given Llx > 0 and Llp > O. By a discrete path, we
understand a curve
x = xnCr), P = Pn(r), s:::; r:::; t,
15The rigorous version of this formula can be found in Standard Example 5 of

Seeton 3.7.
where Xn,Pn : [s, tl ---> lR are piecewise linear, continuous functions such
that
Xn(tj) = integer· ~x, Pn(tj) = integer· ~P for j = 1, ... , n.
Furthermore, we postulate that Xn (.) connects the fixed initial point y with
the fixed end point x, i.e., we set
Xn(s) := y, xn(t):= x, and Pn(s) := 0, Pn(t):= integer· ~p.
Let P n denote the set of all these paths. According to (216), we define
S(Xn,Pn) := L
n+1 { Z.
Xn,j(
~t
)
- Xn,j-1 2 } .
Pn,j - aPn,j - U(xn,j-d (-z~t).
J=1
(219)
Here, Xn,j := xn(tj) and Pn,j := Pn(tj).
By an admissible path, we understand a curve
s ~ T ~ t,
which connects the fixed initial point y and the fixed end point x, i.e.,
xes) := y and x(t) := x. Furthermore, let pes) := O. In addition, we demand
that the integral
exists. The set of all these admissible paths is denoted by P.
Observe that (220) is the limit of (219) as ~t ---> O.
Formal Observation 8 (The path integral). Replace the integrals f ... dx

and f ... dp with sums L··· ~x and L ... ~P, respectively. Then, the
integral from (217) has to be replaced by
J := L eiS(~xt (~:) n+l
Here, the function S has to be taken at all the possible node points, i.e.,
Xj := m~x, Pr:= k~p, for all integers m, k,
and j = 1, ... , n, r = 1, ... , n+ 1. Now to the point. A simple combinatorial

argument shows that
where we sum over all discrete paths (xn' Pn). Therefore, from (217) we get
the following fundamental formula:
(221)
where the symbol "lim" stands for the following formal limiting process:
n ---- 00, f1t ---- 0, f1x ---- 0, f1p ---- 0.
Instead of (221), physicists write
G(x, t; y, s) = 1 eiS(X(·)'P(-))DxDp. (221 *)
Here, the "integral" has to be taken over all paths (x(-),p(·)) E P and
DxDp denotes a "measure" on the path space P.
Remark 9 (Rigorous approach). The integral from (221*) can be given

a precise meaning by constructing a measure on P. This is the so-called
Wiener measure introduced by Norbert Wiener in 1923. In this paper,
Wiener created a mathematical theory for the Brownian motion. In terms
of physics, the Brownian motion was first studied by Einstein in 1905. A
rigorous justification of the so-called Feynman-Kac formula (221*) can be
found in Reed and Simon (1972), Vol. 2, Section X.l1. See also Albeve-
rio and Brezniak (1993) for a rigorous justification of the Feynman path
integral on the level of quantum mechanics.
5.23.2 Quantum Mechanics and the Feynman Path Integral

Parallel to the generalized diffusion equation (212), let us consider the
Schr6dinger equation
-inut(x, t) = au xx - U(x)u(x, t) for all x, t E JR., t > s
(222)
u(x, s) = uo(x) for all x E JR.,
for the fixed initial time s. Here, a := ::. According to (141), this equation
describes the motion of a particle of mass m on the real line, under the
action of the force -U'(x) at the point x E JR. in direction of the positive
x-axis. Recall that
t lu(x, tWdx = probability of finding the particle (223)

in the interval [c, dj,
provided J~oo lu(x, tWdx = 1. The Hamiltonian (i.e., the energy operator)
is given by
p2
H·--+U
.- 2m '
where P := ifitx denotes the momentum operator. Hence
(Hv)(x) = -av"(x) + U(x)v(x),

i.e., the Hamiltonian coincides with the operator H introduced in Section
5.23.1 for the diffusion equation. Using the Hamiltonian, the solution of
(222) is given by
i(t-s)H
u(t) = e--"-uo for all t E R
First choose such physical units that fi = 1. Then, all the formulas from
Section 5.23.1 can be applied to (222) if we replace the real time t by the
imaginary time it.
In order to get formulas that display the role of Planck's quantum of
*
action fi, forget about the convention fi = 1. Then, the rescaling t ~
and p ~ yields the following basic formulas:
*
(i) The Green function G of the Schrodinger equation (222) is defined
through
G(x, t; y, s) := (x I S(t, s) I y) for all x, y, t, s E JR, (224)

i(t-s)H
where S(t, s) := e-----.,.--.
(ii) The solution of the initial-value problem (222) for the Schrodinger
i:
equation is given by
u(x, t) = G(x, t; y, s)uo(y)dy for all x, t E R (225)
(iii) By definition, the set P of admissible paths is given by the curves
(226)
which connect the fixed initial point y and the fixed end point x, i.e.,
x(s) := y and x(t) := x. Furthermore, we postulate that p(s) = 0 and that
the integral
(227)
exists.
(iv) In terms of classical mechanics, the integral (227) represents the
action 16 of the admissible path (x(·),p(·)). Here, x(t) := position at time
16The physical meaning of action is discussed in great detail in Zeidler (1986),

Vol. 4, Sections 58.19ff. Observe the following peculiarity. If x = x(t) describes the
motion of a particle of mass m, then the momentum at time t is given by p(t) :=
mx'(t), where x'(t) equals the velocity of the particle at time t. However, we
also consider such paths (224) in the (x,p)-phase spaces, where p(.) is completely
independent of x(·).
t, p(t) := momentum at time t, and
p(t)2
1-l(t):= 2m + U(x(t)) = energy at time t.
(v) The Green function G can be expressed by the following Feynman
path integral:
G(x,t;y,s)
r
= }p e
is(xC),p(·))
Ii VxVp, (F)
where we "integrate" over all admissible paths (x(·),p(·)).

The Feynman formula (F) is one of the most wonderful formulas of
physics.
In fact, (F) explains the relation between classical mechanics and quan-
tum mechanics. Namely, according to (223) and (i), the Green function G
describes
the propagation of probability
in quantum mechanics. This propagation is obtained in the following way.
Let the particle perform
all the admissible paths x = x(T), p = p(T) in the (x, p) -phase space.
Then, by (F), the Green function is
is(x(·),pC))
the mean value over all the numbers e n ,
where S denotes the classical action along the admissible path x = x( T),
p = p(T) in the (x,p)-phase space. This action is measured in units of Ti,
which corresponds to
the quantization of action in quantum mechanics.
This approach goes back to Feynman's Princeton dissertation in 1942.

Feynman's universal method also works in quantum field theory. This can
be found in Kaku (1993), Sterman (1993), and in Zeidler (1986), Vol. 5,
Chapter 92.
Formal Example 10 (Free motion). Let U == O. Then, the corresponding

Green function Go is given by
27rTii(t-S))-1/2 im(x-y)2
Go(x, t; y, s) =( m e 2n(' 8)
Formal Proof. This follows from Formal Example 5 by letting a := ;-:

and replacing t with ¥t according to our general strategy. 0
5.24 The Importance of the Propagator

in Quantum Physics
In the following we want to show that
The propagator S(·,·) contains all the information about the quantum
system.
Namely, this information includes
(i) time evolution (the Dyson formula);
(ii) bound states of fixed energy E; and
(iii) transition probabilities.
We will apply this to
(a) time-dependent scattering theory, the Feynman diagrams, and the
Heisenberg S-matrix; and
(b) time-independent scattering theory and the Lipman-Schwinger inte-
gral equation.
Furthermore, observe that
The knowledge of the propagator S(·,·) is equivalent to the knowledge of
the Green function G.
In fact, if the propagator S(t, s) is known, then by definition the Green
function G is given through
G(x, t; y, s) := (x I S(t, s) I y).
Conversely, let the Green function be given, and let {¢o,} be a complete
orthonormal system. Then
S(t, s)¢ := L S,,(t, s)¢",
where
S,,(t, s):=
This follows from

i: i: ¢,,(x)G(x, t; y, s)¢(y)dxdy.
S,,(t, s) = (¢" I S(t, s) I ¢) = L(¢" I x)(x I S(t, s) I y)(y I ¢).

x,y
The following considerations possess a "universal" character. They can

be generalized directly to three-dimensional quantum mechanics. Moreover,
the same approach also applies to quantum field theory. This can be found
in Mandl and Shaw (1989), and in Zeidler (1986), Vol. 5, Chapters 89ff.
As in the preceding section, we restrict ourselves to purely formal argu-
ments.
5.24. The Importance of the Propagator in Quantum Physics 395
5.24.1 Time Evolution

Let us consider the Schrodinger equation
iliut = H(t)u for all x E 1R, t > s,

(228)
u(x, s) = uo(x) for all x E 1R,
where
H(t) := Ho + U(·, t),
and Ho := - 2'::
d~2. Observe that the potential U = U(x, t) may depend on
time t. Let u = u(t) be the solution of (228). Then, the propagator S(t, s)
is defined through
u(t) = S(t, s)uo. (229)
Formal Theorem 1. The propagator is given through the fundamental

Dyson formula:
(230)
for all t, s E IR with t > s.
Here, T denotes the chronological opemtor, i.e.,
if tl ;::: t2
if t2 ;::: h.
More generally,
where tl', ... , tn' denotes a permutation of tl, ... , tn with h, ;::: t2' ;::: ... ;:::
t n ,. Instead of (230), physicists formally write
S(t, s) = T exp (i~ it H(T)dT) . (230*)
Formal Proof. From (228) we get the integral equation
Using the itemtion method
n= 0,1, ... ,
we obtain
u(t) = Uo + f(ili)-n
n=l
J H(tl)··· H(tn)uo,
where J := J: dtl J:1 dt2 ... J:n-1 dt n. This yields (230). In fact, for exam-
ple, we get
J:= it it1
dt 1 dt2H(tt}H(t2) = it it H(tl)H(t2)(J(tt - t2)dt1dt2,
where (J(x) := 1 if x ;::: 0 and (J(x) := 0 if x < O. Using a permutation of

indices, we get
Formal Proposition 2. The evolution opemtor S(t, s) is unitary for all

times t and s.
Formal Proof. By (229),
! (u(t) I u(t)} = (u'(t) I u(t)} + (u(t) I u'(t)}

= ((ili)-l Hu(t) I u(t)} + (u(t) I (ili)-l Hu(t)}
= -(ili)-l(Hu(t) I u(t)} + (ili)-l(U(t) I Hu(t)} = O.
Hence (u(t) I u(t)} = (uo I uo) for all t E JR. D
5.24.2 States with Sharp Energy

Suppose that the potential U is independent of time t. Then, it follows
from (230) that
~ 1 i(t?lH
S(t, s) = I + ~ (ili)nn! (t - st H n = e- .
Recall that by definition the state ¢ is called a state of sharp energy E iff
H¢ = E¢, i.e.,
iIiSt(O, O)¢ = E¢. (231)
In terms of the Green function G(x, tj y, s), this means that
iii L
y
:t (x I S(t,O) I y}(y I ¢}It=o = E(x I ¢},
I:
i.e., equation (231) is equivalent to
in Gt(x, 0; y, O)</J(y)dy = E</J(x) for all x E R
5.24.3 Transition Amplitude and Transition Probabilities

Formal Definition 3. Let </J and 't/J be two states. Suppose that the quan-
tum system is in the state </J at time s. Then
I('t/J I S(t, s) I </J)1 2 = probability of finding the system in
(232)
the state 't/J at time t.
Moreover, the complex number ('t/J I S(t, s) I </J) is called the transition
amplitude from the state </J at time s to the state 't/J at time t.
Definition (232) makes sense since S(t,s) is unitary. Hence IIS(t,s)</J11 =

= 1, i.e., S(t, s)</J is a state.
II</JII
1:1:
Formal Proposition 4. We have
('t/J I S(t,s) I ¢) = 't/J(x)G(x,t;y,s)¢(y)dxdy.
Formal Proof. By the Dirac calculus,
('t/J I S(t, s) I </J) = L('t/J I x)(x I S(t, s) I y)(y I </J). 0

X,Y
Formal Proposition 5 (The fundamental Feynman relation for transition

amplitudes). Let {</Ja} be a "complete orthonormal system" in the sense of
the Dirac calculus, i.e., :Ea I¢a)(¢al = I. Then
('t/J I S(t,s) I ¢) = L('t/J I S(t,a) I </Ja)(</Ja I S(a,s) I </J)
if s ~ a ~ t.
Formal Proof. By (228) and (229),

S(t,s)¢ = S(t,a)[S(a,s)¢j.
Hence S(t,s) = S(t,a)S(a,s). Now use the Dirac calculus. o
Formal Example 6 (Complete orthonormal systems). First let {¢a} be
a complete orthonormal system, i.e.,
Set
W a,/3 := transition probability from the state <P/3 at

time s to the state <Pa at time t.
Suppose that
(<p/3 1 S(t, s) 1 <Pa) = ~::::>j8/3ja.
j
By (232),
W/3 ,a = { 0la "12
J
if {3 = {3J"
if {3 # {3j for all j.
(233)
Now let {<Pa} be a "complete orthonormal system" in the sense of the

Dirac calculus, i.e.,
and suppose that
(<p/3 1 S(t, s) 1 <Pa) = ~ a j 8({3j - a).

j
Then, <Pa is a "generalized state" and physicists assume that (233) remains
valid. This is motivated by the philosophy that 8(a - {3) represents a con-
tinuous version of the Kronecker symbol 8a /3'
This argument will be used later in scattering theory (cf. Remark 10).
5.24.4 Applications to Time-Dependent Scattering Theory

and the Feynman Diagrams
Consider first a classical particle of mass m on the real line which moves
with the velocity v from left to right (Figure 5.13(a)). Such a particle has
the momentum p = mv and the energy
1 2 p2
E(p) := "2mv = 2m'
In terms of quantum mechanics such a particle is described by the nmc-

tion
More precisely, 'l/Jp represents a particle stream of velocity v .E.

m
and
particle density p(x, t) = I'l/Jp(x, t) 12 , i.e.,
lb p(x, t)dx = number of particles in the interval

[a, b] at time t.
•
P
(a)
• ~
scattering process
• •q
•
P
J U(td
•q
s :S t1 :S t
(b) < q I S1(t, s) I P>
•
P
J U(t1)
P1
J U(t2)
••
q
s :S t1 :S t2 :S t
(c) < q I S2(t, s) I P >
FIGURE 5.13.
In the following, we set a := 1 and fi := 1. We also write

Ip, t) := 'ljJp(t).
Recall that {¢p} forms a complete orthonormal system in the sense of the
Dirac calculus, i.e.,
Furthermore, observe that

'ljJp(t) = e-itHo¢p.
Since the operator e- itHo is unitary, {'ljJp(t)} also forms a "complete or-
thonormal system" for each time t.
We want to study the scattering of a particle stream on the real line under
the influence of a time-dependent potential U = U(x, t).
Formal Theorem 7. Let t > s. For the transition amplitude we get

00
(q, t I Set, s) I p, s) = 8(q - p) + L:)q I Sn(t, s) I p), (234)

n=l
where
(q I Sn(t, s) I p) := J in1n! T(q I U(td I Pl)(Pl I U(t2) I P2) x

400 5. Self-Adjoint Operators, the Friedrichs Extension, etc .
... (Pn-l I U(t n ) I p),

J := J: dtl··· dtn J~ dpl··· dpn-l
I:
along with and
(q I U(t) I p) := eit(E(q)-E(p)) ¢q(X)U(x, t))¢p(x)dx.
The chronological operator T was introduced in Section 5.24.1. The phys-

ical meaning of (q, t I S(t, s) I p, s) will be discussed in Remark 10.
Remark 8 (The Feynman diagrams). Explicitly, we get
J
and
(q I S2(t, s) I p) = -~ T(q I U(td I Pl)(Pl I U(t2) I p),
where J := J: dtl dt2 J~oo dpl.

This is represented graphically in Figures 5.13(b) and (c) by means of
the so-called Feynman diagrams. Intuitively, these diagrams show that
A scattering process in quantum mechanics can be regarded as the super-
position of infinitely many microscattering processes.
Observe the fundamental fact that the transition amplitude can be com-
pletely computed if the first approximation (q I U(t) I p) is known. In the
language of Feynman diagrams, this means the following:
The Feynman diagrams of higher order are obtained from the first-order
Feynman diagrams (see Figure 5.13).
In fact, Feynman diagrams provide us with a deep insight into the structure
of scattering processes. Today they represent the most important tool in
quantum field theory (elementary particle physics).
In the quotation at the beginning of Section 5.23, Dyson pointed out that
Feynman was able to compute complex physical effects on the blackboard
in short time. The reason for this is the fact that the local language of
Feynman diagrams is much closer to the physical effects than the global
language of the Schrodinger equation.
Formal Proof of Theorem 7. Set u(t) := S(t, s)u. By (228),
iu'(t) = (Ho + U)u(t), u(s) = Uo.
To get a simpler differential equation we define

v(t) := eitHOu(t).
Then
v'(t) = eitHoiHou(t) + eitHou'(t) = eitHO (-iU)u(t).
Hence
iv'(t) = Vv(t), v(s) = eisHOuo, where V:= eitHoUe-itHo.
Letting Uo := CisH0</J and observing that v(s) = </J, it follows from the
Formal Theorem 1 that
v(t) = eitHoS(t, s)CisH0</J
(235)
Since'ljJp = e-itH0</Jp, we obtain
i:
By the Dirac calculus, for example,
(</Jq I V(t)V(r) I </Jp) = (</Jq I V(t) I </JPl) (</JPl I V(r) I </Jp)dpl
= J ('ljJq(t) I U(t) I 'ljJPl (t)) ('ljJPl (r) I U(r) l'ljJp(r))dpl'
Furthermore,
('ljJq(t) I U(t) I 'ljJp(t)) = eit(E(q)-E(p)) J dX</Jq(x)U(x, t)</Jp(x).
This yields (234). D
Formal Standard Example 9. Suppose that the potential U = U(x) is

independent of time t and vanishes outside a compact interval, say [e, d].
Let p > O.
Then, as t --t +00 and s --t -00, the first approximation of the transition
amplitude reads as follows:
(q I Sip) := lim (q, t I S(t, s) I p, s)

s~-oo,t-++oo
(236)
where
and UF denotes the Fourier transform of U. In particular, if U(x) == 0, then

a+(p) = 1 and a_(p) = o.
Formal Proof. By Remark 8,
(q,t I S(t,s) I p,s) = 8(q - p) + (q I Sl(t,S) I p) + ... ,

where
s->-~T--->+oo (q ISl(t,S) Ip) = - i l :dTeiT(E(q)-E(p)) 1:¢q(X)U(x)¢p(X)dX

= -i8(E(q) - E(p)) l:e-i(q-p)XU(X)dX
= -27ri8(E(q) - E(p))UF(q - p).
Since E(p) = fm' it follows from Problem 2.19 that

2
8(E(q) - E(p)) = mp-l{8(q - p) + 8(q + pn. o
The operator
S¢:= lim eitHOS(t,s)e-isHO¢
s---+-c::x:>,t-++CX)
is called the Heisenberg S -operator. Since
(q, t I S(t, s) I p, s) = (¢q I eitHoS(t, s)e- isHO I ¢n),

Heisenberg introduced the S-operator (or the S-matrix) in 1942 as a con-

venient way to describe scattering processes for elementary particles. Ex-
plicitly, it follows from (235) that
where V(t) := eitHoUe-itHo.
Remark 10 (Physical interpretation of (236) and the Born approxima-

tion). Consider a potential U = U(x) as in the Formal Standard Example
9. Let p > o. Define
Wq,p := transition probability for a particle from
momentum p to momentum q.
Then, motivated by the Formal Example 6, it follows from (236) that

- P ......
1--......., ~,
P
• c d
• P
FIGURE 5.14.
(i) Wq,p = 0 if q i- ±p.

(ii) W±p,p = la± (p) 12.
In particular, if U(x) == 0, then Wp,p = 1 and Wq,p = 0 for q i- p, as

expected (no scattering).
In terms of the function 'l/!, this means the following. We are given a
stream of particles with mass m which moves from left to right with the
velocity Vin = :!ii
and the particle density
Pin = 1.
For large -t and -x, this stream is described by the function
These particles either pass the potential barrier U or are reflected (cf.
Figure 5.14). More precisely, we get two different particle streams near
time t = +00, namely:
(i) a particle stream from left to right of velocity v';ut = Vin with the
particle density
(ii) and a reflected particle stream from right to left of velocity V';-ut
-Vin with the particle density
For large t and x, the particle streams in (i) and (ii) are described by the
functions
Observe that
As we shall show, this result is identical to the Born approximation that

follows from a completely different approach (the Lipman-Schwinger inte-
gral equation).
Naturally enough, (i) and (ii) above correspond to conservation of energy,
(237)
In fact, since U(x) = 0 for large lxi, the incoming and outgoing particles
have the energy Ein = 2- 1mv?n and Eout = 2-1mv~ut, respectively. It
follows from Ein = Eout that Vout = ±Vin·
Observe that the structure of quantum scattering processes differs com-
pletely from the behavior of classical particles. For example, in classical
mechanics, it follows from (237) and U(x) = 0 for all x ~ [c, d] that a scat-
tered particle P of initial velocity Vin has the energy E = 2- 1mv?n. Thus,
by (237), the particle P cannot pass the potential barrier if U(x) > 0 for
some point x.
5.24.5 The Lipman-Schwinger Integral Equation

in Time-Independent Scattering Theory
The Lipman-Schwinger integral equation reads as follows:
where V(x) := imp-1U(x) and Ii := 1. A physical interpretation of ¢out

will be given ahead. Let us assume that the time-independent potential
U = U (x) vanishes outside the compact interval [c, d].
The integral equation (238) can be solved by using the following iteration
method:
¢~~~(x) = eipx - i: eiplx-YIV(y)¢~~;l)(y)dy, n = 1,2, ... ,
i:
where ¢~~t(x) := eipx . This way we get the first approximation
¢~~t(x) = eipx - eiPlx-YI+iPYV(y)dy, (239)
which is called the Born approximation.
Formal Proposition 11 (Asymptotic behavior of ¢). If ¢ is a solution of

(238), then
for all x ~ d
and
for all x:::; c,
where
and
Formal Proof. By (238),
<Pout(x) = eipx - eipx iXoo e-iPYV(y)<pout(y)dy

- e- ipx 1
00
eiPY V (Y)<Pout (y)dy. D
i:
In particular, for the Born approximation (239) we get
a_(p) =- e2ipY V(y)dy = -27rmip- 1 UF(-2p),

(240)
where UF denotes the Fourier transform of the potential U.
Formal Proposition 12 (Solution of the Schrodinger equation). Let <Pout

be a solution of the Lipman-Schwinger equation (238). Then,
x, t E JR, (241)
2
is a solution of the Schrodinger equation i'lj;t = H'Ij;. Here, E(p) := ~m.
Formal Proof. By Problem 5.1,
i:
Recall that Ho = - 2~~. Thus, it follows from (238) that
(Ho - E(p))<Pout(x) = (Ho - E(p))e ipx - 8(y - x)U(Y)<Pout(y)dy
= -U(x)<Pout(x).
Hence H<pout = E(p)'Ij;out. This implies i'lj;t = H'Ij;. D
Remark 13 (Physical interpretation). The solution 'Ij; from (241) has the
following asymptotic behavior:
'Ij;(x, t) = 'lj;in(X, t) + 'Ij;;;ut(x, t) for all x ~ c,

'Ij;(x, t) = 'lj;tut (x, t) for all x 2: d,
where
1/Jin(X,t):= e-itE(p)eiPx, 1/J;ut(X,t):= a±(p)e-itE(p)e±ipx,
for fixed p > O. Here, 1/Jin and 1/J';ut correspond to an incoming and outgoing
particle stream of velocity v := !n from left to right and particle density
Pin = 1 and P';ut = la+ (p) 12, respectively. Furthermore, 1/J;;ut corresponds
to a reflected particle stream from right to left of velocity v = -!n and
particle density P;;ut = la_(p)12 (cf. Figure 5.14).
If we use the Born approximation (240) for a±(p), then this result coin-
cides with Remark 10.
5.25 A Look at Solitons and Inverse Scattering

Theory
Waves are one of the most fundamental motions: waves on the water's
surface and of earthquakes, waves along springs, light waves, radio
waves, sound waves, waves of clouds, waves of crowds, brain waves,
and so forth.
Waves are recorded, and records are analyzed. In the case of
sound waves and light waves, it is customary to analyze a wave as the
sum of simple sinusoidal waves (Fourier series). This is the principle
of linear superposition.
However, when we observe water waves carefully, we see that the
linear superposition principle cannot be applied in general, except
for very small amplitudes. The study of water waves of finite am-
plitude was one of the main topics in nineteenth-century physics. In
recent years, many nonlinear phenomena have become important.
For example, strong laser beams and waves in gas plasmas exhibit
nonlinear phenomena.
The increasing importance of such phenomena has given rise to
intensive study by means of high-speed computers, and it has been
revealed that in many nonlinear waves, stable pulses are to be con-
sidered as fundamental entities. The stable pulses in nonlinear media
are called solitons. The discovery of solitons has led to new develop-
ments in mathematics, which have made it possible to solve a variety
of nonlinear evolution equations in the last 30 years.
Morikazu Toda, 1989
5.25.1 Solitons
The Korteweg-de Vries equation (KdVequation) is given by
Ut + 6uu x + Uxxx = 0, -00 < x,t < 00. (242)
(a) soliton (solitary wave)
,....,
C2 C2
A .. f\~ .. ..
c] c]
\
,
"
I \
,
I I \
\~
' .. ,'-
I
.- .;
(b) collision between two solitons
FIGURE 5.15.
An explicit calculation shows that this equation has the following solution: 17
u(x, t) = 2k2 sech2 k(x - ct - xo), (243)
where c = 4k 2 , k > 0, and Xo E R This corresponds to a soliton (solitary

wave) that moves with the velocity c from left to right (cf. Figure 5.15(a)).
Proposition 1 (Two-soliton solutions). The KdV equation (242) has the

solution
82
u(x, t) = 2 8x 2 In rp(x, t), (244)
where
along with
Cj = 4k;, j = 1,2,
and A3 := (~~+~~) 2 A I A 2. Here, the numbers kl' k2 > 0 and AI, A2 > 0
are given.
This follows from an explicit calculation. In particular, if Al = 1, kl = k,

A2 = 0, then (244) is identical to the soliton (243) with Xo = O.
Corollary 2. Let CI < C2. Then as t ~ ±oo, the solution (244) is a

superposition of the following two solitons:
j = 1,2. (245)
17R
eca11 t h at sechx.-
' - cosh
1
x
_
-
2
eX+e-X'
This shows that the solution (244) behaves like two solitons at time
t =-00 and time t = +00 (cf. Figure 5.15(b)). It is quite remarkable
that the two solitons are stable, i.e., they do not change their shape and
velocity after collision. There appears only a phase shift In summary, xTo.
we observe the following:
Solitons behave like particles under collisions.
This important fact was discovered by Kruskal and Zabusky via computer
experiments in 1963.
Proof. Let (x, t) E ]R2 be given such that
Ix - cltl < a
for fixed a > O. Then, as t -+ +00,
since C2 > CI. Hence
t -+ +00.
This yields (245) for j = 1. Now let (x, t) E ]R2 be given such that
Then, as t -+ +00,
Hence
<.p(x, t) ~ A 1 e2 1)1 + A 3 c 2 (1)1 +1)2)

= A 1 e2 1)1 (1 + ~: e2 1)2) , t -+ +00.
This implies
(j2
u(x, t) = 2 8x 2 lncp(x, t)
_ 2 8 2 I (1 A3 21)2)
- 8x 2 n + Al e ,
which corresponds to (245) for j = 2.

Similarly, we obtain the asymptotic behavior as t - t -00. D
5.25.2 Summary of Inverse Scattering Theory

and the Spectral Transform
We consider the stationary Schrodinger equation
-'l//'(x) + u(x)'ljJ(x) = k2'ljJ(x), -00 <x< 00. (246)
We assume that the real COO-potential u vanishes sufficiently fast at infinity,

Le.,
I: lu(x)I(1 + Ixl)dx < 00.

We are looking for complex-valued eigenfunctions 'ljJ. The spectrum of (246)
has the following structure:
(i) Continuous spectrum. For each real k i= 0, the number k 2 is a double
eigenvalue of (246) with the two linearly independent eigenfunctions 'ljJI and
'ljJ2, which are uniquely characterized by the following asymptotic behavior:
'ljJI(X) = e- ikx + 0(1), X - t -00,

(247)
'ljJ2(X) = eikx + 0(1), X - t -00.
Additionally, we obtain
'ljJI(X) = a(k)e- ikx + b(k)e ikX + 0(1), X - t +00,

(248)
'ljJ2(X) = b(k) C ikx + a(k) eikx + 0(1), x - t +00.
(ii) Discrete spectrum. Equation (246) has either no negative eigenvalues

or a finite number of negative eigenvalues
-00 < k~ < k~ < ... < k'1v < O.

All these eigenvalues are simple. Letting kj = iqJ with qJ > 0, the corre-
sponding eigenfunctions 'ljJ[jl, j = 1, ... , N, are characterized by the follow-
ing asymptotic behavior:
X - t -00. (249)
Additionally, we have
x --+ +00,
where Cj is real.
The mapping
(250)
with k E IR and j = 1, ... ,N, is called the spectral transform.
In terms of quantum mechanics, (i) and (ii) correspond to scattered par-
ticles and to bound states of particles, respectively.
Let all the scattering data a(k), b(k), qj, and Cj be given. The main
task of inverse scattering theory consists in constructing the corresponding
potential u. To this end, we set
and we consider the Gelfand-Levitan-Marchenko integral equation
K(x, y) + F(x + y) + 1 00
K(x, z)F(z + y)dz = O. (251)
If we know a solution K of this linear integral equation, then we obtain the

unknown potential u by the relation
d
u(x) = -2 dxK(x,x).
5.25.3 Construction of Solutions of the Korteweg-de Vries

Equation via the Inverse Spectral Transform
We consider the initial-value problem for the nonlinear Korteweg-de Vries
equation 18
Ut - 6uu x + U xxx = 0, -00 <x < 00, t > 0,

(252)
U(x,O) = Uo
together with the linear Schrodinger equation
-'lj/'(x) + u(x, t)'lj;(x) = 0
for fixed, but otherwise arbitrary, time t. Then, we have the following im-
portant result:
18Replacing u with -u, we get (242).

(R) Let u = u(x, t) be a solution of (252), which vanishes sufficiently fast

as Ixl -+ 00. Then the spectral transform of u satisfies the following
simple linear equation:
a(k, t) = 0,
(253)
qj = 0,
The dot denotes the t-derivative.

According to (R), we can use the following elegant procedure in order to
solve the initial-value problem (252) for the Korteweg-de Vries equation:
(a) For given initial values u = uo(x), we compute the spectral transform
a(k, 0), b(k, 0), qj(O), Cj(O).
(b) We solve equation (253). This yields
a(k, t) = a(k, 0), qj(t) = qj(O), t > 0,

b(k, t) = b(k, 0)e8ik3t, Cj(t) = cj(0)e 8Q]C O)t.
(c) We obtain the solution of the original problem (252) by using the
inverse spectral transform
(a(k, t), b(k, t), qj(t), Cj(t)) f-7 u(x, t)
from Section 5.25.2.
Motivation of (R). In order to display the simple idea of proof as clearly

as possible, we restrict ourselves to formal considerations.
Step 1: The discrete spectrum. Let us introduce the following two differ-
ential operators:
d2
L(t)'IjJ := - dx 2 'IjJ(x) + u(x, t)'IjJ(x) ,
A(t)'IjJ
d3
:= 4 dx 3 'IjJ(x)
d a
- 3u(x, t) dx 'IjJ(x) - ax (u(x, t)'IjJ(x)),
where the fixed function u satisfies the KdV equation (252). From (252) we
obtain the key relation
L t = LA - AL. (254)
Hence
Consequently,
L(t)'IjJ =)..'IjJ iff L(O)¢> = )..¢>, where ¢> := etA'IjJ.

Thus, the operator L(t) has the same eigenvalues as the operator L(O), i.e.,
we obtain the crucial relation
for all t,
and hence qj = O. The pair {L, A} with (254) is called a Lax pair. Let
't/J = 't/J(x, t) and let
(255)
for fixed t, where q = qj. By Section 5.25.2, for fixed t, the eigenfunction
't/J can be characterized by the following asymptotic behavior:
x --+ -00. (256a)
In addition,
't/J(x, t) = c(t)e- qx + o(e- qX ), x --+ +00. (256b)
Differentiation of (255) with respect to time t yields
Lt't/J + L't/Jt = _q2't/Jt.

Note that q is independent oft. By (254) and (255), (L+q2)('t/Jt+A't/J) = O.
Hence we obtain
L(t)¢ = _q2¢,
where ¢ := 't/Jt + A't/J. By (256a),
x --+ -00.
Hence ¢ = 4q 3 't/J, i.e., we obtain the key equation
(257)
In this connection, note that u is rapidly vanishing as Ixl --+ 00, i.e., we
may put A = 4-1l:s as Ixl --+ 00. Finally, it follows from (256b) and (257)
that
c(t) = 8q3 c(t).
Step 3: Continuous spectrum. Let k > 0 be fixed, and let 't/J denote the
eigenfunction of the equation
(258)
for fixed t, where 't/J = 't/J(x, t) is characterized by the following asymptotic

behavior:
't/J(x, t) = e- ikx + 0(1), x --+ -00. (259a)
In addition,
't/J(x, t) = a(k, t)e- ikx + b(k, t)e ikx + 0(1), x --+ +00. (259b)
Problems 413
As above, differentiation of (258) with respect to time t yields
where ¢:= 'ljJt + A'IjJ.

By (259a),
x -+ -00.
Hence ¢ = 4ik 3 'IjJ, i.e., we obtain the key equation
Using (259b), we get
ae- ikx + be ikx = (-4~3 + 4ik 3) (ae- ikx + be ikx ) + 0(1)

dx
x -+ +00.
Hence a= 0 and b= 8ik 3 b. This finishes the motivation of (R).
Remark 3 (Nonlinear Fourier transformation). The method (a) through

(c) represents a nonlinear variant of the classical Fourier transformation.
To explain this, consider the linearized Korteweg-de Vries equation
Ut + U xxx = o.
I:
U sing the Fourier transformation
U(x, t) = b(k, t)eikXdk,
we get
This implies
which corresponds to (253).

A detailed discussion of this theory and its applications to the compu-
tation of special solutions (N-soliton solutions) can be found in Novikov
(1984) and Toda (1989).
Problems
5.1. A special fundamental solution. Let pER Show that
for all x E lR,

I:
Solution: Let
U(¢) := eiplxl¢(x)dx for all ¢ E C~(lR.).
We have to show that
U(¢/I) + p 2 U(¢) = 2pi¢(0).

In fact, using integration by parts, this follows from the decomposition
U(¢/I) = (':XO eipX¢(x)dx + fO e-ipX¢(x)dx.

Jo -00
5.2. The nonhomogeneous stationary Schrodinger equation. Let f: lR. ---+ C

be a continuous function that vanishes outside a compact interval. Set
v(x) := f oo
-00
ieiplx-yl
2p
f(y)dy.
Show that, for each p E lR. with p i- 0, the function v is a C 2 -solution of
onR
Hint: Use the decomposition v(x) = Jxoo + ... +J~oo··· and an analogous
argument as in the proof of Proposition 1 in Section 2.7.
5.3. Graph closed operators. Let A: D(A) <:;;; X ---+ X be a linear operator
on the Hilbert space X over lK such that D(A) is dense in X. The set
G(A) := {(u, Au): u E D(A)}
is called the graph of A. The operator A is called graph closed iff G(A) is
closed in X x X, i.e.,
un ---+ u and A Un ---+ v in X as n ---+ 00
imply Au = v. The linear operator B: D(B) <:;;; X ---+ X is called the closure
of A iff A <:;;; B and 19
G(A) = G(B).
We write A instead of B. Show the following:
(i) The adjoint operator A* is graph closed.
190bserve that (u,v) E G(A) iff there exists a sequence {(Un,V n )} in G(A)
such that Un -> U and Vn -> v in X as n -> 00.
Problems 415
(ii) The closure A exists iff it follows from Un E D(A) for all n along with
AU n -+ v and Un -+ 0 asn-+oo
that v = o.
(iii) If there exists a linear graph closed operator C: D( C) <;;; X -+ X such
that A <;;; C, then the closure A exists and
A<;;;c.
Hence the closure A is the smallest graph closed extension of A. In
particular, A is uniquely determined by A.
(iv) If A is symmetric, then the closure A exists and is symmetric.
(v) If A exists, then (A)* = A*.

(vi) If A is self-adjoint, then A = A.
(vii) The operator A is graph closed iff D(A) is a Hilbert space over lK
equipped with the inner product
(u I V)A := (u I v) + (u I Av).
5.4. Symmetric operators. Let A: D(A) <;;; X -+ X be a linear symmetric

operator on the complex Hilbert space X. For all A E C and u EX, show
that
(i) IIAu - Aull 2: 11m Ailiull;

(ii) (A - AI)* = A* - )..1;
(iii) X = N(A* - )..I) EB R(A - AI) (orthogonal direct sum).
(iv) If A is graph closed, then
R(A - AI) = R(A - AI),

for all A E C with 1m A =1= o.
(v) A** = A.
Solution: Ad (i). By the Schwarz inequality and 2ab:::; a2 + b2 ,

2(Re A)(Au I u) :::; 21Re A111Au1111u11
:::; IIAul1 2+ IRe A1211u11 2.
Hence
IIAu - >,u11 2 = (Au - >.U I Au - >.U)

= II Au l1 2 - (Au I >.u) - (>.u I Au) + 1>'1 2 11u11 2
= IIAul1 2 - (2Re >.)(Au I u) + 1>'1 2 11u11 2
~ IRe >'1211u11 2 -1>'1 2 1IuI1 2 = 11m >'121IuI1 2.
Ad (ii). Use the definition of A*.

Ad (iii). Let v E N(A* - )'I). Then
(Au - >.u I v) = (u I A*v - ).v) =0 for all u E D(A),
~~~=~ ~
and hence v E R(A - >.I) . Conversely, if v E R(A - >.I) ,then
(Au->'ulv)=O for all u E D(A).
This implies v E D((A - >'I)*) and A*v - ).v = (A - >'I)*v = 0, i.e.,

v E N(A* - ).I).
Ad (iv). Let Vn := (A - >.I)u n -+ v as n -+ 00. By Problem 5.4(i),
Ilv n - vrnll ~ 11m >'lllun - urnll, and hence (un) is Cauchy, i.e.,
as n -+ 00.
Since A is graph closed, so is A - >'1. Hence (A - >.I)u = v.

Ad (v). Cf. Riesz and Nagy (1955), Section 117.
5.5. Self-adjoint operators. Let A: D(A) <;;; X -+ X be a linear symmetric

operator on the complex Hilbert space X. Show the following:
(i) If R(A - >.I) = R(A - ).I) = X for some fixed>. E C, then A is

self-adjoint.
(ii) If A is self-adjoint, then all the points>. E C with 1m >. i- 0 belong

to the resolvent set p(A) of A.
(iii) The operator A is self-adjoint iff R(A ± iI) = X.

(iv) The operator A is self-adjoint iff A is graph closed and
N(A* ± iI) = {O}.
(v) A2 +I = (A - iI)(A + iI).

(vi) If A is self-adjoint, then so is A 2.
Problems 417
Solution: Ad (i). Since A is symmetric, A ~ A*. Thus, we have to show

that A* ~ A. To this end, let v E D(A*). Since R(A - XI) = X, there is a
w E D(A) such that
Aw - Xw = A*v - Xv.
Thus, for all u E D(A),
(Au - AU I v - w) = (u I A*v - Xv) - (u I Aw - Xw) = O.

Since R(A - AI) = X, this implies v = w, i.e., v E D(A).
Ad (ii). Let A E C be given with 1m A =1= O. We first show that
N(A - AI) = {O}. (260)
In fact, if Av - AV = 0, then
(1m A)llvl1 2 = Im{A(v I v)} = Im(v I AV)
= Im(v I Av) = 0,
since (Av I v) = (v I Av) = (Av I v). Thus, the operator A- AI: D(A) ----+ X
is injective. By Problem 5.4,
for all w E R(A - AI),
and
R(A - AI) = X.
In this connection, observe that A = A*, and hence A is graph closed.
Thus, A E p(A).
Ad (iii). Use (i) and (ii).
Ad (iv). If A is self-adjoint, then A = A *. Hence A is graph closed and
N(A* ± iI) = R(A ± iI)J. = {O}.

Conversely, let A be graph closed and N(A* ± iI) = {O}. By Problem
5.4, R(A ± iI) = X. Hence A is self-adjoint, by (i).
Ad (v). Observe that D(A2 + I) = D(A2) and
D(A + iI) = D(A).

Hence D((A - iI)(A + iI)) = D(A 2).
Ad (vi). It follows from A2 + I = (A - iI)(A + iI) and R(A ± iI) =X
that R(A2 + I) = X. By Problem 5.5(i), A2 is self-adjoint.
5.6. The Kato perturbation theorem. Let A: D(A) ~ X ----+ X be a linear

self-adjoint operator on the complex Hilbert space X, and let B: D(B) ~
X ----+ X be a linear symmetric operator such that D(A) ~ D(B) and
IIBul1 :s; allAul1 + bllull for all u E D(A), (261)

where a and b are fixed real numbers with 0 :::; a < 1 and b 2': o.
Show that A + B is self-adjoint.
Solution: Let a E lR with a # O. Since ia E p(A), the operator (A -
iaI)-I: X ---+ X is linear and continuous. We shall show ahead that
IIB(A - iaI)-III < 1 for all a E lR: lal 2': ao, (262)
provided ao is sufficiently large. Since

(A + B - iaI)(A - iaI)-l = 1+ B(A - iaI)-l,
R(A + B - iaI) = X for all a E lR: lal 2': ao

(cf. the Neumann series from Section 1.23). Thus, by Problem 5.5(i), A+B
is self-adjoint.
Proof of (262). By Problem 5.4(i),
for all u E X.
Furthermore,
IIAvl12 + lal 211vl1 2 = (Av - iav I Av - iav)

= IIAv - iavl1 2 for all v E D(A).
Letting v := (A - iaI)-lu, this implies
for all u E X.
Thus, it follows from (261) that
IIB(A - iaI)-lull :::; aIIA(A - iaI)-lull + bll(A - iaI)-lull

:::; (a + blal-l)llull for all u EX.
This yields (262).
5.7. The Hamiltonian. Set X = L~(lR). Let
H:=Ho+U,
where the continuous function U: lR ---+ lR vanishes outside a compact in-

terval. Let Ho := :~ be the free Hamiltonian, where P .- ~ d~ with
D(P):= {u E X:U' EX}.
Show that H is self-adjoint.
Solution: Since P: D(P) <:;;; X ---+ X is self-adjoint, by Example 8 in
Section 5.2, so is p2. Moreover,
Problems 419
The assertion follows now from Problem 5.6.
5.8. The Friedrich~ extension in complex Hilbert spaces. Let A: D(A) C

X - t X be a symmetric operator on the complex Hilbert space X such
that
(Au I u) 2: cllul1 2 for all u E D(A),
where c is a real constant. For fixed A E lR. with A + c > 0, define
(u I vh := (Au I v) + A(U I v),

and 1
Iluli>. := (u I u)I for all u, v E D(A).
Furthermore, let X). be the set of all the points u E X such that there is
an admissible sequence (un) for u, i.e., by definition,
(a) Un E D(A) for all n,
(b) Un -t U in X as n - t 00, and
(c) (un) is a Cauchy sequence with respect to II· II>.-

Show that, for A, f..L E lR. with A + c >
true:
° and f..L + c > 0, the following hold
(i) X). is a complex Hilbert space equipped with the inner product
(u I v).:= lim (un I vnh for all u, v EX)., (263)

n-->oo
where (un) and (v n ) are admissible sequences for u and v, respectively.

The limit (263) is independent of the chosen admissible sequences.
(ii) X). = XI-';
(iii) II· II). is equivalent to II . Ilw

(iv) Set D(AF) := D(A*) n X). and
AFU:= A*u for all u E D(AF)'
Then, the operator A F : D(AF) ~ X -t X is a self-adjoint extension of

A. In addition,
(AFU I u) 2: cllul1 2 for all u E D(AF)'
The operators AF is called the Friedrichs extension.

Hint: Use the same arguments as in Section 5.3ff.
5.9. A classical inequality. Show that, for all u E CO'(R3),
(264)
where x = (C 'f/, ().

Solution: Let u E CO'(R3). Set v = r!u. Then
u~ + u~ + u~ = r-l(v~ + v~ + v~) - r-2(v 2)r + (4r 3)-lv 2.
Observe that, for sufficiently large R,
since v(x) = 0 for x = 0 and Ixl = R.
5.10. The hydrogen atom and the Friedrichs extension. The Schrodinger
equation for the motion of the electron in the hydrogen atom is given
through
in'IjJ = 1i'IjJ,
where 1i := - : : ~ + U with the Coulomb potential for the electron
1 e2
U(x) := --4-
7fco
-I
x
I on R3 - {a}.
Here m = mass of the electron, e = electric charge of the electron, and C(

= dielectricity constant. Let
Show that there is a real number c such that
(1iu I u) ~ c(u I u) for all u E D(1i). (265)
Consequently, the self-adjoint Friedrichs extension H := 1iF exists; H is

called the Hamiltonian of the hydrogen atom.
Solution: Let u E CO'(R). For each given a > 0, there is a b > 0 such
that
1 a
-r < -+b
- r2
for all r > O. (266)
Hence
Problems 421
(Hu I u) = 1{1R3
li2 e2 u 2
-(u~ +u~ +u~) - - - -
2m 4m::o r
}
dx.
Using (264) this implies
provided we choose the number a sufficiently small in (266). Finally, let

wE D(H). Then, w = u + iv, where u, v E Co
(JR3). Since u and v are real
functions,
(Hw I w) = (Hu I u) - i(Hv I u) + i(u I Hv) + (Jiv I v)

= (Jiu I u) + (Hv I v)
~ c( u I u) + c( v I v) = c( w I w).
5.11 **. The spectrum of the hydrogen atom. Show that the spectrum (J(H)
of the Hamiltonian H from Problem 5.10 consists of the eigenvalues
n = 1,2, ... ,
S:g';:2' and of the essential spectrum

2
where I :=
(Jess(H) = [0,00[.
Hint: Study the proof in Triebel (1972). In terms of a classical picture,
the eigenvalues En and E E (Jess(H) correspond to the energy of bounded
orbits (Figure 5.16(a)) and unbounded orbits (Figure 5.16(b)), respectively.
5.12. Trace class operators. Let A, B: X ---> X be linear operators on the

Hilbert space X over lK with 0 < dim X < 00. Show the following:
(i) The operator A is of trace class.
(ii) tr(AB) = tr(BA).

(iii) tr A* = tr A.
(iv) tr(aA+t3B) =atrA+t3trB.
Solution: Ad (i). Let {Uj} and {Vj} be two complete orthonormal sys-
tems in X. Then, by the Dirac calculus from Section 5.21,
r
electron
(a) bound state (b) scattering
FIGURE 5.16.
and hence
m m,k,r
k,r m
Ad (ii). Observe that
tr(AB) = L(Uk I ABuk)

k
= L(Uk I Avm)(vm I BUk)

k,m
= L(vm I BUk)(Uk I Avm )
k,m
= L (v m I BAv m ) = tr(BA).
m
5.13. The extension of isometric operators. Let C: D(C) <;;; X --+ X be a

linear isometric operator on the Hilbert space X over lK., i.e.,
IICul1 = Ilull for all U E D(C).
Suppose that D(C) is a closed linear subspace of X. Let
dim D(C)J.. < 00 and dim R(C)J.. < 00,
where "..l" denotes the orthogonal complement. Show that the following
two statements are equivalent:
Problems 423
(i) There exists a linear unitary operator U: X -+ X with C <:;;; U.
(ii) dim D(C)J.. = dim R(C)J...

Hint: Let {Ul, ... ,Un } and {Vl, ... ,Vn } be an orthonormal basis of
D(C)J.. and R(C)J.., respectively. Set UUj := Vj.
5.14. The Cayley transform. Let A: D(A) <:;;; X -+ X be a linear symmetric

operator on the complex Hilbert space X. The operator
CA := (A - if) (A + if)-l
is called the Cayley transform of A. Show that
(i) D(CA) = R(A + if) and R(CA ) = R(A - if).

(ii) C A is graph closed iff A is graph closed.
(iii) CA is isometric, i.e., IICAul1 = Ilull for all U E D(CA).
(iv) C A is unitary iff A is self-adjoint.
(v) Let B: D(B) <:;;; X -+ X be linear and symmetric. Then,
(vi) If A is graph closed, then D(CA) and R(CA) are closed linear sub-
spaces of X.
Hint: Use Problems 5.3 through 5.5. Cf. Riesz and Nagy (1955), Section
123.
5.15. The extension of symmetric operators to self-adjoint operators. Let

-+ X be a linear, symmetric, graph closed operator on the
A: D(A) <:;;; X
complex Hilbert space X. The two numbers
n± := dim(A* ± if)
are called the defect indices of A. By Problem 5.4,
n± := dim R(A ± if)J...

Suppose that n± < 00. Show that the following two statements are equiv-
alent:
(i) The operator A can be extended to a self-adjoint operator.

In particular, A has no proper self-adjoint extension iff n+ = n_ = O.

Hint: By Problem 5.14, it is sufficient to study unitary extensions of
the Cayley transform CA. Observe that n+ = dim D(CA).l and n_ =
dim R(CA).l. Now use Problem 5.13.
5.16. Essentially self-adjoint operators. Let A: D(A) <;;; X -+ X be a sym-

metric operator on the complex Hilbert space X. Show that the following
three statements are equivalent:
(i) A is essentially self-adjoint, i.e., by definition, the closure A is self-
adjoint.
(ii) A has exactly one self-adjoint extension.
(iii) N(A* ± iI) = {O}.
Solution: (i) =* (ii). Let B be a self-adjoint extension of A, i.e.,
A<;;;B and B* =B.
Since B is graph closed and A = A**, by Problem 5.4(iv), we get
A** <;;; B.
Hence B* <;;; (A**)*. This implies
B <;;; A**,
since A is self-adjoint. Thus, B = A ** = A.
(i) -i=} (iii). By Problem 5.5(iv), A is self-adjoint iff N((A)* ± iI) = {O}.
Note that (A)* = A*.
(ii) -i=} (iii). Cf. Problem 5.15.
5.17**. The Gelfand-Kostyuchenko theorem on generalized eigenvectors.

Set X := L~(lR). Let
A:S -+ X
be a linear symmetric operator which can be extended to a self-adjoint
operator A: D(A) <;;; X -+ X. Then, A has a complete system :F = {F} of
generalized eigenvectors, i.e., for all F E :F the following hold: 2o
(i) FE S' and F i- O.
(ii) F(Au) = AF(u) for all u E S, where A E R
(iii) F(u) = 0 for all F E:F implies u = O.
This result is the special case of a famous general theorem. Study the
proof in Gelfand and Shilov (1964), Vol. 4, Chapter 1, §4. This proof is based
on the theory of nuclear spaces and deep results from spectral theory.
20The space S was introduced in Section 3.7.

Epilogue
If one does not sometimes think the illogical, one will never discover
new ideas in science.
Max Planck, 1945
Mathematics is not a deductive science-that's a cliche. When you

try to prove a theorem, you don't just list the hypotheses, and then
start to reason. What you do is trial-and-error, experimentation, and
guesswork.
Paul Halmos, 1985
The most vitally characteristic fact about mathematics, in my opin-

ion, is its quite peculiar relationship to the natural sciences, or more
generally, to any science which interprets experience on a higher more
than on a purely descriptive level. ...
I think that this is a relatively good approximation to truth-
which is much too complicated to allow anything but approxima-
tions -that mathematical ideas originate in empirics, although the
genealogy is sometimes long and obscure. But, once they are so con-
ceived, the subject begins to live a peculiar life of its own and is
better, compared to a creative one, governed by almost entirely aes-
thetic motivations, than to anything else and, in particular, to an
empirical science ....
But there is a grave danger that the subject will develop along the
line of least resistance, that the stream, so far from its source, will
426 Epilogue
separate into a multitude of insignificant tributaries, and that the

discipline will become a disorganized mass of details and complexi-
ties. In other words, at a great distance from its empirical sources, or
after much "abstract" inbreeding, a mathematical object is in danger
of degeneration. At the inception, the style is usually classical; when
it shows signs of becoming baroque, then the danger signal is up ....
Whenever this stage is reached, then the only remedy seems to be
a rejuvenating return to the source: the reinjection of more or less
directly empirical ideas. I am convinced that this was a necessary
condition to conserve the freshness and the vitality of the subject
and that this will remain equally true in the future.
John von Neumann, 1947
Mathematics is an ancient art, and from the outset it has been both
the most highly esoteric and the most intensely practical of human
endeavors. As long ago as 1800 B.C., the Babylonians investigated
the abstract properties of numbers; and in Athenian Greece, geome-
try attained the highest intellectual status. Alongside this theoretical
understanding, mathematics blossomed as a day-to-day tool for sur-
veying lands, for navigation, and for the engineering of public works.
The practical problems and the theoretical pursuits stimulated one
another; it would be impossible to disentangle these two strands.
Much the same is true today. In the twentieth century, mathemat-
ics has burgeoned in scope and in diversity and has been deepened
in its complexity and abstraction. So profound has this explosion
of research been that entire areas of mathematics may seem unin-
telligible to laymen-and frequently to mathematicians working in
other subfields. Despite this trend towards-indeed because of it-
mathematics has become more concrete and vital than ever before.
In the past quarter of a century, mathematics and mathematical
techniques have become an integral, pervasive, and essential compo-
nent of science, technology, and business. In our technically oriented
society, "innumeracy" has replaced illiteracy as our principal edu-
cational gap. One could compare the contributions of mathematics
to our society with the necessity of air and food for life. In fact, we
could say that we live in the age of mathematics-that our culture
has been "mathematized." No reflection of mathematics around us
is more striking than the omnipresent computer ....
There is an exciting development taking place right now, reuni-
fication of mathematics with theoretical physics ....
In the last ten or fifteen years mathematicians and physicists re-
alized that modern geometry is in fact the natural framework for
gauge theory (cf. Sections 2.20ff in AMS Vol. 109). The gauge po-
tential or gauge theory is the connection of mathematics. The gauge
Epilogue 427
field is the mathematical curvature defined by the connection; cer-

tain "charges" in physics are the topological invariants studied by
mathematicians. While the mathematicians and physicists worked
separately on similar ideas, they did not just duplicate each other's
efforts. The mathematicians produced general, far-reaching theories
and investigated their ramifications. Physicists worked out details
of certain examples which turned out to describe nature beautifully
and elegantly. When the two met again, the results are more powerful
than either anticipated ....
In mathematics we now have a new motivation to use specific
insights from the examples worked out by physicists. This signals
the return to an ancient tradition ....
Mathematical research should be as broad and as original as pos-
sible, with very long-range goals. We expect history to repeat itself:
we expect that the most profound and useful future applications of
mathematics cannot be predicted today, since they will arise from
mathematics yet to be discovered.
Arthur M. Jaffe, 1984
Mathematics is an organ of knowledge and an infinite refinement of

language. It grows from the usual language and world of intuition as
does a plant from the soil, and its roots are the numbers and simple
geometrical intuitions. We do not know which kind of content math-
ematics (as the only adequate language) requires; we cannot imagine
into what depths and distances this spiritual eye (mathematics) will
lead us.
Erich Kahler, 1941

Appendix
Almost all concepts, which relate to the modern measure and inte-
gration theory, go back to the works of Henri Lebesgue (1875-1941).
The introduction of these concepts was the turning point in the tran-
sition from mathematics of the nineteenth century to mathematics
of the twentieth century.
Naum Jakovlevic Vilenkin, 1975
For the convenience of the reader we summarize a number of important

results about the following topics:
the Lebesgue measure;
the Lebesgue integral;
ordered sets and Zorn's lemma.
The Lebesgue Measure

Let us consider the space JR,N for fixed N = 1,2, ....
By an N - cuboid we understand the set
C:= {(6, ... '~N) E JR,N: aj < ~j < bj for j = 1, ... ,N},
where aj and bj are fixed real numbers with aj < bj for all j. The volume
of C is defined through
N
vol(C) := II (b j - aj).
j=l
430 Appendix
The Lebesgue measure J1, generalizes the classical volume of sufficiently

regular sets in]RN to certain "irregular" sets.
More precisely, we have the following quite natural situation. There exists
a collection A of subsets of]RN which has the following properties:
(i) Each open or closed subset of]RN belongs to A.

(ii) If A, B E A, then
AuB E A, AnB E A, and A- B E A.
(iii) If An E A for all n = 1,2, ... , then
UAn E A nAn
00 00
and E A.
n=l n=l
(iv) To each set A in A there is assigned a number J1,(A), where
0::::: J1,(A) ::::: 00.
Here, J1,(A) is called the (N-dimensional) measure of A, and the sets A in

A are called measurable (in ]RN).
(v) If A, B E A and An B = 0, then
J1,(A U B) = J1,(A) + J1,(B).
If An E A for all n = 1,2, ... and An n Am = 0 for all n, m with n =I=- m,
then
Here, we use "00 + 0 = 00."

(vi) If 0 is an N-cuboid, then 0 E A and
J1,(0) = vol(O).
(vii) The subset A of]RN has the N-dimensional measure zero, i.e., A E A
and
J1,(A) = 0
iff, for each c > 0, there is a countable number of N-cuboids 01, O2 ,
... such that
00 00
j=l j=l
Appendix 431
(viii) If the set A has the N-dimensional measure zero and B ~ A, then
the set B also has the N-dimensional measure zero.
(ix) The collection A is minimal, i.e., if a collection A' satisfies conditions

(i) through (viii), then A ~ A'.
The measure JL is unique on A.
JL is called the Lebesgue measure. As usual, we write "meas" instead of JL,

i.e.,
meas(A) := JL(A) for all A E A.
Example. A finite or countable number of points in ]RN has the N-

dimensional measure zero.
In particular, the set Q of rational numbers has the one-dimensional
measure zero in ]R, and the set
has the N-dimensional measure in ]RN.
Convention. By definition, a property P holds true "almost everywhere"

iff P holds true for all points of ]RN with the exception of a set of N-
dimensional measure zero.
One also uses "almost all." For example, almost all real numbers are
irrational. Let M ~ ]RN. We write
u(x) = n--+oo
lim un(x) for almost all x E M
iff this limiting relation holds for all x E M - Z, where the set Z has the
N-dimensional measure zero.
Approximation Property. The Lebesgue measure is regular, i.e., for each

measurable set M in ]RN, we have
meas(M) = inf meas(G),
where the infimum is taken over all the open subsets G of]RN with M ~ G.
In particular,
meas(]RN) = +00 and meas(0) = O.
432 Appendix
---c
. u
/.
L-~r--------r----_X
a b
FIGURE A.I.
Step Functions
Recall that lK = ]R or lK = C. A function
u: M ~ ]RN -+ lK
is called a step function iff u is piecewise constant. To be precise, we suppose

that the set M is measurable and that there exists a finite number of
pairwise disjoint measurable subsets M j of M such that meas(Mj) < 00
for all j and
for x E M j and all j
u(x) = { ~j otherwise,
where aj E lK for all j.

The integral of a step function u is defined through
Example. Let u: [a, b] -+ ]R be a step function as pictured in Figure A.I.

Then the integral of u defined above is equal to the classic integral.
Measurable Functions
The function
is called measurable iff the following hold:
(i) The domain of definition M is measurable.
(ii) There exists a sequence (un) of step functions Un: M -+ lK such that
u(x) = n--+oo
lim un(x) for almost all x E M.
Appendix 433
Theorem of Luzin. Let M be a measurable subset of JRN. Then, the

function
u:M -+ II(
is measurable iff it is continuous up to small sets, i.e., for each 8 > 0, there
is an open subset Mli of JRN such that the function
u:M -Mli -+ II(
is continuous and meas( M li ) < 8.
Standard Example. The function f: M ~ JRN -+ II( is measurable if it is

almost everywhere continuous on the measurable set M (e.g., M is open
or closed).
Calculus. Linear combinations and limits of measurable functions are

again measurable.
More precisely, we set
F(x) := a(x)u(x) + b(x)v(x), G(x):= lu(x)l,

(L)
H(x):= lim un(x),
n-+oo
and we assume that the functions
a, b,u,v, Un: M ~ JRN -+ II(
are measurable for all n and the limit (L) exists for all x EM. Then, the
functions
F, G, H: M ~ JRN -+ II(
are also measurable.
Modification of Measurable Functions. If we change a measurable

function at the points of a set of measure zero, then the modified function
is again measurable.
For example, if the limit (L) exists only for almost all x E M, i.e., for all
°
x E M -Z with meas(Z) = 0, and if we set H(x) := for all x E Z, then the
function H: M -+ II( is measurable provided all the functions Un: M -+ II(
are measurable.
The Lebesgue Integral

The definition of the Lebesgue integral is based on the very natural formula
[ udx:= lim [undx (A)

1M n-too 1M
434 Appendix
together with the following two formulas
u(x) = n--->oo
lim un(x) for almost all x E M (B)
and
for all n, m 2: no(E). (C)
Definition of the Lebesgue Integral. Let M be a nonempty measurable

set. The function u: M ~ ]RN -> JK is called integrable (over M) iff the
following two conditions are satisfied:
(i) There is a sequence (un) of step functions Un: M ->JK such that (B)
holds.
(ii) For each E > 0, there is a number nO(E) such that (C) holds.
If u is integrable, then we define the integral through (A).

This definition makes sense since the limit exists in (A), and this limit
is independent of the choice of the sequence (un).
Obviously, each integrable function is measurable.
For the empty set M = 0 we define f0 u dx = O. We also use synony-
mously the following terminology:
(a) fMudx exists;

(b) u is integrable (over M);
(c) IfMudxl < 00.
Standard Example 1. Let M be a bounded open or compact subset of

]RN,and suppose that the function
u: M ->JK
is bounded and continuous almost everywhere, i.e., there is a set Z ~ M

with meas( Z) = 0 such that u is continuous on the set M - Z and
lu(x)1 ::; const for all x E M.
Then, u is integrable over M.
Standard Example 2. Let the function u: M ~ ]RN -> JK be almost

everywhere continuous on the measurable set M (e.g., M = ]RN). Suppose
that
const
lu(x)1 ::; (1 + Ixl)a for all x E M (G)
Appendix 435
and fixed a > N. Then, u is integrable over M.

Condition (G) controls the growth of the function u as Ixl --> 00.
Standard Example 3. Let the function f: M <::; ]RN --> ]R be almost every-
where continuous on the bounded measurable set M (e.g., M is bounded
and open or M is compact). Suppose that there is a point Xo in M such
that
lu(x)l:S; const for all x E M with Xo =I- x (H)
Ix - xol,6
and fixed (3: 0 :s; (3 < N. Then, u is integrable over M.
Condition (H) controls the growth of the function u as x --> Xo.
Measure. Let M be a measurable subset of]RN with meas(M) < 00. Then
1M dx = meas(M),
where we write JM dx instead of JM u dx with u == 1.
Linearity. Let the functions u, v: M --> lK be integrable over M and let

a, (3 E lK. Then, the function au + (3v is also integrable over M and
1M (au + (3v)dx = a 1M udx + (3 1M vdx.

Absolute Integrability. Let u: M <::; ]RN --> lK be a measurable function.
Then
1M U dx exists iff 1M luldx exists.
In addition, if one of these two integrals exists, then we have the gener-
alized triangle inequality
Transformation rule. Let the function u: M <::; ]RN --> ]R be integrable

over the nonempty open set M. Suppose that the function f: K --> M is a
C 1 -diffeomorphism 21 from the open subset K of]RN onto M. Then
1M u(x)dx = Lu(f(y)) det f'(y)dy.

21That is, I is bijective and both I and I-I are C1 .
436 Appendix
Here, det l' (y) denotes the determinant of the first partial derivatives of
the function f at the point y.
Majorant Criterion. Let the function u: M ~ ]RN ----+ lK be measurable,

and suppose that there exists a function g: M ----+ R that is integrable over
M such that
lu(x)1 :S g(x) for almost all x E M.
Then, the functions u and lui are also integrable over M and
Vanishing Integrals. Let u: M ~ ]RN ----+ ]R be a measurable function such

that u(x) ::: 0 for all x E M. Then
1M udx=O iff u(x) = 0 for almost all x E M.
Let the function v: M ----+ lK be integrable. Then, the integral fM v dx

remains unchanged if we change the function v at the points of a set of
N-dimensional measure zero.
Additivity with respect to domains. Let M and K be two disjoint

measurable subsets of ]RN, and suppose that the function u: M U K ----+ lK
is integrable over M and K. Then, u is also integrable over K U M, and
r
JKUM
udx= r udx+ JMr udx.
JK
Convergence with respect to domains. Let u: M C ]RN ----+ lK be a
function. Suppose that
CXJ
and M= UMn.
n=l
Then, u is integrable over M iff u is integrable over all sets Mn and

sUPn fMn luldx < 00. In this case,
lim r udx.
r udx = n~ooJMn
1M
Absolute Continuity. Let u: M ~ ]RN ----+ lK be integrable. Then, for each
c > 0, there is a 8 > 0 such that
Ii udxl < c
Appendix 437
holds true for all subsets A of M with meas(A) < 8.
Reduction to Bounded Sets. Let M be a nonempty unbounded mea-

surable subset of JRN, N = 1, 2, ... , and let the function u: M ---; OC be
integrable.
Then, for each c > 0, there is an open ball B in JRN such that
I r
JM-H
udxl::; r
JM-H
luldx < c,
where H:= M n B. Hence
Observe that the set H is bounded.
p-Mean Continuity. Let u: M ~ JRN ---; OC be a measurable function on

the nonempty bounded measurable set M. Suppose that
for fixed p 2': 1. Set u(x) := 0 outside M. Then, for each c > 0, there is a
8(c) > 0 such that
1M lu(x + h) - u(x)IPdx < c for all h E JRN with Ihl < 8(c).
Limits of Functions and Integrals

Theorem on Dominated Convergence. We have
lim
n~CXJ }
r undx = r lim un(x)dx,
M } M n---+oo
where all the integrals and limits exist, provided the following two conditions
are satisfied:
(i) The functions Un: M ~ JRN ---; OC are measurable for all n and the
limit
lim un(x) exists for almost all x E M.
n ..... oo
(ii) There is an integrable function g: M ---; JR such that
for almost all x E M and all n.

438 Appendix
Theorem on Monotone Convergence. Let (un) be a sequence of inte-

grable functions Un: M <:;; JR.N ----> JR. such that
and
1M undx::; C for all n and fixed C > o.
Then, there exists an integrable function u: M ----> JR. such that
u(x) = lim un(x) for almost all x E M

n-->oo
and
Lemma of Fatou. Let (un) be a sequence of integrable functions Un: M <:;;

JR.N ----> lR.. Suppose that
(a) un(x) 20 for all x E M and all n.
(b) JM undx ::; C for all n.
Then
More precisely,
u(x):= lim un(x) is finite for almost all x E M.

n-->oo
If we set u(x) := 0 for all the points x of M with limn--> 00 Un (x) = 00, then
the function u: M ----> JR. is integrable and
r u dx::; n-->oo
JM
lim r undx::; C.
JM
Iterated Integration
Our goal is the following fundamental formula:
1M u(x,y)dxdy = iN (iL U(X'Y)dYj dx

= r (r
JJRL JJRN
u(x, y)dx dy.
(I)
Appendix 439
Here, we set u(x, y) = 0 outside M. Furthermore, let x E ]R.N, Y E ]R.L, and

M ~ ]R.N+L.
Theorem of Fubini. Let u: M ~ ]R.N+L ....... K be integrable. Then formula

(I) holds true.
To be precise, the inner integrals exist for almost all x E ]R.N (resp., for
almost all y E ]R.L), and the outer integrals exist.
Theorem of Tonelli. Let u: M ~ ]R.N+L ....... K be measurable. Then the

following two conditions are equivalent:
(i) The function u is integrable over M.
(ii) There exists at least one of the iterated integrals from (I) if u is
replaced by lui, i.e., J(1 luldy)dx exists or J(1 luldx)dy exists.
If condition (ii) is satisfied, then all the assertions of Fubini's theorem
are valid.
Special Case. Let M := {(x,y) E ]R.2:a < x < b,c < y < d}, where
-00 :::; a < b :::; 00 and -00 :::; c < d :::; 00.
Then, N = L = 1 and formula (I) reads as follows:
1M u(x,y)dxdy = lb (l dU(X,Y)dY) dx = ld (l b
U(X,Y)dX) dy.
Parameter Integrals
We consider the function
F(p):= 1M f(x,p)dx,
for all parameters pEP. We are- given the function
f:M x P ....... K,
where M is a measurable subset of]R.N and P is a subset of]R.L or C L .
Continuity. The function F: P ....... K is well-defined and continuous pro-

vided the following three conditions are satisfied:
(i) The function x f-+ f(x,p) is measurable on M for all parameters

pEP.
(ii) There exists an integrable function g: M ....... ]R. such that
If(x,p)1 :::; g(x) for all pEP and almost all x E M.
440 Appendix
(iii) The function p f-> f(x,p) is continuous on P for almost all x E M.
Differentiability. Let P be a nonempty open subset of IR or <C. Then, the

function F: P ......, lK is differentiable and
for all PEP,
provided the following two conditions are satisfied:
(i) The integral fM f(x,p)dx exists for all parameters pEP.

(ii) There exists an integrable function g: M ......, IR such that
Ifp(x,p)1 ::::; g(x) for all pEP and almost all x E M.
This condition tacitly includes the existence of the partial derivative

fp(x,p) for all pEP and almost all x E M.
Functions of Bounded Variation

Let -00 < a < b < 00. The function g: [a, bJ ..., <C is called of bounded
variation iff n
V(g) := sup L Ig(x~n)) - g(xt\)1 < 00, (1)
'D k=l
where the infimum is taken over all the possible finite decompositions V of
the interval [a, b], i.e.,
with n = 1,2, .... (2)

The number V (g) is called the total variation of the function 9 on the
interval [a, bJ.
Theorem of Jordan. The function g: [a, bJ . . . , C is of bounded variation

iff there exist nondecreasing functions gj: [a, bJ . . . , IR, j = 1,2,3,4, such that
for all x E [a, bJ. (3)
The Classic Stieltjes Integral

We are given the continuous function f: [a, bJ . . . , C and the function g:
[a, bJ . . . , C of bounded variation, where -00 < a < b < 00. Then, there
exists the limit
(4)
Appendix 441
which is independent of the decomposition of the interval [a, bl from (2).

Hence
lIb f(x)dg(x) I ::; C~;~blf(X)I) V(g).

If the function f: lR ---- <C is continuous and the function g: lR ---- <C is of
bounded variation on each compact interval, then we set
1 00
-00
f(x)dg(x):= lim
b-->+oo
a---+-oo
Ib f(x)dg(x),
a
provided this limit exists.
The Lebesgue-Stieltjes Integral

If the function f is not continuous, then one introduces the so-called Lebes-
gues-Stieltjes integral which is identical to the Lebesgue integral in the
special case where g(x) := x for all x E R
A summary of important properties of the Lebesgue-Stieltjes integral in-
cluding measure theory can be found in Zeidler (1986), Vol. 2B, Appendix.
Standard Example. Let -00 ::; a < b::; 00. Then, the formula
Ib f(x)dg(x) = Ib f(x)g'(x)dx (5)
holds true provided the following assumptions are satisfied:

(i) The functions f, h: la, b[---- <C are measurable, and the functions hand
fh are integrable over la, b[, in the sense of the Lebesgue integral.
(ii) For all x E la, b[,
g(x) := IX h(y)dy.
More precisely, under the assumptions (i) and (ii), the left-hand integral
from (5) exists in the sense of a Lebesgue-Stieltjes integral, whereas the
right-hand integral from (5) exists in the sense of a Lebesgue integral with
g' = h.
If, in addition, f is continuous on the closure of la, b[, then the left-hand
integral from (5) exists in the sense of a classic Stieltjes integral.
Ordered Sets and Zorn's Lemma

The set C is called ordered iff there is a relation, written as
u ::; v,
among some pairs of elements of C such that the following hold:
442 Appendix
(i) u:S;uforalluEC.
(ii) If u :s; v and v :s; w, then u :s; w.
(iii) If u :s; v and v :s; u, then u = v.
By a maximal element m of C we understand an element of C such that
m:S; u and u E C imply m=u.
A nonempty subset T of C is called totally ordered iff, for all u, vET, we

have
u:S;v or v:S; u.
Zorn's Lemma. Let C be a nonempty ordered set which has the property
that each totally ordered subset T of C has an upper bound, i. e., there is an
element b of C such that
for all u E T,
where b depends on T.
Then, there exists a maximal element in C.
Example 1. Let S be a set, and let C be the collection of all the subsets
of S. For u, v E C, we write
iff u t:;; v.
Then, C becomes an ordered set.
Example 2. The set ffi. of real numbers is totally ordered, but ffi. does not
have any maximal element.
Zorn's lemma can be used in mathematics if the usual induction argu-

ment fails, since the set under consideration is not countable. In Section 1.1
of AMS Vol. 109 we use Zorn's lemma in order to prove the Hahn-Banach
theorem.
References
Abraham, R., Marsden, J., and Ratiu, T. (1983): Manifolds, Tensor Anal-
ysis, and Applications. Addison-Wesley, Reading, MA.
Albers, D., Alexanderson, G., and Reid, C. (1987): International Mathe-
matical Congresses: An Illustrated History 1893-1986. Springer-Verlag,
New York.
Albeverio, S. and H~egh-Kron, R. (1975): Mathematical Theory of Feyn-
man Path Integrals. Lecture Notes in Mathematics, Vol. 523, Springer-
Verlag, Berlin, Heidelberg.
Albeverio, S. and Brezniak, Z. (1993): Finite-Dimensional Approximation
Approach to Oscillatory Integrals and Stationary Phase in Infinite Di-
mensions. J. Funct. Anal. 113, 177-244.
Allgower, E. and Georg, K. (1990): Numerical Continuation Methods.
Springer-Verlag, New York.
Alt, H. (1992): Lineare Funktionalanalysis: eine anwendungsorientierte
Einfiihrung. 2nd edition. Springer-Verlag, Berlin, Heidelberg.
Amann, H. (1990): Ordinary Differential Equations: An Introduction to
Nonlinear Analysis. De Gruyter, Berlin.
Amann, H. (1995): Linear and Quasilinear Parabolic Problems, Vol. 1.
Birkhauser, Basel.
Ambrosetti, A. (1993): A Primer of Nonlinear Analysis. Cambridge Uni-
versity Press, Cambridge, UK.
Ambrosetti, A. and Coti-Zelati, V. (1993): Periodic Solutions of Singular
Lagrangian Systems. Birkhauser, Basel.
Antman, S. (1995): Nonlinear Elasticity. Springer-Verlag, New York.
Appell, J. and Zabrejko, P. (1990): Nonlinear Superposition Operators.
Cambridge University Press, Cambridge, UK.
444 References
Arnold, L. (1998): Stochastic Differential Equations: Theory and Applica-

tions. Krieger, Malabar, FLA.
Arnold, V. and Khesin, B. (1997): Topological Methods in Hydrodynamics.
Aubin, J. (1977): Applied Functional Analysis. Wiley, New York.
Aubin, J. (1993): Optima and Equilibria: An Introduction to Nonlin-
ear Analysis. Springer-Verlag, Berlin, Heidelberg. (Translated from
French.)
Aubin, J. and Ekeland, 1. (1983): Applied Nonlinear Functional Analysis.
Wiley, New York.
Baggett, L. (1992): Functional Analysis: A Primer. Marcel Dekker, New
York.
Bakelman, 1. (1994): Convex Analysis and Nonlinear Geometric Elliptic
Equations. Springer-Verlag, Berlin, Heidelberg.
Banach, S. (1932): Theorie des opemtions lineaires. Warszawa. (En-
glish edition: Theory of Linear Opemtions. North-Holland, Amsterdam,
1987.)
Banks, R. (1994): Growth and Diffusion Phenomena. Springer-Verlag,
Berlin, Heidelberg.
Barton, G. (1989): Elements of Green's Functions and Propagation: Poten-
tials, Diffusion, and Waves. Clarendon Press, Oxford.
Bellissard, J. (1996): Applications of C*-Techniques to Modern Quantum
Physics. Springer-Verlag, Berlin, Heidelberg.
Berberian, S. (1974): Lectures in Functional Analysis and Opemtor Theory.
Berezin, F. (1987): Introduction to Supemnalysis. Reidel, Dordrecht.
Berezin, F. and Shubin, M. (1991): The Schrodinger Equation. Kluwer,
Dordrecht.
Berger, M. (1977): Nonlinearity and Functional Analysis. Academic Press,
New York.
Boccara, N. (1990): Functional Analysis. Academic Press, New York.
Bogoljubov, N., Logunov, A., Oksak, A., and Todorov, 1. (1990): Geneml
Principles of Quantum Field Theory. Kluwer, Dordrecht. (Translated
from Russian.)
Bogoljubov, N. and Shirkov, D. (1983): Quantum Fields. Benjamin, Read-
ing, MA. (Translated from Russian.)
Booss, B. and Bleecker, D. (1985): Topology and Analysis. Springer-Verlag,
New York.
Borodin, A. and Salminen, P. (1996): Handbook of Brownian Motion.
Birkhiiuser, Basel.
Braides, A. and Defrancheschi, A. (1998): Homogenization of Multiple In-
tegmls. Clarendon Press, New York.
Bratteli, C. and Robinson, D. (1979): Opemtor Algebms and Quantum Sta-
tistical Mechanics, Vols. 1, 2. Springer-Verlag, New York.
Bredon, G. (1993): Topology and Geometry. Springer-Verlag, New York.
Brezis, H. (1983): Analyse functionelle et applications. Masson, Paris.
References 445
Brezis, H. and Browder, F. (1999): Partial differential equations in the

20th century. In: The History of the Twentieth Century. Enciclopedia
!taliana (to appear).
Brokate, M. and Sprekels, J. (1996): Hysteresis and Phase Transitions.
Browder, F. (ed.) (1992): Nonlinear and Global Analysis. Reprints from the
Bulletin of the American Mathematical Society. Providence, RI.
Brown, R. (1993): A Topological Introduction to Nonlinear Analysis. Birk-
hauser, Basel.
Cascuberta, C. and Castellet, M. (1992): Mathematical Research Today
and Tomorrow: Viewpoints of Seven Fields Medalists. Springer-Verlag,
Berlin, Heidelberg.
Cercignani, C., Illner, R, and Pulvirenti, M. (1996): The Theory of Dilute
Gases. Springer-Verlag, Berlin, Heidelberg.
Chang, K. (1993): Infinite Dimensional Morse Theory and Multiple Solu-
tion Problems. Birkhauser, Basel.
Chang, S. (1990): Introduction to Quantum Field Theory. World Scientific,
Singapore.
Choquet-Bruhat, Y., DeWitt-Morette, and Dillard-Bleick, M. (1988): Anal-
ysis, Manifolds, and Physics, Vols. 1,2. North-Holland, Amsterdam.
Chung, K. and Zhao, Z. (1995): From Brownian Motion to Schrodinger's
Equation. Springer-Verlag, Berlin, Heidelberg.
Ciarlet, P. (1977): Numerical Analysis of the Finite Element Method for
Elliptic Boundary- Value Problems. North-Holland, Amsterdam.
Ciarlet, P. (1983): Lectures on Three-Dimensional Elasticity. Springer-Ver-
lag, New York.
Clarke, F. (1998): Nonsmooth Analysis and Control Theory. Springer-
Verlag, New York.
Colombeau, J. (1985): Elementary Introduction to New Generalized Func-
tions. North-Holland, New York.
Connes, A. (1994): Noncommutative Geometry. Academic Press, New York.
Conway, J. (1990): A Course in Functional Analysis. Springer-Verlag, New
York.
Cornwell, J. (1989): Group Theory in Physics. Vol. 1: Fundamental Con-
cepts; Vol. 2: Lie Groups and Their Applications; Vol. 3: Supersymme-
tries and Infinite-Dimensional Algebras. Academic Press, New York.
Courant, R and Hilbert, D. (1937): Die Methoden der Mathematischen
Physik, Vols. 1, 2. (English edition: Methods of Mathematical Physics,
Vols. 1, 2, Wiley, New York, 1989.)
Courant, R. and John, F. (1988): Introduction to Calculus and Analysis,
Vols. 1, 2. 2nd edition. Springer-Verlag, New York.
Cycon, R., Froese, R, Kirsch, W., and Simon, B. (1986): Schrodinger Op-
erators. Springer-Verlag, New York.
Das, A. (1993): Field Theory: A Path Integral Approach. World Scientific,
Singapore.
Dautray, D. and Lions, J. (1990): Mathematical Analysis and Numerical
446 References
Methods for Science and Technology; Vol. 1: Physical Origins and Clas-
sical Methods; Vol. 2: Functional and Variational Methods; Vol. 3: Spec-
tral Theory and Applications; Vol. 4: Integral Equations and Numerical
Methods; Vol. 5: Evolution Problems I; Vol. 6: Evolution Problems II -
the Navier-Stokes Equations, the Transport Equations, and Numerical
Methods. Springer-Verlag, Berlin, Heidelberg. (Translated from French.)
Davies, P. (ed.) (1989): The New Physics. Cambridge University Press,
Cambridge, UK.
Deimling, K. (1985): Nonlinear Functional Analysis. Springer-Verlag, New
York.
Deimling, K. (1992): Multivalued Differential Equations. De Gruyter,
Berlin.
Deufihard, P. and Hohmann, A. (1993): Numerische Mathematik I. De
Gruyter, Berlin. (English edition: Numerical Analysis: A First Course
in Scientific Computation. De Gruyter, Berlin, 1994.)
Deufihard, P. and Bornemann, F. (1994): Numerische Mathematik II. Inte-
gration gewohnlicher Differentialgleichungen. De Gruyter, Berlin. (En-
glish edition in preparation.)
DeVito, C. (1990): Functional Analysis and Linear Operator Theory. Addi-
son-Wesley, Reading, MA.
Diekman, 0., van Gils, S., Verduyn Lunel, S., and Walther, H.-O.
(1995): Delay Equations: Functional-, Complex-, and Nonlinear Analy-
sis. Springer-Verlag, New York.
Dierkes, U., Hildebrandt, S., Kuster, A., and Wohlrab, O. (1992): Minimal
Surfaces, Vols. 1, 2. Springer-Verlag, Berlin, Heidelberg.
Dieudonne, J. (1969): Foundations of Modern Analysis. Academic Press,
New York.
Dieudonne, J. (1981): History of Functional Analysis. North-Holland, Am-
sterdam.
Dieudonne, J. (1992): Mathematics-the Music of Reason. Springer-Verlag,
Berlin, Heidelberg.
Di Francesco, P., Mathieu, P., and Senechal, D. (1997): Conformal Field
Theory. Springer-Verlag, New York.
Dittrich, W. and Reutter, M. (1994): Classical and Quantum Dynamics
from Classical Paths to Path Integrals. Springer-Verlag, Berlin, Heidel-
berg.
Diu, B., Guthmann, C., Lederer, D., and Roulet, B. (1989): Elements de
Physique Statistique. Hermann, Paris. (German edition: Grundlagen der
Statistischen, Physik, de Gruyter, Berlin, 1369 pages.)
Donoghue, J., Golowich, E., and Holstein, B. (1992): The Dynamics of the
Standard Model. Cambridge University Press, Cambridge, UK.
Dubin, D. (1974). Solvable Models in Algebraic Statistical Mechanics.
Clarendon Press, Oxford.
Dunham, W. (1991): Journey Through Genius: The Great Theorems of
Mathematics. Penguin Books, New York.
References 447
Dunford, N. and Schwartz, J. (1988): Linear Operators, Vols. 1-3. Wiley,

New York.
Dyson, F. (1979): Disturbing the Universe. Harper & Row, New York.
Economou, E. (1988): Green's Functions in Quantum Physics. Springer-
Verlag, New York.
Edwards, R. (1994): Functional Analysis. Dover, New York.
Ekeland,1. and Temam, R. (1974): Analyse convex et problemes variation-
nels. Dunod, Paris. (English edition: North-Holland, New York, 1976).
Ekeland, 1. (1990): Convexity Methods in Hamiltonian Mechanics. Springer-
Verlag, New York.
Esposito, G. (1993): Quantum Gravity, Quantum Cosmology, and Lorent-
zian Geometries. Springer-Verlag, New York.
Evans, C. and Gariepy, R. (1992): Measure Theory and Fine Properties of
Functions. CRC Press, New York.
Evans, L. (1998): Partial Differential Equations. Amer. Math. Soc., Provi-
dence, R1.
Feny6, S. and Stolle, H. (1982): Theorie und Praxis der linearen Integral-
gleichungen, Vols. 1-4. Deutscher Verlag der Wissenschaften, Berlin.
Feynman, R., Leighton, R., and Sands, M. (1963): The Feynman Lectures
in Physics, Vols. 1-3. Addison-Wesley, Reading, MA.
Feynman, R. and Hibbs, A. (1965): Quantum Mechanics and Path Integrals.
McGraw-Hill, New York.
Finn, R. (1985): Equilibrium Capillary Surfaces. Springer-Verlag, Berlin,
Heidelberg.
Friedman, A. (1982): Variational Principles and Free Boundary- Value
Problems. Wiley, New York.
Friedman, A. (1989/96): Mathematics in Industrial Problems, Vols. 1-8.
Fulde, P. (1995): Electron Correlations in Molecules and Solids. 3rd en-
larged edition. Springer-Verlag, Berlin, Heildelberg.
Gajewski, H., Groger, K., and Zacharias, K. (1974): Nichtlineare Operator-
gleichungen. Akademie-Verlag, Berlin.
Galdi, G. (1994): An Introduction to the Mathematical Theory of the
Navier-Stokes Equations, Vols. 1-4. Springer-Verlag, Berlin, Heidelberg
(Vols. 3 and 4 to appear).
Gelfand, 1. and Shilov, E. (1964): Generalized Functions, Vols. 1-5. Aca-
demic Press, New York. (Translated from Russian.)
Gelfand, 1. (1987/89): Collected Papers, Vols. 1-3. Springer-Verlag, New
York.
Gell-Mann, M. (1994): The Quark and the Jaguar: Adventures in the Simple
and the Complex. Freeman, San Francisco, CA.
Giaquinta, M. (1993): Introduction to Regularity Theory for Nonlinear EL-
liptic Systems. Birkhiiuser, Basel.
Giaquinta, M. and Hildebrandt, S. (1995): Calculus of Variations, Vols. 1,
2. Springer-Verlag, New York.
448 References
Gilbarg, D. and Trudinger, N. (1994): Elliptic Partial Differential Equa-

tions of Second Order. 2nd edition. Springer-Verlag, New York.
Gilkey, P. (1984): Invariance Theory, the Heat Equation, and the Atiyah-
Singer Index Theorem. Publish or Perish, Boston, MA.
Glimm, J., Impagliazzo, J., and Singer, I. (eds.) (1990): The Legacy of John
von Neumann. Amer. Math. Soc., Providence, RI.
Glimm, J. and Jaffe, A. (1981): Quantum Physics. Springer-Verlag, New
York.
Godlewski, E. and Raviart, R. (1996): Numerical Approximation of Hyper-
bolic Systems of Conservation Laws. Springer-Verlag, New York.
Goldstein, H. (1980): Classical Mechanics. 2nd edition. Addison-Wesley,
Reading, MA.
Golub, G. and Ortega, J. (1993): Scientific Computing: An Introduction
with Parallel Computing. Academic Press, New York.
Green, M., Schwarz, J., and Witten, E. (1987): Superstrings, Vols. 1,2.
University Press, Cambridge, UK.
Greiner, W. (1993): Relativistic Quantum Mechanics. Springer-Verlag,
Berlin, Heidelberg.
Greiner, W. (1993): Gauge Theory of Weak Interactions. Springer-Verlag,
Berlin, Heidelberg.
Greiner, W. (1993/94): Theoretical Physics, Vols. 1-7. Cf. the following
titles.
Greiner, W. (1994): Classical Physics, Vols. Iff. Springer-Verlag, New York.
Greiner, W. (1994): Quantum Mechanics: An Introduction. Springer-
Greiner, W. and Muller, B. (1994): Quantum Mechanics: Symmetries.
Springer-Verlag, Berlin, Heidelberg.
Greiner, W. and Reinhardt, J. (1994): Quantum Electrodynamics. Springer-
Greiner, W. and Schafer, A. (1994): Quantum Chromodynamics. Springer-
Greiner, W. and Reinhardt, J. (1996): Field Quantization. Springer-Verlag,
Berlin, Heidelberg.
Grosche, G., Ziegler, D., Ziegler, V., and Zeidler, E. (eds.) (1995): Teubner
- Taschenbuch der Mathematik II. Teubner-Verlag, Stuttgart, Leipzig
(English edition in preparation).
Grosche, C. and Steiner, F. (1998): Handbook of Feynman Path Integrals.
Grosse, H. (1996): Models in Statistical Physics and Quantum Field Theory.
Gruber, P. and Wills, J. (1993): Handbook of Convex Geometry, Vols. 1, 2.
North-Holland, Amsterdam.
Guillemin, V. and Sternberg, S. (1990): Symplectic Techniques in Physics.
Cambridge University Press, Cambridge, UK.
References 449
Haag, R. (1993): Local Quantum Physics: Fields, Particles, Algebras.

Hackbusch, W. (1985): Multi-Grid Methods and Applications. Springer-
Hackbusch, W. (1992): Elliptic Differential Equations: Theory and Numer-
ical Treatment. Springer-Verlag, Berlin, Heidelberg.
Hackbusch, W. (1994): Iterative Solution of Large Sparse Systems of Equa-
tions. Springer-Verlag, New York. (Translated from German.)
Hackbusch, W. (1995): Integral Equations: Theory and Numerical Treat-
ment. Birkhiiuser, Basel.
Hackbusch, W. (1996): Partielle Differentialgleichungen und Wissenschaft-
liches Rechnen. In: Zeidler, E. (ed.) (1996), Teubner-Taschenbuch der
Mathematik, Chapter 7. Teubner-Verlag, Stuttgart-Leipzig.
Hagihara, Y. (1976): Celestial Mechanics. MIT Press, Cambridge, MA.
Hale, J. and Koc;ak, H. (1991): Dynamics of Bifurcations. Springer-Verlag,
Berlin, Heidelberg (cf. also Koc;ak (1989}).
Hatfield, B. (1992): Quantum Field Theory of Point Particles and Strings.
Addison-Wesley, Redwood City, CA.
Heisenberg, W. (1989): Encounters with Einstein and Other Essays on Peo-
ple, Places, and Particles. Princeton University Press, Princeton, NJ.
Henneaux, M. and Teitelboim, C. (1993): Quantization of Gauge Systems.
Princeton University Press, Princeton, NJ.
Henry, D. (1981): Geometric Theory of Semilinear Parabolic Equations.
Lecture Notes in Mathematics, Vol. 840. Springer-Verlag, New York.
Hermann, C. and Sapoval, B. (1994): Physics of Semiconductors. Springer-
Verlag, New York.
Heuser, H. (1975): Funktionalanalysis. Teubner-Verlag, Stuttgart. (English
edition: Functional Analysis, Wiley, New York, 1982.)
Hilbert, D. (1912): Grundzuge einer allgemeinen Theorie der Integralglei-
chungen. Teubner-Verlag, Leipzig.
Hilbert, D. (1932): Gesammelte Werke (Collected Works), Vols. 1-3.
Springer-Verlag, Berlin.
Hilborn, R. (1994): Chaos and Nonlinear Dynamics: An Introduction for
Scientists and Engineers. Oxford University Press, New York.
Hildebrandt, S. and Tromba, T. (1985): Mathematics and Optimal Form.
Scientific American Library, Freeman, New York.
Hiriart-Urruty, J. and Lemarchal, C. (1993): Convex Analysis and Mini-
mization Algorithms, Vols. 1, 2. Springer-Verlag, Berlin, Heidelberg.
Hirzebruch, F. and Scharlau, W. (1971): Einfuhrung in die Funktionalana-
lysis. Bibliographisches Institut, Mannheim.
Hislop, P. and Sigal, 1. (1996): Introduction to Spectral Theory: With Ap-
plications to Schrodinger Equations. Springer-Verlag, New York.
Hofer, H. and Zehnder, E. (1994): Symplectic Invariants and Hamiltonian
Dynamics. Birkhiiuser, Basel.
450 References
Holmes, M. (1995): Introduction to Perturbation Methods. Springer-Verlag,

New York.
Holmes, R. (1975): Geometrical Functional Analysis and Its Applications.
Honerkamp, J. (1998): Statistical Physics: An Advanced Approach with Ap-
plications. Springer-Verlag, Berlin, Heidelberg.
Honerkamp, J. and Romer, H. (1993): Theoretical Physics: A Classical Ap-
proach. Springer-Verlag, New York.
Hormander, L. (1983): The Analysis of Linear Partial Differential Oper-
ators; Vol. 1: Distribution Theory and Fourier Analysis; Vol. 2: Dif-
ferential Operators with Constant Coefficients; Vol. 3: Pseudodifferen-
tial Operators; Vol. 4: Fourier Integral Operators. Springer-Verlag, New
York.
Iagolnitzer, D. (1993): Scattering in Quantum Field Theory. Princeton Uni-
versity Press, Princeton, NJ.
Isakov, V. (1998): Inverse Problems for Partial Differential Equations.
Isham, C. (1989): Modern Differential Geometry for Physicists. World Sci-
entific, Singapore.
Ivanchenko, Yu. and Lisyansky, A. (1996): Physics of Critical Fluctuations.
John, F. (1982): Partial Differential Equations. Springer-Verlag, New York.
Jost, J. (1991): Two-Dimensional Geometric Variational Problems. Wiley,
New York.
Jost, J. (1994): Differentialgeometrie und Minimalfliichen. Springer-Verlag,
Berlin, Heidelberg.
Jost, J. (1998): Postmodern Analysis. Springer-Verlag, Berlin, Heidelberg.
Jost, J. (1998a): Partielle Differentialgleichungen: Elliptische (und parabo-
lische) Gleichungen. Springer-Verlag, Berlin, Heidelberg.
Jost, J. and Li-Jost, X. (1999): Calculus of Variations. Cambridge Univer-
sity Press, Cambridge, UK.
Kac, M., Rota, G., and Schwartz, J. (1992): Discrete Thoughts: Essays on
Mathematics, Science, and Philosophy. Birkhiiuser, Basel.
Kadison, R. and Ringrose, J. (1983): Fundamentals of the Theory of Oper-
ator Algebras, Vols. 1-4. Academic Press, New York.
Kaiser, G. (1994): A Friendly Guide to Wavelets. Birkhiiuser, Basel.
Kaku, M. (1987): Introduction to Superstring Theory. Springer-Verlag, New
York.
Kaku, M. and Trainer, J. (1987): Beyond Einstein: The Cosmic Quest for
the Theory of the Universe. Bantam Books, New York.
Kaku, M. (1991): Strings, Conformal Fields, and Topology. Springer-
Verlag, New York.
Kaku, M. (1993): Quantum Field Theory. Oxford University Press, Oxford.
Kantorovich, L. and Akilov, G. (1964): Functional Analysis in Normed
Spaces. Pergamon Press, Oxford. (Translated from Russian.)
References 451
Kanwal, R. (1983): Generalized Functions. Academic Press, New York.

Kassel, C. (1995): Quantum Groups. Springer-Verlag, New York.
Kato, T. (1976): Perturbation Theory for Linear Operators. 2nd edition.
Katok, A. and Hasselblatt, B. (1995): Introduction to the Modern
Theory of Dynamical Systems. Cambridge University Press, Cambridge,
UK.
Kevasan, S. (1989): Topics in Functional Analysis and Applications. Wiley,
New York.
Kevorkian, J. and Cole, J. (1996): Multiple Scale and Singular Perturbation
Methods. Springer-Verlag, New York.
Kichenassamy, S. (1996): Nonlinear Wave Equations. Marcel Dekker, New
York.
Kirillov, A. and Gvishiani, A. (1982): Theory and Problems in Functional
Analysis. Springer-Verlag, New York.
Kittel, C. (1987): Quantum Theory of Solids. Second revised printing. Wi-
ley, New York.
Kittel, C. (1996): Introduction to Solid State Physics. 7th edition. Wiley,
New York.
Koc;ak, H. (1989): Differential and Difference Equations Through Computer
Experiments. With Diskettes. Springer-Verlag, New York (cf. also Hale
and Koc;ak (1991)).
Kolmogorov, A., Fomin, S., and Silverman, R. (1975): Introductory Real
Analysis. Dover, New York. (Enlarged translation from Russian.)
Kolmogorov, A. and Fomin, S. (1975): Reelle Funktionen und Funktional-
analysis. Deutscher Verlag der Wissenschaften, Berlin. (Translated from
Russian.)
Kornhuber, R. (1997): Adaptive Monotone Multigrid Methods for Nonlinear
Variational Inequalities. Teubner-Verlag, Stuttgart.
Krasnoselskii, M. and Zabreiko, P. (1984): Geometrical Methods in Nonlin-
ear Analysis. Springer-Verlag, New York. (Translated from Russian.)
Kress, R. (1989): Linear Integral Equations. Springer-Verlag, New York.
Kreyszig, E. (1989): Introductory Functional Analysis with Applications.
Wiley, New York.
Kufner, A., John, 0., and Fucik, S. (1977): Function Spaces. Academia,
Prague.
Kufner, A. and Fucik, S. (1980): Nonlinear Differential Equations. Elsevier,
New York.
Landau, L. and LifSic, E. (1982): Course of Theoretical Physics, Vols. 1-10.
Elsevier, New York.
Lang, S. (1993): Real Analysis. 3rd edition. Springer-Verlag, New York.
Lazutkin, V. (1993): KAM-Theory and Semiclassical Approximations to
Eigenfunctions. Springer-Verlag, Berlin, Heidelberg.
Leis, R. (1986): Initial-Boundary Value Problems in Mathematical Physics.
Wiley, New York.
452 References
Leung, A. (1989): Systems of Nonlinear Partial Differential Equations: Ap-

plications to Biology and Engineering. Kluwer, Dordrecht.
LeVeque, R. (1990): Numerical Methods for Conservation Laws. Birk-
hauser, Basel.
Levitan, B. and Sargsjan, I. (1991): Sturm-Liouville and Dirac Operators.
Kluwer, Boston, MA. (Translated from Russian.)
Lions, J. (1969): Quelques methodes de resolution des problemes aux limites
nonlineaires. Dunod, Paris.
Lions, J. (1971): Optimal Control of Systems Governed by Partial
Differential Equations. Springer-Verlag, Berlin. (Translated from
French.)
Lions, J. and Magenes, E. (1972): Inhomogeneous Boundary- Value Prob-
lems, Vols. 1-3. Springer-Verlag, New York.
Lions, P.L. (1996): Mathematical Topics in Fluid Dynamics. Vol. 1: Incom-
pressible Models. Vol. 2: Compressible Models. Oxford University Press,
Oxford.
Louis, A. (1989): Inverse und schlecht gestellte Probleme. Teubner,
Stuttgart.
Louis, A., Mass, P. and Rieder, A. (1998): Wavelets: Theorie und Anwen-
dungen. 2nd edition. Teubner, Stuttgart.
Luenberger, D. (1969): Optimization by Vector Space Methods. Wiley, New
York.
Lust, D. and Theissen, S. (1989): Lectures on String Theory. Springer-
Lusztig, G. (1993): Introduction to Quantum Groups. Birkhauser, Boston,
MA.
Mackey, G. (1963): The Mathematical Foundations of Quantum Mechanics.
Benjamin, New York.
Mackey, G. (1992): The Scope and History of Commutative and Noncom-
mutative Harmonic Analysis. American Mathematical Society, Provi-
dence, RI.
Mandl, F. and Shaw, G. (1989): Quantum Field Theory. Wiley, New York.
Marathe, K. and Martucci, G. (1992): The Mathematical Foundations of
Gauge Theory. North-Holland, Amsterdam.
Marchioro, C. and Pulvirenti, M. (1994): Mathematical Theory of Inviscid
Fluids. Springer-Verlag, New York.
Markowich, P. (1990): Semiconductor Equations. Springer-Verlag, Berlin,
Heidelberg.
Marsden, J. (1992): Lectures in Mechanics. Cambridge University Press,
Cambridge, UK.
Marsden, J. and Ratiu, T. (1994): Introduction to Mechanics and Sym-
metry: A Basic Exposition of Classical Mechanical Systems. Springer-
Verlag, New York.
Matveev, V. (1994): Algebro - Geometrical Approach to Nonlinear Evolu-
tion Equations. Springer-Verlag, New York.
References 453
Maurin, K. (1972): Methods of Hilbert Spaces. Polish Scientific Publishers,

Warsaw.
Maurin, K. (1998): The Riemann Legacy: Riemann's Ideas in Mathematics
and Phyics. Kluwer, Boston, MA.
Mawhin, J. and Willem, M. (1987): Critical Point Theory and Hamiltonian
Systems. Springer-Verlag, New York.
Meyer, K. and Hall, G. (1992): Introduction to Hamiltonian Dynamical
Systems and the N-Body Problem. Springer-Verlag, New York.
Mielke, A. (1991): Hamiltonian and Lagrangian Flows on Center Manifolds
with Applications to Elliptic Variational Problems. Lecture Notes in
Mathematics, Vol. 1489. Springer-Verlag, Berlin, Heidelberg.
Monastirsky, M. (1993): Topology of Gauge Fields and Condensed Matter.
Plenum Press, New York.
Murray, J. (1989): Mathematical Biology. Springer-Verlag, Berlin, Heidel-
berg.
Nachtmann, O. (1990): Elementary Particle Physics: Concepts and Phe-
nomena. Springer-Verlag, Berlin, Heidelberg.
Nakahara, M. (1990): Geometry, Topology, and Physics. Hilger, Bristol.
Necas, J. (1967): Les methodes directes en tMorie des equations elliptiques.
Academia, Prague.
Neumann, J.v. (1932): Mathematische Grundlagen der Quantenmechanik.
Springer-Verlag, Berlin. (English edition: Mathematical Foundations
of Quantum Mechanics, Princeton University Press, Princeton, NJ,
1955.)
Neutsch, W. and Scherer, K. (1992): Celestial Mechanics: An Intro-
duction to Classical and Contemporary Methods. Wissenschaftsverlag,
Mannheim.
Newton, R. (1988): Scattering Theory of Waves and Particles. Springer-
Nikiforov, A. and Uvarov, V. (1987): Special Functions of Mathematical
Physics. Birkhauser, Boston, MA. (Translated from Russian.)
Nishikawa, K. and Wakatani, M. (1993): Plasma Physics: Basic Theory
with Fusion Applications. Springer-Verlag, Berlin, Heidelberg.
Novikov, S. et al. (1984): Theory of Solitons. Plenum Press, New York.
(Translated from Russian.)
Oberguggenberger, M. (1992): Multiplication of Distributions and Applica-
tions to Partial Differential Equations. Harlow, Longman, UK.
Pazy, A. (1983): Semigroups of Linear Operators and Applications to Par-
tial Differential Equations. Springer-Verlag, New York.
Peebles, P. (1991): Quantum Mechanics. Princeton University Press,
Princeton, NJ.
Peebles, P. (1993): Principles of Physical Cosmology. Princeton University
Press, Princeton, NJ.
Penrose, R. (1992): The Emperor's New Mind Concerning Computers,
Minds, and the Laws of Physics. Oxford University Press, Oxford.
454 References
Penrose, R. (1994): Shadows of the Mind: The Search for the Missing Sci-
ence of Conciousness. Oxford University Press, Oxford.
Petrina, D. (1995): Mathematical Foundations of Quantum Statistical Me-
chanics. Kluwer, Dordrecht.
Polak, E. (1997): Optimization. Springer-Verlag, New York.
Polchinski, J. (1998): String Theory, Vols. 1,2. Cambridge University Press,
Cambridge, UK.
Polianin, A. and Manzhirov, A. (1998): Handbook of Integral Equations.
CRC Press, Boca Raton, FLA.
Polianin, A. and Zaitsev, V. (1995): Handbook of Exact Solutions for Or-
dinary Differential Equations. CRC Press, Boca Raton, FLA.
Polyakov, A. (1987): Gauge Fields and Strings. Academic Publishers, Har-
wood, NJ.
Prugovecki, E. (1981): Quantum Mechanics in Hilbert Space. Academic
Press, New York.
Quarteroni, A. and Valli, A. (1994): Numerical Approximation of Partial
Differential Equations. Springer-Verlag, Berlin, Heidelberg.
Rabinowitz, P. (1986): Methods in Critical Point Theory with Applications.
Amer. Math. Soc., Providence, RI.
Racke, R. (1992): Lectures on Evolution Equations. Vieweg, Braunschweig.
Rauch, J. (1991). Partial Differential Equations. Springer-Verlag, New
York.
Reed, M. and Simon, B. (1972): Methods of Modern Mathematical Physics.
Vol. 1: FUnctional Analysis; Vol. 2: Fourier Analysis, Self-Adjointness;
Vol. 3: Scattering Theory; Vol. 4: Analysis of Operators. Academic
Press, New York.
Reid, C. (1970): Hilbert. Springer-Verlag, New York.
Reid, C. (1976): Courant in Gottingen and New York. Springer-Verlag, New
York.
Renardy, M. and Rogers, R. (1993): Introduction to Partial Differential
Equations. Springer-Verlag, New York.
Riesz, F. and Nagy, B. (1955): Ler;ons d'analyse fonctionelle. (English edi-
tion: FUnctional Analysis, Frederick Ungar, New York 1978.)
Rivers, R. (1990): path Integral Methods in Quantum Field Theory. Cam-
bridge University Press, Cambridge, UK.
Rolnick, W. (1994): FUndamental Particles and Their Interactions. Addi-
son-Wesley, Reading, MA.
Roubicek, T. (1997): Relaxation in Optimization Theory. De Gruyter,
Berlin, New York.
Royden, H. (1988): Real Analysis. Macmillan, New York.
Rudin, W. (1966): Real and Complex Analysis. McGraw-Hill, New York.
Rudin, W. (1973): FUnctional Analysis. McGraw-Hill, New York.
Ruelle, D. (1993): Chance and Chaos. Princeton University Press, Prince-
ton, NJ.
Sakai, A. (1991): Operator Algebras. Cambridge University Press, Cam-
bridge, UK.
References 455
Sattinger, D. and Weaver, O. (1993): Lie Groups, Lie Algebras, and Their
Representations. Springer-Verlag, New York.
Scharf, G. (1995): Finite Quantum Electrodynamics. Springer-Verlag,
Berlin, Heidelberg.
Schechter, M. (1971): Principles of Functional Analysis. Wiley, New York.
Schechter, M. (1982): Operator Methods in Quantum Mechanics. North-
Holland, Amsterdam.
Schechter, M. (1986): Spectra of Partial Differential Operators. North-
Holland, Amsterdam.
Schmutzer, E. (1989): Grundlagen der theoretischen Physik, Vols. 1, 2.
Deutscher Verlag der Wissenschaften, Berlin.
Schwabl, F. (1995): Quantum Mechanics. 2nd edition. Springer-Verlag,
Berlin, Heidelberg.
Schwabl, F. (1997): Quantenmechanik fur Fortgeschrittene. Springer-
Schwarz, A. (1993): Quantum Field Theory and Topology. Springer-Verlag,
Berlin, Heidelberg.
Schwarz, A. (1994): Topology for Physicists. Springer-Verlag, Berlin, Hei-
delberg.
Schweber, S. (1994): QED (Quantum Electrodynamics) and the Men Who
Made It: Dyson, Feynman, Schwinger, and Tomonaga. Princeton Uni-
versity Press, Princeton, NJ.
Scott, G. and Davidson, K. (1994): Wrinkles in Time. Morrow, New York.
Simon, B. (1993): The Statistical Mechanics of Lattice Gases. Princeton
University Press, Princeton, NJ.
Smoller, J. (1994): Shock Waves and Reaction-Diffusion Equations. 2nd
enlarged edition. Springer-Verlag, New York.
Spohn, H. (1991): Large Scale Dynamics of Interacting Particles. Springer-
Sterman, G. (1993): An Introduction to Quantum Field Theory. Cambridge
University Press, Cambridge, UK.
Stoer, J. and Bulirsch, R. (1993): Introduction to Numerical Analysis.
Springer-Verlag, New York. (Translated from German.)
Strang, G. and Fix. G. (1973): An Analysis of the Finite Element Method.
Prentice-Hall, Englewood Cliffs, NJ.
Stroke, H. (ed.) (1995): The Physical Review: The First Hundred Years-A
Selection of Seminal Papers and Commentaries. American Institute of
Physics, New York.
Struwe, M. (1988): Plateau's Problem and the Calculus of Variations.
Princeton University Press, Princeton, NJ.
Struwe, M. (1996): Variational Methods, 2nd edition. Springer-Verlag, New
York.
Sunder, V. (1987): An Invitation to von Neumann Algebras. Springer-
Verlag, New York.
Szabo, I. (1987): Geschichte der mechanischen Prinzipien und ihrer wich-
tigsten Anwendungen. Birkhauser, Basel.
456 References
Taylor, M. (1996): Partial Differential Equations, Vols. 1-3, Springer-

Verlag, New York.
Temam, R. (1988): Infinite-Dimensional Dynamical Systems in Mechanics
and Physics. Springer-Verlag, New York.
Thaller, B. (1992): The Dimc Equation. Springer-Verlag, Berlin, Heidel-
berg.
Thirring, W. (1991): A Course in Mathematical Physics. Vol. 1: Classi-
cal Dynamical Systems; Vol. 2: Classical Field Theory; Vol. 3: Quan-
tum Mechanics of Atoms and Molecules; Vol. 4: Quantum Mechanics of
Larye Systems. Springer-Verlag, New York.
Thorne, K. (1994): Black Holes and Time Warps: Einstein's Outmgeous
Legacy. Norton, New York.
Toda, M. (1989): Nonlinear Waves and Solitons. Kluwer, Dordrecht.
Triebel, H. (1972). H6here Analysis. Verlag der Wissenschaften, Berlin.
Triebel, H. (1987): Analysis and Mathematical Physics. Kluwer, Dordrecht.
Triebel, H. (1992): Theory of Function Spaces II. Birkhiiuser, Basel.
Visintin, A. (1994): Differentiable Models of Hysteresis. Springer-Verlag,
Berlin, Heidelberg.
Visintin, A. (1997): Models of Phase Transitions. Birkhiiuser, Basel.
Wald, R. (1984): Geneml Relativity. The University of Chicago Press,
Chicago, IL.
Weinberg, S. (1992): Dreams of a Final Theory. Pantheon Books, New
York.
Weinberg, S. (1995/96): The Quantum Theory of Fields, Vols. 1, 2. Cam-
bridge University Press, Cambridge, UK.
Wess, J. and Bagger, J. (1991): Supersymmetry and Superymvity. Second
edition revised and expanded. Princeton University Press, Princeton,
NJ.
Wiegmann, P. (1996): Completely Solvable Models of Quantum Field The-
ory. World Scientific, Singapore.
Wiggins, S. (1990): Introduction to Applied Dynamical Systems and Chaos.
Yosida, K. (1988): Functional Analysis. 5th edition. Springer-Verlag, New
York.
Yosida, K. (1991): Lectures on Differential and Integml Equations. Dover,
New York.
Zabczyk, J. (1992): Optimal Control Theory. Birkhauser, Basel.
Zeidler, E. (1986): Nonlinear Functional Analysis and Its Applications. Vol.
1: Fixed-Point Theorems; Vol. 2A: Linear Monotone Opemtors; Vol. 2B:
Nonlinear Monotone Opemtors; Vol. 3: Variational Methods and Opti-
mization; Vols. 4, 5: Applications to Mathematical Physics. Springer-
Verlag, New York. (Second enlarged edition of Vol. 1, 1992; second
enlarged edition of Vol. 4, 1997, Vol. 5 in preparation.)
Zeidler, E. (ed.) (1996): Teubner-Taschenbuch der Mathematik. Teubner-
Verlag, Stuttgart-Leipzig. (English edition in preparation.)
References 457
Zeidler, E. (1996): Chapters 1-6 and 10-19 of Teubner-Taschenbuch der

Mathematik, Vols. 1, 2. Teubner-Verlag, Stuttgart-Leipzig. (English edi-
tion in preparation.)
Zinn-Justin, J. (1996): Quantum Field Theory and Critical Phenomena,
3rd edition. Clarendon Press, Oxford.
Zuily, C. (1988): Problems in Distributions and Partial Differential Equa-
tions. North-Holland, Amsterdam.
Hints for Further Reading
Comprehensive collection of exercises: Kirillov and Gvishiani

(1982).
History of functional analysis: Dieudonne (1981), Mackey (1992)
(harmonic analysis).
International mathematical congresses: Albers, Alexanderson, and
Reid (1987).
Biographies of Hilbert and Courant: Reid (1970), (1976).
A summary of important material from linear functional anal-
ysis: appendices to Zeidler (1986), Vols. 1, 2B, and 3.
Comprehensive bibliographies: Zeidler (1986), Vols. 1-5.
Classical textbooks on linear functional analysis: Riesz and Nagy
(1955), Schechter (1971), Rudin (1973), Kolmogorov and Fomin (1975),
Kato (1976), Dunford and Schwartz (1988), Yosida (1988).
Nonlinear functional analysis: Berger (1977), Aubin and Ekeland
(1983), Deimling (1985), Zeidler (1986ff), Vols. 1-5, Ambrosetti (1993).
Operator algebras: Kadison and Ringrose (1983), Vols. 1-4, Sunder
(1987), Sakai (1991).
Generalized functions, pseudodifferential operators, and Four-
ier integral operators: Hormander (1983), Vols. 1-4, Kanwal (1983).
Function spaces: Kufner, John, and Fucik (1980), Triebel (1992).
Applications to partial differential equations: Leis (1986), Zei-
dler (1986), Vols. 1-5, Dautray and Lions (1990), Vols. 1-6, Alt (1992),
Racke (1992), Giaquinta (1993), Renardy and Rogers (1993), Evans (1994),
Smoller (1994), Amann (1995), Taylor (1996), Vols. 1-3.
Applications to the calculus of variations: Friedman (1982), Rabi-
nowitz (1986), Zeidler (1986), Vol. 3, Mawhin and Willem (1987), Giaquinta
and Hildebrandt (1995), Chang (1993), Struwe (1996), and Jost (1999).
Minimal surfaces: Dierkes, Hildebrandt, Kiister, and Wohlrab (1992).
Applications to integral equations: Kress (1989), Dautray and Lions
(1990), Vol. 4.
458 References
Applications to optimization and mathematical economics: Lu-

enberger (1969), Zeidler (1986), Vol. 3, Aubin (1993), Zabczyk (1993).
Numerical functional analysis: Zeidler (1986), Vols. 2A, 2B, and 3,
Dautray and Lions (1990), Vols. 1-6, Louis (1989).
Scientific computing: Allgower and Georg (1990), LeVeque (1990),
Golub and Ortega (1993), Deuflhard and Hohmann (1993), Deuflhard and
Bornemann (1994), Quarteroni and Valli (1994), Hackbusch (1985), (1992),
(1994), (1995), (1996), Stoer and Bulirsch (1993), Kornhuber (1997).
Applications to industrial problems: Friedman (1989/94), Vols. 1-8.
Applications to the natural sciences: Zeidler (1986), Vols. 4 and 5,
Dautray and Lions (1990), Vols. 1-6, Grosche, Ziegler, and Zeidler (1995)
(handbook).
Applications to mechanics: Marsden (1992).
Applications to celestial mechanics: Meyer and Hall (1992), Neutsch
and Scherer (1992), Ambrosetti and Coti-Zelati (1993).
Applications to dynamical systems: Temam (1988), Amann (1990),
Wiggins (1990), Hale and Ko<;ak (1991), Mielke (1991), Hofer and Zehnder
(1994), Marsden and Ratiu (1994), Katok and Hasselblatt (1995).
Manifolds: Abraham, Marsden, and Ratiu (1983), Zeidler (1986), Vol.
4, Isham (1989).
Applications to mathematical biology: Murray (1989).
Applications to nonlinear elasticity: Ciarlet (1983), Zeidler (1986),
Vol. 4, Antman (1994).
Applications to fluid mechanics: Zeidler (1986), Vol. 4, Galdi (1994),
Marchioro and Pulvirenti (1994), Lions (1995), Vols. 1, 2.
Solitons: Novikov (1984), Toda (1989), Matveev (1994).
Applications to capillarity: Finn (1985).
Large scale dynamics of multi-particle systems: Spohn (1991),
Cercigniani, Illner, and Pulvirenti (1995).
Hysteresis and phase transitions: Visentin (1994), (1997), Brokate
and Sprekels (1996).
Semiconductors: Markowich (1990).
Plasma physics and fusion: Nishikawa and Wakatani (1993).
Symplectic techniques in physics: Guillemin and Sternberg (1990),
Hofer and Zehnder (1994).
Applications to quantum mechanics: Reed and Simon (1972), Vols.
1-4, Prugovecki (1981), Schechter (1982), Berezin and Shubin (1991).
Quantum statistics: Bratelli and Robinson (1979), Diu et al. (1989),
Haag (1993), Simon (1993), Grosse (1995), Petrina (1995), Honerkamp
(1998).
References 459
Quantum field theory: Glimm and Jaffe (1981), Reed and Simon
(1972), Vol. 2 (the Garding-Wightman axioms), Bogoljubov and Shirkov
(1983), Mandl and Shaw (1989), Bogoljubov, et al. (1990), Chang (1990),
Haag (1993), Kaku (1993), Weinberg (1995/96), Vols. 1,2, Scharf (1995),
Zinn-Justin (1996), Greiner and Reinhardt (1996).
Scattering theory: Reed and Simon (1972), Vol. 3, Newton (1988),
Colton and Kress (1992), Iagolnitzer (1993).
Elementary particles: Rolnick (1994).
Standard model of elementary particles: Nachtmann (1990),
Donoghue, Golowich, and Holstein (1992), Kaku (1993), Weinberg (1996),
Vol. 2.
Noncommutative geometry and the standard model of elemen-
tary particles: Connes (1994).
The Feynman path integral: Albeverio (1975) as well as Albeverio
and Brezniak (1993) (rigorous theory), Das (1993), Dittrich and Reutter
(1994), Weinberg (1995/96), Vols. 1,2, Greiner and Reinhardt (1996), Zinn-
Justin (1996), Grosche and Steiner (1998).
Cosmology: Zeidler (1986), Vol. 4, Peebles (1993).
Quantum cosmology: Esposito (1993).
Supersymmetry: Berezin (1987), Wess and Bagger (1991).
Superstring theory: Green, Schwarz, and Witten (1987), Kaku (1987),
Lust and Theissen (1989), Hatfield (1992).
Quantum groups: Lusztig (1993), Kassel (1995).
Conformal field theory: Kaku (1991), Di Francesco et al. (1997).
Topology and physics: Nakahara (1990), Marathe and Martucci
(1992), Monastirsky (1993), Schwarz (1994).
Topology, partial differential equations, pseudo differential op-
erators, and the Atiyah-Singer index theorem: Gilkey (1984).
Textbooks in physics: Feynman, Leighton, and Sands (1963), Schmut-
zer (1989), Vols. 1, 2, Greiner (1993), Vols. 1-7, Honerkamp and Romer
(1993).
A survey on modern physics: Davies (1989).
Seminal papers in physics: Stroke (1995).
Essays on modern physics: Kaku and Trainer (1987), Heisenberg
(1989), Weinberg (1992), Gell-Mann (1994), Schweber (1994), Scott and
Davidson (1994), Thorne (1994).
Essays on modern mathematics: Cascuberta and Castellet (1992)
(viewpoints of seven Fields medalists), Penrose (1992), (1994), Kac, Rota,
and Schwartz (1994).
List of Symbols
What's in a name? That which we call a rose

By any other word would smell as sweet.
William Shakespeare (1564-1616)

Romeo and Juliet 2,2
General Notation
A=}B A implies B
iff if and only if
A{:}B A iff B (i.e., A =} Band B =} A)
f(x) := 2x f(x) = 2x by definition
xES x is an element of the set S
x¢S x is not an element of the set S
{x: ... } set of all elements x with the property ...
S~T the set S is contained in the set T
SeT S ~ T and S =I- T (the set S is properly contained in T)
SUT the union of the sets Sand T (the set of all
elements that live in S or T)
SnT the intersection of the sets Sand T (the set of
all elements that live in Sand T)
S-T the difference set (the set of all elements that
live in S and not in T)
empty set
set of all subsets of S (the power set of S)
462 List of Symbols
SxT product set {(x, y): xES and yET}

{p} set of the single point p
N set of the natural numbers 1,2, ...
R,C,Q,Z set of the real, complex, rational, integer numbers
][{ RarC
RN set of all real N-tupels x = (Xl, ... ,XN)
(Le., Xj E R for all j)
set of all complex N-tupcls (Xl"'" XN)
(Le., Xj E C for all j)
][{N RN or C N
Re z, 1m z real part of the complex number z = X + yi,
imaginary part
of z (Le., Re z := x, 1m z := y)
z conjugate complex number z := X - yi,
Izl absolute value of the complex number z,
Izl := JX2 + y2
[a,bj closed interval (the set {x E R: a :::::; x :::::; b})
ja,b[ open interval (the set {x E R: a < x < b})
ja,bj half-open interval (the set {x E R: a < x :::::; b})
[a,b[ half-open interval (the set {x E R: a :::::; x < b})
sgnr signum of the real number r
Ojk Kronecker symbol, Ojk := 1 if j = k,
and Ojk := 0 if j =f:. k
inf S infimum of the set S of real numbers (the largest
lower bound of S)
supS supremum of the set S of real numbers (the
smallest upper bound of S)
minS the minimum of the set S of real numbers (the
smallest element of S)
maxS the maximum of the set S of real numbers (the
largest element of S)
lim an lower limit of the real sequence (an)
n-+oo
limn-+ooan upper limit of the real sequence (an)
The Landau Symbols

f(x) = O(g(x)), If(x)1 : : :; constlg(x)I for all x in a neighborhood
x -+ a of the point a
f(x) = o(g(x)), lim f(x) =0
x-+a g(x)
List of Symbols 463
Norms and Inner Products

IIxll norm of x 7
lim Xn = x the sequence (x n ) converges to the point x 9
n-+oo
(or Xn ---+ x as
n ---+ 00)
00
infinite series in a Banach space 76

n=1
(x I Y) inner product 105
N
(x I Y) Euclidean inner product, (x I Y) := L xnYn 109
n=1
(fh conjugate complex number to Yj)
(t, Ixnl2)
1
Ixl Euclidean norm, Ixl := (x I x)! = "2 109
Ixl oo special norm, Ixl oo := sup Ixnl 11

n
(u I vh inner product on the Lebesgue spaces L2(G) 114
and L~(G), (u I vh := fa u(x)v(x)dx
norm on the Lebesgue spaces L2(G) and L~(G), 114
(fa lu(xWdX)
1
lIull2 := (u I u)! = "2
(u I Vh,2 inner product on the Sobolev space Wi(G), 120
(u I Vh,2 := 1G
(uv + t8jU8jV)
3=1
dx
lIu111,2 norm on the Sobolev space Wi(G),
(fa (u' + ~ca;")') <Ix) ,

1
11"11>., ,~ C" I")1" ~

(. I ·)E energetic inner product 273
II·IIE energetic norm 273
Operators
A:S ~ X ---+ Y operator from the set S into 17
the set Y, where S ~ Y
D(A) (or dom A) domain of definition of the operator A 17
R(A) (or im A) range (or image) of the operator A 17
N(A) (or ker A) null space (or kernel) of the operator A, 70
N(A) := {x: Ax = O}
I (or id) identical operator, Ix := x for all x 76
A(S) image of the set S, A(S) := {Ax: XES}
464 List of Symbols
A-I(T) pre image of the set T, A-I(T) := {x: Ax E T} 17

A-I inverse operator to A 17
G(A) graph of the operator A, 414
G(A) := {(x,Ax):x E D(A)}
IIAII norm of the linear operator A 70
IIIII norm of the functional I 75
AB (or A 0 B) the product of the operators A and B, 28
(AB)(u) := A(Bu)
A~B the operator B is an extension of 260
the operator A
A* adjoint operator to the linear operator A 263
AT dual operator to the linear operator A
(see Section 3.10 of AMS Vol. 109)
A closure of the linear operator A 415
a(A) spectrum of the linear operator A 83
p(A) resolvent set of the linear operator A 83
r(A) spectral radius of the linear operator A 94
rank A rank of the linear operator A,
rank A := dim R(A) (see Section 3.9
of AMS Vol. 109)
ind A index of the linear operator A,
ind A := dim N(A) - co dim R(A)
det A determinant of the matrix A
tr A trace of the (N x N)-matrix
A = (akm), tr A := all + ... + aNN
tr A trace of the linear operator A 347
in a Hilbert space
Special Sets
S closure of the set S 31
int S interior of the set S 31
ext S exterior of the set S 31
as boundary of the set S 31
Ue(p) E-neighborhood of the point p in a 15
normed space, Ue(p) := {x E X: Ilx - pil < d
U(p) neighborhood of the point P 15
dim X dimension of the linear space X 6
Xc complexification of the linear space X 98
X/L factor space (see Section 3.9 of AMS Vol. 109)
codim L co dimension of the linear subspace L,
codim L:= dim(X/L) (see Section 3.9 of
AMS Vol. 109)
List of Symbols 465
L.L orthogonal complement to the linear 165

subspace L
as the product as := {ax: XES}, a E lR, C 7
S+T the sum S + T:= {x + y: xES and yET} 7
M ffi L orthogonal direct sum (M ffi L, 165
where L = M.L),
X®Y tensor product 224
X* dual space 75
XE energetic space 273
span S linear hull of the set S 31
co S convex hull of the set S 31
co S closed convex hull of the set S 47
dist(p, S) distance of the point p from the set S
diam S diameter of the set S 47
meas S measure of the set S 431
8(x) the Dirac delta function 158
8 the delta distribution 161
Derivatives
u' (t) derivative of an operator function 80
u = u(t) at time t
8j f partial derivative %1J
8C1.f 8r'8~2 ... 8'ft f, where a = (al, ... , aN) 159
(the classical symbols are also used for the
derivatives of generalized functions)
lal the sum al + ... + aN 159
8
derivative in the direction of 181
8n
the exterior normal
N
!::.f Laplacian, !::.f :=L 8} f 125
n=l
8F(x; h) variation of the functional F at the
point x in direction of h (see
Section 2.1 of AMS Vol. 109)
8n F(x; h) nth variation of the functional F
at the point x in the direction of h
A'(x) (or dA(x)) Frechet-derivative of the operator A
at the point x (see Section 4.2 of
AMS Vol. 109)
dn A(x)(h l , ... , h n ) nth Frcchet-differential of the operator A
at the point x in the directions of
hI, ... ,hn (see Section 4.2 of
AMS Vol. 109)
466 List of Symbols
Spaces of Continuous Functions

C[a, b], C(G) 14, 116
L(X, Y), Linv(X, Y) 73,79
Spaces of Holder Continuous Functions

C<>[a, b], Ck'<>[a, b], C<>(G), ck'<>(G) (C<>(G) = cO'<>(G)) 95ff
Spaces of Smooth Functions

Ck[a, b], ck(G), Ck(G), COO(G), Ck(G)c (CO(G) := C(G)) 96, 116
CO'(G) (or V(G)), S 116,214
Spaces of Integrable Functions (Lebesgue Spaces)

L 2(a, b), L 2(G), L~(G) (L2(G) := L~(G) if lK = JR) 130, 114
Sobolev Spaces
131, 132
Spaces of Sequences
lK oo , l~, l~ (b := l~ if lK = JR) 95, 177
Spaces of Distributions
V'(G), S' 160, 219
List of Theorems
A good memory does not recall everything, but forgets the unimpor-
tant.
Folklore
Theorem 1.A
(The Banach fixed-point theorem) 19
Theorem loB
(The Brouwer fixed-point theorem) 53
Theorem I.C
(The Schauder fixed-point theorem) 61
Theorem I.D
(The Leray-Schauder principle) 65
Theorem I.E
(The method of sub- and supersolutions) 69
Theorem 2.A
(Main theorem on quadratic minimum problems) 121
Theorem 2.B
(The Dirichlet principle) 138
Theorem 2.C
(The Ritz method) 141
Theorem 2.D
(The perpendicular principle) 165
Theorem 2.E
(The Riesz theorem) 167
Theorem 2.F
(Dual quadratic variational problems) 170
Theorem 2.G
(Nonlinear monotone operators) 173
Theorem 2.H
(The nonlinear Lax-Milgram theorem) 175
Theorem 3.A
(Complete orthonormal systems) 202
Theorem 4.A
(Eigenvalues and eigenvectors of linear,
symmetric, compact operators) 232
Theorem 4.B (The Fredholm alternative for linear, symmetric,
compact operators) 237
Theorem 5.A (The Friedrichs extension of symmetric operators) 280
Theorem 5.B (The abstract Dirichlet problem) 282
468 List of Theorems
Theorem 5.C (The eigenvalue problem) 284

Theorem 5.D (The Fredholm alternative) 306
Theorem 5.E (The abstract heat equation) 310
Theorem 5.F (The abstract wave equation) 310
Theorem 5.G (The abstract Schrodinger equation) 323
List of the Most Important
Definitions
Intelligence consists of this; that we recognize the similarity of dif-

ferent things and the difference between similar ones.
Baron de la Brede et de Montesquieu (1689-1755)
Spaces
linear space 7
dimension 7
linear subspace 30
Banach space 10
norm 7
separable 84
reflexive (see Section 2.8 of AMS Vol. 109)
Hilbert space 107
inner product 105
orthogonal elements 105
orthogonal projection 165
complete orthonormal system 200
Fock space (bosons or fermions) 364
Lebesgue space 114
Sobolev space 273
energetic space 273
dual space 74
470 List of the Most Important Definitions
metric space and topological space (see Chapter 1 of AMS Vol. 109)
Convergence
norm convergence 8
Cauchy sequence 10
weak convergence (see Section 2.4 of AMS Vol. 109)
sequentially continuous 27
sequentially compact 33
relatively sequentially compact 33
Operators
domain of definition 17
range and preimage 17
injective 17
surjective 17
bijective 17
inverse operator 17
linear 70
symmetric 264
the Friedrichs extension 280
adjoint 263
dual (cf. Section 3.10 of AMS Vol. 109)
self-adjoint 264
Hamiltonian 328
orthogonal projection operator 270
skew-adjoint 264
unitary 212
Fourier transformation 216
trace class 347
statistical state 348
statistical operator 350
Hilbert-Schmidt operator 347
continuous 26
k-contraction 19
Lipschitz continuous 27
Holder continuous 97
homeomorphism 28
diffeomorphism 436
compact 39
strongly monotone 273
monotone or coercive (see Section 2.18 of AMS Vol. 109)
semigroup 298
Green function (propagator) 386
List of the Most Important Definitions 471
one-parameter group 298

dynamics of a quantum system 328
Fredholm alternative 237
linear Fredholm operator and index
nonlinear Fredholm operator (see Section 5.15 of AMS Vol. 109)
m-linear bounded (see Section 4.1 of AMS Vol. 109)
Functional
nonlinear 17
linear 74
COnvex 29
bilinear form 120
bounded 120
symmetric 120
distribution (generalized function) 160
tempered distribution 219
Fourier transformation 220
generalized eigenfunction 344
Dirac delta distribution 161
Green function 160
fundamental solution 183
Palais~Smale condition (see Section 2.16 of AMS Vol. 109)
Embedding
continuous 261
compact 261
Spectrum
eigenvalue and eigenvector 83

generalized eigenvector 344
resolvent set 83
resolvent operator 83
essential spectrum 84
spectral family 333
measurements in quantum systems 343
Set
open 15
neighborhood 15
interior 30
472 List of the Most Important Definitions
closed 15
closure 30
boundary 31
compact or relatively compact 33
dense 83
convex 29
bounded 33
countable 84
Point
fixed point 18
critical point (see Section 2.1 of AMS Vol. 109)
saddle point (see Section 2.2 of AMS Vol. 109)
bifurcation point (see Section 5.12 of AMS Vol. 109)
Operator Algebras
Banach algebra 76
von Neumann algebra 359
C*-algebra 357
observable 359
state 358
pure 358
mixed 358
KMS-state (thermodynamic equilibrium) 360
*-automorphism 358
dynamics of a quantum system 359
Derivative
time derivative 80
generalized derivative of a function 129
derivative of a distribution 162
nth variation (see Section 2.1 of AMS Vol. 109)
Frechet derivative (see Section 4.2 of AMS Vol. 109)
Integral
Lebesgue integral 434

Lebesgue measure 429
integration by parts 118
Lebesgue-Stieltjes integral 441
Feynman path integral 385
Subject Index
a posteriori error estimate 19 admissible sequence 274

a priori error estimate 19 algebraic approach to quantum
a priori estimates 64 statistics 357
absolute continuity 437 almost all 431
absolute integrability 435 almost everywhere 431
absolute temperature 352 annihilation operator 366
absolutely convergent 76 anticommutation relations 367
abstract boundary-eigenvalue antilinear 169
problem 255 Appolonius' identity 178
abstract boundary-value problem Arzela-Ascoli theorem 35
254 asymptotically free 368
abstract Dirichlet problem 282 *-automorphism 358
abstract Fourier series 200
abstract heat equation 255, 306 balls 16,94
abstract Schrodinger equation Banach algebra 76
256, 323 Banach fixed-point theorem 18
abstract setting for quantum Banach space 10
mechanics 327 barycenter 47
abstract setting of quantum barycentric subdivision 49
statistics 348 basic equation of quantum
abstract wave equation 256, 310 statistics 351
action 393 basis 42
addition theorems 305 Bernstein polynomials 86
adjoint operator 263 Bessel inequality 202
admissible paths 392 beauty of functional analysis 256
474 Subject Index
Big Bang 357 complete 10

bijective 17 complete orthonormal system
bilinear form 120 200, 209, 210, 223, 232,
black holes 340 247, 374
Bolzano-Weierstrass theorem 34 complete system of generalized
Born approximation, 403, 404 eigenfunctions 346
Bose-Einstein statistics 355 completeness relation 374
bosonic Fock space 364 completeness theorem 222
bosons 363 complex linear space 4
bound states 372, 394, 396 complex normed space 7
boundary 31 complexification 98
boundary point 31 complexification of real Hilbert
boundary-eigenvalue problem spaces 178
245, 285, 316 composite states of elementary
boundary-value problem 125 particles 227
bounded 33 conservation of energy 336, 404
bounded bilinear form 120 continuity 26
bounded orbits 421 continuous 27
bounded sequence 9 continuous Dirac calculus 375
Brouwer fixed-point theorem 53 continuous embedding 261
Brownian motion 381 continuous spectrum 409
contraction principle 19
C* -algebra 357 convergence 9
calculus of variations 117, 126 convex 29
Cauchy sequence 10 convex hull 31
Cayley transform 423 convexity 29
characteristic equation 94 convolution 182
charge density 180 Coulomb force 180
chemical potential 352 Coulomb potential 420
chronological operator 395 countable 84
classical function spaces 95, 97 creation operators 364
classical Schwarz inequality 12 critical point 321
closed 15
closed balls 16 defect indices 423
closed linear subspace 30 deflection of a string 157
closure 31 degenerate kernels 251
closure of an operator 414 degrees of freedom 229
closed convex hull 31 dense 84
commutation relations 365 density 84, 189, 222
commutative C* -algebra 358 derivative 80
compact 90 diagonal sequence 36
compact embedding 96, 261 diameter 47
compact operator 39 dielectricity constant 420
compact set 33 diffeomorphism 435
compactness 33 differential operator 267
Subject Index 475
diffusion 379, 386 elastic energy 256

diffusion equation 379 elasticity 145
dimension 6 electric field 180
Dirac calculus 222, 373 electric field of a charged point
Dirac 8-distribution 161 180
Dirac 8-function 158, 346 electrostatics 183
Dirac function 375 electrostatic potential 180
Dirichlet principle 101, 123, 138 embedding theorems 262
Dirichlet problem 125 energetic extension 279
discrete Dirac calculus 374 energetic inner product 123, 144,
discrete spectrum 409 273
disk 28 energetic norm 144, 273
dispersion 328 energetic space 144, 154, 273
dispersion of the energy 352 energy 309
dispersion of the particle number energy of the harmonic oscillator
352 309
distance 7, 47 energy conservation 303, 310,
distributions 160 313
domain of definition 17 entropy 349, 353
dominated convergence 437 equicontinuous 35
dual maximum problem 169 equivalent norms 42, 99
dual space 75 error estimate 19, 142
duality map 169, 279 error estimates via duality 142
duality of quadratic variational essential spectrum 84, 372
problems 169 Euclidean norm 12, 109
duality theory 142 Euclidean strategy in quantum
dynamics 359 physics 379
dynamical systems 303
Euler-Lagrange equation 125
dynamics of the harmonic
expansion of our universe 340,
oscillator 341
357
dynamics of quantum systems
exponential function 78
328
extension 261
dynamics of statistical systems
extension principle 213
349
exterior 31
Dyson formula 395
exterior point 31
eigenfunction 247, 340
eigenoscillations 198 Fatou lemma 110, 438
eigenoscillations of the string 317 Fermi-Dirac statistics 356
eigensolution 230, 241 fermions 363
eigenspace 230 fermionic Fock space 366
eigenstate 329 Feynman diagrams 400
eigenvalue 83, 230 Feynman formula 393
eigenvalue problem 283 Feynman path integral 381, 385,
eigenvector 230 393
476 Subject Index
Feynman relation for transition Gaussian functions 223

amplitudes 397 Gelfand-Kostyuchenko theorem
Feynman-Kac formula 391 424
finite-dimensional Banach spaces Gelfand-Levitan-Marchenko
42 integral equation 410
finite-dimensional space 6 general position 45
finite elements 151 generalized boundary values 135,
finite €-net 38 138
finite multiplicity 230 generalized derivative 129
fixed point 19 generalized diffusion equation
Fock space 363 386
force density 156 generalized Dirichlet problem
formal Hamiltonian 339 138
Fourier coefficients 195, 200 generalized eigenfunctions 343,
Fourier integral 198 372,377
Fourier method 315, 316 generalized eigenvectors 424
Fourier series 195, 203, 241, 247 generalized Fourier series 195
Fourier transform 216, 376, 378, generalized functions 156, 160
413 generalized functions in
Fourier transform of tempered mathematical physics
generalized functions 179
219 generalized initial-value problem
Fredholm alternative 237, 245, 185
249, 284, 287 generalized plane wave 184
Fredholm integral operator 95 generalized problem 285, 306,
free Hamiltonian 371 310
Friedrichs extension 258, 280 generalized solution 258
Friedrichs extension in complex generalized triangle inequality 8
Hilbert spaces 419 generator 298
Friedrichs' mollification 186 geometric series 79
Fubini's theorem 439
golden rule for the rate of
function 17
convergence 143
functional calculus 293, 331
golden rule of numerical analysis
functions of bounded variation
143
440
graph 414
functions of self-adjoint
graph closed operators 414
operators 293
Green function 147, 157, 164,
fundamental solution 179, 182,
246, 251, 387, 392
183, 186, 413
group property 300
fundamental theorem of calculus
119
half-numberly spin 357
Gauss method of least squares Hamiltonian 328, 371, 418
197 Hamiltonian of the harmonic
Gauss theorem 119 oscillator 341
Subject Index 477
Hamiltonian of the hydrogen inverse scattering theory 406,

atom 420 409
harmonic oscillations 196, 303 irreversible process in nature
harmonic oscillation in quantum 300, 303
mechanics 338 isometric operators 422
heat equation 182 *- isomorphism 358
Heisenberg S-operator 402 iteration method 18, 68, 395, 404
Heisenberg uncertainty principle
331, 342 k-contractive 19
Hermitean functions 210, 339 Kato perturbation theorem 417
Hermitean polynomials 210 KdV equation 407
Hilbert space 107 kinetic energy 321, 336
Hilbert-Schmidt operator 347 KMS-states 360
Hilbert-Schmidt theory 232 Knaster, Kuratowski, and
Holder continuous functions 96, Mazur-
97 kiewicz lemma 58
homeomorphic 28 Korteweg-de Vries equation 406
homeomorphism 28
lack of classic solution 127
homogeneity in time 302
Lagrange multiplier rule 354
*- homomorphism 358
Laguerre functions 222
hydrogen atom and the
language of physicists 221, 373
Friedrichs extension 420
Laplacian 125, 277, 285
Lax pair 412
idea of orthogonality 124 Lax-Milgram theorem 174
ideal elements 127 least-squares method 201
identical operator 77 Lebesgue integral 434
infinite series 76 Lebesgue measure 429
infinite-dimensional space 6 Lebesgue spaces 114
initial-value problem 24 Lebesgue-Stieltjes integral 441
injective 17 Legendre polynomials 209
inner product 105 Leray-Schauder principle 64
integer spin 357 linear combinations 2
integrable 434 linear continuous functional 74
integration by parts 117, 119, linear hull 31
159 linear integral equation 23
integral equations 22, 62, 240 linear Lax-Milgram theorem 175
integral operator 18, 40, 265 linear operator 70
integral of a step function 432 linear orthogonality principle 172
iterated integration 438 linear space 3
interior 31 linear subspace 30
interior point 31 linearly independent 5
inverse Fourier transformation Lipman-Schwinger integral
216, 378 equation 404
inverse operator 17 Lipschitz continuous 27
478 Subject Index
Luzin's theorem 432 nonlinear Lax-Milgram theorem

175
majorant criterion 436 nonlinear mathematical physics
Malgrange-Ehrenpreis theorem 173
186 nonlinear orthogonality principle
mapping degree 55 174
mass conservation 379 norm 7
matrix 71 normal order cone 67
maximally skew-symmetric 267 normed space 7
maximally symmetric 267 nuclear spaces 424
maximum 37 null space 70
Maxwell-Boltzmann statistics
356 observables 328, 359
mean continuity 437 one-dimensional wave 323
one-parameter group 299
measure 430
one-parameter unitary group
measurable functions 432
301, 328
measurable set 430
open 15
measurements 328, 349
open neighborhood 15
method of finite elements 145
operator 16
microscattering processes 400
operator functions 76, 77, 257
mild (generalized) solution 307,
operator norm 70
311, 320, 324
order cone 66
minimal sequence 176
ordered Banach space 67
minimum 37
ordered normed space 67
Minkowski functional 50
ordered sets 441
mixed state 358
ordinary differential equations
models of quantum field theory
24, 63
368 orthogonal 105
momentum operator 342, 345, orthogonal complement 165, 178
376 orthogonal decomposition 165
momentum vector 336 orthogonal projection 165, 270
monotone convergence 437 orthogonality 124
monotone operators 173 orthogonality principle 172, 175
multiindex 159 orthonormal system 199
multiplication operator 269, 334
multiplicity 230 parallelogram identity 123, 178
parameter integrals 439
Navier-Stokes equations 65 Parseval equation 203, 222
neighborhood 15 particle number 352
Neumann series 79 particle number operator 366
Newtonian equation 336 particle stream 344, 403
nonexpansive semigroup 299 partial differential equations of
nonlinear Fourier transformation mathematical physics
413 256
Subject Index 479
partition function 353 propagation of probability 393

path integral 390 propagator 387
Pauli principle 356, 363 pure state 358
Payley-Wiener theorem 186, 223 Pythagorean theorem 167
Peano theorem 63
perpendicular principle 123, 165 quadratic variational problems
phase space 393 121
photon 340 quantization 336
physical interpretation of the quantization of action 393
Green function 158 quantum field theory 363
physical states 327 quantum hypothesis 340
Picard-Lindelof theorem 24 quantum mechanics 327, 336
Planck quantum action 329 quantum statistics 348
Planck's radiation law 340, 357 quantum system 327
plane wave 184
Poincare inequality 287 range 17
Poincare-Friedrichs inequality rate of convergence 19, 142
136 real normed space 7
Poisson equation 125, 258, 285 real linear space 4
position operator 342, 346, 377 regularity theory 128
positive bilinear form 120 relatively compact 90
potential 182 relatively sequentially compact
potential barrier 403 33
potential energy 321, 336 Rellich's compactness theorem
potential equation 182 287
potential theory 95 resolvent 83
pre-Hilbert space 105 resolvent set 83
preimage 17 resonance condition 249
principle of indistinguishability restriction 259
363 retraction 55
principle of maximal entropy 353 reversibility 302
principle of minimal potential reversible processes in nature 302
energy 256 Riesz theorem 167
principle of minimal potential Ritz equation 140, 152
errors 142 Ritz method 140, 151, 179
principle of stationary action 321
principle of virtual power 142, S-matrix 402
147 scalar multiplication 3
principle of virtual work 147 scattering of a particle stream
probability 336, 352 399
probability conservation of scattering theory 368
quantum processes 303 Schauder fixed-point theorem 61
probability of measuring the Schauder operator 41
position 343 Schmidt orthogonalization
product rule 107 method 207
480 Subject Index
Schrodinger equation 323, 336, strongly monotone operators

338, 381, 395 173, 259, 273
Schwarz inequality 105 strongly positive bilinear form
self-adjoint operator 264, 416 120
semi-Fredholm 83 subsolution 69
semigroup 298 superselection rules 328
separable 84, 116 superposition 196
separability 191 superposition of eigenoscillations
sequentially compact 33 317
sequentially continuous 27 supersolution 69
sharp energy 341 surjective 17
simple eigenvalue 247 symmetric bilinear form 120
simplex 46 symmetric operator 230, 264,
skew-adjoint operator 264 415
skew-symmetric operator 264
smoothing of functions 186
temperature
smoothing technique 117
tempered delta distribution 221
Sobolev embedding theorems 192
tempered distributions 220
Sobolev space 130, 260, 277
tensor product 224, 226
solitons 406
tensor product of functions 184
spectral family 333
tensor product of generalized
spectral radius 94
functions 185
spectral transformation 410
thermodynamic equilibrium 360
spectrum 83, 94, 238
spectrum of the hydrogen atom thermodynamical quantities 353
421 time evolution 329
Sperner lemma 56 time evolution of quantum
Sperner simplex 57 systems 330
standard model in statistical time-dependent processes in
physics 351, 360 nature 298
states 327, 359 time-dependent scattering theory
stationary particle states 337 398
stationary Schrodinger equation time-independent scattering
337 theory 404
statistical operator 350 Tonelli's theorem 439
statistical potential 353 total energy 321, 336
statistical states 348 totally ordered 442
step function 432 trace class operators 347, 421
Stieltjes integral 333, 440 transformation rule for integrals
string 158, 315 435
string energy 320 transition amplitude 397
string equation 321 transition probabilities 397
strong causality 302 Trefftz method 170
strongly continuous semigroup triangle inequality 7
298 triangulation 49
Subject Index 481
trigonometric polynomials 204, vertices 46

206 vibrating string 315
two-soliton solutions 407 Volterra integral operator 95
volume potential 183
unbounded operator 300 von Neumann algebra 359
unbounded orbits 421
uncertainty inequality 330 waves 406
uncertainty principle 343 wave equation 182, 185, 309
uniformly continuous 27 wave operators 369
uniformly continuous one-para- Weierstrass approximation
meter group 299 theorem 84
uniqueness implies existence 237 Weierstrass classical
unitary operator 212, 219, 272, counterexample 176
330 Weierstrass theorem 37
white dwarfs 357
variance 328 Wiener path integral 381,386
variational equation 140
variational lemma 117 Zorn's lemma 442
variational problem 125, 140,
281,287

Applied Functional Analysis Applications To Mathematical Physics

Загружено:

Сведения о документе

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Applied Functional Analysis Applications To Mathematical Physics

Загружено:

Авторское право:

Доступные форматы

Applied Mathematical Sciences

Springer Science+Business Media, LLC

Applied Functional Analysis

Mathematics Subject Classification (1991): 34A12, 42A16, 35J05

Library of Congress Cataloging-in-Publication Data

Printed on acid-free paper.

© 1995 Springer Science+Business Media New York

Production managed by Laura Carlson; manufacturing supervised by Joe Quatela.

9 8 7 6 5 4 3 (Corrected third printing, 1999)

ISBN 978-1-4612-6910-6 SPIN 10738833

Textbooks should be attractive by showing the beauty ofthe subject.

Johann Wolfgang von Goethe (1749-1832)

Sir Michael Atiyah

John von Neumann

A theory is the more impressive,

There are two different ways of teaching mathematics, namely,

This introduction to functional analysis is divided into the following two

(a) linear functional analysis;

(b) nonlinear functional analysis;

(c) numerical functional analysis; and

(d) substantial applications related to the main stream of mathematics

boundary-value problems and obstacle problems in nonlinear elasticity;

(ii) applying abstract functional analytic theorems;

u(x) -lb A(x, y)¢(u(y), y)dy = j(x), a S x S b,

provided we introduce the operator A through

From the abstract point of view, we assume that u is an element of the

The set X is also called a function space. Typically, functional analysis

u(a) = u(b) = 0 (boundary condition),

provided we define the operator A through

(Au)(x) := ul/(x) + c(x)u(x) for all x E [a, b].

Naturally enough, we now assume that u is an element of the space X,

where we set X := IR.2 . Obviously, u E X implies Au E X. Thus, the

F(u) = min!, u E X, (M)

corresponds to Euler's classical variational problem

F(u) := lb L(x, u(x), u'(x))dx for all u E X,

where X denotes an appropriate space of functions that satisfy the bound-

Typically, the spaces X are infinite-dimensional. From the physical point

minmaxL(u,p) = maxminL(u,p) = L(uo,Po) (Minimax)

and, more generally,

inf supL(u,p) = sup inf L(u,p)

u'(t) = Au(t), t > 0,

where A is a linear operator. Formally, the solution of (D) is given by

analytic justification for the Dirichlet principle based on an existence theo-

physical systems corresponds to a uniquely determined self-adjoint Hamil-

g(z)8(z - y)dz = g(y) for all y E lit (IV)

v(k) = e-ikXu(x)dx for all k E JR.

and the inverse Fourier transformation

for all x E JR.,

we formally obtain that

From a mathematical point of view, there is no classical function 8 that

At the end of the 1920s, Banach proved a number of important theorems

(ii) Those interested in the main principles of functional analysis and

The book is based on lectures I have given for students of mathematics

Leipzig Eberhard Zeidler

Each progress in mathematics is based on the discovery of stronger

David Hilbert, 1900

In order to understand the great achievement of Hilbert (1862-1943)

1 In this fundamental lecture, Hilbert formulated his famous 23 open prob-

(i) the problem of the existence of a potential function for given

From our point of view today, Hilbert's paper of 1904 appears

Otto Blumenthal, 1932

The perfection of mathematical beauty is such that whatsoever is

D'Arcy W. Thompson, 1917