Вы находитесь на странице: 1из 130

ANALYSIS

AN INTRODUCTORY COURSE
Ivan F Wilde
Mathematics Department
Kings College London
iwilde@mth.kcl.ac.uk
Contents
1 Sets 1
2 The Real Numbers 9
3 Sequences 29
4 Series 59
5 Functions 81
6 Power Series 105
7 The elementary functions 111
Chapter 1
Sets
It is very convenient to introduce some notation and terminology from set
theory. A set is just a collection of objects which will usually be certain
mathematical objects, such as numbers, points in the plane, functions or
some such. If A denotes some given set and x denotes an object belonging
to A, then this fact is indicated by the expression
x A
to be read as x belongs to A, or x is a member of A, or x is an element
of A. If x denotes some object which does not belong to the set A, then
this is indicated by the symbolism
x / A
and is read as x does not belong to A, or x is not a member of A, or
x is not an element of A.
To say that the sets A and B are equal is to say that they have the same
elements. In other words, to say that A = B is to say both that if x A
then also x B and if y B then also y A. We can write this as
A = B is the same as
_
x A = x B
y B = y A.
The verication that given sets A and B are equal is made up of two
parts. The rst is the verication that every element of A is also an element
of B and the second part is the verication that every element of B is also
an element of A.
We list a few examples of sets and also introduce some notation.
1
2 Chapter 1
Examples 1.1.
1. The set consisting of the three integers 2, 3, 4. We write this as 2, 3, 4 .
2. The set of natural numbers 1, 2, 3, 4, 5, 6, . . . (i.e., all strictly positive
integers). This set is denoted by N. Notice that 0 / N.
3. The set of all real numbers, denoted by R. For example, 8, 11, 0,

5,

1
2
,
1
3
, are elements of R.
4. The set of complex numbers is denoted by C.
5. The set of all integers (positive, negative and including zero) is denoted
by Z.
6. The set of all rational numbers (all real numbers of the form
m
n
for
integers m, n with n ,= 0) is denoted by Q. For example, the real numbers
3
4
,
17
9
, 0, 78, 3 belong to Q, but

2 / Q.
7. The set of even natural numbers 2, 4, 6, 8, . . . . This could also be
written as n N : n = 2m for some m N. (The colon : stands for
such that (or with the property that), so this can be read as the
set of all n in N such that n = 2m for some m in N.)
8. The set x R : x > 1 is the set of all those real numbers strictly
greater than 1.
9. The set z C : [z[ = 1 is the set of complex numbers with absolute
value equal to 1. This is the unit circle in C ( the circle with centre
at the origin and with radius equal to 1).
Certain sets of real numbers, so-called intervals, are given a special
notation with the use of round and square brackets. Let a R and b R
and suppose that a < b.
x R : a x b is denoted [ a, b] (closed interval)
x R : a < x < b is denoted (a, b) (open interval)
x R : a x < b is denoted [ a, b) (closed-open interval)
x R : a < x b is denoted (a, b] (open-closed interval)
x R : x a is denoted (, a]
x R : x < a is denoted (, a)
x R : a x is denoted [ a, )
x R : a < x is denoted (a, ).
Department of Mathematics
Sets 3
It is important to realize that all this is just notation a useful visual
short-hand. In particular, the symbol is used in four of the cases. This in
no way is meant to imply that represents a real number it positively,
absolutely, certainly is not.
is not a real number. There is no such real number as .
Given sets A and B, we say that A is a subset of B if every element of
A is also an element of B, i.e., x A = x B. If this is the case, we
write
A B
read A is a subset of B. By virtue of our earlier discussion of the
equality A = B, we can say that
A = B
is equivalent to
if and only if
both A B and B A.
We have N R, Q R, N Z.
Denition 1.2. Suppose that A and B are given sets. The union of A and B,
denoted by A B, is the set with elements which belong to either A or B
(or both);
A B = x : x A or x B
read A union B equals . . . . Note that the usage of the word or
allows both.
In non-mathematical language, the union AB is obtained by bundling
together everything in A and everything in B. Clearly, by construction,
A A B and also B A B.
Example 1.3. Suppose that A = 1, 2, 3 and B = 3, 6, 8 . Then we nd
that A B = 1, 2, 3, 6, 8 .
Denition 1.4. The intersection of A and B, denoted by A B, is the set
with elements which belong to both A and B;
A B = x : x A and x B
read A intersect B equals . . . .
In non-mathematical language, the intersection AB is got by selecting
everything which belongs to both A and B. Clearly, by construction, we see
that A B A and also A B B.
Kings College London
4 Chapter 1
Example 1.5. With A = 1, 2, 3 and B = 3, 6, 8 , as in the example
above, we see that A B = 3 .
If A and B have no elements in common then their intersection AB has
no elements at all. It is convenient to provide a symbol for this situation.
We let denote the set with no elements. is called the empty set.
Then AB = if A and B have no common elements. In such a situation,
we say that A and B are disjoint.
Example 1.6. Let A and B be the intervals in R given as A = (1, 4] and
B = (4, 6). Then A B = and A B = (1, 6).
Remark 1.7. Let A and B be given sets and consider the truth, or otherwise,
of the statement A B . This fails to be true precisely when A possesses
an element which is not a member of B.
Now suppose that A = . The statement B is false provided
that there is some nuisance element of which is not an element of B.
However, has no elements at all, so there can be no such nuisance
element. In other words, the statement B cannot be false and
consequently must be true; obeys B for any set B. This might seem
a bit odd, but is just a logical consequence of the formalism.
Theorem 1.8. For sets A, B and C, we have
(1) A (B C) = (A B) (A C).
(2) A (B C) = (A B) (A C).
Proof. (1) We must show that lhs rhs and that rhs lhs. First, we shall
show that lhs rhs. If lhs = , then we are done, because is a subset of
any set. So now suppose that lhs ,= and let x lhs = A(BC). Then
x A or x (B C) (or both).
(i) Suppose x A. Then x A B and also x A C and therefore
x (A B) (A C), that is, x rhs.
(ii) Suppose that x (B C). Then x B and x C and so x A B
and also x A C. Therefore x (A B) (A C), that is, x rhs.
So in either case (i) or (ii) (and at least one these must be true), we nd
that x rhs. Since x lhs is arbitrary, we deduce that every element of the
left hand side is also an element of the right hand side, that is, lhs rhs.
Now we shall show that rhs lhs. If rhs = , then there is no more
to prove. So suppose that rhs ,= . Let x rhs. Then x (A B) and
x (A C).
Case (i): suppose x A. Then certainly x A (B C) and so x lhs.
Case (ii): suppose x / A. Then since x (A B), it follows that x B.
Also x (A C) and so it follows that x C. Hence x B C and so
x A (B C) which tells us that x lhs.
Department of Mathematics
Sets 5
We have seen that every element of the right hand side also belongs to the
left hand side, that is, rhs lhs.
Combining these two parts, we have lhs rhs and also rhs lhs and so
it follows that lhs = rhs, as required.
(2) This is left as an exercise.
The notions of union and intersection extend to the situation with more
than just two sets. For example,
A
1
A
2
A
3
= x : x A
1
or x A
2
or x A
3

= x : x belongs to at least one of the sets A


1
, A
2
, A
3

= x : x A
i
for some i = 1 or 2 or 3
= x : x A
i
for some i 1, 2, 3 .
More generally, for n sets A
1
, A
2
, . . . , A
n
, we have
A
1
A
2
A
n
= x : x A
i
for some i 1, 2, . . . , n .
This union is often denoted by
n
_
i=1
A
i
which is somewhat more concise than
the alternative A
1
A
2
A
n
. Let denote the index set 1, 2, . . . , n.
This is just the set of labels for the collection of sets we are considering. Then
the above can be conveniently written as
_
i
A
i
= x : x A
i
for some i .
This all makes sense for any non-empty index set. Indeed, suppose that we
have some collection of sets indexed (that is, labelled,) by a set . Suppose
the set with label is denoted by A

. The union of all the A

s is
dened to be
_

= x : x A

for some .
If = N, one often writes

_
i=1
A
i
for
_

.
Examples 1.9.
1. Suppose that = 1, 2, 3, . . . , 57, 58 and A
j
= [j, j +1] for each j .
(So, for example, with j = 7, A
7
= [7, 7 + 1] = [7, 8].) Then
58
_
j=1
A
j
= [1, 59].
Kings College London
6 Chapter 1
2. Suppose that = N and A
j
= [1, j + 1] for j N. Then

_
j=1
A
j
= [1, ).
To see this, suppose that x

j=1
. Then x is an element of at least
one of the A
j
s, that is, there is some j
0
, say, in N such that x A
j
0
.
This means that x [1, j
0
+ 1], that is, 1 x j
0
+ 1 and so certainly
x [1, ). It follows that lhs rhs.
Now suppose that x [1, ). Then, in particular, x 1. Let N
be any natural number satisfying N > x. Then certainly x satises
1 x N + 1 which means that x A
N
and so x

j=1
A
j
. Hence
rhs lhs and the equality lhs = rhs follows.
3. Suppose that is the interval (0, 1) and, for each (0, 1), A

is given
by A

= (x, y) R
2
: x = . In other words, A

is the vertical line


x = in the plane R
2
. Then
_

= (x, y) R
2
: 0 < x < 1
which is the vertical strip in R
2
with boundary edges given by the lines
with x = 0 and x = 1, respectively. Note that these lines (boundary
edges) are not part of the union of the A

s.
4. Let be the interval [3, 5] and for each [3, 5] let A

= . In other
words, A

consists of just one point, the real number . Then


_

= [3, 5]
which just says that the interval [3, 5] is the union of all its points (as it
should be).
A similar discussion can be made regarding intersections.
A
1
A
2
A
3
= x : x A
1
and x A
2
and x A
3

= x : x belongs to every one of the sets A


1
, A
2
, A
3

= x : x A
i
for all i = 1 or 2 or 3
= x : x A
i
for all i 1, 2, 3 .
In general, if A

is any collection of sets indexed by the (non-empty)


set , then the intersection of the A

s is

= x : x A

for all .
Department of Mathematics
Sets 7
If = 1, 2, . . . , n, we usually write
n

i=1
A
i
for

and if = N, then
we usually write

i=1
A
i
for

.
Examples 1.10.
1. Suppose that = N and for each j = N, let A
j
= [0, j]. Then

jN
A
j
= [0, 1].
2. Let = N and set A
j
= [j, j + 1] for j N. Then

j=1
A
j
= .
3. Let = N and set A
j
= [j, ) for j N. Then

j=1
A
j
= .
To see this, note that x

j=1
provided that x belongs to every A
j
.
This means that x satises j x j + 1 for all j N. But clearly this
fails whenever j is a natural number strictly greater than x. In other
words, there are no real numbers which satisfy this criterion.
4. Suppose that = N and for each k N let A
k
be the interval given by
A
k
= [0, 1/k). Then, in this case,

k=1
A
k
= 0 .
This follows because the only non-negative real number which is smaller
than every 1/k (where k N) is zero.
5. Let = N and let A
k
= [0, 1 + 1/k] for k N. Then

k=1
A
k
= [0, 1].
Indeed, [0, 1] A
k
for every k and if x / [0, 1] then x must fail to belong
to some A
k
.
Kings College London
8 Chapter 1
Theorem 1.11. Suppose that A and B

, for , are given sets. Then


(1) A
_

_
=

(A B

).
(2) A
_
_

_
=
_

(A B

).
Proof. (1) Suppose that x A
_

_
. If x A, then x A B

for
all and so x

(A B

). If x / A, then it must be the case that


x

, in which case x B

for all and so x

(A B

). We
have shown that A
_

(A B

).
To establish the reverse inclusion, suppose that x

(AB

). Then
x A B

for every . If x A, the certainly x A


_

_
. If
x / A, then we must have that x B

for every , that is, x


.
But then it follows that x A
_

_
.
Hence

(A B

) A
_

_
and so the equality
A
_

_
=

(A B

)
follows.
(2) The proof of this proceeds along similar lines to part (1).
Department of Mathematics
Chapter 2
The Real Numbers
In this chapter, we will discuss the properties of R, the real number system.
It might well be appropriate to ask exactly what a real number is? It is
the job of mathematics to set out clear descriptions of the objects within its
scope, so it is not at all unreasonable to expect an answer to this. One must
start somewhere. For example, in geometry, one might take the concept
of point as a basic undened object. Lines are then specied by pairs
of points the line passing through them. Beginning with the natural
numbers, N, one can construct Z and from Z one constructs the rationals, Q.
Finally from Q it is possible to construct the real numbers R. We will not
do this here, but rather we will take a close look at the structure and special
properties of R. Of course, everybody knows that numbers can be added
and multiplied and even subtracted and it makes sense to divide one number
by another (as long as the latter, the denominator, is not zero). We can also
compare two numbers and discuss which is the larger. It is precisely these
properties (or axioms) that we wish to isolate and highlight.
Arithmetic
To each pair of real numbers a, b R, there corresponds a third, denoted
a + b. This pairing, denoted + and called addition obeys
(A1) a + (b + c) = (a + b) + c, for all a, b, c R.
(A2) a + b = b + a, for all a, b R.
(A3) There is a unique element, denoted 0, in R such that a + 0 = a, for
any a R.
(A4) For any a R, there is a unique element (denoted a) in R such that
a + (a) = 0.
The properties (A1) (A4) say that R is an abelian group with respect to
the binary operation + .
9
10 Chapter 2
Next, we consider multiplication. To each pair a, b R, there is a
third, denoted a.b, the product of a and b. The operation . , called
multiplication, obeys
(A5) a.(b.c) = (a.b).c, for any a, b, c R.
(A6) a.b = b.a, for any a, b R.
(A7) There is a unique element, denoted 1, in R, with 1 ,= 0 and such that
a.1 = a, for any a R.
(A8) For any a R with a ,= 0, there is a unique element in R, written a
1
or
1
a
, such that a.a
1
= 1. The element a
1
is called the (multiplicative)
inverse, or reciprocal, of a.
(A9) a.(b + c) = a.b + a.c, for all a, b, c R.
Remarks 2.1.
1. 0
1
is not dened. The element 0 has no reciprocal. Such an object
simply does not exist in R. 1/0 has no meaning.
2. Subtraction is given by a b = a + (b), for a, b R.
3. Division is dened via a b = a.(b
1
) ( = a.
1
b
= a/b) provided b ,= 0. If
it should happen that b = 0, then the expression a/b has no meaning.
4. It is usual to omit the dot and write just ab for the product a.b. There
is almost never any confusion from this.
All the familiar arithmetic results are consequences of the above properties
(A1) (A9).
Examples 2.2.
1. For any x R, x.0 = 0.
Proof. By (A3), 0 + 0 = 0 and so x.(0 + 0) = x.0. Hence, by (A9),
x.0 + x.0 = x.0. Adding (x.0) to both sides gives
(x.0 + x.0) + ((x.0)) = x.0 + ((x.0))
= 0, by (A4).
Hence, by (A1), x.0 + (x.0 + ((x.0)) = 0 and so, using property (A4)
again, we get x.0 + 0 = 0. However, by (A3), x.0 + 0 = x.0 and so by
equating these last two expressions for x.0 + 0 we obtain x.0 = 0, as
required.
Department of Mathematics
The Real Numbers 11
2. For any x, y R, x.(y) = (x.y).
Proof.
(x.y) + x.(y) = x.(y + (y)) , by (A9),
= x.0 , by (A4),
= 0, by the previous result.
By (A4) (uniqueness), (x.y) must be the same as x.(y).
3. For any x R, (x) = x.
Proof. We have
x = x + 0 , by (A3),
= x + ((x) + ((x))) , by (A4),
= (x + (x)) + ((x)) , by (A1),
= 0 + ((x)) , by (A4),
= (x) , by (A3)
as required.
4. For any x, y R, x.y = (x).(y).
Proof. By example 2, above, .() = (, ) for any , R. If we
now choose = x and = y, we get
(x).(y) = ((x).y)
= (y.(x)) , by (A6),
= ((y.x)) , by example 2, above,
= ((x.y)) , by (A6),
= x.y by example 3, above,
and we are done.
Kings College London
12 Chapter 2
Order properties
Here we formalize the idea of one number being greater than another. We
can order two numbers by thinking of the larger as being the higher in
order. More precisely, there is a relation < (read less than) between
elements of R satisfying the following:
(A10) For any a, b R, exactly one of the following is true:
a < b, b < a or a = b (trichotomy).
The notation u > a (read u is greater than a) means that a < u.
(A11) If a < b and b < c, then a < c.
(A12) If a < b, then a + c < b + c, for any c R.
(A13) If a < b and > 0, then a < b.
Notation We write a b to signify that either a < b is true or else a = b
is true. In view of (A10), we can say that a b means that it is false that
a > b. The notation x w is used to mean that w x and as already noted
above, x > w is used to mean w < x.
By (A10), if x ,= 0, then either x > 0 or else x < 0. If x > 0, then x is
said to be (strictly) positive and if x < 0, we say that x is (strictly) negative.
Thus, if x is not zero, then it is either positive or else it is negative. It is
quite common to call a number x positive if it obeys x 0 or negative if it
obeys x 0. Should it be necessary to indicate that x is not zero, then one
adds the adjective strictly.
Examples 2.3.
1. For any x R, we have x > 0 (x) < 0.
Proof. Using (A12), we have
0 < x = 0 + (x) < x + (x) (adding (x) to both sides),
= (x) < 0 since rhs = 0, by (A4).
Conversely, again from (A12),
(x) < 0 = (x) + x < 0 + x (adding x to both sides),
= 0 < x by (A2), (A3) and (A4)
and the result follows.
Department of Mathematics
The Real Numbers 13
2. For any x ,= 0, we have x
2
> 0.
Proof. Since x ,= 0, we must have either x > 0 or x < 0, by (A10). If
x > 0, then by (A13) we have x
2
> 0 (take a = 0, b = = x). On the
other hand, if x < 0, then x > 0 by the example above. Hence, by
(A13) (with a = 0, b = = (x)), it follows that (x)(x) > 0.
But we know (from the arithmetic properties) that ()() = , for
any , R and so we have
x
2
= xx = (x)(x) > 0
as required.
The number 1 was introduced in (A7). If we set x = 1 here, then we
see that 1 = 1
2
> 0, i.e., 1 > 0. We have deduced that the number 1 is
positive. Nobody would doubt this, but we see explicitly that this is a
consequence of our set-up. Note that it follows from this, by (A12), that
a < a + 1, for any a R.
3. If a, b R with a b, then a b.
Proof. If a = b then certainly a = b, so we need only consider the
case when a < b.
a < b = a + (a) < b + (a) by (A12),
= 0 < b + (a)
= b < (b) + b + (a) by (A12) and (A4),
= b < a by (A1), (A2) and (A4)
and the result follows.
From now on, we will work with real numbers and inequalities just as we
normally would and will not follow through a succession of steps invoking
the various listed properties as required as we go. Suce it to say that we
could do so if we wished.
Next, we introduce a very important function, the modulus or absolute
value.
Kings College London
14 Chapter 2
Denition 2.4. For any x R, the modulus (or absolute value) of x is the
number [x[ dened according to the rule
[x[ =
_
x, if x 0,
x, if x < 0.
For example, [5[ = 5, [0[ = 0, [3[ = 3 and

1
2

=
1
2
. Note that [x[ is never
negative. We also see that [x[ = max x, x.
Let f(x) = x and g(x) = x. Then [x[ = f(x) when x 0 and
[x[ = g(x) when x < 0. Now, we know what the graphs of y = f(x) = x and
y = g(x) = x look like and so we can sketch the graph of the function [x[.
It is made up of two straight lines, meeting at the origin.
E
T

d
d
d
d
d
d
d
d
d
[x[
x
0
y = x
y = x
Figure 2.1: The absolute value function [x[.
The basic properties of the absolute value are contained in the following
two propositions. They are used time and time again in analysis and it is
absolutely essential to be uent in their use.
Proposition 2.5.
(i) For any a, b R, we have [ab[ = [a[ [b[.
(ii) For any a R and r > 0, the inequality [a[ < r is equivalent to the
pair of inequalities r < a < r.
Proof. (i) We just consider the various possibilities. If either a or b is zero,
then so is the product ab. Hence [ab[ = 0 and at least one of [a[ or [b[ is also
zero. Therefore [ab[ = 0 = [a[ [b[. If both a > 0 and b > 0, then ab > 0 and
we have [ab[ = ab, [a[ = a and [b[ = b and so [ab[ = [a[ [b[ in this case.
Now, if a > 0 but b < 0, then ab < 0 so we have [a[ = a, [b[ = b and
[ab[ = ab = [a[ [b[. The case a < 0 and b > 0 is similar.
Finally, suppose that both a < 0 and b < 0. Then ab > 0 and we have
[ab[ = ab, [a[ = a and [b[ = b. Hence, [ab[ = ab = (a)(b) = [a[ [b[.
Department of Mathematics
The Real Numbers 15
(ii) Suppose that [a[ < r. Then max a, a < r and so both a < r
and a < r. In other words, a < r and r < a which can be written as
r < a < a.
On the other hand, if r < a < r, then both a < r and a < r so that
max a, a < r. That is, [a[ < r, as required.
Remark 2.6. Putting b = 1 in (i), above, and using the fact that [1[ = 1,
we see that [a[ = [a[.
Proposition 2.7. For any real numbers a and b,
(i) [a + b[ [a[ +[b[.
(ii) [a b[ [a[ +[b[.
(iii) [[a[ [b[[ [a b[.
Proof. (i) We have a +b [a[ +[b[ and (a +b) = a b [a[ +[b[. Hence
[a + b[ = max a + b, (a + b) [a[ +[b[ .
(ii) Let c = b and apply (i) to the real numbers a and c to get the
inequality [a + c[ [a[ +[c[. But then this means that [a b[ [a[ +[b[.
(iii) We have
[a[ = [(a b) + b[ [a b[ +[b[
by part (i) (with (a b) replacing a). This implies that [a[ [b[ [a b[.
Swapping around a and b, we have ([a[ [b[) = [b[ [a[ [b a[ = [a b[
and therefore
[ [a[ [b[ [ = max [a[ [b[ , ([a[ [b[) [a b[
as required.
If a and b are real numbers, how far apart are they? For example, if a = 7
and b = 11 then we might say that the distance between a and b is 4. If,
on the other hand, a = 10 and b = 6, then we would say that the distance
between them is 16. In either case, we notice that the distance is given by
[a b[. It is extremely useful to view [a b[ as the distance between the
numbers a and b. For example, to say that [a b[ is very small is to say
that a and b are close to each other.
Kings College London
16 Chapter 2
Proposition 2.8. Let a, b R be given and suppose that for any given > 0,
a and b obey the inequality a < b + . Then a b. In particular, if x <
for all > 0, then x 0.
Proof. We know that either a b or else a > b. Suppose the latter were
true, namely, a > b. Set
1
= a b. Then
1
> 0 and a = b +
1
. Taking
=
1
, we see that this conicts with the hypothesis that a < b+ for every
> 0 (it fails for the choice =
1
). We conclude that a > b must be false
and so a b.
For the last part, simply set a = x and b = 0 to get the desired conclusion.
We have listed a number of properties obeyed by the real numbers:
(A1) . . . (A9) arithmetic
(A10) . . . (A13) order.
Is this it? Are there any more to be included? We notice that all of these
properties are satised by the rational numbers, Q. Are all real numbers
rational, i.e., is it true that Q = R? Or do we need to consider yet further
properties which distinguish between Q and R? Consider an apparently
unrelated question. Do all numbers have square roots? Since a
2
is positive
for any a R, it is clear that no negative number can have a square root in
R. (Indeed, it is the consideration of C, the complex numbers, which allows
for square roots of negative numbers.) So we ask, does every positive real
number have a square root? Does every natural number n have a square
root in R? In particular, is there such a real number as the square root
of 2? It would be nice to think that there is such a real number. In fact,
according to Pythagoras Theorem, this should be the length of the diagonal
of a square whose sides have unit length. The following proposition tells us
that there is certainly no such rational number.
Proposition 2.9. There are no integers m, n N satisfying m
2
= 2n
2
. In
particular,

2 is not a rational number.
Proof. To say that

2 is rational is to say that there are integers m and
n (with n ,= 0) such that m/n =

2. This means that m
2
/n
2
= 2 and so
m
2
= 2n
2
for m, n N. (By replacing m or n by m or n, if necessary,
we may assume that m and n in this last equality are both positive.) So the
fact that

2 / Q is a consequence of the rst part of the proposition.
Consider the equality
m
2
= 2n
2
()
To show that m
2
= 2n
2
is impossible for any m, n N, suppose the contrary,
namely that there are numbers m and n in N obeying (). We will show
that this leads to a contradiction.
Department of Mathematics
The Real Numbers 17
Indeed, if m
2
= 2n
2
, then m
2
is even. The square of an odd number is
odd and so it follows that m must also be even. This means that we can
express m as m = 2k for some suitable k N. But then
(2k)
2
= m
2
= 2n
2
which means that 2k
2
= n
2
and so n
2
is even. Arguing as above, we deduce
that n can be expressed as n = 2j for some j N. Substituting, we see that
k and j also obey (), namely, k
2
= 2j
2
. This tells us that m/2 and n/2 are
integers also obeying ().
Repeating this whole argument with m

= m/2 and n

= n/2, we nd
that both m

/2 and n

/2 belong to N and also satisfy (). In other words,


m/2
2
and n/2
2
belong to N and obey (). We can keep repeating this argu-
ment to deduce that m/2
j
and n/2
j
are integers obeying (). In particular,
m/2
j
N implies that m/2
j
1 and so m 2
j
. But this holds for any
j N and we can take j as large as we wish. We can take j so large that
2
j
> m. This leads to a contradiction, as we wanted it to. We nally
conclude that there are no natural numbers m and n obeying () and as a
consequence, there is no element of Q whose square is equal to 2, that is,

2 is not a rational number.


Remark 2.10. A somewhat similar argument can be used to show that many
other numbers do not have square roots in Q. For example,

3 / Q. In
fact, one can show that if n N, then either

n N, that is, n is a perfect
square, or else

n / Q. For example,

16 = 4 N but

17 / Q.
Returning to the discussion of the dening properties of R, we still have
to pinpoint the extra property that R has which is not shared by Q. First
we need some terminology.
Denition 2.11. A non-empty subset S of R is said to be bounded from above
if there is some M R such that
a M
for all a S. Any such number M is called an upper bound for the set S.
Evidently, if M is an upper bound for S, then so is any number greater
than M.
We say that a non-empty subset S of R is bounded from below if there
is some m R such that
m a
for all a S. Any such number m is called a lower bound for the set S. If
m is a lower bound for S, then so is any number less than m.
If S is both bounded from above and from below, then S is said to be
bounded.
Kings College London
18 Chapter 2
Example 2.12. Consider the set A = (6, 4] . Then A is bounded because
any x A obeys 6 x 4. (In fact, any x A obeys the inequalities
6 < x 4.) Any real number greater than or equal to 4 is an upper bound
for A and any real number less than or equal to 6 is a lower bound for A.
The set A has a maximal element, namely 4, but A does not have a minimal
element.
Let
B = (1,
3
2
) (2,
5
2
) (3,
7
2
) =

_
k=1
(k, k +
1
2
).
Then the set B is bounded from below (the number 1 is clearly a lower
bound for B). However, B contains k +
1
4
for every k N and so B is not
bounded from above (so B is not bounded). We also see that B does not
have a minimal element.
Remark 2.13. What does it mean to say that a set S is not bounded from
above? Consider the inequality
x M . ()
Now, given S and some particular real number M, the inequality () may
hold for some elements x in S but may fail for other elements of S. To say
that S is bounded from above is to say that there is some M such that ()
holds for all elements x S. If S is not bounded from above, then it must
be the case that whatever M we try, there will always be some x in S for
which () fails, that is, for any given M there will be some x S such that
x > M. In particular, if we try M = 1, then there will be some element
(many, in fact) in S greater than 1. Let us pick any such element and label
it as x
1
. Then we have x
1
S and x
1
> 1.
We can now try M = 2. Again, () must fail for at least one element
in S and it could even happen that x
1
> 2. To ensure that we get a new
element from S, let M = max 2, x
1
. Then there must be at least one
element of S greater than this M. Let x
2
denote any such element. Then
we have x
2
S and x
2
> 2 and x
2
,= x
1
.
Now setting M = max 3, x
1
, x
2
, we may say that there is some element
in S, which we choose to denote by x
3
, such that x
3
> 3 and x
3
,= x
1
and
x
3
,= x
2
. We can continue to do this and so we see that if S is not bounded
from above, then there exist elements x
1
, x
2
, x
3
, . . . , x
n
, . . . (which are all
dierent) such that x
n
> n for each n N.
The following concepts play an essential role.
Department of Mathematics
The Real Numbers 19
Denition 2.14. Suppose that S is a non-empty subset of R which is bounded
from above. The number M is the least upper bound (lub) of S if
(i) a M for all a S (i.e., M is an upper bound for S).
(ii) If M

is any upper bound for S, then M M

.
If S is a non-empty subset of R which is bounded from below, then the
number m is the greatest lower bound (glb) of S if
(i) m a for all a S (i.e., m is a lower bound for S).
(ii) If m

is any lower bound for S, then m

m.
Note that the least upper bound and the greatest lower bound of a set S
need not themselves belong to S. They may or they may not. The least
upper bound is also called the supremum (sup) and the greatest lower bound
is also called the inmum (inf). The ideas are illustrated by some examples.
Examples 2.15.
1. Let S be the following set consisting of 4 elements, S = 3, 1, 2, 5 .
Then clearly S is bounded from above and from below. The least upper
bound is 5 and the greatest lower bound is 3.
2. Let S be the interval S = (6, 4]. Then lubS = 4 and glbS = 6. Note
that 4 S whereas 6 / S.
3. Let S = (1, ). S is not bounded from above and so has no least upper
bound. S is bounded from below and we see that glbS = 1. Note that
glb S / S in this case.
Remark 2.16. Suppose that M is the lub for a set S. Let > 0. Then
M < M and since any upper bound M

for S has to obey M M

, we
see that M cannot be an upper bound for S. But this means that it
is false that a M for all a S. In other words, there must be some
a S which satises M < a. Since M is an upper bound for S, we also
have a M and so a obeys
M < a M.
So no matter how small may be, there will always be some element a S
(possibly depending on and there may be many) such that M < a M,
where M = lubS.
For any > 0, there is a S such that lubS < a lubS .
Kings College London
20 Chapter 2
Now suppose that m = glbS. Then for any > 0 (however small), we
note that m < m+ and so m+ cannot be a lower bound for S (because
all lower bounds for S must be less than or equal to m). Hence, there is
some a S such that a < m + , which means that
m a < m + .
For any > 0, there is a S such that glbS a < glb S + .
Remark 2.17. As already noted above, lubS and glb S may or may not
belong to the set S. If it should happen that lubS S, then in this case
lubS (or sup S) is the maximum element of S, denoted max S. If glbS S,
then glbS (or inf S) is the minimum element of S, denoted minS.
For example, the interval S = (2, 5] is bounded and, by inspection, we
see that supS = 5 and inf S = 2. Since supS = 5 S, the set S does
indeed have a maximum element, namely, 5 = supS. However, inf S / S
and so S has no minimum element.
We are now in a position to discuss the nal property satised by R and
it is precisely this last property which distinguishes R from Q.
(A14) (The completeness property of R)
Any non-empty subset of R which is bounded from above possesses a
least upper bound.
Any non-empty subset of R which is bounded from below possesses a
greatest lower bound.
These statements might appear self-evident, but as we will see, they have
far-reaching consequences. We note here that these two statements are not
independent, in fact, each implies the other, that is, they are equivalent.
Remark 2.18. It is very convenient to think of R as the set of points on a line
(the real line). Indeed, this is standard procedure when sketching graphs of
functions where the coordinate axes represent the real numbers.
Department of Mathematics
The Real Numbers 21
Imagine now the following situation.
__
T
. .
A
. .
B
R
Figure 2.2: The real line has no gaps.
The set A consists of all points on the line (real numbers) to the left
of the arrow and B comprises all those points to the right. Numbers are
bigger the more they are to the right. The arrow points to the least upper
bound of A (which is also the greatest lower bound of B). The completeness
property (A14) ensures the existence of the real number in R that the arrow
supposedly points to. There are no gaps or missing points on the real
line. We can think of the integers Z or even the rationals Q as collections
of dots on a line, but it is property (A14) which allows us to visualize R as
the whole unbroken line itself.
The next result is so obvious that it seems hardly worth noting. However,
it is very important and follows from property (A14).
Theorem 2.19 (Archimedean Property). For any given x R, there is some
n N such that n > x.
Proof. Let x R be given. We use the method of proof by contradiction
so suppose that there is no n N obeying n > x. This means that n x
for all n N, that is, x is an upper bound for N in R. By the completeness
property, (A14), N has a least upper bound, , say. Then is an upper
bound for N so that
n ()
for all n N. Since is the least upper bound, 1 cannot be an upper
bound for N and so there must be some k N such that 1 < k. But
we can rewrite this as < k + 1 which contradicts () since k + 1 N. We
conclude that there is some n N obeying n > x, as claimed.
Corollary 2.20.
(i) For any given > 0, there is some n N such that
1
n
< .
(ii) For any > 0, > 0, there is n N such

n
< .
Proof. (i) Let > 0 be given. By the Archimedean Property, there is some
n N such that n > 1/. But then this gives 1/n < , as required.
(ii) For given > 0 and > 0, set = /. By (i), there is n N such
that 1/n < = / and so /n < .
Kings College London
22 Chapter 2
The next result is no surprise either.
Theorem 2.21. For any a R, there is a unique integer n Z such that
n a < n + 1.
Proof. Let S = k Z : k > a . By theorem 2.19, S is not empty and is
bounded below (by a). Hence, by the completeness property (A14), S has
a greatest lower bound , say, in R. We have
a k
for all k S. (The inequality a follows because a is a lower bound and
is the greatest lower bound and the inequality k follows because is
a lower bound of S.) Since is the greatest lower bound, + 1 cannot be
a lower bound of S and so there is some m S such that m < + 1, that
is, m1 < .
[
m1 = n
[
m = n + 1
[
a
R
Figure 2.3: The integer part of a.
Now, is a lower bound for S and m 1 < and so m 1 / S. But
then, by the dening property of S, this means that it is false that m1 > a.
In other words, we have m 1 a. But m S and so m > a and so m
satises m 1 a < m. Putting n = m 1, we get n Z and n satises
the required inequalities n a < n + 1.
To show the uniqueness of such n Z, suppose that also n

Z obeys
n

a < n

+1. Suppose that n < n

. Then n+1 n

and so the inequalities


n

a and a < n + 1 give


n

a < n + 1 n

giving n

< n

which is impossible. Similarly, the assumption that n

< n
would lead to the impossible inequality n < n. We conclude that n = n

which is to say that n is unique.


Remark 2.22. For x R, let n Z be the unique integer obeying the
inequalities n x < n+1. Set r = xn. Then we see that 0 xn = r < 1
and so x = n + r with n Z and where 0 r < 1. The unique integer n
here is called the integer part of the real number x and is denoted by [x] (or
sometimes by x|).
Department of Mathematics
The Real Numbers 23
Theorem 2.23. Between any pair of real numbers a < b, there are innitely-
many rational numbers and also innitely-many irrational numbers.
Proof. First, we shall show that there is at least one such rational, that is,
we shall show that for any given a < b in R, there is some q Q such that
a < q < b. The idea of the proof is as follows. If there is an integer between
a and b, then we are done. In any case, we note that since the integers are
spread one unit apart, there should certainly be at least one integer between
a and b if the distance between a and b is greater than 1. If the distance
between a and b is less than 1, then we can open up the gap between them
by multiplying both by a suciently large (positive) integer, n, say. The
gap between na and nb is n(b a). Clearly, if n is large enough, this value is
greater than 1. Then there will be some integer m, say, between na and nb,
i.e., na < m < nb. But then we see (since n is positive) that a < m/n < b
and q = m/n is a rational number which does the job.
We shall now write this argument out formally. Let n N be suciently
large that n(b a) > 1 so that na + 1 < nb and let m = [na] + 1. Since
[na] na < [na] + 1, it follows that
[na] na < [na] + 1
. .
m
na + 1 < nb
and so na < m < nb and hence a < m/n < b. (Note that n > 0, so this last
step is valid.) Setting q = m/n, we have that q Q and q obeys a < q < b,
as required.
To see that there are innitely-many rationals between a and b, we just
repeat the above argument but with, say, q and b instead of a and b. This
tells us that there is a rational, q
2
, say, obeying q < q
2
< b. Once again,
repeating this argument, there is a rational, q
3
, say, obeying q
2
< q
3
< b.
Continuing in this way, we see that for any n N, there are n rationals,
q, q
2
, . . . , q
n
obeying
a < q < q
2
< q
3
< < q
n
< b .
Hence it follows that there are innitely-many rationals between a and b.
To show that there are innitely-many irrational numbers between a
and b, we use a trick together with the observation that if r is rational, then
r/

2 is irrational. The trick is simply to apply the rst part to the numbers
a

2 and b

2 to deduce that for any n N there are rational numbers


r
1
, r
2
, . . . , r
n
obeying
a

2 < r
1
< r
2
< < r
n
< b

2 .
Now let
j
= r
j
/

2 for j = 1, 2, . . . , n. Then each


j
is irrational and we
have
a <
1
<
2
< <
n
< b
and the result follows.
Kings College London
24 Chapter 2
As a further application of the Completeness Property of R, we shall
show that any positive real number has a positive n
th
root.
Theorem 2.24. Let x 0 and n N be given. Then there is a unique s 0
such that s
n
= x. The real number s is called the (positive) n
th
root of x
and is denoted by x
1/n
.
Proof. If x = 0, then we can take s = 0, so suppose that x > 0.
Let A be the set A = t 0 : t
n
< x. Then 0 A and so A is not empty
and, by the Archimedean Property, there is some integer K with K > x.
But then every t A must obey t < K because otherwise we would have
t K and therefore t
n
K
n
K > x, which is not possible for any t A.
This means that A is bounded from above. By the Completeness Property
of R, A has a least upper bound, lubA = s, say. Note now that, since x > 0,
by the Archimedean Property there is some m N such that m > 1/x.
Hence m
n
m > 1/x which implies that 1/m
n
< x so that 1/m
n
A. This
means that s 1/m
n
. In particular, s > 0.
Now, exactly one of the statements s
n
= x, s
n
< x or s
n
> x is true.
We claim that s
n
= x and to show this we shall show that the last two
statements must be false.
Indeed, suppose that s
n
< x. For k N, let s
k
= s(1 +
1
k
). Then
evidently s
k
> s and we will show that s
n
k
< x for suitably large k.
Let d = x s
n
. Then d > 0 and
x s
n
k
= x s
n
+ s
n
s
n
k
= d (s
n
k
s
n
) = d s
n
_
(1 +
1
k
)
n
1
_
.
Now, writing = (1 +
1
k
) and noting that 1 < 2, we estimate
(1 +
1
k
)
n
1 =
n
1
= ( 1)(
n1
+
n2
+ + 1)
( 1)(2
n1
+ 2
n2
+ + 1)
( 1) n2
n
=
1
k
n2
n
.
Hence
s
n
k
s
n
= s
n
_
(1 +
1
k
)
n
1
_

s
n
n2
n
k
.
For suciently large k, the right hand side of this inequality is less than d
and so
x s
n
k
= x s
n
+ s
n
s
n
k
= d (s
n
k
s
n
) > 0 .
It follows that if k is large enough, then s
k
A. But s
k
> s which means
that s cannot be the least upper bound of A and we have a contradiction.
Hence it must be false that s
n
< x.
Department of Mathematics
The Real Numbers 25
Suppose now that s
n
> x and let = s
n
x. For given k N, let
t
k
= s(1
1
k
). Writing = 1
1
k
and noting that 0 1, we estimate
that
1 (1
1
k
)
n
= 1
n
= (
n
1)
= ( 1)(
n1
+
n2
+ + 1)
= (1 )(
n1
+
n2
+ + 1)
(1 ) n
=
1
k
n.
It follows that
s
n
t
n
k
= s
n
(1 (1
1
k
)
n
)
1
k
s
n
n <
for suciently large k. But then this means that
t
n
k
x = t
n
k
s
n
+ s
n
x = (s
n
t
n
k
) > 0
for large k. However, t
k
< s and since s = lubA, it follows that t
k
is not an
upper bound for A. In other words, there is some A such that > t
k
and therefore
n
x > t
n
k
x > 0. However, A means that
n
< x
which is a contradiction and so it is false that s
n
> x.
We have now shown that s
n
< x is false and also that s
n
> x is false
and so we conclude that it must be true that s
n
= x, as required.
We have established the existence of some s 0 such that s
n
= x and
so, nally, we must prove that such an s is unique. If x = 0, then s = 0
obeys s
n
= 0 = x. No s ,= 0 can obey s
n
= 0 because s
n
(1/s)
n
= 1 ,= 0, so
s = 0 is the only solution to s
n
= 0.
Now let s > 0 and t > 0. If s > t, then s/t > 1 so that (s/t)
n
> 1 and
we nd that s
n
> t
n
. Interchanging the roles of s and t, it follows that if
s < t, then s
n
< t
n
. We conclude that if s
n
= x = t
n
then both s < t and
s > t are impossible and so s = t.
The proof is complete.
Kings College London
26 Chapter 2
Principle of induction
Suppose that, for each n N, P(n) is a statement about the number n such
that
(i) P(1) is true.
(ii) For any k N, the truth of P(k) implies the truth of P(k + 1).
Then P(n) is true for all n.
Example 2.25. For any n N,
1
2
+ 2
2
+ 3
2
+ + n
2
=
n(n + 1)(2n + 1)
6
.
Proof. For n N, let P(n) be the statement that
1
2
+ 2
2
+ 3
2
+ + n
2
=
n(n + 1)(2n + 1)
6
.
Then P(1) is the statement that
1
2
=
1(1 + 1)(2 + 1)
6
which is true.
Now suppose that k N and that P(k) is true. We wish to show that
P(k + 1) is also true. Since we are assuming that P(k) is true, we see that
1
2
+ 2
2
+ 3
2
+ + k
2
+ (k + 1)
2
=
k(k + 1)(2k + 1)
6
+ (k + 1)
2
,
using the truth of P(k),
=
k(k + 1)(2k + 1) + (k + 1)(6k + 6)
6
=
(k + 1)(2k
2
+ k + 6k + 6)
6
=
(k + 1)(k + 2)(2k + 3)
6
which is to say that P(k + 1) is true. By the principle of induction, we
conclude that P(n) is true for all n N.
We can rephrase the principle of induction as follows. Let T be the set
given by T = k N : P(k) is true , so k T if and only if P(k) is true. In
particular, P(1) is true if and only if 1 T. Hence the principle of induction
may be rephrased as follows.
Let T be a set of natural numbers such that 1 T and such that
if T contains k then it also contains k + 1. Then T = N.
Department of Mathematics
The Real Numbers 27
Principle of induction (2nd form)
Suppose that Q(n) is a statement about the natural number n such that
(i) Q(1) is true.
(ii) For any k N, the truth of all Q(1), Q(2), . . . , Q(k) implies the truth
of Q(k + 1).
Then Q(n) is true for all n.
In a nutshell:
Suppose that:
Q(1) is true and
Q(1) true
Q(2) true
Q(3) true
.
.
.
Q(k) true
_

_
= Q(k+1) true
Conclusion:
Q(n) is true for all n N.
This follows from the usual form of the principle.
To see this, let S = m N : Q(m) is true . We shall use the usual form
of induction to show that the hypotheses above imply that S = N.
For any n N, let P(n) be the statement 1, 2, . . . , n S .
Now, by hypothesis, Q(1) is true and so 1 S. Hence 1 S which
is to say that P(1) is true.
Next, suppose that the truth of P(k) implies that of P(k+1) and assume
that P(k) is true. This means that 1, 2, . . . , k S, that is, each of Q(1),
Q(2), . . . , Q(k) is true. But then by the 2nd part of the hypothesis above,
Q(k + 1) is true, that is to say, k + 1 S. Hence 1, 2, . . . k, k + 1 S.
But this just tells us that P(k + 1) is true. By induction (usual form), it
follows that P(n) is true for all n N. This means that 1, 2, . . . , n S
for all n. In particular, n S for every n N, that is, Q(n) is true for all
n N which is the content of the 2nd form of the principle.
Kings College London
28 Chapter 2
Department of Mathematics
Chapter 3
Sequences
A sequence of real numbers is just a listing a
1
, a
2
, a
3
, . . . of real numbers
labelled by N, the set of natural numbers. Thus, to each n N, there
corresponds a real number a
n
. Not surprisingly, a
n
is called the n
th
term of
the sequence.
a
1
, a
2
, a
3
, . . . , a
k
, a
k+1
, . . .

labelled by N
c
k
th
term
Figure 3.1: The sequence (a
n
)
nN
.
Whilst it may seem a trivial comment, it is important to note that the
essential thing about a sequence is that it has a notion of direction it
makes sense to talk about one term being further down the sequence than
another. For example, a
101
is further down the sequence than, say, a
45
.
It is convenient to denote the above sequence by (a
n
)
nN
or even simply
by (a
n
). Note that there is no requirement that the terms be dierent. It is
quite permissible for a
j
to be the same as a
n
for dierent j and n. Indeed, one
could have a
n
= , say, for all n. This is just a sequence with constant terms
(all equal to ) a somewhat trivial sequence, but a sequence nonetheless.
Remark 3.1. On a more formal level, one can think of a sequence of real
numbers to be nothing but a function from N into R. Indeed, we can dene
such a function f : N R by setting f(n) = a
n
for n N. Conversely,
any f : N R will determine a sequence of real numbers, as above, via the
assignment a
n
= f(n).
One might wish to consider a nite sequence such as, say, the four term
sequence a
1
, a
2
, a
3
, a
4
. We will use the word sequence to mean an innite
sequence and simply include the adjective nite when this is meant.
29
30 Chapter 3
Examples 3.2.
1. 1, 4, 9, 16, . . .
Here the general term a
n
is given by the simple formula a
n
= n
2
.
2. 2, 3/2, 4/3, 5/4, 6/5, . . .
The general term is a
n
= (n + 1)/n.
3. 2, 0, 2, 0, 2, 0, . . .
Here a
n
=
_
0, if n is even
2, if n is odd.
This can also be expressed as a
n
= 1 (1)
n
.
4. Let a
n
be dened by the prescription a
1
= a
2
= 1 and a
n
= a
n1
+a
n2
for n 3. The sequence (a
n
) is then
1, 1, 2, 3, 5, 8, 13, . . .
These are known as the Fibonacci numbers.
We are usually interested in the long-term behaviour of sequences,
that is, what happens as we look further and further down the sequence.
What happens to a
n
when n gets very large? Do the terms settle down
or do they get sometimes big, sometimes small, . . . , or what?
In examples 3.2.1 and 3.2.4, the terms just get huge.
In example 3.2.2, we see, for example, that a
99
= 100/99, a
10000
=
10001/10000, a
10
20 = (10
20
+ 1)/10
20
, . . . , so it looks as though the terms
become close to 1.
In example 3.2.3, the terms just keep oscillating between the two values 0
and 2.
In example 3.2.2, we would like to say that the sequence approaches 1
as we go further and further down it. Indeed, for example, the dierence
between a
10
10 and 1 is that between (10
10
+ 1)/10
10
and 1, that is, 10
10
.
How can we formulate this idea of convergence of a sequence precisely?
We might picture a sequence in two ways, as follows. The rst is as the graph
of the function n a
n
. (Notice that we do not join up the dots.)
1 2 3 4
a
1

a
2

a
3

a
4

. . .
a
n

n N
Figure 3.2: A sequence as a graph.
Department of Mathematics
Sequences 31
The second way is just to indicate the values of the sequence on the real
line.

a
1

a
2

a
3

a
4 R
Figure 3.3: Plot the values of the sequence on the real line.
The example 3.2.3 above, would then be pictured either as
1 2 3 4
a
1
r
a
2
r
a
3
r
a
4
r
a
5
r
a
6
r
. . .
N
Figure 3.4: Graph with values 0 or 2.
or as
0

a
2
a
4
.
.
.
2

a
1
a
3
.
.
.
R
Figure 3.5: The values of a
n
are either 0 or 2.
Kings College London
32 Chapter 3
Returning to the general situation now, how should we formulate the
idea that a sequence (a
n
) converges to ? According to our rst pictorial
description, we would want the plotted points of the sequence (the graph)
to eventually become very close to the line y = .
1 2 3 4

. . .




y =
R
Figure 3.6: The graph gets close to the line y = .
In terms of the second pictorial description, we would simply demand
that the values of the sequence eventually cluster around the value x = .
r r rrrrrr r r r
x =
R
Figure 3.7: The values of (a
n
) cluster around x = .
If we think of the index n as representing time, then we can think of a
n
as the value of the sequence at the time n. The sequence can be considered
to have some property eventually provided we are prepared to wait long
enough for it to become established. It is very convenient to use this word
eventually, so we shall indicate precisely what we mean by it.
We say that a sequence eventually has some particular property if there
is some N N such that all the terms a
n
after a
N
(i.e., all a
n
with n > N)
have the property under consideration. (The number N can be thought of
as some oered time after which we are guaranteed that the property under
consideration will hold and will continue to hold.)
As an example of this usage, let (a
n
) be the sequence given by the pre-
scription a
n
= 100 n, for n N. Then a
1
= 99, a
2
= 98, . . . etc. It is clear
that a
n
is negative whenever n is greater than 100. Thus, we can say that
this sequence (a
n
) is eventually negative.
Now we can formulate the notion of convergence of a sequence. The idea
is that (a
n
) converges to the number if eventually it is as close to
as desired. That is to say, given some preassigned tolerance , no matter
how small, we demand that eventually (a
n
) is close to within of . In
Department of Mathematics
Sequences 33
other words, the distance between a
n
and (as points on the real line) is
eventually smaller than .
Denition 3.3. We say that the sequence (a
n
)
nN
of real numbers converges
to the real number if for any given > 0, there is some natural number
N N such that [a
n
[ < whenever n > N.
is called the limit of the sequence. In such a situation, we write a
n

as n or alternatively lim
n
a
n
= .
The use of the symbol is just as part of a phrase and it has no meaning
in isolation. There is no real number .
Remark 3.4. The positive number is the assigned tolerance demanded.
Typically, the smaller is, so the larger we should expect N to have to
be. For example, consider the sequence (a
n
) where a
n
= 1/n. We would
expect that a
n
0 as n . To see this, let > 0 be given. (We are
not able to choose this. It is given to us and its actual value is beyond our
control.) It will be true that [a
n
0[ < provided n > 1/. So after some
contemplation, we proceed as follows. We are unable to inuence the choice
of given to us, but once it is given then we can (and must) base our tactics
on it. So let N be any natural number larger than 1/. If n > N, then
n > N > 1/ and so 1/n < . That is, if n > N, then [a
n
0[ = 1/n <
and so, according to our denition, we have shown that a
n
0 as n .
Notice that the smaller is, the larger N has to be.
Note that the statement
if n > N then [a
n
[ <
can also be written as
[a
n
[ < whenever n > N
or also as
n > N = [a
n
[ < .
Also, we should note that the inequality [a
n
[ < telling that the distance
between the real numbers a
n
and is less than can also be expressed by
the pair of inequalities
< a
n
<
or equivalently by the pair
< a
n
< + .
This simply means that a
n
lies on the real line somewhere between the two
values and + . This must happen eventually if the sequence is to
be convergent (to ).
Kings College London
34 Chapter 3

( )
+
..
a
n
lies in here
R
Figure 3.8: The value of a
n
lies within of .
Example 3.5. Let (a
n
)
nN
be the sequence with
a
n
=
2n + 5
n
for n N. Does (a
n
) converge? Looking at the rst few terms, we nd
(a
n
) = (7,
9
2
,
11
3
,
13
4
,
15
5
,
17
6
,
19
7
, . . . ,
205
100
, . . . ).
It seems that a
n
2 as n , but we must prove it.
Let > 0 be given. We have to show that eventually (a
n
) is within of 2.
We have [a
n
2[ = [(2n + 5)/n 2[ = 5/n. Now, the inequality 5/n < is
the same as n > 5/. Let N be any natural number which obeys N > 5/.
Then if n > N, we have
n > N >
5

and so 5/n < . This means that if n > N then [a


n
2[ < and we have
succeeded in proving that a
n
2 as n .
Example 3.6. Let (a
n
)
nN
be the sequence a
n
= 1/n
2
. We shall show that
a
n
0 as n .
Let > 0 be given.
We wish to show that there is N N such that if n > N then [a
n
0[ < ,
that is, [a
n
0[ = 1/n
2
< .
Now,
1
n
2
< n
2
>
1

n >
1

so take N N to be any natural number satisfying N > 1/

. Then if
n > N, it follows that n > 1/

and so n
2
> 1/ which in turn implies that
1/n
2
< and the proof is complete.
Alternatively, we note that 1/n
2
1/n and so if 1/n < then it follows that
1/n
2
1/n < . So let N N be any natural number such that N > 1/.
Then 1/N < and so if n > N we have
1
n
<
1
N
< =
1
n
2

1
n
<
1
N
< .
Department of Mathematics
Sequences 35
Example 3.7. Let (a
n
)
nN
be the sequence a
n
=
4
n
3
+
1

n
. We shall show
that a
n
0 as n .
Let > 0 be given.
We must show that
[a
n
0[ =
4
n
3
+
1

n
<
whenever n is large enough. To see this, we note that
4
n
3
+
1

n

4
n
+
1

n

4

n
+
1

n
=
5

n
.
If the right hand side is less than , then so is the left hand side. Let N N
satisfy 5/

N < , that is, 25/N <


2
or N > 25/
2
. Then if n > N, we
may say that 25/n < 25/N <
2
and so
4
n
3
+
1

n

5

n
<
5

N
<
that is, [a
n
0[ < whenever n > N.
Example 3.8. Let [x[ < 1 and for n N, let a
n
= x
n
. Does (a
n
) converge?
Since [x[ < 1, [x
n
[ = [x[
n
gets smaller and smaller as n increases, so we
might guess that x
n
0 as n .
Let > 0 be given.
We must show that eventually [x
n
0[ < which is the same as showing
that eventually [x[
n
< . Set d = [x[. Then we wish to show that eventually
d
n
< . Notice that d 0 and so we no longer have to worry about whether
x is positive or negative. We have transferred the problem from one about
x to one about d.
Consider rst the case x = 0. Then also d = 0 and d
n
= 0 for all n. In
particular, if we go through the motions by choosing N = 1, then certainly
d
n
< whenever n > N (because d
n
= 0), which tells us (trivially) that
eventually d
n
< and so therefore x
n
0 as n .
Now suppose that x ,= 0. Then 0 < [x[ < 1, so that 0 < d < 1. Dene
t by d = 1/(1 + t), that is t = (1 d)/d. Then t > 0. By the binomial
theorem, we have
(1 + t)
n
= 1 + nt +
_
n
2
_
t
2
+ + t
n
> nt
for any n N. Hence
d
n
=
1
(1 + t)
n
<
1
nt
.
Kings College London
36 Chapter 3
We shall use this to estimate d
n
. If the right hand side is less than , then
so is the left hand side. To carry this through, let N be any natural number
obeying N > 1/t. Then this means that 1/Nt < . For any n > N, we
therefore have the inequality 1/n < 1/N and (since t > 0) we also have
d
n
<
1
nt
<
1
Nt
< .
In other words, we have shown that eventually d
n
is less than . In terms
of x and a
n
, we have
[a
n
0[ = [x[
n
=
1
(1 + t)
n
<
1
nt
<
1
Nt
<
whenever n > N. Hence if [x[ < 1 then x
n
0 as n .
Is it possible for a sequence to converge to two dierent limits? To
convince ourselves that this is not possible, suppose the contrary. That is,
suppose that (a
n
) is some sequence which has the property that it converges
both to and , say, with ,= . Let > 0 be given. Then by denition
of convergence, (a
n
) is eventually within distance of and also (a
n
) is
eventually within distance of .

( )
+
..
eventually in here

( )
+
..
eventually in here
R
Figure 3.9: The sequence (a
n
) is eventually within of both and .
As one can see from the gure, if is small enough, then the two intervals
( , + ) and ( , + ) will not overlap and it will not be possible
for any terms of the sequence (a
n
) to belong to both of these intervals
simultaneously. We can turn this into a rigorous argument as follows.
Theorem 3.9. Suppose that (a
n
)
nN
is a sequence such that a
n
and also
a
n
as n . Then = , that is, a convergent sequence has a unique
limit.
Proof. Let > 0 be given.
Since we know that a
n
, then we are assured that eventually (a
n
) is
within of . Thus, there is some N
1
N such that if n > N
1
then the
distance between a
n
and is less than , i.e., if n > N
1
then [a
n
[ < .
Similarly, we know that a
n
as n and so eventually (a
n
) is
within of . Thus, there is some N
2
N such that if n > N
2
then the
distance between a
n
and is less than , i.e., if n > N
2
then [a
n
[ < .
Department of Mathematics
Sequences 37
So far so good. What next? To get both of these happening simultaneously,
we let N = max N
1
, N
2
. Then n > N means that both n > N
1
and also
n > N
2
. Hence we can say that if n > N then both [a
n
[ < and also
[a
n
[ < .
Now what? We expand out these sets of inequalities. Pick and x any
n > N (for example n = N + 1 would do). Then
< a
n
< +
< a
n
< + .
The left hand side of the rst pair together with the right hand side of the
second pair of inequalities gives < a
n
< + and so
< + .
Similarly, the left hand side of the second pair together with the right hand
side of the rst pair of inequalities gives < a
n
< + and so
< + .
Combining these we see that
2 < < 2
which is to say that [ [ < 2. This happens for any given > 0 and
so the non-negative number [ [ must actually be zero. But this means
that = and the proof is complete.
Denition 3.10. We say that the sequence (a
n
)
nN
is bounded from above if
there is some M R such that
a
n
M
for all n N. The sequence (a
n
)
nN
is said to be bounded from below if
there is some m R such that
m a
n
for all n N. If (a
n
) is bounded both from above and from below, then we
say that (a
n
) is bounded.
Examples 3.11.
1. Let a
n
= n + (1)
n
n for n N. Then we see that (a
n
) is the sequence
given by (a
n
) = (0, 4, 0, 8, 0, 12, 0, . . . ). Evidently (a
n
) is bounded from
below (in fact, a
n
0) but (a
n
) is not bounded from above. (There is no
M for which a
n
M holds for all n. Indeed, for any xed M whatsoever,
if n is any even natural number greater than M, then a
n
= 2n > n > M.)
Kings College London
38 Chapter 3
2. Let a
n
= 1/n, n N. It is clear that a
n
obeys 0 a
n
2 for all n
and so (a
n
) is bounded both from above and from below, that is, (a
n
) is
bounded.
Proposition 3.12. The sequence (a
n
) is bounded if and only if there is some
K 0 such that [a
n
[ K for all n.
Proof. Suppose rst that (a
n
) is bounded. Then there is m and M such
that
m a
n
M
for all n. We do not know whether m or M are positive or negative. However,
we can introduce [m[ and [M[ as follows. For any x R, it is true that
[x[ x [x[. Applying this to m and M in the above inequalities, we see
that
[m[ m a
n
M [M[ .
Let K = max [m[ , [M[ . Then clearly,
K [m[ m a
n
M [M[ K
which gives the inequalities
K a
n
K
so that [a
n
[ K, for all n, as required.
For the converse, suppose that there is K 0 so that [a
n
[ K for all n.
Then this can be expressed as
K a
n
K
for all n and therefore (a
n
) is bounded (taking m = K and M = K in the
denition).
Theorem 3.13. If a sequence converges then it is bounded.
Proof. Suppose that (a
n
) is a convergent sequence, a
n
, say, as n .
Then, in particular, (a
n
) is eventually within distance 1, say, of . This
means that there is some N N such that if n > N then the distance
between a
n
and is less than 1, i.e., if n > N then [a
n
[ < 1. We can
rewrite this as
1 a
n
1
or
1 a
n
+ 1
whenever n > N. This tells us that the tail (a
n
for n > N) of the sequence is
bounded but what about the whole sequence? This is now easy we know
Department of Mathematics
Sequences 39
about a
n
when n > N so we only still need to take into account the beginning
of the sequence up to the N
th
term, that is, the terms a
1
, a
2
, . . . , a
N
. Let
M = max a
1
, a
2
, . . . , a
N
, + 1 and let m = min a
1
, a
2
, . . . , a
N
, 1 .
Then certainly + 1 M and m 1. Hence if n > N, then
m a
n
M.
But by construction of m and M, we also have the inequalities m a
n
M
for any 1 n N. Piecing together these two parts of the argument, we
conclude that
m a
n
M
for any n and we have shown that (a
n
) is bounded, as required.
Remark 3.14. The converse of this is false. For example, let (a
n
) be the
sequence with a
n
= (1)
n
. Then (a
n
) = (1, 1, 1, 1, 1, . . . ) which is
bounded (for example, 1 a
n
1 for all n) but does not converge.
Denition 3.15. A sequence (a
n
) of real numbers is said to be
(i) increasing if a
n+1
a
n
for all n;
(ii) strictly increasing if a
n+1
> a
n
for all n;
(iii) decreasing if a
n+1
a
n
for all n;
(iv) strictly decreasing if a
n+1
< a
n
for all n.
A sequence satisfying any of these conditions is said to be monotonic or
monotone. It is strictly monotonic if it satises either (ii) or (iv).
One reason for an interest in monotonic sequences is the following.
Theorem 3.16. If (a
n
) is an increasing sequence of real numbers and is
bounded from above, then it converges.
Proof. Suppose then that a
n
a
n+1
and that a
n
M for all n. Let
K = lub a
n
: n N, so that K is well-dened with K M. We claim
that a
n
K as n .
Let > 0 be given. We must show that eventually (a
n
) is within distance
of K. Now, K is an upper bound for a
n
: n N and so a
n
K for
all n. It is enough then to show that K < a
n
eventually. However, this
is true for the following reason. K < K and K is the least upper bound
of a
n
: n N and so K is not an upper bound for a
n
: n N. This
means that there is some a
j
, say, with a
j
> K . But the sequence (a
n
)
is increasing and so a
n
a
j
for all n > j. Hence a
n
> K for all n > j.
We have shown that
K < a
n
K < K +
for all n > j. This means that eventually [a
n
K[ < and so the proof is
complete.
Kings College London
40 Chapter 3
Remark 3.17. Note that in the course of the proof of the above result, we
have not only shown that (a
n
) converges but we have actually established
what the limit is it is the least upper bound of the set of real numbers
a
n
: n N. Of course, this does not necessarily provide us with the
numerical value of the limit.
It is also worth noting that from this result and the fact that a convergent
sequence is bounded, we can say that an increasing sequence converges if and
only if it is bounded. The sequence (a
n
) with a
n
= n is clearly increasing.
It is not bounded and so we can say immediately that it does not converge
(which is no surprise, in this case).
Corollary 3.18. Any sequence which is decreasing and bounded from below
must converge.
Proof. Suppose that (b
n
) is a sequence which is decreasing and bounded from
below. Then b
n+1
b
n
for all n and there is some k such that b
n
k for
all n. Set a
n
= b
n
and K = k. Then these inequalities become a
n
a
n+1
and a
n
K for all n, that is, (a
n
) is increasing and is also bounded from
above. By the theorem, we deduce that (a
n
) converges. Denote its limit by
and let = . We will show that b
n
as n (as one might well
expect). Let > 0 be given. Then there is some N N such that if n > N
then
[a
n
[ < .
In terms of b
n
and , the left hand side becomes [b
n
+ [ which is equal
to [b
n
[ and so we have established that
[b
n
[ <
whenever n > N, which completes the proof.
Example 3.19. Let (a
n
) be the sequence given by
a
1
= 1, a
2
= 1 + 1, a
3
= 1 + 1 +
1
2!
,
a
4
= 1 + 1 +
1
2!
+
1
3!
, . . . a
n
= 1 + 1 +
1
2!
+ +
1
(n1)!
, . . .
This can be written more succinctly as a
1
= 1 and a
n
= a
n1
+
1
(n1)!
for n 2. Does (a
n
) converge? It is clear that a
n+1
> a
n
and so (a
n
)
is increasing (in fact, strictly increasing). If we can show that it is also
bounded then we conclude that it must converge. Can we nd K such that
Department of Mathematics
Sequences 41
a
n
K for all n? We have a
1
= 1 and for any n 1
a
n+1
= 1 + 1 +
1
2
+
1
2.3
+
1
2.3.4
+ +
1
2.3 . . . n
1 + 1 +
1
2
+
1
2
2
+
1
2
3
+ +
1
2
n1
= 1 +
_
1 (
1
2
)
n
_
(1
1
2
)
, summing the GP,
= 1 + 2(1 (
1
2
)
n
)
< 1 + 2 = 3.
Hence the increasing sequence (a
n
) is bounded above, by 3. We conclude
that (a
n
) converges. Because it is increasing, we know that its limit is equal
to lub a
n
: n N = , say. But a
n
obeys a
n
3 and so 3 is an upper
bound for a
n
: n N and therefore lub a
n
: n N 3, that is, 3.
Of course, = lub a
n
: n N a
k
for any particular k. Taking k = 3,
we get that a
3
> 2 and so we can say that 2 < 3. In fact, is just
e (and e = 2.71828 . . . ).
If a
n
and b
n
, then we might expect it to be the case that
a
n
+ b
n
+ . After all, if (a
n
) is eventually close to and (b
n
) is
eventually close to , then it seems quite reasonable to guess that (a
n
+b
n
)
is eventually close to + . This is true, but we must take care with the
details.
Theorem 3.20. Suppose that (a
n
) and (b
n
) are sequences in R.
(i) If a
n
as n , then a
n
as n , for any R.
(ii) If a
n
as n and b
n
as n , then a
n
+b
n
+
as n and also a
n
b
n
as n .
(iii) If a
n
as n and if b
n
as n and if b
n
,= 0 for
all n and if ,= 0, then a
n
/b
n
/ as n .
Proof. (i) Fix R. Let > 0 be given. We must show that [a
n
[ <
eventually. If = 0, then a
n
= 0 for all n and so it is clear that in this
case a
n
= 0 0 = as n .
So now suppose that ,= 0. Let

> 0. (We will specify

in a moment.)
Then since we know that a
n
, it follows that there is some N N such
that n > N implies that
[a
n
[ <

.
Now,
[a
n
[ = [[ [a
n
[
< [[

Kings College London


42 Chapter 3
whenever n > N. If we choose

= / [[ then see that


[a
n
[ < [[

=
whenever n > N. Hence a
n
, as required.
(ii) Let > 0 be given. Suppose

> 0. We will specify the value of

in
a moment. There is N
1
N such that n > N
1
implies that [a
n
[ <

.
Also, there is N
2
N such that n > N
2
implies that [b
n
[ <

. Set
N = max N
1
, N
2
. Then if n > N, we see that
[a
n
+ b
n
( + )[ = [a
n
+ b
n
[ [a
n
[ +[b
n
[ <

= 2

.
Setting

=
1
2
, it follows that if n > N, then
[a
n
+ b
n
( + )[ < 2

= ,
that is, a
n
+ b
n
+ as n .
To show that a
n
b
n
, consider rst the case x
n
0 and y
n
0.
We shall show that x
n
y
n
0.
Let > 0 be given.
Then we know that there is N
1
N such that if n > N
1
then [x
n
[ <

.
Similarly, we know that there is N
2
N such that if n > N
2
then [y
n
[ <

.
Let N = N
1
+ N
2
. Then if n > N, it follows that
[x
n
y
n
[ <

=
that is, x
n
y
n
0 as n .
Now, in the general case, we simply use previous results to note that
a
n
b
n
= (a
n
)(b
n
) + b
n
+ a
n

0 + +
=
as required.
(iii) Now suppose that a
n
, b
n
and suppose that b
n
,= 0 for all n
and that ,= 0. Let
n
= 1/b
n
and let = 1/. Then a
n
/b
n
= a
n

n
. To
show that a
n
/b
n
/, we shall show that
n
as n . The desired
conclusion will then follow from the second part of (ii), above.
We have
[
n
[ = [1/b
n
1/[
=
[ b
n
[
[b
n
[
.
Department of Mathematics
Sequences 43
For large enough n, the numerator is small and the denominator is close to
[[
2
, so we might hope that the whole expression is small. (Note that it is
imperative here that ,= 0.) We shall show that 1/ [b
n
[ is bounded from
above. Indeed, [[ > 0 and so, taking
1
2
[[ as our , we can say that there
is some N

such that n > N

implies that
[b
n
[ <
1
2
[[ .
Hence, if n > N

, we have
[[ = [ b
n
+ b
n
[ [ b
n
[ +[b
n
[
<
1
2
[[ +[b
n
[
and so
1
2
[[ < [b
n
[. If we set K = min [b
1
[ , [b
2
[ , . . . , [b
N
[ ,
1
2
, then it is
true that K > 0 and [b
n
[ K for all n. Hence 1/ [b
n
[ 1/K for all n.
Let > 0 be given. Let

= K[[. Since b
n
, there is N such that
n > N implies that
[b
n
[ <

.
But then, for any n > N, we have
[
n
[ =
[ b
n
[
[b
n
[ [[
<

K [[
=
and the proof is complete.
Examples 3.21.
1. Taking a
n
= 1/n, it follows that /n 0 as n for any R.
2. Suppose that a
n
as n . Then it follows immediately that
a
n
0 as n . Indeed, for any given > 0, there is some
N N such that n > N implies that [a
n
[ < . But [a
n
[ =
[(a
n
) 0[, so to say that a
n
0 is just to say that a
n
0 as
n .
3. With a
n
= b
n
, we see that if a
n
, then a
2
n

2
as n . Now
with b
n
= a
2
n
, it follows that a
3
n

3
as n . Repeating this (i.e., by
induction), we see that if a
n
as n , then a
k
n

k
as n
for any given k N.
4. Let a
n
=
3n
2
4
2n
2
+ 1
for n N. We can rewrite a
n
as a
n
=
(3 4/n
2
)
(2 + 1/n
2
)
.
Then we note that 4/n
2
0 and 1/n
2
0, so that 3 4/n
2
3 and
2 +1/n
2
1 as n . Finally, it follows that a
n
=
(3 4/n
2
)
(2 + 1/n
2
)
3/2
as n .
Kings College London
44 Chapter 3
5. Let a
n
=
7n
3
5n
2
+ 3n 9
3n
3
+ 4n
2
8n + 2
. The rst thing we do is to divide through
by the highest power of n occurring in the numerator or denominator,
i.e., in this case, by n
3
. So, a
n
can be rewritten as
a
n
=
(7 5/n + 3/n
2
9/n
3
)
(3 + 4/n 8/n
2
+ 2/n
3
)
.
Then we see that the numerator converges to 7 and the denominator
converges to 3 as n . Hence a
n
7/3 as n .
6. Let a
n
=
n
4
8
n
7
+ 3
. Then we have a
n
=
(1/n
3
8/n
7
)
(1 + 3/n
7
)
and so it follows
that a
n
0/1 = 0 as n .
7. Let a
n
=
2n
5
+ 4
n
3
+ 6
. Then a
n
=
(2 + 4/n
5
)
(1/n
2
+ 6/n
5
)
. Now, the numerator
converges to 2 whilst the denominator converges to 0 as n . The
above theorem about the convergence of a
n
/b
n
says nothing about the
case when b
n
or = limb
n
are zero. In this example, we back up and
note that, by inspection, we have
a
n
=
2n
5
+ 4
n
3
+ 6
>
2n
5
n
3
+ 6

2n
5
n
3
+ 6n
3
=
2n
2
7
.
It follows that a
n
is not bounded from above and so cannot converge.
8. Suppose that [x[ < 1 and consider the sequence a
n
= x
n
, for n N. Then
the sequence b
n
= [a
n
[ = [x[
n
is monotone decreasing and is bounded
below (by 0) and so therefore it converges, to , say: b
n
as n .
Hence the sequence (b
2n
) also converges to . However,
b
2n
= [a
2n
[ =

x
2n

= [x[
n
[x[
n
= b
n
b
n

2
and so we see that =
2
. Therefore either = 0 or else = 1. The value
= 1 is not possible because (b
n
) converges to its greatest lower bound
and the value 1 is not a lower bound. Hence = 0 and we conclude that
[a
n
[ 0 as n .
Let > 0 be given.
Then there is some N such that n > N implies that
[[a
n
[ 0[ = [a
n
[ = [a
n
0[ <
which shows that x
n
= a
n
0 as n .
The next result is very useful.
Department of Mathematics
Sequences 45
Proposition 3.22. Suppose that (c
n
) is a sequence in R with c
n
0 for all
n N and such that c
n
as n . Then 0. In other words,
the limit of a convergent positive sequence is positive. (Note that we are
using the term positive to mean not strictly negative, so that the value zero
is allowed.)
Proof. Exactly one of < 0, = 0 or > 0 is true. We wish to show that
the rst is impossible. To do this, suppose the contrary, that is, suppose
that < 0. We will obtain a contradiction from this.
Let = . Then according to our hypothesis, > 0. We know that
c
n
as n and so we can say that there is some N in N such that
n > N implies that [c
n
[ < . How can we use this? Fix any n > N, for
example we could take n = N +1. The inequality [c
n
[ < is equivalent
to the pair of inequalities
< c
n
< .
Recalling that = , we nd that
c
n
< = .
This tells us that c
n
< 0 which is false. We have obtained our contradiction
and so we can conclude that, as claimed, it is true that 0.
It is natural to ask whether strict positivity of every c
n
implies that of
the limit , that is, if c
n
> 0 for all n, can we deduce that necessarily > 0?
The answer is no. To show this, we just need to exhibit an explicit example.
Such an example is provided by the sequence c
n
= 1/n. It is true here that
c
n
= 1/n > 0 for every n. The sequence (c
n
) converges, but its limit is
= 0. So we have c
n
> 0 for all n, c
n
as n but = 0.
The following theorem provides a useful technique for exhibiting con-
vergence of a sequence even under circumstances where we do not know
explicitly the values of the terms of the sequence.
Theorem 3.23 (Sandwich Principle). Suppose that (a
n
), (b
n
) and (x
n
) are
sequences in R such that
(i) a
n
x
n
b
n
for all n N and
(ii) both a
n
and b
n
as n .
Then (x
n
) converges and its limit is .
Proof. Let > 0 be given.
Kings College London
46 Chapter 3
The inequalities a
n
x
n
b
n
can be rewritten as
0 x
n
a
n
. .
y
n
b
n
a
n
. .
z
n
.
Since both (a
n
) and (b
n
) converge to , it follows that z
n
= b
n
a
n

= 0 as n . Hence there is some N in N such that n > N implies
that [z
n
[ < . But since y
n
= x
n
a
n
0, we have [y
n
[ = y
n
and so n > N
implies that
[y
n
[ = y
n
z
n
= [z
n
[ <
which means that y
n
0 as n . To nish the proof, we observe that
x
n
= y
n
+ a
n
0 + as n and we are done.
We illustrate this with a proof that any real number can be approximated
by rationals.
Theorem 3.24. Any real number is the limit of some sequence of rational
numbers.
Proof. Let a be any given real number. For each n N, we know that there
is a rational number q
n
, say, lying between the numbers a and a+1/n. That
is, q
n
satises
a q
n
a +
1
n
.
Since 1/n 0, an application of the Sandwich Principle tells us immediately
that q
n
a as n , as required.
Note that a similar proof shows that any real number is the limit of
a sequence of irrational numbers (just replace the adjective rational by
irrational.) The point though is that even though one might think of the
irrational numbers as somewhat weird, they can nevertheless be approxi-
mated as closely as desired by rational numbers.
Subsequences
Consider the sequence (a
n
) given by a
n
= sin(
1
2
n) for n N. Evidently,
a
n
= 0 if n is even and alternates between 1 for n odd. For example, the
rst 5 terms are a
1
= 1, a
2
= 0, a
3
= 1, a
4
= 0, a
5
= 1.
Next, consider the sequence
(b
n
) = (1,
2
3
,
1
3
,
4
5
,
1
5
, . . . ) .
This is given by
b
n
=
_
1
n
, for n odd
n
n+1
, for n even.
We notice that the odd terms approach 0 whereas the even terms approach 1.
Department of Mathematics
Sequences 47
These two examples suggest that we might well be interested in considering
certain terms of a sequence in isolation from the original sequence. This idea
is formalized in the concept of a subsequence of a sequence. Roughly speak-
ing, a subsequence of a sequence is simply any sequence obtained by leaving
out particular terms from the original sequence. For example, the even terms
a
2
, a
4
, a
6
, . . . form a subsequence of the sequence (a
n
). Another subsequence
of (a
n
) is obtained by considering, say, every tenth term, a
10
, a
20
, a
30
, . . . .
Denition 3.25. Let (a
n
) be a given sequence. A subsequence of (a
n
)
nN
is
any sequence of the form (a
n
1
, a
n
2
, a
n
3
, . . . ) where n
1
< n
2
< n
3
< . . . is
any (strictly increasing) sequence of natural numbers.
We can express this somewhat more formally as follows. A sequence
(b
k
)
kN
is a subsequence of the sequence (a
n
)
nN
if there is some mapping
: N N such that i < j implies that (i) < (j) (i.e., is strictly
increasing) and such that b
k
= a
(k)
for each k N. This agrees with the
above formulation if we simply set (k) = n
k
and put b
k
= a
(k)
= a
n
k
. (It
really just amounts to a matter of notation.)
Of course, (b
k
)
kN
is a sequence in its own right and so one can consider
subsequences of (b
k
). Evidently, a subsequence of (b
k
) is also a subsequence
of (a
n
)
nN
. This is intuitively clear. We get a subsequence of (b
k
) by leaving
out some of its terms. However, (b
k
) itself was obtained from (a
n
) by leaving
out various terms of (a
n
), so if we leave out both lots in one step, we get our
subsequence of (b
k
) directly from (a
n
). To see this more formally, suppose
that (c
j
)
jN
is a subsequence of (b
k
)
kN
. Then there is a strictly increasing
map : N N such that c
j
= b
(j)
for all j N. However, since (b
k
) is a
subsequence of (a
n
), there is a strictly increasing map : N N such that
b
k
= a
(k)
for all k N. This means that we can write c
j
as
c
j
= b
(j)
= a
((j))
for j N. Let : N N be the map (j) = ((j)). Evidently is strictly
increasing and c
j
= a
(j)
for j N. This shows that (c
j
)
jN
is a subsequence
of (a
n
)
nN
.
Remark 3.26. Let (a
n
j
) be a subsequence of (a
n
). It is intuitively clear that,
say, the 20
th
term of (a
n
j
) has to be at least the 20
th
term of (a
n
). In
general, the term a
n
j
is at least as far along the (a
n
) sequence as the j
th
or,
in other words, n
j
j.
We will verify this by induction. For j N, let P(j) be the statement
n
j
j . Now, n
j
N and so, in particular, n
1
1, which means that
P(1) is true. Fix j N and suppose that P(j) is true. We will show that
this implies that P(j + 1) is also true. Indeed, n
j
is strictly increasing in j
and so we have
n
j+1
> n
j
j , by the induction hypothesis that P(j) is true.
Kings College London
48 Chapter 3
Since all quantities under consideration are integer-valued, we deduce that
n
j+1
j +1, i.e., P(j +1) is true. It follows, by induction, that P(j) is true
for all j N.
Proposition 3.27. Suppose that (a
n
) converges to . Then so does every
subsequence of (a
n
).
Proof. Let (a
n
k
)
kN
be any subsequence of (a
n
) whatsoever. We wish to
show that a
n
k
as n .
Let > 0 be given.
Now, we know that a
n
as n . Therefore, we are assured that
there is some N N such that n > N implies that [a
n
[ < . But (a
n
k
)
is a subsequence of (a
n
) and so we know that n
k
k for all k N. It
follows that if k > N, then certainly n
k
> N. Hence, k > N implies that
[a
n
k
[ < and the proof is complete.
Remark 3.28. The proposition tells us that if (a
n
) converges, then so does
any subsequence, and to the same limit.
Consider the sequence a
n
= (1)
n
. Then we see that a
2n
= 1 for
all n, whereas a
2n1
= 1 for all n, so that (a
n
) certainly possesses two
subsequences which both converge but to dierent limits. Consequently, the
original sequence cannot possibly converge. (If it did, every subsequence
would have to converge to the same limit, namely, the limit of the original
sequence.)
Bolzano-Weierstrass Theorem
Before we launch into one of the most important results of real analysis, let
us make one or two observations regarding upper and lower bounds.
Suppose that A and B are subsets of R with A B. If M is such that
b M for all b B, then certainly, in particular, a M for all a A. In
other words, an upper bound for B is also (a fortiori) an upper bound for
any subset A of B. Now, lubB is an upper bound for B and so lubB is
certainly an upper bound for A. It follows that
lubA lub B .
It is possible for the inequality here to be strict. For example, if A is the
interval A = [1, 2] and B is the interval B = [0, 3], then A B and evidently
lubA = 2 whereas lubB = 3, so that lubA < lub B in this case.
Similarly, we note that if m is a lower bound for B, then m is also a
lower bound for A and so
glb B glb A.
With the example A = [1, 2] and B = [0, 3], as above, we see that glbB = 0
and glbA = 1.
Department of Mathematics
Sequences 49
Theorem 3.29 (Bolzano-Weierstrass Theorem). Any bounded sequence of
real numbers possesses a convergent subsequence.
(In other words, if (a
n
)
nN
is a bounded sequence in R, then there is a
strictly increasing sequence (n
k
)
kN
of natural numbers such that (a
n
k
)
kN
converges.)
Proof. Suppose that M and m are upper and lower bounds for (a
n
),
m a
n
M ()
The idea of the proof is to construct a certain bounded monotone decreasing
sequence and use the fact that this converges to its greatest lower bound
and to drag a suitable subsequence of (a
n
) along with this.
We construct the rst element of the auxiliary monotone sequence. Let
M
1
= lub a
n
: n N. Then M
1
1 is not an upper bound for a
n
: n N
and so there must be some n
1
, say, in N such that
M
1
1 < a
n
1
M
1
.
(The value 1 subtracted here (from M
1
) is not important. We could have
chosen any positive number. However, we shall repeat this process and we
require a sequence of positive numbers which converge to 0. The numbers
1,
1
2
,
1
3
, . . . suit our purpose.) We note that M
1
is an upper bound for (a
n
)
and so, in particular, it is an upper bound for the set a
n
: n > n
1
.
Next, we construct M
2
as follows. Let M
2
= lub a
n
: n > n
1
. Then
M
2
M
1
and moreover, M
2

1
2
is not an upper bound for a
n
: n > n
1

and so there is some n


2
> n
1
such that
M
2

1
2
< a
n
2
M
2
.
The way ahead is now clear. Let M
3
= lub a
n
: n > n
2
. Then
M
3
M
2
and since M
3

1
3
is not an upper bound for a
n
: n > n
2
there
must be some n
3
> n
2
such that
M
3

1
3
< a
n
3
M
3
.
Continuing in this way, we construct a sequence (M
j
)
jN
and a sequence
(n
j
)
jN
of natural numbers such that M
j+1
M
j
, n
j+1
> n
j
, and
M
j

1
j
< a
n
j
M
j
for all j N.
Now we note that m a
n
j
M
j
and so (M
j
) is a decreasing sequence
which is bounded from below. It follows that M
j
as j , where
= glb M
j
: j N. We are not interested in the value of this limit .
Kings College London
50 Chapter 3
All we need to know is that the sequence (M
j
)
jN
converges to something.
However, by our very construction,
M
j

1
j
< a
n
j
M
j
and so, by the Sandwich Principle, Theorem 3.23, a
n
j
as j .
We have succeeded in exhibiting a convergent subsequence, namely the
subsequence (a
n
j
)
jN
and the proof is complete.
Remark 3.30. Note that the theorem does not tell us anything about the
subsequence or its limit. Indeed, it cannot, because we know nothing about
our original sequence other than the fact that it is bounded. It can also
happen that there are many convergent subsequences with dierent limits.
It is easy to construct such examples. For example, let (u
n
), (v
n
) and (w
n
)
be any three given convergent sequences, say, u
n
u, v
n
v and w
n
w.
We construct the sequence (a
n
) as follows:
(a
1
, a
2
, a
3
, a
4
, . . . , ) = (u
1
, v
1
, w
1
, u
2
, v
2
, w
2
, u
3
, . . . ) .
In other words, the three sequences (u
n
), (v
n
) and (w
n
) are dovetailed to
form (a
n
). Explicitly, for n N,
a
n
=
_

_
u
k
, if n = 3k 2 for some k N,
v
k
, if n = 3k 1 for some k N,
w
k
, if n = 3k for some k N.
Evidently, if u, v and w are dierent, then the sequences (a
3j2
)
jN
=
(u
j
)
jN
, (a
3j1
)
jN
= (v
j
)
jN
and (a
3j
)
jN
= (w
j
)
jN
are three convergent
subsequences of (a
n
)
nN
with dierent limits.
Let us say that a real number is a limit point of a given sequence if it
is the limit of some convergent subsequence. Then in this terminology, the
real numbers u, v and w are limit points of the sequence (a
n
).
Next, we need a little more terminology.
Denition 3.31. A sequence (a
n
)
nN
is said to be a Cauchy sequence (also
known as a fundamental sequence) if it has the property that for any given
> 0 there is some N N such that both n > N and m > N imply that
[a
n
a
m
[ < .
In other words, eventually the distance between any two terms of the se-
quence is less than .
Department of Mathematics
Sequences 51
Proposition 3.32. Every Cauchy sequence is bounded.
Proof. Suppose that (a
n
) is a Cauchy sequence. Then we know that there
is some N N such that both n > N and m > N imply that
[a
n
a
m
[ < 1 .
(The value 1 on the right hand side here is not at all critical. We could have
selected any positive real number instead, with obvious modications to the
following reasoning.) In particular, for any j > N,
[a
j
[ [a
j
a
N+1
[ +[a
N+1
[ < 1 +[a
N+1
[ .
It follows that if we let M = 1 + max [a
i
[ : 1 i N + 1 , then we have
[a
k
[ M
for all k N. This shows that (a
n
) is bounded.
We have seen that a bounded monotone sequence must converge. The
next theorem is very important as it gives us a necessary and sucient
condition for convergence of a sequence.
Theorem 3.33. A sequence converges in R if and only if it is a Cauchy
sequence.
Proof. We must show that any Cauchy sequence has to converge and, con-
versely, that any convergent sequence is a Cauchy sequence.
So suppose that (a
n
)
nN
is a Cauchy sequence. We must show that there
is some such that a
n
as n . At rst, this might seem impossible
because there is no way of knowing what might be. However, it turns out
that we do not need to know the actual value of but rather just that it
does exist. Indeed, we have seen that a Cauchy sequence is bounded and the
Bolzano-Weierstrass Theorem tells us that a bounded sequence possesses a
convergent subsequence. We shall show that this is enough to guarantee
that the sequence itself converges.
Let > 0 be given.
As noted above, we know that (a
n
) has some convergent subsequence, say
a
n
k
as k . We shall show that a
n
by an /2-argument. Since
we know that a
n
k
as k , we can say that there is k
0
N such that
k > k
0
implies that
[a
n
k
[ <
1
2
.
Since (a
n
) is a Cauchy sequence, there is N
0
such that both n > N
0
and
m > N
0
imply that
[a
n
a
m
[ <
1
2
.
Kings College London
52 Chapter 3
Let N = max k
0
, N
0
. Now, if k > N it follows that also n
k
> N (since
n
k
k) and so if k > N then
[a
k
[ [a
k
a
n
k
[ +[a
n
k
[ <
1
2
+
1
2
= .
Thus a
k
as k as required.
Next, suppose that (a
n
) converges. We must show that (a
n
) is a Cauchy
sequence.
Let > 0 be given.
We use an /2-argument. Let denote lim
n
a
n
. Then there is N N
such that n > N implies that
[a
n
[ <
1
2
.
But then if both n > N and m > N, we have
[a
n
a
m
[ [a
n
[ [ a
m
[ <
1
2
+
1
2
=
which veries that (a
n
) is indeed a Cauchy sequence, as claimed.
Department of Mathematics
Sequences 53
Some special sequences
Example 3.34. What happens to c
1/n
as n for given xed c > 0 ?
To investigate this, let c > 0 and consider the sequence given by (c
1/n
) =
(c, c
1/2
, c
1/3
, c
1/4
, . . . ). Suppose rst that c > 1. Then c
1/n
> 1. For n N,
let d
n
be given by d
n
= c
1/n
1, so that d
n
> 0 and c
1/n
= 1 + d
n
. Hence,
by the binomial theorem,
c = (1 + d
n
)
n
= 1 + nd
n
+
_
n
2
_
d
2
n
+ +
_
n
n 1
_
d
n1
n
+ d
n
n
1 + nd
n
.
It follows that c 1 nd
n
and so we have
0 < d
n

c 1
n
.
It follows from the Sandwich Principle that d
n
0 as n . Hence, for
any c > 1, c
1/n
= 1 + d
n
1 as n .
If c = 1, then evidently c
1/n
= 1
1/n
= 1 1 as n .
Now suppose that 0 < c < 1. Set = 1/c so that > 1. Then from the
above, c
1/n
= (1/)
1/n
= 1/(
1/n
) 1 as n .
We conclude that c
1/n
1 as n for any xed c > 0.
c
1/n
1 as n for any xed c > 0
Example 3.35. What happens to n
1/n
as n ? There is conicting
behaviour here. Taking the n
th
root would tend to make things smaller, but
one is taking the n
th
root of n which itself gets larger. It is not immediately
clear what will happen.
Dene k
n
by n
1/n
= 1 +k
n
(so that k
n
= n
1/n
1). Then k
n
> 0 for all
n > 1. We shall show that k
n
0 as n . To see this, notice that for
any n > 1
n = (1 + k
n
)
n
= 1 + nk
n
+
n(n 1)
2
k
2
n
+ + k
n
n
>
n(n 1)
2
k
2
n
.
Hence, for n > 1,
0 < k
n
<

n 1
Kings College London
54 Chapter 3
and by the Sandwich Principle, we deduce that k
n
0 as n . Hence
n
1/n
= 1 + k
n
1 as n .
n
1/n
1 as n
Example 3.36. What happens to c
n
/n! as n for xed c R? If c > 1,
then c
n
gets large as n grows but so does the denominator n!. There is
conicting behaviour here, so it is not obvious what does happen.
For any c R, choose an integer k N such that k > [c[. The (k +m)
th
term of the sequence is
c
k+m
(k + m)!
=
c
k
c
m
k! (k + 1)(k + 2) . . . (k + m)
.
We have
0

c
k+m
(k + m)!

c
k

k!
[c[
m
(k + 1)(k + 2) . . . (k + m)

c
k

k!
[c[
m
k
m
=

c
k

k!

m
where = [c[ /k < 1. Now let a
j
= c
j
/j! for 1 j k and let a
j
=
|c|
k
k!

m
for j = k + m with m 1. Then evidently a
j
0 as j and
0

c
j
j!

a
j
.
By the Sandwich Principle, it follows that

c
j
/j!

0 and hence we also


have c
j
/j! 0 as j .
c
j
/j! 0 as j for any xed c R
Example 3.37. What happens to

n + 1

n as n ? Each of the
two terms becomes large but what about their dierence? To see what does
happen, we use a trick and write
0 <

n + 1

n =
(

n + 1

n)(

n + 1 +

n)
(

n + 1 +

n)
=
(n + 1) n
(

n + 1 +

n)
=
1
(

n + 1 +

n)
<
1

n
Department of Mathematics
Sequences 55
and so by the Sandwich Principle, we deduce that

n + 1

n 0 as
n .

n + 1

n 0 as n
Example 3.38. Let 0 < a < 1 and let k N be xed. What happens to n
k
a
n
as n ? The term n
k
gets large but the term a
n
becomes small as n
grows. We have conicting behaviour.
To investigate this, rst let us note that n
k
= (n
k/n
)
n
and also that
n
k/n
= (n
1/n
)
k
1
k
= 1 as n . It follows that n
k/n
a a as n .
Let r obey a < r < 1. Then eventually n
k/n
a < r (because with = r a,
eventually n
k/n
a a < = r a). It follows that eventually
0 < n
k
a
n
= (n
k/n
a)
n
< r
n
.
But r
n
0 and so by the Sandwich Principle we conclude that n
k
a
n
0
as n . (There is N N such that n > N implies that
0 < n
k
a
n
= (n
k/n
a)
n
< r
n
.
For 1 n N, set b
n
= n
k
a
n
and for n > N set b
n
= r
n
. Then b
n
0 as
n and we have
0 < n
k
a
n
b
n
and so the Sandwich Principle tells us that n
k
a
n
0 as n .)
n
k
a
n
0 as n for any xed 0 < a < 1
Kings College London
56 Chapter 3
Sequences of functions
Just as one can have a sequence of real numbers, so one can have a sequence
of functions. By this is simply meant a family of functions labelled by the
natural numbers N. Consider, then, a sequence (f
n
)
nN
of functions. For
each given x, the sequence (f
n
(x))
nN
is just a sequence of real numbers,
as considered already. Here, as always, f
n
(x) is the notation for the value
taken by the function f
n
at the real number x. In this way, we get many
sequences one for each x. Now, for some particular values of x the
sequence (f
n
(x))
nN
may converge whereas for other values of x it may not.
Even when it does converge, its limit will, in general, depend on the value
of x. These various values of the limit themselves determine a function of x.
This leads to the following notion of convergence of a sequence of functions.
Denition 3.39. Suppose that (f
n
)
nN
is a sequence of functions each dened
on a particular subset S in R. We say that the sequence (f
n
)
nN
converges
pointwise on S to the function f if for each x S the sequence (f
n
(x))
nN
of
real numbers converges to the real number f(x). We write f
n
f pointwise
on S as n .
Some examples will illustrate this important idea.
Examples 3.40.
1. Let f
n
(x) = x
n
and let S be the open interval S = (1, 1). We have
seen that x
n
0 as n for any x with [x[ < 1. This simply says
that (f
n
) converges pointwise to f = 0, the function identically zero on
the set (1, 1).
2. Let f
n
(x) = x
n
as above, but now let S = (1, 1]. Then for [x[ < 1,
we know that f
n
(x) = x
n
0 as n . Furthermore, with x = 1,
we have f
n
(1) = 1
n
= 1, so that f
n
(1) 1 as n . Let f be the
function on S = (1, 1] given by
f(x) =
_
0, for 1 < x < 1,
1, for x = 1.
Then we can say that f
n
f pointwise on (1, 1].
3. Once again, let f
n
(x) = x
n
but now let S be the interval S = [1, 1].
We know that for each x (1, 1], the sequence (f
n
(x)) of real numbers
converges. We must investigate what happens for x = 1. We see that
f
n
(1) = (1)
n
, so that the sequence (f
n
(1))
nN
of real numbers does
not converge. This means that there does not exist a function f on
[1, 1] with the property that f
n
(1) f(1). The conclusion is that
in this case (f
n
) does not converge pointwise on [1, 1] to any function
at all.
Department of Mathematics
Sequences 57
These examples illustrate the obvious but nevertheless crucial point that
pointwise convergence of a sequence of functions involves not only a particu-
lar sequence of functions but also the set on which the pointwise convergence
is to be considered to take place. The notion of pointwise convergence only
makes sense when used together with the set to which it refers.
Kings College London
58 Chapter 3
Department of Mathematics
Chapter 4
Series
Given a sequence a
1
, a
2
, . . . we wish to discus the innite sum
a
1
+ a
2
+ a
3
+ . . .
Such an expression is called an innite series and is denoted by

k=1
a
k
.
We shall attempt to interpret such a series as a suitable limiting object. To
this end, let s
n
be the so-called n
th
partial sum
s
n
=
n

k=1
a
k
= a
1
+ + a
n
.
Then as n becomes larger, so s
n
looks more like the series

k=1
a
k
. Of
course, there is the matter of convergence to be considered. The point is
that one can always write down the expression

k=1
a
k
but without some
extra discussion it is not all clear what it actually means. It is certainly a
combination of symbols, but does it have any reasonable interpretation as
a real number? For example, if it happens to be the case that a
k
= 1 for
every k, then

k=1
a
k
=

k=1
1. What does this mean? We see that in this
special case, s
n
= n which gets large. The answer is that

k=1
a
k
simply
has no meaning in this case. We say that the series

k=1
a
k
diverges.
As another example, suppose that a
k
= (1)
k+1
. Then a
k
= 1 for odd k
and is otherwise equal to 1. Then

k=1
a
k
= 1 1 + 1 1 + 1 1 + . . .
which means what exactly? In this example, we see that s
n
= 1 if n is
odd but is zero if n is even. The partial sums ip interminably between the
two values 1 and 0.
Denition 4.1. The series

k=1
a
k
is said to be convergent if the sequence
of partial sums (s
n
)
nN
converges.
If s
n
as n , then is said to be the sum of the series and the
expression

k=1
a
k
is dened to be this limit .
A series which is not convergent is said to be divergent.
59
60 Chapter 4
Example 4.2. Let a
k
= (1/3)
k
, so that

k=1
a
k
=

k=1
1
3
k
.
We see that the partial sums are given by
s
n
=
1
3
+ +
_
1
3
_
n
=
(
1
3
(
1
3
)
n+1
)
(1
1
3
)
=
1
2

1
2
_
1
3
_
n

1
2
as n . Hence

k=1
1
3
k
=
1
2
.
Note that the same argument shows that

k=1
x
k
=
x
1 x
()
for any x with [x[ < 1. (The requirement that [x[ < 1 ensures that x
n
0
as n .)
Note that if we were to ignore the fact that it was not valid but go ahead
anyway and simply set x = 1 in the above formula (), then we would have

k=1
1 on the left hand side and 1/0 on the right hand side neither of
which have meaning as real numbers. Again, if we ignore the fact that it
is invalid but anyway set x = 1 in (), then the left hand side becomes
(11+11+. . . ) and the right hand side becomes
1
2
which might lead
one to suggest that 1 1 +1 1 +. . . is in some sense equal to
1
2
. The fact
is that 1 1 + 1 1 + . . . has no sensible interpretation as a real number.
Returning to the series itself and setting x = 5, say, we see that the
partial sum s
n
= 5 + + 5
n
5
n
so that the sequence (s
n
) does not
converge (it is not bounded) and therefore

k=1
5
k
is divergent. Thats it
nothing more to say. The expression

k=1
5
k
does not represent a real
number and it cannot be manipulated as if it did. (It is tempting to say
that

k=1
5
k
has no meaning at all. However, it does implicitly carry with
it the discussion here to the eect that the sequence of partial sums does
not converge.)
The following divergence test allows us to immediately spot certain series
as being divergent.
Proposition 4.3 (Test for divergence). Suppose that the sequence (a
n
) fails
to converge to 0 as n . Then the series

k=1
a
k
diverges.
Proof. We must show that if

k=1
a
k
is convergent then a
n
0. So suppose
that

k=1
a
k
is convergent with sum , say. This means that s
n
as
n , where s
n
=

n
k=1
a
k
.
Department of Mathematics
Series 61
Let > 0 be given.
We need an /2-argument. Since s
n
, we are assured that there is some
N

N such that
[s
n
[ <
1
2

whenever n > N

. But then
[a
n
[ = [s
n
s
n1
[ = [s
n
+ s
n1
[
[s
n
[ +[ s
n1
[
<
1
2
+
1
2
=
provided n > N

and n 1 > N

. So if we set N = N

+ 1, then if n > N
we can be sure that
[a
n
[ <
which establishes that a
n
0 as n and the proof is complete.
Remark 4.4. It is very important to understand what this proposition says
and what it does not say. It says that if the terms of a series fail to converge
to zero, then the series itself is divergent.
It is quite possible to nd a series whose terms do converge to zero but
nevertheless, the series is divergent. Such an example is provided by the
series

k=1
a
k
with a
k
= 1/k.
1 +
1
2
+
1
3
+
1
4
+
1
5
+ . . . is a divergent series
Indeed, the sequence of partial sums (s
n
) is not bounded. One can see this
as follows. That portion of the graph of the function y = 1/x between the
values x = k and x = k + 1 lies below the line y = 1/k. Let R
k
denote the
rectangle with height 1/k and with base on the interval [k, k +1]. Then the
area of R
k
is greater than the area under the graph of y = 1/x between the
values x = k and x = k + 1, that is,
area R
k
=
1
k
>
_
k+1
k
1
x
dx = ln(k + 1) ln k .
Summing from k = 1 to k = n, we get
s
n
= 1 +
1
2
+ +
1
n
>
n

k=1
(ln(k + 1) lnk) = ln(n + 1) ln 1 = ln(n + 1).
But ln(n + 1) > lnn which becomes arbitrarily large for large enough n.
So we conclude that (s
n
) is unbounded and so

k=1
a
k
, with a
k
= 1/k, is
divergent despite the fact that a
n
= 1/n 0 as n .
Kings College London
62 Chapter 4
An alternative argument is as follows. One notices that
1 +
1
2
+
1
3
+
1
4
. .
>
1
2
+
1
5
+
1
6
+
1
7
+
1
8
. .
>4
1
8
=
1
2
+
1
9
+ +
1
16
. .
>8
1
16
=
1
2
+
1
17
+ +
1
32
. .
>16
1
32
=
1
2
+. . .
and so we see that
s
1
= 1, s
2
= s
1
+
1
2
, s
4
> s
2
+
1
2
, s
8
> s
4
+
1
2
, s
16
> s
8
+
1
2
, . . .
and so it follows that s
2
j (j + 2)
1
2
for j N. (This inequality is strict
for j > 1, but this is of no consequence here.) So the sequence (s
n
) is not
bounded.
The next result tells us that we can do arithmetic with convergent series
just as we would expect.
Proposition 4.5. Suppose that

k=1
a
k
and

k=1
b
k
are convergent series.
Then the series

k=1
a
k
, for any R, and

k=1
(a
k
+ b
k
) are also
convergent and have sums such that

k=1
a
k
=

k=1
a
k
and

k=1
(a
k
+ b
k
) =

k=1
a
k
+

k=1
b
k
.
Proof. We just need to look at the partial sums and their limits. So let
s
n
=

n
k=1
a
k
and let t
n
=

n
k=1
b
k
and let =

k=1
a
k
and =

k=1
a
k
.
By hypothesis, we know that s
n
and t
n
as n . But
n

k=1
a
k
= s
n

and
n

k=1
(a
k
+ b
k
) = s
n
+ t
n
+
as n . It follows that

k=1
a
k
is convergent with sum =

k=1
a
k
and also that

k=1
(a
k
+ b
k
) is convergent with sum given by + =

k=1
a
k
+

k=1
b
k
, as required.
Example 4.6. For k N, let a
k
= 9/10
k
. Then

k=1
a
k
=
9
10
+
9
10
2
+
9
10
3
+ . . .
which is usually referred to as 0.9 . . . recurring. Is this series convergent and
if so, what is its sum? We see that

k=1
a
k
=

k=1
9
10
k
= 9

k=1
1
10
k
.
Department of Mathematics
Series 63
But we have seen earlier that

k=1
1
10
k
is convergent with sum equal to
1
10
/(1
1
10
) =
1
9
. It follows that

k=1
9
10
k
= 9
1
9
= 1 .

k=1
9
10
k
= 0.9999 = 1
Continuing with this theme, we have the following.
Example 4.7. Let (a
k
)
kN
be any sequence of integers taking values in the
set 0, 1, 2, . . . , 9 . Then

k=1
a
k
/10
k
is convergent with sum lying in the
interval [0, 1].
To see this, let s
n
=

n
k=1
a
k
10
k
denote the n
th
partial sum of the series

k=1
a
k
/10
k
. We note that s
n+1
s
n
= a
n+1
/10
n+1
0 and so the sequence
of partial sums (s
n
) is monotone increasing. Furthermore, since each a
k
is
an integer in the range 0 to 9, it follows that a
k
/10 9/10 and so we can
say that a
k
/10
k
9/10
k
. Hence, for any n N,
s
n
=
n

k=1
a
k
10
k

n

k=1
9
10
k
= 9
_
1
10
(
1
10
)
n+1
_
_
1
1
10
_ < 9
1
10
_
1
1
10
_ = 1 .
We have shown that the sequence (s
n
) is monotone increasing and bounded
and therefore it converges. Hence, by denition, the series

k=1
a
k
/10
k
is
convergent.
We must now show that

k=1
a
k
/10
k
= , say, lies between 0 and 1.
However, each s
n
0 and since s
n
as n , it follows that it must
also be true that 0. Furthermore, we have seen that s
n
< 1 and so we
have 1 s
n
> 0. But 1 s
n
1 and so it also follows that 1 0.
Hence 0 1, as claimed.
We can interpret this as saying that every innite decimal represents
some real number x lying in the range 0 x 1. The converse is also true.
Example 4.8. Let x be a real number satisfying 0 x 1. Then there is
a sequence of integers (a
k
)
kN
with values in 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 such
that the series

k=1
a
k
/10
k
is convergent with sum equal to x.
To show this, we must construct the sequence (a
k
). We know how to do
this for x = 1, (take a
k
= 9 for all k) so let us suppose now that 0 x < 1.
If we are told that
x =
a
1
10
+
a
2
10
2
+
a
3
10
3
+ . . .
then evidently,
10 x = a
1
+
a
2
10
+
a
3
10
2
+ . . .
Kings College London
64 Chapter 4
so that a
1
is the integer part of 10x, a
1
= [10x]. Similarly,
100 x = 10a
1
+ a
2
+
a
3
10
+ . . .
so that a
2
is the integer part of 100x 10a
1
, a
2
= [100x 10[10x]]. In this
way, we can write any a
k
in terms of x. We simply use this idea to construct
the a
k
s.
We isolate the following fact: if u is any real number with 0 u < 1, then
there is an integer a in the set 0, 1, 2, . . . , 8, 9 and a real number obeying
0 < 1 such that 10u = a+. To see this, we rst note that 0 10u < 10
and so [10u], the integer part of 10u, lies in the set 0, 1, 2, . . . , 8, 9 . Let
a = [10u] and let = 10u [10u]. Since 0 w [w] < 1 for any real
number w, we see that 0 < 1 and 10u = a + as required.
Since 0 x < 1, as noted above, we can write 10x as 10x = a
1
+
1
where a
1
is an integer in the set 0, 1, 2, . . . , 8, 9 and 0
1
< 1. Then
x =
a
1
10
+

1
10
.
Now, with
1
instead of x, we can say that we can write
1
as

1
=
a
2
10
+

2
10
for some integer a
2
in the set 0, 1, 2, . . . , 8, 9 and some real number
2
obeying 0
2
< 1. Then
x =
a
1
10
+

1
10
=
a
1
10
+
a
2
10
2
+

2
10
2
.
Repeating this for
2
, we get
x =
a
1
10
+
a
2
10
2
+
a
3
10
3
+

3
10
3
with a
3
0, 1, 2, . . . , 8, 9 and 0
3
< 1.
Continuing in this way, we construct integers a
n
in the range 0 to 9 and
real numbers
n
obeying 0
n
< 1 such that
x =
a
1
10
+
a
2
10
2
+
a
3
10
3
+ +
a
n
10
n
. .
s
n
+

n
10
n
.
Finally, we note that
[x s
n
[ =

n
10
n

9
10
n
0
as n , that is, s
n
x as n and so it follows that the series

k=1
a
k
/10
k
converges with sum equal to x, and the proof is complete.
Department of Mathematics
Series 65
This provides another proof that any given real number is the limit of
a sequence of rationals. Indeed, for b R, write b = [b] + x where [b] is
the integer part of b and 0 x < 1. As discussed above, x = lims
n
where
each s
n
is the partial sum of a series with rational terms of the form a
k
/10
k
for suitable a
k
0, 1, 2, . . . , 9 . In particular, each s
n
is rational and so is
[b] + s
n
. However, [b] + s
n
[b] + x = b and the result follows.
Since 0.99 = 1 = 1.00 . . . it is clear that the decimal expansion of a
real number need not be unique. Indeed, further examples are provided by
0.5 = 0.499 . . . or 0.63 = 0.6299 . . . and so on. However, this is the only
possible kind of ambiguity as the next theorem shows.
Theorem 4.9. Suppose that 0 x < 1 and that
x = 0.a
1
a
2
= 0.b
1
b
2
. . . ,
that is,
x =

k=1
a
k
10
k
=

k=1
b
k
10
k
where each a
k
and b
k
belong to 0, 1, 2, . . . , 8, 9 . Then either a
k
= b
k
for
all k N or else there is some N N such that a
k
= b
k
for 1 k < N and
either a
N
= b
N
+ 1 and a
k
= 0 and b
k
= 9 for all k > N or b
N
= a
N
+ 1
and b
k
= 0 and a
k
= 9 for all k > N.
Proof. We will use the following result.
Lemma 4.10. Suppose that 0
k
9 and that

k=1

k
/10
k
= 0. Then

k
= 0 for all k N.
Proof of Lemma. Let s
n
=

n
k=1

k
/10
k
denote the partial sums of the
series

k=1

k
/10
k
. Then s
n+1
s
n
=
n+1
/10
n+1
0 so that (s
n
) is a
positive increasing sequence. Moreover, each s
n
obeys s
n

n
k=1
9/10
k
< 1
so that (s
n
) converges, that is,

k=1

k
/10
k
is a convergent series. Its value
obeys s
n

k=1

k
/10
k
. Hence, for any m N,
0

m
10
m
s
m

k=1

k
/10
k
= 0
so that it follows that
m
= 0 as claimed and the proof of the lemma is
complete.
We turn now to the proof of the theorem.
Case 1: x = 0.
In this case, we have 0 = x =

k=1
a
k
/10
k
with a
k
0, 1, . . . , 9 . By
the Lemma, we conclude that a
k
= 0 for all k N. Hence a
k
= b
k
= 0 for
k N.
Kings College London
66 Chapter 4
Case 2: 0 < x < 1.
Suppose that
x =

k=1
a
k
10
k
=

k=1
b
k
10
k
and it is false that a
k
= b
k
for all k N. Let N be the smallest integer for
which a
k
,= b
k
, so that a
k
= b
k
for 1 k < N but a
N
,= b
N
.
Suppose that a
N
> b
N
. Then 0 b
N
< a
N
9 and a
N
1. We have
0 = x x =

k=1
a
k
10
k

k=1
b
k
10
k
=

k=1
(a
k
b
k
)
10
k
=

k=N
(a
k
b
k
)
10
k
=
(a
N
b
N
)
10
N
+
(a
N+1
b
N+1
)
10
N+1
+ . . .
Multiplying by 10
N
, we see that
0 = (a
N
b
N
) +
(a
N+1
b
N+1
)
10
+ . . .
= (a
N
b
N
) +

n=1
c
n
10
n
where c
n
= a
N+n
b
N+n
for all n N. Now, we can write (a
N
b
N
) as
a
N
b
N
= c + 1
where c is an integer with c 0. We also note that each c
n
belongs to the
set 9, 8, . . . , 8, 9 . Hence, writing 1 =

n=1
9/10
n
, we get
0 = c +

n=1
9
10
n
+

n=1
c
n
10
n
that is,
0 = c +

n=1
(9 + c
n
)
10
n
.
Now, 9 +c
n
0 and c 0 so that both terms on the right hand side above
are non-negative. It must be the case that c = 0 and also

n=1
(9+c
n
)
10
n
= 0.
But then c
n
= 9 for all n N, by the Lemma.
Hence a
N
= b
N
+1 and a
N+n
b
N+n
= 9 which implies that a
N+n
= 0
and b
N+n
= 9 for all n N and the result follows.
Department of Mathematics
Series 67
Returning now to the general theory, it is clear that the convergence of
a series will not be aected by changing the values of a few terms, although
of course, this will change the value of its sum. This is conrmed formally
in the next proposition.
Proposition 4.11. Suppose that

k=1
a
k
is a convergent series and

k=1
b
k
is any series such that b
k
= a
k
except for at most nitely-many k. Then

k=1
b
k
is also convergent.
Proof. As always, we look at the partial sums, so let s
n
=

n
k=1
a
k
and let
t
n
=

n
k=1
b
k
. Evidently,
t
n
=
n

k=1
b
k
=
n

k=1
( b
k
a
k
)
. .
=c
k
, say
+
n

k=1
a
k
=
n

k=1
c
k
+ s
n
.
Next, let u
n
=

n
k=1
c
k
. Now, by hypothesis, c
k
= 0 except for at most
nitely-many k. In other words, there is some N N such that c
k
= 0 for
all n > N. This means that u
n
is eventually constant,
u
n
=
n

k=1
c
k
=
N

k=1
c
k
= u
N
,
whenever n > N, and so (u
n
) converges (to the value u
N
). But
t
n
= u
n
+ s
n
and since the right hand side converges, so does the left hand side and the
result follows.
Theorem 4.12 (Comparison Test for positive series). Suppose 0 a
k
b
k
for all k N and that

k=1
b
k
converges. Then

k=1
a
k
also converges.
Proof. Let s
n
=

n
k=1
a
k
and t
n
=

n
k=1
b
k
. By hypothesis, (t
n
) converges
and so (t
n
) is a bounded sequence. Therefore there is some M > 0 such that
t
n
M
for all n N. But since 0 a
k
b
k
, it follows that s
n
t
n
and so
the sequence (s
n
) of partial sums is bounded above (by M). Furthermore,
s
n+1
s
n
= a
n+1
0 and so (s
n
) is monotone increasing. However, we
know that a monotone increasing sequence which is bounded above must
converge. Hence result.
Kings College London
68 Chapter 4
Example 4.13. Consider the series

k=1
1
k
2
= 1 +
1
2
2
+
1
3
2
+
1
4
2
+ . . . .
Let s
n
=

n
k=1
1
k
2
denote the n
th
partial sum. Then (s
n
) is increasing.
Furthermore, for n > 1, we see that
s
n
= 1 +
1
2
2
+
1
3
2
+
1
4
2
+ +
1
n
2
< 1 +
1
1.2
+
1
2.3
+
1
3.4
+ +
1
(n 1)n
= 1 +
_
1
1
2
_
+
_
1
2

1
3
_
+
_
1
3

1
4
_
+ +
_
1
n 1

1
n
_
= 2
1
n
< 2
and so (s
n
) is bounded from above and therefore must converge. Hence,
by denition,

k=1
1
k
2
is convergent. Note, however, that this discussion
gives us no hint as to the value of its sum. This is an example where
the convergence of a series can quite sensibly be discussed without actually
knowing what its sum is.
Example 4.14. What about the series

k=1
1
k
4
?
Since
1
k
4

1
k
2
for all k N, we can apply the Comparison Test for positive
series to deduce that

k=1
1
k
4
is convergent. Indeed, since
1
k


1
k
2
for all
k N for any 2, we can say that the series

k=1
1/k

is convergent
whenever 2.

k=1
1
k
2+
is convergent for any 0
Example 4.15. What about the series

k=1
1

k
?
If this series were convergent, then we could use the inequality
1
k

1

k
,
for every k N, together with the Comparison Test for positive series to
conclude that the series

k=1
1/k were convergent. However, we know this
not to be the case. It follows that the series

k=1
1/

k is not convergent.
Department of Mathematics
Series 69
Indeed, we can apply this reasoning to the series

k=1
1/k

for any 1.

k=1
1
k

is not convergent for any 1


Example 4.16. We have seen above that the series

k=1
1
k

is convergent for
2 but not convergent for 1. It is natural to ask what happens for
values of lying in the range 1 < < 2. We shall see that the series is
convergent for all > 1.
Write = 1 + , where > 0 and let s
n
=

n
k=1
1/k

. Evidently (s
n
)
is an increasing sequence so if we can show that it is bounded, then we will
be able to conclude that it converges. The idea is to compare the terms
1/k

with the integral of the function y = 1/x

over unit intervals. In fact,


over the range k x k + 1, the function y = 1/x
(1+)
is greater than
1/(k + 1)
1+
and so
1
(k + 1)
1+

_
k+1
k
dx
x
1+
.
Summing over k, we nd that
s
n
= 1 +
1
2

+
1
3

+ +
1
n

1 +
_
2
1
dx
x
1+
+
_
3
2
dx
x
1+
+ +
_
n
n1
dx
x
1+
= 1 +
_
n
1
dx
x
1+
= 1 +
_

1
x

_
n
1
= 1 +
1


1
n

1 +
1

.
We see that the sequence (s
n
) is bounded from above and since it is also
increasing, it must converge.

k=1
1
k

is convergent for all > 1 and divergent for all 1


This technique of comparing terms of a series with integrals can be quite
useful. The general idea is contained in the following theorem.
Kings College London
70 Chapter 4
Theorem 4.17 (Integral Test). Suppose that : [1, ) R is a positive
decreasing function such that the sequence of integrals (
_
n
1
(x) dx)
nN
con-
verges as n . Then

n=1
(n) is convergent.
Proof. Since (x) 0, the sequence of partial sums s
n
=

n
k=1
(k) is
increasing. Now, because is decreasing, it follows that (k) (x) for
all x [k 1, k] for all k 2. Hence
_
k
k1
((k) (x)) dx 0, that is,
(k)
_
k
k1
(x) dx. Therefore
s
n
= (1) + (2) + (3) + + (n)
(1) +
_
2
1
(x) dx +
_
3
2
(x) dx + +
_
n
n1
(x) dx
= (1) +
_
n
1
(x) dx.
By hypothesis, the sequence of integrals
__
n
1
(x) dx
_
converges and so is
bounded. It follows that the sequence of partial sums (s
n
) is bounded from
above and therefore converges. The result follows.
This theorem can be rephrased in a slightly more general form, as follows.
Theorem 4.18 (Integral Test). Let (a
n
) be a sequence of positive real numbers
and suppose that there is some positive function such that the sequence
of integrals (
_
n
1
(x) dx)
nN
converges as n and such that, for each
k 2,
a
k
(x)
for all (k 1) x k. Then

k=1
a
k
is convergent.
Proof. As usual, let s
n
=

n
k=1
a
k
. Then (s
n
)
nN
is an increasing sequence
(because each a
k
0). We need only show that (s
n
) is bounded. To see
this, note that for k 2,
a
k
=
_
k
k1
a
k
dx
_
k
k1
(x) dx
and so
s
n
= a
1
+ a
2
+ + a
n
a
1
+
_
2
1
(x) dx +
_
3
2
(x) dx + +
_
n
n1
(x) dx
= a
1
+
_
n
1
(x) dx.
Department of Mathematics
Series 71
Now, the sequence (
_
n
1
(x) dx) converges, by hypothesis, and so it is bounded
and therefore there is a constant C such that
_
n
1
(x) dx C for all n N.
Hence, for any n,
0 s
n
a
1
+
_
n
1
(x) dx a
1
+ C
which shows that (s
n
) is a bounded sequence and the result follows.
The following test for convergence of positive series is very useful.
Theorem 4.19 (DAlemberts Ratio Test for positive series).
Suppose that a
n
> 0 for all n.
(i) Suppose that there is some 0 < < 1 and some N N such that if
n > N then
a
n+1
a
n
< .
Then the series

n=1
a
n
is convergent.
(ii) If there is N

N such that
a
n+1
a
n
1 for all n > N

, then the
series

n=1
a
n
is divergent.
Proof. (i) Suppose that 0 < < 1 and that
a
n+1
a
n
< for n > N. Then
a
N+2
< a
N+1

a
N+3
< a
N+2
< a
N+1

2
a
N+4
< a
N+3
< a
N+1

3
.
.
.
a
N+k+1
< a
N+1

k
.
Hence a
N+k+1
< K
N+k+1
for all k 1, where we have let K =
a
N+1

N+1
.
In other words, we have a
n
< K
n
for all n > N +1. Now we construct
a new sequence (u
n
) by setting
u
n
=
_
0, n N + 1
a
n
, n > N + 1 .
Then certainly u
n
< K
n
for all n. Now, we know that

n=1
K
n
is
convergent (with sum K /(1 )) because 0 < < 1.
By the Comparison Test, it follows that

n=1
u
n
is convergent. However,
a
n
= u
n
eventually and so it follows that

n=1
a
n
is also convergent and
the proof of (i) is complete.
Kings College London
72 Chapter 4
(ii) Suppose now that
a
n+1
a
n
1 for all n > N

. Then for any k N


a
N

+k
a
N

+k1
a
N

+1
> 0 .
This means that it is impossible for a
n
0 as n (every term after
the (N

+ 1)
th
is greater than a
N

+1
). We conclude that

n=1
a
n
must be
divergent.
There is another (weaker) but also very useful version of this theorem.
Theorem 4.20 (DAlemberts Ratio Test for positive series (2
nd
version)).
Suppose that a
n
> 0 for all n and that
a
n+1
a
n
L as n .
(i) If L < 1, then

n=1
a
n
is convergent.
(ii) If L > 1, then

n=1
a
n
is divergent.
(There is no claim as to what happens when L = 1.)
Proof. (i) Suppose that
a
n+1
a
n
L where 0 L < 1. Then for any > 0,
we may say that eventually
a
n+1
a
n
(L , L + ).
In particular,
a
n+1
a
n
< L+ eventually. Let be so small that L+ < 1,
then
a
n+1
a
n
< where = L+ < 1. By the previous version of the Theorem,
it follows that

n=1
a
n
is convergent.
(ii) Now suppose that L > 1. Then for any > 0,
a
n+1
a
n
eventually
belongs to the interval (L, L+). In particular, eventually
a
n+1
a
n
> L.
But L > 1, so if > 0 is chosen so small that L > 1, then we may say
that eventually
a
n+1
a
n
> L > 1 and so

n=1
a
n
is divergent, by the
previous version of the Theorem.
Example 4.21. What can be said when L = 1? Without further analysis, the
answer is nothing. Indeed, there are examples of series which converge
when L = 1 and other examples of series which diverge when L = 1.
For example, we know that

k=1
1/k
2
is convergent and we see that
a
n+1
/a
n
= n
2
/(n + 1)
2
1 as n , so that L = 1 in this case.
However, we also know that

k=1
1/k is divergent, but here again, we see
that a
n+1
/a
n
= n/(n + 1) 1 = L as n .
When L = 1, the Ratio Test tells us nothing.
Department of Mathematics
Series 73
Example 4.22. For xed 0 c < 1, the series

k=1
kc
k
is convergent.
If c = 0, there is nothing to prove, so suppose that 0 < c < 1. Setting
a
n
= nc
n
, we see that
a
n+1
a
n
=
(n + 1)c
n+1
nc
n
=
(n + 1)c
n
c
as n . Since a
n
> 0 for all n and since L = c < 1, we can apply the
Ratio Test to conclude that

k=1
kc
k
is convergent.
The same argument shows that for any power p, the series

k=1
k
p
c
k
is
convergent (provided 0 c < 1).
For any 0 c < 1 and any p N, the series

k=1
k
p
c
k
is convergent.
Theorem 4.23 (n
th
Root Test). Suppose that a
n
> 0 for all n N and that
(a
n
)
1/n
as n .
(i) If < 1, the series

k=1
a
k
is convergent.
(ii) If > 1, then the series

k=1
a
k
is divergent.
(There is no conclusion when = 1.)
Proof. Suppose that < 1. Choose such that < < 1 and set = .
Then > 0 and so there is some N N such that [a
1/n
n
[ < whenever
n > N. In particular,
(a
n
)
1/n
< =
i.e., a
n
<
n
, whenever n > N. We must show that s
n
=

n
k=1
a
k
converges.
Since a
n
> 0, the sequence (s
n
) is monotone increasing so it is enough to
show that (s
n
) is bounded from above. But for any n > N,
s
n
= a
1
+ a
2
+ + a
n
= s
N
+ a
N+1
+ + a
n
< s
N
+
N+1
+
N+2
+ +
n
= s
N
+

N+1

n+1
1
< s
N
+

N+1
(1 )
.
Hence, for any j,
s
j
< s
N+j
< s
N
+

N+j+1
1
Kings College London
74 Chapter 4
which shows that the sequence (s
n
) is bounded from above and therefore
converges, as claimed.
Next, suppose that > 1. Choose d such that 1 < d < and let = d.
Then > 0 and there is N N such that
a
1/n
n
( , + )
whenever n > N. In particular, for n > N,
< a
1/n
n
which means that a
n
> d
n
. It follows that, for any n > N,
s
n
= s
N
+ a
N+1
+ + a
n
> a
n
> d
n
> 1 .
From this, we see that it is false that a
n
0 as n and so by the Test
for divergence,

k=1
a
n
is divergent.
We have considered tests applicable only to positive series. The following
is a convergence test for the case when the terms alternate between positive
and negative values.
Theorem 4.24 (Alternating Series Test). Suppose that (a
n
) is a positive,
decreasing sequence such that a
n
0 as n . Then the (alternating)
series
a
1
a
2
+ a
3
a
4
+ . . . =

n=1
(1)
n+1
a
n
is convergent.
Proof. By hypothesis, a
n
0, a
n+1
a
n
and a
n
0 as n .
Let s
n
= a
1
a
2
+a
3
a
4
+ +(1)
n+1
a
n
denote the n
th
partial sum
of the series, as usual. We shall consider the two cases when n is even and
when n is odd.
Suppose that n is even, say n = 2m. Then
s
2m+2
= s
2m
+ ( a
2m+1
a
2m+2
. .
0
)
and so s
2m+2
s
2m
.
Next, we note that
s
2m
= a
1
a
2
+ a
3
a
4
+ a
5
a
2m
= a
1
( a
2
a
3
. .
0
) ( a
4
a
5
. .
0
) ( a
2m2
a
2m1
. .
0
) a
2m
..
0
a
1
.
For notational convenience, let x
m
= s
2m
. Then we have shown that (x
m
) is
increasing and bounded from above (by a
1
). It follows that (x
m
) converges,
say x
m
as m .
Department of Mathematics
Series 75
Claim: s
n
as n .
Let > 0 be given.
Then there is N
1
N such that if m > N
1
then [x
m
[ <
1
2
. Also,
there is N
2
N such that if n > N
2
then [a
n
[ <
1
2
. Let N = 2(N
1
+ N
2
).
Let n > N and consider [s
n
[. If n is even, say, n = 2m, then
n = 2m > N = 2m > 2(N
1
+ N
2
) = m > N
1
and so
[s
n
[ = [s
2m
[ = [x
m
[ <
1
2
< .
If n is odd, say n = 2k + 1, then
n = 2k + 1 > N = 2k N = 2(N
1
+ N
2
) = k > N
1
.
Moreover, since N > N
2
, we have n > N = n > N
2
and so we see that
n = 2k + 1 > N = both k > N
1
and n > N
2
.
Hence
[s
n
[ = [s
2k+1
[ = [s
2k
+ a
2k+1
[
= [x
k
+ a
n
[
[x
k
[ +[a
n
[
<
1
2
+
1
2

= .
So regardless of whether n is even or odd, if n > N then [s
n
[ < . Hence
s
n
as n , as claimed, and we conclude that the alternating series

n=1
(1)
n+1
a
n
is convergent.
Example 4.25. The series 1
1
2
+
1
3

1
4
+
1
5

1
6
+ . . . converges.
This follows immediately from the Alternating Series Test.
Denition 4.26. The series

n=1
a
n
is said to converge absolutely if the
series

n=1
[a
n
[ is convergent.
The series

n=1
a
n
is said to converge conditionally if it converges but does
not converge absolutely, i.e., it converges but the series

n=1
[a
n
[ is not
convergent.
Example 4.27. We have seen that the series

n=1
(1)
n+1 1
n
= 1
1
2
+
1
3

1
4
+
1
5

1
6
+ . . .
converges. However, we know that

n=1
1
n
does not converge and so the
series

n=1
(1)
n+1 1
n
is an example of a conditionally convergent series.
Kings College London
76 Chapter 4
Theorem 4.28. Every absolutely convergent series is convergent.
Proof. Suppose that

n=1
a
n
is absolutely convergent. Let t
n
=

n
k=1
[a
k
[
and s
n
=

n
k=1
a
k
. Then we know that t
n
converges (since

n=1
a
n
con-
verges absolutely). It follows that (t
n
) is a Cauchy sequence. We shall show
that (s
n
) is also a Cauchy sequence.
Let > 0 be given.
Then there is N such that n, m > N imply that
[t
n
t
m
[ < .
However, for n > m,
[s
n
s
m
[ = [a
m+1
+ + a
n
[ [a
m+1
[ + +[a
n
[ = [t
n
t
m
[
and so it follows that
[s
n
s
m
[ <
whenever n, m > N, which shows that (s
n
) is a Cauchy sequence. But any
Cauchy sequence in R converges and the result follows.
We know that if a and b are real numbers, then a + b = b + a. More
generally, if a
1
, . . . , a
m
is a collection of m real numbers, then their sum
a
1
+ +a
m
is the same irrespective of the order in which we choose to add
them together. Now, a series

n=1
a
n
is the result of adding together real
numbers, so it is natural to guess that the order of the addition does not
matter. To discuss this, we shall need the notion of a rearrangement.
Denition 4.29. The series

n=1
b
n
is a rearrangement of the series

n=1
a
n
if there is some one-one map of N onto N such that b
n
= a
(n)
for each
n N. In other words, every b is one of the as and every a appears as
some b.
Theorem 4.30. Suppose that the series

n=1
a
n
converges absolutely. Then
every rearrangement also converges, with the same sum.
Proof. Let

n=1
b
n
be a rearrangement of

n=1
a
n
. Then there is some one-
one map of N onto N such that b
n
= a
(n)
for every n. Let s
n
=

n
k=1
a
k
,
t
n
=

n
k=1
[a
k
[, r
n
=

n
k=1
b
k
and let s =

k=1
a
k
= lim
n
s
n
. We must
show that (r
n
) converges and that its limit is equal to s.
Let > 0 be given.
Since s
n
s and (t
n
) is a Cauchy sequence, there is some N N such that
n, m N imply that both
[s
n
s[ < /2 and [t
n
t
m
[ < /2 .
Now, the sequence of bs is a relabelling of the as and so for each j there
is some k
j
such that a
j
= b
k
j
. Let N

= max k
j
: 1 j N so that the
Department of Mathematics
Series 77
collection a
1
, a
2
, . . . , a
N
is included in the collection b
1
, b
2
, . . . , b
N
. Then for
any n > N

b
1
+ b
2
+ + b
n
= a
1
+ a
2
+ + a
N
+
n
where
n
= a

1
+ +a

r
for some integers
1
, . . . ,
r
with N <
1
< <
r
.
Now
[
n
[ [a

1
[ + +[a

r
[

k=N+1
[a
k
[ = t

r
t
N
< /2
and so if n > N

[r
n
s[ = [s
n
+
n
s[
[s
n
s[ +[
n
[
/2 + /2 =
and the proof is complete.
Theorem 4.31 (Cauchys Condensation Test). Suppose (a
n
)
nN
is a positive,
decreasing sequence of real numbers (that is, a
n
0 and a
n+1
a
n
). For
each k N, let b
k
= 2
k
a
2
k. Then the series

n=1
a
n
is convergent if and
only if the series

k=1
b
k
is convergent.
(In other words, either both series converge or neither does.)
Proof. Let s
n
=

n
m=1
a
m
and t
k
=

k
i=1
b
k
be the partial sums of the series
under consideration. Since a
n
0, the sequences (a
n
)
nN
and (b
k
)
kN
are
increasing sequences. Now, we know that if an increasing sequence in R is
bounded from above, then it must converge, so our strategy is to show that
the sequences of partial sums are bounded (from above).
The idea is to estimate the partial sums of the series in terms of each
other by bracketing the a
n
terms into groups of size 2, 4, 8, 16, . . . and
using the fact that a
n
a
n+1
. We note that
2a
4
a
3
+ a
4
2a
2
4a
8
a
5
+ a
6
+ a
7
+ a
8
4a
4
8a
16
a
9
+ a
10
+ + a
15
+ a
16
8a
8
.
.
.
Summing, we nd that (for k > 1)
2a
4
+ 4a
8
+ + 2
k1
a
2
k a
3
+ a
4
+ + a
2
k
2a
2
+ 4a
4
+ + 2
k1
a
2
k1 .
Kings College London
78 Chapter 4
In terms of the b
k
s, this becomes
1
2
(b
2
+ + b
k
) a
3
+ a
4
+ + a
2
k b
1
+ b
2
+ + b
k1
giving the pair of inequalities
1
2
t
k
b
1
s
2
k (a
1
+ a
2
) t
k1
()
Suppose now that the series

n=1
a
n
converges and, for clarity, let us write
s =

n=1
a
n
= lim
j
s
j
. Since (s
j
) is increasing, it follows that s
j
s
for all j N. From (), it follows that
1
2
t
k
b
1
s
2
k (a
1
+ a
2
) s (a
1
+ a
2
)
for all k N. Hence (t
k
) is a bounded, increasing sequence and so converges.
But, by denition, this means that

i=1
b
i
is convergent.
Next, suppose that the series

i=1
b
i
converges and write t for its sum. Then
t
k
t for all k N. Now, for any n N, it is true that 2
n
> n (as can be seen
by the Binomial Theorem, as follows; 2
n
= (1+1)
n
= 1+n+
_
n
2
_
+ +1 > n).
Hence s
n
s
2
n and so, using (), we get
s
n
s
2
n t
n1
+ (a
1
+ a
2
) t + (a
1
+ a
2
)
for all n 1 N. Therefore (s
n
) is a bounded, increasing sequence and so
converges. Therefore

j=1
a
j
is convergent.
Example 4.32. We already know that the series

n=1
1/n diverges, but let
us consider it again via the Condensation Test. First, we note that a
n
= 1/n
satises the hypotheses required to apply the Condensation Test. Now,
b
k
= 2
k
a
2
k = 2
k
/2
k
= 1
and so it is clear that

k
b
k
=

k
1 diverges. Applying the Condensation
Test, we conclude that

n=1
1/n diverges. (In fact, we have already shown
this from rst principles using this method of grouping.)
Next, consider

n=1
1/n
1+
for given > 0. Once again, a
n
= 1/n
1+
satises the hypotheses required to apply the Condensation Test. In this
case, we have
b
k
= 2
k
a
2
k =
2
k
2
k(1+)
=
2
k
2
k
2
k
=
1
2
k
so that

k
b
k
is a geometric series with common ratio 1/2

. This series
therefore converges (because 1/2

is smaller than 1).


Department of Mathematics
Series 79
We might say that the series

n
1/n diverges presumably because the terms
1/n do not become small enough quickly enough. Increasing the power of
n from 1 to 1 + is sucient to speed things up so that

n
1/n
1+
does
converge, no matter how small may be. Consider the series

n=2
1/(nln n)
(we cannot start this series with n = 1 because ln 1 = 0). Is the change from
a
n
= 1/n to a
n
= 1/(nln n) enough to give convergence of the series?
To investigate its convergence or otherwise, let a
n
= 1/(nlnn) for n 2
and set a
1
= 5, say, or any value greater than 1/2 ln 2. This choice of a
1
is not quite arbitrary but is chosen so that (a
n
)
nN
satises the hypotheses
required to apply the Condensation Test. The series

n=2
a
n
converges if
and only if the series

n=1
a
n
does, regardless of our choice for a
1
. Applying
the Condensation Test, we may say that the series

n=2
1/(nln n) converges
if (and only if)

n=1
2
n
a
2
n does. But
2
n
a
2
n =
2
n
2
n
ln(2
n
)
=
1
ln 2
1
n
and we know that the series

n
1/n does not converge. We can conclude,
then, that the series

n=2
1/(nln n) is divergent.
The series

n=2
1
(n ln n)
is divergent.
Kings College London
80 Chapter 4
Department of Mathematics
Chapter 5
Functions
Suppose that x represents the value of the length of a side of a square. Then
its area depends on x and, in fact, is given by the formula: area = x
2
. The
area is a function of x. In general, if S is some given subset of R, then a
real-valued function f on S is a rule or assignment by which to each element
x S is associated some real number, denoted by f(x). We write f : S R
which is read as f maps S into R. One also writes x f(x) which is
read as x is mapped to the value f(x). The set S is called the domain
(of denition) of the function f. If x / S, then f(x) has not been given a
meaning.
More generally, if A and B are given sets, then a mapping g : A B
is an association a g(a) of each element of a to some element g(a) B.
For example, for each 0 t 1, let g(t) =
_
t 1
0 t
3
_
. Then t g(t) is
an example of a mapping from the interval [0, 1] into the set of 2 2 real
matrices.
In general, if B is equal to either R or C, the the mapping is often
referred to as a function. Note that a function may be given by a pretty
formula but it does not have to be. For example, the function f : R R
with f(x) = 1 + x
2
is given by a formula. To get f(x), we just substitute
the value of x into the formula. However, the function
x
_

_
x
2
, x < 1,
1, 1 x 0,
x
3
+ 1, x > 0
is a perfectly good function, but is not given by a formula in the same way
as the previous example. In fact, this function seems to be a concoction
constructed from the functions x
2
, 1 and x
3
+ 1. A slightly more involved
example is
x
_

_
0, if x / Q ,
1/n, if x Q and x = k/n (with k Z, n N and
where k and n have no common divisors).
81
82 Chapter 5
For a function to be well-dened, there must be specied
(i) its domain of denition,
(ii) some assignment giving the value it takes at each point of its domain.
It is often very useful to consider the visual representation of f given by
plotting the points (x, f(x)) in R
2
. This is the graph of f.
Examples 5.1.
1. Linear functions: x f(x) = mx +c for constants m, c R and x S.
2. Polynomials: x f(x) = a
0
+a
1
x +a
2
x
2
+ +a
n
x
n
for x S, where
the coecients a
0
, a
1
, . . . a
n
are constants in R and a
n
,= 0. n is the
degree of such a polynomial.
3. Rational functions: x f(x) =
p(x)
q(x)
for x S where p and q are
polynomials. Note that the right hand side is not dened for any values
of x for which q(x) = 0.
4. S = R, f(x) =
_
1/x, x ,= 0 ,
3, x = 0 .
5. S = [1, 1], f(x) =
_
x
2
, x ,= 0 ,
2, x = 0 .
6. S = R, f(x) =
_
0, x / Q,
1, x Q.
(A thought: let g(x) =
_
1, x / Q,
0, x Q
and let h = f + g.
Then we see that h(x) = f(x) + g(x) = 1 for all x R. Certainly
_
1
0
h(x) dx = 1 but what are the values of
_
1
0
f(x) dx and
_
1
0
g(x) dx and
is it true that
1 =
_
1
0
h(x) dx =
_
1
0
f(x) dx +
_
1
0
g(x) dx?)
7. S = R, f(x) =
_

_
0, x < 0 ,
1
4
, 0 x < 1 ,
1
2
, 1 x < 6 ,
1, x 6 .
This kind of step-function is familiar from probability theory it is the
cumulative distribution function f(x) = Prob X x for a random
variable X taking the values 0, 1 and 6 with probabilities
1
4
,
1
4
and
1
2
,
respectively.
Department of Mathematics
Functions 83
Let f : S R be a given function and let A S.
We say that f is bounded from above on A if there is some M such that
f(x) M for all x A.
Analogously, f is said to be bounded from below on A if there is some m
such that f(x) m for all x A.
If f is both bounded from above and from below on A, then we say that f
is bounded on A.
We say that f is
_
increasing
decreasing
_
on A if
_
f(x
1
) f(x
2
)
f(x
1
) f(x
2
)
_
for any x
1
, x
2
A
with x
1
< x
2
.
We say that f is strictly
_
increasing
decreasing
_
on A if
_
f(x
1
) < f(x
2
)
f(x
1
) > f(x
2
)
_
for any
x
1
, x
2
A with x
1
< x
2
.
Examples 5.2.
1. S = R, f(x) = x
2
. Then f is bounded from below on R (by m = 0)
but f is not bounded from above on R. f is strictly increasing on [0, )
and f is strictly decreasing on (, 0]. f is bounded on any bounded
interval [a, b]. (We see that 0 f(x) max a
2
, b
2
on [a, b].)
2. S = [1, ), f(x) = 1
1
x
for x S. Then f is increasing and bounded
on S. We see that f attains its glb, namely 0 but does not attain its
lub, 1.
Denition 5.3. Let f : S R and let x
0
S. We say that f is continuous
at the point x
0
if for any given > 0 there is some > 0 such that
x S and [x x
0
[ < = [f(x) f(x
0
)[ < .
We say that f is continuous on some given set A if f is continuous at each
point of A.
Whats going on? Note that continuity is dened at some point x
0
. The idea is
that one is rst given a margin of error, this is the > 0. For f to be continuous
at the specied point x
0
, we demand that f(x) be within distance of f(x
0
) as
long as x is suitably close to x
0
, i.e., x is within some suitable distance of x
0
.
It must be possible to nd such no matter how small is. In general, one must
expect that the smaller is, then the smaller will need to be. The requirement
that x S ensures that f(x) actually makes sense in the rst place.
If we set h = x x
0
, then we demand that f(x
0
+h) be within distance of f(x
0
)
whenever [h[ < (provided that x
0
+ h S). The point x
0
and the error value
must be given rst. Then one must be able to nd a suitable as indicated.
Kings College London
84 Chapter 5
We shall illustrate the idea with a simple example (so no surprises here).
Example 5.4. Let S = R and set f(x) = x
2
. Let x
0
be arbitrary (but xed).
We shall show that f is continuous at x
0
. The procedure is as follows.
Let > 0 be given.
We must nd some > 0 such that [f(x) f(x
0
)[ < whenever [x x
0
[ < .
For convenience, write x = x
0
+ h. We see that
[f(x) f(x
0
)[ =

x
2
x
2
0

(x
0
+ h)
2
x
2
0

2x
0
h + h
2

. ()
How small must h be in order for this to be smaller than ? We do not
need an optimal estimate, any will do. One idea would be to notice that

2x
0
h + h
2

[2x
0
h[ + h
2
and then try to make each of these two terms
smaller than
1
2
, that is, we try to make sure that both [2x
0
h[ <
1
2
and
h
2
<
1
2
. This suggests the two requirements that [h[ < /(4 [x
0
[) and
[h[ <
_
/2. We must be careful here because it might happen that x
0
= 0,
in which case we cannot divide by [x
0
[. To side-step this nuisance, we shall
consider the two cases x
0
= 0 and x
0
,= 0 separately.
So rst suppose that x
0
,= 0. Then we simply choose to be the minimum
of the two terms /(4 [x
0
[) and
_
/2. This will ensure that if [h[ < then
[f(x) f(x
0
)[ < .
Next, suppose that x
0
= 0. Then the right hand side of () is simply equal
to h
2
. If we choose =

, then [h[ < implies that h
2
< and so, by (),
we have [f(x) f(x
0
)[ < .
In either of the cases x
0
= 0 or x
0
,= 0, we have exhibited a suitable
so that [x x
0
[ < implies that [f(x) f(x
0
)[ < . We have shown that
f(x) = x
2
is continuous at any given point x
0
R and the proof is complete.
Notice that the depends on both and x
0
. We must always expect this to
happen (even though in some trivial situations it might not).
The following theorem gives us an extremely useful characterization of
continuity.
Theorem 5.5. Let f : S R and let x
0
S. The following two statements
are equivalent:
(i) f is continuous at x
0
;
(ii) if (a
n
)
nN
is any sequence in S such that a
n
x
0
as n , then
the sequence (f(a
n
))
nN
converges to f(x
0
).
Proof. Suppose that statement (i) holds. To show that (ii) is also true, let
(a
n
) be any sequence in S with the property that a
n
x
0
as n . We
must show that f(a
n
) f(x
0
) as n .
Department of Mathematics
Functions 85
Let > 0 be given.
By hypothesis, f is continuous at x
0
and so there is some > 0 such
that
[x x
0
[ < = [f(x) f(x
0
)[ < . ()
But a
n
x
0
as n and so there exists N N such that
n > N = [a
n
x
0
[ < . ()
Evidently, () and () together (with x = a
n
) tell us that
n > N = [f(a
n
) f(x
0
)[ <
which means that (f(a
n
)) converges to f(x
0
) as n , as required.
Now suppose that (ii) holds. We must show that this implies that f is
continuous at any x
0
S. Suppose that this were not true, that is, let us
suppose that f is not continuous at the point x
0
S. What does this mean?
It means that there is some
0
> 0 such that it is false that there is some
> 0 so that
x S and [x x
0
[ < = [f(x) f(x
0
)[ <
0
.
That is, there is some
0
> 0 such that no matter what > 0 we choose, it
will be false that
x S and [x x
0
[ < = [f(x) f(x
0
)[ <
0
.
That is, there is some
0
> 0 such that for any > 0 there is some x S
with [x x
0
[ < such that it is false that [f(x) f(x
0
)[ <
0
.
That is, there is some
0
> 0 such that for any > 0 there is some
x S with [x x
0
[ < such that [f(x) f(x
0
)[
0
. Note that x may
well depend on .
How does this help? For given n N, set =
1
n
. Then, according to the
discussion above, there is some point x S with [x x
0
[ <
1
n
but such that
[f(x) f(x
0
)[
0
. The number x could depend on n, so let us relabel it
and call it a
n
. Then
[a
n
x
0
[ <
1
n
but [f(a
n
) f(x
0
)[
0
.
If we do this for each n N we get a sequence (a
n
)
nN
in S which clearly
converges to x
0
. However, because [f(a
n
) f(x
0
)[
0
for all n N, the
sequence (f(a
n
))
nN
does not converge to f(x
0
). This is a contradiction (we
started with the hypothesis that (ii) was true). Therefore our assumption
that f was not continuous on S is wrong and we conclude that f is indeed
continuous on S. This completes the proof that the truth of statement (ii)
implies that of statement (i).
Kings College London
86 Chapter 5
We can now apply this theorem, together with various known results
about sequences, to establish some (not very surprising but) basic properties
of continuous functions.
Theorem 5.6. Suppose that f : S R, g : S R and that R. Suppose
that x
0
S and that f and g are continuous at x
0
. Then
(i) The sum f + g is continuous at x
0
.
(ii) f is continuous at x
0
.
(iii) The product fg is continuous at x
0
.
(iv) If g does not vanish on S, then the quotient f/g is dened on S
and is continuous at x
0
.
Proof. Suppose that (a
n
) is any sequence in S with the property that a
n

x
0
as n . Then we know from the previous theorem that f(a
n
) f(x
0
)
and also that g(a
n
) g(x
0
) as n . It follows that
(i) The sum (f +g)(a
n
) = f(a
n
) +g(a
n
) f(x
0
) +g(x
0
) = (f +g)(x
0
)
as n .
(ii) f(a
n
) f(x
0
) as n .
(iii) The product (fg)(a
n
) = f(a
n
)g(a
n
) f(x
0
)g(x
0
) as n .
(iv) Since g does not vanish on S, the quotient f/g is well-dened on S.
Moreover, g(a
n
) ,= 0 for any n N and so (f/g)(a
n
) = f(a
n
)/g(a
n
)
f(x
0
)/g(x
0
) as n .
Now applying the previous theorem once again proves (i)(iv).
Remark 5.7. We could also have proved the above facts directly from the
denition of continuity. For example, a proof that f +g is continuous at x
0
is as follows.
Let > 0 be given.
Then there is some

> 0 such that


[x x
0
[ <

(and x S) = [f(x) f(x


0
)[ <
1
2
. ()
The reason for using
1
2
rather than will become clear below. Similarly,
there is some

> 0 such that


[x x
0
[ <

(and x S) = [g(x) g(x


0
)[ <
1
2
. ()
Department of Mathematics
Functions 87
Now, let = min

. Then, from () and (),


[(f + g)(x) (f + g)(x
0
)[ = [f(x) f(x
0
) + g(x) g(x
0
)[
[f(x) f(x
0
)[ +[g(x) g(x
0
)[
<
1
2
+
1
2
=
whenever [x x
0
[ < (and x S) and so, by denition, it follows that f +g
is continuous at the point x
0
in S.
Remark 5.8. The function f(x) = x is continuous on R and so with g = f, we
deduce from the theorem that f
2
(x) is also continuous on R. This is just the
statement that the function x
2
is continuous. By induction, we can deduce
from the theorem that products of continuous functions and also nite linear
combinations of continuous functions are continuous, i.e., if f
1
, . . . , f
k
are
each continuous at x
0
, then so is the product function f
1
f
2
. . . f
k
as well
as the linear combination
1
f
1
+ +
k
f
k
, for any
1
, . . . ,
k
R. In
particular, any power of a continuous function is continuous and taking
f(x) = x, we see that any polynomial a
0
+ a
1
x + + a
n
x
n
is continuous
on R.
Example 5.9. The function x

x is continuous on [0, ). This can be
shown as follows.
Let x
0
[0, ) be xed and let > 0 be given. Suppose rst that x
0
> 0.
For any x 0, we have
[

x
0
[ =
[

x
0
[
[

x +

x
0
[
[

x +

x
0
[
=
[ x x
0
[

x +

x
0
<
[ x x
0
[

x
0
<
provided [x x
0
[ < where we have chosen =

x
0
.
To conclude, consider the case x
0
= 0. Then we simply observe that
[

x
0
[ =

x
<
whenever [x 0[ < with chosen to be
2
.
Kings College London
88 Chapter 5
Example 5.10. The function x 1/x, for x > 0, is continuous on (0, ).
Let f(x) = 1/x for x > 0. To show that f is continuous on (0, ), let
x
0
(0, ) be given and suppose that (a
n
)
nN
is any sequence in (0, )
such that a
n
x
0
as n . We know that this means that 1/a
n
1/x
0
,
that is, f(a
n
) f(x
0
) as n . But this implies that f is continuous at
x
0
, as required.
Note that f is bounded from below (by 0) but f is not bounded from
above on (0, ). For any M > 0, there is k N such that k > M, by the
Archimedean Property. Hence, if 0 < x < 1/k, then f(x) = 1/x > k > M.
It follows that there is no constant M such that f(x) < M for all x (0, ),
that is, f is not bounded from above on (0, ).
From the example above, we see that if f(x) = 1/x for x in any interval
of the form (0, b), say, then f is continuous on (0, b) but is not bounded
there. This situation cannot happen on closed intervals. This is the content
of the following important theorem.
Theorem 5.11. Suppose that the function f : [a, b] R is continuous on the
closed interval [a, b]. Then f is bounded on [a, b].
Proof. We argue by contradiction. Suppose that f is continuous on [a, b] but
is not bounded. Suppose that f is not bounded from above. This means
that for any given M whatsoever, there will be some x [a, b] such that
f(x) > M. In particular, for each n N (taking M = n) we know that
there is some point a
n
, say, in the interval [a, b] such that f(a
n
) > n.
Consider the sequence (a
n
)
nN
. This sequence lies in the bounded inter-
val [a, b] and so, by the Bolzano-Weierstrass Theorem, it has a convergent
subsequence (a
n
k
)
kN
, say; a
n
k
as k . Since a a
n
k
b for all
k, it follows that a b. (The limit of a convergent sequence belonging
to a closed interval also belongs to the same closed interval.) But, by hy-
pothesis, f is continuous at and so a
n
k
implies that f(a
n
k
) f().
It is this that will provide our sought after contradiction. By construction,
f(a
n
k
) > n
k
and so it looks rather unlikely that (f(a
n
k
)) could converge.
To see that this is the situation, we observe that there is some K N such
that
[f(a
n
k
) f()[ < 1
for all k > K (because f(a
n
k
) f()). But then
f(a
n
k
) = f(a
n
k
) f() + f()
[f(a
n
k
) f()[ + f()
< 1 + f()
for all k > K. However, f(a
n
k
) > n
k
k so 1 + f() > k for all k N.
This is a contradiction and we conclude that f is bounded from above.
Department of Mathematics
Functions 89
To show that f is also bounded from below, we consider g = f. Then
g is continuous because f is. The argument just presented, applied to g,
shows that g is bounded from above. But this just means that f is bounded
from below and the proof is complete.
Remark 5.12. The two essential ingredients are that f is continuous and that
the interval is both closed and bounded. The boundedness was required so
that we could invoke the Bolzano-Weierstrass Theorem and the fact that it
was closed ensured that , the limit of the Bolzano-Weierstrass convergent
subsequence actually also belonged to the interval. This in turn guaranteed
that f was not only dened at but was continuous there.
If we try to relax these requirements, we see that the conclusion of the
theorem need no longer be true. For example, we must insist that f be
continuous. Indeed, consider the function f on the closed interval [0, 1]
given by
f(x) =
_
0, for x = 0
1/x, for 0 < x 1.
Evidently f is not bounded on [0, 1] but then f is not continuous at the
point x = 0.
Taking f(x) = 1/x for x (0, 1], we see that again f is not bounded on
(0, 1], but then (0, 1] is not a closed interval.
Let f(x) = x for x [0, ). Again, f is not bounded on the interval [0, )
but this interval is not bounded.
We have seen that a continuous function on a closed interval is bounded.
The next theorem tells us that it attains its bounds.
Theorem 5.13. Suppose that f is continuous on the closed interval [a, b].
Then there is some [a, b] and [a, b] such that f() f(x) f()
for all x [a, b]. In other words, if ran f = f(x) : x [a, b] is the range
of f, then f() = inf ranf = min ranf and f() = sup ranf = max ranf.
Proof. We have seen that f is bounded. Let m = inf ranf and M =
sup ran f. By denition of the supremum, there is some sequence (y
n
) in
ran f such that y
n
M as n . Since y
n
ran f, there is some
x
n
[a, b] such that y
n
= f(x
n
). By the Bolzano-Weierstrass Theorem, (x
n
)
has a convergent subsequence (x
n
k
)
kN
. Let = lim
k
x
n
k
. Then [a, b].
Since f is continuous on [a, b], it follows that f(x
n
k
) f() as k .
But f(x
n
k
) = y
n
k
and (y
n
k
)
kN
is a subsequence of the convergent sequence
(y
n
). Therefore (y
n
k
)
kN
converges to the same limit, that is, y
n
k
M as
k . Since y
n
k
= f(x
n
k
) f() as k , we deduce that M = f().
That is, sup ranf = f() and so f(x) f() for all x [a, b].
Kings College London
90 Chapter 5
We can argue in a similar way to show that there is some [a, b] such
that m = f(). However, we can draw the same conclusion using the above
result as follows. Note that if g = f, then g is continuous on the interval
[a, b] and sup rang = m. By the argument above, there is some [a, b]
such that m = g(). This gives the desired result that m = f().
Alternative Proof. We know that f is bounded. Let M = sup ranf. To
show that f achieves its least upper bound M, we suppose not and obtain a
contradiction. Since M is an upper bound and is not achieved by f, we must
have that f(x) < M for all x [a, b]. In particular, Mf is continuous and
strictly positive on [a, b]. It follows that h = 1/(M f) is also continuous
and positive on [a, b]. But then h is bounded on [a, b] and so there is some
constant K such that 0 < h K on [a, b], that is,
0 <
1
M f
K .
Hence f M 1/K which says that M 1/K is an upper bound for f on
[a, b]. But then this contradicts the fact that M is the least upper bound for
f on [a, b]. We conclude that f achieves this bound, i.e., there is [a, b]
such that f() = M = sup ranf.
In a similar way, if f does not achieve its greatest lower bound, m, then
f m is continuous and strictly positive on [a, b]. Hence there is L such
that
0 <
1
f m
L
on [a, b]. Hence m + 1/L f and m + 1/L is a lower bound for f on [a, b].
This contradicts the fact that m is the greatest lower bound for f on [a, b]
and we can conclude that f does achieve its greatest lower bound, that is,
there is [a, b] such that f() = m.
Theorem 5.14 (Intermediate-Value Theorem). Any real-valued function f
continuous on the interval [a, b] assumes all values between f(a) and f(b).
In other words, if lies between the values f(a) and f(b), then there is
some s with a s b such that f(s) = .
Proof. Suppose f is continuous on [a, b] and let be any value between f(a)
and f(b). If = f(a), take s = a and if = f(b) take s = b.
Suppose that f(a) < f(b) and let f(a) < < f(b). Let A be the set
A = x [a, b] : f(x) < . Then a A and so A is a non-empty subset
of the bounded interval [a, b]. Hence A is bounded and so has a least upper
bound, s, say. We shall show that f(s) = .
Since s = lubA, there is some sequence (a
n
) in A such that a
n
s. But
A [a, b] and so a a
n
b and it follows that a s b. Furthermore,
Department of Mathematics
Functions 91
by the continuity of f at s, it follows that f(a
n
) f(s). However, a
n
A
and so f(a
n
) < for each n and it follows that f(s) . Since, in addition,
< f(b), we see that s ,= b and so we must have a s < b.
Let (t
n
) be any sequence in (s, b) such that t
n
s. Since t
n
[a, b]
and t
n
> s, it must be the case that t
n
/ A, that is, f(t
n
) . Now, f
is continuous at s and so f(t
n
) f(s) which implies that f(s) . We
deduce that f(s) = , as required.
Now suppose that f(a) > > f(b). Set g(x) = f(x). Then we have
that g(a) < < g(b) and applying the above result to g, we can say that
there is s [a, b] such that g(s) = , that is f(s) = and the proof is
complete.
Corollary 5.15. Suppose that f is continuous on [a, b]. Then ran f, the range
of f, is a closed interval [m, M].
Proof. We know that f is bounded and that f achieves its bounds, that is,
there is [a, b] and [a, b] such that
m = inf ranf = f() f(x) M = sup ran f = f()
for all x [a, b]. Evidently, ranf [m, M].
Let c obey m c M. By the Intermediate-Value Theorem, there is
some s between and such that f(s) = c. In particular, c ran f and so
we conclude that ranf = [m, M].
Example 5.16. f(x) = x
6
+ 3x
2
1 has a zero inside the interval [0, 1].
To see this, we simply notice that f(0) = 1 and f(1) = 3. Since f is
continuous on R, it is continuous on [0, 1] and so, by the Intermediate-Value
Theorem, f assumes every value between 1 and 3 over the interval [0, 1].
In particular, there is some s [0, 1] such that f(s) = 0, as claimed. Of
course, this argument does not tell us whether such s is unique or not. (In
fact, it is because f is strictly increasing on [0, ) and so cannot take any
value twice on [0, ). A moments reection reveals that f(x) f(0) = 1,
f is not bounded from above and f(x) = f(x). Therefore f assumes every
value in the range (1, ) exactly twice and assumes the value 1 at the
single point x = 0.)
Example 5.17 (Thomaes function). We wish to exhibit a function which is
continuous at each irrational point in [0, 1] but is not continuous at any
rational point in [0, 1]. Such a function was constructed by Thomae in 1875.
Any rational number x may be written as x = p/q where we may assume
that p and q are coprime and that p Z and q N. This done, we dene
Kings College London
92 Chapter 5
: Q R by setting (x) = 1/q where x = p/q. For example,
(x) = 1 for x = 1
(x) =
1
2
for x =
1
2
(x) =
1
3
for x =
1
3
,
2
3
(x) =
1
4
for x =
1
4
,
3
4
.
.
.
(x) =
1
11
for x =
1
11
,
2
11
, . . . ,
10
11
. . . and so on.
Suppose x Q obeys 0 < x < 1 and that (x) = 1/q. Then x must be of
the form x = p/q for some p N with 1 p q 1. In particular, for any
given q N, x Q : 0 < x < 1 and (x) = 1/q is a nite set of rational
numbers.
Next, we dene f : [0, 1] R with the help of as follows.
f(x) =
_

_
1, x = 0
(x), x Q [0, 1]
0, x / Q [0, 1] .
Claim: f is discontinuous at every rational in [0, 1].
Proof First we note that f(0) = 1 and that f(x) = 1/q when x has the
form p/q (with p, q coprime). In any event, f(x) > 0 for any given rational
x in [0, 1]. Now let r Q [0, 1] be given and let (x
n
) be any sequence of
irrationals in [0, 1] which converge to r. (For example, if r ,= 0 we could let
x
n
= r(1 1/n

2) but otherwise let x


n
= 1/n

2.) Then f(x


n
) = 0 for
every n so it cannot be true that f(x
n
) f(r) (because f(r) > 0), that is,
f fails to be continuous at r, as claimed.
Claim: f is continuous at every irrational in [0, 1].
Proof Let x
0
be any given irrational number with 0 < x
0
< 1. Then
f(x
0
) = 0.
Let > 0 be given.
We must show that there is some > 0 such that
x [0, 1] and [x x
0
[ < = [f(x) f(x
0
)[
. .
=|f(x)0| =|f(x)|
< . ()
Now f(0) = f(1) = 1 and so () must fail for x = 0 or x = 1 if < 1.
Furthermore, () fails if x = p/q (p, q coprime) and (x) = 1/q , that is,
q 1/. In other words, () will fail if x = 0, x = 1 or else x Q[0, 1] and
Department of Mathematics
Functions 93
(x) = 1/q where q 1/. However, there are only nitely-many numbers
q N obeying q 1/ and so the set
A = r Q [0, 1] : r = 0 or (r)
is nite. Write A = r
1
, . . . , r
m
.
Since x
0
/ Q, it follows that x
0
,= r
j
for any 1 j m. For each
1 j m, let
j
= [x
0
r
j
[ and let = min
j
: 1 j m. Then
> 0 and if x obeys [x x
0
[ < it must be the case that x ,= r
j
for any
1 j m. It follows that if x [0, 1] and obeys [x x
0
[ < , then either
x / Q and so f(x) = 0 or else x Q but x / A and so f(x) = (x) < . In
any event, () holds and so f is continuous at x
0
, as required.
Kings College London
94 Chapter 5
Dierentiability
We know from calculus that the slope of the tangent to the graph of a
function f at some point is given by the so-called derivative at the point
in question. To nd this slope, one considers the limiting behaviour of the
Newton quotient
f(a + h) f(a)
h
as h approaches 0. We wish to set this up formally.
Denition 5.18. We say that the function f is dierentiable at the point a
if lim
h0, h=0
f(a+h)f(a)
h
exists, that is, if there is some R such that for
any > 0 there is some > 0 such that
0 < [h[ < =

f(a + h) f(a)
h

< .
The real number is called the derivative of f at a and is usually written
as f

(a) or as
df
dx
(a).
Remarks 5.19.
1. Note that the Newton quotient
f(a + h) f(a)
h
is not dened for h = 0
and clearly, it will only make any sense if both f(a) and f(a + h) are
dened. We shall take it to be part of the denition that this is true, at
least for suitably small values of h. That is, we assume that there is some
(possibly very small) open interval around a of the form (a , a + )
on which f is dened. This means that if a function is dened only
on the integers Z, say, then it will not make any sense to discuss its
dierentiability.
2. We see immediately that if f is constant, then f(a + h) = f(a) for any
h and so the Newton quotient is zero for all h ,= 0 and therefore f is
indeed dierentiable at a with derivative f

(a) = 0.
3. Suppose that f is dierentiable at a with derivative f

(a). Let
f,a
be
the function given by

f,a
(x) =
_
_
_
f(x) f(a)
x a
, x ,= a
f

(a) , x = a .
Then

f,a
(a + h) =
_
_
_
f(a + h) f(a)
h
, h ,= 0
f

(a) , h = 0 .
Department of Mathematics
Functions 95
By denition of dierentiability, for any given > 0 there is some > 0
such that
0 < [h[ < =

f,a
(a + h) f

(a)

< ,
that is,
0 < [h[ < = [
f,a
(a + h)
f,a
(a)[ < . ()
Now, () is still valid if we allow h = 0 and so (with x = a + h), we see
that
[x a[ < = [
f,a
(x)
f,a
(a)[ < .
In other words, the dierentiability of f implies that
f,a
is continuous
at x = a.
Example 5.20. Let f(x) =
_
x
3
, x 0
x
2
, x < 0 .
What is f

(x)?
Consider the region x > 0. Here, f(x) = x
3
and so f is dierentiable
with derivative 3x
2
for any x > 0. In the region x < 0, f(x) = x
2
and so
f

(x) = 2x for any x < 0. What about x = 0? We must argue from rst
principles. The Newton quotient (with a = 0) is
f(0 + h) f(0)
h
=
_

_
h
3
0
h
= h
2
for h > 0
h
2
0
h
= h for h < 0
0 as h 0.
Hence f is dierentiable at x = 0 with derivative f

(0) = 0.
Proposition 5.21. If f is dierentiable at a, then f is continuous at a.
Proof. The idea is straightforward. For h ,= 0, we can write
f(a + h) f(a) =
_
f(a + h) f(a)
h
_
h.
The rst term on the right hand side approaches f

(a) as h 0 and so the


whole right hand side should approach zero as h 0. Looking at the left
hand side, this means that f(a + h) approaches f(a) as h 0. Formally,
we have
f(x) = f(a) +
f,a
(x) (x a) .
The right hand side is the product of the two functions
f,a
(x) and (xa),
each being continuous at x = a and so the same is true of their product.
Therefore the left hand side is continuous at x = a, as required.
Kings College London
96 Chapter 5
Example 5.22. The converse to Proposition 5.21 is false. As an example,
consider f(x) = [x[ for x R. Then f is continuous at every x R.
However, f is not dierentiable at x = 0. Indeed,
f(0 + h) f(0)
h
=
[h + 0[ [0[
h
=
[h[
h
=
_
1, if h > 0
1, if h < 0
so the Newton quotient does not have a limit as h 0 (with h ,= 0) and
consequently f is not dierentiable at x = 0.
The following are familiar and very important rules.
Proposition 5.23. Suppose that f and g are dierentiable at x
0
.
(i) For any R, f is dierentiable at x
0
and (f)

(x
0
) = f

(x
0
).
(ii) The sum f + g is dierentiable at x
0
and
(f + g)

(x
0
) = f

(x
0
) + g

(x
0
) .
(iii) The product fg is dierentiable at x
0
and
(fg)

(x
0
) = f

(x
0
) g(x
0
) + f(x
0
) g

(x
0
) .
(iv) Suppose that f ,= 0. Then 1/f is dierentiable at x
0
and
_
1
f
_

(x
0
) =
f

(x
0
)
(f(x
0
))
2
.
Proof. In the following, h is small but h ,= 0.
(i) We have
(f)(x
0
+ h) (f)(x
0
)
h
=
f(x
0
+ h) f(x
0
)
h
=
f,x
0
(x
0
+ h) f

(x
0
)
as h 0.
(ii) We have
(f + g)(x
0
+ h) (f + g)(x
0
)
h
=
f(x
0
+ h) + g(x
0
+ h) f(x
0
) g(x
0
)
h
=
f(x
0
+ h) f(x
0
)
h
+
g(x
0
+ h) g(x
0
)
h
f

(x
0
) + g

(x
0
)
as h 0.
Department of Mathematics
Functions 97
(iii) We have
(fg)(x
0
+ h) (fg)(x
0
)
h
=
f(x
0
+ h) g(x
0
+ h) f(x
0
) g(x
0
)
h
=
f(x
0
+ h) f(x
0
)
h
g(x
0
+ h) +
g(x
0
+ h) g(x
0
)
h
f(x
0
)
f

(x
0
) g(x
0
) + g

(x
0
) f(x
0
)
as h 0, since g is continuous at x
0
.
(iv) We have
1/f(x
0
+ h) 1/f(x
0
)
h
=
1
h
_
1
f(x
0
+ h)

1
f(x
0
)
_
=
1
h
_
f(x
0
) f(x
0
+ h)
f(x
0
+ h) f(x
0
)
_
=

f
(x
0
+ h)
f(x
0
+ h) f(x
0
)

(x
0
)
(f(x
0
))
2
as h 0 since f is continuous at x
0
.
Recall that f g denotes the composition x f(g(x)) (function of a
function). Of course, for this to be well-dened the range of g must be
contained in the domain of denition of f. In the following, we assume that
this is satised.
Theorem 5.24 (Chain Rule). Suppose that g is dierentiable at x
0
and that
f is dierentiable at v
0
= g(x
0
). Then the composition f g is dierentiable
at x
0
and
(f g)

(x
0
) = f

(g(x
0
)) g

(x
0
) .
Proof. Suppose that h is small and that h ,= 0. Let v
0
= g(x
0
) and put
= g(x
0
+ h) g(x
0
) so that g(x
0
+ h) = v
0
+ . Then
(f g)(x
0
+ h) (f g)(x
0
)
h
=
f(g(x
0
+ h)) f(g(x
0
))
h
=
f(v
0
+ ) f(v
0
)
h
=
1
h

f,v
0
(v
0
+ ) (even if = 0)
=
f,v
0
(v
0
+ )
_
g(x
0
+ h) g(x
0
)
h
_
=
f,v
0
(v
0
+ )
g,x
0
(x
0
+ h) .
Kings College London
98 Chapter 5
Now, g(x
0
+ h) g(x
0
) as h 0 because g is continuous at x
0
. In other
words, = g(x
0
+ h) g(x
0
) 0 as h 0. It follows that

f,v
0
(v
0
+ )
g,x
0
(x
0
+ h)
f,v
0
(v
0
)
g,x
0
(x
0
)
= f

(v
0
) g

(x
0
)
= f

(g(x
0
)) g

(x
0
)
as h 0 and the result follows.
Imagine a function f(x) on the interval [0, 1], say, which has the property
that f(0) = f(1). Can we draw any conclusions about the behaviour of f(x)
for x between 0 and 1? It seems clear that either f is constant on [0, 1] or
else goes up and or down but in any event must have a turning point.
We know from calculus that this should demand that f

be zero somewhere.
However, it is clear that f cannot be entirely arbitrary for this to be true. For
example, suppose that f(0) = 0 = f(1) and that f(x) = 5x for 0 < x < 1.
Evidently f

is never zero. In fact, f

(x) = 5 for 0 < x < 1. We note that f


is not continuous at x = 1.
As another example, consider f(x) = 1 [x[ for x [1, 1]. We see that
f(1) = 0 = f(1) but is it true that f

is zero for x between 1 and 1?


No, it is not. We see that f

(x) = 1 for 1 < x < 0 and that f

(x) = 1
for 0 < x < 1 and f is not dierentiable at x = 0. In this example, f is
continuous on [1, 1] but fails to be dierentiable on (1, 1).
If we impose suitable continuity and dierentiability hypotheses, then
what we want will be true.
Theorem 5.25 (Rolles Theorem). Suppose that f is continuous on the closed
interval [a, b] and is dierentiable in the open interval (a, b). Suppose further
that f(a) = f(b). Then there is some (a, b) such that f

() = 0. (Note
that need not be unique.)
Proof. Since f is continuous on [a, b], it follows that f is bounded and attains
its bounds, by Theorem 5.13. Let m = inf f(x) : x [a, b] and let
M = sup f(x) : x [a, b] , so that
m f(x) M , for all x [a, b].
If m = M, then f is constant on [a, b] and this means that f

(x) = 0 for all


x (a, b). In this case, any (a, b) will do.
Suppose now that m ,= M, so that m < M. Since f(a) = f(b) at least
one of m or M must be dierent from this common value f(a) = f(b).
Suppose that M ,= f(a) ( = f(b)). As noted above, by Theorem 5.13,
there is some [a, b] such that f() = M. Now, M ,= f(a) and M ,= f(b)
and so ,= a and ,= b. It follows that belongs to the open interval (a, b).
Department of Mathematics
Functions 99
We shall show that f

() = 0. To see this, we note that f(x) M = f()


for any x [a, b] and so (putting x = +h) it follows that f(+h)f() 0
provided [h[ is small enough to ensure that + h [a, b]. Hence
f( + h) f()
h
0 for h > 0 and small ()
and
f( + h) f()
h
0 for h < 0 and small. ()
But () approaches f

() as h 0 which implies that f

() 0. On the other
hand, () approaches f

() as h 0 and so f

() 0. Putting these two


results together, we see that it must be the case that f

() = 0, as required.
It remains to consider the case when M = f(a). This must require that
m < f(a) ( = f(b)). We proceed now just as before to deduce that there
is some (a, b) such that f() = m and so () and () hold but with
the inequalities reversed. However, the conclusion is the same, namely that
f

() = 0.
Theorem 5.26 (Mean Value Theorem). Suppose that f is continuous on the
closed interval [a, b] and dierentiable on the open interval (a, b). Then there
is some (a, b) such that
f

() =
f(b) f(a)
b a
.
Proof. Let y = (x) = mx + c be the straight line passing through the pair
of points (a, f(a)) and (b, f(b)). Then the slope m is equal to the ratio
(f(b) f(a))/(b a).
Let g(x) = f(x) (x). Evidently, g is continuous on [a, b] and dif-
ferentiable on (a, b) (because is). Furthermore, since (a) = f(a) and
(b) = f(b), by construction, we nd that g(a) = 0 = g(b). By Rolles Theo-
rem, Theorem 5.25, applied to g, there is some (a, b) such that g

() = 0.
However, g

(x) = f

(x) m for any x (a, b) and so


f

() = m =
f(b) f(a)
b a
and the proof is complete.
We know that a function which is constant on an open interval is dier-
entiable and that its derivative is zero. The converse is true (so no surprise
there then).
Kings College London
100 Chapter 5
Corollary 5.27. Suppose that f is dierentiable on the open interval (a, b)
and that f

(x) = 0 for all x (a, b). Then f is constant on (a, b).


Proof. Let and be any pair of points in (a, b). We shall show that
f() = f(). By relabelling, if necessary, we may suppose that < . By
hypothesis, f is dierentiable at each point in the closed interval [, ] and
so is also continuous there, by Proposition 5.21. f obeys the hypotheses
of the Mean value Theorem on [, ] and so we can say that there is some
(, ) such that
f

() =
f() f()

.
However, f

vanishes on (a, b) and so f

() = 0 which means that we must


have f() = f() and the result follows.
Remark 5.28. The Mean Value Theorem can sometimes be useful for obtain-
ing inequalities. For example, setting f(x) = sin x and assuming standard
properties of the trigonometric functions, we can apply the Mean Value
Theorem to f on the interval [0, x] for x > 0 to nd that
f

() =
f(x) f(0)
x 0
or cos =
sin x
x
for some (0, x). However, cos 1 for all and so we nd that sinx x
for all x > 0.
Similarly, applying the Mean Value Theorem to f(x) = ln(1 +x) on the
interval [0, x], we nd that
f

() =
f(x) f(0)
x 0
or
1
1 +
=
ln(1 + x)
x
for some (0, x). But then 1/(1 + ) < 1 and we nd that ln(1 + x) < x
for any x > 0.
These inequalities could also have easily been obtained from the fact
that the integral of a positive function is positive. Indeed,
x sin x =
_
x
0
(1 cos t) dt 0 .
In the same way,
x ln(1 + x) =
_
x
0
_
1
1
1+t
_
dt 0.
In fact, one can show that both integrals are strictly positive if x > 0 so this
last method gives the strict inequalities sinx < x and ln(1 + x) < x for all
x > 0. (In this connection, note that if ln(1 +x) = x, then 1 +x = e
x
. This
is not possible for any x > 0 as is seen from the series expansion for e
x
.)
Department of Mathematics
Functions 101
Suppose that f and g are continuous on [a, b], dierentiable on (a, b) and
that g

is never zero on (a, b). The Mean Value Theorem applied to f and g
tells us that there is some and in (a, b) such that
f(b) f(a)
b a
= f

() and
g(b) g(a)
b a
= g

() .
Dividing (and noting that g(b) g(a) ,= 0 since g

() ,= 0, by hypothesis),
gives
f(b) f(a)
g(b) g(a)
=
f

()
g

()
.
It is possible to do a little better.
Theorem 5.29 (Cauchys Mean Value Theorem). Suppose that f and g are
continuous on [a, b] and dierentiable on (a, b). Suppose further that g

is
never zero on (a, b). Then there is some (a, b) such that
f(b) f(a)
g(b) g(a)
=
f

()
g

()
.
Proof. First, we observe that if g(a) = g(b) then Rolles Theorem tells us
that g

() = 0 for some (a, b). However, g

has no zeros on (a, b), by


hypothesis, and so it follows as noted above that g(a) ,= g(b). Set
(x) =
_
g(b) g(a)
_
f(x)
_
f(b) f(a)
_
g(x) .
Then
(a) = g(b) f(a) f(b) g(a) = (b)
and satises the hypotheses of Rolles Theorem. Hence there is some
(a, b) such that

() = 0, that is,
_
g(b) g(a)
_
f

()
_
f(b) f(a)
_
g

() = 0
or
f(b) f(a)
g(b) g(a)
=
f

()
g

()
,
as required.
Remark 5.30. Notice that interchanging a and b does not aect the left
hand side of the above equality. This means that we can slightly rephrase
Cauchys Mean Value Theorem to say that for any a ,= b there is some
between a and b such that
f(b) f(a)
g(b) g(a)
=
f

()
g

()
,
regardless of whether a < b or a > b.
Kings College London
102 Chapter 5
Taylors Theorem
It is convenient to let f
(k)
denote the k
th
-derivative of f (whenever it exists).
Now, if k > j, then
d
k
(x
j
)
dx
k
= 0, whereas if k j, then we see that
d
k
(x
j
)
dx
k
=
j(j 1) . . . (j (k 1))x
jk
. This vanishes when x = 0 and so we see that
d
k
(x
j
)
dx
k

x=0
= 0
for any k, j N.
Consider the polynomial p(x) =
0
+
1
x +
2
x
2
+ +
m
x
m
. Taking
derivatives and setting x = 0, we nd p(0) =
0
, p

(0) =
1
, p
(2)
(0) =
2
2
, p
(3)
(0) = 3!
3
. In general,
p
(k)
(0) = k!
k
.
Now consider some general function f(x) and dene
a
0
= f(0), a
1
= f

(0), a
2
=
1
2
f
(2)
(0), . . . , a
k
=
1
k!
f
(k)
(0), . . . , etc.
Let
P
n1
(x) = a
0
+ a
1
x + a
2
x
2
+ + a
n1
x
n1
R
n
(x) = f(x) P
n1
(x) .
If f(x) is a polynomial of degree n1, then f(x) = P
n1
(x) and R
n
(x) = 0.
So, in general, we can think of P
n1
(x) as a polynomial approximation to
f(x) and R
n
(x) as the remainder. The smaller R
n
(x) is, so f(x) is closer to
a polynomial. The question is, what can be said about R
n
(x)? This is the
content of Taylors Theorem.
To begin with, we notice that for k n 1,
R
(k)
(0) = f
(k)
(0) P
(k)
n1
(0) = f
(k)
(0) k! a
k
= 0 ,
by our construction of the a
k
s. We will use this in the following discussion.
Now, for x ,= 0, we apply Cauchys Mean Value Theorem to the pair of
functions R
n
(t) and g
n
(t) = t
n
, to write
R
n
(x) R
n
(0)
g
n
(x) g
n
(0)
=
R

n
()
g

n
()
for some lying between 0 and x. (It does not matter whether x > 0 or
x < 0.) Now, any such can be expressed in the form =
1
x for some
0 <
1
< 1. Hence
R
n
(x)
g
n
(x)
=
R
n
(x) R
n
(0)
g
n
(x) g
n
(0)
=
R

n
(
1
x)
g

n
(
1
x)
Department of Mathematics
Functions 103
for some 0 <
1
< 1, since both R
n
(0) = 0 and g
n
(0) = 0.
We repeat this argument applied successively to R
(k)
n
(t) and g
(k)
n
(t), and
use the facts that R
(k)
n
(0) = 0 and g
(k)
n
(0) = 0 for k n 1, to deduce that
R
n
(x)
g
n
(x)
=
R
n
(x) R
n
(0)
g
n
(x) g
n
(0)
=
R

n
(
1
x)
g

n
(
1
x)
for some 0 <
1
< 1,
=
R

n
(
1
x) R

n
(0)
g

n
(
1
x) g

n
(0)
=
R

n
(
2

1
x)
g

n
(
2

1
x)
for some 0 <
2
< 1,
=
R

n
(
2

1
x) R

n
(0)
g

n
(
2

1
x) g

n
(0)
=
R
(3)
n
(
3

1
x)
g
(3)
n
(
3

1
x)
for some 0 <
3
< 1,
.
.
.
=
R
(n1)
n
(
n1
. . .
1
x) R
(n1)
n
(0)
g
(n1)
n
(
n1
. . .
1
x) g
(n1)
n
(0)
=
R
(n)
n
(
n
. . .
1
x)
g
(n)
n
(
n
. . .
1
x)
for some 0 <
n
< 1.
However, R
(n)
n
(s) = f
(n)
(s) P
(n)
n1
(s) = f
(n)
(s) since P
(n)
n1
(s) = 0 and
g
(n)
n
(s) = n! . Let =
1

2
. . .
n
. Then 0 < < 1 and we get that
f(x) P
n1
(x)
x
n
=
R
n
(x)
g
n
(x)
=
f
(n)
(x)
n!
We can rewrite this to give
f(x) = P
n1
(x) +
x
n
n!
f
(n)
(x)
for some 0 < < 1. We have established the following theorem.
Theorem 5.31 (Taylors Theorem). Suppose f is dened on some interval
(, ) and has derivatives up to order n at all points in (, ). Suppose also
that 0 (, ) and x (, ). Then
f(x) = f(0) + xf

(0) +
x
2
2!
f

(0) + . . . +
x
n1
(n 1)!
f
(n1)
(0) + R
n
(x)
where R
n
(x) =
x
n
n!
f
(n)
() for some between 0 and x.
Remark 5.32. Note that will generally depend on f, x and also n.
Example 5.33. Let f(x) = ln(1+x) on, say (1, 3). The derivatives of f are
given by
f
(k)
(x) =
(1)
k+1
(k 1)!
(1 + x)
k
for k N.
Kings College London
104 Chapter 5
For any x (1, 3), by Taylors Theorem (up to remainder order n+1), we
may say that
ln(1 + x) = x
x
2
2
+
x
3
3

x
4
4
+ + R
n+1
(x)
where
R
n+1
(x) =
x
n+1
(n + 1)!
(1)
n+2
n!
(1 + )
n+1
=
x
n+1
(n + 1)
(1)
n+2
(1 + )
n+1
for some between 0 and x. Now let x = 1. Then f(1) = ln 2 and so
ln 2
_
1
1
2
+
1
3

1
4
+ +
(1)
n
n
_
= R
n+1
(1)
where R
n+1
(1) =
(1)
n+2
(n + 1)(1 + )
n+1
for some 0 < < 1.
But [R
n+1
(1)[ <
1
n + 1
which means that R
n+1
(1) 0 as n . It
follows that
ln 2 = 1
1
2
+
1
3

1
4
+ . . . =

n=1
(1)
n+1
n
There is a further more general formulation. For xed a, let g(s) = f(s +a)
and apply Taylors Theorem to g(s) to get
g(s) = g(0) + s g

(0) +
1
2
s
2
g

(0) + +
s
n1
(n 1)!
g
(n1)
(0) +
s
n
n!
g
(n)
()
for some between 0 and s. Now, g(s) = f(s + a) and g(0) = f(a).
Furthermore, by the chain rule, we nd that g
(k)
(0) = f
(k)
(a) and g
(n)
() =
f
(n)
( +a). But if lies between 0 and s then +a lies between a and s+a.
Putting x = s +a, we have s = xa and so = +a lies between a and x.
We arrive at the following version of Taylors Theorem.
Theorem 5.34 (Taylors Theorem for f about a). Suppose f is dened on
some interval (, ) and has derivatives up to order n at all points in (, ).
Suppose also that a (, ) and x (, ). Then
f(x) = f(a) + (x a) f

(a) +
(x a)
2
2!
f

(a) +
+
(x a)
n1
(n 1)!
f
(n1)
(a) + R
n
(x)
where R
n
(x) =
x
n
n!
f
(n)
() for some between a and x.
Department of Mathematics
Chapter 6
Power Series
Denition 6.1. A series of the form

n=0
a
n
(x )
n
, where the a
n
are
constants, is called a power series (about x = ).
We notice immediately that such a power series always converges for
x = (in this case, all terms, except possibly for the a
0
term, are zero).
What can be said about the convergence of power series? The following
results explain the situation. By setting w = x , it is often sucient to
consider the case = 0, so that the powers are simply powers of x and we
will usually do this.
Proposition 6.2. Suppose that the power series

n=0
a
n
x
n
converges for
some value x = x
0
with x
0
,= 0. Then it converges absolutely for every
x satisfying [x[ < [x
0
[.
Proof. Let S
n
(x) =

n
k=0
a
k
x
k
. By hypothesis, (S
n
(x
0
))
nN{ 0 }
converges.
In particular, (a
k
x
k
0
) converges (to zero) and so is a bounded sequence; that
is, there is some M > 0 such that [a
k
x
k
0
[ < M for all k.
We wish to show that

n
k=0
[a
k
x
k
[ converges for every x with [x[ < [x
0
[.
Suppose, then, that x obeys [x[ < [x
0
[ and set = [x/x
0
[. Evidently,
0 < 1 and so

k=0

k
converges. But then
[a
k
x
k
[ = [a
k
x
k
0
[ [x/x
0
[
k
M
k
and so

n
k=0
[a
k
x
k
[ converges by the Comparison Test.
Radius of Convergence of a Power Series
Consider a given power series

n=0
a
n
x
n
and let
J = x R :

n=0
a
n
x
n
converges .
What can be said about J ? Certainly, 0 J and it could happen that this
is the only element of J. For example, if a
n
= n
n
, then a
n
x
n
= (nx)
n
and so
no matter how small x is, eventually [nx[ > 1 provided x ,= 0. This means
105
106 Chapter 6
that for any given x ,= 0, it is false that a
n
x
n
0 as n and so the
power series cannot converge. In this case J = 0 .
Suppose that x
0
J, so that

n=0
a
n
x
n
0
is convergent. Then we know
that

n=0
a
n
x
n
also converges (absolutely) for every x obeying [x[ < [x
0
[.
In other words, if x
0
J, then every point in the interval ([x
0
[ , [x
0
[) also
belongs to J. What does this mean for J? There are 3 distinct (mutually
exclusive) possibilities.
(i) J = 0 .
(ii) J is bounded but there is some t ,= 0 with t J
(that is, J ,= 0 but is bounded).
(iii) J is unbounded.
We can immediately deduce that if J is not bounded, case (iii), then it must
be the whole of R. Indeed, to say that J is not bounded is to say that for
any r > 0, there is some x J with [x[ > r. Hence [r, r] J for all r > 0
and so J = R.
Now consider case (ii) and let
A = r > 0 :

n=0
a
n
x
n
converges for x (r, r) .
Evidently, if t J, then [t[ A and so A is bounded because J is. Let
R = lubA. Then R > 0 otherwise we are in case (i).
Suppose 0 < < R. Then, by denition of lub, there is r A such
that < r R. But then the series

n=0
a
n
x
n
converges (absolutely) for
x (r, r) and, in particular, for x with [x[ = .
Next, suppose that x R with [x[ = > R. If

n=0
a
n
x
n
were to
converge, then we could deduce that (, ) J which would mean that
A. This contradicts the fact that R is an upper bound for A and so

n=0
a
n
x
n
cannot converge for any such x.
Case (ii) means then that there is some R > 0 such that

n=0
a
n
x
n
converges (absolutely) for all x with [x[ < R but diverges for any x with
[x[ > R. The behaviour of the power series when [x[ = R (i.e., x = R)
requires separate extra discussion and will depend on the particular power
series. Anything is possible.
This discussion is summarized in the following very important theorem.
Department of Mathematics
Power Series 107
Theorem 6.3 (Radius of Convergence Theorem for Power Series). For any
given power series

n=0
a
n
(x )
n
, exactly one of the following three pos-
sibilities applies.
(i)

n=0
a
n
(x )
n
converges only for x = .
(ii) There is R > 0 such that

n=0
a
n
(x )
n
converges (absolutely)
for all [x [ < R but diverges for any x with [x [ > R.
(iii)

n=0
a
n
(x )
n
converges (absolutely) for all x.
Denition 6.4. The value R above is called the radius of convergence of the
power series. In case (iii), one says that the series has an innite radius of
convergence.
Examples 6.5.
1. Consider

n=0
x
n
. This series converges if [x[ < 1 (by the Ratio Test)
and otherwise diverges, so R = 1. Note that the series diverges at both
of the boundary values x = 1.
2. Consider

n=0
a
n
x
n
= 1 + x +
x
2
2
+
x
3
3
+ . . .
The series converges if [x[ < 1 (by Comparison with 1 + x + x
2
+ . . . ).
If x = 1, then it becomes 1 +1 +
1
2
+
1
3
+. . . which we know diverges. It
follows that it cannot converge for any x with [x[ > 1. When x = 1, it
becomes 11+
1
2

1
3
+. . . which converges. So 1+x+x
2
/2+x
3
/3+. . .
converges at x = 1 but diverges at x = 1.
Replacing x by x, we see that the series
1 x +
x
2
2

x
3
3
+
x
4
4

x
5
5
+ . . .
converges for [x[ < 1 and for x = 1 but diverges when x = 1.
3. Formally adding together the two series above, suggests the power series
2 +
2x
2
2
+
2x
4
4
+
2x
6
6
+ = 1 + 1 + x
2
+
x
4
2
+
x
6
3
+ . . .
which converges for [x[ < 1 = R but diverges when x = 1.
4. The series
1
x
2
2
+
x
6
3

x
8
4
+ . . .
converges for [x[ < 1 = R and also converges for both x = 1.
Kings College London
108 Chapter 6
5. The series

n=0
x
n
n!
= 1 + x +
x
2
2!
+
x
3
3!
+ . . .
converges absolutely for all x R, by the Ratio Test.
If the power series

n=0
a
n
x
n
is dierentiated term by term, then the
resulting power series is

n=1
na
n
x
n1
. This is called the associated derived
series. The next theorem tells us that this makes sense.
Theorem 6.6. Suppose that

n=0
a
n
x
n
has radius of convergence R > 0.
Then the series

n=1
na
n
x
n1
also has radius of convergence equal to R.
(The possibility of an innite radius of convergence is included.)
Proof. Suppose that 0 < [u[ < R. Let r > 0 obey 0 < [u[ < r < R. Then

n=0
[a
n
[ r
n
converges. Since n
1/n
1 as n , it follows that there is
some N N such that n
1/n
< r/ [u[ for all n > N. Therefore
n [a
n
[ [u
n
[ = [a
n
[ (n
1/n
[u[)
n
< [a
n
[ r
n
for all n > N. By Comparison, it follows that

n=1
na
n
u
n1
= (1/u)

n=1
na
n
u
n
converges absolutely. It follows that the power series

n=1
na
n
x
n1
has
radius of convergence at least equal to R.
On the other hand, if the derived series

n=1
na
n
x
n1
converges absolutely,
then the inequality
[a
n
[ [x[
n
[x[ n [a
n
[ [x[
n1
for n 1 implies that

n=0
a
n
x
n
converges absolutely, by Comparison. The
result follows.
Remark 6.7. By applying the theorem once again, we see that the power
series

n=2
n(n 1) a
n
x
n2
also has radius of convergence equal to R. Of
course, we can now apply the theorem again . . .
The big question is whether the derived series is indeed the derivative of
the original power series. We shall now show that this is true.
We recall that Taylors Theorem, with 2
nd
order remainder for a function
f about x
0
, gives
f(x) = f(x
0
) + (x x
0
)f

(x
0
) +
f
(2)
(c)
2!
(x x
0
)
2
for some c between x and x
0
. Setting f(x) = x
k
gives the equality
x
k
x
k
0
= k (x x
0
) x
k1
0
+
1
2
k(k 1)c
k2
k
(x x
0
)
2
Department of Mathematics
Power Series 109
for some c
k
between x
0
and x. Note that c
k
may depend on k (as well as x
0
and x). If x = x
0
+ h, then this becomes
(x
0
+ h)
k
x
k
0
= hk x
k1
0
+
1
2
k(k 1)c
k2
k
h
2
()
for some c
k
between x
0
and x
0
+ h.
We can use this to nd the derivative of a power series inside its disc
of convergence. Indeed, suppose that the power series f(x) =

n=0
a
n
x
n
has radius of convergence R > 0. Let [x
0
[ < R be given and let r > 0 obey
0 < [x
0
[ < r < R. Let h ,= 0 be so small that [x
0
[ +[h[ < r. This means that
r < x
0
+ h < r so that

n=0
a
n
(x
0
+ h)
n
converges (absolutely). Using
(), we nd that
f(x
0
+ h) f(x
0
)
h

n=1
na
n
x
n1
0
=
1
2
h

n=2
n(n 1)c
n2
n
.
Now c
n
is between x
0
and x
0
+h and both of these points lie in the interval
(r, r) and so it follows that c
n
(r, r), that is [c
n
[ < r. But then, by
Comparison with the series

n=2
n(n1)r
n2
, the power series on the right
hand side is convergent. Letting h 0 gives the desired result that
f

(x
0
) = lim
h0
f(x
0
+ h) f(x
0
)
h
=

n=1
na
n
x
n1
0
.
We have proved the following important theorem.
Theorem 6.8 (Dierentiation of Power Series). The power series

n=0
a
n
x
n
is dierentiable at each point x
0
inside its radius of convergence.
Moreover, its derivative is given by the derived series

n=1
na
n
x
n1
0
.
Example 6.9. We shall show that
ln(1 + x) = x
x
2
2
+
x
3
3

x
4
4
+ . . .
for any x (1, 1). The radius of convergence, R, of the power series on
the right hand side is R = 1.
Let us begin by guessing that ln(1 +x) = a
0
+a
1
x+a
2
x
2
+. . . . If this is to
be true, then putting x = 0, we should have ln 1 = a
0
+ 0, that is, a
0
= 0.
Dierentiating term by term and then setting x = 0, we might guess that
d
dx
ln(1 + x) [
x=0
= a
1
. This gives 1 = a
1
. Dierentiating twice (term by
term) and setting x = 0, we might guess that
d
2
dx
x
ln(1 +x) [
x=0
= 2a
2
, that
is, a
2
=
1
2
. Repeating this, we guess that a
k
= (1)
k+1
/k. So much for
the guessing, now let us justify our reasoning.
Kings College London
110 Chapter 6
Let g(x) be the power series
g(x) = x
x
2
2
+
x
3
3

x
4
4
+ . . . .
We see that this power series converges for x = 1 and so it must converge
absolutely for [x[ < 1. (This can also be seen directly by the Ratio Test.)
The series does not converge when x = 1 and so we deduce that its radius
of convergence is R = 1. For any x with [x[ < R = 1, the power series
can be dierentiated and the derivative is that obtained by term by term
dierentiation. Hence
g

(x) = 1 x + x
2
x
3
+ . . .
for any x with [x[ < 1. However, we know that
1
1 + x
= 1 x + x
2
x
3
+ . . .
for [x[ < 1 and so g

(x) = 1/(1 + x) for x (1, 1). But


d
dx
ln(1 + x) =
1/(1+x) for x (1, 1) and so ln(1+x)g(x) has zero derivative on (1, 1).
It follows that ln(1 +x) g(x) is constant on (1, 1). Setting x = 0, we see
that this constant must be ln 1 g(0) = 0 and so ln(1 + x) = g(x) on the
interval (1, 1), as required.
Note that we have shown that ln(1 + x) = x
1
2
x
2
+
1
3
x
3
. . . for
any x (1, 1). We have already seen (thanks to Taylors Theorem) that
ln 2 = 1
1
2
+
1
3

1
4
+ . . . which means that this expansion is also valid for
x = 1.
When x = 1, the left hand side becomes ln 0, which is not dened and
the right hand side becomes the divergent series 1
1
2

1
3

1
4
. . . .
Department of Mathematics
Chapter 7
The elementary functions
We have already used the elementary functions (the trigonometric functions,
exponential function and the logarithm) as examples to illustrate various
aspects of the theory. Now is the time to give their formal denitions.
The trigonometric functions sinx and cos x and the exponential function
exp x are dened as follows.
Denition 7.1. For any x R,
sin x =

n=0
(1)
n
x
2n+1
(2n + 1)!
= x
x
3
3!
+
x
5
5!

x
7
7!
+ . . .
cos x =

n=0
(1)
n
x
2n
(2n)!
= 1
x
2
2!
+
x
4
4!

x
6
6!
+ . . .
exp x =

n=0
x
n
n!
= 1 + x +
x
2
2!
+
x
3
3!
+ . . . .
Each of these power series converges absolutely for all x R (by the Ratio
Test) so they have an innite radius of convergence.
Remark 7.2. These are the denitions and so each and every property that
these functions possess must be obtainable from these denitions.
We can see immediately that sin 0 = 0, cos 0 = 1 and exp 0 = 1. We also
note that sin(x) = sinx (so sinx is an odd function) and cos(x) = cos x
(so cos x is an even function). Furthermore, by the basic dierentiation of
power series theorem, Theorem 6.8, we see that these functions are dieren-
tiable at every x R with derivatives given by term by term dierentiation
111
112 Chapter 7
so that
d
dx
sin x =
d
dx
_
x
x
3
3!
+
x
5
5!
. . .
_
= 1
x
2
2!
+
x
4
4!
= cos x
d
dx
cos x =
d
dx
_
1
x
2
2!
+
x
4
4!
. . .
_
= x +
x
3
3!

x
5
5!
+ = sin x
d
dx
exp x =
d
dx
_
1 + x +
x
2
2!
+
x
3
3!
+ . . .
_
= 0 + 1 + x +
x
2
2!
+ = exp x.
We shall establish further familiar properties.
Theorem 7.3. For any x R, sin
2
x + cos
2
x = 1.
Proof. Let (x) = sin
2
x + cos
2
x. Then we calculate the derivative

(x) = 2 sinxcos x 2 cos xsinx = 0 .


It follows that (x) is constant on R. In particular,
(x) = (0) = sin
2
0 + cos
2
0 = 0 + 1 = 1
that is, sin
2
x + cos
2
x = 1, as required.
Remark 7.4. Since both terms sin
2
x and cos
2
x are non-negative, we can say
that 1 sin x 1 and also 1 cos x 1 for all x R. The functions
sin x and cos x are bounded (by 1). This is not at all obvious just by
looking at the power series in their denitions.
Theorem 7.5 (Addition Formulae). For any a, b R, we have
sin(a + b) = sin a cos b + cos a sin b
cos(a + b) = cos a cos b sin a sin b .
Proof. Let (x) = sin( x) cos x + cos( x) sin x. Then we see that

(x) = cos( x) cos x sin( x) sin x


+ sin( x) sin x + cos( x) cos x = 0 .
It follows that (x) is constant on R and so (x) = (0), that is,
sin( x) cos x + cos( x) sin x = sin .
Putting = a + b and x = b, we obtain the desired formula
sin(a + b) = sin a cos b + cos a sin b .
The other formula can be obtained similarly. Indeed, let
(x) = cos( x) cos x sin( x) sin x.
Department of Mathematics
The elementary functions 113
Then we nd that

(x) = 0 so that (x) is constant on R. Hence (x) =


(0) = cos . Again setting = a + b and x = b, we nd that
cos(a + b) = cos a cos b sina sinb
and the proof is complete.
Remark 7.6. The formulae
sin(a b) = sin a cos b cos a sinb
cos(a b) = cos a cos b + sina sinb
follow by replacing b by b and using the facts that sin(b) = sin b whereas
cos(b) = cos b. Notice further that if we set a = x and b = x in this last
formula, then we get
cos(x x) = cos
2
x + sin
2
x
that is, we recover the formula sin
2
x + cos
2
x = 1.
The number
The elementary geometric approach to the trigonometric functions is by
means of triangles and circles. The number makes its appearance in the
formula relating the circumference and the radius of a circle (or giving the
area A = r
2
of a circle of radius r). For us here, we must always proceed via
the power series denitions of the trigonometric functions. The identication
of begins with some preliminary properties of the functions sinx and cos x.
Lemma 7.7.
(i) sin x > 0 for all x (0, 2) .
(ii) cos 2 < 0 .
Proof. (i) Taylors Theorem (up to order 2) says that
f(x) = f(0) + xf

(0) +
x
2
2!
f

(c)
for some c between 0 and x. With f(x) = sin x, we obtain
sin x = 0 + x
x
2
2
sin(c) x
x
2
2
for some c between 0 and x. We have used the facts that sin 0 = 0, cos 0 = 1
and sin(c) 1. Hence
sin x x
1
2
x
2
=
1
2
x(2 x) > 0
if 0 < x < 2, as claimed.
Kings College London
114 Chapter 7
(ii) Applying Taylors Theorem (up to order 4), we may say that there
is some between 0 and x such that
cos x = 1 0
x
2
2!
+ 0 +
x
4
4!
cos .
But cos 1 and so
cos x 1
x
2
2
+
x
4
4!
.
Putting x = 2 gives
cos 2 1
4
2
+
16
24
= 1 +
2
3
=
1
3
which implies that cos 2
1
3
< 0, as required.
Now we come to the crucial part.
Theorem 7.8. There is a unique 0 < < 2 such that cos = 0.
Proof. We know that cos 0 = 1 and we have just seen that cos 2 < 0. It
follows by the Intermediate Value Theorem (applied to the function cos x
on the interval [0, 2]) that there is some (0, 2) such that cos = 0.
We must now show that there is only one such . To see this, suppose
that cos = 0 for some (0, 2) with ,= . Then by Rolles Theorem,
there is some between and such that
d
dx
cos x

x=
= 0, that is, sin = 0.
But we have shown that sinx > 0 on (0, 2). This gives a contradiction and
so we conclude that there can be no such . In other words, there is a unique
with 0 < < 2 such that cos = 0.
Denition 7.9. The real number is dened to be = 2, where is the
unique solution in (0, 2) to cos = 0.
All we can say at the moment is that 0 < < 4. It is known that is
irrational and its decimal expansion is known to some two million decimal
places. Curiously enough, it seems that each of the digits 0, 1, . . . , 9 appears
with about the same frequency in this expansion.
Theorem 7.10. The number is such that sin(
1
2
) = 1, cos(2) = 1 and
sin(2) = 0. Furthermore, for any x R
sin(x + 2) = sin x
cos(x + 2) = cos x.
Proof. By its very denition, we know that cos(
1
2
) = 0. But since we have
the identity sin
2
x + cos
2
x = 1, it follows that sin(
1
2
) = 1. However, we
have seen that sinx > 0 on (0, 2) and so it follows that sin(
1
2
) = 1.
Department of Mathematics
The elementary functions 115
By the addition formulae, sin = 2 sin(
1
2
) cos(
1
2
) = 0. This then
implies that sin(2) = 2 sin cos = 0. To show that cos(2) = 1, we use
the addition formula again to nd that
cos(2x) = cos
2
x sin
2
x = 1 2 sin
2
x.
Setting x = , we get cos(2) = 1 because sin = 0.
Finally, using the above results together with the addition formulae, we
calculate
sin(x + 2) = sin x cos(2) + cos x sin(2) = sin x
cos(x + 2) = cos x cos(2) sin x sin(2) = cos x
for any x R and the proof is complete.
Properties of the exponential function
We now turn to a discussion of the exponential function.
Proposition 7.11. The function exp x enjoys the following properties.
(i)
d
dx
exp x = exp x for all x R.
(ii) exp 0 = 1.
(iii) For any a, b R, exp(a + b) = expa exp b.
(iv) exp(x) = 1/ exp x for all x R.
(v) exp x > 0 for all x R.
Proof. (i) As already noted, this follows because the derivative of the power
series is that power series got by dierentiating term by term.
(ii) Putting x = 0 in the power series gives exp 0 = 1.
(iii) Fix u R and set (x) = exp x exp(u x). Then

(x) = expx exp(u x) exp x exp(u x) = 0


for all x R. It follows that (x) is constant, so that (x) = (0). But
(0) = exp u and so (x) = expu. Letting u = a + b and x = a, we nd
that expa exp b = exp(a + b), as required.
(iv) From the above, we nd that expx exp(x) = exp 0 = 1 and so
exp(x) = 1/ exp x.
Kings College London
116 Chapter 7
(v) Since expx exp(x) = 1 it follows that expx ,= 0 for any x R.
However, it is clear from the power series that expx > 0 if x > 0 and so the
formula expx exp(x) = 1 implies that exp(x) > 0 too.
(Alternatively, one can note that expx = exp(
1
2
x) exp(
1
2
x) = (exp(
1
2
x))
2
which is positive.)
Because of the property exp(a + b) = exp a exp b, one often writes e
x
for exp x, so this reads e
a+b
= e
a
e
b
. However, this notation needs some
further discussion. The point is that the symbols e
2
, say, now appear to
have two interpretations. Firstly as exp(2) and secondly as the square of
the number e. The real number e is dened as exp(1) and we see that
e
2
= exp(1)
2
= exp(1) exp(1) = exp(1 + 1) = exp(2)
so the two interpretations actually agree. What about, say, e
1/2
? This is
interpreted as either exp(
1
2
) or as the square root of e. But
exp(
1
2
) exp(
1
2
) = exp(
1
2
+
1
2
) = exp(1) = e
so exp(
1
2
) is the square root of e. This extends to any rational power.
Theorem 7.12. For any r Q, exp(r) = e
r
, where e = exp(1)
Proof. If r = 0, then exp(0) = 1 = e
0
, by denition of the power e
0
. Now
suppose that r > 0 and write r = p/q for p and q N. We have
(expr)
q
= exp r exp r
. .
q factors
= exp(rq)
= exp p = exp(1 + 1 + + 1
. .
p terms
)
= exp 1 exp 1
. .
p factors
= e
p
and so exp(r) = e
p/q
= e
r
.
Now let r = s where s Q and s > 0. The above discussion tells us
that exp(s) = e
s
so that
exp(r) = exp(s) =
1
exp(s)
=
1
e
s
= e
s
= e
r
and we are done.
Remark 7.13. This result claries the symbolism e
x
. This can always be
considered as shorthand notation for expx, but if x is rational, then it can
also mean the x
th
power of the real number e. In this (rational) case, the
values are the same, as the theorem shows, so there is no ambiguity.
Department of Mathematics
The elementary functions 117
Remark 7.14. We have seen that the power series expression for expx tells us
that
d
dx
exp x = exp x and exp 0 = 1. These properties completely determine
exp x. In fact, if (x) is the power series (x) = a
0
+a
1
x+a
2
x
2
+. . . , then
the requirement that

(x) = (x) demands that


a
1
+ 2a
2
x + 3a
3
x
2
+ 4a
4
x
3
+ = a
0
+ a
1
x + a
2
x
2
+ . . .
This holds if ka
k
= a
k1
for all k = 0, 1, 2, . . . which means that a
1
= a
0
,
a
2
= a
1
/2 = a
0
/2, . . . , a
k
= a
k1
/k = a
k2
/k(k 1) = = a
0
/k!. If
(0) = 1, then a
0
= 1 and a
k
= 1/k! so we nd that (x) = expx.
This holds without assuming that we begin with a power series. Indeed,
suppose that (x) is dierentiable on R and that

(x) = (x) and (0) = 1.


We shall show that (x) = exp x.
Let g(x) = (x) exp(x). Then g is dierentiable on R and
g

(x) =

(x) exp(x) (x) exp(x) = 0


since

(x) = (x). Fix u R and let (a, b) be any interval in R such that
both u (a, b) and 0 (a, b). Then g

is zero on the interval (a, b) and


so g is constant there. In particular, g(u) = g(0). However, by construction,
g(0) = (0) exp 0 = 1 and so g(u) = g(0) = 1. Hence (u) exp(u) = 1
and we nally arrive at the required result that (u) = expu.
The function expx has further interesting properties.
Theorem 7.15. The function exp x obeys the following.
(i) The map x exp x is one-one from R onto (0, ). In fact, exp x
is strictly increasing on R.
(ii) For any k N, x
k
/ exp x 0 as x .
Proof. (i) From the power series expression for expx, we see that if x > 0
then expx > 1 + x > 1. Suppose that a, b R and that a < b. Then
b a > 0 so that exp(b a) > 1. Multiplying by expa (which is positive),
we see that exp(b a) exp a > exp a, that is expb > exp a. It follows that
exp x is strictly increasing and so expa = exp b is only possible if a = b, that
is, x exp x is one-one.
(Alternatively, the Mean Value Theorem tells us that (expb exp a)/(b a)
is equal to the derivative of expx evaluated at some point between a and b.
This derivative is always positive and so (expb exp a) and (b a) always
have the same sign. In particular, expa = exp b only if a = b.)
We still have to show that expx maps R onto (0, ). To see this, let
(0, ). We must show that there is some u R such that expu = . Let
> and let > 1/. Then exp > 1 + > and exp > 1 + > 1/,
so that exp() = 1/ exp < . So we have
exp() < < exp .
Kings College London
118 Chapter 7
Now exp x is continuous on R and so in particular is continuous on the
closed interval [, ]. By the Intermediate Value Theorem, there is some
u between and such that expu = , as required.
(ii) For x > 0, the power series expression for expx tells us that
exp x =

n=0
x
n
n!
>
x
k+1
(k + 1)!
.
Hence 0 < x
k
/ exp x < (k + 1)!/x for x > 0 and so x
k
/ exp x 0 as
x .
Remark 7.16. This last result can be written as x
k
exp(x) 0 as x
or as (expx)/x
k
as x and it implies that x
k
exp x 0 as
x .
It is clear from the power series denition (with x = 1) that e > 1+1 = 2.
We can easily obtain an upper bound for e via Taylors Theorem. Indeed,
exp
(k)
(x) = exp(x) for any k N and exp(0) = 1, so by Taylors Theorem
up to remainder of order 3, we have
exp(x) = 1 + x +
x
2
2!
+
x
3
3!
exp(c
x
)
for some c
x
between 0 and x. If x = 1, then c
1
< 1 and so e
c
1
< e and we
get
e (1 + 1 +
1
2
) =
1
6
e
c
1
<
1
6
e ,
that is,
e < 3 .
We can protably pursue this method of estimation. Taylors Theorem up
to remainder of order m + 1 gives
e
x
= 1 + x +
x
2
2!
+ +
x
m
m!
+ R
m+1
where R
m+1
=
x
m+1
(m+1)!
e
c
x
. Now setting x = 1 and noting the inequalities
0 < e
c
1
< e < 3, we see that 0 < R
m+1
<
3
(m+1)!
.
However, if m 3, then
3
(m+1)!
<
1
m!
and we deduce that
0 < e
_
1 + 1 +
1
2!
+ +
1
m!
_
<
1
m!
. ()
This can be rewritten as
1 + 1 +
1
2!
+ +
1
m!
< e <
_
1 + 1 +
1
2!
+ +
1
m!
_
+
1
m!
for any m 3. These estimates allow us to prove the following interesting
fact.
Department of Mathematics
The elementary functions 119
Theorem 7.17. The real number e is irrational.
Proof. The proof is by contradiction. Suppose it were the case that e Q
and let e = p/q where p, q N. Let m N obey m > q + 3 (so that m is
greater than both q and 3). Using the estimate () and multiplying through
by m! , we see that
0 < m!
_
p
q

_
1 + 1 +
1
2!
+ +
1
m!
_
_
< 1 .
However,
m!
_
p
q

_
1 +1 +
1
2!
+ +
1
m!
_
_
=
m! p
q

_
m! +m! +
m!
2!
+ +
m!
m!
_
which is an integer because each term is an integer. This gives us our
contradiction since there is no integer lying strictly between 0 and 1. The
proof is complete.
In fact, we can prove more, namely that all powers and roots of powers
are irrational. That is, e
p/q
is irrational for any p, q Z with p ,= 0 and
q ,= 0. (If q = 0, then p/q does not make any sense. If q ,= 0 but p = 0, then
we have e
p/q
= e
0
= 1 which is rational.) In order to show this, we need a
few preliminary results.
Lemma 7.18. For given n N, let f(x) =
x
n
(1 x)
n
n!
. Then
(i) f(x) =
1
n!
2n

m=n
c
m
x
m
, with c
m
Z.
(ii) If 0 < x < 1, then 0 < f(x) < 1/n!.
(iii) f
(k)
(0) Z and f
(k)
(1) Z for all k 0.
Proof. (i) The Binomial Theorem tells us that (1 x)
n
can be written as
(1 x)
n
= a
0
+ a
1
x + a
2
x
2
+ + a
n
x
n
for suitable integers a
0
, a
1
, . . . , a
n
. In fact, a
0
= 1, a
1
= n, a
2
= n(n 1)
and so on. In general, a
m
= (1)
m
n! /(n m)! m!.
Alternatively, this can be proved by induction. Indeed, let P(n) be the
statement that (1 x)
n
= a
0
+ a
1
x + a
2
x
2
+ + a
n
x
n
for coecients
a
0
, a
1
, . . . , a
n
Z. Then with n = 1, we have (1 x)
1
= 1 x and we see
that P(1) is true.
Now suppose that n N and P(n) is true. Then
(1 x)
n+1
= (1 x) (1 x)
n
= (1 x)(a
0
+ a
1
x + a
2
x
2
+ + a
n
x
n
)
Kings College London
120 Chapter 7
for coecients a
0
, a
1
, . . . , a
n
Z. Expanding the right hand side gives
(1 x)
n+1
= a
0
+ a
1
x + a
2
x
2
+ + a
n
x
n
x(a
0
+ a
1
x + a
2
x
2
+ + a
n
x
n
)
= a
0
+ (a
1
a
0
)x + (a
2
a
1
)x
2
+
+ (a
n
a
n1
)x
n
a
n
x
n+1
.
Evidently, the coecients all belong to Z and so P(n + 1) is true. By
induction, it follows that P(n) is true for all n N.
(ii) If 0 < x < 1, then also 0 < (1x) < 1 and therefore both 0 < x
n
< 1
and 0 < (1 x)
n
< 1. Hence 0 < f(x) < 1/n! .
(iii) We rst note that dierentiating k times the power x
m
and then
setting x = 0 gives
d
k
x
m
dx
k

x=0
=
_
0, if m ,= k
k!, if m = k.
It follows directly from (i) that
f
(k)
(0) =
_
k! c
k
/n! , if n k 2n
0, otherwise.
Furthermore, if k n, then k!/n! Z and so we see that f
(k)
(0) Z for
any k 0.
Next, we use the relation f(x) = f(1 x) together with the chain rule
to nd f
(k)
(1). Let u = 1 x. Then du/dx = 1 so that
d f(1 x)
dx
=
df(u)
du
du
dx
=
df(u)
du
(1) .
Dierentiating k times gives
d
k
f(1 x)
dx
k
=
d
k
f(u)
du
k
(1)
k
.
Hence, using the equality f(x) = f(1 x) = f(u), we get
f
(k)
(x) =
d
k
f(x)
dx
k
=
d
k
f(1 x)
dx
k
= (1)
k
d
k
f(u)
du
k
for all x. Putting x = 1 gives u = 0 and so f
(k)
(1) = (1)
k
f
(k)
(0). However,
we know that f
(k)
(0) Z and so (1)
k
f
(k)
(0) Z. That is, f
(k)
(1) Z, as
claimed.
We are now in a position to prove the following result we are interested
in concerning the irrationality of various powers of e.
Department of Mathematics
The elementary functions 121
Theorem 7.19. e
r
is irrational for every r Q 0 .
Proof. We rst show that e
s
/ Q for every s N. The proof is by contra-
diction, so suppose the contrary, namely that there is some s N such that
e
s
Q. Then we can write e
s
= p/q for p, q N.
Choose and x n N obeying the inequality n! > p s
2n+1
and let f(x) =
x
n
(1 x)
n
/n! as in the previous lemma, Lemma 7.17. We introduce the
following function F(x) dened to be
F(x) = s
2n
f(x) s
2n1
f

(x) + s
2n2
f

(x)
s
2n3
f
(3)
(x) + sf
(2n1)
(x) + f
(2n)
(x) .
Now, by part (i) of Lemma 7.17, f(x) has degree 2n and so f
(k)
(x) = 0 for
all k > 2n. Hence, dierentiating the formula above, we nd that
F

(x) = s
2n
f

(x) s
2n1
f

(x) + s
2n2
f
(3)
(x)
s
2n3
f
(4)
(x) + sf
(2n)
(x) + f
(2n+1)
(x)
. .
=0
.
Hence (after many cancellations)
F

(x) + sF(x) = s
2n
f

(x) s
2n1
f

(x) + s
2n2
f
(3)
(x)
s
2n3
f
(4)
(x) + sf
(2n)
(x)
+ s
2n+1
f(x) s
2n
f

(x) + s
2n1
f

(x)
s
2n2
f
(3)
(x) + s
2
f
(2n1)
(x) + sf
(2n)
(x)
= s
2n+1
f(x) .
It follows that
I =
d
dx
_
e
sx
F(x)
_
= s e
sx
F(x) + e
sx
F

(x)
= e
sx
_
sF(x) + F

(x)
_
= s
2n+1
e
sx
f(x) .
Hence
_
1
0
s
2n+1
e
sx
f(x) dx =
_
e
sx
F(x)
_
1
0
= e
s
F(1) F(0)
=
p
q
F(1) F(0)
since e
s
= p/q. Therefore
q I = p F(1) q F(0) .
Kings College London
122 Chapter 7
Now, s N and by Lemma 7.17 we know that f
(k)
(0) Z and f
(k)
(1) Z
for all k 0. It follows from the expression for F(x) that both F(0) Z
and F(1) Z. Hence q I Z. Furthermore, the integrand in the formula
for I is positive on (0, 1) and so I > 0. It follows that q I N.
Now, by Lemma 7.17 again, 0 < f(x) < 1/n! for 0 < x < 1 and e
sx
< e
s
for x < 1 and so
0 < q I = q
_
1
0
s
2n+1
e
sx
f(x) dx
< q s
2n+1
_
1
0
e
sx
1
n!
dx
< q s
2n+1
e
s
1
n!
=
p s
2n+1
n!
< 1
by our very choice of n at the start. However, there are no integers strictly
between 0 and 1 so we have nally arrived at our contradiction and we
conclude that e
s
is irrational for every s N.
Let m N. Since e
m
= 1/e
m
and we have just shown that e
m
/ Q, it
follows that e
m
/ Q and so we may conclude that e
s
/ Q for all s Z 0 .
Now let r Q 0 . Write r = m/n for m Z 0 and n N. If
e
r
were rational, it would follow that e
m
= (e
m/n
)
n
= (e
r
)
n
is also rational.
But we know that e
m
is irrational for every m Z 0 and so it follows
that e
r
is irrational and the proof is complete.
Compound Interest
If one pound is invested for one year at an annual interest rate of 100r% and
compounded at n regular intervals, the compound interest formula states
that its value on maturity is (1+r/n)
n
pounds. This value is approximately
equal to e
r
. We shall see why this is so.
Proposition 7.20. For xed r > 0, let x
n
= (1 + r/n)
n
, n N. Then (x
n
) is
a bounded increasing sequence and so converges.
Proof. Using the Binomial Theorem, we write
x
n
=
_
1 + r/n
_
n
=
n

k=0
_
n
k
_
_
r
n
_
k
=
n

k=0
n(n 1) . . . (n (k 1))
k!
_
r
n
_
k
=
n

k=0
b
k
(n) r
k
Department of Mathematics
The elementary functions 123
where b
k
(n) =
n(n1)...(n(k1))
k! n
k
=
1
k!
1
_
1
1
n
__
1
2
n
_
. . .
_
1
(k1)
n
_
.
Now, as n increases, j/n decreases and so
_
1
j
n
_
increases. In other
words, for each xed k n, b
k
(n) < b
k
(n + 1). It follows that
x
n+1
=
n+1

k=0
b
k
(n + 1) r
k
=
n

k=0
b
k
(n + 1) r
k
+ b
n+1
(n + 1) r
n+1
>
n

k=0
b
k
(n) r
k
= x
n
which shows that (x
n
) is an increasing sequence. Moreover, it is clear that
b
k
(n)
1
k!
so we see that
x
n
=
n

k=0
b
k
(n) r
k

k=0
r
k
k!
< e
r
and the proof is complete.
We can now establish the result we are interested in.
Theorem 7.21. For any xed r > 0,
_
1 + r/n
_
n
e
r
as n .
Proof. Let > 0 be given.
Using the notation established above, we know that x
n
= (1 + r/n)
n

for some R. We must show that = e
r
. Let N
1
N be such that if
n > N
1
then
[x
n
[ <
1
5
.
Next, let s
n
=

n
k=0
r
k
/k! and let N
2
N be such that if n > N
2
then
[s
n
e
r
[ <
1
5
.
(We know that s
n
e
r
.)
Now we note that for each xed k, b
k
(n) 1/k! as n . Fix
N > N
1
+ N
2
and let N
3
N be such that N
3
> N and if n > N
3
then

k=0
b
k
(n) r
k

k=0
1
k!
r
k

<
1
5
.
Kings College London
124 Chapter 7
For any n > N
3
, we have
[x
n
s
n
[ =

k=0
b
k
(n) r
k

k=0
1
k!
r
k

k=0
b
k
(n) r
k

k=0
1
k!
r
k
+
n

k=N+1
b
k
(n) r
k

k=N+1
1
k!
r
k

k=0
b
k
(n) r
k

k=0
1
k!
r
k

+ 2
n

k=N+1
1
k!
r
k
<
1
5
+ 2

k=N+1
1
k!
r
k
=
1
5
+ 2 (e
r
s
N
)
<
1
5
+
2
5
=
3
5
.
But then, for n > N
3
,
[ e
r
[ [ x
n
[ +[x
n
s
n
[ +[s
n
e
r
[ <
1
5
+
3
5
+
1
5
=
and we conclude that = e
r
.
Corollary 7.22. For any r > 0, (1 r/n)
n
e
r
as n .
Proof. Let y
n
= (1 r
2
/n
2
)
n
and suppose that n is so large that r/n < 1.
For such n, we see that
0 < 1 y
n
= 1 (1 r
2
/n
2
)
n
= 1
n

k=0
b
k
(n)
_
r
2
n
_
k
=
n

k=1
b
k
(n)
_
r
2
n
_
k

k=1
b
k
(n)
_
r
2
n
_
k
< e
r
2
/n
1 .
Now e
r
2
/n
1 and so by the Sandwich Principle, we see that y
n
1 as
n . However, we then nd that
(1 r/n)
n
=
(1 r
2
/n
2
)
n
(1 + r/n)
n

1
e
r
= e
r
as required.
Department of Mathematics
The elementary functions 125
The logarithm
The logarithm is dened via the exponential function. We know that expx
maps R one-one onto (0, ). This means that to each x (0, ) there is
one and only one v R such that expv = x.
Denition 7.23. For x (0, ), log x is the value v R such that expv = x.
It follows that x log x maps (0, ) onto R.
log x is dened by the formula e
log x
= x for x > 0.
Remark 7.24. The notation lnx is also used for the function log x here. The
notation ln emphasizes the fact that this is the logarithm to base e, the
so-called natural logarithm.
Proposition 7.25. The function log x has the following properties.
(i) log 1 = 0 and log e = 1.
(ii) For any s, t > 0, log(st) = log s + log t.
(iii) For any x > 0, log(1/x) = log x.
(iv) log x is strictly increasing and log x as x .
(v) (log x)/x 0 as x .
Proof. We shall make use of the identity log(e
s
) = s for s R.
(i) We have log 1 = log(e
0
) = 0. Also log e = log e
1
= 1.
(ii) For any s, t > 0, we have
log(s t) = log(e
log s
e
log t
) = log(e
log s+log t
) = log s + log t .
(iii) We have
log(1/x) = log(1/e
log x
) = log(e
log x
) = log x.
(iv) Suppose that a < b. Then e
log a
= a < b = e
log b
and so we have
log a < log b because expx is strictly increasing.
Now let M > 0 be given. Set m = e
M
. Then if x > m, it follows that
log x > log m, that is, log x > log(e
M
) = M.
Kings College London
126 Chapter 7
(v) Let v = log x. Then x = e
v
and
log x
x
=
v
e
v
.
Now, if x then also log x , that is, v . However,
we already know that v/e
v
0 as v and so (log x)/x 0 as
x .
The proof is complete.
Theorem 7.26. The function log x is continuous at each point in (0, ).
Proof. Let s (0, ) be given and let > 0 be given. We know that the
function t e
t
is strictly increasing, that is, < if and only if e

< e

.
This means that
< t < e

< e
t
< e

.
In particular, if = log s , t = log x and = log s + , this becomes
log s < log x < log s + s e

< x < s e

.
Let = min s e

s, s s e

. Then
[x s[ < = s < x < s + = s e

< x < s e

.
Therefore log s < x < log s + or [log x log s[ < and it follows that
log x is continuous at s.
The next theorem tells us what the derivative of log x is.
Theorem 7.27. The function log x is dierentiable at every s > 0 and its
derivative at s is 1/s. (In other words,
d
dx
log x = 1/x on (0, ).)
Proof. Let s (0, ) be given and let h be small but h ,= 0. We must show
that the Newton quotient
1
h
(log(s+h)log(s)) approaches 1/s as h 0. To
see this, let v = log s and let k = log(s+h)log s, so that log(s+h) = v+k.
The continuity of log x at s implies that log(s + h) log(s) as h 0, that
is, k 0 as h 0. In terms of v and k, we have
h = (s + h) s = e
log(s+h)
e
log s
= e
v+k
e
v
= exp(v + k) exp(v) .
Note also that since h ,= 0, it follows that s+h ,= s and so log(s+h) ,= log s
Department of Mathematics
The elementary functions 127
which means that k ,= 0. Using these remarks, we get
log(s + h) log(s)
h
=
k
exp(v + k) exp(v)
=
_
exp(v + k) exp(v)
k
_
1

_
exp

(v)
_
1
as h 0 (since also k 0),
=
1
exp(v)
=
1
s
as required.
For any positive real number a, we know what the power a
k
means for
any k N. We also know what a
p/q
means for p, q N: it is the real
number whose q
th
power is equal to a
p
. However, it is not at all clear what
a power such as 3

2
means. We would like to set up a reasonable denition
for powers such as this. We need some preliminary results.
Proposition 7.28. For any a > 0 and m, n N,
log(a
m/n
) =
m
n
log(a) .
Proof. First note that log(s
k
) = k log(s) for any s > 0 and k N. We shall
verify this by induction. For k N, let P(k) be the statement log(s
k
) =
k log(s). Evidently, P(1) is true. Using the previous proposition, we see
that
log(s
k+1
) = log(s
k
s) = log(s
k
) + log(s) = k log(s) + log(s) = (k + 1) log(s)
if P(k) is true. Hence the truth of P(k + 1) follows from that of P(k) and
so, by induction, we conclude that P(k) is true for all k N.
Let t = a
1/n
so that t
n
= a and t
m
= a
m/n
. We have
nlog(a
m/n
) = nlog(t
m
)
= nmlog(t)
= mlog(t
n
)
= mlog(a)
and it follows that log(a
m/n
) =
m
n
log(a).
From the above, we see that a
m/n
= exp(
m
n
log(a)). Moreover, a
m/n
=
1/a
m/n
= 1/ exp(
m
n
log(a)) = exp(
m
n
log(a)). Hence a
r
= e
r log a
for any
a > 0 and any r Q. Now, the left hand side, a
r
, makes no sense unless r is
rational but the right hand side, namely, e
r log a
(which is short-hand notation
for exp(r log a)), is well-dened for any real number r. This suggests the
following denition of the power a
s
for any s R.
Kings College London
128 Chapter 7
Denition 7.29. For a > 0 and s R, the power a
s
is dened to be
a
s
= e
s log a
.
A further remark is in order here. By setting a = e, the real number
exp(1), we have a formula for the power e
s
. But this is just
e
s
= exp(s log e) = exp s
since log e = 1. This is in agreement with our penchant for using the short-
hand notation e
s
= exp s, so everything works out alright; that is, we can
think of the expression e
s
as being the power, as dened above, or as an
abbreviation for the exponential, exps. These are the same thing. The next
proposition tells us that the expected power laws hold.
Proposition 7.30. For any a (0, ) and any s, t R, we have
a
s+t
= a
s
a
t
and (a
s
)
t
= a
st
.
Proof. From the denition, we have
a
s+t
= exp((s + t) log a) = exp(s log a) exp(t log a)
= a
s
a
t
.
Similarly,
(a
s
)
t
= exp(t log(a
s
)) = exp(t log(e
s log a
))
= exp(t s log a)
= a
st
,
as required.
Proposition 7.31. For any a R, the function f(x) = x
a
is dierentiable
on (0, ) and
f

(x) = a x
a1
.
Proof. From the denition, f(x) = x
a
= e
a log x
and so the standard rules
of dierentiation imply that
f

(x) =
a
x
e
a log x
= a
x
a
x
= a x
a1
as claimed.
Department of Mathematics

Вам также может понравиться