Вы находитесь на странице: 1из 329

EMT4801/1

Contents

Module 1. SEQUENCES AND SERIES 1


UNIT 1: REVISION OF SEQUENCES 1
UNIT 2: SERIES AND THE CONVERGENCE THEREOF 12
UNIT 3: POWER SERIES 30
UNIT 4: POWER SERIES EXPANSION OF FUNCTIONS 33
Module 2. COMPLEX ANALYSIS 51
UNIT 1: REVISION OF COMPLEX NUMBERS 51
UNIT 2: COMPLEX FUNCTIONS AND MAPPINGS 62
UNIT 3: COMPLEX DIFFERENTIATION 87
UNIT 4: COMPLEX INTEGRATION 106
UNIT 5: COMPLEX SERIES 124
UNIT 6: RESIDUES AND INTEGRATION REVISITED 150
A RESUMÉ OF COMPLEX INTEGRATION 161
Module 3. LAPLACE TRANSFORMS–
CONTINUOUS SIGNALS AND SYSTEMS 163
UNIT 1: DEFINITIONS AND PROPERTIES 163
UNIT 2: THE LAPLACE METHOD FOR SOLVING DIFFERENTIAL
EQUATIONS 182
UNIT 3: ENGINEERING APPLICATIONS 188
UNIT 4: STEP AND IMPULSE FUNCTIONS 200
UNIT 5: TRANSFER FUNCTIONS 217
UNIT 6: STATE SPACE EQUATIONS 238
Module 4. Z-TRANSFORMS–
DISCRETE SIGNALS AND SYSTEMS 259
UNIT 1: DEFINITIONS AND PROPERTIES 259
UNIT 2: THE INVERSE Z–TRANSFORM 270
UNIT 3: THE Z-TRANSFORM METHOD FOR SOLVING DIFFERENCE
EQUATIONS 276
UNIT 4: TRANSFER FUNCTIONS AND STABILITY 286
UNIT 5: STATE–SPACE EQUATIONS 302
Appendix A. MATRIX THEORY 311
A.1. OBJECTIVE 311
A.2. BASIC CONCEPTS 311
A.3. TYPES OF MATRICES 312
A.4. DETERMINANTS 313
A.5. OPERATIONS WITH MATRICES 315
A.6. INVERSE MATRICES 316
A.7. APPLICATION: THE SOLUTION OF SYSTEMS OF SIMULTANEOUS
LINEAR EQUATIONS 318
A.8. EIGENVALUES AND EIGENVECTORS 320
Appendix B. TABLE OF LAPLACE TRANSFORMS 325
Appendix C. TABLE OF Z-TRANSFORMS 327

iii
iv
MODULE 1

SEQUENCES AND SERIES

UNIT 1: REVISION OF SEQUENCES


1.1.1. OBJECTIVE. Revision of work previously done on sequences and the
introduction of new sequences.

1.1.2. OUTCOMES. At the end of this unit the student should


• Understand what is meant by the phrase “convergence of a sequence”;
• Know how to compute the limits of some basic convergent sequences (in-
cluding ones that are defined by some given recursive formula);
• Know what is meant by the terms arithmetic progression, harmonic pro-
gression and geometric progression, and be able to identify such sequences.

1.1.3. INTRODUCTION. This module is a prerequisite for module 2, unit


5. In this module sequences and series done in earlier courses will be revised. New
classes of series as well as more convergence tests will also be presented.

1.1.4. DEFINITIONS. If to any integer value of n there is assigned a quan-


tity un , the set:
. . . u−n , u−n+1 , . . . , u−2 , u−1 , u0 , u1 , u2 , . . . , un−1 , un . . .
is called a two-sided infinite sequence and is written as
{ui } |i=∞
i=−∞

If for some fixed N ∈ N we consider only


u−N , u−N +1 , . . . , u−2 , u−1 , u0 , u1 , u2 , . . . , uN −1 , uN
we call this set a two-sided finite sequence and write it as

{ui } i=N

i=−N
Sets of the form

u1 , u2 , u3, . . . , uN
and

u1 , u2 , u3, . . . , un , . . .
are respectively called one-sided finite sequences and one-sided infinite se-
quences, and are written as
i=N
and {ui } i=∞

(1.1.1) {ui } i=1 i=1

In the following we will consider only one-sided sequences! Given an


infinite sequence {ui }|i=∞
i=1 we say that the sequence converges to some number u, if
the terms un approach u as n increases. In such a case we say that u is the limit of
the sequence, and we write limn→∞ un = u. In precise mathematical terms we say
that the sequence {ui }|i=∞
i=1 converges to u if given any distance ε > 0 (no matter

1
2

how small), there exists some positive integer N such that for all terms un with
n ≥ N we have
|un − u| < ε.
If such a limit does NOT exist, we say that the sequence diverges.
There are basically two things that can go wrong and cause a sequence to
diverge. If for example we had that either un → ∞, or un → −∞ as n increases,
the limit would fail to exist. So in such a case the sequence would diverge. Thus
if {un }|n=∞ n=∞
n=1 is unbounded, the sequence will diverge. However even if {un }|n=1 is
bounded, the sequence may still diverge. If say some of the terms of the sequence
oscillate between two or more numbers as n increases (instead of getting closer
to one unique number), the limit would also fail to exist and the sequence would
diverge. For example the terms of the infinite sequence
1, −1, 1, −1, 1, . . .
do not move closer to one single number, but rather jump around between 1 and
-1. Hence this sequence diverges.
If we randomly select an increasing sequence of natural numbers n1 < n2 <
n3 < . . . then we may construct a new sequence from a given sequence u1 , u2 , u3, . . .
by setting a1 = un1 , a2 = un2 , a3 = un3 , etc. This new sequence is called a
subsequence of the original sequence. Any sequence does of course have infinitely
many subsequences. For example suppose we are given a sequence u1 = −1, u2 =
−2, u3 = −5, u4 = −12, . . . un = n − 2n , . . . . One possible subsequence is given
by taking all the terms with even indices. This gives a1 = u2 = −2, a2 = u4 =
−12, a3 = u6 = 6−26 = −58, . . . Another possible subsequence is given by selecting
all the terms for which the indices are prime numbers. This gives b1 = u2 = −2, b2 =
u3 = −5, b3 = u5 = 5 − 25 = −27, . . .
If a given sequence u1 , u2 , u3, . . . converges, then all its subsequences will con-
verge to the same limit. However if the sequence we start with does not converge,
its subsequences can behave very differently from the original sequence. For ex-
ample the sequences 1, 1, 1, . . . and −1, −1, −1, . . . , both converge and are both
subsequences of the sequence 1, −1, 1, −1, 1, . . . , which does NOT converge.
Suppose now we are given a one-sided infinite sequence {ui }|i=∞ i=1 . The formal
sum
u1 + u2 + u3 + . . . + un + . . .
of the elements is called a series, and is written as

X
(2) S= un = u1 + u2 + u3 + . . . + un + . . .
n=1

The idea of such a “sum” may be intuitively clear, but we have to be sure that
it makes mathematical sense. This is not as straightforward as one may think. For
example if we want to “compute” such a sum we can’t really take a calculator and
add up all the terms until we are done. After all, there are infinitely many terms,
and no matter how many we have added up at any given time, there will still be
infinitely many left to add! So how exactly do we make mathematical sense of the
idea of an infinite sum? The way we do this is by constructing a NEW sequence
from the old. From the sequence {ui }|i=∞
i=1 , we construct the sequence

S1 , S 2 , S 3 , . . .
of partial sums where for each n ∈ N we set
n
X
Sn = ui = u1 + u2 + u3 + . . . + un .
i=1
3 EMT4801/1

This new sequence {Sn }|n=∞ i=∞


n=1 is called the series associated with {ui }|i=1 . If the
partial sums Sn approach a limit, say S, as n increases to infinity, then we say that
the series converges and we call the limit limn→∞ Sn = S the sum or limit of the
series. In such a case we will write
X∞
S= ui = u1 + u2 + u3 + . . .
i=1

for this limit of partial sums. We remind the reader that the existence of such a
limit S is equivalent to saying that there exists a quantity S, such that for any
quantity ε > 0 (no matter how small), there exists a positive integer N , such that
for all the partial sums Sn with n ≥ N , the error |S − Sn | is less than ε; that is we
have that
|S − Sn | < ε for all n ∈ N.
If Sn fails to approach a limit as n → ∞, we say that the series is divergent.
As was the case with sequences, there are basically two things that can go
wrong and cause a series to diverge. If the sequence of partial sums {Sn }|n=∞ n=1
is either unbounded, or else tends to oscillate between two or more numbers as
n increases, there would be no way to give meaning to the idea of “sum”. So in
these cases, the series will diverge. ForP example for the series of constant terms
n
2 + 2 + 2 + 2 + . . . the partial sums Sn = i=1 2 = 2 + 2 + 2 + · · · + 2(n terms) = 2n
keep on increasing, whereas for the series 2 − 2 + 2 − 2 + 2 − 2 . . . the partial sums
oscillate between 2 and 0. Hence both these series diverge.
Now if un ≥ 0 for every n, then of course Sn = u1 + u2 + u3 + · · · + un will not
oscillate, but keep on growing as we add more terms. So for such a series, the only
n=∞
thing that can go wrong and cause it to diverge, is if {Sn }|n=1 is unbounded.
In the case where the terms in equation 2 are alternately positive and negative,
the series is called an alternating series.
If the series of absolute values

X
|un |
n=1

converges, the series is called absolutely convergent. If a given series converges, but
the series of absolute values does not, we say that such a series is conditionally
convergent.
Although we will not prove it, we take some time to note a very interesting
property of absolutely convergent series, namely
a series of real numbers converges absolutely if and only if the
terms may be rearranged in any order without altering the limit
to which the series converges.
This means that if we have a series which converges conditionally, we need to be
very careful about the order in which we add the terms! Add them up in the wrong
order, and you will get the wrong answer!

1.1.5. SEQUENCES. We start of this section with two examples demon-


strating how the concept of a sequence may arise naturally in applications.
A certain bacterium propagates itself by subdividing, creating four additional
bacteria. If the bacteria keep on subdividing in this manner n times then, assuming
none of them die, the number of bacteria present after each subdivision is given by
the sequence:
k=n
{bk }k=0
4

where

b0 = 1

b1 = 1+4=5

b2 = 1 + 4 + (4)2 = 21

b3 = 1 + 4 + (4)2 + (4)3 = 85
..
.
bn = 1 + 4 + (4)2 + . . . + (4)n

Hence, each term of the sequence 1, 5, 21, 85, . . . may be expressed explicitly
in terms of a formula depending on n only. This is not always the case. Often
sequences are defined by means of a recurrence relation, as in the next example.
Newton’s method for finding the square root of a number, uses the following
recurrence relation:
 
1 a
xn+1 = xn +
2 xn
where a is the number of which the square root is to be found, and x0 the first √
approximation. As n increases, the terms xn converge extremely rapidly to a.
(On a historical note we observe that although this algorithm is commonly called
“Newton’s method”, it was in fact known to the Babylonians!) √
This formula uses the idea that if x is an approximation to a, then xa will

also be an approximation to a, so that the average of the two approximations
will be an even better approximation. Now if the successive approximations xn do
converge to some non-zero limit b, then from the equation displayed above it follows

that such a b must satisfy b = 21 (b + ab ); or equivalently b2 − a = 0. Thus b = a

as required. Let’s suppose we want to calculate 3. To do this we let

a=3

and take
x0 = 1

as a first approximation. The sequence {xn }|n=∞


n=1 is then defined by
 
1 3
xn+1 = xn + n ≥ 0.
2 xn

Therefore
 
1 3
x1 = 1+ =2
2 1
 
1 3
x2 = 2+ = 1, 75
2 2
 
1 3
x3 = 1, 75 + = 1, 732142857
2 1, 75
5 EMT4801/1


Proceeding inductively we get the following sequence of approximations to 3:
x0 = 1
x1 = 2
x2 = 1, 75
x3 = 1, 732142857 . . .
x4 = 1, 732050810 . . .
x5 = 1, 732050807 . . .
x6 = 1, 732050807 . . .

√In the previous example, it can be seen that the sequence {xn }0 gets closer
to 3 after each step (x5 =√x6 to nine decimals). The sequence converges and the
limit of xn turns out to be 3 as n → ∞.

1.1.5.1. Properties of convergent sequences. Convergent sequences display the


following basic properties:
• A convergent sequence is bounded. That is, if {xn }∞ 0 is convergent, then
there is a positive number M such that |xn | < M for all n.
(It is actually quite easy to prove this! Suppose xn → a as n → ∞. For
a fixed ε > 0 we can then find N so that |xn − a| < ε for n ≥ N . But then
|xn | < ε + |a| for n ≥ N . So if we set M = max{|x1 |, |x2 |, . . . , |xN −1 |, ε +
|a|}, we will have that |xn | < M for all n as required.)
• If limn→∞ an = a and limn→∞ bn = b, then
(a) limn→∞ (an + bn ) = a + b;
(b) limn→∞ (an − bn ) = a − b;
(c) limn→∞ (an bn ) = ab;
(d) if b 6= 0 and bn 6= 0 for all n, then limn→∞ ( abnn ) = ab .

We show how the above properties may be used to compute limits of sequences.

Example 1.1.1. Determine the limit of the sequence {xn }n=1 for
n
(a) xn =
n+1
3n2 − 4n + 1
(b) xn =
6n2 + 50
1
(c) xn+1 =1+ , xn > 0
xn

Solution:
(a)
n 1 1 1
lim xn = lim = lim 1 = 1 = =1
n→∞ n→∞ n + 1 n→∞ 1 + n 1 + limn→∞ n
1+0
[using properties (a) and (d)].
(b) Similar to (a) above, we get
3n2 − 4n + 1 3 − n4 + n12 3−0+0 1
xn = 2
= 50 →n→∞ = .
6n + 50 6 + n2 6+0 2
(c)
1
xn+1 = 1 + , xn > 0
xn
6

Here the approach is different to that of the previous cases. If the limit
of {xn } is a, then the limit of {xn+1 } is also a. Thus by property (a),
1 1
lim xn+1 = lim 1 + =1+ .
n→∞ n→∞ xn limn→∞ xn
But this is the same as saying that
1
a=1+ ,
a

whence a2 − a + 1 = 0. Therefore a = 12 (1 ± 1 + 4). But since all the

xn ’s are positive, they can’t converge to a negative number like 12 (1 − 5).

Thus the only possible value for the limit, is a = 12 (1 + 5). Therefore

limn→∞ xn = a = 21 1 + 5 = 1, 6180339887 . . ..



To determine the limit of a sequence is not always sufficient. One should also
be able to determine the degree of accuracy of a given approximation.
√ To get some
idea of how to do this, let us go back to the example of finding 3.
The question may be asked: how many iterations of the algorithm
 
1 3
xn+1 = xn + n≥0
2 xn
must be performed (that is, how many
√ terms of the sequence must be calculated)
to find a term which approximates 3 correctly to say four decimal places? (Recall
that for an approximation to be correct to n decimal places, the error must be less
than 21 × 10−n . So for accuracy up to the fourth place we want the error to be less
than 12 · 10−4 = 0.00005.) Let ε0 be the error in our first estimate x0 = 1. This is

the difference between x0 and the actual value of 3. Thus

x0 = 3 + ε0 .
If for each n we let εn be the error in the nth estimate, then of course

xn = 3 + εn

xn+1 = 3 + εn+1 .
Substituting into
x2n + 3
 
1 3
xn+1 = xn + =
2 xn 2xn
we obtain √
√ 3 + 2 3εn + ε2n + 3
3 + εn+1 = √ .
2( 3 + εn )
Therefore √ √ √
6 + 2 3εn + ε2n − 3 · 2( 3 + εn ) ε2
εn+1 = √ = n .
2( 3 + εn ) 2xn
Notice that for any x > 0 we have that x2 + 3 − 2x = (x − 1)2 + 2 > 0. But
x + 3 − 2x > 0 can be rewritten as 21 (x + x3 ) > 1, and hence 21 (x + x3 ) > 1 for any
2

x > 0. Since xn > 0 for each n ∈ N, this means that 1 is less than xn for all values
of n ≥ 1. It therefore follows that
ε2n 1
εn+1 = < ε2n .
2xn 2
√ √
Since x0 = 1, we may conclude from x0 = 3 + ε0 that ε0 = 1 − 3. Thus

|ε0 | = |1 − 3| < 1.
7 EMT4801/1

[Note: ε0 may be negative, but εn > 0 for all n ≥ 1. To see this note that
since xn−1 > 0, we have

εn = xn − 3
1 3 √
= (xn−1 + )− 3
2 xn−1
2

xn−1 − 2 3xn−1 + 3
=
2xn−1

(xn−1 − 3)2
=
2xn−1
≥ 0
for any n ≥ 1.]
Using the inequality εn+1 < 21 ε2n we see that
1 2 1 1
ε1 < ε < = 21−2
2 0 2
1 2 1 2
ε2 < ε < (2−2 ) = 21−2 .
2 1 2
n
Continuing inductively we see that if for some n we have εn < 21−2 , then also
n n+1
εn+1 < 12 ε2n < 12 (21−2 )2 = 21−2 . Thus clearly
n
εn < 21−2 for all n ∈ N.
For the answer to be correct to p decimal places whenever n ≥ N , we require
an N such that εn < 21 .10−p for all n ≥ N . By the previous centred inequality this
will happen if we choose N so that
N 1 −p
21−2 < 10 ,
2
N
or equivalently 22 −2 > 10p . Taking logarithms we see that this inequality can be
written as (2N − 2) log10 (2) > p; i.e. 2N > log p (2) + 2. Again taking logarithms it
10
follows that we need N log10 (2) = log10 (2N ) > log10 [ log p (2) + 2], or equivalently
10

log10 [ log p (2) + 2]


10
N> .
log10 (2)
Now we may finally answer the question: which is the first term to be accurate
to at least four decimal places? The error may of course not exceed 0.00005 (that
is 12 · 10−4 ). So let p = 4 in
log10 [ log p (2) + 2]
10
N> .
log10 (2)
The right-hand side is approximately 3,93, so taking N = 4 will do the trick.
Hence, at most four terms need √ to be calculated. Let us compare the above
estimate to the actual errors using 3 = 1, 732050807568877 . . . .
x0 = 1
x1 = 2
x2 = 1, 75
x3 = 1, 732142857 . . .
x4 = 1, 732050810 . . .
8

Rounding to the nearest ninth decimal, the corresponding errors for these terms
are
ε0 = −0, 732050808
ε1 = 0, 267949192
ε2 = 0, 017949192
ε3 = 0, 000092049
ε4 = 0, 000000002
We see that as required the error in the fourth term is less than 0, 00005.
1.1.5.2. Some special sequences.
(a) Triangular numbers (Tn )
There are the numbers of dots that occur in the nth line of an equi-
lateral triangle formed by arranging dots in a triangular pattern.

Hence {Tn } = 1, 3, 6, 10, 15, . . .


(b) An arithmetic progression (AP)
This is a sequence with the property that
an+1 − an = d,
where d is called the constant difference. If a is the first term of an AP
{an }∞
1 , then

{an } = {a, a + d, a + 2d, a + 3d, . . . , a + (n − 1)d, . . .}.


Note that {an } is divergent whenever d 6= 0, since

∞ if d > 0
lim [a + (n − 1)d] =
n→∞ −∞ if d < 0.
(c) A geometric progression (GP)
This is a sequence with the property that
an+1
= r; r 6= 1 (for r = 1 it is also an AP) .
an
r is called the constant ratio. If a is the first term of GP, then
{an } = {a, ar, ar2 , ar3 , . . . , arn−1 , . . .}
(d) A harmonic progression (HP)
This is a sequence having the property that, for any three consecutive
terms a, b and c, in the sequence, the following applies
a a−b
=
c b−c
The most common HP is
1 1 1
1, , , , . . .
2 3 4
Example 1.1.2. Determine u5 and u10 of an AP with u1 = 4 and
u2 = 7.
Solution: The first term is a = u1 = 4, and the common difference
d = un+1 − un must be d = u2 − u1 = 7 − 4 = 3. Therefore the formula
for the nth term un = a + (n − 1)d, becomes un = 4 + (n − 1)3. Therefore
u5 = 4 + (4)3 = 16 and u10 = 4 + (9)3 = 31.
9 EMT4801/1


Example 1.1.3. Prove that the reciprocals of the terms of a harmonic
progression form an arithmetic progression.
Solution:
If a, b and c are any three consecutive terms of an HP, then
a a−b
=
c b−c
We must prove that
1 1 1
; ; ,...
a b c
form an AP. That is, prove that
1 1 1 1
− = −
b a c b
From
a a−b
=
c b−c
it follows that
ac − bc = ab − ac
Dividing throughout by abc, now yields
1 1 1 1
− = − ,
b a c b
which was what needed to be proved. Hence
1 1 1
, , ,...
a b c
forms an AP. 

EXERCISE 1.1.
1. Prove the validity of Newton’s method for finding the square root of a
number by means of the recurrence relation
 
1 a
xn+1 = xn + n≥0
2 xn
[i.e. show that for any x0 > 0, we have

lim xn = a
n→∞

for the sequence {xn } defined as above.]


2. Using x1 = 1 in example 1.1(c), determine the first 15 terms of the se-
quence to indicate that, as n increases, xn approaches
1 √ 
1 + 5 = 1, 6180339887 . . .
2
3. Find the intermediate terms u2 , u3 , . . . , u7 of a harmonic progression for
which
2 2
u1 = and u8 = .
3 17
[HINT: Use the result of example 1.3.]
4. A beaker of capacity V is rinsed n times with distilled water. What
percentage of the original solution remains in the beaker, if after each
rinse, a volume v of the solution remains behind?
[Ans: ( Vv )n × 100]
10

5. Use Newton’s recurrence relation


 
1 a
xn+1 = xn +
2 xn
to approximate the square root of 110 (x0 = 10). State how many terms
need to be computed to guarantee an answer correct to:
(a) 1 dp,
(b) 3 dp,
(c) 4 dp,
(d) 5 dp,
(e) 11 dp,
(f) 12 dp, and
(g) 100 dp.
[Ans: At most 1, 2, 2, 3, 3, 4, 6.]
6. Show that
1 1 1
√ , , √
1+ x 1−x 1− x
are in arithmetic progression and find its nth term if these three terms are
the
h first three terms √ of the
i AP.
Ans : un = 1+(n−2)
1−x
x

7. Two capacitors A and B are connected in such a manner, that every time
a switch is closed momentarily, 15% of the charge on capacitor A will go
to capacitor B and 35% of the charge on capacitor B will to capacitor A.
If A originally had 20% of the total charge Q, show that the charge on A
satisfies the recurrence relation
1
Qn+1 = 35 + Qn
2
If the switch is repeatedly closed momentarily, the percentage of the
charge Q on A will approach an equilibrium value. What will this value
be?
[Ans: 70%]
8. The voltage Vk at the k th pin of the insulator chain shown in figure 1.1
overleaf satisfies the recurrence relation
 
C1
Vk+2 − 2 + Vk+1 + Vk = 0
C2
with
V0 = 0
and
Vn = v (the amplitude applied at the head of the chain)
Show that Vk = Ak · V1 where Ak satisfies the recurrence relation
 
C1
Ak+2 − 2 + Ak+1 + Ak = 0
C2
with
A0 = 0 and A1 = 1.
Hence, deduce the value of V1 and tabulate Vk for k = 0, 1, 2, . . . , 10, if
C1
the capacitance ratio = 0, 1 and v = 1000.
C2
11 EMT4801/1

k 0 1 2 3 4 5 6 7 8 9 10
Vk 0 27, 5 57, 8 93, 8 139, 2 198, 6 277, 8 384, 7 530, 2 728, 7 1000

Figure 1.1
12

UNIT 2: SERIES AND THE CONVERGENCE THEREOF


1.2.1. OBJECTIVE. Revision of work done on series and the introduction
of convergence tests, as well as the formal proof of some of these tests.
1.2.2. OUTCOMES. At the end of this unit the student should
• Understand what is meant by the phrase convergence of a series;
• Know what is meant by the term geometric series, and be able to deter-
mine whether or not a given geometric series converges, and also able to
compute the sum of those that do converge;
• Understand and be able to apply the various convergence tests;
• If convergence of a series can be proved by one of the ratio test, the
integral test or the alternating series test, the student should know how
to estimate the maximum error when such a series is estimated by the
sum of the first n terms.
1.2.3. INTRODUCTION. As mentioned in unit 1, a series is formally con-
i=∞
sidered to be the sum of the terms of a given sequence. Given a sequence {ui }|i=1 ,
we give meaning to this concept by defining the associated series Pn as the sequence
S1 , S2 , S3 , . . . of partial sums where for each n ∈ N, Sn = i=1 ui = u1 + u2 +
u3 + . . . + un . If the partial sums Sn approach a limit, say S, when nP → ∞, we

call the limit limn→∞ Sn = S the sum of the series and write S = i=1 ui =
u1 + u2 + u3 + . . .. In such a case we say that the series converges. Having in-
troduced the concept of convergence, we would like to develop reliable tests with
which to identify those series that actually do converge. More generally given a
fixed sequence {ai }|i=∞ i=0 of coefficients and a fixed real number b, we will ultimately
be interested in finding a way to compute those values of x for which the series
a0 + a1 · (x − b) + a2 · (x − b)2 + a3 · (x − b)3 + . . .
converges. Series of the above type (where a variable is involved) are called power
series, and will be dealt with more fully in unit 3.

1.2.4. SOME SPECIAL SERIES.


1.2.4.1. The arithmetic series.
Theorem 1.2.1. The sum of the first n terms of an arithmetic progression is
given by
Sn = a + (a + d) + (a + 2d) + . . . + [a + (n − 1) d]
Xn
= [a + (i − 1) d]
i=1
n
= [2a + (n − 1) d]
2
n
= [first term + last term]
2
Proof:
Sn = a + (a + d) + (a + 2d) + . . . + [a + (n − 1) d]
Writing Sn in reverse order, we get
Sn = [a + (n − 1) d] + [a + (n − 2) d] + [a + (n − 3) d] + . . . + a.
Adding the two sums term for term, yields
2Sn = [2 + (n − 1) d] + [2a + (n − 1) d] + . . . + [2a + (n − 1) d]
= n [2a + (n − 1) d] .
13 EMT4801/1

(Here we used the fact that after adding, there are n terms of the form 2a+(n−1)d.)


1.2.4.2. The geometric series.


Theorem 1.2.2. The sum of the first n terms of a geometric progression is
given by
Sn = a + ar + ar2 + ar3 + . . . + arn−1
Xn
= ari−1
i=1
a(rn − 1)
= (r 6= 1).
r−1
Proof: Notice that
Sn = a + ar + ar2 + ar3 + . . . + arn−1
rSn = ar + ar2 + ar3 + . . . + arn−1 + arn .
Thus
(1 − r)Sn = Sn − rSn = a − arn = a(1 − rn ).
Solving for Sn yields
a(1 − rn )
Sn = .
1−r

1.2.4.3. The p-series.
1 1 1 1 1
+ p + p + p + ... + p + ... where p ∈ R.
1p 2 3 4 n
1.2.4.4. A simple harmonic series.
1 1 1 1
1 + + + + ... + + ...
2 3 4 n
Example 1.2.3. Obtain a simple formula for
Xn
Sn = k 2 = 12 + 22 + 32 + . . . + n2 .
k=1

Solution: Using
(k + 1)3 = k 3 + 3k 2 + 3k + 1
we may write
n
X n
X
[(k + 1)3 − k 3 ] = [3k 2 + 3k + 1].
k=1 k=1
By cancelling intermediate terms, the left–hand side becomes
X n
[(k + 1)3 − k 3 ] = [23 − 13 ] + [33 − 23 ] + [43 − 33 ] + . . .
k=1
. . . + [n3 − (n − 1)3 ] + [(n + 1)3 − n3 ]
= (n + 1)3 − 1
Now notice that if a =Pd = 1, it follows from the formula for the sum of a finite
n
arithmetic series, that k=1 k = 1+2+3+· · ·+n = n2 (n+1). Thus the right–hand
side of the second centred equation becomes
n n n n n
X X X X X n
[3k 2 + 3k + 1] = 3 k2 + 3 k+ 1=3 k 2 + 3 (n + 1) + n
2
k=1 k=1 k=1 k=1 k=1
14

Therefore
n
3
X n
(n + 1) − 1 = 3 k 2 + 3 (n + 1) + n.
2
k=1
Pn 2
Solving for k=1 k , we get
n
X 1 3
k2 = ((n + 1)3 − n(n + 1) − (n + 1))
3 2
k=1
(n + 1) 3
= [(n + 1)2 − n − 1]
3 2
(n + 1) (2n2 + n)
=
3 2
n
= (n + 1)(2n + 1)
6

Example 1.2.4. Obtain the sum of the series
n n
1 1 1 1 X 1 1X 1
+ + +...+ = =
(2) (4) (4) (6) (6) (8) (2n) 2 (n + 1) 4k (k + 1) 4 k (k + 1)
k=1 k=1

Solution: Resolve
1
k (k + 1)
into partial fractions to get
1 1 1
= − .
k (k + 1) k k+1
Then
" n n
#
1 X1 X 1
Sn = −
4 k k+1
k=1 k=1
   
1 1 1 1 1 1 1 1 1
= 1 + + + + ... + − + + + ... +
4 2 3 4 n 2 3 4 n+1
 
1 1
= 1−
4 n+1
1 n
=
4n+1

1.2.5. CONVERGENCE OF INFINITE SERIES. In example 1.2.4 in
the previous subsection, we showed that the nth partial sum of the series

X 1
4k (k + 1)
k=1
is exactly
n
X 1 1 n
Sn = = .
4k(k + 1) 4n+1
k=1
If we now take the limit as n → ∞, it is clear that the series must converge to
∞ n
X 1 X 1 1 n 1
= lim = lim = .
4k(k + 1) n→∞ 4k(k + 1) n→∞ 4 n + 1 4
k=1 k=1
15 EMT4801/1

For geometric series we also had a very elegant formula for the nth partial sum,
which can be used in a very similar way to test for convergence. Specifically given
the series
a + ar + ar2 + · · · + arn−1 + . . . ,
we saw that the sum of the first n terms a + ar + ar2 + · · · + arn−1 , is exactly
a (1 − rn )
Sn = (r 6= 1).
1−r
Now {rn } converges to 0 if |r| < 1, and is unbounded if |r| > 1. If we apply this
a
fact to the formula for the partial sum, it is clear that limn→∞ Sn = 1−r if |r| < 1,
and that the Sn ’s are unbounded if |r| > 1. Hence, the geometric series converges
for |r| < 1, and diverges for |r| > 1. If r = 1 the series is a + a + a + . . . , and if
r = −1 we get a − a + a − a + a − . . . . Both of these also diverge.
Now in both the above cases we had access to a very elegant formula for Sn ,
which we used to test the series for convergence or divergence. However it is not
always possible to find such a formula. Because of this a number of convergence
tests have been developed that rely only on information about the general term of
the series, rather than on the existence of some clever formula for the partial sums.
We will present and investigate such tests in the subsections that follow. However
before doing this we take some time out to make a very general observation about
the behaviour of the general term of a convergent series:

Proposition 1.2.5. If the series

u1 + u2 + u3 + . . . + un + . . .

converges, then we will have that

lim un = 0.
n→∞

NOTE: The converse of the above statement is not true! On its own the condition
limn→∞ un = 0 is not enough to force convergence. In the next section (on the
integral test) we will prove that even though limn→∞ n1 = 0, the associated series
1 1 1 1
1+ + + + ··· + ...
2 3 4 n
actually diverges!
So what is the above fact good for then? The one very useful conclusion we
can make from the above fact, is that if limn→∞ un 6= 0, the series u1 + u2 + u3 +
. . . + un + . . . must diverge! So if limn→∞ un 6= 0, we immediately know that the
series must diverge, and we are done. If however limn→∞ un = 0, we have a bit of
work to do before we can be sure if the series diverges or converges.
Proof: If the series is convergent, then by definition

lim Sn = S
n→∞

But then also


lim Sn−1 = S
n→∞

Hoewever
un = Sn − Sn−1
16

and thus, necessarily


lim un = lim (Sn − Sn−1 )
n→∞ n→∞
= lim Sn − lim Sn−1
n→∞ n→∞
= S−S
= 0

1.2.5.1. The integral test.
Theorem 1.2.6 (The integral test). If f (x) is a non-negative monotone de-
creasing function on [1, ∞) which takes on the values
f (n) = un for n = 1, 2, 3, . . .
then the series of positive terms
u1 + u2 + u3 + . . . + un + . . .
R∞
converges if the improper integral 1 f (x)dx exists, and diverges if the improper
integral fails to exist.

Note: To be able to use the integral test, we must be sure that f decreases on
[1, ∞)! If this fact is not immedately clear, then one way in which to check this, is
to see if df
dx ≤ 0 on [1, ∞).
Proof: Consider the graph of a typical positive monotone decreasing function
f (x), presented in figure 1.2 below.

Figure 1.2

It is evident that the area under the curve y = f (x) between x = 1 and x = n,
is larger than the sum of the smaller rectangular areas, and smaller than the sum
of the larger rectangular areas standing over each unit interval on the x–axis. Thus
Z n
u2 + u3 + u4 + . . . + un < f (x) dx < u1 + u2 + u3 + . . . + un−1
1
Suppose that Z ∞ Z n
f (x) dx = lim f (x) dx
1 n→∞ 1
exists. Then
u2 + u3 + u4 + . . . + un
17 EMT4801/1

R∞
is bounded above by the real number 1
f (x) dx. Hence, it is clear that for EACH
n,
Sn = u1 + (u2 + u3 + u4 + . . . + un )
is bounded above by
Z ∞
u1 + f (x) dx.
1

Since the Sn ’s have an upper bound, they must then surely have a smallest upper
bound. Call this smallest upper bound L. Next notice that every uk is positive.
This means that as n increases and we keep on adding terms, Sn increases mono-
tonically as n increases. The value to which the Sn ’s increase must surely be the
smallest upper bound L. Therefore the series u1 + u2 + u3 + u4 + · · · + un + . . .
converges to this L.
On the other hand if
Z ∞ Z n
f (x) dx = lim f (x) dx
1 n→∞ 1
Rn
does not exist, that is if 1 f (x)dx → ∞ as n → ∞, then since
Z n
f (x)dx ≤ u1 + u2 + u3 + u4 + . . . + un−1 = Sn−1 ,
1

we must surely also have Sn−1 → ∞ as n → ∞ (equivalently Sn → ∞ as n → ∞).


Thus the series is divergent. 

Example 1.2.7. Test the p–series


1 1 1 1
+ p + p + p + ...
1p 2 3 4
for convergence.

Solution: For p ≤ 0 we have ( n1 )p ≥ ( n1 )0 = 1. Clearly n1p will then NOT


decrease to 0 as n increases. So for these values of p we already know from propo-
sition 1.2.5 that the p-series diverges. Hence let p > 0. For p > 0, x → x1p = x−p is
a decreasing function on (0, ∞), and hence here we can use the integral test. (To
see that x−p is decreasing, observe that dx d
(x−p ) = −px−p−1 < 0 for all x > 0.)
We consider the cases p 6= 1 and p = 1 separately. First let p 6= 1. In this case
Z ∞ Z n
1
dx = lim x−p dx
1 xp n→∞ 1
n
x1−p
= lim
n→∞ 1 − p
1

1
= lim [n1−p − 1]
1 − p n→∞
 
1 1
= lim −1
1 − p n→∞ np−1
 1 
p−1 for p > 1
=
∞ for p < 1

Hence, the p–series converges for p > 1 and diverges for p < 1.
18

If p = 1,we have
Z ∞ Z n
1
f (x) dx = lim dx
1 n→∞ 1 x
n
= lim ln x|1
n→∞

= lim [ln n − ln 1]
n→∞

= lim ln n
n→∞

= ∞
Hence the p series diverges for p = 1.
To sum up, we have proved that the p–series converges for p > 1, and diverges
for p ≤ 1. 
Example 1.2.8. Test the harmonic series
1 1 1
1 + + + ... + + ... ,
2 3 n
for convergence.
Solution: This is a p–series with p = 1, and hence it is divergent. Observe
that even though the series diverges, we nevertheless have that un = n1 → 0 as
n → ∞. 

1.2.5.2. The comparison test.


Theorem 1.2.9 (Comparison test). Suppose we are given two series , u1 +
u2 + u3 + . . . and a1 + a2 + a3 + . . ., with 0 ≤ un ≤ an for each n ∈ N. Then the
following holds:
• If a1 + a2 + a3 + . . . converges to say A, u1 + u2 + u3 + . . . will converge
to a sum which is less than or equal to A.
• If u1 + u2 + u3 + . . . diverges, a1 + a2 + a3 + . . . will also diverge.

Proof Let An denote the nth partial sum of {an } . Then


An = a1 + a2 + a3 + . . . + an < A for all n.
Since ui ≤ ai for each i, it follows that
Sn = u1 + u2 + u3 + . . . + un ≤ An < A
Since every ui is non-negative, the sequence {Sn } is monotonically increasing
and also bounded from above by A. Thus it must increase to some limit which is
less than or equal to A.
To see the second claim suppose that u1 + u2 + u2 + . . . diverges. Now since
the terms of the series u1 + u2 + u2 + . . . are all non-negative, the only way the
sequence of partial sums Sn = u1 + u2 + u2 + . . . + un can diverge, is if they increase
to ∞. But since
Sn = u1 + u2 + u2 + . . . + un ≤ a1 + a2 + a3 + . . . + an = An ,
the partial sums An = a1 + a2 + a3 + . . . + an must then also increase to ∞. Thus
the series a1 + a2 + a3 + . . . is then also divergent. 

Of course to use the comparison test when testing for convergence or divergence,
we need to have a bank of series known to either converge or diverge with which to
19 EMT4801/1

compare the series we wish to test. It is here that series like geometric series and
the p–series (whose behaviour is known) prove to be very useful.

Example 1.2.10. Prove that the series



X 1

3
n +n
n=1

is convergent.
Solution: We need a series for comparison. One way to get such a series
is
√ to √ the denominator of the general term. Since n ≥ 1, it follows that
simplify
n3 + n > n3 , and hence that
1 1
√ <√ for each n ≥ 1.
n3 +n n3

√1
P
But n3
is the p–series with
n=1

3
p= > 1.
2

√1
P
Thus n3
converges. Hence
n=1

X 1

n=1
n3 +n
must also converge. 

Example 1.2.11. Prove that the series



X 2n − 1
n=1
n2 + n

is divergent.
Solution: We need a series for comparison. If in both the numerator and
denominator of the general term we keepPonly the dominant P∞ 1 terms and discard

the rest, we end up with 2n 2
n2 = n . Since
2
n=1 n = 2 n=1 n diverges (p-series,
p = 1), we suspect that the given series will also diverge. Now when trying to prove
divergence by means of the comparison test, we need a divergent series of positive
terms that is term for term less than that of the given series. One way to get such
a series is to simplify the given general term by decreasing the numerator to n and
increasing the denominator to 2n2 (note that 2n2 = n2 + n2 ≥ n2 + n and that
2n − 1 ≥ 2n − n = n). Then
2n − 1 n 1
≥ 2 = .
n2 + n 2n 2n
∞ ∞
1 1 1
P P
But 2 n = 2n is a multiple of the p–series with p = 1 and thus divergent.
n=1 n=1
Hence

X 2n − 1
n=1
n2 + 1
also diverges. 
20

1.2.5.3. Absolute convergence forces convergence.


Theorem 1.2.12. Consider the series S = u1 + u2 + u3 + . . .. If |u1 | + |u2 | +
|u3 | + . . . converges, then so does u1 + u2 + u3 + . . ..

Proof: Suppose that the series


|u1 | + |u2 | + |u3 | + . . .
converges to some limit A. (Then 2|u1 | + 2|u2 | + 2|u3 | + . . . will of course converge
to 2A.) For any n we of course have that −|un | ≤ un ≤ |un |, and hence that
0 ≤ un + |un | ≤ 2|un |. Thus by the comparison test, (u1 + |u1 |) + (u2 + |u2 |) + (u3 +
|u3 |)+. . . will also converge – say to B. Now write Sn = u1 +u2 +u3 +. . .+un , An =
|u1 |+|u2 |+|u3 |+. . .+|un | and Bn = (u1 +|u1 |)+(u2 +|u2 |)+(u3 +|u3 |)+. . . (un +|un |)
for the nth partial sums of these three series. Rearranging the terms of Bn , we get
Bn = [(u1 + u2 + u3 + . . . + un ) + (|u1 | + |u2 | + |u3 | + . . . + |un |)]
= (Sn + An )
Thus Sn = Bn − An , and hence u1 + u2 + u3 + . . . must converge since
lim Sn = lim [Bn − An ] = B − A
n→∞ n→∞

exists. 

Example 1.2.13. Prove that the series


1 1 1 1 1
−1 + − + + − + ...
3 9 27 81 243
converges.
Solution: The corresponding series of positive terms is
1 1 1 1 1
1+ + + + + + ... .
3 9 27 81 243
The terms u1 = 1, u2 = 31 , u3 = 19 , . . . are of the form un = 1
3n−1 , which is a
geometric series with |r| = 13 < 1, and thus convergent. Hence
1 1 1 1 1
−1 + − + + − + ... ,
3 9 27 81 243
also converges. 

1.2.5.4. Conditions for convergence of alternating series.


Theorem 1.2.14 (Alternating Series Test). Suppose we are given a series u1 −
u2 + u3 − u4 + . . . with un ≥ 0 for each n. If
• un+1 ≤ un for each n ≥ 1, and
• un → 0 as n → ∞,
the series will converge.

Proof Suppose we are given a series


u1 − u2 + u3 − u4 + . . .
with un ≥ 0 for each n, satisfying
• un+1 ≤ un for each n ≥ 1, and
• un → 0 as n → ∞.
21 EMT4801/1

The partial sums are of the form


Sn = u1 − u2 + u3 − u4 + . . . + u2k−1 − u2k + u2k+1 . . . ± un
[+un or − un depending on whether n is even or odd]
First suppose that n is even – that is n = 2k for some k ∈ N. Then
S2k = u1 − u2 + u3 − u4 + . . . + u2k−1 − u2k
= (u1 − u2 ) + (u3 − u4 ) + . . . + (u2k−1 − u2k )
Since u2m−1 − u2m ≥ 0 for each m, the partial sums of the form S2k are monoton-
ically increasing. S2k may also be written as
S2k = u1 − (u2 − u3 ) − (u4 − u5 ) − . . . − (u2k−2 − u2k−1 ) − u2k .
Since u2m − u2m+1 ≥ 0 for each m, it is clear from the above that S2k ≤ u1 .
Therefore the S2k ’s must increase to some limit S less than or equal to u1 . Thus
the Sn ’s converge for n even.
Next suppose that n is odd. That is suppose that n is of the form n = 2k + 1
for some k ≥ 0. Since
S2k+1 = u1 − u2 + u3 − u4 + . . . + u2k−1 − u2k + u2k+1
= S2k + u2k+1 ,
we therefore have that
lim S2k+1 = lim (S2k + u2k+1 ) = S + 0 = S.
k→∞ k→∞

Thus for n odd, the Sn ’s converge to the same limit S! But then limn→∞ Sn = S
(with no restriction on n), which means that the series is convergent. 
Example 1.2.15. Show that the alternating harmonic series
1 1 1
1− + − + ... ,
2 3 4
is convergent.
Solution: Each term n1 is numerically less than the preceding term and the
nth term n1 approaches zero as n approaches infinity.
Recall that we already know that the harmonic series
1 1 1
1+ + + + ...
2 3 4
diverges. So the series 1 − 12 + 13 − 14 + . . . is an example of a series which converges,
but not absolutely. That is it is a so-called conditionally convergent series. 

1.2.5.5. D’Alembert’s ratio test.


Theorem 1.2.16 (D’Alembert’s ratio test). The infinite series u1 + u2 + . . .,
|un+1 |
• is absolutely convergent if limn→∞ |un | <1 (Case 1),
• and divergent if limn→∞ |u|un+1
n|
|
>1 (Case 2).
• Iflimn→∞ |u|un+1
n|
|
= 1 (Case 3), the test gives no information about either
divergence or convergence.

Proof – Case 1 Let


|un+1 |
lim =L<1
n→∞ |un |
22

and let L < r < 1. From the definition of a limit it follows that there exists a
number N such that for all n ≥ N, the ratio |u|un+1
n|
|
is closer to L than it is to r.
|un+1 |
Then |un | < r for all n ≥ N . That is
|un+1 | < r|un | for all n ≥ N.
But then |uN +2 | < r|uN +1 | < r2 |uN |, |uN +3 | < r|uN +2 | < r(r2 |uN |) = r3 |uN |, etc.
Continuing inductively we see that |uN +k | < rk |uN | for all k ≥ 0. Thus
|u1 | + |u2 | + |u3 | + . . . + |uN | + |uN +1 | + |uN +2 | + . . .
< |u1 | + |u2 | + |u3 | + . . . + |uN | + r |uN | + r2 |uN | + . . .
= |u1 | + |u2 | + |u3 | + . . . + |uN −1 | + |uN | 1 + r + r2 + . . .


The series 1 + r + r2 + . . . is a geometric series and since r < 1, it converges.


The part of the series |u1 | + |u2 | + |u3 | . . . + |uN −1 | , consists of a fixed finite number
of terms, and is thus a constant. Hence, |u1 | + |u2 | + |u3 | + . . . + |uN | + |uN +1 | +
|uN +2 | + . . . converges by the comparison test, which means that u1 + u2 + u3 + . . .
is absolutely convergent in this case.
Proof – Case 2 The proof is similar to that of case 1. Here we let r be a
number such that 1 < r < L. For some natural number N this leads to the ratio
|un+1 |
> r for all n ≥ N,
|un |
from which it follows that
|un+k | > rk |un | for all n ≥ N.
Since r > 1, we clearly have rk |uN | → ∞ as k → ∞. By the above inequality we
then also have |uN +k | → ∞ as k → ∞ (equivalently |un | → ∞ as n → ∞). Thus
by Proposition 1.2.5, the series u1 + u2 + u3 + . . . diverges.
Proof – Case 3 Here it is sufficient to give examples of both a convergent and
a divergent series for which
|un+1 |
lim =1
n→∞ |un |

To this end recall that we know that


1 1 1
1 − + − + ...
2 3 4
converges and that
1 1 1
1 + + + + ...
2 3 4
diverges. Yet in each case
|un+1 |
lim = 1.
n→∞ |un |

Example 1.2.17. Test the series


1 23 33 43 53
+ 2 + 3 + 4 + 5 + ...
3 3 3 3 3
for convergence.
Solution: The general term is of the form
n3
un = .
3n
23 EMT4801/1

So
3
|un+1 | (n + 1) 3n
= · 3
|un | 3n+1 n
3 2
n + 3n + 3n + 1
=
3n3
Thus
|un+1 | 1
lim = < 1.
n→∞ |un | 3
The series is therefore convergent. 

Example 1.2.18. Test the series


1 1 1 1
+ + + + ...
(2) (4) (3) (5) (4) (6) (5) (7)
for convergence.
Solution: The general term of the series is of the form
1
un =
(n + 1) (n + 3)
The ratio test fails since here
|un+1 | (n + 1)(n + 3) n2 + 4n + 3
lim = lim = lim 2 = 1.
n→∞ |un | n→∞ ((n + 1) + 1)((n + 1) + 3) n→∞ n + 6n + 8

Thus in this case we must use one of the other tests.


1 1
The general term (n+1)(n+3) = n2 +4n+3 of the series is less than the general
P∞
term of the p–series with p = 2, namely n12 . Since n=1 n12 converges, the given
series also converges by the comparison test.
NOTE: The integral test could also have been used to prove convergence.
Show this yourself. 

1.2.6. ACCURACY OF A CONVERGENT SERIES. In all of the se-


ries we have encountered thus far, we were able to determine whether they were
divergent or convergent by using any of a number of tests. Although some of these
had elegant formulas for the sum, most of these tests only provide a means for
testing convergence, and provide no way of actually computing the limit (sum) of
the series once convergence has been established. By way of example notice that
1 1
1 + + + ...
3 9
is a geometric series, which according to the theory for geometric series, converges
to
1 3
= .
1 − 13 2
But what does something like
1 23 33 43 53
+ 2 + 3 + 4 + 5 + ...
3 3 3 3 3
converge to? In principle one can try to approximate the sum of a convergent series
by adding up the first few terms of the series. But for this idea to be of any practical
use, we need some way of determining the accuracy of our approximation without
actually knowing the limit.
If a series S = u1 + u2 + u3 + . . . is estimated by its nth partial sum
Sn = u1 + u2 + u3 + . . . + un , the error S − Sn will be denoted by Rn = S − Sn .
Rn is also sometimes referred to as the remainder after n terms.
24

1.2.6.1. Estimating an alternating series.


Theorem 1.2.19. Suppose we are given a series u1 − u2 + u3 − u4 + . . . with
• 0 ≤ un+1 ≤ un for each n ≥ 1, and
• un → 0 as n → ∞.
By the alternating series test this series will converge. The error in estimating the
limit S = u1 − u2 + u3 − u4 + . . . with the nth partial sum Sn = u1 − u2 + u3 − u4 +
. . . + (−1)n−1 un , is less than or equal to the absolute value of the (n + 1)th term.
Specifically |S − Sn | ≤ |un+1 |.

Proof: For the series S = u1 − u2 + u3 − u4 + . . . the general term is of the


form (−1)k−1 uk . So if n is an even number, the (n + 1)th term will be of the form
(−1)n un+1 = un+1 . Therefore in this case
Rn = S − Sn
= (u1 − u2 + u3 − u4 + . . . + un−1 − un + un+1 . . .)
− (u1 − u2 + u3 − u4 + +un−1 − un )
= un+1 − un+2 + un+3 − un+4 + . . .
This may be written as either
Rn = (un+1 − un+2 ) + (un+3 − un+4 ) + . . .
or
Rn = un+1 − (un+2 − un+3 ) − (un+4 − un+5 ) − . . .
Since ui+1 ≤ ui for each i (that is ui − ui+1 ≥ 0), the first form shows that
Rn ≥ 0, whereas the second form shows that Rn is less than or equal to un+1 .
Thus, 0 ≤ Rn ≤ un+1 .
If n is an odd number, we have (−1)n un+1 = −un+1 , and hence S = u1 − u2 +
u3 − u4 + . . . − un−1 + un − un+1 . . . and Sn = u1 − u2 + u3 − u4 + . . . − un−1 + un .
So in this case
Rn = S − Sn
= −un+1 + un+2 − un+3 + un+4 − . . .
This may be written as either
Rn = − (un+1 − un+2 ) − (un+3 − un+4 ) − . . .
or
Rn = −un+1 + (un+2 − un+3 ) + (un+4 − un+5 ) + . . .
As before since ui+1 ≤ ui for each i, the first form shows that Rn ≤ 0, and the
second form that Rn ≥ −un+1 . Thus, −un+1 ≥ Rn ≥ 0. Thus whether n is even
or odd, −un+1 ≤ Rn ≤ un+1 (or equivalently |Rn | ≤ un+1 ). 

1.2.6.2. Series conditioned to the ratio test.


Theorem 1.2.20. Let u1 + u2 + u3 + . . . be a series whose convergence can
be established by the ratio test (that is limn→∞ |u|un+1
n|
|
< 1)). Given N ∈ N and
0 < r < 1 so that
|un+1 |
≤r<1 for all n ≥ N,
|un |
the remainder after n terms, Rn , will then satisfy the estimate
r |un |
|Rn | ≤ for all n ≥ N.
1−r
25 EMT4801/1

Proof: Given N ∈ N and 0 < r < 1 so that |u|un+1 n|


|
≤ r < 1 for all n ≥ N , we
may argue exactly as in the proof of case 1 of theorem 1.2.16 (D’Alembert’s ratio
test), to see that then |un+k | ≤ rk |un | whenever n ≥ N . Therefore given n ≥ N we
have
|Rn | = |un+1 + un+2 + . . . |
≤ |un+1 | + |un+2 | + . . .
≤ r|un | + r2 |un | + r3 |un | + . . .
= r|un |[1 + r + r2 + . . .]
r|un |
=
1−r
as required. 

1.2.6.3. Series conditioned to the integral test. Consider the series



X 2n + 6 8 10 12
3 = + 3 3 + ...
n=1 (n + 2)
33 4 5
2n+6
For this series the ratio rest does NOT prove convergence since for un = (n+2) 3,

we have
|un+1 | (2n + 8) (n + 2)3
= · → 1 as n → ∞.
|un | (n + 3)3 (2n + 6)
The algorithm described in the previous section is therefore not applicable. However
there is another estimate based on the integral test that we can use to estimate the
remainder in this case. We briefly describe how it works:

P
Theorem 1.2.21. Suppose we have a series un whose convergence can be
n=1
established by the integral test. Hence there exists a decreasing
R∞ function f on [1, ∞)
with f (x) ≥ 0 for all x, f (n) = un for all n ≥ 1, and 1 f (x) dx < ∞. Then
Z ∞
|S − Sn | = |Rn | ≤ f (x) dx
n

Proof: Let m ∈ N be fixed, and let n > m. By a similar argument to that of


the proof of theorem 1.2.6, we have
Z n
um+1 + um+2 + . . . + un ≤ f (x)dx < um + um+1 + . . . + un .
m

We already know that


Z n Z ∞
lim f (x) dx = f (x) dx
n→∞ 1 1

exists (by assumption), so


Z n Z ∞
lim f (x) dx = f (x) dx
n→∞ m m

must also exist. If therefore we let n → ∞, then from the first centered equation
in the proof we get

X Z ∞
um+1 + um+2 + . . . = un ≤ f (x) dx
n=m+1 m

as required. 
26

P∞ 2n+6 8 10 12
Coming back to the series n=1 (n+2) 3 = 33 + 43 53 + . . ., we firstly note that
2x+6
the function f (x) = (x+2) 3 is decreasing on [1, ∞), and that
Z ∞  Z ∞ 
2x + 6 2 2
dx = + dx
1 (x + 2)3 1 (x + 2)2 (x + 2)3
  b
2 1
= lim − − 2

b→∞ (x + 2) (x + 2) 1
 
2 1 2 1
= lim + − −
b→∞ 3 9 b + 2 (b + 2)2
7
= .
9
Hence for this series, convergence can indeed be proved by means of the integral
test. If now we use Sn = u1 + u2 + . . . + un to approximate the sum of
X∞
un = u1 + u2 + u2 + . . . ,
n=1
then by the theorem we must have
Z ∞
|Rn | ≤ f (x) dx.
n
If for example we use the first n = 5 terms to approximate the sum of the series,
the error can then be estimated by
Z ∞
2x + 6
|R5 | ≤ 3 dx
5 (x + 2)
Z ∞ !
2 2
= 2 + 3 dx
5 (x + 2) (x + 2)
! b
2 1
= lim − −

(x + 2) (x + 2)2

b→∞
5
!
2 1 2 1
= lim + − −
b→∞ 7 49 b + 2 (b + 2)2
2 1
= +
7 49
15
=
49
= 0, 3061 . . .
Example 1.2.22. Estimate the sum of the series
23 33 43 53 63
+ 2 + 3 + 4 + 5 + ...
3 3 3 3 3
accurately up to the third decimal place.
Note: An estimate is said to be accurate up to the nth decimal place if |Rn | <
1
2 · 10−n . So in the above problem we need to have |Rn | < 12 · 10−3 = 2000
1
.
Solution: The general term is of the form
(n + 1)3
un = .
3n
Since  3
un+1 n+1 1 1
lim = lim = < 1,
n→∞ un n→∞ n 3 3
27 EMT4801/1

the ratio test proves convergence in this case. We may therefore use theorem 1.2.20.
It not too difficult to see that
 3
un+1 n+2 1
=
un n+1 3
decreases as n increases. In addition the first value of n for which uun+1n
< 1, is
u2 3 31 u3 4 31 64
n = 2. (We have u1 = ( 2 ) 3 > 1, and u2 = ( 3 ) 3 = 81 < 1.) So in this case we
have  3
un+1 u3 4 1 64
≤ = = <1
un u2 3 3 81
for all n ≥ 2. We may therefore take N = 2 and r = 6481 , and conclude from theorem
1.2.20 that
r|un | 64 64(n + 1)3
|Rn | < = un =
1−r 17 17.3n
3
for all n ≥ 2. We need to pick n so that 64(n+1) 17.3n
1
< 2000 , or equivalently so that
3n 128000
(n+1)3 > 17 = 7529.4117647 . . . . Fortunately for us this series converges quite
rapidly, which means that here we can find the required n by inspection. If we take
315
n = 15 as a first try, we get 16 3 = 3503.151123 . . . , which is almost big enough.
316
For n = 16 we get 17 3 = 8761.7995115 · · · > 7529.4117647 . . . as required. Hence
we need at least 16 terms to approximate the sum to the required accuracy. 

Example 1.2.23. How many terms of the series


1 5 9 13
− + − + ...
(2) (3) (4) (6) (7) (8) (10) (11) (12) (14) (15) (16)
must be used to ensure that the error in the approximation of the limit of the sum
does not exceed 0, 00001?
Solution: The general term of the series is
(−1)n−1 (4n − 3)
(4n − 2) (4n − 1) 4n
(4n−3)
Let un = (4n−2)(4n−1)4n . Since the series alternates, un+1 < un , and limn→∞ un =
0, the error in taking the first n terms does not exceed the value of un+1 =
(4n+1) (4n+1)
(4n+2)(4n+3)(4n+4) . We therefore need to pick n so that (4n+2)(4n+3)(4n+4) <
0.00001 = 10−5 . If we can determine where we have equality, we will have some
idea of where to look for an n which gives us the required inequality. We will have
equality when
(4N + 1)105 = (4N + 2)(4N + 3)(4N + 4) = 8(8N 3 + 18N 2 + 13N + 3);
that is when
1 1
8N 3 + 18N 2 + (13 − · 105 )N + (3 − · 105 ) = 0.
2 8
We use the Newton–Raphson iteration method to approximate the zeros of this
polynomial. Setting f (N ) = 8N 3 + 18N 2 + (13 − 21 · 105 )N + (3 − 18 · 105 ) and taking
N0 = 70 as a first approximation of f (N ) = 0, the iterative formula
f (Ni )
Ni+1 = Ni −
f 0 (Ni )
yields N = 78, 0561 (4 dp). So for any integer N > 78, 0561 we should haveuN +1 <
0.00001. For example for N = 79 we have u80 = 0, 000009765 < 0, 00001. (N = 78
gives u79 = 0, 000085527 > 0, 00001.) We therefore need n = 79 terms. 
28

EXERCISE 1.2.
1. Find the sum of the first n terms of
Xn
k3
k=1
4 1 2
[Hint: consider (k + 1) − k 4 ] [Ans: 4 n (n + 1)2 ]
2. Consider the series
n
X k
Sn =
2k
k=1
1
Write down the series for 2 Sn and show that
1 1 1 1 1 n
Sn − Sn = + + + . . . + n − n+1
2 2 4 8 2 2
Hence, find the sum of the series Sn .
3. Consider the general arithmetic–geometric series
Sn = a + (a + d) r + (a + 2d) r2 + (a + 3d) r3 + . . . + [a + (n − 1)d] rn−1
Show that
(1 − r) Sn = a + dr + dr2 + . . . + drn−1 − [a + (n − 1) d] rn
Find a simple expression for Sn .
4. Evaluate each of the following sums:
(a) 1 + 2 + 3 + . . . + 153 + 154
2 2 2 2 2
(b) (1) + (2) + (3) + . . . + (153) + (154)
1 1 1 1
(c) + + + ... +
(2) (4) (4) (6) (6) (8) (152) (154)
5. A loan of R10 000 repaid over a period of n years at r% per year satisfies
the equation
x x x x
10 000 = r + 2 + 3 + . . . + r n

1 + 100 1+ r
1+ r 1 + 100
100 100
where x is the repayment installment. Find x in terms of r and n and
compute its value if r = 10 and n = 20.
6. Is it possible to find a convergent harmonic series? Explain with a proof.
7. Test the following series for convergence or divergence:
√ √ √
1 2 3 4
(a) 2 + 2 + 2 + 2 + . . . [Convergent]
2 3 4 5
0, 001 0, 002 0, 003 0, 004
(b) − + − + ... [Divergent]
1 + 10 1 + 20 1 + 30 1 + 40
1! 2! 3! 4!
(c) − + 3 − 4 + ... [Divergent]
10 102 10 10
1 1 1 1 1 1 1 1
(d) − + − + − + − + ... [Convergent]
2 3 5 7 11 13 17 19
1 22 32 42
(e) + 3 + 3 + 3 + ... [Divergent]
1 + 2000 2 + 3000 3 + 4000 4 + 5000
(1) (3) (2) (4) (3) (5) (4) (6) (5) (7)
(f) + + + + + ... [Convergent]
3 32 33 34 35
3 32 33 34
(g) 4 + 5 + 6 + 7 + . . . [Divergent]
2 2 2 2
29 EMT4801/1

8. (a) Estimate the error made in using the first 5 terms of the series
1 22 32 42 52
+ 2 + 3 + 4 + 5 + ...
3 3 3 3 3
as an approximation to the sum of the series.
(b) How many terms of this series are needed to ensure an error of less
than 0, 01?
[0, 30864 . . . , 10]
30

UNIT 3: POWER SERIES


1.3.1. OBJECTIVE. Introducing the concept of regions of convergence for
power series.
1.3.2. OUTCOMES. At the end of this unit the student should
• Know what is meant by the terms power series and interval of convergence;
• Know how to compute the interval of convergence of a given power series.
1.3.3. POWER SERIES. The ratio test proves to be very useful in studying
the convergence of series of functions
u1 (x) + u2 (x) + u3 (x) + . . . + un (x) + . . . ;
particularly series of the form
2 3 n
a0 + a1 (x − x0 ) + a2 (x − x0 ) + a3 (x − x0 ) + . . . + an (x − x0 ) + . . .
A series of this form is known as a power series in (x − x0 ) . Such a series is also
referred to as a power series centred at x0 . If x0 = 0, it is known as a power series
in x (alternatively a power series centred at 0).
For such a series the limit of the ratio |u|un+1
n|
|
will generally depend on x and
may be less than 1 for some values of x, and greater than 1 (or equal to 1) for other
values of x.
Hence, the convergence (or divergence) of the series depends on the value of x.
The set of all values of x for which the series converges, is known as the region of
convergence or interval of convergence of the power series.

Example 1.3.1. For what values of x does the series


x x2 x3
1+ + + + ...
3 5 7
converge?
xn−1
Solution: The general term is of the form un (x) = (2n−1) , and hence
n
|un+1 | |x | (2n − 1)
= ·
|un | (2n + 1) |xn−1 |
(2n − 1)
= |x|
(2n + 1)
Clearly
|un+1 |
lim = |x|
n→∞|un |
Hence by d’Alembert’s ratio test, the series will converge for |x| < 1 (equivalently
−1 < x < 1), and diverge for |x| > 1 (equivalently either x < −1, or x > 1). The
test gives us no information about the cases where |x| = 1, and hence these cases
need to be investigated separately. Now of course if |x| = 1, the possible values of
x are x = −1, or x = 1. If x = −1, then the series is
1 1 1
1 − + − + ...
3 5 7
1
which is an alternating series with general term un = (−1)n−1 2n−1 . Since
1 1 1
limn→∞ 2n−1 = 0, and 2(n+1)−1 ≤ 2n−1 for each n, this series converges by the
Alternating Series Test. If x = 1, then the series is
1 1 1
1 + + + + ...
3 5 7
31 EMT4801/1

1 1
P∞ 1
P∞ 1
which has general term of the form un = 2n−1 . Since 2 n=1 n = n=1 2n
diverges (p-series, p = 1) and
1 1
> for each n ∈ N,
2n − 1 2n
the comparison test tell us that the given series must also diverge. Hence, the
interval of convergence is −1 ≤ x < 1. 

Example 1.3.2. For what values of x does the following series converge?
 2 3
1 x2 − 5x + 2 x2 − 5x + 2 x2 − 5x + 2
+ + + + ...
2 22 23 24
Solution: The general term is of the form
(x2 − 5x + 2)n−1
un = .
2n
Applying the ratio test we get
x2 − 5x + 2n

|un+1 | 2n
= ·

|un | 2n+1 2 n−1
(x − 5x + 2)
1 2 
= x − 5x + 2
2
1
2 
Since x − 5x + 2 is independent of n, we have
2

|un+1 | 1 2 
lim = x − 5x + 2 .
n→∞ |un | 2
Thus, the series converges for
1 2 
x − 5x + 2 < 1;
2
i.e. for
1 2
−1 < (x − 5x + 2) < 1.
2
Now notice that
x2 − 5x + 2 > −2 x2 − 5x + 2 < 2
⇔ x2 − 5x + 4 > 0 ⇔ x2 − 5x < 0
⇔ (x − 1)(x − 4) > 0 ⇔ x(x − 5) < 0
⇔ either x > 1 and x > 4 ⇔ either x > 0 and x < 5
or x < 1 and x < 4 or x < 0 and x > 5 (impossible)
⇔ either x > 4 or x < 1 ⇔ 0<x<5

From the above it is clear that −2 < x2 − 5x + 2 < 2 will hold precisely when
either of x > 4 or x < 1 holds, simultaneously with 0 < x < 5; that is when either
0 < x < 1 or 4 < x < 5. The case
1 2 
x − 5x + 2 = 1
2
(corresponding to x = 0, 1, 4, 5 still needs to be investigated. Substituting into
 2 3
1 x2 − 5x + 2 x2 − 5x + 2 x2 − 5x + 2
+ + + + ...
2 22 23 24
32

we get

1 1 1 1
x=0: + + + + ... which is divergent
2 2 2 2
1 1 1 1
x=1: − + − + ... which is divergent
2 2 2 2
1 1 1 1
x=4: − − − + ... which is divergent
2 2 2 2
1 1 1 1
x=5: + + + + ... which is divergent.
2 2 2 2
(In each of the above cases, divergence follows from proposition 1.2.5.) Thus the
series converges for 0 < x < 1 and 4 < x < 5. 

EXERCISE 1.3. Determine the interval of convergence of each of the follow-


ing series:
x2 x4 x6
1. 1 − + − + ...
1! 2! 3!
(−∞ < x < ∞)
2 3x 4x2 5x3
2. + + + + ...
(1) (3) (2) (4) (3) (5) (4) (6)
(−1 ≤ x < 1)
x x2 x3
3. 1 + 2 + 4 − 6 + . . .
2 2 2
(−4 < x < 4)
2 4 6
1 (x + 2) (x + 2) (x + 2)
4. + + + + ...
(1) (3) (5) (2) (4) (6) (3) (5) (7) (4) (6) (8)
(−3 ≤ x ≤ −1)
5. 1 − (1!x) + (2!x2 ) − (3!x3 ) + . . .
(x = 0)
2 3 4
6. 1 + + 2 + 3 + . . .
x x x
(x < −1 or x > 1)
2
 2 3 4
x + 4x − 1 x2 + 4x − 1 x2 + 4x − 1 x2 + 4x − 1
7. + + + +...
4+1 42 + 24 43 + 34 44 + 44
(−5 < x < −3; −1 < x < 1)
33 EMT4801/1

UNIT 4: POWER SERIES EXPANSION OF FUNCTIONS


1.4.1. OBJECTIVE. To apply the work on power series to function theory.
In particular to find conditions under which a given function may be represented
as a power series. In this regard we give a formal verification of conditions under
which a Taylor series converges to the function from which it was generated.
1.4.2. OUTCOMES. At the end of this unit the student should
• Know and understand the concepts of Maclaurin series and Taylor series
(revision);
• Know the Maclaurin expansions of some elementary functions like ex ,
ln(1 + x), sin x, cos x and (h + x)n ;
• Know when and how to perform basic operations with power series like
adding, subtracting, multiplying, differentiating and integrating;
• Given a function which is a combination of elementary functions, the stu-
dent should know how to compute the Taylor series of the given function
from those of the elementary functions, by means of the above operations.
(For example the student should be able to use the Maclaurin series of ex
and cos x, to compute the Maclaurin series of cos(ex − 1).)
1.4.3. INTRODUCTION. In this chapter we shall investigate in detail the
formal methods by which functions may be expressed as infinite power series. We
shall also examine the conditions under which such power series are valid repre-
sentations of the functions which generated them. In closing we shall consider the
practical applications of power series representations to such problems as the com-
putation of definite integrals and the evaluation of limits of indeterminate forms.
1.4.4. MACLAURIN EXPANSIONS. (Colin Maclaurin (1698 – 1746) :
Scottish mathematician)
Suppose we are given a function f for which derivatives of all orders exist.
Formally if such a function may indeed be written in the form f (x) = b0 + b1 x +
b2 x2 + . . ., then we expect that
2 3 4
f 0 (x) = b1 + 2b2 (x) + 3b3 (x) + 4b4 (x) + 5b5 (x) + . . .
2 3
f 00 (x) = 2b2 + 3.2b3 (x) + 4.3b4 (x) + 5.4b5 (x) + . . .
2
f 000 (x) = 3.2.1b3 + 4.3.2b4 (x) + 5.4.3b5 (x) + . . .
f (4) (x) = 4.3.2.1b4 + 5.4.3.2b5 (x) + . . .
..
.
So if we evaluate f and each of its derivatives at 0, then on inserting 0 into each of
the above expansions, we expect those values to be
f (0) = b0 , f 0 (0) = b1 , f 00 (0) = 2!b2 , f 000 (0) = 3!b3 , f (4) (0) = 4!b4 , . . .
Thus if f is going to have a well-behaved series expansion in terms of x, the coeffi-
(n)
cients should really be of the form bn = f n!(0) for each n, where we define 0! to be
1. So if f is a function for which derivatives of all orders exist at x = 0, we shall
call the proposed power series form of f ,
x2 x3
f (x) ∼ f (0) + f 0 (0) x + f 00 (0) + f 000 (0) + ...
2! 3!
the Maclaurin expansion of f .
Now in the above discussion we started by assuming that f has some power
series expansion f (x) = b0 + b1 x + b2 x2 + . . ., and then presented a strong case for
f (n) (0)
suspecting that in such cases, the bn ’s MUST be of the form bn = n! What
34

this suggests is that in a given region, the power series expansion of a function is
unique. In other words if for a function f there are several ways of generating a
power series expansion around 0, they should all give you the Maclaurin series
Several questions need to be answered about this expansion if it is going to be
of any practical use whatsoever. For example:
• Given f , for which values of x will the associated Maclaurin series con-
verge?
• At those values of x for which the Maclaurin series does converge, does it
converge to f , or to something else?
• What happens when we differentiate (or integrate) a function? Can we
get the Maclaurin series of f 0 by differentiating (or integrating) the one
for f term for term?
• What about doing algebra with Maclaurin series? If we add, multiply or
divide functions, can we get the Maclaurin series of the resultant function
by adding, dividing or multiplying the Maclaurin series of the original
functions?
As you may remember, this expansion was first introduced and studied in Math-
ematics II. However in Mathematics II, attention was only given to the actual
computation of the Maclaurin expansion, without considering regions of conver-
gence, or whether the series actually converges to the function which generated it,
or not. In this course we will go some way to filling these gaps.
Up to now we have been arguing very loosely, which is of course no way to do
mathematics. At some stage we need to start making the above ideas more exact.
However before doing so, we will first take some time to reacquiant ourselves with
the actual process of computing a Maclaurin series. For now we will therefore
assume that the above claims about uniqueness, differentiating (integrating) term
for term, and doing algebra all hold.
Example 1.4.1. What is the Maclaurin expansion of ex and for which values
of x does it converge?
Solution: Recall that for f (x) = ex ,
f 0 (x) = ex , f 00 (x) = ex , f 000 (x) = ex , . . .
Thus
1 = e0 = f (0) = f 0 (0) = f 00 (0) = f 000 (0) = . . .
Substituting into the Maclaurin series formula we get
x2 x3 x4 xn−1
ex ∼ 1 + x + + + + ... + + ...
2! 3! 41 (n − 1)!
To determine the region of convergence, we use the ratio test. Given any x we
see that
|un+1 | xn (n − 1)! |x|

= · n−1 = → 0 < 1 as n → ∞
|un | n! x n
Hence the series converges for all values of x, and the interval of convergence is
therefore −∞ < x < ∞. 

Example 1.4.2. Determine the Maclaurin series for sin x and determine its
region of convergence.
Note: For all the common differentiation formulas for sin x to hold, x must be a
variable expressed in radians.
35 EMT4801/1

Solution: Here
f (x) = sin x f (0) = 0
f 0 (x) = cos x f 0 (0) = 1
f 00 (x) = − sin x f 00 (0) = 0
f 000 (x) = − cos x f 000 (0) = −1
f (4) (x) = sin x f (4) (0) = 0
.. ..
. .
Thus
x3 x5 x7
sin x ∼ x − + − + ...
3! 5! 7!
The absolute value of the nth term is
2n−1
x
(2n − 1)!
Hence for any x,
|un+1 | x2n+1 (2n − 1)! x2

= · 2n−1 = →0<1 as n → ∞.
|un | (2n + 1)! x (2n + 1) 2n
Thus the series converges for all values of x. Formally differentiating the series for
sin x leads to
x2 x4 x6
cos x ∼ 1 − + − + ...
2! 4! 6!
which can be shown to also converge for all x. 

Example 1.4.3. Determine the Maclaurin series for tanh−1 x and determine
its region of convergence.
Solution: Instead of working directly with f (x) = tanh−1 x, it is easier to find
a series for ln (1 + x), and then use the standard formula
1 (1 + x)
tanh−1 x = ln
2 (1 − x)
to obtain a series for tanh−1 . For f (x) = ln(1 + x) we have
f (x) = ln (1 + x) f (0) = 0
f 0 (x) 1
= 1+x f 0 (0) = 1
f 00 (x) 1
= − (1+x) 2 f 00 (0) = −1
f 000 (x) 2
= + (1+x)3 f 000 (0) = 2 = 2!
3.2
f (4) (x) = − (1+x) 4 f (4) (0) = −3.2 = −3!
4.3.2
f (5) (x) = (1+x) 5 f (5) (0) = 4.3.2 = 4!
5.4.3.2
f (6) (x) = (1+x) 6 f (6) (0) = −5.4.3.2 = −5!
.. ..
. .
Thus
x2 x3 x4 x5 x6
ln (1 + x) ∼ x − + 2! − 3! + 4! − 5! + . . .
2! 3! 4! 5! 6!
1 2 1 3 1 4 1 5 1 6
= x − x + x − x + x − x + ...
2 3 4 5 6
On replacing x with −x, we get
1 1 1 1 1
ln (1 − x) = −x − x2 − x3 − x4 − x5 − x6 − . . .
2 3 4 5 6
36

If now we subtract the second series from the first, we get


1 (1 + x)
tanh−1 ∼ ln
2 (1 − x)
1
= [ln (1 + x) − ln (1 − x)]
2
1 1
= x + x3 + x5 + . . .
3 5
The nth term is of the form
x2n−1
.
2n − 1
Hence by the ratio test
|un+1 | x2n+1 2n − 1 2n − 1 2

= · = x → x2 as n→∞
|un | 1n + 1 x2n−1 2n + 1
Thus the series converges if x2 < 1 (i.e. −1 < x < 1) and diverges if x2 > 1 (that
is when either x > 1, or x < −1.) When x = 1, we get the series
1 1
1+ + + ...
3 5
1
with general term un = 2n−1 . But since
1 1
> for all n ≥ 1,
2n − 1 2n
P∞
and since 12 n=1 n1 diverges (p-test, p = 1), the comparison test now tells us that
1 + 13 + 15 + . . . also diverges. When x = −1 we get −1 − 31 − 15 − . . ., which must
also diverge since it is just a multiple of the series 1 + 31 + 15 + . . . which we know
diverges. Thus the interval of convergence for the Maclaurin series of tanh−1 is
−1 < x < 1. 

Example 1.4.4. Suppose that f (x) is an unknown function with the properties
that it has derivatives of all orders, and

f 00 (x) = xf (x) , f (0) = 1, and f 0 (0) = 0.


Find the Maclaurin series of f (x) .
Solution: We already know that f (0) = 1, and f 0 (0) = 0. Inductively using
the given properties, we see that
f 00 (x) = xf (x) f 00 (0) = 0
f 000 (x) = f (x) + xf 0 (x) f 000 (0) = 1
f (4) (x) = 2f 0 (x) + xf 00 (x) f (4) (0) = 0
f (5) (x) = 3f 00 (x) + xf 000 (x) f (5) (0) = 0
f (6) (x) = 4f 000 (x) + xf (4) (x) f (6) (0) = (4) (1)
f (7) (x) = 5f (4) (x) + xf (5) (x) f (7) (0) = 0
f (8) (x) = 6f (5) (x) + f (6) (x) f (8) (0) = 0
f (9) (x) = 7f (6) (x) + xf (7) (x) f (9) (0) = (7) (4) (1)
..
.
Hence
x3 (1) (4) 6 (1) (4) (7) 9 (1) (4) (7) (10) 12
f (x) ∼ 1 + + x + x + x + ...
3! 6! 9! 12!
See if you can show that this series converges for all values of x! 
37 EMT4801/1

1.4.5. TAYLOR EXPANSIONS. (Brook Taylor (1685 –1731): English math-


ematician)
In the previous section we introduced the idea of a Maclaurin series. Given a
function f for which derivatives of all orders exist, the basic idea was to try and
write f as a power series of the form b0 + b1 x + b2 x2 + . . . However there is nothing
sacred about x. We could just as well have tried to write f as a power series of the
form a0 + a1 (x − 10) + a2 (x − 10)2 + a3 (x − 10)3 + . . . instead of b0 + b1 x + b2 x2 + . . ..
In fact if we were interested in approximating f with a polynomial at the point
x = 10 rather than x = 0, it would be more appropriate to look for a power series
in powers of (x − 10), rather than powers of x. So rather than focussing on x only,
we are in this section interested in finding ways of writing f (x) in the form
2 n
f (x) = b0 + b1 (x − a) + b2 (x − a) + . . . + bn (x − a) + . . .

where a is a constant, and the bn ’s have to be determined.


Assuming that f (x) has derivatives of all orders, a small modification of the
argument at the start of the previous section, shows that if such an expansion does
exist, then in this case we expect the coefficients to be of the form

f (n) (a)
bn = .
n!
The proposed series

f 00 (a) 2 f 000 (a) 3


f (x) ∼ f (a) + f 0 (a) (x − a) + (x − a) + (x − a) + . . .
2! 3!
f (n) (a) n
... + (x − a) + . . .
n!
is known as the Taylor series around the point x = a. When referring to the Taylor
series around the point a, all the following expressions have essentially the same
meaning:
• Expand f (x) in powers of (x − a) .
• Expand f (x) in a power series about the point x = a.
• Expand f (x) in a Taylor series about the point x = a.
As was the case in the previous section, we defer theoretical considerations for
the moment, and first take some time to acquiant ourselves with the actual process
of computing a Taylor series.

Example 1.4.5. Expand cos x in powers of


 π
x−
2
Solution: Proceeding inductively we have

f (x) = cos x f ( π2 ) = 0
f 0 (x) = − sin x f 0 ( π2 ) = −1
f 00 (x) = − cos x f 00 ( π2 ) = 0
f 000 (x) = sin x f 000 ( π2 ) = 1
f (4) (x) = cos x f (4) ( π2 ) = 0
.. ..
. .
38

Thus
 π 1  π 3 1  π 5
cos x ∼ 0 − x − +0+ x− − x− + ...
2 3! 2 5! 2
 π 1  π 3 1  π 5
= − x− + x− − x− + ...
2 3! 2 5! 2
n 1  π 2n−1

. . . + (−1) · x− ...
(2n − 1)! 2
By means of the ratio test it can be shown that this series converges for all values
of x. 

1.4.6. TAYLOR’S THEOREM. So far we have computed Taylor and Mac-


laurin series of certain functions, and also determined the regions of convergence of
these series. But do these series actually converge to the functions that generated
them? If they don’t they are of no use when studying the actual functions. The
answer to this question is provided by Taylor’s theorem. The essence of this theo-
rem, is to provide conditions under which the Taylor (or Maclaurin) expansion of a
function actually converges to the function. So let f have derivatives of all orders
at a, and let it be associated with the Taylor expansion
f 00 (a) 2 f 000 (a) 3
f (x) ∼ f (a) + f 0 (a) (x − a) + (x − a) + (x − a) +
2! 3!

f (n−1) (a) n−1 f (n) (a) n


(1.4.1) ... + (x − a) + (x − a) + . . .
(n − 1)! n!
Let
f 00 (a) 2 f 000 (a) 3
Sn (x) = f (a) + f 0 (a) (x − a) + (x − a) + (x − a) +
2! 3!
f (n−1) (a) n−1
... + (x − a)
(n − 1)!

be the nth partial sum of the Taylor series. If we can show that for some fixed x

lim |f (x) − Sn (x)| = 0,


n→∞

it would mean that the series converges to f (x) at this point.


We proceed to obtain a formula for estimating the difference f (x)−Sn (x) where
Sn is as above. To do this fix x for now, and let Tn be a number so that
n
(x − a)
f (x) = Sn (x) + · Tn
n!
f 00 (a) 2 f 000 (a) 3
= f (a) + f 0 (a) (x − a) + (x − a) + (x − a) + . . .
2! 3!
n
f (n−1) (a) n−1 (x − a)
+ (x − a) + · Tn
(n − 1)! n!
The above may trivially be rewritten as
f 00 (a) 2 f 000 (a) 3
f (x) − f (a) − f 0 (a) (x − a) − (x − a) − (x − a) − . . .
2! 3!
n
f (n−1) (a) n−1 (x − a)
(1.4.2) − (x − a) − · Tn = 0
(n − 1)! n!
39 EMT4801/1

Now define a function φ (u) as follows:


f 00 (u) 2 f 000 (u) 3
φ (u) = f (x) − f (u) − f 0 (u) (x − u) − (x − u) − (x − u)
2! 3!
n
f (n−1) (u) n−1 (x − u)
(1.4.3) −··· − (x − u) − · Tn
(n − 1)! n!
It is clear from the above that when u = x, we get φ(x) = 0. On the other hand if
u = a, we have from equation 1.4.2 that φ(a) = 0. By hypothesis all the derivatives
of f exist. Hence since φ (u) and φ0 (u) are both finite sums of products of powers
of (x − u), and derivatives of f (u), all of which exist and are continuous, φ and φ0
are themselves continuous functions over the closed interval from a to x. Hence by
Rolle’s theorem (see Mathematics II), there must be at least one value of u between
a and x (say u = x1 ) at which the gradient of φ (u) must be zero, as shown in the
figure below.

φ
φ (x 1) = 0 φ (x 3) = 0

φ (x 2) = 0

a x1 x2 x3 x u

Figure 1.3
Figure 1.3

Thus, φ0 (x1 ) = 0 for some x1 between a and x. Taking into account that x is
constant with respect to u, when we differentiate equation 1.4.3 with respect to u,
we will get:
φ0 (u) = 0 − f 0 (u) − [−f 0 (u) + f 00 (u) (x − u)]
f 000 2
−[−f 00 (u) (x − u) + (u) (x − u) ] − . . .
2!
f (n−2) (u) n−3 f (n−1) (u) n−2
−[− (x − u) + (x − u) ]
(n − 3)! (n − 2)!
n−1
f (n−1) (u) n−2 f (n) (u) n−1 (x − u)
−[ (x − u) + (x − u) ]+ · Tn
(n − 2)! (n − 1)! (n − 1)!
If in this expression we cancell terms where possible, it simplifies to
n−1
f (n) (u) n−1 (x − u)
φ0 (u) = − (x − u) + · Tn .
(n − 1)! (n − 1)!
But as we have already seen 0 = φ0 (x1 ), which forces
n−1
f (n) (x1 ) n−1 (x − x1 )
0 = φ0 (x1 ) = − (x − x1 ) + · Tn .
(n − 1)! (n − 1)!
Therefore
Tn = f (n) (x1 ) .
40

Substituting into equation 1.4.2, we get


f 00 (a) 2 f 000 (a) 3
f (x) = f (x) + f 0 (a) (x − a) + (x − a) + (x − a) +
2! 3!
n
f (n−1) (a) n−1 (x − a) (n)
(1.4.4) ... + (x − a) + f (x1 )
(n − 1)! n!
for some x1 between a and x.
Equation 1.4.4 is known as Taylor’s Theorem, and may be written as
n
(x − a) (n)
f (x) = Sn (x) + f (x1 )
n!
Computing the error between f (x) and the value of the Taylor Series, therefore
boils down to computing
(x − a)n (n)


lim |f (x) − Sn (x)| = lim f (x1 )
n→∞ n→∞ n!
This must equal zero for the series to converge to f (x). We note that there are
other ways in which this error can be expressed, but the one given above will be
convenient for our purposes.
Finally, let us consider the three possibilities
n
(x − a) (n)
Case 1: lim f (x1 ) fails to exist,
n→∞ n!
n
(x − a) (n)
Case 2: lim f (x1 ) = L 6= 0
n→∞ n!
(x−a)n (n)
Case 3: lim n! f (x1 ) = 0
n→∞

In case 1, the series diverges and represents no function. In case 2, the series
converges, but to f (x) − L instead of f (x). In case 3, the series converges to f (x).
A Maclaurin series is just a Taylor series with a = 0, and hence this result is of
course applicable to any Maclaurin series we wish to compute. Now if the Taylor
series DOES converge to f (x), then in the language of section 1.2.6, the remainder
term
n
(x − a) (n)
Rn (x) = f (x1 )
n!
is just the error obtained in estimating the sum f (x) with the nth partial sum
Sn (x).

Example 1.4.6. Prove that the Maclaurin expansion of f (x) = sin x actually
converges to sin x.
Solution: In example 1.4.2 the Maclaurin expansion of f (x) = sin x was found
to be
x3 x5 x7
sin x = x − + − + ...
3! 5! 7!
It was shown to converge for all values of x. To prove that the limit to which the
series converges is actually sin x we need to prove that
n
x (n)
lim |f (x) − Sn (x)| = lim f (x1 ) = 0

n→∞ n→∞ n!

where x1 is some number between x and 0. The nth derivative of sin x is of course
always one of either ± sin x, or ± cos x. The absolute value of each of these is always
41 EMT4801/1

less than or equal to 1, and so for f (x) = sin x, |f n (x1 )| ≤ 1. Thus,


n
x
lim |sin x − Sn (x)| ≤ lim = 0
n→∞ n→∞ n!

for all x. Hence, the series converges to sin x. 

In the above we assumed that


n
x
lim = 0
n→∞ n!

for all values of x. The easiest way to see that this holds, is to use the facts we
n
earlier proved about Maclaurin series. The term xn! is just the general term in the
Maclaurin series of ex , and in section 1.4.4 we proved that this Maclaurin series
converges for all values of x. But since this term is then just the general term of a
convergent series, we must have
xn
lim =0
n→∞ n!

by proposition 1.2.5.

1.4.7. THE BINOMIAL SERIES. Let h > 0 be given. Then for any real
number n we have
n (n − 1) n−2 2 n (n − 1) (n − 2) n−3 3
(h + x)n = hn + nhn−1 x + h x + h x +
  2! 3!
n n−m m
... + h x + ...
m
whenever |x| < |h|.
n

Note: Given any n ∈ R and m ∈ N, we define m by
 
n n (n − 1) (n − 2) . . . (n − m + 1)
= .
m m!
For n an integer with n ≥ m, this formula may be rewritten as
 
n n (n − 1) (n − 2) . . . (n − m + 1)
=
m m!
n (n − 1) (n − 2) . . . (n − m + 1) (n − m)!
= ·
m! (n − m)!
n!
= .
m! (n − m)!
n

If n ∈ N with n < m, we will have m = 0 since in this case the product
n (n − 1) (n − 2) . . . (n − m + 1) must contain a factor of the form (n − n) = 0.
n
Proof: Computing the Maclaurin series of f (x) = (h + x) , we see that
n
f (x) = (h + x)
n−1
f 0 (x) = n (h + x)
n−2
f 00 (0) = n (n − 1) (h + x)
n−2
f 000 (0) = n (n − 1) (n − 2) (h + x)
..
.
(r) n−r
f (x) = n (n − 1) (n − 2) . . . (n − r + 1) (h + x)
42

whence
f (0) = hn
f 0 (0) = nhn−1
f 00 (0) = n (n − 1) hn−2
000
f (0) = n (n − 1) (n − 2) hn−3
..
.
(r)
f (0) = n (n − 1) (n − 2) . . . (n − r + 1)hn−r .
Thus
n n (n − 1) n−2 2 n (n − 1) (n − 2) n−3 3
(h + x) ∼ hn + nhn−1 x + h x + h x +
2! 3!
n (n − 1) (n − 2) . . . (n − r + 1) n−r r
... + h x + ...
r!
∞  
X n n−r r
= h x
r=0
r
Note that if n is a finite natural  number, the series will have exactly n + 1
terms, since in this case we have nr = 0 for all r > n. To compute the interval of
convergence, we use the ratio test.
n 
n−(r+1) r+1
|un+1 | r+1 h x

=

n (n−r) r
|un |

r h x

n(n − 1)(n − 2) . . . (n − r) r! x
= ·
(r + 1)! n(n − 1)(n − 2) . . . (n − r + 1) h

(n − r) x r x

= →
r + 1 h ∞ h
which converges when | hx | < 1; that is when |x| < h. (Of course if n is a positive
integer, there will be only finitely many terms, and so in this case the “series” is
just a polynomial which “converges” for all values of x.)
It remains to show that the Binomial series converges to (h + x)n when |x| < h.
We could prove this using the formula for the remainder in Taylor’s theorem, but
that proof is a bit tricky, and hence we will not present it here. Instead we outline
a rather ingenious alternative way of proving what we want. See if you can work
out the details! Define ψ to be the function
∞  
X n n−r r
ψ(x) = h x for all − h < x < h.
r=0
r
Our task is to show that ψ(x) = (h + x)n within its domain of definition. Firstly
notice that
ψ(0) = hn .
0
P∞ n n−r r−1

If we differentiate ψ, we get ψ (x) = r=1 r r h x . (The r = 0 term is
constant and therefore
P∞ vanishes when we differentiate.) We may rewrite this ex-
n
pression as ψ 0 (x) = r=0 (r + 1) r+1
 n−r−1 r
h x . Now show that this expression for
ψ 0 satifies
∞  
X n
(h + x)ψ 0 (x) = (h + x) (r + 1) hn−r−1 xr
r=0
r + 1
∞     
X n n
= (r + 1) +r hn−r xr .
r=0
r + 1 r
43 EMT4801/1

n n n
  
Next verify that (r + 1) r+1 +r r =n r , and insert this formula into the above
equation to get
(h + x)ψ 0 (x) = nψ(x).
But this is a first order separable differential equation which together with the
initial condition ψ(0) = hn , has a unique solution, namely (h + x)n . So we must
have ψ(x) = (h + x)n , as required.

1.4.8. OPERATIONS WITH POWER SERIES. Polynomials may be


added, subtracted, multiplied and divided. Rearranging their terms do not alter
them. They may be differentiated or integrated term by term. Because of the
resemblance which power series bear to polynomials, it is natural to expect that
they can be manipulated in a similar fashion. But can they really? If we take two
power series and multiply or divide them with each other, what exactly will we
end up with? To be able to answer questions like these, we first need to give some
attention to the issue of uniqueness of power series expansions. This principle of
uniqueness is described by the following theorem:
P∞
Theorem 1.4.7. Let n=0 bn (x − a)n = b0 + bn (x − a)n + bn (x − a)n + . . . be
a power series which for some r > 0 converges in the interval a − r < x < a + r.
Let f be the function defined by

X
f (x) = bn (x − a)n for all a − r < x < a + r.
n=0
Then
f (n) (a)
bn = for all n ≥ 0.
n!
Although we will not prove this theorem, it has important consequences and
hence it is essential that we take note of it. But exactly how does this theorem
help us to understand which operations with power series are valid? In principle
the message of the theorem is that if in some interval around a there is a way of
writing a function f as power series in powers of (x − a), then there is only one
way of doing it! This means that if in some way we manage to expand f in
powers of (x − a) around the point a, then it doesn’t really matter how we got to
the expansion. By the uniqueness theorem the expansion we end up with must be
the Taylor series for f at a. So for example to get the Maclaurin series for ex sin x
we may simply multiply the Maclaurin series for ex with the series for sin x, and
then rearrange the terms in increasing powers of x. The result will be a power series
in x which converges to ex sin x, and which by the uniqueness theorem, must be
the Maclaurin series of ex sin x. For this to work you need to be sure that you are
working on an interval on which all the series you are trying to combine, converge
absolutely! If you mess around with either conditionally convergent or divergent
series, strange things may happen! For example as we noted earlier, at a point
where the series converges conditionally, we can no longer rearrange the terms in
any order. The following example illustrates this fact.

Example 1.4.8. Consider the series


1 1 1 1 1 1 1
S = 1 − + − + − + − + ...
2 3 4 5 6 7 8
Using the Alternating Series Test, we showed earlier that this series converges!
However since 1 + 12 + 31 + 14 + . . . diverges, the above series does not converge
absolutely. Rewriting it in the form
       
1 1 1 1 1 1 1
S =1− 1− + − + − + − + ...
2 3 4 5 6 7 8
44

clearly shows that S > 12 and hence that S 6= 0. By simply rearranging the terms of
this series, we get
       
1 1 1 1 1 1 1 1 1 1
1− − + − − + − − + − − ...
2 4 3 6 8 5 10 12 7 14
1 1 1 1 1 1 1
= − + − + − + + ...
2  4 6 8 10 12 14 
1 1 1 1 1 1 1
= 1 − + − + − + − ...
2 2 3 4 5 6 7
1
= S
2
Since S 6= 0, we clearly have 12 S 6= S. Hence the sum of the rearranged series
is NOT the same as that of the original series. We remind the reader that this
problem does not occur in the case of absolutely convergent series. Their terms
may be rearranged at will, without altering the limit to which they converge.
In closing we take some time to highlight some of the facts hinted at in the
preceding discussion.

1.4.9. PROPERTIES OF POWER SERIES.


• If f (x) can be expanded in powers of (x − a) in the neighbourhood of
x = a, the expansion is unique. For example expanding
1
f (x) =
1+x
−1
in powers of x using the Maclaurin expansion, or expanding (1 + x) as
a binomial series or, the easiest of all, performing long division
1 − x + x...
1+x 1
1+x
−x
−x − x2
x2
..
.
all lead to the same series, namely
1
= 1 − x + x2 − x3 + . . .
1+x
• If on some interval a − r < x < a + r (r > 0)
2
f (x) = a0 + a1 (x − a) + a2 (x − a) + . . .

and
2
g (x) = b0 + b1 (x − a) + b2 (x − a) + . . . ,
then the two series may be added or subtracted term for term to give
2
f (x) ± g (x) = (a0 ± b0 ) + (a1 ± b1 ) (x − a) + (a2 ± b2 ) (x − a) + . . .
on a − r < x < a + r. The resulting series will therefore be valid for all
values of x for which both of the original series converges. To see how this
45 EMT4801/1

works in practice, lets compute the Maclaurin expansion of cosh x. Since


by definition cosh x = 21 (ex + e−x ), we conclude that
1 x
e + e−x

cosh x =
2 
x2 x3 x3

1
= 1+x+ + + + ...
2 2! 3! 4!
(−x)2 (−x)3 (−x)4
 
+ 1 + (−x) + + + + ...
2! 3! 4!
x2 x3 x3
 
1
= 1+x+ + + + ...
2 2! 3! 4!
x2 x3 x4
 
+ 1−x+ − + − ...
2! 3! 4!
2 4
x x
= 1+ + + ...
2! 4!
for all values of x.
• If on some interval a − r < x < a + r (r > 0)
2
f (x) = a0 + a1 (x − a) + a2 (x − a) + . . .
and
2
g (x) = b0 + b1 (x − a) + b2 (x − a) + . . . ,
then the two series may be multiplied as ordinary polynomials. If b0 6= 0,
then we may also divide the series for f with the series for g, to get
an expansion for fg(x) (x)
. The series generated in this way for f (x) · g (x),
converges for all values of x for which both f (x) and g (x) converge, but
in the case of fg(x)
(x)
, this may not be enough. Here we need to insist that in
addition the quotient series also converges. Foer example the Maclaurin
series for sin x and cos x both converge for all values of x. Dividing the
series for sin with the series for cos will give us a Maclaurin series for
sin
tan = cos , but this series does not converge at x = π2 since tan is not
defined at x = π2 .
• If
f (y) = a0 + a1 y + a2 y 2 + . . .
and
2
g (x) = b0 + b1 (x − a) + b2 (x − a) + . . .
and if g(a) = b0 is inside the interval of convergence of f (y) then if in the
exapnsion for f (y) we substitute y with the series of g (x), the result will
converge to f ◦g(x) for all x in some small neighbourhood of a. However, if
the series of f (y) converges everywhere, the resulting series will converge
for all values of x for which the series of g (x) converges. (Warning: When
we substitute y with g(x) we get f (g(x)) = a0 + a1 g(x) + a2 (g(x))2 + . . .
which is generally not a power series, but rather an expression from which
we can extract the required power series by some clever manipulation of
terms. So once this expression has been obtained there is still some work
to be done – see example 1.4.10.)
• The derivative (integral) of a function represented by a Taylor expansion
can be found by term by term differentiation (integration) of the series
for all values of x within the region of convergence.

Example 1.4.9. Find the Maclaurin series expansions for


46

(a) sin x · tanh−1 x, and


sin x
(b) tan x = cos x

Solution: We showed earlier that


x3 x5 x2 x4
sin x = x − + − . . . and cos x = 1 − + − . . . for all x,
3! 5! 2! 4!
and that
1 1
tanh−1 x = x + x3 + x5 + . . . for all − 1 < x < 1.
3 5
(We did not show that the Maclaurin series of tanh−1 actually converges to tanh−1
when −1 < x < 1, but this is indeed the case.)
(a) Using the above expansions we conclude that
x3 x5
  
1 1
sin x · tanh−1 x = x− + − ... x + x3 + x5 + . . .
6 120 3 5
   
2 1 1 4 1 1 1
= x + − x + + − x6 + . . .
3 6 5 120 18
1 11
= x2 + x4 + x6 + . . .
6 72
for all −1 < x < 1.
(b) Again using the known series expansions, we get
3 5
sin x x − x6 + 120
x
− ...
tan x = = x2 x4
.
cos x 1 − 2 + 24 + . . .
The required series may be obtained from the above quotient by long
division, but it is usually simpler to express it as
x3 x5
x− + 120 − . . .
6
2 4 = a0 + a1 x + a2 x2 + . . .
1 − x2 + x24 + . . .
For this to hold we must have
x3 x5
x− + − ...
6 120
x2 x4
 
a0 + a1 x + a2 x2 + a3 x3 + a4 x4 + . . . 1 −

= + + ...
2 24
 a   a 
0 2 1 3
= a0 + a1 x + − + a2 x + − + a3 x
a 2 2
0 a2 
4
+ − + a4 x + . . .
24 2
Thus
a0 = 0
a1 = 1
−a0
+ a2 = 0
2
−a1 1
+ a3 = −
2 6
a0 a2
− + a4 = 0
24 2
a1 a3 1
− + a5 =
24 2 120
..
.
47 EMT4801/1

Solving for the an ’s we get

sin x 1 2
tan x = = x + x3 + x5 + . . .
cos x 3 15

But since we cannot divide by zero and since cos x will have some zeros
on any interval larger than (− π2 , π2 ), this technique will only make sense
if we stay inside (− π2 , π2 ). 

Example 1.4.10. What is the Maclaurin expansion of esinh x ? Find the series
up to the x5 term.

Solution: Since the Maclaurin series for ex converges to ex for all x, we have

y2 y3
f (y) = ey = 1 + y + + + ... for all y.
2! 3!

Applying this same expansion to the definition of sinh, shows that

g (x) = sinh x
1 x
= (e − e−x )
2 
x2 x3 (−x)2 (−x)3
  
1
= 1+x+ + + . . . − 1 + (−x) + + + ...
2 2! 3! 2! 3!
x3 x5
= x+ + + ...
3! 5!

for all values of x. Hence, the series obtained by substituting every y with

x3 x5
 
x+ + + ... ,
3! 5!

will converge to esinh x for all values of x. Doing the substitution, we obtain

2
x3 x5 x3
  
1
esinh x = 1+ x+ + + ... + x+ + ... +
3! 5! 2! 3!
 3
3  3
4
1 x 1 x 1 5
x+ + ... + x+ + . . . + (x + . . .) + . . .
3! 3! 4! 3! 5!

Notice that the “tail” of this expansion, namely

6 7
x3 x3
 
1 1
x+ + ... + x+ + ... + ...
6! 3! 7! 3!

contains no x5 terms, but only terms with order 6 and higher. Hence to get series
 3
5
1
which is accurate up to the x5 term, we only need to go up to the 5! x + x3! + . . .
48

term in the above expansion. Expanding and simplifying this expression now yields
x3 x5 x3 x5
   
sinh x 1 2
e = 1 + (x + + + . . .) + (x + + + . . .) +
3! 5! 2! 3! 5!
x3 x3
   
1 1 1 
+ . . .)3 + + . . .)4 + (x + . . .)5 + . . .

(x + (x +
3! 3! 4! 3! 5!
"  2 ! #
x3 x5
 
1 2 2 1
= 1+ x+ + + ... + x2 + x4 + + x6 . . . +
3! 5! 2! 3! 5! 3!
   
1 3 1 4 1  5
x3 + x5 + . . . + x4 + x6 + . . . +

x + ... ...
3! 3! 4! 3! 5!

x3 x5 x2 x4 x6 x3 x5
= 1 + [x ++ + . . .] + [ + + + . . .] + [ + + . . .]
6 120 2 6 45 6 12
x4 x6 x5
+[ + + . . .] + [ + . . .] . . .
24 36 120
1 1 5 1
= 1 + x + x2 + x3 + x4 + x5 + . . .
2 3 24 10
for all x. 
1
Example 1.4.11. Find the Maclaurin expansion of 1−x2 , using the expansion
of tanh−1 x derived in example 1.4.3.
Solution: Given that
1 1
tanh−1 x = x + x3 + x5 + . . . for all − 1 < x < 1,
3 5
we may differentiate term for term to get
1 d
= tanh−1 x
1 − x2 dx
= 1 + x2 + x4 + . . .
This expansion could also have been obtained either by means of longdivision, or
by substituting x with x2 in the expansion
1
= 1 + x + x2 + . . . for all − 1 < x < 1.
1−x

Example 1.4.12. Find a series for tan−1 x by using the series
1
= 1 + x2 + x4 + . . . for all − 1 < x < 1
1 − x2
Solution: Noting that
d 1
tan−1 x = ,
dx 1 + x2
we first substitute x2 with −x2 in
1
1 + (x2 ) + (x2 )2 + (x2 )3 + . . . = = 1 + x2 + x4 + x6 + . . .
1 − x2
to get
1
= 1 + (−x2 ) + (−x2 )2 + (−x2 )3 . . . = 1 − x2 + x4 − x6 + . . .
1 + x2
Hence
d 1
tan−1 x = = 1 − x2 + x4 − . . .
dx 1 + x2
49 EMT4801/1

Integrating term for term, we get


x3 x5
tan−1 x = (x − + + . . .) + c
3 5
(c is the constant of integration). Since tan−1 0 = 0, it follows that c = 0, and
hence that
x3 x5
tan−1 x = x − + + ...
3 5


EXERCISE 1.4.
1. The Maclaurin expansion of ln (1 + x) was shown to be
1 1 1 1 1
ln (1 + x) = x − x2 + x3 − x4 + x5 − x6 + . . .
2 3 4 5 6
in example 1.4.3. Find its region of convergence.
[−1 < x ≤ 1]
2. Obtain the general term and the interval of convergence for the Maclaurin
expansion of y = sin2 x.
h 2
i
23 x4 25 x6 n−1 22n−1 2n
y = 2x
2! − 4! + 6! − . . . + (−1) (2n)! x + . . . for all x
3. In each case obtain the first three nonzero terms of the Maclaurin expan-
sion of y:
(a) y = tan x
(b) y = sin x2
y
h(c) ye = 3x 5
i
y = x + x3 + 2x
15 + . . . ,
h 6 10
i
y = x2 − x3! + x5! − . . . ,
h 3
i
y = x − x2 + 3x2 + . . .
4. Expand 1−x
1+x as power series in x, and determine its region of convergence.
  
1 − 2 x − x2 + x3 − x4 + . . . |x| < 1

5. Expand x in powers of (x − 2) .
" !#
2 3
√ x − 2 (x − 2) (x − 2)
2 1+ − + − ...
4 32 128
6. Obtain the first four terms of the expansion of x around t = 1 if x(1) is
3
" and x + 2xt − 3 = 0.
real #
2 3
2 (t − 1) 8 (t − 1) 56 (t − 1)
x=1− + +
5 53 55
7. Obtain the first four terms of the expansion of y around x = 1 if y 00 + y 0 +
xy = 0 and y (1) = 0, y 0 (1) = 1.
" #
2 4 5
(x − 1) (x − 1) (x − 1)
(x − 1) − + − + ...
2 24 30
8. Prove that the Maclaurin expansions of cos x and sinh x, actually do re-
spectively converge to these functions for all values of x.
9. Using the Maclaurin expansion of ln (1 − x) , obtain the Maclaurin expan-
sion of sin−1 x given that
d 1
sin−1 x = √
dx 1 − x2
50

Also find its region of convergence.


1 x3 1 · 3 x5 1 · 3 · 5 x7
 
x+ + + + ... −1<x<1
2 3 2·4 5 2·4·6 7
10. Determine Z 1
sin x
dx
0 x
by using the Maclaurin expansion of sin x up to the term in x7 . Also
estimate the maximum error.
0, 946082766 . . . ; 3, 06 · · · × 10−7
 

11. Determine 2
ex sin 2x − 2xe−x
lim
x→0 tan2 x
using Maclaurin series expansions of the functions instead of L’Hôspital’s
rule.
[2]
MODULE 2

COMPLEX ANALYSIS

UNIT 1: REVISION OF COMPLEX NUMBERS


2.1.1. OBJECTIVE. To revise the work previously done on complex num-
bers in preparation for the work on complex analysis.
2.1.2. OUTCOMES. This unit is mostly revision. At the end of this unit
the student should be familiar with the concept of a complex number, know how
to perform basic operations with complex numbers, and know and be able to ma-
nipulate the definitions of Arg(z), rg(z), ez , ln(z), and Ln(z).
2.1.3. REVISION. Before commencing with the unit on complex analysis,
it is necessary to revise the work done on complex numbers in the Mathematics I
course.
2.1.3.1. Definitions. To start of with we postulate the existence of an imaginary
unit j with the property
√ that j 2 = −1. In principle this imaginary unit will for us
play the role of −1. By the term a complex number we then mean a number of
the form
z = a + jb
where a is the so-called real part of our complex number z, and b the imaginary
part. We will sometimes write <(z) and =(z) for a and b respectively. Complex
numbers expressed in this form are said to be in Rectangular or Cartesian form.
Note: Many texts denote the imaginary unit by i instead of j. However in the
fields of engineering and physics, the notation j is more common. Hence here too
we will adopt the convention of denoting the imaginary unit by j rather than i.
Since all complex numbers have both a real and imaginary part, the set of all
complex numbers C, is therefore a two-dimensional object. In fact if we identify
the set of all complex numbers with the two-dimensional plane, then on this plane
our complex number z = a + jb may be represented √ by the point (a, b) on the
circumference of the circle with radius r = a2 + b2 and centre 0 as shown in
figure 2.1. For non-zero complex numbers, θ is the angle that the ray from 0 to
(a, b) makes with the positive real axis.

Figure 2.1
51
52


The quantity r = a2 + b2 is referred to as the so-called modulus of the complex
number and is denoted by |z|. Geometrically |z| is really just the distance from
(a, b) to the origin. Now since a = r cos θ, b = r sin θ, it is clear that z = a + jb may
be written in the equivalent form
z = r(cos θ + j sin θ).
We refer to this form as the so-called polar form of the complex number. It is
clear from the above equation that the value of r and θ will uniquely determine our
complex number z. For this reason we often employ the shorthand notations
z = r(cos θ + j sin θ) = r∠θ
and
cis(θ) = (cos θ + j sin θ)
for the polar form.
The angle θ is referred to as the argument of the complex number. However
a word of warning is in order here: since for any integer k we have cos(θ) =
cos(θ + 2kπ) and sin(θ) = sin(θ + 2kπ), this angle is not unique! (Here θ is of course
in radians!) In fact for any k = 0, ±1, ±2, . . . we will have
z = r∠θ = r∠(θ + 2kπ).
So in principle we could’ve used any one of θ + 2kπ as the argument of z. That
specific angle θ which lies in the interval −π < θ ≤ π, is referred to as the principal
value of the argument, and is written Arg(z). For the set θ + 2kπ we write arg(z) =
θ + 2kπ.
For any z 6= 0, the principal value of the argument may be computed by
tan−1 ( ab )


 if a > 0
 tan−1 ( ab ) + π if a < 0, b ≥ 0


Arg(z) = tan−1 ( ab ) − π if a < 0, b < 0
π
if a = 0, b > 0


 2π


−2 if a = 0, b < 0
(The basic idea behind the above formula is that tan−1 ( ab ) gives one a reference
angle which in some cases may have to be shifted by π radians to make sure we end
up in the correct quadrant.)
Using the standard trigonometric identities and the fact that j 2 = −1, it is
possible to prove that
(cos θ + j sin θ)n = (cos(nθ) + j sin(nθ)) for all n ∈ Z.
This formula is known as de Moivre’s theorem. From the above it is clear that
the quantity (cos θ + j sin θ) has certain exponential properties. With de Moivre’s
theorem as motivation, we may therefore make the following definition (known as
Euler’s equation):
ejθ = (cos θ + j sin θ).
Using this definition, the polar form z = r∠θ of our complex number z may be
rewritten as
z = rejθ .
When written in this form, we say that z is in exponential form.
Using Euler’s equation as our starting point we may now define complex ver-
sions of both the exponential and logarithmic functions. Given z = a + jb we define
ez by
ez = ea .ejb = ea (cos(b) + j sin(b)).
(Here b is taken to be in radians.) Now suppose that in exponential form z = rejθ
For any w ∈ C which is of the form w = ln(r) + j(θ + 2kπ) for some integer k, it
53 EMT4801/1

is easy to see that ew = eln(r) .ej(θ+2kπ) = rejθ = z. Noting that here r = |z|, it
therefore makes sense to define
ln(z) = ln |z| + j(θ + 2kπ)
(2.1.1) = ln |z| + jarg(z)
If in the above definition we restrict ourselves to values of the argument which lie
in the interval (−π, π], we get what is called the Principal Logarithm (denoted by
Ln). Specifically for any z 6= 0, we let
Ln(z) = ln |z| + jArg(z)
where Arg(z) is that value of θ + 2kπ for which −π < θ + 2kπ ≤ π.
Finally for any complex number z we have the so-called complex conjugate of
z (denoted by z). If z = a + jb, then z is defined to be
z = a − jb.
So in principle the process of passing from z to z, is just reflection across the real
axis. The process of complex conjugation is closely related to the modulus in that
zz = a2 + b2 = |z|2 .
In exponential form we have that
z = re−jθ whenever z = rejθ .
We may further obtain the formulas
z + z = 2a = 2<(z)
and
z − z = j2b = j2=(z).
2.1.3.2. Operations with complex numbers. Given complex numbers z1 and z2
with
z1 = a + jb = r1 ∠θ1 and z2 = c + jd = r2 ∠θ2 ,
the following will apply:
• Equality: We say that z1 = z2 , if and only if a = c and b = d.
• Addition: z1 + z2 = (a + c) + j (b + d)
• Multiplication: Expressed in cartesian form we have
z1 z2 = (ac − bd) + j(ad + bc)
and in exponential form
z1 z2 = r1 ejθ1 · r2 ejθ2 = (r1 · r2 )ej(θ1 +θ2 ) .
• Division: In Cartesian form we have that
z1 a + jb
=
z2 c + jd
a + jb c − jd
= ·
c + jd c − jd
(ac + bd) + j (bc − ad)
=
c2 + d2
and in exponential form
z1 r1 ejθ1 r1
= jθ
= ej(θ1 −θ2 ) .
z2 r2 e 2 r2
54

• nth roots: Given a positive integer n, and a complex number z = r∠θ in


polar form, the nth roots of z are defined to be the collection of all those
complex numbers w for which wn = z. These nth roots may be computed
by means of the following formula:
 
1 1 θ + 2kπ
z =r ∠
n n
n
where k = 0, 1, 2, . . . , n − 1.

Example 2.1.1. Given that


z1 = 2+j
z2 = 3 − j2

1 3
z3 = − +j ,
2 2
evaluate each of the following:
(a) |2z1 − 3z2 |
(b) (z 3 )4
Solution:
(a)
|2z1 − 3z2 | = |2 (2 + j) − 3 (3 − j2)|
= |−5 + j8|

= 25 + 64

= 89
(b)
" √ #4
4 1 3
(z 3 ) = − −j
2 2
  4

= 1∠ −
3
 

= 1∠ −
3
    
8π 8π
= 1 cos − + j sin −
3 3

1 3
= − −j
2 2
√ √
(To write − 12 − j 23 in polar form we first note that | − 21 − j 23 | =
q √ √
( 12 )2 + ( 23 )2 = 1. With a = − 21 and b = − 23 , we obtain the reference
√ √
angle of tan−1 ( ab ) = tan−1 ( 3) = π3 radians. However − 21 − j 23 is in

the third quadrant. Hence to get the argument of − 21 − j 23 , we need to
rotate π3 (clockwise) by π radians to get θ = π3 − π = − 2π 3 .)


Example 2.1.2. Determine the real numbers a and b if 3a+2jb−ja+5b = 7+j5.


Solution: The above equality may be rewritten as
(3a + 5b) + j (2b − a) = 7 + j5,
55 EMT4801/1

from which it follows that


(2.1.2) 3a + 5b = 7
and
(2.1.3) 2b − a = 5.
From equation 2.1.3 we have that
a = 2b − 5.
Substituting into equation 2.1.2 will now yield 3(2b − 5) + 5b = 7. Solving for b
from this equation yields b = 2. Therefore
a = 2b − 5 = 2(2) − 5 = −1.

Example 2.1.3. Determine the following and give the answer in Cartesian
(rectangular) form:
3
(a) (3 + j4)
1
(b) (3 + j4) 3
Solution:
(a)
3 3
(3 + j4) = (5∠0.927295218 . . . )
3
= (5) ∠ [(3) (0.927295218 . . . )]
= 125∠2.781885654 . . .
= 125(cos(2.781885654 . . . ) + j sin(2.781885654 . . . ))
= −117 + j44
(b) Since
(3 + j4) = 5∠0.927295218 . . .
we have
 
1 1 1
(3 + j4) = (5) ∠
3
(0.927295218 · · · + 2kπ)
3
3
for k = 0, 1, 2. So
1 1 1
(5) 3 ∠0.309 . . . or (5) 3 ∠2.403 . . . or (5) 3 ∠4.498 . . .
Expressed in cartesian form (accurate up to three decimal places), the
roots are
(1, 629 + j0, 520) , (−1, 265 + j1, 151) and (−0, 364 − j1, 671) .

Example 2.1.4. Determine all the values of ln (−2 + j2) and then give its
principal value.
Solution: In radians the line segment from 0 to −2 + j2 lies at an angle of
3π/4 to the positive
√ real axis. Thus clearly arg(−2 + j2) = 3π/4 + 2kπ. In addition
| − 2 + j2| = 8. So
√ 
ln (−2 + 2j) = ln 8 + j (3π/4 + 2kπ) k = 0, ±1, ±2, . . .
To obtain the principal value, we must select k so that −π < 3π/4 + 2kπ < π. Thus
the principal value is

Ln (−2 + 2j) = ln 8 + j (3π/4) .
56

2.1.4. Engineering applications: electrical circuits. The following ma-


terial on electrical circuits does not form part of the course per se, but is included
primarily to demonstrate the applicability of the theory of complex numbers. Now
if i = i(t) is the current in the circuit at time t, then using well known electrical
laws, it is possible to show that
• the voltage drop across a resistor with resistance R, is iR;
di
• the voltage drop across an inductor with inductance L, is L dt ;
1
Rt
• The voltage drop across a capacitor with capacitance C is C 0 i(s)ds.
• The instantaneous electrical charge (or quantity of electricity) present in
Rt
the capacitor at any time t is given by q(t) = 0 ids. Thus we may say
that i = dq q
dt and that the voltage drop across a capacitor is given by C .
Therefore given a current of the form i = im sin ωt, where im is the amplitude
(or maximum current), f the frequency and ω = 2πf , the voltage across
• a resistor R is v = im R sin ωt (v is in phase with i)
• an inductor L is v = im ωL cos(ωt) = im ωL sin ωt + π2 (v leads i by π2


radians)
• a capacitor C is v = C1 . −1 im π
(v lags i by π2

ω cos(ωt) = ωC sin ωt − 2
radians.)

Figure 2.2. For a resistor the voltage is in phase with the current.

π
Figure 2.3. For an inductor the voltage leads the current by 2 radians.
57 EMT4801/1

π
Figure 2.4. For a capacitor the voltage lags the current by 2 radians.

Using the fact that ejθ = cos θ + j sin θ, v may therefore be written as
 
 = im R ejωt For a resistor
v = = im ωL ej(ωt+π/2) For an inductor
im j(ω−π/2)

= ωC e For a capacitor

Now notice that ej(ωt+π/2) = ejπ/2 .ejω = j.ejω . Similarly ej(ωt−π/2) = −j.ejω .
Using these facts the voltage may be further rewritten as
 
 = im R ejωt For a resistor
v = = im jωL ejωt For an inductor
−j
= im ωC ejωt For a capacitor.

Based on the above form of the voltage, we define the so-called complex impedance
of the circuit to be

 R For a resistor
Z = jωL For an inductor
 −j
ωC
For a capacitor
The voltage of the circuit would then be of the form
v = = im Z ejωt


More generally we define the complex impedance to be Z = R + jX, where


R is the resistance and X the so-called reactance of the circuit due to a capacitor
and/or inductor. The reactance X of a circuit is of the form X = XL − XC where
1
XC = (reactance of capacitor)
ωC
XL = ωL (reactance of inductor)

The actual impedance of the circuit is then given by |Z| = R2 + X 2 . The complex
admittance Y of a circuit is defined to be the reciprocal of the impedance, that is
Y = Z1 . It is striking fact of the impedance and admittance that if N electrical
components are connected in series, the impedance of the composite circuit can be
found by simply adding the impedances each of the components. That is with the
components in series we have Z = Z1 + Z2 + · · · + ZN where Z is the impedance of
the composite circuit and Zk the impedance of the k th component. If however the
components are connected in parallel, it is the admittances not the impedances that
combine to yield the admittance of the composite cicuit. So if the N components
are connected in parallel, we have that Z1 = Z11 + Z12 + · · · + Z1N . The quantity
V = Im Zejωt is defined to be the complex voltage. Notice that the actual voltage
may be expressed in terms of the complex impedance and complex voltage. In
particular
v = = im Z ejωt = = {V } .

58

Proposition 2.1.5. The actual voltage of a circuit is of the form


v = im |Z| sin (ωt + φ)
where φ is the so-called phase of the circuit. For a resistor in series with a capacitor
and/or inductor, the phase is given by
X
φ = tan−1
R
Proof: If we write Z in exponential form we have
Z = |Z|ejφ
where φ = Arg(Z). Then
= = im Z ejωt

v
= = im |Z| ejφ ejωt


= ={im |Z|ej(ωt+φ) }
= ={im |Z|(cos(ωt + φ) + j sin(ωt + φ))}
= im |Z| sin (ωt + φ) .
Now since R ≥ 0, Z = R + jX is in the first or fourth quadrant. Thus if R > 0,
the argument of Z will be of the form φ = Arg(Z) = tan−1 X
R. 

Figure 2.5.

For a resistor, inductor and capacitor in series (figure 2.5) the complex impedance
will be  
1
Z = R + j (XL − XC ) = R + j ωL −
ωC
and the actual impedance
s  2
1
|Z| = R2 + ωL − .
ωC
From this it can clearly be seen that in this case the actual impedance |Z| of the
1
circuit depends on ω and that it will reach a minimum when ωL − ωC = 0, i.e.
1
when ω = LC . Thus, the impedance of the circuit will be a minimum when the

ω
frequency f = 2π has the value f = 2π√1LC . (See figure 2.6.)

Figure 2.6
59 EMT4801/1

For such a circuit the phase is given by


1
−1 ωL −
 
ωC
φ = tan .
R
Example 2.1.6. Calculate the impedance and phase of the element shown in
figure 2.7 if an alternating current of frequency 50 Hz flows. Here R = 15Ω and
L = 41, 3 mH.

Figure 2.7.

Solution: In the given circuit there is no capacitor. So


X = XL
= ωL
= 2π.f L
= 2π (50) 41, 3 × 10−3


= 4, 13πΩ
Therefore the complex impedance is
Z = R + jXL = (15 + j4, 13π) Ω
and the actual impedance
q
2 2
|Z| = (15) + (4.13π) = 19, 8329 . . . Ω.
The phase is given by
X
φ = tan−1
R
−1 4, 13π
= tan
15
= 0, 7131 . . . radians

Example 2.1.7. Express the impedance of the following circuits at a frequency
of 50 Hz in rectangular as well as polar form. Also calculate the amplitude of the
current in each case, as well as the phase of each current relative to the applied
voltage, if the applied voltage is 230 V at 50 Hz.
(a) A resistance of 20Ω in series with an inductance of 0, 1 H.
(b) A resistance of 50Ω in series with a capacitance of 40µF.
(c) Circuits (a) and (b) in series.

Solution:
(a) Here X = XL = ωL = 2π.f L. So
Z = R + jX
= 20 + j [2π (50) (0, 1)]
= (20 + j10π) Ω
= 37, 24191778 . . . ∠1, 0038848 . . . Ω
60

Since V = im Zejωt , we have |V | = im |Z|. Given that Z = |Z|ejφ , we


have
|V |
im e−jφ =
Z
230∠0
=
37, 24191778 . . . ∠1, 0038848 . . .
= 6, 1758 . . . ∠ − 1, 0038848 . . . A
Thus, the amplitude of the current is 6, 1758 . . . A, lagging the voltage
by 1, 0038848 . . . radians.
−1 −1
(b) Here X = −Xc = ωC = 2πf C . So

Z = R + jX
j
= 50 −
2π(50) (40 × 10−6 )
 
250
= 50 − j Ω
π
= 93, 98177 . . . ∠ − 1, 00981 . . . Ω.

|V |
im e−jφ =
Z
230∠0
=
93, 98177 . . . ∠ − 1, 00981 . . . Ω
= 2, 44728 . . . ∠1, 00981 . . . A
Thus the amplitude of the current is 2, 44728 . . . A, leading the voltage
by 1, 00981 . . . radians.
(c) Here
 
250
Z = (20 + j10π) + 50 − j
π
= (70 − j48, 161545 . . . ) Ω
= 84, 96784 . . . ∠ − 0, 60264 . . . Ω

230∠0
i =
84, 96784 . . . ∠ − 0, 60264 . . .
= 2, 706906 . . . ∠0, 60264 . . . A
Thus, the current is 2, 706906 A, leading the voltage by 0, 60264 . . .
radians. 

EXERCISE 2.1.
1. Simplify the following and give the answer in exponential form
4 2
(2 + j4) (5 − j)
6 + 4
(1 − 2j) (−1 + j)

2. Determine all the values of ln −2 + 4j and give only the principal value
in rectangular form.
3. Determine the values of a and b if
(a) a (2 + j) + b (1 − j) = 1, 5 + 0, 5j
(b) (a + 2bj) (2 − j) − (1 − j) = −5 + 5j
4. If u = 1 − j2 and v = 1 + j, determine
61 EMT4801/1

(a) ln (2v)
v6
(b)
u
(c) u − 2v
1
(d) u 3
(e) 2uv
(f) j 5 − uj
5. Simplify the following and write the answers in the a + jb form
1
(a) j 7 + j −1 + (−4) 2
√ 
1 − j 3 (2∠(π/6))
(b) 4
(−2 + 2j)
1
(c) (1 + 2j) 3 . Give all the roots.
√ π
(d) ejπ + 2 cis
4
1
(e) 3
(1 + j)
6. Determine the values of a and b if
(a) (a − 4j) (−1 + 2j) = 2 + j − bj
aj
(b) = 2 + 3bj
j−1
7. Calculate the complex impedance for the circuit shown in figure 2.8 when
an alternating current of 50 Hz flows.

Figure 2.8.
62

UNIT 2: COMPLEX FUNCTIONS AND MAPPINGS


2.2.1. OBJECTIVE. To introduce complex functions and show how they
may be realised as mappings from the z–plane into the w–plane.

2.2.2. OUTCOMES. At the end of this unit the student should


• Be familiar with the definitions of elementary complex functions;
• Be able to interpret complex functions as a coordinate transformations of
the complex plane;
• Be familiar with the concept of modulus and know how it is used to
describe distance;
• Know how the equation of a circle may be written in terms of the modulus;
• Know what effect transformations defined in terms of the elementary op-
erations with complex numbers will have on regions in the complex plane;
• Be able to apply this knowledge to compute the images of simple regions
under such transformations.

2.2.3. INTRODUCTION. The concept of a function involves two sets X


and Y and a rule f that assigns to each x ∈ X exactly one y ∈ Y. We say that f is
a function mapping the elements in X onto elements of Y , and write
f :X→Y
or
y = f (x) , x ∈ X, y ∈ Y
Here x is the so-called independent variable and y the dependent variable. In
the case of a complex variable
z = a + jb,
the dependent variable w = f (z) is usually also complex. Here it is not possible
to graphically represent z and f (z) on one system of axes, as we will need a set of
axes with four degrees of freedom for such a task (two degrees of freedom for the
real and imaginary parts of z, and another two for the real and imaginary parts
of w = f (z)). The plane containing the independent variable z, will be called the
z–plane and the plane containing the dependent variable w = f (z) the w–plane.
Consider the following example.
w = f (z) = z 2
With z = x + jy (where x, y ∈ R), we have
2
w = (x + jy) = (x2 − y 2 ) + j2xy
Thus if we write w = u + jv where u = <(w), v = =(w), then from
u + jv = w = x2 − y 2 + j (2xy)


it follows that u = (x2 −y 2 ) and v = 2xy. So if we regard w = z 2 as a transformation


from the z-plane to the w-plane, then under this mapping the point P (x, y) will
map onto the point P 0 (u, v) where u = (x2 − y 2 ) and v = 2xy. For example the
point P (1, 2) in the z-plane will map onto the point P 0 (−3, 4) in the w-plane as
shown in figure 2.9 (note that u(1, 2) = 12 − 22 = −3, v(1, 2) = 2(1)(2) = 4).
More generally any complex function w = f (z) = u + jv can be regarded as
a transformation from the z-plane to the w-plane which maps points P (x, y) in
the z–plane (the domain) onto points P 0 (u, v) in the w–plane (the range) where
u = <(w), v = =(w).
63 EMT4801/1

Figure 2.9.

2.2.4. SINGLE– AND MULTIPLE–VALUED FUNCTIONS. In the


theory of complex functions there are some well-defined concrete processes which
assign more than one possible value of w to each fixed z. Where convenient we
will refer to these processes as multiple-valued functions (alternatively multi-valued
functions). The more standard functions which assign only one value of w = f (z)
to each z, will be referred to as single–valued functions. For example the process
w = z 2 is such a single–valued function.
1
Now consider the process w = z 2 . For each value of z, there are two possi-
1 1
ble values of w = z 2 . Hence w = z 2 is multiple–valued (two–valued actually).
1
Specifically if z = r∠θ where −π < θ ≤ π, then the possible values of w = z 2 are

 
θ
w0 = r∠
2
and

 
θ
w1 = r∠ +π .
2
1
(See the formula for z n given earlier.)
Sometimes it is possible to construct a single-valued function from a multi-
valued one. In particular if for a multi–valued function w = f (z) we can find a
natural way of for each z selecting a single value of w from the many that f assigns
to z, then the result will of course be a single-valued function, which we refer to
as a branch of the original multi-valued function. In most cases we can therefore
think of a multi–valued function as a collection of such single-valued branches “glued
together” in a specific way, where the branch that is most commonly used, is referred
1
to as the principal branch. In the case of w = z 2 there are actually infinitely many
1
possible branches. Given any real number α, we can define a branch cut for w = z 2
corresponding to α, by setting

 
1 θ
z 2 = r∠
2
whenever z = r∠θ with α < θ ≤ α + 2π. In the specific case where −π < θ ≤ π,
the values

 
θ
w0 = r∠
2
and

 
θ
w1 = r∠ +π .
2
1 √  α = −π and α = π
can be shown to respectively correspond to the branch cuts for
respectively. For w = z 2 the principal branch is w0 = r∠ θ2 where −π < θ ≤ π.
64

In order to avoid confusion we will in the remainder of this course adopt the
convention that when using the word “function”, we will mean a single–valued
function if the phrase “multi–valued” is not explicitly mentioned.

2.2.5. INVERSE FUNCTIONS. A function g is called the inverse function


of a given function f if for every z in the domain of f we have g ◦ f (z) = z, and
every z in the domain of g we have f ◦ g(z) = z. Such a function will sometimes
be denoted by f −1 . Please note that this is NOT the same as f1 .
Of course not all functions have an inverse in the above sense. For a function
f to have such an inverse, them for any two points z0 , z1 in the domain, we must
have that f (z0 ) 6= f (z1 ) whenever z0 6= z1 .

2.2.6. ELEMENTARY FUNCTIONS. We proceed to present the com-


plex versions of some common elementary functions.
2.2.6.1. Polynomial functions. A polynomial function P is a function which
takes points z in the complex plane onto points of the form
w = P (z) = a0 + a1 z + a2 z 2 + a3 z 3 + . . . + an z n
where a0 , a1 , a2 , . . . , an are constants (complex or real) and an 6= 0. Here n is a
positive integer, and is called the degree of the polynomial. Polynomials of the form
w = a0 + a1 z
will map straight lines in the z-plane onto straight lines in the w-plane, and for
that reason are often called linear transformations.
2.2.6.2. Rational functions. A function of the form
P (z)
w= ,
Q (z)
where P (z) and Q (z) are polynomials, is called a rational function. The domain
of such a function consists of all points z for which Q (z) 6= 0. Rational functions
of the form
az + b
w=
cz + d
where ad − bc 6= 0, are called Möbius functions, or bilinear transformations.
2.2.6.3. Exponential functions. Given z = x + jy (where x, y ∈ R), we may use
Euler’s equation ejy = cos y + j sin y, to define the exponential function to the base
e by
w = ez
= ex+jy
= ex ejy
= ex (cos y + j sin y)
The exponential function defined above actaully turns out to be periodic (with a
period of 2π) along lines parallel to the imaginary axis. More specifically we have
the following
Proposition 2.2.1. For any z ∈ C we have that ez+j2kπ = ez k = 0, ±1, ±2, . . ..
Proof: For any z = x + jy we may conclude from the periodicity of sin and
cos that
ez+j2kπ = ex+j(y+2kπ)
= ex (cos(y + 2kπ) + j sin(y + 2kπ))
= ex (cos(y) + j sin(y))
= ez .
65 EMT4801/1


With this definition in the bag, we can now use it to define exponential functions
for more general bases. Specifically given a real number a with a > 0, a 6= 1, we
now define
az = ez ln a
2.2.6.4. Trigonometric functions. Given a real number x we have from Euler’s
equation that ejx = cos x + j sin x, and e−jx = cos(−x) + j sin(−x) = cos x − j sin x.
jx −jx jx −jx
Solving for cos and sin, we have that sin x = e −e 2j and cos x = e +e2 . Since
we already have a well-established complex version of the exponential function, the
complex version of cos and sin may now be defined by simply replacing x with z in
the above formulas. In particular we get
ejz − e−jz
sin z =
2j
e + e−jz
jz
cos z =
2
Using cos and sin we may now introduce complex versions of all the other trigono-
metric functions. We get
sin z 1 ejz − e−jz
tan z = =
cos z j ejz + e−jz
j ejz + e−jz

cos z
cot z = =
sin z ejz − e−jz
1 2
sec z = = jz
cos z e + e−jz
1 2j
cosecz = = jz
sin z e − e−jz
Identities:
It is not difficult to prove that the above complex functions satisfy the following
indentities:
sin2 z + cos2 z = 1
sec2 z = 1 + tan2 z
cosec2 z = 1 + cot2 z

sin (−z) = − sin z


cos (−z) = cos z
tan (−z) = − tan z

sin (z1 ± z2 ) = sin z1 · cos z2 ± cos z1 · sin z2


cos (z1 ± z2 ) = cos z1 · cos z2 ∓ sin z1 · sin z2

tan z1 ± tan z2
tan (z1 ± z2 ) =
1 ∓ tan z1 tan z2

sin 2z = 2 sin z cos z

cos 2z = cos2 z − sin2 z


= 2 cos2 z − 1
= 1 − 2 sin2 z
66

2.2.6.5. Hyperbolic functions. The definitions of the real valued versions may
be copied directly to yield
ez − e−z
sinh z =
2
ez + e−z
cosh z =
2
ez − e−z sinh z
tanh z = =
ez + e−z cosh z
2 1
cosechz = =
e − e−z
z sinh z
2 1
sechz = =
ez + e−z cosh z
ez + e−z 1
coth z = =
ez − e−z tanh z

Identities:
As in the real-valued case, the following identities pertain for these functions
cosh2 z − sinh2 z = 1
sech2 z = 1 − tanh2
cosech2 z = coth2 z − 1
sinh (−z) = − sinh z
cosh (−z) = cosh z
tanh (−z) = − tanh z
sinh (z1 ± z2 ) = sinh z1 cosh z2 ± cosh z1 sinh z2
cosh (z1 ± z2 ) = cosh z1 cosh z2 ± sinh z1 sinh z2

tanh z1 ± tanh z2
tanh (z1 ± z2 ) =
1 ± tanh z1 tanh z2

sinh 2z = 2 sinh z cosh z


cosh 2z = cosh2 z + sinh2 z
= 2 cosh2 z − 1
= 1 + 2 sinh2 z
Note: From the definitions given in subsubsection 2.2.5.4 and 2.2.5.5, it can
be seen that
sin jz = j sinh z
cos jz = cosh z
tan jz = j tanh z
sinh jz = j sin z
cosh jz = cos z
tanh jz = j tan z

2.2.6.6. Logarithmic functions. We saw earlier (see equation 2.1.1) that the
complex version of the natural logarithm may be defined by
ln z = ln |z| + jarg(z).
67 EMT4801/1

If in exponential form z = rejθ where −π < θ ≤ π, then of course |z| = r and


arg(z) = θ + 2kπ where k = 0, ±1, ±2, . . .. Thus the formula for ln may then be
rewritten as
ln z = ln |z| + j(θ + 2kπ).
This is clearly a multivalued function. In order to extract a single-valued branch,
we need to restrict the values θ + 2kπ in such a way that there is only one possible
value for each z. Given any real number α, the way to do this is to insist that
k must be chosen in such a way that α < θ + 2kπ ≤ α + 2π. The branch of ln
corresponding to α will then be given by
ln z = ln |z| + j(θ + 2kz π)
where for each z, kz is an integer chosen so that α < θ + 2kz π ≤ α + 2π. There
are therefore clearly infinitely many branches of ln. The branch given by α = −π
corresponds to the case where k = 0 in the above equation, and is the principal
branch (principal value) of ln. Specifically the principal value of ln is defined by
Ln z = ln |z| + jθ = ln |z| + jArg(z)
where θ is as above. Complex logarithms for a real base a (where a > 0 and a 6= 1)
other than e, may be defined by setting
ln z
loga z = .
Ln a
2.2.6.7. Inverse trigonometric functions. With clear definitions of complex log-
arithms in place, we are now well positioned to introduce complex versions of in-
verse trigonometric and inverse hyperbolic functions. We deal with the inverse
trigonometric functions first, leaving the inverse hyperbolic functions for the next
subsection. By formally copying the definitions of the real-valued version we get

h 1
i
sin−1 z = −j ln jz + (1 − z 2 ) 2
h 1
i
cos−1 z = −j ln z + (z 2 − 1) 2

j j+z
tan−1 z = ln
2 j−z
Having defined sin−1 , cos−1 and tan−1 , we now define the balance of the inverse
trigonometric functions by setting cosec−1 z = sin−1 z1 , sec−1 z = cos−1 z1 , and
1
cot−1 z = tan−1 .
z
2.2.6.8. Inverse hyperbolic functions. The real-valued definitions may be suit-
ably modified to yield
 1

sinh−1 z = ln z + (z 2 + 1) 2
 1

cosh−1 z = ln z + (z 2 − 1) 2

1 1+z
tanh−1 z = ln
2 1−z
The balance of the inverse hyperbolic functions are defined by setting sech−1 z =
cosh−1 z1 , cosech−1 z = sinh−1 z1 , and coth−1 z = tanh−1 z1 .
68

2.2.7. MODULUS AND DISTANCE. Let


z1 = x1 + jy1 and z2 = x2 + jy2
be two points in the complex plane. The modulus function | · | defined earlier, has
the following basic properties:
(a) |z1 z2 | = |z1 | |z2 |

z1 |z1 |
(b) = where z2 6= 0.
z2 |z2 |
(c) |z1 + z2 | ≤ |z1 | + |z2 |
(d) |z1 + z2 | ≥ |z1 | − |z2 |
(e) For any z ∈ C, z.z = |z|2 .
The distance between
p (x1 , y1 ) and (x2 , y2 ) (corresponding to z1 and z2 respec-
tively) is given by (x1 − x2 )2 + (y1 − y2 )2 . But by definition
|z1 − z2 | = |(x1 + jy1 ) − (x2 + jy2 )|
= |(x1 − x2 ) + j (y1 − y2 )|
q
2 2
= (x1 − x2 ) + (y1 − y2 )
Hence geometrically the modulus |z1 − z2 | is nothing but the distance between z1
and z2 .

Figure 2.10

Given any z, we may therefore think of |z| as the distance of the point z from
the origin.

2.2.8. THE EQUATION OF A CIRCLE.

Figure 2.11

Consider the circle with centre z1 = x1 + jy1 and radius r > 0. This circle is
just the locus of all points z = x + jy which are a constant distance r from the
69 EMT4801/1

point z1 Thus since the distance from z to z1 is just |z − z1 |, the equation of the
circle with centre z1 and radius r, may be written as
|z − z1 | = r.
We proceed to show how the arguably more familiar form of the equation of a circle
may be derived from the above form. We have
|z − z1 | = r ⇔ |(x + jy) − (x1 + jy1 )| = r
⇔ |(x − x1 ) + j(y − y1 )| = r
⇔ |(x − x1 ) + j(y − y1 )|2 = r2
⇔ (x − x1 )2 + (y − y1 )2 = r2 .

Example 2.2.2. Represent the values of z graphically where



z − 5 3
z + 5 = 2

Solution: Firstly note that if 2|z − 5| = 3|z + 5| we cannot have |z + 5| = 0.


To see this note that if |z + 5| = 0, then z = −5, in which
case 2|z − 5| = 2|(−5) −
z−5 3
5| = 20 6= 0 = 3|z + 5|. Therefore in general z+5 = 2 ⇔ 2 |z − 5| = 3 |z + 5|.
Proceeding from this fact, we see that

z − 5 3
z + 5 = 2 ⇔ 2 |z − 5| = 3 |z + 5|

⇔ 2 |(x − 5) + jy| = 3 |(x + 5) + jy|


h i h i
2 2
⇔ 4 (x − 5) + y 2 = 9 (x + 5) + y 2
⇔ 5x2 + 130x + 5y 2 = −125
⇔ x2 + 26x + y 2 = −25
⇔ x2 + 26x + 132 + y 2 = 169 − 25
⇔ (x + 13)2 + (y − 0)2 = 144
This is the equation of a circle with radius 12 and centre (−13, 0) which may of
course be rewritten as |z + 13| = 12. (Show this yourself.) Thus

z − 5 3
z + 5 = 2 ⇔ |z + 13| = 12.


Example 2.2.3. Represent
|z + 1| < 3
graphically.
Solution: Notice that |z + 1| = |z − (−1)| is just the distance from z to the
point −1. The set of all z for which |z + 1| < 3, is just the set of all z for which the
the distance between z and −1 is less than 3 units. But this is precisely the interior
of the circle with centre (−1, 0) and radius 3. To demonstrate this fact, notice that
2
|z + 1| < 3 ⇔ |(x + 1) + jy| < 3 ⇔ (x + 1) + y 2 < 9.


Example 2.2.4. Represent


|z + 2j| > 2
graphically.
70

Solution: Here |z + 2j| = |z − (−2j)| is the distance from z to the point −2j.
So the set of all z for which |z + 2j| > 2, is just the locus of all points z for which
the the distance between z and −2j is greater than 2 units. Thus it is the exterior
of the circle with centre (0, −2) and radius 2. As before we may demonstrate this
fact by noting that
|z + 2j| > 2 ⇔ |x + j(y + 2)| > 2 ⇔ x2 + (y + 2)2 > 4.


2.2.9. GENERAL TRANSFORMATIONS AND MAPPINGS. Let a


function w = f (z) be given. As mentioned in section 2.2.3, the function z 7→ f (z) =
w can be regarded as a transformation mapping points P in the z–plane onto points
P 0 in the w–plane by means of the transformation w = f (z). In such a case P 0 is
then referred to as the image of P .

Example 2.2.5. Determine the image of the point P (2, 3) (i.e. of z = 2 + j3)
in the w–plane, under the transformation w = z 2 .
Solution: Here
2
u + jv = w = (x + jy) = x2 − y 2 + j2xy.
Clearly
u(x, y) = x2 − y 2 and v(x, y) = 2xy.
Thus at (2, 3)
u(2, 3) = 22 − 32 = 4 − 9 = −5 and v(2, 3) = 2 (2) (3) = 12.
0 2
So P (2, 3) gets mapped onto P (−5, 12) by w = z . 

2.2.9.1. Translation. A translation is defined by


w =z+β
where β = a + jb is a complex constant. Here
u + jv = w
= z+β
= (x + jy) + (a + jb)
= (x + a) + j (y + b) .
Thus u = x + a and v = y + b.
From this it follows that if say a, b > 0, then each point in the z–plane is shifted
a units to the right and b units upwards.
Example 2.2.6. Determine the image of the square with vertices ±1 ± j, and
±1 ∓ j, in the z–plane under the transformation w = z + (1 + j).
Solution: Here u+jv = w = (x + 1)+j (y + 1) so that u = x+1 and v = y+1.
The four sides map as follows:
AB (−1 ≤ x ≤ 1, y = 1) → A0 B 0 (0 ≤ u ≤ 2, v = 2)
AD (x = −1, 1 ≥ y ≥ −1) → A0 D0 (u = 0, 2 ≥ v ≥ 0)
DB (−1 ≤ x ≤ 1, y = −1) → D0 C 0 (0 ≤ u ≤ 2, v = 0)
CB (x = 1, −1 ≤ y ≤ 1) → C 0 B 0 (u = 2, 0 ≤ v ≤ 2)

71 EMT4801/1

z−plane w−plane
Y V
A( _ 1;1) B (1;1) A (0;2) B (2;2)

0 X

D( _ 1; _ 1) C(1; _ 1)
D (0;0) C (2;0) U

Figure 2.12
Figure 2.12

The square in the z–plane has been shifted one unit to the right and one unit
upwards.
2.2.9.2. Rotation. Let z ∈ C be given and suppose that in exponential form
z = rejθ . Multiplying z by a complex number ejα (where α ∈ R) then yields
z · ejα = rejθ · ejα = rej(θ+α)
Therefore in passing from z to z.ejα , all that changes is that the argument changes
from θ to θ + α. Thus multiplication by ejα is really just an operator rotating
points in the z-plane through an angle of α radians. It rotates clockwise if α < 0,
and counter–clockwise if α > 0.
Example 2.2.7. Find the image of the square described in the previous example,
under the transformation
π
w = ej 4 · z.
Solution: We have that
u + jv = w
π
= ej 4 · z
π π
= cos + j sin (x + jy)
4 4
1
= √ (1 + j) (x + jy)
2
x−y x+y
= √ +j √
2 2
Thus
x−y x+y
u= √ and v= √
2 2
We proceed to compute the image of each of the sides.
x−1
AB(−1 ≤ x ≤ 1, y = 1) : If y = 1, then u = and v = x+1
√ . But then u + √1 =

22 2

v − √12 (both sides are equal to √x2 ). Therefore v = u + 2. Since −1 ≤ x ≤ 1, the

possible values of u = x−1
√ range from − 2 (when x = −1) to 0 (when x = 1). Thus
2 √ √
the line segment AB maps onto the line segment A0 B 0 (− 2 ≤ u ≤ 0, v = u + 2).
−1−y −1+y
AD(x = −1, −1 ≤ y ≤ 1) : If x = −1, then u = √
2
and v = √ . But
2 √
then
−u − √1
2
=v+ √1
2
(both sides are equal to √y2 ). Therefore v = −u − 2. Since

−1 ≤ y ≤ 1, the possible values of u = −1−y

2
range from − 2 (when y = 1)
72

√ y = −1). Thus √
to 0 (when the line segment AD maps onto the line segment
A0 D0 (− 2 ≤ u ≤ 0, v = −u − 2).

Similar arguments √ to show that DB(−1 ≤ x ≤ 1, y = −1) maps


√ can be used
onto D0 B 0 (0 ≤√u ≤ 2, v = u√− 2), and that CB(x = 1, −1 ≤ y ≤ 1) maps onto
C 0 B 0 (0 ≤ u ≤ 2, v = −u + 2). The image (shown in the figure below), is then
still a square, has the same area, and has been rotated counter–clockwise through
an angle of π4 radians. 

Figure 2.13
2.2.9.3. Magnification (stretching or shrinking). Under the transformation
w = az
where a is real, all figures in the z–plane are magnified if |a| > 1 and compressed if
|a| < 1. If a > 0 the orientation of the figures remains the same as before, whereas
if a < 0, the figures are also rotated by π radians.
Example 2.2.8. Apply w = 2z to the square of example 2.2.6.
Solution: Here
u + jv = w = 2 (x + jy)
Thus
u = 2x and v = 2y.
The four sides of the square therefore clearly map as follows:
AB (−1 ≤ x ≤ 1, y = 1) → A0 B 0 (−2 ≤ u ≤ 2, v = 2)
AD (x = −1, 1 ≥ y ≥ −1) → A0 D0 (u = −2, 2 ≥ v ≥ −2)
DB (−1 ≤ x ≤ 1, y = −1) → D0 C 0 (−2 ≤ u ≤ 2, v = −2)
CB (x = 1, −1 ≤ y ≤ 1) → C 0 B 0 (u = 2, −2 ≤ v ≤ 2)

Figure 2.14

The image of the square in the figure above is still a square, but the lengths of the
sides have been doubled. 
73 EMT4801/1

Remark 2.2.9. For w = −2z, the figure would also have been rotated through π
radians. To see this notice that −1 = ejπ . We may therefore write −2z = ejπ (2z),
resulting in a rotation of π radians on w = 2z.

2.2.9.4. Affine linear transformation.

w = αz + β

where α and β are complex constants. This transformation is a combination of all


three of the previous transformations. To see this notice that if in exponential form
α is of the form α = rejθ , the tranformation can be written as

w = rejθ · z + β.

This transformation is therefore merely a combination of the following three trans-


formations:
Magnification: z 7→ w1 = rz
Rotation: w1 7→ w2 = ejθ · w1
Translation: w2 7→ w3 = w2 + β
An important property of these transformations is that they map straight lines in
the z–plane onto straight lines in the w–plane. (See if you can prove this! Given
that z = x + jy and w = u + jv, see if you can show that for a suitable choice of
m0 and c0 , the the transformation z 7→ w = αz + β, will transform an equation of
the form y = mx + c, into an equation of the form v = m0 u + c0 .)
2.2.9.5. Inversion. Inversion is defined by
1
w=
z
where z 6= 0.

1
Example 2.2.10. Apply w = z to the square of the previous examples.

Solution: Suppose that z = x + jy and w = u + jv where w = z1 . We first


describe u and v in terms of x and y. These formulas will be useful in computing
the range of values of u and v on the image of each of the sides AB, AD, DC, and
CB of the square. We have that

u + jv = w
1
=
x + jy
1 x − jy
= ·
x + jy x − jy
x − jy
=
x2 + y 2
Thus
x −y
u= and v=
x2 + y 2 x2 + y 2
Since w = z1 is not a linear transformation, it is clear that the images of the
sides of the square are no longer straight lines. Thus before the figure in the w–
plane can be sketched, it will be necessary to determine the equations of the images
of each of the sides of the square. For this we describe x and y in terms of u and
74

v. Note that
x + jy = z
1
=
w
1
=
u + jv
1 u − jv
= ·
u + jv u − jv
u − jv
=
u2 + v 2
Thus
u −v
x= and y= 2
u2 + v 2 u + v2
First consider the line segment AB(−1 ≤ x ≤ 1, y = 1). Since y = 1, we have
x x −y −1
u = x2 +y 2 = x2 +1 and v = x2 +y 2 = x2 +1 . Now as x ranges from −1 to 1, the

value of u = x2 +1 will range from − 2 to 12 , and of v = x−1


x 1
2 +1 from -1 (at x = 0) to
1 −v
− 2 . Again since y = 1, we have that 1 = y = u2 +v2 , which may be rewritten as
u2 + v 2 + v = 0. Completing the square, we get that (u − 0)2 + (v + 12 )2 = ( 21 )2 . This
is a circle with centre at 0, − 2 and radius 12 . Thus the image A0 B 0 of the side
1


AB, is the portion of the circle (u − 0)2 + (v + 12 )2 = ( 12 )2 for which − 21 ≤ u ≤ 12


and −1 ≤ v ≤ − 21
Next consider the side AD(x = −1, −1 ≤ y ≤ 1). Since x = −1 we have
x −1 −y −y
u = x2 +y 2 = 1+y 2 and v = x2 +y 2 = 1+y 2 . Now as y ranges from −1 to 1, the
−1 −y
value of u = 1+y2 will range from -1 (at x = 0) to − 12 , and of v = 1+y 2 from
1 1
− 2 to 2 . Finally notice that in terms of u and v, x = −1 becomes −1 = x =
u 2 2
u2 +v 2 . This may be rewritten as u + u + v = 0. On completing the squares,
this yields (u + 2 ) + v = ( 2 ) . Thus the image A0 D0 is that portion of the circle
1 2 2 1 2

(u + 12 )2 + v 2 = ( 12 )2 for which −1 ≤ u ≤ − 21 and − 21 ≤ v ≤ 12 .


Similarly DC(−1 ≤ x ≤ 1, y = −1) can be shown to map onto the circle sector
D0 C 0 described by u2 + (v − 12 )2 = ( 12 )2 where − 21 ≤ u ≤ 12 and 21 ≤ v ≤ 1.
The line segment CB(x = 1, −1 ≤ y ≤ 1) maps onto the circle sector C 0 B 0
described by (u − 12 )2 + v 2 = ( 12 )2 where 21 ≤ u ≤ 1 and − 21 ≤ v ≤ 12 .

Figure 2.15


75 EMT4801/1

2.2.9.6. Bilinear transformation. Bilinear transformations are transformations


of the form
az + b
w= where ad − bc 6= 0.
cz + d
Tranformations of the above type are sometimes also called Möbius transformations.
It is not difficult to see that these transformations are just a combination of all the
previous transformations. They have a lot of very nice properties. We proceed to
mention just a few:
• If we add a point called ∞ to C and formally set 01 = ∞ and ∞ 1
= 0,
bilinear transformations yield 1-1 transformations from C ∪ {∞} onto
C ∪ {∞}.
• Bilinear transformations will always map any circle or line, onto either a
circle or line.
• The composition of two bilinear transformations is again a bilinear trans-
formation. (That is if f and g are both bilinear transformations, then
f ◦ g(z) = f (g(z)) is also a bilinear transformation.)
• Given three distinct points z1 , z2 , z3 in the z-plane, and three distinct
points w1 , w2 , w3 in the w-plane, there is a unique bilinear transformation
which maps each of z1 , z2 , and z3 onto w1 , w2 , and w3 respectively.

Example 2.2.11. Find the image in the w–plane of the line y = 2x in the
z–plane under the transformation
z+j
w=
z−j
Sketch the graphs in both planes, shading the region in the w–plane corresponding
to the image of the region below the line y = 2x in the z–plane.
z+j
Solution: Suppose that z = x + jy and w = u + jv where w = z−j . To find
the equation of the image of y = 2x, we first describe x and y in terms of u and v.
z+j
From w = z−j we cross-multiply to get
wz − wj = z + j,
rewrite this as
z (w − 1) = j (w + 1)
and then solve for z to get
x + jy = z
j (1 + w)
=
(w − 1)
j (1 + u + jv)
=
u − 1 + jv
 
−v + j (1 + u) (u − 1) − jv
=
(u − 1) + jv (u − 1) − jv

−v (u − 1) + v (u + 1) + j v 2 + (u2 − 1)
= 2
(u − 1) + v 2

2v + j v + (u2 − 1)
2
= 2
(u − 1) + v 2
Hence
2v v 2 + u2 − 1
x= 2 and y= 2
(u − 1) + v 2 (u − 1) + v 2
76

Thus y = 2x becomes
v 2 + u2 − 1 4v
2 = 2 ,
(u − 1) + v2 (u − 1) + v 2
which simplifies to 4v = u2 + v 2 − 1, so that u2 + v 2 − 4v + 4 = 1 + 4. We therefore
arrive at
(u − 0)2 + (v − 2)2 = 5.

This is the equation of a circle with centre (0, 2) and radius 5 in the w–plane.
Using the above formulas, we see that
(u − 0)2 + (v − 2)2 < 5 ⇔ v 2 + u2 − 1 < 4v
v 2 + u2 − 1 4v
⇔ 2 < 2
(u − 1) + v 2 (u − 1) + v 2
⇔ y < 2x.
Thus points below the line map onto points inside the circle.

Figure 2.16

Had we wanted to compute the image in the w–plane of specific points in the
z–plane, we would’ve needed to write u and v in terms of x and y. To do this write
u + jv = w
z+j
=
z−j
x + j (y + 1) x − j (y − 1)
= ·
x + j (y − 1) x − j (y − 1)
(x2 + y 2 − 1) + j2x)
=
x2 + j(y − 1)2
From this it is clear that
x2 + y 2 − 1 2x
u= 2 and v= 2
x2 + (y − 1) x2 + (y − 1)
Using these formulas we can now easily test our earlier claim that points below
the line map onto points inside the circle. For example the point F (0, −2) is below
the line and maps onto F 0 31 , 0 which is inside the circle. Hence points below the
line map onto points inside the circle. The point G (1, 4) is above the line and maps
onto G0 ( 85 , 51 ), which is outside the circle. 
77 EMT4801/1

Example 2.2.12. Let R be the square in the z–plane, with corners at the ver-
tices ±1 ± j and ±1 ∓ j.

For each of the transformations (i) w = z + b, (ii) w = az and (iii) w = az + b,


where a = 3 + j4 and b = 2 − j,
(a) determine the image R0 of R;
(b) and discuss the effect that a and b have on R, in terms of magnification,
rotation and translation (shifting).
Solution:
(i) (a) Notice that
u + jv = w
= z+b
= x + jy + 2 − j
= (x + 2) + j (y − 1)
Thus
u=x+2 and v = y − 1.
Since the transformation is affine linear, straight lines in the z–plane
will map onto a straight lines in the w–plane. The line segments
forming the sides of the square R, will therefore map onto other line
segments forming the sides of R0 . Thus the square R with corners at
A, B, C and D, will map onto another square R0 with corners at A0 ,
B 0 , C 0 and D0 where
A (1, 1) → A0 (3, 0)
B (−1, 1) → B 0 (1, 0)
C (−1, −1) → C 0 (1, −2)
D (1, −1) → D0 (3, −2)
(b) The effect of the mapping z 7→ z + b (where b = 2 − j), is then to
shift R 2 units to the right, and 1 unit down.
(ii) (a) Notice that
u + jv = w
= az
= (3 + j4) (x + jy)
= (3x − 4y) + j (4x + 3y)

Thus here
u = 3x − 4y and v = 4x + 3y.
Since this transformation is also affine linear, straight lines in the
z–plane will again map onto a straight lines in the w–plane. The line
segments forming the sides of the square R, will therefore map onto
78

other line segments forming the sides of R0 . Thus the square R with
corners at A, B, C and D, will map onto another square R0 with
corners at A0 , B 0 , C 0 and D0 where
A (1, 1) → A0 (−1, 7)
B (−1, 1) → B 0 (−7, −1)
C (−1, −1) → C 0 (1, −7)
D (1, −1) → D0 (7, 1)

Figure 2.17

(b) If we write a = 3 +√j4 in exponential form, we have that a = rejθ ,


where r = |a| = 32 + 42 = 5, and θ = Arg(a) = arctan 34 =
0.927295 . . . (radians). Thus z 7→ az can be written as z 7→ 5 · ej θ · z,
from which it is clear that the effect of this mapping is to magnify
by a factor of 5, and then to rotate counter-clockwise by 0.927295 . . .
radians.
(iii) It is left to the student to verify that this is a combination of (i) and (ii),
i.e. R is magnified by a factor of |a| = 5, rotated by Arg(a) = 0.927295 . . .
radians, and then translated 2 units to the right, and 1 unit down. 
Example 2.2.13. Determine the equation of the image of the line y = 2x + 1
in the z–plane under the transformation w = 2z + 4 + j3.
Solution: Given that z = x + jy and w = u + jv, we have that
u + jv = w
= 2z + 4 + j3
= 2 (x + jy) + 4 + j3
= (2x + 4) + j (2y + 3)
Thus
u = 2x + 4 and v = 2y + 3
Solving for x and y, we get
u v−3
x= −2 and y= .
2 2
79 EMT4801/1

v−3 u

Using these formulas, y = 2x + 1 becomes 2 =2 2 − 2 + 1, which simplifies to
v = 2u − 3.

Figure 2.18

Note: There is no rotation in the transformation z 7→ 2z + (4 + j3) – only


a magnification by a factor 2, followed by a translation (shift) made up of a shift
of 4 units to the right and 3 units up. Note by way of illustration that the points
P (0, 1) and
q Q (2, 5) are on y = 2x√+ 1 (which has a slope of 2), and that the length
2 2
of P Q is (2 − 1) + (5 − 1) = 20. Using the formulas u = 2x + 4, v = 2y + 3,
is is easy to see that P and Q map onto
P 0 (4, 5) and Q0 (8, 13)
respectively. Both P 0 and Q0 lie on the line v = 2u − 3 (which also has a slope of
2), and the length of the line segment from P 0 to Q0 is
p √ √
(8 − 4)2 + (13 − 5)2 = 80 = 2 20
(exactly twice the length of P Q). 
Example 2.2.14. Determine the image in the w–plane of each of the contours
(a) y = x,
(b) x2 + y 2 = 4
in the z–plane, under the transformation w = z 2 .
Solution: As always we assume that z = x + jy, and w = u + jv.
(a) From example 2.2.5 we have that, u = x2 − y 2 and v = 2xy. So if y = x,
then u = 0 and v = 2x2 ≥ 0. Thus when y = x, the possible values for u
and v are u = 0, and v ≥ 0 – i.e. the positive v–axis.

Figure 2.19
80

(b) The equation x2 + y 2 = 4 may also be written as |z|2 = 4 (ie. |z| = 2).
Now since w = z 2 , we have |w| = |z 2 | = |z|2 (by property (a) of section
2.2.7). Thus under the transformation w = z 2 , the equation |z| = 2,
becomes |w| = 4 – the circle in the w–plane, centred at the origin, with
radius 4. Another way to see this is to use the formulas u = x2 − y 2 and
v = 2xy to note that
2 2
u2 + v 2 = x2 − y 2 + (2xy)
= x4 − 2x2 y 2 + y 4 + 4x2 y 2
= x4 + 2x2 y 2 + y 4
2
= x2 + y 2
= 42 .

Figure 2.20

Note: One revolution in the z–plane gives two revolutions in the w–


plane. To see this note that if in exponential form a point in the z–plane
is of the form z = rejθ where −π < θ ≤ π, the image w will then be
w = z 2 = (2ejθ )2 = 22 ej2θ . Thus in passing from z to w, the distance
from the origin gets squared, and the argument doubled. The sector of the
circle in the first quadrant of the z–plane will for example map onto the
semi–circle above the u–axis. Moving from A to E along the circumference
of |z| = 2 in a positive sense (counter–clockwise), these points (the left
half of the circle |z| = 2) map onto the points in the w–plane from A0 to E 0
counter–clockwise (that is one full counter–clockwise rotation of the circle
|w| = 4, starting and finishing at (−4, 0)). If now in the z plane we move
from E to A counter–clockwise (the right half of the circle |z| = 2), the
image of these points in the w–plane trace out yet another full rotation
of the circle w = 4from E 0 to A0 ). 
Example 2.2.15. Determine the image of the w–plane of each of the contours
(a) y = x,
(b) x2 + y 2 = 4,
(c) y = 2x + 1,
in the z–plane, under the transformation w = z1 .
81 EMT4801/1

Solution: Before computing the images of the given curves, we describe x and
y in terms of u and v, and u and v in terms of x and y. We have that
x + jy = z
1
=
w
1 u − jv
= ·
u + jv u − jv
u − jv
= .
u2 + v 2
Thus
u −v
x= and y= .
u2 + v 2 u2 + v 2
Similarly
x −y
u= and v= 2 .
x2
+y 2 x + y2
(a) Using the above formulas, y = x may be rewritten as
u −v
2 2
= 2 .
u +v u + v2
After cross–multiplying, this gives v = −u, which is a straight line.

Figure 2.21

(b) Since w = z1 , it is simple to conclude that |w| = |z| 1


. The equation
2 2 1
x + y = 4 can be rewritten as |z| = 2. So since |w| = |z| , |z| = 2 must
1
map onto |w| = 2 – the circle in the w–plane centred at the origin, and
with radius 12 . Another way to see this is to again use the formulas for x
and y, to rewrite 4 = x2 + y 2 as
2  2
u2 + v 2

u −v
4= 2 2
+ 2 2
= 2.
u +v u +v (u2 + v 2 )
This then simplifies to
1
u2 + v 2 = ,
4
the circle in the w–plane centred at the origin, with radius 12 .
82

Figure 2.22

(c) Using the formulas for x and y, the equation y = 2x + 1 becomes


−v 2u
= 2 + 1,
u2 + v 2 u + v2
which simplifies to
(u2 + 2u) + (v 2 + v) = 0.
If now we complete the squares, we get
1 1 5
(u + 1)2 + (v + )2 = 1 + =
2 4 4
Thus the line y = 2x + 1 maps onto the circle centred at (−1, − 12 ), with
q √
radius 54 = 25 .

Figure 2.23

More generally one can show that the transformation w = z1 maps


lines through the origin onto other lines through the origin, but all lines not
passing through the origin (described by equations of the form y = mx + c
where c 6= 0), onto circles. By the same argument we used in (b), we can
show that the circles |z| = r in the z–plane map onto circles |w| = 1r in
the w–plane. Also points inside |z| = r map onto points outside |w| = 1r
and points outside |z| = r map onto points inside |w| = 1r . [Try to show
this yourself.] 
83 EMT4801/1

Example 2.2.16. Determine the bilinear transformation that maps the points
−j, 0 and −1 in the z–plane, onto the points 1, j and 0 in the w–plane respectively.

Solution: We know that the transformation must be of the form w = az+b cz+d .
Thus all we need to do is find suitable values for a, b, c and d.
When z = −1, we need w = 0. That is we need 0 = −a+b −c+d . The only way this
equation can hold is if
a = b.
When z = 0, we must have w = j. This means that j = a0+b b
c0+d = d . Thus
1
d = j b = −jb. Using the fact that b = a, we arrive at the conclusion that

d = −ja.

Finally when z = −j, we need w = 1. Thus we must have 1 = −aj+b


−cj+d . Solving
1
for c we get c = − j (−aj + b − d). Inserting the values b = a and d = −ja, this
becomes
1
c = − a = ja.
j
The required transformation is therefore of the form
 
az + a z+1 z+1
w= = = −j .
jaz − a j (z − 1) z−1


2.2.10. ENGINEERING APPLICATION: ANALYSING AC


CIRCUITS. Consider the circuit below. Both the impedance Z and admittance
Y = Z1 depend on the capacitance C. As the capacitance C varies from 0 to ∞,
Z and Y will then also vary. This variation can be represented as curves in the
complex plane. Find and sketch the curves representing the variation in impedance
Z and admittance Y = Z1 , as the capacitance C varies from 0 to ∞.

Figure 2.24

Solution: (We assume that the reader is familiar with the content of section
2.1.4.) Let ZR and ZC respectively be the impedances of the resistor and the
capacitor. Since in the above circuit these components are connected in parallel,
84

we have that
1 1 1
Y = = +
Z ZR ZC
1 ωC
= −
R j
1
= + jωC
R
1 + jωRC
. =
R
As C varies from 0 to ∞, the real part of Y will remain constant at R1 , whereas
the imaginary part (that is ωC) will vary from 0 to ∞. As shown in the sketch
below, this is a half-line in the upper half-plane, parallel to the imaginary axis and
starting from ( R1 , 0).
Recall that Y = Z1 (equivalently Z = Y1 ). Thus we use inversion to pass from Y
to Z. As noted in example 2.2.15, inversion will map lines that don’t pass through
the origin, onto circles. So if the variation in Y traced out the half-line described
above, we exepect that the variation in Z = Y1 will trace out the image of this
half-line (under inversion), which should be a circle sector. The question is just
which one?
Let Z = u + jv where u and v are the real and imaginary parts of Z. In the
analysis of Y we saw that
1
<(Y ) = and =(Y ) = ωC ≥ 0.
R
However
1
Y =
Z
1 u − jv
= ·
u + jv u − jv
u − jv
=
u2 + v 2
Thus we must have
u 1 −v
= <(Y ) = and ≥ 0.
u2 + v 2 R u2 + v 2
But this can only be the case if u2 − Ru + v 2 = 0 and v ≤ 0. On completing the
square, u2 − Ru + v 2 = 0 becomes
 2  2
R 2 R
u− +v = .
2 2
Recalling that v ≤ 0, the variation in Z is therefore represented by the lower half
of the circle (u − R2 )2 + v 2 = ( R2 )2 (starting from (R, 0)).
85 EMT4801/1

EXERCISE 2.2.
1. The line joining P (−3, 1) and Q (1, 3) in the z–plane is mapped onto a
curve P 0 Q0 in the w–plane by w = z 2 . Determine the equation of P 0 Q0 .
2. Given that w = 2z − z 2 , determine the image of z when
(a) z = 1 + j;
(b) z = 2 − 2j
1+z
3. Given that f (z) = , z 6= 1, determine
1−z
(a) f (j);
(b) f (1 − j).
z+2 1
4. Let w = f (z) = , z 6= .
2z − 1 2
(a) Find all values of z for which (i) f (z) = j, and (ii) f (z) = 2 − 3j.
(b) Show that z may be written as a single valued function of w.
(c) Find the values of z such that f (z) = z, and explain geometrically
(or otherwise) why we would call such values the fixed or invariant
points of the transformation.
5. Determine u and v if w = u + jv and
(a) w = 2z 2 − j3z;
(b) w = z + z1 .
6. Determine u (x, y) and v (x, y) where f (z) = u + jv and
(a) f (z) = ej3z ;
(b) f (z) = cos z.
7. Determine u (x, y) and v (x, y) where
(a) u + jv = sinh 2z;
(b) u + jv = z cosh z.
8. Determine the value(s) of

(a) 4 sinh ;
3 π
(b) cosh j (2k + 1) ; k = 0, ±1, ±2, . . .
2
9. Show that
√ !  
1 j 3 2π
ln − − =j − + 2kπ , k = 0, ±1, ±2, . . .
2 2 3

and hence write down the principal value of ln(− 21 − j 2 3 ).
10. Determine all the values of
(a) ln (−4),
(b) ln (3j),
and give the principal value in each case.
11. Show that under appropriate restrictions
1 h 2
i y
ln (z − 1) = ln (x − 1) + y 2 + j tan−1 .
2 x−1
Explicitly state the restrictions (if any) needed for this equality to hold.
12. Find a bilinear transformation which maps (0, 0) , (0, −1) and (−1, 0) in
the z–plane onto (0, 1) , (1, 0) and (0, 0) respectively, in the w–plane.
13. Given the triangle R in the z–plane with vertices at j, 1 − j and 1 + j, de-
termine the contour R0 onto which R is mapped under the transformation
w = 3z + 4 − 2j. Also discuss the relationship between R and R0 .
86

14. (a) Determine the image of the circle C : |z + 3| = 5 under the transfor-
mation w = z1 .
(b) Into what region is the interior of C mapped?
15. Determine the equation of the curve in the w–plane into which the line
y = 1 − x is mapped under the transformation w = z 2 .
16. (a) Given that w = u + jv, determine u and v if w = 2z − 3jz + 5 − 4j.
(b) Determine the image of the triangle R in question 13, under the
transformation in (a). The image should be another triangle. Are
the triangles similar?
17. Find and graph the variation in impedance Z and admittance Y = Z1 as
L varies from 0 to ∞.
(Hint: Here Z1 = R1 + jωL 1
.)

Figure 2.25
18. A circuit consists of a resistance R1 and an inductance L in parallel,
connected with a second resistor R2 in series. A voltage with frequency
ω
f = 2π is applied to the circuit. The resulting complex impedance Z is
given by
1 1 1
= +
Z − R2 R1 jωL
Show that as L varies from zero to infinity, the locus of the corresponding
points Z, is a circle sector. Also find the centre and radius of this circle.
87 EMT4801/1

UNIT 3: COMPLEX DIFFERENTIATION


2.3.1. OBJECTIVE. To introduce the concepts of limit and continuity, as
well as the process of differentiation of complex functions. With this as background,
to then specialise to the theory of analytic functions, giving specific attention to
connections with the theory of harmonic functions, and the description of zeros and
singular points.
2.3.2. OUTCOMES. At the end of this unit the student should
• Understand what is meant by respectively saying that a complex function
is differentiable, and that it is analytic;
• Be familiar with the Cauchy-Riemann equations and able to use them to
determine where a given function is differentiable;
• Know what is meant by a harmonic function, be able to test for harmonic-
ity, and be able to compute harmonic conjugates;
• Know what is meant by a conformal mapping and be able to test a given
function for conformality;
• Be familiar with the definitions of the three types of isolated singularities
and be able to test whether such a singularity is either removable or a
pole.
2.3.3. LIMIT AND CONTINUITY. Firstly, formal definitions will be
given for “limit” and “continuity”. For our purposes it will be sufficient to have a
heuristic understanding of these two ideas.
2.3.3.1. Limits. If in some neighbourhood of z = z0 , with the possible exclusion
of z = z0 itself, f (z) is well-defined and single–valued, we say that the limit of f (z)
as z approaches z0 is l, if for any  > 0, we can find a δ > 0 so that |(z) − l| < 
whenever 0 < |z − z0 | < δ. In such a case we will denote the existence of the limit
l by writing
lim f (z) = l.
z→z0

Figure 2.26

The idea behind the definition is that for any distance , no matter how small,
we should have that the distance between f (z) and l is less than  whenever z
is close enough to z0 (without actually being equal to z0 ). The value f (z0 ) may
or may not be defined. Therefore if the limit does exist, the route along which z
tends to z0 should not make a difference. No matter which path it follows, as z
gets closer to z0 , f (z) should get closer to l.
2
−1
Consider for example the function f (z) = zz−1 . Although this function is
clearly not defined at z = 1, we nevertheless have
z2 − 1 (z − 1) (z + 1)
lim = lim = lim (z + 1) = 2
z→1 z − 1 z→1 z−1 z→1
88

z 2 +1
On the other hand for g(z) = z+1 we have g(1) = 1 and
z2 + 1 1+1
lim = = 1.
z→1 z + 1 1+1
2.3.3.2. Continuity. We say that f (z) is continuous at z = z0 if the following
three conditions hold:
(i) limz→z0 f (z) exists,
(ii) f (z0 ) is defined,
(iii) limz→z0 f (z) = f (z0 ).
If f fails to be continuous at a point, we say that it is discontinuous at that point.
2
−1
Now for f (z) = zz−1 , we saw in the preceding discussion that although
limz→1 f (z) did exist, f (1) did not exist. So this function is discontinuous at z = 1.
By contrast all three the above conditions will hold at z = 1 for the function
2
g(z) = zz+1 +1
. So this function is continuous at z = 1.
2.3.3.3. Theorems on limits. If both limz→z0 f (z) and limz→z0 g (z) exist, it
follows that
1. lim [f (z) ± g (z)] = lim f (z) ± lim g (z)
z→z0 z→z0 z→z0
2. lim [f (z) · g (z)] = lim f (z) · lim g (z)
z→z0 z→z0 z→z0
lim f (z)
f (z) z→z0
3. lim = provided that in addition lim g (z) 6= 0
z→z0 g (z) lim g (z) z→z0
z→z0

2.3.3.4. Theorems on continuity.


1. If both f (z) and g (z) are continuous at z = z0 , then so are f (z) ± g (z),
f (z) · g (z). The function fg(z)
(z)
will also be continuous at z0 if in addition
g (z0 ) 6= 0.
2. w = f [g (z)] is continuous at z = z0 , if g is continuous at z = z0 , and f
continuous at g(z0 ) – i.e. a continuous function of a continuous function
is continuous.
3. If f (z) is continuous in a closed bounded region, it will be bounded in
that region; that is there will exist a constant M > 0 such that |f (z)| < M
for all points z in the region.
4. If f (z) is continuous in a region, then so are the real and imaginary parts
of f .
5. If f (z) is continuous at w0 , and limz→z0 g(z) = w0 , then
 
lim f (g(z)) = f lim g(z) .
z→z0 z→z0

2.3.4. DIFFERENTIABILITY AND ANALYTICITY.


2.3.4.1. Introduction. In Mathematics I we defined the derivative at a point x0
of a real-valued function f of one real variable x, to be
f (x0 + ∆x) − f (x0 )
f 0 (x0 ) = lim
∆x→0 ∆x
whenever the limit existed. If the limit failed to exist, we said that f was not
differentiable at x0 . This derivative of course represents the instantaneous rate of
change of f with respect to the variable x at the point x0 .
89 EMT4801/1

Figure 2.27

In passing to complex functions of a complex variable, we may follow the same


formal prescription to introduce the idea of a derivative for these functions. Given
a complex function f defined in some neighbourhood of z0 , we define the derivative
of f at z0 to be
f (z0 + ∆z) − f (z0 )
lim
∆z→0 ∆z
whenever the limit exists. In such a case we say that f is differentiable at z0 . If
the limit fails to exist, we say that f is not differentiable at z0 . If f 0 (z) exists for
all z in a given region G, we will say that f is differentiable in that region.
More generally given f , we can define a new function f 0 with domain the set
of all z for which f 0 (z) exists, and with the action of taking each z in the domain
onto f 0 (z). If we write w = f (z) and set ∆w = f (z + ∆z) − f (z) then for any z in
the domain of f 0 , we will where convenient use the shorthand
dw ∆w
= lim
dz ∆z→0 ∆z
to denote
f (z + ∆z) − f (z)
f 0 (z) = lim
∆z→0 ∆z

Figure 2.28

Here we are effectively dealing in two-dimensional vectors. To illustrate the


point suppose P and Q in the z–plane map onto P 0 and Q0 in the w–plane. Then
P 0 Q0 = ∆w = (w0 + ∆w) − w0 = f (z0 + ∆z) − f (z0 )
Now as ∆z tends to 0, Q will tend to P , and (if f is continuous), Q0 will tend to
P 0 . Thus here too the derivative represents some sort of rate of change of w = f (z)
with respect to z. Specifically
dw f (z0 + ∆z) − f (z0 ) P 0 Q0
|z=z0 = lim = lim .
dz ∆z→0 ∆z Q→P P Q
90

All the usual rules of differentiation apply for complex functions as


well. At all points z where both f 0 (z) and g 0 (z) exist, we have
d
• dz (f (z) ± g(z)) = f 0 (z) ± g 0 (z),
d 0 0
• dz (f (z).g(z))= f (z)g(z) + f (z)g (z),
f 0 (z)g(z)−f (z)g 0 (z)

d f (z)
• and dz g(z) = [g(z)]2 if in addition g(z) 6= 0.
Also
• if g is differentiable at z0 and f is differentiable at g(z0 ), then [f ◦g]0 (z0 ) =
f 0 (g(z0 )).g 0 (z0 ),
d
• and dz (z n ) = nz n−1 for all n ∈ N. (If z 6= 0, this formula is also valid for
−n ∈ N.)
Suppose for example that f (z) = z 2 . We can then prove from first principles that
f 0 (z) = 2z, by noting that
f (z0 + ∆z) − f (z)
f 0 (z) = lim
∆z→0 ∆z
2
(z + ∆z) − z 2
= lim
∆z→0 ∆z
2
2z · ∆z + (∆z)
= lim
∆z→0 ∆z
= lim (2z + ∆z)
∆z→0
= 2z
What about functions which are not differentiable? Somewhat surprisingly
there are some very simple complex functions that are not differentiable at any
point of C! Consider for example the function defined by f (z) = z (so f sends z to
z). Let z0 be fixed. To compute the derivative at z0 (if it exists), we would need
to compute the limit

f (z0 + ∆z) − f (z0 ) z0 + ∆z − z0
lim = lim
∆z→0 ∆z ∆z→0 ∆z

z0 + ∆z − z0
= lim
∆z→0 ∆z
(∆x + j∆y)
= lim
∆z→0 ∆x + j∆y
∆x − j∆y
= lim
∆z→0 ∆x + j∆y

(Of course here z0 = x0 + jy0 and ∆z = ∆x + j∆y.)


If ∆z tended to 0 along the line ∆x = 0, the limit will then be

z0 + ∆z − z0 −j∆y
lim = lim = −1
∆z→0 ∆z ∆z→0 +j∆y

If on the other hand ∆z tended to 0 along the line ∆y = 0, the limit turns out to
be 
z0 + ∆z − z0 ∆x
lim = lim =1
∆z→0 ∆z ∆z→0 ∆x
But if the original limit did exist, we should’ve obtained the same answer no matter
how ∆z tended to 0. Since we get different answers along different paths, the limit
(z0 +∆z)−z0
lim∆z→0 ∆z does not exist. Since z0 was an arbitrary point in C, what we
have actually proved is
the function f (z) = z is not differentiable at any point of C.
91 EMT4801/1

2.3.4.2. Analytic (regular or holomorphic) functions. Let w = f (z) be a func-


tion which is defined and single-valued in some neighbourhood of z0 . We say that f
is analytic at z = z0 if for some R > 0, it is differentiable at each point z satisfying
|z − z0 | < R. (Thus to be analytic f must not just be differentiable at z0 , but also
at every other point close enough to z0 ). A point at which f (z) fails to be analytic,
but nevertheless has other points arbitrarily close to it at which f is analytic, is
called a singular point or singularity.
Remark 2.3.1. Let p(z) = a0 + a1 z + · · · + am z m be given where m ∈ N. Using
the differentiation formulas given in the previous subsection, it is easy enough to
see that p0 (z) exists for each z and that
p0 (z) = 0 + a1 + 2a2 z + · · · + mam z m−1 .
Thus any such polynomial is analytic at all points of C. However as we saw earlier
the function f (z) = z is by contrast nowhere differentiable, and hence also nowhere
analytic
2.3.5. THE CAUCHY–RIEMANN EQUATIONS. With the concepts
of differentiability and analyticity in place, the next challenge we have to face, is
to find a reliable way to test a given function to see if it is differentiable at a given
point or not. It is here where the so-called Cauchy-Riemann equations come into
their own. They are the most useful and reliable tool in testing for differentiability
(or lack thereof). It is therefore worth our while to take some time to properly get
to grips with these equations.
Suppose we are given a function f (z) = u(x, y) + jv(x, y) which is differentiable
at a point z0 = x0 +jy0 . Given ∆z = ∆x+j∆y we write w = f (z) and ∆u+j∆v =
∆w = f (z0 + ∆z) − f (z0 ) where
∆u = u(x0 + ∆x, y0 + ∆y) − u(x0 , y0 ), ∆v = v(x0 + ∆x, y0 + ∆y) − v(x0 , y0 ).
Then
∆w ∆u + j∆v
f 0 (z0 ) = lim = lim
∆z→0 ∆z ∆x→0 ∆x + j∆y
∆y→0

Now of course since the limit lim∆z→0 ∆w


∆z does exist, the path that ∆z follows as it
approaches 0, should not make a difference. Each path must give the same answer!
First let ∆z tend to 0 along the path where ∆x = 0 (that is ∆x = 0 and
∆y → 0).Then the above formula for the derivative becomes

∆u + j∆v
f 0 (z) = lim
∆y→0 j∆y
[u(x0 , y0 + ∆y) − u(x0 , y0 )] + j[v(x0 , y0 + ∆y) − v(x0 , y0 )]
= lim
∆y→0 j∆y
u(x0 , y0 + ∆y) − u(x0 , y0 ) v(x0 , y0 + ∆y) − v(x0 , y0 )
= lim +
∆y→0 j∆y ∆y
∂u ∂v
= −j (x0 , y0 ) + (x0 , y0 )
∂y ∂y
Now suppose that ∆z tends to 0 along the path where ∆y = 0 (that is ∆y = 0
and ∆x → 0). In this case the formula for the derivative becomes
∆u + j∆v
f 0 (z0 ) = lim
∆x→0 ∆x
u(x0 + ∆x, y0 ) − u(x0 , y0 ) v(x0 + ∆x, y0 ) − v(x0 , y0 )
= lim +j
∆x→0 ∆x ∆x
∂u ∂v
= (x0 , y0 ) + j (x0 , y0 )
∂x ∂x
92

Thus we have
∂v ∂u ∂u ∂v
(x0 , y0 ) − j (x0 , y0 ) = f 0 (z0 ) = (x0 , y0 ) + j (x0 , y0 ).
∂y ∂y ∂x ∂x
If now we equate real and imaginary parts, we get that
∂u ∂v ∂v ∂u
(x0 , y0 ) = (x0 , y0 ) and (x0 , y0 ) = − (x0 , y0 ).
∂x ∂y ∂x ∂y
These equations are the so called Cauchy–Riemann equations. To summarise,
we have proved the following theorem in our discussion:
Theorem 2.3.2 (Necessity of the Cauchy–Riemann equations). If a complex
function f (z) = u(x, y) + jv(x, y) is differentiable at a point z0 = x0 + jy0 , then at
that point we must have ∂u ∂v ∂v ∂u
∂x (x0 , y0 ) = ∂y (x0 , y0 ) and ∂x (x0 , y0 ) = − ∂y (x0 , y0 ). In
such a case
∂u ∂v ∂v ∂u
f 0 (z0 ) = (x0 , y0 ) + j (x0 , y0 ) = (x0 , y0 ) − j (x0 , y0 ).
∂x ∂x ∂y ∂y
Notice that the above theorem tells us what should happen at a point if the
function IS differentiable there. It does NOT tell us how to check if the function
actually is differentiable. So what is this theorem good for then? Basically what
this theorem tells us is that
if for a given function f (z) = u(x, y) + jv(x, y) the Cauchy-
Riemann equations do not hold at some given point, the function
is not differentiable at that point.
So it gives us a useful way to check if a function is not differentiable at a given
point. But how do we check to see if it IS differentiable at a point? Does the above
theorem work the other way around as well? The answer is “not quite”. For this
theorem to work the other way around, the Cauchy–Riemann equations on their
own are not enough. We also need the partial derivatives to be well-behaved. What
can be proved is the following:
Theorem 2.3.3 (Sufficiency of the Cauchy–Riemann equations). Let f (z) =
u(x, y) + jv(x, y) be a complex function, and suppose that the partial derivatives
∂u ∂u ∂v ∂v
∂x , ∂y , ∂x and ∂y exist at some point (x0 , y0 ). Then if in addition
• these partial derivatives satisfy the Cauchy–Riemann equations
• AND are continuous at (x0 , y0 ),
the function f will be differentiable at z0 = x0 + jy0 .
The proof of this theorem requires some deep results from real analysis. We
will therefore only outline the proof, and not go into too much detail. Basically if
the first partial derivatives are continuous at z0 = x0 + jy0 , then by a theorem of
real analysis we have that
∂u ∂u p
∆u = (x0 , y0 )∆x + (x0 , y0 )∆y + 1 (∆x)2 + (∆y)2
∂x ∂y
and
∂v ∂v p
∆v = (x0 , y0 )∆x + (x0 , y0 )∆y + 2 (∆x)2 + (∆y)2
∂x ∂y
where 1 and 2 tend to 0 as (∆x, ∆y) → 0. If now the Cauchy Riemann equations
also hold, we can use the above formulas to write
p
∆w ∆u + j∆v ∂u ∂v (∆x)2 + (∆y)2
= = (x0 , y0 ) + j (x0 , y0 ) + (1 + j2 ) .
∆z ∆x + j∆y ∂x ∂x ∆z
93 EMT4801/1

As ∆z → 0, the right-hand side will then tend to ∂u ∂v


∂x (x0 , y0 ) + j ∂x (x0 , y0 ). So we
have that
∆w ∆u + j∆v ∂u ∂v
f 0 (z0 ) = lim = = (x0 , y0 ) + j (x0 , y0 )
∆z→0 ∆z ∆x + j∆y ∂x ∂x
as required.
Example 2.3.4. Use the Cauchy–Riemann equations to show that f (z) = z 2
is analytic and to find its derivative.
Solution: For f (z) = z 2 we have
2
u + jv = f (z) = (x + jy) = x2 − y 2 + j2xy.


So u = x2 − y 2 and v = 2xy. On computing the partial derivatives, we see that


∂u ∂u
= 2x, = −2y,
∂x ∂y
∂v ∂v
= 2y, = 2x.
∂x ∂y
These derivatives are all polynomials, and hence continuous at every point (x, y).
In addition for every (x, y) we have that
∂u ∂v ∂u ∂v
= 2x = and − = −2y = .
∂x ∂y ∂y ∂x
Thus u and v also satsify the Cauchy–Riemann equations at each point (x, y). It
follows from our second theorem that f (z) = z 2 is differentiable at every z ∈ C,
and hence that f is analytic on all of C. From the first theorem we now conclude
that
∂u ∂v
f 0 (z) = +j = 2x + j2y = 2z.
∂x ∂x

2.3.6. HARMONIC FUNCTIONS AND HARMONIC CONJUGATES.
A function u : R2 → R that satisfies Laplace’s equation,
∇2 u = 0,
where ∇2 is the operator
∂2 ∂2
∇2 = + ,
∂x2 ∂y 2
is said to be harmonic. As it turns out, analytic functions are very useful for
finding solutions to Laplace’s equation, since if a function f (z) = u(x, y) + jv(x, y)
is analytic on some domain Ω, u and v will both be harmonic on that domain. To
prove this we need the Cauchy–Riemann equations, and the fact that u and v have
continuous second order partial derivatives. We will see later that if a function
f (z) is analytic on the interior of a domain, then its derivatives of all orders exist
and are analytic on the same domain. Since the derivatives of f can be expressed
in terms of the partial derivatives of u and v, this means that u and v actually
have (continuous) partial derivatives of all orders on that domain. For the sake of
argument assume this to be true for now. Since f is analytic on Ω, we have that
∂u ∂v ∂v ∂u
= and =−
∂x ∂y ∂x ∂y
on the domain Ω. If now we differentiate the first equation with respect to x, and
the second with respect to y, we get that
∂2u ∂v ∂2u ∂2v
= and 2 = − ∂y∂x .
∂x2 ∂x∂y ∂y
94

But since the second partial derivatives of v are continuous, we have that
∂v ∂2v
= .
∂x∂y ∂y∂x
Combining these three sets of equations we get that
∂2u ∂2u
= −
∂x2 ∂y 2
as required. The proof that
∂2v ∂2v
2 =−
∂x ∂y 2
runs along similar lines and is left as an exercise. Thus we arrive at the following
Theorem 2.3.5. If f (z) = u(x, y) + jv(x, y) is analytic, then
∂2u ∂2u ∂2v ∂2v
+ 2 =0 and + 2 = 0.
∂x2 ∂y ∂x2 ∂y
We saw that we can get solutions to Laplace’s equation from analytic functions.
But we can also get analytic functions from solutions to the Laplace equation.
Suppose we are given a function u : R2 → R that satisfies Laplace’s equation.
Any other harmonic function v for which u + jv is analytic, is called a harmonic
conjugate of u. To compute the harmonic conjugate of some given u, we simply
look for another harmonic fuction v chosen so that the pair u and v satisfy the
Cauchy-Riemann equations.
Given the importance of Laplace’s equation in mathematical modelling, har-
monic functions have applications in such areas as stress analysis in plates, fluid
flow, and electrostatics.

Example 2.3.6. Show that for any branch cut of sin−1 (z), the Cauchy–Riemann
equations hold wherever that branch cut is analytic. Then show that the derivative
of sin−1 (z) is (1−z12 )1/2 .

Solution: Whilst it is possible to prove what is required of us directly, we


will use a more indirect approach based on the chain rule. Given a branch cut
w = sin−1 (z) = u + jv, it follows from the definition of sin−1 that
sin(u + jv) = sin(w) = z = x + jy.
Recall that u and v are functions of (x, y), and that their first partial derivatives
exist where w = sin−1 (z) is analytic. So at any such point we may use the chain
rule for partial derivatives to conclude that
 
∂w ∂u ∂v ∂z
cos(w). = cos(w). +j = = 1 + j0.
∂x ∂x ∂x ∂x
By a similar argument
 
∂u ∂v
cos(w). +j = 0 + j.
∂y ∂y
1
If now we solve for cos(w) , we get that
     
∂u ∂v 1 1 ∂u ∂v ∂v ∂u
+j = = +j = −j .
∂x ∂x cos(w) j ∂y ∂y ∂y ∂y
By comparing real and imaginary parts we may now conclude that
∂u ∂v ∂u ∂v
= and − =
∂x ∂y ∂y ∂x
95 EMT4801/1

wherever cos(w) 6= 0. So the Cauchy–Riemann equations hold whenever cos(w) 6= 0.


(We will see shortly that these points are just the points where dw dz exists.) By now
applying the chain rule for complex differentiation, we get that
dw d dz
cos(w). = (sin(w)) = = 1.
dz dz dz
Since cos2 (w) = 1 − sin2 (w) = 1 − z 2 , it therefore follows that
dw 1 1
= =
dz cos(w) (1 − z 2 )1/2
where the value of the derivative depends on the branch chosen for (1 − z 2 )1/2 .
Since cos2 (w) = 1 − sin2 (w) = 1 − z 2 , it is clear that the points where cos(w) 6= 0,
are just the points where the multi-valued function (1−z12 )1/2 exists. 

Example 2.3.7. Show that u = x2 − y 2 + 2x is harmonic and find the form of


the harmonic conjugate v of u. Then write down the analytic function u+jv = f (z)
as an expression in terms of z.

Solution: On partially differentiating u we see that


∂u ∂u
= 2x + 2, = −2y
∂x ∂y
and hence that
∂2u ∂2u
2
= 2, = −2.
∂x ∂y 2
Thus
∂2u ∂2u
+ 2 =0
∂x2 ∂y
and u is therefore harmonic. Next we need to find a function v for which u + jv is
analytic. From our theorems on the Cauchy–Riemann equations we need the first
partial derivatives to be continuous, and also
∂u ∂v ∂v ∂u
= and =−
∂x ∂y ∂x ∂y
But if the last set of equations holds, we must have that
∂v ∂v
= 2x + 2 and = 2y
∂y ∂x
Since all of the first partial derivatives of u and v are then clearly continuous, the
∂v
function u+jv will then be analytic. Thus the solution to the equations ∂y = 2x+2
∂v
and ∂x = 2y will indeed be the harmonic conjugate of u. To find v from the above
equations, we “partially integrate each equation”. For example if we “partially
∂v
integrate” ∂y = 2x + 2 with respect to y, then x should be regarded as a constant,
which means that the constant of integration can be a function of x. Thus from
∂v ∂v
∂y = 2x + 2 we get v = 2xy + 2y + g(x). Similarly it follows from ∂x = 2y that
v = 2xy + h(y). If we compare these expressions it is clear that 2y + g(x) = h(y),
or equivalently that
g(x) = h(y) − 2y.
Since here x and y are independent of each other, this equation can only hold if
both sides are constant. Thus
v = 2xy + 2y + c.
Finally notice that then
f (z) = u + jv = x2 − y 2 + 2x + j (2xy + 2y + c) .

96

Whenever z = x, we have that f (x + j0) = x2 + 2x + jc. Thus we suspect that for


general z we also have f (z + j0) = z 2 + 2z + jc. It is now a simple matter to check
that these two expressions for f are indeed the same. 

2.3.7. CONFORMAL MAPPINGS. With a good idea of what analyticity


is all about, we now come back to the question of mappings, and see what sorts
of properties mappings w = f (z) from the z-plane to the w-plane have when f is
analytic. A transformation w = f (z) is said to be conformal at a point z0 if f is
analytic at the point z0 , and f 0 (z0 ) 6= 0. If w = f (z) is conformal at all points of
some given region D, we say that f is conformal on D. The main theorem of this
section clarifies why conformality is so important.
Let C1 and C2 be any two contours intersecting at a point z0 = (x0 , y0 ) in
the z–plane, and let C10 and C20 be their images in the w-plane, intersecting at
w0 = f (z0 ) = (u0 , v0 ). We say that the mapping w = f (z) preserves both the
magnitude and sense of angles at z0 if the angle at (x0 , y0 ) between C1 and C2 is
equal to the angle at (u0 , v0 ) between C10 and C20 at (u0 , v0 ) both in magnitude and
direction.

Theorem 2.3.8. If w = f (z) is conformal at a point (x0 , y0 ), then the mapping


w = f (z) preserves both the sense and the magnitude of the angles at (x0 , y0 ). If
w = f (z) is conformal in a region D, then the mapping w = f (z) preserves both
the sense and the magnitude of angles at all points of D.
The transformation w = z 2 is analytic everywhere on C. Since dw dz = 2z 6= 0
precisely when z 6= 0, it follows from the theorem that w = z 2 preserves the sense
and the magnitude of angles at all points z 6= 0. Let us demonstrate this.
Example 2.3.9. Let R be the triangle ABC in the z–plane with A, B and C
the points (0, 2), (2, 0) and (2, 2). Let A0 , B 0 and C 0 be the images of these points.
Show that at each of these points, the angles subtended by the images of the line
segments AB, BC, and AC is equal in sense and magnitude to the angles at A, B,
and C. Also sketch the graphs in both planes.
Solution: The easy way to do this, is of course to use the theorem. The
transformation w = z 2 is analytic everywhere on C, and the derivative dw dz = 2z is
non-zero precisely z 6= 0. So at all other points (including of course A, B, and C),
w = z 2 will preserve both the sense and the magnitude of angles. It is possible to
compute the angles directly (without using the theorem). Lets see how that works:
2
For w = z 2 we have u + jv = w = z 2 = (x + jy) = x2 − y 2 + j2xy, from which it
follows that
u = x2 − y 2 and v = 2xy.
Then
A (0, 2) → A0 (−4, 0)
B (2, 0) → B 0 (4, 0)
C (2, 2) → C 0 (0, 8)
The equation of the line segment AB is y = −x + 2 where 0 ≤ x ≤ 2. When
y = −x + 2, u = x2 − (2 − x)2 = 4(x − 1) and v = 2x(2 − x) = 4x − 2x2 . If we solve
for x from the first equation we get x = u4 + 1. On substituting this into the second
2
equation, we arrive at v = (u + 4) − 2( u4 + 1)2 = 2 − u8 . Thus the line y = −x + 2
maps onto the parabola
u2
v =2− .
8
97 EMT4801/1

Figure 2.29

The equation of the line segment BC is x = 2 where 0 ≤ y ≤ 2. Now if x = 2,


then u = 4 − y 2 and v = 4y. Solving for y we get y = v4 . On substituting this into
the first equation, we then get the parabola
v2
u=4− .
16
2
Thus the image of the line x = 2 is u = 4 − v16 .
The equation of the line segment AC is y = 2 where 0 ≤ x ≤ 2. Now if y = 2,
then u = x2 − 4 and v = 4x. Solving for x we get x = v4 . On substituting this into
the first equation, we then get the parabola
v2
u= − 4.
16
2
Thus the image of the line y = 2 is u = v16 − 4.
2
At the point C 0 (0, 8), the slope of the tangent to u = 4 − v16 , is given by
2
dv dv
m1 = du |(0,8) . To find du in this case, we differentiate the equation u = 4 − v16
dv
implicitly to get 1 = − v8 du dv
. Therefore du = − v8 , whence m1 = − v8 |(0,8) = −1. The
v2 0 dv
slope of the tangent to u = 16 − 4 at C is given by m2 = du |(0,8) . By a similar
dv
argument to the one we used above, we can show that in this case du = v8 . So
8
m2 = v |(0,8) = 1. Now recall that the angle α between two lines, having gradients
m1 and m2 is π2 radians whenever m1 m2 = −1, and given by the formula
m1 − m2
tan−1 (α) =
1 + m1 m2
in other cases. In the specific case we are interested in, we have that m1 m2 = −1.
So the angle subtended at C 0 is π2 radians – the same as the angle Ĉ. It is clear
from the sketch that the sense of the angle at C 0 is also the same as the one at C.
As v tends to 0 on the image of the line segment BC (so along the parabola
2
v = 2 − u8 moving from C 0 to B 0 ), the slope du dv
= − v8 will tend to −∞. Thus
0
this parabola has a vertical tangent at B , which means that the angle it makes
2
with the u–axis is π2 radians. By contrast the slope of the tangent to v = 2 − u8
at B 0 (4, 0) is given by m3 = du
dv
|(4,0) = − u4 |(4,0) = −1. Since the slope is −1, the
tangent has an angle of π4 with the u-axis. So the magnitude of the angle between
the two tangents, is π2 − π4 = π4 . But this is exactly the same as the angle B̂. Once
again it is clear from the sketch that the sense of the angles at B and B 0 are also
the same.
98

The fact that the angle subtended at A0 has the same magnitude and sense as
the one at A, follows along similar lines. 

We finally demonstrate that w = z 2 does NOT preserve the magnitude of angles


at 0. To see this consider the x-axis (described by the equation y = 0) and the line
y = x in the z–plane. These two lines have an angle of π4 radians with each other
at the origin. The mapping w = z 2 will map y = 0 onto the contour described by
v = 2x.0 = 0 and u = x2 ≥ 0 – that is onto the positive u-axis. The line y = x
gets mapped onto the contour described by u = x2 − x2 = 0 and v = 2x2 ≥ 0 – the
positive v-axis. But the angle between the positive u-axis and the positive v-axis
is π2 radians and no longer π4 radians!

2.3.8. SINGULAR POINTS AND ZEROS. A point z0 is said to be a


singularity, or singular point, of f if f fails to be analytic at z0 , but nevertheless
does have other points arbitrarily close to z0 at which it is analytic. A zero of f (z)
is a point in the z–plane at which f (z) = 0.
2.3.8.1. Isolated singularities. The point z = z0 is called an isolated singularity
if we can find δ > 0 such that the circle |z − z0 | = δ encloses no other singular point
besides z0 . There are essentially three types of isolated singularities. We will discuss
each type in turn.
• Removable singularities:
The singularity z = z0 is removable if
lim f (x)
z→z0

exists. If say limz→z0 f (z) = w0 , then we can simply extend the function
f to a larger domain including z0 , and define its value at z0 to be w0 .
This extended function is then continuous at z0 as well. Thus by making
such an extension, the singularity can be “removed”. E.g. if
sin z
f (z) =
z
the singularity at z = 0 is removable since
sin z
lim =1
z→z0 z

Note: By using Maclaurin expansion of


z3 z5
sin z = z − + − ...
3! 5!
it follows that
sin z z2 z4
=1− + − ... = 1
z 3! 5!
when z 6= 0.
• Poles:
If we can find a positive integer n ∈ N such that
n
lim (z − z0 ) f (z) = A 6= 0,
z→z0

then z = z0 is called a pole of order n. For example


5z + 3
f (z) = 3 2
(2z − 3) (z − 2) (z + j)
has a pole of order 3 at z = 23 , a pole of order 2 at z = 2, and a simple
pole at z = −j. Functions having poles as their only singularities are called
meromorphic functions.
99 EMT4801/1

• Essential singularities:
An isolated singular point that is neither removable, nor a pole, is called
an essential singularity. E.g.
1
f (z) = e z+4
has an essential singularity at z = −4.
2.3.8.2. Singularities and branch cuts. What about singular points that are
NOT isolated? Is there anything we can say about them? We will not be very pre-
cise here, and will content ourselves with just a few brief comments aimed at giving
the reader some idea of how non-isolated singularities may arise in a natural way.
It turns out that any branch cut of a multi–valued function has lots of singularities
1
that are NOT isolated. Consider for example the multi–valued function w = z 2 .
1
Recall that if z = r∠θ where −π < θ ≤ π, then the principal branch of w = z 2 is
given by

 
θ
w0 = r∠ .
2
But what if z lies on the negative real axis? If say z0 = −2, then −2 = z0 = 2∠π
1 1 √ √ √
and by definition z02 = (−2) 2 = 2∠ π2 = 2(cos( π2 ) + j sin( π2 )) = 2j. However
it is also true that −2 = z0 = 2∠ − π. Now suppose we have a sequence given by
zn = 2∠θn , where −π < θn ≤ π and θn → −π as√n → ∞. √ Then clearly z√
n =
1
2∠θn → 2∠ − π = z0 as n → ∞. However (zn ) 2 = 2∠ θ2n → 2∠ − π2 = − 2j.
1 1
Thus although zn → z0 , for this branch cut {zn2 } does not converge to z02 . Thus
1
the principal branch of w = z 2 is not continuous at z0 . This point is therefore
a singularity. In fact all points on the negative real axis will be singularities for
this branch. Speaking very loosely we can say that all points on the “edge” of a
branch cut are singularities. Now if we make the branch cut somewhere else and
not on the negative real axis, most of the points on the negative real axis will no
longer be singularities. For example if instead of asking that −π < θ ≤ π we rather
ask that − 3π π
2 < θ ≤ 2 , then it is the points on the positive imaginary axis that
will now be singularities, and not those on the negative real axis. However there
are some singularities of a branch-cut that cannot be removed by making the cut
1
in a different place. For example for w = z 2 one can show that no matter where
we make the cut, 0 will always be part of the branch cut. Such singularities that
cannot be removed by making the cut in a different place, are called branch points.

2.3.8.3. L’Hôpital’s Rule. Isolated singularities are classified according to the


behaviour of different kinds of limits. So to be able to classify singularities, you
will need to be skilled at computing limits. With this in mind we now take the
time to present a very helpful tool in computing limits, namely L’Hôpital’s Rule. If
f (z) and g (z) are both analytic in a region containing the point z = z0 for which
f (z0 ) = g (z0 ) = 0, but with g 0 (z0 ) 6= 0, then
f (z) f 0 (z0 )
lim = 0
z→z0 g (z) g (z0 )
provided
f (z)
lim
z→z0 g (z)
0 ∞
leads to the indeterminate forms 0 or ∞.

sin z
Example 2.3.10. Determine limz→0 z .
100

Solution: Note that both sin(z) and w = z are analytic and that the limit
limz→0 sinz z leads to the indeterminate form 00 . Thus by L’Hôpital’s Rule we now
have that
sin z cos z 1
lim = lim = = 1.
z→0 z z→0 1 1

Example 2.3.11. Evaluate each of the following limits:
(a)  

 z
limπ z − e 3
z→ej 3 z3 + 1
(b)  14 
z +1
lim
z→j z 10 + 1
(c)
sin z 2
 
lim
z→0 1 − cos z
(d)  
1 − cos z
lim
z→0 z2
0
Note that each of the above lead to 0 . Hence try to apply L’Hôpital’s Rule.
Solution:
(a)
π π
(z − ej 3 )z 1.z + (z − ej 3 ).z
   
limπ = limπ
z→ej 3 z3 + 1 z→ej 3 3z 2
π 1
= ej 3 · 2π
3ej 3

1 −j π
= e 3
3
1 √ 
= 1−j 3
6
(b)
z 14 + 1 14z 13
  
lim = lim
z→j z 10 + 1 z→j 10z 9
= lim 1.4z 4

z→j
= 1.4
(c) On applying L’Hôspital’s rule we get
sin z 2 2z cos z 2
   
lim = lim .
z→0 1 − cos z z→0 sin z
But the right hand side is still an indeterminate form, and so we apply
the rule for a second time to get
sin z 2 2z cos z 2
   
lim = lim
z→0 1 − cos z z→0 sin z
2 cos z 2 − 4z 2 sin z 2
 
= lim
z→0 cos z
2−0
=
1
= 2
101 EMT4801/1

(d) In this case we once again need two applications of L’Hôspital’s rule to be
able to compute the limit. Specifically
   
1 − cos z sin z
lim = lim
z→0 z2 z→0 2z
h cos z i
= lim
z→0 2
1
=
2


Example 2.3.12. Locate the singularities in each of the following cases. Say
whether they are isolated or not, and classify those that are isolated.
z
(a) f (z) = (z2 +9) 2

ln(z+4)
(b) f (z) = (z 2 +4z+5)3
1
(c) f (z) = cosec z1 = 1

sin z

Solution:
z 2
(a) The function f (z) = (z2 +9) 2 has singularities where z = −9; that is
where z = ±j3. Both of these are isolated. Notice that
z z
f (z) = 2 = 2 2
2
(z + 9) (z + j3) (z − j3)
and hence that
2 z j3 j
lim (z − j3) f (z) = lim 2 = 2 =− 6= 0.
z→j3 z→j3 (z + j3) (j6) 12
Thus z = j3 is a pole of order two. Similarly z = −j3 can be shown to be
a pole of order two
(b) Consider
ln (z + 4)
f (z) = 3
2
(z + 4z + 5)
The given function is a multi-valued function for which the point z = −4
is a branch point. If we turn the function ln(z + 4) into a single valued
function by making a branch cut, z = −4 will not be isolated since all
points on the “edge” of that branch cut, including z = −4, will then be
singularities. (Notice that since 0 is a branch point of ln z, z + 4 = 0
will be a branch point of ln(z + 4).) There are also singularities where
z 2 + 4z + 5 = 0, i.e. where

−4 ± 16 − 20
z= = −2 ± j.
2
If the branch cut is made in such a way that neither of these points are on
the “edge” of the cut for ln(z + 4), they will then be poles of order three
of f . To see this notice that we may write
ln(z + 4)
f (z) =
(z − (−2 + j))3 (z − (−2 − j))3
from which it follows that for example
ln((−2 + j) + 4) ln(2 + j)
lim (z − (−2 + j))3 f (z) = = 6= 0.
z→−2+j ((−2 + j) − (−2 − j))3 (2j)3
102

(c)
1 1
f (z) = cosec = .
z sin z1
This equation has singularities at z = 0, as well as when sin z1 = 0, i.e.
when z1 = nπ. Thus at
1
z=0 and zn = n = ±1, ±2, . . . .

1
Since zn = nπ → 0 as n → ∞, the singularity z = 0 is not isolated. (There
are other singularities arbitrarily close to it.) By contrast the singularities
1
zn = nπ are isolated. Since for any n = ±1, ±2, . . .
1
z − nπ
 
1
lim1 z − · f (z) = lim1 1
z→ nπ nπ z→ nπ sin z
 
1
= lim1 1 1
z→ nπ − z 2 cos z
1
n 2 π2
= −
cos nπ
 2
n+1 1
= (−1)

6= 0,
the singularities are all simple poles (poles of order one).


2.3.9. ENGINEERING APPLICATION: A HEAT TRANSFER PROB-


LEM. The outer temperature of a cylindrical pipe is 0o C. Steam passes at 100o C
3
through an offset cylindrical hole. The radius of the inner circle is 10 of that of the
outer circle. If r is the radius of the pipe cylinder, the center of the offset hole is
3
on a radial line 10 r units from the centre of the cylinder. Find a function T (x, y)
giving the temperature at any point (x, y) in the pipe. (Since this is a heat transfer
problem, T needs to be a solution Laplace’s equation – that is a harmonic function.)

Figure 2.30

Solution: We need to find a function T (x, y) such that


∂2T ∂2T
+ =0
∂x2 ∂y 2
in the region between the circles |z − 0, 3| = and |z| = 1, with T = 100 on
|z − 0, 3| = 0, 3 and T = 0 on |z| = 1.
103 EMT4801/1

Figure 2.31

We can simplify our task by using bilinear mappings. Specifically the mapping
z−3
w=
3z − 1
transforms the circle |z| = 1 onto the circle |w| = 1 and the circle |z − 0, 3| = 0, 3
onto the circle |w| = 3. (See if you can show this yourself.) Thus to solve the
problem we first find a harmonic function Te(u, v) on the w–plane such that Te = 100
on |w| = 1 and Te = 0 on |w| = 3, and then determine T from this by using the
z−3
expression u + jv = w = 3z−1 to write u2 + v 2 in terms of x and y. Harmonic
functions with such axial symmetry as Te (that is with constant values on concentric
circles centred at the origin), have the general form

Te (u, v) = A ln u2 + v 2 + B

(where A and B are constants)

On the circle u2 + v 2 = 1, Te = 100, and on the circle u2 + v 2 = 9, Te = 0. Thus

100 = A ln 1 + B ⇒ B = 100,
100
0 = A ln 9 + 100 ⇒ A=− .
ln 9
Therefore
100 100 
ln u2 + v 2 + 100 = ln 9 − ln u2 + v 2 .
 
Te (u, v) = −
ln 9 ln 9
Next notice that
2
u2 + v 2 = |w|
z − 3 2

=

3z − 1
|(x − 3) + jy|2
=
|(3x − 1) + jy|2
2
(x − 3) + y 2
= 2
(3x − 1) + y 2
Thus
" #
2
100 (x − 3) + y 2
T (x, y) = ln 9 − ln 2
ln 9 (3x − 1) + y 2

104

EXERCISE 2.3.
1. Show that u = 2x (1 − y) is harmonic, find its harmonic conjugate v and
hence, express w = u + jv as an analytic function of z.
2
2. Verify that the functions ez , cosh 4z are analytic.
3. Determine which of the following functions v are harmonic. For each
harmonic v, determine another harmonic function u such that w = u + jv
is analytic, and express w = u + jv as a function of z.
(a) 2xy + 3xy 2 − 2y 3
(b) 3x2 y + 2x2 − y 3 − 2y 2
(c) e−2xy sin x2 − y 2
(d) xex cos y − yex sin y
h i
2 2
4. Prove that v = ln (x − 2) + (y − 1) is harmonic in every region ex-
cluding the point (2, 1) in the z–plane. Then find u such that u + jv will
be analytic and hence, express w = u + jv in terms of z.
5. (a) Sketch the lines y − 3x = 0 and y = 5 − x in the z–plane, as well as
their images in the w–plane, where w is the mapping w = z 2 .
(b) Show that the angles of intersection of the curves are the same in
both planes and explain why this is so.

6. (a) Sketch the curves |z − 3| = 2 and 2x + 3y = 7, as well as their
images in the w–plane, where w = z1 .
(b) Determine whether the circle and line intersect at the same angles as
their images. Justify each of your assertions.
2
7. Evaluate the limit limz→j zz4 −2jz−1
+2z 2 +1 .
8. Locate
√ and classify the singularities in each of the following cases:
(a) z 3 + z
ln (z + j3)
(b)
z3
1
(c) e z

z2 + 1
(d) 3
z2
9. Find the zeros and singularities of the following functions:
z−1
(a) 4
z − z 2 (1 + j) + j
sin (z − 1)
(b) 4
z − z 2 (1 + j) + j
10. (a) Show that
z+3
w=
z−3
maps the circle
1 + k2
x2 + y 2 + 6 x + 9 = 0, k 2 6= 1
1 − k2
in the z–plane onto the circle u2 + v 2 = k 2 in the w–plane.
(b) Two long cylindrical wires, each of radius 4 mm, are placed parallel
to each other with their axes 10 mm apart, as shown in the figure
below. The wires carry potentials of +V0 and −V0 . Show that the
potential V (x, y) at any point (x, y) is given by
2
V0 (x + 3) + y 2
V (x, y) = ln
ln 4 (x − 3)2 + y 2
105 EMT4801/1

Figure 2.32

11. A hole of radius 3 cm has been drilled through a metal block, with its
centre 5 cm above the base of the block (see the figure below). Steam
with a temperature of 100◦ C passes through the hole, and the base of the
block is supported on a slab of ice at 0◦ C.

Figure 2.33

Determine the image of both the circle and the x–axis under the
mapping
z + j4
w= ,
z − j4
and then show that the temperature T (x, y) at any point (x, y) in the
shaded region is given by
2
50 x2 + (y + 4)
T (x, y) = ln
ln 3 x2 + (y − 4)2
106

UNIT 4: COMPLEX INTEGRATION


2.4.1. OBJECTIVE. The introduction of contour integrals and of Cauchy’s
theorem and its consequences.

2.4.2. OUTCOMES. At the end of this unit the student should


• Know what a contour integral is and be able to compute some simple
contour integrals;
• Know when antiderivatives may be used to compute a complex integral,
and be able to do such a calculation when necessary;
• Know and be able to apply Cauchys theorem;
• Know and be able to apply Cauchys integral formulas to compute integrals
of complex functions of the from fp(z)
(z)
where p is a polynomial.

2.4.3. CONTOUR INTEGRALS. A complex valued function z(t) = x(t)+


jy(t) defined on some interval [a, b] in the real line, is said to be differentiable on
[a, b], if each of x(t) and y(t) are differentiable on [a, b] in the usual sense for real
functions. In such a case we set z 0 (t) = x0 (t) + jy 0 (t). A point set C in the complex
plane will be called a smooth curve (or smooth arc) if it is the range of such a
function z : [a, b] → C for which in addition
• z 0 (t) exists and is continuous on all of [a, b],
• z 0 (t) is never zero on [a, b]
The points z(a) and z(b) are respectively called the initial and terminal points of
the curve, and the function z(t) is called a parametrisation of the curve. The length
of such a smooth curve is determined by the formula
Z b
L= |z 0 (t)|dt,
a
0
p
where of course |z (t)| = x0 (t)2 + y 0 (t)2 .

Figure 2.34. Smooth curve

If the initial and terminal points z(a) and z(b) of a smooth curve equal each
other, the curve is said to be closed.
If now in addition the points z(t) where a < t < b are all different, this means
that as t varies from a to b, z(t) will not cross the same point twice as it travels
from z(a) to z(b) along C (ie. the curve does not intersect itself). Such a curve is
called a simple curve (or a simple closed curve if indeed it is closed). A contour is
an curve made up of a finite number of smooth curves joined end to end. For this
reason contours are sometimes also called piecewise smooth curves.
107 EMT4801/1

Figure 2.35. Closed curve intersecting itself

Figure 2.36. Simple closed curve

We note finally that if we are given a parametrisation z : [a, b] → C of a


contour C, then this parametrisation directs the contour in the sense that it traces
the contour C in a specific direction. As t varies from a to b, the values z(t) trace
out the contour with a definite starting point at z(a), and finishing point at z(b).
So we may then think of C as a contour running from z(a) to z(b). When in
this way we assign the idea of dircetion to a contour C, we will call C a directed
contour. It is of course a simple matter to reverse the direction. Specifically if the
parametrisation z : [a, b] → C traces out C from z(a) to z(b), then the function
ze : [a, b] → C defined by ze(t) = z((a + b) − t) will trace out the same contour, but
from ze(a) = z(b) to ze(b) = z(a). If C is a directed contour, the the same contour
with the direction reversed, will be denoted by −C.
Given a directed contour C from say α to β, and a complex function f which
is piecewise continuous on C, we may define the contour integral of f on C to be
Z Z b
f (z)dz = f (z(t))z 0 (t)dt
C a

where z : [a, b] → C is a (piecewise smooth) parametrisation of C which traces


C from α to β. What is not so clear is that the value of this integral does not
actually depend on the parametrisation! If we had used a a different piecewise
smooth parametrisation
R of C also tracing C from α to β, it would’ve yielded the
same value for C f (z)dz. Changing the direction of the the contour C will have
the effect of changing the value of the integral by a factor -1. Specifically we have
108

that Z Z
f (z)dz = − f (z)dz.
−C C
(See if you can prove this.) If M is an upper bound for |f | on C (ie. |f (z)| ≤ M
for every z ∈ C), then Z

f (z)dz ≤ M L

C
where L is the length of C. If C is made of contours C1 , C2 , . . . , Cn joined end to
end, then
Z Z Z Z
f (z)dz = f (z)dz + f (z)dz + · · · + f (z)dz.
C C1 C2 Cn
R
In closing we note that it is in fact also possible to express C f (z)dz in terms
of line integrals of real valued functions on R2 . To see this let z(t) = x(t) + jy(t)
be a function parametrising C, and suppose that f (z) = u(x, y) + jv(x, y), where
u and v are the real and imaginary parts of f . Then
Z Z b
f (z) dz = f (z(t))z 0 (t)dt
C a
Z b
= (u(x(t), y(t)) + jv(x(t), y(t)))(x0 (t) + jy 0 (t))dt
Za
= (u + jv)(dx + jdy)
ZC Z
= (udx − vdy) + j (vdx + udy) .
C C
R
By using this form of C f (z)dz, many of the theorems about real line integrals,
can now be applied to complex contour integrals.

Example 2.4.1. Evaluate Z


z 2 dz
C
along the path C from 0 + j0 to 2 + 8j where C is composed of
(a) the line segments, first from 0+j0 to 2+j0 and then from 2+j0 to 2+8j,
(b) the line joining 0 + j0 to 2 + 8j and
(c) the parabola y = 2x2 joining 0 + j0 to 2 + 8j.

Figure 2.37
109 EMT4801/1

Solution:
(a) Let C1 be the line segment from 0 + j0 to 2 + j0, and C2 the line segment
from 2 + j0 to 2 + 8j. For the function f (z) = Rz 2 we have z 2 = (x2 −
y 2 ) + j2xy where z = x + jy. Thus if we write C z 2 dz in terms of real
line integrals, we get
Z Z Z
z 2 dz = [(x2 − y 2 )dx − 2xydy] + j [2xydx + (x2 − y 2 )dy].
C C C

R R
We will use this formula to separately compute C1 f (z)dz and C2 f (z)dz.
Now along AB (that is on C1 ), we have y = 0 (thus dy = 0), with
0 ≤ x ≤ 2. Thus
Z Z Z
z 2 dz = [(x2 − y 2 )dx − 2xydy] + j [2xydx + (x2 − y 2 )]dy
C1 C1 C1
Z Z
2
= [(x − 0)dx − 0] + j [0dx + 0]
C C1
Z 1
= x2 dx
C1
Z2
= x2 dx
0
8
=
3

Along BC (that is C2 ), we have x = 2 (so dx = 0), with 0 ≤ y ≤ 8. So in


this case
Z Z Z
z 2 dz = [(x2 − y 2 )dx − 2xydy] + j [2xydx + (x2 − y 2 )dy]
C2 C2 C2
Z Z
= [0 − 2(2y)dy] + j [0 + (22 − y 2 )dy]
C2 C2
Z8 Z 8
= (−4y)dy + j (4 − y 2 )dy
0 0
   8
2 1 3
= −2y + 4y − y
3
  0
512
= −128 + j 32 −
3
416
= −128 − j
3

Thus
Z Z Z
2 2
z dz = z dz + z 2 dz
C C1 C2
376 416
= − −j
3 3

(b) The equation of the line joining 0+j0 and 2+8j is y = 4x where 0 ≤ x ≤ 2.
From this equation it is clear that the function z(x) = x + j4x, where
110

x ∈ [0, 2], parametrises C in this case. Therefore


Z Z 2
2
z dz = z(x)2 z 0 (x)dx
C 0
Z 2
= (x + j4x)2 .(1 + j4)dx
0
Z 2
= (1 + 4j)3 x2 dx
0
x3 2
= (1 + 4j)3|
3 0
376 416
= − −j
3 3
(c) The points on the segment of the parabola y = 2x2 from 0 + j0 to 2 + 8j
are all of the form z = x + j2x2 where 0 ≤ x ≤ 2. Thus in this case the
function z(x) = x + j2x2 , where x ∈ [0, 2] parametrises C. Therefore
Z Z 2
z 2 dz = z(x)2 z 0 (x)dx
C 0
Z 2
= (x + j2x2 )2 .(1 + j4x)dx
0
Z 2
= [(x2 − 20x4 ) + j(8x3 − 16x5 )]dx
0
 2
8x6
  
1 3 5 4
= x − 4x + j 2x −
3 3 0
   
8 512
= − 128 + j 32 −
3 3
376 416
= − −j
3 3
The result is the same in all three cases. We shall soon see that this no coincedence,
but rather a property shared by all analytic functions. 
2.4.4. SIMPLY– AND MULTIPLY–CONNECTED DOMAINS. A
subset R of the complex plane will be called open if no part of the boundary of R
actually belongs to R. An open set R is said to be connected if it has the additional
property that any two points z1 , z2 in R can be joined by a path consisting of a
finite number of line segments joined end-to-end, all of which lie completely in R.
Such an open and connected set is called a domain. A set which consists of a
domain with some, all, or none of its boundary points added, is called a region. A
domain R in the complex plane is called simply–connected if every simple closed
curve which lies in R, can be shrunk to a point without any part of it ever leaving
R. Thus simply–connected domains, are domains without “holes”. A domain which
is not simply–connected is called multiply–connected.
In figure 2.38(a) τ lies inside R, and any such τ can be shrunk to a point
without leaving R. Hence, |z| < 3 is simply–connected. In figure 2.38(b), although
τ lies inside R (the annulus 1 < |z| < 3), it cannot be shrunk to a point without
leaving R. Hence, in this case R is not simply–connected, but multiply–connected.
111 EMT4801/1

Figure 2.38.

In closing we take note of the following very important convention: We say


that the boundary C of a region is traversed in the positive sense if an observer
travelling along the boundary in the specified direction, always has the region on
his left. Thus, if the region is a disc for which the boundary is some given circle,
the positive direction will be counter–clockwise along the circle. If the region is an
annulus, i.e. a region lying between two concentric circles, then the boundary will
of course be in two parts consisting of the inner circle and the outer circle. The
positive direction for the outer circle will be counter–clockwise, whereas for the
inner circle it will be clockwise. Any closed curve can of course be considered to be
the boundary of the region it encloses. For such a closed curve
H C, integration around
it in the positive sense is often indicated by the symbol . Unless clearly otherwise
stated, we will generally assume the direction of closed curves to be positive. Thus
if C is traversed in the negative sense, the direction must be clearly indicated.

2.4.5. ANTIDERIVATIVES. Given a continuous function f on some do-


main D, we say that a function F on D is an antiderivative of f on D, if we have that
F 0 (z) = f (z) for all z in D. In the theory of real-valued functions on the real line,
we learnt that all continous functions on some interval [a, b] have antiderivatives,
Rb
and also that integrals like a g(x)dx can be evaluated by means of antiderivatives;
Rb
specifically if G is an antiderivative of g on [a, b], then a g(x)dx = G(b) − G(a).
So what about complex functions? Do all continuous complex functions have an-
tiderivatives? Can we generally use antiderivatives to evaluate complex contour
integrals? The good news is that if a continuous function f does have an anti-
derivative on a domain D, then we can use that antiderivative to compute contour
integrals.

Proposition 2.4.2. Let f be a function which has an antiderivative F on a


domain D, and let C be a directed contour with initial point w0 , and terminal point
w1 , which lies completely in D. Then
Z
f (z)dz = F (w1 ) − F (w0 ).
C

It is not difficult to prove this result. Let’s consider the case where C is a
smooth contour, and z : [a, b] → C a parametrisation of C (so z(a) = w0 and
z(b) = w1 ). Then we can use the theory of functions of a real variable to conclude
112

that
Z Z b
f (z)dz = f (z(t)).z 0 (t)dt
C a
Z b
= F 0 (z(t)).z 0 (t)dt
a
= F (z(t))|ba
= F (z(b)) − F (z(a))
= F (w1 ) − F (w0 )

However there is some bad news as well. For complex functions not every
continuous function has an antiderivative! In fact the class of functions that do
have antiderivatives, is quite special. There are even some very nice elementary
functions that do NOT have antiderivatives. (For example it can be shown that the
function f (z) = z −1 does NOT have antiderivative on the domain {z ∈ C|z 6= 0}.)
So which functions do have antiderivatives? The following theorem goes some way
to answering that question.

Theorem 2.4.3. Let f be a continuous function defined on some domain D.


Then the following are equivalent:
(i) f has an antiderivative F on D. R
(ii) For any closed contour lying completey in D, we have that C f (z)dz.
(iii) For any two points w0 and w1 in D, the integrals along contours running
from w0 to w1 lying completely in D, all have the same value.

We will not prove this result, but merely sketch how it can be proved. In this
regard note that the implication (i) ⇒ (ii) is clear. If C is a closed contour, then
it starts and finishes at the same point. Let R that point be w. So if (i)
R holds and
if we use the antidrivative F to integrate C f (z)dz, we should get C f (z)dz =
F (w) − F (w) = 0.
Secondly if (ii) holds then (iii) should also hold. To see this let C1 and C2 be
any two contoursin D running from w0 to w1 . If we reverse the direction of C2 (the
reversed contour is denoted by −C2 ),R then C1 ∪−C2 forms R a closed contour.
R There-
fore if (ii) holds we must have 0 = C1 ∪−C2 f (z)dz = C1 f (z)dz + −C2 f (z)dz =
R R
C1
f (z)dz − C2 f (z)dz. But this means that (iii) must then hold.
Finally if (iii) holds, then since contour integrals only depend onRthe endpoints
w
of the contour, and not the actual path followed, we may simply write w01 f (z)dz for
an integral along a contour running from w0 to w1 . We now define the antiderivative
F as follows: Pick a point w0 ∈ D atRrandom. For any other point w ∈ D we then
w
define the value F (z) to be F (w) = w0 f (z)dz. To show that (i) holds, it is then
0
a simple matter of checking that F (w) = f (w).
In truth although the above theorem does provide us with different ways of
saying the same thing (that is that a function has an antiderivative), it still does
not help to identify a large class of functions which actually do have antiderivatives.
This gap is filled by Cauchy’s remarkable theorem, which we shall discuss in the
next subsection.

2.4.6. CAUCHY’S THEOREM. We now come to arguably the most im-


portant theorem on integration of complex functions. This is the theorem known
as Cauchy’s theorem, which makes the following claim:
113 EMT4801/1

Theorem 2.4.4 (Cauchy’s Theorem). If a complex function f is analytic at


every point inside and on a simply closed contour C, then
Z
f (z)dz = 0.
C
Cauchy’s proof of this theorem was only valid for the case where f 0 (z) is con-
tinuous inside and on C. Later Goursat presented another proof in which this
restriction was lifted, and for this reason this theorem is often also referred to as
the Cauchy–Goursat theorem. The proof for the most general case is quite techni-
cal, and outside the scope of this course. We do however show how the result can be
proved in the case considered by Cauchy (where f 0 (z) is assumed to be continuous
inside and on C).
In proving the result we will make use of Green’s theorem for the plane, which
states that
Theorem 2.4.5 (Green’s Theorem). If P (x, y) and Q (x, y) are continuous and
have continuous partial derivatives in a region R and on its boundary C, then
I Z Z  
∂Q ∂P
P dx + Qdy = − dx dy
C R ∂x ∂y
Since f (z) = u + jv is analytic inside and on C, it is also continuous there.
It follows that each of u and v is continuous inside and on C. Similarly since the
derivative f 0 (z) is continuous inside and on C, so are the first partial derivatives of
u and v. Now recall that
Z Z Z
f (z)dz = (udx − vdy) + j (vdx + udy) .
c C C
If we let R denote the interior of the contour C, and apply Green’s theorem to each
of the integrals on the right-hand side, we get that
Z Z Z   Z Z  
∂v ∂u ∂u ∂v
f (z)dz = − − dx dy + j − dx dy.
C R ∂x ∂y R ∂x ∂y
But since f is analytic throughout R, the Cauchy–Riemann equations
∂u ∂v ∂u ∂v
= , =−
∂x ∂y ∂y ∂x
hold throughout R. In view of this fact, we have that
∂v ∂u ∂u ∂v
− − = 0, − =0
∂x ∂y ∂x ∂y
throughout R. Applying this to the integral equation above, we conclude that
Z
f (z)dz = 0
C
With a bit of effort, Cauchy’s theorem can be extended to obtain the following
more general statement:
Theorem 2.4.6. If a complex function f is analytic on a simply connected
domain D, then Z
f (z)dz = 0
C
for every closed contour C lying in D.
If we compare this theorem to the one in the previous section, it follows that if a
function f is analytic on a simply connected domain D, then f has an antiderivative
on D.
A remarkable theorem of Morera demonstrates that analytic functions are in
fact the only continuous functions for which the above theorem holds.
114

R Theorem 2.4.7 (Morera’s theorem). If f (z) is continuous in a domain D and


if C
f (z)dz = 0 around every closed contour in D, then f is analytic in D.
Thus in the case where D is simply connected, Morera’s theorem is a kind of
converse to Cauchy’s theorem.

2.4.7. SOME CONSEQUENCES OF CAUCHY’S THEOREM.


Theorem 2.4.8. If f is analytic on a domain D, then for any two points a and
b in D, the integrals along contours in D running from a to b, all have the same
value.
Note: In a case like the one described above where the value of integral does
Rb
not depend on the path followed from a to b, we will often write a f (z)dz for an
integral along a contour from a to b.
Proof: Let C1 and C2 be any two contours running from a to b. If we reverse
the direction of say C2 (this reversed contour is of course denoted by −C2 ), then C1
together with −C2 forms a closed contour lying entirely in D. So if C = C1 ∪ −C2 ,
then I
f (z) dz = 0
C
But
Z Z Z Z Z
f (z)dz = f (z)dz + f (z)dz = f (z)dz − f (z)dz.
C C1 −C2 C1 C2
Thus if we combine the above two facts, we get that
Z Z
f (z)dz = f (z)dz
C1 C2
as required.

Figure 2.39


Theorem 2.4.9. Let C and C1 be two positively oriented simple closed contours
with C1 inside C. If f (z) is analytic on the region consisting of these two contours
and all points between them, then
Z Z
f (z)dz = f (z)dz.
C C1

Proof: To see why this result holds, consider the above sketch very carefully.
Make a “cut” from a point A on C, to a point B on C1 . The line segment from A
to B will be denoted by AB and the one from B to A, by BA. We now define a
new contour C0 consisting of
115 EMT4801/1

• C from A to A,
• the line segment AB,
• C1 in NEGATIVE (clockwise) direction from B to B,
• the line segment BA.
(That is C0 = C ∪ AB ∪ −C1 ∪ BA.) The result is then a simply closed contour for
which the interior is really just the region between C and C1 .

Figure 2.40

Since f is analytic in this region, we can apply Cauchy’s theorem to conclude


that Z
f (z)dz = 0.
C0
But
Z Z Z ZZ
f (z)dz = f (z)dz + f (z)dz +
f (z)dz + f (z)dz
C0 C AB −C1 BA
Z Z Z Z
= f (z)dz + f (z)dz − f (z)dz − f (z)dz
C AB C1 AB
Z Z
= f (z)dz − f (z)dz
C C1
R R
Combining these two equations now yields C
f (z)dz = C1
f (z)dz as required. 
Remark 2.4.10. It is a simple matter to extend the above result to the case
where we have a simply closed contour C containing n disjoint simply closed con-
tours where f is assumed to be analytic on each of these contours and at each point
in the region between them. In this case we get that
Z Z Z Z
f (z)dz = f (z)dz + f (z)dz + · · · + f (z)dz
C C1 C2 Cn

Figure 2.41
116

R
Example 2.4.11. Evaluate C
zdz from (0, 0) to (4, 2) where C is:
(a) the curve given by z(t) = t2 + jt where 0 ≤ t ≤ 2;
(b) the line segment from (0, 0) to (0, 2), followed by the line segment from
(0, 2) to (4, 2) .
Solution:
(a) In this case the contour is parametrised by the function z(t) = t2 + jt
where 0 ≤ t ≤ 2. So
Z Z 2
zdz = z(t).z 0 (t)dt
C 0
Z 2  
= t2 + jt (2t + j) dt
0
Z 2
t2 − jt (2t + j) dt

=
0
Z 2
2t3 + t − jt2 dt

=
0
  2
1 4 1 2 j 3
= t + t − t
2 2 3 0
8
= 10 − j
3
(b) We firstly note that
Z Z Z
zdz = zdz + zdz
C C1 C2

where C1 is the line segment from (0, 0) to (0, 2), and C2 the line segment
from (0, 2) to (4, 2). If f (z) = z, then u = x, and v = −y.R So when
expressed in terms of real line integrals, the contour integral C1 zdz be-
comes
Z Z Z Z
zdz = (x − jy)(dx + jdy) = (xdx + ydy) + j (xdy − ydx).
C1 C1 C1 C1

On C1 we have x = 0 and dx = 0. So the above simplifies to


Z Z 2 2
1 2
zdz = ydy = y = 2.
C1 0 2 0
On the line segment C2 from (0, 2) to (4, 2), we have y = 2 (therefore
dy = 0) and 0 ≤ x ≤ 4. So by a similar argument to the one above, we
have
Z Z Z
zdz = (xdx + ydy) + j (xdy − ydx)
C2 C2 C2
Z 4 Z 4
= xdx − j 2dx
0 0
4
1 2
= x − j2x
2 0
= 8 − j8
Thus Z Z Z
zdz = zdz + zdz = 10 − j8.
C C1 C2

117 EMT4801/1

We see that the results


R of (a) and (b) differ, so even if the endpoints
are fixed, the value of C zdz nevertheless does depend on the path. This
is because f (z) = z is not analytic.

Example 2.4.12. Given a positively oriented contour C, evaluate


Z
dz
C z −a
when
(a) a is inside C;
(b) a is outside C.
Solution:
1
(a) f (z) = z−a has a simple pole at z = a.

Let C be any contour containing a. Then for  > 0 small enough, the
circle C1 with radius  and centre at a, will be inside C. Assume that this
1
is the case. Since z−a is analytic on C and C1 , and also on the region
between these two contours, we have that
Z Z
dz dz
=
C z − a C1 z −a
The circle with centre a and radius  can of course be parametrised by
z(θ) = a + (cos(θ) + j sin(θ)) = a + ejθ where θ ∈ [0, 2π]. Thus
Z Z
dz dz
=
C z − a C1 z −a
Z 2π
1
= .jejθ dθ
0 a − (a + ejθ )
Z 2π
= jdθ
0

= jθ|0
= 2πj
1
(b) If a is outside C, then z−a is analytic everywhere inside and on C. Thus
in this case it will follow from Cauchy’s theorem that
Z
dz
= 0.
C z −a


Example 2.4.13. Generalise the result of part (a) of the previous example to
the case I
dz
n
C (z − a)
where n = 1, 2, 3, 4, . . . .
118

Solution: Since the case n = 1 was already dealt with in the previous example,
we may assume that n ≥ 2. Let C be a positively oriented contour containing a,
and as before let C1 be circle inside C with radius  and centre a. Arguing as before
we have that
Z Z
dz dz
n = n
C (z − a) C1 (z − a)
Z 2π
1
= jθ ))n
.jejθ dθ
0 (a − (a + e
Z 2π
jejθ
= n dθ
0 (ejθ )
Z 2π
j
= ejθ(1−n) dθ
n−1 0

1−n jθ(1−n)
= e
(1 − n)
0
1−n h
 i
= ej2π(1−n) − 1
(1 − n)
= 0
(Recall that ej2π(1−n) = cos(2π(1 − n)) + j sin(2π(1 − n)) = 1 + j0 for any integer
n.) Thus in summary
I 
dz 0 if n 6= 1
n =
C (z − a) 2πj if n=1

Remark 2.4.14. We could have used the theory of antiderivatives in the pre-
vious example. Suppose that n is an integer with n 6= 1, and let D be the domain
1 1
{z ∈ C|z 6= a}. For such an n, the function F (z) = 1−n (z−a)n−1 is defined and dif-
0 1 dz
R
ferentiable for all z 6= a, with F (z) = (z−a)n . Thus by theorem 2.4.3, C (z−a)n = 0
for any closed contour lying completelyR in D. But C will lie completely in D pre-
dz
cisely when the point a is not on C. So C (z−a) n = 0 for any contour not containing

a.
By contrast we saw in example R dz2.4.12 that there are some closed contours in
D = {z ∈ C|z 6= a} for which C z−a 6= 0. By theorem 2.4.3 this can only mean
1
that z−a does NOT have an antiderivative on {z ∈ C|z 6= a}.
It is tempting to think that the function F (z) = ln(z − a) is an antiderivative of
1
z−a on D. However keep in mind that F (z) = ln(z − a) is a multivalued function,
and that we need to make a branch cut in order to turn ln(z −a) into a single-valued
function. At all points on the “edge” of this branch cut, ln(z−a) is not differentiable.
d 1
So the problem with ln(z − a) is that although we do have that dz (ln(z − a)) = z−a
for some of the points of D, this formula does not hold for all the points of D.

Example 2.4.15. Evaluate


Z
zdz
C (z − 1) (z + 2j)
where C is any contour enclosing
(a) none of the singularities;
(b) z = −2j, but not z = 1;
(c) both singularities.
119 EMT4801/1

Solution: The first thing we can do is to simplify our task by using partial
fractions. Specifically if
z A B
= + ,
(z − 1)(z + 2j) z − 1 z + 2j
then z = A(z + 2j) + B(z − 1). For z = 1 we then get 1 = A(1 + 2j) + 0, and for
1−2j
z = −2j we get −2j = 0 + B(−1 − 2j). Thus A = 1+2j 1 1
= 1+2j . 1−2j = 1−2j
5 and
2j 2j 1−2j 4+2j
B = 1+2j = 1+2j . 1−2j = 5 . Therefore
z 1 − 2j 1 4 + 2j 1
= + .
(z − 1)(z + 2j) 5 z−1 5 z + 2j
(a) Since in this case C encloses none of the singularities, f (z) is analytic
inside and on C. Thus by Cauchy’s theorem
Z
zdz
=0
C (z − 1) (z + 2j)
(b) Here we may write
1 − 2j
Z Z Z
zdz 1 4 + 2j 1
= dz + .
C (z − 1) (z + 2j) 5 C z−1 5 C z + 2j
R 1 R 1
By example 2.4.12 C z−1 dz = 0, and C z+2j = j2π. So
Z
zdz 4 + 2j 4π
=0+ .j2π = (−1 + 2j).
C (z − 1) (z + 2j) 5 5
R 1
(c) In this case we may conclude from example 2.4.12 that C z−1 dz = j2π,
1
R
and C z+2j = j2π. So here
1 − 2j
Z Z Z
zdz 1 4 + 2j 1
= dz +
C (z − 1) (z + 2j) 5 C z−1 5 C z + 2j
1 − 2j 4 + 2j
= (j2π) + (j2π)
5 5
= j2π


Example 2.4.16. Evaluate


Z
12z 2 − 4jz dz,

C

where C is the curve y = x − 3x2 + 4x − 1, joining (1, 1) and (2, 3) .


3

Solution: The function F (z) = 4z 3 − 2jz 2 is an antiderivative of f (z) =


2
12z − 4jz on all of C. But by theorem 2.4.3 the integral then only depends on
the endpoints of C, and not the path followed between these points. Thus we may
replace C by a simpler path (with the same endpoints) if we wish. The quickest
way to compute this integral is to use proposition 2.4.2 to write
Z
12z 2 − 4jz dz = 4z 3 − j2z 2 |2+j3
 
1+j
C
h i h i
3 2 3 2
= 4 (2 + j3) − j2 (2 + j3) − 4 (1 + j) − j2 (1 + j)
= [−160 + j46] − [−4 + j8]
= −156 + j38

120

2.4.8. CAUCHY’S INTEGRAL FORMULA. We now come to a very


elegant formula of Cauchy.
Theorem 2.4.17. Let f be analytic everywhere inside and on a simple closed
contour C taken in the positive sense. Then for any point a inside C, we have
Z
1 f (z)
f (a) = dz.
2πj C z − a
We merely outline the the proof without going into R any detail.R To prove
this theorem one first proves by careful analysis that C fz−a (z)
dz = C fz−a (a)
dz =
R 1 R 1
f (a) C z−a dz. Once this is done the next step is use the fact that C z−a dz = 2πj
to conclude that C fz−a
R (z)
dz = 2πjf (a).
There is one very important consequence of this formula, which we now point
out. Let C be a simple closed contour with D the interior of this contour,
R f (z) and let f
1
be analytic on C and its interior. Consider the function w → 2πj C z−w
dz defined
on D. It is not difficult to see that this function has derivatives of all orders on D
with
dn dn
 Z  Z   Z
1 f (z) 1 f (z) n! f (z)
n
dz = n
dz = dz.
dw 2πj C z − w 2πj C dw z−w 2πj C (z − w)n+1
1
R f (z)
But by Cauchy’s integral formula, the function w → 2πj C z−w
dz agrees with the
function w → f (w) on D. So we have proved that f has derivatives of all orders
on D, and that
Z
(n) n! f (z)
f (w) = dz
2πj C (z − w)n+1
for all w ∈ D. By refining this argument we may arrive at the following conclusion:
Theorem 2.4.18. Let f be analytic on a domain D. Then f has derivatives
of all orders on D. Moreover each of these derivatives is also analytic on D. In
addition if f (z) is analytic inside and on a simply closed contour C, then for any
point a inside C, we have
Z
n! f (z)
dz = f (n) (a), n = 0, 1, 2, 3, . . .
2πj C (z − a)n+1
(Here we use the convention that 0! = 1.)
If the formula in the above theorem is rewritten as
Z
f (z) 2πj (n)
n+1
dz = f (a),
C (z − a) n!
it can be used to compute a large number of complex integrals.

Example 2.4.19. Use Cauchy’s integral formula to evaluate each of


R sin πz 2 + cos πz 2
(a) C dz, and
(z − 1)(z − 2)
e2z
R
(b) C (z+1) 4 dz

where C is the circle |z| = 3.

Solution:
(a) Using partial fractions we may write
1 1 1
= − .
(z − 1) (z − 2) z−2 z−1
121 EMT4801/1

So
sin πz 2 + cos πz 2 sin πz 2 + cos πz 2 sin πz 2 + cos πz 2
Z Z Z
dz = dz − dz.
C (z − 1) (z − 2) C z−2 C z−1
Since both z = 1 and z = 2 are inside |z| = 3, it will now follow from
Cauchy’s integral formula that
sin πz 2 + cos πz 2 sin πz 2 + cos πz 2 sin πz 2 + cos πz 2
Z Z Z
dz = dz − dz
C (z − 1) (z − 2) C z−2 C z−1
= 2πj [sin 4π + cos 4π] − 2πj [sin π + cos π]
= 2πj [0 + 1] − 2πj [0 − 1]
= 4πj.
(b) The point z = −1 lies inside |z| = 3. So we may use the formulas proved
above to conclude that
e2z 2πj d3
Z
2z

4 dz = 3! dz 3
e
C (z + 1) z=−1
2πj  −2 
= 8e
3!
8πj
=
3e2

EXERCISE 2.4.
R (2,5)
1. (a) Evaluate (0,1) (3x + y) dx + (2y − x) dy along
(i) the curve y = x2 + 1,
(ii) the straight line joining (0, 1) and (2, 5) ,
(iii) the contour made up of line segments from (0, 1) to (0, 5) and
then from (0, 5) to (2, 5),
(iv) the contour made up of line segments from (0, 1) to (2, 1) and
then from (2, 1) to (2, 5.).
[(i) 88/3, (ii) 32, (iii) 40, (iv) 24]
H
(b) Evaluate C (x + 2y) dx + (y − 2x) dy around the ellipse x = 4 cos θ,
y = 3 sin θ where 0 ≤ θ ≤ 2π. [−48π]
H 2
(c) Evaluate C |z| dz around the square with vertices at (0, 0) , (1, 0) ,
(1, 1) and (0, 1) . [−1 + j]
R 2

(d) C z + 3z dz, along
(i) the circle from (2, 0) to (0, 2) ,
(ii) the straight line from (2, 0) to (0, 2) and
(iii) the contour made up of line segments from (2, 0) to (2, 2) and
then from (2, 2) to (0, 2) .
[−44/3 − j8/3 in all cases]
R 2−j 
(e) Evaluate j 3xy + jy 2 dz, along the curve x = 2t − 2, y = 1 +
 −1 79

t − t2 , (t ∈ [1, 2]). 3 + j 30
H 2
(f) Evaluate C (z) dz around the circle
(i) |z| = 1 and
(ii) |z − 1| = 1
[(i)0, (ii)4πj]
Z
dz
(g) Evaluate around
C −2
z
(i) |z + 2| = 3,
122

(ii) the square with vertices at 2 ± 2j and −2 ± 2j.


H
2. Evaluate C (5x + 6y − 3) dx + (3x − 4y + 2) dy, where C is the triangle
with vertices at (0, 0) , (4, 0) and (4, 3) . [−18]
3. (a) Let C be any simple closed curve bounding a region having area A.
Prove that Z
1
A= xdy − ydx
2 C
(b) Hence, find the area bounded by the ellipse x = a cos θ, y = b sin θ
(0 ≤ θ ≤ 2π). [πab]
H 2 y
 y
4. (a) Prove that C y cos x − 2e dx + (2y sin x − 2xe ) dy = 0 around
any simple closed curve C.
(b) Evaluate the integral in (a) along the parabola y = x2 fromh (0, 0) toi
 2
π, π 2 . −2πeπ
(Hint: See if you can replace y = x2 with a simpler contour.)
5. Verify Cauchy’s theorem for the functions
(a) f (z) = 3 sin 2z if C is the square with vertices at ±2 ± 2j,
(b) f (z) = z 3 − jz 2 − 5z + 2j if C is the circle |z| = 1.
H
6. Determine C z − 3dz where C is the circle |z − 2| = 5. Does your answer
contradict Cauchy’s theorem? Explain.
Show that C e−2z dz is independent of the path C joining
R
7.  1 −2 the points
1 − e−2

(1, −π) and (2, 3π) and then determine its value. 2e
8. Evaluate Z
2zdz
,
C (2z − 1) (z + 2)
where C is the circle
(a) |z| = 1,
(b) |z| = 3. [(a) 52 πj, (b) 2πj]
9. Evaluate Z
5zdz
,
C (z + 1) (z − 2) (z + 4j)
where C is the circle
(a) |z| = 3,
(b) |z| = 5. [(a) 4π
17 (9 + 2j) , (b) 0]
10. Use Cauchy’s integral formulas to evaluate each of the following:
z3 + z
Z  
3πj
(a) 3 dz, where C is the circle |z| = 1. −
C (2z + 1) 8
Z
4z
(b) 3 dz, where C is the circle |z| = 3. [0]
C (z − 1) (z + 2)
ez
Z
(c) dz, where C is the circle
Cz−2
(i) |z| = 3,  
(ii) |z| = 1. 2πje2 , 0
Z
sin 3z
(d) π dz, where C is |z| = 5. [2πj]
C + 2
z
ejz
Z
(e) 3
dz, where C is |z| = 2. [−πj]
C z
sin6 z
Z  
πj
(f) (i) π dz if C is |z| = 1,
C z − 6 32
123 EMT4801/1

sin6 z
Z  
21πj
(ii)  dz if C is |z| = 1.
π 3
C z− 16
6
ezt
I  
1 1
(g) dz, if t > 0 and C is |z| = 3. (sin t − t cos t)
2πj C (z 2 + 1)2 2
124

UNIT 5: COMPLEX SERIES


2.5.1. OBJECTIVE. To extend the work done on series in module 1 to series
of complex functions. In particular to see how one may obtain series expansions of
complex functions with isolated singularities, and to then use this theory to obtain
an algorithm for computing the residue about a given pole.

2.5.2. OUTCOMES. At the end of this unit the student should


• Be familiar with the concepts of a complex power series and its region of
convergence;
• Know and be able to apply the complex version of Taylors theorem;
• Be able to use either Taylors theorem, or results on geometric series to
compute regions of convergence of given Taylor series;
• Know the Maclaurin expansions of some elementary functions like ez ,
Ln(1 + z), sin z, cos z and (1 + z)n ;
• Given a function which is a combination of elementary functions, know
how to compute the Taylor series of the given function from those of the
elementary functions, by means of the basic operations on power series;
• Be familiar with Laurents theorem and the concept of a Laurent series;
• Be able to use either Taylors theorem, or results on geometric series to
compute regions of convergence of given Taylor series;
• Know the Maclaurin expansions of some elementary functions like ez ,
Ln(1 + z), sin z, cos z and (1 + z)n ;
• Given a function which appears as a combination of elementary functions,
the student should be able to use basic operations with series to compute
the Laurent expansion of this function from the associated Taylor expan-
sions of these elementary functions. The student should also be able to
determine the annulus of convergence of such an expansion.
• The student should know how to determine the type of an isolated singu-
larity from the form of the Laurent series at that point.
• The student should be familiar with the concept of a residue and be able
compute residues of poles.

2.5.3. POWER SERIES. The theory of complex sequences and series runs
along much the same lines as the theory for real sequences and series, and hence
we will not go into too much detail here. We will content ourselves with defining
only what we need for the theory of complex power series.
Given an infinite sequence of complex numbers z1 , z2 , . . . , the formal sum
z1 + z2 + . . . + zk + . . .
of the elements is called a series. We say that the series
Pnconverges to a sum S, if the
sequence of partial sums S1 , S2 , S3 , . . . (where Sn = k=1 zk = z1 + z2 + · · · + zn )
converges to a complex number S. In such a case we write

X
(2.5.1) S= zk = z1 + z2 + z3 + . . . + zk + . . .
k=1

If the series fails to converge, we say it diverges. If the series


|z1 | + |z2 | + . . . + |zk | + . . .
made up of the moduli of the zn ’s converges, we say that the series is absolutely
convergent. Just as in the theory of series of real numbers, we have that z1 + z2 +
. . . + zk + . . . converges whenever |z1 | + |z2 | + . . . + |zk | + . . . converges absolutely.
125 EMT4801/1

A series of the form


X∞
n 2 r
f (z) = an (z − z0 ) = a0 + a1 (z − z0 ) + a2 (z − z0 ) + . . . + ar (z − z0 ) + . . .
n=0
is called a power series about z = z0 . We will sometimes also say that it is centred
at z0 . The first thing we can say about such power series, is that if it converges at
a point z1 6= z0 , it will also converge at any other point that is closer to z0 than z1
is. Specifically we get
Theorem 2.5.1. If a power series

X n
an (z − z0 )
n=0
converges at some point z1 where z1 6= z0 , then it is absolutely convergent at each
point z for which |z − z0 | < |z1 − z0 |.
Basically this theorem tells us that if a power series centred at z0 converges at
some point z1 which is different from z0 , then we can always find a circle centred at
z0 (with radius R1 = |z1 − z0 |), such that the power series converges at all points
inside
P∞ that circle.n The biggest circle centred at z0 such that the power series
n=0 an (z − z0 ) converges at each point inside that circle, is called the circle
of convergence of the power series. The radius of this circle is called the radius of
convergence of the power series. One can now use the previous theorem to establish
the following facts:
Theorem 2.5.2. Let |z − z0 | = R be the circle of convergence of the power
series
X ∞
an (z − z0 )n .
n=0
Then the power series converges absolutely at each point inside the circle (so when-
ever |z − z0 | < R), and diverges at each point outside the circle (ie. whenever
|z − z0 | > R).

But how do we find this radius of convergence? If the limit limn→∞ aan+1 ex-
n
ists,
P∞ then we can use D’Alembert’s ratio test to prove that the radius of convergence
of n=0 an (z − z0 )n is precisely

an
R = lim .
n→∞ an+1

In cases where this limit does not exist, we will need to look for other ways to
compute the radius of convergence.
Lets look at a simple example to illustrate the point.
Example 2.5.3. Consider the complex geometric series
a + az + az 2 + az 3 + . . .
The nth partial sum of this series corresponds to Sn = a + az + az 2 + · · · + az n−1 .
These partial sums may of course be computed by using the following well-known
formula:
(1 − z n )
Sn = a + az + az 2 + · · · + az n−1 = a .
(1 − z)
From this formula it is clear that the sequence of partial sums will converge if and
if {z n } converges. Next note that limn→∞ z n → 0 whenever |z| < 1 and that {z n }
diverges if |z| > 1. If we apply this to the formula for the partial sums, we have that
a
limn→∞ Sn = 1−z whenever |z| < 1, with the sequence of partial sums diverging if
126

|z| > 1. So the circle of convergence of the series a + az + az 2 + az 3 + . . ., is |z| = 1,


a
and the series converges to the sum 1−z at every z inside that circle.
We close this section by indicating what happens to a power series when we
either integrate or differentiate term for term.
P∞
Theorem 2.5.4. Let n=0 an (z − z0 )n be a power series centred at z0 with
|z − z0 | = R its circle of convergence. Then following holds:
• The function defined by

X
S(z) = an (z − z0 )n whenever |z − z0 | < R
n=0

is well-defined and continuous on the region |z − z0 | < R.


• S(z) is differentiable at each z satisfying |z − z0 | < R. At each such point

X
S 0 (z) = nan (z − z0 )n−1 .
n=1
P∞ P∞
The series n=0 an (z −z0 )n and n=1 nan (z −z0 )n−1 will moreover have
same circles of convergence.
• For any contour C in the interior of the circle of convergence, and any
function g which is continuous on C, we have that
Z X ∞ Z
g(z)S(z)dz = an g(z)(z − z0 )n dz.
C n=0 C
P∞
Notice that since the function S(z) = n=0 an (z − z0 )n is differentiable on the
entire domain |z − z0 | < R, it is in fact analytic on this domain. Also notice that in
the formula for the
P∞derivative, there is no n = 0 term. This is because in the series
formula S(z) = n=0 an (z − z0 )n , the n = 0 term is a constant which vanishes on
being differentiated.

2.5.4. TAYLOR SERIES. We saw in the previous section that any power
series defines an analytic function in the interior of its circle of convergence. Taylor’s
theorem for analytic functions provides a kind of converse by giving details of how
analytic functions may be written as power series.
Theorem 2.5.5 (Taylor’s Theorem). Let f be a function which is analytic
throughout an open disc |z − z0 | < R0 centred at z0 , and with radius R0 . For every
z with |z − z0 | < R0 we then have that

X
f (z) = an (z − z0 )n
n=0
where
f (n) (z0 )
an = n = 0, 1, 2, 3, . . .
n!
Here we agree that f (0) (z0 ) = f (z0 ) and 0! = 1.
We will refer to the series

X f (n) (z0 ) f (1) (z0 ) f (2) (z0 )
f (z) = (z − z0 )n = f (z0 ) + (z − z0 ) + (z − z0 )2 + . . .
n=0
n! 1! 2!
as the Taylor series of f about the point z0 . In the special case where z0 = 0, this
series is sometimes also called the Maclaurin series of f .
Proof of the case z0 = 0: The proof of this theorem relies on a clever use
of Cauchy’s integration formulas. We briefly outline the proof for the case where
127 EMT4801/1

z0 = 0. So suppose that f is analytic inside the disc |z| < R0 . Given any w with
|w| < R0 , we may then enclose w in a circle C0 centred at 0, which lies completely
inside the disc |z| < R0 . For any point s on C0 we will then have that |w| < |s|, ie.
| ws | < 1. Thus by the example in the previous section we will have that

1 1 1 1 X  w n
= = .
s−w s 1 − (w/s) s n=0 s
Using this expansion and Cauchy’s Integration Formulas, we may now conclude
that
Z
1 f (s)
f (w) = ds
2πj C0 s − w
Z ∞
1 1 X  w n
= f (s) ds
2πj C0 s n=0 s
∞  Z 
X 1 f (s)
= n+1
ds wn
n=0
2πj C0 s

X f (n) (0) n
= w
n=0
n!
as required. 

A very useful fact regarding the Taylor series is that it is unique! Exactly what
we mean by saying it is unique, is explained in the following theorem:
Theorem 2.5.6. Let f be a functionP∞ which is analytic throughout an open disc
|z − z0 | < R0 centred at z0 , and let n=0 bn (z − z0 )n be a power series such that

X
f (z) = bn (z − z0 )n
n=0
(n)
for all z in the disc |z − z0 | < R0 . Then bn = f n!(z0 ) (n = 0, 1, 2, 3, . . . ), that is
n
P
n=0 bn (z − z0 ) is the Taylor series centred at z0 of f .

(See if you can use the theorem on differentiating


P∞ power series in the previous
section to prove this result. Given that f (z) = n=0 bn (z − z0 )n holds in some
disc centred at z0 , you need to differentiate this expression n times, and then set z
equal to z0 .)
When it comes to actually computing a Taylor series centred at z0 of a function
f , one obvious way in which to do it, is to one by one compute the coefficients
(n)
an = f n!(z0 ) . However this is not always necessary. The message from the above
theorem is that if by some other means we are able to generate a power series
centred at z0 which represents f in some disc centred at z0 , then such a series must
in fact be the Taylor series we are looking for.
Example 2.5.7. Expand the function
1
f (z) =
z−3
as Taylor series in each of the following forms, and in each case state the region of
convergence:

an z n
P
(a)
n=0

P n
(b) an (z − 2)
n=0
128

Solution: The results of example 2.5.3 will be used in both cases.


(a) We know from example 2.5.3 that

1 z   z 2  z 3
z =1+ + + + ...
1− 3 3 3 3

whenever | z3 | < 1; that is whenever |z| < 3. But then

1 1 1
= −
z−3 3 1 − z3
  z   z 2  z 3 
1
= − 1+ + + + ...
3 3 3 3
1 z z2 z3
= − − − − ...
3 9 27 81
whenever |z| < 3.
(b) Notice that if |z − 2| < 1, then by example 2.5.3 we have that

1
= 1 + (z − 2) + (z − 2)2 + (z − 2)3 + . . .
1 − (z − 2)

Therefore

1 1
=
z−3 (z − 2) − 1
 
1
= −
1 − (z − 2))
2 3
= −1 − (z − 2) − (z − 2) − (z − 2) − . . .

whenever |z − 2| < 1. 

Example 2.5.8. Determine the Taylor series for


(a) Ln(z) (the principal branch of ln z) about z = 1,
(b) sin z about z = π2 .

Solution:
(a) The principal branch of ln(z) is not differentiable at 0 and on the negative
real axis. However the disc |z − 1| < 1 (centred at 1 with raius 1) does not
contain any of these points. So on this disc Ln(z) will be analytic, and
hence by Taylor’s theorem, Ln(z) will have a expansion. Differentiating
and evaluating the derivatives at 0, we see that

f (z) = Ln(z) f (1) = Ln(1) = 0


f 0 (z) = z1 f 0 (1) = 1
f 00 (z) = −1
z2 f 00 (1) = −1
f 000 (z) = 1.2
z3 f 000 (1) = 2!
f (4) (z) = −1.2.3
z4 f (4) (1) = −3!

Carrying on inductively we see that

(−1)n−1 (n − 1)!
f (n) (z) = f (n) (1) = (−1)n−1 (n − 1)!
zn
129 EMT4801/1

Thus
1 2 2! 3 3! 4
Ln(z) = 0 + (z − 1) − (z − 1) + (z − 2) − (z − 1) +
2! 3! 4!
(−1)n−1 (n − 1)!
··· + (z − 1)n + . . .
n!
1 2 1 3 1 4
= (z − 1) − (z − 1) + (z − 1) − (z − 1) +
2 3 4
(−1)n−1
··· + (z − 1)n + . . .
n
We know from Taylor’s theorem that this series must converge for |z − 1| <
1. There is moreover no larger open disc centred at 1 on which the above
expansion is valid. To see this notice that 0 lies on the boundary of the
open disc |z − 1| < 1 (since |0 − 1| = 1) and that Ln(z) is undefined at 0,
with the series itself becoming −1 − 21 − 13 − . . . at 0, which is a divergent
series.
(b) Differentiating f (z) = sin(z) and evaluating the derivatives at π2 , yields
f π2 = 1

f (z) = sin(z)
f 0 (z) = cos(z) f 0 π2  = 0
00
f (z) = − sin z f 00 π2  = −1
000
f (z) = − cos(z) f 000 π2 = 0
(4)
f (z) = sin(z) f (4) π2 = 1
Thus
2 4
z − π2 z − π2
sin(z) = 1 − + + ...
2! 4!
By Taylor’s theorem this expansion will be valid on any disc of the form
|z − π2 | < R0 on which sin is analytic. But since sin is analytic on all of C,
the radius R0 can be as big as we like. Thus the series actually converges
for all values of z. 
Example 2.5.9. Expand the principal branch of tanh−1 z as a Taylor series
about z = 0.

Solution: Recall that


1 1+z
tanh−1 (z) = ln .
2 1−z
The principal branch of tanh−1 is given by
1 1+z
tanh−1 (z) = Ln .
2 1−z
Now let w1 and w2 lie in the right hand half of the complex plane. That means
they can be written in the form w1 = r1 ejθ1 and w2 = r2 ejθ2 , where r1 , r2 > 0, and
− π2 < θ1 , θ2 < π2 . But then w r1 j(θ1 −θ2 )
w2 = r2 e
1
with −π < θ1 − θ2 < π. Therefore
   
w1 r1
Ln(w1 ) = ln(r1 ) + jθ1 , Ln(w2 ) = ln(r1 ) + jθ2 and Ln w = ln + j(θ1 − θ2 ) =
2
  r1
w1
[ln(r1 )−ln(r1 )]+j(θ1 −θ2 ). It follows that in this case Ln w2 = Ln(w1 )−Ln(w2 ).
For any z with |z| < 1, both 1 + z and 1 − z will lie in the right hand side of the
complex plane. So for all z with |z| < 1, the principal branch of tanh−1 satisfies
1 1+z 1
tanh−1 z = Ln = [Ln (1 + z) − Ln (1 − z)]
2 1−z 2
130

Now let w = 1 − z. If |z| < 1, then |w − 1| = | − z| < 1. We may therefore use the
expansion for Ln in the previous example to conclude that
Ln(1 − z) = Ln(w)
1 1 1
= (w − 1) − (w − 1)2 + (w − 1)3 − (w − 1)4 + . . .
2 3 4
(−z)2 (−z)3 (−z)4
= (−z) − + − + ...
2 3 4
z2 z3 z4
= −z − − − − ...
2 3 4
Similarly for v = 1 + z, the same expansion yields
Ln(1 + z) = Ln(v)
1 1 1
(v − 1) − (v − 1)2 + (v − 1)3 − (v − 1)4 + . . .
=
2 3 4
z2 z3 z4
= z− + − + ...
2 3 4
Thus whenever |z| < 1, we have
1
tanh−1 z = [Ln (1 + z) − Ln (1 − z)]
2 
z2 z3 z4 z2 z3 z4
  
1
= z− + − + . . . − −z − − − − ...
2 2 3 4 2 3 4

z3 z5
= z+ + + ... for |z| < 1
3 5
Note: The radius of convergence of this power series is actually precisely R = 1.
To see this note that

an
= lim 2n + 1 = 1.

R = lim
n→∞ an+1 n→∞ 2n − 1


Example 2.5.10. Expand
1
f (z) =
z (z − 2j)
as a Taylor series about the point z = j, i.e. in powers of (z − j) .
Solution: To successively differentiate f (z) will become tedious, so lets resolve
f (z) into partial fractions, and try to use the results of example 2.5.3. If
1 A B
= + ,
z (z − 2j) z z − 2j
then A (z − 2j) + Bz = 1. From this it is clear that 2jB = 1 if z = 2j and that
−2jA = 1 when z = 0. So A = −1 1
2j and B = 2j .Thus
 
1 1 1 1
f (z) = = − .
z (z − 2j) 2j z − 2j z
Notice that 1
z−2j
1
= − j−(z−j) = − 1j 1−(1z−j ) . Moreover if | z−j
j | < 1 (equivalently
j
|z − j| < |j| = 1), then
   2  3
1 z−j z−j z−j
=1+ + + + ...
1 − ( z−j
j )
j j j
131 EMT4801/1

So

1 1 1
= −
z − 2j j 1 − ( z−j
j )
"    2  3 #
1 z−j z−j z−j
= − 1+ + + + ...
j j j j
"    2  3 #
1 z−j z−j z−j
= −1 − − − − ...
j j j j

1
Similarly for z we have that

1 1 1 1
= =
z (z − j) + j j 1 − (− z−j
j )
"    2  3 #
1 z−j z−j z−j
= 1+ − + − + − + ...
j j j j
"    2  3 #
1 z−j z−j z−j
= 1− + − + ...
j j j j

If we subtract the second expansion from the first, we get

"  2  4 #
1 1 1 z−j z−j
− = −2 − 2 −2 + ...
z − 2j z j j j
2
−1 + (z − j)2 − (z − j)4 + . . .

=
j

(In the last step above we used the fact that j 2 = −2, j 4 = 1, j 6 = −1, etc.) From
this we may now finally conclude that

 
1 1 1
f (z) = −
2j z − 2j z
1 
−1 + (z − j)2 − (z − j)4 + . . .

=
j2
= 1 − (z − j)2 + (z − j)4 − . . .

whenever |z − j| < 1.
Note: The two singularities of f (z) occur at z = 0 and z = 2j, which are
both 1 unit removed from j. Hence the radius of convergence being 1 comes as no
surprise! 

We close this section by presenting a list of the Maclaurin expansions of some


common complex functions. In the case of multi-valued functions, it is always the
132

principal branch that is in view.


z2 z3
ez = 1 + z + + + ... |z| < ∞
2! 3!

z3 z5 z7
sin(z) = z − + − + ... |z| < ∞
3! 5! 7!

z2 z4
cos(z) = 1 − + − ... |z| < ∞
2! 4!

z2 z3 z4
Ln(1 + z) = z − + − + ... |z| < 1
2 3 4

z3 z5
tan−1 z = z − + − ... |z| < 1
3 5

z3 z5
tanh−1 z = z + + + ... |z| < 1
3 5

n n (n − 1) 2 n (n − 1) (n − 2) 3
(1 + z) = 1 + nz + z + z + ... |z| < 1
2! 3!
(Binomial series)
2.5.5. LAURENT SERIES. We have seen how at each point where a func-
tion is analytic, one can use Taylor series to write the function as a power series in
some small neighbourhood around that point. But what about functions which are
not analytic, but have a few singularities here and there? Can they too be written
as some sort of power series? The answer to this question is provided by Laurent’s
theorem.
Theorem 2.5.11 (Laurent’s Theorem). Let f be a function which is analytic
throughout an annular domain r2 < |z −z0 | < r1 centred at z0 (here r2 ≥ 0), and let
C be any positively oriented simple closed contour around z0 which lies completely
in that domain. Then for every z with r2 < |z − z0 | < r1 , we have that
∞ ∞
X X bn
f (z) = an (z − z0 )n +
n=0 n=1
(z − z0 )n
where Z
1 f (z)
an = dz (n = 0, 1, 2, 3, . . . )
2πj C (z − z0 )n+1
and Z
1 f (z)
bn = dz (n = 1, 2, 3, . . . ).
2πj C (z − z0 )−n+1
Note: An annulus is defined to be the ring-shaped region between two con-
centric circles centred at the same point. (See the sketch below.)
We will refer to the series
∞ ∞
X X bn
f (z) = an (z − z0 )n +
n=0 n=1
(z − z0 )n
as the Laurent series of f valid for the domain r2 < |z − z0 | < r1 . The expansion
in Laurent’s theorem can also be written in the form

X
f (z) = cn (z − z0 )n (r2 < |z − z0 | < r1 )
n=−∞
133 EMT4801/1

Figure 2.42

where Z
1 f (z)
cn = (n = 0, ±1, ±2, ±3, . . . ).
2πj C (z − z0 )n+1
The proof of this theorem is a little more technical than that of Taylor’s theo-
rem, but uses the same basic ideas. For that reason we will not prove the theorem,
but merely give a very brief hint of how it can be proved. For the case where z0 = 0,
it follows from a clever use of Cauchy’s integration formulas that
Z Z
1 f (z) 1 f (z)
f (z) = dz + dz.
2πj C1 s − z 2πj C2 z − s
To get the series expansion from this equation, it is then a matter of showing that
we may set

1 X zn
= n+1
s − z n=0 s
in the first integral, and

1 X 1 1
= −n+1
z − s n=1 s zn
in the second.
Remark 2.5.12. It is important to realise that Taylor’s theorem is really just
a special case of Laurent’s theorem. What we mean by that is if f is analytic
throughout the domain |z − z0 | < r1 , then the Laurent series of f in this domain is
exactly the Taylor series. To see this note that if f is analytic throughout |z − z0 | <
r1 , then for each n = 1, 2, . . . the function (z−zf0(z) )−n+1 = f (z)(z − z0 )
n−1
will also
be analytic, which means that we can then use Cauchy’s theorem to conclude that
for each such n Z
1 f (z)
bn = = 0.
2πj C (z − z0 )−n+1
In addition we will then be able to use Cauchy’s integration formulas to conclude
that
f (n) (z0 )
Z
1 f (z)
an = dz = n = o, 1, 2, . . .
2πj C (z − z0 )n+1 n!
Thus in this case
∞ ∞ ∞
X X bn X f (n) (z0 )
an (z − z0 )n + n
= (z − z0 )n .
n=0 n=1
(z − z 0 ) n=0
n!

We next come to the question of the uniqueness of a Laurent series.


Theorem 2.5.13. Let f be a function which P∞is analytic throughout
P∞ anqnannulur
domain r2 < |z−z0 | < r1 centred at z0 , and let n=0 pn (z−z0 )n + n=1 (z−z 0)
n be a

power series which converges to f (z) at all points z in the domain r2 < |z −z0 | < r1 .
134

Then this series is the Laurent series for f in powers of (z − z0 ) valid for that
domain.
What this theorem says is that, in any given domain of the form r2 < |z − z0 | <
rP1 on which f is analytic,
P∞ there is one and only one series expansion of the form
∞ n qn
n=0 pn (z − z0 ) + n=1 (z−z0 )n which represents f , namely the Laurent series.
On a different domain the series can look different. For example if we pick some
r3 > r1 and instead look at the domain r1 < |z − z0 | < r3 , then on this different
domain we may get a different Laurent series.
When it comes to actually computing a Laurent series, the conclusion we can
make from this theorem is “anything goes”. What we mean P by that is if in any
∞ n
way
P∞ whatsoever we are able to generate a series of the form n=0 pn (z − z0 ) +
qn
n=1 (z−z0 )n which represents f in the domain r2 < |z − z0 | < r1 , then that series
will be the Laurent series of f for that domain. This fact will prove to be very
useful when it comes to the computation of Laurent series.

Example 2.5.14. For each of the following functions determine the Laurent
series about the indicated singularity, which converges in a region of the form 0 <
|z − z0 | < r. In each case give the precise region of convergence of the series.
e2z
(a) 3 z = 2,
(z − 2)
1
(b) (z − 2) sin z = −3,
z+3
z − sin z
(c) z = 0,
z3
z
(d) z = −3,
(z + 2) (z + 3)
1
(e) 2 z=4
2
z (z − 4)
Solutions:
e2z
(a) We need to write (z−2) 3 as a series in powers of (z − 2). If we let z − 2 = u,

we can use the Taylor series


1 1 1
ew = 1 + w + w2 + w3 + w4 + . . .
2! 3! 4!
(valid for all w) to conclude that
e2z e2(u+2)
3 =
(z − 2) u3
e4 (2u)
= e
u3 " #
2 3 4
e4 (2u) (2u) (2u)
= 1 + (2u) + + + + ...
u3 2! 3! 4!
e4 2e4 22 e4 23 e4 24 e4
= 3
+ 2 + + + u + ...
u u 2!u 3! 4!
e4 2e4 22 e4 23 e4 24 e4
= 3 + 2 + + + (z − 2) + . . .
(z − 2) (z − 2) 2! (z − 2) 3! 4!
2 3 4
Since e(2u) = 1 + (2u) + (2u) (2u) (2u)
2! + 3! + 4! + . . . for all u, the series will
converge for all u = (z − 2), except where z − 2 = 0. Thus the region of
convergence is 0 < |z − 2| < ∞.
135 EMT4801/1

(b) Let z + 3 = u. Notice that then


1 1
(z − 2) sin = (u − 5) sin ,
z+3 u
1
and set w = u in the Maclaurin series
1 3 1
sin(w) = w − w + w5 − . . .
3! 5!
to get
1 1
(z − 2) sin = (u − 5) sin
z+3 u
"  3  5 #
1 1 1 1 1
= (u − 5) − + − ...
u 3! u 5! u
5 1 5 1 5
= −1− + + − + ...
u 3!u2 3!u3 5!u4 5!u5
5 1 5 1 1
= 1− − 2 + 3 + − + ...
u 6u 6u 120u4 24u5
5 1 5 1
= 1− − + 3 + 4 − ...
z + 3 6 (z + 3)2 6 (z + 3) 120 (z + 3)
Since the series expansion of sin u1 will converge whenever z + 3 = u 6= 0,
the above series converges for all 0 < |z + 3| < ∞ (ie. whenever z 6= −3).

(c) For z−sin


z3
z
we may once again use the Maclaurin series of sin (which
converges for all z) to conclude that
z3 z5 z7
  
z − sin z 1
= z− z− + − + ...
z3 z3 3! 5! 7!
1 z2 z4
= − + − ...
3! 5! 7!
Since the coefficients of all the z1n terms are zero, this series in fact con-
verges for all values of z, although it only equals z−sin
z3
z
when z 6= 0.

(d) We want to expand


z
(z + 2) (z + 3)
z u−3
in powers of (z + 3), so let u = z + 3. Then (z+2)(z+3) = u(u−1) . We may
therefore use the results of example 2.5.3 to conclude that
z u−3
=
(z + 2) (z + 3) u (u − 1)
 
(u − 3) 1
= −
u 1−u
 
3
− 1 1 + u + u2 + u3 + . . .
 
=
u
3
= + 2 + 2u + 2u2 + 2u3 + . . .
u
3 2 3
= + 2 + 2 (z + 3) + 2 (z + 3) + 2 (z + 3) + . . .
z+3
whenever |u| < 1 and u 6= 0; that is whenever 0 < |z + 3| < 1.
136

(e) We want a series in powers of (z − 4), so let u = z − 4. Then


1 1
2 = 2
z2 (z − 4) (4 + u) u2
1
= 2
42 u2 1 + u4
1  u −2
= 1 +
16u2 4
u
We now set w = 4 and n = −2 in the expansion

n n (n − 1) 2 n (n − 1) (n − 2) 3
(1 + w) = 1 + nw + w + w + ... |w| < 1
2! 3!
(see the end of section 2.5.4) to get
1 1  u −2
2 = 2
1+
z 2 (z − 4) 16u 4
  u  (−2) (−3)  u 2
1
= 1 + (−2) + +
16u2 4 2! 4

(−2) (−3) (−4)  u 3
+ ...
3! 4
  u 2  u 3 
1 u
= 1−2 +3 −4 + ...
16u2 4 4 4
1 1 3 u
= − + − + ...
16u2 32u 256 256
1 1 3 z−4
= 2 − 32 (z − 4) + 256 − 256 + . . .
16 (z − 4)
For this to make sense we of course need that z − 4 6= 0. In addition
for the binomial expansion used above to converge, we need | u4 | < 1 (or
equivalently |z − 4| = |u| < 4). Thus the series converges for 0 < |z − 4| <
4. 
Example 2.5.15. Determine the Laurent expansion of
1
f (z) =
(z + 1) (z + 3)
valid for each of the regions
(a) 1 < |z| < 3,
(b) |z| > 3,
(c) 0 < |z + 1| < 2,
(d) |z| < 1.
Solutions: First resolve f (z) into partial fractions to get
   
1 1 1 1
f (z) = − .
2 z+1 2 z+3
Having done this we may then try to solve the problem through an application of
the Binomial series.
(a) Since |z| > 1 (or equivalently | z1 | < 1) we may set w = 1
z and n = −1 in

n n (n − 1) 2 n (n − 1) (n − 2) 3
(1 + w) = 1 + nw + w + w + ... |w| < 1
2! 3!
137 EMT4801/1

to get
 −1 "  2   3 #
1 1 1 1 1 1 1
= 1+ = 1− + − + ... .
1+z z z z z z z
Similarly since |z| < 3 (equivalently | z3 | < 1), setting w = z3 yields
 
1 1h z i−1 1 z  z 2  z 3
= 1+ = 1− + − + ... .
z+3 3 3 3 3 3 3
Thus we get
   
1 1 1 1
f (z) = −
2 z+1 2 z+3
"  2   3 #  
1 1 1 1 1 z  z 2  z 3
= 1− + − + ... − 1− + − ...
2z z z z 6 3 3 3
1 1 1 1 1 z z2 z3
= ... − + − + − + − + − ...
2z 4 2z 3 2z 2 2z 6 18 54 162
(b) The region |z| > 3 lies inside the region |z| > 1. Hence we still have
"   2  3 #
1 1 1 1 1
= 1− + − + ... .
1+z z z z z

However since in this region we have | z3 | < 1, we now set w = 3


z in the
Binomial expansion given above to get
 −1 "  2   3 #
1 1 3 1 3 3 3
= 1+ = 1− + − ... .
z+3 z z z z z z
Thus here
   
1 1 1 1
f (z) = −
2 z+1 2 1+3
"   2  3 # "   2  3 #
1 1 1 1 1 3 3 3
= 1− + − ... − 1− + − + ...
2z z z z 2z z z z
1 4 13
= 2
− 3 + 4 − ...
z z z
(c) We need a series in powers of (z + 1), so let u = z + 1. Once again using
the Binomial series, we have that
1 1
=
(z + 1) (z + 3) u (u + 2)
1  u −1
= 1+
2u  2 
1 u  u 2  u 3
= 1− + − + ...
2u 2 2 2
1 1 u u2
= − + − + ...
2u 4 8 16
2
1 1 z + 1 (z + 1)
= − + − + ...
2 (z + 1) 4 8 16
−1
(Note that the series expansion for 1 + u2 we used above will converge
when | u2 | < 1; that is when |z + 1| = |u| < 2. Thus the series obtained
above is indeed the one valid for 0 < |z + 1| < 2.)
138

(d) Since the domain |z| < 1 is contained in |z| < 3, we still have
 
1 1 z  z 2  z 3
= 1− + − ... .
z+3 3 3 3 3
1
However for the factor z+1 we now have that
1
= [1 + z]−1 = 1 − z + z 2 − z 3 + . . .
z+1
Thus
   
1 1 1 1
f (z) = −
2 z+1 2 z+3
 
1  1 z  z 2  z 3
= 1 − z + z2 − z3 + . . . − 1− + − + ...
2 6 3 3 3
1 4 13 2 40 3
= − − z + z − z + ...
3 9 27 81


2.5.5.1. General hints on computing Laurent series. We next present some sug-
gestions on computing Laurent series. If approached correctly then at least for
rational functions this should be a reasonably straightforward process. In short the
q(z)
idea runs as follows: If f (z) = p(z) , where p and q are polynomials, needs to be ex-
panded as a Laurent series centred at a (i.e. in powers of z − a) we can use Taylor’s
theorem to rewrite q (z) in powers of (z − a). Next we can use partial fractions
1 K
to decompose p(z) into a sum of terms of the form (z−b) n . Each of these terms

are then separately written in Laurent form and finally all the various expansions
K
for q (z) and the (z−b) n ’s combined algebraically to get the Laurent expansion for

q(z) 1
f (z) = p(z) . So all that remains is to explain how to write terms like (z−b) n in

Laurent–form.
However before doing so we note that a function f may have several Laurent
expansions centred at z = a depending on the number of singularities it has. A
careful look Laurent’s theorem reveals that if a particular form of the Laurent
expansion in powers of (z − a) holds at a point (say w0 ) then this form will converge
on the largest annulus of the form S < |z − a| < R (where 0 ≤ S < R) which
contains w0 and on which f is still analytic! As soon as we leave the annulus
and move beyond a point where f is NOT analytic, the expansion changes. For
example if f has say two singularities at say z0 and z1 with say

0 < R0 = |z0 − a| < R1 = |z1 − a|

then f will be analytic on the annuli

|z − a| < R0 , R0 < |z − a| < R1 and R1 < |z − a| < ∞,

but not on the circles |z − a| = R0 and |z − a| = R1 as such, since these contain


z0 and z1 . So by Laurent’s Theorem on each of the annuli |z − a| < R0 , R0 <
|z − a| < R1 and R1 < |z − a| < ∞, f will have some Laurent expansion, however
the expansion may be different on each of these sets!
1
We finally indicate how a term like z−b (where b 6= a) may be written as
1 1
a Laurent series centred at z = a. The expansions for (z−b) 2,
(z−b)3
, . . . may
1
then be obtained from this one by differentiating the expansion for z−b . (Note for
139 EMT4801/1

3
1
example that (z−b)4
= − 16 dz
d 1
3 (z−b) .) Now we know that


1 X
= 1 + w + w2 + . . . = wn (|w| < 1) .
1−w n=0


If therefore z−a
b−a < 1 (i.e. |z − a| < |b − a|) we may write

1 −1 1
=  
z−b (b − a) 1 − z−a
b−a

 
z−a
and set w = b−a to get

∞ n ∞
1 −1 X (z − a) X −1 n
z−b
=
b − a n=0 (b − a)n = n+1 (z − a) .
n=0 (b − a)
 
b−a b−a
If on the other hand z−a < 1 (i.e. |z − a| > |b − a|) we set w = z−a to get

 
1 1  1
=  
z−b z − a 1 − b−a
z−a
∞  n
1 X b−a
=
z − a n=0 z−a

X m−1 1
= (b − a) m.
m=1
(z − a)

Suppose for example that we are asked to compute the Laurent series of
z 2 −8z
(z−2)2 (z+1)in each of the following regions: (i) |z − 1| < 1, and (ii) 1 < |z − 1| < 2.
So for each of these regions we must find an expansion in powers of (z − 1) that
converges on the given region.
First of all note that by means of partial fractions we can see that

z 2 − 8z 1 4
2
= − .
(z − 2) (z + 1) z + 1 (z − 2)2

1
We first consider the term z+1 . If |z − 1| < 2 (equivalently | z−1
−2 | < 1), then

!
1 1 1
=
z+1 2 1 − ( z−1
−2 )
∞  n
1 X z−1
=
2 n=0 −2

X (z − 1)n
= (−1)n
n=0
2n+1
1 (z − 1) (z − 1)2
= − + ...
2 22 23
140

−2
If however |z − 1| > 2 (equivalently | z−1 | < 1), then
!
1 1 1
= −2
z+1 z−1 1 − ( z−1 )
∞  n
1 X −2
=
z − 1 n=0 z − 1

X 2n
= (−1)n
n=0
(z − 1)n+1
1 2 22
= − + ...
(z − 1) (z − 1)2 (z − 1)3

1
Now consider the term z−2 . If |z − 1| < 1, then
 
1 1
= (−1)
z−2 1 − (z − 1)

X
= (−1) (z − 1)n
n=0
= −1 − (z − 1) − (z − 1)2 . . .

1
and if |z − 1| > 1 (equivalently | z−1 | < 1), then
!
1 1 1
= 1
z−2 z−1 1 − ( z−1 )
∞  n
1 X 1
=
z − 1 n=0 z − 1
1 1 1
= + + ...
(z − 1) (z − 1)2 (z − 1)3

On differentiating these two expressions and changing the sign we get


 
1 d 1
= −
(z − 2)2 dz z−2
= 1 + 2(z − 1) + 3(z − 1)2 . . .
X∞
= (n + 1)(z − 1)n whenever |z − 1| < 1,
n=0

and
 
1 d 1
= −
(z − 2)2 dz z − 2
1 2 3
= + + ...
(z − 1)2 (z − 1)3 (z − 1)4

X n+1
= whenever |z − 1| > 1.
n=0
(z − 1)n+2
141 EMT4801/1

In the region |z − 1| < 1 it will of course also hold that |z − 1| < 2. So in this
region
z 2 − 8z 1 4
= −
(z − 2)2 (z + 1) z + 1 (z − 2)2
∞ ∞
! !
n
n (z − 1)
X X
n
= (−1) −4 (n + 1)(z − 1)
n=0
2n+1 n=0
1 −1 1
= ( − 4) + ( − 8)(z − 1) + ( − 12)(z − 1)2 . . .
2 4 8
= −3.5 − 8.25(z − 1) − 11.875(z − 1)2 . . .
In the region 1 < |z − 1| < 2 we have 1 < |z − 1| AND |z − 1| < 2. So in this
region
z 2 − 8z 1 4
= −
(z − 2)2 (z + 1) z + 1 (z − 2)2
∞ ∞
! !
n
n (z − 1) n+1
X X
= (−1) −4
n=0
2n+1 n=0
(z − 1)n+2
8 4 1 (z − 1) (z − 1)2
= ··· − − + − + ...
(z − 1)3 (z − 1)2 2 22 23
2.5.6. LAURENT SERIES AND ISOLATED SINGULARITIES.
With the theory of Laurent series now at our disposal, we are able to further develop
the theory of isolated singularities. Ultimately this theory will help us to develop
a very general theory of integration for complex functions with many important
applications.
Definition 2.5.16. Let f be a function with an isolated singularity at z0 . Then
f is analytic on a domain of the form 0 < |z − z0 | < r1 , centred at z0 . Let
∞ ∞
X X bn
f (z) = an (z − z0 )n +
n=0 n=1
(z − z0 )n
be the Laurent series expansion P of f valid for that domain. In this Laurent ex-

pansion we will call the portion n=0 an (z − z0 )n the analytic part of the series,
P∞ bn 1
and n=1 (z−z 0)
n the principal part. The coefficient b1 of the (z−z ) term in the
0
principal part, is called the residue of f at z0 . We will denote the residue of f at
z0 by Resz=z0 (f ).
Let f be as in the above definition, and let C be a positively oriented simple
closed contour around z0 which lies completely in 0 < |z − z0 | < r1 . From Laurent’s
theorem we now have that
Z Z
1 f (z) 1
b1 = dz = f (z)dz.
2πj C (z − z0 )−1+1 2πj C
R
Therefore C f (z)dz = 2πjb1 . Residues can therefore be used to compute certain
integrals, and this is in fact the primary reason for our interest in residues. However
before we start using residues to compute integrals, we need to see how Laurent
series can be used to classify singularities, and also develop algorithms for com-
puting residues. The structure of the principal part of the Laurent series of f on
0 < |z − z0 | < r1 , actually tells us what type of singularity f has at z0 . When
looking at the principal part, there are three cases to consider; the case where all
the bn ’s are zero, the case where only finitely many of the bn ’s are non-zero, and the
case where infinitely many of the bn ’s are nonzero. Each of these cases corresponds
to a different type of singularity. We specifically get the following:
142

• If all the bn ’s in the principal part are zero, f has a removable singularity
at z0 .
• If for some m ∈ N we have that bm 6= 0 with bk = 0 for every k > m, then
f has a pole of order m at z0 .
• If infinitely many of the bn ’s are non-zero, f has an essential singularity
at z0 .
Lets try to get some idea of why the above is true. If all the bn ’s are zero, the
principal part of the series vanishes, and so in this case we have that

X
f (z) = an (z − z0 )n = a0 + a1 (z − z0 ) + a2 (z − z0 )2 + . . .
n=0

on 0 < |z − z0 | < r1 . But then limz→z0 f (z) exists. In fact


lim f (z) = a0 .
z→z0

So by definition f then has a removable singularity at z0 .


Now suppose that for some m ∈ N we have that bm 6= 0 with bk = 0 for every
k > m. In this case the principal part is of the form

X bn b1 b2 bm
n
= + 2
+ ··· + .
n=1
(z − z 0 ) (z − z 0 ) (z − z 0 ) (z − z0 )m

Therefore on 0 < |z − z0 | < r1 , we then have that



X b1 b2 bm
f (z) = an (z − z0 )n + + + ··· + .
n=0
(z − z0 ) (z − z0 )2 (z − z0 )m

But then

X
(z − z0 )m f (z) = an (z − z0 )m+n + b1 (z − z0 )m−1 + b2 (z − z0 )m−2 + · · · + bm ,
n=0

which in turn ensures that


lim (z − z0 )m f (z) = bm 6= 0.
z→z0

Thus by definition, f will then have a pole of order m at z0 .


Example 2.5.17. For each of the following functions, classify the singularity
at the given point, and determine the residue of f at that point:
e2z
(a) 3 z = 2,
(z − 2)
1
(b) (z − 2) sin z = −3,
z+3
z − sin z
(c) z = 0,
z3
z
(d) z = −3,
(z + 2) (z + 3)
1
(e) 2 z=4
2
z (z − 4)

Solutions: The hard work of computing the Laurent expansions of each of


these functions in a region of the form 0 < |z − z0 | < r1 has already been done in
example 2.5.14. We will therefore freely use the results of that example to classify
the given singularities.
143 EMT4801/1

(a) We saw earlier that


e2z e4 2e4 22 e4 23 e4 24 e4
3 = 3 + 2 + + + (z − 2) + . . .
(z − 2) (z − 2) (z − 2) 2! (z − 2) 3! 4!
Thus z = 2 must be a pole of order 3. The residue at z = 2 (ie. the
2 4
1
coefficient of the z−2 term) is 2 2!e = 2e4 .

(b) In this case we have that


1 5 1 5 1 1
(z − 2) sin = 1− − + + − +. . .
z+3 z + 3 6 (z + 3) 6 (z + 3) 120 (z + 3) 24 (z + 3)5
2 3 4

whenever 0 < |z + 3| < ∞. Thus we have an essential singularity at


1
z = −3. The residue (coefficient of the z+3 term) is −5.

(c) Recall that


z − sin z 1 z3 z4
= − + − ...
z3 3! 5! 7!
Since the principal part is zero, we must have a removable singularity at
0. In addition since the coefficient b1 is zero, the residue at 0 is 0.

(d) Here
z 3 2 3
= + 2 + 2 (z + 3) + 2 (z + 3) + 2 (z + 3) + . . .
(z + 2) (z + 3) z+3
So we have a simple pole (pole of order 1) at z = −3, with a residue of 3.

(e) Since
1 1 1 3 z−4
2 = 2 − + − + ...
z2 (z − 4) 16 (z − 4) 32 (z − 4) 256 256
1
there is a pole of order 2 at z = 4 with a residue of − 32 .


√ z.
sin
Example 2.5.18. Classify the singularity at 0 of the function f (z) = z

Solution: At first sight it appears as if f (z) = sin√z z is a multivalued function
with a branch-point at 0. However this is not actually the case. To see this we use
the Maclaurin series of sin. Recall that
w3 w5 w7
sin(w) = w − + − + ...
3! 5! 7!
But then
sin(w) w2 w4 w6
=1− + − + ...
w 3! 5! 7!

Therefore if we replace w by z, we get
√ √ √ √
sin z ( z)2 ( z)4 ( z)6
√ = 1− + − + ...
z 3! 5! 7!
z z2 z3
= 1− + − + ...
3! 5! 7!

sin
√ z z
Thus f (z) = z
is in fact a single-valued function defined by the series 1 − 3! +
2 3
z z
5! − + . . . at each z 6= 0. z = 0 is therefore an isolated singularity which is
7!
removable. 
144

2.5.7. COMPUTING RESIDUES. We already pointed out the connec-


tion of residues to evaluating certain complex integrals. Here we will present an
algorithm for computing residues at poles. The task of explaining in more detail
how residues can be used to compute integrals, will be dealt with in the next unit.
Suppose first that f (z) has a simple pole (pole of order 1) at z = z0 . In some
domain of the form 0 < |z − z0 | < r1 , the Laurent series of f (z) will then be
b1 2
f (z) = + a0 + a1 (z − z0 ) + z2 (z − z0 ) + . . .
z − z0
so that
2 3
(z − z0 ) f (z) = b1 + a0 (z − z0 ) + a1 (z − z0 ) + a2 (z − z0 ) + . . .
Therefore in this case
lim (z − z0 ) f (z) = b1 = Resz→z0 f (z)
z→z0

More generally suppose that f has a pole of order m at z0 . Then in some


domain of the form 0 < |z − z0 | < r1 , the Laurent series of f (z) will have the form

X b1 b2 bm
f (z) = an (z − z0 )n + + + ··· + .
n=0
(z − z0 ) (z − z0 )2 (z − z0 )m
But then

X
(z − z0 )m f (z) = an (z − z0 )m+n + b1 (z − z0 )m−1 + · · · + bm−1 (z − z0 ) + bm
n=0

Our task is to compute b1 . On differentiating we get



d X
((z − z0 )m f (z)) = (m+n)an (z−z0 )m+n−1 +(m−1)b1 (z−z0 )m−2 +· · ·+bm−1 .
dz n=0

Differentiating again gives



d2 X
((z − z0 )m f (z)) = (m + n)(m + n − 1)an (z − z0 )m+n−2 +
dz 2 n=0
(m − 1)(m − 2)b1 (z − z0 )m−3 + · · · + 2bm−2 .
After differentiating m − 1 times, this expression becomes
dm−1
((z − z0 )m f (z))
dz m−1

X
= an [(m + n).(m + n − 1).(m + n − 2) . . . (n + 2)](z − z0 )n+1
n=0
+[(m − 1).(m − 2) . . . 1]b1
Therefore
dm−1
 
m
lim ((z − z0 ) f (z)) = [(m − 1).(m − 2) . . . 1]b1 = (m − 1)!b1 .
z→z0 dz m−1
We therefore obtain the following formula for the residue, b1 :
Theorem 2.5.19. If f has a pole of order m at z0 , then
 m−1 
1 d m
lim [(z − z 0 ) f (z)] = Resz=z0 f (z).
(m − 1)! z→z0 dz m−1
d0
(Here we agree that 0! = 1 and that dz 0 g(z) = g(z).)
145 EMT4801/1

Warning: When using this formula to compute the residue at a pole, it is


very important to have the correct value for the order m of the pole. So before
computing the residue at a pole, you must first correctly classify the pole!

Example 2.5.20. Determine the residues of

z 2 − 2z
f (z) = 2
(z + 1) (z 2 + 9)

at each of its poles in the complex plane.

Solution: We may factorise (z 2 + 9) and rewrite this function as

z 2 − 2z
f (z) = 2
(z + 1) (z − 3j) (z + 3j)

It is then easy to see that f (z) has poles of order 1 at z = 3j and z = −3j, and a
pole of order 2 at z = −1. To see this note for example that

z 2 − 2z
lim (z + 1)2 f (z) = lim 6= 0,
z→−1 z→−1 (z − 3j) (z + 3j)

and

z 2 − 2z
lim (z − 3j)f (z) = lim 6= 0,
z→3j z→−1 (z + 1)2 (z + 3j)

etc. Thus by the formula for the residue, we have that

z 2 − 2z
Resz=3j f (z) = lim (z − 3j) 2
z→3j (z + 1) (z − 3j) (z + 3j)
2
z − 2z
= lim 2
z→3j (z + 1) (z + 3j)
−9 − 6j
= 2
(3j + 1) (6j)
9 + 6j
=
12(3 + 4j)
9 + 6j 3 − 4j
= .
12(3 + 4j) 3 − 4j
= 0, 17 − j0, 06

Similarly

z 2 − 2z
Resz=−3j f (z) = lim (z + 3j) 2
z→−3j (z + 1) (z + 3j) (z − 3j)
−9 + 6j
= 2
(1 − 3j) (−6j)
= 0, 17 + j0, 06
146

Finally at z = −1 (the pole of order m = 2) we have that


( )
1 d 2 z 2 − 2z
Resz=−1 f (z) = lim (z + 1) 2
1! z→−1 dz (z + 1) (z + 3j) (z − 3j)
d z 2 − 2z
 
= lim
z→−1 dz z2 + 9
 
(2z − 2) z 2 + 9 − 2z z 2 − 2z
= lim 2
z→−1 (z 2 + 9)
−4 (10) + 2 (3)
=
100
= −0, 34.

Example 2.5.21. Determine the residues of each of the following functions at
the indicated points.
(a)
ez
2 (z = j)
(1 + z 2 )
(b)
 3
sin(z)
(z = 0)
z2
(c)
z4
3 (z = −1)
(z + 1)
Solution:
(a) Since (1 + z 2 ) = (z − j)(z + j), we may rewrite this function as
ez ez
2 = 2 2.
(1 + z 2 ) (z − j) (z + j)
From this form it is then clear that it has a pole of order m = 2 at z = j
z
ej
(since limz→j (z − j)2 (z−j)e2 (z+j)2 = (2j) 2 6= 0). Therefore

1 d ez
Resz=j f (z) = lim
1! z→j dz (z + j)2
2
ez (z + j) − 2ez (z + j)
= lim 4
z→j (z + j)
2
ej (2j) − 2ej 2j
= 4
(2j)
j
e (−4 − 4j)
=
16
ej
= (−1 − j)
4
(b) Notice that
 3  3
sin z 1 sin z
=
z2 z3 z
Since
sin z
lim = 1,
z→0 z
147 EMT4801/1

it therefore follows that


 3  3
sin z sin z
lim z 3 = lim = 1.
z→0 z2 z→0 z

Therefore the given function has a pole of order m = 3 at z = 0. But then

Resz=0 f (z)
3
d2

1 sin z
= lim 2 z 3
2! z→0 dz z2
1 d2 (sin z)3
= lim 2
2 z→0 dz z3
1 d (3(sin z)2 cos z)z 3 − (sin z)3 3z 2
= lim
2 z→0 dz z6
3 d z sin z cos z − sin3 z
2
= lim
2 z→0 dz z4
3 1
lim [ sin2 z cos z + 2z sin z cos2 z − z sin3 z − 3 sin2 z cos z z 4

=
2 z→0 z 8
− z sin2 z cos z − sin3 z 4z 3 ]


3 1
lim [ −3z sin2 z cos z + 2z 2 sin z cos2 z − z 2 sin3 z − sin3 z ]

=
2 z→0 z 5

This limit is of the form 00 so that L’Hôpital’s Rule must be applied, and
this is to be done several times! Whilst it is certainly possible to compute
this limit, it may not be the quickest way to compute the residue. As an
alternative we can also try to write down the Laurent series expansion
valid for some domain of the form 0 < |z| < r, and simply read off the
coefficient of the z1 term. We have that

3 3  3
z3 z5
 
sin z 1
= z− + − ...
z2 z2 3! 5!
3
z3

1 z
= − + − ...
z 3! 5!
  " 2 #
1 1 1 1 1
= +3 − +3 − + z + ...
z3 3! z 3! 5!
1 11 13
= − + z + ...
z3 2z 120

Hence, the residue is − 21 .


z4
(c) The function 3 has a pole of order m = 3 at z = −1. Therefore
(z + 1)

1 d2
z4

Resz=−1 f (z) = lim 2
2! z→−1 dz
1
= lim 12z 2
2 z→−1
= 6
148

1
Let us check by comparing the above value to the coefficient of z+1 in the
Laurent expansion. Let u = z + 1. Then
4
z4 (u − 1)
3 =
(z + 1) u3
4
(1 − u)
=
u3
1 
1 − 4u + 6u2 − 4u3 + u4

= 3
u
1 4 6
= 3
− 2 + −4+u
u u u
1 4 6
= 3
− 2
+ − 4 + (z + 1)
(z + 1) (z + 1) (z + 1)
Hence the residue is indeed 6.
EXERCISE 2.5.
1. Find the Laurent series expansions of the function
1
z−j
valid for each of the following regions:
(a) |z| < 1
(b) |z| > 1
(c) |z − 1 − j| < 1
2. (a) Write the function
1
f (z) = 2
z +1
as a power series in the open disc |z| < 1.
(b) Now use this power series expansion, to find power series expansions
for each of
1
(i) 2,
2
(z + 1)
1
(ii) 3.
(z 2 + 1)
Hint: Use the result on differentiation of power series.
3. Find the first four non–zero terms of the Taylor series of the following
functions about the indicated points and determine the radius of conver-
gence in each case.
1
(a) (z = 1)
1+z
1
(b) (z = 2j)
z (z − 4j)
12
(c) (z = 1 + j)
z2
4. Find the Maclaurin series of
1
1 + z + z2
up to the z 3 term.
1
5. (a) Determine Laurent series of f (z) = 2 valid for a region of
z (z − 1)
the form 0 < |z − z0 | < r where
(i) z0 = 0
149 EMT4801/1

(ii) z0 = 1
In each case state the precise region of convergence, as well as the
residue.
(b) Determine Laurent series of f (z) = z 2 sin z1 about z = 0. State the
region of convergence as well as the residue.
z
(c) Determine Laurent series of f (z) = valid for each of
(z − 1) (2 − z)
the following regions:
(i) |z| < 1
(ii) 1 < |z| < 2
(iii) |z| > 2
(iv) |z − 1| > 1
(v) 0 < |z − 2| < 1
6. Classify the singularities of the following functions. Also determine the
zeros, if any.
cos z
(a)
z2
z
(b) 4
z −1
sin z
(c) 2
z + π2
z+j
(d) 3
(z + 2) (z − 3)
(e) coth z
z
(f) e (1−z)
7. Expand each of the following functions in a Laurent series about z = 0,
valid for a domain of the form 0 < |z| < r, and for those that have a
singularity at 0, state what type it is.
1 − cos z
(a)
2
z
ez
(b) 3
z
(c) z −1 cosh z −1
(d) tan−1 z 2 + 2z + 2


8. Determine all the poles of each of the following functions, and compute
the residues at each of these poles.
2z + 1
(a) 2
z −z−2
3z 2 + 2
(b)
(z − 1) (z 2 + 9)
z3 − z2 + z − 1
(c) 3 + 4z
 z  2
z+1
(d)
z−1
3 + 4z
(e) 3
z + 3z 2 + 2z
9. Determine the residues of each of the following functions at the indicated
pole.
cos z
(a) (z = 0)
z
sin z π
(b) 4 2
z = ej 3
z +z +1
1
(c) 2 (z = j)
2
(z + 1)
150

UNIT 6: RESIDUES AND INTEGRATION REVISITED


2.6.1. OBJECTIVE. To present Cauchy’s residue theorem, and see how it
may be used to compute a wide variety of complex contour integrals, as well as
several classes of important real integrals.

2.6.2. OUTCOMES. At the end of this unit the student should


• Be familiar with Cauchys Residue theorem, and able to apply it to com-
pute contour integrals of complex functions around closed contours;
• Be able to use residue theory to compute
R ∞ real integrals involving sines and
cosines, and real integrals of the form 0 f (x)dx (where f (x) is a rational
function).

2.6.3. THE RESIDUE THEOREM. This theorem draws together aspects


of the theories of differentiation, series representation,
H and integration of a complex
function. It is concerned with the evaluation of C f (z) dz, where C is a simply
closed positively oriented contour, and where f (z) has a finite number of isolated
singularities.
First let f be a function which is analytic throughout a domain of the form
0 < |z − z0 | < r, and with an isolated singularity at z0 . We saw from Laurent’s
theorem that within this domain f may be represented as a series of the form
∞ ∞
X X bn
f (z) = an (z − z0 )n + ,
n=0 n=1
(z − z0 )n

and then noted in the discussion following definition 2.5.16, that the formula for
the coefficients bn in fact implies that
I
f (z)dz = 2πjb1 = 2πjResz=z0 f (z)
C

where C is here a positively oriented simple closed contour enclosing z0 , which lies
completely inside the domain 0 < |z − z0 | < r.
Now suppose that f is analytic inside and on some given positively oriented
simple closed contour C, except for a few isolated singularites z1 , z2 , . . . , zn inside
the contour C. Each of these singularites zk may of course be enclosed by a small
positively oriented circle Ck centred at zk , in such a way that all the Ck ’s lie inside
C, and that no two distinct Ck ’s intersect each other. Since each singularity inside
C will lie inside one of these Ck ’s, the function f will of course be analytic on the
region which lies outside the Ck ’s, but inside C. But then by remark 2.4.10, we
will have that
I I I I
f (z) dz = f (z) dz + f (z) dz + . . . + f (z) dz
C C1 C2 Cn

But since for each fixed k the circle Ck contains only the singularity zk , it
follows from what we noted earlier that
I
f (z)dz = 2πjResz=zk f (z).
Ck

Thus the above formula for the integral then becomes


I Xn I
f (z)dz = f (z)dz = 2πj{sum of residues inside C}.
C k=1 Ck

Thus we arrive at the following conclusion:


151 EMT4801/1

Theorem 2.6.1 (Cauchy’s Residue Theorem). Let f be analytic inside and


on some positively oriented simple closed contour C, except for a finite number of
isolated singularites z1 , z2 , . . . , zn inside the contour C. Then
I n
X
f (z)dz = 2πj Resz=zk f (z).
C k=1

In the above theorem it is very important to note that ONLY R the residues
of singularities inside C contribute to the value of the integral C f (z)dz. Thus
although f may have many more singularies than just z1 , z2H, . . . , zn , those other
singularities play no role whatsoever in Hthe computation of C f (z)dz, if they lie
outside C. The value of the integral C f (z)dz therefore only depends on the
behaviour of the function f inside and on C.

Example 2.6.2. Use Cauchy’s Residue Theorem to evaluate the integral

e3z
I
dz
C z2 (z 2 + 2z + 2)

where C is the circle |z| = 3.

Solution: Notice that since z 2 + 2z + 2 = 0 whenever z = −1 ± j, we may


write
e3z e3z
= 2 .
z2 (z 2 + 2z + 2) z (z − (−1 + j))(z − (−1 − j))

From this form it is now clear that the integrand f (z) has a double pole (order
m = 2) at z = 0 and simple poles (order m = 1) at each of z = −1 ± j. All of these
poles are inside C. (To see this note that |0| < 3 and | − 1 ± j| < 3.) Therefore

e3z
I
dz = 2πj [Resz=0 f (z) + Resz=−1+j f (z) + Resz=−1−j f (z)] .
C z 2 (z 2 + 2z + 2)

It remains to compute the residues.

d e3z
Resz=0 f (z) = lim
z→0 dz z 2 + 2z + 2

3e3z z 2 + 2z + 2 − e3z (2z + 2)
= lim 2
z→0 (z 2 + 2z + 2)
= 1.

e3z
Resz=−1+j f (z) = lim
z→(−1+j) z 2 (z + 1 + j)
e3(−1+j)
= 2
(−1 + j) (2j)
1 3(−1+j)
= e
−4j 2
1 3(−1+j)
= e
4
152

e3z
Resz=−1−j f (z) = lim
z→(−1−j) z 2 (z + 1 − j)

e3(−1−j)
= 2
(−1 − j) (−2j)
1 3(−1−j)
= e
−4j 2
1 3(−1−j)
= e
4
Thus
e3z
I  
1 3(−1+j) 1 3(−1−j)
dz = 2πj 1 + e + e
C z 2 (z 2 + 2z + 2) 4 4
 
1 −3 3j −3j
= 2πj 1 + e (e + e )
4
 
1 −3
= 2πj 1 + e cos 3 .
2

(In the above we used the fact that cos 3 = 12 (e3j + e−3j ).) 

Example 2.6.3. Evaluate the integral


I 3
z − z2 + z − 1
dz
C z 3 + 4z
where C is the positively oriented circle
(i) |z| = 3,
(ii) |z| = 1.

Solution: Since z 3 + 4z = z(z 2 + 4) = z(z − 2j)(z + 2j), we have that

z3 − z2 + z − 1 z3 − z2 + z − 1
3
= .
z + 4z z(z − 2j)(z + 2j)
The integrand therefore clearly has poles of order 1 at each of 0, 2j and −2j. We
proceed to compute the residues at these poles.

z3 − z2 + z − 1
Resz=0 f (z) = lim
z→0 z2 + 4
1
= −
4

z3 − z2 + z − 1
Resz=2j f (z) = lim
z→2j z (z + 2j)
(2j) − (2j)2 + 2j − 1
3
=
2j (4j)
−8j + 4 + 2j − 1
=
−8
3 3
= − +j .
8 4
153 EMT4801/1

z3 − z2 + z − 1
Resz=−2j f (z) = lim
z→−2j z (z − 2j)
(−2j) − (−2j)2 + (−2j) − 1
3
=
−2j (−4j)
8j + 4 − 2j − 1
=
−8
3 3
= − −j .
8 4
(i) |z| = 3 encloses all three poles. Thus
I 3
z − z2 + z − 1
    
1 3 3 3 3
dz = 2πj − + − + j + − −j
C z 3 + 4z 4 8 4 8 4
= −2πj
(ii) |z| = 1 encloses only z = 0. Thus in this case
I 3
z − z2 + z − 1
 
1 1
3 + 4z
dz = 2πj − = − πj.
C z 4 2


2.6.4. EVALUATION OF DEFINITE REAL INTEGRALS. By means


of a few ingenious tricks, residue theory can be used to compute many definite real
integrals in a fairly straightforward way. Often these integrals are very difficult to
compute by other means. Thus residue theory is not just important for complex
functions, but has many useful applications to the theory of real functions as well.
We are going to consider two of the most common types of real integrals that can
be evaluated using residue theory.
R∞
2.6.4.1. Type 1: Real integrals of the form −∞ f (x) dx where f (x) is a rational
R∞
function of the real variable x. Let f be a rational function for which −∞ f (x) dx
converges, and suppose that the complex rational function
H f (z) has no singularities
on the real axis. Now consider the contour integral CR f (z) dz, where CR is made
up of the line segment in the real axis from −R to R, and the semi–circle Γ.

Figure 2.43.

Since z = x on the real axis, we have


I Z R Z
f (z) dz = f (x) dx + f (z) dz
CR −R Γ
154

RR R∞ R
Notice that limR→∞ −R f (x) dx = −∞ f (x) dx. So in situations where Γ f (z) dz
H R∞
tends to 0 as R increases, we should have limR→∞ CR f (z) dz = −∞ f (x) dx. But
R
when does Γ f (z) dz tend to 0 as R → ∞? To get some idea of when this will
work, we proceed to estimate this integral.
We know from section 2.4.3 that
Z

f (z)dz ≤ M L

Γ

where L is the length of Γ, and M is an upper bound for |f (z)| on Γ. Since Γ is a


half-circle of radius R, the length of Γ is just πR. So if for some constant M0 > 0
and some k > 1 we have that |f (z)| ≤ M 0
Rk
for all z on Γ, then we will have that
Z
M0 πR πM0
lim f (z) dz ≤ lim k
= lim k−1 = 0
R→∞ Γ R→∞ R R→∞ R
so that Z
lim f (z) dz = 0.
R→∞ Γ
Thus in this case we will then have that
I Z ∞
lim f (z) dz = f (x) dx
R→∞ CR −∞

Lets suppose this limit formula holds. Since f is a rational function (with no
singularities on the real axis), it will have finitely many singularities in the upper
half-plane (ie. lying above the real axis). Now whenever R > 0 is big enough for
CR to enclose all of the singularities in the upper half-plane, we will have that
I
f (z)dz = 2πj [sum of residues in upper half-plane] .
CR

Combining this with the limit formula above, we will then have that
Z ∞
f (x) dx = 2πj [sum of residues in upper half-plane] .
−∞

In summary we need to executeR ∞the following steps when using residue theory to
compute integrals of the form −∞ f (x) dx:
• Define a suitable contour CR in the
R manner described above.
• Estimate |f (z)| on Γ and check if Γ f (z)dz → 0 as R → ∞.
H RR R
• If this works deduce from CR f (z) dz = −R f (x) dx + Γ f (z) dz that
R∞
−∞
f (x) dx = 2πj [sum of residues in upper half-plane].
• Compute the residues in the upper half-plane, and write down the value
of the integral.
Remark 2.6.4. (i) The semi–circular path Γ can be parametrised by z =
Rejθ (0 ≤ θ ≤ π) (in which case dz = jRejθ dθ). So
Z Z π
f Rejθ jRejθ dθ

f (z) dz =
Γ 0

For this to tend to 0 as R → ∞, f Rejθ jRejθ = |f (Rejθ )|.R must
decrease to zero as R → ∞. Hence, the condition that |f (Rejθ )| ≤ MRk
0
for
some k > 1. Thus if the degree of the denominator of the rational function
f (x) is at least two more than the degree of the numerator, this approach
will work.
155 EMT4801/1

(ii) If f (x) is an even function [i.e. f (−x) =


R ∞f (x)], the same approach can
be used to compute integrals of the form 0 f (x) dx, since then
Z ∞ Z ∞
f (x) dx = 2 f (x) dx
−∞ 0

Example 2.6.5. Use contour integration to show that


Z ∞
dx π
2 2 = 16
−∞ (x + 4)

Solution: The degree of the denominator is 4 more than that of the numerator,
hence, contour integration may be used. Consider the integral
I
dz
2
CR (z 2 + 4)
where CR is the contour as shown in Figure 2.43. Then
I Z R Z
dz dx dz
2 2 = 2 2 + 2 2.
CR (z + 4) −R (x + 4) Γ (z + 4)

Since
1 1 1
= = ,
(z 2 + 4)2 [(z − 2j)(z + 2j)]2 (z − 2j)2 (z + 2j)2
the integrand has double poles (order m = 2) at z = ±2j. However only z = 2j will
be inside CR for R big enough (−2j is below the x–axis). Thus if R is big enough
for CR to include 2j (this will happen when R > 2), then
I
dz
2 2 = 2πjResz=2j f (z).
CR (z + 4)

Since
d 1
Resz=2j f (z) = lim
z→2j dz (z + 2j)2
−2
= lim 3
z→2j (z + 2j)
−2
=
64j 3
j
= − ,
32
we therefore have that
Z R Z I  
dx dz dz j π
2 2 + 2 2 = 2 2 = 2πj − 32 =
16
−R (x + 4) Γ (z + 4) CR (z + 4)

whenever R > 2.
For any z on Γ we have that
|z| = R, so that |z 2 + 4| ≥ |z 2 | − |4| = |z|2 − 4 =
1 1
R2 − 4 on Γ. It follows that (z2 +4) 2 ≤ (R2 −4)2 for all z on Γ. Since the length of

Γ is L = πR, it follows that


Z
dz πR
(z 2 + 4)2 ≤ M L = (R2 − 4)2 .

Γ
RR
Since (R2 −4)2 → 0 as R → ∞, it follows that limR→∞ −R (x2dx
πR
+4)2
= 0, and hence
that " #
Z ∞ Z R
dx π dx π
2 = R→∞lim − 2 = .
2
−∞ (x + 4) 16 2
−R (x + 4) 16
156

Note: In this particular case we could’ve used elementary calculus to solve this
problem. If we set x = 2 tan θ where − π2 < θ < π2 , then of course dx = 2 sec2 θdθ
with x2 + 4 = (2 tan θ)2 + 4 = 4(tan2 θ + 1) = 4 sec2 θ. Thus
Z ∞ Z π2
dx 2 sec2 θdθ
2 =
2
−∞ (x + 4) −π 2
16 sec4 θ
Z π
1 2
= cos2 θdθ
8 − π2
Z π
1 2 1
= (1 + cos 2θ) dθ
8 − π2 2
  π2
1 1
= θ + sin 2θ
16 2 −π2
π
=
16


Example 2.6.6. Use contour integration to evaluate


Z ∞
dx
x 6+1
0

Solution: Consider the integral


I
dz
CR z6 +1

where CR is the contour as shown in Figure 2.43. Now (z 6 + 1) has six distinct
roots at the 6th roots of −1. In exponential form −1 = ej(π+2kπ) . So the possible
roots of z 6 = −1 = ej(π+2kπ) are

zk = ej((π/6)+k(π/3)) ; k = 0, 1, 2, 3, 5.

Since
1 1
= ,
z6 + 1 (z − z0 )(z − z1 ) . . . (z − z5 )
π π 5π
each zk is a pole of order 1. Of these, only z0 = ej 6 , z1 = ej 2 and z2 = ej 6 lie
within CR for R big enough. (The others are all below the x–axis.) Thus for R big
enough
Z R Z I
dx dz dz
6
+ 6
= 6
−R x + 1 Γ z +1 CR z + 1
= 2πj [Resz=z0 f (z) + Resz=z1 f (z) + Resz=z2 f (z)]

For any z ∈ Γ we will have that |z 6 + 1| ≥ |z|6 − 1 = R6 − 1. So if R > 1, this in


turn ensures that

1 1
z 6 + 1 ≤ R6 − 1

for all z on Γ. But then


Z  
dz 1
z 6 + 1 ≤ R6 − 1 .πR → 0

Γ
157 EMT4801/1

as R → ∞, which means that limR→∞ Γ z6dz+1 = 0. Thus


R

Z ∞ Z R
dx dx
6
= lim 6
−∞ x + 1 R→∞ −R x + 1
= 2πj [Resz=z0 f (z) + Resz=z1 f (z) + Resz=z2 f (z)]
Z
dz
− lim
R→∞ Γ z 6 + 1

= 2πj [Resz=z0 f (z) + Resz=z1 f (z) + Resz=z2 f (z)]


Since
 
π 1
Resz=ej π6 f (z) = limπz − ej 6 6
z→ej 6 z +1
 
1
= limπ 5
(L’Hôpital’s Rule)
j
z→e 6 6z
1 −j 5π
= e 6,
6
 
π 1
z − ej 2

Resz=ej π2 f (z) = limπ
z→ej 2 z6 + 1
1
= limπ
z→ej 2 6z 5
1 −j 5π
= e 2
6
1 −j π −j2π
= e 2 .e
6
1 −j π
= e 2,
6
and
 
j 5π
 1
Res 5π f (z) = lim5π z−e 6
z=ej 6 z6 + 1
z→ej 6

1
= lim5π
z→e j
6
6z 5
1 −j 25π
= e 6
6
1 −j π −j4π
= e 6 .e
6
1 −j π
= e 6,
6
it therefore follows that
Z ∞
dx 2πj h −j 5π π π
i
6+1
= e 6 + e−j 2 + e−j 6
−∞ x 6
" √ ! √ !#
2πj 3 1 3 1
= − −j + (−j) + −j
6 2 2 2 2

= .
3
But
1
f (x) =
x6 +1
158

is an even function, and hence


Z ∞ Z ∞
dx 1 dx π
= = .
0 x6 + 1 2 −∞ x6 + 1 3

2.6.4.2. Type 2: Real integrals involving sines and cosines. Here we consider
definite real integrals of the form
Z 2π
I= G (sin θ, cos θ) dθ
0
where G is a function of sin θ and cos θ.
Let C be the positively oriented circle |z| = 1. This circle can be parametrised
by z(θ) = ejθ where 0 ≤ θ ≤ 2π. From the definitions of the complex versions of
sin and cos, it is clear that
ejθ − e−jθ
sin θ =
2j

ejθ + e−jθ
cos θ =
2
But since z = ejθ on C, it therefore follows that
 
1 1
sin θ = z−
2j z
 
1 1
cos θ = z+
2 z
with dz = jejθ dθ, or equivalently
dz
dθ =.
jz
On making these substitutions, it is clear that any integral of the form
Z 2π
G (sin θ, cos θ) dθ
0
(where G is an integrable function of sin and cos), may be rewritten in the form
Z 2π Z     
1 1 1 1 dz
G (sin θ, cos θ) dθ = G z− , z+
0 C 2j z 2 z jz
where C is the unit circle |z| = 1.
Remark 2.6.7. Let C be the circle |z| = 1 parametrised by z(θ) = ejθ (0 ≤ θ ≤
2π). When dealing with terms like sin(mθ) and cos(mθ) where m ∈ N, the same
argument as above shows that
ejmθ − e−jmθ
 
1 m 1
sin(mθ) = = z − m
2j 2j z
and
ejmθ + e−jmθ
 
1 m 1
cos(mθ) = = z + m .
2 2 z

Example 2.6.8. Use contour integration to evaluate


Z 2π

0 2 + cos θ
159 EMT4801/1

Solution: Let C be the circle |z| = 1 parametrised by z(θ) = ejθ , where


0 ≤ θ ≤ 2π. Then
 
1 1 dz
cos θ = z+ and dθ =
2 z jz
Thus
Z 2π I I
dθ dz 2
= 1 = dz.
z + z1 (jz)
 
0 2 + cos θ C 2+ 2 C j(z 2 + 4z + 1)
2
The integrand of the
√ complex contour integral has√simple poles where z +4z+1 = 0
z = −2 ± √ 3), of which only z = −2 + 3 is inside |z| < 1. (Notice that
(i.e. at √
| − 2 − 3| = 2 + 3 > 1.) Thus
I
2
2 + 4z + 1)
dz = 2πjResz=−2+√3 f (z)
C j(z
where

 
2 1
Resz=−2+√3 f (z) = lim √ (z + 2 − 3) √ √
j z→−2+ 3 (z + 2 − 3)(z + 2 + 3)
 
2 1
= lim √
j z→−2+√3 z + 2 + 3
2 1
= · √
j 2 3
1
= √
j 3

Therefore
Z 2π I
dθ 2
= 2 + 4z + 1)
dz
0 2 + cos θ C j(z
 
1
= 2πj √
j 3

= √ .
3


EXERCISE 2.6.
1. Show that
ezt
I
1
dz = sin t,
2πj C z 2 + 1
where t > 0, and C is the circle |z| = 3.
2. Using the residue theorem, evaluate the following integrals.
3z 2 + 2
I
(a) 2
dz, where C is the circle:
C (z − 1) (z + 4)
(i) |z − 2| = 2, [2πj]
(ii) |z| = 4. [6πj]
z 2 − 2z
I
(b) 2 dz, where C is the circle:
2
C (z + 1) (z + 4)
(i) |z| = 3,  [0]

2π (1 − 7j)
(ii) |z + j| = 2.
25
160

I
dz
(c) 3 , where C is the circle:
C (z + 1) (z 2 − 3z + 2)
(i) |z| = 21 ,  [0]

19πj
(ii) |z + 1| = 1.
108
(z − 1) dz
I
(d) 4 , where C is
C (z 2 − 4) (z + 1)
(i) the circle |z| = 12 ,  [0]

πj
(ii) the circle z + 23 = 2, −
 162 
3πj
(iii) the triangle with vertices at 3 and − 23 ± j. −
2
3. UsingZ a suitable contour integral, evaluate the following real integrals:

x2 dx
 

(a) 2
2 2
(x + 1) (x + 2x + 2) 50
Z−∞∞  
dx 2π
(b) , √
Z−∞ x2 + x + 1 3

dx  5π 
(c) 2, 288
2 2
Z0 ∞ (x + 1) (x + 4)
dx h πi
(d) 2 ,
2
−∞ (x + 4x + 5) 2
Z ∞ " √ #
dx π 2
(e) 4+1
.
−∞ x 2
4. Using contour integration, evaluate
Z 2π

(a) , [π]
3 − 2 cos θ + sin θ
Z0 2π  
dθ 2π
(b) ,
5 + 4 sin θ 3
Z0 2π
cos 3θ hπi
(c) dθ,
5 − 4 cos θ  12 
Z0 2π
dθ 5π
(d) 2.
0 (5 − 3 sin θ) 32
161 EMT4801/1

A RESUMÉ OF COMPLEX INTEGRATION


When integrating complex functions on a given contour the elegance with which
we are able to integrate depends entirely on the extent to which the given function
is analytic. If the function is very bad and not at all analytic on or near the contour,
then our only real option is to parametrise the contour and compute the integral
from ”first principles”. If however the integrand, say f , does behave well in the
sense of being analytic everywhere except for some isolated points, we have many
more options available to us.
If the contour C is not closed and f is analyticR on some domain containing
C, we may then use antiderivatives to compute R C
f (z) dz. If C is closed and f
is analytic on and inside C, then ofR course C f (z) dz = 0. (Note that if we use
Cauchy’s theorem to conclude that C f (z) dz = 0, this does not mean that f must
have no singularities anywhere, but just that it has no singularities on or inside
C. For example although f (z) = z1 has a singularity at z = 0, this is outside
|z − 2| = 1 and so we still have
Z
1
dz = 0
|z−2|=1 z

by Cauchy’s theorem.)
If C is a positively oriented closed contour and f is analytic on C with in
addition only finitely many singularities inside C, we still have several options
available to us for computing the integral. Firstly there is the powerful Residue
theorem we may use in this case. Again here it is only those residues that correspond
sin z
to points inside C that are relevant here. For example let f (z) = z(z−2) and let C
be the circle |z| = 1 in the positive sense. Although here f has
R sin z singularities at both
z = 0 and z = 2, only z = 0 is inside C, and so here C z(z−2) dz = 2πi Resf (z) .
z=0
Of course if we are to use this result effectively we must be adept at computing
residues and Laurent series. (We will come back to this point later.)
If now f is of the form f (z) = g(z)
p(z) where g is analytic inside and on C and p is
a polynomial, we can avoid the use of residues
R altogether and rather use the Cauchy
Integral formulas to compute the integral C f (z) dz. The idea is as follows: If g(z)
p(z)
is a rational function we can use partial fractions to decompose it into a sum of
simpler terms to which we may apply the integration formulae. However even if g
1
is not a polynomial, we can still use partial fractions to decompose p(z) into a sum
K
of terms of the form (z−a)n , and hence decompose f into a sum of terms of the
Kg(z)
form (z−a)n . Then use the Cauchy integration formulas to compute the integral of
Kg(z) R sin z
each (z−a)n and add the results to get C f (z) dz. For example if f (z) = z(z−2)
and C is the positively oriented curve |z| = 3, then since
1 1 1
= −
z (z − 2) 2 (z − 2) 2z
we can write
sin z sin z
sin z
= 2 − 2
z (z − 2) z−2 z
and use the integration formulas to get
Z Z sin z Z sin z
sin z 2 2
dz = − dz
C z (z − 2) C z−2 C z
   
sin z sin z
= 2πi − 2πi
2 z=2 2 z=0
= πi sin 2.
162

(If of course we were integrating over the positively oriented circle |z| = 1
instead of |z| = 3, we would not have needed partial fractions since in this case
only the singularity z = 0 lies inside |z| = 1. In particular this means that sin z
z−2 ,
although not differentiable at z = 2, is nevertheless analytic inside and on |z| = 1.
We can then directly see that
Z Z sin z
sin z (z−2) sin z
dz = dz = 2πi = 0.)
|z|=1 z (z − 2) |z|=1 z (z − 2) z=0
So in principle any student who masters the above simple guidelines should
be able to integrate any rational function over a given closed contour C. Being
able to do so in turn puts us in a position to compute a large number of non–
trivial improper real integrals and even a whole host of integrals of trigonometric
functions.
MODULE 3

LAPLACE TRANSFORMS–
CONTINUOUS SIGNALS AND SYSTEMS

UNIT 1: DEFINITIONS AND PROPERTIES


3.1.1. OBJECTIVE. To revise the work done previously on the Laplace
transform, and to extend these results to possibly complex variables. Also to intro-
duce the initial value and final value theorems, to decribe the Laplace Transform
of periodic functions, and to introduce the convolution theorem.

3.1.2. OUTCOMES. At the end of this unit the student should


• Be familiar with the basic definitions and properties of the Laplace Trans-
form as presented in MAT301W;
• Understand how this theory can be extended to a complex variable;
• Understand and be able to apply the initial value and final value theorems
(the student should in particular also be able to determine when these
theorems are applicable and when not);
• Be able to use tables of Laplace transforms to compute both Laplace
transforms and inverse transforms;
• Understand the convolution theorem and be able to apply it in computing
inverse transforms.

3.1.3. DEFINITIONS. Given an integrable function f on [0, ∞), the Laplace


transform of f is formally defined to be
Z ∞
L {f (t)} = F (p) = e−pt f (t) dt,
0−

where p may taken to be a real or complex parameter. (In earlier courses, p was

taken to be real
R ∞ only.) In the above integral formula, the lower limit 0 is shorthand
for lim→0+ − . Where there is no danger of confusion we will simply write 0 for
0− . Of course
R ∞ if for exampleR ∞ f is continuous at 0, there is no difference anyway
between 0− e−pt f (t) dt and 0 e−pt f (t) dt. It is usual to refer to {f (t) , F (p)} as
a Laplace transform pair.
A function f on R is defined to be causal if f (t) = 0 for t < 0. When studying
the Laplace Transform we will generally restrict ourselves to causal functions. The
reason for this is that we will for the most part be applying these techniques to
causal systems where we essentially have no information about the system prior to
time t = 0. Any function on R can of course be turned into a causal function by
simply multiplying it with the so-called Heaviside unit step function H (t). This
function is defined by 
0 t<0
H (t) =
1 t≥0
For any f we clearly have that

0 t<0
f (t) H (t) =
f (t) t≥0
163
164

which is causal. By replacing a function f with f (t)H(t), we effectively discard


the part of the function defined on the negativeR real line. As notedR above, if f is
∞ ∞
continuous at 0, there is no difference between 0− e−pt f (t) dt and 0 e−pt f (t) dt
at all. Thus for such functions we don’t really lose anything by replacing f (t) with
f (t)H(t), since the behaviour of the function on the negative real line (−∞, 0),
makes no contribution to computation of the Laplace Transform at all. So it is only
when we specifically wish to accomodate peculiarities that may occur at 0, that we
need to be careful about passing from f (t) to f (t)H(t). If f has a jump discontinuity
at 0, then the function f (t)H(t) may not be continuous on [0, ∞). If we are not
too worried about what happens at 0, we could simply “remove” this discontinuity
by redefining the value of f (t)H(t) at 0 to be f (0+ ) = limt→0+ f (t). If however we
are dealing with something like an impulse applied at t = 0Rwhere the behaviour

at t = 0 is very significant, it becomes very important to use 0− e−pt f (t) dt rather
R ∞ −pt
than 0 e f (t) dt to compute the transform. Unless otherwise stated all
functions f (t) will for the remainder of this chapter be assumed to be
causal.
For a function f to have a well-defined Laplace Transform, the improper integral
Z ∞
e−pt f (t) dt,
0
must exist for all p, i.e.
Z T
lim e−pt f (t) dt
T →∞ 0
must exist. Thus if we want to identify a large class of functions which do have
Laplace Transforms, we really need to identify a large class of functions for which
these integrals converge. Clearly the factor e−pt acts as a convergence factor. What
we mean by that is that if we have a function f for which f (t) grows a lot faster
than ept ,R the values f (t) will eventually “overwhelm” the values e−pt causing the

integral 0 e−pt f (t) dt to diverge. Thus in order to be able to formulate sufficient
conditions for the existence of L {f (t)} , we need to find conditions which will
exclude functions which grow too fast as t → ∞. To do this we introduce the
notion of exponential order.
Definition 3.1.1. A function f (t) is said to be of exponential order σ as
t → ∞ if there exists a real number σ and positive constants M and T such that
|f (t)| < M eσt for all t>T
Fortunately most of the more commonly used functions are of exponential order.
2
An example of a function which is not of exponential order, is et . The problem
2
here is that et will always grow a lot faster than M eσt , no matter how large M
and σ may be.

Example 3.1.2. Show that f (t) = t3 (t ≥ 0) is of exponential order with σ > 0.

Solution: Since
2 3
(αt) (αt)
eαt = αt + + + ...,
2! 3!
3
it follows that for any α > 0, and any t > 0, we have that (αt) 3! < eαt . In other
words
3!
t3 < 3 eαt
α
for any α > 0, and any t > 0, so that t3 is of exponential order with σ > 0. 

Example 3.1.3. The function f (t) = e4t is of exponential order with σ ≥ 4.


165 EMT4801/1

These two examples clearly show that the value of σ for which a given function
is of exponential order, is not unique. For this reason, we define σc to be the
greatest lower bound of the set of possible values of σ for which a given function
f is of exponential order, and refer to this value as the abscissa of convergence of
f (t) . For f (t) = t3 , σc = 0 andR for f (t) = e4t , σc = 4.

To ensure the existence of 0 e−pt f (t) dt, we would also like to exclude func-
tions which have too many discontinuities, as this can also cause the integral to
diverge. Hence we will restrict ourselves to functions f (t) which are piecewise-
continuous on all subintervals of [0, ∞) of the form [a, b] where (i.e. on any interval
of the form [a, b] it must be bounded and have a finite number of discontinuities).
We will call such functions piecewise regular.
Now suppose that f is piecewise-regular, and also of exponential order with
abscissa of convergence σc . Then of course we can find T > 0 so that |f (t)| ≤ M eσc t
for all t > T . Therefore
Z ∞ Z ∞
−pt
−pt
e f (t) dt ≤ e |f (t)| dt

T T

Now suppose that p is a complex number with p = σ + jω, where σ and ω are real.
Then
−pt −(σ+jω)t −σt −jωt
e = e = e e = e−σt

since e−jωt = 1 and e−σt > 0. Consequently
Z ∞ Z ∞
−pt
−pt
e f (t) dt ≤ e |f (t)| dt

T T
Z ∞
= e−σt |f (t)| dt
T
Z ∞
≤ M e−σt eσc t dt
T
Z ∞
= M e−(σ−σc )t dt.
T

But this last integral will converge


R∞ whenever σ > σc . Hence if f is piecewise-regular
and of exponential order, T e−pt f (t) dt will exist whenever <(p) = σ > σc . But if
RT
f is piecewise regular, 0 e−pt f (t) dt will always exist since e−pt f (t) will then be
piecewise-continuous on [0, T ]. Thus it follows that if f is piecewise-regular and of
exponential order,

L {f (t)} = F (p)
Z T Z ∞
= e−pt f (t) dt + e−pt f (t) dt
0 T
Z ∞
= e−pt f (t) dt
0

will exist for all p with <(p) > σc .


It should be stressed that the conditions used above (f (t) being piecewise-
regular and of exponential order) are sufficient for the Laplace transform to exist,
but
R ∞ not always necessary. Even if these conditions are not satisfied, the integral
−pt
0
e f (t) dt may still sometimes exist.
166

3.1.4. BASIC PROPERTIES OF THE LAPLACE TRANSFORM.


Unless clearly otherwise stated we will throughout this section assume that f and
g are functions which are piecewise regular and of exponential order. This ensures
that both L{f (t)} and L{g(t)} will exist. We present some basic properties of the
Laplace transform. Further properties regarding periodic functions, as well as step
and impulse functions will be presented in Unit 3.4.

3.1.4.1. The linearity property.

Theorem 3.1.4. Let f, g be any two functions for which the Laplace Transform
exists. Then for any two complex numbers α, β, the Laplace Transform of αf (t) +
βg (t) also exists. Moreover

L {αf (t) + βg (t)} = αL {f (t)} + βL {g (t)} .

If σc , σ1 and σ2 are the abcissas of convergence of αf + βg, f and g respectively,


we have that σc ≤ max(σ1 , σ2 ).

Proof: Notice that


Z ∞
L {αf + βg} (p) = e−pt [αf (t) + βg (t)] dt
0
Z ∞ Z ∞
−pt
= e αf (t) dt + e−pt βg (t) dt
0 0
Z ∞ Z ∞
−pt
= α e f (t) dt + β e−pt g (t) dt
0 0
= αL {f } + βL {g} .

Regarding the region of convergence, if f (t) and g (t) have abscissas σ1 and σ2
respectively, then |f (t)| ≤ M1 eσc and |g (t)| ≤ M2 eσ2 . Thus

|αf (t) + βg (t)| ≤ |α| |f (t)| + |β| |g (t)|


≤ |α| M1 eσc + |β| M2 eσ2
≤ (|α| M2 + |β| M2 ) eσ

where
σ = max (σ1 , σ2 ) .

Clearly σc ≤ max(σ1 , σ2 ). 

Example 3.1.5. Given that


1
L ebt =

,
p−b

where <(p) > <(b), use this fact to determine the Laplace transforms of sin at and
cos at.

Solution: For any p with <(p) > <(ja) = −=(a), we have

1
L ejat (p) =

.
p − ja
167 EMT4801/1

Next recall that cos(at) = 21 (ejat + e−jat ) and sin at = 1


2j (e
jat
− e−jat ). Therefore

1
L ejat + L e−jat
  
L {cos at} =
2 
1 1 1
= +
2 p − ja p + ja
1 (p + ja) + (p − ja)
=
2 (p + ja).(p − ja)
p
= .
p + a2
2

Similarly
a
L {sin at} = .
p2 + a2

3.1.4.2. First translation or shifting property.

Theorem 3.1.6. If L {f (t)} = F (p) , then L ebt f (t) = F (p − b) . If σc is
the abcissa of convergence of f , σc + b will be the abcissa of convergence of ebt f (t).

Example 3.1.7. We have that


p
L {cos at} (p) = whenever <(p) > 0
p2 + a2
Thus
p−b
L ebt cos at =

2 whenever < (p) > < (b)
(p − b) + a2

3.1.4.3. Second translation or shifting property.

Theorem 3.1.8. If
L {f (t)} = F (p)
and 
f (t − k) t ≥ k
f (t − k) H (t − k) =
0 t<k
then
L {f (t − k) H (t − k)} = e−kp L {f (t)} = e−kp F (p)

Example 3.1.9. Given that


4!
L t4 = 5

p
the Laplace Transform of
(t − 5)4

4 t≥5
(t − 5) H(t − 5) =
0 t<5
will then be
n
4
o 4!
L (t − 5) H (t − 5) = e−5p L t4 = e−5p 5

p
168

3.1.4.4. Change of scale property.


Theorem 3.1.10. If L {f (t)} = F (p) then
1 hpi
L {f (at)} = F
a a

Example 3.1.11. Given that


1
L {sin t} =
p2 + 1
it follows that
1 1
L {sin at} = ·
a (p/a)2 + 1
a
=
p + a2
2

3.1.4.5. Laplace transforms of derivatives.


Theorem 3.1.12. If f, f 0 , f 00 , . . . , f (n−1) , f (n) are all of exponential order with
f(n)
piecewise regular and f, f 0 , f 00 , . . . , f (n−1) continuous, then
n o
L f (n) (t) = pn F (p) − p(n−1) f (0) − p(n−2) f 0 (0) − . . . − pf (n−2) (0) − f (n−1) (0)

where L {f (t)} = F (p). For example


L {f 000 (t)} = p3 F (p) − p2 f (0) − pf 0 (0) − f 00 (0)

3.1.4.6. Laplace transforms of integrals.


Theorem 3.1.13. If L {f (t)} = F (p) then
Z t 
F (p)
L f (u) du =
0 p

Example 3.1.14. Since


4
L {sin 4t} = ,
p2 + 16
Rt
the Laplace Transform of the function g(t) = 0 sin 4udu will be
Z t 
4
L sin 4udu = 2 + 16)
0 p (p

3.1.4.7. Multiplication by tn .
Theorem 3.1.15. Let f be piecewise regular and of exponential order, and let
L {f (t)} = F (p). Then F (n) (p) exists for each n ∈ N, and
n dn n
L {tn f (t)} = (−1) F (p) = (−1) F (n) (p)
dpn

Example 3.1.16. Since


1
L e4t =

,
p−4
169 EMT4801/1

we have
d2
 
 2 4t
12
L t e = (−1)
dp2 p − 4
( )
d −1
=
dp (p − 4)2
2
= 3
(p − 4)

3.1.4.8. Division by t.
Theorem 3.1.17. If L {f (t)} = F (p) then
  Z ∞
f (t)
L = F (u) du,
t p

provided that
f (t)
lim
t→0 t
exists.

Example 3.1.18. Since


3
L {sin 3t} =
p2 + 9
and
sin 3t
lim =3
t→0 t
we have
  Z ∞
sin 3t 3
L = 2+9
du
t p u
Z b
3
= lim du
b→∞ p u2 + 9

u b
= lim tan−1
b→∞ 3 p
 
b p
= lim tan−1 − tan−1
b→∞ 3 3
π −1 p
= − tan .
2 3
(Here we made use of the fact that limt→∞ tan−1 (t) = π2 .) Now let y = tan−1 p3 .
Since tan(y) = p3 we will have cot(y) = p3 . But then tan π2 − y = cot(y) = p3 .
This means that 1 − y = tan−1 p3 ; in other words
 
sin 3t π p 3
L = − tan−1 = tan−1 .
t 2 3 p

3.1.4.9. Behaviour of L{f (t)}) = F (p) as p → ∞.


Theorem 3.1.19. If f is a piecewise regular function of exponential order with
L {f (t)} = F (p), then
lim F (p) = 0
p→∞
170

3.1.4.10. Initial value theorem. The initial and final value theorems are two
useful theorems that enable us to predict the behaviour of f as t → 0 and t → ∞,
from the behaviour of L{f (t)} = F (p).
Theorem 3.1.20. If f and f 0 are both piecewise regular and of exponential
order (so both have Laplace Transforms), then
lim f (t) = lim pF (p)
t→0+ p→∞

It is important to note that the initial–value theorem does not give the initial
value f (0− ) used when determining the Laplace transform, but rather gives the
value of f (0+ ) = limt→0+ f (t). This distinction is highlighted in example 3.3.3. In
addition this theorem holds even when f is not causal! We will outline the proof
for this very general case.

Outline of proof: Suppose that f and f 0 are not necessarily causal functions
which satisfy the hypothesis of the theorem. For any  > 0 we have that
Z ∞ Z ∞
e−pt f 0 (t)dt = lim [e−pt f (t)|b− ] − (−p) e−pt f (t)dt
− b→∞ −
Z ∞
= lim [e−pb f (b) − ep f (−)] + p e−pt f (t)dt
b→∞ −
Z ∞
= −ep f (−) + p e−pt f (t)dt
−
+
Letting  → 0 then gives
L {f 0 (t)} = pF (p) − f 0− .


We may write
Z ∞ Z 0+ Z ∞
−pt 0 −pt 0
e f (t)dt = e f (t)dt + e−pt f 0 (t)dt
0− 0− 0+
where Z ∞ Z ∞
−pt 0
e f (t)dt = lim+ e−pt f 0 (t)dt,
0+ →0 
and
Z 0+ Z 
e−pt f 0 (t)dt = lim e−pt f 0 (t)dt.
0− →0+ −
Since f 0 (t) is of exponential order, we have that
Z ∞
lim e−pt f 0 (t) dt = 0.
p→∞ 0+
(This can be proved by using the inequalities at the end of subsection 3.1.3.) The
final step in the proof is to show that
Z 0+ Z 
e−pt f 0 (t)dt = lim+ e−pt f 0 (t)dt → f (0+ ) − f (0− ).
0− →0 −
We will not work out the details of this step, but merely note that it then follows
from the above formulas that
Z ∞

+
f (0 ) − f (0 ) = lim e−pt f 0 (t)dt = lim [pF (p) − f (0− )],
p→∞ 0− p→∞

or in other words that


f (0+ ) = lim pF (p).
p→∞

171 EMT4801/1

There is an interesting but simple extension of this result which we now for-
mulate.
Corollary 3.1.21. Suppose that the initial value theorem holds for both f and
g, and that limt→0+ g (t) 6= 0. If limt→0+ fg(t)
(t) F (p)
= 1, then limp→∞ G(p) =1
To prove this corollary one merely needs to use the initial value theorem to note
that limt→0+ fg(t)
(t) pF (p)
= limp→∞ pG(p) F (p)
= limp→∞ G(p) . What this corollary essentially
tells us is that under suitable conditions F (p) will be close to G(p) as p → ∞,
whenever f (t) is close to g(t) as t → 0+ .
3.1.4.11. Final value theorem.
Theorem 3.1.22. If f and f 0 are both piecewise regular and of exponential
order, and if the abscissa of convergence of f 0 satifies σc < 0, then
lim f (t) = lim pF (p)
t→∞ p→0

whenever both these limits exist.


Outine of proof: For the sake of simplicity we prove only the case where both
f and f 0 are continuous. In this case we may apply property 3.1.12 to conclude
that Z ∞
L {f 0 (t)} = e−pt f 0 (t) dt = pF (p) − f (0) .
0
Taking limits and using the fact that f 0 has negative abscissa of convergence, we
have that
Z ∞
lim [pF (p) − f (0)] = lim e−pt f 0 (t) dt
p→0 p→0 0
Z ∞
= lim e−pt f 0 (t) dt
0 p→0
Z ∞
= f 0 (t) dt
0

= lim f (t)|0−
b→∞
= lim f (t) − f (0)
t→∞
Thus
lim f (t) = lim pF (p)
t→∞ p→0
as required. 
Remark 3.1.23. The restriction that lim f (t) must exist means that the the-
t→∞
orem does not apply to functions such as et , which tends to infinity as t → ∞ or
sin(ωt) whose limit is undefined. It is important that the theorem be used with cau-
tion, since the existence of limp→0 pF (p) does not imply the existence of limt→∞ (f ).
In practical applications it will often be L{f (t)} = F (p), rather than f , that is
known. Thus it will be useful to have conditions for the use of this theorem that
depend on F (p), and not on f . Fortunately such conditions do exist. It is possible
to show that this theorem may be applied whenever there are no complex numbers
p0 with nonnegative real part at which pL{f (t)} = pF (p) is unbounded. Thus if all
the singularities of pF (p) are to the left of the imaginary axis, we may safely use
the theorem. Otherwise not.
As was the case with the initial value theorem, the final value theorem too has
an interesting corollary. In this case we end up with conditions which ensure that
if f (t) is close to g(t) as t → ∞, then F (p) will be close to G(p) as p → 0+ .
172

Corollary 3.1.24. Suppose that the final value theorem holds for both f and
g and that limt→∞ g (t) 6= 0. If limt→∞ fg(t)
(t) F (p)
= 1, then limp→0 G(p) =1

Example 3.1.25. Determine the Laplace Transform of e−kt sin at.


Solution: We know that
a
L {sin at} = < (p) > 0.
p2 + a2
On applying the first shifting property to this formula, it follows that
a
L e−kt sin at

= 2
(p − (−k)) + a
a
= 2 < (p) > −k.
(p + k) + a

Example 3.1.26. Determine L {t cosh 2t} .
Solution: Given that
p
L {cosh 2t} = ,
p2 −4
we have that
 
d p
L {t cosh 2t} = (−1)
dp p2 − 4

p2 − 4 − p (2p)
= (−1) 2
(p2 − 4)
p2 + 4
= 2
(p2 − 4)


Example 3.1.27. Determine


Z t 
u3 + sin 2u du

L
0

Solution: We know that


3! 2
L t3 + sin 2t = L t3 + L {sin 2t} = 4 + 2
 
.
p p +4
Therefore
Z t   
3
 1 3! 2
L u + sin 2u du = + 2
0 p p4 p +4
6 2
= + .
p5 p (p2 + 4)


Example 3.1.28. Given that


 
sin t 1
L = tan−1 ,
t p
find  
sin at
L
t
using the change of scale property.
173 EMT4801/1

Solution: If L {f (t)} = F (p) then


1 hpi
L {f (at)} = F .
a a
Therefore  
sin at 1 a
L = tan−1 .
at a p
So  
sin at a
L = tan−1 .
t p

Example 3.1.29. (a) Suppose that L{f (t)} = F (p) exists for all p ≥ 0.
Prove that then
Z ∞ Z ∞
f (t)
dt = F (u) du.
0 t 0
(b) Given
R ∞ sin that the integral converges, use the results of part (a) to determine
t
0 t dt.
Solution:
(a) The division by t property states that if L {f (t)} = F (p) then
  Z ∞
f (t)
L = F (u) du.
t p
From the definition of a Laplace transform it follows that
  Z ∞
f (t) f (t)
L = e−pt dt.
t 0 t
Consequently
Z ∞ Z ∞
f (t)
e−pt dt = F (u) du.
0 t p

Letting p → 0+ now yields


Z ∞ Z ∞
f (t)
dt = F (u) du
0 t 0
as required.

(b) If f (t) = sin t, then


1
F (p) = L {sin t} = .
p2 +1
Thus by part (a),
Z ∞ Z ∞
sin t 1
dt = du
0 t 0 u2 +1
Z b
1
= lim 2+1
du
b→∞ 0 u
b
= lim tan−1 u 0
b→∞
π
= .
2
(Here we used the fact that limb→∞ tan−1 b = π
2 .)

174

Example 3.1.30. Evaluate


R∞
(a) 0 te−2t cos tdt;
R ∞ e−t − e−3t
(b) 0 dt.
t
Solution:
(a) Observe that
Z ∞
te−pt cos tdt = L {t cos t}
0
d
= (−1)L{cos t}
dp
d p
= −
dp p2 + 1
p2 − 1
= 2.
(p2 + 1)
If now we set p = 2, it follows that
Z ∞
3
te−2t cos tdt =
0 25

(b) If we set f (t) = e−t − e−3t , then


1 1
F (p) = L e−t − e−3t =

− .
p+1 p+3
For this specific function f , the identity
  Z ∞
f (t)
L = F (u) du
t p

then becomes
Z ∞ Z ∞
e−t − e−3t 1 1
e−pt dt = − du.
0 t p u+1 u+3
It follows that
Z ∞ ∞
e−t − e−3t
Z
1 1
e−pt dt = − du
0 t p u+1 u+3
b
= lim ln (u + 1) − ln (u + 3)|p
b→∞
b
u + 1
= lim ln
b→∞ u + 3 p
p+1
= ln 1 − ln
p+3
p+1
= 0 − ln
p+3
p+3
= ln .
p+1
On letting p → 0+ , it therefore follows that
Z ∞ −t
e − e−3t
dt = ln 3.
0 t

175 EMT4801/1

Example 3.1.31. Find L {f (t)} if



4t 0≤t<6
f (t) =
3 6≤t
Solution: By using

0 t<6
H (t − 6) = ,
1 6≤t
f (t) may be written as
f (t) = 4t − 4tH (t − 6) + 3H (t − 6)
= 4t − 4 (t − 6) H (t − 6) − 21H (t − 6)
Using linearity and the second shifting property, we conclude that
4 4e−6p e−6p
L {f (t)} = 4L{t} − 4e−6p L{t} − 21e−6p L{1} = 2
− 2
− 21
p p p

3.1.4.12. The convolution theorem for Laplace transforms. The convolution of
two piecewise regular functions f (t) and g (t) on R, denoted by f ∗ g (t), is defined
to be Z ∞
f ∗ g (t) = f (ϕ) g (t − ϕ) dϕ
−∞
In the particular case where f (t) and g (t) are both causal functions, we will of
course have that f (ϕ) = g (ϕ) = 0 for t < 0. We will then have that g (t − ϕ) = 0
for t − ϕ < 0. Thus in this case g (t − ϕ) = 0 on the interval (t, ∞), and f (ϕ) = 0
on (−∞, 0). The integral formula for convolution will in this case therefore simplify
to
Z t
(3.1.1) f ∗ g (t) = f (ϕ) g (t − ϕ) dϕ.
0
Convolution turns out to be commutative; that is in general
f ∗ g (t) = g ∗ f (t) .
For the case of causal functions this means that
Z t Z t
(3.1.2) f (ϕ) g (t − ϕ) dϕ = f (t − ϕ) g (ϕ) dϕ
0 0
It is not too difficult to prove this. Specifically if we set ϕ1 = t − ϕ, then of course
−dϕ1 = dϕ with ϕ1 = t when ϕ = 0 and ϕ1 = 0 when ϕ = t. Thus
Z t
f ∗ g(t) = f (ϕ) g (t − ϕ) dϕ
0
Z 0
= f (t − ϕ1 ) g (ϕ1 ) (−dϕ1 )
t
Z t
= f (t − ϕ1 ) g (ϕ1 ) dϕ1
0
= g ∗ f (t) .
Theorem 3.1.32 (Convolution theorem for Laplace Transforms). Let f (t) and
g (t) be piecewise-regular and of exponential order, with Laplace transforms F (p)
and G (p) respectively. Then
L {f ∗ g (t)} = F (p) G (p) ;
176

that is Z t 
L f (ϕ) g (t − ϕ) dϕ = F (p) G (p) .
0

Proof. From the definition of the Laplace transform we have that


Z ∞ Z ∞
F (p) G (p) = e−px f (x) dx · e−py g (y) dy.
0 0
Since x and y are independent variables we can rewrite this as
Z ∞Z ∞
F (p) G (p) = e−p(x+y) f (x) g (y) dxdy.
0 0

(Here we made use of the fact that e−p(x+y) = e−px e−py . Since 0 ≤ x < ∞ and
0 ≤ y < ∞, we can write this repeated integral, as a double integral over the first
quadrant R in the (x, y)–plane. The equation then becomes
ZZ
F (p) G (p) = e−p(x+y) f (x) g (y) dx · dy.
R
Now change variables by setting x + y = t and y = ϕ. Notice that the possible
values of t = x + y = t + ϕ range from 0 to ∞, and that t = x + y ≥ y = ϕ. Thus
in changing from (x, y) to (t, ϕ), the region R = {(x, y)|0 ≤ x < ∞, 0 ≤ y < ∞}
gets mapped onto R1 = {(t, ϕ)|0 ≤ t < ∞, 0 ≤ ϕ ≤ t}.

Thus
ZZ
F (p) G (p) = e−p(x+y) f (x) g (y) dxdy
R
ZZ
= e−p(t) f (t − ϕ) g (ϕ) dϕdt
R1
Z ∞Z t
= e−pt f (t − ϕ) g (ϕ) dϕdt
0 0
Z ∞ Z t 
= e−pt f (t − ϕ) g (ϕ) dϕ dt
Z0 ∞ 0

= e−pt (g ∗ f (t))dt
0
Z ∞
= e−pt (f ∗ g (t))dt
0
= L {f ∗ g (t)}

177 EMT4801/1

3.1.5. THE INVERSE LAPLACE TRANSFORM. Given a complex


function F (p), if there exists a causal function f whose Laplace transform is F (p),
we define the inverse Laplace transform of F (p) (denoted by L−1 {F (p)}) to be the
function f (t). That is

L−1 {F (p)} = f (t) ⇔ L {f (t)} = F (p)

For example since


1
L {sinh t} = ,
p2 −1
we have that
 
−1 1
L 2
= sinh t.
p −1
In the above definition we are implicitly assuming that for any F (p) there will
be only one f (t) whose Laplace transform is F (p). This is not strictly true. If
two piecewise regular functions of exponential order f and g are identical, except
at a finite number of isolated points, we will still have that L{f (t)} = L{g(t)}.
Hence it is possible for two slightly different functions to have the same transform.
However whenever two piecewise regular functions f and g do have the same Laplace
transform, they can only differ at the discontinuities. Hence they are not really all
that different from each other. It is for this reason that we can make sense of the
definition given above.
When computing inverse transforms of rational functions in practice, we need
to first use partial fractions to break the rational function up into a sum of simpler
terms, then write each term in a form resembling the transform of some function,
and finally refer to a table of Laplace transforms to compute the inverse transform
of each term. Lets see how this works.

Example 3.1.33. Find


 
−1 1
L
(p + 1) (p + 2) (p2 + 2p + 10)

Solution: Let
1 A B Cp + D
2
= + + 2 .
(p + 1) (p + 2) (p + 2p + 10) p + 1 p + 2 p + 2p + 10

Then

1 = A (p + 2) p2 + 2p + 10 + B (p + 1) p2 + 2p + 10 + (Cp + D) (p + 1) (p + 2) .
 

Substituting different values of p we see that

1 = 9A when p = −1
1 = −10B when p = −2
1 = 20A + 10B + 2D when p = −1

On comparing the coefficients of the p3 term on the left and right hand side of the
equation we also get
0 = A + B + C.
178

From the
 first
1
two equations it is clear that A = 19 and B = − 10 . But then
1 20 1 1
D = 2 1 − 9 + 1 = − 9 and C = −A − B = − 90 . Thus

1 1 1 1 1 1 p + 10
= − −
(p + 1) (p + 2) (p2 + 2 + 10) 2
9 p + 1 10 p + 2 90 p + 2p + 10
1 1 1 1 1 (p + 1) + 9
= − −
9 p + 1 10 p + 2 90 (p + 1)2 + 9
1 1 1 1 1 (p + 1)
= − −
9 p + 1 10 p + 2 90 (p + 1)2 + 32
1 3

30 (p + 1)2 + 32

Therefore
     
−1 1 1 −1 1 1 −1 1
L = L − L
(p + 1) (p + 2) (p2 + 2 + 10) 9 p+1 10 p+2
 
1 p+1
− L−1
90 (p + 1)2 + 32
 
1 −1 3
− L
30 (p + 1)2 + 32
1 −t 1 1
= e − e−2t − e−t cos 3t
9 10 90
1
− e−t sin 3t.
30

Another very useful technique when inverting the Laplace transform is con-
volution. If say we want to compute the inverse transform of F (p)G(p) where
L{f (t)} = F (p) and L{g(t)} = G(p), then the convolution theorem tells us that

L−1 {F (p) G (p)} = f ∗ g (t)

The following example demonstrates how this technique is applied in practice.

Example 3.1.34. Determine


( )
−1 1
L 2
(p − 2) p2

Solution: We could try to solve this problem by using partial fractions, but
d 1
let’s see how to do it with convolution. Since L{t} = p12 and L{te2t } = − dp p−2 =
1
(p−2)2 , we have that

( )
−1 1
L 2 = f ∗ g(t)
(p − 2) p2
179 EMT4801/1

where f (t) = te2t and g(t) = t. Therefore with f and g as before


( )
1
L−1 2 = f ∗ g (t)
(p − 2) p2
Z t
= f (ϕ) g (t − ϕ) dϕ
0
Z t
= ϕe2ϕ (t − ϕ) dϕ
0
Z t
e2ϕ tϕ − ϕ2 dϕ

=
0

 t 1 t 2ϕ
Z
1 2ϕ 2
= e tϕ − ϕ − e (t − 2ϕ) dϕ
2 0 2 0
" t Z t #
1 1 2ϕ 2
e2ϕ dϕ

= 0− e (t − 2ϕ) +
2 2 0 2 0
 1 2ϕ t

1  2t
= − e (−t) − t − e
4 4 0
1  2t
 1 2t
 
= te + t − e −1
4 4
1  2t 
= e (t − 1) + t + 1
4


EXERCISE 3.1.
1. Use the definition of the Laplace transform to determine the transform of
each of the following functions:
(a) cosh 4t,
(b) t2 ,
(d) te−2t .

2. What are the abscissas of convergence for the following functions?


(a) e−4t ,
(b) sin 4t,
(c) t5 ,
(d) 3e2t − 2e−2t + sin 2t.

3. Determine the Laplace transforms of the following functions, each time


stating their regions of convergence:
(a) 7 − 4t,
(b) t2 sin 3t,
(c) (t2 + 1)2 ,
2
(d) (sin t − cos t) .

4. Find L {f (t)}
 if
0 0<t<2
(a) f (t) = ,
4 t≥2

2t 0 ≤ t ≤ 5
(b) f (t) = .
1 t>5
180

5. Prove that
n!
L {tn } = .
pn+1

6. Investigate the existence of the Laplace transform for each of the following
functions:
1
(a) ,
t+1
2
(b) et −t ,
(c) cos t2

7. Find L e−t sin2 t .




8. Consider the function



2t 0≤t≤1
f (t) = .
t t≥1
(a) For f as above, compute each of the following Laplace transforms:
(i) L {f 0 (t)},
(ii) L {f 00 (t)}.
(b) Does the formula L {f 0 (t)} = pL {f (t)} − f (0) hold in this case?
Clearly justify your assertions.

9. Verify directly that


Z t 
−u 1 
2
du = L t2 − t + e−t

L u −u+e
0 p

10. Show that


 6p2 − 2
(a) L t2 sin t = 3,
(p2 + 1)
R∞ 3
(b) 0 te−3t sin tdt = .
50
11. Verify the initial–value theorem for the functions
(a) 2 − 3 cos t
2
(b) (3t − 1)

12. Verify the final–value theorem for the functions


(a) 1 + 3e−t sin 2t
(b) t2 e−2t

13. Use the initial– and final–value theorems to find the jump at t = 0 and
the limiting value as t → ∞ for the solution of the initial–value problem
dy
7 + 5y = 4 + e−3t + 2δ (t) with y (0) = −1
dt
 
2 4
,
7 5
14. Evaluate
R ∞ each of the following integrals:
(a) 0 t3 e−t sin tdt,
R ∞ e−3t sin t
(b) 0 dt.
t
181 EMT4801/1

15. Determine the inverse Laplace transforms of each of the following func-
tions:
p+4
(a) 2 2 ,
p (p + 45p + 8)
p
(b) 2,
(p2 + a2 )
 
1
(c) ln 1 + 2 ,
p
 
1 1
(d) ln 1 + 2 .
p p
16. Determine Z t 
2

L u + sin 2u du .
0

17. (a) Determine ( )


−1 1
L 2
p2 (p + 1)
by using the convolution theorem.
(b) Use the convolution theorem to solve the equation
Z t
y (t) = t + 2 y (u) cos (t − u) du
0
182

UNIT 2: THE LAPLACE METHOD FOR SOLVING


DIFFERENTIAL EQUATIONS
3.2.1. OBJECTIVE. To show how Laplace transforms may be used to solve
systems of ordinary linear constant-coefficient differential equations.
3.2.2. OUTCOMES. At the end of this unit the student should
• Understand and be able to apply the Laplace method for solving linear
constant coefficient differential equations as presented in MAT301W;
• Know how this same method can be used to solve sets of simultaneous
differential equations.
3.2.3. ORDINARY DIFFERENTIAL EQUATIONS. Laplace
transforms may be used to solve any order of ordinary linear constant-coefficient
DE’s. Given such a DE, the basic idea is to use Laplace transforms to convert the
DE into an algebraic equation, from which the transform of the dependent variable
can then be found by algebraic methods. Once this transform is known, the DE can
be solved by computing the inverse transform. We illustrate the general method
for a second order DE. Suppose we want to solve a DE of the form
ay 00 (t) + by 0 (t) + cy(t) = f (t) ,
with initial conditions
y (0) = k y 0 (0) = m.
On taking Laplace transforms we get
a p2 Y (p) − py (0) − y 0 (0) + b [pY (p) − y (0)] + cY (p) = F (p)
 

where L{y(t)} = Y (p) and L{f (t)} = F (p). After substituting the initial condi-
tions, and gathering all terms containing the factor Y (p) on the left, the equation
becomes
Y (p) ap2 + bp + c = F (p) + apk + am + bk.
 

Thus
F (p) + apk + bk + am
L{y(t)} = Y (p) =
ap2 + bp + c
and hence  
F (p) + apk + bk + am
y = L−1
ap2 + bp + c
Example 3.2.1. Solve the DE y 00 + 5y 0 + 6y = 2e−t , (t ≥ 0), subject to the
initial conditions y (0) = 1 and y 0 (0) = 0.
Solution: On taking Laplace transforms we get
2
[p2 Y (p) − py (0) − y 0 (0)] + 5 [pY (p) − y (0)] + 6Y (p) = .
p+1
After substituting the initial conditions, this becomes
2
Y (p) p2 + 5p + 6 =
 
+p+5
p+1
Thus
2
p+1 + p + 5
Y (p) =
p2 + 5p + 6
2 + (p + 1) (p + 5)
=
(p + 1) (p2 + 5p + 6)
p2 + 6p + 7
=
(p + 1)(p + 2)(p + 3)
183 EMT4801/1

To find y(t) we must now compute the inverse transform of this rational function.
If
p2 + 6p + 7 A B C
= + + ,
(p + 1)(p + 2)(p + 3) p+1 p+2 p+3
then
p2 + 6p + 7 = A (p + 2) (p + 3) + B (p + 1) (p + 3) + C (p + 1) (p + 2) .
We get
2 = 2A when p = −1
−1 = −B when p = −2
−2 = 2C when p = −3.
So A = 1, B = 1 and C = −1. In other words
1 1 1
Y (p) = + −
p+1 p+2 p+3
thus
1 1 1
y(t) = L−1 { + − } = e−t + e−2t − e−3t
p+1 p+2 p+3

000 00 0
Example 3.2.2. Solve the DE x + 5x + 17x + 13x = 1 (t ≥ 0) subject to the
initial conditions x (0) = x0 (0) = 1 and x00 (0) = 0.

Solution: Taking Laplace transforms yields


[p3 X(p) − p2 x (0) − px0 (0) − x00 (0)] + 5 p2 X(p) − px (0) − x0 (0)
 
1
+17 [pX(p) − x (0)] + 13X(p) = .
p
Substituting the initial conditions will then give
 1 p3 + 6p2 + 22p + 1
X(p) p3 + 5p2 + 17p + 13 = + p2 + p + 5p + 5 + 17 =

,
p p
whence
p3 + 6p2 + 22p + 1
X(p) =
p (p3 + 5p2 + 17p + 13)
Substituting p = −1 into p3 + 5p2 + 17p + 13, gives −1 + 5 − 17 + 13 = 0. Thus
p + 1 must be a factor of p3 + 5p2 + 17p + 13. To find the other factors we use
long-division:
p2 + 4p + 13
p + 1 p3 + 5p2 + 17p + 13
p3 + p2
4p2 + 17p + 13
4p2 + 4p
13p + 13
13p + 13
Therefore
p3 + 6p2 + 22p + 1
X(p) = .
p(p + 1)(p2 + 4p + 13)
If we set
p3 + 6p2 + 22p + 1 A B Cp + D
= + +
p(p + 1)(p2 + 4p + 13) p p + 1 p2 + 4p + 13
then
A (p + 1) p2 + 4p + 13 +Bp p2 + 4p + 13 +(Cp + D) (p) (p + 1) = p3 +6p2 +22p+1.
 
184

We get
1 = 13A when p=0
−16 = −10B when p = −1.
From comparing the coefficients on the left and the right of the p2 and p3 terms,
we further have that
6 = 5A + 4B + C + D
1 = A+B+C
1
But then A = 13 , B = 85 , C = 1 − 13
1
− 85 = − 65
44 5
, and D = 6 − 13 − 32 44 7
5 + 65 = − 65 .
Hence
1 1 8 1 1 44p + 7
X(p) = + −
13 p 5 p + 1 65 p2 + 4p + 13
1 1 8 1 1 [44 (p + 2) − 27 (3)]
= + − 2
13 p 5 p + 1 65 (p + 2) + 32
1 1 8 1 44 (p + 2) 27 3
= + − +
13 p 5 p + 1 65 (p + 2)2 + 32 65 (p + 2)2 + 32

Therefore
1 8 1
x(t) = + e−t − e−2t [44 cos 3t − 27 sin 3t]
13 5 65

00
Example 3.2.3. A system is characterised by the differential equation x +
5x0 + 6x = u + 3u0 . Determine the response of the system if the forcing function
u (t) = e−t is applied at time t = 0, given that the system was initially in a quiescent
state.

Solution: Since initial values are zero, taking Laplace transforms will yield
X (p) p2 + 5p + 6 = U (p) (1 + 3p) ,


whence
3p + 1
X (p) = U (p) .
p2
+ 5p + 6
The input at time t = 0 is u (t) = e−t , and hence
1
U (p) = .
p+1
Therefore
3p + 1
X (p) = .
(p + 2) (p + 3) (p + 1)
Using the theory of partial fractions we may write this as
5 4 1
X(p) = − − .
p+2 p+2 p+1
After taking inverse transforms, we therefore get that
x (t) = 5e−2t − 4e−3t − e−t .

−t 00 0
Warning: One might be tempted to substitute u (t) = e into x + 5x +
6x = u + 3u0 , giving x00 + 5x0 + 6x = e−t − 3e−t = −2e−t . The solution of
this last differential equation is x (t) = 2e−2t − e−3t − e−t , which differs from the
previous solution. So why this apparent contradiction? The problem is that in
the second approach we have ignored the important fact that we are dealing with
causal functions! At time t = 0 the input is u(t) = e−t , but before time t = 0, the
185 EMT4801/1

forcing function has not been applied yet, which means that effectively u(t) = 0 for
t ≤ 0. Thus we are really dealing with u (t) = e−t H (t) here, and not e−t . Setting
u(t) = e−t H(t) in x00 + 5x0 + 6x = u + 3u0 will lead to the answer obtained in the
example above, namely x (t) = 5e−2t − 4e−3t − e−t . You may verify this yourself.
From the above it is clear that if in practice we wish to obtain reliable solutions
of differential equations, it is very important to be clear about the nature of the
forcing function – whether it is causal or not.

3.2.4. SIMULTANEOUS DIFFERENTIAL EQUATIONS. Essentially


the same approach that was demonstrated in the previous section can be used to
solve systems of ordinary linear constant coefficient DE’s. The only difference with
what we did previously, is that after applying the Laplace transform, we will end up
with a whole set of simultaneous algebraic equations (rather than just one equation)
from which we will need to determine the transforms of the dependent variables.
Once these transforms have been found, then as before we can find the final solution
by simply computing the inverse transform. We pause to demonstrate the essential
technique by considering an example of a system of two simultaneous DE’s. For
more complicated systems involving several DE’s, it may be necessary to combine
the Laplace technique with matrix methods. We will however not deal with matrix
techniques just yet, but rather come back to this question at the point where the
state space approach is presented.
Example 3.2.4. Solve the simultaneous DE’s
dx dy
+ + 5x + 3y = e−t
dt dt
dx dy
2 + +x+y = 3
dt dt
given that x(0) = 2 and y(0) = 1.
Solution: Upon taking Laplace transforms we get
1
[pX(p) − x (0)] + [pY (p) − y (0)] + 5X(p) + 3Y (p) =
p+1
3
2 [pX(p) − x (0)] + [pY (p) − y (0)] + X(p) + Y (p) = .
p
After substituting the initial conditions, this becomes
1
X(p) [p + 5] + Y (p) [p + 3] = +3
p+1
3
X(p) [2p + 1] + Y (p) [p + 1] = + 5.
p
From this set of equations we need to compute both X(p) and Y (p). To do this
we eliminate either X or Y . The remaining term can then be computed. Once this
one is known, the other can be found from either of the above equations. For the
sake of argument lets find X by eliminating Y . The first equation is multiplied by
(p + 1) and second by (p + 3) to give
X(p) [p + 5] [p + 1] + Y (p) [p + 3] [p + 1] = 3p + 4
9
X(p) [2p + 1] [p + 3] + Y (p) [p + 1] [p + 3] = 5p + 18 + .
p
Taking the difference of these two equations, now leads to
 
9
X(p) [(2p + 1) (p + 3) − (p + 5) (p + 1)] = 5p + 18 + − (3p + 4).
p
186

Now simplify and solve for X(p) to get


2p2 + 14p + 9 2p2 + 14p + 9
X(p) = = .
p (p2 + p − 2) p(p + 2)(p − 1)
To find the inverse transform, we first use partial fractions to write the right hand
side of this equation as a sum of simpler terms. Specifically we set
2p2 + 14p + 9 A B C
= + + .
p(p + 2)(p − 1) p p+2 p−1
But then
A (p + 2) (p − 1) + Bp (p − 1) + Cp (p + 2) = 2p2 + 14p + 9.
Succesively substituting the values p = 0, p = −2, and p = 1, now yields 9 = −2A,
−11 = 6B and 25 = 3C. Thus A = − 29 , B = − 11 25
6 and C = 3 , or rather
9 1 11 1 25 1
X(p) = − − + .
2p 6 p+2 3 p−1
Now take the inverse transform to get
9 11 25
x(t) = L−1 {X(p)} = − − e−2t + et .
2 6 3
To determine y we can either similarly eliminate X(p) to find Y (p), and then
compute the inverse transform of Y (p) to find y(t), OR we can see if from the
original DE’s we can find a formula for y in terms x and dx
dt , and then use x(t) to
find y(t). Lets consider the original set of DE’s:
dx dy
+ + 5x + 3y = e−t
dt dt
dx dy
2 + +x+y = 3
dt dt
Subtracting the second DE from the first yields
dx
− + 4x + 2y = e−t − 3.
dt
Therefore  
1 dx −t
y= − 4x + e − 3 .
2 dt
Substituting the solution for x(t) now leads to
     
1 d 9 11 −2t 25 t 9 11 −2t 25 t −t
y(t) = − − e + e −4 − − e + e +e −3
2 dt 2 6 3 2 6 3
= 7, 5 + 5, 5e−2t − 12, 5et + 0, 5e−t .


EXERCISE 3.2. Use Laplace transforms to solve the following differential


equations and systems of simultaneous differential equations.
dx
1. + 3x = e−2t ; x (0) = 2.
dt
dx 1
2. 3 − 4x = sin 2t; x (0) = .
dt 3
d2 y dy
3. +2 + y = 4 cos 2t; y (0) = 0; y 0 (0) = 2.
dt2 dt
187 EMT4801/1

d3 y d2 y dy
4. 3
−2 2 − + 2y = 2 + t; y (0) = y 00 (0) = 0; y 0 (0) = 1.
dt dt dt
dx dy dx dy
5. + +y =t and +4 + x = 1; x (0) = 1; y (0) = 0.
dt dt dt dt
d2 x d2 y
6. 2
= y − 2x and 2
= x − 2y;
dt dt
x (0) = 4; y (0) = 2; x (0) = y 0 (0) = 0.
0
188

UNIT 3: ENGINEERING APPLICATIONS


3.3.1. OBJECTIVE. To demonstrate how Laplace transform may be used
to solve differential equations arising from engineering problems.

3.3.2. OUTCOMES. At the end of this unit the student should have some
idea of the importance of this theory and insight into how this theory is applied to
engineering problems.

3.3.3. ELECTRICAL CIRCUITS. We pause to revise some of the basics


regarding electrical circuits as presented in section 2.1.4. These are needed to be
able to write down a differential equation describing a given RLC–circuit.
• The instantaneous electrical charge (or quantity of electricity) present in
Rt
the capacitor at any time t is given by q(t) = 0 ids, or alternatively
i = dq
dt .
• The voltage drop across a resistor with resistance R, is iR.
di
• The voltage drop across an inductor with inductance L, is L dt .
t
• The voltage drop across a capacitor with capacitance C is C 0 i(s)ds = Cq .
1
R

The analysis of electrical circuits is based on the very important laws of Kirchoff.
These are essential tools in setting up a differential equation describing a given
RLC–circuit. The two laws of Kirchoff are as follows:
• Kirchhoff’s Current (First) Law – the algebraic sum of currents entering
any junction (node) is zero.
• Kirchhoff’s Voltage (Second) Law – the algebraic sum of the voltage drops
around any closed loop is zero; in other words the voltage impressed on
a closed loop is equal to the sum of the voltage drops in the rest of the
loop.
Now consider the circuit shown in the figure below.

Figure 3.1.

Using Kirchhoff’s voltage (second) law for the given circuit, we have
1 t
Z
di
L + Ri + ids = e (t)
dt C 0
dq
On setting i = dt , this can be written as
d2 q dq q
L +R + = e (t)
dt2 dt C
189 EMT4801/1

Example 3.3.1. Determine the charge q (t) on the capacitor and the resulting
current i (t) in the circuit of figure 3.1 at time t given that R = 160Ω, L = 1H,
C = 10−4 and e (t) = 20V. Prior to closing the switch, the charge on the capacitor
and the current in the circuit are zero.
Solution: Substituting the given values leads to
d2 q dq
2
+ 160 + 104 q = 20
dt dt
Thus if L{q} = Q(p), then on taking Laplace transforms, we get
20
p2 Q − pq (0) − q 0 (0) + 160 [pQ − q (0)] + 104 Q =
p
By assumption the charge on the capacitor and the current in the circuit are zero
at time t = 0. Thus q(0) = 0 and q 0 (0) = i(0) = 0. The equation therefore becomes
 20
Q p2 + 160p + 104 =

.
p
Hence
20
Q= .
p (p2 + 160p + 104 )
If we set
20 A Bp + C
= + 2 ,
p (p2 + 160p + 104 ) p p + 160p + 104
then
A p2 + 160p + 104 + (Bp + C) p = 20


For p = 0 this yields 104 A = 20. Thus


1
A= .
500
Comparing the coefficients of the p2 and p terms leads to
A+B =0 and 160A + C = 0.
So
1 160 8
B = −A = − and C = −160A = − =− .
500 500 25
Therefore
1 1 1 p + 160
qe = −
500 p 500 p + 160p + 104
2
" #
1 1 (p + 80) + 80
= −
500 p (p + 80)2 + 602
" #
1 1 (p + 80) 4 60
= − −
500 p (p + 80)2 + 602 3 (p + 80)2 + 602
Taking inverse transforms, we see that
  
1 −80t 4
q(t) = 1−e cos 60t + sin 60t
500 3
The resulting current in the circuit is then
dq
i =
dt     
1 −80t 4 −80t
= 0 − −80e cos 60t + sin 60t + e (−60 sin 60t + 80 cos 60t)
500 3
1 −80t
= e sin 60t
3
190


Example 3.3.2. In the parallel network shown in figure 3.2 below, there is no
current in any of the loops prior to time t = 0 when the switch is closed. Determine
the currents i1 and i2 at time t.

R 1 = 20 Ω L 1 = 0,5 H L 2 = 1H
i i2

i1

R2 = 8 Ω R 3 = 10 Ω

e(t) = 200 V

Figure 3.2
Figure 3.2.

Solution: By Kirchhoff’s first law i = i1 +i2 . For the two loops, it respectively
follows from Kirchhoff’s second law that
d
L1 (i1 + i2 ) + R1 (i1 + i2 ) + R2 i1 = 200
dt
and
di2
L2 + R3 i2 − R2 i1 = 0.
dt
On substituting the given values the first equation becomes
 
di1 di2
0, 5 + + 20 (i1 + i2 ) + 8i1 = 200,
dt dt
or rather
di1 di2
+ + 56i1 + 40i2 = 400,
dt dt
and the second
di2
+ 10i2 − 8i1 = 0
dt
On taking Laplace transforms these equations become
400
pI1 − i1 (0) + pI2 − i2 (0) + 56I1 + 40I2 =
p
and
pI2 − i2 (0) + 10I2 − 8I1 = 0.
Now insert the given initial values to get
400
I1 (p + 56) + I2 (p + 40) =
p
and
I1 (−8) + I2 (p + 10) = 0.
From the above equation we get I1 = I2 p+10
8 . Substitute this into the previous
equation to get
(p + 10) 400
I2 (p + 56) + I2 (p + 40) = .
8 p
191 EMT4801/1

This simplifies to
 3200
I2 p2 + 74p + 880 =

p
which means that
3200
I2 = .
p (p2 + 74p + 880)
Now set
3200 A Bp + C
= + 2 .
p (p2
+ 74p + 880) p p + 74p + 880

But for this to hold we must have A p2 + 74p + 880 + (Bp + C) p = 3200. Setting
40
p = 0 yields 880A = 3200, whence A = 11 . Comparing the coefficients of the p2
and p terms leads to A + B = 0 and 74A + C = 0. Thus B = −A = − 40 11 and
C = −74A = −74 40 11 . Therefore
40 1 40 p + 74
I2 = −
11 p 11 p2 + 74p + 880
" √ #
40 1 (p + 37) 37 489
= − −√ .
11 p (p + 37)2 − 489 489 (p + 37)2 − 489
Now take inverse transforms to get
√ √
  
40 37
i2 (t) = 1 − e−37t cosh 489t + √ sinh 489t
11 489
di2
Substitute into dt+ 10i2 − 8i1 = 0 to get
√ √
  
40 37
8i1 = 0 + 37e−37t cosh 489t + √ sinh 489t
11 489
√ √ √ i
−37t
−e 489 sinh 489t + 37 cosh 489t
√ √
  
400 −37t 37
+ 1−e cosh 489t + √ sinh 489t .
11 489
Thus
2550 −37t √ 50 h √ i
i1 (t) = √ e sinh 489t + 1 − e−37t cosh 489t .
11 489 11
Using these formulas it can now be shown that i1 → 50 40
11 A and i2 → 11 A as t → ∞.

Example 3.3.3. The circuit consists of a resistance R and a capacitor C con-
nected in series together with a constant voltage source E, as shown in the figure
below. Prior to closing the switch at time t = 0, the charge on the capacitor and the
current in the circuit are zero. Determine the current in the circuit at any instant
t after the switch is closed, and investigate the use of the initial–value theorem.

Figure 3.3
192

Solution: We have Z t
1
Ri + idt = E.
C 0
Thus
1 I (p) E
RI (p) + = ,
C p p
whence
E
I (p) = h i
1
p R+ Cp

E
R
= 1
p+ RC
Now take inverse transforms to get
E − t
i (t) = e RC
R
It therefore clearly follows that
E − t E
i (t) = e RC → as t → 0+
R R
This confirms the initial–value theorem, which predicts that
lim i (t) = lim pI (p)
t→0+ p→∞

pE
R
= lim
p→∞ p + 1
RC
E
R
= lim 1
p→∞ 1+ pRC
E
=
R
+ E
Note that i (0 ) = R is however not the same as the initial condition i (0) = 0!
This means that there is a step change in i (t) at t = 0. 
3.3.4. MECHANICAL OSCILLATIONS. Mechanical translational sys-
tems involve three basic elements:
• Springs with a stiffness coefficient of k (N m−1 ).
• Masses with mass m kg attached to those springs.
• Dampers with a damping coefficient of β (N s m−1 ).
The associated variables are displacement from equilibrium x (t) m, and force
F (t) N. Suppose we have a single spring, with one end of the spring fixed and a
mass of mass m kg attached to the other end. If at time t the mass is displaced
by x(t) units from equilibrium, then assuming we are dealing with ideal springs
and dampers (i.e. they behave linearly), the relationship between the forces and
displacements at time t are:
• Total force exerted on mass: Fm = ma where x00 = a. (Newton’s Law)
• Force exerted by spring: Fs = −kx (Hooke’s Law)
• Damping force: Fd = −βx0 .
The “minus” sign in the expression for the force exerted by the spring and the
damping force, is due to the fact that these forces are restoring forces acting to
restore the system to equilibrium.
But what if both ends of the spring are displaced? What do the formulas for the
forces look like then? If the mass-end of the spring is displaced from equilibrium by
193 EMT4801/1

x1 units and the opposite end of the spring by x2 units, then comparing the mass-
end of the spring to the opposite end, the mass-end is effectively being displaced
by x1 − x2 units from the other end. So on setting x = x1 − x2 and applying the
above laws, we may conclude that in this case:
• The force exerted on the mass by the spring is Fs = −k(x1 − x2 ) =
k (x2 − x1 ).
• The damping force exerted on the mass is Fd = −β(x01 − x02 ) = β(x02 − x01 ).

Figure 3.4. Mass

Figure 3.5. Spring

Figure 3.6. Damper

Example 3.3.4.

Figure 3.7

The mass of the above system is subjected to an externally applied periodic force
F (t) = 4 sin ωt at time t = 0. For each of the two cases
194

(a) ω = 2
(b) ω = 5
determine the resulting displacement x (t) of the mass at time t, given that x (0) =
x0 (0) = 0.
What would happen to the response in the case ω = 5, if the damper was
missing?
Solution: As indicated in the figure, the forces acting on the mass m are the
applied forces F (t) and restoring forces F1 and F2 due to the spring and damper
respectively. Thus the total force being exerted on the mass is F + F1 + F2 . By
Newton’s law
mx00 (t) = F (t) + F1 (t) + F2 (t) .
Using the facts that
m = 1,
F1 = −kx (t) = −25x,
F2 = −βx0 = −6x0
and
F (t) = 4 sin ωt,
the force equation becomes
x00 + 6x0 + 25x = 4 sin ωt.
Taking Laplace transforms leads to

[p2 X − px (0) − x0 (0)] + 6 [pX − x (0)] + 25X = .
p2 + ω2
Inserting the initial conditions then yields

X p2 + 6p + 25 =

p2 + ω2
or rather

X=
(p2 + 6p + 25) (p2 + ω 2 )
(a) With ω = 2 we get
8
x
e= .
(p2 + 6p + 25) (p2 + 4)
If now we write
8 Ap + B Cp + D
= 2 + 2 ,
(p2 + 6p + 25) (p2 + 4) p + 6p + 25 p +4
we must have
(Ap + B) p2 + 4 + (Cp + D) p2 + 6p + 25 = 8
 

Setting p = 0 and comparing the coefficients of the p3 , p2 and p terms,


leads to the equations
p=0: 4B + 25D = 8 ...(i)
p3 : A+C =0 ...(ii)
p2 : B + 6C + D = 0 ...(iii)
p: 4A + 25C + 6D = 0 ...(iv)
From (i) it is clear that
25
B =2− 4 D ...(v)
195 EMT4801/1

By (ii) we have C = −A. Substituting this into (iv) gives 21C + 6D = 0,


or equivalently
C = − 72 D ...(vi)
25 12
If now we insert (v) and (vi) into (iii), we get (2 − 4 D) − 7 D + D = 0,
which in turn simplifies to
56
D= .
195
But then
2 56 16 16 25 56 40
C=− =− , A= and B = 2 − = .
7 195 195 195 4 195 195
Thus
 
1 16p + 40 16p − 56
X = −
195 p2 + 6p + 25 p2 + 4
" #
1 (p + 3) 4 p 2
= 16 2 −2 2 − 16 2 + 28 2 .
195 (p + 3) + 42 (p + 3) + 42 p + 22 p + 22

Now take inverse transforms to get


1  −3t 
x(t) = e (16 cos 4t − 2 sin 4t) − 16 cos 2t + 28 sin 2t .
195

(b) If ω = 5 the Laplace transform X(p) = L{x(t)} is


20
X= .
(p2 + 6p + 25) (p2 + 52 )
As before we set
20 Ap + B Cp + D
= 2 + 2 ,
(p2 2 2
+ 6p + 25) (p + 5 ) p + 6p + 25 p + 52
and then solve for A, B, C and D, to get
20
X =
(p2 + 6p + 25) (p2 + 52 )
 
1 2p + 12 2p
= −
15 p2 + 6p + 25 p2 + 52
" #
1 (p + 3) 3 4 p
= 2 + −2 2 .
15 (p + 3)2 + 42 2 (p + 3)2 + 42 p + 52

Taking inverse transforms leads to


   
1 −3t 3
x(t) = e 2 cos 4t + sin 4t − 2 cos 5t .
15 2

If the damping term F2 = −6x0 was missing, the system would’ve


been described by the differential equation
x00 + 25x = 4 sin ωt.
In this case we would’ve obtained the formula

X= 2
(p + 25)(p2 + ω 2 )
196

for the Laplace transform. Thus in the present case where ω = 5, we


would have
20
X =
(p2 + 25) (p2 + 52 )
 2
4 5
= .
5 (p2 + 52 )
We may use the convolution theorem to compute the inverse transform.
Since
5
L{sin 5t} = 2 ,
p + 52
it follows from the convolution formula that
4
X = L{sin 5t ∗ sin 5t}
5
and hence that
4
x(t) = sin 5t ∗ sin 5t
5
4 t
Z
= sin 5(t − θ) sin 5θ dθ
5 0
2 t
Z
= cos 5(t − 2θ) − cos 5t dθ
5 0
 
2 1
= − sin(5t − 10θ) − θ cos 5t |t0
5 10
 
2 1
= − (sin(−5t) − sin 5t − t cos 5t
5 10
2
= [sin 5t − 5t cos 5t]
25
Notice that in this particular case the presence of the t cos 5t term,
will force x(t) to be unbounded as t → ∞. The reason why this happens
is because without damping, the applied force F (t) = 4 sin 5t will be in
resonance with the system (that is with the vibrating mass). 

Figure 3.8.

Example 3.3.5. The system shown in figure 3.8 above consists of two masses
m1 = 1 kg and m2 = 2 kg, each attached to a separate fixed base by springs having
stiffness coefficients of k1 = 1 and k3 = 2 respectively, and attached to each other
by a third spring having a stiffness coefficient of k2 = 2. The system is released
from rest at time t = 0 from a position in which m1 is displaced 1 unit to the left
of its equilibrium position and m2 is displaced 2 units to the right of its equilibrium
position. Neglecting all frictional effects, determine the position of the masses at
time t.
197 EMT4801/1

Solution:

Figure 3.9.

Let x1 (t) and x2 (t) be the respective displacements of masses m1 and m2 to


the right of their equilibrium positions. By assumption x1 (0) = −1 and x2 (0) = 2.
(The reason we have x1 (0) = −1 is because x1 (t) represents right displacement of
m1 at time t, and at t = 0 the mass m1 has been displaced 1 unit to the left.)
Since at time t = 0 both m1 and m2 are released from a stationary position, we
have x01 (0) = 0 and x02 (0) = 0.
The force exerted by the left-hand spring on mass m1 is F1 = −k1 x1 . To
compute the force exerted by the central spring on m1 note that since in this case
we are specifically interested in m1 , m1 will here represent the mass-end and m2
the opposite end of the spring. So the force exerted on m1 by this spring must be
F2 = k2 (x2 − x1 ). Since frictional forces are neglected, the only forces acting on
the masses are the restoring forces due to the springs, as shown in figure 3.9. Thus
the total force on m1 is F1 + F2 . If we apply Newton’s law to the mass m1 , we get
m1 x001 = F2 + F1
= k2 (x2 − x1 ) − k1 x1 .
Coming to the mass m2 , then when considering the central spring in this case m2
represents the mass-end of the spring and m1 the opposite end. So here the force
exerted by the central spring on m2 will be k2 (x1 − x2 ) = −k2 (x2 − x1 ) = −F2 .
The force exerted by the right hand spring on m2 is F3 = −k3 x2 , and the total
force on m2 therefore F3 − F2 . Thus by Newton’s law
m2 x002 = F3 − F2
= −k3 x2 − k2 (x2 − x1 ) .
Substituting the given values lead to
x001 = 2 (x2 − x1 ) − x1 and 2x002 = −2x2 − 2 (x2 − x1 ) .
In other words
x001 + 3x1 − 2x2 = 0 and x002 − x1 + 2x2 = 0.
Taking Laplace transforms leads to
[p2 X1 − px1 (0) − x01 (0)] + 3X1 − 2X2 = 0
p X2 − px2 (0) − x02 (0) − X1 + 2X2
 2 
= 0.
Substituting initial conditions then gives
p2 X1 + p + 3X1 − 2X2 = 0
p2 X2 − 2p − X1 + 2X2 = 0
Thus
X1 p2 + 3 − 2X2

= −p
X 2 p2 + 2 − X 1

= 2p
198

From the second of these equations we have


X1 = X2 p2 + 2 − 2p.


If we substitute this into the first equation, we get


X2 p2 + 2 − 2p p2 + 3 − 2X2 = −p
   

Therefore
X2 p2 + 2 p2 + 3 − 2 = −p + 2p p2 + 3 .
    

This can be rewritten as


X2 (p2 + 1)(p2 + 4) = 2p3 + 5p
which means that
2p3 + 5p
X2 = .
(p2 + 1) (p2 + 4)
If we set
2p3 + 5p Ap + B Cp + D
= 2 +
(p2 + 1) (p2 + 4) (p + 1) (p2 + 4)
then
(Ap + B) p2 + 4 + (Cp + D) p2 + 1 = 2p3 + 5p.
 

Thus if p = 0 we have 4B + D = 0. On comparing the coefficients of the p, p2 and


p3 terms, it is clear that we also have 4A + C = 5, B + D = 0, and A + C = 2. It
is not difficult to conclude from these equations that
A = 1, B = 0, C = 1, D = 0.
(We leave the verification of this as an exercise.) Consequently
2p3 + 5p p p
X2 = = 2 + .
(p2 + 1) (p2 + 4) (p + 1) (p2 + 4)
On taking inverse transforms we therefore get
x2 (t) = cos t + cos 2t
The formula for x1 may be computed by substituting the above formula for x2 into
x002 − x1 + 2x2 = 0. Specifically
x1 = x002 + 2x2
= − cos t − 4 cos 2t + 2 cos t + 2 cos 2t
= cos t − 2 cos 2t

EXERCISE 3.3.
1. Determine i2 (t) if, initially, i1 (0) = i2 (0) = q1 (0) = 0.

Figure 3.10
199 EMT4801/1

2. Determine i1 (t) for a constant applied voltage E0 = 10V if, prior to


closing the switch, there is no charge on the capacitor and no current
flowing through the inductances.

Figure 3.11

3. Determine the displacement of the masses m1 and m2 shown in figure 3.12


at time t if m1 = m2 = 1 and k1 = 1, k2 = 3 and k3 = 9. What are the
natural frequencies of the system?

Figure 3.12.

4. When testing the landing gear of a spacecraft, drop tests are carried out.
Figure 3.13 represents a schematic model of the unit when it first touches
the ground. At√this instant the spring is fully extended and the velocity
of the mass is 2gh where h is the height from which the unit has been
dropped. Obtain the equation representing the displacement of the mass
at time t > 0 when m = 50 kg, β = 180 N sm−1 and k = 474, 5 N m−1 .
Investigate the effects of different dropping heights h. Take g = 9, 8 m s−2 .

Figure 3.13.
200

UNIT 4: STEP AND IMPULSE FUNCTIONS


3.4.1. OBJECTIVE. To introduce step and impulse functions, and demon-
strate their usefulness in the theory of Laplace transforms.

3.4.2. OUTCOMES. At the end of this unit the student should


• Be familair with the definition of Heaviside step and Dirac impulse func-
tions;
• Able to use Heaviside step functions to write down concise formulations
of piecewise-continuous functions, and from this formulation to compute
the Laplace transform of such functions;
• Be familair with and able to use the theory of Laplace transforms of
periodic functions;
• Understand and be familair with the sifting property of impulse functions;
• Understand and be familair with the relationship between step and im-
pulse functions.

3.4.3. THE HEAVISIDE STEP FUNCTION. We recall some definitions


from section 3.1. A function f (t) on [0, ∞) is said to be piecewise regular if on
every subinterval of [0, ∞) of the form [a, b], it is bounded and has a finite number
of discontinuities. If a piecewise regular function on [0, ∞) has finitely many discon-
tinuities, we say that it is piecewise continuous. In many engineering applications
the function is frequently discontinuous, but still piecewise regular; for example
a square wave from an on/off switch. The so-called Heaviside unit step function
proves to be an effective tool for analysing such functions.

Figure 3.14

This function is defined by



0 t<0
H (t) =
1 t≥0
(see also section 3.1). The translated Heaviside step function

0 t<a
H (t − a) =
1 t≥a
201 EMT4801/1

may be interpreted as a device for “switching on” the function f (t) at t = a. To


see what we mean by this let f be a given function on [0, ∞), and consider the
product f (t) H (t − a). We have that

0 t<a
f (t) H (t − a) =
f (t) t ≥ a
Thus the effect of the product f (t) H (t − a), is “switch f on” at t = a. Now
suppose we are given 0 < a < b. If we take the product of f with the so-called “top
hat function” 
 0 0≤t<a
H (t − a) − H (t − b) = 1 a≤t<b
0 b≤t

instead of just H(t − a), we get



 0 t<a
f (t) (H (t − a) − H(t − b)) = f (t) a ≤ t < b
0 b≤t

Having switched f on at t = a, the effect of subtracting f (t) H (t − b) from


f (t) H (t − a), is therefore to “switch f off again” at t = b. By in this way switching
various functions on and off at predetermined points, the Heaviside unit step func-
tion may be used to write a concise formulation of piecewise–continuous functions.
To see how this works, consider for example the function

 f1 (t) , 0 ≤ t < t1
f (t) = f2 (t) , t1 ≤ t < t2 .
f3 (t) , t ≥ t2

Figure 3.15

The above piecewise continuous function is produced by means of the following


process:
• Switch f1 (t) on at t = 0 and off at t = t1 .
• Switch f2 (t) on at t = t1 and off at t = t2 .
• Switch f3 (t) on at t = t2 .
We can now use the Heaviside step function to switch the functions f1 , f2 and f3
on and off at the appropriate places. Using this approach, the original function f
can now be written as
f (t) = f1 (t) [H (t) − H (t − t1 )] + f2 (t) [H (t − t1 ) − H (t − t2 )] + f3 (t) H (t − t2 ) .
202

Example 3.4.1. Sketch and express the following piecewise–continuous func-


tion in terms of unit step functions.


 0 t<1
 1 1≤t<3


f (t) = 3 3≤t<5
 2 5≤t<6



0 t≥6

Solution:

Figure 3.16

We obtain
f (t) = 1(H(t − 1) − H(t − 3)) + 3(H(t − 3) − H(t − 5)) + 2(H(t − 5) − H(t − 6))
= H (t − 1) + 2H (t − 3) − H (t − 5) − 2H (t − 6)

Example 3.4.2. Sketch and express the following piecewise–continuous func-
tion in terms of unit step functions:
 2
 2t 0≤t<3
f (t) = t+4 3≤t<5
9 t≥5

Solution:

Figure 3.17
203 EMT4801/1

f (t) = 2t2 (H(t) − H(t − 3)) + (t + 4) (H (t − 3) − H(t − 5)) + 9H (t − 5)


2t2 H (t) + t + 4 − 2t2 H (t − 3) − (t − 5) H (t − 5)

=

Now suppose that after having used the Heaviside step function to write a
piecewise regular function as a combination of simpler functions, we wish to com-
pute the Laplace transform of such a function. At this point it is important to
distinguish between the functions f (t) H (t − a) and f (t − a) H (t − a) .
The function f (t) H (t − a) simply represents a function which switches f (t)
on at time t − a. By contrast if f is causal, then

0 t<a
f (t − a) H (t − a) =
f (t − a) t ≥ a
represents a translation of f (t), a units to the right. Hence we may think of
f (t − a) H (t − a) as representing the function f (t) delayed by a units.

Figure 3.18

In the theory of Laplace transforms, the second translation property (see sub-
section 3.1.4.3) tells us how to compute the Laplace transform of f (t − a) H (t − a),
but not of f (t) H (t − a)! So if we need to compute the Laplace transform of some-
thing like f (t) H (t − a), we first need to use a bit of ingenuity to try an rewrite it
as a combination of functions of the form g (t − a) H (t − a). Having done that we
can then simply apply the formula L {g (t − a) H (t − a)} = e−ap G (p). (Note that
since g (t − a) H (t − a) is just the function g delayed by a units, we may think of
the factor e−ap in its Laplace transform e−ap G (p), as some sort of delay operation
on G (p) . Since many practically important systems have some form of delay in-
herent in their behaviour, it is clear that this theoretical model is potentially very
useful.)
Coming back to the problem of computing Laplace transforms of functions like
f (t)H(t − a), in some cases it is fairly easy to rewrite this in terms of functions like
g(t − a)H(t − a), and then find the Laplace transform. In such cases we can simply
do the rewriting by inspection.
In cases where the function is more complicated, we have to work a bit harder to
rewrite something like f (t)H(t − a) as a combination of terms like g(t − a)H(t − a).
Let us revisit example 3.4.2 to see how we may compute the Laplace transform of
that function.
Example 3.4.3. Determine L {f (t)} where

t 0≤t<4
f (t) =
0 t≥4
204

Solution: We have that


f (t) = t(H (t) − H (t − 4)) = tH (t) − tH (t − 4) .
Here the term tH(t − 4) is not in the form g(t − 4)H(t − 4) for some g. But we can
fix this by simply replacing tH(t − 4) with (t − 4)H(t − 4) + 4H(t − 4) to get
f (t) = tH (t) − (t − 4) H (t − 4) − 4H (t − 4) .
By the second translation property we then have that
1 1 1
L {f (t)} = L{t} − e−4p L{t} − 4e−4p L{1} = 2 − e−4p 2 − 4e−4p .
p p p

Example 3.4.4. Determine L {f (t)} where
 2
 2t 0≤t<3
f (t) = t+4 3≤t<5
9 t≥5

Solution: We saw in example 3.4.2 that


f (t) = 2t2 H (t) + t + 4 − 2t2 H (t − 3) − (t − 5) H (t − 5) .


Here the expression (t + 4 − 2t2 )H(t − 3) is not in the correct form. What we need
to do is to rewrite t + 4 − 2t2 in terms of powers of t − 3. So let z = t − 3 (that is
t = z + 3). Then
t + 4 − 2t2 = (z + 3) + 4 − 2(z + 3)2
= −2z 2 − 11z − 11
= −2(t − 3)2 − 11(t − 3) − 11.
Thus
f (t) = 2t2 H(t) − [2(t − 3)2 + 11(t − 3) + 11]H(t − 3) − (t − 5)H(t − 5).
Taking Laplace transforms then yields
L{f (t)} = 2L{t2 } − e−3p L{2t2 + 11t + 11} − e−5p L{t}
4 4 11 11 1
= 3
− e−3p [ 3 + 2 + ] − e−5p 2 .
p p p p p

Example 3.4.5. Determine
e−πp (p + 3)
 
−1
L
p (p2 + 1)
Solution: Using partial fractions we can write
p+3 3 1 − 3p
2
= + 2
p (p + 1) p p +1
Thus
e−πp (p + 3)
    
−1 −1 −πp 3 1 − 3p
L = L e +
p (p2 + 1) p p2 + 1
 
−1 −πp 1 −πp 1 −πp p
= L 3e +e − 3e
p p2 + 1 p2 + 1
= [3 + sin (t − π) − 3 cos (t − π)] H (t − π)
= [3 − sin t + 3 cos t] H (t − π)

205 EMT4801/1

3.4.4. APPLICATION TO PERIODIC FUNCTIONS. We have already


determined the Laplace transforms of continuous periodic functions such as sin t and
cos t. In this section we will establish a formula for computing the Laplace trans-
form of a general periodic function. The Heaviside unit step function will prove to
be an important tool in obtaining this formula. Lets first look at a few examples
to try and get some idea of how one may approach the problem of computing the
Laplace transform of a periodic function. Some examples of discontinuous periodic
functions are given in figures 3.19 to 3.22 below.

Figure 3.19.

Figure 3.20

Figure 3.21
206

Figure 3.22.

Lets suppose that we wish to determine the Laplace transform of the square
wave function in figure 3.19. Using the Heaviside unit step function we may write
this function as

T T
f (t) = k[H(t) − H(t − )] − k[H(t − ) − H(t − T )]
2 2
3T
+k[H(t − T ) − H(t − )] − . . .
 2  
T 3T
= kH (t) − 2kH t − + 2kH (t − T ) − 2kH t −
2 2
+2kH (t − 2T ) − . . .
  
T
= 2kH (t) − 2kH t − + 2kH (t − T )
2
  
3T
−2kH t − + 2kH (t − 2T ) − . . . − kH(t)
2
     
T 3T
= 2k H (t) − H t − + H (t − T ) − H t − + H (t − 2T ) − . . .
2 2
−kH(t).

Therefore

F (p) = L {f (t)}
 
1 1 −pT /2 1 −pT 1 1 k
= 2k − e + e − e−3pT /2 + e−2pT − . . . −
p p p p p p
 
2k   2   3   4 k
= 1 − e−pT /2 + e−pT /2 − e−pT /2 + e−pT /2 − . . . −
p p

The series in brackets is an infinite geometric series with constant ratio −e−pT /2 .


The sum of this series is therefore

1 1
=
1 − (−e−pT /2 ) 1 + e−pT /2

Hence
2k 1 k
L {f (t)} = − .
p 1 + e−pT /2 p
207 EMT4801/1

This can be written in a more compact from by noting that


 
k 2
L {f (t)} = − 1
p 1 + e−pT /2
k 1 − epT /2
 
=
p 1 + e−pT /2
k epT /4 − e−pT /4
=
p epT /4 + e−pT /4
k pT
= tanh
p 4
But what about more general periodic functions? Can we use the same ideas
to get a formula for the Laplace transform in this more general case? The good
news is that indeed we can.

Theorem 3.4.6. If f (t) is a piecewise regular periodic function with period T


(that is f (t + T ) = f (t) for all t), then
R T −pt
e f (t) dt
L {f (t)} = 0 .
1 − e−pT
Alternatively if f1 is the function

f (t) if 0≤t<T
f1 (t) =
0 otherwise
then
L{f1 (t)}
L{f (t)} = .
1 − e−pT
Proof: Suppose we are given some periodic function f on [0, ∞) with period
T.

Figure 3.23

If, as illustrated in the above figure, f (t) is periodic and piecewise-regular,


then it is also bounded on [0, ∞), and hence of exponential order σ = 0. Thus the
Laplace transform exists. Since f has period T , we have that f (t + T ) = f (t) for
all t, and hence that f (t + nT ) = f (t) for all t and all positive integers n. The
208

Laplace transform can be expressed as a series of integrals over successive periods:


Z ∞
L {f (t)} = e−pt f (t) dt
0
Z T Z 2T Z 3T
= ept f (t) dt + e−pt f (t) dt + e−pt f (t) dt +
0 T 2T
Z (n+1)T
... + e−pt f (t) dt + . . .
nT
R (n+1)T
In each of the integrals of the form nT
e−pt f (t) dt, we make the substitution
t = s + nT, dt = dσ.
Using the fact that f (s + nT ) = f (s), we get that
Z (n+1)T Z T
e−pt f (t) dt = e−p(s+nT ) f (s + nT )ds
nT 0
Z T
−pnT
= e e−ps f (s)ds.
0

Thus
Z T Z 2T Z 3T
L{f (t)} = ept f (t) dt + e−pt f (t) dt + e−pt f (t) dt +
0 T 2T
Z (n+1)T
... + e−pt f (t) dt + . . .
nT
∞ Z
X (n+1)T
= e−pt f (t)dt
n=0 nT

X∞ Z T
= e−pnT e−ps f (s)ds
n=0 0
Z T ∞
X
−ps
= e f (s)ds e−pnT
0 n=0

The series

X
e−pnT = 1 + e−pT + e−2pT + . . .
n=0

is an infinite geometric series with constant ratio e−pT . The sum of this series is
therefore
1
.
1 − e−pT
Hence
Z T
1
L {f (t)} = e−ps f (s) ds.
1 − e−pT 0
But s is of course just a dummy variable. We may therefore simply replace s by t
to get
R T −pt
e f (t) dt
L {f (t)} = 0 .
1 − e−pT
Finally let f1 (t) assume the values f (t) on 0 ≤ t < T , and be 0 elsewhere.
That is let
f1 (t) = f (t) [H (t) − H (t − T )] .
209 EMT4801/1

Then of course
Z ∞
L {f1 (t)} = e−pt f1 (t) dt
0
Z T Z ∞
= e−pt f (t) dt + 0dt
0 T
Z T
= e−pt f (t) dt
0
Thus we get
RT
0
e−pt f (t) dt L {f1 (t)}
L {f (t)} = =
1 − e−pT 1 − e−pT
as required. 

Now lets get back to our square wave function illustrated in figure 3.19 and see
what we get when we apply the above formula. This function had a period of T
with the values in the first period given by
if 0 ≤ t < T2

K
f (t) =
−K if T2 ≤ t < T
So here 
 K if 0 ≤ t < T2
T
f1 (t) = −K if 2 ≤t<T
0 otherwise

By the theorem
L{f1 (t)}
L {f (t)} = .
1 − e−pT
To find L{f (t)} we must therefore find L{f1 (t)}. Notice that
      
T T
f1 (t) = k H (t) − H t − −k H t− − H (t − T )
2 2
   
T
= k H (t) − 2H t − + H (t − T ) .
2
Therefore
e−pT /2 e−pT
 
1
L {f1 (t)} = k −2 +
p p p
k h i 2
= 1 − e−pT /2 .
p
Now notice that the expression 1 − e−pT can also be written as 1 − e−pT = (1 −
e−pT /2 )(1 + e−pT /2 ). Consequently as before we get that
L {f1 (t)}
L {f (t)} =
1 − e−pT
kh i2 1
= 1 − e−pT /2  
p 1 − e−pT /2 1 + e−pT /2
k 1 − e−pT /2
=
p 1 + e−pT /2
k e−pT /4 − e−pT /4
=
p e−pT /4 + e−pT /4
k pT
= tanh
p 4
210

Example 3.4.7. Determine the Laplace transform of the rectified half–wave


defined by
1

sin 100πt 0 ≤ t < 100
f (t) = 1 2
0 100 ≤ t < 100
2 2n

where f (t) = f (t + 100 ) for all t (equivalently f (t) = f t + 100 for all t and all
positive integers n).
Solution: The function f1 which has the value f (t) on the first period and 0
elsewhere is defined by
  
1
f1 (t) = sin 100πt H (t) − H t − .
100
To get this into a form where we can actually compute the Laplace transform, all
terms must be of the form g(t−a)H(t−a). Using the fact that sin (θ − π) = − sin θ,
we may write
 
1
f1 (t) = sin 100πt H (t) − sin 100πt H t −
100
 
1
= sin 100πt H (t) + sin (100πt − π) H t −
100
   
1 1
= sin 100πt H (t) + sin 100π t − H t−
100 100
Hence
100π p 100π
L {f1 (t)} = + e− 100 . 2
p2 + (100π)2 p + (100π)2
h p
i 100π
= 1 + e− 100 2
2
p + (100π)
Therefore by the theorem
L {f1 (t)}
L {f (t)} =
1 − e−pT
1 100π h p
− 100
i
= 1 + e
1 − e−2p/100 p2 + (100π)2


1 100π h p
− 100
i
= 1 + e
(1 − e−p/100 )(1 + e−p/100 ) p2 + (100π)2
100π
= h i.
 2
1 − e−p/100 p2 + (100π)

EXERCISE 3.4.
1. Express the following piecewise-continuous causal functions in terms of
Heaviside unit step functions and determine the Laplace transforms of
these functions.

t 0≤t<1
(a) f (t) =
" 0 t≥1 #
f (t) = tH (t) − tH (t − 1)
−p −p
F (p) = 1−e
p2 − ep
211 EMT4801/1


 t 0≤t<1
(b) f (t) = 2−t 1≤t<2
0 t≥2

" #
f (t) = t − 2 (t − 1) H (t − 1) + (t − 2) H (t − 2)
−p −2p
F (p) = p12 − 2ep2 + e p2

2. Determine the inverse Laplace transforms of the following.


e−5p
(a) 4
h(p−2) i
1 3 2(t−5)
6 (t − 5) e H (t − 5)
(p+1)e−p
(b) p2 (p2 +1)
[[t − cos (t − 1) − sin (t − 1)] H (t − 1)]
e−p (1−e−p )
(c) p2 (p2 +1)
[[(t − 1) − sin(t − 1)]H(t − 1) − [(t − 2) − sin(t − 2)]H(t − 2)]

3. Given that x (0) = 0, obtain the solution of the differential equation x0 +


x = f (t) where f (t) is the function defined in 1(a). Sketch the graph of
the solution.
[x (t) = e−t + (t − 1) (1 − H (t − 1))]

4. Given that x (0) = 1 and x0 (0) = 0, obtain the solution of the differential
equation x00 + x0 + x = f (t) where f (t) is the function defined in 1(b).
 1 √  
x (t) = 2e− 2 th cos 21 3t + t −1 
1
√  √ i
−2H (t − 1) t − 2 + e− 2 (t−1) cos 23 (t − 1) − √13 sin 12 3 (t − 1)
  
 
 h
− 12 (t−2)
  √
3

1 1
√  i 
+H (t − 2) t − 3 + e cos 2 (t − 2) − √3 sin 2 3 (t − 2)

5. Express the function



3 0≤t<4
f (t) =
2t − 5 t ≥ 4
in terms of Heaviside unit step functions and obtain its Laplace transform.
Hence, obtain the response of the harmonic oscillator x00 + x = f (t) to
such a forcing function if x = 1 and x0 = 0 when t = 0.
f (t) = 3 + 2 (t − 4) H (t − 4)
 
−4p
 F (p) = 3 + 2 e 2 
p p
x (t) = 3 − 2 cos t + 2 [t − 4 − sin (t − 4)] H (t − 4)

6. A periodic function f (t) with period 4 units, is defined within the interval
[0, 4] by

3t 0 ≤ t < 2
f (t) =
6 2≤t<4
Sketch a graph of the function for 0 ≤ t < 12 and obtain its Laplace
transform.
h i
−2p −4p
3−3e −6pe
p2 (1−e−4p )
212

3.4.5. THE IMPULSE FUNCTION. In many engineering applications


we are interested in seeking the response of systems to forcing functions that are
applied suddenly and for a very short period of time, e.g. hitting a nail with
a hammer. These functions are known as impulse forces. Mathematically, such
forcing functions are idealised by the impulse function, which is a function whose
total value is concentrated at one point. To develop a mathematical formulation
of the impulse function and obtain some insight into its physical interpretation,
consider the pulse function φ (t) defined by
0 ≤ t < a − 21 T

 0
φ (t) = A/T a − 21 T ≤ t < a + 12 T
0 t ≥ a + 21 T

This is illustrated in figure 3.24. Since the height of the pulse is A/T and its
duration is T, the area under the pulse is A, that is
Z ∞ Z a+ 21 T
A
φ (t) dt = dt = A.
0 a− 2 T T
1

Figure 3.24.

If we now let the duration of the pulse T approach zero in such a way that the
area under the pulse remains A, we obtain a formulation of an impulse of magnitude
A occurring instantaneously at time t = a. It is important to remember that the
magnitude of the impulse function is measured by its area. If the magnitude is unity,
the impulse function is called the unit impulse function or Dirac delta function. The
Dirac delta function with a pulse of magnitude 1 at point a is denoted by δ (t − a)
and is formally defined by
 Z ∞
0 t 6= a
(3.4.1) δ (t − a) = ; δ (t − a) dt = 1
∞ t=a −∞

An impulse function of magnitude A may now be expressed as Aδ (t − a) and


represented graphically as in the figure below.

Figure 3.25
213 EMT4801/1

It is important to realise that this is NOT a function in the true sense of the
word, but part of a class of objects known as generalised functions (which may be
analysed using the theory of generalised calculus). Their properties are such that,
used with care, the theory can lead to results that have physical significance and
which in many cases cannot easily be obtained by any other method. However we
should always keep in mind that δ(t − a) only has meaning in a limiting sense. It
shouldR ∞therefore be seen as a kind of shorthand for limit formulas.R Thus when we say

that −∞ δ (t − a) dt = 1, what we really mean is that lim T → 0 −∞ δT (t − a) dt =
1 where
0 ≤ t < a − 21 T

 0
δT (t − a) = 1/T a − 12 T ≤ t < a + 12 T
0 t ≥ a + 21 T

R∞
Similarly when we write expressions like L{δ(t−a)} and 0 f (t)δ(t−a)dt, what we
R∞
really mean have in mind are limT →0 L{δT (t − a)} and limT →0 0 f (t)δT (t − a)dt
respectively. Although in the above discussion we expressed δ(t − a) as the limit of
the functions δT (t − a), it is important to realise that the actual shape of the graph
of these limiting pulses are not really that important, as long as the area under the
curve remains constant 1 as the duration of the pulse approaches zero, with the
approximating functions tending to zero at all t with t 6= a. We could therefore
equally well have regarded δ(t − a) as some sort of limit as T → 0, of the so-called
“tent-functions” φ1 (t) represented in the figure below:

Figure 3.26
The unit impulse function at t = 0 will be denoted by δ (t) (rather than δ(t−0))
and as above may be regarded as the limiting case of the pulse φ2 (t) illustrated in
the figure below, as T → 0. (Note that the functions φ2 used to define δ(t) are not
causal.) The unit impulse function has the formal properties that
 Z ∞
0 t 6= 0
δ (t) = ; δ (t) dt = 1.
∞ t=0 −∞

Figure 3.27
214

3.4.5.1. The sifting property.


Theorem 3.4.8. If f (t) is integrable on [α, β] and continuous at t = a where
α < a < β, then
Z β
(3.4.2) f (t) δ (t − a) dt = f (a) .
α

This is referred to as the sifting property because it provides a method of


isolating, or sifting out, the value of a function at any particular point. For example
Z 2π  π π 1
cos t δ t − dt = cos = .
0 3 3 2
Although we used finite limits of integration in equation 3.4.2, the formula still
holds if we replace α by −∞, and/or β by ∞.

3.4.5.2. Laplace transforms of impulse functions.


Theorem 3.4.9. For any a ≥ 0 we have that
L {δ (t − a)} = e−ap
or in inverse form
L−1 e−ap = δ (t − a) .

(3.4.3)
In particular
L{δ(t)} = 1 and L−1 {1} = δ(t).
Proof: In the above theorem, the case a > 0 is a simple consequence of the
sifting property applied to the function f (t) = e−pt . For any 0 < a < ∞ we have
by this property that
Z ∞
L {δ (t − a)} = e−pt δ (t − a) dt = e−ap .
0
The caseR where a = 0 is a bit more complex. Notice that given an integral

of the form 0 f (t)δ(t − a)dt, the sifting property only applies in the case where
0 < a < ∞. So we cannot directly apply the sifting property in the case where
a = 0. However recall that for any f , the Laplace transform was formally defined
to be Z ∞ Z ∞
L{f (t)} = e−pt f (t)dt = lim+ e−pt f (t)dt.
0− →0 −
For each  > 0, we will have − < 0 < ∞, and hence we may indeed apply the
sifting property in the integral below to conclude that
Z ∞
e−pt δ(t)dt = e−0p = 1
−

whenever  > 0. Thus


Z ∞
L{f (t)} = lim e−pt δ(t)dt = lim 1 = 1.
→0+ − →0+


Example 3.4.10. Determine
p2
 
−1
L
p2 + 4
215 EMT4801/1

Solution: By performing long-division of polynomials, we obtain


p2 p2 + 4 − 4 2
= =1−2 2 .
p2 + 4 p2 + 4 p +4
Thus
p2
   
2
L−1 2
= L−1 {1} − 2L−1 2
= δ (t) − 2 sin 2t.
p +4 p +4

Example 3.4.11. Solve the differential equation x00 + 3x0 + 2x = 1 + δ (t − 4)
subject to x (0) = x0 (0) = 0.
Solution: Taking Laplace transforms yields
1
p2 X − px (0) − x0 (0) + 3 [pX − x (0)] + 2X = + e−4p .
p
After substituting the initial conditions, this becomes
 1
X p2 + 3p + 2 = + e−4p ,
p
or equivalently
1 e−4p
X= + .
p (p + 1) (p + 2) (p + 1) (p + 2)
By means of partial fractions this can be written as
1 e−4p
X = +
p (p + 1) (p + 2) (p + 1) (p + 2)
   
1 1 2 1 −4p 1 1
= − + +e −
2 p p+1 p+2 p+1 p+2
Therefore
1  h i
x (t) = 1 − 2e−t + e−2t + e−(t−4) + e−2(t−4) H (t − 4)
2 
−t
1
+ e−2t  
 
= 2 1 − 2e 0≤t<4
1 −t −2t −(t−4) −2(t−4)

1 − 2e + e + e − e t≥4
 21  −t −2t

= 2 1 − 2e + e 0≤t<4
−t
1 4 1 8
 −2t
2 + e − 1 e + 2 − e e t≥4

3.4.5.3. Relationship between Heaviside step and impulse functions. Although
the Heaviside step function H(t − a) is not differentiable at t = a and δ(t − a) not
a function in the true sense of the word, when it comes to the theory of Laplace
transforms, the generalised function δ(t − a), will play the role of the derivative of
H(t − a). Recall that the formal rules of operation for δ(t − a) are that

0 t 6= a
δ (t − a) =
∞ t=a
Z ∞
δ (t − a) dt = 1
−∞

Still arguing
Rs formally, Rif s < a, then δ(t − a) = 0 on all of (−∞, s], which suggests
s
that −∞ δ(t − a)dt = −∞ 0dt = 0. If on the other hand s ≥ a, then the rule that
R∞ Rs
δ(t − a) = 0 on (s, ∞), suggests that 1 = −∞ δ(t − a)dt = −∞ δ(t − a)dt. Thus in
a formal sense
 Z s
0 s<a
H (s − a) = = δ (t − a) dt.
1 a≤s −∞
216

Similarly we also have that


Z s
H (s) = δ (t) dt
−∞
in a formal sense.

EXERCISE 3.4(continued).
7. Obtain the inverse Laplace transforms of the following functions:
2p2 + 1
(a) . [2δ (t) + 9e−2t − 19e−3t ]
(p + 2) (p + 3)
p2 − 1
(b) 2 . [δ (t) − 52 sin 2t]
p +4
p2 + 2
δ (t) − e−t 2 cos 2t + 21 sin 2t
 
(c) 2 .
p + 2p + 5
8. Solve the following differential equations for t ≥ 0:
(a) x00 + 7x0 + 12x = 2 + δ (t − 2) subject to x = x0 = 0 at t = 0.
x (t) = 16 − 23 e−3t + 21 e−4t + e−3(t−2) − e−4(t−2) H (t − 2)

00 0 0
(b) x  + 6x 1+ 6π 13x = δ (t − 2π) subject
 to x (0) = x (0) = 0.
−3t
x (t) = 2 e e H (t − 2π) sin 2t
00 0
(c) x + 7x +−3t 12x = δ (t − 3) subject to x = x0 = 1 at t = 0.
x (t) = 5e − 4e−4t + e−3(t−3) − e−4(t−3) H (t − 3)

9. Solve the differential equation x00 + 7x0 + 10x = 2u + 3u0 for t ≥ 0, subject
to x (0) = 0 and x0 (0) = 2, given that u (t) = e−2t H (t) .
−2t −5t
x (t) = 10 − 10 − 34 te−2t
 
9 e 9 e

10. A periodic function f (t) is made up of an infinite train of unit impulses


at
t = 0, t = T, t = 2T, . . . , t = nT, . . .
(a) Show that
1
L {f (t)} = .
1 − e−pT
(b) The response of a harmonic oscillator to such a periodic stimulus
is determined by the differential equation x00 + ω 2 x = f (t) , t ≥ 0.
Determine the response and sketch the responses from t = 0 to t =
(6π) /ω for
(i) T = π/ω,
(ii) T = (2π) /ω.
[For any 0 ≤ t < T , we have x (t + nT ) = ω1 sin ω (t) H (t), (n =
0, 1, 2, . . .). Alternatively if nT ≤ t < (n + 1)T , then x (t) =
1
ω sin ω (t − nT ) H (t − nT ), (n = 0, 1, 2, . . .)]

11. An impulse voltage Eδ (t) is applied at time t = 0 to a series circuit


consisting of an inductor L, a resistor R and a capacitor C. Prior to the
impulse, the current in the circuit and the charge on the capacitor were
zero. Determine the charge on the capacitor and the resulting current in
the circuit at time t.
E −µt E µt
[q (t) = Lm e sin(mt), i (t) = Lm e (m cos mt − µ sin mt), where µ =
R 2 1 R2
2L and m = LC − 4L2 .]
217 EMT4801/1

UNIT 5: TRANSFER FUNCTIONS


3.5.1. OBJECTIVE. To introduce the notion of a transfer function, and
indicate the usefulness and applicability of this concept.

3.5.2. OUTCOMES. At the end of this unit the student should


• Understand the concept of a transfer function and be able to compute
transfer functions of systems governed by linear constant coefficient dif-
ferential equations;
• Be able to use the transfer function of a system to compute the response
of the system to a given input;
• Know how the stability of a systems can be described in terms of the poles
of the transfer function;
• Know the Routh-Hurwitz criterion and be able to use it to test the stability
of a given transfer function;
• Know how the applicability of the final value theorem depends on the
stability of a systems;
• Know the meaning of the term “impulse response” and be familiar with
the relationship between the impulse response and the transfer function;
• Know the meaning of the term “step response” and be familiar with the
relationship between the step response, impulse response, and the transfer
function;
• Know how to use convolution with the impulse response to compute the
response of a system to an arbitrary input.

3.5.3. DEFINITIONS. Lets consider a simple system described by the dif-


ferential equation
x00 (t) + 2x0 (t) + x(t) = u(t)
where x0 (0) = x(0) = 0. In the preceding units we have seen how for a given forcing
function u(t), one can use Laplace transforms to solve this system. But what if we
want to solve the same system for different forcing functions? If we use the above
approach this means that for each forcing function we would have to do all the
calculations all over again. This is far too time consuming. The challenge for us
then is to find a more efficient way of capturing the essence of the system – a way
that will provide us with a simple way of solving the system for any given forcing
function. Lets get back to the differential equation above. If we take Laplace
transforms we ultimately get to
1
X(p) = U (p).
(p + 1)2
1
The function G(p) = (p+1) 2 seems to be the key here. To get from the Laplace

transform of the input (forcing function) to the transform of the output, all we
need to do is multiply by G(p)! So once we know what G(p) looks like, we can
then solve the system for any input (forcing function) u(t) by going through the
following steps:
• Compute the transform U (p) = L{u(t)}.
• Multiply by G(p) to get L{x(t)} = X(p) = G(p)U (p).
• Find x(t) by taking the inverse transform of L{x(t)}.
When dealing with a large number of different forcing functions for the same system,
such an approach is clearly more efficient. These ideas then lead us to the notion
of a transfer function.
The transfer function G(p) of a linear time–invariant system is defined as the
ratio of the Laplace transform X(p) of the output (response function) to the Laplace
218

transform U (p) of the input (forcing function) where all initial conditions are zero.
The transfer function therefore clearly satifies the relationship
X(p) = G(p)U (p).
Given a system described by the differential equation
an x(n) + an−1 x(n−1) + . . . + a0 x = bm u(m) + bm−1 u(m−1) + . . . + b0 u
with input u(t), output x(t), and all initial conditions zero, taking Laplace trans-
forms will result in
(an pn + an−1 pn−1 + . . . + a0 )X(p) = (bm pm + bm−1 pm−1 + . . . + b0 )U (p).
Thus for such a system the transfer function will then be
X (p) bm pm + bm−1 pm−1 + . . . + b0
G (p) = = .
U (p) an pn + an−1 pn−1 + . . . + a0
The equation an pn +an−1 pn−1 +. . .+a0 = 0 is called the characteristic equation
of the system and its order the order of the system. If m > n, then X (p) =
G (p) U (p) will involve powers of p. Since formally
n o
L δ (k) (t) = pk L{δ(t)} = pk k∈N
it is clear that x (t) will then contain impulse functions and that to solve such
systems we would need to use the theory of generalised functions to give meaning to
not just the Dirac delta function, but also its derivatives of all orders. In practice,
an overall system may be made up of a number of components, each with its
own transfer function. The overall system transfer function is then obtained by
combining the transfer functions of the subsystems, using rules of block diagrams.
In factorised form G (p) may be written
bm (p − zm ) (p − zm−1 ) . . . (p − z1 )
G (p) = .
am (p − vn ) (p − vn−1 ) . . . (p − v1 )
The zi ’s are the zeros and the vi ’s are the poles. A plot of the poles and zeros
is often used as an aid in the graphical analysis of the transfer function. In such a
plot the position of a zero is marked by a circle (o) and that of a pole by a cross
(x). If the coefficients of the differential equation are real, the coefficients of the
numerator and denominator of the transfer function will trivially also be real. In
such a situation all complex roots will occur in conjugate pairs, i.e. if p = α + jβ
is a root, then α − jβ will also be a root. Thus for such systems the pole–zero plot
of the transfer function will be symmetrical about the real axis.
Example 3.5.1. The response x (t) of a system to an input u (t) is determined
by the differential equation 9x00 + 12x0 + 13x = 2u0 + 3u.
(a) Determine the transfer function of the system.
(b) Write down the characteristic equation of the system. What is the order
of the system?
(c) Determine the transfer function poles and zeros and plot them in the p–
plane.
Solution:
(a) Taking Laplace transforms with all initial conditions zero, yields
X (p) 9p2 + 12p + 13 = U (p) (2p + 3)


so that
X (p) 2p + 3
G (p) = = 2 .
U (p) 9p + 12p + 13
(b) The characteristic equation is 9p2 + 12p + 13 = 0 and is of order two.
219 EMT4801/1

(c) The poles are the roots of 9p2 + 12p + 13 = 0, i.e.



−12 ± 144 − 4.9.13 2
p= = − ± j.
18 3
The zeros are the roots of 2p + 3 = 0, i.e. p = − 23 .

3.5.4. STABILITY. A stable system is one that will remain at rest, unless
it is excited by an external source, and which will return to rest when the external
source is removed. In mathematical terms this means that a a bounded input must
produce a bounded output. This latter property is often taken as the definition of
stability for a linear system.
It is important to note that the stability of a system is a property of the system
itself and does not depend on the input. If U (p) is the transform of the input and
G(p) the transfer function of the system, then X(p) = G(p)U (p) is the transform
of the ouput. Thus the nature of the transfer function G(p) will affect the nature
of X(p), and hence of the output x(t). As it turns out, the stability of a system in
fact depends on both the order of the system, and the nature of the poles of the
transfer function. Lets look at this a bit more carefully, and try to understand why
this is the case.
Suppose we have a transfer function of the form
bm pm + bm−1 pm−1 + . . . + b0
G (p) =
an pn + an−1 pn−1 + . . . + a0
where all the coefficients are real. If m ≥ n, then a partial fraction expansion of
G(p) will contain terms of the form Apk where 0 ≤ k ≤ m − n. The expression for
X(p) will then contain terms like Apk U (p) = AL{u(k) (t)}, and hence the output
x(t) will contain terms like Au(k) (t). But even if u(t) is bounded, there is no reason
why Au(k) (t) should then also be bounded! For example even though u(t) = sin(t2 )
is bounded, u0 (t) = 2t cos(t2 ) is not. Thus in a stable system we must have m < n.
So lets assume that m < n, and see how the position of the poles of the transfer
function affect stability. In this case the poles of the transfer function correspond to
the roots of the characteristic equation. Since the coefficients of the characteristic
equation are real, the roots are either real, or come in complex conjugate pairs. We
proceed to look at these two cases:
Case 1: A transfer function with poles at some real number α.
A pole at p = α corresponds to a factor of the form (p − α) in the denom-
inator of the transfer function. This means that for an input like u(t) = H(t),
the partial fraction expansion of X(p) = G(p)U (p) = G(p) p1 , will contain terms
c
like (p−α) k where k ∈ N. (If α is not a repeated root, there will only be a term
220

c
of the form p−α .) But this means that the response x(t) must contain terms like
αt c k−1
e (k−1)! t H (t) . If α < 0, the pole lies in the left half of the p–plane and the
c
response term (k−1)! tk−1 eαt H (t) will tend to zero as t → ∞. If on the other hand
1
α > 0, then (k−1)! tk−1 eαt H (t) → ∞ as t → ∞. Finally consider the case α = 0.
This means that the partial fraction expansion of G(p) contains terms of the form
c
pk
where k ∈ N. For an input like u(t) = H(t) (with U (p) = L{u(t)} = p1 ) the par-
tial fraction expansion of X(p) = G(p)U (p) = G(p) p1 , will then contain terms of the
c c k
form pk+1 . Thus the output will in this case contain terms of the form c k! t H (t).
All such terms are however clearly unbounded. Therefore it is only when α < 0
(i.e. when this real pole lies in the left half-plane), that the system could possibly
be stable.

Case 2: A pair of complex conjugate poles at p = α ± jβ.


(Here α and β are of course real.) For the sake of simplicity lets suppose that
these complex conjugate poles are both poles of order 1. The numerator of G(p)
must then contain a factor of the form
2
(p − (α + jβ)(p − (α − jβ)) = (p − α) + β 2 .
For an input like u(t) = H(t), the partial fraction expansion of X(p) = G(p)U (p) =
G(p) p1 will then contain terms of the form
c (p − α) + dβ
2
(p − α) + β 2
leading to terms of the form
eαt (c cos βt + d sin βt)
in the response x(t). This term will be unbounded if α > 0, and will tend to zero as
t → 0 whenever α > 0. So for a stable system we can’t have α > 0. But if α = 0?
The numerator of G(p) will then contain a factor of the form p2 + β 2 , which means
that the partial fraction expansion of G(p) will contain a term of the form pcp+dβ
2 +β 2 .
β
For a bounded input like u(t) = sin(βt) we of course have U (p) = p2 +β 2 . Thus
β
in this case X(p) = G(p)U (p) = G(p) p2 +β 2 will then contain a term of the form
cp+dβ
(p2 +β 2 )2 .
But then the response will contain a term of the form
    " 2 #
−1 cp + dβ c −1 d β d −1 β
L = L − + L
(p2 + β 2 )2 2β dp p2 + β 2 β p2 + β 2
c d
= t sin(βt) + sin(βt) ∗ sin(βt)
2β β
c d d
= t sin(βt) − t cos(βt) + 2 sin(βt).
2β 2β 2β
This term is clearly unbounded. Hence for a system to be stable, all the poles of
the transfer function that are of the form α + jβ, must be in the left half-plane
(that is we must have α < 0 for stability).

Armed with the above information we can now make the following definition:

Definition 3.5.2. A physically realisable causal linear system with transfer


function of the form
bm pm + bm−1 pm−1 + . . . + b0
G (p) =
an pn + an−1 pn−1 + . . . + a0
221 EMT4801/1

where n ≥ m, is said to be stable if all the poles of G (p) are to the left of the
imaginary axis in the p–plane.
If some of the poles are on the imaginary axis and the rest to the left of this
axis, we say that the system is marginally stable.
If some or all of the poles are to the right of the imaginary axis, the system is
said to be unstable.
There is one more thing we can learn from the poles of a transfer function.
Formally we expect the transfer function G(p) to be the Laplace transform of some
real-valued causal function g(t) = L−1 [G(p)]. But what sort of function is g(t)? In
terms of the poles of the transfer function, its abscissa of convergence σc corresponds
to the real part of the pole which is located furthest to the right in the p–plane.
For example, if
p+3
G (p) =
(p + 2) (p + 4)
the abcissa of convergence will be σc = −2. For any stable system the pole of the
transfer function which is furthest to the right, will have a real part with α < 0
(since all the poles are to the left of the imaginary axis in the p–plane). Thus for
such a system the abscissa of convergence of g will satisfy σc < 0. Hence for this g
the region of convergence of the Laplace transform includes the imaginary axis, so
that G (p) exists when p = a + jb, with a ≥ 0.

To prove stability of a system, we have to show that all the roots of the char-
acteristic equation of the transfer function have negative real parts. In cases where
the characteristic equation is fairly simple, this can be done by inspection. How-
ever in more complex systems we need some reliable means of checking whether an
expression like
0 = an pn + an−1 pn−1 + . . . + a1 p + a0
has roots with negative real parts or not. Using the Routh–Hurwitz criterion,
stability can be proved without having to solve the characteristic equation.
Theorem 3.5.3 (Routh–Hurwitz criterion). Let an pn +an−1 pn−1 +. . .+a1 p+a0
be a polynomial with real coefficients and with an > 0. A necessary and sufficient
condition for all the roots of the equation 0 = an pn +an−1 pn−1 +. . .+a1 p+a0 to have
negative real parts, is that ∆i > 0, i = 1, 2, 3, . . . n, where ∆i is the determinant

an−1 an an+1 an+2 . . . an+i−1

an−3 an−2 an−1 an . . . an+i−3

an−5 an−4 an−3 an−2 . . . an+i−5
∆i =
..
.

an−(2i−1) an−2i a an−2i−2 . . . an−i
n−2i−1

and where we agree that ak = 0 whenever either k < 0, or k > n.

Example 3.5.4. Show, without factorising the polynomial, that all the roots of
p4 + 9p3 + 33p2 + 51p + 26
have negative real parts.

Solution: In this case n = 4. Thus here


a4 = 1, a3 = 9, a2 = 33, a1 = 51, a0 = 26
with all other ak ’s zero. Hence
∆1 = a3 = 9,
222


a a4
∆2 = 3
a1 a2

9 1
=
51 33
= 9 (33) − 1 (51)
= 246,


a3 a4 a5
∆3 = a1 a2 a3
a−1 a0 a1


9 1 0
= 51 33 9
0 26 51

33 9 51 9
= 9 − 1
26 51 0 51
= 9 [(33) (51) − (9) (26)] − [(51) (51) − 0]
= 10 440,
and


a3 a4 a5 a6

a1 a2 a3 a4
∆4 =
a−1 a0 a1 a2

a−3 a−2 a−1 a0


9 1 0 0

51 33 9 1
=
0 26 51 33

0 0 0 26

9 1 0

= 26 51 33 9
0 26 51
= 26∆3 > 0.
Since ∆i > 0 for i = 1, 2, 3, 4, we may conclude from the Routh-Hurtwitz
criterion that all the roots of p4 + 9p3 + 51p + 26 = 0 have negative real parts. 
Example 3.5.5. The steady motion of a steam engine governor is modelled by
the differential equations
(3.5.1) mµ00 + bµ0 + cµ − dω = 0
and
(3.5.2) I0 ω 0 = −f µ,
where µ is a small fluctuation in the angle of inclination, ω a small fluctuation in
the angular velocity of rotation and m, b, c, d, f and I0 are all positive constants.
Show that the motion of the governor is stable if
bc df
>
m I0
Solution: Differentiating equation 3.5.1 leads to

mµ000 + bµ00 + cµ0 + d =0
I0
223 EMT4801/1

The above equation is just a special case of the differential equation



mµ000 + bµ00 + cµ0 + d = u(t)
I0
(The first equation corresponds to the case where the forcing function is u(t) = 0.
Taking Laplace transforms of the second equation (with all initial conditions zero)
leads to  
3 2 df
mp + bp + cp + M (p) = U (p)
I0
where L{µ(t)} = M (p) and L{u(t)} = U (p). The transfer function of this system
is therefore
M (p) 1
G(p) = = .
U (p) mp + bp + cp + df
3 2
I0
To see when the system is stable, we need to see when the roots of the characteristic
equation mp3 + bp2 + cp + df
I0 = 0 all lie to the left of the imaginary axis. We again
use the Routh–Hurwitz criterion to do this. Here n = 3. So
df
a3 = m, a2 = b, a1 = c, a0 = , with ai = 0 if either i < 0 or i > 3.
I0
Thus
∆1 = |a2 | = b > 0,

a a3
∆2 = 2
a0 a1

b m
= df

I c
 0 
mdf
= bc − ,
I0
and


a2 a3 a4
∆3 = a0 a1 a2
a−2 a−1 a0

b m 0
df
= I0 c b
0 0 df I0


df b m
= df
I0 I0 c

 
df mdf
= bc − .
I0 I0
We already have that ∆1 > 0. Since df I0 > 0, we see from the above formulas that
∆2 and ∆3 will also be positive if and only if bc − mdf
I0 > 0; or equivalently if and
only if
bc df
>
m I0
Thus the system described by

mµ000 + bµ00 + cµ0 + d = u(t)
I0
will be stable if
bc df
> .
m I0
224

In particular for the bounded forcing function u(t) = 0 (the case corresponding to
the steam governor), the response of the system will also be bounded. Thus the
bc
motion of the governor will be stable if m > df
I0 . 

3.5.5. STABILITY AND THE FINAL VALUE THEOREM. Given a


causal function f , we noted in remark 3.1.23 that the final value theorem may safely
be applied to f if all the singularities of pL{f (t)} = pF (p) are to the left of the
imaginary axis. Now recall that a linear system described by a transfer function
G(p), is stable if and only if all the poles of G(p) are to the left of the imaginary
axis. Thus if g is the inverse transform of such a transfer function, we may use the
final value theorem to predict the behaviour of limt→∞ g(t), whenever the system
is stable! If the system is not stable, we may run into problems. We illustrate this
point in the following example:

Example 3.5.6. Investigate the applicability of the final–value theorem to the


transfer function
1
G (p) = .
(p + 2) (p − 3)
Solution: Observe that
p
lim pG (p) = lim = 0.
p→0 p→0 (p + 2) (p − 3)
By means of partial fractions we have that
 
1 1 1 1
= −
(p + 2) (p − 3) 5 p−3 p+2
Therefore inverting G (p) gives
  
1 1 1
g (t) = L−1 −
5 p−3 p+2
1 3t
= (e − e−2t ).
5
Notice that
g(t) → ∞ as t → ∞.
Thus limt→∞ g(t) is clearly not the same as limp→0 pG (p). The reason for this is
that the pole p = 3 of G (p) is not in the left half of the p–plane. 

3.5.6. IMPULSE RESPONSE. Suppose we have a stable linear system


described by a transfer function G(p). Then of course X (p) = G (p) U (p), where
U (p) is the transform of the input, and X(p) the transform of the reponse. If the
input is the impulse function u (t) = δ (t), then of course U (p) = L{δ(t)} = 1. So
in this case X(p) = G(p) · 1 = G(p). On taking inverse transforms, we get that the
response will then be x (t) = L−1 [G (p)]. Thus from a physical point of view the
inverse transform g(t) of G(p) is just the response of the system to an unit impulse
applied at time t = 0, when all initial conditions are zero. In view of the above we
make the following definition:
Definition 3.5.7. Let G(p) be the transfer function of a stable linear time–
invariant system. If the inverse transform exists, we define the impulse response of
the system with all initial conditions zero, to be g(t) = L−1 {G(p)}.
Since the impulse response is the inverse Laplace transform of the transfer
function, it follows that both the impulse response and the transfer function carry
the same information about the dynamics of the system. Either one may therefore
225 EMT4801/1

be used to characterise the system. Therefore given a system capable of respond-


ing to an excitation, it should in principle be possible to determine the complete
information about the system by exciting it with an impulse and measuring the
response. For this reason, it is common practice in engineering to think of the
transfer function as the Laplace transform of the impulse response.
When characterising a system by its impulse response (as opposed to its transfer
function), we can say that the system is stable if the value of its impulse response
tends to zero when t → ∞.
Example 3.5.8. Determine the impulse response to a linear system whose re-
sponse x (t) to an input u (t) is determined by x00 + 5x0 + 6x = 5u (t) .
Solution: Recall that the impulse response g (t) is the system response to
u (t) = δ (t) when all the initial conditions are zero. Thus
g 00 + 5g 0 + 6g = 5δ (t)
subject to g (0) = g 0 (0) = 0. Taking Laplace transforms leads to p2 + 5p + 6 G(p) =


5. Thus
5
G(p) =
(p + 2) (p + 3)
Since this is just the transfer function of the system, it follows that the system is
stable since the poles p = −2 and p = −3 both lie to the left of the imaginary axis
in the p–plane. Now notice that
5 5 5
G(p) = = − .
(p + 2) (p + 3) p+2 p+3
inverse transforms we get g (t) = 5 e−2t − e−3t . Clearly g (t) =
 
Thus
 −2ton taking
− e−3t → 0 as t → ∞, once again proving stability of the system.

5 e 

3.5.7. UNIT STEP RESPONSE. In the previous section we saw how the
transfer function of a stable linear system may be realised as the Laplace transform
of the impulse response of the system with all initial conditions zero. There are
however some philosophical challenges inherent in this approach. The unit impulse
function is not a function in the usual sense of the word, and for us to be able to
give meaning to the idea of an impulse function, we had to pass to the theory of
generalised functions. To analyse the response of a system to such a “function”,
we then properly need to use generalised calculus to compute the response. For
those who would prefer to avoid such challenges, there is an alternative – the step
response.
Once again let G(p) be the transfer function of some stable linear system. The
transform of the Heaviside unit step function is L{H(t)} = p1 . So if when all
initial conditions are zero, we let our forcing function be u(t) = H(t), we may
conclude from the formula X(p) = G(p).U (p) that the transform of the response
will then be X (p) = G (p) U (p) = G(p) p1 . Equivalently we will have pX(p) = G(p).
But from property 3.1.12 we have L{x0 (t)} = pL{x(t)} = pX(p) when all initial
conditions are zero. Therefore in the above situation where the system is excited
by means of the Heaviside unit step function (initial conditions zero), the response
x(t) should satisfy L{x0 (t)} = G(p) (or equivalently x0 (t) = g(t) where L−1 [G(p)]).
We therefore arrive at the following principle:
Theorem 3.5.9. Consider a stable linear time–invariant system with transfer
function G(p) and impulse response g(t). The response x(t) of the system to a unit
step input with all initial conditions zero, satisfies
x0 (t) = g(t) and L{x0 (t)} = G(p).
226

That is with all initial conditions zero, the response of the system to a unit impulse
is the derivative of the response to a unit step function.

In principle the system’s response to a unit step input can therefore also be
used to compute the transfer function of the system.
We proceed to investigate one further application of the step response of a
system. With all initial conditions zero, a stable system is said to be in a steady
state if a constant input u(t) = u0 results in a constant response x(t) = x0 . For a
stable system in such a steady state, the Steady State Gain of the system (or just
SSG for short) is defined to be the quotient SSG = ux00 . But what if the system is
not in a steady state? How may we then compute the SSG of the system? Once
again it is the step response that comes to our rescue. Lets see how.
For the moment lets consider stable electrical systems. If with all initial con-
ditions zero, such a system is excited by a well-behaved periodic forcing function,
the response will generally contain some terms that decay exponentially as t → ∞
(and therefore become negligibly small very quickly), and some other terms that
continue indefinitely in the same pattern as time goes on. The part of the response
that decays exponentially is called the transient part, whereas the part that con-
tinues indefinitely, is the steady state part. A concrete example may help to clarify
these ideas. Consider the results of example 3.3.1. The system described there was
excited by the constant input e(t) = 20V . The charge q(t) of that system represents
the response of the system to this input. This charge satisfies the equation
  
1 4
q(t) = 1 − e−80t cos 60t + sin 60t t > 0.
500 3
1 −80t
cos 60t + 43 sin 60t decay to zero as t → 

In the above equation, the terms − 500 e
1 1 −80t
∞, whereas 500 continues indefinitely. Thus in this case − 500 e cos 60t + 43 sin 60t
1
is the transient part of the charge, and 500 the steady state part.
Coming back to the problem of finding the SSG of a stable system, for a constant
input u(t) = u0 we expect that the transient part of the response x(t) will decay
exponentially as t → ∞, whereas the steady state part should continue indefinitely.
So with such a constant input (with all initial conditions zero), it should then be
possible to compute the “gain” of the steady-state part (i.e. the SSG) by means of
the formula limt→∞ x(t) u0 .
Suppose we have a stable linear time-invariant system with transfer function
G(p). Let x(t) be the response of the system to the unit step input u(t) = H(t).
In view of the above discussion, we have that SSG = limt→∞ x(t) 1 = limt→∞ x(t).
If we now use the final value theorem and the fact that the transform of the step
response satifies pX(p) = G(p), we get that

SSG = lim x (t)


t→∞
= lim pX (p)
p→0
= lim G (p)
p→0
= G(0)

We have therefore managed to prove the following result:

Theorem 3.5.10. The Steady State Gain of a stable linear time-invariant sys-
tem with transfer function G(p), is given by

SSG = G(0).
227 EMT4801/1

3.5.8. SYSTEM RESPONSE TO AN ARBITRARY INPUT. The im-


pulse response of a linear time–invariant system enables us to obtain the response
of the system to an arbitrary input using the convolution theorem. This provides
a powerful tool for analysing dynamical systems.
Suppose we have a linear system characterised by the impulse response g(t).
With all initial conditions zero, we wish to determine the system response x (t) to an
arbitrary continuous input u (t). The Laplace transform of the impulse response is
of course G(p), the transfer function of the system. By the definition of the transfer
function we have X(p) = G(p)U (p) where X(p) is the Laplace transform of the out-
put, and U (p) the transform of the input. But by the convolution theorem (theorem
3.1.32), G(p)U (p) = L{g(t) ∗ u(t)}. Therefore L{x(t)} = X(p) = L{g(t) ∗ u(t)}. On
Rt
taking inverse transforms, it follows that x(t) = g(t) ∗ u(t) = 0 u (ϕ) h (t − ϕ) dϕ.
Thus we have managed to prove the following fact:
Theorem 3.5.11. If the impulse response of a stable linear time–invariant sys-
tem is g (t), then with all initial conditions zero, the system response x (t) to an
arbitrary input u (t), is given by
Z t Z t
x (t) = u (ϕ) g (t − ϕ) dϕ = g (ϕ) u (t − ϕ) dϕ.
0 0
(Here we have used the fact that the convolution is commutative.)
Example 3.5.12. The response φo (t) of a system to a driving force φi (t) is
given by the differential equation
d2 φo 2dφo
+ + 5φo = φi
dt2 dt
where φi is the input, and φo is the output. Determine:
(a) The impulse response of the system.
(b) Use the convolution theorem to determine the system response to an unit
input step at time t = 0 if the system was initially in its quiescent state.
(c) Confirm the result by solving the differential equation by Laplace transform
method.

Solution:

(a) The impulse response g (t) is the solution of


d2 g dg
2
+ 2 + 5g = δ (t)
dt dt
subject to the initial conditions g (0) = g 0 (0) = 0. Taking Laplace trans-
forms yields
p2 + 2p + 5 G (p) = L {δ (t)} = 1


so that
1 1
G (p) = = 2 .
p2
+ 2p + 5 (p + 1) + 22
Now take inverse transforms to get
1
g (t) = e−t sin 2t
2
(b) Using the convolution theorem, the response to a unit step input φi (t) =
H(t) is given by
Z t Z t
φo (t) = g(t) ∗ H(t) = g (s) H (t − s) ds = g (s) ds.
0 0
228

Therefore Z t
1
φo (t) = e−s sin 2sds
2 0
with
Z t  Z t 
1 t
e−s sin 2sds = −e−s cos 2s 0 − e−s cos 2sds
0 2 0
 Z t 
1 1 −s
= (−e−t cos 2t + 1) − e sin 2s|t0 + e−s sin 2sds
2 4 0

Thus
5 t −s
Z
1
2 − 2e−t cos 2t − e−t sin 2t ,

e sin 2sds =
4 0 4
or equivalently
Z t
1
e−s sin 2sds = 2 − 2e−t cos 2t − e−t sin 2t .

0 5
So
1 
2 − 2e−t cos 2t − e−t sin 2t .

φo (t) =
10
Observe that if we differentiate the step response we get
dφo 1
= [0 − 2(−e−t cos 2t − 2e−t sin 2t) − (−e−t sin 2t + 2e−t cos 2t)]
dt 10
1 −t
= e sin 2t
2
= g(t)
as predicted by the results in section 3.5.7.
(c) Taking Laplace transforms of
d2 φo 2dφo
2
+ + 5φo = H(t)
dt dt
with all initial conditions zero, yields
1
p2 + 2p + 5 Φo (p) = .

p
Thus
1
Φo (p) = 2
.
p (p + 2p + 5)
Using partial fractions we can write this as
 
1 1 p+2
Φo (p) = − 2
5 p p + 2p + 5
" #
1 1 p+2
= −
5 p (p + 1)2 + 22
" #
1 1 (p + 1) + 1
= −
5 p (p + 1)2 + 22
" #
1 2 2 (p + 1) + 2 2
= − − .
10 p (p + 1)2 + 22 2
(p + 1) + 22
Thus
1 
2 − 2e−t cos 2t − e−t sin 2t ,

φo (t) =
10
which confirms the result obtained in (b).
229 EMT4801/1

3.5.9. STEADY–STATE ERROR (SSE). We finally come to the concept


of the steady state error of a linear time–invariant system. This concept has partic-
ular application in control theory where a very specific output is desired. In such
a situation the system would then need to be adjusted to ensure that the actual
output is as close as possible to the desired output. Included in the system would be
a controller. The error between the desired output and actual output would be fed
into the controller, which then adjusts the system to try and minimise this error.
With time the transient parts of both the actual output and the desired output
will become negligibly small. So if the steady state parts of the desired output and
actual output are the same, then as t increases, the actual output will indeed start
looking more and more like the desired output. If of course the steady state parts
of the actual and desired output are not the same, we have a problem since even for
large t there is no guarantee that the error will be small. The difference between the
steady state part of the actual output and of the desired output, is what we call the
Steady State Error (or just SSE for short). If x(t) is the actual output, and r(t) the
required or desired output, the error is e(t) = r(t) − x(t), and the steady state error
would then be limt→∞ (r(t) − x(t)) = limt→∞ e(t). In the diagram below we give
a block diagram representation of such a system in terms of Laplace transforms.
(Here G(p) is the transfer function of the system, X(p) = L{x(t)}, R(p) = L{r(t)}
and E(p) = L{e(t)}.

Figure 3.28.

An example may help to clarify the concept. Suppose we have a car that has
been fitted with cruise control, which is a device designed to maintain a constant
vehicle speed. This constant speed is the desired or reference speed, which is set
by the driver. The system in this case is the vehicle, and the actual speed (which
may differ slightly from the desired speed) is the actual output of the system. The
control variable by means of which the speed may be adjusted, is the engine’s
throttle position.
The desired speed is fed into the system’s computer by the driver. In a closed-
loop control system, a sensor monitors the output (the vehicle’s actual speed) and
feeds this data to the computer. The computer then computes the difference be-
tween the desired and actual speed, and based on this information, continuously
adjusts the control input (the throttle) as necessary to try and keep the error to
a minimum (i.e. to maintain the desired speed). Feedback on how the system is
actually performing allows the controller (the vehicle’s on board computer) to dy-
namically compensate for disturbances to the system, such as changes in slope of
the ground or wind speed. An ideal feedback control system cancels out all errors,
effectively mitigating the effects of any forces that may or may not arise during
operation and producing a response in the system that perfectly matches the user’s
wishes.
Now suppose we have a stable system of this type. Based on the block–diagram
above, we have that
G (p) E (p) = X (p) = R (p) − E (p)
230

Thus
E (p) [1 + G (p)] = R (p)
so that
R (p)
E (p) =
1 + G (p)
Using the final–value theorem, we conclude that
SSE = lim e (t)
t→∞
= lim pE (p)
p→0
pR (p)
= lim .
p→0 1 + G (p)
Thus we have proved the following:
Theorem 3.5.13. For a stable control system of the type described above, the
steady state error is given by the formula
pR (p)
SSE = lim
p→0 1 + G (p)

where G(p) is the transfer function of the system, and R(p) the Laplace transform
of the required output.
Example 3.5.14. Determine the steady–state error for the system in figure 3.28
if r (t) is a step of magnitude 5 and
20 (1 + 3p)
G (p) =
p2 + 7p + 10
Also determine the steady–state gain.
Solution: From the formula in section 3.5.7, the steady state gain of the system
is precisely
20(1 + 0)
SSG = G(0) = = 2.
0 + 0 + 10
The required output of the system is given by r(t) = 5H(t) (a step of magnitude
5). The Laplace transform of this required output is R(p) = 5L{H(t)} = p5 . Thus
by the formula proved above, we have that
pR (p)
SSE = lim
p→0 1 + G (p)

p · p5
= lim
p→0 1 + G (p)
5
=
1 + G(0)
5
=
1+2
5
=
3

231 EMT4801/1

3.5.10. ENGINEERING APPLICATION: FREQUENCY RESPONSE.


Consider a basic RLC circuit as in the sketch below.

Figure 3.29

Using Kirchoff’s laws (see section 3.3.3) it is clear that this electrical circuit is
described by the differential equation
1 t
Z
di
L + Ri + i(s)ds = e(t).
dt C 0
If with all initial conditions zero we take Laplace transforms, we get
 
1
pL + R + I(p) = E(p).
pC
By definition the transfer function corresponding to this differential equation is
I(p) 1
G(p) = = 1 .
E(p) pL + R + pC
1
Now consider G(p) , and replace p by jω to get
1 1 1
= jωL + R + = R + j(ωL − ).
G(jω) jωC ωC
But this is exactly the complex impedance Z of the circuit as defined in section
2.1.4. The same holds true for more general electrical circuits. In fact in general
we get the following property:
Proposition 3.5.15. If G(p) is the transfer function of some linear differential
1
equation describing an RLC circuit (with output the current i(t)), then G(jω) is the
complex impedance of the cicuit.
When it comes to electrical circuits there is therefore a clear connection between
the impedance of the circuit, and transfer functions.
But even in more general linear systems which may have absolutely nothing to
do with electricity, the transfer function of the system can be used in the same way
that one would use the complex impedance in an electrical circuit. In particular on
replacing p with jω, one can use the transfer function of the system to compute the
effect of the system on the phase and amplitude of a sinusoidal input. We proceed
to give a concrete example of how this works in practice.
Suppose we have a stable linear system with tranfer function of the form
K (p − z1 ) (p − z2 ) . . . (p − zm )
G (p) = (m ≤ n) .
(p − a1 ) (p − a2 ) . . . (p − an )
232

Figure 3.30

If the input u (t) = A sin ωt, is applied at time t = 0, the Laplace transform of
the system response x (t) for t ≥ 0 is determined by
X (p) = G (p) L {A sin ωt}

= G (p) 2
p + ω2
KAω (p − z1 ) (p − z2 ) . . . (p − zm )
=
(p − a1 ) (p − a2 ) . . . (p − an ) (p + jω) (p − jω)
n
α1 α2 X βi
= + +
(p + jω) (p − jω) i=1 p − ai
This last expression has been obtained by partial fractions, where for the sake of
simplicity we have assumed that none of the roots of the denominator are repeated.
(α1 , α2 and the βi ’s are of course constants.) Taking inverse Laplace transforms
leads to
n
X
x (t) = α1 e−jωt + α2 ejωt + βi eai t
i=1
For stable systems, the first two terms will constitute the steady–state response,
and the remaining terms the transient part of the response, which decays exponen-
tially to zero. To see this, recall that if the system is stable, all the poles ai of
the transfer function lie in the left half of the p–plane. But if this is the case then
βi eai t → 0 as t → ∞. Thus, for stable systems
xss (t) = α1 e−jωt + α2 ejωt
where we write xss (t) for the steady–state part. If now we solve for α1 and α2 from
X (p) = G(p)U (p)

= G (p) 2
p + ω2
AωG (p)
=
(p + jω) (p − jω)
n
α1 α2 X βi
= + +
(p + jω) (p − jω) i=1 p − ai
we obtain
AωG (−jω) AG (−jω)
α1 = =
−2jω −2j
and
AωG (jω) AG (jω)
α2 = = .
2jω 2j
Hence
A A
G (jω) ejωt − G (−jω) e−jωt .
xss (t) =
2j 2j
Now write G(jω) and G(−jω) in exponential form (see module 2) to get
G (jω) = |G (jω)| ej arg G(jω)
233 EMT4801/1

and
w
G (−jω) = |G (jω)| e−j arg G(j )

Note that both the argument and modulus depend on the frequency ω. Substituting
these expressions into the formula for xss leads to
A A
xss (T ) = |G (jω)| ej arg G(jω) ejωt − |G (jω)| e−j arg G(jω) e−jωt
2j 2j
A h i
= |G (jω)| ej{ωt+arg G(jω)} − e−j{jωt+arg G(jω)}
2j
= A |G (jω)| sin [ωt + arg G (jω)]
1
e − e−jz for any z. (See
 jz 
The last equality follows from the fact that sin z = 2j
module two.) This indicates that if a stable linear system with transfer function
G (p) is subjected to sinusoidal input A sin(ωt), then
• the steady–state system response is also sinusoidal with the same fre-
quency ω as the input.
• The amplitude of the response is |G (jω)| times the amplitude A of the
input. If |G (jω)| > 1, the input is amplified and if |G (jω)| < 1, the input
is attenuated.
• The phase shift between input and output is arg G (jω). The system leads
if arg G (jω) > 0 and lags if arg G (jω) < 0.
The variations in both the amplitude |G (jω)| and the argument arg G (jω) caused
by a variation in the frequency ω of the input, is called the frequency response of
the system. Here |G (jω)| represents the amplitude gain and arg G (jω) the phase
shift.
In practice it is usual to represent the information contained in the system’s
frequency response by two graphs: one showing how the amplitude gain |G (jω)|
varies, and the other how the phase shift arg G (jω) varies with frequency ω.
In summary then, if for a linear time–invariant system we are primarily in-
terested in describing the response of the system in terms of the time variable,
an appropriate approach would be to compute the output x(t) by computing the
convolution of the input u(t) with the impulse response g(t) to get
x(t) = g(t) ∗ u(t).
If however our interest lies primarily with the frequency response of the system to
some sinusoidal input, then in view of the above it would be more appropriate to
use the transfer function G(p) to compute the Laplace transform of the response
X(p) from the transform of the input U (p) by means of the formula
X(p) = G(p)U (p).
Example 3.5.16. Determine the frequency response of the RC filter shown in
the figure below. Sketch the amplitude and phase shift plots. The input–output
relationship is given by
1
Eo (p) = Ei (p)
RCp + 1
where Eo (p) is the Laplace transform of the output and Ei (p) the transform of the
input.
Solution: The transfer function G (p) of the system is
Eo (p) 1
G (p) = = .
Ei (p) RCp + 1
234

Figure 3.31

Thus
1
G (jω) =
1 + jωRC
(1 − jωRC)
=
(1 + jωRC) (1 − jωRC)
1 − jωRC
= .
1 + ω 2 R2 C 2
But then the frequency response of the system is given by
s 2  2
1 ωRC
|G (jω)| = +
1 + ω 2 R2 C 2 1 + ω 2 R2 C 2
s 
1
=
1 + ω 2 R2 C 2
and
arg G (jω) = tan−1 (−ωRC) = − tan−1 ωRC.
Notice that if ω = 0, then |G (jω) | = 1 and arg G (jω) = 0. Moreover if ω → ∞,
then |G (jω)| → 0 and arg G (jω) → − π2 .

Figure 3.32


235 EMT4801/1

For the above simple transfer function, plotting the amplitude and phase shift
characteristics is fairly straightforward. For higher order transfer functions use is
made of Bode plots. This however falls outside the scope of the course, and will
therefore not be considered.
EXERCISE 3.5.
1. The response x (t) of a system to a forcing function u (t) is determined by
the differential equation model
x00 + 2x0 + 5x = 3u0 + 2u.
(a) Determine the transfer function characterising the system.
(b) Write down the characteristic equation of the system. What is the
order of the system?
(c) Determine the transfer function poles and zeros, and illustrate them
in the p–plane.
 
3p + 2 2 2
; p + 2p + 5 = 0; order 2; poles − 1 ± j2; zero −
p2 + 2p + 5 3

2. Which of the following transfer functions represent stable– and which


unstable systems?
p−1
(a)
(p + 2) (p2 + 4)
(p + 2) (p − 2)
(b)
(p2 − 1) (p + 4)
p−4
(c)
(p + 3) (p + 6)
[marginally stable, unstable, stable]

3. Which of the following characteristic equations represent stable systems?


(In each case you may assume that we are dealing with a transfer function
for which the order of the denominator is not less than the order of the
numerator.)
(a) p2 − 4p + 13 = 0
(b) 5p3 + 13p2 + 13p + 15 = 0
(c) p3 + p2 + p + 1 = 0
(d) 24p4 + 11p3 + 26p2 + 45p + 36 = 0
(e) p3 + 2p2 + 2p + 1 = 0
[unstable, stable, marginally stable, unstable, stable]

4. The behaviour of a system having a gain controller is characterised by a


transfer function having the characteristic equation
p4 + 2p3 + (K + 2) p2 + 7p + K = 0
where K is the controller gain. Show that the system is stable provided
that K > 1, 5. (You may assume that we are dealing with a transfer
function for which the order of the denominator is not less than the order
of the numerator.)

5. A feedback control system has a transfer function with characteristic equa-


tion
p3 + 25Kp2 + (2K − 1) p + 5K = 0
where K is a constant gain factor. Determine the range of positive values
of K for which the system will be stable. (You may assume that we are
236

dealing with a transfer function for which the order of the denominator is
not less than the order of the numerator.)
 
3
k>
5

6. Determine the impulse responses x (t) of the following linear systems


whose response to an input u (t) is determined by the following differ-
ential equations:
(a) x00 + 15x0 + 56x = 3u (t)
(b) x00 + 8x0 + 25x = u (t)
In each case also comment on the stability of each of the systems.
 
−7t −8t 1 −4t
3e − 3e ; e sin 3t
3

7. The response of a system to a unit step u (t) = H (t) is given by


7 3 1
x (t) = 1 − e−t + e−2t − e−4t
3 2 6
Find the transfer function of the system.
 
p+8
(p + 1) (p + 2) (p + 4)

8. The output x (t) from a stable linear control system with input sin ωt and
transfer function G (p) is determined by the relationship
X (p) = G (p) L {sin ωt}
Show that, as t → ∞, the output
 jωt 
e G (jω)
x (t) → <
j

9. Consider the feedback system in the figure below, where K is a constant


feedback gain.

Figure 3.33

(a) Is the system stable in the absence of feedback (i.e. when K = 0)?
(b) Write down the transfer function G (p) for the overall feedback sys-
tem.
(Hint: If U (p) is the transform of the input into the overall system,
then U (p) − KX(p) is the transform of the input into the controller.)
 
1
No:
p2 + 2p + K − 3
237 EMT4801/1

10. For the feedback control system in the figure below, it is known that the
impulse response is g (t) = 2e−2t sin t. Use this information to determine
the value of the parameter α.

Figure 3.34
238

UNIT 6: STATE SPACE EQUATIONS


In the following we will assume that the reader is familar with the elementary
aspects of matrix theory as presented in Appendix A. If it has been some time since
you worked with matrices, please take the time now to review the material in this
appendix.

3.6.1. OBJECTIVE. To introduce state–space equations and present algo-


rithms for their solution.

3.6.2. OUTCOMES. At the end of this unit the student should


• Know what is meant by a “state space equation” and know how such
equations can be used to give a concise expression for both MIMO and
SISO systems;
• Know how SISO systems may be written is state space form. In particular
given a transfer function of a system, know how to obtain an equivalent
state space formulation of that same system;
• Be able to solve state space equations using both matrix-valued functions
and the Laplace transform method.

3.6.3. INTRODUCTION. For systems described by a linear differential


equation, the Laplace transform method is a very powerful way of computing the
response of the system to a specific input. However for higher order higher order
linear differential equations, this method can be rather time consuming and tedious.
In a case like this it is often more efficient to combine the Laplace transform method
with matrix methods, to compute the response.
In a more complicated system where there may be several inputs and outputs,
there will of course not be a single differential equation describing the system, but
rather a whole system of differential equations. (Such systems are commonly called
multi–input; multi–output (MIMO) systems.) Here too we need to incorporate
matrix methods to effectively describe and solve such systems.
The “matrix methods” referred to above, is the so-called state–space approach.
When in using the state–space approach we are faced with the propect of writing
out a whole matrix of derivatives, we adopt the convention of writing ẋ for dx dt .
The main reason for this is economy of space. But what exactly is the state–space
approach? Suppose we have a MIMO system with k inputs u1 (t), u2 (t), . . . , uk (t)
and r outputs y1 (t), y2 (t), . . . , yr (t). In state–space form the dependence of the
outputs on the inputs is described by a matrix equation of the form

ẋ = Ax + Bu
(3.6.1)
y = Cx + Du.

Here A is an n × n matrix, B an n × k matrix, and C and D are respectively r × n


and r × k matrices. The n × 1 matrix
 
x1 (t)
 x2 (t) 
 
x=
 .. 
 . 

 xn−1 (t) 
xn (t)

is the so-called n-state vector, and its elements xi (i = 1, 2, . . . , n) the state variables.
Here u and y are respectively the inputs and outputs written in column vector form
239 EMT4801/1

as    
u1 y1

 u2 


 y2 

u= .. y= ..
.
   
 . 
  . 
 uk−1   yr−1 
uk yr
The n–dimensional space over which x(t) ranges, is referred to as the state–space.
As t increases, x(t) will trace out a path in the state–space, which we call the
trajectory of the state vector. In two dimensions the state–space is of course a
plane, and so in this case we sometimes refer to it as a state–plane. The matrix A
is called the system matrix, B is the input or control matrix, and C and D output
matrices. Because the way in which the state vector varies with time is described by
the matrix equation 3.6.1, we will refer to these equations as the dynamic equations
of the system.
Example 3.6.1. Obtain the state–space representation characterising the two–
input, one–output network in the figure below, where x1 , x2 , x3 , and u1 and u2 are
as indicated in the figure. The output y is the voltage drop across the inductor L1 .

Figure 3.35

Solution: Applying Kirchoff’s laws to the left loop yields


di1
R1 i1 + L1 + vC = e1 .
dt
Thus given that x1 = i1 , x3 = vC , and u1 = e1 , this can be rewritten as R1 x1 +
L1 ẋ1 + x3 = u1 ; or equivalently
R1 1 1
ẋ1 = − x1 − x3 + u1 .
L1 L1 L1
For the right loop we get that
di2
L2 + R2 i2 + vC = e2 .
dt
Since x2 = i2 and e2 = u2 , this can be rewritten in the form L2 ẋ2 + R2 x2 + x3 = u2 ;
i.e.
R2 1 1
ẋ2 = − x2 − x3 + u2 .
L2 L2 L2
Finally note that Kirchoff’s laws also inform us that
1 t 1 t
Z Z
x3 = vC = (i1 + i2 ) ds = (x1 + x2 )ds,
C 0 C 0
so that
1
ẋ3 = (x1 + x2 ) .
C
240

The output y is the voltage drop across the inductor L1 . Thus y = L1 di dt . But
1

di1
since we have that R1 i1 + L1 dt + vC = e1 for the left loop, it is clear that
di1
y = L1 = −R1 i1 − vC + e1 = −R1 x1 − x3 + u1 .
dt
Summarising we have that
R1 1 1
ẋ1 = − x1 − x3 + u1
L1 L1 L1
R2 1 1
ẋ2 = − x2 − x3 + u2
L2 L2 L2
1
ẋ3 = (x1 + x2 )
C

y = −R1 x1 − x3 + u1
This can be written in matrix form as
  R1
0 − L11
    1 
ẋ1 − L1 x1 L1 0  
1 u1
 ẋ2  =  0 −R L2
2
− L12   x2  +  0 L2

1 1 u2
ẋ3 C C 0 x3 0 0
 
x1  
u1
y = [−R1 0 − 1] x2 + [1
  0]
u2
x3
which is of the required form

ẋ = Ax + Bu
y = Cx + Du.


3.6.4. SINGLE–INPUT, SINGLE–OUTPUT SYSTEMS (SISO). A


system needing a single input u(t) and yielding a single response y(t) to that input,
is called a single–input; single–output (SISO) system. (Up to now all the systems
we have considered have been SISO systems.) The state space form of such a system
will be

ẋ = Ax + Bu
(3.6.2)
y = Cx + Du.
Here u and y are not vectors, but single functions. Hence they are not written in
boldface.
Up to now all SISO systems we have looked at were either described by a
linear differential equation, or by a transfer function. So how do these approaches
compare to the state space approach? Are they different, or is there a natural way
of rewriting these descriptions in a state space form?
Suppose we are given a SISO system which is described by the following linear
differential equation:
(3.6.3) y (n) + an−1 y (n−1) + . . . + a2 y 00 + a1 y 0 + a0 y = u
Note:
• Here u = u (t) is the input and y = y (t) the output.
• The coefficient of y (n) can be taken to be 1, for if it is not the case, we
can simply divide the differential equation by a suitable factor to reduce
the coefficient of y (n) to 1.
241 EMT4801/1

To write this in a state space form let


x1 = y
x2 = y 0 = ẋ1
x3 = y 00 = ẋ2
..
.
xn−1 = y (n−2) = ẋn−2
xn = y (n−1) = ẋn−1
If we substitute these equations into equation 3.6.3, we get
ẋn = y (n) = − (an−1 ) xn − (an−2 ) xn−1 − . . . − a2 x3 − a1 x2 − a0 x1 + u.
In matrix form these sets of equations may be written as
      
ẋ1 0 1 0 ... 0 0 x1 0
 ẋ2   0 0 1 . . . 0 0  x2   0 
      
 ..   .. .
.. .
.. .
.. .
.. .. .. ..
 . = . + u
   
   . 
 .   . 
 ẋn−1   0 0 0 ... 0 1   xn−1   0 
ẋn −a0 −a1 −a2 . . . −an−2 −an−1 xn 1
The output is y = x1 , which may be written in matrix form as
 
x1
 x2 
 
y = [1 0 0 . . . 0]  ... 
 
 
 xn−1 
xn
Thus on setting
   
0 1 0 ... 0 0 0

 0 0 1 ... 0 0 


 0 

A= .. .. .. .. .. .. ..
, B= ,
   
 . . . . . .   . 
 0 0 0 ... 0 0   0 
−a0 −a1 −a2 . . . −an−2 −an−1 1
C = [1 0 0 . . . 0] ,
it is clear that the linear differential equation we started with, may be written in
matrix form as 
ẋ = Ax + Bu
y = Cx
Here A is the system matrix, and in cases where it is derived from a given DE in
the above manner, it is sometimes called the companion matrix of the DE.
Example 3.6.2. Find the state–space representation for the system charac-
terised by the differential equation
d3 y d2 y dy
2
3
− 6 2
+4 + 10y = 4 sin t
dt dt dt
Solution: The leading coefficient of the differential equation is 2 and not 1.
Hence to get this equation into a form where we can apply our algorithm, we first
need to normalise the differential equation by dividing with 2. We get
d3 y d2 y dy
3
−3 2 +2 + 5y = 2 sin t
dt dt dt
242

Now let x1 = y, x2 = y 0 = ẋ1 and x3 = y 00 = ẋ2 . If we substitute these equations


into the above differential equation and use the fact that y 000 = ẋ3 , we get
ẋ3 = −5x1 − 2x2 + 3x3 + 2 sin t.
In matrix form these formulas for ẋ1 , ẋ2 and ẋ3 may be written as
      
ẋ1 0 1 0 x1 0
 ẋ2  =  0 0 1   x2  +  0  (2 sin t)
ẋ3 −5 −2 3 x3 1
with the output y being given by
 
x1
y = Cx = [1 0 0]  x2 
x3
since y = x1 . 

But what about more complex SISO systems where the associated differential
equation involves derivatives of the input? Is there a way of reformulating this in
terms of state spaces? For example suppose we have a system characterised by the
differential equation
y (n) +an−1 y (n−1) +. . .+y 0 +a0 y = bm u(m) +bm−1 u(m−1) +. . .+b1 u0 +b0 u m ≤ n.
We found earlier that when all initial conditions are zero, one can use Laplace
transforms to show that this system can also be described by means of the following
transfer function:
bm pm + bm−1 pm−1 + . . . + b1 p + b0
G (p) =
pn + an−1 pn−1 + . . . + a1 p + a0
Using this transfer function as a starting point, our task is therefore to find a
state space formulation of the system which leads to the same transfer function!
Such a formulation would then be a valid state space representation of the system.
However let us first clarify what we mean by the “transfer function” of a state space
representation. Given a SISO system with input u and output y which is described
by

ẋ = Ax + Bu
y = Cx + Du
then with all initial conditions zero, the transfer function of the system is defined
to be
U (p)
G(p) =
Y (p)
where U (p) = L{u(t)} and Y (p) = L{y(t)}. The following theorem now comes to
our rescue providing exactly what we need for passing from the transfer function
approach to the state space approach.
Theorem 3.6.3 (First Kalman Form). Suppose we are given a system initially
in a queiscent state described by the transfer function
bm pm + bm−1 pm−1 + . . . + b1 p + b0
G (p) = m < n.
pn + an−1 pn−1 + . . . + a1 p + a0
Then with all initial conditions zero, the state space equation

ẋ = Ax + Bu
(3.6.4)
y = Cx + Du
243 EMT4801/1

where
   
0 1 0 ... 0 0 0

 0 0 1 ... 0 0 


 0 

A= .. .. .. .. .. .. ..
, B= ,
   
 . . . . . .   . 
 0 0 0 ... 0 1   0 
−a0 −a1 −a2 . . . −an−2 −an−1 1

C = [b0 b1 ... bm 0 ... 0] and D=0


will yield exactly the same transfer function. (The reason why there may be some
zeros in the matrix C, is because m < n.) This state space representation of the
system is referred to as the First Kalman Form.
Proof: Using the specific matrices given above, matrix multiplication in equa-
tion 3.6.4 will lead to the equations
ẋ1 = x2 , ẋ2 = x3 , ,... ẋn−1 = xn ,

ẋn = −a0 x1 − a1 x2 − · · · − an−1 xn + u,


and
y = b0 x1 + b1 x2 + · · · + bm xm+1 .
From the second set of equations it is clear that u = a0 x1 +a1 x2 +· · ·+an−1 xn + ẋn .
Taking Laplace transforms, we get
U (p) = a0 X1 (p) + a1 X2 (p) + · · · + an−1 Xn (p) + pXn (p)
and
Y (p) = b0 X1 (p) + b1 X2 (p) + · · · + bm Xm+1 (p).
Since x2 = ẋ1 we have that
X2 (p) = pX1 (p) .
Similarly x3 = ẋ2 leads to
X3 (p) = pX2 (p) = p(pX1 (p)) = p2 X1 (p).
Continuing in this manner, we can show that
Xk (p) = p(k−1) X1 (p) for all 1 ≤ k ≤ n.
If we substitute these equations into the formulas for Y (p) and U (p), we get that
U (p) = (a0 + a1 p + · · · + an−1 p(n−1) + pn )X1 (p)
and
Y (p) = (b0 + b1 p + · · · + bm pm )X1 (p).
Thus as required
Y (p) bm pm + bm−1 pm−1 + . . . + b1 p + b0
= = G(p).
U (p) pn + an−1 pn−1 + . . . + a1 p + a0


Remark 3.6.4. It is important to realise that starting from a given transfer


function, there may be several state space representations which all yield this same
transfer function. Hence the First Kalman Form is by no means the only way of
passing from the transfer function approach to the state space approach. It just
244

happens to be a very convenient and easy way of doing it! For example with all
initial conditions zero, it is possible to show that if instead we have
 
b0
   b1 
0 0 0 ... 0 −a0
 .. 
 
 1 0 0 ... 0 −a
 . 
1
  
 
A=
 0 1 0 ... 0 −a , B =  bm ,
2
 
 .. .. .. .. .. ..   0 
 . . . . . .   
 . 
0 0 0 . . . 1 −an−1  .. 
0
C = [0 0 ... 0 1] and D = 0,
then the state space equations

ẋ = Ax + Bu
y = Cx + Du
will also yield the transfer function
bm pm + bm−1 pm−1 + . . . + b1 p + b0
G (p) = m < n.
pn + an−1 pn−1 + . . . + a1 p + a0
The above alternative state space formulation of the system is referred to as the
Second Kalman Form.

Example 3.6.5. Suppose we have a system characterised by the differential


equation
d3 y d2 y dy d2 u du
+ 4 + 3 + y = 2 + + 3u
dt3 dt2 dt dt2 dt
with all initial conditions zero. Obtain
(a) a transfer function model,
(b) a state–space model
for the system.

Solution:
(a) Taking Laplace transforms leads to
p3 + 4p2 + 3p + 1 Y (p) = 2p2 + p + 3 U (p) .
 

Thus the transfer function of the system is given by


Y (p) 2p2 + p + 3
G (p) = = 3 .
U (p) p + 4p2 + 3p + 1
(b) For a system with a transfer function as in part (a), the First Kalman
Form will be
      
ẋ1 0 1 0 x1 0
 ẋ2  =  0 0 1   x2  +  0  u (t)
ẋ3 −1 −3 −4 x3 1
 
x1
y = [3 1 2]  x2  .
x3
Thus a suitable representation would be

ẋ = Ax + Bu
y = Cx
245 EMT4801/1

where    
0 1 0 0
A= 0 0 1 , B= 0 
−1 −3 −4 1
and
C = [3 1 2] .


3.6.5. SOLVING STATE SPACE EQUATIONS 1: MATRIX-VALUED


FUNCTIONS. Now that we hopefully have a better idea of what state space
equations are, and how one can reformulate other descriptions in a state space for-
mat, we come to the next major question which is how to actually solve state space
equations. All the effort of learning how to describe systems in state space form is
wasted if we don’t know how to solve such equations.
When faced with a linear differential equation we can either try to use function
theory to solve the equation directly, or else use the Laplace Transform method.
In principle the same is true for state space equations. One can either use function
theory to try and solve such equations directly, or try to use Laplace Transforms.
We will discuss the function theory approach in this section, leaving the Laplace
transform method for the next section.
So what then are matrix-valued functions and how can we use them to solve
state space equations? One of the main advantages of the state space method is
that complicated systems of linear differential equations can be written in a compact
matrix form as a “first order linear differential equation”. Specifically we end up
with an equation of the form

ẋ = Ax + Bu
y = Cx + Du.
If in the above set of equations we can solve for x in ẋ = Ax + Bu, we may then
obtain the output y by simply substituting x into y = Cx + Du. So the challenge
is to solve ẋ = Ax + Bu. Had we been working with functions and constants
instead of vectors and matrices, this would have been a straightforward exercise.
Lets revise this simpler case to try and get some ideas on how to solve the state
space equation.
Given a first order linear differential equation
dx
= ax + bu x(t0 ) = x0
dt
(a and b are constants) the solution is of the form x(t) = f (t) + g(t) where f (t)
is the general solution of dx dt = ax with x(t0 ) = x0 , and g(t) a particular solution
of dx
dt = ax + bu with x(t 0 ) = 0. To find the solution to the unforced homogenous
equation
dx
= ax x(t0 ) = x0 ,
dt
we simply write x1 dx = adt, and then integrate to get ln(x) = at + c; or equivalently
x = Keat where K = ec . Since we require that x(t0 ) = x0 it follows that x0 =
Keat0 ; i.e. K = x0 e−at0 . Thus x = x0 ea(t−t0 ) is the solution to dx dt = ax with
x(t0 ) = x0 .
To then find a particular solution to dx dt = ax + bu with x(t0 ) = 0, we rewrite
dx
this
R
as dt − ax = bu, and multiply throughout by the so-called integrating factor
−adt −at
e =e to get
d −at dx
(e x) = e−at − ae−at x = e−at bu.
dt dt
246

Rt
Now integrate from t0 to t to get e−at x(t) = t0 e−as bu(s)ds, or equivalently
Z t Z t
x(t) = eat e−as bu(s)ds = ea(t−s) bu(s)ds.
t0 t0

Thus the solution to our original equation dxdt = ax + bu with x(t0 ) = x0 , is


Z t
x(t) = x0 ea(t−t0 ) + ea(t−s) bu(s)ds.
t0

Now if for a square matrix A we can give meaning to the idea of eAt , and if
this matrix-valued function behaves much like its real-valued cousin eat , then by
following essentially the same steps, we can show that
Z t
x(t) = eA(t−t0 ) x0 + eA(t−s) Bu(s)ds
t0

is the solution of

ẋ = Ax + Bu.

Here the first part eA(t−t0 ) x0 corresponds to the solution of the unforced system
and is called the complementary function. The convolution integral depending on
the forcing vector u (t), is called the particular integral.
Thus for this method to work, the primary challenge is to give meaning to
something like eAt for square matrices A. To get some idea of how to do this, lets
consider a specific example. Suppose we are given an n × n matrix of the form
 
λ 1 0 ... 0 0
 0 λ 1 ... 0 0 
 
J =  ... ... ... .. .. ..  .

 . . .  
 0 0 0 ... λ 1 
0 0 0 ... 0 λ

Now consider (J − λI)k where


 
0 1 0 ... 0 0

 0 0 1 ... 0 0  
J − λI = 
 .. .. .. .. .. ..  .
 . . . . . . 
 0 0 0 ... 0 1 
0 0 0 ... 0 0

It is not difficult to see that as (J − λI)k gets raised to higher and higher powers,
that the “second diagonal” of 1’s moves “upwards” until finally (J − λI)k = 0 for
all k ≥ n. Try to verify this in the case where n = 3 by showing that
 2    3  
0 1 0 0 0 1 0 1 0 0 0 0
 0 0 1  = 0 0 0   0 0 1  =  0 0 0 .
0 0 0 0 0 0 0 0 0 0 0 0
Now suppose we are given a function f for which the Taylor series at λ converges
to f for all x. That is
f 0 (λ) f 00 (λ) f (k) (λ)
f (x) = f (λ) + (x − λ) + (x − λ)2 + · · · + (x − λ)k + . . .
1! 2! k!
247 EMT4801/1

for all x. Using this series as a starting point we may formally define the matrix
f (J) to be
f 0 (λ) f 00 (λ) f (k) (λ)
f (J) = f (λ)I + (J − λI) + (J − λI)2 + · · · + (J − λI)k + . . . .
1! 2! k!
But since (J − λI)k = 0 for all k ≥ n, this expression simplifies to
f 0 (λ) f 00 (λ) f (n−1) (λ)
(3.6.5) f (J) = f (λ)I+ (J −λI)+ (J −λI)2 +· · ·+ (J −λI)n−1 .
1! 2! (n − 1)!
Thus to compute f (J) we need only add the n non-zero matrices on the right to
get
0 00
 (n−2)

f (n−1) (λ)
f (λ) f 1!(λ) f 2!(λ) . . . f (n−2)! (λ)
(n−1)!
0 (n−3)
f (n−2) (λ) 
 
 0
 f (λ) f 1!(λ) . . . f (n−3)! (λ)
(n−2)! 
f (J) =  .. .. .. .. .. .. .
 
 . . . . . . 
0
f (λ)
 
 0 0 0 ... f (λ) 
1!
0 0 0 ... 0 f (λ)
Note that in equation 3.6.5 we needed only the n-th Taylor polynomial to compute
f (J). Thus any other function g for which the n-th Taylor polynomial is the same,
will yield the same value as before for g(J). In particular for any polynomial g for
which f (k) (λ) = g (k) (λ) for all 0 ≤ k ≤ n − 1 we will have that f (J) = g(J).
So what has all this got to do with a general n × n matrix A? It is a fact that
for any such a matrix we can find an invertible matrix S such that A = S −1 BS,
where B is an n × n matrix which has submatrices J of the above form arranged
on its diagonal, with zeros elsewhere. (This form is the famous Jordan canonical
form of A.) If now we combine this fact with the preceding discussion, we arrive at
the following algorithm:
Definition 3.6.6. Let A be an n × n matrix with k distinct eigenvalues λ1 , λ2 ,
. . . , λk with each λj repeated rj times. Let f be a real-valued function such that
f (s) (λj ) exists for all 0 ≤ s ≤ rj − 1, 1 ≤ j ≤ k.
Then
f (A) = p(A) = α0 I + α1 A + α2 A2 · · · + αn−1 An−1
where
p(x) = α0 + α1 x + · · · + αn−1 xn−1
is the polynomial such that
p(s) (λj ) = f (s) (λj ) for all 0 ≤ s ≤ rj − 1, 1 ≤ j ≤ k.
In particular if
p(s) (λj ) = v s evλj for all 0 ≤ s ≤ rj − 1, 1 ≤ j ≤ k,
then p(A) = evA .
We are of course interested in evaluating eAt rather than eA . If now A is as in
the definition, then to evaluate eAt what we really need is an expression
pt (x) = α0 (t) + α1 (t)x + · · · + αn−1 (t)xn−1
depending on both x and t for which
d s pt
(λj ) = ts e(λj t) for all 0 ≤ s ≤ rj − 1, 1 ≤ j ≤ k.
dxs
In this case we will have pt (A) = eAt by the above algoritm. Thus we arrive at the
following algorithm for finding eAt :
248

Let A be an n×n matrix with k distinct eigenvalues λ1 , λ2 , . . . , λk


with each λj repeated rj times. Then

(3.6.6) eAt = α0 (t) I + α1 (t) A + α2 (t) A2 + . . . + αn−1 (t) An−1

where the expressions αj (t) are determined by solving the n si-


multaneous equations

(3.6.7) ts eλi t = α0 (t) + α1 (t) λi + α2 (t) λ2i + . . . + αn−1 (t) λn−1


i

for all 0 ≤ s ≤ rj − 1, 1 ≤ j ≤ k.

Example 3.6.7. A system is characterised by the state equation


      
ẋ1 1 2 x1 2
= + u (t)
ẋ2 0 3 x2 1

with initial conditions x1 (0) = x2 (0) = 1 and with input



0 if t < 0
u (t) = H (t) =
1 if t ≥ 0

(a) Use matrix valued functions to solve the state equation ẋ (t) = Ax (t) +
Bu (t) . (Of course here
    
1 2 2
A= and B= .
0 3 1

(b) Confirm the result by solving the two first order differential equations.

Solution:

(a) Before computing eAt , we first find the eigenvalues of A. Hence let
|λI − A| = 0. Thus

λ−1 −2
0 = = (λ − 1)(λ − 3).
0 λ−3

Thus the eigenvalues are λ1 = 1 and λ2 = 3. Since A is a 2 × 2 matrix, it


follows from our algorithm that eAt = α0 (t) I + α1 (t) A, where α0 (t) and
α1 (t) must satisfy the equations

et = α0 (t) + α1 (t)
3t
e = α0 (t) + 3α1 (t)

(see equation 3.6.7). If we subtract the first equation from the second, we
get 2α1 (t) = e3t − et ; that is
1 3t
e − et .

α1 (t) =
2
Notice that α0 (t) = et − α1 (t) by the first equation. Thus if we substitute
the value for α1 (t), we get
1 3t  1
α0 (t) = et − e − et = 3et − e3t .

2 2
249 EMT4801/1

From these values it now follows that


eAt = α0 (t) I + α1 (t) A
1 1 3t
3et − e3t I + e − et A
 
=
2 2
   
1 t 3t
 1 0 1 3t t
 1 2
= 3e − e + e −e
2 0 1 2 0 3
 1 t 3t
   1 3t t

e3t − et 

= 2 3e − e 0
+ 2 e −e
1 3
t 3t

0 2 3e − e 0 2 e3t − et
 t
e e3t − et

=
0 e3t
The complementary function is given by
 t
e e3t − et
   3t 
At 1 e
e x0 = =
0 e3t 1 e3t
and the particular integral by
t Z t  t−ϕ
e3(t−ϕ) − e(t−ϕ)
Z  
e 2
eA(t−ϕ) Bu (ϕ) dϕ = · u(ϕ)dϕ.
t0 0 0 e3(t−ϕ) 1
Since 
0 if t<0
u (t) = H (t) = ,
1 if t≥0
the above integral becomes
t Z t  t−ϕ
e3(t−ϕ) − e(t−ϕ)
Z  
A(t−ϕ) e 2
e Bu (ϕ) dϕ = · 1dϕ
t0 0 0 e3(t−ϕ) 1
Z t  (t−ϕ)
+ e3(t−ϕ)

e
= dϕ
0 e3(t−ϕ)
t
−e(t−ϕ) − 31 e3(t−ϕ)

=
− 13 e3(t−ϕ) 0
 4
− 3 + et + 31 e3t

=
− 13 + 31 e3t
Thus
Z t
x (t) = eAt x0 + eA(t−ϕ) Bu (ϕ) dϕ
t
 3t  0
− 43 + et + 13 e3t

e
= +
e3t − 31 + 13 e3t
 4
− 3 + et + 34 e3t

= ,
− 13 + 43 e3t
and therefore
4 4
x1 (t) = − + et + e3t
3 3
and
1 4
x2 (t) = − + e3t
3 3
250

(b) The state equation may be written in expanded notation as


      
ẋ1 1 2 x1 2
= + ·1 for t ≥ 0.
ẋ2 0 3 x2 1

This corresponds to the following set of DE’s:


dx1
= ẋ1 = x1 + 2x2 + 2
dt
dx2
= ẋ2 = 3x2 + 1.
dt
The second DE
dx2
− 3x2 = 1
dt
has an integrating factor of e−3t . Therefore
Z
1
e−3t x2 (t) = e−3t .1dt = − e−3t + k1 ,
3
and hence
1
x2 (t) = − + k1 e3t
3
But since x2 (0) = 1, we must have 1 = − 31 +k1 ; i.e. k1 = 43 . Consequently

1 4
x2 (t) = − + e3t
3 3
dx1 dx1
Substituting this answer into dt = x1 + 2x2 + 2 now yields dt = x1 −
2 8 3t
3 + 3 e + 2, whence

dx1 4 8
− x1 = + e3t .
dt 3 3
In this case the integrating factor is e−t . Consequently
Z  
4 −t 8 2t
e−t x1 (t) = e + e dt
3 3
4 4
= − e−t + e2t + k2 .
3 3
Therefore
4 4
x1 (t) = − + e3t + k2 et .
3 3
But since x1 (0) = 1, we must have 1 = − 43 + 4
3 + k2 ; i.e. k2 = 1. Thus

4 4
x1 = − + e3t + et
3 3
and
1 4
x2 (t) = − + e3t ,
3 3
which agrees with the result of part (a). 
251 EMT4801/1

3.6.6. SOLVING STATE SPACE EQUATIONS 2: LAPLACE TRANS-


FORMS. Matrix-valued functions are not the only way to solve state space equa-
tions. The Laplace transform technique extends directly to matrix equations. If we
combine this technique with inversion of matrices, we end up with an alternative
way of solving a state space equation. Lets see how this works. If we are given
ẋ = Ax (t) + Bu (t)
y = Cx + Du,
then on taking Laplace transforms, we get
pX (p) − x0 = AX (p) + BU (p) ,
from which follows
[pI − A] X (p) = x0 + BU (p) .
−1
Therefore if we can compute [pI − A] , then X(p) will be given by
−1 −1
(3.6.8) X (p) = [pI − A] x0 + [pI − A] BU (p) .
Computing the inverse transform of the solution to the above equation, gives the
required formula for x (t) (the first part will give the complementary function and
the second the particular integral). Of course once we have x(t), it is then easy to
compute y from the equation y = Cx + Du.
Example 3.6.8. Do part (a) of example 3.6.7 by the Laplace transform method.
Solution: We have that
     
1 0 1 2 p − 1 −2
pI − A = p − =
0 1 0 3 0 p−3
The adjoint is
 
p−3 2
0 p−1
with
|pI − A| = (p − 1) (p − 3) .
Therefore
 
−1 1 p−3 2
[pI − A] =
(p − 1) (p − 3) 0 p−1
" #
1 2
(p−1) (p−1)(p−3)
= 1
0 (p−3)

Then
" #
1 2 
−1 (p−1) (p−1)(p−3) 1
[pI − A] x0 = 1
0 (p−3) 1
" #
1
(p−3)
= 1 .
(p−3)

(The inverse transform of this will give the complementary function.) Since
1
L {u (t)} = L {H (t)} = ,
p
252

we have that
" #
1 2 
−1 (p−1) (p−1)(p−3) 2 1
[pI − A] BU (p) = 1
0 (p−3) 1 p
" #
2(p−3)+2
= p(p−1)(p−3)
1
p(p−3)
" #
2p−4
p(p−1)(p−3)
= 1 .
p(p−3)

(The inverse transform of this will give the particular integral.) Thus
−1 −1
X (p) = [pI − A] x0 + [pI − A] BU (p)
" # " #
1 2p−4
(p−3) p(p−1)(p−3)
= 1 + 1
(p−3) p(p−3)
" #
p(p−1)+2p−4
p(p−1)(p−3)
= p+1
p(p−3)

p2 +p−4
" #
= p(p−1)(p−3) .
p+1
p(p−3)

On resolving these expressions into partial fractions, we get


p2 + p − 4 41 1 4 1
= − + +
p (p − 1) (p − 3) 3p p−1 3p−3
p+1 11 4 1
= − + .
p (p − 3) 3p 3p−3
Therefore
− 43 p1 + p−11
+ 43 p−3
1
 
X (p) = .
− 13 p1 + 43 p−3
1

Thus on taking inverse transforms we get


 4
− 3 + et + 43 e3t

x (t) = ,
− 13 + 43 e3t
which agrees with the earlier result. 

3.6.7. TRANSFER FUNCTIONS OF MIMO SYSTEMS. The Laplace


transform method may also be used to give meaning to the idea of a transfer func-
tion for a MIMO system. However this is not quite as straight-forward as in the
SISO case. Suppose we have a MIMO (multi-input, multi-output) system described
by the state space equations
ẋ (t) = Ax (t) + Bu (t)
y (t) = Cx (t) + Du (t)
with zero initial conditions x0 = 0. The problem is that both the input u and
the output y are vectors rather than functions. Therefore the Laplace transforms
U(p) = L{u(t)} and Y(p) = L{y(t)} are also vectors. One can’t really divide
one vector by another, and so we can’t simply divide Y(p) by U(p) to get the
transfer function. What we do instead is to define the system transfer function to
be a matrix which by means of matrix multiplication will map the transform of
253 EMT4801/1

the input vector U(p) onto the transform of the output vector Y(p). That is the
transfer “function” is a matrix G(p) for which
Y(p) = G(p)U(p).
Now if our system is described by the above state space equations (with x0 = 0),
then by equation 3.6.8 we will have that
−1
X (p) = [pI − A] BU (p) .
If now we insert this into the equation
Y (p) = CX (p) + DU (p) ,
we get that
−1
Y (p) = C [pI − A] BU (p) + DU (p)
 
−1
= C [pI − A] B + D U (p) .
Thus for a MIMO system, the system transfer matrix is given by
−1
(3.6.9) G (p) = C [pI − A] B + D.
Example 3.6.9. Consider the circuit in the figure below.

Figure 3.36

(a) Express ẋ1 , ẋ2 , y1 and y2 in terms of x1 , x2 and u, where y1 = iL and


y2 = iC .
(b) Find the state–space representation of the system in the form
ẋ (t) = Ax (t) + Bu (t)
y (t) = Cx (t) + Du (t)
given that
2 1
R1 = 2Ω, R2 = 5, 5Ω, L= H and C = F.
15 15
Note: Since there is only one input, u (t) is not a column matrix.
(c) Use Laplace transforms to find the transfer matrix relating the outputs
y1 (t) and y2 (t), to the input u (t) = u when all initial conditions are zero.
(d) Find the system response to the unit step input u (t) = H (t) if the circuit
is initially in its quiescent state.
Solution:
(a) Let i be the total current, iC the current in the capacitor, and iL the
current in the inductor. Applying Kirchoff’s second law to the outer loop
gives
u = iR1 + vc + iC R2
254

and to the left–hand loop gives


diL
u = iR1 + L .
dt
Since i = iC + iL , these equations can be rewritten as
u = (iC + iL )R1 + vC + iC R2
diL
u = (iC + iL )R1 + L .
dt
In the above we have that x1 = vC and x2 = iL . Next notice that the
voltage drop across the capacitor is given by the formula
1 t
Z
vC = iC (s)ds.
C 0
Thus on differentiating, we get
iC = C v̇C = C ẋ1 .
Therefore the two circuit equations can be rewritten as
u = (C ẋ1 + x2 )R1 + x1 + C ẋ1 R2
u = (C ẋ1 + x2 )R1 + Lx˙2 .
From the first of these equations we get
1
ẋ1 = (−x1 − R1 x2 + u) ,
C [R1 + R2 ]
which when inserted into the second equation, yields
R1
u= (−x1 − R1 x2 + u) + x2 R1 + Lẋ2 ,
[R1 + R2 ]
which can be rewritten as
 
R1 R2
ẋ2 = x1 − R2 x2 + u .
L [R1 + R2 ] R1
The outputs are given by
y1 = x2
and
y2 = iC = C v̇C = C ẋ1 .

(b) In terms of the values given for R1 , R2 , L and C, we get that


1
ẋ1 = (−x1 − R1 x2 + u)
C [R1 + R2 ]
15
= (−x1 − 2x2 + u)
7, 5
= 2 (−x1 − 2x2 + u)
(3.6.10) = −2x1 − 4x2 + u.
Since
R1 15 1
= 2. = 2,
L [R1 + R2 ] 2 7, 5
255 EMT4801/1

we also have that


 
R1 R2
ẋ2 = x1 − R2 x2 + u
L [R1 + R2 ] R1
 
5, 5
= 2 x1 − 5, 5x2 + u
2
= 2x1 − 11x2 + 5, 5u.
Thus with  
x1
x= ,
x2
we have that
   
−2 −4 2
(3.6.11) ẋ = x+ u
2 −11 5, 5
Recall that
y1 = x2 and y2 = C ẋ1 .
On substituting equation 3.6.10, we get that
1 2 4 2
y2 = (−2x1 − 4x2 + 2u) = − x1 − x2 + u.
15 15 15 15
Thus with  
y1
y= ,
y2
we have that
   
0 1 0
(3.6.12) y= 2 4 x+ 2 u
− 15 − 15 15
Equations 3.6.11 and 3.6.12 together, give the required representation.

(c) In part (b) we obtained the state space formulation


ẋ (t) = Ax (t) + Bu (t)
y (t) = Cx (t) + Du (t)
where    
−2 −4 2
A= , B= 11 ,
2 −11 2
   
0 1 0
C= 2 4 and D= 2 .
− 15 − 15 15
With all initial conditions zero, we then have by equation 3.6.9, that the
transfer matrix will be of the form
−1
G (p) = C [pI − A] B + D.
−1
Our first task is to compute [pI − A] . Notice that
     
1 0 −2 −4 p+2 4
pI − A = p − =
0 1 2 −11 −2 p + 11
with

p+2 4

−2 p + 11

= (p + 2) (p + 11) + 8

= p2 + 13p + 30
= (p + 3) (p + 10)
256

The adjoint matrix of (pI − A) is


 
p + 11 −4
,
2 p+2
and therefore
 
−1 1 p + 11 −4
(pI − A) =
(p + 3) (p + 10) 2 p+2
First note that
  
−1 1 p + 11 −4 2
(pI − A) B = 11
(p + 3) (p + 10) 2 p+2 2
 
1 2p
=
(p + 3) (p + 10) 112 p + 15
Consequently
−1
G (p) = C (pI − A) B+D
     
0 1 1 2p 0
= 2 4 +
− 15 (p + 3) (p + 10) 11
− 15 2 p + 15
2
15
 11   
1 2 p + 15 0
= + 2
(p + 3) (p + 10) − 2615 p − 4 15
5,5p+15
" #
(p+3)(p+10)
= − 26
15 p−4 2
(p+3)(p+10) + 15
" 11p+30
#
2(p+3)(p+10)
= −26p−60 2 .
15(p+3)(p+10) + 15

(d) We are given that the system is initially in its quiescent state (i.e. x0 = 0)
and that the input is u (t) = H (t). Thus
1
U (p) = L {H (t)} = .
p
It therefore follows from Y (p) = G (p) U (p) that
" 11p+30
#
2(p+3)(p+10) 1
Y (p) = −26p−60 2
15(p+3)(p+10) + 15 p
" 11p+30
#
2p(p+3)(p+10)
= −26p−60 2
15p(p+3)(p+10) + 15p

11 1 1 4 1
2 p + 14 p+3 − 7 p+10
 
= 2 1 2 1 4 1 2
− 15 p − 35 p+3 + 21 p+10 + 15p
11 1 1 4 1
14 p+3 − 7 p+10
 
2p +
= 2 1 4 1
− 35 p+3 + 21 p+10
Now take inverse transforms to get
1 1 4
y1 (t) = + e−3t − e−10t
2 14 7
and
2 4
y2 (t) = − e−3t + e−10t .
35 21

257 EMT4801/1

EXERCISE 3.6.
1. Obtain the state–space form of the differential equation
d4 y d2 y dy
4
+2 2 +4 = 5u (t)
dt dt dt
     
0 1 0 0 0
  0 0 1 0   0  
ẋ =   x +   u, y = [1 0 0 0] x
  0 0 0 1   0  
0 −4 −2 0 5

2. With all initial conditions zero, obtain the dynamic equations in state–
space form for the system which is described by the transfer function
p2 + 3p + 5
G (p) = .
p3
+ 6p2 + 5p + 7
     
0 1 0 0
ẋ =  0 0 1  x +  0  u, y = [5 3 1] x
−7 −5 −6 1

3. Obtain the state–space model of the network shown in the figure below.

Figure 3.37
     
−R1 /L1 −R1 /L1 −1/L1 1/L1

 ẋ =  −R1 /L1 − (R1 + R2 ) /L2 −1/L2  x +  1/L2  u, 


  1/C  1/C 0 0 

 R1 R 1 0 
y= x
0 R2 0

4. A system is characterised by the state equation


ẋ (t) = Ax (t) + Bu (t) t≥0
where    
3 4 0 1
A= and B= .
2 1 1 1
(a) Determine the state x (t) of the system for an input of
 
4
u (t) =
3
subject to x1 (0) = 1, x2 (0) = 2.
(b) Confirm the result of (a) by using the Laplace transform method.
x1 (t) = −5 + 31 8e−t + 10e5t , x2 = 3 + 31 −8e−t + 5e5t
    
258

5. Determine the response y = x1 of the system characterised by



ẋ1 + 2x2 = u1 − u2
t ≥ 0
ẋ2 + 3x2 − x1 = u1 + u2
to an input u1 = 1, u2 = t, subject to the initial conditions x1 (0) = 0 and
x (0) = 1. Use the Laplace transform approach.
2
y (t) = 14 15 − 10t + 9e−2t − 24e−t
 
MODULE 4

Z-TRANSFORMS–
DISCRETE SIGNALS AND SYSTEMS

UNIT 1: DEFINITIONS AND PROPERTIES


4.1.1. OBJECTIVE. Introduction of discrete time systems, the Z–transform,
and its properties.

4.1.2. OUTCOMES. At the end of this unit the student should


• Be familiar with the basic definitions and properties of the Z-transform;
• Understand and be able to apply the initial value theorem, final value
theorem, and convolution theorem.

4.1.3. INTRODUCTION. In module 3 the Laplace transform was used to


analyse continuous time systems. However when faced with say a digital rather
than an analog process, the evolution of the system is no longer continuous, but in
discrete step-wise increments. Examples of such systems include systems which op-
erate on digital signals generated by sampling a continuous time signal. Although
real-valued functions of the form f : R → R were a useful tool in modelling con-
tinuous systems, when faced with a system like the above where the evolution is
discrete rather than continuous, it would be more appropriate to use functions with
a discrete domain space like the positive integers Z = {0, ±1, ±2, . . . }, i.e. func-
tions of the form g : Z → R. But such functions are nothing but two-sided infinite
sequences written in a different form! To see this notice that for any g : Z → R the
set of function values g(k) = xk where k ∈ Z, form an infinite two-sided sequence
. . . , x−2 , x−1 , x0 , x1 , x2 , . . . . When passing from continuous to discrete systems,
time evolution will no longer be described by differential equations, but rather by
difference equations. In addition when trying to solve such difference equations
Laplace transforms is not the best way to go, as they were invented with continu-
ous systems in mind. Rather what we should do, is to try and use a discrete analog
of the Laplace transform. This discrete analog is the so-called Z-transform.
Thus in summary, with the goal of being able to model and solve discrete
systems, we will in this module try to repeat the programme of module 3, but with
• functions f : R → R replaced by sequences . . . , u−2 , u−1 , u0 , u1 , u2 , . . . ,
• differential equations replaced by difference equations, and
• the Laplace transform replaced by the Z-transform.

4.1.4. BASIC CONCEPTS. Given a two-sided infinite sequence like


. . . , x−2 , x−1 , x0 , x1 , x2 , . . . , we define the Z-transform of this sequence to be the
complex function

∞ 
X xk
(4.1.1) X(z) = Z {xk }−∞ =
zk
k=−∞

and understand
P∞ the domain of X(z) to be the region of convergence of the Laurent
series k=−∞ xzkk .
259
260

If we have that xk = 0 for all k < 0, the sequence is called a causal sequence.
For such causal sequences the formula for the Z–transform simplifies to

∞ 
X xk
(4.1.2) X(z) = Z {xk }−∞ =
zk
k=0
A sequence together with its Z-transform, will be known as a Z-transform pair. In
the case of Laplace transforms we noted that there are some functions for which
the Laplace transform does not exist. Similarly
P∞ here there are some sequences
for which the associated Laurent expansion k=−∞ xzkk will not converge for any
z. Such sequences do not have a Z-transform. We will not go into this issue of
existence in any measure of detail, but rather content ourselves with making the
reader aware of this fact, and then treat each sequence on a case by case basis. As
we did with Laplace transforms we will for the rest of this module restrict
ourselves to causal sequences only. As before the reason for this is that for
the most part we will be dealing with causal systems where we essentially have no
information about the system prior to time k = 0.
We proceed to demonstrate the technique of computing Z-transforms by means
of some illustrative examples. Ultimately we wish to follow the same approach
as with Laplace transforms. Specifically we will build up a bank of known Z-
transforms of some standard sequences, and then use this bank of known Z-transforms
to compute the transforms of sequences that are a combination of these standard
sequences.
Example 4.1.1. Determine the Z-transform

Z({xk }0 )
of each of the following causal sequences, where a is a real or complex constant:

(a) {xk } = nak = o {1, a, a2 , a3 , . . . }.
k
(b) {xk } = − 12 = {1, − 12 , (− 21 )2 , (− 12 )3 , . . . }.

 k−1
(c) {xk } = ka = {0, 1, 2a, 3a2 , . . . } (here a 6= 0).
(d) {xk } = {ak} = {0, a, 2a, 3a, . . . }.
(e) {xk } = {1, 0, 0, 0, . . .}.
Solution:
(a) For {xk } = {1, a, a2 , a3 , . . . }.

 X ak a h a i2 h a i3
Z ak = = 1 + + + + ...
zk z z z
k=0
This is a geometric progression with common ratio az and leading term 1,
which by example 2.5.3 converges to
1 z
X(z) = =
1 − az z−a

whenever az < 1, i.e. whenever |z| > |a| (all points outside the circle in
the z–plane with radius |a| and centre the origin). Thus
 k ∞ z
a k=0 and X (z) = , |z| > |a|
z−a
form a Z–transform pair.

(b) This is a simple consequence of part (a). We merely need to set a = − 12


in part (a) to see that
z 2z 1
X(z) = = |z| >
z − − 12 2z + 1 2
261 EMT4801/1

is the Z-transform of {(− 21 )k }∞ 1 1 2 1 3


k=0 = {1, − 2 , (− 2 ) , (− 2 ) , . . . }.

(c) Observe that if term for term we differentiate the sequence {1, a, a2 , a3 , . . . }
with respect to a, we get
 
d k
= {0, 1, 2a, 3a2 , . . . } = kak−1 .

a
da

Since in this way kak−1 can be obtained from {ak } by means of differ-
entiation, we will in a similar way attempt to obtain the Z-transform of
{kak−1 } from that of {ak } by means of differentiation. Regarding z as a
constant and differentiating
a h a i2 h a i3 z
1+ + + + ... = |z| > |a|
z z z z−a
with respect to a, yields
   2  
1 2a 3a d z z
0+ + 2 + + ... = =
z z z3 da z − a (z − a)2
(where of course we still have |z| > |a|). Thus
 k−1 ∞ z
ka and X (z) = , |z| > |a|
k=0 (z − a)2
form a Z–transform pair.

(d) We will use the result of part (c). If in part (c) we set a = 1, we get that
∞ z
{k}k=0 = {0, 1, 2, . . . } and X (z) = , |z| > 1
(z − 1)2
form a Z–transform pair; that is
   
1 2 3 z
0 + + 2 + 3 + ... = |z| > 1.
z z z (z − 1)2
If now we multiply throughout by a, we get that
   
a 2a 3a az
0 + + 2 + 3 + ... = |z| > 1.
z z z (z − 1)2
Thus
∞ az
{ak}k=0 = {0, a, 2a, . . . } and X (z) = , |z| > 1
(z − 1)2
form a Z–transform pair.

(e) For {xk } = {1, 0, 0, 0, . . .} we have by definition that


1 0 0 0
Z({1, 0, 0, 0, . . .}) = 0 + + 2 + 3 . . . = 1.
z z z z
Thus
{xk }∞
k=0 = {1, 0, 0, 0, . . .} and X(z) = 1
form a Z-transform pair. 

Definition 4.1.2. We will refer to the sequence {1, 0, 0, 0, . . . } as the impulse


sequence (or unit pulse sequence) and denote it by
{δk } = {1, 0, 0, . . .} .
As we demonstrated in the previous example, we have that
(4.1.3) Z({δk }) = 1.
262

4.1.5. SAMPLING: THE RELATION BETWEEN LAPLACE AND


Z-TRANSFORMS. Although Laplace transforms apply to continuous–time sig-
nals and Z–transforms to discrete–time signals, one may observe many similarities
between the two. We are now going to establish a relationship between the two.
Suppose we are given a continuous function f that an instrument samples every
T units of time. (Thus sampling is done at time 0, T , 2T , etc.) This process is
represented by the figure below.

Figure 4.1

If the duration of each sampling event (width of the pulse) is  > 0, then the
function for which the area under the graph corresponds to that of the columns in
the figure, is given by

X
f (nT ) [H(t − nT + /2) − H(t − nT + /2)].
n=0
If now we divide by  and let  → 0 (thus taking the duration of the sampling
events to be arbitrarily small), we get to what is called an ideal sampler which is
the generalised function
X∞
g (t) = f (nT ) δ (t − nT )
n=0
where δ (t − nT ) is the Dirac delta function at nT . (Thus our ideal sampler is
just a series of pulses for which the “area under the n-th pulse” is f (nT ).) If we
argue formally and use the fact that L{δ (t − nT )} = e−pnT , then taking Laplace
transforms will lead to
X∞
L {g (t)} = L{ f (kT ) δ (t − kT )}
k=0

X
= f (kT ) L{δ (t − kT )}
k=0
X∞
f (kT ) · e−kT p
 
=
k=0

If now we make the substitution z = epT , then the above becomes



X f (kT )
,
zk
k=0

which is just the Z-transform of the causal sequence {f (kT )}k=0 =
1
{f (0) , f (T ) , f (2T ) , . . . f (nT ) , . . .}. Thus on setting z = epT (i.e. p = T ln z) the
263 EMT4801/1

Z–transform of a sequence of samples in discrete time is the same as the Laplace


transform of the ideal sampling function g (t); i.e.

X
L{ f (kT ) δ (t − kT )} = Z{f (nT )}
k=0

whenever z = epT .
Example 4.1.3. The signal f (t) = e−t H (t) is sampled at intervals T. Deter-
mine the Z–transform of the resulting sequence.
(Note: Since e−t 6= 0 for t < 0, the factor H (t) is there to ensure that f (t) is
causal.)
Solution: The sequence of sampled values is given by
{f (kT )} = {f (0) , f (T ) , f (2T ) , . . . , f (nT ) , . . .} = 1, e−T , e−2T , . . . , e−nT , . . . .


If now we let a = e−T in part (a) of example 4.1.1, it clearly follows that then
∞ k
X e−T z
Z {f (nT )} = = |z| > e−T .
z k z − e−T
k=0

Thus we see that the region of convergence (domain) of this Z-transform, depends
on the sampling interval. 
4.1.6. PROPERTIES OF THE Z–TRANSFORM.

4.1.6.1. The linearity property.


Theorem 4.1.4. If {xk } and {yk } have Z–transforms X (z) and Y (z) respec-
tively, then for any two constants α and β, we have that
Z (αxk + βyk ) = αX (z) + βY (z)
where the region of convergence of Z (αxk + βyk ) will be at least as big as the region
common to the Z–transforms X (z) and Y (z).
Proof:

X αxk + βyk
Z (αxk + βyk ) =
zk
k=0
∞ ∞
X xk X yk
= α + β
zk zk
k=0 k=0
= αX (z) + βY (z)

Example 4.1.5. Determine the Z–transform of the sequence {sin ωkT } that
arises when f (t) = sin ωtH (t) is sampled at intervals T. (Here ω is an arbitrary
constant.)
Solution: Using the fact that
ejωkT − e−jωkT
sin ωkT = ,
2j
it follows from the linearity property that
1   jωkT
) − Z( e−jωkT )
 
Z({sin ωkT }) = Z( e
2j
1   jωT k
) ) − Z {(e−jωT )k )

= Z( (e
2j
264

So if in part (a) of example 4.1.1 we respectively set a = ejωT and a = e−jωT , it


will follow that
1
) − Z( e−jωkT )
  jωkT  
Z({sin ωkT }) = Z( e
2j
 
1 z z
= −
2j z − ejωT z − e−jωT
z − ze−jωT − z 2 + zejωT
 2 
1
=
2j z 2 − z (ejωT + e−jωT ) + 1
 
jωT −jωT
z e − e
=    
2j 2
z − 2z ejωT +e−jωT
+1
2
z sin ωT
= .
z 2 − 2z cos ωT + 1

Since ejkωT = e−jkωT = 1, the region of convergence will at least include |z| > 1.


4.1.6.2. The first shift property (Delaying).

Theorem 4.1.6. Let {yk } be a delayed version of {xk } with yk = xk−k0 where
k0 is the number of steps in the delay. For example if k0 = 3, then {yk } is effectively
just {xk } moved forward by 3 steps in the sense that y0 = x−3 , y1 = x−2 , y2 = x−1 ,
and so on. If both {yk } and {xk } are causal, then

1
Z({yk }) = Z({xk }).
z k0
In other words
1
Z({xk−k0 }) = Z({xk }).
z k0
In the sketch below we represent the effect of such a delay process on a sequence
of sample values.

Figure 4.2
265 EMT4801/1

Proof: Notice that



X yk
Z({yk }) =
zk
k=0

X xk−k0
=
zk
k=0

X xp
=
z p+k0
p=−k0

1 X xp
=
z k0 z p+k0
p=−k0

where p = k − k0 . Since {xp } is a causal sequence, we have that xp = 0 when p < 0.


Thus the above simplifies to

1 X xp 1
Z({yk }) = = k0 X (z) .
z k0 p=0 z p z


1 k

Example 4.1.7. Determine Z({xk−3 }) where xk = 2 for k ≥ 0 and xk = 0
when k < 0. (So {xk } is causal.)
Solution: From part (a) of example 4.1.1 we have that
n  o∞  z 2z 1
1 k
Z = = , |z| >
2
k=0 z − 12 2z − 1 2
Thus by the first shift property
1 2z 2 1
Z({xk−3 }) = 3
· = 2 , |z| > .
z 2z − 1 z (2z − 1) 2


4.1.6.3. The second shift property (Advancing).


Theorem 4.1.8. Let {yk } be a causal sequence generated by advancing {xk }
by k0 steps, i.e. yk = xk+k0 . Then
0 −1
kX
Z({yk }) = Z({xk+k0 }) = z k0 Z({xk }) − xn z k0 −n .
n=0

Proof: Observe that


Z({yk }) = Z({xk+k0 })

X yk
=
zk
k=0

X xk+k0
=
zk
k=0

X xk+k
= z k0 0

z k+k0
k=0

X xn
= z k0
zn
n=k0
266

where in the last line we have set where n = k + k0 . Thus



X xn
Z({yk }) = z k0
zn
n=k0
0 −1
"∞ #
X xn kX xn
k0
= z −
n=0
zn n=0
zn
0 −1
" kX
#
k0 xn
= z X (z) −
n=0
zn
as required. 

4.1.6.4. Multiplication by k n where n ∈ N.


Theorem 4.1.9.
 n
n d
Z({k xk }) = −z Z({xk }).
dz
d
z −k = kz −k . Thus if we repeat

Proof: It is an exercise to see that (−z) dz
this process n-times, we get
 n
d
z −k = k n z −k .

−z
dz
Using this fact, it is now clear that

X kn
Z({k n xk }) = xk
zk
k=0
∞  n  
X d 1
= xk −z
dz zk
k=0
 n X∞  
d 1
= −z xk
dz zk
k=0
 n
d
= −z Z({xk }).
dz


4.1.6.5. Multiplication by ak where a is a constant.


Theorem 4.1.10. If
Z({xk }) = X (z)
then
Z({ak xk }) = X a−1 z .


(See if you can prove this! It is not too difficult.)

4.1.6.6. Initial–value theorem.


Theorem 4.1.11. If
Z({xk }) = X (z)
then
lim X (z) = x0 .
z→∞
267 EMT4801/1

Proof: If in the definition of the Z-transform


x1 x2 x3
X(z) = x0 + + 2 + 3 + ...
z z z
we let z → ∞, then all terms in the series except x0 will tend to 0. Hence the result
follows. 

4.1.6.7. Final–value theorem.


Theorem 4.1.12. If
lim xk
k→∞
exists, then
lim xk = lim (z − 1) X (z)
k→∞ z→1
where
Z({xn }) = X (z) .
Proof: By the second shift property we have that
Z({xk+1 }) = zX(z) − zx0 .
So
zX(z) − zx0 − X(z) = Z({xk+1 }) − Z({xk })
n
X xk+1 − xk
= lim .
n→∞ zk
k=0

If now we let z → 1, we get that


n
X
lim (z − 1) X (z) − x0 = lim (xk+1 − xk )
z→1 n→∞
k=0
= lim [(x1 − x0 ) + (x2 − x1 ) +
n→∞
· · · + (xn−1 − xn−2 ) + (xn − xn−1 )]
= lim [xn − x0 ]
n→∞
= ( lim xn ) − x0
n→∞

as required. (The second last line follows by cancelling terms where possible.) 

4.1.6.8. The convolution theorem for Z-transforms.


Theorem 4.1.13. Given two causal sequences {un } and {vn }, the convolution
product of these sequences is defined to be {wn } = {un ∗ vn } where
n
X n
X
(4.1.4) wn = un ∗ vn = uk vn−k = un−k vk .
k=0 k=0

If U (z) and V (z) are the Z-transforms of {un } and {vn } respectively, then
Z({un ∗ vn }) = U (z)V (z).
Observe that by equation 4.1.4, {un ∗ vn } = {vn ∗ un }.

Proof: We have that



" n #
X X 1
Z({un ∗ vn }) = uk vn−k .
n=0
zn
k=0
268

Since {vn } is causal, we have that vn−k = 0 when k = n + 1, n + 2, . . . So without


changing anything, the above formula may be rewritten as

"∞ #
X X 1
Z({un ∗ vn }) = uk vn−k n .
n=0
z
k=0
Now first reverse the order of summation and then set m = n − k (i.e. n = m + k)
to get

"∞ #
X X 1
Z({un ∗ vn }) = uk vn−k n
n=0 k=0
z

"∞ #
X X 1
= uk vn−k n
z
k=0 n=0

" ∞ #
X X 1
= uk vm m+k
z
k=0 m=−k
∞ ∞
" #
X X 1
= uk vm m+k
z
k=0 m=0
"∞ #" ∞ #
X 1 X 1
= uk k vm m
z m=0
z
k=0
= U (z)V (z).
(To pass from the third to the fourth line we once again use the causality of {vn }
to note that vm = 0 when m < 0.) 
EXERCISE 4.1.
1. Determine the Z–transform of the following sequences and state the region
of convergence in each case:
n  o
1 k
(a) 3

(b) n4k o
k
(c) (−3)

(d) −3k
(e) {5k}

2. The signal f (t) = e−3λt , where λ is a real constant, is sampled at intervals


T when t ≥ 0. Write down the general term of the sequence of samples
and calculate the Z–transform of this sequence.

3. Show that
z (z − cos ωT )
Z({cos ωkT }) = , |z| > 1
z 2 − 2z cos ωT + 1

4. Use the first shift property to determine Z {yk } if



0 k<2
yk =
xk k ≥ 2
where {xk } is a causal sequence with
 k
1
xk = .
3
Confirm your result by using equation 4.1.2 to compute the Z–transform.
269 EMT4801/1

5. Determine
n theo Z–transforms of
k
(a) − 14 ,
(b) {sin kπ} .
n k o
6. Determine Z k 12 .

7. Determine
(a) Z({sinh kβ}),
(b) Z({cosh kβ}).

8. Sequences are generated by sampling a causal continuous time signal u (t)


at regular intervals T. Write down an expression for the general term uk
of the sequence of samples, and calculate the Z–transform if u (t) is
(a) e−3t ,
(b) cos t
(c) sin 4t
270

UNIT 2: THE INVERSE Z–TRANSFORM


4.2.1. OBJECTIVE. To introduce the concept of an inverse Z–transform,
and develop an algorithm for computing inverse Z-transforms of complex rational
functions.
4.2.2. OUTCOMES. At the end of this unit the student should understand
the meaning of the term “inverse Z-transform” and be able to use tables of Z-
transforms to compute inverse Z-transforms.
4.2.3. THE INVERSE Z–TRANSFORM. Let X(z) be a given complex
function. If there exists a sequence {xk } so that
Z({xk }) = X(z)
on some region, then by definition this sequence is the inverse transform of X(z)
on that region, i.e.
Z −1 [X(z)] = {xk } .
For example the complex function
z 3 + 4z 2 − 5z
G (z) =
z3
can be rewritten as
4 5
G (z) = 1 + − 2.
z z
In this form it clearly follows from the definition of the Z-transform that this
function can be obtained by taking the transform of {1, 4, −5, 0, 0, . . .}. Hence
Z −1 [G (z)] = {1, 4, −5, 0, 0, . . .} .
The above definition is however a bit more tricky than one may think. Consider
for example the function
1
.
1−z
If |z| < 1, then using the theory of geometric series, we can show that
1
= 1 + z + z2 + z3 + . . .
1−z
Thus on the region |z| < 1 this function is the Z-transform of the non-causal
sequence . . . , u−2 , u−1 , u0 , u1 , u2 , . . . with

1 k≤0
uk =
0 k>0

If however we restrict ourselves to the region |z| > 1 (equivalently z1 < 1), we may
use the binomial expansion to show that
1 1 1
= − .
1−z z 1 − z1
 
1 1 1 1
= − 1 + + 2 + 3 + ...
z z z z
1 1 1
= − − 2 − 3 − ...
z z z
1
Thus on this region 1−z is the Z-transform of the causal sequence {0, −1, −1, −1, . . . }.
From this simple example it is clear to see that if we want our inverse Z-transform
to be well-defined, we either need to say up front which region we are interested in,
or else restrict ourselves to causal sequences. We avoid these difficulties by choosing
to restrict ourselves to causal sequences.
271 EMT4801/1

In the examples in the previous unit, we showed that


z
Z({ak }∞
k=0 ) = |z| > |a|.
z−a
Thus the inverse Z-transform of z−a z
is of course {ak }∞
k=0 . This simple example can
be expanded to a technique that will allow us to compute the inverse Z-transform
of any rational function for which the order of the denominator is not less than the
order of the numerator. Here is how it works:
Let F (z) be a given a complex rational function for which order of the denomi-
nator is not less than the order of the numerator. If the order of the denominator is
the same as that of the numerator, use longdivision of polynomials to reduce to the
case where we have a constant and a rational function for which the order of the
numerator is less that of the denominator. The inverse transform of such a rational
function F (z), is then computed as follows:
• Step 1: Firstly use partial fractions to write F (z)
z as a sum of terms of
K
the form (z−b) n .
• Step 2: Multiply throughout by z to write F (z) as a sum of terms of the
z
form K (z−b) n.

• Step 3: In the above expansion write down the inverse Z-transform


z
of each term of the form (z−b) n by referring to the given table of Z-

transforms. (Most of these were computed in the previous unit.) The


associated linear combination of these inverse Z-transforms is the inverse
Z-transform of F (z).
Suppose for example we want to compute the inverse Z-transform of F (z) =
2z 2
(z+2)(z+1)2 .
We then firstly use partial fractions to see that
F (z) 2z
=
z (z + 2)(z + 1)2
4 2 4
= − − .
z + 1 (z + 1)2 z+2
Now multiply throughout by z to get
     
z z z
F (z) = 4 −2 − 4 .
z+1 (z + 1)2 z+2
From the list of known Z-transforms we can now conclude that
     
z z z
Z −1 [F (z)] = 4Z −1 − 2Z −1 − 4Z −1
z+1 (z + 1)2 z+2
     
z z z
= 4Z −1 − 2Z −1 − 4Z −1
z − (−1) (z − (−1))2 z − (−2)
= 4(−1)k − 2(k(−1)k−1 ) − 4(−2)k
= (4 + 2k)(−1)k − 4(−2)k for k ≥ 0.
Example 4.2.1. Determine
 
−1 z
Z
z−5
Solution: From part (a) of example 4.1.1 with a = 5, it follows that
 
z
Z −1 = 5k

k ≥ 0.
z−5

272

Example 4.2.2. Determine


 
z
Z −1
(z − 3) (z − 4)
z
Solution: Here F (z) = (z−3)(z−4) . We need to resolve

F (z) 1
=
z (z − 3) (z − 4)
into partial fractions. Hence let
1 A B
= + .
(z − 3) (z − 4) (z − 3) (z − 4)

But then A (z − 4) + B (z − 3) = 1. Successively substituting z = 3 and z = 4 leads


to A = −1 and B = 1. Thus
1 1 1
=− + .
(z − 3) (z − 4) (z − 3) (z − 4)
Now multiply throughout by z to get
z z z
=− + .
(z − 3) (z − 4) (z − 3) (z − 4)

Using part (a) of example 4.1.1 and the linearity of the Z-transform, then yields
   
−1 z −1 −z z
Z = Z +
(z − 3) (z − 4) (z − 3) (z − 4)
   
−1 −z −1 z
= Z +Z
(z − 3) (z − 4)
 k  k
= 4 − 3 k≥0
 k k

= 4 −3 k ≥ 0.

Example 4.2.3. Determine


 
3z + 1
Z −1
(z + 2) (z − 1)
Solution: We first divide the given rational function by z and then resolve the
result into partial fractions to get the form
3z + 1 A B C
= + + .
z (z + 2) (z − 1) z (z + 2) (z − 1)

But then 3z + 1 = A(z + 2)(z − 1) + Bz(z − 1) + Cz(z + 2). Successively substituting


z = 0, z = −2 and z = 1 leads to A = − 21 , B = − 65 and C = 34 . Thus

3z + 1 11 5 1 4 1
=− − + .
z (z + 2) (z − 1) 2z 6 (z + 2) 3 (z − 1)
Multiplying by z yields
3z + 1 1 5 z 4 z
=− − + .
(z + 2) (z − 1) 2 6 (z + 2) 3 (z − 1)
273 EMT4801/1

Using the results of example 4.1.1 we see that


   
3z + 1 1 5 z 4 z
Z −1 = Z −1 − − +
(z + 2) (z − 1) 2 6 (z + 2) 3 (z − 1)
   
1 −1 5 z 4 z
= − Z [1] − Z −1 + Z −1
2 6 (z + 2) 3 (z − 1)
1 5 4
= − {δk } − {(−2)k } + {1k } k≥0
 24 5 6 k 3
= 3 − 6 (−2) k>0
0 k=0

Example 4.2.4. Invert the Z–transform
z
Y (z) = 2
z + a2
where a is a real constant.
Solution: Observe that we may write z 2 +a2 = (z − ja) (z + ja) . Thus Y (z)/z
needs to be resolved into partial fractions by setting
Y (z) 1 A B
= 2 = + .
z z + a2 z − ja z + ja
1
This leads to A (z + ja) + B (z − ja) = 1 from which we get that A = j2a and
1
B = − j2a . (To see this successively substitute z = aj and z = −aj.) Thus
 
Y (z) 1 1 1
= −
z j2a z − ja z + ja
whence  
1 z z
Y (z) = −
j2a z − ja z + ja
But then
    
−1 1 −1 z −1 z
Z [Y (z)] = Z −Z
j2a z − ja z + ja
1 hn k
o n
k
oi
= (ja) − (−ja) k≥0
j2a
ak−1 n k k
o
= (j) − (−j) k≥0
2j
In principle we are now done. However using the theory of complex numbers, we
π π
can write the above sequence in a simpler form. Since ej 2 = j and e−j 2 = −j, the
above solution can be written as
ak−1 n j kπ o  
kπ 1
Z −1 [Y (z)] = e 2 − e−j 2 = ak−1 sin kπ k ≥ 0.
j2 2
In the above we made use of the fact that sin w = 1
2j (e
jw
− e−jw ). 

Example 4.2.5. Invert the Z–transform


z(1 − e−bT )
F (z) =
(z − 1)(z − e−bT )
where b and T are positive constants.
274

Solution: We need to write F (z)/z in the form


F (z) 1 − e−bT A B
= −bT
= + .
z (z − 1)(z − e ) z − 1 z − e−bT
This leads to A(z − e−bT ) + B(z − 1) = 1 − e−bT from which (by successively
substituting z = 1 and z = e−bT ) we get that A = 1 and B = −1. Thus
F (z) 1 1
= −
z z − 1 z − e−bT
or equivalently
z z
F (z) = −
z − 1 z − e−bT
Taking inverse Z-transforms now leads to
Z −1 [F (z)] = {1k − (e−bT )k } = {1 − e−kbT } k ≥ 0.

It is important to be aware of the fact that using partial fractions in this way,
is not the only way to compute the inverse Z-transform. It just happens to be one
which is convenient and suited to our purposes. There are several other ways to do
this inversion. For example one could also use the convolution theorem to compute
inverse Z-transforms. We are not going to go into the convolution method in any
amount of detail, but will merely demonstrate the technique with one example.
Example 4.2.6. Use the convolution theorem to find the inverse transform of
z
X(z) = .
(z − 3)(z + 1)
Solution: We may write
X(z) = U (z)V (z)
where
z
= Z {3k }∞

U (z) = k=0 ,
z−3
and
1 1 z 1
= Z {(−1)k }∞

V (z) = = k=0 .
z+1 zz+1 z
By the first shift property, we have that
Z −1 [V (z)] = {v0 , v1 , v2 , . . . }
with
v0 = 0 and vk = (−1)k−1 when k ≥ 1.
If now we write {uk } = Z −1 [U (z)] = {3k }, it follows from the convolution theorem
that
Z −1 [X(z)] = {wk }∞ ∞
k=0 = {uk ∗ vk }k=0
where
wn = u0 vn + u1 vn−1 + u2 vn−2 + · · · + un−1 v1 + un v0
= (−1)n−1 + (−1)n−2 3 + (−1)n−3 32 + · · · + 3n−1
= (−1)n−1 1 + (−3) + (−3)2 + · · · + (−3)n−1
 

(−3)n − 1
= (−1)n−1
(−3) − 1
3 − (−1)n
n
= .
4
(In the third equality above we used the fact that (−1)m (−1)k = (−1)m−k .) 
275 EMT4801/1

EXERCISE 4.2.
1. Invert the following Z–transforms. Give the general term of the sequence
in each case.
z
(a)
z+1
z
(b)
4z + 1
z
(c)
z−j
1
(d)
z−1
z+2
(e)
z−1

2. Determine Z −1 [F (z)] where F (z) is given by


z
(a)
(z − 3) (2z + 1)
z2
(b)
(z − 1) (2z + 1)
2z
(c)
2z 2 + z − 1
5z + z 3 − 3z 6
(d)
z6
1+z 4z
(e) +
z3 4z + 1
2z 2 − 7z + 7
(f) 2
(z − 1) (z − 2)
276

UNIT 3: THE Z-TRANSFORM METHOD FOR SOLVING


DIFFERENCE EQUATIONS
4.3.1. OBJECTIVE. To show how difference equations can be used to de-
scribed discrete processes, and how such difference equations can then be solved
using the theory of Z-transforms.

4.3.2. OUTCOMES. At the end of this unit the student should


• Know how block diagrams can be used to get a better understanding of a
given system;
• Understand and be able to apply the Z-transform method for solving
linear constant coefficient difference equations.

4.3.3. INTRODUCTION. Linear continuous–time systems can be described


by linear differential equations. In module 3 we demonstrated the usefulness of the
Laplace transform in analysing such systems. Specifically the Laplace transform
not only provides us with a technique for solving the differential equations describ-
ing the system, but also provides us with a way of obtaining the transfer function
describing the system. In this section we shall briefly discuss the idea of a linear
discrete–time system and show how such systems can be described by difference
equations. We will also see how the Z-transform can be used to solve such difference
equations.
An expression of the form
αn yk+n + αn−1 yk+(n−1) + · · · + α0 yk = βm uk+m + βm−1 uk+(m−1) + · · · + β0 uk
is called a linear constant coefficient difference equation. Here α0 , . . . , αn and
β0 , . . . , βm are fixed constants, {uk } is some given sequence of inputs, and {yk }
the sequence of outputs. A solution of a difference equation is a formula for the
general term yk of the outputs (depending on both k and the input sequence {xk }),
that satisfies the above equation. If say αn 6= 0 and βm 6= 0, then to be sure of
obtaining a unique solution, we need to specify the values of at least y0 , y1 , . . . , yn−1
and u0 , u1 , . . . , um−1 at the outset.
But where and how are such difference equations used to model (discrete-time)
digital processes? Lets consider a simple example to get some idea of how this
relationship works.
Suppose that a sequence of signals {xk } (where xk is the signal at the time step
k) is processed using a discrete–time feedback system. Once a signal xk enters the
system at time k, it is first combined with a feedback signal, and then delayed or
held back until the “clock” advances one step, to step k + 1. The feedback signal
mentioned above is produced by scaling the output yk of the delay process by the
factor α, and then subtracting it from the input signal xk . (This constant scaling
factor in the feedback process is sometimes called the feedback-gain of the system.)
Let {rk } denote the sequence of input signals to the delay process. If the output
of this delay process at step k is yk , then owing to the delay action of this process,
the inputs must be
yk+1 = rk .
However the input rk to the delay process comes from the feedback action which
subtracts αyk from xk . So we also have
rk = xk − αyk .
Thus the system is represented by the difference equation
yk+1 = xk − αyk .
277 EMT4801/1

Equivalently
yk+1 + αyk = xk .
This is a simple example of a system described by a first–order difference equation.
4.3.4. BLOCK DIAGRAMS: AN AID TO UNDERSTANDING. Sig-
nal flow diagrams and schematic block diagrams are often used as an aid in analysing
a control system. Such diagrams don’t actually help to solve the system, but they
can be a useful aid in understanding the interactions between the various inputs
and outputs of each of the subsystems. It is important to note that block diagrams
are not unique in the sense that there may be more than one block diagram repre-
senting a given system. These diagrams should therefore be used with a measure
of caution. We are of course primarily interested in how block diagrams may be
applied to discrete-time processes. With this in mind we will in this section very
briefly indicate how to pass from difference equation description, to a block diagram
description, and back.
Our block diagrams will basically be made up of nodes or error wheels, and
branches. For example if signal b is subtracted from signal a to produce signal
c = b − a, then this action can be represented by an error wheel as in the sketch
below.

Figure 4.3

By inserting the appropriate sign in each case, we can of course have as many
signals as we like being added or subtracted at such a node. In its simplest form a
branch will consist of an input r, and output y, and an operator F which transforms
r to y. Such a branch will typically be represented as in the diagram below:

Figure 4.4

When working with difference equations and discrete-time systems, there are
typically two types of operators that feature prominently; namely delay blocks and
scaling blocks. A delay block will receive an input signal rk at time step k, and
then hold the input signal until the “clock” advances one step, before releasing it
back into the system at step k + 1. The action of such a delay block on a sequence
{rk } of input signals will be represented as in the sketch below.
Observe that in this delay system the output yk at time k is just the the input
rk−1 from the k − 1-th time step. Thus we have that rk−1 = yk , or equivalently
rk = yk+1 . Thus delay blocks may more accurately be sketched as below:
278

Figure 4.5.

Figure 4.6.

A scaling block is simply a subsystem which multiplies its input signal a with
some given constant m to produce the output ma. We will typically use circles
rather than squares to represent such a scaling operator. Thus the system which
sends a to ma may be sketched as follows:

Figure 4.7

If in the above diagram we take a to be acceleration and m mass, then it


represents the process whereby the input of some acceleration a results in a force
F = ma being exerted on the mass m.
There are of course some basic rules according to which branches and nodes are
combined to form a more complex system. These may essentially be summarised
as follows:
• All signals flow unilaterally in the direction of the arrows.
• A block representing a subsystem is assumed not to load any of the sub-
systems preceding it.
• A signal entering a branch is operated on by the branch operator.
• The variable at any given node is the signal produced by adding and
subtracting the incoming signals according to the signs indicated at the
node.
• All branches leaving the node transmit the variable represented by that
node.
With these basic notions behind us, lets see how this works in practice. As a
start we will have another look at the simple system discussed in the introduction.
Recall that there we assumed we were dealing with a discrete–time feedback sys-
tem which processed a sequence of input signals {xk } by first combining it with a
feedback signal, and then delaying the signal until the “clock” advances one step.
The feedback signal itself is produced by scaling the output yk of the delay process
by the factor α, and then subtracting it from the input signal xk .
There must therefore be at least one delay block in our diagram (see figure 4.5).
However to model the system described above, we need to introduce a tap-off point
that takes the output {yk } of our delay block to a scaling block (scaled by a factor
279 EMT4801/1

α) and from there to an error wheel where it is subtracted from the system input
{xk }. The output from this node is then fed into the delay block. After making
these modifications to our sketch of the delay block, we get to

Figure 4.8

This then is the block diagram representation of this system. Given such a
block diagram, we can deduce the associated difference equation from the diagram.
In the above example we can do this by looking at what happens at the error wheel.
Recall that owing to the delay action of D, we have that
yk+1 = rk
Since the node variable at the error wheel is xk − αyk , and since the branch leaving
this node transmits this variable, we must also have that
rk = xk − αyk .
Thus as before we get
yk+1 = xk − αyk ,
(or equivalently yk+1 + αyk = xk .)
Example 4.3.1. Find a difference equation to represent the system shown in
the figure below, having input {xk } and output {yk }. Here D is the unit delay block,
and α and β are constant feedback gains.

Figure 4.9

Solution: Our final difference equation should be in terms of the system input
{xk } and system output {yk } only. Thus we need to describe each of {vk } and
{rk } in terms of these two sequences. Because of the delay action of the two delay
blocks, the inputs of the two delay blocks must satisfy
yk+1 = vk and vk+1 = rk .
280

(See figure 4.6 and the discussion preceding it.) The variable at the node S is
xk − αvk + βyk . Since the branch leaving that node must carry this variable, we
also have that
rk = xk − αvk + βyk .
From the first two equations we have that
yk+2 = vk+1 = rk .
By substituting the second equation into the third we also get that
rk = xk − αyk+1 + βyk .
Thus we obtain
yk+2 = xk − αyk+1 + βyk ,
or equivalently
yk+2 + αyk+1 − βyk = xk
(a second–order difference equation with constant coefficients). 

We close with some examples showing how to go from a difference equation to


a block diagram.
Example 4.3.2. Draw a block diagram to represent the system modelled by the
difference equation
yk+2 + 3yk+1 − yk = uk .
Solution: The figure below illustrates how the sequence {yk } is generated from
{yk+2 } by means of two delay blocks.

Figure 4.10

If now we rewrite our difference equation as


yk+2 = −3yk+1 + yk + uk
it is clear that the input yk+2 into the first delay block, is produced by tapping off
yk and yk+1 , and then scaling yk+1 by a factor 3, before combining yk and 3yk+1
with the input uk to produce yk+2 . In block diagram form this means that the
above diagram needs to be modified to produce the diagram below.

Figure 4.11

This then is the required block diagram representation of the system. 


Example 4.3.3. Draw a block diagram to represent the system modelled by the
difference equation
yk+2 + 3yk+1 + 2yk = uk+1 − uk .
281 EMT4801/1

Solution: This difference equation has a more complex right–hand side than
the equation in the previous example. Hence although it should be possible to
draw a block diagram using only 2 delay blocks, actually doing this will not be as
straightforward as before. We start by first considering a system for which the right
hand-side is of a simpler form, namely
(4.3.1) rk+2 + 3rk+1 + 2rk = uk .
Arguing as in the previous example we can now construct a block diagram for the
new equation to get

Figure 4.12.

From equation 4.3.1 it is clear that


uk+1 − uk = [rk+3 + 3rk+2 + 2rk+1 ] − [rk+2 + 3rk+1 + 2rk ]
= [rk+3 − rk+2 ] + 3[rk+2 − rk+1 ] + 2[rk+1 − rk ].
If we compare this equation to
(4.3.2) uk+1 − uk = yk+2 + 3yk+1 + 2yk
it is clear that in order to be able to pass from equation 4.3.1 to equation 4.3.2, we
need to have
yk = rk+1 − rk .
(Then also yk+1 = rk+2 − rk+1 and yk+2 = rk+3 − rk+2 .) Thus to get the required
block diagram, we need to tap off rk and rk+1 in figure 4.12 and combine them
according to the prescription rk+1 − rk to produce the output yk . Having done this,
we then end up with the required block diagram, namely

Figure 4.13


282

4.3.5. THE SOLUTION OF DIFFERENCE EQUATIONS. The Z–


transform method is based upon the second shift property and is a technique much
like that of the Laplace transform method for solving differential equations. Given
a difference equation like

αn yk+n + αn−1 yk+(n−1) + · · · + α0 yk = βm uk+m + βm−1 uk+(m−1) + · · · + β0 uk

with suitable initial conditions imposed on y0 , y1 , . . . , yn−1 , one first uses Z-transforms
to convert the difference equation into an algebraic equation involving U (z) =
Z ({uk }) and Y (z) = Z ({yk }). From this equation we then solve for Y (z). Once
Y (z) = Z ({yk }) is known, it is then a simple matter of computing the inverse
transform of Y (z) to find the solution.
Consider for example the difference equation

yk+2 + yk+1 − 2yk = uk k≥0

with input {uk } = {1, 1, 1, . . . }. If we use the linearity property and the fact that
z
Z 1k =

,
z−1
then on taking Z–transforms this becomes
z
Z({yk+2 }) + Z({yk+1 }) − 2Z({yk }) = .
z−1
But by the second shift property this can be rewritten as
 2 z
z Y (z) − z 2 y0 − zy1 + [zY (z) − zy0 ] − 2Y (z) =

z−1
where Y (z) = Z {yk }. Equivalently
 
1 z 2
Y (z) = + z y 0 + z (y 0 + y 1 ) .
(z 2 + z − 2) z − 1
Notice that the value of Y (z) depends on y0 and y1 . Thus if we are to obtain a
unique solution, the values of y0 and y1 need to be specified. This corresponds to
knowing y (0) and y 0 (0) in the Laplace transform method. Suppose now we want to
solve the difference equation under the assumption that y0 = 0 and y1 = 1. Then
the formula for Y (z) becomes
 
1 z
Y (z) = + z
(z 2 + z − 2) z − 1
z2
= 2
(z + z − 2) (z − 1)
z2
= 2.
(z + 2) (z − 1)
To obtain the solution, we now simply find the inverse transform of Y (z). Dividing
Y (z) by z, and then resolving into partial fractions, yields
Y (z) z 2 1 2 2 1 1
z
= 2 = −9 z + 2 + 9 z − 1 + 3 2.
(z + 2) (z − 1) (z − 1)
Thus
2 z 2 z 1 z
Y (z) = − + +
9 z + 2 9 z − 1 3 (z − 1)2
283 EMT4801/1

Using the linearity property combined with parts (a) and (c) of example 4.1.1, now
leads to
{yk } = Z −1 [Y (z)]
    " #
2 −1 z 2 −1 z 1 −1 z
= − Z + Z + Z 2
9 z+2 9 z−1 3 (z − 1)
 
2 k 2 1
= − (−2) + 1k + k
9 9 3
1
2(1 − (−2)k ) + 3k

= k≥0
9
Example 4.3.4. Solve the difference equation 8yk+2 − 6yk+1 + yk = 9, k ≥ 0,
given that y0 = 1 and y1 = 32 .
Solution: Taking Z-transforms leads to
9z
8 z 2 Y (z) − z 2 y0 − zy1 − 6 [zY (z) − zy0 ] + Y (z) = 9Z({1}) =
 
z−1
where Y (z) = Z ({yk }). On inserting the given initial conditions, this becomes
 2 9z
+ 8z 2 + 12z − 6z.

8z − 6z + 1 Y (z) =
z−1
If now we solve for Y (z) and divide by z, this expression becomes
 
Y (z) 1 9
= + 8z + 6
z 8z 2 − 6z + 1 z − 1
8z 2 − 2z + 3
=
8 (z − 1) z − 41 z − 21
 

Once resolved into partial fractions, this can be rewritten as


Y (z) 3 2 4
= + 1 − ,
z z−1 z− 4 z − 21
whence
3z 2z 4z
Y (z) = + 1 − .
z−1 z− 4 z − 21
Thus (  k  k )
1 1
{yk } = 3+2 −4 k≥0
4 2


√ Solve the difference equation yk+2 + 2yk = 0, k ≥ 0, given that


Example 4.3.5.
y0 = 1 and y1 = 2.
Solution: Taking Z-transforms leads to
[z 2 Y (z) − z 2 y0 − zy1 ] + 2Y (z) = 0.
In view of the initial conditions, this ensures that
 2 √
z + 2 Y (z) = z 2 + z 2,


whence

Y (z) z+ 2
=
z z2 + 2

z+ 2
= √  √ 
z+j 2 z−j 2
284

Once resolved into partial fractions, this becomes


Y (z) 1+j 1 1−j 1
= √ + √ ,
z 2 z+j 2 2 z−j 2
and hence
1+j z 1−j z
Y (z) = √ + √ .
2 z+j 2 2 z−j 2
Taking inverse transforms now leads to
1 + j  √ k 1 − j  √ k
 
{yk } = −j 2 + j 2
2 2
√ k
2  k
(j + (−j)k ) − j(j k − (−j)k )

= k ≥ 0.
2
π π
Now replace j by ej 2 and −j by e−j 2 to get
√ k
2  jk π π π π
{yk } = (e 2 + e−jk 2 ) − j(ejk 2 − e−jk 2 ) .
2
π π π π
Finally use the facts that 2 cos k π2 = ejk 2 + e−jk 2 and 2j sin k π2 = ejk 2 − e−jk 2
to rewrite this as
√ kn
2 π πo
{yk } = 2 cos k − j.2j sin k
2 2 2
√ kn π πo
= 2 cos k + sin k k ≥ 0.
2 2
Note: The solution is a real–valued sequence since the coefficients of the dif-
ference equation and the “initial” values involved only real numbers. This kind of
observation is sometimes a useful check on algebraic computations involving com-
plex partial fractions. 
EXERCISE 4.3.
1. Find the difference equations representing the discrete–time systems shown.
(a)

(b)

2. Solve the following difference equations.


(a) yk+2 + 4yk = 0 subject to y0 = 0, y1 = 1.
285 EMT4801/1

(b) 2yk+2 − 5yk+1 − 3yk = 0 subject to y0 = 3, y1 = 2.

3. Solve the following difference equations.


(a) 6yk+2 + 5yk+1 − yk = 5 subject to y0 = y1 = 0.
k
(b) yk+2 − 5yk+1 + 6yk = 12 subject to y0 = y1 = 0.
(c) yk+2 − 3yk+1 + 3yk = 1 subject to y0 = 1, y1 = 0.
(d) 2yn+2 − 3yn+1 − 2yn = 6n + 1 subject to y0 = 1, y1 = 2.

4. A person’s capital at the beginning and expenditure during a given year k


are denoted by Ck and Ek respectively, and satisfy the difference equations
Ck+1 = 1, 6Ck − Ek
Ek+1 = 0, 2Ck + 0, 4Ek
At the beginning of year 0 and year 1, the capital was respectively R10
000 and R7 630.
(a) Show that as k increases, the capital growth will approach 40% per
annum.
(b) Find the year in which the expenditure is a minimum and also find
the capital at the beginning of that year.

5. The dynamics of a discrete–time system is determined by the difference


equation yk+2 − 5yk+1 + 6yk = uk subject to y0 = 0, y1 = 1. Determine
the response of the system to the unit–step input

0 k<0
uk =
1 k≥0
286

UNIT 4: TRANSFER FUNCTIONS AND STABILITY


4.4.1. OBJECTIVE. To define the concept of transfer function for Z–transforms,
to interpret it as the impulse response of a discrete-time system, and to investigate
the use of this construct in describing the stability of a system.

4.4.2. OUTCOMES. At the end of this unit the student should


• Understand the concept of a Z transfer function and be able to com-
pute transfer functions of systems governed by linear constant coefficient
difference equations;
• Be able to use the transfer function of a system to compute the response
of the system to a given input;
• Know the meaning of the term “impulse response” and be familiar with
the relationship between the impulse response and the transfer function;
• Know how to use convolution with the impulse response to compute the
response of a system to an arbitrary input;
• Know how the stability of a systems can be described in terms of the poles
of the transfer function;
• Know the Jury stability criterion and able to use it to test the stability of
a given transfer function.

4.4.3. Z-TRANSFER FUNCTIONS. Suppose we have a linear time-invariant


discrete-time system with causal input sequence {uk } and causal output sequence
{yk }. By analogy with the Laplace transform case we define the transfer function
G(z) of a such a system to be the ratio of the Z-transform Y (z) = Z({yk }) of the
output sequence to the Z-transform U (z) = Z({uk }) of the input sequence where
all starting values are assumed to be zero. The transfer function then clearly satifies
the relationship
(4.4.1) Y (z) = G(z)U (z).
As was the case with continuous-time systems, a transfer function approach is a
useful way of dealing with a situation where we need to solve the same system for
a number of different input sequences. Once the transfer function G(z) is known,
then given a causal input sequence {uk }, all we need to do to find the Z-transform
of the corresponding output sequence, is to simply compute the Z-transform of
{uk }, and multiply by G(z).
Suppose now that our linear time-invariant discrete-time system is described
by a general linear constant-coefficient difference equation of the form
an yk+n + an−1 yk+n−1 + . . . + a0 yk = bm uk+m + bm−1 uk+m−1 + . . . + b0 uk .
(Here k ≥ 0, n and m are positive integers, and ai and bj are constants with an 6= 0.)
For the system to be well-posed we need to have n ≥ m. To ensure that each fixed
input yields a unique ouput, we need to specify starting values for y0 , y1 , . . . , yn−1 .
If all these starting values are 0, we say that the system is in a quiescent state. In
this particular case this would mean that
y0 = y1 = . . . = yn−1 = 0
and
u0 = u1 = um−1 = 0.
Assuming that the system is indeed initially in a quiescent state, taking Z–
transforms will then give
an z n + an−1 z n−1 + . . . + a0 Y (z) = bm um + bm−1 um−1 + . . . + b0 U (z)
 
287 EMT4801/1

where as before Y (z) = Z {yk } and U (z) = Z {uk }. In such a case the system
Z–transfer function G (z) is then described by the formula
Y (z) bm um + bm−1 um−1 + . . . + b0
G (z) = = .
U (z) an z n + an−1 z n−1 + . . . + a0
This formula is normally written such that the coefficient of z n in the denominator
is 1.
On writing
P (z) = bm z m + bm−1 z m−1 + . . . + b0
and
Q (z) = an z n + an−1 z n−1 + . . . + a0 ,
G (z) may be expressed as
P (z) bm um + bm−1 um−1 + . . . + b0
G (z) = = .
Q (z) an z n + an−1 z n−1 + . . . + an
As was the case for Laplace transforms, Q (z) = 0 is called the characteristic
equation, its order n is the order of the system, and its roots are the poles of the
transfer function. The roots of P (z) = 0 are the zeros of the transfer function.
Example 4.4.1. For each of the following cases, find the transfer function of
the system described by the given difference equation:
(a) yk+2 + 3yk+1 − yk = uk k ≥ 0;
(b) yk+2 + 3yk+1 + 2yk = uk+1 − uk .
Solution:
(a) If we take Z-transforms of yk+2 +3yk+1 −yk = uk (k ≥ 0) with all starting
values zero, we obtain
(z 2 + 3z − 1)Y (z) = z 2 Y (z) + 3zY (z) − Y (z) = U (z)
where Y (z) = Z {yk } and U (z) = Z {uk }. Thus the transfer function is
given by
Y (z) 1
G(z) = = 2 .
U (z) z + 3z − 1

(b) With all starting values zero, taking Z-transforms of the equation yk+2 +
3yk+1 + 2yk = uk+1 − uk . leads to
(z 2 + 3z + 2)Y (z) = (z − 1)U (z).
Thus the transfer function is given by
Y (z) z−1
G(z) = = 2 .
U (z) z + 3z + 2


4.4.4. TRANSFER FUNCTIONS AND BLOCK DIAGRAMS. In sec-


tion 4.3.4 we described how for a system characterised by a given difference equa-
tion, one could use the difference equation to draw a block diagram for the system.
One could equally well use the transfer function of a system to draw block dia-
grams. For a given causal sequence {yk }, the second shift property tells us that
Z({yk+1 }) = zZ({yk }) when y0 = 0. Thus on setting Y (z) = Z({yk }), the pro-
cess of passing from {yk+1 } to {yk } translates to passing from zY (z) to Y (z) once
Z-transfers have been computed. Thus the transferred version of a delay block
becomes
288

Figure 4.14

The rest of the techniques remain fairly much the same as before and hence we
will not repeat the guidelines given earlier in section 4.3.4. Instead we will content
ourselves with considering a couple of examples.
Example 4.4.2. Draw a block diagram for the system with transfer function
1
G(z) = 2 .
z + 3z − 1
Solution: Observe that the given function is the transfer function for the
system described by the equation
yk+2 + 3yk+1 − yk = uk .
In example 4.3.2 we already did draw a block diagram for this syetem using the
difference equation approach, and hence all things being equal, this alternative
approach should result in the same diagram. Now if Y (z) is the transfer of the
output and U (z) the transfer of the input, then
Y (z) 1
= G(z) = 2
U (z) z + 3z − 1
can be rewritten as (z 2 + 3z − 1)Y (z) = U (z). Thus at some point in our block
diagram we will need to pass from z 2 Y (z) to Y (z) by means of two delay blocks
(see the sketch below).

Figure 4.15

However notice that the equation (z 2 + 3z − 1)Y (z) = U (z) can be written as
z 2 Y (z) = U (z) − 3zY (z) + Y (z).
Thus in our system z 2 Y (z) is produced by tapping off zY (z) and Y (z), scaling
zY (z) by a factor 3, and then combining U (z), Y (z) and 3zY (z) according to the
prescription U (z) − 3zY (z) + Y (z). On making these modifications to the above
diagram, we end with the required “transferred version” of the block diagram for
our system. As expected it has the same structure as the one in example 4.3.2.
289 EMT4801/1

Figure 4.16


Example 4.4.3. Draw a block diagram for the system with transfer function
z−1
G(z) = 2 .
z + 3z + 2
Solution: Once again there is nothing really new about this example. As
we showed in example 4.4.1, this transfer function is just the transfer function of
the system considered in example 4.3.3. Hence we should obtain a similar block
diagram here as before. With Y (z) denoting the transfer of the output and U (z)
the transfer of the input,
Y (z) z−1
= G(z) = 2
U (z) z + 3z + 2
can be rewritten as
(4.4.2) (z 2 + 3z + 2)Y (z) = zU (z) − U (z).
This equation has a more complex right–hand side than the one in the pre-
vious example. As we did in example 4.3.3, we get around this problem by first
introducing an intermediate equation with simplified right-hand side, namely
z 2 + 3z + 2 R (z) = U (z) .

(4.4.3)
Using the same approach as in the previous example, this equation now leads to
the following diagram:

Figure 4.17

To find out how R(z) is related to Y (z), we simply substitute equation 4.4.3
into equation 4.4.2 to get
(z 2 + 3z + 2)Y (z) = zU (z) − U (z) = z 2 + 3z + 2 (zR (z) − R(z)).

290

Thus
Y (z) = zR (z) − R (z) .
Thus to produce Y (z), we need to tap off zR(z) and R(z), and combine them
according to the above prescription. Making these modifications now leads to the
required block diagram, namely

Figure 4.18

Again this diagram has the same structure as the one obtained in example
4.3.3. 
4.4.5. THE IMPULSE RESPONSE. Suppose we have a linear discrete-
time system (with all starting values zero) described by the transfer function G(z).
For any causal input {uk }, the Z-transform of the (causal) output {yk }, is related
to that of {uk } by means of the equation
Y (z) = G (z) U (z) .
Now recall that the Z-transform of the impulse sequence {δk } = {1, 0, 0, 0, . . .} is
just Z({δk }) = 1. Thus if we take the input sequence {uk } to be this impulse
sequence {δk } and denote the associated output sequence by {yδk }, then
Z({yδk }) = G (z) .1 = G(z).
Thus the transfer function is nothing but the Z-transfer of the response of the
system to a unit impulse input. Since by taking transforms and inverse transforms
we may freely pass between {yδk } and Z {yδk } = G (z), these two objects are
essentially equivalent, and we may think of either one as capturing the essence
of the system. In the following we will consistently write {yδk } for the impulse
response of a discrete-time system, and denote its Z-transform by Yδ (z). From the
above discussion it is clear that
Yδ (z) = G(z)
and hence that in general
Y (z) = Yδ (z)U (z).
Example 4.4.4. Find the impulse response of the system with transfer function
z
G (z) = 2
z + 3z + 2
Solution: From the preceding discussion it is clear that
z
Yδ (z) = G(z) = .
(z + 1) (z + 2)
To find {yδk }, we must therefore compute the inverse transform of this function.
Resolving
Yδ (z) 1
=
z (z + 1) (z + 2)
291 EMT4801/1

into partial fractions, leads to


Yδ (z) 1 1
= − ,
z z+1 z+2
whence
z z

Y (z) = .
z+1 z+2
Thus   n
z z k k
o
{yδk } = Z −1 − = (−1) − (−2) k ≥ 0.
z+1 z+2


n Example o 4.4.5. A system has the impulse response sequence {yδk } =


k k
(a) − (0, 5) where a > 0 is a real constant. What is the nature of this response
when
(a) a = 0, 3 ;
(b) a = 1, 4?
Find the step response in both cases (that is the response of the system to the unit
step sequence {hk } = {1, 1, 1, . . .}).
n o
k k
Solution: If {yδk } = (a) − (0, 5) is the impulse response sequence of the
system, the transfer function is given by
G (z)= Yδ (z)
n o
k k
= Z (a) − (0, 5)
z z
= −
z − a z − 0, 5
Now if the input sequence is the unit step sequence
{hk } = {1, 1, 1, . . .} ,
then using the fact that
z
H(z) = Z({hk }) =
z−1
the transform of the step response of the system is given by
Y (z) = G (z) H(z)
  
z z z
= −
z − a z − 0, 5 z − 1
But then
Y (z) z z
= − .
z (z − a) (z − 1) (z − 0, 5) (z − 1)
After resolving this into partial fractions, and then multiplying by z, we get
 
a z z 1 z
Y (z) = + + −2
a − 1 z − a z − 0, 5 1−a z−1
Hence, the step response is
  
a k k 1
{yk } = a + (0, 5) + −2
a−1 1−a
k k
(a) When a = 0, 3 both (0, 3) and (0, 5) tend to zero as k → ∞ so that
yδk → 0
1
as k → ∞. Similarly the step response will then tend to 1−0.3 − 2 = − 47 .
292

(b) If a = 1, 4 then
k
(1, 4) → ∞
as k → ∞, so that
k
yδk → ∞.

For the step response we similarly have that yk → ∞ as k → ∞. 
4.4.6. SYSTEM RESPONSE TO AN ARBITRARY INPUT. Sup-
pose that as before we have a linear discrete-time system (with all starting values
zero). In the previous section we saw that the Z-transfer of the causal response
{yk } of the system to an arbitrary causal input {uk }, is related to the Z-transform
of this input by means of the equation
Y (z) = Yδ (z)U (z)
where Yδ (z) is the Z-transform of the impulse response of the system. But by the
convolution theorem (see subsection 4.1.6.8) this means that
{yk } = {uk ∗ yδk }.
Thus we arrive at the following fact:
Theorem 4.4.6. With all starting values zero, the response {yk } of a causal
linear time–invariant discrete-time system to an arbitrary causal input {uk }, is
given by
{yk } = {uk ∗ yδk }
where {yδk } is the system response to the unit impulse sequence {δk } = {1, 0, 0, 0, . . . }.
Example 4.4.7. Suppose we are given a discrete-time system with transfer
function
z
G (z) = .
z+2
Assuming that all starting values are zero, find the response {yk } of the system to
the unit step input {hk } = {1, 1, 1, . . . } by
(1) finding the Z-transfer Z({hk }) = H(z), and then computing the inverse
transfer of Y (z) = G(z)H(z),
(2) computing the impulse response {yδk }, and the taking the convolution prod-
uct {uk ∗ yδk }.
Solution:
(a) The Z-transfer of the system step response is given by
Y (z) = G (z) Z({hk })
where
z
Z({hk }) = .
z−1
Thus
z z
Y (z) = ·
z+2 z−1
On resolving Y (z)/z into partial fractions we get
Y (z) z
=
z (z + 2) (z − 1)
2 1
3 3
= + .
z+2 z−1
Therefore  
1 z z
Y (z) = 2 + ,
3 z+2 z−1
293 EMT4801/1

and hence taking inverse Z-transforms will yield


 h i  1 h i
1 k k+1
{yk } = 2 (−2) + 1 = (−1)k (2) +1
3 3

(b) As demonstrated in section 4.4.5, the impulse response is given by


{ykδ } = Z −1 [G (z)]
 
−1 z
= Z
z+2
k
= {(−2) } k ≥ 0.
Thus by the theorem above, the response of the system to a unit step
input {hk } is given by
{yk } = {hk ∗ yδk } = {hk ∗ (−2)k }.
But by the convolution formula each yk then equals
k
X
yk = (−2)n hk−n
n=0
= (−2)0 hk + (−2)1 hk−1 + (−2)2 hk−2 + · · · + (−2)k h0
= 1 + (−2)1 + (−2)2 + · · · + (−2)k .
But this is just a finite geometric progression with k + 1 terms and a
common ratio of −2. Thus for each k ≥ 0
1 − (−2)k+1
yk =
1 − (−2)
1
= (1 + (−1)k 2k+1 ).
3
As expected this confirms the result of part (a). 
4.4.7. STABILITY. In example 4.4.5 we observed that for a = 0, 3 the im-
pulse response decayed to zero with increasing k, whilst the step response ap-
proached a constant limiting value. By contrast For a = 1, 4 both the impulse and
step response became unbounded. Thus although in both cases our input was a
bounded sequence, in the first case the output was also bounded, whereas in the
second case it clearly was not.
As in the case of continuous time, the systems for which a bounded input leads
to a a bounded output are the so-called stable systems, and as before the question
of stability is related to the roots of the characteristic equation of the transfer
function. Suppose we have a linear discrete-time system characterised by a transfer
function of the form
bm um + bm−1 um−1 + . . . + b0
G (z) =
an z n + an−1 z n−1 + . . . + a0
where m < n and an 6= 0. The denominator Q (z) = an z n + an−1 z n−1 + . . . + a0
may then be factorised as follows:
Q (z) = an (z − α0 ) (z − α2 ) (z − α3 ) . . . (z − αn ) .
Here the αk ’s (the roots of the characteristic equation) are the poles of the transfer
function. For simplicity’s sake suppose that none of these roots are repeated. This
means that if we resolve G(z) into partial fractions, we should get an expression of
the form
1 1 1
G(z) = C1 + C2 + . . . Cn .
z − α1 z − α2 z − αn
294

z
If now we combine the first shift property with the fact that Z({rk }) = z−r , it
follows that for a fixed m
   
1 1 z
Z −1 = Z −1 = {0, 1, αm , (αm )2 , . . . }.
z − αm z z − αm
Thus if (with all starting values 0) we compute the impulse response by taking
the inverse transform of G(z), it will be a combination of sequences of the form
{0, 1, αm , (αm )2 , . . . }. However such a sequence will be unbounded if |αm | > 1
k
since then |αm | = |αm |k → ∞ as k → ∞. Thus if any of the poles of the above
transfer function lie outside the circle |z| = 1, then the system will be unstable
in the sense that a bounded input like the unit impulse sequence, will lead to an
k
unbounded output. By contrast if |αm | < 1, then |αm | = |αm |k → 0 as k → ∞.
Armed with these insights, we now make the following definition:
Definition 4.4.8. A linear discrete-time system characterised by a transfer
function of the form
bm um + bm−1 um−1 + . . . + b0
G (z) =
an z n + an−1 z n−1 + . . . + a0
where m < n and an 6= 0, is said to be stable if all the poles of the transfer function
lie within the circle |z| = 1. If some of the poles lie on the circle, and the rest are
all inside the circle, the system is said to be marginally stable. If one or more of
the poles are outside this circle, we say that the system is unstable.
Example 4.4.9. Which of the following discrete-time systems, each specified
by their transfer functions G (z), are stable?
1
(a) G (z) =
z − 0, 5
z+2
(b) G (z) = 2
z −z+1
z−3
(c) G (z) = 2
z − z + 0, 5
2z 2 − 4
(d) G (z) = 3
2z − z 2 + 8z − 4
Solution:
(a) This transfer function has only one pole at z = 0, 5. Since |0, 5| < 1,
z = 0, 5 is inside the circle |z| = 1, and hence the system is stable.

(b) This transfer function has poles where z 2 − z + 1 = 0; that is at the points
√ √
1± 1−4 1 3
z= = ±j .
2 2 2
But since

1 √ 2  2 √ !2
3 1 3
±j = + = 1,

2 2 2 2
both of these poles actually lie on the circle |z| = 1. Hence the system is
marginally stable.

(c) In this case the poles of the transfer function are at the roots z 2 −z +0, 5 =
0; in other words at the points

1± 1−2 1 1
z= = ±j .
2 2 2
295 EMT4801/1

Since
s 2  2
1 1 1 1 1
±j =
2 + = √ < 1,
2 2 2 2
it is clear that both poles are inside the circle |z| < 1, and hence that the
system is stable.

(d) We may factorise the characteristic equation to get


2z 3 − z 2 + 8z − 4 = 2z z 2 + 4 − z 2 + 4
 

= z 2 + 4 (2z − 1)


= (z − j2)(z + j2)(2z − 1).


The roots of this equation are therefore at z = ±j2, and z = 21 . Since
both ±j2 lie outside the circle |z| = 1, the system is unstable. 
All four the characteristic equations in the previous examples could easily be
factorised, and hence in all four the above cases it was possible to directly compute
the roots of this equation and check their location. But what if we have a higher
order system with a more complicated characteristic equation which cannot be
factorised so readily? How do we check such systems for stability? In other words
given a polynomial of the form
Q (z) = an z n + an−1 z n−1 + . . . + a0
with an 6= 0, is there a way of checking the location of the roots without actually
factorising it? The good news is that several tests have been developed. Of these we
are going to discuss the so-called Jury stability criterion introduced by E.I. Jury in
1964 in his book Theory and Applications of the Z-transfer method. This criterion
gives necessary and sufficient conditions for the characteristic equation to have its
roots inside |z| = 1. The version presented below is a slightly modified one involving
fewer computations.
The Jury criterion works as follows:
Suppose we are given a polynomial of the form
Q (z) = a0 z n + a1 z n−1 + a2 z n−2 . . . + an−1 z + an ,
with real coefficients and with a0 > 0. (If we had a0 < 0, we could simply multiply
throughout by -1 to reduce to the case where a0 > 0.) Observe that in the above
polynomial we have numbered the coefficients so that their subscripts INCREASE
as the powers of z DECREASE. The reason why we do this will become clear later.
At the outset we can perform a simple preliminary test, by checking if
Q(1) > 0 and (−1)n Q(−1) > 0.
If these inequalities do not hold, there must be at least one root outside |z| = 1.
However if the inequalities DO hold, that still does not guarantee that all the roots
are inside |z| = 1. In such a case we have a lot more work to do before we can be
sure where the roots are. So if the inequalities do hold, the next thing we do is to
draw up the so-called Jury table shown below. The way this is done is as follows:
Each odd-numbered row is paired off with the even-numbered row immediately
following it. Notice also that each even-numbered row is basically the same as
the odd-numbered row before it, but with the terms written in reverse order. To
each pair of rows there is associated a constant βk numbered from n to 1. To
find the constant for a given pair of rows simply divide the last term of the top
row in the pair, by the first term of the same row. Each pair of rows is used to
compute the odd-numbered row immediately following it. So row 3 is computed
from rows 1 and 2, row 5 from rows 3 and 4, and so on. The way we do this is to
296

multiply the second row in any given pair by the constant for that pair, and then
simply subtract the result from the top row. So (Row 3) = (Row 1) − βn (Row 2),
(Row 5) = (Row 3) − βn−1 (Row 4), etc. This means that the bk ’s will be computed
from the ak ’s by the formula
bk = ak − βn an−k ,
the ck ’s from the bk ’s by the formula
ck = bk − βn−1 bn−k−1 ,
and so on. As you do this the pairs of rows will get progressively shorter, since at
each step the last two terms will cancell. Carry on in this way until you get to a
row with only one entry.
Row zn z n−1 z n−2 z n−k z2 z1 z0 β
1 a0 a1 a2 ... ak ... an−2 an−1 an
an
2 an an−1 an−2 ... an−k ... a2 a1 a0 βn = a0
3 b0 = ∆1 b1 b2 ... bk ... bn−2 bn−1
bn−1
4 bn−1 bn−2 bn−3 ... bn−k−1 ... b1 b0 βn−1 = b0
5 c0 = ∆2 c1 c2 ... ck ... cn−2
cn−2
6 cn−2 cn−3 cn−4 ... cn−k−2 ... c0 βn−2 = c0
..
.
2n − 1 r0 = ∆n−1 r1
r1
2n r1 r0 β1 = r0
2n + 1 s0 = ∆n

The reason we numbered the coefficients of the polynomial in the


way we did, was to ensure that we ended up with a uniform formula for
computing each of the odd-numbered rows shown above. Once the table
has been drawn up, set ∆1 = b0 , ∆2 = c0 , . . . , ∆n = s0 as shown above. If we have
that ∆k > 0 for every k = 1, 2, . . . , n, the roots of the polynomial will all be inside
|z| = 1! But in fact more is true.
Let
Q(z) = a0 z n + a1 z n−1 + a2 z n−2 . . . + an−1 z + an
be a polynomial with a0 > 0. If all the ∆k ’s in the Jury table
are non-zero, the roots of the polynomial will lie inside the cir-
cle |z| = 1 if and only if both the following two conditions are
satisfied:
n
(i) Q (1) > 0 and (−1) Q (−1) > 0.
(ii) ∆i > 0 for each i = 1, 2, . . . , n.
If indeed there are roots outside |z| = 1, the number of roots
outside |z| = 1, is the same as the number of negative ∆k ’s.
Example 4.4.10. Use the Jury criterion to determine which of the following
equations have all their roots inside |z| = 1. In cases where some roots are outside
|z| = 1, say how many roots are outside |z| = 1.
(a) Q (z) = z 3 + 31 z 2 − 14 z − 12
1
=0
3 2 1 3
(b) Q (z) = z − 3z − 4 z + 4 = 0
Solution:
(a) Firstly note that
1 1 1
Q (1) = 1 + − − =1>0
3 4 12
297 EMT4801/1

and
 
3 1 1 1
(−1) Q (−1) = −1 −1 + + − = 0, 5 > 0.
3 4 12

Thus the first condition is satisfied. We proceed to draw up the required


table:
Row z3 z2 z1 z0 β
1
1 1 3 − 14 1
− 12
1
1 − 12
2 − 12 − 14 1
3 1 β3 = 1
1
= − 12
143 5
3 144 16 − 29
− 92
4 − 29 5
16
143
144 β2 = 143
32
= − 1287
144
5 0, 9875 . . . 0, 32027 . . .
0,32027...
6 0, 32027 . . . 0, 9875 . . . β1 = 0,9875... = 0, 324 . . .
7 0, 883 . . .

From the above table it is clear that


143
∆1 = > 0, ∆2 = 0, 9875 · · · > 0, ∆1 = 0, 883 . . . .
144
Thus both the conditions of the Jury stability criterion are satisfied, and
hence all the roots must be inside |z| = 1.
Here the third row of the table was of course obtained by means of
the formulas
  
1 1 143
b0 = a0 − β3 a3 = 1 − − − =
12 12 144
  
1 1 1 5
b1 = a1 − β3 a2 = − − − =
3 12 4 16
  
1 1 1 2
b2 = a2 − β3 a1 = − − − =−
4 12 3 9

the fifth row by


  
143 32 2
c0 = b0 − β2 b2 = − − − = 0, 9875 . . .
144 1287 9
  
5 32 5
c1 = b1 − β2 b1 = − − = 0, 32027 . . .
16 1287 16

and the last row by

d0 = c0 − β1 c1 = 0, 9875 · · · − (0, 324 . . . )0, 32027 · · · = 0, 883 . . .

(b) Notice that here n = 3 with a3 = 1 > 0. However

1 3
Q (1) = 1 − 3 − + = −1, 5 < 0.
4 4
Thus the first condition is not satisfied, which means that at least one
of the roots is outside the circle |z| = 1. To see how many are outside
|z| = 1, we need to calculate the value of the ∆i ’s. We proceed to draw
298

up the required table:


Row z3 z2 z1 z0 β
1 1 −3 − 41 3
4
3
2 4 − 14 −3 1 β3 = 3
4
7
3 16 − 45
16 2
4 2 − 45
16
7
16 β2 = 2
7 = 32
7
16
5 − 975
112
1125
112 1125
1125
6 112 − 975
112 β1 = 112
− 975
= − 1125
975
112
7 2, 8846 . . .
7
The sequence ∆1 , ∆2 , ∆3 therefore corresponds to 16 , − 975
112 , 2, 8846 . . .
Since there is one negative term, only one root of the polynomial is outside
|z| = 1.
As before the third row of the table was computed using the formulas
  
3 3 7
b0 = a0 − β3 a3 = 1 − =
4 4 16
  
3 1 45
b1 = a1 − β3 a2 = −3 − − =−
4 4 16
 
1 3
b2 = a2 − β3 a1 = − − (−3) = 2
4 4
the fifth row by
 
7 32 975
c0 = b0 − β2 b2 = − (2) = −
16 7 112
  
45 32 45 1125
c1 = b1 − β2 b1 = − − − =
16 7 16 112
and the last row by
 
975 1125 1125
d0 = c0 − β1 c1 = − − − = 2, 8846 . . .
112 975 112
To see the reliability of this result notice that
1 3
Q (z) = z 4 − 3z 2 − z + = 0
4 4
can be factorised to get
  
1 1
z− z+ (z − 3) = 0,
2 2
from which it is clear that the roots are
1
z = ± , 3.
2
Of these ± 21 are inside |z| = 1 and only 3 outside. 
Remark 4.4.11. There is an alternative to the Jury stability criterion. In a
rather sneaky way we can actually use the Routh-Hurwitz criterion to test a transfer
function G(z) for stability in the Z-transfer sense. The way we do this is to use
the following map from the z-plane to the p-plane:
z+1
z→p= .
z−1
This map has some very interesting properties (which we will not prove). Firstly it
z+1 p+1
is its own inverse. (So z → p = z−1 is just the inverse of p → z = p−1 .) Secondly it
will map the interior of the circle |z| = 1 onto the left half-plane in the p-plane. So
299 EMT4801/1

p+1
if in our original transfer function G(z) we set z = p−1 , after some simplification
we will get a new transfer function
 
p+1
F (p) = G
p−1
which will also be a quotient of polynomials. The poles that G(z) had inside the
circle |z| = 1 in the z-plane, will be mapped onto poles of F (p) that are in the left
half of the p-plane. Thus G(z) is stable in the Z-transfer sense if and only if F (p)
is stable in the Laplace-transform sense. So if
 
p+1
F (p) = G
p−1
satisfies the Routh-Hurwitz criterion, G(z) will be stable in the Z-transfer sense.
Consider for example the transfer function
(z − 1)3
G(z) = .
60z 4 − 92z 3 + 48z 2 − 8z
p+1
If we set z = p−1 ,
we get
 
p+1
F (p) = G
p−1
p+1
(( p−1 ) − 1)3
= p+1 4 p+1 3 p+1 2 p+1
60( p−1 ) − 92( p−1 ) + 48( p−1 ) − 8( p−1 )
((p + 1) − (p − 1))3 (p − 1)
=
60(p + 1)4 − 92(p + 1)3 (p − 1) + 48(p + 1)2 (p − 1)2 − 8(p + 1)(p − 1)3
p−1
=
p4 + 9p3 + 33p2 + 51p + 26
But in example 3.5.4 we used the Routh-Hurwitz criterion to show that all the roots
of p4 + 9p3 + 33p2 + 51p + 26 are in the left-half of the p-plane. Thus F (p) is stable
in the Laplace transform sense. Therefore G(z) is stable in the Z-transform sense.
4.4.8. THE RELATION BETWEEN LAPLACE AND Z-TRANSFORMS
REVISITED. In section 4.1.5 we investigated the relationship between the use
of Laplace transforms in studying continuous-time systems, and Z-transforms in
studying discrete-time systems. There we showed that given a continuous function
f which is being sampled with a sampling interval of T units of time, then on set-
ting z = epT , the Z-transform of the sequence {f (0) , f (T ) , f (2T ) , . . . f (nT ) , . . .},
agrees with the Laplace transform of the so-called ideal sampler

X
g (t) = f (nT ) δ (t − nT ) .
n=0

With that as background, let us finally compare the criteria for stability in
the Laplace case, with the criteria for stability in the Z-transform case. As we
saw in module 3 a linear continuous-time system characterised by some transfer
function, is stable if the poles of the transfer function are in the left half of the
p–plane. (That is that region of the p-plane for which <(p) < 0.) By contrast a
discrete–time system described by some transfer function is stable if all the poles
of the transfer function lie in the region |z| < 1. Lets take a closer look at the
change of variable z = epT that we used to identify the Laplace transform of the
ideal sampler, with the Z-transform of the sequence of sampled values. Suppose
300

that p = q + jr where q and r are real variables. Then


z = epT
= e(q+jr)T
= eqT · ejrT
= eqT (cos rT + j sin rT ) .
Thus
|z|2 = (eqT )2 cos2 rT + sin2 rT = e2qT .


But then
|z| < 1 ⇔ eqT < 1
⇔ qT < ln 1 = 0
⇔ <(p) = q < 0.
The mapping z = epT will therefore map the left half of the p–plane onto the
interior of the circle |z| < 1 in the z–plane. Thus if the Laplace transform of the
ideal sampler has singularities in left-half of the p-plane, the Z-transform of the
sequence of sampled values will have its singularities inside the circle |z| = 1 in the
z-plane. This behaviour is clearly consistent with the apparently different criteria
for stability in the continuous-time and discrete-time cases.
EXERCISE 4.4.
1. Draw a block diagram to illustrate a time–domain of the system with
transfer function
z
G (z) = 2
z + 0, 3z + 0, 02
Also find a second structure that implements the system.
(Hint: Use partial fractions.)

2. Find the transfer functions of each of the following discrete–time systems,


given that the system is initially in its quiescent state.
(a) yk+2 − 3yk+1 + 2yk = uk
(b) yk+2 − 3yk+1 + 2yk = uk+1 − uk
 
1 z−1
;
z 2 − 3z + 2 z 2 − 3z + 1

3. Draw a block diagram representing the discrete–time system


yk+2 + 0, 5yk+1 + 0, 25yk = uk
Hence, find a block diagram representation of the system
yk+2 + 0, 5yk+1 + 0, 25yk = uk − 0, 6uk+1

Figure 4.19. Answer for question 3


301 EMT4801/1

4. Find the impulse response for the systems with Z–transfer function.
z
(a) 2
8z + 6z + 1
5z 2 − 12z
(b) 2
hz n− 6z + 8  o  k+1 i
1 1 k 1 k k
2 − 4 − − 2 , 4 + 2

5. Obtain the impulse response for the systems of 2(a) and (b).
   
0 k=0 0 k=0
;
2k−1 − 1 k > 0 2k−1 k > 0
6. Which of the following systems are stable?
(a) 9yk+2 + 9yk+1 + 25yk = uk
(b) 2yk+2 + 3yk+1 − yk = uk
(c) 4yk+2 − 3yk+1 − yk = uk+1 − 2uk
[unstable, unstable, marginally stable]

7. A sampled data system described by the difference equation yn+1 − yn =


un is controlled by making the input un proportional to the previous error
according to
 
1
un = K n − yn−1 (where K is a positive gain)
2
Determine the range of values of K for which the system is stable. Taking
K = 29 , determine the system response given that y0 = y1 = 0.
n n n 
yn = −4 21 + 2 13 + 2 23


8. Show that the system yn+2 + 2yn+1 + 2yn = un+1 has a transfer function
z
F (z) = 2
z + 2z + 2
Show that the poles of the system are at z = −1 ± j. Hence, show that
the impulse response of the system is given by
n 3nπ
fn = 2 2 sin
4

9. Use the convolution formula in theorem 4.4.6 to find the response of the
systems with transfer functions as shown below, to the unit step input
{hk } = {1, 1, 1, . . . }. You may assume that all starting values are zero.
z
(a) G (z) =
z + 12
z
(b) G(z) =
z − 12
h n h k io n k oi
{yk } = 13 2 + − 12 ; {yk } = 2 − 12
302

UNIT 5: STATE–SPACE EQUATIONS


4.5.1. OBJECTIVE. To introduce the state space approach for the discrete-
time case, and develop algorithms for solving state space equations in this setting.

4.5.2. OUTCOMES. At the end of this unit the student should


• Know what is meant by a “state space equation”;
• Know how SISO systems may be written is state space form. In particular
given a transfer function of a system, know how to obtain an equivalent
state space formulation of that same system.
• The student should be able to solve state space equations using the Z-
transform method.

4.5.3. THE STATE–SPACE MODEL FOR DISCRETE–TIME SYS-


TEMS. As is the case with continuous time systems, in discrete time systems one
may sometimes also have to deal with more complicated systems with several in-
puts and outputs, governed by not just one, but rather a whole system of difference
equations. Such discrete-time MIMO (multi–input; multi–output) systems are bet-
ter written in matrix form. Suppose we have such a MIMO system with m input se-
queneces {u1 (k)}, {u2 (k)}, . . . , {um (k)} and r outputs {y1 (k)}, {y2 (k)}, . . . , {yr (k)}.
In state–space form the dependence of the outputs on the inputs is described by a
matrix equation of the form

x(k + 1) = Ax(k) + Bu(k)
(4.5.1)
y(k) = Cx(k) + Du(k).
Here A is an n × n matrix, B an n × m matrix, and C and D are respectively r × n
and r × m matrices. The column matrices u(k), x(k) and y(k) are of the form
     
x1 (k) u1 (k) y1 (k)
 x2 (k)   u2 (k)   y2 (k) 
     
x(k) =  .. , u(k) = .. y(k) = ..
.
     
 . 


 . 


 . 
 xn−1 (k)   um−1 (k)   yr−1 (k) 
xn (k) um (k) yr (k)
The vector x(k)is the so-called n-state vector, whilst u(k) and y(k) are respectively
the input and output sequences written in column vector form. In the set of equa-
tions 4.5.1, the first equation describes the evolution of the system as k increases,
whereas the second describes the dependence of the outputs on the state vector
and the inputs. The equation x(k + 1) = Ax(k) + Bu(k) is of course nothing but
a whole system of difference equations of the form
xj (k + 1) = aj1 x1 k + aj2 x2 k + · · · + ajn xn k + bj1 u1 k + bj2 u2 k + · · · + ajm um k
written in matrix form.
Having defined the state space approach for discrete systems, two challenges
face us:
• Firstly we need to see how for SISO systems the difference equation and
transfer function approaches may be reformulated in state space form.
• Secondly we need to develop algorithms for solving state space equations.
For both of these challenges we will follow essentially the same approach as we
did in the continuous-time case. Because of this similarity with the work done in
module 3, we will not go into too much detail here.
Lets first see how a simple difference equation may be written in state space
form. Suppose that in a discrete-time system, the sequence {yk } is the response
303 EMT4801/1

to the input sequence {uk }, with the dependence of the output on the input being
governed by the equation
(4.5.2) yk+n + an−1 yk+n−1 + . . . + a1 yk+1 + a0 yk = b0 uk
As in the case of continuous–time systems, we introduce state variables xi (k) (i =
1, 2, . . . , n) by setting
x1 (k) = yk , x2 (k) = yk+1 , ..., xn (k) = yk+n−1
Using these variables, the above equations may then be written as
x1 (k + 1) = yk+1 = x2 (k)
x2 (k + 1) = yk+2 = x3 (k)
..
.
xn−1 (k + 1) = yk+n−1 = xn (k)
xn (k + 1) = yk+n = −an−1 yk+n−1 − . . . − a1 yk+1 − a0 yk + b0 uk
= −an−1 xn (k) − . . . − a1 x2 (k) − a0 x1 (k) + b0 uk
In matrix form these equations then become
x (k + 1) = Ax (k) + Buk
where
   
  0 1 0 ... 0 0
x1 (k)
 x2 (k) 

 0 0 1 ... 0 


 0 

x (k) =  .. ..
, A= and B =  .
     
.. .  .
 .     
 0 0 0 ... 1   0 
xn (k)
−a0 −a1 −a2 . . . −an−1 b0
The output {yk } can be obtained from yk = x1 (k). So in state space form equation
4.5.2 becomes
(4.5.3) x (k + 1) = Ax (k) + Buk
yk = Cx (k)
where A and B is as above, and
C = [1 0 0 ... 0] .
For equations with a more complex right-hand side, we pass to the transfer
function model, and then argue as in the continuous-time case to obtain the follow-
ing theorem:
Theorem 4.5.1 (First Kalman Form revisited). Suppose we are given a discrete-
time system initially in a queiscent state, described by the transfer function
bm z m + bm−1 z m−1 + . . . + b1 z + b0
G (z) = m < n.
z n + an−1 z n−1 + . . . + a1 z + a0
Then with all starting values zero, the state space equation

x(k + 1) = Ax(k) + Buk
(4.5.4)
yk = Cx(k) + Duk
where
   
0 1 0 ... 0 0 0

 0 0 1 ... 0 0 


 0 

A= .. .. .. .. .. .. ..
, B= ,
   
 . . . . . .   . 
 0 0 0 ... 0 1   0 
−a0 −a1 −a2 . . . −an−2 −an−1 1
304

C = [b0 b1 ... bm 0 ... 0] and D=0


will yield exactly the same transfer function. (Here yk and uk are not written in
bold since each corresponds to the general term of a single sequence rather than a
column vector.)

Example 4.5.2. Determine the state–space representation of the system char-


acterised by the difference equation

yk+2 − 5yk+1 + 6yk = 5uk .

Solution: Following the hints in the discussion above, we set

x1 (k) = yk , x2 (k) = yk+1 .

Then

(4.5.5) x1 (k + 1) = yk+1 = x2 (k) .

and
x2 (k + 1) = yk+2
Using the fact that the difference equation may be written as yk+2 = 5yk+1 − 6yk +
5uk , we may now conclude from the above equations that

(4.5.6) x2 (k + 1) = 5x2 (k) − 6x1 (k) + 5uk

If now we combine equations 4.5.5 and 4.5.6 with the fact that yk = x1 (k), then in
matrix form this may be written as
      
x1 (k + 1) 0 1 x1 (k) 0
= + uk (k)
x2 (k + 1) −6 5 x2 (k) 5
 
x1 (k)
yk = [1 0] .
x2 (k)


Example 4.5.3. Determine the state–space representation of the system char-


acterised by the transfer function
z−1
G (z) =
z2 + 5z + 4
Solution: The block diagram of this system is shown in the figure below.

Figure 4.20
305 EMT4801/1

In this case we use the theorem regarding the First Kalman Form. According
to this theorem, a state space formulation will be given by
      
x1 (k + 1) 0 1 x1 (k) 0
= + u (k)
x2 (k + 2) −4 −5 x2 (k) 1
 
x1 (k)
yk = [−1 1] .
x2 (k)
It is in fact possible to deduce the above state space formulation directly from the
block diagram (so without using the theorem). To do this notice that we can choose
x1 (k) as the output of the second delay block and x2 (k) as the output of the first
delay block. From the action of the second delay block we conclude that
x1 (k + 1) = x2 (k)
Since the variable on any branch leaving the first error wheel must be the same as
the variable at this node, we also have that
x2 (k + 1) = −5x2 k − 4x1 k + u (k) .
Applying the same reasoning to the second error wheel yields
yk = −x2 (k) + x2 (k) .
If now we write these equations in matrix form, we again get the above state space
formulation of the system. 
4.5.4. SOLUTION OF THE STATE–SPACE EQUATION. Before com-
mencing this section the student is advised to revise sections 3.6.4 and 3.6.5 of
module 3. As before there are essentially two ways in which we can try to solve a
state space equation: we can either use the theory of matrix-valued functions, or
we can use the Z-transform approach. We will look at each of these in turn.

4.5.4.1. The matrix-valued function approach. Consider a state space equation


of the form
x (k + 1) = Ax (k) + Bu (k)
y(k) = Cx (k) + Du (k) .
It can be shown that the solution to
x (k + 1) = Ax (k) + Bu (k)
is given by
k−1
X
x (k) = Ak x (0) + Ak−j−1 Bu (j) .
j=0
(Here the first term on the right hand side plays the role of the complementary
function, and the second term the role of the particular integral.) Once this solution
has been computed, one then simply substitutes it into the equation
y(k) = Cx (k) + Du (k)
to get the output. From the above formula for x(k), it is clear that we need a
formula for the value of the matrix function Ak , to be able to actually compute the
value of x(k).
For this we can use definition 3.6.6 of module 3. To find Ak we simply apply
this definition to the function t → tk . If A is an n × n matrix with m distinct
eigenvalues λ1 , λ2 , . . . , λm (with each λj repeated rj times), it follows from this
definition that we will have
Ak = α0 (k)I + α1 (k)A + α2 (k)A2 · · · + αn−1 (k)An−1
306

where for each k,


pk (t) = α0 (k) + α1 (k)t + α2 (k)t2 · · · + αn−1 (k)tn−1
is that polynomial for which the coefficients αi (k) satisfy
(4.5.7) (λj )k = p(λj ) and
k−s (s)
k(k − 1) . . . (k − s + 1)(λj ) = p (λj ) for all 1 ≤ s ≤ rj − 1, 1 ≤ j ≤ m.
Of course if the eigenvalues are not repeated, the second line above does not apply.
We will briefly demonstrate this technique for the homogeneous (unforced)
system
x (k + 1) = Ax (k) ,
before going on to the Z-transform approach. Here {uk } = 0, so the solution will
be of the form
x (k) = Ak x (0) .
Example 4.5.4. Solve the state–space equation
 
0 1
x (k + 1) = x(k)
−4 −5
subject to x1 (0) = x2 (0) = 1.
Solution: We need to find a formula for Ak . To this end we first find the
eigenvalues of A. Let
λ −1
|λ − A| = =0
4 λ+5
This yields λ (λ + 5) + 4 = 0, which may be rewritten as (λ + 1) (λ + 4) = 0. The
eigenvalues are therefore λ1 = −1 and λ2 = −4. From equation 4.5.7 above, for
a second order system like this one with no repeated eigenvalues, we need to find
α1 (k) and α2 (k) so that
λkj = α0 (k) + α1 (k) λj (j = 1, 2) ;
that is
k
(−1) = α0 (k) − α1 (k)
k
(−4) = α0 (k) − 4α1 (k)
k k
If we subtract the second equation from the first we get (−1) − (−4) = 3α1 (k),
from which it follows that
1h k k
i
α1 (k) = (−1) − (−4) .
3
Now notice that the first equation may be rewritten as α0 (k) = (−1)k + α1 (k). If
we insert the above formula for α1 (k) into this equation, we get
1h k k
i 1h
k k
i
α0 (k) = (−1)k + (−1) − (−4) = 4 (−1) − (−4) .
3 3
Therefore
Ak = α0 (k) I + α1 (k) A
1h i 
1h i 
k k 1 0 k k 0 1
= 4 (−1) − (−4) + (−1) − (−4)
3 0 1 3 −4 −5
k k k k
" #
1 4 h(−1) − (−4) i (−1) − (−4)
= k k k k .
3 −4 (−1) − (−4) − (−1) + 4 (−4)
307 EMT4801/1

Thus the required solution is


x (k) = Ak x (0)
k k k k
" #
4 h(−1) − (−4) i (−1) − (−4)

1 x1 (0)
= k k k k
3 −4 (−1) − (−4) − (−1) + 4 (−4) x2 (0)
k k k k
" #
4 h(−1) − (−4) i (−1) − (−4)
 
1 1
= k k k k
3 −4 (−1) − (−4) − (−1) + 4 (−4) 1
" #
k k
1 5 (−1) − 2 (−4)
= .
3 −5 (−1) + 8 (−4)k
k


4.5.4.2. The Z-transform approach. In a manner similar to the Laplace trans-
form method for solving continuous–time state–space equations, the Z–transform
approach may be used to solve discrete–time state-space equations. Consider for
example the equation
x (k + 1) = Ax (k) + Bu (k)
y(k) = Cx (k) + Du (k) .
The key task is to find x(k). Once this is known, it is a simple matter to compute
the out put from the equation y(k) = Cx (k) + Du (k). To find x(k), take the
Z-transform of
x (k + 1) = Ax (k) + Bu (k)
to get
zX (z) − zx (0) = AX (z) + BU (z) .
Thus we have that
(zI − A) X (z) = zx (0) + BU (z) ,
whence
−1
X (z) = (zI − A) [zx (0) + BU (z)] .
We may therefore use the above formula to compute
Z({x(k)}) = X (z) ,
and then simply take inverse transforms of the result to find x(k).
Example 4.5.5. Using the Z–transform method, determine x(k) for the system
described by 
x1 (k + 1) = 2x1 (k) + 5x2 (k) + uk
k≥0
x2 (k + 1) = −3x1 (k) − 6x2 (k) + uk
subject to x1 (0) = 1 and x2 (0) = −1, where uk is the unit step function

0 k<0
uk =
1 k≥0
Solution: In matrix form the state–space equation becomes
    
2 5 x1 (k) 1
x(k + 1) = + uk
−3 −6 x2 (k) 1
= Ax(k) + Buk
Taking Z–transforms yields
zX (z) − zx(0) = AX (z) + BU (z) .
By the discussion preceding this example, it then follows that
−1 −1
X (z) = (zI − A) zx (0) + (zI − A) BU (z) .
308

We proceed to compute the inverse of


 
z − 2 −5
zI − A =
3 z+6
Notice firstly that

z − 2 −5

3
= (z − 2) (z + 6) + 15
z+6
= z 2 + 4z + 3
= (z + 1) (z + 3) .
The adjoint of (zI − A) is given by
 
z+6 5
adj (zI − A) = .
−3 z − 2
Thus
   
z+6 5
−1 1 z+6 5 (z+1)(z+3) (z+1)(z+3)
(zI − A) =  = .
(z + 1) (z + 3) −3 z − 2 −3 z−2
(z+1)(z+3) (z+1)(z+3)

But then
   
z+6 5
−1 (z+1)(z+3) (z+1)(z+3) 1
z (zI − A) x (0) = z · 
−3 z−2
(z+1)(z+3) (z+1)(z+3) −1
 
z+1
(z+1)(z+3)
= z 
−(z+1)
(z+1)(z+3)
 
z
(z+3)
(4.5.8) =  .
−z
(z+3)
z
Next notice that the Z-transform of {uk } = {1, 1, 1, . . . }k≥0 is U (z) = z−1 . Thus
  
1 z+6 5 1
−1
(zI − A) BU (z) =    z
(z + 1) (z + 3) −3 z − 2 1 z−1
 
z z + 11
=  
(z + 1) (z + 3) (z − 1) z−5
 
z+11
(z+1)(z+3)(z−1)
= z 
z−5
(z+1)(z+3)(z−1)
 
− 25 z+1
1
+ 1
z+3 + 3 1
2 z−1
(4.5.9) = z .
3 1 1 1 1
2 z+1 − z+3 − 2 z−1

If now we add equations 4.5.8 and 4.5.9, we get


−1 −1
X (z) = (zI − A) zx (0) + (zI − A) BU (z)
 
− 25 z+1
z z
+ 2 z+3 + 32 z−1
z
=  .
3 z z 1 z
2 z+1 − 2 z+3 − 2 z−1
309 EMT4801/1

Taking inverse Z-transforms, yields


 
− 52 (−1)k + 2(−3)k + 23
x(k) =  .
3 k k 1
2 (−1) − 2(−3) − 2

EXERCISE 4.5.
1. (a) Express the difference equation yk+2 − yk+1 − yk = 0 in state–space
form by taking
x1 (k) = yk , x2 (k) = yk+1
(b) Obtain a solution for yk given that y0 = 0 and y1 = 1. Hence show
that
yk+1 1 √ 
lim = 1+ 5
k→∞ yk 2
2. (a) Express the difference equation
yk+2 − 5yk+1 + 6yk = 5
in state–space form by taking x1 (k) = yk , x2 (k) = yk+1 .
(b) Obtain a solution for yk given that y0 = 0 and y1 = 1 by using
(i) the matrix-valued function approach;
(ii) the Z–transform approach.
yk = 25 − 6 2k + 72 3k
  
APPENDIX A

MATRIX THEORY

A.1. OBJECTIVE
The objective of this unit is to revise work done previously on matrices so as to
equip the student with the necessary knowledge to cope with unit 8 on state–space
equations.
Note: The student will not be examined on this unit.

A.2. BASIC CONCEPTS


A matrix is a rectangular array elements aij where subscript “ij” indicates the
element from the i-th row and the j-th column. These elements can for example be
taken from a given set of numbers, functions or operators. In expanded notation
such a matrix will commonly be written in the form
 
a11 a12 . . . a1n
 
 a21 a22 . . . a22 
 
A= .
..

 ..

. ... 
 
am1 am2 ... amn

Rows are therefore numbered from top to bottom, and columns from left to right.
Here the elements ak1 , ak2 , ak3 , . . . , akn form the k-th row and a1k , a2k , a3k , . . . , amk
the k-th column. For a general matrix of the above type, we will often simply write
A = [aij ]. Matrices can of course have different sizes depending on the number of
rows and columns they have. To be able to give meaning to the idea of the “size”
of a matrix, we introduce the notion of order. The order of a matrix depends on
its number of rows and columns. A matrix of the above type with m rows and n
columns is said to be a matrix of order m × n, or just an m × n matrix. Consider
for example the matrix A where

column
| {znumber}
1 2 3 4
 
1 4 6 81 
  
A=  −9 5 −6  2
 
2
   Row number
−1 7 3 7 3

Here 5 is the element a23 in the second row and third column of the matrix.
This particular matrix is of order 3 × 4 (3 by 4).
Two matrices are said to be equal if they are of the same order and all the
corresponding elements are equal to each other.
311
312

A.3. TYPES OF MATRICES


The transpose At of a matrix A, is defined to be that matrix obtained from A
by interchanging the rows and columns. For example if A is the matrix
 
1 4 6 8
 
A =  −9 2 5 −6 
 
 
−1 7 3 7

then the transposed matrix At is


 
1 −9 −1
 
 
t
 4 2 7 
A =



 6 5 3 
 
8 −6 7

A matrix consisting of a single row, is called a row matrix and one consisting of a
single column, a column matrix. For example
[3 5 −4 2]
is a row matrix, and
 
6
 
 −9 
 
 
8
a column matrix.
When all the elements of a matrix are zero, it is called a null matrix. A nul
matrix of order m × n will be denoted by 0mn , or simply just 0 if there is no danger
of confusion. For example
 
0 0 0 0
 
 0 0 0 0  = 034 = 0,
 
 
0 0 0 0
 
0 0 0 0
 
 
 0 0 0 0 
  = 044 = 0,
 
 0 0 0 0 
 
0 0 0 0
h i
0 0 0 0 = 014 = 0, etc.

A.3.1. Square matrices. A matrix is said to be a square matrix if it has the


same number of rows and columns (i.e. m = n. When speaking of the order of a
square matrix A with n rows and n columns, we will sometimes just say that A is
a square matrix of order n (rather than n × n).
The diagonal elements of a square matrix A of order n, are the elements aij for
which i = j, namely a11 , a22 , . . . , ann . If in fact aij = 0 whenever i 6= j, we say that
313 EMT4801/1

A is a diagonal matrix. For example, the following matrix is a diagonal matrix of


order 3:  
5 0 0
 
A= 0
 
4 0 
 
0 0 7
If we have a square matrix for which all the elements below the diagonal are zero, we
say that the matrix is upper–triangular. If on the other hand all the elements above
the diagonal are zero, we say that the matrix is lower–triangular. For example
 
5 0 3
 
is upper–triangular, and
 
 0 4 0 
 
0 0 7
 
8 0 0
 
 −4 4 0  is lower–triangular.
 
 
7 2 7
A diagonal matrix for which all the diagonal elements are equal to 1, is called an
identity matrix. We will use the notation In for an identity matrix of order n. If
there is no danger of confusion, we will often simply write I. For example
 
  1 0 0
1 0  
I2 =  , I3 =  0 1 0  .
 
0 1  
0 0 1

A.4. DETERMINANTS
The determinant of the square matrix
 
a b
 
c d
is defined to be

a b
= ad − bc

c d
To evaluate determinants of higher–order square matrices, we introduce the
concept of minors and cofactors. For an n × n matrix A, the minor Mij of an entry
aij in the matrix A, is the determinant of the (n − 1) × (n − 1) matrix obtained by
deleting all elements in the i-th row and j-th column. For example the minor of
a32 = 8 in  
4 7 3
 
 
 6 1 5 
 
2 8 9
is

4 3
M32 =

6 5
314

All elements in row 3 and column 2 have been deleted. The cofactor Cij of aij is
simply the product of the minor of aij with the factor (−1)i+j . Thus if i + j is
even, the cofactor is Mij and if i + j is odd, the cofactor is −Mij . For example the
cofactor of a32 = 8 in the above example is


4 3
C32 = −M32 = −

6 5
(since 3 + 2 = 5 is odd). When it comes to computing cofactors from minors, the
effect of the factor (−1)i+j is just to force a sign change where necessary. It is clear
that the structure for the sign changes of the minors is

+ − + − + ...


− + − + − ...


+ − ... ... ... ...

.. ..

..
. . .


... ... ... ... ... ...
For a higher order square matrix
 
a11 a12 ... a1n
 
 a21 a22 ... a2n
 

A=
 .. .. ..


 . . . 
 
an1 an2 ... ann
the determinant is defined to be
|A| = a11 C11 + a12 C12 + a13 C13 + · · · + a1n C1n
= a11 M11 − a12 M12 + a13 M13 + · · · + (−1)1+n a1n M1n .
It is a fact that
|A| = (−1)i+1 ai1 Mi1 + (−1)i+2 ai2 Mi2 + · · · + (−1)i+n ain Min
= (−1)1+j a1j M1j + (−1)2+j a2j M2j + · · · + (−1)n+j anj Mnj .
for any row ai1 , ai2 , ai3 , . . . , ain and any column a1j , a2j , a3j , . . . , anj . Thus the
value of a determinant may be obtained by expanding along any row or column.
The elements of a row (column) are multiplied by their cofactors and added.
Example A.4.1. Determine

4 −7

3

|A| = 6

1 5

8 −9

2
using the elements of
(a) the first row,
(b) the second column.
315 EMT4801/1

Solution:

(a)


1 5 6 5 6 1
|A| = 4 − (−7) + 3
8 −9 2 −9

2 8
= 4 (−9 − 40) + 7 (−54 − 10) + 3 (48 − 2)
= −506
(b)


6 5 4 3 4 3
|A| = − (−7) + 1 − 8
2 −9 2 −9

6 5
= 7 (−54 − 10) + (−36 − 6) − 8 (20 − 18)
= −506


A.5. OPERATIONS WITH MATRICES

A.5.1. Addition and subtraction. Only matrices of the same order may
be added or subtracted. It is done by adding or subtracting the corresponding
elements of the matrices. For example if
   
7 0 3 9
   
A =  4 −1  and B =  −3 2  ,
   
   
6 2 7 1
then    
7+3 0+9 10 9
   
A + B =  4 + (−3) = 1 1 .
   
(−1) + 2  
   
6+7 2+1 13 3

A.5.2. Multiplication. For a matrix A = [aij ] and a scalar λ, the scalar


multiple λ[aij ] is λA = [λaij ]. That is every element of A is multiplied by the
scalar λ. For example
   
3 9 15 45
   
5  −3 2  =  −15 10 
   
   
7 1 35 5
The product C = A · B of one matrix with another is only defined if the number of
columns of A is the same as the number of rows of B. Thus if A has order m × n
and B is order n × r, A · B is defined. The product of A and B will then be of
order m × r. Notice that if A is of order m × n and B of order n × r, then the
rows of A has the same number of elements as the columns of B. Thus we can
therefore take the dot product of any row of A with any column of B. In fact the
product C = A · B is defined to be that matrix for which the entry in the i-th row
and j-th column is the dot product of the i-th row of A and the j-th column of
B. That is if ai1 , ai2 , ai3 , . . . , ain is the i-th row of A, and b1j , b2j , b3j , . . . , bnj the
316

j-th column of B, then the entry in the i-th row and j-th column of C = A · B is
cij = ai1 b1j + ai2 b2j + ai3 b3j + . . . ain bnj .

Example A.5.1. Where possible multiply the following two matrices with each
other:    
3 9 4 3 9
   
A =  −3 2 −1  B =  −3 2 
   
   
7 1 3 7 1

Solution:
C  ·B
= A 
3 (3) + 9 (−3) + 4 (7) 3 (9) + 9 (2) + 4 (1)
 
=  −3 (3) + 2 (−3) + (−1) (7) −3 (9) + 2 (2) + (−1) (1) 
 
 
7 (3) + 1 (−3) + 3 (7) 7 (9) + 1 (2) + 3 (1)
 
10 49
 
=  −22 −24 
 
 
39 68
Note that B · A is impossible in this case. Can you see why? 

A.6. INVERSE MATRICES


Let A be a square matrix. If there exists another square matrix B with the
property that
BA = BA = I
then we say that B is the inverse of A and denote it by B = A−1 . It is possible to
show that if only one of the properties BA = I or AB = I holds, then the other
also holds. Thus if only one of BA = I or AB = I holds, then B is the inverse of
A.
It is important to realise that not all square matrices have inverse. For a square
matrix A to have an inverse we must have that |A| = 6 0. To find the inverse of a
matrix, we first determine the so-called adjoint of A (denoted by adj(A)). For the
square matrix
 
a11 a12 . . . a1n
 
 a21 a22 . . . a2n 
 
A= 
.. .. .. 

 . . . 
 
an1 an2 ... ann
the adjoint is defined to be the transpose of the matrix obtained by replacing each
element with its cofactor. That is
 
C11 C21 . . . Cn1
 
 C12 C22 . . . Cn2 
 
adj(A) =  .. .. ..  .

 . . . 
 
C1n C2n ... Cnn
317 EMT4801/1

If indeed the determinant of A is nonzero, the inverse matrix A−1 may be obtained
1
from adj(A) by simply multiplying by |A| , that is

1
A−1 = adj(A).
|A|

Example A.6.1. Determine B −1 if


 
1 3 4
 
B= 1
 
3 3 
 
1 2 4

Solution: Firstly note that




1 3 4

|B| = 1 3 3



1 2 4


3 3 1 3 1 3
= 1 − 3
+ 4

2 4 1 4 1 2
= 6−3−4
= −1

Thus since |B| =


6 0, the inverse of B does indeed exist. To find this inverse, we first
compute the matrix given by replacing each element with its cofactor. (Remember
that the cofactor of aij is the product of the minor with (−1)i+j .) This matrix is
 
C11 C12 C13
 
C =  C21 C22 C23 
 
 
C31 C32 C33
 

3 3 1 3 1 3











 2 4 1 4 1 2 
 
 
 
 
 
 3 4 1 4 1 3 
 − − 
=  2 4
 
1 4 1 2 
 
 
 
 
 
 3 4 1 4 1 3 









 3 3 1 3 1 3 
 

 
6 −1 −1
 
=  −4
 
0 1 
 
−3 1 0
318

To get adj(B) from the above matrix, we still need to find the transpose of that
matrix. Therefore
 
6 −4 −3
 
adj(B) =  −1
 
0 1 
 
−1 1 0
Hence
   
6 −4 −3 −6 4 3
1    
B −1 = adj(B) = (−1)  −1 1 = 0 −1  .
   
0 1
|B|    
−1 1 0 1 −1 0

We demonstrate the reliability of the algorithm for determining B −1 by computing


the product of B and the above matrix, and showing that it is indeed I as it should
be.
  
1 3 4 −6 4 3
  
0 −1 
  
 1 3 3  1
  
1 2 4 1 −1 0
 
1(−6) + 3(1) + 4(1) 1(4) + 3(0) + 4(−1) 1(3) + 3(−1) + 4(0)
 
=  1(−6) + 3(1) + 3(1) 1(4) + 3(0) + 3(−1) 1(3) + 3(−1) + 3(0) 
 
 
1(−6) + 2(1) + 4(1) 1(4) + 2(0) + 4(−1) 1(3) + 2(−1) + 4(0)
 
1 0 0
 
=  0 1 0 
 
 
0 0 1

Exercise A.6.2. Given that
 
1 2 3
 
A= 4 5 ,
 
1
 
6 0 2

determine A−1 .

  
2 −4 7
  
1
The inverse should be  22 −16
  
28 7 
  
−6 12 −7

A.7. APPLICATION: THE SOLUTION OF SYSTEMS OF


SIMULTANEOUS LINEAR EQUATIONS
When solving a system of n equations in m unknowns, then in the majority of
the cases we need to use a process called Gauss-elimination. However if we have n
319 EMT4801/1

equations in n unknowns then sometimes inverse matrices provide an effective way


of solving such system To see how this works consider the system of equations

a11 x1 + a12 x2 + a13 x3 = b1


a21 x1 + a22 x2 + a23 x3 = b2
a31 x1 + a32 x2 + a33 x3 = b3

This system may be represented in matrix form as


    
a a12 a13 x1 b1
 11    
=
    
 a21 a22 a23 x b 
 2   2 
   

a31 a32 a33 x3 b3

That is AX = B where
     
a a12 a13 x1 b1
 11     
A =  a21 a22 a23  , X =  x2  , B =  b2  .
     
     
a31 a32 a33 x3 b3

If A−1 exists we can solve the system of equations by noting that

X = A−1 AX = A−1 B.

Example A.7.1. Solve the following system of simultaneous equations using


an inverse matrix
3x1 + 4x2 − x3 = −5
−2x1 + x2 + x3 = −3
x1 − 2x2 + 2x3 = 12

Solution: In matrix form the system may be written as


    
3 4 −1 x1 −5
    
 −2 1   x2  =  −3
    
1 
    
1 −2 2 x3 12

Setting
   
3 4 −1 −5
   
A =  −2 and B =  −3 
   
1 1 
   
1 −2 2 12
these equations may be written as AX = B. Now notice that

−2 1 −2

1 1 1
|A| = 3 − 4 − 1
−2 2 1 −2

1 2
= 3 (4) − 4 (−5) − 1 (3)
= 29.
320

6 0, A−1 exists. It therefore follows from A · X = B, that X = A−1 · B.


Since |A| =
Thus to find X, we must first find A−1 , and then compute A−1 B. Firstly note that
   
C11 C21 C31 4 −6 5
   
adj(A) =  C12 C22 C32  =  5 7 −1  .
   
   
C13 C23 C33 3 10 11
Dividing by |A| = 29 now yields
1
A−1 = adj(A)
|A|
 
4 −6 5
1  
= 7 −1 
 
 5
29  
3 10 11
Thus       
4 −6 5 −5 58 2
1    1    
X= 7 −1   −3  =  −58  =  −2 
      
 5
29    29    
3 10 11 12 87 3
so that
x1 = 2, x2 = −2, x3 = 3.


A.8. EIGENVALUES AND EIGENVECTORS


An m × n matrix A may be viewed as a linear map from Rm to Rn . The way
this works is as follows: any vector x = (x1 , x2 , . . . , xm ) in Rm may be rewritten
as the m × 1 matrix X = [x1 , x2 , . . . , xm ]t . Because of this correspondence, we will
often refer to m×1 matrices as column vectors of order m. The product AX is then
an n × 1 matrix, which of course corresponds to a vector in Rn . Thus the process
X → AX induces a mapping from Rm to Rn . When we say that this mapping is
linear, we mean that for any two m × 1 matrices X and Y , and any two scalars λ
and µ, we have that A(λX + µY ) = λAX + µAY .
In the case where A is a square matrix of order n, it induces a map from Rn back
into itself. In such a case there may be some vectors on which A has a particularly
simple action. Specifically if A maps a non-zero vector onto a scalar multiple of
itself, that vector is called an eigenvector. More precisely, the values of λ for which
we can find a non-zero column vector such that AX = λX, are called eigenvalues.
Any non-zero vector X for which we have that AX = λX, is called an eigenvector
associated with λ.
Notice that if X is an eigenvector associated with λ, we must have that (A −
λI)X = 0. That is for an eigenvalue λ, A − λI can’t be invertible since it will
not be 1-1 (it will map both 0 and X onto 0). So to find eigenvalues, we must
look for those λ’s for which A − λI is not invertible; that is for those λ’s for which
|A − λI| = 0. Once we know that λ is an eigenvalue, we can find the associated
eigenvector by solving for X from the equation (A − λI) X = 0. (Why do we use
λI in this equation and not simply λ? The answer is that λ is not a matrix, whereas
λI is, and only a matrix may be subtracted from a matrix.)
321 EMT4801/1

Example A.8.1. Determine the eigenvalues and eigenvectors of


 
4 −1
(a) A =  
2 1
 
2 3 −2
 
(b) A =  1 4 −2 
 
 
2 10 −5
Solution:

(a) To find the eigenvalues we need to find all λ’s for which |A − λI| = 0. Now
   
4 −1 1 0
A − λI =   − λ 
2 1 0 1
   
4 −1 λ 0
=  − 
2 1 0 λ
 
4−λ −1
=  .
2 1−λ
(Thus λ is subtracted from each diagonal element of A.) Since

4−λ −1

= (4 − λ) (1 − λ) − (−1) (2) = λ2 − 5λ + 6,
1−λ

2
the equation
|A − λI| = 0
leads to
λ2 − 5λ + 6 = 0.
That is
(λ − 2) (λ − 3) = 0.
Thus λ1 = 2 or λ2 = 3 are the required eigenvalues. To determine the
corresponding eigenvectors X1 and X2 , we respectively substitute λ1 = 2
and λ2 = 3 into (A − λI) X = 0, and solve for X.

Case 1(λ1 = 2):


We need to find  
x1
X= 
x2
so that     
4−2 −1 x1 0
  = .
2 1−2 x2 0
It follows from the above that
2x1 − x2 = 0.
Equivalently
x2 = 2x1 .
322

Thus if x1 = k, then x2 = 2k. The eigenvector corresponding to λ1 = 2 is


therefore of the form
   
k 1
X1 =   = k .
2k 2

Case 2(λ2 = 3):

In this case we have that


    
4 − 3 −1 x1 0
  = 
2 1−3 x2 0

which leads to

x1 − x2 = 0 and 2x1 − 2x2 = 0.

Equivalently
x2 = x1 .

Thus if x1 = k then also x2 = k The eigenvector corresponding to λ2 = 3


is therefore of the form
   
k 1
X2 =   = k .
k 1

Any non-zero multiple of an eigenvector is again an eigenvector. So


it is enough to give the eigenvectors for a specific value of k. All other
possibilities can then be recovered from this one by simply multiplying
with a scalar. Hence for the sake of simplicity, we may as well restrict
ourselves to the case where k = 1. Thus the eigenvectors respectively
corresponding to λ1 = 2 and λ2 = 3, are
   
1 1
X1 =   and X2 =  
2 1

(b) As before to find the eigenvalues, we need to find all those λ’s for which

|A − λI| = 0.

In other words we need to solve for λ, given that



2−λ −2

3

4−λ −2 = 0

1

−5 − λ

2 10
323 EMT4801/1

If we expand the determinant along the first row, we get that



2−λ −2

3

4−λ −2 = (2 − λ) [(4 − λ) (−5 − λ) − (−2) (10)]

1

−5 − λ

2 10
−3 [1 (−5 − λ) − (−2) (2)]
+ (−2) [1 (10) − (4 − λ) (2)]
= (2 − λ) λ2 + λ − 3[−λ − 1] − 2 [2λ + 2]
 

= (λ + 1) [(2 − λ)λ + 3 − 4]
= − (λ + 1) λ2 − 2λ + 1


2
= − (λ + 1) (λ − 1) .
Thus the equation |A − λI| = 0 becomes
2
− (λ + 1) (λ − 1) = 0.
The eigenvalues are therefore λ1 = −1 and λ2 = λ3 = 1.

Case 1(λ1 = −1):


In this case (A − λ1 I) X = 0 becomes
    
3 3 −2 x1 0
    
5 −2   x2  =  0  .
    
 1
    
2 10 −4 x3 0
After performing the matrix multiplication, this leads to the equations
3x1 + 3x2 − 2x3 = 0 and x1 + 5x2 − 2x3 = 0
(Note that the third row of the matrix A − λ1 I leads to the same equation
as the second row.) If now we subtract the second equation from the first
one, we get
2x1 − 2x2 = 0,
which means that x1 = x2 . On substituting this equality into the first
equation, we get
6x1 = 2x3 ;
that is x3 = 3x1 . Therefore if x1 = 1, then x2 = 1 and x3 = 3. Thus
 
1
 
X1 =  1 
 
 
3
is an eigenvector corresponding to λ1 .

Case 1(λ2 = λ3 = 1):


Here (A − I) X = 0 leads to
    
1 3 −2 x1 0
    
−2   x2  =  0 .
    
 1 3
    
2 10 −6 x3 0
324

Matrix multiplication then yields the equations


x1 + 3x2 − 2x3 = 0 and 2x1 + 10x2 − 6x3 = 0.
If we multiply the first equation by 3, and subtract the result from the
second, we get
−x1 + x2 = 0.
Equivalently x1 = x2 . If now we substitute this equality into the first
equation, we get that 4x1 = 2x3 ; that is x3 = 2x1 . Thus if x1 = 1, then
x2 = 1 and x2 = 2. Therefore
 
1
 
X2 =  1 
 
 
2
is an eigenvector corresponding to λ2 = λ3 . 

Part (b) of the above example illustrates an important fact about square ma-
trices. Although we had a square matrix of order 3, we could only find two general
types of eigenvectors. Thus it is possible for square matrix of order n, to have fewer
than n linearly independent eigenvectors.
APPENDIX B

TABLE OF LAPLACE TRANSFORMS

Z ∞
L {f (t)} (p) = e−pt f (t) dt = F (p)
0

f (t) L{f (t)} = F (p)


0 1
1. t =1
p
n!
2. tn
pn+1
1
3. ebt
p−b
a
4. sin at
p2 + a2
p
5. cos at
p2 + a2
a
6. sinh at
p2 − a2
p
7. cosh at
p2 − a2
8. ebt .f (t) F (p − b)
bt n n!
9. e .t
(p − b)n+1
a
10. ebt . sin at
(p − b)2 + a2
p−b
11. ebt . cos at
(p − b)2 + a2
a
12. ebt. sinh at
(p − b)2 − a2
p−b
13. ebt. cosh at
8 (p − b)2 − a2
t≥k e−kp
< 1 if
14. H (t − k) =
: 0 if t<k p

15. f (t − k) .H (t − k) e−kp .L {f (t)}


16. δ (t − k) e−kp
17. δ (t) 1
dy
18. pY (p) − y (0)
dt
d2 y
19. p2 Y (p) − py (0) − y 0 (0)
dt2
d3 y
20. p3 Y (p) − p2 y (0) − py 0 (0) − y 00 (0)
dt3
d4 y
21. p4 Y (p) − p3 y (0) − p2 y 0 (0) − py 00 (0) − y 000 (0)
dt4
Rt F (p)
22. 0
f (u) du
p

325
APPENDIX C

TABLE OF Z-TRANSFORMS

All sequences in the table below are assumed to be causal – that is xk = 0


whenever k < 0.

P∞ xk
{xk } X(z) = Z({xk }) = k=0 z k
z
 k
a z−a
z
 k−1
ka (z−a)2

{δk } = {1, 0, 0, 0, . . .} 1
z
{hk } = {1, 1, 1, . . . } z−1
z
{k} (z−1)2
1
{xk−k0 } z k0
Z {xk }
Pk0 −1
{xk+k0 } k0
z Z {xk } − n=0 xn z k0 −n
d n

{k n xk } −z dz X (z)
X a−1 z
 k 
a xk
z sin ωT
{sin ωkT } z 2 −2z cos ωT +1
z(z−cos ωT )
{cos ωkT } z 2 −2z cos ωT +1

327

Вам также может понравиться