Вы находитесь на странице: 1из 206

CRANFIELD UNIVERSITY

SCHOOL OF AEROSPACE,
TRANSPORT and MANUFACTURING

PRE-MASTERS COURSE
ACADEMIC YEAR 2014 - 15

MATHEMATICS 1

Term 1

Contents
Chapter 1: Preliminaries
Quadratic expressions
Polynomials and Rational Functions
Algebraic Division
Indices and Logarithms
The Binomial theorem
Partial fractions
Arithmetic and Geometric series
Some important functions
Natural Logarithmic function
Exponential Functions
Trigonometric Functions
Hyperbolic Functions
Trigonometric and Hyperbolic Identities
Functions (general)
Numerical Methods and Errors
How errors combine
Relative error
Types of growth of errors
Chapter 2: Matrices and Theory of Systems of Linear Equations
Denitions and Operations involving Matrices
The Inverse Matrix and its Calculation
Linear Systems of Simultaneous Equations
Solution (Regular Case)
Solution - more general and irregular cases
Homogeneous and Inhomogeneous Systems
Linear Dependence and Independence
Rank of a Matrix
Iterative Methods of Solution for Linear Systems
Appendix: Theorems governing Solutions of Linear Systems
Chapter 3: Dierentiation
Denitions and Properties
Rules for Dierentiation
Chain Rule
Product Rule
Quotient Rule
Use of Natural Logarithms
Inverse Functions
Implicit Dierentiation
Parametric Dierentiation

2
4
4
4
6
8
10
12
12
12
14
16
18
20
22
26
28
29

32
36
40
40
42
46
47
48
53

56
58
58
58
58
60
60
62
62

Chapter 4: Series, Limits and Functions


Denitions: Sequences, Series
Limits
Convergence of Series
The Taylor Series and Theorem
The Maclaurin Series
LHpitals Rule
Stationary Point of Functions: Maxima, Minima, Points of Inexion
Sketching Curves of Functions
Appendix A: Formal Denition of a Limit
Appendix B: Some Useful Limits
Appendix C: Properties of Convergent Series
Appendix D Checking Convergence: Tests and examples
Appendix E: Derivation of the Taylor Series

66
68
70
72
74
77
78
80
83
83
83
84
90

Chapter 5: Complex Numbers


Denitions: The Imaginary Number; Complex Numbers
Real and Imaginary Parts
The Argand Diagram
Modulus and Argument
Exponential Form of a Complex Number
Algebraic Properties and Operations
De Moivres Theorem
Roots of Complex Numbers
Locus Problems

92
93
93
94
96
97
100
102
104

Chapter 6: Polynomials
Denitions
Theorem 1: The Remainder Theorem
Theorem 2: The Fundamental Theorem of Algebra
Theorem 3: Number of Zeros of a Polynomial
Theorem 4: Multiplicity of Zeros of a Polynomial
Theorem 5: Complex Zeros of a Real Polynomial
Relationships between Coecients and Zeros of Polynomials
Numerical Solution of Equations: the Newton-Raphson Method
Estimation of Location of Real Roots
Polynomial Interpolation
The Lagrange Interpolation Formula
Divided Dierences and Newtons Interpolation Formula
Curve Fitting
The Least Squares Method
Appendix: Proofs of Theorems

110
110
110
112
112
114
116
120
123
126
126
127
129
131
135

Chapter 7: Integration
Denitions and Properties of Integrals
Methods for nding Integrals
Class 1: Use of Direct Derivatives
Standard Forms
Substitution
Use of Partial Fractions
Powers and Products of Trig and Hyperbolic Functions
Class 2: Integration by Parts
Reduction Formulae
Numerical Integration
The Trapezium Rule
Simpsons Rule
Appendix A: Proof of the Fundamental Theorem of the Calculus
Appendix B: Gaussian Quadrature

138
139
140
140
142
146
148
150
152
155
155
156
158
159

Chapter 8: Functions of More than One Variable; Partial Dierentiation I


Denitions and Illustrations
The Taylor Series for a Function of Two Variables
Small Increments and Errors
Implicit Functions

164
170
172
175

Chapter 9: Ordinary Dierential Equations (D.E.s)


Denitions
First Order Dierential Equations
Separation of Variables
Homogeneous D.E.s
Exact D.E.s
Linear 1st order D.E.s
Numerical Methods of Solution of 1st Order Dierential Equations
Eulers Method
More Accurate Methods
Modied Euler Method (Heuns Method)
Runge-Kutta Method of order 4
Appendix: Dierential Equations with Linear Coecients
The Bernoulli equation
Runge-Kutta methods general (order m)

178
179
180
182
184
186
188
189
191
192
195
197
199
200

Exercises
Formulae Sheets

Chapter 1.
PRELIMINARIES

PRELIMINARIES
1. Quadratic Expressions
(a) Factorising

ax2 + bx + c

We use the facts (probably subconsciously!) that:


b
a
c
Product of roots =
a
Sum of roots =

(b) Solving a quadratic equation


Either by factorising:

ax2 + bx + c = 0

if ax2 + bx + c = (px + q)(rx + s), then

ax2 + bx + c = 0 (px + q)(rx + s) = 0


px + q = 0 or rx + s = 0
q
s
x=
or x =
p
r
or by using the formula:
x=

b2 4ac
2a

(c) Completing the square


It is sometimes useful to be able to write the quadratic expression
ax2 + bx + c

a[(x + h)2 k 2 ]

in the form

The algebraic steps taken to achieve this alternative expression are:


i. Take out the coecient of x2 (a) as a factor;
ii. Halve the coecient of x; this becomes the h of the above expression;
iii. Adjust the constant term so that
h2 k 2 =

c
a

EXAMPLE
i. x2 7x + 10 =
ii. 2x2 + 5x 12 =
iii. x2 y 2 =
iv. 12 + x 6x2 = (6x2 x 12) =

EXAMPLE
(i)

f1 (x) = x2 + 4x + 29

(ii)

= 2(x2 32 x 2)

(Step(i) not needed here)

Coecient of x is 23
So h = 43

Coecient of x is +4
So h = +2
f1 = (x + 2)2 4 + 29
= (x + 2)2 + 25
or

f2 (x) = 4 + 3x 2x2

f2 (x) = 2[(x 43 )2
]
2[(x 34 )2 41
16

(x + 2)2 + 52

or

41
8

2(x 34 )2

9
16

2]

2. Polynomials and Rational Functions


A polynomial function P (x) is an algebraic expression of the type:
P (x) = an xn + an1 xn1 + an2 xn2 + ..... + a2 x2 + a1 x + a0
where n is an integer 0 and the an , an1 , ...., a0 are constants, an = 0.
n is the degree of the polynomial.
A rational function R(x) has the form
R(x) =

N (x)
D(x)

where N (x) and D(x) are both polynomial functions.


3. Algebraic Division
If the degree of N (x) is equal to or greater than the degree of D(x), then R(x) can be
expressed as the sum of a polynomial + a rational function:
R(x) = PQ (x) +

NR (x)
D(x)

where the degree of NR (x) is less than the degree of D(x). It is often necessary,
for integration, for graph sketching, and various other purposes, to express a rational
function in this form. We obtain this form by algebraic division, dividing out the
rational function. The method is illustrated in the example on the adjacent page.

4. Indices and Logarithms


Indices

an am =

an+m

a0
an

an am = anm
(an )m =

anm

an

=1
1
= n
a

= na

Logarithms The logarithm of a number (x) is the power (y) to which the base of
the logarithm (a) must be raised in order to equal the given number (x).
i.e.

y = loga x

ay = x

Since logarithms are powers (indices), the rules for combining them are basically the
same as those for indices:
log x + log y = log xy
x
log x log y = log
y
y log x = log(xy )

EXAMPLE
(i)

4x4 + 5x3 8x2 2x + 5

is a polynomial of degree 4.

(ii) x7 is a polynomial of degree 7.


(iii)

x3

2x + 7
4x2 + 6x 1

is a rational function which cannot be divided out.

3x3 + 11x2 12x 50


(iv)
is a rational function which can be divided out by
x2 + 2x 8
algebraic division, as shown below:

EXAMPLE
x2 + 2x 8 |3x3 + 11x2 12x 50

3x3 + 11x2 12x 50


=
x2 + 2x 8

N.B. These laws of indices imply that it is possible to factorise expressions such as ax+y
into the factors ax ay .

EXAMPLE

e1+ 2

(i)

= e1 e 2

(ii) ecos i sin = ecos ei sin =

ecos
ei sin

The rules for logarithms can change the form of an expression:

EXAMPLE
)

(x + 2) 3x 2
=
log(x
+
2)
+
log
log
3x 2 log(x + 4)3
(x + 4)3
1
= log(x + 2) + log(3x 2) 3 log(x + 4)
2
(

5. The Binomial Theorem


This give the expansion of the expression (a + x)n
(i)

When n is a positive integer:


(a + x)n

n(n 1) n2 2
n(n 1)(n 2) n3 3
a x +
a x ...
2!
3!
n(n 1)(n 2)...(n (r 1)) nr r
.. +
a x + ... + xn
r!
an + nan1 x +

This can alternatively be written:


(

(a+x)n = an +

n n1
n n2 2
n nr r
a x+
a x +....+
a x +....+xn ;
1
2
r

An alternative way to nd the Binomial Coecients:


This is the triangle of numbers :

n
r

n!
(n r)!r!

Pascals Triangle

1
1

1
1
1
1

2
3

4
5

6
10

1
4

10

1
5

If we need to expand (a + x)n , providing n is a fairly small integer, we nd the row


of Pascals triangle which starts 1
n, and this row will give all the necessary
coecients for the Binomial expansion.
(ii)

When n is negative or fractional

In these cases it is safer to consider only expressions of the form (1 + x)n , taking out
a factor an before expanding where necessary (see below). For (1 + x)n , we have
(1+x)n = 1 + nx +

n(n 1) 2 n(n 1)(n 2) 3


n(n 1)..(n (r 1)) r
x +
x + ....+
x + ...
2!
3!
r!

There is no last term in these cases, and the series is therefore innite. Clearly, if these
individual terms are too large, the sum of the series will be innite and the expansion
cannot be valid. (The value of (1 + x)n for nite n and x, x = 1 for negative n,
must be nite). It can be shown that
The series expansion is valid provided |x| < 1
[ (

For (a+x)n , write as a 1 +

x
a

)]n

= an 1 +

x
a

)n

x
expansion then valid if < 1
a

EXAMPLE

(i)

(a + x)4

=
=

(ii)

4.3 2 2
4.3.2 3
ax +
ax + x4
2!
3!
a4 + 4a3 x + 6a2 x2 + 4ax3 + x4
a4 + 4.a3 x +

The term in x5 in the expansion of (a + x)9 (so that n = 9, r = 5) is:


9.8.7.6.5 95 5
9! 4 5
a x , or
a x , = 126a4 x5
5!
4!5!

EXAMPLE
(i)

The use of Pascals Triangle

(a + x)6 =

(ii) (1 2x)3 , in which a = 1, n = 3, and the standard x is replaced by 2x, is:


= 13 + 3.12 .(2x) + 3.1.(2x)2 + 1.(2x)3
= 1 6x + 12x2 8x3

EXAMPLE
(i) (1 + x)2

Negative and fractional values of n


=
=

(ii)

(2)(3) 2
(2)(3)(4) 3
x +
x + .....
2!
3!
1 2x + 3x2 4x3 + ....
Valid only for |x| < 1 (or 1 < x < 1)
1 + (2)x +

2 x = (2 x)

1
2

=


x

Valid for

x
1
2

)1

1
x
Then expanding with n = , x = ,
2
2
( )(
)

1
1 (
( )(
)
)
2
2
1
x
x
1
2

+
...
= 2 2 1 +
2
2
2!
2
= 2

1
2

x2
x3
x

.....
2 1
4
32
128

< 1

|x|
< 1 |x| < 2, or 2 < x < 2
2

6. Partial Fractions
The aim is to express a rational function f (x) = N (x)/D(x), N (x) and D(x) being
polynomials, as the sum of simpler rational functions.
In order to obtain a partial fraction representation of f (x)
1. f (x) must be such that the degree of N (x) < the degree of D(x). If this is not
the case, then f (x) must be divided out (see Algebraic Division, p.4), giving
f (x) = PQ (x) +

NR (x)
D(x)

NR (x) being the remainder after division, the degree of NR (x) being less than the degree
of D(x). We then apply the partial fraction techniques to the fraction NR (x)/D(x).
2.

D(x) is factorised as far as possible into linear and quadratic factors.

3. f (x) (or NR (x)/D(x)) is then expressed as the sum of fractions with, as yet,
unknown numerators, the denominators being the factors of D(x), as follows:
Any factor which occurs only once in D(x) is used in a single fraction with this
factor as its denominator;
Any factor which is repeated in D(x), say m times, is used in m fractions whose
denominators are powers of the factor concerned, the powers increasing from 1 in the
rst of these fractions to m in the mth.
4.

For the numerator of each fraction in the sum, we choose

A constant if the denominator is a linear factor or a power of a linear factor;


A linear expression if the denominator is a quadratic factor or a power of a quadratic
factor;
In general, if the denominator of the fraction is a polynomial of degree k, (or a power
of such a polynomial), the numerator chosen must be a general polynomial of degree
k 1.
5. We now have a sum of algebraic fractions which is equivalent to f (x) (or to
NR (x)/D(x). This identity is multiplied by D(x) so that the resulting identity contains
no fractions. The coecients in the numerators of the new fractions can then be
calculated by:
(a) Substituting convenient values of x into the identity;

or

(b) Equating coecients of like powers of x from each side of the identity.
The equations thus formed can be solved to nd the coecients in the new numerators.

EXAMPLE 1.

f1 (x) =

3x3 + 11x2 12x 50


x2 + 2x 8

Here the degree of the N (x) >the degree of D(x). So f1 (x) must be divided out, giving
f1 (x) = 3x + 5 +

2x 10
+ 2x 8

x2

(see Example P.5)

D(x) = x2 + 2x 8 = (x + 4)(x 2) when factorised. NR (x) = 2x 10. The necessary


form of partial fractions for this NR (x)/D(x) is
A
B
2x 10

+
(x + 4)(x 2)
x+4
x2
Multiplying this identity through by D(x), we obtain
2x 10 A(x 2) + B(x + 4)
Substituting, in turn x = 2 and then x = 4 in this, we obtain B = 1, A = 3
Thus f1 (x) = 3x + 5 +

3
1

x+4 x2

2x3 + x2 + 20x + 12
(x2 + 5)(x + 1)2
Here division is unnecessary (degree of N (x) = 3 <degree of D(x) = 4). D(x) is
already in factored form. The necessary form of partial fractions for f2 (x) is

EXAMPLE 2.

f2 (x) =

2x3 + x2 + 20x + 12
Ax + B
C
D

+
+
(x2 + 5)(x + 1)2
x2 + 5
x + 1 (x + 1)2
since D(x) contains the repeated factor (x + 1)2 . Multiplying through by D(x):
2x3 + x2 + 20x + 12 (Ax + B)(x + 1)2 + C(x2 + 5)(x + 1) + D(x2 + 5)
Substituting x = 1 gives :

2 + 1 20 + 12 = 0 + 0 + 6D D = 23

We now need to multiply out the R.H.S. so that like powers of x can be equated.
2x3 +x2 +20x+12 Ax3 +2Ax2 +Ax+Bx2 +2Bx+B+Cx3 +Cx2 +5Cx+5C+Dx2 +5D
Equating coecients of x3 , x2 , x, constant terms, we obtain four equations which can
be solved for A, B, C, D:
2
1
20
12

=
=
=
=

A+C
2A + B + C + D
A + 2B + 5C
B + 5C + 5D

Since D is known, only three of these are needed; solving gives A = 23 , B = 2, C =


9

7
2

7. Arithmetic and Geometric Series

(A.P.s and G.P.s)

(a) Arithmetic Series


The terms of an arithmetic series (an Arithmetic Progression, A.P.) are such that
each is formed from the previous one by adding a constant called the common
dierence, d. The rst term is usually denoted by a.
The series is:

a + (a + d) + (a + 2d) + (a + 3d) + .....

The nth term, an or , is:

a + (n 1)d =

The sum, Sn of the rst n terms is :

Sn =

n
(2a + (n 1)d)
2
n
=
(a + )
2

An A.P. never has an innite number of terms. Its sum would be innite.

(b) Geometric Series


The terms of a geometric series (a Geometric Progression, G.P.) are such that
each is formed by multiplying the previous one by a constant called the common
ratio, r or x. Again the rst term is usually denoted by a.
The series is:

a + ar + ar2 + ar3 + ....

The nth term, an , is:

arn1

The sum, Sn of the rst n terms is :

Sn =

a(1 rn )
1r

It is possible for a G.P. to have innitely many terms and yet to have a nite
sum. The condition for this to be possible is
|r| < 1
In this case the sum to innity, denoted by S, is given by
S=
In this case, the series has no last term.

10

a
1r

EXAMPLE
The sum of the rst n natural numbers
1 + 2 + 3 + 4 + 5 + .... + n
is an A.P. with a = 1 and d = 1. The sum, Sn , is therefore
n
n(n + 1)
(2.1 + (n 1).1) =
2
2
7.8
= 28
2
N.B. If we are given that three numbers are consecutive terms of an A.P., we can
obviously express them as
a, a + d, a + 2d
e.g. If n = 7, then : 1 + 2 + 3 + 4 + 5 + 6 + 7 =

However, it is sometimes more convenient for the purposes of calculation to express


them in a symmetrical form as:
a d, a, a + d

EXAMPLE 1

The series
1+

1 1 1
1
+ + + ....
2 4 8 16

is a geometric series with a =


,r=
.
Since |r| < 1, this series has a nite sum when taken to :
S=
1
i.e. the sum of the terms of the series 1 + 12 + 14 + 18 + 16
.... approaches
as we add
more and more terms, but it can never exceed
however many terms are included.

EXAMPLE 2

The series
1 ex + e2x e3x + e4x .....

is a geometric series with a = 1, r =


it value is
S=

. Where the sum to innity exists, therefore,

This is only valid for |r| < 1,; in this case:

11

8. Some Important Elementary Functions


(a) The Natural Logarithm function

ln x

This function can be dened as

ln x =
1

1
dt
t

i.e. the area under the graph of the curve y =


between the limits t = 1 and t = x.

Curve y =

1
t

1
t

The function is found to obey the rules of logarithms, and the value of x for which the area
concerned has the value 1 is dened as the exponential number e (value 2.71828182...).
Hence ln e = 1, and therefore the base of this natural logarithm function is e.
i.e. ln x = loge x

ln x is dened as a real function only for x > 0.


Unlike many other elementary functions, it cannot be expressed as an innite
series in powers of x.
(b) The Exponential function

ex

From the denition of a logarithm (as a power see p.4), this is the inverse of
the natural log function. It can be expressed as a power series in powers of x as:
x2 x3 x4 x5
e =1+x+
+
+
+
+ ....
2!
3!
4!
5!
This expansion is valid for all x.
x

N.B. ex > 0 for all real x.


Since ln x and ex are inverse functions, the graph of one is the reection of the
other in the line y = x.
y
6

y = ex
x=y

Also:
y = ln x

1
1

-x

12

eln x
ln(ex )
e1
e0

=
=
=
=

x for x > 0
x for all x
e ln e = 1
1 ln 1 = 0

-t

EXAMPLE
One of the rules of logarithms (to any base) is
log x + log y = log xy
Therefore if the function ln x dened by the given integral is to obey this law, we
need
ln x + ln y = ln xy
In terms of the denition, this would be:

y
xy
1
1
1
dt +
dt =
dt
t
t
1 t
1

We make a substitution in the second integral:


u
.
x

Let t =

Then:

1
x
du
= , dt =
;
t
u
x

t = 1 u = x; t = y u = xy

xy
xy
xy
1
x du
1
1
Then
dt =
.
=
du which can be written
dt
t
t=1 t
u=x u x
u=x u
x
since u is a dummy variable doesnt matter what letter is used.
y

So
1

y
x
xy
xy
1
1
1
1
1
dt +
dt =
dt +
dt =
dt
t
t
t
1 t
1 t
x
1

Similarly, ex =
e =

N.B. In order to simplify expressions such as


e ln x ,

e2 ln x ,

e2+ 2 ln x etc.

the rules of logs must be used.


e2 ln x
e ln x
1

e2+ 2 ln x

13

2)

=
=

eln(x

= x2

=
=

e2 .eln(x 2 )

as required

N.B. radians = 180

(c) Trigonometric Functions

1
sin x
1
cos x;
sec x =
cos x
1
tan x;
cot x =
tan x
sin x and cos x are also known as the circular functions
Trig. ratios:

sin x;

cosec x =

Possible values:
1 cos x, sin x 1

cosec x, sec x 1 or 1

Relationships between sin x, cos x, tan x :


tan x =

sin x
;
cos x

cot x =

cos x
sin x

Also, crucially:
sin2 x + cos2 x = 1
Dividing this respectively by (a) cos2 x;
(a) tan2 x + 1 = sec2 x;

(b) sin2 x, we obtain:


(b) 1 + cot2 x = cosec2 x

Series Expansions of sin x, cos x, tan x


x3 x5 x7 x9
+

+
.....
3!
5!
7!
9!
x2 x4 x6 x8
+

+
.....
cos x = 1
2!
4!
6!
8!
x3 2x5
tan x = x +
+
+ ....
3
15
These expansions are valid for all x, provided x is expressed in radians.
sin x

Useful triangles for nding trig. ratios of 30 , 60 , 45 and associated angles


(i) Equilateral, side 2 units.
T

30 T
2
2
 T
3
T


60
1

(ii) Right-angled isosceles,


equal sides of length 1

@
@
@ 2
@

45

Then trig ratios of 30 , 60


can be read o.

Then trig ratios of 45


can be read o.
14

Graphs of sin x, cos x, tan x


y

cos x

sin x

-x

- x

These graphs demonstrate the facts that:


sin(x)

sin x

odd function

cos(x)

cos x

even function

tan(x)

tan x

odd function

They show the range of values of x for which the trig functions are positive or
negative.
S +

A +

T +

C +

The various symmetries of the graphs enable calculation of trig functions of


angles outside the range 0 x 2 :
e.g. sin( x) = sin x,

cos( x) = cos x,

sin(x + ) = sin x, etc.

They show the periodic nature of the functions:


sin x, cos x periodic with period 2;

tan x periodic with period

From these triangles, we obtain:

sin
cos
tan

30

45

1
2
3
2
1
3

1
2
1
2

15

60

3
2
1
2

(d) Hyperbolic Functions


sinh x;

1
sinh x
1
=
cosh x
1
=
tanh x

cosech x =

cosh x;

sech x

tanh x;

coth x

Denitions

ex ex
ex + ex
;
cosh x =
2
2
Denitions of the other hyperbolic functions in terms of exponential functions
follow from these two.
sinh x =

ex ex
we have tanh x = x
e + ex

sinh x
In particular, since tanh x =
,
cosh x

cosh2 x sinh2 x = 1

Another important identity:


Graphs:

y = cosh x

1
0

y =-sinh x
x

Inverse Hyperbolic functions


cosh1 x, sinh1 x, tanh1 x etc. the function whose cosh is x etc.
It is sometimes useful to be able to express these inverse hyperbolic functions in
a dierent form. The procedure uses the exponential denitions of hyperbolic
functions, and is illustrated, for sinh1 x, as follows:
Let y = sinh1 x
ey ey
i.e. x =
2
or e2y 2xey 1 = 0 :

2x

4x2 + 4
Solving, ey =
2

y
2
In fact x + x + 1, since e must be +ve

16

sinh y = x

2x = ey ey

a quadratic equation in ey

2x 2 x2 + 1
=
= x x2 + 1
2

y = sinh1 x = ln(x + x2 + 1)

Comments on the graphs::


The curve y = cosh x is the curve formed by a heavy exible rope or chain etc.
with each end held at the same horizontal level and allowed to hang under its
own weight. We shall see later that the gradient of this curve is sinh x. These
hyperbolic functions therefore do have practical meaning.
The least value of cosh x is 1.
cosh(x) = cosh x
sinh(x) = sinh x
tanh(x) = tanh x

(even function)
(odd function)
(odd function)

As x ,

Similarly, suppose y = tanh1 x


We can write: tanh y = x

x = tanh y =

17

TRIGONOMETRIC AND HYPERBOLIC IDENTITIES

cos2 A + sin2 A = 1;
sin(A B)
sin 2A
cos(A B)
cos 2A

sin A cos B cos A sin B


2 sin A cos A
cos A cos B sin A sin B
cos2 A sin2 A
tan A tan B
tan(A B) =
1 tan A tan B
2 tan A
tan 2A =
1 tan2 A
x
If t = tan , then:
2
sin A cos B =

sin A sin B =
cos2 A =
sin2 A =

cot2 A + 1 = cosec2 A

=
=
=
=

cos A cos B =

1 + tan2 A = sec2 A :

sin x =

2t
,
1 + t2

cos x =

1 t2
1 + t2

1
[sin(A + B) + sin(A B)]
2
1
[cos(A + B) + cos(A B)]
2
1
[cos(A B) cos(A + B)]
2
1
(1 + cos 2A)
2
1
(1 cos 2A)
2
x

Relationships with
exponential functions :

cosh x = e +e
2
ix
ix
cos x = e +e
2

sinh x = e e
2
ix
ix
sin x = e e
2i

As a result:

cos ix = cosh x
sin ix = i sinh x

cosh ix = cos x
sinh ix = i sin x

cosh1 x
sinh1 x

Inverse hyperbolic functions:

tanh1 x

18

= ln(x x2 1)

= ln(x + x2 + 1)
(
)
1
1+x
=
ln
2
1x

For relationships between hyperbolic functions, replace the trig function by the
corresponding hyperbolic function, with the proviso that if a ny term contains the
product (or implied product) of two sines, then the sign of this term is reversed
for the hyperbolic relationship. This is basically because
sin ix = i sinh x and therefore sin2 ix = i2 sinh2 x = sinh2 x

where i = 1 and therefore i2 = 1.


So, to illustrate with various examples:
1.

cos2 A + sin2 A = 1

cosh2 A sinh2 A = 1

2.

sin 2A = 2 sin A cos A

sinh 2A = 2 sinh A cosh A

3.

cos 2A = cos2 A sin2 A

cosh 2A = cosh2 A + sinh2 A

4.

tan(A + B) =

tanh(A + B) =

tan A+tan B
1tan A tan B

19

tanh A+tanh B
1+tanh A tanh B

9. Functions (general)
(a) Odd and Even functions
i. A function f is odd if
f (x) = f (x)
The graph of an odd function has point symmetry about the origin the
graph is unchanged when rotated through 180 about the origin.
Examples are:

sin x, tan x, sinh x,


Any polynomial containing only odd powers of x.

If fO is an odd function, then

a
a

fO (x) dx = 0

ii. A function f is even if


f (x) = f (x)
The graph of an even function is symmetrical about the y-axis.
Examples are:

cos x, cosh x,
Any polynomial containing only even powers of x.

If fE is an even function, then

a
a

fE (x) dx = 2

fE (x) dx

Most functions are neither odd nor even. However all functions can be expressed
as the sum of an odd function + an even function. If the given function is f , and
functions f1 and f2 are dened as
1
f1 (x) = (f (x) + f (x)) :
2

1
f2 (x) = (f (x) f (x)), then
2

f1 is even, f2 is odd, and clearly f (x) = f1 (x) + f2 (x).

(b) Periodic functions


A function f is said to be periodic with period T if
f (x + nT ) = f (x), where n is an integer
This implies that, if f is periodic and if f (x) can take the value, say, k for some
value of x, there will be innitely many other values of x for which f (x) = k
This implies that the inverse function, f 1 (x) is many-valued.
The graph of a periodic functions repeats at intervals of T .
All trig functions are periodic.
20

EXAMPLE
Consider f (x) = x2 cos x 2x sin x
To decide whether f (x) is odd or even, we consider f (x).
f (x) = (x)2 . cos(x) 2(x). sin(x)
=

Thus in this example, f (x) is


Products of odd and even functions
tions, we can deduce:
even even = even
odd odd = even
odd even = odd

From the denitions of odd and even func-

Graphs: if we know that a function is either odd or even, we only need to know the
shape of the graph on one side of the y-axis to be able to draw the complete graph:
y

-x

-x

odd

even

EXAMPLE

If f (x) = ex (neither odd nor even),then

ex + ex
= cosh x;
2
Then ex = cosh x + sinh x
(even+odd)
f1 (x) =

f2 (x) =

ex ex
= sinh x
2

The circular functions, sin x and cos x are both periodic with period 2
The function tan x is periodic with period
(See graphs)
In the case of functions such as sin px, cos px, these repeat when the argument, in this
case px, increases by 2.
Therefore if the period is T , then

pT = 2
Then, e.g., the period of sin 3x is

2
;
3

T =

2
p

the period of cos x2 is


21

10. Numerical Methods and Errors


We all know of problems we need to solve where an analytic solution is either impossible
to nd or very dicult and not worth the eort. Examples might be:
2

ex dx

Evaluate
1

Find the roots of : x4 3x3 + 7x2 + x 5 = 0


Solve the equation : cos x cosh x = 1

A basic description of a numerical method might be:


A set of rules for solving a problem, or problems of a particular type,
involving only the operations of arithmetic.
This statement needs qualifying a little. These operations, strictly speaking,
mean
+, , , . We also need to include evaluating certain functions, e.g. x, cos x,
ln x, ex , etc. for specic values of x, but these can ultimately be reduced to operations
involving only the 4 operations stated (e.g. by expressing the function as a Taylor
series to the accuracy required).
The set of rules may also need to include some kind of test to see whether a certain
condition has been satised to check whether the required accuracy has been achieved,
or whether the process has gone far enough in some other way. If not the process is
carried out again. In these cases we have an iterative process.

22

ERRORS
An error is the dierence between the exact value, n, of a number and the approximation, n , to that number. If the error is denoted by , then
= n n

(or = n n if preferred)

Four sources of errors have been identied as follows:


(i)

Idealised models being used for a calculation instead of a real situation;

(ii)
Parameters, used in most mathematical formulations, only being available from
measurement and therefore can only be accurate to within certain limits;
(iii) If an innite process is needed, this cannot, by denition, be complete. The
error resulting is a truncation error; i.e. All the terms after, say, the N th term of an
innite series representing a number are rejected. The obvious case of this occurs when
all the digits after a chosen decimal place in a number are rejected without correction.
(iv) In digital computation, only a nite number of digits can be given as the result
of a calculation. The operator aims to minimise this error by rounding to the nearest
available power of 10. This is called the round-o error.
Clearly we cannot evaluate the error arising from any of these sources. If we could,
we could eliminate it. However, we do need to try to set, in some cases, or identify, in
others, bounds within which the error must lie. Thus a judgement can eventually be
made about the likely accuracy of the calculation.
In general, if a decimal number is approximated to k decimal places, then:
If the number has been truncated, the truncation error 1 is such that
|1 | 10k
If the number has been rounded, the round-o error 2 is such that
|2 |

23

1
10k
2

Example The decimal number representing the fraction n =

n = 0.14285714285714285
7

1
7

is

(recurring)

This can be thought of as the sum of the innite series


1 101 + 4 102 + 2 103 + 8 104 + 5 105 + 7 106 + ....
A truncation would occur if all terms after the 4th were rejected.
then

1
0.1428 = n1
7

If the decimal number is rounded to four decimal places,


then

1
0.1429 = n2
7

since the exact number is closer to 0.1429 than 0.1428.


The truncation error, 1 = n1 n = 0.000057142..., so that |1 | < 104
The round-o error, 2 = n2 n = 0.000042851..., so that |2 | < 12 104 as predicted.

24

In most numerical calculations there are likely to arise:


Propagated errors
These are present at the start of the computation due to errors of measurement, inexact
representation of constants, parameter values etc. involved in any formulae. These
carry through the computation. It is important to know how they are propagated
through the process and to what extent they render the results uncertain.
Generated errors
At every step there is the possibility of new errors arising and combining with errors already propagated, through rounding processes, conversion of numbers from decimal to
binary and back by the computer performing the calculation, etc. They can sometimes
be minimised by the order in which a calculation is carried out.
Residual errors
At the end of the computation, a truncation error might arise and further enlarge the
region of uncertainty.
Roughly speaking, propagated and residual errors depend to a large extent on the
mathematical formulation procedure; generated errors are more dependent on the ordering of the computational steps.
e.g.

Even in the simple calculation of


(k ) m

or

k
,
m

we may obtain an answer by:

(k m)

25

or

k ( m)

How errors combine under arithmetic operations


We will suppose that the exact values of two positive numbers are n1 and n2 . They are
approximated (represented) by n1 and n2 , the errors being 1 , 2 respectively, (where
1 , 2 may be positive or negative) so that
n1 = n1 + 1 ,
Addition

n2 = n2 + 2

n1 + n2 = (n1 + 1 ) + (n2 + 2 ) = (n1 + n2 ) + (1 + 2 )

The result of the addition is seen to be n1 + n2 , and the error is now 1 + 2 . If these
have the same sign, they will reinforce so that the errors build up. In general we do
not know the sign of the error, so we need to anticipate the worst case and allow for it
when deciding whether our answer is acceptably accurate.
Subtraction
n1 n2 = (n1 + 1 ) (n2 + 2 ) = (n1 n2 ) + (1 2 )
The result in this case is apparently n1 n2 , with error 1 2 . In this case if 1 and 2
have opposite signs, we have a larger error than for either of the individual numbers.
Again, not knowing this information, we must anticipate the worst case.
Multiplication
n1 n2 = (n1 + 1 ) (n2 + 2 )
= n1 .n2 + 1 .n2 + 2 .n1 + 1 .2
Since the initial errors were small enough for n1 and n2 to be considered good enough
to represent n1 and n2 with sucient accuracy for our purposes, the last term of this
expression, 1 .2 can be neglected in comparison with the others. Hence n1 .n2 is taken
as the result of the multiplication, the error being 1 .n2 + 2 .n1 . Again in this case,
these will reinforce if the signs of 1 and 2 are the same.
Division
n1
n2

1
(n1
n2

=
(

(n1 +1 )
(n2 +2 )

+ 1 ) 1

2
n2

n2
,
n2

error

1
n2

)2

+ O(2 ) .. =
=

Apparent value

n1 +1

n2 (1+ n2 )

n1
n2

2 n1
;
n2
2

1
n2

1
(n1
n2

1
n2

2 n1
n2
2

+ 1 ) 1 +

2
n2

n1 2
n2

n1 + 1

)1
1 2
n2

+ O()2 ..

+ O(2 )

terms reinforce if 1 , 2 have opposite signs.


26

Examples
Suppose that two numbers, n1 and n2 are given as n1 = 4.63, n2 = 2.15, where these
numbers have been corrected to two decimal places. The errors in each could therefore
have absolute values of up to 12 102 = 0.005 (see p.23 ). The errors arising when
performing the following calculations could be as large as shown:

n1 + n2 = 6.78
|error| 0.005 + 0.005 = 0.01

|error| 0.005 + 0.005 = 0.01


n1 n2 = 2.48

n1 n2 = 9.9545 |error| 0.005 2.15 + 0.005 4.63 = 0.0339


n1
= 2.1534..
|error| (0.0052.15+0.0054.63)
= 0.0073..

2.152
n2

If, in fact, these numbers are exactly n1 = 4.628, n2 = 2.146, 1 and 2 are both
negative;
1 = 4.628 4.63 = 0.002,

2 = 2.146 2.15 = 0.004

so the errors resulting from the addition and multiplication operations will in this
case be larger than those resulting from subtraction and division. In particular, this
means that if the answers are required to be accurate to two decimal places, we cannot
guarantee this if the numbers are corrected to two decimal places before the calculation
is made.

n1 + n2 = 6.774
|error| = 0.006
n1 n2 = 2.482
|error| = 0.002
n1 n2 = 9.931688 |error| = 0.022812
n1
= 2.15657.. |error| = 0.0031...
n2

27

(Answer = 6.77 corr. to 2 d.p.)


(Answer = 2.48 corr. to 2 d.p.)
(Answer = 9.93 corr. to 2 d.p.)
(Answer = 2.16 corr. to 2 d.p.)

Relative error
The importance of an error can often be better appreciated if is compared with the
quantity being considered.
We therefore dene relative error as



Relative error =

since in practice we cannot use n which is not known.


A relative error is often given as a percentage.
e.g. We may be told that n is accurate to within r%.
=

r
n,
100

(or

r
n)
100

n = n 1

so that

Then

r
100

As an example, consider the product of two numbers n1 and n2 , approximated by


n1 and n2 with errors within r1 %, r2 % respectively.
(

n1 n2

n1 1

n1 n2 1

r1
100

r1
100

n2 1

r2
100

r2
100

(r1 r2 )
10000

)
))

r
+r2
+ O 10000
in the worst case
n1 n2 1 r1100
i.e. r1 , r2 having the same sign.
2

Thus the number calculated as the product, n1 n2 , is within (r1 + r2 )% of the true
value.

28

We have discussed the accuracy of a number from the point of view of numbers of
decimal places given. The concept of relative error introduces the accuracy implied
by numbers of signicant gures, which may be a more appropriate measure in some
cases.
As an example, consider the value of

so

T = 1 cos x
when x = 1 , all numbers being corrected to 4 d.p.
T = 1 0.9998 = 0.0002

|Error in T | 0.5 104 = 0.00005


i.e. 0.00015 < T < 0.00025
0.00005
0.00025
or
0.2
i.e.

So

< relative error in T


< relative error in T
20% +33%

0.00005
0.00015
< 0.3333
<

Although the error 0.00005 sounds small, it is not small relative to the value of T ,
giving, as we see, an error of between -20 and 33% in the value.

Type of Growth of Errors as a result of Calculation


Suppose after a calculation involving rounded numbers the error is R(), and N operations have been involved. Then:
If R() N
we say that the growth of the error is linear. This is normal, unavoidable and doesnt
usually matter too much - we can cope with it, and make the necessary allowances.
If R() k N
where k is some constant > 1, we say that the growth is exponential. This can
be disastrous and should be avoided. A procedure which shows exponential growth is
called unstable.

29

An Example of an unstable calculation


A function fn (x) is dened as

x2 x3 x4
xn
fn (x) = n! ex 1 + x +
+
+
+ ..... +
2!
3!
4!
n!

)]

For any values of x and n, this can be computed from the given denition.
However, it can be shown that fn (x) satises the recurrence relation

fn+1 (x) = (n + 1)fn (x) xn+1

It is interesting to compare the values of fn (x) for a given value of x, say x = 1,


obtained for increasing values of n, by using (a) the denition of the function and (b)
the recurrence relation.
N.B. We know that e1 = 2.718281828 correct to 9 decimal places.
n

fn (1)
from denition
corr.to 5 d.p.

fn (1)
from recurrence
relation
corr. to 4 d.p

fn (1)
from recurrence
relation
corr. to 5 d.p

fn (1)
from recurrence
relation
corr. to 6 d.p

0
1
2
3
4
5
6
7
8
9
10
11
12

1.71828
0.71828
0.43656
0.30969
0.23876
0.19382
0.16292
0.14042
0.12332
0.10991
0.09911
0.09023
0.08281

1.7183
0.7183
0.4366
0.3098
0.2392
0.1960
0.1760
0.2320
0.8560
6.7040
66.0400

1.71828
0.71828
0.43656
0.30968
0.23872
0.19360
0.16160
0.13120
0.04960
-0.55360
-6.53600

1.718282
0.718282
0.436564
0.309692
0.238768
0.193840
0.163040
0.141280
0.130240
0.172160
0.721600
6.937600
82.2512

30

Chapter 2.
MATRICES
AND THEORY OF
SYSTEMS OF LINEAR EQUATIONS

31

1. Matrices
A matrix is a rectangular array of numbers or elements.
A matrix is usually denoted by a capital letter. The elements are denoted by lower
case letters, sometimes with subscripts indicating the position of the element within
the matrix. The rst subscript is the number of the row containing that element; the
second is the number of the column containing the element.
The order of a matrix is the number of rows the number of columns.
2. Algebraic Operations between matrices
Consider matrices A = (aij ), order nA mA ;
B = (bi,j ), order nB mB ;
C = (ci,j ), order nc mc .
Equality of two matrices:
(i)
and (ii)

A=B if and only if:

nA = nB and mA = mB ( orders of A and B are equal);


aij = bij for all relevant i, j.

Addition (Subtraction) Only possible to calculate A+B (or AB) if order(A)=order(B).


Then if C = A + B,

cij = aij + bij

(if C = A B,

cij = aij bij )

Multiplication by a scalar
If C = A, then cij = aij

for all relevant i, j

Corollary: If each element of a matrix contains a common factor, then this is a factor
of the matrix itself.
Product of two matrices

Only possible to dene and calculate AB if mA = nB .

If C = AB, then cij = ai1 b1j + ai2 b2j + ai3 b3j + ..... + ai,mA bnA ,j
i.e.The ijth element of C is calculated from the ith row of A and the jth column of B.
Note 1: In general, matrix multiplication is not commutative. i.e.
AB = BA
Note 2: Matrix multiplication is associative. i.e.
A(BC) = (AB)C
The expression ABC, written without brackets, is therefore valid.
32

EXAMPLES

1 2 0 3

4 is a 3 4 matrix.
1 0 5
2 1 7 6

(i)

(ii)

A =

a11
a21

a12
a22

a13
a23

This shows how the subscripts are used.

4
1
1
8

If A = 6 0 , B = 7 3 ,
2 3
4 2

5
9
3
7

then A + B = 1 3 ; A B = 13 3
2 5
6
1
(

(iv)

is a 23 matrix.

(iii)

3 1
5
4 0

15 5
;
20 0

x2
xy 2

xy
x3

x
=x 2
y

y
x2

(
)
1 2
4 1 1
2 1 3

(v) If A = 4 3 , B =
, then AB = 11 4 9
1 0 1
7 1
15 7 20

Order:

32
(

But BA =
Order:

23
)

(
)
1 2
2 1 3
27 10

4
3
=

1 0 1
6 1
7 1

23

32

33

22

33

3. The Transpose of a Matrix


The matrix, denoted by AT (or A ), formed from A
by writing the rows of A as the columns of AT .
Properties:

(A + B)T = AT + B T
(AT )T = A
(AB)T = B T AT

4. Some Special Matrices


A Row Matrix (or row vector):

a matrix of order (1 m)

A Column Matrix (or column vector):

a matrix of order (n 1)

A Zero Matrix (denoted by 0 or 0)

a matrix in which every element is zero.

N.B.

If AB = 0, this does not necessarily imply that either A = 0 or B = 0.

Corollary: If AB = AC, and A = 0,we cannot necessarily conclude that B = C.


If AB = AC, then
AB AC = 0
A(B C) = 0
But from the above, this does not necessarily imply that (B C) = 0
A symmetric Matrix:

the matrix A is symmetric if


aij = aji

i.e. It has symmetry about the leading diagonal.


An Antisymmetric Matrix:

the matrix A is antisymmetric if


aij = aji

This implies that aii = 0.


The Identity, or Unit Matrix:

a square matrix, denoted I, in which

aij = 0 if i = j :

aii = 1

i.e. The elements in the leading diagonal all have value 1; all others have value 0.
I has the property that, for any square matrix A of the same order as I:
AI = IA = A
34

EXAMPLES

3
0

A = 2 1
;
4 6

(i)

A =

3 2 4
0 1 6


1 is a 3 1 column matrix.

(1, 2, 5, 7) is a 1 4 row matrix:

(ii)

3
(

(iii)

If A =

4 0
0 0

and B =

0 0
, then AB =
0 3

0 0
0 0

= 0

However, neither A nor B is the zero matrix.

a h g

is symmetric;

(iv)

i.e.

a12 =

, a13 =

a23 =

(v)

1 0
;
0 1

1 0
0 1

)(

0 2 3

1
0

0
0

1 0 0

1 0;
0 0 1

8 9
1 3

is antisymmetric.

a11 =
a12 =

8 9
1 3

0
1
0
0

0
0
1
0

8 9
1 3

35

, a13 =

, a23 =

0
0

;
0
1
)(

Sometimes these are denoted by


I2 , I3 , I4 . However, the context
usually decides the size.

1 0
0 1

5. The Inverse Matrix


This is only dened for a square matrix.
If the square matrix in question is A, its inverse, where this exists, is denoted by A1 .
Where it exists, it is dened so that
AA1 = A1 A = I
If A is a square matrix for which an inverse cannot be found, A is said to be singular.
A non-singular matrix therefore has an inverse.
Test for a singular matrix:

A is singular if the determinant of A is zero.

Properties of an inverse matrix


1.

The inverse of a non-singular square matrix is unique.

2.

(A1 )1 = A

3.

If A and B are both non-singular square matrices of the same order, then the
product AB is also non-singular, and
(AB)1 = B 1 A1

Proof:

Jordans Method for calculating the inverse of a matrix, using row operations on
an augmented matrix.
Suppose A is the square matrix whose inverse we require. We form an augmented
matrix; the L.H.S. of this is A and the R.H.S. is the unit matrix of the same order as
A. The L.H.S. and R.H.S. are then treated as the two sides of a set of simultaneous
equations, so that the following operations may be carried out:
a complete row may be multiplied by a non-zero constant;
a complete row may be added to or subtracted from another;
two complete rows may be interchanged.
We aim to perform these row operations systematically on the augmented matrix, so
that the L.H.S. becomes I. The R.H.S. will then be A1 .

36

EXAMPLES
(

(i)

If A =

2 5
, then A1 =
1 3
(

(ii)

Consider the matrix B =

3 5
1 2

2 5
since
1 3

)(

3 5
1 2

6 4
, and suppose B 1 , if it exists, is
9 6

1 0
0 1

a b
c d

Then BB 1 = I to comply with the denition of an inverse matrix, so that


(

6 4
9 6

Also

)(

a b
c d

6a 4c = 1
6b 4d = 0

6a 4c 6b 4d
9a 6c 9b 6d

1 0
0 1

and 9a 6c = 0 :

not possible.

and 9b 6d = 1 :

not possible.

Therefore B has no inverse since there are no values of a, b, c, d which will satisfy the
necessary equations. Thus
B is singular
[N.B. the determinant of B = (6 6) (9 4) = 0, as expected]

(iii)

Demonstration of Jordans method

EXAMPLE 1
Form
the

1 1

0
2
2 2

1 1 1

Find A1 , where A = 2 0 2
2 2 1

augmented matrix:

1 : 1 0 0
2 : 0 1 0

1 : 0 0 1

:
:
:

:
:
:

37

:
:
:

:
:
:

We use the numbers in the leading diagonals (called pivots) in turn to produce zeros
where required in the rest of that column.

A1

EXAMPLE 2

1 0.5 3

1
Find B , where B = 1 3 1.5
6
7
8

With more awkward numbers in the matrix, it can be useful to reduce the pivot to
the value 1 before combining the pivotal row with other rows. Any computer program
written to execute Jordans method would use this procedure.
The augmented matrix is:

1 0.5 3 : 1 0 0

1 3 1.5 : 0 1 0
6
7
8 : 0 0 1

1 0.5
3
: 1 0 0

1 1 0
0 2.5 4.5 :
0
4
10 : 6 0 1

R2 + R1
R3 6R1

Divide R2 by -2.5 so
that the next pivot to
be used (in R2 ) = 1

1 0.5
3
:
1
0
0

1 1.8 : 0.4 0.4 0


0
0 4 10 : 6
0
1

R1 21 R2

1 0 3.9 : 1.2
0.2 0

0 1 1.8 : 0.4 0.4 0


0 0 2.8 : 4.4 1.6 1

R3 4R2
Divide R3 by -2.8 so
that the next pivot to
be used (in R3 ) = 1

1 0 3.9 :
1.2
0.2
0

0
1
1.8
:
0.4
0.4
0

0 0
1
: 1.5714 0.5714 0.3571

R1 3.9R3
R2 + 1.8R3

1 0 0 : 4.9285 2.4285
1.3927

2.4285 1.4285 0.6428


0 1 0 :

0 0 1 : 1.5714 0.5714 0.3571

Thus

B 1

4.9285 2.4285
1.3927

= 2.4285 1.4285 0.6428

1.5714 0.5714 0.3571

38

EXAMPLE 3

0
0

Find C 1 , where C =
5
0

1
0
0
0

0 0
2 0

0 1
0 1

In this case, the pivot needed for the rst stage (and perhaps some of the others) is
zero. Clearly a zero pivot cannot be used in a linear combination with another number
to give zero. We need to change the pivot. This is done by interchanging complete rows
of the augmented matrix. In this case, appropriate interchanges will almost complete
the inversion process.
The augmented matrix is:

0
0

5
0

0 0
2 0
0 1
0 1

:
:
:
:

1
0
0
0

0
1
0
0

0
0
1
0

0
0

0
1

0
1
0
0

0 1
0 0
2 0
0 1

:
:
:
:

0
1
0
0

0
0
1
0

1
0
0
0

0
0

0
1

0
1
0
0

0 0
0 0
2 0
0 1

:
:
:
:

0
1
0
0

0
0
1
0

1
0
0
0

1
0

0
1

1
0

0
0

0
1
0
0

0
0
1
0

1
5

1
5

5
0

0
0

Interchange complete
rows appropriately to
eliminate zero pivots

R1 + R4

5
0

0
0

Finally reduce the


L.H.S. to I by
division as necessary:

R1 5
R3 2
R4 (1)

Thus

C 1

0
1

=
0
0

39

0
0
1
2

1
0
0
0

0
0
0

0
1

0
0
0
1

:
:
:
:

0
1
0
0

0
0
1
2

1
5

0
0
0

1
5

0
1

6. Linear Systems of Simultaneous Equations


(a) Matrix Representation of the system
Consider the equations:
a11 x1 + a12 x2 + a13 x3 + ..... + a1n xn
a21 x1 + a22 x2 + a23 x3 + ..... + a2n xn
a31 x1 + a32 x2 + a33 x3 + ..... + a3n xn
.....
am1 x1 + am2 x2 + am3 x3 + ..... + amn xn

= b1
= b2
= b3
= bm

This is a system of m linear simultaneous equations in n unknowns.

If the matrix with elements aij

x1
x
2

= A, the n 1 column vector x3 = x, the

:
xn

b1
b
2

m 1 column vector
b = b, then the system can be written as Ax = b
3
:
bm
(b) Solution: The Regular case
Here A is a square non-singular matrix, so that n = m (same number of equations
as unknowns) and A1 exists. In this case the solution is unique.
Possible Methods of Solution
i.

Elimination of variables by addition and/or subtraction of scalar multiples


of equations.

ii.

Associated with i., elimination of variables using elementary row operations


on an augmented matrix whose L.H.S. is A and R.H.S. is b. The L.H.S. is reduced to upper triangular form and the variables found by back-substitution
in the simplied equations represented by the nal augmented matrix. This
method is known as Gaussian Elimination

iii.
iv.

Pre-multiplication of the matrix equation Ax=b by A1


A1 Ax = A1 b
Ix = A1 b

Cramers Rule. (See Maths II notes)

40

x = A1 b.

EXAMPLE :

Method ii. Gaussian Elimination


x1 + 4x2 2x3
2x1 2x2 + x3
3x1 + x2 + 2x3

Solve the equations

= 3
= 1
= 11

Form the augmented matrix:

Possible row operations:


1. Multiply a row by a scalar = 0;
2. Add one row to another;
3. Interchange two complete rows.

1 4 2 : 3

1 : 1
2 2
3 1
2 : 11

Use the pivots (leading diagonal elements) in turn to produce the necessary zeros
in column 1, then in column 2, etc., resulting in an upper triangular matrix.

1 4 2 : 3

1 4 2 : 3

Now R3 of the matrix represents the equation:


And R2 represents the equation:
We nally use the rst equation to nd x1 :

EXAMPLE :

Method iii. Use of the Inverse Matrix


x1 + x2 + x3
2x1 + 2x3
2x1 2x2 + x3

Solve the equations

= 2
= 12
= 15

This system of equations can be written as

Ax = 12 ,
15

We have already found A1

1 1 1
x1

where A = 2 0 2 and x = x2
2 2 1
x3

2 23

= 1 12
2 2

2
2 23

Then x = A1 12 = 1 12
15
2 2

0
1

1
2

0 12 =
1
15

x1
Thus x2
x3

=
=
=

Note also that A is the inverse of A1 and that (AT )1 = (A1 )T . If, therefore,
we need also to solve a system of equations whose coecient matrix is A1 or AT ,
we already have the necessary inverse matrices.
41

(c) Solution: More General Systems


We have thus considered the regular case of a n n system of simultaneous equations which has a unique solution. We now look at the problem more generally.
We consider the cases of
Homogeneous systems:

Ax = 0;

Inhomogeneous systems:

Ax = b,

where b = 0.

We ask: Is there always a solution?


Consider rst Homogeneous Systems
The vector x = 0 (i.e. x1 = 0, x2 = 0, ..., xn = 0) is clearly always a solution of
these equations, since A.0 = 0. So the answer is, Yes, there is always a solution.
Not, on the whole, a very interesting solution: it is known as the trivial solution.
So we ask a supplementary question: Is there ever a non-trivial solution?
Consider the following examples.
Case I

x1 + 4x2 + x3
2x1 x2 + 2x3
x1 2x2 + 5x3

1
4 1 : 0

2
1
2 : 0

1 2 5 : 0

= 0
= 0
= 0
and Gaussian
elimination
leads to:

These are represented by:

R2 2R1
R3 + R1

1 4 1 : 0

0 9 0 : 0
0 2 6 : 0

R2 now represents the equation 9x2 = 0 x2 = 0;


Substituting in equ.3 gives 2.0 + 6x3 = 0 x3 = 0;
Substituting in equ.1 gives x1 + 4.0 + 0 = 0 x1 = 0
This is the only (the unique) solution.

Case II

x1 + 4x2 + x3
2x1 x2 + 2x3
x1 14x2 + x3

= 0
= 0
= 0

These are represented by:

1
4
1 : 0
1 4 1 : 0
1
4
1 : 0

0 9 0 : 0
2 1 2 : 0 ; R2 2R1 0 9 0 : 0 ;
R3 R1
0 18 0 : 0
R3 2R2 0 0 0 : 0
1 14 1 : 0
The system has been reduced to two equations. From equ.2 we have,as before,
x2 = 0, but the only information we can obtain from equ.1 is the relationship
x1 + x3 = 0 x1 = x3 . There is no more information contained in the
equations from which we can derive a unique solution. So the system has innitely
many solutions which can be expressed in terms of a parameter t as
x1 = t, x2 = 0, x3 = t.
42

Now Inhomogeneous Systems Do these always have a solution?


Consider the following three examples.

Case I

1 4 2 : 3

1 : 1
2 2
3 1
2 : 11

This system has already been solved; the last stage in the Gaussian elimination
process is:

1 4 2 : 3

1 : 1

0 2
0 0
5 : 15

equ.3 :
equ.2 :
equ.1 :

The solution is:


x3 = 3
2x2 + 1.3 = 1 x2 = 2
x1 + 4.2 2.3 = 3 x1 = 1

This is the only (the unique) solution.

Case II

1
4
2 :
3

1 :
1
2 2
3 28 14 : 11

1
4
2 :
3

1 :
1 ;
2 2
3 28 14 : 11

1
4
2 :
3

R2 2R1 0 10 5 : 5 ;
R3 3R1 0 40 20 : 20

These are represented by:

1 4 2 : 3

R2 5 0 2 1 : 1
0 : 0
R3 4R2 0 0

The system is reduced to


two equations and cannot be
solved uniquely.

All we can do is choose an arbitrary value (a parameter, say t) for one of the
unknowns, say x2 , and express the other two in terms of this. So we have innitely
many solutions which can be expressed in terms of one parameter, t.
If x2 = t, from equ.2 we have x3 = 2t 1
From equ.1 we have x1 = 3 4t + 2(2t 1) = 1
This can be written as:

1
1
0
x1

x = x2 = t = 0 + t 1
x3
2t 1
1
2
The two vector terms of this solution can be thought of as the Particular Solution and the Complementary Function (cf. Linear dierential equations). The
Particular Solution is just one of the possible solutions of the given equations
(obtained when t = 0) and the Complementary Function is a solution of the
homogeneous system Ax = 0, satisfying Ax = 0 for all values of t.
43

Case III

1
4
2 : 3

1 : 1
2 2
3 28 14 : 10

As in Case II, except for b3 :

1
4
2 : 3
1
4
2 : 3
1 4 2 : 3

1 : 1 0 10 5 : 5 0 2 1 : 1
2 2
3 28 14 : 10
0 40 20 : 1
0 0
0 : 21
R3 now represents the equation
0x1 + 0x2 + 0x3 = 21
This clearly cannot be satised by any values of x1 , x2 , x3 : there is therefore
no solution.
To summarise:
Homogenous Equations Ax = 0.
Always have a solution, x = 0, (the trivial solution) but may have a
non-trivial solution as in Case II. Where a non-trivial solution exists,
it always represents innitely many solutions, being dependent on at
least one arbitrary parameter.
Inhomogeneous Equations Ax = b.
Do not always have a solution. Three cases arise: there may be a
unique solution, as in Case I; there may be innitely many solutions
dependent on at least one arbitrary parameter, as in Case II, when the
equations are said to be consistent; or there is no solution, as in Case
III - the equations contain contradictory information and are said to
be inconsistent.
When we set out to solve a system of equations, it is not obvious which of these
cases, Case I, Case II or Case III will apply. The examples have demonstrated that
the use of Gaussian Elimination will identify the system concerned with one of
these, without any pre-knowledge being necessary. If the equations are consistent,
the Gaussian Elimination process also gives us the means of expressing a solution
in terms of parameters as necessary. This is much more helpful than just stating,
There are innitely many solutions.
The solution of equations by Gaussian Elimination is therefore to be recommended!

44

The next question is, Why do these dierent situations arise? This depends
on the nature of the coecient matrix, A:
A non-singular

Solution is unique. (Case I)

A singular

Equations are consistent;


innitely many solutions (Case II)
or
Equations are inconsistent;
no solution
(Case III)

45

Finally, What determines whether a matrix is singular or non-singular?


In a singular matrix, at least one row is a linear combination of the others. In
our examples:
Homogeneous equations, Case II:

i.e.

Here R3 = 2R2 3R1


(1, 14, 1) = 2 (2, 1, 2) 3 (1, 4, 1)

Inhomogeneous equations, Cases II and III:

i.e.

Here R3 = 4R2 5R1


(3, 28, 14) = 4 (2, 2, 1) 5 (1, 4, 2)

If we consider rows of the augmented matrix (the bi elements as well as the aij )
in the inhomogeneous systems, we nd:
Case II: R3
i.e. (3, 28, 14 : 11)
Same combination as the aij
But Case III: R3
(3, 28, 14 : 10)

= 4R2 5R1
= 4 (2, 2, 1 : 1) 5 (1, 4, 2 : 3)
Hence consistency, and solutions
= 4R2 5R1 since
= 4 (2, 2, 1 : 1) 5 (1, 5, 2 : 3)
Hence inconsistency, and no solution

We say that the rows of A in cases II and III are Linearly Dependent: i.e. at
least one row can be formed by taking a linear combination of some or all of the
other rows.
A matrix whose rows are linearly dependent is always singular.
If it is not possible to express any row of a matrix as a linear combination of the
other rows, they are said to be Linearly Independent.
A matrix whose rows are linearly independent is always non-singular.

46

The Rank of a Matrix


It can be shown that if the rows of a square matrix are linearly dependent, then
so are the columns of the matrix.
Suppose that, if A is a n n matrix whose rows are linearly dependent, there are
r (< n) linearly independent rows. Then there will also be r linearly independent
columns.
The number r is called the Rank of the matrix.
The simplest way to nd the rank of a matrix is to perform row operations on the
matrix, as in Gaussian elimination, to produce an upper triangular matrix. The
number of non-zero rows at the end of the operation is the number of linearly
independent rows, and therefore the rank of the matrix. The rank of a matrix is
not altered by performing these elementary row operations on it. In our example
Case II of the Inhomogeneous equations:

1
4
2
1
4
2
1 4 2

1 0 10 5 0 2 1
2 2
3 28 14
0 40 20
0 0
0
There are two non-zero rows in the nal matrix which are certainly linearly independent; the rank of the matrix is therefore 2.
The augmented matrix representing a set of equations is often denoted: A|b.
If, for a system of equations, rank(A) = rank(A|b), then the equations are consistent and have solutions.
If rank(A) < rank(A|b), then the equations are inconsistent and have no solution.
Where there is consistency in the system, the number of parameters necessary to
give a general solution of the system is:
n rank(A)
In our Case II, n = 3, rank(A) = 2

number of parameters needed for the general solution = 3 2 = 1.

47

(d) Iterative Methods of Solution for Linear Simultaneous Equations


These methods are based on the idea of rewriting the n equations so that in turn
each of the n unknowns, xi , i = 1, 2, 3, ...., n, is made the subject of the equation.
i.e. If the set of equations is given as

a11
a
21

.
an1

a12
a22
.
.
an2

.
.
.
.
.

.
.
.
.
.

b1
x1
. a1n

. a2n x2 b2

. = .
.
.

.
. . .
bn
xn
. ann

then the rearrangement is


x1 =

1
(b1 a12 x2 a13 x3 ... a1n xn )
a11

x2 =

1
(b2 a21 x1 a23 x3 ... a2n xn )
a22
.
.

xn =

1
(bn an1 x1 an2 x2 ... an,n1 xn1 )
ann

Clearly this requires that aii = 0 - diagonal elements are non zero. If necessary,
the equations can be rearranged, not only to ensure that aii = 0, but also to try
to arrange values as large as possible for the aii .
We then aim to use these equations as iteration formulae. The process will,
clearly, have to be convergent to be of use. This will depend on the values of
the coecients - there are tests which can be applied. (See end of notes)
A set of initial values of the xi are chosen, often all as zero unless there is a priori
(0)
information about the approximate size of the solutions. We shall call these xi ,
(k)
and successive iterations xi , where k is the number of the iteration.

48

Jacobis Method
Jacobis method involves the most basic iteration process of the type:
(k+1)

(k+1)

x1
x2

1
(k)
(k)
(b1 a12 x2 a13 x3 ... a1n x(k)
n )
a11
1
(k)
(k)
(b2 a21 x1 a23 x3 ... a2n x(k)
n )
a22
.
.
.

x(k+1)
=
n

1
(k)
(k)
(k)
(bn an1 x1 an2 x2 ... an(n1) xn1 )
ann
(k)

We see that the set of values {xi } is used in the R.H.S. of each equation until
the whole iteration is completed.
Gauss-Seidel Method
The Gauss-Seidel method involves updating the subdiagonal variables as the computation proceeds. The iteration process is:
1
(k)
(k)
(b1 a12 x2 a13 x3 ... a1n x(k)
n )
a11
1
(k)
(k+1)
=
a23 x3 ... a2n x(k)
(b2 a21 x1
n )
a22
.

(k+1)

x1

(k+1)

x2

x(k+1)
n

.
1
(k+1)
(k+1)
(k+1)
=
(bn an1 x1
an2 x2
... an(n1) xn1 )
ann

49

Example of the Iteration Method


Solve
10x1 x2 + x3 = 20
x1 20x2 + 2x3 = 16
2x1 x2 + 20x3 = 23

Rewrite as

1
(20 + x2 x3 )
10
1
x2 = (16 + x1 + 2x3 )
20
1
x3 = (23 2x1 + x2 )
20
Note: A is diagonally dominant. Therefore convergence is likely because 1/aii is
small.
x1 =

Jacobi iteration

(k+1)

x1

(k)

(k)

(k)

(k+1)

= (16 + x1 + 2x3 )/20

(k+1)

= (23 2x1 + x2 )/20

x2
x3
Taking x(0) = 0, gives

(k)

= (20 + x2 x3 )/10
(k)

(k)

x(0) = (0, 0, 0)T


x(1) = (2.0, 0.8, 1.15)T
x(2) = (1.965, 1.015, 0.990)T

x(3) = (2.0025, 0.99725, 1.00425)T


x(4) = (1.9993, 1.00055, 0.9996125)T

50

Using Gauss-Seidel iteration, we get


(k+1)

x1

(k+1)

x2

(k+1)

x3
Taking x(0) = 0, gives

(k)

(k)

= (20 + x2 x3 )/10
(k+1)

+ 2x3 )/20

(k+1)

+ x2

= (16 + x1

= (23 2x1

(k)

(k+1)

)/20

x(0) = (0, 0, 0)T


x(1) = (2.0, 0.9, 0.995)T

x(2) = (1.9905, 0.999025, 1.00090125)T


x(3) = (1.999812375, 1.000080744, 1.000022856)T
x(4) = (2.0000058, 1.0000026, 0.9999995)T
c.f. exact solution: x = (2, 1, 1)T

51

General Formulation of these Methods


These methods can be written in matrix form as

x(k+1) = M x(k) + c

If we write the coecient matrix A as A = L + D + U,


ie. a strictly lower triangular matrix + a diagonal matrix + a strictly upper
triangular matrix

A =

@
s
s
s
s
s
@ @
U
@ @
s @ s@
s
s
s
@ @
@ @
s
s @ s@
s
s
@ @
@ D@
s
s
s @ s@
s
@
@
L
@ @
s
s
s
s @ s@
@

Then for the Jacobi Method,


M = D1 (L + U ),

c = D1 b

These are usually referred to as MJ and cJ


For the Gauss-Seidel Method,
M = (D + L)1 U,

c = (D + L)1 b

These are usually referred to as MG and cG


The Methods will converge to a solution if
(a) The matrix A is diagonally dominant (this is a sucient condition)
or, more generally,
(b) the absolute values of all the eigenvalues of the iteration matrix M must be
less than 1. This is a necessary and sucient condition.
Iteration Methods are particularly useful when the matrix A is large and sparse
(i.e. has many zero elements). Problems involving sets of equations such as
these occur, for instance, in Finite Element and Finite Dierence Analysis.
Other iteration methods involve weighting constants in the iteration formulae, and
are known as Under- or Over-Relaxation methods. On the whole they can converge more quickly than either Jacobi or Gauss-Seidel, provided the best weighting
constants are found, and this is usually a matter of trial and error.

52

APPENDIX
Theorems governing Solutions of Linear Systems of Equations
1. If the homogeneous system Ax = 0 has the solution x = u, then for any constant ,
another (more general) solution is
x = u

2. If the homogeneous system Ax = 0 has two solutions x = u1 and x = u2 then a third


solution is
x = u1 + u2

Combining these two theorems, we can state that a more general solution is
x = u1 + u2
where and are any two constants.
3. If the inhomogeneous equation Ax = b has a solution x = v and the corresponding
homogeneous equation Ax = 0 has the solution x = u, then for any constant , a
more general solution of Ax = b is
x = u + v

53

54

Chapter 3.
DIFFERENTIATION

55

1. Denition of the Derivative, or Dierential Coecient, of a Function


Given a function y = f (x):
{

f (x + x) f (x)
dy
= f (x) = lim
x 0
dx
x
Also

d2 y
d dy
f (x + x) f (x)

=
=
f
(x)
=
lim
x 0
dx2
dx dx
x
and similarly for higher derivatives.

y
6

y = f (x)

f (x + x)
f (x)








f (x + x) f (x)
?

-x

x + x

 x

2. Derivatives of a few important functions


f (x)
xn
ex
ln x
sin x
cos x
tan x
3. Dierentiation is a Linear Operation

f (x)
nxn1
ex
1
x

cos x
sin x
sec2 x
Hence:

(i)

d
d
d
{f (x) + g(x)} =
f (x) + g(x);
dx
dx
dx

(ii)

d
d
{ f (x)} = f (x),
dx
dx
56

where is constant.

EXAMPLES

The use of the denition to nd two derivatives.

(i) f (x) = x2
{

(x + x)2 x2
lim
x0
x
{
}
x2 + 2xx + x2 x2
lim
x0
x
{
}
x(2x + x))
lim
= 2x
x0
x

f (x) =
=
=
(ii) f (x) = ex
{

f (x) =
=

ex+x ex
x

lim

x0

lim

x0

lim

x0

ex (1 +

= lim

ex (1 + x +
{

}
x0

x2

x3

2!

3!

+
x

ex .ex ex
x

+ ... 1)

x x2
+
+ ...)
2!
3!

= lim

x0

ex (ex 1)
x

= ex

Use of these, together with the linear properties, to nd derivatives of sinh x and cosh x
ex ex
(i) If y = sinh x =
2
Then

dy
d
=
dx
dx

ex ex
2

ex + ex
(ii) If y = cosh x =
2

Then

ex (ex )
2
x
e + ex
=
2
= cosh x

dy
d
=
dx
dx

ex + ex
2

ex + (ex )
2
x
e ex
=
2
= sinh x

General implications of the linear properties


Any function which consists of sums of constant multiples of elementary functions can
be dierentiated term by term:
f (x) = 4x3 + 3 sin x

2
+8
x

f (x) =

57

4.3x2 + 3. cos x 2.(1)x2 + 0


12x2 + 3 cos x +

2
x2

4. Rules for Dierentiation


(a) Function of a Function; Chain Rule
If y = y(t) and in turn t = t(x) , then:
dy
dy dt
=
.
dx
dt dx

(b) Product Rule


If y = u.v where u = u(x), v = v(x), then:
dy
du
dv
=
.v + .u
dx
dx
dx

(c) Quotient Rule


If y = uv where u = u(x), v = v(x), then:
dy
=
dx
OR:

Treat

du
v
dx

dv
dx
u
2
v

u
as u.v 1 and use the product rule.
v

58

EXAMPLES
i.

y = e5x+3
Let t
Then y
dy

dt
dy

dx

ii.

= 5x + 3
= et and t = 5x + 3
dt
= et and
=5
dx
dy dt
=
.
= et .5
dt dx

= 5e5x+3

y = sin3 (x2 )
Let u
Then y
dy

dt
dy

dx

= x2 and t = sin u
= t3 , t = sin u, u = x2
dt
du
= 3t2 ,
= cos u,
= 2x
du
dx
dy dt du
=
. .
= 3t2 . cos u.2x =
dt du dx

6x sin2 (x2 ) cos(x2 )

EXAMPLE
y = (5x2 + 3x 1)e5x
This is the product of two functions of x; we need to use the product rule.
We can take: u(x) = 5x2 + 3x 1; v(x) = e5x
or vice versa
du
dv
Then:
= 10x + 3;
= 5e5x
dx
dx
dy

= (10x + 3)e5x + 5e5x (5x2 + 3x 1) = e5x (25x2 + 25x 2)


dx

EXAMPLE
sin x
y=
x

Here we need the quotient rule.

1
Here there is no choice of u and v.
Here: u(x) = sin x; v(x) = x = x 2
du
dv
1 1
1
Then:
= cos x;
= x 2 =
dx
dx
2
2 x

1
cos x. x 2x sin x
dy
2x cos x sin x

=
=
3
dx
x
2x 2
59

5. Use of Natural Logarithms


We take the natural log of a function we wish to dierentiate and then use the properties
of logs to simplify the dierentiation process in the following cases:
When the function consists of a complicated combination of products and quotients.
In this case, the use of logs is helpful.
When the function contains some function of x which is raised to a power which is
also a function of x and is therefore variable. In this case the use of logs is essential
(well, almost!).
We then dierentiate the natural log of the function w.r. to x, remembering to use the
chain rule where necessary. In particular, there is now a term ln y in the expression,
and
d
1 dy
ln y =
dx
y dx
using the chain rule.

6. Inverse Functions
In these cases, y is given as an inverse function of x. We need to rewrite this as a direct
function; usually this means expressing x as a function of y. We then dierentiate the
new function w.r. to x (again remembering that the chain rule will almost certainly be
necessary). Finally we need to use the given relationship between x and y to express
the derivative in terms of the required variable, usually x.
Given y = f 1 (x)

f (y) = x

f (y).

dy
=1
dx

dy
1
=
dx
f (y)

Finally f (y) has to be expressed in terms of the independent variable, x.


df
; the representing the derivative of a function
Remember that here f (y) means dy
always implies dierentiation w.r. to the variable in which the function is expressed.

60

EXAMPLES

(x2 + 1)3 2x 1
(i) y =
1 + 2x 3x2
Taking logs: ln y

=
=

ln(x2 + 1)3 + ln 2x 1 ln(1 + 2x 3x2 )


1
3 ln(x2 + 1) + ln(2x 1) ln(1 + 2x 3x2 )
2

Dierentiating w.r. to x :
1 dy
1
1 1
1
.(2 6x)
.
= 3. 2
.2x +
.2
y dx
x +1
2 2x 1
1 + 2x 3x2
(
)
dy
6x
1
2 6x

= y 2
+

dx
x + 1 2x 1 1 + 2x 3x2

(
)
(x2 + 1)3 2x 1
6x
1
2 6x
=
+

1 + 2x 3x2
x2 + 1 2x 1 1 + 2x 3x2
(ii)

y = xx
Taking logs: ln y = ln(xx ) = x. ln x
Dierentiating w.r. to x :
1 dy
1
.
= 1. ln x + x. ,
using the product rule
y dx
x
dy

= y(ln x + 1)
= xx (1 + ln x)
dx

EXAMPLE

y = tan1 x

(= arctan x)

Rewrite as: tan y = x


Dierentiating w.r. to x :
dy
= 1
sec2 y.
dx
dy
1
1
1

=
=
=
2
2
dx
sec y
1 + tan y
1 + x2

61

7. Implicit Dierentiation
A function which can be expressed in the form y = f (x) is said to give y explicitly
in terms of x.
A function in which the relationship between x and y is given in the form f (x, y) = 0
and which cannot be rearranged to express y solely in terms of x is said to be an implicit
function.
dy
of a curve whose equation is an implicit relationship between
To nd the gradient dx
x and y, we dierentiate the implicit function term by term w.r. to x, remembering
that the chain rule is necessary for terms in y.

8. Parametric Dierentiation
A functional relationship between x and y is sometimes expressed in parametric
form: x and y are each given in terms of a third variable, the parameter, say t or .
i.e. x = x(t),

y = y(t)

To nd the gradient of the curve showing the relationship between x and y in these
cases, we use the chain rule as follows:
dy
dy dt
=
.
=
dx
dt dx

dy
dt
dx
dt

y
x

where represents dierentiation w.r. to t.


For the second derivative

d dy
d2 y
=
by denition
2
dx
dx dx
(
)
d dy dt
dy
=
.
since
is in terms of t
dt dx dx
dx
=

d
dt

This can also be expressed as:

dy
dx
dx
dt

d2 y
yx xy
=
2
dx
x 3

62

d
using the quotient rule to obtain
dt

dy
.
dx

EXAMPLE
x3 + 4x2 y + y 3 = 0
Dierentiate w.r. to x:
(

dy
dy
3x + 4 2x.y + x .
+ 3y 2 .
= 0
dx
dx
dy

(4x2 + 3y 2 ) = 3x2 8xy


dx
dy
3x2 8xy

=
dx
4x2 + 3y 2
2

If the second derivative of y is required, we can dierentiate this, using the quotient
and the chain rules:
dy
dy
))(4x2 + 3y 2 ) (8x + 6y dx
)(3x2 8xy)
(6x 8(1.y + x. dx
d2 y
=
dx2
(4x2 + 3y 2 )2

and we can then substitute for


y only.

dy
dx

if the second derivative is required in terms of x and

EXAMPLE
x = cos t, y = cos 2t
dx
= sin t,
dt

dy
= 2 sin 2t
dt

dy
2 sin 2t
2 sin 2t
=
=
dx
sin t
sin t
dy
2.2 sin t cos t
Simplifying, this becomes:
=
= 4 cos t
dx
sin t

Then

OR,

using

d2 y
=
dx2

d
(4 cos t)
dt
dx
dt

4 sin t
sin t

= 4

d2 y
yx xy
=
, and x = cos t, y = 4 cos 2t
2
dx
x 3
d2 y
4 cos 2t( sin t) ( cos t)(2 sin 2t)
=
2
dx
( sin t)3
4(cos2 t sin2 t) sin t 2 cos t.2 sin t cos t
=
sin3 t
= 4

63

64

Chapter 4.
SERIES, LIMITS
and FUNCTIONS

65

1. Denitions
A Sequence

is a set of numbers, or elements, stated in a given order,


each element formed according to some pattern.
a1 , a2 , a3 , ......, an , ......

A Series

is the sum of the elements in a sequence.


a1 + a2 + a3 + ...... + an + .....

A Finite Series

is a series having a nite number (say N ) of (nite) terms.


It must always, therefore, have a nite sum.
SN = a1 + a2 + a3 + .... + aN =

an

n=1

An Innite Series

has no last term.


S = a1 + a2 + a3 + .... + an + .... =

an

n=1

In this case we can think of series for which:


(i) S is innite (cannot be dened);
(ii) S can be uniquely dened and is nite;
(iii) Neither (i) nor (ii) applies.
In Case (ii) we say that the series is Convergent
In Cases (i) and (iii) we say that the series is Divergent.
A Power Series

is an innite series which takes the form:


a0 + a1 x + a2 x2 + a3 x3 + ..... + an xn + ....
=

an x n

n=0

where x may be a variable or a known quantity.

66

EXAMPLES
A Sequence:
1,

1
1 1 1
, , , ......
....
2 3 4
n

A Series:
1+

1
1
1
1
1
+ + + .... +
+ .... =
= e1
2! 3! 4!
n!
n=1 n!

A Finite Series:
1 + 5a + 10a2 + 10a3 + 5a4 + a5 =

5!
an = (1 + a)5
(5

n)!n!
n=0

Innite Series
(i)

1 + 2 + 3 + 4 + 5 + ........ + n + ....

(ii)

1+

(iii)

(1)1 + (1)2 + (1)3 + (1)4 + ...... + (1)n + ....


In this case SN = 1 when N is odd, and 0 when N is even
Therefore although these sums are nite, the sum does not
converge towards a single value. The series is therefore Divergent

Divergent series

1 1 1
1
+ + + .... + n1 + .... = 2
2 4 8
2
This is a geometric series with a = 1 and r = 1/2
and therefore has a nite sum, i.e. is Convergent

Power Series
(i)

1 + 2x + 3x2 + 4x3 + .... + (n + 1)xn + .... =

(n + 1)xn

n=0

(ii)

4 8 16
(1)n 2n+1
(1)n 2n+1
2 +
+ .... +
+ .... =
3 5
7
(2n + 1)
n=0 (2n + 1)

67

2. Limits
Informal Denitions of Limits (For a formal denition, see Appendix A.)
(a) For a function
If the values f (x) of a function f approach a value L (which must be a nite
number), as x approaches a value c from either direction, we say that f has
Limit L as x approaches c, and we write:
lim{f (x)} = L

xc

N.B. The limiting value of the function f is not dened as the value of f at x = c.
The limit, if it exists at all, is entirely determined by the values of f (x) as x draws
closer to c, but not by any value that f may have at x = c. The value of the
function may not even be dened at x = c, and yet the limit may still exist.
Left- and Right-handed Limits
The values f (x) approached by a function f as x approaches c from the left and
from the right are respectively denoted:
lim {f (x)}

xc

and

lim {f (x)}

xc+

These values may be dierent.


Only if they both exist and are equal is the limit of f (x) as x approaches c said
to exist.
(b) For a sequence
If the values of the terms an of a sequence approach a value L (which again must
be nite) as n becomes very large, we say that the sequence has a limit L, and
we write:
lim {an } = L
n

The terms an of a sequence are often expressed as a rational function of n. In


order to nd the limit L, we divide top and bottom of the fraction by the highest
power of n occurring in the denominator and then let n

68

EXAMPLE

(a) For a continuous function f (x), limxc f (x) = f (c)

y
6

f (x)
f (c)


+

- x
cx
However, when there is a discontinuity, or where the function is not dened, there may
be a problem.

e.g. f (x) =
However:

sin x
x

is not dened at x = 0.

f (0.1) = 0.99833; f (0.01) = 0.999983; f (0.001) = 0.99999983


f (0.1) = 0.99833; f (0.01) = 0.999983; f (0.001) = 0.99999983

Clearly f (x) 1.0000 as x 0 from either direction. Therefore conditions are


satised for
lim f (x) = 1
x0

e.g. The Heaviside Function or Unit Step Function


y
{

This is dened as H(t) =

1 for t > 0
0 for t < 0

So the Left-Hand limit is lim H(t) = 0;


t0

6
-t

the Right-Hand limit is lim+ H(t) = 1


t0

Both of these limits exist, although H(t) is not dened at t = 0.


However, since the L-H and R-H limits are not the same, limt0 H(t) does not exist.

EXAMPLE

(b) Consider the sequence

n
2 3 4 5
1, , , , , ...... an , ...., where an =
3 5 7 9
2n 1
Dividing top and bottom of an by n (the highest power of n occurring in the denominator), we can write
an =

1
2

1
n

as n ;

69

i.e. lim {an } =


n

1
2

3. Convergence of a series
An intuitive idea of convergence can be given as follows:
We have said that if all terms of an innite series add up to a consistent nite number,
then the series is said to be convergent.
In practice, this means that if we take the sum SN of all the terms up to and including
the N th term, where N is very large, the value of SN is more or less the same as the
value of SN +1 , the sum of the rst (N + 1) terms, or the value of SN +2 , etc.
i.e. the sequence formed from these sums,
{SN , SN +1 , SN +2 , SN +3 , ........}
seems to be approaching some limit.
More formally, the series is convergent if
lim {SN } = S

where SN =

N
N

an

and is called the N th partial sum.

n=1

Clearly, if SN , SN +1 , SN +2 , ..... all take roughly the same value, then the individual
terms aN +1 , aN +2 , aN +3 , ... must all be of negligible size. This, then, is a pre-requisite
of any series to be convergent. i.e., that
an 0 as n
N.B. Although this condition is necessary for a series to be convergent (and should
always be checked as a preliminary test before any other tests for convergence are
tried), it is not sucient to ensure that the series is convergent.
i.e. If we nd that the terms do become negligible in size as the series progresses, this
merely tells us that it is is possible for this series to converge; it does not guarantee
convergence other tests have to be used to conrm convergence or otherwise.
Absolute and Conditional Convergence
If the series

|an | is convergent, then so is

n=1

an .

n=1

Here we say that:

an is absolutely convergent.

n=1

If the series

|an | is divergent, but

n=1

then we say that:

an is convergent,

n=1

an is conditionally convergent.

n=1

70

Formal Proof
Since SN =

an ,

SN 1 =

n=1

N
1

an ,

then:

n=1

SN SN 1 = aN

But if the series converges to S, then by the denition of convergence for a sequence:
S = lim {SN }
N

So:

and

S = lim {SN 1 }
N

lim {SN SN 1 } = lim {SN } lim {SN 1 } = S S = 0

i.e.

lim {aN } = 0

EXAMPLE

The series 1 + 12 + 13 + 14 + 15 + 16 + .... +


proof see p.62), although
an =

EXAMPLE:
1

1
n

so that

1
n

+ ... is divergent (for

lim {aN } = 0

a conditionally convergent series

(1)n+1
(1)n+1
1 1 1 1 1
+ + + .... +
+ ....... =
2 3 4 5 6
n
n
n=1

is convergent (for proof see p.87), whereas |1| + | 21 | + | 13 | + | 14 | + | 15 | + | 16 | + .... is


divergent. The given series is therefore conditionally convergent, the condition being
that the terms must be added in the given order.
e.g. (1 + 31 + 15 + 17 + .....) ( 12 + 14 + 16 + 18 + .....) is not dened (i.e. a sum cannot
be found) since both series are divergent.
71

4. The Most Important Power Series: the Taylor Series


This series expresses the values f (x) of a function f as a power series in ascending
powers of (x a), where a is a xed value. We say that:
The function f has been expanded about the point x = a.
The series is:

f (x) = f (a) + (x a)f (a) +

(x a)2
(x a)3
f (a) +
f (a) + ......
2!
3!

For a function f (x) to possess a Taylor expansion about the point a, f (x) and all
its derivatives must be properly and uniquely dened at x = a.
For the Taylor expansion of f (x) to be equivalent to the function value at x (i.e. for
the expansion to be valid), the series must be convergent for this value of x.
By letting x = a + h, the Taylor series for f (x) about the point x = a can alternatively
be expressed as:
f (a + h) = f (a) + hf (a) +

h2
h3
f (a) + f (a) + ......
2!
3!

h may be positive or negative, so that f (x) is evaluated at a point to the right or to


the left respectively of a.
Taylors Theorem
The Taylor series, as given above, is an innite series. There is also a nite version of
the series which therefore has a last term and is known as Taylors Theorem. The
statement of the theorem is as follows:
Provided the function f and the rst (n + 1) of its derivatives exist and are
continuous at the point a, there exists a number c lying between x and a such
that
f (x) = f (a)+(xa)f (a)+

(x a)n (n)
(x a)n+1 (n+1)
(x a)2
f (a)+ .... +
f (a)+
f
(c)
2!
n!
(n + 1)!

The last term is called the Error or Remainder term because it represents the error
made in approximating the value of f (x) by just the rst n+1 terms in the Taylor series.
We usually do not know the value of c, but sometimes by evaluating the maximum
possible value of the error term taking c as the value giving the worst case, we can
ascertain whether the approximation is acceptable for our purpose.
72

EXAMPLE

Part (i)

If f (x) = sin x, expand f (x) about the point a = 3 .

We need the value of f and its derivatives at x =


f (x) = sin x

f (x) =

f (x) =

( )

=
3
( )

f
=
3
( )

=
f
3
f

So f (x) =
Part (ii) Estimate the maximum size of the error made by approximating sin 62 by
the rst three terms of this series.
If only the rst three terms are used (i.e. up to and including the term in f ( 3 )), then
the error term is:
(x a)3

f (c),
with a =
3!
3

First, x = 62 must be expressed in radians: (x a) = (62 60) = 2 180


radians
Next, c is unknown. However, since f (x) = cos x, the max. value of |f (x)| = 1.
(

2 3 1
The error term is of the order of
. = 0.0000071 (max)
180
3!
We therefore expect the rst three terms to give at least 4 decimal places correct in
the value of sin 62 .
In fact the numerical value given by these three terms is:
calculator value = 0.88294759
|Error| = 0.00000349, < 0.0000071 as expected.
N.B.

sin 62 0.88295108;

By taking the case n = 0 in Taylors theorem, we obtain:

f (x) = f (a) + (x a)f (c),

or:

f (c) =

f (x) f (a)
, a<c<x
xa

The R.H.S. of this expression for f (c) is the gradient of the straight line joining the
points at x and a on the curve y = f (x), and is called the Mean Value of f (x) in
this interval.
The equation is the statement of the Mean Value Theorem which states that, provided f is continuous and dierentiable on the interval (a, x), there is at least one value
of x, x = c, in this interval for which
f (c) = the mean value of f (x) in the interval (a, x)

73

5. The Maclaurin Series


This is a special case of the Taylor series in which we take a = 0.
Substituting a = 0 in the Taylor series for f (x) gives:

f (x) = f (0) + xf (0) +

x2
x3
x4
f (0) + f (0) + f (4) (0) ......
2!
3!
4!

Clearly, a function f can only be expanded as a Maclaurin series if the values f (0), f (0),
f (0), ... are uniquely dened. The function ln x, for example, has no Maclaurin series
since ln 0 is not dened, and none of the derivatives of ln x are dened at x = 0.
The Maclaurin series representing some functions are valid (converge to the value of
the function) for all values of x; others are valid only for a limited range of values of
x. We can test the series as required using the various tests for convergence.
The following are the Maclaurin expansions of various important elementary functions:
The Exponential Function
ex = 1 + x +

x2 x3 x4
+
+
+ ....
2!
3!
4!

The Circular Functions: sin x, cos x


sin x = x

(N.B. x must be in radians)

x3 x5 x7
+

+ .....
3!
5!
7!

x2 x4 x6
cos x = 1
+

+ .....
2!
4!
6!
These three are valid for all x.
The Binomial Series

(any n)

(1 + x)n = 1 + nx +

n(n 1) 2 n(n 1)(n 2) 3


x +
x + .....
2!
3!

ln(1 + x)
x2 x3 x4
+

+ .....
2
3
4
These last two are valid only for |x| < 1, or 1 < x < 1.
ln(1 + x) = x

74

EXAMPLE
f (x)
f (x)
f (x)
f (x)

=
=
=
=

Find the Maclaurin series for tan x, up to the term in x3 .

tan x
sec2 x
2 sec x. sec x tan x = 2 sec2 x tan x
4 sec x. sec x tan x. tan x + 2 sec2 x. sec2 x
Thus

tan x = 0 + x.1 +

f (0)
f (0)
f (0)
f (0)

=
=
=
=

0
12 = 1
0
2

x2
x3
.0 + .2 + ...
2!
3!

x3
tan x = x +
+ ...
3

EXAMPLE

Test the convergence of the Maclaurin series for cos x.

We need to use the Ratio Test. (This is a power series, the coecients contain factorials)
an = (1)n .

So

x2n
(2n)!

an+1 = (1)n+1

x2(n+1)
x2n+2
= (1)n+1
[2(n + 1)]!
(2n + 2)!





x2n+2 (2n)!
an+1



=
=


2n

an
(2n + 2)! x

EXAMPLE

Derive the Maclaurin series for ln(1 + x) and test for convergence.
To test convergence:

f (x)
f (x)
f (x)
f (x)
f (4) (x)

= ln(1 + x)
1
= 1+x
1
= (1+x)
2
1.2
= (1+x)
3
1.2.3
= (1+x)4

Similarly, f (5) (0) = 4!,


2

f (0)
f (0)
f (0)
f (0)
f (4) (0)

= ln 1 = 0
= 11 = 1
= 1
= 1
12
2
= 13 = 2!
= 3!
= 3!
14

f (6) (0) = 5!, ..


3

x
x
x
(1) + (2!) + (3!)...
2!
3!
4!
2
3
4
x
x
x
x5
ln(1 + x) = x
+

+
...
2
3
4
5

f (x) = 0 + x.1 +
Thus:

75

Using the Root test:


n
x

|an | =

Then

So: lim

|x|
|an | =
n
n

|an | =

|x|
= |x|
1

Therefore:
Abs. conv. if |x| < 1;
Divergent if |x| > 1.

Alternative Methods of obtaining Maclaurin series


Where a Maclaurin series can be found to represent a function, it is unique. It can
therefore be found by any valid means that is convenient e.g. by making use of
previously known Maclaurin series for factors involved in the function. The most
convenient way may not be by using the basic denition of the Maclaurin series which
inevitably involves a considerable amount of dierentiation.
sin x
.
1x
To dierentiate f (x) several times will become very complicated. Therefore alternatively, we write
f (x) = sin x.(1 x)1

EXAMPLE

Find the Maclaurin series for f (x), where

f (x) =

since we already know the Maclaurin series for sin x and (1 x)1 (or can nd the
second of these using the Binomial theorem). All we need to do then is to multiply the
two series together, obtaining as many terms as are required in ascending powers of x.
(

Thus:

f (x) =

(
)
x3 x5 x7
x
+

+ .... 1 + x + x2 + x3 + x4 + ....
3!
5!
7!

The series is valid for:


In other cases the function, say y(x), whose Maclaurin series is required is perhaps
not itself be known, but it is known to satisfy a certain dierential equation (D.E.).
Provided enough initial conditions are given, we can dierentiate the D.E. repeatedly
to obtain enough derivatives of y, nding their values at x = 0, to use in the Maclaurin
series formula.

EXAMPLE

Find the Maclaurin series representing the function y(x), where


dy
= xy 2
dx

and y = 1 when x = 0

76

6. The Use of Taylor Series in deriving two Mathematical Procedures


(a) LH
opitals Rule
This rule is used to solve the following problem:
Find lim {f (x)}

where f (x) =

xa

g(x)
h(x)

and g(a) = h(a) = 0

In this case, f (a) = 00 , and is therefore not dened.


However, when the limit does exist, it can be found in the following way:
We expand both g(x) and h(x) as Taylor series about x = a. Then:
f (x) =
=
=
Thus lim {f (x)} =
xa

(xa)2
g (a) +
2!
2
(xa)
h(a) + (x a)h (a) + 2! h (a) +
2
(x a)g (a) + (xa)
g (a) + ....
2!
2
h (a) + ....
(x a)h (a) + (xa)
2!
g (a) + (xa)
g (a) + ....
2!
h (a) + (xa)
h (a) + ....
2!

g(a) + (x a)g (a) +

....
....
since g(a) = h(a) = 0

g (a) + 0 + 0 + 0 + ...
g (a)
=

h (a) + 0 + 0 + 0 + ...
h (a)

This is LHopitals Rule


If ALSO g (a) = h (a) = 0, then

lim {f (x)} =

xa

g (a)
, etc.
h (a)

EXAMPLES
{

1.

sin x
Find lim
x0
x

2.

Again, this
Rule:

g(x) = sin x g(0) = 0


h(x) = x
h(0) = 0
0
g(x)

as x 0
h(x)
0

Limit = lim

x1

So LHopitals Rule gives:


{

sin x
lim
x0
x

cos x
= lim
x0
1

1 sin x
2
Find lim
x1
(1 x)2

= 1
77

0
,
0

and we apply LHopitals

(b) Identication of types of Stationary Points


For a function f :
y

The point x = a is a local maximum point of f


if, for every x in a neighbourhood of a,
f (x) < f (a)

6
loc.max

f (x) f (a) < 0

loc.min

The point x = b is a local minimum point of f


if, for every x in a neighbourhood of b,
f (x) > f (b)

- x

f (x) f (b) > 0

N.B. f (a) and f (b) may not be the largest and smallest values which f (x) can
take.
A stationary point of f is a point for which f (x) = 0
It is evident, then, that if f is a smooth (dierentiable) continuous function,
local max. and min. points are also stationary points, so that in this case
f (a) = 0, f (b) = 0 .
A Point of Inexion, say c, is a point at which f changes sign. If f is
smooth and continuous, this implies that f (c) = 0 .
However, if f (c) = 0, this does not necessarily imply a point of inexion. We use
a Taylor expansion of f (x) to explain.
Let x be a point close to a so that if we write x = a + h, h is small. Then:
f (x) = f (a + h) = f (a) + hf (a) +

h2
h3
f (a) + f (a) + .....
2!
3!

h2
h3
f (a) + f (a) + .....
2!
3!
3
h2
h
=
f (a) + f (a) + ..... since f (a) = 0
2!
3!

i.e. f (x) f (a) = hf (a) +

For small values of h, the R.H.S. will be dominated by the rst non-zero term,
and will therefore take the sign of this term. The L.H.S. will therefore also have
this sign.
So: if f (a) < 0
if f (a) > 0

L.H.S. < 0 (h2 > 0 h)


L.H.S. > 0 (h2 > 0 h)

78

maximum point;
minimum point;

But if f (a) = 0 and f (a) = 0, then the term h3! f (a), now the dominant term,
is positive on one side of a and negative on the other, since h and therefore h3 are
positive or negative when x lies respectively to the right or to the left of a. We
therefore have neither a local max. or min., but a point of inexion.
3

Now, however, suppose that f (a) is also zero. The new dominant term of the
4
R.H.S., h4! f (4) (a), contains an even power of h as a factor, so the sign of the
dominant term will take the sign of f (4) (a) whether h is positive or negative,
i.e. to the right or left of a. The situation at the point a is the same as when
f (a) = 0, i.e.
a maximum if f (4) < 0;

a minimum if f (4) (a) > 0

Generalising, the nature of the stationary point at x = a is dependent on the rst


non-zero derivative of f at x = a:
if this is an even derivative, we have a maximum or minimum point;
if it is an odd derivative, we have a point of inexion.

EXAMPLE

Find and identify the types of stationary points of f where


f (x) = x5 + 5x4 + 2

79

7. Sketching curves of functions


It is often useful to be able to sketch the general shape of the graph of a function,
y = f (x), over the whole domain (the set of values of x for which the function is
dened) and range (the set of resulting y values).
If we restrict the graph by merely calculating and plotting a few points, we may not
nd the overall picture or trends of the graph.
The following suggestions will help to obtain the general shape of the graph. It may
not be necessary to check all these points, especially if we have some idea of the shape
to start with, as may be the case if f is some polynomial or trig function.
(a) Try to nd the points in which the curve meets the axes; i.e. nd y when x = 0;
nd x when y=0.
(b) Investigate the behaviour of f (x) as x , and the behaviour of x as y .
In doing this we are looking for asymptotes, where they exist.
Vertical asymptotes occur where a nite value of x gives an innite value of
y. This usually occurs when f (x) is a rational function of x at points where the
denominator becomes zero.
Horizontal asymptotes occur where innite values of x give nite constant
values of y. We let x in the expression for f (x), if necessary dividing top
and bottom of a rational function by the highest power of x occurring in the
denominator.
Oblique asymptotes are straight lines not parallel to either axis, to which the
graph becomes asymptotic.
In general, suppose f (x) is the rational function N (x)/D(x), N (x) and D(x)
being polynomials of degree n, d respectively.
If n < d, y = f (x) has the horizontal asymptote y = 0.
If n = d, y = f (x) has a horizontal asymptote y = C, C = 0.
If n d = 1, y = f (x) has an oblique line as an asymptote.
If n d 2, y = f (x) is asymptotic to a polynomial curve.
In all these cases, the equation of the asymptote is found by dividing out the
rational function and letting x .
(c) Investigate the gradient; in particular, nd any stationary points. It can also
be useful to nd the ranges of values of x for which the gradient is respectively
positive or negative.
(d) Consider the sign of y for various ranges of values of x.
(e) Look for any symmetry perhaps f (x) is odd or even.
(f) If f (x) = g(x) sin x or f (x) = g(x) cos x, for some function g(x), the graph of
y = f (x) will oscillate between the curves y = g(x) and y = g(x).
80

3 2x
x+3

EXAMPLE 1.

y=

EXAMPLE 2.

y = ex sin x

81

EXAMPLE 3.

y=

x2 5
x3

82

APPENDIX A:

The Formal Denition of a Limit

The function f is said to approach the limit L as x tends to a if, given any positive
number , there exists a corresponding positive number such that
|f (x) L| <
for every value of x satisfying the inequality
0 < |x a| <
APPENDIX B:
Some Useful Limits

1.
lim { n a} = 1,
where a is a positive constant.
n

2.

lim { n n} = 1.

3.

nk
lim
= 0,
where k > 0 and |x| > 1.
n xn
An alternative form of this limit is :
lim {nk xn } = 0,
where k > 0 and |x| < 1.
n
{(

4.

lim

1
1+
n

)n }

{(

= e.

APPENDIX C:

lim

1
1
n

)n }

= e1 .

Properties of Convergent Series

i. If each term of a series is multiplied by a non-zero constant so that a new


series is formed, the new series is convergent or divergent as the original.
i.e.

If

then

n=1

an is convergent, with sum S,


kan is also convergent, with sum kS

n=1

ii. The convergence or divergence of a series is not aected by adding, subtracting


or changing a nite number of (nite) terms.
e.g.

If

an is convergent, then so is

n=1

an

n=5

iii. Two convergent series may be added (or subtracted) term by term, the result
being another convergent series.
i.e. If
then so are

an and

n=1

bn are both convergent,

n=1

(an + bn ) and

n=1

(an bn )

n=1

83

APPENDIX D:

Checking Convergence

In some cases we can go back to rst principles and check the convergence of
a given series by using the denition of convergence. In these cases we nd
an expression for the N th partial sum of the series, SN , in terms of N , and
investigate the convergence of the sequence {SN , SN +1 , SN +2 , ....}.

Usually, however, this is not possible. We then have to use a convergence


test. There are several of these, the most important of which will now be
outlined.
Some tests for convergence of series
Test 1:
(a)

The Comparison Test

If |an | |bn | for all n > some chosen N

and if

|bn | is convergent, then

n=1

(b)

and if

|an | is also convergent;

n=1

If |an | |bn | for all n > some chosen N

|bn | is divergent, then

n=1

|an | is also divergent.

n=1

For comparison purposes, we need to know about the convergence of some


known series. Sometimes a geometric series can be used.
Otherwise, a particularly useful class of series are the p-series:

1
1
1
1
1
1
1
1 + p + p + p + p + p + .... + p + ..... =
p
2
3
4
5
6
n
n=1 n

Dierent series are produced by taking dierent values of p.


Important Result

It can be shown that:


A p-series is convergent if
A p-series is divergent if

84

p > 1;
p1

EXAMPLE
S=

Investigate the convergence of the series

1
1
1
1
1
+
+
+
+ .... +
+ ....
12 23 34 45
n(n + 1)

The nth term an =

1
1
1
=
n(n + 1)
n n+1

by partial fractions

Therefore S =
Therefore SN =
lim {SN } = lim

Therefore since the N th partial sum, SN , has a limit


convergent with sum

EXAMPLE

of Comparison Test (a).

, the series is

Consider the series

1
1 1 1 1
+ + + +
+ ...... =
an
2 3 5 9 17
n=1

, then |an | < |bn | for all n


2n1 + 1
2n1

1
1 1 1
But
bn = 1+ + + +..., the geometric series with a = 1, r =
2 4 8
2
n=1
Here an =

If we choose bn =

It is therefore convergent (sum = 2).


1
+ ...... is also convergent.
17

EXAMPLE

So by comparison, 1 + 12 + 13 + 15 + 19 +

of Comparison Test (b).

Consider the series

1 1 1 1 1 1 1 1
1
1
1
an = 1 + + + + + + + + + .... + + + ..... + + .....
2 3 4 5 6 7 8 9
16 17
32
n=1
If

1 1 1 1 1 1 1 1
1
1
1
bn = 1 + + + + + + + + + .... + + + ..... + + .....
2 4 4 8 8 8 8 16
16 32
32
n=1

then |an | |bn | for all n.


But

1 1 1
1
1 1
1 1 1 1
+
+
+ + +
+..... = 1+ + + +...
bn = 1+ +
2
4 4
8 8 8 8
2 2 2
n=1

which has an innite sum and is therefore divergent.


Therefore by comparison, the series 1 + 12 + 13 + 14 + 15 + ... is divergent.
This series is called the Harmonic Series, and is the p-series with p = 1.
85

Test 2:

DAlemberts Ratio Test

Given the series

an , we consider:

n=1

an+1
lim
= K
n a
n
If K < 1,
If K > 1,
If K = 1,

Test 3:

the series converges;


the series diverges;
the test is inconclusive.

Cauchys Root Test

Given the series

an , we consider:

n=1

lim { n |an |} = L

If L < 1,
If L > 1,
If L = 1,

the series converges;


the series diverges;
the test is inconclusive.

And then, for series whose terms alternate in sign:


Test 4:

Leibnitz Test for Alternating Series


Given the series

an , if:

n=1

(i)
(ii)
(iii)

the terms alternate in sign;


|an | |an+1 | for all n > some N ;
lim |an | = 0
n

Other tests include Raabes Test and the Integral Test.

86

EXAMPLE

The use of the Ratio Test

Given the series

an =

n=0

Then

xn
n=0

xn
,
n!

an =

[N.B. 0! = 1 by denition]

n!
so that

an+1 =

xn+1
(n + 1)!







xn+1 n!
xn+1 .n(n 1)(n 2)....3.2.1
an+1
|x|




=
=
=
So:

n
n




an
(n + 1)! x
(n + 1)n(n 1)(n 2)....3.2.1.x
n+1
{
}


an+1
|x|
= lim
Thus lim
= 0
for all x

n
n

an

n+1

So K = 0, and 0 < 1

Series is convergent for all x

N.B. This series is the expansion of the function ex . We have therefore proved
that the expansion of ex is valid for all x.

EXAMPLE

The use of the Root Test

3
4
1 2
Given the series
+
+ ..... =
an = + +
3 9 27 81
n=1
n=1

Then |an | =
{

So:
So L =

lim

; since L =

EXAMPLE

|an | = lim

, by the Root test, the series is

The use of Leibnitz Test for Alternating Series

Given the series

= 1

n=1

(i)
(ii)
(iii)

(1)n+1
1 1 1 1
+ + + ..... =
2 3 4 5
n
n=1

The terms alternate in sign;


1
1
|an | = , |an+1 | =
n
n+1
{ }
1
lim
= 0.
n n

So by Leibnitz Test, the series is convergent.

87

|an | > |an+1 |

for all n;

Which test to use?


This is not always obvious! However, there are certain types of series for which a
particular test is usually the one to try.
First, we assume that we have made the preliminary check, that
an 0

as

If this is not the case, then there is no possibility that the series will converge.
Then:
i. If the terms alternate in sign, we try Leibnitz Test. This assumes that we
are not investigating the series for absolute convergence, in which case we
need to consider the absolute (positive) value of the terms and therefore need
one of the other tests.
ii. If the general term an of the series is given as a rational function of n, we
try the Comparison Test, comparing the series with a p-series with an
appropriate choice of p.

iii. If the series is a power series, we try the Ratio Test or the Root Test.
Usually we can choose which of these to use; the exception is when the term
an includes n! (or a similar factorial) as a factor. Since the limiting value of

n
n! is not known, we cannot use the Root Test in these cases.
N.B. If one of these two tests gives an inclusive result (i.e. K = 1 from the
Ratio test, or L = 1 from the Root test), so will the other!
If the power series concerned is given in powers of x where x is variable,
we may nd that there is some positive number R such that:
if |x| < R,
if |x| > R,

the series converges;


it diverges.

R is called the Radius of Convergence.


If x = R we usually have a problem, since the Ratio test will give an
inconclusive result when either of these two values of x is used to create the
series. We need to use other considerations to discover whether or not the
series is convergent in these cases. We substitute x = R into the series and
then start again to choose an appropriate test.

88

EXAMPLES
Given the series

Preliminary Test

2 3 4 5
n
an = 1+ + + + + .... =
3 5 7 9
n=1
n=1 2n 1

We have already shown that


{

lim

n
2n 1

= n
lim

1
2

}
1
n

1
2

Therefore, since an does not 0 as n , the given series is divergent.


i. See previous page for an example showing the use of Leibnitz test.
ii. When an is a rational function of n:
A. Strictly:

an =

n=1

1
1
1 1
+ +
+
+ .... =
2 5 10 17
n=1

So an =
B. Less strictly:

an =

n=1

For large values of n, an

n2
3n3

n2 + 2n
3
n=1 3n + 4n + 1
=

1
n=1

3n

1
.
3n

The series therefore behaves as

1
1
3 n=1 n

which is divergent since it is 31 (harmonic series, p-series with p = 1).


Therefore the given series is also divergent.
iii. When the series is a Power Series

an = 1+

n=0

2x 4x2 8x3
+
+
+ .... =
2
3
4
n=0

So |an | =
Then

|an+1 | =



an+1
=


an

89

APPENDIX E:
Series

Demonstration of the Derivation of the Taylor

Suppose we are given a function f which is dened at x = a and is dierentiable


an innite number of times at x = a.
We should like to be able to express f (x) as a power series in ascending powers
of (x a). i.e. to write
f (x) = c0 + c1 (x a) + c2 (x a)2 + c3 (x a)3 + c4 (x a)4 + ....

(1)

We need to nd the coecients cn .


Substituting x = a in equation (1), we obtain:

f (a) = c0 + 0 + 0 + 0 + ...

c0 = f (a)

Dierentiating equation (1) gives:


f (x) = 0 + c1 + c2 .2(x a) + c3 .3(x a)2 + c4 .4(x a)3 + .....

(2)

Substituting x = a in equation (2), we obtain:


f (a) = c1 + 0 + 0 + 0 + ...

c1 = f (a)

Dierentiating equation (2) gives:


f (x) = 0 + 2c2 + c3 .3.2(x a) + c4 .4.3(x a)2 + .....
Substituting x = a in equation (3), we obtain:
f (a) = 2c2 + 0 + 0 + 0 + ...

c2 =

f (a)
2!

Continuing the process of successive dierentiation and substitution gives:


c3 =

1
f (a),
3!

c4 =

1 (4)
f (a),
4!

c5 =

1 (5)
f ,
5!

etc.

Substituting the cn values into equation (1) gives the Taylor series:
(x a)3
(x a)2
f (a) +
f (a) + .....
f (x) = f (a) + (x a)f (a) +
2!
3!

90

(3)

Chapter 5.
COMPLEX NUMBERS

91

1. The Imaginary Number


The imaginary number was rst introduced because certain algebraic equations, such
as x2 + 1 = 0 were found to have no solution in terms of real numbers. This seemed
to be an unwelcome exception to the theory that polynomial equations of degree 2
have two roots: the very similar equation x2 1 = 0, for instance has the two roots
x = 1. So the new number,

i=

(or j =

1 )

was introduced to remove the exceptions to the accepted rule. The solution of the
equation x2 + 1 = 0 can then be given as
x = i
Any multiple of i is an imaginary number; the solution, therefore of similar equations
such as x2 + 9 = 0 are also imaginary:
x2 = 9 = 9 1 x = 3i
Powers of i:

1
= i
i
The higher powers of i must each take one of the values 1 or i.
i2 = 1,

i3 = i,

i4 = 1,

e.g. i15 = i12 i3 = (i4 )3 i3 = 1 (i) = i


2. Complex numbers
These arise, after dening i, when we seek solutions of general quadratic equations.
e.g.

If x2 + 6x + 12 = 0, then

6 36 4.1.12
x =
2
6 12
=
2

6 12 1
6 2 3 1
=
=
= 3 3i
2
2

Thus we see that in order to give a solution for this equation, we need a combination
of a real number and an imaginary number. We dene this as a complex number: in
general this takes the form
x + iy
where x and y are real.
It can be shown that every solution of a polynomial equation of degree n, where
n is any positive integer can be expressed as a complex number (real and imaginary
numbers being special cases of complex numbers in which y, x respectively are zero);
there is no need to dene any other new number.
92

3. Real and Imaginary Parts


If the complex number z is dened as:

z = x + iy, then:

x = (z) is called the Real Part of z;


y = (z) is called the Imaginary Part of z.
Complex Conjugate

If z = x + iy, then

z = x iy is called the Complex Conjugate of z


Independence of x and y
x and y are independent.
Therefore z = 0 if and only if x = 0 and, independently, y = 0.
Consequences of this independence
(a) Equality of Complex Numbers
Suppose z1 = x1 + iy1 , z2 = x2 + iy2 are two complex numbers such that
z1 = z2 Then: z1 z2 = 0
i.e. (x1 x2 ) + i(y1 y2 ) = 0
This can only be true if
x1 x2 = 0
and
y1 y2 = 0

x1 = x2
and
y1 = y2
(b) Graphical Representation of Complex Numbers
It is not possible, as with real numbers, to represent a complex number z by a
point on a single line, i.e. only involving one dimension, since some arithmetic
operation combining the real and imaginary parts of z would be implied.
We therefore need a two-dimensional space (plane) to represent z, to allow for the
independence of x and y. This is called the Complex Plane and the diagram
used is called the Argand Diagram, as shown below:
imag. (y)
6
r


r 



0 HHH

x
HH
HH
HHr
y

93

P , represents z = x + iy

- real (x)

P , represents z = x iy

4. Modulus and Argument


The diagram shows that the position of P can also be uniquely specied by its polar
coordinates, r and . r is the length of OP and is the angle between OP and the
positive direction of the real (x-) axis, a positive angle being dened, as usual, in the
anticlockwise sense.
r = |z| is called the Modulus of z;
= arg(z) is called the Argument of z.
N.B.1 arg(z) is not uniquely dened for any complex number since any integer
multiple of 2 added to will give the same position for z on the diagram. We
therefore dene the Principal Value of arg(z) to be the value of which satises
<
N.B.2 The modulus and argument of z satisfy:
|
z | = |z| = r;
arg z = arg(z) = .
5. Relationships between x, y, r,
From the diagram:
x = r cos
=

r sin

x2 + y 2

= tan1

and
y

( )
y
x

Therefore
z = x + iy
=
r(cos + i sin )
z = x iy
=
r(cos i sin )
=
r(cos() + i sin())

94

EXAMPLES

(i)

z1 = 1+ 3i

Im
6
r z1
3






- Re

Im
-1 6 - Re

(ii) z2 = 1 3i

z2 r

2
12 + 3 = 4 = 2

3
1
in 1st quadrant
arg(z1 ) = tan
1

=
(+2k)
3

|z1 | =

(1)2 + ( 3)2 = 4 = 2

3
arg(z2 ) = tan1
in 3rd quadrant
1
2
4
=
(or
) (+2k)
3
3

|z2 | =

Im
6

(iii) z3 = 4

-4
r
z3

- Re

|z3 | = 4
arg(z3 ) = (+2k)

Im
6
- Re

(iv) z4 = 3i
-3 r z4

|z4 | = 3
arg(z4 ) =

3
(or
) (+2k)
2
2

These examples illustrate the facts that


the modulus of a complex number is never negative;
Although can be expressed as tan1 ( xy ), this formula, taken on its own, is ambiguous since there are two quadrants in which tan has the same value,
so could lie in
either. [See examples (i) and (ii), where in both cases, tan = 3.] However, if the
position of the point on the Argand diagram is also taken into account, there is no
confusion.
Sometimes it is easier to obtain the value of directly from the diagram rather than
trying to use the tan1 formula. [See examples (iii) and (iv)].
(v)

Find the real and imaginary parts of z5 if:


|z5 | = 3, arg(z5 ) = 3
.
4
Use z5 = r(cos + i sin ) =

95

6. Exponential Form of a Complex Number


Consider again the polar representation of z:
z = r(cos + i sin )

Now
and
Therefore

2 4 6
+

+ ...
2!
4!
6!
3 5 7
sin =
+

+ ...
3!
5!
7!

cos = 1

cos + i sin = 1 + i

2
3 4
5
i +
+ i ....
2!
3!
4!
5!

= 1 + (i) +

(i)2 (i)3 (i)4 (i)5


+
+
+
...
2!
3!
4!
5!

(1)

since i2 = 1, i3 = i, i4 = 1, i5 = i, ...
Now compare this with the exponential series
ex = 1 + x +

x2 x3 x4 x5
+
+
+
+ ....
2!
3!
4!
5!

If we use x = i in this series, we obtain the series (1) which we have found to be
equivalent to cos + i sin . Therefore we can write
cos + i sin = ei
Similarly,

cos i sin = ei

SUMMARY
We have three ways of expressing a complex number:

x iy
z = r(cos i sin )
rei

x + iy
z = r(cos + i sin )
rei

96

7. Algebraic Properties and Operations


in terms of
Real and Imaginary Parts

Modulus and Argument

Let z1 = x1 + iy1
Let z2 = x2 + iy2

=
=

r1 (cos 1 + i sin 1 )
r2 (cos 2 + i sin 2 )

Equality
z1 = z2 if and only if:
x1 = x2 and y1 = y2

r1 = r2 and 1 = 2 + 2k

Addition and Subtraction


z1 z2 =
(x1 x2 ) + i(y1 y2 )

(r1 cos 1 r2 cos 2 )


+i(r1 sin 1 r2 sin 2 )

Im
6

Pr2 (z2 )

#
#

r Q(z1 + z2 )

#
#

 ##
P1 (z1 )
 #r

 
#
Q
Q
Q
Q
Qr

- Re

R(z1 z2 )

Geometric Representation
On the Argand diagram, these operations can be regarded as vector addition
1 and
and subtraction of the vectors OP

OP2 , z1 + z2 giving OQ and z1 z2 giv


ing OR.
= P2P1 ,
N.B. Since OR

|P2 P1 | = |z1 z2 |
i.e. |z1 z2 | = the distance P1 P2
and Direction of P2 P1
= arg(z1 z2 ).

97

Multiplication
= (r1 ei1 ) (r2 ei2 )

z1 z2 = (x1 + iy1 )(x2 + iy2 )


= x1 x2 + ix1 y2 + ix2 y1 + i2 y1 y2
= x1 x2 y1 y2 + i(x1 y2 + x2 y1 )

= r1 r2 ei(1 +2 )
This demonstrates that in order to multiply two complex numbers given in
terms of modulus and arguement, we
Multiply their moduli
to obtain the modulus of the product;
add their arguments
to obtain the argument of the product.
In particular, if r2 = 1, then the eect
of multiplying z1 by z2 is to rotate z1
through the angle 1 about the origin.

Multiplication by the Complex Conjugate


= (rei ) (rei )
= r2 e0
= r2

z z = (x + iy)(x iy)
= x2 + ixy ixy i2 y 2
= x2 + y 2

Thus z z is real and is equal to |z|2 = r2 .


Division

z1
x1 + iy1
=
z2
x2 + iy2

r1 ei1
r2 ei2
r1
= ei(1 2 )
r2
=

Since we require a real denominator,


and since z z is real, we multiply top
and bottom of the fraction by z2 :

This demonstrates that in order to divide z1 by z2 we

(x1 + iy1 )(x2 iy2 )


z1
=
z2
(x2 + iy2 )(x2 iy2 )
x1 x2 + y1 y2
y1 x2 x1 y2
=
+i
2
2
x2 + y2
x22 + y22

divide r1 by r2
to obtain the modulus of the quotient;
subtract 2 from 1
to obtain the argument of the quotient.

98

EXAMPLES
Find the real and imaginary parts of z, and hence |z| and arg(z), if

(i)

z=
z =

2+i
4 3i

2+i
=
4 3i
; arg(z) = tan1

|z| =

in the

quadrant

=
(ii)

If z = x + iy and if w is a purely imaginary number, show that z is real when


w=

z+1
iz

We need to nd the real and imaginary parts of w, so we need to substitute z = x + iy:


w =

x + iy + 1
((x + 1) + iy) (y ix)
=
i(x + iy)
(y + ix) (y ix)
xy ix2 y ix iy 2 + xy
=
=
y 2 + x2

y
= 0 y=0
+ x2
But if y = 0, then since z = x + iy, z = x which is real.
Hence z is real .
Now since w is purely imaginary, (w) = 0

y2

(iii) Find the real and imaginary parts of (1 + i)15 .


We can either expand (1+i)15 by the Binomial theorem(!), or use the fact that raising
a complex number to a power is very easy when the number is written in exponential
form. We choose the latter option.
Let z = 1 + i. We need rst to nd |z| and arg(z).
|z| = r =

Then z =

(1)2 + 12 =
i3
2e 4

2;

arg(z) = tan1

1
1

(N.B.Draw Argand diagram)


)

in the 2nd quadrant =

3
(+2k).
4

(
3
45
45
z 15 = ( 2)15 .e15i 4 = 27 2. cos
+ i sin
4
4
)
(
5
5
= 128 2 cos(10 +
) + i sin(10 +
)
4
4
(
)
)
(

5
5
1
1
= 128 2 cos
+ i sin
= 128 2 i
4
4
2
2
= 128 128i

99

8. De Moivres Theorem
This theorem states that:
For any real number k:

Proof

(cos + i sin )k = cos k + i sin k

(cos + i sin )k =
=
=
=

Therefore if z = r(cos + i sin ), then:

(ei )k
eik
ei(k)
cos k + i sin k

z k = rk (cos k + i sin k) .

We use De Moivres theorem in particular for:


(i) Expressing the sine or cosine of a multiple of in terms of powers of cos and
sin ;
(ii) Expressing a power of cos or sin in terms of cosines (or sines) of multiples of
(useful for integration);
(iii) Finding roots of complex numbers.

EXAMPLES
(i)

Express cos 3 in terms of powers of cos .

To use De Moivres theorem for such problems, we do not just consider cos 3, but
cos 3 + i sin 3
cos 3 + i sin 3 =
=

by De Moivres theorem

100

(ii)

Express sin6 in terms of cosines of multiples of .


First:
Let z = ei = cos + i sin
Then:
z m = eim =
z m = eim =

(2)
(3)

Adding:

(2) + (3) gives:

(4)

Subtracting:

(2) (3) gives:

(5)

We require sin6
Putting m = 1 in equation (5) gives:
Then sin6 =

101

(iii)

Roots of Complex Numbers

Given the equation


xn = a
where n is a positive integer and x and a are both real, there are only 0, 1 or 2 solutions,
depending on n being odd or even and on a being positive or negative.
However, if we do not restrict the solution to real numbers (or a to having to be a real
number), we have a more regular situation.
Theorem

If n is any positive integer and is any complex number,


then the equation
zn =
has exactly n distinct solutions.
Let || = R, arg() = + 2k
Then = Rei(+2k)

Proof

Then the equation becomes:

Let z = rei
rn ein = Rei(+2k)

and:

rn = R

r = Rn

n = + 2k

2k
+
n
n
By letting k = 0, 1, 2, ...., (n 1) in turn, we obtain n dierent values for , giving n
dierent values of z.
or:

But increasing k once more, to k = n, gives

+ 2
n
and this will give the same value for z as k = 0, since
=

cos
+ 2
n

( )

= cos
n

and

sin
+ 2
n

( )

= sin

Therefore by letting k take integer values > n, we simply generate values of which
reproduce values of z already found.
Thus the equation has exactly n distinct roots.

We follow the procedure outlined in this proof to nd roots of complex numbers.
102

EXAMPLES
x2 = 2
has two real solutions :
3
x = 27 has one real solution :
x4 = 1 has no real solution.

EXAMPLE

x= 2
x = 3

Find the 4th roots of 8 + 8 3 i.

From the theorem, we expect exactly four distinct roots. We are, in eect, being asked
to solve the equation

z4 = 8 + 8 3 i
Method The
number whose roots we reIm
quire, 8 + 8 3 i, is at present in the wrong
6
form for us to be able to nd these roots.
The number must always be expressed in
complex exponential form. This means
that it is rst necessary to nd its mod- Re
O
ulus and argument.
As always, we show the position of the
number on an Argand diagram, to help
nd modulus and argument.

|| =
Let = 8 + 8 3 i Then:
arg() =
Then in the form rei , we have
=

103

9. Locus Problems
The locus of a variable point P on a plane or in space is dened as:
The set of all points which can be occupied by P
The point P in question usually has to satisfy one or more conditions. These can be
expressed in a variety of ways.
e.g. A verbal condition:
Or by an equation:

The point P is always 3 cm from the origin.


x2 + y 2 = 9

Both these conditions result in the set of points which lie on a circle, centre the origin,
radius 3 cm.
We are used to seeing locus equations expressed in terms of a Cartesian equation, as
in the case of this circle and in many cases we can view the equation and know what
sort of curve to expect. When we are looking for the locus of the point representing an
complex number, it is therefore tempting (and often necessary) to express the condition
to be satised by the point in terms of Cartesian coordinates, x and y.

EXAMPLE

of converting a locus given in terms of z into a Cartesian equation.

Find all possible positions of the locus of the point z satisfying the condition:
|z + 3| = 2 |z 1|
This is an equation which is not easy to interpret at sight as a geometric shape. We
therefore use z = x + iy and convert the equation to Cartesian form which we may
recognise. On substituting for z in terms of x and y, we obtain
|x + iy + 3| = 2 |x + iy 1|, or
|(x + 3) + iy| = 2 |(x 1) + iy|

i.e.

(x + 3)2 + y 2 = 2

Squaring both sides, (x + 3)2 + y 2


x2 + 6x + 9 + y 2
3x2 + 3y 2 14x 5
14
5
x2 + y 2 x
3
3
Completing the square gives:
)
(
7 2
+ y2
x
3

(x 1)2 + y 2

= 4((x 1)2 + y 2 )
= 4x2 8x + 4 + 4y 2
= 0
= 0
64
=
:
9

104

a circle, centre

8
7
, 0 , radius
3
3

It is not, however, always necessary to express the equation in terms of Cartesian


coordinates to obtain the locus. We remember, for instance, that the distance d of the
point A, (x1 , y1 ) from the point B, (x2 , y2 ) is

(x2 x1 )2 + (y2 y1 )2

If the points A and B represent the complex numbers z1 and z2 , then this length can
be expressed as
|z2 z1 |
In the previous example, where the condition to be satised by point z is given as
|z + 3| = 2 |z 1|, the expressions |z + 3|, |z 1| represent the distance of z from
point -3 and the distance of z from point 1 respectively. So the condition represented
by the equation can be stated as
The distance of z from point -3 is twice the distance of z from point 1.
Similarly, we also know that the expression arg(z ) represents the angle made by
the line joining z to with the positive direction of the real (x) axis. [See 7. Algebraic
Properties and Operations]. So the interpretation of an equation such as
arg(z 1 + i) =

is a straight line through the point (1 i) that makes an angle 3 with the positive
real axis. We therefore do not always need to convert a locus equation to Cartesian
coordinates in order to interpret it. So we look at
The use of (z2 z1 )in locus equations

105

The use of (z2 z1 ) in locus equations


Reminder:
|z2 z1 | = the distance of z2 from z1 ;
arg(z2 z1 ) = the angle made by the line joining z1 and z2 with the positive
direction of the real axis.

Given:

two xed complex numbers ,


a xed real positive number r,
a xed angle ;

Im
6
'$

r
PPr
r z

|z | = r

&%
- Re

is the equation of the circle, centre , radius r.


Another equation for this circle, using the complex exponential form of z, is:
z = + rei ,
0 < 2
Im
6

arg(z ) = (+ k)
is the equation of the straight line through ,
making an angle of with the positive real axis.

z!
!r!
!
r !
!!
!
!
!!
- Re
!
!
Im

|z | = |z |
is the equation of the straight line which is
the perpendicular bisector of the line joining
and .

106

rz
6


r

P
PP
 P
P
r





- Re

EXAMPLES using the standard equations of circles and straight lines dened
on the previous page.
1.

|z i| = 3 is:

2.

|z + 2 4i| = 1: write as |z (2 + 4i)| = 1 to conform as required, is:

3.

arg(z 1) = 6 + k

4.

arg(z) = 2

5.

|z + 1| = |z + i|, again, more conveniently written as: |z (1)| = |z (i)|,


is:

1.

is:

2.

is:

3.

4.

107

5.

108

Chapter 6.
POLYNOMIALS

109

1. Denition of a polynomial of degree n


A polynomial of degree n is a function of the form
P (z) = an z n + an1 z n1 + an2 z n2 + ....... + a2 z 2 + a1 z + a0
in which n is a non-negative integer and the ai are constants, an = 0
If the coecients a0 , a1 , a2 , ....an are all real, P (z) is said to be a real polynomial.
2. Zeros and Roots
If P () = 0, then the number is said to be:
a zero of the polynomial P (z);
a root of the polynomial equation P (z) = 0.

3. Theorems concerning polynomial functions


Theorem 1

The Remainder Theorem

If is any constant and P (z) is a polynomial, then when P (z) is divided by (z ),


the value of the remainder is P ().
Corollary to Theorem 1
If is a zero of P (z), then P () = 0.

Theorem 2

The Fundamental Theorem of Algebra

A polynomial of degree n 1 has at least one zero.

110

If P (z) = z 3 + iz 2 + (2 i)z 1
then z = i is a zero of P (z) since
P (i) = i3 + i.i2 + (2 i)i 1 = i i + 2i + 1 1 = 0

Consider the polynomial P (z) = z 3 + z 2 + 2z 4


(a) when divided by (z + 2)

(b) when divided by (z 1)

Here we expect the remainder to be


P (2) = (2)3 + (2)2 + 2(2) 4
= 8 + 4 4 4 = 12

Here we expect the remainder to be


P (1) = 13 + 12 + 2(1) 4
=1+1+24= 0

The division gives:

Since the remainder = 0, we expect


(z 1) to be a factor of P (z).

z2 z + 4
z + 2 )z + z 2 + 2z 4
z 3 + 2z 2
z 2 + 2z
z 2 2z
4z 4
4z + 8
12
as expected.
3

Division gives
z 1 )z 3 + z 2 + 2z 4

P (z) can therefore be expressed as


(z 1)(z 2 + 2z + 4)

111

Theorem 3

Any polynomial P (z) of degree n 1 has exactly n zeros.

N.B. We have already proved this theorem for the case of a polynomial equation of the
type
zn = a
(or z n a = 0)
whose solutions are the nth roots of the number a. In that case we found that there
were exactly n distinct (dierent) roots.
For a more general polynomial, the zeros are not necessarily all distinct. If P (z)
contains some repeated linear factors, then the corresponding zeros will be repeated.

Multiplicity
The repeated zeros are classied according to their multiplicity: i.e. If the factor (z
) occurs m times amongst the linear factors of P (z), then is a zero of multiplicity
m.
Theorem 4
If is a zero of P (z) of multiplicity m, where m 2, then is also a

zero of P (z), of multiplicity m 1.

112

EXAMPLE

Consider the polynomial P (z), where


P (z) = z 6 5z 5 + 5z 4 + 9z 3 14z 2 4z + 8

It is found that P(z) can be factorised, giving


P (z) = (z 2)3 (z + 1)2 (z 1)
so that, as expected, P (z) has 6 linear factors:
(z 2) is repeated 3 times
(z + 1) is repeated twice
(z 1) occurs only once

= z = 2 is a zero of multiplicity 3
= z = 1 is a zero of multiplicity 2
= z = 1 is a simple zero.

Then dierentiating w.r. to z,


P (z) = 3(z 2)2 (z + 1)2 (z 1) + 2(z + 1)(z 2)3 (z 1) + (z 2)3 (z + 1)2
= (z 2)2 (z + 1)(3(z + 1)(z 1) + 2(z 2)(z 1) + (z 2)(z + 1))
= (z 2)2 (z + 1)(6z 2 7z 1)
Let (6z 2 7z 1) = Q(z)
Q(2) = 24 14 1 = 9 = 0 therefore (z 2) is not a factor of Q(z).
Q(1) = 6 + 7 1 = 12 = 0 therefore (z + 1) is not a factor of Q(z).
Q(1) = 6 7 1 = 2 = 0 therefore (z 1) is not a factor of Q(z).
Therefore P (z) has exactly two factors (z 2), one factor (z +1) and no factors (z 1).
i.e.
z = 2 is a zero of multiplicity 2 of P (z);
z = 1 is a simple zero of P (z);
z = 1 is not a zero of P (z).
y
6

-1

-x

Zeros of multiplicity 2 imply stationary points on the x-axis on the graph of a real
polynomial such as P (x). even multiplicity max or min; odd multiplicity point
of inexion.
113

The above theorems all apply to real or complex polynomials with real or complex
zeros.
The next theorem concerns complex zeros of real polynomials.
Theorem 5
If = u + iv, where u, v are real and v = 0, and if is a zero of the
real polynomial P (z), then
= u iv is also a zero of P (z).
Corollary 1
A real polynomial may always be expressed as a product of real linear and real quadratic
factors.
Corollary 2
A real polynomial whose degree is odd must have at least one real zero.

114

EXAMPLE 1
In the case of an equation of the form
zn a = 0
where a is real and positive, the roots we require are the nth roots of a. Remembering
the standard method of nding these roots, we need to nd the modulus and argument
of a, and then to express a in its complex exponential form. Since a is real and positive;
|a| = a,

arg(a) = 0 + 2k(where k is an integer),


= the nth roots of a are

a = aei(0+2k)

2k
n
a ei n
y

By taking k = 0 we obtain one root= n a.e0 = n a, real


and positive.
The other
roots are then equally spaced around the circle
n
of radius a, centre O, and the symmetry of this result
shows that the complex roots occur in conjugate pairs.

n=5

'$
q
q

ZZ 2
5
-x
q

B
q
Bq
&%

EXAMPLE 2
Find the real 4th degree polynomial P (z) which has zeros at z = 2 + i and z = 3i,
and for which P (0) = 90.

115

4. Relationships between coecients and zeros of a polynomial


Since every polynomial may be expressed in terms of its linear factors, we now have
two alternative ways of writing a polynomial:
(1)

P (z) = an z n + an1 z n1 + an2 z n2 + ....... + a2 z 2 + a1 z + a0

(2)

P (z) = an (z 1 )(z 2 )(z 3 ) ...... (z n )

where i , i = 1, 2, 3, ....., n are the zeros of P (z)


When the expression (2) is multiplied out, leading to an expression denoted (3), it will
take the same form as the expression (1), and must therefore be identical to it.
Therefore by equating the coecients of identical powers of z from each of (1) and (3),
we will obtain relationships between the coecients (the ai ) and the zeros (the i ).
(a) Polynomials of degree 2 (quadratic)

Denote by P2 (z).

Form (1):

P2 (z) = a2 z 2 + a1 z + a0

Form (2):

P2 (z) = a2 (z 1 )(z 2 ) = a2 (z 2 (1 + 2 )z + 1 2 )

Equating coecients of z gives:


Equating the constant terms gives:

a1 = a2 (1 + 2 )
a0 = a2 (1 2 )

These equations lead to the relationships:


Sum of zeros: 1 + 2 =

a1
:
a2

116

Product of zeros: 1 2 =

a0
a2

EXAMPLE involving a quadratic equation


If 1 and 2 are the roots of the equation
z 2 + 3z + 8 = 0,
nd the quadratic equation whose roots are 12 and 22 .
Using the relationships derived on the previous page, we have:
3
1 + 2 = = 3 (1);
1

1 2 =

8
= 8 (2)
1

The required equation will be:


z 2 (12 + 22 )z + 12 22 = 0
From equation (2), we have:
We can write:

12 22 = (1 2 )2 = 64.

(12 + 22 ) as: (1 + 2 )2 21 2 = (3)2 2 8 = 7

The required equation is therefore:


z 2 + 7z + 64 = 0

117

(b) Polynomials of degree 3 (cubic)

Denote by P3 (z).

Form (1):

P3 (z) = a3 z 3 + a2 z 2 + a1 z + a0

Form (2):

P3 (z) = a3 (z 1 )(z 2 )(z 3 )


= a3 (z 3 (1 + 2 + 3 )z 2 + (1 2 + 1 3 + 2 3 )z 1 2 3 )
a2 = a3 (1 + 2 + 3 )

Equating coecients of z 2 gives:


Equating coecients of z gives:

a1 = a3 (1 2 + 1 3 + 2 3 )

Equating the constant terms gives:

a0 = a3 (1 2 3 )

These equations lead to the relationships:


Sum of zeros: 1 + 2 + 3 =

a2
:
a3

Product of zeros: 1 2 3 =

Also, products of all possible pairs of zeros: (1 2 + 1 3 + 2 3 ) =

a0
a3

a1
a3

Polynomials of degree n (the general case)


The polynomial is Pn (z) with n linear factors.
Multiplying out the factors and equating coeents of z n1 , z n2 , ....., z, constant terms,
gives:
n

an1
Sum of the zeros =
k = (1)
an
k=1
n,n

Sum of products of all possible pairs of zeros =

j k = (1)2

j=1,k=1,j=k
n,n,n

Sum of products of all possible triples of zeros =

i j k = (1)3

i=1,j=1,k=1,i=j=k

........
Product of all the zeros =

k=1

118

an2
an

k = (1)n

a0
an

an3
an

EXAMPLE involving a cubic equation


It is given that and

are both roots of the equation

3z 3 7z 2 7z + 3 = 0
Find and hence solve the equation completely.

We expect three roots. Call the third root .
We can write down three equations for and using the relationships between roots
(zeros of the polynomial) and the coecients:

N.B.1 In this case the two possible values of obtained have led to the same three
roots of the equation, so there is no ambiguity. In other cases (not illustrated here),
two dierent sets of possible roots may result from solving the equations in these new
cases. Since the solutions of a polynomial must form a unique set of values, one of
these apparent possibilities is false.
We resolve the ambiguity by using the 3rd of the equations relating coecients and
zeros as a check - we have in fact only used two of the equations to nd and since
there were only two unknowns.
N.B.2 At rst sight, it may seem as though using these relationships is a way to nd
the roots of a polynomial of degree 3. However, these relationships are in general
not linear. Therefore unless extra information about the roots is given, when the
relationship equations are manipulated to eliminate some of the unknowns, the result
will be a polynomial of the same degree as the original equation - in many cases the
original equation itself!
Therefore in order to be of use, we also need extra information about the roots.
119

5. Numerical Solution of Non-Linear Equations The Newton-Raphson Method


The equation to be solved must be expressed in the form
f (x) = 0
where f (x) is a function which is continuous withion the relevant range of values of x.
Geometric Explanation of the Method
y
6

The equation is f (x) = 0.


The exact root required is x = x.
Let x0 be a rst approximation to x.

y = f (x)

The tangent to the curve y = f (x) may


be expected to cut the x-axis at a point
x = x1 closer to x than x0 .
x

x1

f (x0 )

x0

- x

To calculate x1 from x0 :
f (x0 ) = gradient of the tangent to the curve at x0
= tan
f (x0 )
=
x0 x1
f (x0 )
f (x0 )
f (x0 )
= x0
f (x0 )

Therefore x0 x1 =
Thus x1

The process may be repeated, using the tangent to the curve at the new point, x1 .
This will lead to the next improved point, x2 .
We have developed an iterative process, given by the Newton-Raphson formula as:

xn+1 = xn

f (xn )
f (xn )

We perform as many iterations as are necessary to achieve the accuracy required. This
is indicated when two successive iterations give the same value of x to the required
number of signicant gures/decimal places.
120

EXAMPLE

Find a real root of the equation


x3 x2 + 2x 1 = 0

and give your answer correct to 6 decimal places.


For this equation, f (x) = x3 x2 + 2x 1 = 0.
Since f (x) is a continuous function, any real root must occur between two values of x
which give function values having opposite signs. To nd a suitable starting value x0 ,
we usually calculate function values at consecutive integer values of x, until we nd a
change of sign.
In this case, starting with x = 0:
f (0) = 1
f (1) = 1 1 + 2 1 = 1
Since f (0) and f (1) have opposite signs, a root lies between x = 0 and x = 1.
y
1

Therefore choose, say, x0 = 0.5

-x

-1 r
For the Newton-Raphson formula, we need: f (x) = x3 x2 + 2x 1 = 0
f (x) = 3x2 2x + 2
Then the 1st iteration gives:
[

(0.5)3 (0.5)2 + 2(0.5) 1


f (x0 )
x1 = x0
= 0.5
f (x0 )
3(0.5)2 2(0.5) + 2

= 0.57142857
Repeating the process:
f (x1 )
x2 = x1
= 0.56984127
f (x1 )
f (x2 )
x3 = x2
= 0.56984030
f (x2 )
f (x3 )
= 0.56984029
x4 = x3
f (x3 )
We see that x3 and x4 are the same when given correct to the accuracy required (6
d.p.). We have therefore performed enough iterations, and found the required root of
the equation to be:
x = 0.569840 to 6 d.p.
121

Comments on the Newton-Raphson Process


A suitable starting value x0 is usually chosen in this way by nding two consecutive
integer values of x between which f (x) changes sign. Since the function is continuous
in the region, a change of sign indicates that the value of f (x) must become zero in
order for its sign to change, and the point at which this occurs is the point we are
seeking.
Usually the method converges quickly. In most cases, we can expect the number of
correct decimal places roughly to double at each iteration.
Provided any arithmetic mistakes made during the iteration process are small, the
method is self-correcting.
The method will also nd complex roots of the equation if a suitable complex starting
value x0 is used. However, it is usually dicult to nd a suitable starting value, and
the process may be extremely lengthy!
Because the formula involves division by f (xi ), if the value of f (xi ) is very small we
shall be dividing by a number tending to zero, and convergence may be very slow; other
problems may also occur. In particular, this situation occurs at a multiple (double,
triple, etc.) root, though in fact even these can be found (slowly!) by the N-R process.
The diagram illustrates what could occur when a root is close to a point where
f (x) = 0.
y
6




x2
leads to wrong root

a
 
? 
a
 

6
6

x1
a
from 1st tangent

Root required

x0 ,
?
a
a

initial choice

-x

x3

Two possible (though unlikely) causes of lack of convergence are demonstrated in


the following diagrams. The remedy is to try a dierent starting value x0 .

122

6. Estimation of the location of real roots


In order to use the Newton-Raphson method, we need to nd a starting value for x
not too far from the required root. The following suggestions may be of use.

Change of sign of the function


Already considered in Comments on the N-R method.

Descartes Rule of Signs


The Rule: The number of positive real roots of the polynomial equation P (x) = 0 is
either equal to, or is less than by an even number, the number of sign changes
in the coecients of P (x), taken in order.
Corollary:

The number of negative real roots of the same equation is either equal to, or
is less than by an even number, the number of sign changes in the coecients
of P (x), taken in order

Example 1 In the equation x3 x2 + 2x 1 = 0 used as the example of the N-R


method:
f (x) = x3 x2 + 2x 1
The coecients taken in order are 1, -1, 2, -1, so +, -, +, We see that there are three changes of sign in the coecients of consecutive terms.
By Descartes Rule of Signs, we deduce that the equation has 3 or 1 positive real roots.
f (x) = (x)3 (x)2 + 2(x) 1 = x3 x2 2x 1
We see that there are no changes of sign between consecutive coecients.
Again by Descartes Rule of Signs we deduce that the equation has no real negative
roots. Therefore in our search for a suitable starting value, x0 , for the N-R method,
there would be no point trying negative values of x
Example 2

x5 + x3 + 1 = 0

Here f (x) = x5 + x3 + 1 :
No sign changes No real positive roots.
5
3
f (x) = (x) + (x) + 1 = x5 x3 + 1 : 1 sign change 1 negative real root
(Zero coecients are ignored, or bypassed.)
x=0 is not a solution of this equation since the a0 coecient is not zero. Therefore the
equation has only one real root and this must be negative. Since the polynomial is of
degree 5, so that the equation must have 5 roots, 4 of them must be complex. Since
the polynomial is real, these complex roots will be two pairs of complex conjugates.

123

Rolles Theorem

If and are real roots of the equation f (x) = 0, where < , and f(x) is smooth
and continuous, then the equation
f (x) = 0
must have at least one real root in the interval < x < .
i.e. The function f (x) must have at least one turning point (max. or min.) in the
interval < x < .
y
6

max

Demonstration of the theorem:


q

-x

Conversely, if we know that one real root of f (x) = 0 is x = , and the rst max or
min point beyond occurs at x = , the the next root, , must be > .
Furthermore, if there are no real turning points beyond x = , then there can be no
real roots of the equation f (x) = 0 which are greater than .
Example

Consider again the equation x3 x2 + 2x 1 = 0

The turning points are given by f (x) = 0, i.e. 3x2 2x + 2 = 0

2 4 24
, and these are complex.
This equation has roots x =
6
Therefore the function has no real turning points, so the curve cannot turn round
and meet the axis again.
We have therefore found the one real root of the equation x3 x2 + 2x 1 = 0 to be
x = 0, 569840, and the other two roots are complex. They will be complex conjugates
of each other.

124

Division by a linear factor after one root has been found

Suppose we have found that x = is a root of the equation P (x) = 0, where P (x) is a
polynomial of degree n. This implies that (x ) is a factor of the polynomial P (x).
We can then divide f (x) by (x ). The resulting polynomial equation, Q(x) = 0,
of degree n 1 and therefore simpler than P (x), is satised by the remaining roots of
P (x) = 0. We can therefore work on the equation Q(x) = 0 instead of P (x) = 0.
Example
Returning again to the N-R example, x3 x2 + 2x 1 = 0, we have obtained the root
x = 0.569840.
(x 0.569840) is therefore a factor of f (x).
Dividing f(x) by this factor, we obtain
x2 0.43016x + 1.75488 = 0
Since this equation is quadratic, we can use the formula to solve it, and we obtain
x = 0.21508 1.30714 i

(complex, as predicted).

When all else fails, plot some points and get some idea of the curve, and hope
that this will help!

125

7. Polynomial Interpolation
Given a table of n + 1 distinct values of an independent variable xi and corresponding
function values f (xi ) = fi , i = 0, 1, 2, 3, ...., n, interpolation is the process of determining a suitable function value corresponding to some intermediate, non-tabular value
of x. (If the non-tabular value of x lies outside the range given, the process is called
extrapolation).
Since polynomial functions are so convenient for many purposes, we often look for a
polynomial function to interpolate the given points. A polynomial of degree n will
interpolate n + 1 points.
There are various methods of obtaining interpolating polynomials. Two of these will
be explained here. These two methods can both be used to nd the interpolating
polynomial, whether the values of x are equally spaced or not.
(a). The Lagrange Formula
Suppose we are given (n + 1) points (x0 , f0 ), (x1 , f1 ), . . . . (xn , fn ).
a polynomial through these.

We wish to t

Consider the formula:


P (x) = f0

(x x1 )(x x2 )....(x xn )
(x x0 )(x x2 )....(x xn )
+ f1
(x0 x1 )(x0 x2 )....(x0 xn )
(x1 x0 )(x1 x2 )....(x1 xn )
(x x0 )(x x1 )....(x xn1 )
+ ..... + fn
(xn x0 )(xn x1 )....(xn xn1 )

This is certainly a polynomial of degree n.


Consider x = xk , where k is an integer
such that 0 k n. i.e. xk is one of the given values of x.
Then P (xk )

=
+
=

0 + 0 + ....
(xk x0 )(xk x1 )...(xk xk1 )(xk xk+1 )...(xk xn )
fk
+ ...0
(xk x0 )(xk x1 )...(xk xk1 )(xk xk+1 )...(xk xn )
fk .1 = fk

The point (xk , fk ), therefore, lies on this curve.


i.e. The polynomial P (x) interpolates all the given points.
P (x) is the required interpolating polynomial.
Example Use Lagrange interpolation to nd the quadratic function interpolating
the points (2,-1), (3,7), (6,3).

126

(b)

Divided Dierences and Newtons Interpolation Formula

From the given table of values of xi and fi , a table of Divided Dierences can
be formed. These dierences can be considered to represent the mean values of the
derivatives of f (x) with respect to x at the points concerned. We dene divided
dierences as follows:
Denition
If a function f (x) takes values fi at a set of points {xi }, the the First Divided Dierence
(1st D.D.) between the points at xi and xj is
f [xi , xj ] =

fj fi
xj xi

The Second Divided Dierence depends on the 1st D.D.:


f [xi , xj , xk ] =

f [xj , xk ] f [xi , xj ]
xk xi

nth divided dierence involves n + 1 points, say those at x0 , x1 , x2 , xn :


etc. The
f [x0 , x1 , x2 , ...xn ] =

f [x1 , x2 , ...xn ] f [x0 , x1 , ...xn1 ]


xn x0

N.B. 1. The points xi , xj , xk etc. do not have to be taken in any particular order.
Often they will be given in ascending order, but this is not necessary.
N.B. 2.

f [xi , xj ] = f [xj , xi ]

N.B. 3.
points.

f [xi , xj ]

is numerically equal to the gradient of the line joining the two

We then form a table of divided dierences for the function.

127

Example Form a divided dierence table from the data points


(2,19), (4,119), (5,214).
xi

fi

1stD.D.

2ndD.D.

(-2,11), (-1,4),

3rdD.D.

These
points were, in fact, all taken from the function
y = x3 + 4x2 2x 1
i.e. a cubic polynomial, and we see that the 3rd D.D., representing the mean value of
the 3rd derivative, is constant.
The Newton Interpolating Polynomial Formula is:
P (x) = f0 + (x x0 )f [x0 , x1 ] + (x x0 )(x x1 )f [x0 , x1 , x2 ] + ...
... + (x x0 )(x x1 )(x x2 )....(x xn1 )f [x0 , x1 , x2 , ...xn ]

If we use this formula on the above table, we would expect to obtain the cubic function
from which the points were calculated.

This formula has an advantage over the Lagrange formula. If an extra point needs to
be interpolated after the initial calculation of the function has been done, we merely
add it to the table, calculate the next D.Ds, and add one more term to the Newton
formula.

128

8. Curve Fitting
So far we have considered a set of data points, experimental or calculated, and tted
a curve through these points by interpolation. We know that a polynomial of degree
n will interpolate (n + 1) points. If, therefore, there are more than (n + 1) points,
there is no guarantee that those not used in the interpolation procedure will lie on the
curve, or that the error will be particularly small.
Curve-tting is essentially an averaging process, aiming to average out and minimise
errors by assuming that, behind the possibly erratic experimentally-obtained values,
there is some orderly process which basically follows some relatively simple equation.
The data points may be such that none of them will actually lie on the curve, but they
should be near enough for the curve to be a genuine representation of the function
within reasonable error. The curve found may be more useful and appropriate than
a complicated polynomial which interpolates all the given points. For example, given
10 points. we could interpolate these by a polynomial of degree 9, but this could have
up to 8 turning points, and we may need a smoother curve than this.
Given a set of data points, we have to choose the type of curve we feel is best tted
to them: e.g. a polynomial of a certain degree, an exponential curve, a trigonometric
curve, etc. To make an intelligent decision, we should attempt to nd out where
the given data points came from, since we may be able to guess the expected type of
relationship from this information.

An example of
a set of data points

Graphs of 3 possible functions


which might t this data.

129

Consideration of the errors involved


Let the function we have chosen to approximate the relationship shown by the data
points be f (x), where
f (x) = a0 v0 (x) + a1 v1 (x) + .... + am vm (x)
where the vi (x) are functions of x and are called basis functions, and the ai are
constants, i = 0, 1, 2, ...., m.
We have n + 1 data points:

(x0 , y0 ), (x1 , y1 ), .....(xn , yn ).

f (xk ) is the value of the function f (x) at x = xk , (k = 0, 1, 2, ...., n), and we would not
expect f (xk ) to be necessarily exactly equal to yk . However, we do require that the
dierence between f (xk ) and yk should be small, otherwise f (x) would be of little use
as a function approximating the curve.
Let k = f (xk ) yk
Considering the accumulation of errors, there would be little point in insisting that

k was as small as possible unless we knew that k was always +ve or always -ve,
and this is unlikely. Very large +ve and -ve errors could cancel each other out and

give k = 0.
More sensible would be to minimise

|k |.

However, there are diculties in performing algebraic operations on this expression.


We therefore use the Least Squares Method, and miminise
E = 20 + 21 + 22 + ...... + 2n
in which all terms are positive.

130

A variation of this method is the Weighted Least Squares tting in which we aim
to minimise the expression
w0 20 + w1 21 + w2 22 + .... + wn 2n
This technique is useful when dealing with experimental data in which we are more
convinced about the accuracy of some points than others. These points are therefore
more crucial to the function. If we apply a larger weighting factor to these points in
the above expression, their importance will be increased over that of the others.

The Least Squares Method

=
=

f (xk ) yk
a0 v0 (xk ) + a1 v1 (xk ) + .... + am vm (xk ) yk

We need to choose the coecients ai so that E is minimised.


Now

E =
=

2k

k=0
n

(a0 v0 (xk ) + a1 v1 (xk ) + .... + am vm (xk ) yk )2

k=0

If E is to be minimised, then
E
ai
i.e.

= 0

for i = 0, 1, 2, ..., m

2(a0 v0 (xk ) + a1 v1 (xk ) + .... + am vm (xk ) yk ).vi (xk ) = 0

k=0

for i = 0, 1, 2, ..., m.
So we have m + 1 linear equations to solve for a0 , a1 , ..., am .
normal equations.

These are called the

We solve these equations and thus nd the required best-t function f (x).

131

Example 1 Find the least-squares straight line that would best t the three points
(0,1), (1,1), (2,2)
f (x) must take the form:

f (x) = a0 + a1 x

We wish to minimise E where


E=

(f (xk ) yk )

k=1

This will occur when


3

E
a0

(a0 + a1 xk yk )2

k=1

= 0,

E
a1

= 0. i.e.

2(a0 + a1 xk yk ).1 = 0 a0

k=1
3

k=1
3

k=1

k=1

2(a0 + a1 xk yk ).xk = 0 a0

1 + a1

xk

k=1
3

xk + a1

yk = 0;

k=1
3

x2k

k=1

xk yk = 0;

(4)
(5)

k=1

So we need:
3

1
k=1 xk
3
2
k=1 xk
3
k=1 yk
3
k=1 xk yk
k=1

=
=
=
=
=

1+1+1
0+1+2
0+1+4
1+1+2
0+1+4

Then equation (1) is:


equation (2) is:

The required line is: f (x) =

=
=
=
=
=

3
3
5
4
5

3a0 + 3a1 4 = 0
3a0 + 5a1 5 = 0
1
5
a1 = , a0 =
2
6

5 1
+ x
6 2
y
6

P3

The diagram shows that none of


the given points lie on the line: the
point C is their centre of gravity,
and this point does lie on the line.


132

s 

C
s
s
s

P1 
P2












-x

Example 2 Fit a suitable polynomial to the given set of data points in which the
yk are known to be approximate.

y
k xk
1 2
2 1
3 0
4 1
5 2

yk
1.2
1.5
1.3
1.9
7.1

6
4

We shall rst plot the points and consider


what order polynomial might be suitable.
Clearly a straight line would not be suitable, but a parabola (2nd degree polynomial) might be appropriate.

-2

-1

We therefore choose

-2

f (x) = a0 + a1 x + a2 x2
We need to minimise E where
E=

(f (xk ) yk )2 =

k=1

This gives:

(a0 + a1 xk + a2 x2k yk )2

k=1

E
= 2 (a0 + a1 xk + a2 x2k yk ).1 = 0
a0
k=1

0 = a0

1 + a1

k=1

xk + a2

k=1

x2k

k=1

yk

(3)

k=1

E
= 2 (a0 + a1 xk + a2 x2k yk ).xk = 0
a0
k=1

0 = a0

xk + a1

k=1

x2k + a2

k=1

x3k

k=1

xk yk

(4)

x2k yk

(5)

k=1

E
= 2 (a0 + a1 xk + a2 x2k yk ).x2k = 0
a0
k=1

0 = a0

k=1

x2k + a1

x3k + a2

k=1

We need to evaluate the sums involved in these equations.


133

k=1

x4k

k=1

-x

1
k=1 xk
5
2
k=1 xk
5
3
k=1 xk
5
4
k=1 xk
5
k=1 yk
5
k=1 xk yk
5
2
k=1 xk yk
k=1

=
=
=
=
=
=
=
=

1+1+1+1+1 = 5
2 1 + 0 + 1 + 2 = 0
4 + 1 + 0 + 1 + 4 = 10
8 1 + 0 + 1 + 8 = 0
16 + 1 + 0 + 1 + 16 = 34
1.2 1.5 1.3 + 1.9 + 7.1 = 5
2.4 + 1.5 + 0 + 1.9 + 14.2 = 20
4.8 1.5 + 0 + 1.9 + 28.4 = 24

Using these results, we obtain the normal equations as:


Equation (3) becomes:
Equation (4) becomes:
Equation (5) becomes:

5a0 + 0 + 10a2 = 5
0 + 10a1 + 0
= 20
10a0 + 0 + 34a2 = 24

Note that the matrix of coecients in this system of equations is symmetric. This will
always be the case for a system of normal equations derived from the use of the Least
Squares method.
In this case the solutions can easily be found to be
a0 = 1;

a1 = 2;

a2 = 1

so that the required quadratic function is f (x) = 1 + 2x + x2


The values of f (x) corresponding to the given values of x are:
1 2
1
2
7
the given values of y were: 1.2 1.5 1.3 1.9 7.1
So we have a reasonable t.
[A better approximation in this case is, in fact, not a polynomial function but a function
of the type
f (x) = a0 + a1 ex + a2 ex
and it can be shown, again by using the same Least Squares method, that the best t
function of this type is
f (x) = 2.235 + 0.110ex + 1.283ex
However, unless we have good reason to suppose that a function of a dierent type will
give a better t to the given points, it is best to assume a polynomial function of an
appropriate degree to approximate the curve.]
134

APPENDIX

Proofs of theorems

Theorem 1
The Remainder Theorem
If is any constant and P (z) is a polynomial, then when P (z) is divided by (z ),
the value of the remainder is P ().
Proof

Since (z) is a linear factor, the division gives a quotient Q(z) and a constant
remainder, R. Then P (z) = (z ).Q(z) + R.
Putting z = into this equation, we have:
i.e.
R = P ()

P () = 0.Q(z) + R

Corollary
If is a zero of P (z), then P () = 0.
But P () = R, the remainder.
Therefore, since there is no remainder, P (z) is divisible by (z ).

Theorem 2
The Fundamental Theorem of Algebra
A polynomial of degree n 1 has at least one zero.
(The proof is beyond the scope of this course)

Theorem 3
Proof

Any polynomial P (z) of degree n 1 has exactly n zeros.


By the Fundamental Theorem of Algebra, P (z) must have at least one zero.
Let this be 1 .
Then P (z) = (z 1 ).Q(z)
where Q(z) is a polynomial of degree n 1
Therefore Q(z) must also have at least one zero, say 2 for n 2
Then Q(z) = (z 2 ).R(z)
where R(z) is a polynomial of degree n 2
So P (z) = (z 1 )(z 2 )R(z)
Repeating this argument (n 1) times, we nd that P (z) can be factorised
into the product of n 1 linear factors (z 1 )(z 2 )...(z n1 ) and a
polynomial of degree (n (n 1)) = 1, which is itself a linear factor - i.e. P (z)
is a product of n linear factors.
If any one of these linear factors is zero, then
P (z) = 0
i.e. P (z) has exactly n zeros.

135

Theorem 4
If is a zero of P (z) of multiplicity m, where m 2, then is also a
zero of P (z), of multiplicity m 1.
Proof

P (z) = (z )m .Q(z), where Q() = 0


Then P (z) = m(z )m1 .Q(z) + (z )m .Q (z)
= (z )m1 (m.Q(z) + (z ).Q (z))
And since (m.Q(z) + (z ).Q (z)) = 0 when z = , P (z) has exactly m 1
factors (z ): hence result.

Theorem 5
If = u + iv, where u, v are real and v = 0, and if is a zero of the
real polynomial P (z), then
= u iv is also a zero of P (z).
Proof

Let = u + iv = rei , then


= rei .
Now P (z) =

ak z k .

k=0

Since P () = 0,

ak rk eik = 0.

k=0

i.e.

ak rk (cos k+i sin k) = 0

k=0

The real and imaginary parts of this must separately be zero. Since all the
coecients ak are real, this implies that:
n

ak rk cos k = 0

and

k=0

ak rk sin k = 0

k=0

ak rk (cos k i sin k) = 0

k=0

But this is

k ik

ak r e

k=0

k=0

ak (re

i k

) =

ak (
)k = P (
) = 0

k=0

i.e. If is a zero of the polynomial P (z), so also is


.
Corollary 1
A real polynomial may always be expressed as a product of real linear and real quadratic
factors.
If is a complex root of P (z), then another is

= (z ) and (z
) are both factors of P (z)
2
But (z )(z
) = z ( +
)z +

And ( +
) = 2u (real);

= u2 + v 2 (real)
= z 2 ( +
)z +
is a real quadratic factor of P (z).
Corollary 2
A real polynomial whose degree is odd must have at least one real zero.
136

Chapter 7.
INTEGRATION

137

1. Denitions
Integration is:
(a) A process reversing dierentiation.
The result of this process is called the Indenite Integral or the Anti-derivative
and is a function.
(b) A process giving the area between the graph of a function, the x-axis, and two
lines parallel to the y-axis.
The result of this process is called the Denite Integral and is a number.
y
6

y = f (x)
P
PP
P
P P
PP PP PP
P
PPPP PP
PPPPPPP
P
PPPPPPP
PP
P
PP
P
P P P

We write (a) as

- x

f (x)dx;

(b) as

f (x) dx
a

In the case of (b), the denite integral, the answer is a number, the variable x
is called a dummy variable and can be replaced by any convenient letter.

2. The Fundamental Theorem of the Calculus


If f (x) is continuous for a x b, then:
d x
f (t) dt
dx a

exists and is equal to

138

f (x)

3. Properties of Denite Integrals


Linear Properties:

(i)

(f (x) + g(x))dx =
a
b

f (x) dx =

f (x) dx,

constant

[Area below the x-axis


is considered negative]

f (x) dx
b

f (x) dx =

g(x) dx
a

f (x)dx +
a

f (x) dx =

(ii)

f (x) dx +
a

f (x) dx

where a < c < b

4. Methods for nding Integrals: two broad classes of method


Class 1
We use direct integration, the integrand being the exact derivative of a function;
OR
We re-write the integrand in some way so that it can be seen to be the exact derivative
of a function, or the sum of several exact derivatives, which can then be integrated
directly. This re-writing process may include the use of:
Substitution;
Partial Fractions;
Completing the Square;
Expressing products of trig functions as sums of other trig functions;
etc.
Class 2
We use integration by parts.
This is sometimes necessary when the integrand is a product of two functions, and the
process can be considered as inverting the process of dierentiating a product.

139

5. Techniques based on Direct Integration


(a)

Some Standard Forms of Integrals

(i) Elementary Functions of x


(Basic list)

xn dx =

(ii) Elementary Functions of (ax + b)


(i.e. of linear functions of x)

xn+1
+c
n+1
provided n = 1

(ax + b)n dx =

1
dx = ln x + c
x
ex dx = ex + c

e(ax+b) dx =

sin x dx = cos x + c

sin(ax + b) dx =

cos x dx = sin x + c
etc.

1
dx =
ax + b

cos(ax + b) dx =

See formula sheets

etc.

1 (ax + b)n+1
+c
a n+1
provided n = 1
1
ln(ax + b) + c
a
1 (ax+b)
e
+c
a
1
cos(ax + b) + c
a
1
sin(ax + b) + c
a
All these can be transformed
into the basic list by
substituting (ax + b) = t
a.dx = dt

(iii) Elementary Functions of another Function of x, say f (x)

[f (x)]n .f (x) dx =

[f (x)]n+1
+ c, provided n = 1
n+1

f (x)
dx = ln[f (x)] + c
f (x)

ef (x) .f (x) dx = ef (x) + c

sin[f (x)].f (x) dx = cos[f (x)] + c


cos[f (x)].f (x) dx = sin[f (x)] + c
etc.

All these can be transformed into the basic list


by substituting f (x) = t
f (x).dx = dt

140

EXAMPLES
(i) Basic Elementary Functions

I1 =

1
1
4
1
x + 2 + 2 x + sin x + dx =
x4 + x2 + 2x 2 + sin x + 4. dx
x
x
x
3
x5 x1 2x 2
+
+ 3 cos x + 4. ln x + c
=
5
1
2
x5 1 4 3
=
+ x 2 cos x + 4 ln x + c
5
x 3
4

(ii) Elementary Functions of Linear Functions

Find I2 =

1
1
+
dx
3x 2
1x

(3x 2) and (1 x) are both linear functions of x. We can therefore treat (3x 2)
and (1 x) as simple variables, remembering, when we integrate each term, to divide
by the coecient of x.
1

1
1 (1 x) 2
1
I2 = ln(3x 2) +
ln(3x

2)

2
1x+c
+
c
=
1
3
1
3
2

OR By making the suggested substitutions:

1
1 dt
1
Let t = 3x 2 dt = 3dx
dx =
= ln t
3x 2
t 3
3
1

u 2
1
12

Let u = 1 x du = dx
dx = u . du = 1
1x
2
Hence result.

(iii) Elementary Functions of other Functions of x

(a)

I3 =

2x
1
dx =
.2x dx
2
2
x +5
x +5

Here, x2 + 5 = f (x) and so f (x) = 2x.

f (x)
dx = ln[f (x)] + c;
f (x)

I3 therefore takes the form


Thus I3 = ln(x2 + 5) + c

OR By making the substitution t = x2 + 5 2x.dx = dt,

(b)

I4 =

Here f (x) = sin x f (x) = cos x

esin x cos x dx;

I4 therefore takes the form

I3 =

ef (x) .f (x) dx = ef (x) + c;


141

I4 = esin x + c.

dt
t

= ln t + c.

(b)

Substitution

The Aim:

To simplify the integrand, or at least express it in a form that we can


integrate directly, by writing it in terms of a dierent variable.
The choice is largely inspirational!
It may not work!
There are some integrands for which a particular substitution is recommended; some of these are listed below. Otherwise, we consider what is
making the integral particularly awkward and try substituting for this.

Remember:

(i)

to substitute for dx as well as for x;

(ii) to change the limits of a denite integral so that they represent


the correspond limits on the new variable.

Some standard substitutions to use in certain specic cases


(i)

If the integrand involves (a2 + x2 ), we try x = a tan , the new variable being .
Then:
also:

(ii)

dx = a sec2 .d;
(a2 + x2 ) = (a2 + a2 tan2 ) = a2 (1 + tan2 ) = a2 sec2 .

If the integrand contains sin x and/or cos x:


dx
x
1
sec2 = (1 + t2 )dx
2
2
2
2dt
dx =
1 + t2
2t
1 t2
N.B. We also need:
sin x =
,
cos
x
=
.
1 + t2
1 + t2
we try

t = tan x2

dt =

(iii) If the integrand involves sin2 x and/or cos2 x, we try t = tan x .

142

EXAMPLE

A general (inspirational?) choice of substitution

x
dx
1x

This is awkward because of the denominator, 1 x. We try a substitution for it:

Let t = 1 x
1 x = t2
x = 1 t2
dx = 2t dt

I5 =

Then I5 =

1 t2
t3
(2t)dt = 2 (1 t2 )dt = 2 t
+c
t
3
(
)
3
1
= 2
1 x (1 x) 2 + c
3

1
2

dx , we make the same


0
1x
substitution as for I5 , but also need to change the limits:
If I6 is the corresponding denite integral, I6 =

1
1
when x = , t =
2
2

When x = 0, t = 1;
Then I6 = 2

1
2

t3
(1 t2 )dt = 2 t
3
[(

= 2

] 1

1
1
1

1
3
2 3.2 2

)]

4
5

3 3 2

There is now no need to substitute back for x.

EXAMPLE

Use of substitution (i) for:

Using x = a tan , I7 =

a sec2 d
1
=
2
2
a sec
a

Use of substitution (ii) for:

I8 =

Using t = tan x2 as suggested:


When x = 0, t = tan 0 = 0;

Then I8 =

1
1
x
+ c = tan1 + c
a
a
a

d =

EXAMPLE

dx
a2 + x2

I7 =

dx
5 + 3 cos x

, t = tan = 1
2
4
1
2dt
2dt
=
2
2
5(1 + t ) + 3(1 t )
0 8 + 2t2

when x =

1
2dt
=
2) .
1 + t2
0
5 + 3(1t
1+t2

=
0

making use of the result from I7 .


143

1
t
dt
=
tan1
2
4+t
2
2

]1

=
0

1
1
tan1
2
2

(iv) Much use is made of the relationships


cos2 x + sin2 x = 1

and

cosh2 x sinh2 x = 1

to introduce certain trig and hyperbolic substitutions when the integrand involves
square roots of certain expressions. The following table shows the expressions concerned, the substitution suggested, and simplication in the expression resulting from
the appropriate substitution.

occurring

x2 a2

becomes

dx =

x = a sin

a cos

a cos d

x = a sinh u

a cosh u

a cosh u du

x = a cosh u

a sinh u

a sinh u du

a2 x2

a2 + x2

substitution

In particular, these lead to the standard integrals:

dx
x
= sin1 + c
2
a
x

dx
x

= sinh1 + c
2
2
a
a +x

dx
1 x

=
cosh
+c
a
x2 a2

a2

If the variable x in any of these is replaced by a linear function of x (written here as


(bx + c)), the usual rules for integrating functions of linear functions apply: i.e. treat
(bx + c) as a simple variable and divide the result by b, the coecient of x.

dx
1 x
=
sin
+k
a
a2 x2

1 1 (bx + c)
dx

=
sin
+k
we can write
b
a
a2 (bx + c)2
e.g. Since

However, the quadratic expression involved may not be given in the correct form; i.e.
as the sum or dierence of two squares, as in the example. We may rst need to
complete the square to obtain the correct form.
e.g.

10x x2 16 = (x2 10x + 16) = ((x 5)2 25 + 16) = 9 (x 5)2

144

EXAMPLE

I9 =

dx

x 4 x2

Since the integrand contains the expression


a = 2, we use the suggested substitution:

x = 2 sin

Then I9 =

4 x2 which has the form a2 x2 with


dx = 2 cos d

cos d
2 cos d

=
2
sin .2 cos
2 sin 4 4 sin

1
cosec d
2

)
(
1
1
2
4 x2
=
ln(cosec + cot ) + c = ln
+
+c
2
2
x
x

EXAMPLE

showing the derivation of a standard form

Using the suggested substitution x = a sinh u for the second of these standard form
integrals,

dx
a cosh u du
a cosh u du

becomes:
=
2
2
a +x
a2 (1 + sinh2 u)
a2 + a2 sinh2 u

a cosh u du
=
a cosh u

du = u + c = sinh1

x
+c
a

Similarly for the other standard form integrals.

EXAMPLE

When a linear function of x is involved.

I10 =

dx

9 (x

5)2

= sin1

(x 5)
+c
3

or: it is more likely that the integral will appear in the form

I10 =

dx
10x 16 x2

The quadratic needs to be expressed as the sum or dierence of two squares, using the
completion of the square process shown on the previous page. Then, since
10x 16 x2 = 9 (x 5)2 ,
the integral I10 can be written as given above and solved as a standard form, or by
making the substitution
x 5 = 3 sin
145

(c) The Use of Partial Fractions in Integration


When the integrand is a rational function and its denominator will factorise, then it is
usually possible to express the integrand as the sum of simpler fractions, each of which
may integrate separately.
Remember:

(i)

if the degree of the numerator the degree of the denominator,


it is necessary to divide the fraction out rst.

(ii) not all algebraic fractions integrate to give a natural log function!

e.g.

dx
=
(x 2)2 dx
(x 2)2
(x 2)1
=
+c
1
1
=
+c
x2

146

EXAMPLE 1

I11 =

x3 + 3x2 2x 1
dx
x2 + 4x + 3

Division is necessary since the degree of numerator = 3, degree of denominator = 2.


Division gives the integrand as:

x1+

x2

x + 2
x + 2
= x1+
+ 4x + 3
(x + 3)(x + 1)

x + 2
takes partial fractions of the form:
(x + 3)(x + 1)

Then I11 =
=

EXAMPLE 2

I12 =

x1

A
B
+
x+3 x+1
3
52
=
+ 2
x+3 x+1

5
3
+
dx
2(x + 3) 2(x + 1)

x2
5
3
x ln(x + 3) + ln(x + 1) + c
2
2
2
2x3 + 3x2 2x + 5
dx
(x2 + 1)(x 1)2

Division is not necessary since the degree of numerator = 3, degree of denominator = 4.


Form of partial fractions required is:

Ax + B
C
D
+
+
2
x +1
x 1 (x 1)2

On evaluating these constants, we obtain:

Therefore I12 =

x+2
1
4
+
+
2
x + 1 x 1 (x 1)2

1
4
x+2
+
+
dx
2
x + 1 x 1 (x 1)2

For the integration, the rst term needs to be split into two terms:

Then:

2
x
and
x2 + 1
x2 + 1

2
1
4
x
+
+
+
dx
x2 + 1
x2 + 1
x1
(x 1)2
=

1
4
ln(x2 + 1) + 2 tan1 x + ln(x 1)
+c
2
x1

147

(d) Powers and Products of Trig (and Hyperbolic) Functions


To integrate sin2 x and cos2 x we use the trig identities:
2 sin2 x = 1 cos 2x

Then:

and

1
sin x dx =
1 cos 2x dx
2(
)
sin 2x
1
x
+c
=
2
2

2 cos2 x = 1 + cos 2x

1
cos x dx =
1 + cos 2x dx
2(
)
sin 2x
1
x+
+c
=
2
2
2

To integrate odd powers of sin x and cos x, we can use the identity
cos2 x + sin2 x = 1
to convert the integral into a sum of exact integrals, as shown in the example shown
on the following page.
However, this is only practicable for reasonably small odd powers, and the technique
cannot be used for even powers of sin x and cos x. In these cases, it is best to obtain a
reduction formula for a general power n and then apply it to the particular situation.
(See Reduction Formulae).

To integrate products of sines and cosines, we use the following identities to convert
products into sums or dierences of trig (hyperbolic) functions which can then be
integrated term by term.

1
[sin(A + B) + sin(A B)]
2
1
cos A cos B =
[cos(A + B) + cos(A B)]
2
1
sin A sin B =
[cos(A B) cos(A + B)]
2
sin A cos B =

The same processes can be applied to product of hyperbolic functions, with the usual
proviso about changing trig identities onto hyperbolic form.
148

EXAMPLE

Integration of an odd power of cos x

cos x dx

cos4 x. cos x dx

(1 sin2 x)2 . cos x dx


cos x 2 sin2 x cos x + sin4 x cos x dx

All three terms in the integral are now exact derivatives and can be integrated directly.
If f (x) = sin x, then f (x) = cos x, so that the second and third terms of the integral
are:

2
2 [f (x)] .f (x)dx
and
[f (x)]4 .f (x)dx
or

Thus

(i)

(ii)

sin2 x.d(sin x)

and

sin4 x.d(sin x)

sin3 x sin5 x
cos x 2 sin x cos x + sin x cos x dx = sin x 2
+
+c
3
5

EXAMPLES

Some integrals of products

1
sin(3x + 7x) + sin(3x 7x), dx
sin 3x. cos 7x dx =
2
1
=
sin 10x sin 4x dx
2(
)
1
cos 10x cos 4x
=

+
+c
2
10
4
1
sinh(x + a). sinh(x a) dx =
cosh 2a cosh 2x dx
2
=

149

(a constant)

6. Integration by Parts
This process is necessary to integrate certain products of two functions.

du
dv
v dx = uv u dx
dx
dx

or

vdu = uv

udv

This formula is obtained from the formula for the derivative of a product: since
d
du
dv
{uv} =
v + u,
dx
dx
dx
integration of both sides w.r. to x gives:

uv =

du
dv
v dx +
u dx
dx
dx

which rearranges into the formula for integration by parts, shown above.

When using Integration by Parts, we treat one of the factors in the integrand as v
we need its derivative for the formula.
We treat the other as

du
we need its integral for the formula.
dx

N.B. if one factor is a polynomial, this one is usually treated as v so that it gives a
simpler function in the result.
UNLESS the other factor is ln x (or a natural log of any function of x) or an inverse
function such as sin1 x. In these cases, we need to treat a factor of either of these
types as v and use its derivative to obtain the result.

150

EXAMPLE 1

I13 =

x sin x dx

dv
= 1;
dx

Let v = x

du
= sin x u = cos x
dx

let

x sin x dx = x( cos x)
= x cos x +

( cos x).1 dx

cos x dx

= x cos x + sin x + c

EXAMPLE 2

x2 ln x dx

I14 =
dv
1
= ;
dx
x

Let v = ln x

let

du
x3
= x2 u =
dx
3

3
x 1
x3
(ln x)
. dx
3
3 x
2
x3
x
=
ln x
dx
3
3
x3
x3
ln x
+c
=
3
9

x2 ln x dx =

EXAMPLE 3

I15 =

e2x cos x dx

Here there is no particular advantage in choosing either factor to be v. We will need


to use the integration by parts technique twice.
Let v = cos x

dv
= sin x;
dx
]

let

du
1
= e2x u = e2x
dx
2
[

2
1
1 2 2x
1 2x
2 1 2x
e . cos x
e ( sin x) dx = 0
+
I15 =
e sin x dx
2
2
2 0
0 2
0
dv
du
Using integration by parts again, with v = sin x dx
= cos x;
= e2x
dx
u = 12 e2x ,

I15

1 1
= +
2 2

[[

1 2x
e sin x
2
(

I15 1 +

1
4

1 1 1
1
1 2x
e cos x dx = +
e 0 I15
2
2 2 2
4

1 e
= +
2
4
151

1
I15 = (e 2)
5

7. Reduction Formulae
These formulae help us to integrate functions that involve the nth power of some
function.
We denote that given integral by In
We aim to derive a formula (called a Reduction Formula) which expresses In in terms
of In1 or In2 etc.
Thus we reduce the power involved in the integrand until it is manageable.
Usually reduction formulae are obtained by using integration by parts. The standard
exception to this statement are reduction formulae for

tan x dx

cotn x dx

and

and the method for obtaining the reduction formulae necessary for these will be shown
in Example 2.

EXAMPLE 1

In =

Let v = xn dv = nxn1
Let du = ex u = ex

xn ex dx

Then In = uv

= ex .xn

u dv

ex .nxn1 dx

= x e n

n x

ex xn1 dx

ex xn1 dx is In1

But
Therefore In = xn ex nIn1

This is the required reduction formula, expressing In in terms of In1 .


In1 in terms of In2
Using it we can then also express In1 in terms of In2
etc. until:
I1 in terms of I0

Now I0 =

x0 ex dx =

ex dx = ex + c

Then we work back through the formulae until we reach In .


e.g. To nd

x3 ex dx = I3 :

we start by using n = 3 in the reduction formula:

152

EXAMPLE 2

An example of the exceptional cases in which integration by

parts is not necessary.

tann x dx

In =

In this case we start by separating out two of the tan x factors and write

tann2 x. tan2 x dx

In =

2
Now 1 + tan2 x = sec
x tan2 x = sec2 x 1

So In =
tann2 x(sec2 x 1)dx

tann2 x sec2 x dx

This integral is exact


d
since sec2 x =
(tan x)
dx

tann2 x dx
This is In2

tann1 x
=
In2
n1

In

Thus we have obtained a reduction formula giving In in terms of In2


Therefore:
if n is odd, we shall eventually need I1 = tan x dx = ln(sec x) + c;
If n is even, we shall eventually need I0 = 1.dx = x + c
If the integral is a denite integral, it is advisable to substitute the limits at the earliest
possible stage. This usually simplies the formula, sometimes considerably.

e.g. Suppose In =
Then

tann x dx

proceeding as above, we obtain:


[

tann1 x 4 4

tann2 x dx
=
n1 0
0
n1
(tan 4 )
(tan 0)n1
1
=

In =
In2
n1
n1
n1

In

If we require

tan5 d, then:

We shall also need I1 =


Then:

tan x dx = [ln(sec x)]04 = ln( 2) 0.

153

EXAMPLE 3

Here we again need to use integration by parts.

cosn x dx

In =

For most integrands that involve a power of a trig function, we separate out just one
of the trig factors. In this case:
In =

cosn1 x cos x dx

Let

Let

v = cosn1 x
dv = (n 1) cosn2 x.( sin x)
du = cos x
u = sin x

In = uv

u dv

= sin x. cosn1 x
n1

= sin x cos

sin x((n 1) cosn2 x. sin x) dx

x + (n 1)

= sin x. cosn1 x + (n 1)
= sin x. cos

n1

x + (n 1)

cosn2 x sin2 x dx

cosn2 x(1 cos2 x)dx


cosn2 x cosn x dx

= sin x. cosn1 x + (n 1)[In2 In ]


So In + (n 1)In = sin x. cosn1 x + (n 1)In2
nIn = sin x. cosn1 x + (n 1)In2
1
(sin x cosn1 x + (n 1)In2 )
n
Again we have In expressed in terms of In2 , and therefore:
or In =

If n is odd, we eventually need I1 = cos x.dx = sin x + c;


If n is even, we eventually need I0 = 1.dx = x + c
And, as an example of a denite integral, suppose In =
before:

cosn dx, proceeding as

In = [sin x cosn1 x]02 + (n 1)(In2 In )


= 0 0 + (n 1)(In2 In )

Also, in this case, I1 = [sin x + c]02 = 1;

154

n1
In2
n

In =

I0 = [x + c]02 =

8. Numerical (Approximate) Integration


Many functions cannot be integrated analytically, and of those which can be, the result
is sometimes so complicated that the eort seems hardly worth while.
For a denite integral, therefore, which gives a numerical value for the area under a
curve between given limits, a useful approach to nding an approximate value for the
integral would be to approximate this area. We shall look at two simple approaches;
there are many more. (Not all simple!)

Required: an approximate value of I =

f (x) dx
a

y
6

y = f (x)
yn1

y0

y1

yn

y2

a x1 x2
x0

xn1 b
xn

- x

The Trapezium Rule


The area is divided into n strips each of width h, each approximating to a trapezium
in shape.The values of x at the points of division are referred to as
x0 = a, x1 , x2 , x3 , .......xn1 , xn = b
These give rise to corresponding values of y from the equation y = f (x):
y0 , y1 , y2 , y3 , .......yn1 , yn
Then the required area representing the value of I is given approximately by
I
=

h
h
h
h
(y0 + y1 ) + (y1 + y2 ) + (y2 + y3 ) + ...... + (yn1 + yn )
2
2
2
2
h
(y
2 0

+ 2(y1 + y2 + y3 + .... + yn1 ) + yn )

This is the Trapezium Rule. h is called the step-length.


155

Simpsons Rule
This rule aims at a better approximation to the area, and therefore to the value of the
integral, by approximating the curve y = f (x) by a series of parabolic curve segments
rather than a series of straight lines as in the Trapezium Rule. It uses the fact that we
can always nd a parabola which passes through any three non-collinear points.
The range of values of x is divided into equal steps of length h, as for the Trapezium
Rule. Let three points on the curve y = f (x) corresponding to adjacent steps be:
(h, y0 ), (0, y1 ), (h, y2 )

where

y0 = f (h), y1 = f (0), y2 = f (h)

Suppose the parabola on which they lie has equation:

y = ax2 + bx + c.

We can nd the values of a, b, c by insisting that the coordinates of these three points
satisfy the equation. Thus
y0 = ah2 bh + c
y1 = 0 + 0 + c
y2 = ah2 + bh + c

(6)
(7)
(8)

Adding (6) + (8) gives: y0 + y2 = 2ah2 + 2c a =

Then c = y1 .

y0 + y2 2y1
2h2

The area under this parabolic curve segment is given by the integral

ax2 +bx+c dx

This is the approximation to the required integral. Thus


[

Area

x3
x2
a + b + cx
3
2

=
=
=
=

]h
h

h
h
h3
h2
a + b + ch a + b ch
3
2
3
2
3
h
2a + 2ch
3
3
h (y0 + y2 2y1 )
2
+ 2y1 h
3
2h2
h(y0 + y2 2y1 + 6y1 )
h
= (y0 + 4y1 + y2 )
3
3

To use this result, then, for Simpsons Rule, we divide the area required into an even
number, 2n, of strips of width h and approximate the curve bounding each pair of
strips by a parabolic section as described. Then
I
=

h
h
h
(y0 + 4y1 + y2 ) + (y2 + 4y3 + y4 ) + .... + (y2n2 + 4y2n1 + y2n )
3
3
3
h
(y
3 0

+ y2n + 4(y1 + y3 + y5 + .... + y2n1 ) + 2(y2 + y4 + .... + y2n2 )

156

EXAMPLE

sin x
dx
0 1+x
(b) Using Simpsons Rule, in both cases using 6

Find estimates for the value of

(a) using the Trapezium Rule:


sub-intervals.

The range of values of x over which the integration is required is

0=
2
2

h=

, since there are 6 sub-intervals


12

The necessary values of x are shown in the table.


Calculation of the y values:
sin x0
sin 0
=
=0
1 + x0
1+0

sin 12
sin x1
0.2588..
=
=
= 0.2051..
=
1 + x1
1 + 12
1 + 0.2618..
etc.

y0 =
y1

y
x
y
x0 = 0
y0 = 0

x1 = 12
y1 = 0.2051
x2 = 2
y2 = 0.3282
12
3
x3 = 12
y3 = 0.3960
x4 = 4
y4 = 0.4230
12
5
x5 = 12
y5 = 0.4183

x6 = 2 y6 = 0.3890
0.3890
1.0194
0.7512

y6

- x

Using these values, the Trapezium Rule gives:


I

1
(0.3890 + 2(1.0194 + 0.7512)) = 0.5144 = 0.514 to 3 d.p.
2 12

Simpsons Rule gives:


I

1
(0.3890 + 4 1.0194 + 2 0.7512)) = 0.5209 = 0.521 to 3 d.p.
3 12

The analytic result, correct to 6 d.p. is:

I = 0.521032

As expected, Simpsons Rule give the better estimate in this case.


A more accurate method that involves calculating fewer function values than either the
Trapezium rule or Simpsons Rule is called Gaussian Quadrature, and is described
in the appendix.
157

APPENDIX A:
Dene F (x) =

Proof of the Fundamental Theorem of the Calculus


f (x) dx, the indenite integral of f (x), so that
dF
= f (x)
dx

Dene A(x) =

x
a

f (t) dt, the denite integral.

x+x

Then A(x + x) =

f (t) dt
a

The area of the shaded strip = A(x + x) A(x) f (x).x.


Then
{

But
Therefore we have

dA
dx

Since we now have

dF
dx

lim

x0

A(x + x) A(x)
f (x)
x

A(x + x) A(x)
x

is dened as

dA
dx

= f (x)
and

dF
dA
=
dx
dx

dA
dx

both equal to f (x), we have

F (x) = A(x) + a possible constant

We see therefore that the processes involved in nding an indenite and a denite
integral are the same.

y
6

y = f (t)



























x x + x
- x 

158

-t

APPENDIX B.

Gaussian Quadrature

Quadrature is the general term for numerical methods of integration. We have so


far met the Trapezium Rule and Simpsons Rule which are both methods depending
on the value of the function at equally spaced points.
Gauss developed a method, in the rst instance depending on the area of a trapezium.
However, in this case he did not merely take the two end points on the segment of the
curve in question and join them by a straight line, as in the Trapezium Rule. His two
points were chosen within the segment of the curve under which the area was required,
and chosen so that the approximate area had the best chance of giving an accurate
answer.

y
6

D














  





   

C









































 









      

y = f (x)

-x

Clearly, Gauss Method has more chance of giving an accurate value of ab f (x)dx if
points C and D can be found so that the area left out under the curve is compensated
by other areas included in the calculation which are actually outside the required area.
In the Trapezium Rule, we have
Area = p.f (a) + q.f (b),

where p = q =

ba
2

In Gauss Method, we wish to write


Area = w1 .f (c) + w2 .f (d)
Since x = c, d are not the boundaries of the trapezium, w1 , w2 will not have the values
dc
(or ab
).
2
2
We therefore have four unknowns, c, d, w1 , w2 which we aim to choose so as to ensure
that the evaluation of the integral is as accurate as possible.

159

Derivation of the Method


We shall assume that the interval of integration is [1, 1].
then we use a substitution for the variable.

If this is not the case,

Since we have four unknowns, we can assume that the method ought to give an exact
answer for the integral of any polynomial function of degree 3. In particular, it
should give exact answers for the integrals of 1, x, x2 , x3 over the interval [-1, 1].
For these four functions, we have

y
6

I0 =

D

I1 =




C




d 1

I2 =

1dx = [x]11 = 2
[

xdx =

1
1

x2
2

1
1

I3 =

-x

x dx =
[

x3 dx =

]1

= 0
1
]
3 1

x
3

1
]
4 1

x
4

2
3

= 0
1

We are looking for Area to be of the form


Area = w1 .f (c) + w2 .f (d)
Therefore, for the function 1 :
for the function x :

A0 = w1 .1 + w2 .1 = I0 = 2
A1 = w1 .c + w2 .d = I1 = 0
2
3
3
3
= w1 .c + w2 .d = I3 = 0

(1)
(2)

for the function x2 :

A2 = w1 .c2 + w2 .d2 = I2 =

(3)

for the function x3 :

A3

(4)

Equation (1) gives:

w1 + w2 = 2

If we choose w1 = w2 , and c = d, we see that equations (2) and (4) are satised.
Therefore, choose w1 = w2 = 1 (to satisfy equation (1)) and substitute in equation (3):
(d)2 + d2 =

2
3

2d2 =

2
3

d2 =

1
3

So the Gaussian 2-point method of quadrature is:

f (x)dx 1.f

160

+ 1.f

1
d =
3

3-Point Gaussian Quadrature


Here we look for a formula to give the approximate
area in the form:

y
6

A = w1 f (x1 ) + w2 f (x2 ) + w3 f (x3 )


We now choose 3 points, C, D, E, on the curve
with x-values
x1 , x2 , x3 , and put a parabola through these in
such a way that the
overlap areas compensate for each other, as in
the 2-point process.

x2

x3 1

-1x1

-x

Since 6 unknowns are involved this time, we would expect the process to give exact
answers for integrals of polynomial functions of order 5, and in particular, for 1, x,
x2 , x3 , x4 , x5 over the interval [-1, 1].
As before, we have
2
I0 = 2, I1 = 0, I2 = , I3 = 0, and, in addition,
3
This leads to the 6 equations:

2
I4 = , I5 = 0.
5

A0 = w1 .1 + w2 .1 + w3 .1 = I0 = 2
A1 = w1 .x1 + w2 .x2 + w3 .x3 = I1 = 0
2
A2 = w1 .x21 + w2 .x22 + w3 .x23 = I2 =
3
A3 = w1 .x31 + w2 .x32 + w3 .x33 = I3 = 0
2
A4 = w1 .x41 + w2 .x42 + w3 .x43 = I4 =
5
A5 = w1 .x51 + w2 .x52 + w3 .x53 = I5 = 0

(1)
(2)
(3)
(4)
(5)
(6)

Suppose, as in the 2-point case, that x1 , x2 , x3 are symmetric about 0, then


x2 = 0,

and

x1 = x3

These would satisfy equations (2), (4) and (6), provided w1 = w3 . Equations (3) and
(5)then become:
2
2

2w1 x21 = , 2w1 x41 =


3
5
Then from equation (1), w2 = 2 10
= 89
9

3
x21 = ,
5

Thus the 3-point Gaussian Quadrature formula is:

1
1

f (x)

w1 =

5
= w3
9

5
3
8
5 3
f
+
f (0) +
f
9
5
9
9
5

161

n-Point Gaussian Quadrature


The method can be extended to as many points as we wish to choose. For n points
the method gives exact solutions to the integrals of polynomials of degree (2n 1)
over [-1,1]. For other functions, the accuracy increases as we increase n.
The following table gives the values of the wi constants and the x-values to be used in
calculating the values of f (x) for the quadrature for values of n 5. In every case,
n
i=1 wi = 2.
n

Coecients wi

w1 = w2 = 1

w1 = w3 = 59 = 0.55556
w2 = 89 = 0.88889

w1 = w4 = 0.34785
w2 = w3 = 0.65214

x1 = x4 = 0.86114
x1 = x4 = 0.33998

w1 = w5 = 0.236923
w2 = w4 = 0.47863
w3 = 0.56889

x1 = x5 = 0.90618
x2 = x4 = 0.53847
x3 = 0

EXAMPLE

I=

Abscissae xi
1
x1 = x2 = = 0.57735
3

x1 = x3 = 0.6 = 0.77460
x2 = 0

ex dx = [ex ]11 = e1 e1 = 2.350402..

By Simpsons rule with h = 1 (2 sub-intervals), I 13 (e1 + 4e0 + e1 ) = 2.362054


h = 0.5 (4 sub-intervals), I ( 0.5
(e1 + 4e0.5 + 2e0 + 4e0.5 + e1 )
3
= 2.351195
By trapezium rule with h = 0.5 (4 sub-intervals), I ( 0.5
(e1 + 2e0.5 + 2e0 + 2e0.5 + e1 )
2
= 2.399166
h = 0.2 (10 sub-intervals), I ( 0.2
(e1 + 2e0.8 + 2e0.6 + 2e0.4
2
0.2
0
0.2
0.2
0.6
+2e
+ 2e + 2e + 2e + 2e + 2e0.8 + e1 ) = 2.358232
Here, the absolute errors || are respectively 0.01165.., 0.000792.., 0.0487.., 0.007829
By Gaussian Quadrature:
2-point: I e 3 + e 3 = 2.342696
)
5 ( 0.6
8
3-point: I
e
+ e 0.6 + e0 = 2.350337
9
9
0.90618
0.90618
5-point: I 0.23692(e
+e
) + 0.47863(e0.53847 + e0.53847 )
+0.56889e0 = 2.350402
1

In these cases, ||= 0.00770.., 0.000056.., 0.00000..

162

Chapter 8.
FUNCTIONS OF MORE THAN
ONE VARIABLE:
PARTIAL DIFFERENTIATION, PART I

163

Functions of two or more variables

So far, discussion about functions has been conned to functions dependent on a single
variable, f (x), for instance. We can demonstrate the way in which f varies with x by
means of a 2-dimensional graph. The set of all points satisfying y = f (x) form some
straight line or curve.
Many functions, however, depend on more than one variable.
We write:

z = f (x, y), for instance,


where x and y are the two independent variables on which f depends;
= (x, y, z, t), as another example,
where is a function of 3-D space (needing the three independent
co-ordinate variables x, y, z to specify position) and time t.

Provided f depends on only two independent variables, say x and y, so that the
dependent variable z is given as
z = f (x, y),
then the relationship between z, x and y can be represented on a 3-D graph. Here the
set of all points satisfying the functional relationship z = f (x, y) forms a surface.

Suppose we take a plane section (cut) through the surface z = f (x, y). The section
will produce a straight line or curve.
In particular, if the section is a plane parallel to the x-z plane, everywhere on this
plane y takes a constant value, so the curve produced from the surface demonstrates
how z varies with x for a chosen constant value of y. We have a curve whose equation
can be expressed as
z = f1 (x)
where f1 is some function.
Similarly, a plane section parallel to the z-y plane produces a curve, all of whose points
have a constant x coordinate, showing the way in which z varies with y for a chosen
constant value of x. Here we have a curve whose equations will take the form, for some
function f2 ,
z = f2 (y)
A constant value of z gives a curve parallel to the x-y plane. These curves are
usually known as contours.

164

EXAMPLE

of a function of one variable

y = f (x) = x +

1
x

The graph illustrates how f (x) varies with


dy
x; the gradient dx
shows the rate of change
of f (x) with x.

EXAMPLE

of a function of two variables

z = f (x, y) = 4 x2 4y 2
For any independently chosen x = x1 and.
y = y1 , we can then calculate the corresponding value of z = z1 , giving the point
whose coordinates are (x1 , y1 , z1 ) lying on
the surface representing the function.
e.g. When x = 1 and y =
(

1
z =41 4
2

)2

1
2

=2

So (1, 21 , 2) is a point on the surface.


Sections Take a constant value of y, e.g. y = 21 ;
The resulting section through the surface is a curve whose equation is
(

z = 4 x2 4

1
2

)2

z = 3 x2 ;

a parabola

In general, taking a section y = C1 gives a parabolic curve with equation


z = 4 4C12 x2 .
Similarly taking a section
parallel to the z-y plane,
x = C2 (constant),
we obtain another series of
parabolic curves with equation z = 4 C22 4y 2 .

If z is taken as constant (say


z = C3 ) a series of ellipses is
formed:
x2 + 4y 2 = 4 C3
These are the contours of the
surface.

165

Rates of change of functions


Trying to nd a suitable way of showing how these functions vary as both x and y vary
simultaneously is clearly bound to be awkward. Since x and y are independent, there
is no reason why they should necessarily vary at the same rate, or, for that matter
at the same time. We therefore consider the variation of one variable at a time; we
consider f as a function of one variable, keeping all the others constant for the time
being, and consider the rate of change of f with respect to that particular variable.
We shall see later how we can combine rates of change to look at the overall change,
or the rate of change, of f when more than one of the variables are changing simultaneously.
Consider rst the rate at which f varies with x while y is kept at a constant value.
This is represented by the gradient of the curve created by the section parallel to the
z-x plane, and is therefore calculated by dierentiating f w.r. to x, treating y as a
df
constant. We do not want to write this derivative as dx
since this would look like an
ordinary derivative and be confusing. A dierent notation is therefore used:
f
x
is dened as the rate of change of f with respect to x, all other variables being kept
constant.
Similarly
f
y
is the rate of change of f w.r. to y, while all other variables remain constant. These
values give the gradients of points on the curves created by sections of the surface
parallel to the z-y plane.

f
f
and
are called the rst partial derivatives of f (x, y).
x
y

166

EXAMPLE
In the case of the function f (x, y) = 4 x2 4y 2 , represented by the surface
z = f (x, y) = 4 x2 4y 2
We can write
f
= 0 2x 0 = 2x;
x

f
= 0 0 8y = 8y
y

Thus the gradients of the respective curves parallel to the z-x and z-y planes are
z
= 8y
y

z
= 2x;
x

At the point (1, 12 , 2) on the surface, for instance, the curves obtained from the sections
are
(

1
z =4x 4
2
2

)2

z = 3 x2

and the respective gradients are:

and

z
= 2;
x

167

z = 4 (1)2 4y 2 ;
z
=4
y

z = 3 4y 2

Formal denitions of the Partial Derivatives of f (x, y)


{

f (x, y)
z
f (x + x, y) f (x, y)
=
= lim
x0
x
x
x
f (x, y + y) f (x, y)
f (x, y)
z
=
= lim
y
y y0
y
provided that these limits exist at (x, y).

Second and higher order derivatives are dened in the same way:
e.g.

2 f (x, y)

=
x2
x

f
x

{ f

= lim

x0

(x + x, y)
x

}
f
(x, y)
x

f
partially w.r. to y,
x
f
partially w.r. to x.
y

Note that we can also dierentiate:


and:
We then obtain, respectively,
2f

=
yx
y

f
x

and

2f

=
xy
x

f
y

the mixed second partial derivatives.

The example illustrates the following general result:


Theorem
If

2f
2f
and
both exist and are continuous functions of x and y in some region
xy
yx

containing the point (x0 , y0 ), then

2f
2f
(x0 , y0 ) =
(x0 , y0 )
xy
yx

168

EXAMPLE
Find all the rst and second partial derivatives of f (x, y), where
f (x, y) = x3 y + exy 2 cos(3x + 4y)

f
=
x
f
y

2f
=
2
x
x
2f
y 2

=
y

f
x
f
y

=
}

Finally the two mixed second derivatives :


2f

=
yx
y

2f
=
xy
x

f
x
f
y

=
}

Note:

N.B. Another notation sometimes used for partial dierentiation involves the use of
subscripts. We can write:
f
f
= fx ;
= fy
x
y
2f
= fxx ;
x2

2f
= fyy ;
y 2

169

2f
= fxy etc.
xy

Taylors Series for a Function of Two Variables


A reminder:

The Taylor series for a function of one variable, say f (x), about the point
x = a with x = h is:

f (a + h) = f (a) + hf (a) +

h2
h3
hn
f (a) + f (a) + .... + f (n) (a) + ...
2!
3!
n!

provided the function f and all the necessary derivatives of f are dened at the point
x = a.

For a function of two variables, f (x, y), we have a comparable series. If derivatives
are to be involved, they must be partial derivatives of f . We might also expect the
derivatives involved to be symmetric in terms of x and y since there is no reason for
any priority of one variable over the other.
So we have the Taylor expansion of f (x, y) about the point (a, b), with x = h, y = k:

f
f
f (a + h, b + k) = f (a, b) + h
+k
x
y at (a,b)
{
}
2
2
2f
1
2 f
2 f
h
+ 2hk
+k
+
2!
x2
xy
y 2 at (a,b)
{

3
3f
3f
3f
1
3 f
h3 3 + 3h2 k 2 + 3hk 2
+
k
+
3!
x
x y
xy 2
y 3 at (a,b)
+ .......
{
}n
1

h
+k
f (x, y)at (a,b)
+
n!
x
y
+ .......

N.B.
{

f
h
+
xn

h
+k
x
y

}n

f (x, y) is a shorthand notation for

nf
n
hn1 k n1 +
1
x y

170

nf
nf
n
hn2 k 2 n2 2 + .... + k n n
2
x y
y

Derivation
First: Treat f (x, y) as a function of x only, x varying from a to a + h, while y is
kept constant at y = b + k.
Since we are now dealing with a function of one variable, x, we expand it about x = a
using the Taylor series for a function of one variable. We obtain:
f (a + h, b + k) = f (a, b + k) + h

f
h2 2 f
(a, b + k) +
(a, b + k) + .....
x
2! x2

(9)

Then: Treat each term on the R.H.S. of (1) as a function of y only, y varying from
b to b + k, while x is kept at the constant value a. We expand each of these terms by
a Taylor series for a function of one variable, this time y. Thus:
f (a, b + k) = f (a, b) + k

f
k2 2f
(a, b) +
(a, b) + ......
y
2! y 2

f
f
2f
k2 3f
h (a, b + k) = h
(a, b) + k
+
(a, b) + ....
x
x
yx 2! y 2 x
[

h2 2 f
h2 2 f
3f
k2 4f
(a,
b
+
k)
=
(a,
b)
+
k
+
(a, b) + ....
2! x2
2! x2
yx2
2! y 2 x2
.....

.....

Then substitute all these terms back in equation (1).

171

Small Increments and Errors


Suppose now that in the Taylor series for f (x, y), expanded about the point (a, b) with
x = h, y = k, both h and k are small enough for h2 , k 2 and higher powers of these
to be neglected.
Then the Taylor series is reduced to the approximate relationship:
f (a + h, b + k) f (a, b) + h

f
f
(a, b) + k (a, b)
x
y

(10)

Now [f (a + h, b + k) f (a, b)] represents the change in f resulting from the values of
x and y being increased from a and b to (a + h) and (b + k) respectively.
So, using the usual notation, we write: f (a + h, b + k) f (a, b) = f
Then rearranging equation (2), this can be written:
f h

f
f
(a, b) + k (a, b)
x
y

Reverting to the x, y notation, this is usually written:

f x

f
f
+ y
(11)
x
y

The value of f is expected to be small relative to the value of f at the point in


question, provided that x and y are small relative to x and y, so that no terms of
any signicant size are neglected.
This equation gives us a means of calculating (approximately, at any rate) the change
in f induced when both the variables x and y vary simultaneously. f is called the
total dierential of f .
This very useful approximate expression for f can be generalised for cases in which f
is a function of more than two variables.
e.g.

f = f (x, y, z, t)

Then the total dierential f is given approximately by


f x

f
f
f
f
+ y
+ z
+ t
x
y
z
t

172

EXAMPLE

Consider the volume of a cylinder

V = r2 h

If small errors r and h are made in the measurements of r and h, then an error V
will inevitably result in the calculation of V from this formula. We can use the total
dierential result to estimate this error:
V r

V
V
+ h
=
r
h

Case (i) Suppose r and h are measured as 10.0 cm and 15.0 cm respectively, both
measurements being made to the nearest mm. We are required to nd the maximum
possible error which could arise in the calculation of the volume using the given values
of r and h.
The maximum error will clearly occur when the errors made in the measurements of
r and h are as large as they can be, and when the terms giving V reinforce each
other.
If r = 10cm = 100 mm, the value given correct to the nearest mm, the value of r can
actually lie anywhere between:
Therefore, the max. value of |r| =
Similarly,

<h<

, so that max |h| =

Then max. possible error in the calculation of V is given by


V
Case (ii) We may be more interested in the percentage error in the calculation of V .
We are often given the likely percentage errors in the measurement of variables, rather
than the actual errors.
i.e. We know

r
h
V
and
, and are looking for
r
h
V

Suppose that r and h can be measured to an accuracy of 0.3% and 0.2% respectively. This means that

r
max = 0.3%,
r

Since V


h
max = 0.2%
h

, we can divide this by V , giving

V
i.e. The maximum possible percentage error in V is

173

N.B.1 If angles are involved in any formula used to calculate a total dierential, any
small change or error in an angle must be measured in radians.
N.B.2 If the errors x, y, ... in the variables x, y, ... can be positive or negative, consideration must be given to the signs involved in the formula for f when a maximum
value (worst case) of f or of f /f is being sought.

EXAMPLE

A function f is to be calculated from the formula

f (x, y, z) = x2

y
z

If x, y, z can only be measured to a guaranteed accuracy of 0.2%, 0.2%, 0.5%


respectively, nd the maximum possible % error in the calculation of f .

174

Implicit Functions
Here we use the approximate result (3) for a case which contains a function f (x, y),
but where x and y are not independent - y can be considered as a function of x and
dy
therefore the rate of change dx
can be dened. Note that this would be an ordinary,
not partial, derivative, since x would be the only independent variable. [We have met
this idea in Chapter 3, P.62, under Implicit Functions].
Since y is a function of x, a small increment x in x will result in a small increment y
in y. Dividing equation (3) by x gives
f
f x f y

+
x
x x y x
{

As x 0, lim

x0

i.e.

f
x

df
f dx f dy
=
. +
.
dx
x dx y dx

df
f
f dy
=
+
.
dx
x y dx

(12)

If the relationship between x and y is given in the form of the equation f (x, y) = 0,
dy
df
equation (4) gives us an easy means of nding dx
since dx
must = 0.
f
f
f dy
dy
x
Equation (4) becomes 0 =
+
. , which rearranges to give
= f
x y dx
dx
y

EXAMPLE

Implicit Functions

The relationship between x and y is given as


x3 xy + y 3 = 0;

175

nd

dy
.
dx

176

Chapter 9.
ORDINARY FIRST ORDER
DIFFERENTIAL EQUATIONS

177

1. Denitions
A Dierential Equation (D.E.) is a relationship between a function, some of its
derivatives, and the variable(s) upon which it depends.
An Ordinary Dierential Equation (O.D.E.) is a dierential equation, as dened
above, but in which the function is dependent only on a single variable and therefore
no partial derivatives arise.
The Order of a dierential equation is the order of the highest derivative it contains.
The Degree of a dierential equation is the power to which the highest derivative
in the equation is raised.
A Solution of a dierential equation is a relationship between the function and
its variable(s) which satises the equation but contains no derivatives. It may be an
explicit or an implicit relationship.

EXAMPLE
(

The equation

d2 y
d3 y
+ 2y
dx3
dx2

)2

+x

dy
y 3 = sin x
dx

is an O.D.E. in which x is the independent variable, y(x) is the function; the equation
is of order 3 and degree 1.

178

2. First Order Dierential Equations (of degree 1)


These may be expressed in either of the forms
dy
= f (x, y)
dx

or

P (x, y) + Q(x, y)

dy
=0
dx

These are clearly interchangeable: by re-arranging the second form, we see that
f (x, y) =

P (x, y)
Q(x, y)

In order to solve the D.E., it is sometimes more convenient to express it in the rst of
these forms, sometimes in the second. This depends on the nature of the D.E.

EXAMPLE

In the D.E. x2 +4y 2 3xy

It can be rewritten as:

dy
= 0, P (x, y) = x2 +4y 2 ; Q(x, y) = 3xy
dx

dy
dy
x2 + 4y 2
2
2
3xy
= x +4y
=
dx
dx
3xy

3. Solution of 1st Order Dierential Equations


First: the general solution of any rst order dierential equation will contain one
arbitrary constant.
There are two classes of methods of solving these D.E.s:
(i) Those based on separation of variables;
(ii) Those based on exact solution making (or nding the equation to be) the
exact derivative of some function, and then integrating it.

179

(i)

Methods based on the Technique of Separation of Variables

(a) Separable Equations

To identify these, write the D.E. in the form:

dy
= f (x, y)
dx

dy
= f1 (x).f2 (y)
dx
i.e. if f (x, y) is the product of two factors f1 (x) and f2 (y), functions of x only and
y only respectively, then the D.E. is separable. We can then write the equation
as
If we can write:

f (x, y) = f1 (x).f2 (y)

1 dy
= f1 (x),
and integrate w.r. to x :
f2 (y) dx

1 dy
1
dx =
f1 (x) dx
or:
dy = f1 (x) dx
f2 (y) dx
f2 (y)

180

EXAMPLE 1

dy
= 2y
dx

EXAMPLE 2

xy + y + x(y 1)

dy
= 0
dx


Rearrange as:

xy+y = x(y1)

dy
dx

181

dy
xy + y
=
dx
x(y 1)

(b) Homogeneous Dierential Equations


First we dene a homogeneous function.
Denition A function f (x, y) is said to be homogeneous of degree n if, for any
non-zero constant ,
f (x, y) = n f (x, y)

We can now dene a homogeneous 1st order D.E.


The D.E.

dy
= 0
dx
is said to be homogeneous if P (x, y) and Q(x, y) are both homogeneous functions
of the same degree.
P (x, y) + Q(x, y)

Then

dy
P (x, y)
=
dx
Q(x, y)

( )

which has dimension 0 and can be expressed as f

We therefore make the substitution:

for

v=

y
x

and substitute for y and

dy
, so that we have a D.E. in terms of v and x instead of y and x.
dx

N.B. If we dene v by v =

y
, or y = vx, then in terms of v and x,
x

dy
=
dx

182

x
.
y

EXAMPLES

of homogeneous functions

2
f (x, y) = x4 + 3x2 y 2 + 5xy 3
g(x, y) = x +
x + y2
f (x, y) = (x)4 + 3(x)2 (y)2 + 5(x)(y)3 g(x, y) = x + (x)2 + (y)2

= 4 x4 + 34 x2 y 2 + 54 xy 3
= 4 (x4 + 3x2 y 2 + 5xy 3 )
= 4 f (x, y)

= x + 2 (x2 + y 2 )

= (x + x2 + y 2 )
= g(x, y)

Thus f (x, y) is homogeneous, of degree 4;

EXAMPLE

g(x, y) is homogeneous of degree 1

of a homogeneous dierential equation:

2xy

dy
= y 2 x2
dx

The functions 2xy and y 2 x2 are both homogeneous of degree 2. We therefore


have a homogeneous 1st order D.E. of degree 2.
Using the standard procedure:

Substituting for y and for

Let v =

y
, or y = vx.
x

dy
in the D.E.:
dx

183

Then

dy
=
dx

(ii) Dierential Equations based on Exact Derivatives


(a) Exact Equations
Suppose a relationship between x and y is given as
f (x, y) = C
df
Then
= 0
dx

(C constant)

But the implicit dierentiation rule gives


df
f
f dy
=
+
dx
x
y dx
f
f dy
+
= 0
x
y dx

Therefore

Compare this last equation with the standard form of a 1st order D.E.:
P (x, y) + Q(x, y)

dy
= 0
dx

If we can show that P (x, y) and Q(x, y) are the partial derivatives w.r. to x and
y respectively of some function f (x, y), then reversing the dierentiation process
above, the solution of the D.E. will be
f (x, y) = C
f
,
x
f
And if Q =
,
y

Now if this function f exists, so that P =

then
then

P
2f
=
y
yx
Q
2f
=
x
xy

2f
2f
=
xy
yx
Therefore the test for P and Q to be the respective x and y partial derivatives of
the same function f , so that the D.E. is shown to be exact, is:
But we know that, in general,

Q
P
=
y
x
The solution is then f (x, y) = C, where
f
=P
x

f (x, y) =

P (x, y) dx;
|{z}

f
= Q f (x, y) =
Q(x, y) dy
y
|{z}

y const

x const

Comparison of the results of these two integrals gives f (x, y).


184

EXAMPLE

x3 + sin y + (x cos y + y 3 )

dy
= 0
dx

Find the general solution, and also the particular solution for which y(0) = 2
[i.e. y = 2 when x = 0]

P (x, y) =

Q(x, y) =

185

(b) Linear Equations


These can always be expressed in the form
dy
+ p(x)y = q(x)
dx
p(x) and q(x) are functions of x only, and the L.H.S. is a linear combination of
dy
and y.
dx
As it stands, the L.H.S. is not usually the exact derivative of a function, but
multiplication by a certain function of x, called the Integrating Factor (I.F.),
will result in the L.H.S. becoming the exact derivative of a product. So we just
need to nd the correct choice of I.F. for each equation.
Let the I.F. be h(x)
Multiplying the D.E. through by h(x) gives
dy
+ h(x)p(x)y = h(x)q(x)
dx
We require the L.H.S. to be an exact derivative. Identifying
h(x)

P (x, y) = h(x)p(x)y,

Now
And

Q(x, y) = h(x), we need

P
Q
=
y
x

=
{h(x)p(x)y} = h(x)p(x).1 = h(x)p(x)
y
y
Q

dh
=
h(x) =
since h is a function of x only
x
x
dx

Therefore, for the L.H.S. to be an exact derivative, we require

dh
= h(x)p(x).
dx

This is a separable D.E. in h(x) and x. We separate the variables and integrate:

1
dh = p(x) dx ln h = p(x) dx
h(x) = e p(x)dx
h
For simplicity we choose the constant of integration resulting from the integration
of p(x) to be zero, since any factor which makes the L.H.S. into an exact derivative
is acceptable.

Thus we have

I.F = e

p(x)dx

And when the D.E. is multiplied by the I.F.


d
{I.F. y}
the L.H.S. becomes
dx
186

EXAMPLE 1

Find the particular solution for which y(1) = 1 of the D.E.


x2

dy
1
2xy =
dx
x

This is a linear 1st order D.E., but it is not at present in the correct form to start
dy
the solution process, since the coecient of dx
must be unity.
Therefore

EXAMPLE 2

Find the general solution of the D.E.


dy
y tan x = x
dx

This is again a linear D.E., this time having the correct form with the coecient
dy
being 1. We can therefore directly identify p(x) = tan x, q(x) = x
of dx

For the I.F. we need


p(x) dx = tan x dx = ln(sec x) = ln(cos x)

p(x) dx
Then the I.F. = e
= eln(cos x) = cos x
Multiplying the D.E. through by the I.F., we obtain
cos x

dy
dy
cos xy tan x = x cos x cos x y sin x = x cos x
dx
dx
or

So, integrating gives:

y = x tan x + 1 +

d
{y cos x} = x cos x
dx

y cos x = x sin x
C
cos x

187

1. sin x dx = x sin x + cos x + C

4. Numerical Methods of Solution of 1st order Dierential Equations


The analytic methods of solving 1st order ODEs shown so far can be used only for
a very restricted number of cases. Many ODEs that occur in engineering situations
cannot be solved exactly by analytic methods. We therefore need to seek methods to
obtain approximate solutions, and then perhaps look to see how the solution values
obtained can be improved. In these methods, we do not seek the solution y as a
function of x, but we seek numerical values of y for certain specic values of x within
the range concerned. If intermediate values of y are needed, we can try to improve the
accuracy of the numerical method used, or use an interpolation process as appropriate.
We start by looking at the ideas behind the methods used

Given:

the dierential equation

and an initial point (x0 , y0 ),

dy
= f (x, y)
dx

i.e. y = y0 when x = x0 .

What does this tell us? If we aim to draw a graph representing the solution function
(or at least nd points on the solution function), at present we know one point on the
graph, and the gradient of the graph at that point (obtained by evaluating f (x0 , y0 )).
y
6

y0

Direction of the tangent to the curve


*

 at (x , y ) given by f (x , y )

0 0
0 0


r


-x

x0
A crude solution might be obtained by assuming that, over a small interval x0 to x0 +h,
the gradient remains roughly constant, and thus we can nd an approximate value of
y (= y1 ) at x = x0 + h (= x1 ):
y1 = y0 + h tan
= y0 + hf (x0 , y0 )
Having obtained a second point (x1 , y1 ), we can evaluate the gradient at this point and
repeat the process.

188

Hence we obtain Eulers Method, given by the formula:


yn+1 = yn + hf (xn , yn )

EXAMPLE

Use Eulers method to estimate numerical solutions for the D.E.


dy
=x+y
dx

given that y(0) = 0. Use the step-length h = 0.2 to nd an estimate for the value of
y(1).

Comparing this equation with the basic equation


x + y.

dy
dx

The Euler formula for this equation is:

yn+1 = yn + hf (xn , yn )
= yn + h(xn + yn )

Thus:

y1
y2
y3
y4
y5

=
=
=
=
=

y0 + h(x0 + y0 )
y1 + h(x1 + y1 )
y2 + h(x2 + y2 )
y3 + h(x3 + y3 )
y4 + h(x4 + y4 )

=
=
=
=
=

i.e., When x = 1, y 0.488.

189

= f (x, y), we identify f (x, y) with

0 + 0.2(0 + 0) = 0
0 + 0.2(0.2 + 0) = 0.04
0.04 + 0.2(0.4 + 0.04) = 0.128
0.128 + 0.2(0.6 + 0.128) = 0.2736
0.2736 + 0.2(0.8 + 0.2736) = 0.4883

Now this dierential equation is a linear rst order D.E., and we can nd its analytic
solution. Writing the equation as
dy
y = x,
dx
the integrating factor is

1dx

= ex

Multiplying through the D.E. by this I.F., we obtain


d
(yex ) = xex
dx
=

yex = ex .x
=

ex .1dx

xex ex + C

And since we were given y(0) = 0, we calculate that C = 1.


The analytic solution, then, tting the initial condition, is
y = x 1 + ex
From this, we calculate that y(1) = 1 1 + e1 = 0.718
The Euler approximate solution, y(1) = 0.488, compares unfavourably with this correct value!
The seemingly obvious remedy would be to decrease the size of the step length h when
using Eulers Method. This would clearly increase the number of iterations of the
process necessary to reach the required value of x, in this case x = 1. The results of
Eulers Method to obtain y(1) for dierent values of h are tabulated below:

h
0.2
0.1
0.05
0.02
0.01
0.005
0.002
0.001

y(1)

No. of iterations necessary

0.488
0.594
0.6533
0.6916
0.7048
0.7115
0.7156
0.7169

5
10
20
50
100
200
500
1000

190

Possibilities for a More Accurate Method


(i) We could take the rst three terms of the Taylor expansion of y(x) instead of the
rst two.
[
]
[
]
dy
h2 d2 y
Then y(x0 + h) y0 + h
+
dx x0
2! dx2 x0
with the error term being
[

h3 d3 y
3! dx3
Now

dy
dx

= f (x, y), so that, as before,


and

0t1
x0 +th

dy
dx x0

d2 y
dx2

= f (x0 , y0 )
d
f (x, y)
dx

f (x, y) is unlikely to be merely a function of x, since in this case we could nd an approximation for y by a numerical integration process such as Simpsons Rule. Therefore
d
to nd dx
f (x, y) we need the Chain Rule for an implicit

so that

d
f
f (x, y) =
+
dx
x
f
=
+
x
]
[
]
[
d
f
f (x, y)
=
dx
x x0
x0

f dy
y dx
f
.f (x, y)
y
]
[
f
+
.f (x0 , y0 )
y x0

Therefore we have:
h2
y(x0 + h) y0 + h.f (x0 , y0 ) +
2!

([

f
x

+
x0

f
y

.f (x0 , y0 )
x0

This is sometimes known as the Extended Euler Method. An alternative method,


giving the same numerical accuracy as the Extended Euler Methods, is the Modied
Euler Method, or Heuns Method.

191

(ii)

The Modied Euler Method

Recall that the D.E. is


dy
= f (x, y),
dx

with

y(x0 ) = y0

Integrating:

x1
dy
dx =
f (x, y)dx
x0 dx
x
x0 1
1
i.e. [y]x=x
f (x, y)dx
x=x0 =
x1

y1 y0 =

x0
x1

f (x, y)dx
x0

We shall try to approximate the R.H.S. by using the Trapezium Rule.


y1 y0

Then

h
[ f (x0 , y0 ) + f (x1 , y1 )]
2

So we still have a problem, since we need to calculate f (x1 , y1 ), but we still have no
value for y1 .
We therefore use the approximate value of y1 , found by the basic Euler Method and
which we now call the predicted value of y1 . We denote this by y1 and calculate
f (x1 , y1 ), thus aiming at a better approximation for y1 .
We now have a process with two stages to each iteration: a Predictor Stage, during
which we calculate y1 , and a Corrector Stage, during which we improve the value of
y1 from y1 . These are:

Predictor :
Corrector :

yn+1
= yn + h.f (xn , yn )
h

yn+1 = yn +
[f (xn , yn ) + f (xn+1 , yn+1
)]
2

The corrector process can be considered as a variation of the Euler Method in which
an average gradient over the interval [xn , xn+1 ] is used instead of the gradient at xn .

192

Using the Modied Euler Method for the same D.E. as before:
dy
= x + y,
dx

y(0) = 0

giving f (x, y) = x + y, and taking h = 0.2:

Pred.
Corr.
Pred.
Corr.
Pred.
Corr.

y1 = y0 + hf (x0 , y0 ) = 0 + 0.2(0 + 0) = 0
h
y1 = y0 + [f (x0 , y0 ) + f (x1 , y1 )] = 0 + 0.1[(0 + 0) + (0.2 + 0)]
2
= 0.02

y2 = y1 + hf (x1 , y1 ) = 0.02 + 0.2(0.2 + 0.02)


= 0.064
h
y2 = y1 + [f (x1 , y1 ) + f (x2 , y2 )]
2
= 0.02 + 0.1[(0.2 + 0.02) + (0.4 + 0.064)] = 0.0884
y3 = y2 + hf (x2 , y2 ) = 0.0884 + 0.2(0.4 + 0.0884)
= 0.18608
h
y3 = y2 + [f (x2 , y2 ) + f (x3 , y3 )]
2
= 0.0884 + 0.1[(0.4 + 0.0884) + (0.6 + 0.18608)] = 0.2158

Continuing, we nd

y4 = 0.3790,

y4 = 0.4153

y5 = 0.6584,

y5 = 0.7027

These are, in fact, the same values as have been obtained for this D.E. by the Extended
Euler Method. This would not usually be the case - it would depend on the nature
of the function f (x, y) - but it can be shown that the Modied Euler Method is a good
approximation to the Extended Euler Method. Both give errors of the order of h3 ,
and so the Modied Euler Method is usually preferred since no derivatives have to be
found. It is one of a series of Predictor-Corrector Methods.

193

Greater Accuracy ???


The obvious course to follow would seem to be to use additional terms of the Taylor
series for y(x); adding in one more term, for instance, would give
[

y(x0 + h) y0

dy
+ h
dx

x0

h2 d2 y
+
2! dx2

x0

h3 d3 y
+
3! dx3

,
x0

the error then being of the order of h4 .


However, we should need to express
since
d3 y
dx3
and this turns out to be

d3 y
dx3

in terms of f (x, y) and its partial derivatives,

d2
f (x, y),
dx2

2f
2f
f f
+
2f
+
.
+ f
2
x
xy
x y

f
y

)2

+ f 2.

2f
y 2

!
We hope, therefore to be able to nd alternative methods which give greater accuracy
than the Modied Euler Method, but without having to cope with all these derivatives
and their evaluation at the various points. This leads us to a consideration of

Runge-Kutta Methods
all of which are derived from a Taylor series for y(x), but in such a way that the benets
of including more Taylor series terms are preserved without the actual need to nd and
evaluate all these derivatives.

194


The Runge-Kutta Formula of order 4 This
is the most frequently used RungeKutta method. It gives a high degree of accuracy, is reasonably stable, and is not too
dicult to program or to use in hand calculations.
The formula is
yn+1 = yn +
where

1
(k0 + 2k1 + 2k2 + k3 )
6

k0 = hf (xn , yn )
1
1
k1 = hf (xn + h, yn + k0 )
2
2
1
1
k2 = hf (xn + h, yn + k1 )
2
2
k3 = hf (xn + h, yn + k2 )

The derivation is somewhat long and dicult!

However, returning again to the D.E.

dy
= f (x, y)
dx
and integrating w.r. to x between the limits x0 and x1 , this time using Simpsons Rule
(step length h2 ) to approximate the integral of the R.H.S. instead of the Trapezium
Rule which we used when seeking the Modied Euler Method, we have
y1 y0

1 h
1
1
. [f (x0 , y0 ) + 4f (x0 + h, y(x0 + h)) + f (x1 , y1 )]
3 2
2
2

k1 and k2 are both estimates of h.f (x0 + 12 h, y(x0 + 12 h)), and k3 is an estimate of
h.f (x1 , y1 ).
The Runge-Kutta Formula of order 4 can therefore be considered as a
numerical integral of the D.E. obtained by using a Simpsons Rule idea.

195

Use of the 4th order Runge-Kutta method on the same D.E.


dy
=x+y
dx
Again, we take h = 0.2, and are given y(0) = 0, , f (x, y) = x + y.
To nd y1 : starting with x0 = 0, y0 = 0, x1 = 0.2.
k0 = h.f (x0 , y0 ) = 0.2(0 + 0) = 0
1
1
k1 = h.f (x0 + h, y0 + k0 ) = 0.2.f (0.1, 0) = 0.2(0.1 + 0) = 0.02
2
2
1
1
k2 = h.f (x0 + h, y0 + k1 ) = 0.2f (0.1, 0.01) = 0.2(0.1 + 0.01) = 0.022
2
2
k3 = h.f (x0 + h, y0 + k2 ) = 0.2f (0.2, 0.022) = 0.2(0.2 + 0.022) = 0.0444
1
Then y1 = 0 + (0 + 2 0.02 + 2 0.022 + 0.0444) = 0.0214
6
To nd y2 : starting with x1 = 0.2, y1 = 0.0214, x2 = 0.4.
k0 = h.f (x1 , y1 ) = 0.2(0.2 + 0.0214) = 0.04428
[
(
)]
1
1
0.04428
k1 = h.f (x1 + h, y1 + k0 ) = 0.2 0.3 + 0.0214 +
= 0.068708
2
2
2
[
(
)]
1
1
0.068708
k2 = h.f (x1 + h, y1 + k1 ) = 0.2 0.3 + 0.0214 +
= 0.07115
2
2
2
k3 = h.f (x2 , y1 + k2 ) = 0.2(0.4 + (0.0214 + 0.07155)) = 0.09851
1
Then y2 = 0.0214 + [0.04428 + 2 (0.068708 + 0.07115) + 0.09851] = 0.09182
6
Continuing the process, we nd that the values, corrected to 4 signicant gures, are
calculated as:
y3 = 0.2221
y4 = 0.4255
y5 = 0.7183
These agree with the solutions obtained analytically.

196

APPENDIX
1. D.E.s with Linear Coecients
These equations can also be converted to separable equations by means of appropriate
substitutions. They have the form:
(a1 x + b1 y + c1 ) + (a2 x + b2 y + c2 )

Case (i)

dy
= 0
dx

b1
a1
=
a2
b2

i.e. The lines whose equations are a1 x + b1 y + c1 = 0 and a2 x + b2 y + c2 = 0 are not


parallel.
We start by solving these two equations, to nd the point on the graph where the lines
meet.
If this is the point x = h, y = k, we make the substitution: X = x h, Y = y k
This represents a translation of the x- and y-axes to the point (h, k), so that the lines
meet at the point X = 0, Y = 0 in the new co-ordinate system. However, since no
directional or scaling transformations are made, there are no changes to any gradients,
and therefore
dY
dy
=
.
dX
dx
Then substitution of x = X + h, y = Y + k into the D.E. gives a homogeneous D.E.
of degree 1 in terms of the new variables X and Y , and the D.E. can then be solved
using the techniques for homogeneous D.E.s.

Case (ii)

a1
b1
=
a2
b2

In this case the lines mentioned above are parallel, so the equations representing the
straight lines have no solution.
We make the substitution: z = a1 x + b1 y
Dierentiating this w.r. to x gives:
dy
dz
= a 1 + b1
dx
dx
Substitution into the D.E. gives a separable D.E. in z and x.
197

EXAMPLE

(4x y + 9) + (x y + 3)

Case (i)

dy
=0
dx

First, solving the equations 4x y + 9 = 0 and x y + 3 = 0 gives x = 2, y = 1


Then let X = x (2) = x + 2, Y = y 1
dY
dy
=
;
dX
dx

x = X 2, y = Y + 1

Substitution of these into the D.E. gives:


(4(X 2) (Y + 1) + 9) + ((X 2) (Y + 1) + 3)
(4X Y ) + (X Y )

or:

Now let Y = vX

dY
dX

dY
= 0
dX

dY
dX

Homogeneous D.E., degree 1.

dv
= v + X dx
.

dv
Then substitution gives: (4X vX) + (X vX)(v + X dX
)=0
dv
Dividing by X : (4 v) + (1 v)(v + X dX ) = 0
dv
dv
4 v 2 = (v 1)X dX
4 v + v v 2 = (1 v)X dX
This is now separable, so:

1
v1
1
3
dX =
dv
=

dv
2
X
4v
4(2 v) 4(2 + v)

Integrating, we obtain:

ln X = 14 ln(2 v) 43 ln(2 + v) + C
(

Substituting back:

EXAMPLE

1 dz
(2z
2 dx

i.e.

dz
dx

dy
= 1 + 2 dx

(2z 3)dz =

5 dx

3
2

dy
dx

= 52
z 2 3z = 5x + C

(x + 2y)2 3(x + 2y) + 5x = C

198

= C

dy
= 0
dx

dz
(z + 1) + (2z 3)( 12 dx
12 ) = 0

3) = z 1 + 2z. 12

Substitute back:

(x + 2y + 1) + (2x + 4y 3)

Case (ii)

Lines are parallel: let z = x + 2y


The D.E. becomes:

2x y + 5
2x + y + 3
1
3
ln(x + 2) + ln
+ ln
4
x+2
4
x+2

1 dz
2 dx

1
2

2. The Bernoulli Equation is an important example of an equation that is not linear


as given, but can be made linear by judicious choice of substitution. It takes the form
dy
+ p(x)y = q(x)y n
dx
The standard technique for solving a Bernoulli D.E. is:
(i)
(ii)

Divide the equation through by y n


Substitute v = y 1n

and also make the corresponding substitution for


dierentiating v = y 1n w.r. to x.

dy
dx

in terms of x and v, obtained by

dy
+ y = ex y 3
dx

EXAMPLE

This is a Bernoulli equation with p(x) = 1, q(x) = ex and n = 3


(i) Divide through by y 3 :
1 dy
1
+ 2 = ex
3
y dx y
(ii) Substitute v = y 1n = y 2 =

1
y2

dv
dy
1 dy
= 2y 3 .
= 2 3
dx
dx
y dx

(13)

1 dy
1 dv
=
3
y dx
2 dx

So, substituting for the two terms on the L.H.S. of equation (1), we obtain

1 dv
+ v = ex
2 dx

Now linear in v and x.

Multiplying through by (-2) to obtain the coecient of 1 for

The I.F. = e

dv
,
dx

we have

dv
2v = 2ex
dx
2dx

= e2x .

Multiplying through by the I.F. gives

dv
2e2x v = 2ex
dx
Integrating both sides, we have
e2x

e2x v = 2ex + C

or

d 2x
(e v) = 2ex
dx
v = 2ex + Ce2x

Substituting back for v:


Since v =

1
, we have
y2

1
= 2ex + Ce2x
y2

199

y =

2ex

1
+ Ce2x

3.

The general Runge-Kutta idea of order m is to use


yn+1 = yn + k0 + k1 + k2 + ...... + km
This is a Runge-Kutta formula of order m, and the ki take the form
k0
k1
k2
k3

=
=
=
=

hf (xn , yn )
hf (xn + c1 h, yn + d1,0 k0 )
hf (xn + c2 h, yn + d2,0 k0 + d2,1 k1 )
hf (xn + c3 h, yn + d3,0 k0 + d3,1 k1 + d3,2 k2 )
etc.

The constants are such that 0 < c1 , c2 , ...cm1 < 1 and cm = 1. This means that the
values of x at which f (x, y) is evaluated in the ki terms, x = xn + c1 h, x = xn + c2 h,
etc., lie between xn and xn+1 , with xn + cm h being xn+1 .
The ki terms are increments in the value of y. The constants d1,0 , d2,0 , d2,1 etc.
are chosen to give reasonable estimates for y corresponding to the x-values xn , xn +
c1 h, xn + c2 h etc.
The sum of the constants , , , etc. is 1.

This means that the expression

k0 + k1 + k2 + ...... + km
is h a weighted average of estimates of the gradient of the function y(x) at a series
of points in the interval [xn , xn+1 ]. If these estimates are good, we can see that this
renes the idea used in the Modied Euler Method, and would be expected to give
good results.
The Runge-Kutta Formula of order 1 is clearly just the basic Euler Method. We have
seen that this is unsatisfactory on its own, but it gives a start for the rening process
to build on.
Consider the Runge-Kutta formula of order 2.
yn+1 = yn + k0 + k1
= yn + hf (xn , yn ) + hf (xn + c1 h, yn + d1,0 k0 )
1
we take = = , c1 = 1, d1,0 = 1, so that
2
h
yn+1 = yn + [f (xn , yn ) + f (xn + h, yn + hf (xn , yn ))]
2
h

)]
or yn+1 = yn + [f (xn , yn ) + f (xn+1 , yn+1
2
i.e. the Modied Euler Method.

200