Вы находитесь на странице: 1из 59

MATH49111/MATH69111

Mini Projects Part 1


ii
Contents

Guidelines v

Deadlines vii

1 Ordinary Differential Equations 1


1.1 Solving ODE Boundary Value Problems . . . . . . . . . . . . . . . . . . . . . . . 1
1.1.1 The initial value problems for ODEs . . . . . . . . . . . . . . . . . . . . . 1
1.1.2 Boundary Value Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.1.3 Taking a guess . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.1.4 Newtons shooting method . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Coding, Examples and Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2.1 Creating a Math vector from the standard library . . . . . . . . . . . . . 4
1.2.2 Protocol for ODE function . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.2.3 ODE solver function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.2.4 Implementing the Newton shooting method . . . . . . . . . . . . . . . . . 9

2 Volatility 13
2.1 Background Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.1.1 Statistical Estimates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.1.2 Generating Option Prices . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.1.3 Implied Volatility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.2 Importing Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.2.1 Creating a Math vector from the standard library . . . . . . . . . . . . . 16
2.2.2 Importing data sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.3 Numerical Integration Methods for Option Pricing . . . . . . . . . . . . . . . . . 18
2.3.1 Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.3.2 Quadrature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.3.3 Object Orientated Approach . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.4 Implied Volatilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.5 Report . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

iii
3 CG Method 25
3.1 Iterative methods for solving a Matrix equation . . . . . . . . . . . . . . . . . . . 25
3.1.1 The Conjugate Gradient Method . . . . . . . . . . . . . . . . . . . . . . . 25
3.1.2 Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.2 Coding, Examples and Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.2.1 Creating a Math vector from the standard library . . . . . . . . . . . . . 27
3.2.2 More Useful Vector Functions . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.2.3 Creating a Matrix class . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.2.4 Implementing the Conjugate Gradient method . . . . . . . . . . . . . . . 32
3.2.5 A New Type of Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.3 Huge Systems and Sparse Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.4 Report . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

4 Sorting 39
4.1 Sorting and searching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
4.1.1 The sorting problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
4.1.2 Bubblesort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
4.1.3 Quicksort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
4.1.4 Heapsort . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
4.2 Coding, Examples and Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
4.2.1 Creating our own vector class from the standard library . . . . . . . . . . 42
4.2.2 Creating initial data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4.2.3 Implementing the sorting algorithms . . . . . . . . . . . . . . . . . . . . . 44
4.2.4 Abstracting the sorting algorithms . . . . . . . . . . . . . . . . . . . . . . 47
4.2.5 Sorting in the STL (optional) . . . . . . . . . . . . . . . . . . . . . . . . . 48
4.3 The game of life using a CoordinateArray (optional) . . . . . . . . . . . . . . . . 49
4.4 Report . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

iv
Guidelines

Choose only one project from the four possible projects in this booklet to complete and hand
in. If you are on the MSc in Applied it is recommended that you choose either project 1 or
project 3. If you are specialising in Numerical Analysis, then project 3 is recommended. If you
are on the MSc QFFE you are encouraged to choose project 2, Historical vs. Implied Volatility.
Projects are written to be self contained but you may contact the lecturer (Dr. Martin Lotz)
or the author (Dr. Paul Johnson) for clarification on the material. Some initial code has been
published on the website to help you get started. It is recommended that you study this code
in detail. You may also try to solve the problems without object oriented concepts first, and
apply these later to see their benefits.
This is not a group work assignment and all codes and text must be written by you individu-
ally. Both the report and codes will be run through the TurnItIn system to automatically check
for similarities between your work and that of others on the course. Please see the university
guidelines on plagiarism.
All reports should be submitted in typed form using latex (preferred) or a word processing
package and contain a title page with your student ID number. The submission should take
place online at the appropriate section of the university Blackboard system. Any codes used to
generate results in the report should be included in the appendix, you may be asked to supply
electronic copies of the code if required. Anonymous marking is used for marking the reports so
please do not include any names which may be used to identify the author of the report. The
written part of the report should not exceed 25 pages in length and the font size should not be
less than 11pt.
The report should be well structured containing

• an introduction;

• a section discussing the problem formulation;

• a description of the numerical techniques used;

• a results and conclusions section.

Any articles cited need to be included in a references section. Figures can be included
anywhere in the text, but not before they are referenced, and should be clearly labelled. All
figures in the report need to be cited in the text somewhere. Small portions of code listings may
be included in the text for illustrative purposes but the full code should be left in the appendices.

v
The report will be marked in accordance with the criteria as described in the attached marking
scheme handout. Plagiarism is taken very seriously, so unless the project description mentions
group working, you need to work independently. Do not copy/paste code or text from
other peoples work. Reports containing results from codes which are not what one may
expect from the code listing will be severely penalised.

Mark Scheme
Presentation (max 20)
The report should be well structured with an adequate introduction of the problem being solved,
a problem formulation section, results and conclusions section. It should be largely free from
typographical and grammatical errors. The figures and tables should be properly referenced
and labelled with suitable captions. References should be given in detail.

Content (max 45)


The report should be factually accurate and the mathematical language precise and clear. Is
there a clear description of the project with a clear outcome to be achieved? Any conclusions
made need to be well supported and coherent. For example, if the solution obtained is claimed
to be accurate, what grid size checks have been made, or what evidence is presented to validate
the code. Is the original problem solved correctly, i.e., are the results credible. Does the report
give evidence of the methods used to solve the problem and correct implementation. Is there
evidence of an understanding and competent application of the range of techniques and methods
used in the project, and evidence of technical skills. Does the student understand the results
and interpret them in the correct manner. Overall, does the candidate understand the meaning,
context and significance of the work being presented, or are there huge gaps.

Coding (max 35)


Is the code easy to understand with explanatory comments. Is the code easy to read, in
separate files with short functions. Are there any obvious memory leaks. Is there evidence
of OO programming, code reuse, use of standard/external libraries. Has the problem been
approached in a novel manner with original ideas.

vi
Deadlines

• Your code should be tested at the lab session 12-2pm on:


Friday 15th November 2013.
• The written report for the first mini project is due in by:
5pm Monday 18th November 2013.

vii
viii
Mini Project 1

Ordinary Differential Equations

Author: Dr. Paul Johnson, Paul.Johnson-2@manchester.ac.uk

In this project we shall use the Newton Shooting method to solve ODE. We will be looking
at ODE that have boundary conditions at both ends, making them a boundary value problem.
It will be assumed that students have a working code to solve initial value ODE problems using
the Euler method, and that they can output values to a file and plot results.
The aim of the project is to use OO programming to write generic code that can solve
a given boundary value problem. We shall use concepts such as encapsulation, inheritance,
polymorphism. Other techniques developed here will include using a layered approach to make
use of a standard template, and making a protocol class.

1.1 Solving ODE Boundary Value Problems


1.1.1 The initial value problems for ODEs

The initial value problem solves an ordinary differential equations of the type

dy
= f (x, y), a ≤ x ≤ b,
dx
subject to an initial condition
y(a) = α.

Any higher order ODE may reduced to a set of first order ODEs. As such the general system
may be written
dY
= F (x, Y ), a ≤ x ≤ b, (1.1)
dx
where

Y = (y1 (x), y2 (x), ..., yn (x))T ,


F = (f1 (x, Y ), f2 (x, Y ), ..., fn (x, Y ))T ,

1
with initial data

Y (a) = α, (1.2)
α = (α1 , α2 , ..., αn )T .

1.1.2 Boundary Value Problems


All ODEs and PDEs require boundary conditions in order that a solution may exist. In initial
value problems, the boundary conditions are all on one side, but this is not the case for every
problem, for instance take the following:
d2 y dy
2
+κ + xy = 0, (1.3)
dx dx
with the boundary conditions
y(0) = 0, y(1) = 1. (1.4)
Clearly we now have a problem with conditions at both ends. However we do not want to
abandon all the methods for solving initial value problems, some of which are extremely accurate
and efficient. So how can we match conditions at both ends?
First let us rewrite the problem above as a system of first order ODEs

Y1 = y(x); (1.5)
dy
Y2 = ; (1.6)
dx
dY1
= Y2 ; (1.7)
dx
dY2
= −κY2 − xY1 . (1.8)
dx
so that the boundary conditions are now written:

Y1 (x = 0) = 0 Y1 (x = 1) = 1.

In order to solve the problem by marching through x we need to assign a value to Y2 (x = 0).
But how to choose a value of Y2 ? Well we know that our choice must satisfy the boundary
condition at x = 1. The Newton shooting method gives us an iterative algorithm to find the
perfect guess.

1.1.3 Taking a guess


Let us start by making a guess, g, to Y2 (0) so that the initial conditions now become
! !
Y1 (0) 0
Y (0) = = .
Y2 (0) g

Then we may solve (1.7) and (1.8) with these initial conditions using your favourite method, to
get a solution at x = 1 ! !
Y1 (1) β1
Y (1) = = .
Y2 (1) β2

2
It is now possible by comparing the value β1 to our boundary condition, to see how good our
guess at the initial condition was.
In order to make our guess better, we want to know whether we have shot above or below.
Let us define the amount by which we have shot above or below the boundary condition as

φ(g) = Y1 (x = 1; g) − Y1BC (x = 1) = β1 − 1, (1.9)

where β1 is our solution and 1 is the required boundary condition. Since φ is just a function of
g (remember g = Y2 (x = 0)) and the boundary condition is satisfied when φ = 0, the problem
reduces to the classic root finding problem. We should already know of an algorithm to solve
this problem - Newton’s root finding algorithm. We also know that this method has quadratic
convergence and is easy to implement.

1.1.4 Newtons shooting method

Newtons shooting method combines the root finding algorithm with an initial value ODE solver
to calculate the solution to boundary value problems. After starting with some initial guess at
the initial condition, the formula to find a new guess may be written as

φ(g)
gn+1 = gn − .
φ0 (g)

We have demonstrated above that once a guess at the initial condition has been made, it is
possible to generate the function φ(g). But we still need to know what φ0 (g) is. If we differentiate
(1.9) with respect to g, we get
dφ dY1
= (1.10)
dg dg x=1
So one method would be to differentiate the original ODE with respect to g to get a new initial
value problem for φ0 .
Consider that ODE (1.3) may be written as:

y 00 = −κy 0 − xy, (1.11)


00 0
y = F (x, y, y ) (1.12)

We can differentiate (1.3) with respect to the guess g, using the chain rule

dy 00 ∂F dx ∂F dy ∂F dy 0
= + + . (1.13)
dg ∂x dg ∂y dg ∂y 0 dg
dy 0
Now define Z1 = dy dg , and Z2 = dg , then the set of first order ODEs and initial conditions
satisfied by Z1 and Z2 are

dZ1
= Z2 (1.14)
dx
dZ2
= −κZ2 − xZ1 , (1.15)
dx

3
and
d
Z1 (x = 0) = Y1 (x = 0) = 0 (1.16)
dg
d
Z2 (x = 0) = Y2 (x = 0) = 1. (1.17)
dg

Then we may recover φ0 from the solution to initial value problem for Z since

0 dY1
φ (g) = = Z1 (x = 1)
dg x=1

1.2 Coding, Examples and Exercises


1.2.1 Creating a Math vector from the standard library

Here we shall use the standard vector class to create a new vector class so that we can add
them, and multiply them by scalars. Putting extra work into making this class will enable our
integrator methods to be written as we would write them in maths.
Copy the following class definition for the new class M V ector into a header file or at the
top of your main code.
// c l a s s MVector c o n t a i n s a r r a y s t h a t can work w i t h d o u b l e s
c l a s s MVector
{
// s t o r a g e f o r t h e new v e c t o r c l a s s
v e c t o r <double> v ;
public :
// c o n s t r u c t o r
e x p l i c i t MVector ( ) { }
e x p l i c i t MVector ( i n t n ) : v ( n ) { }
e x p l i c i t MVector ( i n t n , double x ) : v ( n , x ) { }
// e q u a t e v e c t o r s ;
MVector& operator=(const MVector& X)
{ i f (&X==t h i s ) return ∗ t h i s ; v=X. v ; return ∗ t h i s ; }
// a c c e s s d a t a i n v e c t o r
double& operator [ ] ( i n t i n d e x ) { return v [ i n d e x ] ; }
// a c c e s s d a t a i n v e c t o r ( c o n s t )
double operator [ ] ( i n t i n d e x ) const { return v [ i n d e x ] ; }
// s i z e o f v e c t o r
i n t s i z e ( ) const { return v . s i z e ( ) ; }
} ; // end c l a s s MVector

So far so good. The class M V ector will act in exactly the same way as a std :: vector, except
that we do not have access to all the public functions of the std :: vector, and we have explicitly
chosen the data double as the data type stored in the array.
Now for this to be of any use we must overload the operators +-/* to work with our new
M V ector class. We shall place the function definition outside the class definition but inside
the header file. A typical definition will look like

4
// s c a l a r mult v e c t o r
MVector operator ∗ ( const double& l h s , const MVector& r h s ) ;

and the implementation can be placed in a different file


MVector operator ∗ ( const double& l h s , const MVector& r h s )
{
MVector temp ( r h s ) ;
f o r ( i n t i =0; i <temp . s i z e ( ) ; i ++)temp [ i ]∗= l h s ;
return temp ;
}

Tasks:

1. There are five operators we need. Remember that we can multiply/divide vector by a
scalar, add/subtract vectors, but can’t add/subtract a scalar to a vector. What are the
five operators that we need?

2. Write the function definitions and implementations into your code.

3. Check that the code is working by evaluating the following using M V ectors to represent
u, v, w and x:
u = 4.7v + 1.3w − 6.7x

where v = (0.1, 4.8, 3.7), w = (3.1, 8.5, 3.6) and x = (5.8, 7.4, 12.4).

4. Try other combinations additions/multiplications and see what happens. What happens
when you try to add a double to a vector? What happens if you try again but remove the
explicit keyword from the constructors?

5. When adding two vectors check they conform and exit with an error if they do not.

6. You could also try overloading the << operator to output a vector in the form (v[0], v[1], ..., v[n]):
ostream& operator<<(ostream& os , const MVector& v ) {
// O v e r l o a d t h e << o p e r a t o r t o o u t p u t MVectors t o s c r e e n or f i l e
int n = v . s i z e ( ) ;
c o u t << ” ( ” ;
f o r ( i n t i =0; i <n ; i ++) {
o s << v [ i ] ;
i f ( i <n−1) c o u t << ” , ” ;
}
c o u t << ” ) ” ;
return o s ;
}

7. Think about error checking. What happens if the vectors we try to add are not the same
size?

5
1.2.2 Protocol for ODE function
In this section we develop a protocol for the function F (x, Y ) from (1.1) using pure virtual
functions. The class MFunction will basically be a definition of the function used to provide an
interface. The class is defined entirely as follows:
struct MFunction {
v i r t u a l MVector operator ( ) ( const double& x ,
const MVector& y ) =0;
};

• This is the C++ replacement for function pointers.

• A struct is a class where all members are public.

• The definition of operator() is a pure virtual definition, because of the syntax “=0” at
the end of the line.

• We can only inherit from classes with pure virtual functions, not declare them since they
have no implementation.

Example:

Use inheritance to generate a new class that implements the following


!
Y1 + xY2
F (x, Y ) = ,
xY1 − Y2
and evaluate the following
v = F (2, Y )
where
!
1.4
Y = ,
−5.7

Solution:

The function class is written as:


c l a s s T e s t F u n c t i o n : public MFunction
{
public :
// f u n c t i o n
MVector operator ( ) ( const double& x , const MVector& y )
{
MVector temp ( 2 ) ;
temp [ 0 ] = y [ 0 ] + x∗y [ 1 ] ;
temp [ 1 ] = x∗y [ 0 ] − y [ 1 ] ;
return temp ;
}
};

6
In the main function we have (assuming that << has been overloaded)
MVector v , y ( 2 ) ; // i n i t i a l i s e y w i t h 2 e l e m e n t s
T e s t F u n c t i o n f ; // f has o r d e r 2 by d e f i n i t i o n
y [ 0 ] = 1 . 4 ; y [ 1 ] = − 5 . 7 ; // a s s i g n e l e m e n t v a l u e s i n y
v = f ( 2 . , y ) ; // e v a l u a t e f u n t i o n f as r e q u i r e d
s t d : : c o u t << ” v : : ” << v << ” y : : ” << y << ” \n” ;

and the output is


v : : ( −10 , 8 . 5 ) y : : ( 1 . 4 , −5.7 )

Tasks:

1. Copy the program above and get it to compile and run - if you have not overloaded <<
you will have to output v and y element by element.

2. Declare another M V ector u with 2 elements and set them to 1 and 2 respectively. Now
let v be defined by the expression

v = u + F (2, Y ).

Can this be written as seen (i.e v = u + f(2.,y)). Calculate the result by hand to check
your code.

3. Now declare doubles h = 0.1, and x = 0.5, and evaluate

v = u + hF (x, u + hY ).

Again try to write this in one line of code. Calculate the result by hand to check your
code.

1.2.3 ODE solver function


Below is the declaration of a function that can be used to solve ODEs. In order to solve an
initial value ODE problem we need to know the initial conditions, the start point in x, the
number of steps, and the function f (x, y) for which we are solving. On entry the arguments to
this function contain all of those elements, and on return the solution can be stored inside the
vector y. In this section you must complete the definition of this function.
// D e f i n i t i o n o f an e u l e r scheme ODE s o l v e r f u n c t i o n
i n t e u l e r S o l v e ( i n t s t e p s , double a , double b , MVector &y , MFunction &f ) ;

On entry to the function

• steps :- number of steps in the problem

• a :- initial value of x

• b :- final value of x

7
• y :- the initial value of y(x = a)

• f :- the function defining the problem we are solving

On exit from the function:

• y :- the solution y(x = b)

• return value :- integer that can give information about any errors that have occurred.

Tasks:

1. Write the declaration for this function and an empty definition.

2. Now fill in the definition of the function. This piece of code should carry out the following
algorithm:

(a) Declare and initialise the value of x.


(b) Declare and calculate the step size h.
(c) loop over the number of steps and update x and Y according to the algorithm

xi =a + ih.
Y i+1 =Y i + hF (xi , Y i ),

for i = 0, 1, . . . , steps − 1.

3. Write a new function inheriting M F unction to evaluate the following


!
x
F (x, Y ) = .
Y2

Then use the function eulerSolve to solve the initial value problem
!
dY 0
= F (x, Y ), with Y (x = 0) = ,
dx 1

on the interval x ∈ [0, 1]. The exact solution is


!
0.5
Y (x = 1) = .
e

Create a table containing your values of Y1 (x = 1) and Y2 (x = 1) for different numbers of


steps from n = 10 up to n = 100 in steps of ten.

4. Next write functions to solve the ODE using the midpoint method and 4th order Runge-
Kutta method.

(a) Use the previous example as a template for your declaration and definition

8
(b) The midpoint method is given by the recurrence relation

xi =a + ih,
Y i+1 =Y i + hF xi + 12 h, Yi + 21 hF (xi , Yi ) .


for i = 0, 1, . . . , steps − 1.
(c) and the 4th order Runge-Kutta integrator method may be expressed as

xi = a + ih,
h k1
k1 = hF (xi , Y i ), k2 = hF (xi + , Y i + ),
2 2
h k2
k3 = hF (xi + , Y i + ), k4 = hF (xi + h, Y i + k3 ),
2 2
1
Y i+1 = Y i + [k1 + 2k2 + 2k3 + k4 ],
6
for i = 0, 1, . . . , steps − 1.

5. Test the methods against each other on the test problem.

6. Now consider the following ODE;

d2 y
 
1 3 dy
= 32 + 2x − y , (1.18)
dx2 8 dx

on the interval x ∈ [1, 3]with the initial conditions

y(x = 1) = 17, y 0 (x = 1) = 1.

(a) Write the ODE as a system of first order ODEs. (Hint: Write Y1 = y and Y2 = y 0 .)
(b) Derive the function F , and write a new function (which inherits ODEF unction) to
represent it.

7. Think about error checking in your code. What happens if the size of y and f are different?
What can you do?

8. Now include an optional print statement within the solver functions to output values of
Y i , xi for all i to a file.

Report:

• For the ODE stated in (1.18), in your report briefly state the problem, and comment on
the accuracy of the numerical methods on the solution of this equation.

1.2.4 Implementing the Newton shooting method


Example:

Solve the BVP defined in (1.3) and (1.4) with κ = 1.

9
Solution:

The function F is given by


c l a s s T e s t F u n c t i o n : public MFunction
{
double kappa ;
public :
// c o n s t r u c t o r t o i n i t i a l i s e o r d e r
T e s t F u n c t i o n ( ) { kappa = 1 . } ;
// f u n c t i o n
MVector operator ( ) ( const double& x , const MVector& y )
{
MVector temp ( 4 ) ;
temp [ 0 ] = y [ 1 ] ;
temp [ 1 ] = −kappa ∗y [ 1 ] − x∗y [ 0 ] ;
temp [ 2 ] = y [ 3 ] ;
temp [ 3 ] = −kappa ∗y [ 3 ] − x∗y [ 2 ] ;
return temp ;
};
void setKappa ( double k ) { kappa=k ; } ; // change kappa
};

and in the main code we have something like


TestFunction f ;
f o r ( i n t newton =0; newton <100; newton++)
{
// s e t u p i n i t i a l c o n d i t i o n s
y [0]=0; y [1]= guess ; y [ 2 ] = 0 . ; y [ 3 ] = 1 . ;
r u n g e K u t t a S o l v e ( 1 0 0 , 0 . , 1 . , y , f ) ; // s o l v e
p h i = y [ 0 ] − 1 . // c h e c k a g a i n s t BC
p h i d a s h = y [ 2 ] ; // p h i d a s h = z 1 ( x =1)
i f ( abs ( p h i )< t o l ) break ; // e x i t i f condtn s a t i s f i e d
guess = guess − phi / phidash ;
}

You will require the cmath library to access the abs function.

Tasks:

Consider now ODE (1.18) with the boundary conditions

43
y(x = 1) = 17, y(x = 3) = .
3

1. Consider that (1.18) may be written as:

1
y 00 = 32 + 2x3 − yy 0 ,

(1.19)
8
y 00 = F (x, y, y 0 ) (1.20)

10
(a) Differentiate (1.18) with respect to the guess g, using the chain rule.

dy 00 ∂F dx ∂F dy ∂F dy 0
= + + . (1.21)
dg ∂x dg ∂y dg ∂y 0 dg
dx
(Hint: dg = 0)
dy
(b) Let us set z = dg , then write down the set of first order ODEs and initial conditions
satisfied by z.
d
z = f (x, y, z)
dx
(c) Alter your code so as to solve for z and y simultaneously. (Hint: You now have a 4
element system as in the example.)
(d) Alter your code to iterate toward the correct solution, using the Newton method,
given by
φ(gn )
gn+1 = gn − 0 , φ0 (gn ) = z(3; gn ).
φ (gn )

2. Check your code against the exact solution, y(x) = x2 + 16


x and y 0 (x = 1) = −14.

3. Think about error checking. What happens if we reach the end of the loop and a solution
has not been found? What information can you give back to the user?

Marks will be awarded for clarity and correctness of code as well as answers to the questions
and discussion.

11
12
Mini Project 2

Historic vs Implied Volatility

Author: Dr. Paul Johnson, Paul.Johnson-2@manchester.ac.uk

In the first part of the project we shall learn how to input data into your C++ programs
from a *.csv format, then process the data with some simple statistical algorithms. In the
second part, we shall use analytical formulae along with the Newton-Secant method to derive
implied volatilities for option prices. How closely does the implied volatility match the historical
ones from the data?
The aim of the project is to use OO programming to write generic code that can manip-
ulate data, solve for option prices and calibrate parameters. We shall use concepts such as
encapsulation, inheritance, polymorphism.

2.1 Background Theory

2.1.1 Statistical Estimates

Here we give a brief definition of some of the statistical properties calculated.

The Moving Average

Given a vector x = {x0 , x1 , . . . , xi , . . . , xn−1 } with n values let us define the moving average
with m points to the left and p points to the right as:
p
1 X
x̄i = xi+j (2.1)
m+p+1
j=−m

If we wish to add weights to the averaging then we will have:


p
1 X
x̄i = Pp wj xi+j (2.2)
j=−m wj j=−m

13
The Exponential Moving Average

Given a vector x = {x0 , x1 , . . . , xi , . . . , xn−1 } with n values let us define the exponential moving
average as:

x̄0 = x0
x̄i = αxi−1 + (1 − α)x̄i−1

where α ∈ [0, 1] is the smoothing factor.

Historical Volatility

We shall use a naive algorithm to estimate the volatility of sample data, because it will be
simple to implement. Assume we have a data set S = {S0 , S1 , . . . , Si , . . . , Sn−1 } that samples
from data driven by the following SDE (in discrete form):
dSi √
= µdt + σ dtφi
Si
where Si+1 = Si + dSi and φi ∼ N (0, 1). Here we are assuming that the stock price follows a
log normal random walk.
Given a random variable X the variance is calculated as:

var(X) = E[X 2 ] − E[X]2

So for the sample data x = {x0 , x1 , . . . , xi , . . . , xn−1 } we have:


n−1
! n−1
!2
1 X 1 X
var(X) = (xi )2 − 2 xi (2.3)
n−1 n −n
i=0 i=0

Then we can calculate the variance σ 2 of S as follows:

1. Generate the vector d ln(S) = {ln(S1 ) − ln(S0 ), ln(S2 ) − ln(S1 ), . . . , ln(Sn−1 ) − ln(Sn−2 )}
by taking logs of the original data and evaluating the difference between successive data
points.

2. Calculate the variance of d ln(S) from equation (2.3).


p
3. σ = var(d ln(S))

2.1.2 Generating Option Prices


Consider the well-known Black and Scholes (1973) partial differential equation for an option
with an underlying asset following geometric Brownian motion:
∂V 1 ∂2V ∂V
+ σ 2 S 2 2 + rS − rV = 0, (2.4)
∂t 2 ∂S ∂S
where V (S, t) is the price of the derivative product, S the current value of the underlying asset,
t is time, T is the time to maturity r the risk-free interest rate, σ the volatility of the underlying
asset and X is the exercise price of the option.

14
Next make the following standard transformations in (2.4)
x = log(S0 /X), (2.5)
y = log(ST /X). (2.6)
The final conditions for a European option expiring at time T with V (y, T ) are transformed in
straightforward fashion; e.g. the payoff for a call option V (S, T ) = max(ST − X, 0) becomes
V (y, T ) = max(Xey − X, 0). (2.7)
The value of the option at time t = 0 on an underlying asset S0 has an exact solution given by
Z ∞
V (x, 0) = A(x) B(x, y)V (y, T ) dy, (2.8)
−∞
where,
1 1 1 2 k 2 T −rT
A(x) = √ e− 2 kx− 8 σ , (2.9)
2σ 2 πT
and
(x−y)2
+ 12 ky
B(x, y) = e− 2σ 2 T (2.10)
and
2r
k= − 1. (2.11)
σ2
Henceforth the integrand here will be denoted as f (x, y), so that (2.8) becomes
Z ∞
V (x, t = 0) = A(x) f (x, y)dy (2.12)
−∞
where the integrand is given by
f (x, y) = B(x, y)V (y, T ). (2.13)

2.1.3 Implied Volatility


Let us define V̂ (S, t; X, T ) as the observed price for a European call (or put) option on the
market, with S the observed stock price at time t, maturity date T and exercise price X, and
V (S, t; X, T, r, σ) be the price calculated from the Black-Scholes model with extra parameters r
the interest rate and σ the volatility. Then is we know what the risk-free interest rate r is, we
only need to know what σ is to generate a Black-Scholes price. We find the implied volatility
by choosing the correct value of σ = σim such that the Black-Scholes price equals the market
price, or:
V (S, t; X, T, r, σim ) = V̂ (S, t; X, T ).
Now we can make use of the Newton-Secant root-finding method defined by the algorithm:
σn−1 − σn−2
σn = σn−1 − g(σn−1 ) (2.14)
g(σn−1 ) − g(σn−2 )
to find the roots σ ∗ such that g(σ ∗ ) = 0
Then to find the implied volatility we simply set
g(σ) = V (S, t; X, T, r, σ) − V̂ (S, t; X, T ), and g(σim ) = 0.
Two suitable guesses must be supplied to start the algorithm going. There is no guarantee that
the algorithm will converge.

15
2.2 Importing Data
2.2.1 Creating a Math vector from the standard library
Here we shall use the standard vector class to create a new vector class and write function that
can evaluate simple statistical properties of the data stored in the vector.
Copy the following class definition for the new class M V ector into a header file or at the
top of your main code.
// c l a s s MVector c o n t a i n s a r r a y s t h a t can work w i t h d o u b l e s
c l a s s MVector
{
// s t o r a g e f o r t h e new v e c t o r c l a s s
v e c t o r <double> v ;
public :
// c o n s t r u c t o r
e x p l i c i t MVector ( ) { }
e x p l i c i t MVector ( i n t n ) : v ( n ) { }
e x p l i c i t MVector ( i n t n , double x ) : v ( n , x ) { }
// e q u a t e v e c t o r s ;
MVector& operator=(const MVector& X)
{ i f (&X==t h i s ) return ∗ t h i s ; v=X. v ; return ∗ t h i s ; }
// a c c e s s d a t a i n v e c t o r
double& operator [ ] ( i n t i n d e x ) { return v [ i n d e x ] ; }
// a c c e s s d a t a i n v e c t o r ( c o n s t )
double operator [ ] ( i n t i n d e x ) const { return v [ i n d e x ] ; }
// s i z e o f v e c t o r
i n t s i z e ( ) const { return v . s i z e ( ) ; }
} ; // end c l a s s MVector

So far so good. The class M V ector will act in exactly the same way as a std :: vector, except
that we do not have access to all the public functions of the std :: vector, and we have explicitly
chosen the data double as the data type stored in the array.
Now let us add a function to the class so that we can add data to the vector on input from
a file. To do this we need to allow access to more of the functionality of the std :: vector. Add
the following function to your class:
// add d a t a t o your v e c t o r
void push back ( double x ) { v . push back ( x ) ; }

Example: Using the MVector

Below is an excerpt from some code which inserts elements into the vector x and then prints
them out to the screen.
// c r e a t e an MVector
MVector x ;
// add e l e m e n t s t o t h e v e c t o r
x . push back ( 1 . 3 ) ;
x . push back ( 3 . 5 ) ;

16
// x now c o n t a i n s 1 . 3 and 3 . 5
// p r i n t x
c o u t << ” x:= ( ” ;
f o r ( i n t i =0; i <x . s i z e ( ) ; i ++)c o u t << x [ i ] << ” ” ;
c o u t << ” ) \ n” ;

Tasks:

1. Now add a function inside your class returning the average of a vector. It will return a
double, but won’t need any arguments as it will already have access to the data. Test
your results on small sample sets of data.

2. Now add a function that returns a vector containing the running average for a given
vector. The function might look something like:
// r e t u r n t h e r u n n i n g a v e r a g e f o r a v e c t o r
MVector runningAverage ( const MVector &x , /∗ o t h e r arguments ∗/ )
{
// c r e a t e v e c t o r same s i z e as x
MVector temp ( x . s i z e ( ) ) ;
/∗ r u n n i n g a v e r a g e a l g o r i t h m i n h e r e ∗/
// r e t u r n temp v e c t o r c o n t a i n i n g r u n n i n g a v e r a g e
return temp ;
}

Test your function on small data samples.

3. Create a function to return a vector containing the exponential average. Try different
values for α. Can you describe any qualitative or quantitative differences between the two
different types of average?

4. Create similar function to return

y = yi = log(xi ), 0≤i≤n−1

and
y = yi = xi+1 − xi , 0≤i≤n−2

Note that in the second case the vector has one less entry than before.

5. Using the functions created above (or otherwise) generate a function to return the variance
of the data stored in a vector x assuming that the data you have follows a log normal
process, see the section on historical volatilities.

2.2.2 Importing data sets


Example: Read/write numbers to/from a file

Some example code of how one might write numbers to a file then read them back.

17
// open a f i l e stream and w r i t e some d a t a t o i t
s t d : : o f s t r e a m output ( ” f i l e . dat ” ) ;
output << 2 . ∗ x << ” ” << 2 . ∗ y << ” \n” ;
output . c l o s e ( ) ;
// re ope n t h e f i l e t o r e a d b a c k d a t a
s t d : : i f s t r e a m i n p u t ( ” f i l e . dat ” ) ;
i n p u t >> x >> y ;
input . c l o s e ( ) ;
s t d : : c o u t << x << ” ” << y << ” \n” ;

Tasks:

• Open a new C++ project on your IDE and try the example code. Hopefully the values
should have doubled after begin read from the file.

• Download historical stock prices from the web (available at http://finance.yahoo.com)


to a *.csv file. To make things easy open the file in excel, then copy and paste the column
with the stock prices into a new file (using notepad on windows, gedit on unix). Save the
file as a text file (very important on windows).

• Write a program to read the first 5 entries from the text file and output them to screen.
Check that the values are correct.

• The input stream has a function eof() which returns the value true when the end of the
file has been reached. Use this to create a while loop that reads in all values from the file
regardless of how big it is.

• Instead of outputting values to screen, create a MVector to store the data and add that
data to the vector (you will need to store the data temporarily to do this).

• Think about error checking. What happens if the data is not of the correct format (i.e.
column heading left in, or no data entry available)?

• Finally use your functions from the first section to generate the variance of your data set.
You have now generated the variance per day. How can this be converted to the variance
per year and hence the volatility?

2.3 Numerical Integration Methods for Option Pricing


Note that the following section is independent of the other sections and you can start afresh
here if you have struggled with them. We shall evaluate the price of a European option using the
analytical formulae. Using inheritance, we can allow for common elements of an option (such
as the parameters) to be shared. Please review the examples sheets and notes for examples of
implementing integration on C++.

18
2.3.1 Integration
The first task is to create a function to integrate another function. The algorithm we choose to
implement is Simpsons rule, given by:
 
Z b n/2−1 n/2
h X X
f (y)dy ≈ f (a) + 2 f (x2j ) + 4 f (x2j−1 ) + f (b) (2.15)
a 3
j=1 j=1

where xj = a + jh and h = (b − a)/n. The integral we wish to evaluate is


Z 1
I= Xe−rσT eT y dy (2.16)
y=0

where X = 10, r = 0.05, σ = 0.1 and T = 0.5 are constants.

Tasks:

• First, evaluate I by hand. Write down your solution to 6 s.f.

• Create a new program, and enter the following code above your main function:
// i n t e g r a n d f u n c t i o n
double f ( double y )
{
return 0 ; // f i l l i n a p p r o p r i a t e f u n c t i o n h e r e
}
// f u n c t i o n t o implement i n t e g r a t i o n
double i n t e g r a t e ( double a , double b , i n t n )
{
double sum=0;
// f i l l i n Simpsons a l g o r i t h m h e r e
return sum ;
}

Edit the code so that f(y) returns the value of function in (2.16). For the time being
declare the constants inside the function. Edit the code so that integrate(a,b,n) returns
the integrated value of f(y) between the limits a and b with n divisions.

• Calculate the solution for the integral I using with different values of n. Create a table
comparing your results with the exact solution. Roughly what value of n is needed to get
a solution accurate to 6 s.f.?

• Now change your functions so that the constants X, r, σ and T may be passed in as
arguments to the integrate function. In one program run, calculate the value of the
integral with T = 0, 0.25, 0.5, 0.75 and 1.

2.3.2 Quadrature
In this section we shall evaluate the price of a European call option using a proceedural approach,
before repeating the process with an object orientated approach.

19
Tasks:

• Add three extra functions into your code, declaring their type and arguments correctly:

– payoff(y,X) evaluates the transformed payoff from (2.7);


– A(x,r,sigma,T,k) evaluates function A from (2.9);
– B(x,y,sigma,T,k) evaluates function B from (2.10).

Using the parameter values from part 2.3.1, y = 0.5, and S0 = 10, check that each of
your functions are working by evaluating them by hand and comparing answers. You will
need to calculate the value of x (from the transformation, equation 2.5) and the value of
k (from equation 2.11) from other the other parameters before calling the functions.

• Now change your function f to evaluate B multiplied by payoff as in (2.13). You will need
to make sure that all the required parameters are passed into the funtion f as arguments.
The definitions for f and integrate should now look something like:
// i n t e g r a n d f u n c t i o n
double f ( double x , double y , double X, double r ,
double sigma , double T, double k )
{
// f i l l i n a p p r o p r i a t e f u n c t i o n h e r e
}
// f u n c t i o n t o implement i n t e g r a t i o n
double i n t e g r a t e ( double a , double b , i n t n , double x ,
double X, double r , double sigma , double T, double k )
{
double sum=0;
// f i l l i n Simpsons a l g o r i t h m h e r e
return sum ;
}

• Now calculate the value of a Call option with S0 = 97, X = 100, r = 0.06, σ = 0.2,
T = 0.75 using your functions A and integrate as in (2.12). The limits of integration
can be set to a = −10 and b = 10. Test your code by producing a table of your solution
against the exact value (from some other source) against different values of n.

2.3.3 Object Orientated Approach


In this section we develop a protocol for the payoff function of an option using pure virtual
functions. The class Payoff will basically be a definition of the function used to provide an
interface. The class is defined entirely as follows:
// t h e o p t i o n p a y o f f c l a s s
struct P a y o f f
{
v i r t u a l double operator ( ) ( double y )=0;
};

20
• This is the C++ replacement for function pointers.

• A struct is a class where all members are public.

• The definition of operator() is a pure virtual definition, because of the syntax “=0” at
the end of the line.

• We can only inherit from classes with pure virtual functions, not declare them since they
have no implementation.

Example: Share Value

Create a class that inherits Payoff and returns the value of the share as a payoff F (S) = S.
Under the transformed variables it may be written as f (y) = S0 ey . The code is as follows:
// a c a l l o p t i o n i m p l e m e n t a t i o n
c l a s s ShareValue : public P a y o f f
{
double S0 ;
public :
ShareValue ( double S0 ) : S0 ( S0 ) { }
double operator ( ) ( double y ) {
return S0 ∗ exp ( y ) ;
}
};

Declare the class in the code as


// D e c l a r e a ShareValue w i t h s c a l e d v a l u e S0=10
ShareValue s h a r e ( 1 0 . ) ;
// o u t p u t from t h i s l i n e w i l l be 10 , s i n c e
// 1 0 . ∗ exp (0)=10 .
c o u t << s h a r e ( 0 . ) << e n d l ;
// o u t p u t 3 . 6 7 8 7 9
c o u t << s h a r e ( −1.) << e n d l ;
// o u t p u t 2 7 . 1 8 2 8
c o u t << s h a r e ( 1 . ) << e n d l ;

Tasks:

• Copy the following code for the payoff class at the top of your program and add a defini-
tion for a call option underneath (using the example for ShareValue as a template) and
complete the payoff function for the call option (in transformed variables see equation
2.7). Check that your code still compiles and runs.

• Create a new function f_object by copying and pasting the function f in your code and
renaming it f_object. Add Payoff &payoff to the argument list in f_object. Since
the payoff function we use inside the copied function is from Payoff class, it will just
take y as an argument rather than y and X. Test the function f_object by calling it from

21
your main function and compare it with the values from f. You will need to declare a
CallOption object to pass in as the argument (since it inherits the payoff function) and
set the value of the variable X in the object with the declaration statement.

• Create a new function integrate_object by copying and pasting the function integrate
and renaming it. Add Payoff &payoff to the argument list in f_object. Inside the
function integrate_object, now use f_object instead of f as the integrand, you must
pass the payoff function as an argument on every function call. Test the function
integrate_object by calling it from your main function. Do you get the same answers
as before?

• Try making a PutOption class that inherits the Payoff class and generate values of the
option.

• Write a single function to return the value of an option, your function definition should
look something like:
double o p t i o n V a l u e ( double S , double X, double r , double sigma ,
double T, P a y o f f &p a y o f f )

You can calculate the value of x and k within the function.

• Try making a class that stores all the methods and parameters required to generate option
prices.

2.4 Implied Volatilities


Example: Newton’s root finding algorithm
Use Newton’s method to find roots of f (x) = 0 where

f (x) = 9x4 − 42x3 − 1040x2 + 5082x − 5929.

Solution:

Create a function to find the root of


// f u n c t i o n t o f i n d r o o t o f
double p o l y ( double x ) {
return ( ( ( 9 . ∗ x − 4 2 . ) ∗ x − 1 0 4 0 . ) ∗ x + 5 0 8 2 . ) ∗ x −5929;
}

its derivative
// d e r i v a t i v e f u n c t i o n t o f i n d r o o t o f
double p o l y d e r i v ( double x ) {
return ( ( 3 6 . ∗ x − 1 2 6 . ) ∗ x − 2 0 8 0 . ) ∗ x + 5 0 8 2 . ;
}

22
and a simple root finding function defined by:
// f i n d t h e r o o t o f a f u n c t i o n
// On i n p u t : x i s i n i t i a l g u e s s , On r e t u r n x i s t h e r o o t
// m a x i t e r i s t h e max no o f i t s , t o l i s t h e a c c u r a c y o f t h e r o o t
// 0 w i l l be r e t u r n e d i f no r o o t i s found
// 1 i f i t i s s u c c e s s f u l
i n t f i n d r o o t ( double& x , i n t maxiter , double t o l ) {
// f i n d r o o t
}

The algorithm to find the root is given by:


f (xn )
xn+1 = xn −
f 0 (xn )
so we may write the code inside the function as
// f i n d r o o t
int i t e r ;
f o r ( i t e r =0; i t e r <m a x i t e r ; i t e r ++){
x = x − poly ( x )/ p o l y d e r i v ( x ) ;
i f ( s t d : : abs ( p o l y ( x)) < t o l ) break ;
}
i f ( i t e r==m a x i t e r ) return 0 ;
e l s e return 1 ;

Example use of the program is


i n t i t e r =100;
double x , t o l =1. e −6;
s t d : : c o u t << ” Enter i n i t i a l g u e s s \n” ;
s t d : : c i n >> x ;
i f ( find root (x , iter , tol ))
s t d : : c o u t << ” There i s a r o o t a t ” << x << ” \n” ;
else
s t d : : c o u t << ” No r o o t found . \ n” ;

Tasks:

• Create a new project and go through the example above.

• Edit your code so that it solves the example problem using the secant method – check it
works.

• Write a function to return g(σ) the difference between an option value and the market
value, you will need to add in all the relevant parameters for the option in the argument
list, including the payoff. Edit your secant method algorithm to find the root of the
function g(σ), again all relevant parameters must be added to the argument list.

• Consider a European call option with a value today of $0.05 on an underlying whose price
today (S(t = 0)) is $1. Expiry is in 6 months time, the risk-free rate is 5% (this may

23
be assumed fixed), and the strike is $1. Using the secantMethod function evaluate the
implied volatility of this option.

2.5 Report
Write a project report as a connected piece of prose. You may use any suitable reference sources,
but these must be clearly identified in the report at the points of use.

1. State the problem, method used to solve, and give a description of any classes used.

2. For a selection of stocks of your choice, plot the time series of the value, the running
average and the exponential average. Make comments on the differences between them
and how well they can spot trends in the data.

3. Using the same selection of stocks as above, find the implied volatility for a selection of
different options. You may obtain your option-price data from any (published) source
(but this must be clearly stated in your report). The FT is one source, but another (more
comprehensive) source is

http://www.marketwatch.com/quotes/

which has delayed option price data. The website is a bit busy but looks quite user friendly.
If you type in the ticker symbol for a stock (e.g. IBM) and then click on ‘options’ on the
tab it gives you quotes for option prices. The codes are a bit tricky to decipher but there
is a description at:

http://www.marketwatch.com/tools/quotes/lookup.asp

The first three digits identify the company (e.g. IBM) and the fourth digit identifies the
expiry month i.e. all April expirations have xxxDx and May xxxEx etc. and the last digit
indicates the strike price.
Another useful site is

http://www.liffe.com/reports/eod?item=Daily

You may, of course, use other (better?) sources, but these should be stated clearly in your
report.

4. Plot the implied volatility of options against different strike prices for fixed time to ma-
turity and interest rate.

5. Include tables of your historical volatilities versus implied volatilities, and describe any
differences.

Marks will be awarded for clarity and correctness of code as well as answers to the questions
and discussion.

24
Mini Project 3

The Conjugate Gradient Method

Author: Dr. Paul Johnson, Paul.Johnson-2@manchester.ac.uk

In this project we shall solve the matrix equation Ax = b where x is the unknown. We shall
use an iterative solver, the conjugate gradient method, to solve the equation. Do not worry if
you do not follow the theory as the implementation is far more important for the purposes of
this course, and by creating vector and matrix classes that can be added/multiplied together,
the algorithm should be very easy to implement.
We shall also look at different storage methods for a matrix, and show how the same algo-
rithm can be used to solve for different types of matrix. This is an important component of
programming in C++.

3.1 Iterative methods for solving a Matrix equation

3.1.1 The Conjugate Gradient Method

The conjugate gradient method was first put forward by Hestenes and Stiefel (1952) and was
reinvented again and popularised in the 1970’s. The method has one important property that it
converges to the exact solution of the linear system in a finite number of iterations, in the absence
of roundoff error. Used on its own the method has a number of drawbacks, but combined with
a good preconditioner it is an effective tool for solving linear equations iteratively. Consider

Ax = b; (3.1)

where A is an n × n symmetric and positive definite matrix. Recall A is symmetric if AT = A


and positive definite if xT Ax > 0 for any non zero vector x.
Let us define hx, yi to be the dot product of two vectors. Now consider the quadratic
functional
φ(y) = hAy, yi − hb, yi − hy, bi. (3.2)

25
Then for a vector p and x and any scalar α

φ(x + αp) = hA(x + αp), xαpi − hb, x + αpi − hx + αp, bi


= φ(x) + hAx − b, αpi + hαp, Ax − bi + α2 hAp, pi, (3.3)

Differentiating (3.3) with respect to α and equating to zero to find a minimum gives

0 = hAx − b, pi + hp, Ax − bi + 2αhAp, pi,

which gives
hb − Ax, pi
α= , (3.4)
hAp, pi
since hx, yi = hy, xi. Then the functional φ(x + αp) achieves its minimum at

hb − Ax, pi2
φ(x + αp) = φ(x) − . (3.5)
hAp, pi
This proves that x is a solution of Ax = b if and only if x minimizes the functional φ(y). The
proof is as follows. If x is a minimizer of φ(y), then Ax = b, since otherwise there would exist
y = x + αp for α 6= 0 and p 6= 0 such that φ(y) ≤ φ(x) and φ(y) would not be a minimum.
Conversely, if Ax = b, then φ(x) ≤ φ(x + αp) for any α and p, so then x is a minimizer of
φ(y). So solving Ax = b is the same as minimizing φ(y).
This suggests an iterative algorithm:
hb − Axk , pk i
xk+1 = xk + αk pk , with αk = (3.6)
hApk , pk i
where pk is some vector that we must generate.
Now assume that the vectors p0 , p1 , p2 , . . . , pn that we generate are conjugate with respect
to the matrix A, which is defined as

hApi , pj i = 0, for i 6= j; (3.7)

then the iterates x0 , x1 , x2 , . . . , xn , generates by the iterative process (3.6) satisfy that Ax = b
for an arbitrary initial point x0 .
The important thing to note here is that if this theory holds we have a finite limit n on the
number of iterations this scheme will take to converge. The convergence rate will depend on
the condition of the matrix. In order to speed up convergence, a preconditioner may be used,
but that is not investigated in this task.

3.1.2 Formulation
Let us assume that we choose conjugate vectors pk as

pk+1 = r k+1 + βk pk , (3.8)

where βk is given by the formula


hr k+1 , r k+1 i
βk = − (3.9)
hr k , r k i

26
Now we have the conjugate vectors, we can make some simplifications, namely that

hb − Axk , pk i = hr k , r k i

Then following the theory the conjugate gradient algorithm may be written as follows:

Algorithm - Conjugate Gradient method:

• Initialise the method with a guess x0 . Then set

r 0 = b − Ax0 ,
p0 = r 0 , (3.10)

• Now construct a loop to iterate through xk , for k = 0, 1, 2, . . . ,

hr k , r k i
αk = , (3.11)
hApk , pk i
xk+1 = xk + αk pk , (3.12)
r k+1 = r k − αk Apk , (3.13)
hr k+1 , r k+1 i
βk = , (3.14)
hr k , r k i
pk+1 = r k+1 + βk pk . (3.15)

3.2 Coding, Examples and Exercises


3.2.1 Creating a Math vector from the standard library
Here we shall use the standard vector class to create a new vector class so that we can add
them, and multiply them by scalars. Putting extra work into making this class will enable our
conjugate gradient method to be written as we would write it on paper.
Copy the following class definition for the new class M V ector into a header file or at the
top of your main code.
// c l a s s MVector c o n t a i n s a r r a y s t h a t can work w i t h d o u b l e s
c l a s s MVector
{
// s t o r a g e f o r t h e new v e c t o r c l a s s
v e c t o r <double> v ;
public :
// c o n s t r u c t o r
e x p l i c i t MVector ( ) { }
e x p l i c i t MVector ( i n t n ) : v ( n ) { }
e x p l i c i t MVector ( i n t n , double x ) : v ( n , x ) { }
// e q u a t e v e c t o r s ;
MVector& operator=(const MVector& X)
{ i f (&X==t h i s ) return ∗ t h i s ; v=X. v ; return ∗ t h i s ; }
// a c c e s s d a t a i n v e c t o r

27
double& operator [ ] ( i n t i n d e x ) { return v [ i n d e x ] ; }
// a c c e s s d a t a i n v e c t o r ( c o n s t )
double operator [ ] ( i n t i n d e x ) const { return v [ i n d e x ] ; }
// s i z e o f v e c t o r
i n t s i z e ( ) const { return v . s i z e ( ) ; }
} ; // end c l a s s MVector

So far so good. The class M V ector will act in exactly the same way as a std :: vector, except
that we do not have access to all the public functions of the std :: vector, and we have explicitly
chosen the data double as the data type stored in the array.
Now for this to be of any use we must overload the operators +-/* to work with our new
M V ector class. We shall place the function definition outside the class definition but inside
the header file. A typical definition will look like

// s c a l a r mult v e c t o r
MVector operator ∗ ( const double& l h s , const MVector& r h s ) ;

and the implementation can be placed in a different file

MVector operator ∗ ( const double& l h s , const MVector& r h s )


{
MVector temp ( r h s ) ;
f o r ( i n t i =0; i <temp . s i z e ( ) ; i ++)temp [ i ]∗= l h s ;
return temp ;
}

Tasks:

1. There are five operators we need. Remember that we can multiply/divide vector by a
scalar, add/subtract vectors, but can’t add/subtract a scalar to a vector. What are the
five operators that we need?

2. Add the function definitions and implementations into your code.

3. Check that the code is working by evaluating the following using M V ectors to represent
u, v, w and x:
u = 4.7v + 1.3w − 6.7x

where v = (0.1, 4.8, 3.7), w = (3.1, 8.5, 3.6) and x = (5.8, 7.4, 12.4).

4. Try other combinations additions/multiplications and see what happens. What happens
when you try to add a double to a vector? What happens if you try again but remove the
explicit keyword from the constructors?

5. When adding two vectors check they conform and exit with an error if they do not.

28
Additional Tasks:

1. Try adding a function to resize the vector using the resize(int n,double x) member
function of the standard vector class on v. Within your function, you should call the
standard clear() member function before the resize member function.

2. You could also try overloading the << operator to output a vector in the form (v[0], v[1], ..., v[n]).

3.2.2 More Useful Vector Functions


When working with vectors it will useful to specify a further two member functions to calculate
the norm of a vector. We wish to implement the following two norms, the infinity norm

||x||∞ = max(|x0 |, |x1 |, . . . , |xn |), (3.16)

and the L2 norm q


||x||2 = x20 + x21 + · · · + x2n . (3.17)

Also we would like a function to calculate the dot product of two vectors. It may be defined as
n
X
x.y = xi yi . (3.18)
i=0

Example:

Add a member function definition


double maxNorm ( ) const ;

to the M V ector class that returns the maximum absolute element in the vector. Now write
the implementation into your code. You may use the std::max function. Remember that the
std::abs function requires the <cmath> library.

Solution:

The function may be written as follows:


double maxNorm ( ) const
{
// c a l c u l a t e t h e maximum v a l u e
// i n i t i a l i s e temp w i t h f i r s t e l e m e n t
double temp = s t d : : abs ( v [ 0 ] ) ;
// s e t temp t o be t h e maximum o f temp and t h e n e x t e l e m e n t
f o r ( i n t i =1; i <s i z e ( ) ; i ++)
{
temp = s t d : : max( s t d : : abs ( v [ i ] ) , temp ) ;
}
return temp ;
}

29
Tasks:

1. Add the source code above to your program and check that it compiles and runs.

2. Write your own member function to evaluate the L2 norm.

3. Add a function outside the class, with the definition:


double dot ( const MVector& l h s , const MVector& r h s ) ;

Fill in the implementation for the function.

4. Test your dot product function by evaluating


u.u
α=
v.w
where u = (1.5, 1.3, 2.8), v = (6.5, 2.7, 2.9) and w = (0.1, −7.2, 3.4). The value of alpha
should be -1.31915 (to 6 s.f.).

5. Test your L2 norm function on the three vectors ||u||2 = 3.4322, ||v||2 = 7.61249, and
||w||2 = 7.96304.

3.2.3 Creating a Matrix class


Now in order to solve the matrix equation we shall need a matrix class, as given below. For the
time being do not worry about the way the data is stored, we have used a vector of a vector,
storage issues and data access functions are taken care of with the definition.
// c l a s s Matrix i s an a r r a y t h a t has
// m a t r i x v e c t o r m u l t i p l y f u n c t i o n
c l a s s Matrix
{
unsigned i nt N,M;
// s t o r a g e f o r t h e new m a t r i x c l a s s
v e c t o r <v e c t o r <double> > A;

public :

// c o n s t r u c t o r
e x p l i c i t Matrix ( ) : N( 0 ) ,M( 0 ) { }
e x p l i c i t Matrix ( i n t n , i n t m) : N( n ) ,M(m) ,A( n , v e c t o r <double>(m) ) { }
e x p l i c i t Matrix ( i n t n , i n t m, double x ) : N( n ) ,M(m) ,A( n , v e c t o r <double>(m, x ) ) { }
// e q u a t e m a t r i c e s ;
Matrix& operator=(const Matrix& r h s ) {
i f (& r h s==t h i s ) return ∗ t h i s ;A=r h s .A; return ∗ t h i s ;
}
// s e t a l l v a l u e e q u a l t o a d o u b l e
Matrix& operator=(double x ) {
f o r ( i n t i =0; i <rows ( ) ; i ++)f o r ( i n t j =0; j <c o l s ( ) ; j ++)A[ i ] [ j ]=x ;
return ∗ t h i s ;
}

30
// a c c e s s d a t a i n m a t r i x ( c o n s t )
double operator ( ) ( i n t i , i n t j ) const { return A[ i ] [ j ] ; }
// a c c e s s d a t a i n m a t r i x
double& operator ( ) ( i n t i , i n t j ) { return A[ i ] [ j ] ; }
// s i z e o f m a t r i x
i n t rows ( ) const { return N; }
i n t c o l s ( ) const { return M; }
} ; // end c l a s s Matrix

Please note that to access elements in the matrix we now use (i,j) rather than [i][j]. The
matrix class as listed here is almost complete. All that is left to do is write a function to
implement M atrix × M V ector. The definition is as follows:

MVector operator ∗ ( const Matrix& A, const MVector& x ) ;

Tasks:

1. Declare a new 4 × 3 matrix in your code. Now assign values to the matrix (use (i,j) to
access elements) and print them out to the screen.

2. Add the definition for M atrix × M V ector multiplication into your code. Complete the
implementation of the function.

3. Let A = ai,j be a 4 × 3 matrix, such that ai,j = 3 ∗ i + j for 0 ≤ i ≤ 3 and 0 ≤ j ≤ 2 and


let x = (0.5, 1.6, 3.2). Then test your function by evaluating

b = Ax

Check your solution for b by using hand or by using matlab.

Additional Tasks:

1. Create a resize function for MMatrix as you have done with the MVector class. The
definition should be something like resize(int n,int m,double x), you must use a
clear() function on A before resizing. See the constructors in the class for the syntax
on how to resize.

2. You could try overloading the << operator to output the matrix in the form

(A[0][0], A[0][1], ..., A[0][m])


(A[1][0], A[1][1], ..., A[1][m])
..
.
(A[n][0], A[n][1], ..., A[n][m])

31
3.2.4 Implementing the Conjugate Gradient method

Let A be an n × n matrix and x, b are n element vectors. Then solve

Ax = b.

The particular problem we shall solve here is derived from a one dimensional finite difference
scheme, but we shall just present the matrix itself without derivation. The matrix may be
written: 
 2
 if i=j
A = ai,j = −1 if |i − j| = 1 , (3.19)

0 otherwise

2
and the vector b = bi = (n+1)2
. We can use x = xi = 0 as our initial guess.

Tasks:

1. Declare a matrix A in your code with n = 5 rows and n = 5 columns.

2. Assign each element in the matrix a value according to (3.19).

3. Print out your matrix to the screen or a file and check that it is properly assigned.

4. Calculate the residual r = b − Ax.

Now we must implement the conjugate gradient algorithm as stated in (3.10) and (3.11–
3.15). Your code will look something like:
// i n i t i a l i s e v e c t o r s
// f i l l t h i s i n . . .
f o r ( i t e r =0; i t e r <m a x i t e r ; i t e r ++)
{
// c a l c u l a t e new v a l u e s f o r x and r
// f i l l t h i s i n . . .
// c h e c k i f s o l u t i o n i s s a t i s f i e d
i f ( r . l2norm ( ) < t o l e r a n c e ) break ;
// c a l c u l a t e new c o n j u g a t e v e c t o r p
// f i l l t h i s i n . . .
}

Tasks:

1. Implement the conjugate gradient (CG) method, with maxiter = 1000 and tolerance =
10−6 . With n = 25 the method should take 12 iterations to converge.

i+1
2. Plot xi versus i for n = 10, 25, 100. Try plotting xi against n+1 where i runs between 0
and n − 1 so that they lie on top of each other. What do you notice?

32
3. Alter the code to output the number of iterations needed for convergence. Create a table
showing the number of elements n against the iterations for convergence. Also add a
column showing the computation time (see website for more information).

4. Change the matrix A to be



2
 2(i + 1) + m
 if i=j
A = ai,j = −(i + 1) 2 if |i − j| = 1 , (3.20)

0 otherwise

with the vector b = bi = 2.5 and m = 10. and rerun your results. What happens if you
vary the value of m, particularly if m is small or even 0? Can you find out (from external
sources) why this might happen?

Report:

• Include annotated code of the MVector and Matrix classes, and your CG implementation

• Include the figure of xi against i for different values of n.

• include the table for convergence

• Include a discussion of the results, and what happens when A is changed.

3.2.5 A New Type of Matrix


You may notice when printing out the matrix A from above that most of the elements in the
matrix, especially as n gets large, are zero. A more efficient way of storing the matrix would be
to just store the diagonals, rather than the whole matrix. We call this a banded matrix.
Assume B is a n × n banded matrix. Now let l be the number of non-zero diagonal to the
left of center, and r the number of non-zero diagonals to the right. Then the size of the storage
will have to be n × (l + r + 1). If for instance A 5 × 5 matrix with 4 bands is given by
   
a0,0 a0,1 a0,2 1 6 10
 a1,0 a1,1 a1,2 a1,3   13 2 0 11
   

   
A=
 a2,1 a2,2 a2,3 a2,4 =
  14 3 8 12 

a3,2 a3,3 a3,4 0 4 9 
   
  
a4,3 a4,4 16 5

then can express the matrix B as a 5 × 4 matrix


   
b0,0 b0,1 b0,2 b0,3 1 6 10
b1,0 b1,1 b1,2 b1,3   13 2 0 11 
   

   
B=
 b2,0 b2,1 b2,2 b2,3  =  14 3
  .
8 12 
b3,0 b3,1 b3,2 b3,3   0 4 9
   
 
b4,0 b4,1 b4,2 b4,3 16 5

33
If we recall that l is the number of diagonals left of the center we can express the relation
between the matrices as
ai,j = bi,j+l−i .

In the following class definition we define such a banded matrix:


// c l a s s Banded s t o r e s o n l y e l e m e n t s i n t h e d i a g o n a l s
c l a s s Banded
{
unsigned i nt N,M;
// s t o r a g e f o r t h e banded m a t r i x
v e c t o r <v e c t o r <double> > A;
i n t l , r ; // number o f l e f t / r i g h t d i a g o n a l s
public :
// c o n s t r u c t o r
e x p l i c i t Banded ( ) : N( 0 ) ,M( 0 ) { }
e x p l i c i t Banded ( i n t n , i n t m, i n t lband , i n t rband )
:N( n ) ,M(m) ,A( n , v e c t o r <double>( lband+rband +1)) , l ( lband ) , r ( rband ) { }
e x p l i c i t Banded ( i n t n , i n t m, i n t lband , i n t rband , double x )
:N( n ) ,M(m) ,A( n , v e c t o r <double>( lband+rband +1,x ) ) , l ( lband ) , r ( rband ) { }
// e q u a t e m a t r i c e s ;
Banded& operator=(const Banded& r h s )
{N=r h s .N;M=r h s .M;A=r h s .A; l=r h s . lband ( ) ; r=r h s . rband ( ) ; return ∗ t h i s ; }
// a c c e s s d a t a i n Banded m a t r i x ( c o n s t )
double operator ( ) ( i n t i , i n t j ) const { /∗ f i l l t h i s i n ∗/ }
// a c c e s s d a t a i n Banded m a t r i x
double& operator ( ) ( i n t i , i n t j ) { /∗ f i l l t h i s i n ∗/ }
// s i z e o f v e c t o r
i n t rows ( ) const { return N ; } ;
i n t c o l s ( ) const { return M; } ;
// t o t a l number o f bands
i n t bands ( ) const { return r+l +1;}
i n t lband ( ) const { return l ; } // number o f l e f t bands
i n t rband ( ) const { return r ; } // number o f r i g h t bands
} ; // end c l a s s Banded

Example:

• Overload the << operator to output the matrix.

Solution:

First define the operator as follows:


s t d : : ostream& operator<<(s t d : : ostream& output , const Banded& banded )

Then the function implementation will be


// o v e r l o a d o u t p u t s t r e a m s
s t d : : ostream& operator<<(s t d : : ostream& output , const Banded& banded )
{

34
f o r ( i n t i =0; i <banded . rows ( ) ; i ++)
{
output << ” ( ” ;
// c a l c u l a t e p o s i t i o n o f l o w e r and upper band
i n t jmin=s t d : : max( s t d : : min ( i −banded . lband ( ) , banded . c o l s ( ) ) , 0 ) ;
i n t jmax=s t d : : min ( i+banded . rband ()+1 , banded . c o l s ( ) ) ;
f o r ( i n t j =0; j <jmin ; j ++)output << 0 << ” \ t ” ;
f o r ( i n t j=jmin ; j <jmax ; j ++)
output << banded ( i , j ) << ” \ t ” ;
f o r ( i n t j=jmax ; j <banded . c o l s ( ) ; j ++)
output << 0 << ” \ t ” ;
output << ” ) \ n” ;
}
return output ;
}

Tasks:
1. Copy the code for the banded matrix along with the function to output the matrix to
the screen. Now fill in the access functions to elements in the matrix. Think carefully
about how to do this. If you have added any range-checking state why you have done
this. Assign some values to a matrix and check they have assigned correctly by printing
to the screen.

2. Now make the matrix A from (3.19) a banded matrix, we will have that l = 1 and r = 1.
Setup the matrix, and print to screen to check it has worked.

3. Create a function with the definition


MVector operator ∗ ( const Banded A, const MVector& x ) const ;

to implement matrix vector multiplication for Banded matrices. Complete the definition
and make sure that it is efficient (i.e. doesn’t multiply by zero entries).

4. Using the values for A, x and b from §3.2.4, calculate the residual r. Check your answer
against previous values.

5. Now run the (CG) algorithm on the banded matrix. Note here that because we have
overloaded the operators you should not need to change anything in the code.

6. Create a table to show the number of iterations and computation time for different values
of n.

Report:
• Include annotated code of the Banded class.

• Include the table showing number of iterations and computation time for the two matrix
classes.

35
• Compare and contrast the different methods for storing a matrix.

3.3 Huge Systems and Sparse Matrices


Let u be the solution to the equation

∇2 u(s, t) = 2. (3.21)

Now define the n2 vector x as an approximation of u such that

x = xk+nl ≈ u(sk , tl )

Then x is an approximation to (3.21) if it solves the matrix equation

Ax = b,

where A is an n2 × n2 matrix, x and b are n2 element vectors, and the matrix A may be written:


 4 if i=j

 −1 if |i − j| = n



A = ai,j = −1 if {|i − j| = 1 and . (3.22)

(i + j) mod (2n) 6= 2n − 1}





0 otherwise

2
The vector b = bi = (n+1)2
and x = xi = 0 for our initial guess.

Tasks:

1. Input the matrix A into a M atrix class, with n = 5. Print out to screen to check that it
has been entered correctly. You may ask to see a solution to check against your values.
Solve the problem using n = 5 with your CG solver.

2. Input the matrix A into a Banded class, with n = 5. Print out to screen to check that it
has been entered correctly. Note here that you are still storing a lot of zero entries. You
can check against your print out from the M atrix class. Solve the problem using n = 5
with your CG solver.

3. Now solve the system with all your available storage classes for values of n up to 100 or
more. Create a table or figure showing iterations to converge and computation times for
each class.
WARNING Before running calculations on large matrices make sure you have opti-
mised your programs before running them. If you cannot find a value state this in your
report and try to explain why you can’t.

4. Try plotting the results in 3D by outputting your data

36
• in columns (for gnuplot)
i j xin+j

for 0 < i < n and 0 < j < n.


• in a matrix format (for matlab)

x0 x1 ... xn−1
xn xn+1 ... x2n−1
.. .. ..
. . ... .
xn(n−1) xn(n−1)+1 ... xn2 −1

3.4 Report
Write a project report as a connected piece of prose. You may use any suitable reference sources,
but these must be clearly identified in the report at the points of use.

1. State a brief theory behind the conjugate gradient method.

2. Give a detailed description of any classes used, alongside code listings, describing what
they are used for and what benifits they bring.

3. Include tables or figures of convergence and computation times for the problems stated in
the report.

4. Try to explain why certain problems are more difficult (or impossible) to solve.

5. Include some 3D plots of the solutions in section 3.

6. Compare and contrast your three matrix storage classes for the problems stated in section
3.

Marks will be awarded for clarity and correctness of code as well as answers to the questions
and discussion.

37
38
Mini Project 4

Sorting and Searching

Author: Dr. Andrew Hazel, Andrew.Hazel@manchester.ac.uk

Sorting and/or searching through data is something that is extremely common when pro-
gramming. It is an essential part of many algorithms, but the basic sorting problem is interesting
in its own right. In this project, we shall implement some standard sorting algorithms with in-
creasing levels of abstraction. The sorting methods will be used to implement the game of life
using very little memory. It will be assumed that students are familiar with objects, inhertinace
and can output data to files.

4.1 Sorting and searching


4.1.1 The sorting problem
The sorting problem is to find a permutation π of a sequence of n elements a0 , a1 , · · · , an−1 such
that aπ(0) ≤ aπ(1) ≤ · · · , ≤ aπ(n−1) , where ≤ is a suitably defined partial order. Note that the
indexing is from 0 to be consistent with standard C++ convention. For our purposes, a partial
order is defined as a relation ≤ on a set S, such that for a, b, c ∈ S:

• ≤ is reflexive — a ≤ a is true.

• ≤ is transitive — a ≤ b and b ≤ c ⇒ a ≤ c.

• ≤ is antisymmetric — a ≤ b and b ≤ a ⇒ a = b.

An important point is that we must be able to define such a relation if it is to be possible to


sort our set.

4.1.2 Bubblesort
Bubblesort is very simple sorting algorithm. The idea is to move the greatest element to the n-th
position in the sequence and then to move the greatest element in the subsequence a1 , · · · , an−1
to the n − 1-th position and so on. In each (sub)sequence, the method used to move elements is

39
to compare two neighbouring elements ai and ai+1 and interchange them if ai+1 < ai . Assuming
that the sequence is stored in an M V ector, see §4.2.1, the algorithm used to move the greatest
element to the final position of a sequence of length n would be

for(i=0;i<n-1;++i)
{
if(a[i+1] < a[i]) {a.swap(i,i+1);}
}

where swap(i,i+1) is a member function of the M V ector class that exchanges the entries a[i]
and a[i+1].

4.1.3 Quicksort

Quicksort is an alternative sorting algorithm that runs much more quickly than bubble sort in
most cases and can be defined recursively. The idea is to choose a random element from the
sequence, suppose it has the value x. The sequence is then divided into three subsequences

• S1 : all elements with values less than x.

• S2 : all elements with values equal to x.

• S3 : all elements with values greater than x.

The quicksort algorithm is then applied to the subsequences S1 and S3 and the sorted output
consists of (sorted) S1 followed by S2 followed by (sorted) S3 . When writing recursive algorithms
it is extremely important to have a termination criterion (otherwise the algorithm will continue
forever). The termination criterion in this case is that if the sequence contains one or no
elements there is no sorting to be done, so it just returns the sequence.

4.1.4 Heapsort

A heap is a special type of binary tree in which the value at each vertex is always greater than
or equal to the values at its sons, see Figure 4.1(b). Arranging a sequence into a heap allows a
particularly fast sort to be performed.

Binary Trees

A binary tree is a particular type of directed, acyclic, graph that is a very convenient data
structure, see http://en.wikipedia.org/wiki/Binary tree or many textbooks for full de-
tails. The easiest way to understand a tree is to look at a picture and Figure 4.1(a) shows a
simple binary tree.

40
0 6

1 2 4 5

3 4 5 6 3 1 0 2

(a) a binary tree (b) a heap

Figure 4.1: (a) a binary tree with three levels: the root is the node labelled 0 with two sons 1
and 2. Each son has two sons of its own 3 and 4 and 5 and 6 and these are the leaves. (b) the
tree rearranged into a heap in which the value at a vertex is always greater than or equal to to
values of its sons.

Making a heap from a tree

A recursive algorithm for creating a heap from a tree is relatively easy to construct. Consider
a tree consisting of a root with only two sons: if either of the values at the sons is greater than
the value at the root, then exchanging that value with the value at the root will make a local
heap. For a general tree, we apply this algorithm to every vertex of the tree starting from the
bottom and working upwards. (We cannot apply the algorithm if a vertex has no sons, so we
actually start from the penultimate row.) If an exchange takes place then we must check that
the lower portion of the tree remains a heap. Thus, we apply the algorithm recursively to the
(sub)tree with the recently-exchanged son as the root.
The algorithm applied to the tree in Figure 4.1(a) yields the heap in Figure 4.1(b). We shall
illustrate the algorithm by representing the tree in a linear array reading across each row in
turn; the initial configuration is 0123456 and the final configuration (heap) is 6453102.

• Vertex 2: greatest-son-value is 6, exchange 2 and 6 to give 0163452.

• Vertex 1: greatest-son-value is 4, exchange 1 and 4 to give 0463152.

• Vertex 0: greatest-son-value is 6, exchange 0 and 6 to give 6403152.

• Apply algorithm recursively to exchanged son

– Vertex 0: greatest-son-value is 5, exchange 0 and 5 to give 6453102

To understand this algorithm it really helps to draw your own tree and draw out the moves
made to generate the heap.

41
Sorting the heap

Once we have a heap then we can easily find the element with greatest value ... it’s at the top
of the heap! The idea of heapsort is to remove this top value and replace it by the value of one
of the leaves and then apply the recursive heap-generation algorithm to the root, which move
the element with next greatest value to the top of the heap and so on and so on. Thus we build
up the sorted data by removing the top of the heap each time.

4.2 Coding, Examples and Exercises


Initially, we shall write our algorithms to sort the data in a vector class

4.2.1 Creating our own vector class from the standard library

You should use the following class definition for the new class M V ector in a header file or at
the top of your main code.
// c l a s s MVector c o n t a i n s a r r a y s t h a t can work w i t h d o u b l e s
c l a s s MVector
{
// s t o r a g e f o r t h e new v e c t o r c l a s s
v e c t o r <double> v ;
public :
// c o n s t r u c t o r
e x p l i c i t MVector ( ) { }
e x p l i c i t MVector ( i n t n ) : v ( n ) { }
e x p l i c i t MVector ( i n t n , double x ) : v ( n , x ) { }
// e q u a t e v e c t o r s ;
MVector& operator=(const MVector& X)
{ i f (&X==t h i s ) return ∗ t h i s ; v=X. v ; return ∗ t h i s ; }
// a c c e s s d a t a i n v e c t o r
double& operator [ ] ( i n t i n d e x ) { return v [ i n d e x ] ; }
// a c c e s s d a t a i n v e c t o r ( c o n s t )
double operator [ ] ( i n t i n d e x ) const { return v [ i n d e x ] ; }
// s i z e o f v e c t o r
i n t s i z e ( ) const { return v . s i z e ( ) ; }
} ; // end c l a s s MVector

The class M V ector should act in exactly the same way as a std :: vector, except that we do
not have access to all the public functions of the std :: vector, and we have explicitly chosen the
data double as the data type stored in the array. N.B. You will have to include the <vector>
header in order to use the standard vector class.

Tasks:

1. Write a very simple main program that creates an M V ector of size 10, stores whatever
data you like and outputs that data to a file.

42
2. Add range-checking errors to your square-bracket [ ] access functions so that if the index
is out of range then the program exits with an error. Test that the range-checking works!
N.B. This can save your life when developing code, but will make your code run more
slowly, so you should remove it in final “production runs”.

3. Add an additional member function to the M V ector class


void MVector : : swap ( i n t i , i n t j ) ;

that interchanges the values of the i-th and j-th entries of the M V ector.

4. If you like overload the << operator to produce pretty output of the vector in the form
(v[0], v[1], . . . , v[n − 1]).

4.2.2 Creating initial data


Example:

Add a member function


void MVector : : i n i t i a l i s e r a n d o m ( double xmin , double xmax ) ;

to the M V ector class that assign a random (double) value, between the limits xmin and xmax,
to each entry in the vector. You may use the C++ rand() function defined in the cstdlib
library.

Solution:

The random number generator requires an initial seed, set using the function std::srand(...).
If this is not set then the same pseudo-random sequence will be generated for every vector, so
the initial data won’t be very random at all! We shall modify the constructor of the M V ector
to set the seed using the system time, which requires the inclusion of the time.h header. Thus,
we need to include two headers at the top of our code.
#i n c l u d e <time . h>
#i n c l u d e <c s t d l i b >

The constructor is modified as follows


e x p l i c i t MVector ( i n t n ) : v ( n )
{ s r a n d ( s t a t i c c a s t <unsigned>( time (NULL ) ) ) ; }

We then fill in the random entries using the rand() function. The standard random number
generator returns an integer between 0 and RAND MAX, so we scale this value to lie within our
desired range.
void MVector : : i n i t i a l i s e r a n d o m ( double xmin , double xmax )
{
const unsigned n = this−>s i z e ( ) ;
f o r ( unsigned i =0; i <n ; i ++)

43
{
// S c a l e t h e random number t o l i e b e t w e e n xmin and xmax
v [ i ] = xmin + rand ( ) ∗ ( xmax − xmin ) /
s t a t i c c a s t <double>(RAND MAX) ;
}
}

Tasks:

1. Add the source code above to your program and make sure that it compiles and runs.

2. Make sure that you modify all constructors to set the seed of the random number gener-
ator.

3. Check the functionality by generating a number of vectors of varying lengths and fill them
with random initial values. How can you test the randomness of the entries? (This is a
hard question that you may like to think about further).

4.2.3 Implementing the sorting algorithms


Bubblesort

Tasks:

1. Add a function to your program in a (global) namespace called Sort


namespace S o r t
{
void bubble ( MVector &v ) ;
}

that implements the bubblesort algorithm described in §4.1.2. The function should return
the sorted data in the M V ector passed into the function, i.e. the code
MVector v ( 3 ) ;
v [0] = 5.5; v [1] = 2.0; v [2] = 1.0;
S o r t : : bubble ( v ) ;
s t d : : c o u t << v ;

should produce the output

(1.0, 2.0, 5.5)

assuming that you have overloaded the << operator; if not, why not do it now?

2. Write a program that computes the average sort time for a randomly-initialised vector of
length n. You might want to use the following function to make your timings:

44
// Return number o f s e c o n d s s i n c e t h e program s t a r t e d r u n n i n g
double t i m e r ( )
{
time t t = clock ( ) ;
return s t a t i c c a s t <double>( t ) /
s t a t i c c a s t <double>(CLOCKS PER SEC ) ;
}

Report:

Your report for this section should contain

• A hard-copy of your function bubble(...).

• A graph of the average sort time as a function of n. Can you deduce the functional form
of the curve?

• An analysis of the expected form of the curve from theoretical considerations, i.e. what
is the complexity of the algorithm?

Quicksort

There are two subtleties here: 1) How does one choose the “random” value in the sequence?
2) How does one divide the sequence into the three subsequences? For the first of these, you
could always use the random number generator used to generate the random initial data. The
second requires a bit more thought. If you think about it carefully you will find that you can
divide the sequence “in place” (i.e. without using an significant extra memory) by using a
series of carefully chosen exchanges. You will need to keep track of the starts and ends of each
subsequence. That said, it is always better to first write a version that works and then think
about how to make it more efficient. You may find that you need to add additional functions
to the M V ector class.

Tasks:

1. Add two functions to the Sort namespace


namespace S o r t
{
void q u i c k r e c u r s i v e ( MVector &v , i n t s t a r t , i n t end ) ;
void q u i c k ( MVector &v ) { q u i c k r e c u r s i v e ( v , 0 , v . s i z e ( ) − 1 ) ; }
}

to implement the quicksort algorithm described in §4.1.3. The quick recursive(...)


function should perform a quicksort on the section of the M V ector between the indices
start and end. The function quick(...) is a simple “wrapper” to the recursive function
that does the actual work. Be very careful when implementing the recursive algorithm to
ensure that it has an appropriate termination criterion.

45
2. Write a program that calculates the average sort time for a randomly-initialised vector of
length n.

Report:

Your report for this section should contain

• A hard-copy of your function quick recursive(...).

• A graph of the average sort time as a function of n. Can you deduce the functional form
of the curve?

• An analysis of the expected form of the curve from theoretical considerations, i.e. what
is the complexity of the algorithm?

Heapsort

Heapsort requires some thought about how to represent the heap structure. One way, not the
only way, is to store the heap in a vector-like structure with the indexing as shown in Figure
4.1(a). In other words, the root is a h[0] and the sons of the vertex h[i] are located at h[2 ∗ i + 1]
and h[2 ∗ i + 2]. Draw lots of pictures and convince yourself that this scheme really does work!

Tasks:

1. Using the heap-storage scheme suggested above, find the criterion that determines whether
a vertex is a leaf (i.e. has no sons)?

2. Add two functions to the Sort namespace


namespace S o r t
{
void h e a p f r o m r o o t ( MVector &v , i n t i , i n t n ) ;
void heap ( MVector &v ) ;
}

The first of these should make a heap with the i-th vertex as the root, assuming that only
the first n values stored in the vector make up the heap data. The second should implement
the heapsort algorithm described in §4.1.4. It should use the function heap from root(...)
to build the heap and then to perform the sorting. If you are careful, you should be able
to perform the sorting efficiently without significant additional memory. Once again, the
most effective way to write code is to get the thing working first and then worry about
efficiency.

3. Write a program that calculates the average sort time for a randomly-initialised vector of
length n.

46
Report:

Your report for this section should contain

• A hard-copy of your functions heap from root(...) and heap(...).

• A graph of the average sort time as a function of n. Can you deduce the functional form
of the curve?

• An analysis of the expected form of the curve from theoretical considerations, i.e. what
is the complexity of the algorithm?

4.2.4 Abstracting the sorting algorithms

We have now implemented sorting algorithms for M V ector objects, but we can use OO tech-
niques to write much more general algorithms to sort any objects with certain properties. Have
a careful look at your algorithms and think about the operations used. You should find that
you can write all the algorithms using only three operations on the data stored in the M V ector:
size(), swap(i,j) and cmp(i,j), where cmp(i,j) is a comparison operator, e.g. <. You do
not need an equality operator.

Tasks:

1. Work out how to write a test for the equality of two elements in the vector using only the
comparison operator.

2. Copy your code into a new file, keeping a backup of the previous version.

3. In the new file, add a member function


bool MVector : : cmp( i n t i , i n t j ) ;

that returns true if the value v[i] is less than the value v[j] and false otherwise.

4. In the new file, rewrite your sorting algorithms to use the cmp member function, rather
than <, where appropriate. Test that your algorithms still work.

5. If you’ve done everything correctly you should be able to change the program to sort the
values in descending order by changing a single < to a >. Can you see where to do this?
[Hint: it’s in the M V ector class].

6. Copy your code into yet another new file and add a protocol class SortableContainer that
contains all interfaces required for sorting data stored within the class.

7. Modify your sorting algorithms so that they work with SortableContainers rather than
M V ectors, and modify the M V ector class so that it inherits from SortableContainer.
Test your algorithm.

47
Sorting a two-dimensional array of coordinates

Now that we have abstracted the sorting algorithms we can create any number of new SortableContainers.
One object that will be useful later is an array of coordinates. Firstly, we define a two-
dimensional coordinate structure
struct I n t e g e r C o o r d i n a t e
{
unsigned X;
unsigned Y;
};

Tasks:

1. Create a class CoordinateArray that inherits from SortableContainer and stores a vector
IntegerCoordinate objects.
s t d : : v e c t o r <I n t e g e r C o o r d i n a t e > v ;

Make sure that you include the necessary access functions so that you can access any
coordinates stored in the array. You should also include a resize(int n) function.

2. Write a comparison operator


bool C o o r d i n a t e A r r a y : : cmp( i n t i , i n t j ) ;

that returns true if the coordinate v[i] is lexicographically less than coordinate v[j] and
false otherwise. A lexicographical ordering is defined by

(X, Y ) < (A, B) if X < A or [X = A and Y < B] .

3. Test your new class by sorting some CoordinateArrays of your choice.

Report:

Your report for this section should contain

• A hard-copy of your abstracted sorting functions and your CoordinateArray class.

• A comparison of average times taken to sort n doubles and n IntegerCoordinates and an


interpretation of the results.

4.2.5 Sorting in the STL (optional)


The C++ standard template library (STL) has a number of sorting and searching functions.
These are implemented at a yet further level of abstraction because we have so far assumed
that all our SortableContainers can be indexed by an integer and that we only want one type
of comparison operator for each container. This is too much of a restriction, as far as the STL

48
is concerned. The two ideas are that: 1) the entries in a container can all be accessed using
an iterator: an object that iterates through the data; and 2) the comparison is performed by a
special comparison functor: an object that behaves like a function.

Tasks

1. Read-up on sorting and searching in the STL using available web-resources

2. Convert your program so that the sorting is performed by the STL sort() algorithm.

4.3 The game of life using a CoordinateArray (optional)


The game of life is a cellular automaton which obeys a very simple set of rules. The classic
game is played on a two-dimensional square grid of cells and each cell is either live or dead.
As time advances the state of the cells changes according to various rules that depend on the
states of the eight neighbouring cells. The classic rules are:

• If a dead cell has exactly 3 live neighbours it becomes live, otherwise it remains dead.

• If a live cell has 2 or 3 live neighbours it remains live, otherwise it dies.

These rules can be summarized by the notation B3/S23, indicating that a dead cell becomes
live with 3 neighbours and a live cell stays alive with 2 or 3 neighbours.
One simple way to represent the game grid is to construct a two-dimensional array of
booleans, but we must then store one boolean for every point in the grid. A more memory-
efficient1 storage method it to store only the locations of the live cells in some sort of data
structure. The problem is that the state of the neighbours of any live points can only be found
by searching through the data structure.
We shall create a Lif e class that stores the coordinates of the live cells as a CoordinateArray.

Tasks

1. Write a class Lif e with the following prototype


class Life
{
public :
// S t o r a g e f o r t h e c o o r d i n a t e s o f t h e l i v e cells
CoordinateArray L i v e C e l l s ;

// C o n s t r u c t o r
Life ();

1
The trade-off is worth considering. Imagine that we are playing on a grid of dimension 232 ×232 (the standard
maximum of an unsigned integer). Then we will required 264 bits to store the entire grid as booleans, to convert
this into bytes we divide by 23 , so we have 261 bytes = 251 kB = 241 MB = 231 Gb, which is an awful lot of
storage! Storing a cell’s coordinates requires two integers (usually 8 bytes) per live point, which means that we
can hold 227 live points in 1 Gb of RAM.

49
// Advance time one u n i t
void t i c k ( ) ;
}

You can use the lexicographical sorting on the CoordinateArray to allow relatively fast
determination of the live neighbours of the cell and to determine the dead cells that have
live neighbours. You may also feel free to investigate other methods to evolve the game
of life and may use the STL if you wish. The only restriction is that the data structure
that represents the state of the game must consist only of a container that stores the
coordinates of the live cells.

2. Evolve the following two initial configurations of live cells

(a) {(1, 1), (1, 2), (2, 1), (2, 2)},


(b) {(5, 5), (5, 4), (5, 6)}.

3. Evolve the following initial configuration (the acorn) of live cells:

{(10001, 10001), (10002, 10001), (10002, 10003),

(10004, 10002), (10005, 10001), (10006, 10001), (10007, 10001)}.

4.4 Report
Write a project report as a connected piece of prose. You may use any suitable reference sources,
but these must be clearly identified in the report at the points of use.
Your report must include:

• code listings for the three sorting algorithms: bubblesort, quicksort and heapsort;

• graphs of the average sort time as a function of the number of objects being sorted for
the three algorithms;

• a comparison of the data in your graphs with the theoretical complexity of the algorithms;

• a discussion of which sorting algorithm is “best” in your opinion with reasons for your
choice;

• a code listing for your game of life object;

• a representation (figure) of the state of the game after 5026 generations (time steps) of
the acorn pattern described in Task 3 above.

• answers to the following questions:

– what happens when you start with the initial configurations in Task 2 above?

50
– what is the total number of live cells after 5026 generations of the acorn pattern?

Marks will be awarded for clarity and correctness of code as well as answers to the questions
and discussion. The majority of the marks for this project (80%) will be awarded for the work
described in section 2, with only 20% of the marks available for the investigation of the game
of life, section 3.

51

Вам также может понравиться