EEE484 Note Book

EEE484 COMPUTATIONAL METHODS Course Notebook
February 3, 2009
................................................................ .........aaaaaaaaaaaaaaaa..............AAAAAAAAAAAAAAAA......... .......aaaaaaaaaaaaaaaaaaaa..........AAAAAAAAAAAAAAAAAAAA....... .....aaaaaaaaaaaaaaaaaaaaaaaa......AAAAAAAAAAAAAAAAAAAAAAAA..... ...aaaaaaaaaabbbbbbbbaaaaaaaa......AAAAAAAABBBBBBBBAAAAAAAAAA... ...aaaaaabbbbbbbbbbbbbbbbaaaa......AAAABBBBBBBBBBBBBBBBAAAAAA... ...aaaabbbbbbbbbbccbbbbbbbbaaaa..AAAABBBBBBBBCCBBBBBBBBBBAAAA... ...aaaabbbbccccccccccccbbbbaaaa..AAAABBBBCCCCCCCCCCCCBBBBAAAA... .aaaabbbbccccddddddddddccbbbbaa..AABBBBCCDDDDDDDDDDCCCCBBBBAAAA. .aaaabbccccddddeeeeeeddddccbbaa..AABBCCDDDDEEEEEEDDDDCCCCBBAAAA. .aaaabbccddddeeffffffffeeddccaa..AACCDDEEFFFFFFFFEEDDDDCCBBAAAA. .aabbbbccddeeffgghhhhggffeeccbb..BBCCEEFFGGHHHHGGFFEEDDCCBBBBAA. .aabbccddeeffggiijjkkiiggeeddbb..BBDDEEGGIIKKJJIIGGFFEEDDCCBBAA. .aabbccddeeffhhjjmmoolliiffddbb..BBDDFFIILLOOMMJJHHFFEEDDCCBBAA. .aabbccddeeffhhkkpp--oojjggddbb..BBDDGGJJOO++PPKKHHFFEEDDCCBBAA. .aabbccddeeffhhjjmmoolliiffddbb..BBDDFFIILLOOMMJJHHFFEEDDCCBBAA. .aabbccddeeffggiijjkkiiggeeddbb..BBDDEEGGIIKKJJIIGGFFEEDDCCBBAA. .aabbbbccddeeffgghhhhggffeeccbb..BBCCEEFFGGHHHHGGFFEEDDCCBBBBAA. .aaaabbccddddeeffffffffeeddccaa..AACCDDEEFFFFFFFFEEDDDDCCBBAAAA. .aaaabbccccddddeeeeeeddddccbbaa..AABBCCDDDDEEEEEEDDDDCCCCBBAAAA. .aaaabbbbccccddddddddddccbbbbaa..AABBBBCCDDDDDDDDDDCCCCBBBBAAAA. ...aaaabbbbccccccccccccbbbbaaaa..AAAABBBBCCCCCCCCCCCCBBBBAAAA... ...aaaabbbbbbbbbbccbbbbbbbbaaaa..AAAABBBBBBBBCCBBBBBBBBBBAAAA... ...aaaaaabbbbbbbbbbbbbbbbaaaa......AAAABBBBBBBBBBBBBBBBAAAAAA... ...aaaaaaaaaabbbbbbbbaaaaaaaa......AAAAAAAABBBBBBBBAAAAAAAAAA... .....aaaaaaaaaaaaaaaaaaaaaaaa......AAAAAAAAAAAAAAAAAAAAAAAA..... .......aaaaaaaaaaaaaaaaaaaa..........AAAAAAAAAAAAAAAAAAAA....... .........aaaaaaaaaaaaaaaa..............AAAAAAAAAAAAAAAA......... ................................................................
http://www1.gantep.edu.tr/andrew/eee484/
Dr Andrew Beddall andrew@gantep.edu.tr Department of Electric and Electronic Engineering, University of Gaziantep, Turkey.
Preamble This notebook presents notes, exercises and example exam questions for the course EEE484. Fortran and C++ solutions can be found in the downloads section of the course web-site. The content of this document is automatically built from the course web-site ( this build is dated Tue Feb 3 11:45:18 EET 2009 ). You can download the latest version, in postscript or pdf format, from the course web-site. Only 8 topics are present in this version (more coming soon): Lecture 1 - Numerical Truncation, Precision and Overow Lecture 2 - Numerical Dierentiation Lecture 3 - Roots, Maxima, Minima (closed methods) Lecture 4 - Roots, Maxima, Minima (open methods) Lecture 5 - Numerical Integration: Trapezoidal and Simpsons formulae Lecture 6 - Solution of D.E.s: Runge-Kutta, and Finite-Dierence Lecture 7 - Random Variables and Frequency Experiments Lecture 8 - Monte-Carlo Methods
Title page gure: Numerical solution for the potential around a dipole.
Contents
1 Numerical Truncation, Precision and Overow 1.1 Topics Covered . . . . . . . . . . . . . . . . . . . 1.2 Lecture Notes . . . . . . . . . . . . . . . . . . . . 1.3 Lab Exercises . . . . . . . . . . . . . . . . . . . . 1.4 Lab Solutions . . . . . . . . . . . . . . . . . . . . 1.5 Example exam questions . . . . . . . . . . . . . . Numerical Dierentiation 2.1 Topics Covered . . . . . . 2.2 Lecture Notes . . . . . . . 2.3 Lab Exercises . . . . . . . 2.4 Lab Solutions . . . . . . . 2.5 Example exam questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1 1 6 7 9 10 10 10 15 16 19 21 21 21 27 28 30 31 31 31 37 38 41 43 43 43 50 51 54 55 55 55 64 66 68 70 70 70 82 83 85
Roots, Maxima, Minima (closed methods) 3.1 Topics Covered . . . . . . . . . . . . . . . . 3.2 Lecture Notes . . . . . . . . . . . . . . . . . 3.3 Lab Exercises . . . . . . . . . . . . . . . . . 3.4 Lab Solutions . . . . . . . . . . . . . . . . . 3.5 Example exam questions . . . . . . . . . . . Roots, Maxima, Minima (open methods) 4.1 Topics Covered . . . . . . . . . . . . . . . . 4.2 Lecture Notes . . . . . . . . . . . . . . . . . 4.3 Lab Exercises . . . . . . . . . . . . . . . . . 4.4 Lab Solutions . . . . . . . . . . . . . . . . . 4.5 Example exam questions . . . . . . . . . . .
Numerical Integration: Trapezoidal and Simpsons formulae 5.1 Topics Covered . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Lecture Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3 Lab Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4 Lab Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.5 Example exam questions . . . . . . . . . . . . . . . . . . . . . . . Solution of D.E.s: Runge-Kutta, and Finite-Dierence 6.1 Topics Covered . . . . . . . . . . . . . . . . . . . . . . . . 6.2 Lecture Notes . . . . . . . . . . . . . . . . . . . . . . . . . 6.3 Lab Exercises . . . . . . . . . . . . . . . . . . . . . . . . . 6.4 Lab Solutions . . . . . . . . . . . . . . . . . . . . . . . . . 6.5 Example exam questions . . . . . . . . . . . . . . . . . . . Random Variables and Frequency Experiments 7.1 Topics Covered . . . . . . . . . . . . . . . . . . . 7.2 Lecture Notes . . . . . . . . . . . . . . . . . . . . 7.3 Lab Exercises . . . . . . . . . . . . . . . . . . . . 7.4 Lab Solutions . . . . . . . . . . . . . . . . . . . . 7.5 Example exam questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Monte-Carlo Methods 8.1 Topics Covered . . . . . 8.2 Lecture Notes . . . . . . 8.3 Lab Exercises . . . . . . 8.4 Lab Solutions . . . . . . 8.5 Example exam questions Linux Tutorial
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
86 86 86 93 94 96 97
ii
1
1.1
Numerical Truncation, Precision and Overow

Topics Covered
o Introduction to numerical methods; Taylors expansion and truncation errors; round-o errors, overow; precision of data types in Fortran and C++.
1.2
Lecture Notes
Introduction It is important to understand that, in general, numerical methods are not exact; neither are the machines (computers) that perform the numerical calculations for us. In this lecture, we will look at the nature of truncation errors and round-o errors. An understanding of these sources of errors in numerical methods is as important as an understanding of the methods themselves. Numerical Methods We apply numerical techniques to solve numerical problems when analytical solutions are dicult or inconvenient. A simple example is the computation of the rst derivative of a function f(x). Calculus gives us an analytical method for forming an expression for the derivative, however, such analysis for some functions may be dicult, impossible, or inconvenient. A simple numerical solution uses the Forward-Dierence Approximation (FDA) that approximates the derivative by taking the gradient of the function f(x) in the region x to x+h: FDA = ( f(x+h) - f(x) ) / h where h is small but not zero. For example if f(x) = 2x2 + 4x + 6 and we wish to determine the rst derivative evaluated at x=3, the FDA (using h=0.01) gives: ( (2x3.012 + 4x3.01 + 6) - (2x3.002 + 4x3.00 + 6 ) ) / 0.01 = 16.02. Of course this is only an approximation (the true value, by calculus, is 16). gnuplot$>$ plot [0:4] 2*x**2+4*x+6
Truncation Errors The error in the above approximation can be written as FDA - f(x) = 16.02 - 16 = 0.02. This is called a truncation error as it is due to the truncation of higher orders in the exact expression for the rst derivative. We can see the form of the truncation error in the FDA by considering Taylors expansion: f(x+h) = f(x) + h.f(x)/1! + h^2.f(x)/2! + h^3.f(x)/3! + .... Rearrange for the FDA: ( f(x+h) - f(x) ) / h = f(x) + h.f(x)/2 + h^2.f(x)/6 + .... => FDA = f(x) + (h/2).f(x) + O(h^2) ------------------------/ \ the derivative the truncation error in the FDA
We see that the FDA gives the rst derivative plus some extra terms in the series. The error in the approximation FDA - f(x) is therefore (h/2).f(x) + O(h2 ). This can be checked numerically with the above example: (h/2).f(x) = (0.01/2) x (4) = 0.02 (as found above). The truncation error in the FDA is proportional to h, the FDA is therefore called a rst-order approximation. Higher-order methods have truncation errors that are proportional to higher powers of h and therefore yield smaller truncation errors (when h is less than one). We will investigate the round-o error in the above calculation at the end of the next section. Computer Precision (Round-o Errors) Numerical methods are implemented in computer programs where the numerical calculations can be performed quickly and conveniently. However, numbers are stored in computer memory with a limited precision; the loss of precision of a value is called a round-o error. Round-o errors can occur when a value is initially assigned, and can be compounded when values are combined in arithmetic operations. Iteration is common in computational methods and so it is important to minimise compound roundo. As round-o errors can be a signicant source of error in a numerical method (in addition to the truncation error) we will look more closely at the nature of the round-o error and how it can be reduced. A binary representation is used to store numbers in computer memory. For example the binary number 11.011 represents exactly the decimal number 3.375: 1 1x2 1 1x1 . 0 + 0x1/2 1 1*1/4 1 1*1/8 = 3.375
Similarly the decimal value 0.3125 can be expanded to 0.25 + 0.0625 = 1/4 + 1/16 that can be stored exactly in binary as 0.0101. However, given a limited number of binary digits, it is possible that even a rational decimal number might not be stored precisely in binary. For example there is no precise representation for 0.3; the nearest representation with 8 bits is 0.01001101 that gives 0.30078125. The precision increases as more binary digits are used, but there is always a round-o error. In general, the only real numbers that can be represented exactly in the computers memory are those that can be written in the form m/2k where m and k are integers; however, again there is a limit to the set of numbers that are included in this group due to the limited number of binary digits used to store the value. Floating-Point Representation Computers store REAL numbers (as opposed to INTEGER numbers) in the oating-point representation value = m be where m is the mantissa, b is the base (= 2 in computers) and e is the exponent. In Fortran, a type real number is stored in 32 binary bits (4 bytes) [this is equivalent to a oat in C/C++]. To allow for a large exponent range the binary bits available for storage are shared between the mantissa and the exponent of the number. For example the number 413.26 is represented by a mantissa part and an exponent part as 0.41326x103 . The division of the 32 binary bits are as follows: 8 bits are used for to store the exponent, 1 bit for the sign of the exponent, and 23 bits for the mantissa. The precision of the storage of real data is therefore limited by the 23 bits used to store the mantissa. In Fortran the number of binary bits used to store type real numbers can be increased from the default 32 to 64 or 128 by declaring the type real data with the kind specier. The default single-precision data has kind=4 where each data is stored in 4 bytes (32 binary bits) of memory. Double-precision data (kind=8) is allocated 8 bytes (64 binary bits) [this is equivalent to a double in C/C++] and quad-precision (kind=16) 16 bytes (128 binary bits). Double precision has about twice the precision as single precision and a much larger range in the exponent, quad precision has more than four times the precision and a very large range in the exponent. The three real kinds are illustrated in the table below.
+------------------+---------------------+------------------+---------+--------| Type and Kind | Memory allocation | Precision | Range | C/C++ +------------------+---------------------+------------------+---------+--------| real (kind=4 )* | 4 bytes ( 32 bits) | 7 s.f. (Single) | 10^38 | "float" | real (kind=8 ) | 8 bytes ( 64 bits) | 15 s.f. (Double) | 10^308 | "double" | real (kind=16) | 16 bytes (128 bits) | 34 s.f. (Quad) | 10^4931 | "long double"+ +------------------+---------------------+------------------+---------+--------* default kind in Fortran. s.f. = "significant figures". + only on 64 bit platforms. A limitation is also placed on the range of values that be be stored, this is illustrated for single-precision type real data below: underflow overflow<------------------------->--<------------------------->overflow -10^38 -10^-45 +10^-45 +10^38 If a number exceeds the permitted range, for example -1038 to +1038 , then it cannot be stored; such a situation results in the program continuing with wrong values or terminating with an overow error. There is also a limit to the representation of very small real numbers, the range for single precision real data is about -1045 to about +1045 ; attempting to store a value smaller than this results in an underow error. Similarly, integer type data can be stored in 1, 2, 4 or 8 bytes, each giving a larger range of values that can be represented by them. The default integer kind in Fortran is 4 bytes (kind=4) [int in C/C++]. As an integer number is exact there is no corresponding precision, the only limitation is then that of range (integer overow). The four kind types for integers, and the corresponding ranges, are summarised in the table below. +------------------+-------------------+----------------------------+------| Type and Kind | Memory allocation | Range | C/C++ (signed) +------------------+-------------------+----------------------------+------| integer (kind=1) | 1 byte ( 8 bits) | -128 to 127 | "char" | integer (kind=2) | 2 bytes (16 bits) | -32768 to 32767 | "short" | integer (kind=4)*| 4 bytes (32 bits) | -2147483648 to 2147483647 | "int" | integer (kind=8) | 8 bytes (64 bits) | about +- 9x10^18 | "long"+ +------------------+-------------------+----------------------------+ * default kind in Fortran. + only on 64 bit platforms. kind specication in Fortran [you can investigate the C/C++ equivalent in your own time] Examples of the declaration of data of dierent kinds: real real (kind=4) real (kind=8) real (kind=16) :: :: :: :: A B C D ! ! ! ! Default (single precision) Single precision Double precision Quad precision
Examples of assignments: A=1.2345678_4 or simply 1.2345678 C=1.234567890123456_8 D=1.2345678901234567890123456789012345_16 3 (Single precision) (Double precision) (Quad precision)
Note that the underscore symbol is used to dene the precision of the constant; if this is not used then some precision might be lost, or unpredictable values assigned to some of the least signicant digits. For example: real(kind=8) :: C = 1.11111111111111_8 whereas real(kind=8) :: C = 1.11111111111111 and real(kind=8) :: C = 1.111111 assigns C with 1.11111111111111 assigns C with 1.11111116409302 assigns C with 1.11111104488373
The last 7 or 8 digits have been assigned garbage from residual values in memory. The E symbol can be used for exponentiation: A = 1.234568E38 or A = 1.234568E38_4 C = 1.23456789012346E308_8 D = 1.234567890123456789012345678901235E4931_16 We will see later in the course how double- and quad-precision can greatly reduce round-o errors in numerical methods. Remember: although double- and quad-precision can reduce round-o errors they have no eect on the size of truncation errors; truncation errors are inherent to the numerical method and not to the internal representation of numbers in a computer. Examples 1. The expression ( (a+b)2 - 2.a.b - b2 ) / a2 reduces to a2 / a2 = 1. But computed in a machine with limited precision can give unexpected results: [Fortran] real(kind=8) :: a=0.00001_8, b=88888.0_8, c c = ( (a+b)**2 - 2*a*b - b**2 ) / a**2 print *, c end [C++] #include $<$iostream$>$ main () { double a=0.00001, b=88888, c; c = ( (a+b)*(a+b) - 2*a*b - b*b ) / (a*a); std::cout << c << std::endl; }
The result is 4.65661 in both cases! This is an extreme example of a calculation that is sensitive to roundo. Note that quad precision gives the correct result 1.000000000000017 2. test for precision The following programs (in Fortran and C++) implement the Forward-Dierence Approximation algorithm using single-precision. [Fortran] real :: h = 0.1 print *, "FDA = ", (f(3.0+h)-f(3.0))/h contains real function f(x) real :: x f = 2*x**2 + 4*x + 6 end function f end [C++] #include $<$iostream$>$ float f(float x) { return 2*x*x + 4*x + 6; } int main() { float h = 0.1; std::cout << "FDA = " << (f(3.0+h)-f(3.0))/h << std::endl; }
running the above programs for decreasing values of h reveals a decreasing truncation error (t.e.) but increasing round-o error (r.e.). Remember that the correct result should be 16. 4
h=0.1 h=0.01 h=0.001 h=0.0001 h=0.00001 h=0.000001
FDA FDA FDA FDA FDA FDA
= = = = = =
16.199990 16.019821 16.002655 15.983582 16.021729 15.258789
t.e t.e t.e t.e t.e t.e
= = = = = =
0.2, 0.02, 0.002, 0.0002, 0.00002, 0.000002,
r.e. r.e. r.e. r.e. r.e. r.e.
= = = = = =
-0.000011444092 -0.00017929077 0.0006542206 -0.016618729 0.021709442 -0.74121284
The optimal value occurs when h=0.001 where both truncation and round-o errors are relatively small. 3. tests for overow and underow: ! test integer overflow integer :: i, j=1e9 do i = 1, 5 print *, i, j j = j * 2 end do end ! test real overflow integer :: i real :: r=1.0E37 do i = 1, 5 print *, i, r r = r * 10. end do end | | | | | | | | | | | | | | | Result: 1 1000000000 2 2000000000 3 -294967296 4 -589934592 5 -1179869184
Result: 1 1.E+37 2 1.E+38 3 +Inf 4 +Inf 5 +Inf
! test real underflow integer :: i real :: r=1.0E-42 do i = 1, 6 print *, i, r r = r / 10. end do end
| | | | | | | |
Result: 1 1.E-42 2 1.E-43 3 1.E-44 4 1.E-45 5 0. 6 0.
Some compilers provide options that give dierent behavior with respect to overow and underow. For example in the g95 compiler (www.g95.org) the following environment variables can be set: G95_FPU_OVERFLOW=1 and G95_FPU_UNDERFLOW=1
In this case the above two programs abort with a Floating point exception message instead of continuing with bogus values.
1.3
Lab Exercises
Task 1 - Truncation Errors The rst derivative (gradient) of a function can be approximated by the Forward Dierence Approximation: FDA = ( F(x+h) - F(x) ) / h where h is small but not zero. In theory, the truncation error in this approximation is given by: Error = h/2 F(x) + O(h^2) Write a program that computes, using the FDA, at x=4.7, with h=0.01, the rst derivative of the function: F(x) = 3.4 + 18.7x - 1.6x^2 Hint: use double precision to avoid signicant round-o errors confusing the results: for example real(kind=8) :: h, and h=0.01_8
Questions Compare your result with the exact result determined by calculus. Compare the error in the result with the predicted truncation error. Do your comparisons make sense? Task 2 - Precision (Round-o Errors) 1. What is the result of your FDA program with all the variables in single precision? 2. Write a Fortran program that declares variables A, B, C and D as type double precision real, and determine the result of the assignments: A B C D = = = = 1.11111111111111_8 1.11111111111111 1.111111 1.111111_8
Explain your ndings. 3. What do you expect to be the output of the following program. Run the program to see if you are right, explain your ndings. Hint: Press [Ctrl][C] to break out of a program that does not terminate. real :: a=0.0 do a = a + 0.1 print *, a if ( a == 1.0 ) exit end do end Task 3 - Series expansion Write a program that computes ex by the series expansion: ex = 1 + x + x2 /2! + x3 /3! + x4 /4! + ... + xi /i! + ... Terminate the expansion when a term is less than 0.000001. Check you results against the library function exp(x) or use your pocket calculator. Hint: Factorials can be problematic due to integer overow; you can avoid factorials by observing that the (i+1)th term in the series is equal to the he (i)th term times x/i. 6
1.4
Lab Solutions
Task 1 - Truncation Errors The rst derivative (gradient) of a function can be approximated by the Forward Dierence Approximation: FDA = ( F(x+h) - F(x) ) / h where h is small but not zero. In theory, the truncation error in this approximation is given by: Error = h/2 F(x) + O(h^2) Write a program that computes, using the FDA, at x=4.7, with h=0.01, the rst derivative of the function: F(x) = 3.4 + 18.7x - 1.6x^2 Hint: use double precision to avoid signicant round-o errors confusing the results: for example real(kind=8) :: h, and h=0.01_8 Questions Compare your result with the exact result determined by calculus. Compare the error in the result with the predicted truncation error. Do your comparisons make sense? Solution Program: eee484ex1a (see the downloads page) The output is: fda true error_fda true error = 3.644000 = 3.660000 = -0.016000 = -0.016000
Note that double precision variables are used, real(kind=8), otherwise the results will include signicant round-o errors making the analysis less clear. From the output, we can see that the fda value is similar to the true value, but not exactly the same as it is an estimate only. According to theory the truncation error in this estimate is (h/2).F(x) = h/2.0*(-3.2) = -0.016 this is the same as the true error = fda - true = 3.644 - 3.660 = -0.016. The result is: the expression for the truncation error is correct. Task 2 - Precision (Round-o Errors) 1. What is the result of your FDA program with all the variables in single precision? Solution Simply replace kind=8 with kind=4, and underscore 8 with underscore 4 [or double with oat] and rerun the program, the result is fda true error_fda true error = 3.64423 = 3.66000 = -0.01600 = -0.01577 7
The result for the fda is dierent in the fourth decimal place - as well as truncation error there is now an additional round-o error. 2. Write a program that declares variables A, B, C and D as type double precision real, and determine the result of the assignments: A B C D = = = = 1.11111111111111_8 1.11111111111111 1.111111 1.111111_8
Explain your ndings. Solution A = 1.11111111111111_8 correctly assigns the value 1.11111111111111 to A. B = 1.11111111111111 assigns 1.11111116409302 to B because the assignment is equivalent to 1.11111111111111_4 = 1.1111111 and so the last 7 digits contain garbage. C = 1.111111 assigns 1.11111104488373 for the same reason as in B. D = 1.111111_8 assigns correctly 1.11111100000000 to D 3. What do you expect to be the output of the following program. Run the program to see if you are right, explain your ndings. Hint: Press [Ctrl][C] to break out of a program that does not terminate. real :: a=0.0 do a = a + 0.1 print *, a if ( a == 1.0 ) exit end do end Solution You might expect the program to output the numbers 0.1, 0.2, ...., 0.9, 1.0 and then terminate. But you might actually nd that, due to round of errors, A does not take exactly the value 1.0 and therefore the program fails the test (A==1.0) and continues to count without end (press [Ctrl][C] to stop the program). A x for this would be to replace the equality == with >= (greater than or equal to); the loop might then end with A=1.0 or A=1.1. Task 3 - Series expansion Write a program that computes ex by the series expansion: ex = 1 + x + x2 /2! + x3 /3! + x4 /4! + ... + xi /i! + ... Terminate the expansion when a term is less than 0.000001. Check you results against the library function exp(x) or use your pocket calculator. Hint: Factorials can be problematic due to integer overow; you can avoid factorials by observing that the (i+1)th term in the series is equal to the he (i)th term times x/i. Solution Program: eee484ex1b (see the downloads page) 8
1.5
Example exam questions
Question a) Explain the term truncation error, and give an example. How can truncation errors be reduced? b) Explain the term round-off error. How can round-off errors be reduced? c) Explain the terms overflow and underflow. How can overflow and underflow be avoided.
2
2.1
Numerical Dierentiation
Topics Covered
o Numerical Dierentiation: - Forward Dierence Approximation (rst derivative): FDA - Central Dierence Approximation (rst derivative): CDA - Richardson Extrapolation (rst derivative): REA - Central Dierence Approximation (second derivative): CDA2 - Richardson Extrapolation (second derivative): REA2 The student should be able to derive (or prove) the FDA, CDA, REA, CDA2 and REA2 (the formulae are given) from Taylors Expansion, and show the form of the error in each approximation. The student should be able to use the formulae by hand, and implement them in a computer program. A basic understanding of the meaning of Truncation Error and Round-o Error is expected.
2.2
Lecture Notes
Introduction When an analytical solution to the derivative of a given function is dicult or inconvenient then a numerical method can be used to provide an approximate solution. It is important however to understand the truncation and roundo errors involved in these numerical methods. We will look rst at the most basic method for the rst derivative of a function, the FDA, and then move onto higher order methods, the CDA and REA. We will also look at approximations for the second derivative: the CDA2 and REA2. The Forward-Dierence Approximation (FDA) The FDA method for the numerical dierentiation of a function can be derived by considering dierentiation from rst principles, or alternatively by considering Taylors Expansion. 1. Dierentiation from rst principles: | f(x) = limit | dx -> 0 | f(x+dx) - f(x) -------------dx
As a computer cannot divide by zero the computed (nite) version of this expression is an approximation where dx is small but not zero, I denote this value by h. Now f(x) is approximated by: (f(x+h)-f(x))/h This is the Forward-Dierence Approximation for the numerical derivative of a function, it has the most basic form for a numerical derivative and is the least accurate : +--------------------------------+ | FDA = ( f(x+h) - f(x) ) / h | +--------------------------------+
h is small but not zero
Example: Compute the rst derivative of f(x) = 3x3 + 2x2 + x at x=3 and x=10 using the FDA with h = 0.01 FDA(3) = ( f(3.01) - f(3) ) / 0.01 = 94.2903 by calculus f(3) = 94.0000 => error is 0.2903 (0.3%) FDA(10) = ( f(10.01) - f(10) ) / 0.01 = 941.9203 by calculus f(10) = 941.0000 => error is 0.9203 (0.1%) 10
2. Taylors Expansion: The FDA can also be obtained by rearranging the Taylor Expansion: f(x+h) = f(x) + h f(x)/1! + h^2 f(x)/2! + h^3 f(x)/3! + .... Rearrange for the FDA: ( f(x+h) - f(x) ) / h = f(x) + h f(x)/2 + h^2 f(x)/6 + .... the left hand side is the FDA => FDA = f(x) + (h/2) f(x) + O(h^2) ------------------------/ \ the derivative the truncation error in the FDA Consider the example of the numerical rst derivative of f(x) = 3x3 + 2x2 + x , at x=3 and x=10 , h = 0.01 We obtained the results : FDA(3) = 94.2903 and error = 0.2903 FDA(10) = 941.9203 and error = 0.9203 We can check that the error is (h/2).f(x): f(x) = 18x + 4 , the error(x) = h(9x+2), error(3) = 0.2900 as above and error(10) = 0.9200 as above. The small dierence between the results is due to the omission of the O(h2 ) term in the expression for the error. Summary: o The rst derivative of a function f(x) is approximated by: FDA = ( f(x+h) - f(x) ) / h where h is small but not zero. o The error is approximately (h/2).f(x) , i.e. proportional to h - to minimise the error choose a small value of h o The error in the FDA is called a truncation error as it is due to the truncation of the higher-order terms in the Taylor expansion. Note that h should not be too small as round-o errors in the machine arithmetic increase as h decreases (always use double precision variables! the kind=8 specier in Fortran, and the double declaration in C/C++). The Central-Dierence Approximation (CDA) The CDA gives an improved (higher-order) method: +--------------------------------------+ | CDA = ( f(x+h) - f(x-h) ) / (2h) | +--------------------------------------+ h is small but not zero It can be shown (see the lecture) from Taylors Expansion that CDA = f(x) + (h^2/6) f(x) + O(h^4) ---------------------------/ \ the derivative the truncation error in the CDA The CDA is a higher-order method than the FDA as it gives an error which is proportional to h2 (the error is therefore much smaller). Also, the error is proportional to the third derivative, F(x) which may further 11
reduce the error with respect to the FDA error which has an f(x) dependence. Richardson Extrapolation Approximation (REA) A higher-order method is given by the Richardson Extrapolation Approximation: +-------------------------------------------------+ | REA = (f(x-2h)-8f(x-h)+8f(x+h)-f(x+2h))/(12h) | +-------------------------------------------------+ h is small but not zero It can be shown from Taylors expansion that REA = f(x) + - (h^4/30) f(x) + O(h^6) ---------------------------------/ \ the 1st derivative the truncation error in the REA The truncation error is proportional to the fth derivative and h4 . The results below compare the performance of the above three methods. f(x) = 3x^3 + 2x^2 + x , first derivative at x=3 and x=10 , h = 0.01 +------------------------------------+--------------------------------------+ | FDA(3)=94.290300 (error=0.290300) | FDA(10)=941.920300 (error=0.920300) | | CDA(3)=94.000300 (error=0.000300) | CDA(10)=941.000300 (error=0.000300) | | REA(3)=94.000000 (error=0.000000) | CDA(10)=941.000000 (error=0.000000) | +------------------------------------+--------------------------------------+ The results illustrate that the CDA can give reasonably accurate results and so is worth implementing as a simple method. The REA in this case is exact as the truncation error is proportional to the fth derivative which is zero. Implementation Implementation of the above methods is simple. The program requires a denition of f(x) and the two inputs h and x. Algorithm 2 ! Program to compute the first ! The FDA, CDA and REA methods input "input the value of x ", input "input the value of h ", fda cda rea derivative of a function f(x) are implemented for comparison. x h
= (f(x+h)-f(x))/h = (f(x+h)-f(x-h))/(2*h) = (f(x-2*h)-8*f(x-h)+8*f(x+h)-f(x+2*h))/(12*h)
output "FDA = ", fda output "CDA = ", cda output "REA = ", rea function definition f(x) = 3x^3 + 2x^2 + x Note: you should use double precision variables to avoid large round-o errors. 12
The Central-Dierence Approximation for a Second Derivative (CDA2) The second derivative of a function f(x) can be approximated by: +----------------------------------------------+ | CDA2 = ( f(x-h) - 2f(x) + f(x+h) ) / h^2 | +----------------------------------------------+ h is small but not zero It can be shown from Taylors Expansion that CDA2 = f(x) + (h^2/12) f(x) + O(h^4) ------------------------------/ \ the 2nd derivative the truncation error in the CDA2
The Richardson Extrapolation Approximation for a Second Derivative (REA2) The second derivative of a function f(x) can be approximated by: +----------------------------------------------------------------+ | REA2 = (-f(x-2h)+16f(x-h)-30f(x)+16f(x+h)-f(x+2h)) / (12h^2) | +----------------------------------------------------------------+ h is small but not zero It can be shown from Taylors Expansion that REA2 = f(x) + -----/ the 2nd derivative Summary of Methods
FDA CDA REA = (f(x+h)-f(x))/h = (f(x+h)-f(x-h))/(2h) = (f(x-2h)-8f(x-h)+8f(x+h)-f(x+2h))/(12h) = f(x) = f(x) = f(x) + (h/2) f(x) + (h^2/6) f(x) - (h^4/30) f(x) + .... + .... + ....
- (h^4/90) f(x) + O(h^6) -----------------------------\ the truncation error in the REA2
CDA2 = (f(x-h)-2f(x)+f(x+h))/h^2 = f(x) + (h^2/12) f(x) + .... REA2 = (-f(x-2h)+16f(x-h)-30f(x)+16f(x+h)-f(x+2h))/(12h^2) = f(x) - (h^4/90) f(x) + ....
Errors - the truncation error and the round-o error The approximation methods FDA, CDA, and REA can be used to demonstrate the eect of truncation errors and round-o errors. The error, for example (h2 /6).f(x) inherent to the CDA method, is an example of a truncation error, i.e. by truncating higher order terms in the Taylor expansion the method becomes only approximate. Another source of error exists when the FDA, CDA or REA are computed; this is the round-o error due to limited precision in numerical arithmetic (numerical values are stored in the computer with a limited number of binary bits). Round-o errors are compounded in arithmetic operations. The total error is therefore a combination of the two error sources: Total Error = Truncation Error + Round-o Error
13
The important parameter here is the value of h; the truncation error increases with increasing h, the roundo error decreases with increasing h Given a particular method, for example the CDA, the most accurate computed derivative is obtained by minimising the total error, this corresponds to nding the optimal value of h. This optimal value will dier depending on 1. The numerical method (FDA, CDA, REA, etc). 2. The function being dierentiated, and the value of x. 3. The precision of the arithmetic (single-, double-, quad-precision). To arrive at the optimal value some study of the output of your program is needed. The total error in the CDA is given by: |------------------------| | Error = CDA(x) - f(x) | where f(x) is the |------------------------| unknown first derivative. log(|Error|) | | | / | \ / | \ / | \ _ / |____________________ -10 -8 -6 -4 -2 log(h)
A plot of |Error| against h will have the form indicated qualitatively in the figure. The rise on the right (as h increases) is due to the truncation error which has the form of h^2, the rise on the left (as h decreases) is due to round-off errors.
A minimum error exists at some intermediate value of h corresponding to a minimum in the plot. If f(x) is unknown we can only plot CDA versus h, but, as f(x) is a constant the plot will have the same shape (only shift up or down). In this case again a minimum (or stationary) value in the plot will be observed corresponding to a minimum error. The situation is less clear when the round-o error has the opposite sign to the truncation error, the CDA may vary erratically about the true value, though again a relatively stable stationary value in a plot of CDA versus h corresponds to a solution close to a minimum error. This procedure of outputting the CDA with dierent values of h and then interpreting the results is an example of step 8 in the section Errors and Problem Solving. The lab exercise will require you to perform such an analysis.
14
2.3
Lab Exercises
We will estimate the derivative of the function f(x) = -1/x at x=3 and check the result against the exact result f(x) = 1/x2 which gives f(3) = 1/32 = 0.111 111 111 111... i.e, Error = Estimate - 0.111 111 111 111. Task 1 Compare the accuracy of FDA(3), CDA(3) and REA(3). For this use h = 0.01, and double precision data (kind=8 or double). Task 2 For CDA(3) investigate the eect of varying h, try h = 101 , 102 , 103 , ...., 1012 . Use double precision data (kind=8 or double). Which value of h gives the most accurate estimate? Task 3 Repeat task 2 replacing CDA with REA. Which value of h gives the most accurate estimate? Task 4 Repeat task 2 with single-, double-, and quad-precision. Comment on the results. Additional Tasks Investigate the CDA2 and REA2 expressions for nding the second derivative of a function.
15
2.4
Lab Solutions
We will estimate the derivative of the function f(x) = -1/x at x=3 and check the result against the exact result f(x) = 1/x2 which gives f(3) = 1/32 = 0.111 111 111 111... i.e, Error = Estimate - 0.111 111 111 111. Task 1 Compare the accuracy of FDA(3), CDA(3) and REA(3). For this use h = 0.01, and double precision data (kind=8 or double). Solution Program: eee484ex2a (see the downloads page). FDA CDA REA Tru = = = = 0.110741971207 0.111112345693 0.111111111056 0.111111111111 Err = -0.000369139904 Err = 0.000001234582 Err = -0.000000000055
For the same value of h and x, the accuracy increases as the order of the method increases. Remember that truncation error for the FDA, CDA, and REA are proportional to h, h2 , and h4 respectively. Task 2 For CDA(3) investigate the eect of varying h, try h = 101 , 102 , 103 , ...., 1012 . Use double precision data (kind=8 or double). Which value of h gives the most accurate estimate? Solution Program: eee484ex2b (see the downloads page). h 0.100000000000 0.010000000000 0.001000000000 0.000100000000 0.000010000000 0.000001000000 0.000000100000 0.000000010000 0.000000001000 0.000000000100 0.000000000010 0.000000000001 CDA(3) Error=CDA(3)-Tru 0.111234705228 0.000123594117 0.111112345693 0.000001234582 0.111111123457 0.000000012346 0.111111111235 0.000000000124 0.111111111113 0.000000000002 0.111111111123 0.000000000012 0.111111110900 -0.000000000211 0.111111110929 -0.000000000182 0.111111129574 0.000000018463 0.111111212868 0.000000101757 0.111112045942 0.000000934831 0.111108659166 -0.000002451946
As h decreases in size the truncation error is seen to decrease in the expected form; i.e. the error is proportional to h2 so each decrease in h by a factor of 10 decreases the error by a factor of 102 . The minimum error is obtained with h=0.00001 after which round-o error dominates. The round o error increases as h decreases in size. Remember that the total error is the sum of the truncation error and the round o error. Further studies reveal that the optimal value of h depends on the function and the value of x. Task 3 Repeat task 2 replacing CDA with REA. Which value of h gives the most accurate estimate? 16
Solution Program: eee484ex2c (see the downloads page). h 0.100000000000 0.010000000000 0.001000000000 0.000100000000 0.000010000000 0.000001000000 0.000000100000 0.000000010000 0.000000001000 0.000000000100 0.000000000010 0.000000000001 REA(3) 0.111110559352 0.111111111056 0.111111111111 0.111111111112 0.111111111113 0.111111111116 0.111111110725 0.111111113438 0.111111107981 0.111110996954 0.111109886799 0.111104541456 Error=REA(3)-Tru -0.000000551759 -0.000000000055 -0.000000000000 0.000000000000 0.000000000002 0.000000000005 -0.000000000386 0.000000002327 -0.000000003130 -0.000000114157 -0.000001224312 -0.000006569655
Replacing CDA with REA results in much small truncation errors. The round o errors, however, are similar to those generated in the CDA method. Consequently, the optimal value of h occurs earlier at about h = 0.001. Task 4 Repeat task 2 with single, double, and quad precision. Comment on the results. Solution Program: eee484ex2d (see the downloads page). CDA(3) (float) (double) Error (kind=4) Error (kind=8) 0.000123433769 0.000123594117 0.000000916421 0.000001234582 -0.000009402633 0.000000012346 -0.000079058111 0.000000000124 0.000647775829 0.000000000002 -0.003491610289 0.000000000012 -0.160781651735 -0.000000000211 -0.607816457748 0.000000000182 -5.078164577484 0.000000018463 -49.78163909912 0.000000101757 -496.8164062500 0.000000934831 -4967.164062500 -0.000002451946 (long double) Error (kind=16) 0.000123594116 9200346063527 0.000001234581 6188081102136 0.000000012345 6803840879439 0.000000000123 4567902606310 0.000000000001 2345679012483 0.000000000000 0123456790123 0.000000000000 0001234567901 0.000000000000 0000012345679 0.000000000000 0000000123457 0.000000000000 0000000001235 0.000000000000 0000000000010 0.000000000000 0000000000106
h 0.100000000000 0.010000000000 0.001000000000 0.000100000000 0.000010000000 0.000001000000 0.000000100000 0.000000010000 0.000000001000 0.000000000100 0.000000000010 0.000000000001
Single precision (kind=4 or oat) is clearly not appropriate for numerical dierentiation; the round o error dominates early and so the optimal value of h is large resulting in a poor accuracy. Double precision (kind=8 or double) performs well but one should be careful not to choose a value of h that is very small as this will result in signicant round-o errors. Quad precision (kind=16 or long double) again dramatically reduces round o errors, the errors becoming signicant only at h = 1012 for this case. The use of quad-precision, however, is not common as double precision is often sucient and quad-precision arithmetic takes signicantly longer to compute on 32 bit platforms [64 bit becoming common? - this statement is outdated?]. 17
Conclusion The FDA method gives poor results and should not be used. The CDA method gives reasonable results if you require a simple (easy to remember) method and do not require a very high precision. The REA gives the best result of the three methods and should be used in applications where precision is important. In this kind of numerical work it is advisable to use double precision data to avoid large round o errors. Although quad precision is sometimes available (depending on the platform and compiler) it is not (yet) a commonly used precision. The choice of the value of h can be important, the optimal value will depend on the function, where it is being evaluated and the method used; one should not choose an arbitrary value. Additional Tasks Investigate the CDA2 and REA2 expressions for nding the second derivative of a function. Solution This is left to the student. Feel free to discuss your results with your teacher.
18
2.5
Question a) Write a computer program to evaluate the first derivative of a function f(x) using the Central Difference Approximation method: CDA = ( f(x+h) - f(x-h) ) / 2h b) Using Taylors expansion show that the truncation error in this approximation is given by: error = (h^2/6).f(x) + O(h^4) c) Theoretically, how can the error in the CDA be minimised? In practice, what other type of error exists in this method. d) i. Using the CDA with h=0.1, evaluate the first derivative of f(x) = x^4 at x=3.2 ii. Using calculus, determine the value for the error in your result and show that it equals (h^2/6).f(x) Question a) Write a computer program to evaluate the first derivative of a function f(x) using the Richardson Extrapolation Approximation method: REA = ( f(x-2h) - 8f(x-h) + 8f(x+h) - f(x+2h) ) / 12h b) Using Taylors expansion show that the truncation error in this approximation is given by: error = -(h^4/30).f(x) + O(h^6) c) Theoretically, how can the error in the REA be minimised? In practice, what other type of error exists in this method. d) i. Using the REA with h=0.1, evaluate the first derivative of f(x) = x^6 at x=3.2 ii. Using calculus, determine the value for the error in your result and show that it equals -(h^4/30).f(x) Question a) Write a computer program to evaluate the second derivative of a function f(x) using the Central Difference Approximation method: CDA2 = ( f(x-h) - 2f(x) + f(x+h) ) / h^2 b) Using Taylors expansion show that the truncation error in this approximation is given by: error = (h^2/12).f(x) + O(h^4) c) Theoretically, how can the error in the CDA2 be minimised? In practice, what other type of error exists in this method. d) i. Using the CDA2 with h=0.1, evaluate the second derivative of f(x) = x^5 at x=3.2 ii. Using calculus, determine the value for the error in your result and show that it equals (h^2/12).f(x)
19
Question a) Write a computer program to evaluate the second derivative of a function f(x) using the Richardson Extrapolation Approximation method: REA2 = (-f(x-2h)+16f(x-h)-30f(x)+16f(x+h)-f(x+2h))/(12h^2) b) Using Taylors expansion show that the truncation error in this approximation is given by: error = -(h^4/90).f(x) + O(h^6) c) Theoretically, how can the error in the REA2 be minimised? In practice, what other type of error exists in this method. d) i. Using the REA2 with h=0.1, evaluate the second derivative of f(x) = x^7 at x=1.5 ii. Using calculus, determine the value for the error in your result and show that it equals -(h^4/90).f(x)
20
3
3.1
Roots, Maxima, Minima (closed methods)

Topics Covered
gnuplot$>$ plot [0:10] exp(-x)*(x**3-6*x**2+8*x) http://www1.gantep.edu.tr/~andrew/eee484/images/extrema-test-function.gif o The sequential search method for nding roots; the student should remember the method, and be able to derive an expression for the number of iterations required to obtain a given accuracy. o The bisection method for nding roots; the student should remember the method, and be able to derive an expression for the number of iterations required to obtain a given accuracy. http://www1.gantep.edu.tr/~andrew/eee484/images/bisection_method.png o The sequential search method for maxima and minima; the student should remember the method, and be able to derive an expression for the number of iterations required to obtain a given accuracy. http://www1.gantep.edu.tr/~andrew/eee484/images/extrema_example.png
3.2
Lecture Notes
Introduction Numerical methods for nding roots and extremum (maxima and minima) of functions are used when analytical solutions are dicult (or impossible), or when a calculation is part of a larger numerical algorithm. We will study a number of basic numerical methods starting from very simple (and inecient) sequential searches to very powerful Newtons methods. The algorithms are divided into two groups: closed methods[this week] (where the solution is initially bracketed), and open methods[next week] (where the solution is not bracketed). Root nding (closed methods) Denition: The root, xo , of a function f(x) is such that f(xo )=0. For example if f(x) = x3 -28, then the root xo is 281/3 = 3.0365889718756625194208095785... In general we will not nd an exact solution especially given that roots tend to be irrational. Our strategy will be to dene how accurate we want the solution to be and then compute the result approximately to this accuracy. This is called a tolerance, for example: Tolerance = 0.0001, this means that the root is required to be correct within plus or minus 0.0001 (four decimal place accuracy). For this to work we also need to be able to determine an error estimate with which the tolerance is compared and the algorithm terminates when error estimate < tolerance The sequential search method (closed method) In the sequential search rst the position of the root is estimated such that a bracket a,b can be formed placing a lower- and upper-bound on the root. For this some initial analysis of the function is required. Note that if a single root (or odd number of roots) is bracketed by a and b then there will be a sign change between f(a) and f(b). During this search (scan) of the function we can identify a root as follows: 21
Search the function in the range a x b in steps of dx until we see a sign change. An estimate of the root can then be given as the center of the last inspected step with a maximum error of dx/2 (and mean error of about dx/4).
/sign change i=0 i=1 i=2 i=3 i=4 ... / i=n -----|---|---|---|---|---|---|-o-|---|---|---|---|---|---|---|---- x a /root dx b / / if the sign change occurs between x and x+dx then root estimate = x + dx/2, maximum error = dx/2 Algorithm 3a Sequential search method for nding the root of f(x). All roots between a and b are found. input a, b, tolerance dx = 2*tolerance n = nint((b-a)/dx) do i = 1, n x = a + i*dx if ( f(x)*f(x+dx) < 0 ) output "root = ", x+dx/2 end do function definition f = x**3-28 Results for a=3.0, b=3.1 and dierent values of tolerance. x0 3.03 3.037 3.0365 3.03659 3.036589 3.0365889 3.03658897 3.036588971 error -0.66E-2 0.41E-3 -0.89E-4 0.10E-5 0.28E-6 -0.72E-7 -0.19E-8 -0.88E-9 tolerance n 1.E-2 5 1.E-3 50 1.E-4 500 1.E-5 5000 1.E-6 50000 1.E-7 500000 1.E-8 5000000 1.E-9 50000000 in 0.25 seconds!
root root root root root root root root
= = = = = = = =
We obtain a high accuracy (tolerance = 1e-9) in less than one second (50 million steps). However if the initial bracket was a=0, b=10 then we would need 2 billion steps taking 30 seconds. We can see that the error is proportional to 1/n and the run-time is proportional to 1/tolerance. We can do this much more eciently using the following bisection method. The Bisection method (closed method) In the Bisection method rst the position of the root is estimated such that a bracket can be formed placing a lower- and upper-bound on the root. For this some initial analysis of the function is required. A rst estimate of the root is then computed as the mid-point between the two bounds 22
/ LowerBound / UpperBound ------x----------/------o-----------------x-----/ MidPoint / / MidPoint = ( LowerBound + UpperBound ) / 2 Consider the function F(x) = x3 -28; the root lies somewhere between x=3.0 and x=3.1. This can be shown be evaluating the function at these two values: F(3.0) = 27 - 28 = -1 and F(3.1) = 29.791 - 28 = +1.791; the function changes sign implying that the root is bracketed between x=3.0 and x=3.1. The rst estimate of the root is then MidPoint = (3.0+3.1)/2 = 3.05. We can improve on this estimate by determining which side of MidPoint the root lies and then moving the bracket accordingly and re-evaluating MidPoint: if F(LowerBound) . F(MidPoint) is negative then the root is to the left of MidPoint => move UpperBound to MidPoint else the root is to the right of MidPoint => move LowerBound to MidPoint / / LowerBound / UpperBound -----x---------/------o----------------x-----/ MidPoint -ve / +ve +ve / / <- root is this way MidPoint is recalculated and the procedure iterated until HalfBracket is less than Tolerance, where HalfBracket = (UpperBound-LowerBound)/2 is the maximum possible error in our estimate. Each iteration halves (bisects) the bracket (and therefore halves the maximum possible error) hence the term Bisection. The following algorithm represents a Bisection search for the root of F(x) = x3 -28 Algorithm 3b input lb, ub, tolerance do hb = (ub-lb)/2 ! the error estimate mp = (ub+lb)/2 ! the new root estimate output mp, hb if ( hb < tolerance ) exit ! terminate if tolerance is satisfied if ( f(lb)*f(mp) < 0 ) ub=mp else lb=mp end do function definition f = x**3-28 For inputs 3.0, 3.1, 0.001 the result of the algorithm is: 23
MidPoint 3.0500 3.0250 3.0375 3.0313 3.0344 3.0359 3.0367
HalfBracket 0.0500 0.0250 0.0125 0.0062 0.0031 0.0016 0.0008
The algorithm terminates after six iterations when the value of HalfBracket (the error estimate) is smaller than the value of Tolerance; i.e. 0.0008 is less than 0.001. The nal value of MidPoint (the root estimate) for this tolerance is 3.037. A trace of values is shown below: iteration 0 1 2 3 4 5 6 F(MidPoi.) 0.3726 -0.3194 0.0252 -0.1474 -0.0612 -0.0180 0.0036 LowerB. MidPoint 3.0000 3.0500 3.0000 3.0250 3.0250 3.0375 3.0250 3.0313 3.0313 3.0344 3.0344 3.0359 3.0359 *3.0367* UpperB. 3.1000 3.0500 3.0500 3.0375 3.0375 3.0375 3.0375 F(L)*F(M) -ve +ve -ve +ve +ve +ve -ve HalfBracket 0.0500 0.0250 0.0125 0.0062 0.0031 0.0016 0.0008
Notice that each iteration halves the size of the search region, hence the term Bisection. The table below gives the number of iterations required to satisfy a given tolerance (the values in brackets are explained later). Tolerance 10^-1 10^-2 10^-3 10^-4 10^-5 10^-6 10^-7 10^-8 10^-9 MidPoint(root) 3.0500000000 3.0312500000 3.0367187500 3.0366210937 3.0365905762 3.0365882874 3.0365889549 3.0365889728 3.0365889720 HalfBracket 0.0500000000 0.0062500000 0.0007812500 0.0000976562 0.0000061035 0.0000007629 0.0000000954 0.0000000060 0.0000000007 true error +0.0134110281 -0.0053389719 +0.0001297781 +0.0000321219 +0.0000016043 -0.0000006845 -0.0000000170 +0.0000000009 +0.0000000002 iterations 0 (-1.0) 3 ( 2.3) 6 ( 5.6) 9 ( 9.0) 13 (12.3) 16 (15.6) 19 (18.9) 23 (22.3) 26 (25.6)
As expected, a greater number of iterations is required to achieve a greater accuracy, for this method the convergence is exponential (3 or 4 iterations increases the accuracy by a factor of ten). The value of the HalfBracket is the largest possible error in the calculated root. This is illustrated in the table by comparing this value with the true error = MidPoint - 281/3 , the true error is similar to but always smaller than the value of HalfBracket. An expression for the relationship between the error and the number of iterations can be derived as follows: Given an initial error ei , after one iteration the error is ei /2 and after n iterations the error is ei /2n = ef (the nal error), and so taking logs and rearranging for n we have: n = log(ei / ef ) / log(2). This is the number of iterations required to achieve an accuracy of ef given an initial accuracy of ei . In the above example the initial accuracy is (3.1-3.0)/2 = 0.05 , the nal accuracy en must be less than 24
the tolerance. The expression becomes: n = log(0.05/Tolerance) / log(2) The results from this expression are shown in the brackets in the above table. The number of iterations performed by the algorithm is the same as that indicated by the above expression (rounded up to the nearest integer). We see that the error is proportional to 2sup-n/sup and the run-time is proportional to log(1/tolerance). The Bisection method is similar to the way we search for a word in a dictionary. The upper and lower bounds are the rst and last page respectively and we open the book at the centre page. The word lies either to the left or the right of the current page, if it is to the right then we turn to the page half way through the book to the right (bisecting the pages to the right). We continue the search in the appropriate direction converging exponentially towards required page. In this way a page can be found in a 1000 page dictionary in only n = log(500/1) / log(2) = 9 bisections (the tolerance here is 1 page, the initial HalfBracket is 1000/2=500 pages). Try it for your self. Maxima and Minima [extremum] (closed methods) [See the figure give in the lecture (URL)] For some functions we can use dierential calculus to nd extremum; we know that a minimum occurs when f(x)=0 and f(x)>0, and a maximum when f(x)=0 and f(x)<0. For example f(x) = x2 - 8x + 19 and so f(x) = 2x - 8 = 0 and so an extremum occurs at x = 4. And, f(x) = 2 (+ve) and so this is a minimum. Also by inspection f(x) = x2 - 8x + 19 = (x-4)2 + 3 and so f(4) is a minimum. However, often it may be dicult, or impossible, to treat a function analytically and we must use a numerical method for nding extremum. Also, we must be careful not to mistake local extremum for global extremum. [See the figure given in the lecture (URL)] We can attempt to avoid making this mistake by inspecting the function graphically (or equivalently performing a sequential search) or re-running our algorithm a number of times with a broad variety of dierent inputs. For investigating methods for nding extremum, out test function is: f(x) = ex (x3 - 6x2 + 8x) and we are interested in x0. http://www1.gantep.edu.tr/~andrew/eee484/images/extrema-test-function.gif Sequential Search (closed method) If we plot our test function, say in the range 0<x<10, then we are performing a sequential search. During this search (scan) of the function we can identify extremum as follows: Search the function in the range a x b in steps of dx, with x as the current position, the following conditions are tested: if [f(x) > f(x-dx) and f(x) > f(x+dx)] then f(x) is a local or global maximum if [f(x) < f(x-dx) and f(x) < f(x+dx)] then f(x) is a local or global minimum [See the figure give in the lecture (URL)] here dx denes the tolerance, and the number of points inspected is nint[(b-a)/dx] + 1. i.e. we loop over i = 0 to n and dene xi = a + i dx. Algorithm 3c Sequential search method for nding minimum and maximum of f(x). All global and local minimum and maximum between a and b are found.
25
input a, b input dx n = nint((b-a)/dx) do i=0,n x = a + i*dx if [f(x) < f(x-dx) and f(x) < f(x+dx)] output "minima ", f(x), "at x = ", x if [f(x) > f(x-dx) and f(x) > f(x+dx)] output "maxima ", f(x), "at x = ", x end do end define f(x) = exp(-x) (x^3 - 6x^2 + 8x) With a=0, b=10 and dx=106 the output of this algorithm is: maxima minima maxima 1.592547 at x = -0.165150 at x = 0.120121 at x = 0.510 711 2.710 831 5.778 457 (actual error is 0.4x10^-6) (actual error is 0.4x10^-6) (actual error is 0.1x10^-6)
Note that the results are given to 6 decimal places as this is the limit of the accuracy dened by dx = 106 . To achieve this accuracy the algorithm needs to inspect 10,000,001 points, this takes about 5 seconds on my 2.4 GHz CPU which may be considered much to slow for general purposes (to increase the accuracy to 109 the algorithm will take 5000 seconds!). We can see that the error is proportional to 1/n and the run-time is proportional to 1/tolerance. A more ecient method is the Golden Section search; this however is much more complex to derive and implement. We will look at another powerful (but simple) extremum nder in the next lecture (open methods).
26
3.3
Lab Exercises
Task 1 Implement algorithms 3a, 3b, 3c given in the lecture into computer programs. Check the outputs of your programs against the solutions given in the lecture. Task 2 Using the bisection method, evaluate to at least 6 decimal place accuracy the root of the following function: f(x) = x2 + loge (x) - 3.73 gnuplot> plot [1:2] x**2 + log(x) - 3.73 http://www1.gantep.edu.tr/~andrew/eee484/images/lab3-fig1.gif For each method write down: - The evaluated root (to the appropriate number of decimal places). - The estimated error, explain how you arrive at your value. - The number of iterations performed. - The theoretically expected number of iterations for the required accuracy. Task 3 Using any of your computer programs, nd all extremum and roots of the function f(x) = ex - 3x2 (for 0 > x > 4) to 6 decimal place accuracy. gnuplot> plot [0:4] exp(x) - 3*x**2 http://www1.gantep.edu.tr/~andrew/eee484/images/lab3-fig2.gif If you have time, experiment with some more functions.
27
3.4
Lab Solutions
Task 1 Implement algorithms 3a, 3b, 3c given in the lecture into computer programs. Check the outputs of your programs against the solutions given in the lecture. Solutions See eee484ex3a, eee484ex3b, eee484ex3c in the course downloads page. Task 2 Using the bisection method, evaluate to at least 6 decimal place accuracy the root of the following function: f(x) = x2 + loge (x) - 3.73 gnuplot> plot [1:2] x**2 + log(x) - 3.73 http://www1.gantep.edu.tr/~andrew/eee484/images/lab3-fig1.gif For each method write down: - The evaluated root (to the appropriate number of decimal places). - The estimated error, explain how you arrive at your value. - The number of iterations performed. - The theoretically expected number of iterations for the required accuracy. Solutions With an initial approximate analysis, the root is determined to be between 1.0 and 2.0 i.e. F(1.0) = -2.73 and F(2.0) = +0.96 Program eee484ex3b (see the downloads page) MidPoint 1.500 000 1.750 000 1.875 000 1.812 500 1.781 250 1.765 625 1.773 437 1.777 343 1.775 390 1.776 367 1.775 878 1.776 123 1.776 245 1.776 306 1.776 336 1.776 351 1.776 359 1.776 355 1.776 353 1.776 354 00 00 00 00 00 00 50 75 62 19 91 05 12 15 67 93 56 74 84 79 HalfBracket 0.500 000 00 0.250 000 00 0.125 000 00 0.062 500 00 0.031 250 00 0.015 625 00 0.007 812 50 0.003 906 25 0.001 953 12 0.000 976 56 0.000 488 28 0.000 244 14 0.000 122 07 0.000 061 04 0.000 030 52 0.000 015 26 0.000 007 63 0.000 003 81 0.000 001 91 0.000 000 95 - initial estimate - iteration 1
- iteration 19
The program terminates after 19 iterations because HalfBracket (the error estimate) is less than 0.000 001 28
(the tolerance). The result for the root is the nal value of MidPoint = 1.776 355 (6dp accuracy). The theoretical number of iterations required is log(ei /ef )/log(2) = log(0.5/0.000001)/log(2) = 18.9 = 19 (as above). Task 3 Using any of your computer programs, nd all extremum and roots of the function f(x) = ex - 3x2 (for 0 > x > 4) to 6 decimal place accuracy. gnuplot> plot [0:4] exp(x) - 3*x**2 http://www1.gantep.edu.tr/~andrew/eee484/images/lab3-fig2.gif
Solutions The plot indicates that there is a maximum at about 0.2, a root at about 1.0, minimum at about 2.8 and a second root at about 3.7. We will use eee484ex3a and eee484ex3c to perform a sequential search in the range 0,4 with a tolerance of 0.000 001. Results: eee484ex3a root = 0.910007 root = 3.733079 check with eee484ex3b with two brackets {0.90, 0.92} root = 0.910008 {3.73, 3.74} root = 3.733079 eee484ex3c maxima at x = minima at x =
0.204481 2.833148
29
3.5
Question 1 (Bisection Method) a) Show that, for the bisection root-finding method, the number of iterations, n, required to reduce the error from an initial value of e_i to a final value of e_f is given by: n = log( e_i / e_f ) / log(2) b) Given that a root of the function f(x) = e^x - 3x^2 is near 1.0, estimate the number of iterations required to achieve an accuracy of at least 6 decimal places. Question 2 (Sequential search: root) Explain how a sequential search can be performed to find roots of a function f(x). Include in your answer an explanation of the relationship between the number of function evaluations and the accuracy of the solution. Question 2 (Sequential search: extremum) Explain how a sequential search can be performed to find maxima and minima of a function f(x). Include in your answer an explanation of the relationship between the number of function evaluations and the accuracy of the solution.
30
4
4.1
Roots, Maxima, Minima (open methods)

Topics Covered
gnuplot$>$ plot [0:10] exp(-x)*(x**3-6*x**2+8*x) http://www1.gantep.edu.tr/~andrew/eee484/images/extrema-test-function.gif o The Newton-Raphson root-nding method; the student should be able to derive the iterative formula, write a computer program implementing it, and use the formula to nd the root of a function by hand (using a pocket calculator). o Newtons Square-root; the student should be able to derive the Newtons Square-root iterative formula from the the Newton-Raphson iterative formula, write a computer program, and use the method to calculate by hand (using a pocket calculator) the square-root of a positive number. o The Secant root-nding method. o Newtons method and modied Newtons method for nding extremum.
4.2
Lecture Notes
Introduction Continuing from last week, we now look at open methods (not requiring the solution to be bracketed) for nding roots and extrema of functions. The Newton-Raphson root nding method (open method) The Newton-Raphson method for nding the root of a function f(x) uses information about the rst derivative f (x) to estimate how far (and which direction) the root lies from the current position. Theory: Let x be an approximation to the root xo , the error e is dened as: e = x - xo , and so we can write the root as xo = x - e. Taylors Expansion gives: f(xo ) = f(x-e) = f(x) - e f (x)/1! + e2 f (x)/2! - e3 f (x)/3! + .... Ignoring powers of e2 and higher we arrive at an approximation to the root: f(xo ) = f(x) - e f (x) (approximately). f(xo ) = 0 and so we can write: 0 = f(x) - e f (x) and so e = f(x) / f (x) (approximately) i.e. we have an estimate of the error in the approximation x. We can now correct the root estimate x for this error and arrive at a value closer to the root: xo = x - e and so xo = x - f(x) / f (x) (approximately). This is the Newton-Raphson improved estimate of the root xo given an initial estimate x. This improved estimate is still not exact as we have not included all the terms in the Taylor expansion (it has a truncation error), but by iterating the procedure we can repeat the improvements; the iteration is illustrated as follows: xi+1 = xi - f(xi ) / f (xi ) where xi is the current estimate and xi+1 is the next, improved, estimate. This is the Newton-Raphson iterative formula for the root of a function; the method can be represented by the following algorithm.
31
Algorithm 4a input x ! input the initial root estimate input Tolerance ! input the tolerance (required accuracy) do Error = f(x) / f(x) ! the error estimate output x, Error ! output current estimates if ( |Error| < Tolerance ) terminate ! terminate if tolerance is satisfied x = x - Error ! subtract the error estimate end do define f(x) = x^3 - 28 define f(x) = 3 x^2 Notes: - The algorithm is very simple! - It terminates when a tolerance is satised. - No bracket is required, only an initial estimate of the root. - The algorithm requires the rst derivative of the function. - The error estimate can be negative so the absolute value is compared to the tolerance. - If f (x) is close to zero ( i.e. near a turning point in the function f(x) ) then the estimate of the error, f(x) / f (x), can be very large launching the solution far away from the root. The algorithm might crash with an overow or error or take a long time to recover. Convergence for Newton-Raphson is very rapid. The error in the error estimate is proportional to the square of the error; this vanishes quickly for an error << 1: on each iteration the number of correct signicant gures doubles. This is demonstrated by executing the above algorithm for: f(x) = x3 -28, f (x)=3x2 , Tolerance = 1012 , and an initial estimate of x=3.0: iteration 0 1 2 3 Root estimate (x) Error estimate 3.000000000000000000 -0.037 037 037 037 3.037037037037037037 0.000 447 999 059 3.036589037977100638 0.000 000 066 101 3.036588971875663958 0.000 000 000 000 037 936 436 001 037 399 680 439
The program converges in just 3 iterations giving an accuracy of the order of 1015 . Newtons Square-Root A special case of the Newton-Raphson method can be written for the square-root of a positive number p: let x = p1/2 then the root of the function f(x) = x2 - p gives the root of p. In the iterative method of Newton-Raphson xi+1 = xi - f(xi ) / f (xi ), the rst derivative is simply 2x and so the formula can be written as: xi+1 = xi - (x2 i - p) / 2xi or xi+1 = xi - (xi - p/xi )/2 Algorithm 4b input p input Tolerance x = p do Error = (x-p/x)/2 output x, Error if ( |Error| < Tolerance ) terminate x = x - Error end do ! input the number we want the sqrt of ! input the tolerance (required accuracy) ! initial sqrt estimate ! ! ! ! the error estimate output current estimates terminate if tolerance is satisfied subtract the error estimate
32
Notes: 1. The estimate x is initially set to p though this could be p/2 (but not zero). 2. We need a tolerance to provide a termination condition. 3. The functions f(x) = x2 - p and f (x) = 2x do not need to be dened, they are absorbed directly into the error expression. Example For p=2 and a tolerance of 109 the output of the algorithm is: 2.000 1.500 1.416 1.414 1.414 000 000 666 215 213 000 000 666 686 562 000 000 667 275 375 0.500 0.083 0.002 0.000 0.000 000 333 450 002 000 000 333 980 123 000 000 333 392 900 002
The square root of 2 can therefore be written as 1.414213562 (9 dp) or subtracting the nal error estimate as 1.414213562373 (12 dp). The Secant Method The main disadvantage of the Newton-Raphson method is that it requires a knowledge of the rst derivative. If the rst derivative is not known, or is inconvenient to implement, then it can be approximated numerically by the iterative form of the Forward Dierence Approximation, this leads to the Secant Method: we have the Newton Raphson iterative formula xi+1 = xi - f(xi ) / f (xi ) and replacing f (xi ) with ( f(xi ) - f(xi1 ) / ( xi - xi1 ) we have xi+1 = xi - f(xi ) ( xi - xi1 ) / ( f(xi ) - f(xi1 ) where xi1 is the previous estimate, xi is the current estimate, and xi+1 is the next, improved, estimate. This is the Secant iterative formula for the root of a function; the method can be represented by the following algorithm. Algorithm 4c input x0 input x1 input Tolerance do Error = f(x1) * (x1-x0) / (f(x1)-f(x0)) output x1, Error if ( |Error| < Tolerance ) terminate x0 = x1 x1 = x1 - Error end do define f(x) = x^3 - 28 Notes: - The algorithm is similar to Newton-Raphson, but does not require a knowledge of the rst derivative. - As with the Bisection method a lower and upper bracket is required but this bracket does not necessarily need to contain the root. - Unlike the Bisection method, convergence is not guaranteed. Convergence for the Secant method is very rapid, almost as fast as Newton-Raphson. This is demonstrated by executing the above algorithm for: f(x) = x3 -28, Tolerance = 1012 and initial bracket x0=3.0 and x1=3.1: 33 ! ! ! ! ! the error estimate output the current values terminate if the tolerance is satisfied reassign the previous subtract the error estimate ! input the lower bracket (previous estimate) ! input the upper bracket (current estimate) ! input the tolerance (required accuracy)
iteration 0 1 2 3 4
Root estimate (x) Error estimate 3.100000000000000000 0.064 170 548 190 3.035829451809387316 -0.000 743 875 474 3.036573327283408032 -0.000 015 648 505 3.036588975789397405 0.000 000 003 913 3.036588971875642356 -0.000 000 000 000
612 020 989 755 020
684 715 373 049 164
The program converges in 4 iterations giving an accuracy of the order of 1014 . Conclusion Below is a comparison of the Bisection, Newton-Raphson, and Secant methods for nding the root of f(x) = x3 -28 with a tolerance of 1012 . method root estimate 3.036 3.036 3.036 3.036 588 588 588 588 971 971 971 971 876 875 875 875 336 642 664 663 error estimate 728E-15 20E-15 1E-15 true error 673E-15 20E-15 1E-15 number of iterations 36 4 3
Bisection Secant Newton-Raphson True root
The Newton-Raphson method has the fastest convergence with the Secant method a close second. An additional advantage of the Newton-Raphson method over the Bisection and Secant methods is that it does not require upper and lower bounds as inputs. However, a disadvantage is that it requires a knowledge of the rst derivative (which is not always available). Also, the Newton-Raphson method can fail at or close to turning points in the function (why?). The Bisection method guarantees convergence whereas both the Newton-Raphson and Secant methods can fail to converge on a root. More than one root? Functions may contain more than one root, the algorithms discussed above will only nd one root at a time and so the user will need to guide the root-nder to nd the other roots. This involves giving dierent brackets (Bisection and Secant cases) or initial values of x (Newton-Raphson case) until all expected roots are found. Hybrid algorithms By combining the two methods a hybrid algorithm can be constructed which contains the rapid convergence of the Newton-Raphson or Secant method with the robustness of the bisection method. Try this for yourself and think about the advantages of your hybrid program. Extrema (open methods) Last week we studied the Sequential Search method for nding extremum (minima and maxima) of a function; this is a closed method, i.e. it requires the extremum to be bracketed. This week we continue the study using open methods. Newtons method (open method) This method is very closely related to the Newton-Raphson root nding method. It provides very fast convergence. However, it is an open method and so involves some uncertainty about which extremum is found. In the Newton-Raphson method we have a target root x0 = x - e where x is a root estimate and e is the error. An estimate of e is obtained by truncating Taylors expansion f(x+e) = f(x) + e f (x) = 0 (condition for a root) and so e = f(x)/f (x). The estimate x is then improved iteratively with xi+1 = xi - f(xi )/f (xi ). Similarly, dierentiating the truncated Taylor expansion we have f (x+e) = f (x) + e f (x) = 0 (condition for an extremum) and so e = f (x)/f (x). The estimate x is then improved iteratively with xi+1 = xi - f (xi )/f (xi ). This is Newtons iterative formula for nding extremum of f(x). 34
Clearly we need the rst and second derivatives of f(x); for example for our test function: f(x) = ex (x3 -6x2 +8x) and so [using d/dx(u.v)=u.dv/dx+v.du/dx] f (x) = ex (-x3 +9x2 -20x+8) f (x) = ex (x3 -12x2 +38x-28) Algorithm 4d Newtons method for nding accurately and quickly one minimum or maximum of f(x). input x, tol ! initial estimate and required accuracy do e = f(x) / f(x) ! The error estimate output x, e ! output the current values if ( |e| < tol ) exit ! terminate if tolerance is satisfied x = x - e ! subtract the error estimate end do if [f(x) < 0] output "maxima ", f(x), "at x = ", x if [f(x) > 0] output "minima ", f(x), "at x = ", x end define f(x) = exp(-x) (x^3 - 6x^2 + 8x) define f(x) = exp(-x) (-x^3 + 9x^2 -20x + 8) define f(x) = exp(-x) (x^3 - 12x^2 + 38x - 28) with dx=109 the output of this algorithm is: [x=0.5] [x=2.7] [x=5.7] maxima 1.592547 at x = 0.510 711 428 189 916 minima -0.165150 at x = 2.710 831 453 551 690 maxima 0.120121 at x = 5.778 457 118 251 383
The solutions converge in just 4 iterations with an error << 109 . Newtons modied method (an open method) The requirement that we need to know the rst and second derivative is a major disadvantage for Newtons method. However, as with the Secant root nding method, derivatives can be replaced with numerical approximations. The CDA approximation for the rst derivative requires two x values x0 and x1 , The CDA2 approximation of the second derivative requires three values; in this case the third value m is taken as the mean of the rst two, i.e m = (x0 + x1 )/2 [See the figure give in the lecture (URL)] We can write CDA = ( f(x+dx) - f(x-dx) ) / (2dx) = ( f(x1 ) - f(x0 ) ) / (x1 -x0 ) and CDA2 = ( f(x-dx) + f(x+dx) - 2f(x) ) / dx2 = 4 ( f(x0 ) + f(x1 ) - 2 f(m) ) / (x1 -x0 )2 The error estimate in Algorithm 4d can be rewritten with the above approximations as: ( x1 - x0 )( f(x1) - f(x0) ) e = ----------------------------4 ( f(x0) + f(x1) - 2 f(m) ) Applying m = m - e improves the estimate, and then after the reassignments x0 = m - e and x1 = m + e, the procedure is iterated causing m to converge quickly on an extremum.
35
|<- e ->|<- e ->| ----+-------+-------+---> x x0 m x1 Algorithm 4e Newtons modied method for nding accurately and quickly one minimum or maximum of f(x). input x0, x1 ! initial estimates of the root input tol ! required accuracy do m = (x0+x1)/2 e = (x1-x0)*(f(x1)-f(x0))/(f(x0)+f(x1)-2*f(m))/4 output f(m), m, e ! output the current values if ( |e| < tol ) exit ! terminate if tolerance is satisfied m = m - e ! improve the extremum estimate x0 = m - e ! modify x0 x1 = m + e ! and x1 end do define f(x) = exp(-x) (x^3 - 6x^2 + 8x)
You can dened f(x) to determine whether the solution is a maximum or minimum. With dx=109 the output of this algorithm is: [x0=0.4 x3=0.6] f(m) = 1.592547 [x0=2.7 x3=2.8] f(m) = -0.165150 [x0=5.7 x3=5.8] f(m) = 0.120121 x = 0.510 711 428 296 x = 2.710 831 454 653 x = 5.778 457 122 806 [actual error 0.1x10^-9] [actual error 1.1x10^-9] [actual error 4.6x10^-9]
The solutions converge in about 7 iterations (slightly slower than the Newtons method) with an error of the order of 109 . The modied Newtons method avoids the need for derivatives but at the expense of less accuracy.
36
4.3
Lab Exercises
Task 1 Implement the Newton-Raphson and Secant root-nding algorithms given in the lecture into computer programs. Using your computer programs, evaluate to at least 6 decimal place accuracy the root of the following function: f(x) = x2 + loge (x) - 3.73 gnuplot> plot [1:2] x**2 + log(x) - 3.73 http://www1.gantep.edu.tr/~andrew/eee484/images/lab4-fig1.gif For each method write down: - The evaluated root (to the appropriate number of decimal places). - The estimated error, explain how you arrive at your value. - The number of iterations performed. - The theoretically expected number of iterations for the required accuracy. Task 2 Implement the Newtons Square-Root algorithm given in the lecture into a fortran program. Use your program to evaluate, to at least 9 decimal place accuracy, the square root of 2, 3, 4, ...., 10. Check the results against your pocket calculator. Task 3 Implement Newtons method and Modied Newtons method for nding extrema of a function given in the lecture into computer programs. Using your computer programs, nd all extremum of the function f(x) = ex - 3x2 (for 0 > x > 4) to 6 decimal place accuracy. gnuplot> plot [0:4] exp(x) - 3*x**2 http://www1.gantep.edu.tr/~andrew/eee484/images/lab4-fig2.gif If you have time, experiment with some more functions.
37
4.4
Lab Solutions
Task 2 Implement the Bisection, Newton-Raphson and Secant algorithms given in the lecture into computer programs. Use your programs to evaluate, to at least 6 decimal place accuracy, the root of the following function: f(x) = x2 + loge (x) - 3.73 gnuplot> plot [1:2] x**2 + log(x) - 3.73 http://www1.gantep.edu.tr/~andrew/eee484/images/lab4-fig1.gif For each method write down: - The evaluated root (to the appropriate number of decimal places). - The estimated error, explain how you arrive at your value. - The number of iterations performed. - The theoretically expected number of iterations for the required accuracy. Solutions With an initial approximate analysis, the root is determined to be between 1.0 and 2.0 i.e. f(1.0) = -2.73 and f(2.0) = +0.96 Newton-Raphson Method Program eee484ex4a (see the downloads page) Here the root estimate is the current value of x and the error estimate is f(x)/f(x). x 1.500 1.793 1.776 1.776 000 054 411 354 000 971 626 855 Error estimate -0.293 0.016 0.000 0.000 054 643 056 000 971 - initial estimate 345 - iteration 1 771 001 - iteration 3
With an initial root estimate of 1.5 and a tolerance of 106 the program terminates after 3 iterations, the nal root estimate is 1.776355 (6 dp accuracy). Double precision (kind=8) is used to avoid round-o errors. The theoretical number of iterations is estimated as follows: for the Newton-Raphson method the number of correct signicant gures is said to double on each iteration, we start with one correct s.f. and want 7 correct s.f.; 22.8 = 7 which implies about 3 iterations (as above). Secant Method Program eee484ex4b (see the downloads page) Here we provide an initial bracket x0=1, x1=2; the initial root estimate is x1=2. x2 2.000 1.739 1.774 1.776 1.776 000 206 699 367 354 000 933 754 491 850 Error estimate 0.260 -0.035 -0.001 0.000 -0.000 793 492 667 012 000 067 822 737 641 004 - initial estimate - iteration 1
- iteration 4
With an initial root estimate of 2.0 and a tolerance of 106 the program terminates after 4 iterations, the 38
nal root estimate is 1.776355 (6 dp accuracy). Double precision (kind=8) is used to avoid round-o errors. Theoretical the number of required iterations is similar to that from the Newton-Raphson method - plus one or two more iterations. Discussion The Newton-Raphson method converges much more rapidly than the bisection method and tends to overshoot the required tolerance giving a much higher accuracy than requested. Also, this method does not require an initial bracket and so an initial estimate of the root is not necessary. However, the bisection method does not need a knowledge of the rst derivative - this is a advantage when the rst derivative is dicult to derive. The Secant method also does not require the rst derivative and is much faster than the Bisection method. Task 2 Implement the Newtons Square-Root algorithm given in the lecture into a fortran program. Use your program to evaluate, to at least 9 decimal place accuracy, the square root of 2, 3, 4, ...., 10. Check the results against your pocket calculator. Solution Newtons Square-Root Program eee484ex4c (see the downloads page) With a tolerance of 109 the results are summarised below: P 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0 Estimate 1.414213562 1.732050808 2.000000000 2.236067977 2.449489743 2.645751311 2.828427125 3.000000000 3.162277660 True square-root 1.414213562 1.732050808 2.000000000 2.236067977 2.449489743 2.645751311 2.828427125 3.000000000 3.162277660
All nal estimates are accurate when compared to the true root values. This is not surprising as the computations are based on the powerful Newton-Raphson algorithm. To obtain a 9 decimal place accuracy, 11 signicant gures are required; single precision is not sucient for this and so double precision (kind=8) is employed. Task 3 Implement Newtons method and Modied Newtons method for nding extrema of a function given in the lecture into computer programs. Using your computer programs, nd all extremum of the function f(x) = ex - 3x2 (for 0 > x > 4) to 6 decimal place accuracy. gnuplot> plot [0:4] exp(x) - 3*x**2 http://www1.gantep.edu.tr/~andrew/eee484/images/lab4-fig2.gif Solutions The plot indicates that there is a maximum at about 0.2 and a minimum at about 2.8. We will use eee484ex4d and eee484ex4e with a tolerance of 0.000 001. Results:
39
eee484ex4d (Newtons method) input 0.2: maxima at x = input 2.8: minima at x = 0.204481 (in 2 iterations) 2.833148 (in 3 iterations)
eee484ex4e (modified Newtons method) inputs 0.2,0.3: maxima at x = inputs 2.8,2.9: maxima at x = 0.204481 (in 3 iterations) 2.833148 (in 3 iterations)
40
4.5
Question 1 (Newton-Raphson Method) a) Using Taylors expansion derive the following Newton-Raphson iterative formula for finding the root of a function f(x): x[i+1] = x[i] - f(x[i]) / f(x[i]) b) Write a computer program to implement the Newton-Raphson method for the evaluation of the root of f(x) = e^x - 3x^2. Your program should include a tolerance as an input. c) Using the Newton-Raphson method evaluate the root of the function f(x) = e^x - 3x^2, which is near 1.0, to an accuracy of at least 6 decimal places. Show the result of each iteration. Question 3 (Secant Method) a) Using Taylors expansion derive the following Secant iterative formula for finding the root of a function f(x): x[i+1] = x[i] - f(x[i]) ( x[i] - x[i-1] ) / ( F(x[i]) - F(x[i-1]) ) b) Write a computer program to implement the Secant method for the evaluation of the root of f(x) = e^x - 3x^2. Your program should include a tolerance as an input. c) Using the Secant method evaluate the root of the function f(x) = e^x - 3x^2, which is near 1.0, to an accuracy of at least 6 decimal places. Show the result of each iteration. Question 5 (Newtons Square-root) a) Using the Newton-Raphson iterative formula for the root of a function f(x): x[i+1] = x[i] - f(x[i]) / f(x[i]) show that the iterative formula: x[i+1] = x[i] - (x[i]-p/x[i])/2 converges to the square-root of p. b) Write a computer program to implement the above formula. Your program should include a tolerance as an input. c) Using the above formula evaluate the square-root of 45.6 to an accuracy of at least 6 decimal places. Show the result of each iteration. d) Generalise Newtons Square root to compute the nth root of a number p. Question 6 a) Using Taylors expansion, derive Newtons iterative formula for finding the extremum of a function f(x): x[i+1] = x[i] - f(x[i]) / f(x[i])
41
b) Write a computer program to implement Newtons method. Your program should include a tolerance as an input. c) Using the Newtons iterative formula find the maximum of the function f(x) = e^x - 3x^2, which is near 0.2, to an accuracy of at least 6 decimal places. Show the result of each iteration.
42
5
5.1
Numerical Integration: Trapezoidal and Simpsons formulae

Topics Covered
o Numerical Integration: the Extended Trapezoidal Formula (ETF) and the Extended Simpsons Formula (ESF). The student should remember the formulae for the ETF and ESF, be able to compute results by hand, and implement the formulae in computer programs. The student should understand the signicance of the 1/n2 and 1/n4 terms in the truncation error.
5.2
Lecture Notes
Introduction In this lecture we investigate numerical integration using the Newton-Cotes formulas (also called the NewtonCotes rules). These are a group of formulas for numerical integration based on evaluating the integrand at n+1 equally-spaced points. They are named after Isaac Newton and Roger Cotes. From the Newton-Cotes group of rules the two most simplest will be studied, they are the Trapezoid rule and Simpsons rule. The aim is to perform, numerically, the integral of a function F(x) over the limits a to b. The basic idea is to evaluate the function at equally space locations between the limits, summing the values in an appropriate manner will give an approximation to the integral. The approach taken here is to take simple (and therefore less accurate) formulae and implement them in an intelligent way to form basic but practical integration algorithms. The Trapezoidal Rule (the building block) Consider integrating a known function F(x) over the interval x = a to b. The Trapezoidal Rule gives the following expression for the exact integral I: I = (h/2) ( F(a) + F(b) ) (h^3/12) F(z) + higher order terms
where h is the interval b-a, F(z) is the second derivative of the function evaluated at some unknown point z between a and b. You can nd the derivation of this rule elsewhere. Rearranging the Trapezoidal Rule gives: (h/2) ( F(a) + F(b) ) = I + (h^3/12) F(z) + higher order terms The expression to the left of the equality can be calculated numerically, it approximates the true integral with a truncation error O(h3 ). The value of F(z) is generally not known (z is unknown though bounded between a and b) and so this term is omitted from the solution when we perform the numerical calculation.
The Extended Trapezoidal Rule (ETF) We will now extend the Trapezoidal Rule to increase the accuracy of the numerical integral. The expression on the left-hand side gives our numerical approximation for the integral I (which is what we want to know). This expression is simply the area under the straight line between the points [a,F(a)] and [b,F(b)]. Note that two function evaluations are performed. Of course we do not expect this straight line to give an exact representation of the curve F(x) (unless the curve is a straight line, then the function has the form F(x) = m x + c , and so F(x) = 0 and so the LHS is exact). The right hand side is the exact integral I plus the unknown term which represents the truncation error in the approximation. To increase the accuracy of the numerical integration we can divide the single interval up into n intervals and perform n Trapezoidal Rules (n+1 function evaluations). For example for n = 5 intervals:
43
+ + + +
(h/2) (h/2) (h/2) (h/2) (h/2)
( ( ( ( (
F(x0) F(x1) F(x2) F(x3) F(x4)
+ + + + +
F(x1) F(x2) F(x3) F(x4) F(x5)
) ) ) ) ) = + + + +
I1 I2 I3 I4 I5
+ + + + +
(h^3/12) (h^3/12) (h^3/12) (h^3/12) (h^3/12)
F(z1) F(z2) F(z3) F(z4) F(z5)
where h = (b-a)/n = (b-a)/5 x0 = a, x1 = a+h, x2 = a+2h, x3 = a+3h, x4 = a+4h, x5 = a+5h = b (i.e. xi = a + i*h , for i = 0 to 5) The expression reduces to: h ( F(x0)/2 + F(x1) + F(x2) + F(x3) + F(x4) + F(x5)/2 ) = I + 5 (h^3/12) F(z) Where I = I1 +I2 +I3 +I4 +I5 is the exact integral over the full range, and F(z) = F(z1 ) + F(z2 ) + F(z3 ) + F(z4 ) + F(z5 ) is unknown and represents the error in the numerical integration. For n intervals the expression becomes: +-------------------------------------------------------------------+ | ETF = h ( F(x0)/2 + F(x1) + F(x2) + .... + F(x[n-1]) + F(xn)/2 ) | | | | = I + n.(h^3/12).F + higher order terms | | | | where h = (b-a)/n , xi = a + i*h , for i = 0 to n | +-------------------------------------------------------------------+ Extended (Compound) Trapezoidal Formula (ETF) for n intervals. It is useful to rearrange the formula as follows: (h/2) ( F(x0) + F(xn) ) | | a b + h.( F(x1) + F(x2) + ... + F(x[n-1]) )
Replacing h with (b-a)/n in the right hand side gives: n (h^3/12) F = (b-a)^3 F/(12n^2) giving for n intervals the numerical integral +-------------------------------------------------------+ | ETF = h ( F(a) + F(b) ) / 2 | | + h ( F(x1) + F(x2) + ... + F(x[n-1]) ) | | | | = I + (b-a)^3 F/(12 n^2) + higher order terms | | | | where h=(b-a)/n , xi=a+i*h , for i = 1 to n-1 | +-------------------------------------------------------+ Extended (Compound) Trapezoidal Formula for n intervals Inspecting the truncation error term we can now see that the accuracy of the numerical integral can be increased by increasing the number of intervals n; the truncation error in the ETF is inversely proportional 44 + higher order terms
to the the square of the number of intervals, i.e. doubling the number of intervals gives four times the accuracy. Note that the value of (b-a) is constant (the region of integration). Also, if the second derivative of F(x) is small then the error is small, if F is zero then the formula is exact, i.e. for the form F(x) = m x + c the error is zero (as expected). Example Using the Extended Trapezoidal Formula (ETF) integrate the function F(x) = x3 - 3x2 + 5 over the range x = 0.0 to 2.5 using 5 intervals. Solution: First we can see that the second derivative = 6x-6 is not zero and so we expect the ETF to contain a non-zero truncation error. h = (b-a)/n = (2.5-0.0)/5 = 0.5 xi = 0.0 + i*0.5 n=5 i x F(x) --------------------x0=a 0 0.0 5.000 x1 1 0.5 4.375 x2 2 1.0 3.000 x3 3 1.5 1.625 x4 4 2.0 1.000 x5=b 5 2.5 1.875
ETF = 0.25*( 5.000 + 1.875 ) + 0.5*( 4.375 + 3.000 + 1.625 + 1.000 ) = 6.718750. Compare with the analytical result 6.640625, we see the error E5 = 0.078125. Repeating the integral over 10 intervals (n=10) gives ETF = 6.660156, the error is E10 = 0.019531 and so E5 /E10 = 4.0000512 as predicted by the form of the error term in the Trapezoidal formula. Note that the 1/n2 relation is not exact due to higher order terms in the truncation error. Implementation First consider a basic implementation. The inputs to the numerical integration are: 1. The function, F(x), to be integrated 2. The limits of the integration, a and b 3. The number of intervals of the integration, n. Algorithm 5a input a, b input n h = (b-a) / n etf = ( f(a) + f(b) ) / 2 do i = 1 to n-1 x = a + i*h etf = etf + f(x) end do etf = etf*h ! input the lower and upper limits ! input the number of intervals ! the interval size ! sum the end points
! calculate the evaluation position ! sum over the remaining points ! complete the ETF
print The integral = ,etf ! output the result end define the function f(x) = x^3 - 3*x^2 + 5 45
Note: use double precision variables to avoid round-o errors. This algorithm gives the following outputs for the indicated inputs (the true error is given in the brackets): a=0.0, a=0.0, a=0.0, a=0.0, b=2.5, b=2.5, b=2.5, b=2.5, n= 5 n= 10 n= 100 n=1000 integral integral integral integral = = = = 6.718750 6.660156 6.640820 6.640627 (0.08) (0.02) (0.0002) (0.000002)
As expected, the error in the approximation reduces as the square of the number of intervals, n. Here the error is calculated by comparing the numerical result with the analytical evaluation; of course the analytical result might not be available in practice. Tolerance and Error Estimation A desirable property of a numerical integration algorithm is that the accuracy of the result is determined by the user i.e. a tolerance is an input to the algorithm. In our algorithm (above): replace: input n with: input tolerance ! input the number of intervals ! the required accuracy of the result
The algorithm then has to decide what value of n corresponds to the required accuracy (tolerance). For this the algorithm needs to form an estimate E for the error in the ETF and terminate the algorithm when abs(E) is less than Tolerance. We use abs(E) because E can be negative. An error estimate An error estimate can be formulated by making use of the fact that the error has a 1/n2 form. Consider n intervals giving an approximation ETFn and error En . Now run the algorithm again with 2n intervals. The result is ETF2n with an error E2n = En /4 (to a good approximation). The dierence between the two results is (ETFn - ETF2n ) = (En - E2n ) = (4E2n - E2n ) = 3 E2n and so we can write E2n = (ETFn ETF2n )/3. We therefore have an estimate of the error in the nal result ETF2n . This error estimate can be used as a termination condition in the algorithm: repeat the ETF doubling the number of intervals until abs(E2n ) less than the value of Tolerance. Also we can use the error estimate to improve our nal approximation: ETFimproved = ETF2n - E2n (this is applied after the termination condition). The estimated error is subtracted from the nal approximation. This works well as long as the assumption that E2n = En /4 is accurate. Example using the above data: E10 = (ETF5 - ETF10 )/3 = (6.718750-6.660156)/3 = 0.019531 ETFimproved = ETF10 - E10 = 6.660156 - 0.019531 = 6.640625 (exact to 6dp). The modication of Algorithm 5a for the input of a tolerance is left to the student (see lab exercise). Simpsons Rule (a higher-order method) The second method in Newton-Cotes group is Simpsons Rule which provides a higher-order method: I = (h/3) ( F(a) + 4 F((a+b)/2) + F(b) ) Rearranging Simpsons Rule gives: (h/3) ( F(a) + 4 F((a+b)/2) + F(b) ) = I + (h^5/90) F(z) + higher order terms The left hand side can be calculated, there are three function evaluations (i.e. n=2). Extending Simpsons Rule for n intervals (n must be even) gives: (h^5/90) F(z) + higher order terms
46
[n=2]
(h/3) ( F(x0) + 4 F(x1) + F(x2) ) = I + (h^5/90) F(z) + higher order terms
[n=4]
(h/3) ( F(x0) + 4 F(x1) + F(x2) ) + (h/3) ( F(x2) + 4 F(x3) + F(x4) ) = (h/3) ( F(x0) + 4 F(x1) + 2 F(x2) + 4 F(x3) + F(x4) ) = I + 2(h^5/90) F(z)
[n even]
(h/3) ( F(x0) + 4 F(x1) + F(x2) + F(x2) + 4 F(x3) + F(x4) + F(x4) + 4 F(x5) + F(x6) + . . F(x[n-2]) + 4 F(x[n-1]) + F(x[n]) ) = I + (n/2)(h^5/90) F(z)
and combining terms gives: +-------------------------------------------------------------+ | ESF = (h/3) ( F(x0) + 4 F(x1) + 2 F(x2) + 4 F(x3) + .... | | + 2 F(x[n-2]) + 4 F(x[n-1]) + F(x[n] ) | | | | = I + (n/2)(h^5/90) F + higher order terms | | | | where h = (b-a)/n , xi = a + i*h , for i = 0 to n | +-------------------------------------------------------------+ Extended Simpsons Formula (ESF) for n(even) intervals. Replacing h with (b-a)/n (n/2)(h^5/90) F = (b-a)^5 F/(180 n^4) and forming summation series, we have +-------------------------------------------------------------+ | ESF = (h/3) ( F(x0) + F(x[n] ) | | + 4 (h/3) ( F(x1) + F(x3) + .... + F(x[n-1]) ) | | + 2 (h/3) ( F(x2) + F(x4) + .... + F(x[n-2]) ) | | | | = I + (b-a)^5 F/(180 n^4) + higher order terms | | | | where h = (b-a)/n , xi = a + i*h , for i = 0 to n | +-------------------------------------------------------------+ Extended Simpsons Formula (ESF) for n(even) intervals. Inspecting the truncation error for the ESF we see that the error is proportional to 1/n4 and the fourth derivative of F. We can therefore expect a much smaller truncation error than the ETF. Example (using the previous ETF example) Using the Extended Simpsons Formula (ESF) integrate the function F(x) = x3 - 3x2 + 5 over the range x = 0.0 to 2.5 using 2 intervals. Solution: First we can see that the fourth derivative is zero and so even with just n=2 we expect the truncation error to be zero. 47 + higher order terms
h = (b-a)/n = (2.5-0.0)/2 = 1.25
n=2 i x F(x) --------------------x0=a 0 0.00 5.000 x1 1 1.25 2.265625 x2=b 2 2.50 1.875
ESF = (h/3) ( F(x0) + 4 F(x1) + F(x2) ) = 1.25/3 * ( 5.000 + 4*2.265625 + 1.875 ) = 6.640625 The result is exact as expected. To test the ESF further we need a function that has a non-zero fourth derivative. For example integrate the function F(x) = x5 over the same interval: h = (b-a)/n = (2.5-0.0)/2 = 1.25 n=2 i x F(x) --------------------x0=a 0 0.00 0.00 x1 1 1.25 3.0517578125 x2=b 2 2.50 97.65625
ESF = (h/3) ( F(x0) + 4 F(x1) + F(x2) ) = 1.25/3 * ( 0 + 4*3.0517578125 + 97.65625 ) = 45.7763671875 The exact results is 2.56 /6 = 40.690104166666667 and so the error E2 = 45.7763672 - 40.690104166666667 = 5.086263020833333 Repeating for n=4 we get ESF = 41.00799560547 and so E4 = 41.00799560547 - 40.690104166666667 = 0.3178914388021 We expect an error proportional to 1/n4 i.e. E2 /E4 = 24 = 16, and 5.086263020833333/0.3178914388021 = 16.00000 which is consistent with the 1/n4 expectation. Implementation The inputs to the numerical integration are: 1. The function, F(x), to be integrated; 2. The limits of the integration, a and b; 3. The number of intervals of the integration, n. Algorithm 5b input a, b ! input the lower and upper limits input n ! input the number of intervals h = (b-a) / n ! the interval size esf = f(a) + f(b) ! sum the end points do i = 1, n-1,2 x = a + i*h ! calculate the evaluation position esf = esf + 4 * f(x) ! sum the odd points end do do i = 2, n-2,2 x = a + i*h ! calculate the evaluation position esf = esf + 2 * f(x) ! sum the even points end do esf = esf * h/3 print The integral = ,esf ! output the result end define the function f(x) = x^5
48
This ESF algorithm is compared with the ETF algorithm for F(x) = x5 , a=0.0, b=2.5 : intervals n= 4 n= 8 n= 16 n= 32 n=320 ETF 46.968460083 42.274594307 41.087158024 40.789425838 40.691097575 error 6.278355916 1.584490140 0.397053858 0.099321672 0.000993409 ESF 41.007995605 40.709972382 40.691345930 40.690181777 40.690104174 error 0.317891439 0.019868215 0.001241763 0.000077610 0.000000008
The ESF method is clearly more accurate than the ETF method. Tolerance and Error Estimation As discussed earlier for the ETF algorithm, it is desirable to replace the input n with a tolerance; for this the algorithm needs to form an estimate E for the error in the ESF and terminate the algorithm when abs(E) is less than the value of Tolerance. An error estimate An error estimate can be formulated by making use of the fact that the truncation error has a 1/n4 form. Consider n intervals giving an approximation ESFn and error En . Now run the algorithm again with 2n intervals. The result is ESF2n with an error E2n = En /16 (to a good approximation). The dierence between the two results is (ESFn - ESF2n ) = (En - E2n ) = (16E2n - E2n ) = 15 E2n and so we can write E2n = (ESFn - ESF2n )/15. We therefore have an estimate of the error in the nal result ESF2n . This error estimate can be used as a termination condition in the algorithm: repeat the ESF doubling the number of intervals until abs(E2n ) is less than the value of Tolerance. Also we can use the error estimate to improve our nal approximation: ESFimproved = ESF2n - E2n (this is applied after the termination condition). The estimated error is subtracted from the nal approximation. Example using the above data: E32 = (ESF16 - ESF32 )/15 = (40.691345930-40.690181777)/15 = 0.000077610. ESFimproved = ESF32 - E32 = 40.690181777 - 0.000077610 = 40.690104167 (exact to 9dp). However in some cases, where the higher derivatives of a function are signicant, this error estimate is not accurate. In these cases it is better to use an alternative error estimate formed simply as E2n = ESFn ESF2n , in this way we repeat the calculations until the dierence between the previous result and the new result is smaller than a tolerance. This avoids the possible underestimation of the above error estimate (see the lab exercise). Adaptive methods The Newton-Cotes formulas we have considered in this lecture perform numerical integration based on evaluating the integrand at n+1 equally-spaced points. This may not be an eective approach in cases were the function varies greatly over the region of integration. For example and Gaussian function has long (innite) tails that reduce asymptotically towards zero. In such cases the spacing between function evaluations need to vary. This is the subject of adaptive numerical integration methods. You can read about these methods elsewhere. Summary We have investigated two numerical methods for integrating a function F(x) over the range x = a to b based on evaluating the integrand at n+1 equally-spaced points. The Extended Trapezoidal Formula (ETF) has a truncation error proportional to the second derivative and 1/n2 . The Extended Simpsons Formula (ESF) has an truncation error proportional to the fourth derivative and 1/n4 . Error estimates can be formed by considering the 1/n2 and 1/n4 forms for the truncation error or simply by comparing the dierence between the previous result and the new result as the number of intervals is increased. These error estimates can be used as a termination condition in the numerical integration algorithm. 49
5.3
Lab Exercises
The task Write a computer program to integrate the following functions to an accuracy of at least 6 decimal places using the Extended Trapezoidal Formula and the Extended Simpsons Formula. F(x) = ( 1-x2 )1/2 over the range x=-1 to x=1 F(t) = 894 t / ( 1.76 + 3.21 t2 )3 over the range t=0 to t=1.61 Estimate the integrals from a rough sketch of the function. Questions 1. What are the results of your program - do they look reasonable? 2. What is the approximate error in the numerical integral? Note: To determine the integral to at least 6 decimal places you should run the program with say 16 intervals, then run it again with 32 intervals and 64 and 128 etc until the error estimate is less than 0.000001. But be careful with the way you choose the error estimate - you might run into problems! Also, if you can think of a way to instruct the computer to perform this procedure automatically then you will save your self a lot of time (and have a useful program!).
50
5.4
Lab Solutions
Write a computer program to integrate the following functions to an accuracy of at least 6 decimal places using the Extended Trapezoidal Formula and the Extended Simpsons Formula. F(t) = 894 t / ( 1.76 + 3.21 t2 )3 over the range t=0 to t=1.61 F(x) = ( 1-x2 )1/2 over the range x=-1 to x=1 Estimate the integrals from a rough sketch of the function. 1. What are the results of your program - do they look reasonable? 2. What is the approximate error in the numerical integral? Solutions: Programs: eee484ex5a and eee484ex5b (see download page). To determine the integral to at least 6 decimal places you should run the program with say 16 intervals, then run it again with 32 intervals and 64 and 128 etc until the error estimate is less than 0.000001. The results shown below use the following error estimates: E2n = (ETFn -ETF2n )/3 and E2n = (ESFn -ESF2n )/15. We have already seen in the lecture notes that these estimates can be very accurate. However, we will see that they are not so good for the second function given in this exercise and so later we will try again with the error estimates E2n = ETFn -ETF2n and E2n = ESFn -ESF2n . First it is good practice to estimated the integral from a rough sketch to make sure that computed result is not completely wrong (due to a programming error). Results for F(t) = 894 t / ( 1.76 + 3.21 t2 )3 First a rough sketch of the integral shows that we expect the result to be approximately 21.8. Next, computing the ETF and ESF for n=16 and then doubling until the tolerance is satised, gives: n etf Error Estimate True E. esf 21.795657 21.792484 21.792302 *21.792290 21.792290 21.792290 21.792290 Error estimate True E.
16 21.650237 32 21.756923 64 21.783457 128 21.790082 256 21.791738 512 21.792152 1024 21.792255 2048 21.792281 4096 21.792287 8192 *21.792289
(ETFn-ETF2n)/3 -0.035562 -0.035367 -0.008845 -0.008833 -0.002208 -0.002208 -0.000552 -0.000552 -0.000138 -0.000138 -0.000034 -0.000034 -0.000009 -0.000009 -0.000002 -0.000002 -0.0000005 -0.0000005
(ESFn-ESF2n)/15 0.000211 0.000195 0.000012 0.000012 0.0000007 0.0000007 0.00000005 0.00000005 0.000000003 0.000000003 0.000000000 0.000000000
Here the Error Estimate and True Error are compared; in this case to 6dp the error estimates are accurate. Iterations are terminated(*) when the error estimate is less than 106 . For the ETF a 6 decimal place accuracy is achieved for n=8192 (the error estimate is -0.000 000 5) with the numerical integral = 21.792 289. The actual error by calculus is also 0.000 000 5, in this case the error estimate is accurate. The ESF gives the same result but in fewer iterations. Again the error estimate is accurate.
51
Results for F(x) = ( 1-x2 )1/2 First a rough sketch of the integral shows that we expect the result to be approximately 1.57. n etf Error estimate True esf 1.560595 1.567199 1.569526 1.570348 1.570638 1.570740 1.570777 *1.570789 1.570794 1.570795 1.570796 Error estimate True error
16 1.544910 32 1.561627 64 1.567551 128 1.569648 256 1.570390 512 1.570653 1024 1.570746 2048 1.570778 4096 1.570790 8192 1.570794 16384 *1.570796
(ETFn-ETF2n)/3 -0.005572 -0.009170 -0.001975 -0.003245 -0.000699 -0.001148 -0.000247 -0.000406 -0.000087 -0.000144 -0.000031 -0.000051 -0.000011 -0.000018 -0.000004 -0.000006 -0.000001 -0.000002 -0.0000005 -0.0000008
(ESFn-ESF2n)/15 -0.000440 -0.003597 -0.000155 -0.001270 -0.000055 -0.000449 -0.000019 -0.000159 -0.000007 -0.000056 -0.000002 -0.000020 -0.0000009 -0.0000070 -0.0000003 -0.0000025 -0.0000001 -0.0000009 -0.00000004 -0.00000031
For the ETF 6 decimal place accuracy is achieved with n=16384 giving a numerical integral = 1.570796 The error estimate is -0.000 000 5 while the actual error by calculus is 0.000 0009 about twice the estimated error but still less than 106 . For this function the error estimate is not very accurate but still serves well for the termination condition. The ESF performs only slightly better than the ETF for this function, except for the error estimate which is of the order of 10 times less than the true error. Consequently the algorithm terminates too soon giving the result 1.570 789 that has only 5 dp accuracy! This problematic behavior can be explained by the fact that the higher derivatives of this function are signicant. The solution to this problem is to dene the error estimate in a more reliable way as E2n = ESFn - ESF2n , in this case the algorithm terminates at n=16384 giving the result 1.570796 that is now correct to 6 decimal places (see below). Discussion The error estimates (ETFn -ETF2n )/3 and (ESFn -ESF2n )/15 in some cases give very good estimates of the truncation error and in other cases not so good; this varies from function to function (and where the function is being evaluated). A much more reliable error estimate is E2n = ESFn - ESF2n . This will guarantee the correct result! It would be practical to start with, for example, n=1024; the solution is then only a three or four programruns away. Alternatively you could write the program such that it automatically repeats the ETF (doubling n each time) until the termination condition is met. This is implemented in eee484ex5a-auto and eee484ex5bauto (see the downloads page). The more reliable error estimate of E2n = ESFn - ESF2n is also used in these programs. Results for F(t) = 894 t / ( 1.76 + 3.21 t2 )3 . n ETF 32 21.756923 64 21.783457 128 21.790082 256 21.791738 512 21.792152 1024 21.792255 2048 21.792281 4096 21.792287 8192 21.792289 16384 *21.792289 Error -0.106686 -0.026534 -0.006625 -0.001656 -0.000414 -0.000103 -0.000026 -0.000006 -0.000002 -0.000000 ESF Error 21.792484 0.003172 21.792302 0.000183 21.792290 0.000011 *21.792290 <0.000001
52
Results for F(x) = ( 1-x2 )1/2 over the range x=-1 to x=1. n ETF 32 1.561627 64 1.567551 128 1.569648 256 1.570390 512 1.570653 1024 1.570746 2048 1.570778 4096 1.570790 8192 1.570794 16384 1.570796 32768 *1.570796 Error -0.016717 -0.005925 -0.002097 -0.000742 -0.000262 -0.000093 -0.000033 -0.000012 -0.000004 -0.000001 <-0.000001 ESF 1.567199 1.569526 1.570348 1.570638 1.570740 1.570777 1.570789 1.570794 1.570795 *1.570796 Error -0.006604 -0.002327 -0.000821 -0.000290 -0.000103 -0.000036 -0.000013 -0.000005 -0.000002 <-0.000001
In this case the 6 decimal place accuracy is guaranteed. Conclusion We can integrate functions numerically to a predened accuracy. Higher-order methods do not necessarily give more accurate results! The Extended Trapezoidal Formula and Extended Simpsons Formula are simple to implement and work well when used intelligently with a termination condition based on an error estimate and a tolerance. Be careful when forming error estimates, E2n = ESFn - ESF2n is safer.
53
5.5
Question a) Using the ETF(or ESF) perform the following integral by dividing the region of integration into ten equally spaced intervals: Integral of ( 6 x^2 - e^x ) dx from 1.0 to 5.0
b) Write a computer program to implement this method for 100 intervals. Answers a) The ETF is given by: I = h ( f0/2 + f1 + f2 + ..... + fn-1 where fi = f(xi) and xi = a + i*h and h = (b-a)/n. Note that there are (n+1) function evaluations. For ten intervals (n=10), and a range of 1.0 to 5.0 Evaluating with a pocket calculator f(x) = 6x^2 - e^x gives: sum = 252.51921 ; I = 0.4*sum = 101.00768 Repeating with the ESF: I = h/3 ( f0 + 4 f1 + 2 f2 + ..... + 2 fn-2 + 4 fn-1 + fn ) Evaluating with a pocket calculator f(x) = 6x^2 - e^x gives: sum = 767.1359 ; I = 0.4/3*sum = 102.285 A quick check with the analytical solution 102.305 indicates that the answer seems reasonable. b) See the lab exercise. + fn/2 )
54
6
6.1
Solution of D.E.s: Runge-Kutta, and Finite-Dierence

Topics Covered
o Runge-Kutta; the student should be able to write down Euler (rst-order RK) steps representing the time evolution of a given (simple) physical system, and calculate the truncation error by considering Taylors series. The student should be able to write a computer program implementing the formulae for the given physical system. o Laplace and Jacobi Relaxation: the student should be able to derive the nite dierence form for Laplaces equation using the CDA2, and solve for the potential V(x,y,z).
6.2
Lecture Notes
Introduction The numerical solution of dierential equations is a very large subject spanning many types of problems and solutions. In this lecture we will look at just two simplied topics: the solution of ordinary dierential equations using Eulers method (rst-order Rung-Kutta), and the solution of partial dierential equations using the nite-dierence method and Jacobi Relaxation. You can read about many other types of problems and solutions in your course text book (Ordinary Dierential Equations and Partial Dierential Equations). The Euler Method (rst-order Rung-Kutta) Many physical systems can be expressed in terms of rst-order or second-order dierential equations. The time-evolution of these systems can be approximated with a Euler method. We will apply the Euler method to simulate a body in free-fall, the charging and discharge of a simple R-C circuit, and the motion of a mass-on-a-spring system. To introduce the ideas of Euler methods we will rst look at a simple freshman physics problem of a body in free-fall; here air resistance is ignored and the acceleration due to gravity is a constant = 9.81 m/s2 . First the problem is solved analytically and then we will develop Euler methods to solve the problem numerically. We will then study Euler methods further. Free-fall - analytical solution We will determine the displacement of a body in free-fall. The boundary conditions for the solution is that at t=0 the initial displacement, y, and the initial velocity, v, are both zero. The system is governed by the second-order dierential equation y = -g ; integrating with the above boundary conditions gives y = - g t and so y = - 0.5 g t2 . For t = 10 seconds we therefore have a displacement of -0.5x9.81x102 = -490.500 m Free-fall - numerical solution (by the Euler method) We have the second-order dierential equation y = -g this can be written as two rst-order equations: dv/dt = -g and dy/dt = v, rearranging these equations we have: dv = -g dt and dy = v dt, which can be written as: v(t+dt) = v(t) - g dt y(t+dt) = y(t) + v(t) dt These expressions simple state that the velocity a time dt later is the current velocity advanced by acceleration multiplied by dt, and the displacement a time dt later is the current displacement advanced by velocity multiplied by dt. The expressions are exact if dt takes the calculus form of dt tends to zero. However, for 55
a numerical solution dt is small but not zero; with a nite value for dt the above equations become Eulers method, the expressions now, in general, contain truncation errors (see later). Writing the expressions in the form of an algorithm: v1 = v0 - g dt y1 = y0 + v0 dt Euler step evolving velocity and displacement in a finite time dt
Three versions of the Euler step exist: Simple Euler v1 = v0 - g dt y1 = y0 + v0 dt Euler-Cromer v1 = v0 - g dt y1 = y0 + v1 dt Improved Euler v1 = v0 - g dt y1 = y1 + (v0+v1)/2 dt
Each version uses a dierent velocity to evolve y: Simple Euler : Euler-Cromer : Improved Euler : v0 v1 (v0+v1)/2 - generally poor accuracy - works well for oscillating systems - exact for the free-fall system.
Algorithm 6a Implementation of the Simple Euler method to evolve the displacement of a body under free-fall (g=9.81) for 10 seconds. The system evolves in time steps of 0.1 seconds (100 iterations). dt t y v1 = = = = 0.1 0 0 0 iterations v1 t + dt v0 - 9.81*dt y + v0 dt ! ! ! ! Time step is 0.1 seconds Time is initially zero Displacement is initially zero Velocity is initially zero
do 100 v0 = t = v1 = y = end do
! ! ! !
Record Evolve Evolve Evolve
the previous velocity time velocity accord displacement (Simple Euler method)
output y
! Output the result
Result: displacement at t = 10.000 seconds is -485.595 m The exact result (from calculus) is - g t2 /2 = -9.81x102 /2 = -490.500 m. This dierence, +4.905 m, between this numerical result and the analytical result is due to truncation errors in the Simple Euler method. We can investigate this by considering Taylors expansion: y(t+dt) = y(t) + y(t) dt + y(t) dt2 /2 + y(t) dt3 /6 + ... In free-fall y(t)=v(t), y(t)=-g, and y(t)=0, and so we can write: y(t+dt) = y(t) + v(t) dt - g dt2 /2. The rst two terms in the above equation is the Euler-step for evolving displacement (with the evolution of velocity v(t+dt) = v(t) - g dt being exact). The last term g dt2 /2 represents the truncation error in each Simple Euler step. In the above algorithm the Euler step is iterated 10s/0.1s = 100 times which implies the total error is 100 g dt2 /2 = 100x9.81x0.12 /2 = 4.905 m, as seen in the results of the algorithm. Again by considering Taylors expansion it can be shown that the Improved Euler method is exact for the case of free-fall (see homework). Replace y = y + v0 dt with y = y + (v0+v1)/2 dt in the above algorithm to convert it to the Improved Euler method. Euler methods give a general tool for solving systems governed by rst- or second-order dierential equations - in many cases analytical solutions are not available. Euler methods are generally not exact though with 56
careful choice of the version of the method and by using a small enough value for dt Euler methods can give good results. The general form for the Simple Euler method: First-order: dy/dt = f(), Euler step: y1 = y0 + f() dt where f() is any function of t and y and t,y can represent any parameters. Example, a marble falling in oil: dv/dt = g - bv/m Euler step: v = v + (g-bv/m) dt (only velocity is evolved) Second-order: y = f() where f() is any function of t and y and t,y can represent any parameters. Replace this with two first-order equations: dv/dt = f() and dy/dt = v Euler steps: v1 = v0 + f() dt and y = y + v0 dt Example, a simple pendulum: let y = theta and v = omega => theta = - g Sin(theta)/L replace with two first-order equations: d(omega)/dt = -g Sin(theta)/L and d(theta)/dt = omega Euler steps: omega1 = omega0 - (g Sin(theta)/L) dt and theta = theta + omega0 dt For systems governed by second-order dierential equations the Euler-Cromer and Improved Euler methods can be obtained with simple modications of the above formulae. Summary so far It has been shown that a Euler method can be implemented to evolve, in time, the motion of a free-falling body. No analytical inputs are required and so this method can be employed in the study of more complex systems where analytical solutions are dicult of impossible. The type of Euler method should be chosen carefully; the Improved-Euler method gives an exact result in the case of free-fall, the Euler-Cromer method is more suitable for oscillating systems. Examples The following are examples of employing the Simple Euler method to solve some physical systems (each system has an analytical solution with which you can check the result of the numerical solution). A rst-order system A charging R-C circuit. A simple R-C circuit is governed by the 1st-order D.E.: i = dq/dt , where i is the current in the circuit: i = (V0-V)/R V0 is the charging voltage, and V is the potential difference across the capacitor V = q/C.
The analytical result for potential difference across the capacitor as it charges is given by: V = V0 (1-exp(-t/RC)) The Simple Euler simulation is as follows: The system should be initialised:
57
R = 1000 C = 1E-6 V0 = 12 V = 0 q = 0 dt = 1E-5 t = 0
! ! ! ! !
Circuit resistance (Ohms) Circuit capacitance (Farads) Charging voltage (Volts) Initial potential of the capacitor (Volts) Initially uncharged (Coulombs)
! Euler time step (seconds) ! Start at t=0 seconds
The Simple Euler steps are i t q V = = = = (V0-V)/R t + dt q + i dt q/C ! ! ! ! Calculate the circuit current Advance the time Add a small amount of charge dq = i dt Recalculate the voltage
The system evolves by iterating the above Euler steps. A second-order system The displacement of a mass-on-a-spring The restoring force of a mass on a spring is given by F = -k.x => the motion of the body is governed by the 2nd-order D.E.: x = -kx/m where k is the spring constant (N/m), x is the displacement, m is the inertial mass. two 1st-order D.E.s are formed: dv/dt = -kx/m dx/dt = v and
The analytical result for the displacement is: x = x0 Cos(w.t) where x0 is the amplitude and w is the angular frequency = SQRT(k/m)
The Euler simulation is as follows (here I show the Euler-Cromer method - it is more accurate for oscillating systems and is more concise to write down): The system should be initialised: m k x v = = = = 0.1 1.0 0.1 0 ! ! ! ! Mass (kg) Spring constant (N/m) Initial displacement (amplitude) (m) The mass is initial at rest.
58
dt = 0.01 t = 0
! Time step (seconds) ! Start at t=0
The Euler-steps are t a v x = = = = t + dt -kx/m v + a dt x + v dt ! ! ! ! Advance the time Calculate the acceleration Advance the speed Advance the displacement (using the new speed)
The system evolves by iterating the above Euler steps. Algorithm 6b As an example, a simulation of the displacement of a mass-on-a-spring is implemented in the algorithm below. The 1 kg mass is initially at rest and at a displacement of 1 m. The spring constant is 1 N/m and the simulation evolves in time steps of 0.1 seconds. The algorithm terminates after 100 iterations (10 seconds). m = 0.1 k = 1.0 x0 = 0.1 dt = 0.01 x = x0 v = 0.0 t = 0.0 ! ! ! ! Mass Spring constant Amplitude Time step ! ! ! ! Parameters
! Initial displacement ! Initial velocity ! Initial time
! ! !
Initial state
do 100 iterations t = t + dt ! Advance time a = -k x/m ! Calculate the acceleration v = v + a dt ! Advance the velocity x = x + v dt ! Advance the displacement (Euler-Cromer) ! Ouput, and compare the displacement with the exact expression output x, x0*COS(SQRT(k/m)*t) end do 4th-Order Runge-Kutta For serious calculations, higher-order methods are employed. One very popular method is the 4th-order Runge-Kutta method; the truncation error is greatly reduced though at the expense of a more complex algorithm. You can read more about this in your course text book. Finite-Dierence Methods Many physical systems can be represented by partial dierential equations. Such systems can be solved numerically using nite-dierence methods; i.e. the PDEs are replaced with nite-dierence approximations. For example dV/dt can be approximated by the forward-dierence approximation (V(t+dt)-V(t))/dt, and d2 V/dx2 can be approximated by the Central-Dierence approximation ( V(x-dx) - 2V(x) + V(x+dx) ) / dx2 . In this lecture we will look at just one such method; you can refer to the course text book for an introduction to a number of other Finite-Dierence methods.
59
Finite-Dierence: Laplace For a region of space which does not contain any electric charge the electric potential, V(x,y,z) in that region must obey Laplaces equation: d2 V/dx2 + d2 V/dy2 + d2 V/dz2 = 0; here d2 /dx2 etc are partial derivatives. Laplaces equation can be solved analytically for simple (symmetric) congurations; however, if a more complex conguration is to be solved then a numerical method must be employed. The numerical solution involves four steps: 1. The region of space is represented by a three-dimensional lattice where the potential is dened at discrete points. 2. Laplaces Equation is written numerically as a nite dierence equation. 3. The numerical equation is solved. 4. A relaxation algorithm is employed to apply the solution until Laplaces equation is satised. The simplest method for this is the Jacobi Method. 1. A 3-d lattice The potential V(x,y,z) in a region of space can be mapped by a lattice V(i, j, k) where i, j, and k specify points in the lattice: x = i.dx , y = j.dy , z = k.dz where dx, dy, dz are the spacing between the lattice points in the x-, y-, and z-direction respectively. The lattice is initialised with the boundary conditions (which are xed) and with an initial approximation to the solution (which will be relaxed to the solution). Note: as the lattice spacing tends to zero the lattice becomes continuous space. However, the lattice spacing must be nite in this computed solution; the model is therefore not exact. 2. A Finite Dierence equation for Laplaces Equation. Recall that the central dierence approximation for the second derivative of a function F(x) is given by: CDA2 = ( F(x-h) - 2F(x) + F(x+h) ) / h^2 and so we can write (approximately): d^2 V ----- = ( V(x-dx,y,z) - 2V(x,y,z) + V(x+dx,y,z) ) / dx^2 dx^2 d^2 V ----- = ( V(x,y-dy,z) - 2V(x,y,z) + V(x,y+dy,z) ) / dy^2 dy^2 d^2 V ----- = ( V(x,y,z-dz) - 2V(x,y,z) + V(x,y,z+dz) ) / dz^2 dz^2 and Laplaces equation becomes [ V(x-h,y,z) - 2V(x,y,z) + V(x+h,y,z) + V(x,y-h,z) - 2V(x,y,z) + V(x,y+h,z) + V(x,y,z-h) - 2V(x,y,z) + V(x,y,z+h) ] / h^2
where, for convenience, the lattice spacings are set equal i.e. dx = dy = dz = h 3. Solution Solving for V(x,y,z) [see homework] gives 60
V(x,y,z) =
[ V(x-dx,y,z) + V(x+dx,y,z) + V(x,y-dy,z) + V(x,y+dy,z) + V(x,y,z-dz) + V(x,y,z+dz) ] / 6
The equation simply states that the value of the potential at any point is the average of the potential at neighboring points. The solution for V(x,y,z) is the function that satises this condition at all points simultaneously and satises the boundary conditions. 4. Jacobi Relaxation Applying the above solution to the region of space modies the eld so that it is in better agreement with Laplaces equation. The solution must be applied many times, each iteration giving a better agreement with Laplaces equation. The solution is satised when further iterations yields insignicant modications to the potential eld. The dierence between the old and new lattice can be expressed as: Delta = |V1-V2| ---------------number of points
Iteration can be terminated when Delta is less than some small value that corresponds to an insignicant change in the solution. This method of relaxation of the potential eld is one of many techniques which can be used to solve for V(x,y,z). The Jacobi relaxation method is the simplest form of relaxation; other methods are employed to speed up the relaxation process especially for large lattices. Algorithm 6c The following algorithm implements Jacobi Relaxation for a dipole potential. It employs a 33-by-33 twodimensional lattice and iterates until Delta is less than 106 . Note that the matrix is centered at (0,0).
Declare matrix V1(-16:+16,-16:+16) Declare matrix V2(-16:+16,-16:+16) V2=0.0 ! Set all grid points to zero potential ! Create the -ve pole ! Create the +ve pole
V2(-6,0)=-1.0 V2(+6,0)=+1.0 do V1=V2
! Make a copy of the old lattice
! Apply the solution to Laplaces equation ! but dont modify the boundary. do i=-15,+15 do j=-15,+15 V2(i,j) = ( V1(i-1,j) + V1(i+1,j) + V1(i,j-1) + V1(i,j+1) )/4 end do end do V2(-6,0)=-1.0 V2(+6,0)=+1.0 ! Reset the dipoles
61
! Compute the difference between the old and new solution Delta = sum(abs(V1-V2)) / (33*33) if (Delta < 0.000001) exit end do ! Terminate if Delta is small
The Following are results for a dipole and a parallel-plate capacitor; values of potential are represented by characters: -zyxwvutsrqponmlkjihgfedcba.ABCDEFGHIJKLMNOPQRSTUVWXYZ+ | | | | | -1V -0.5V 0 V +0.5V 1.0V
62
A Dipole (dipole.f90) Initial field

................................................................ ................................................................ ................................................................ ................................................................ ................................................................ ................................................................ ................................................................ ................................................................ ................................................................ ................................................................ ................................................................ ................................................................ ................................................................ ................................................................ ................................................................ ................................................................ ...................--......................++................... ................................................................ ................................................................ ................................................................ ................................................................ ................................................................ ................................................................ ................................................................ ................................................................ ................................................................ ................................................................ ................................................................ ................................................................ ................................................................ ................................................................ ................................................................ ................................................................
Final field after 473 iterations

................................................................ ................................................................ ................................................................ .........aaaaaaaaaaaaaaaa..............AAAAAAAAAAAAAAAA......... .......aaaaaaaaaaaaaaaaaaaa..........AAAAAAAAAAAAAAAAAAAA....... .....aaaaaaaaaaaaaaaaaaaaaaaa......AAAAAAAAAAAAAAAAAAAAAAAA..... ...aaaaaaaaaabbbbbbbbaaaaaaaa......AAAAAAAABBBBBBBBAAAAAAAAAA... ...aaaaaabbbbbbbbbbbbbbbbaaaa......AAAABBBBBBBBBBBBBBBBAAAAAA... ...aaaabbbbbbbbbbccbbbbbbbbaaaa..AAAABBBBBBBBCCBBBBBBBBBBAAAA... ...aaaabbbbccccccccccccbbbbaaaa..AAAABBBBCCCCCCCCCCCCBBBBAAAA... .aaaabbbbccccddddddddddccbbbbaa..AABBBBCCDDDDDDDDDDCCCCBBBBAAAA. .aaaabbccccddddeeeeeeddddccbbaa..AABBCCDDDDEEEEEEDDDDCCCCBBAAAA. .aaaabbccddddeeffffffffeeddccaa..AACCDDEEFFFFFFFFEEDDDDCCBBAAAA. .aabbbbccddeeffgghhhhggffeeccbb..BBCCEEFFGGHHHHGGFFEEDDCCBBBBAA. .aabbccddeeffggiijjkkiiggeeddbb..BBDDEEGGIIKKJJIIGGFFEEDDCCBBAA. .aabbccddeeffhhjjmmoolliiffddbb..BBDDFFIILLOOMMJJHHFFEEDDCCBBAA. .aabbccddeeffhhkkpp--oojjggddbb..BBDDGGJJOO++PPKKHHFFEEDDCCBBAA. .aabbccddeeffhhjjmmoolliiffddbb..BBDDFFIILLOOMMJJHHFFEEDDCCBBAA. .aabbccddeeffggiijjkkiiggeeddbb..BBDDEEGGIIKKJJIIGGFFEEDDCCBBAA. .aabbbbccddeeffgghhhhggffeeccbb..BBCCEEFFGGHHHHGGFFEEDDCCBBBBAA. .aaaabbccddddeeffffffffeeddccaa..AACCDDEEFFFFFFFFEEDDDDCCBBAAAA. .aaaabbccccddddeeeeeeddddccbbaa..AABBCCDDDDEEEEEEDDDDCCCCBBAAAA. .aaaabbbbccccddddddddddccbbbbaa..AABBBBCCDDDDDDDDDDCCCCBBBBAAAA. ...aaaabbbbccccccccccccbbbbaaaa..AAAABBBBCCCCCCCCCCCCBBBBAAAA... ...aaaabbbbbbbbbbccbbbbbbbbaaaa..AAAABBBBBBBBCCBBBBBBBBBBAAAA... ...aaaaaabbbbbbbbbbbbbbbbaaaa......AAAABBBBBBBBBBBBBBBBAAAAAA... ...aaaaaaaaaabbbbbbbbaaaaaaaa......AAAAAAAABBBBBBBBAAAAAAAAAA... .....aaaaaaaaaaaaaaaaaaaaaaaa......AAAAAAAAAAAAAAAAAAAAAAAA..... .......aaaaaaaaaaaaaaaaaaaa..........AAAAAAAAAAAAAAAAAAAA....... .........aaaaaaaaaaaaaaaa..............AAAAAAAAAAAAAAAA......... ................................................................ ................................................................ ................................................................
A parallel plate capacitor (capacitor.f90) For a capacitor simply replace: V2(-6,0)=-1.0 V2(+6,0)=+1.0 Initial field
................................................................ ................................................................ ................................................................ ................................................................ ................................................................ ................................................................ ................................................................ ................................................................ ...................--......................++................... ...................--......................++................... ...................--......................++................... ...................--......................++................... ...................--......................++................... ...................--......................++................... ...................--......................++................... ...................--......................++................... ...................--......................++................... ...................--......................++................... ...................--......................++................... ...................--......................++................... ...................--......................++................... ...................--......................++................... ...................--......................++................... ...................--......................++................... ...................--......................++................... ................................................................ ................................................................ ................................................................ ................................................................ ................................................................ ................................................................ ................................................................ ................................................................
with: V2(-6,-8:+8)=-1.0 V2(+6,-8:+8)=+1.0 Final field after 330 iterations
................................................................ .....aaaaaaaaaaaaaaaaaaaaaaaa......AAAAAAAAAAAAAAAAAAAAAAAA..... ...aaaabbbbbbccccccccccbbbbaaaa..AAAABBBBCCCCCCCCCCBBBBBBAAAA... .aaaabbccccddddeeeeeeddddccbbaa..AABBCCDDDDEEEEEEDDDDCCCCBBAAAA. .aabbccccddeeffffggggffeeddccaa..AACCDDEEFFGGGGFFFFEEDDCCCCBBAA. .aabbccddffgghhiiiiiihhggeeddbb..BBDDEEGGHHIIIIIIHHGGFFDDCCBBAA. .aaccddeegghhjjkkllllkkiiggeebb..BBEEGGIIKKLLLLKKJJHHGGEEDDCCAA. .bbcceeffhhjjllnnppqqoolliiffcc..CCFFIILLOOQQPPNNLLJJHHFFEECCBB. .bbcceeggiikkmmpptt--ssnnjjggcc..CCGGJJNNSS++TTPPMMKKIIGGEECCBB. .bbddffhhjjlloorruu--ttookkggdd..DDGGKKOOTT++UURROOLLJJHHFFDDBB. .bbddffhhkkmmppssvv--uuppllhhdd..DDHHLLPPUU++VVSSPPMMKKHHFFDDBB. .bbddffiikknnppssww--uuqqllhhdd..DDHHLLQQUU++WWSSPPNNKKIIFFDDBB. .bbddggiillnnqqttww--uuqqmmhhdd..DDHHMMQQUU++WWTTQQNNLLIIGGDDBB. .bbeeggiillooqqttww--vvqqmmhhdd..DDHHMMQQVV++WWTTQQOOLLIIGGEEBB. .bbeeggjjllooqqttww--vvqqmmiidd..DDIIMMQQVV++WWTTQQOOLLJJGGEEBB. .bbeeggjjlloorrttww--vvqqmmiidd..DDIIMMQQVV++WWTTRROOLLJJGGEEBB. .bbeeggjjlloorrttww--vvqqmmiidd..DDIIMMQQVV++WWTTRROOLLJJGGEEBB. .bbeeggjjlloorrttww--vvqqmmiidd..DDIIMMQQVV++WWTTRROOLLJJGGEEBB. .bbeeggjjllooqqttww--vvqqmmiidd..DDIIMMQQVV++WWTTQQOOLLJJGGEEBB. .bbeeggiillooqqttww--vvqqmmhhdd..DDHHMMQQVV++WWTTQQOOLLIIGGEEBB. .bbddggiillnnqqttww--uuqqmmhhdd..DDHHMMQQUU++WWTTQQNNLLIIGGDDBB. .bbddffiikknnppssww--uuqqllhhdd..DDHHLLQQUU++WWSSPPNNKKIIFFDDBB. .bbddffhhkkmmppssvv--uuppllhhdd..DDHHLLPPUU++VVSSPPMMKKHHFFDDBB. .bbddffhhjjlloorruu--ttookkggdd..DDGGKKOOTT++UURROOLLJJHHFFDDBB. .bbcceeggiikkmmpptt--ssnnjjggcc..CCGGJJNNSS++TTPPMMKKIIGGEECCBB. .bbcceeffhhjjllnnppqqoolliiffcc..CCFFIILLOOQQPPNNLLJJHHFFEECCBB. .aaccddeegghhjjkkllllkkiiggeebb..BBEEGGIIKKLLLLKKJJHHGGEEDDCCAA. .aabbccddffgghhiiiiiihhggeeddbb..BBDDEEGGHHIIIIIIHHGGFFDDCCBBAA. .aabbccccddeeffffggggffeeddccaa..AACCDDEEFFGGGGFFFFEEDDCCCCBBAA. .aaaabbccccddddeeeeeeddddccbbaa..AABBCCDDDDEEEEEEDDDDCCCCBBAAAA. ...aaaabbbbbbccccccccccbbbbaaaa..AAAABBBBCCCCCCCCCCBBBBBBAAAA... .....aaaaaaaaaaaaaaaaaaaaaaaa......AAAAAAAAAAAAAAAAAAAAAAAA..... ................................................................
The Fortran programs for these two simulations can be found in the downloads section of the course website. Results for various congurations of charges, and for larger matrix sizes can be found at: http://www1.gantep.edu.tr/~andrew/eee484/downloads/laplace/ The program sources are also available at at that URL. 63
6.3
Lab Exercises
Task 1: Investigation of the potential elds for various congurations First download the program source codes dipole.f90 and capacitor.f90 from the course web site downloads page. Compile and run them. If you prefer then try to translate these programs to C or C++ etc. The two programs are the same except for the denitions of the dipole and capacitor plates. The dipole potential is dened with the assignments: V2(-6,0)=-1.0 V2(+6,0)=+1.0 The capacitor plates are dened with the assignments: V2(-6,-8:+8)=-1.0 V2(+6,-8:+8)=+1.0 Note that the assignments appear twice in each program. To simulate other congurations of potentials the above assignments are simply replaced by the appropriate potential distribution (the rest of the program remains unchanged). For example a + (plus) shape arrangement can be achieved with the assignments: V2(0,-8:+8)=-1.0 V2(-8:+8,0)=+1.0
----------------++-------------------------------++-------------------------------++-------------------------------++-------------------------------++-------------------------------++-------------------------------++-------------------------------++---------------++++++++++++++++++++++++++++++++++ ----------------++-------------------------------++-------------------------------++-------------------------------++-------------------------------++-------------------------------++-------------------------------++-------------------------------++----------------
Modify the dipole program for the following congurations, run your programs and check the outputs: 1. A dipole with equally signed potentials 2. A monopole 3. A quadrupole 4. A box potential After you have investigated the above congurations try some others. Task 2: Implementation of Euler simulations of a charging R-C circuit and a mass-on-aspring Implement the following Euler simulations given in the lecture into computer programs and compare the outputs with the analytical results. If you have time compare the performance of the Simple Euler, EulerCromer and Improved Euler methods for these systems (the formulations can be tricky!). 1. A charging R-C circuit governed by the 1st-order D.E. dq/dt = (Vo -V)/R, where R = 1000 Ohms, C = 1 micro Farad, charging voltage Vo is 12 Volts. With the capacitor initially uncharged and a time step dt = 105 seconds evolve the system for 1 millisecond. Compare your result for the circuit voltage with the analytical solution: V(t) = Vo (1 - et/RC ) 2. A mass-on-a-spring governed by the 2nd-order D.E. x = -kx/m, where m = 0.1 kg, k = 1 N/m. With an 64
initial displacement xo = 0.1 m and the body initially at rest, using a time step of 0.01 s evolve the system for 10 seconds. Compare your result for the displacement of the mass with the analytical solution: x(t) = xo Cos(wt) where w = sqrt(k/m).
65
6.4
Lab Solutions
Task 1: Investigation of the potential elds for various congurations Modify the dipole program for the following congurations, run your programs and check the outputs: 1. A dipole with equally signed potentials 2. A monopole 3. A quadrupole 4. A box potential
Solutions: (See the downloads page): Sources: eee484ex6a1 Outputs: eee484ex6a1.out The potential denitions are: A (+ +) dipole V2(-6,0)=+1.0 V2(+6,0)=+1.0 A monopole V2(0,0)=+1.0 A quadrupole V2(-6,-6)=+1.0 V2(+6,+6)=+1.0 V2(-6,+6)=-1.0 V2(+6,-6)=-1.0 A box potential V2(-8,-8:+8)=+1.0 V2(+8,-8:+8)=+1.0 V2(-8:+8,-8)=+1.0 V2(-8:+8,+8)=+1.0 eee484ex6a2 eee484ex6a2.out eee484ex6a3 eee484ex6a3.out eee484ex6a4 eee484ex6a4.out
Task 2: Implementation of Euler simulations of a charging R-C circuit and a mass-on-a-spring 1. A charging R-C circuit governed by the 1st-order D.E. dq/dt = (Vo -V)/R, where R = 1000 Ohms, C = 1 micro Farad, charging voltage Vo is 12 Volts. With the capacitor initially uncharged and a time step dt = 105 seconds evolve the system for 1 millisecond. Compare your result for the circuit voltage with the analytical solution V(t) = Vo (1 - et/RC ) Solution eee484ex6b1 (see the downloads page). For this Simple Euler simulation a 10 micro-second time step yields, initially, 0.5 percent error; in this case the error reduces slightly as the system evolves (to 0.3 percent after 1ms). Reducing the time-step to 1 micro-second reduces this initial error to 0.05 percent. Again, it can be shown that the error is proportional to the size of the time-step. A much more accurate simulation is gained with the Improved Euler method eee484ex6b1improved; here the initially error of 0.5 percent error is quickly reduced to near zero. 2. A mass-on-a-spring governed by the 2nd-order D.E. x = -kx/m, where m = 0.1 kg, k = 1 N/m. With an initial displacement xo = 0.1 m and the body initially at rest, using a time step of 0.01 s evolve the system for 10 seconds. Compare your result for the displacement of the mass with the analytical solution x(t) = xo Cos(wt) where w = SQRT(k/m). Solution eee484ex6b2 (see the downloads page) With the Simple Euler simulation although the period of the system is well reproduced the amplitude of the system increases (thereby not conserving energy); you can check this by comparing the simulated displacement with the expected theoretical displacement over a few periods. The Euler-Cromer method, eee484ex6b2cromer.f90, is much more accurate, this reproduces well both the period and amplitude of the system. Euler-Cromer is generally better for oscillating systems. Conclusion For the above simulations, and in general, we see that the Simple Euler method does not perform well. The 66
Improved Euler method can give a much greater accurate except in the case of oscillating systems where the Euler-Cromer method is the preferred choice. Errors are proportional to the size of the time-step, dt, i.e. the error in each simulation can be reduced by reducing the value of dt. However, reducing dt makes it necessary to perform more iterations increasing the run-time of the simulation, also round-o errors may become large (in this case double precision should be used). Final note In practice higher-order methods such as the 4th -order Runge-Kutta method are often employed. See rkshm.f90 in the downloads page.
67
6.5
Question 1 Using the central-difference approximation to the second derivative of a function F(x): CDA2 = ( F(x-h) - 2F(x) + F(x+h) ) / h^2 show that for a region of space that does not contain any electric charge the following expression satisfies Laplace equation. V(i,j,k) = [ V(i-1,j,k) + V(i+1,j,k) + V(i,j-1,k) + V(i,j+1,k) + V(i,j,k-1) + V(i,j,k+1) ] / 6 where the matrix V represents the potential at discrete points i,j,k in a three-dimensional lattice. Your answer should include an explanation of the mapping of x, y, and z space onto i, j, and k points in the lattice. Hint: d^2 V d^2 V d^2 V ----- + ----- + ----- = 0 dx^2 dy^2 dz^2 Laplaces equation for chargeless region of space
Question 2 Using the central-difference approximation to the second derivative of a function f(x): CDA2 = ( F(x-dx) - 2F(x) + F(x+dx) ) / dx^2 and the forward-difference approximation to the first derivative of a function f(t): FDA = ( F(t+dt) - F(t) ) / dt, show that the finite-difference solution to the heat equation for a thin rod is given by: U(i,j+1) = r U(i-1,j) + (1-2r) U(i,j) + r U(i+1,j) where the matrix U represents the temperature at discrete points i,j in a two-dimensional lattice. Your answer should include an explanation of the mapping of x and t space onto i and j points in the lattice. Hint: dU/dt = c d^2 U / dx^2 Heat equation for a one-dimensional conductor.
Question 3 a) Write down the formulae representing the following methods for the numerical evolution of the displacement, y(t), of a body governed by a second-order differential equation y(t) = -g i. Simple Euler method ii. Improved Euler method iii. Euler-Cromer method
68
b) Show that the Improved Euler method yields an exact result. c) Write a computer program implementing the Improved Euler method for the above system. Question 4 a) A simple R-C circuit is governed by the 1st-order D.E: i = dq/dt , where i is the current in the circuit: i = V/R and V is the p.d. across the capacitor V = q/C. Write down Euler steps representing the time-evolution of the potential difference across the capacitor. Implement your formulae in a computer program. b) A simple R-C circuit is governed by the 1st-order D.E.: i = dq/dt , where i is the current in the circuit: i = (V0-V)/R V0 is the charging voltage, and V is the potential difference across the capacitor V = q/C.
Write down Euler steps representing the time-evolution of the potential difference across the capacitor. Implement your formulae in a computer program. c) The F = x the restoring force in a mass-on-a-spring system is given by -k.x, the motion of the body is governed by the 2nd-order D.E.: = -k.x/m where k is the spring constant (N/m), x is the displacement, m is inertial mass. Two 1st-order D.E.s can be formed: dv/dt = -k.x/m and dx/dt = v
Write down Euler steps representing the time-evolution of the displacement of the mass. Implement your formulae in a computer program. d) (In this question you are not given the differential equations describing the system, so you need to build them yourself). Write down Euler steps representing the time-evolution of the voltage V recorded by the voltmeter in the system shown below. Implement your formulae in a computer program.
The diagram is an R-C circuit
| | +------| |-------+ | | | | | C | | | | +------+ | +----| R |----+ | +------+ | | | | | +------(V)-------+
C = 1 micro Farad R = 1000 Ohms At t = 0 V = 12 volts
(V) is a voltmeter, you can assume it has infinite internal resistance.
69
7
7.1
Random Variables and Frequency Experiments

Topics Covered
o Review of Probability and Random Variables: the student should be familiar with the topics taught in EEE 283 (operations on pdfs and pmfs, probabilities and conditional probabilities); o Generation of pseudo-random numbers: the student should know how to generate random numbers (e.g. by using the rand() function) and write computer programs to perform frequency experiments. o Transformation of a uniform pdf to a non-uniform pdf: given a uniform pdf fX (x)=1 with 0<x<1 and the transformation function y=T(x) the student should be able to determine and sketch the resultant non-uniform pdf fY (y); the student should also be familiar with the rejection method for generating a non-uniform pdf from a uniform pdf.
7.2
Lecture Notes
Introduction Probability, random variables and random processes are important topics in science and engineering. These topics are covered in the course EEE 283 (check out my web site for that course). Key to this subject are the ideas of the random variable, probability density functions (pdfs) and probability mass functions (pmfs) and operations on them. This is treated theoretically in EEE 283. However, all the results given in that course can be reproduced experimentally by taking a frequency interpretation of probability. [A brief summary of Probability and Random Variables is given in class]. For example, consider the tossing of a coin; we know that the probability of the outcome being heads is 0.5, in probability theory this is written P(heads) = 0.5. The frequency interpretation is P(heads) = nheads /n in the limit n goes to innity, where n is the number of tosses of the coin (the number of trials) and nheads is the number of outcomes that give heads, i.e. we perform an experiment where the coin is tossed an innite times and we count the number of times the coin comes up heads. In reality a good approximation to the probability can be obtained with a large nite value of n. Such frequency experiments allow us to verify that a theoretical result is true (a great help when writing exam questions for EEE 283!) and nd results for cases where the theoretical calculations are dicult to evaluate (i.e. many problems in the real world). To perform a frequency experiment one needs to generate a large number of trials. Often we have to do this by hand (e.g. to test the eect of a new drug a clinical trial is perform where a large number of patients is given the drug while another large number of patients is not, the outcomes are analysed statistically). However if we know the underlying probabilities that govern a system (e.g. P(heads) = 0.5) then we can simulate an experiment using a computer. For this we need to be able to generate a probability density function (pdf), i.e. lists of numbers X = (x1 , x2 , x3 , ..., xn ) that are randomly distributed according to some function fX (x). The most basic pdf is fX (x)=1 with 0<x<1, i.e. a unform distribution. Generating Random Numbers In Fortran 90 the intrinsic subroutine random number provides the programmer with lists of random numbers uniformly distributed in the range 0<r<1. In the following example, array r is lled with random numbers: Algorithm 7a1 (Fortran syntax) real :: r(8) call random_number(r) print *, r
70
Example result: 0.983900 0.699951 0.275312 0.661102 0.809842 0.910005 0.304463 0.484259 The equivalent in C++, using the intrinsic rand() function, is: Algorithm 7a2 (c++ syntax) for (int i=1; i<=8; i++) { double r = rand()/(double(RAND_MAX)+1); cout << r << " "; } cout << endl; Example result: 0.840188 0.394383 0.783099 0.798440 0.911647 0.197551 0.335223 0.768230 These programs output 8 pseudo-random numbers. Random number generators create a sequence of pseudorandom numbers usually distributed uniformly between 0 and 1. The numbers are not truly random, they are created by an deterministic algorithm hence the term pseudo-random. There are various algorithms for producing large sequences of random numbers with varying qualities. The quality of a random number generator relates to four main properties: 1. The apparent randomness o the sequence. 2. The size of the period of the sequence, i.e. how many numbers are generated before the sequence repeats; this varies from 109 in a minimal standard generator to 1043 or more in high quality generators. 3. The uniformity of the distribution of random numbers; is the distribution at? does it have gaps? 4. The distribution should pass some statistical/spectral tests. |_________________| | | | | +-----------------+- R 0 1
Uniform (flat) distribution of a set of random numbers
|________-___-____| | | | | +-----------------+- R 0 1
Distribution of a set of random numbers with some non-uniformity
|________ ___ ____| | | | | +-----------------+- R 0 1
Distribution of a set of random numbers with gaps.
71
A popular primitive algorithm is the multiplicative linear congruential generator rst used in 1948; with carefully chosen constants this generator provides a good basic generator: Ri+1 = ( a Ri + b ) MOD m, where MOD means modulo. Constants a, b and m are chosen carefully such that the sequence of numbers becomes chaotic and evenly distributed. Park and Miller proposed a minimal standard with which more complex generators can be compared; the constants are taken as: a = 75 = 16807, b = 0, and m = 231 - 1 = 2147483647. The range of values is 1 to m (divide by m to convert to 0<r<1). The period of this generator is m-1, about 2 billion. In a computer implementation of this algorithm using 32 bit integers is not straight forward as a times R can be out of the integer range; we have to apply a trick (approximate factorisation of m). The algorithm is implemented below in Fortran 90 (see also the downloads page on the course website for the Fortran 77 and C and C++ versions of this program). The algorithm is in the form of a function ran() to which a seed is passed. The function returns a random number, the seed returns modied so that the next call of the function returns the next random number in the sequence. Before the rst call to the function the seed needs to be initialised with any value 1 to 2147483647 (not zero). Dierent initial seed values result in dierent sequences of random numbers. ran.f90
integer :: i, iseed real :: r iseed=314159265 ! Initialise the seed do i=1,10 r=ran(iseed) ! ran() is a function that print *, r ! returns a random number. end do contains ! ran() is defined as an internal function: real function ran(iseed) !-------------------------------------------------------------! Returns a uniform random deviate between 0.0 and 1.0. ! Based on: Park and Millers "Minimal Standard" random number ! generator (Comm. ACM, 31, 1192, 1988) !-------------------------------------------------------------implicit none integer, intent(inout) :: iseed integer, parameter :: IM=2147483647, IA=16807, IQ=127773, IR= 2836 real, parameter :: AM=128.0/IM integer :: K K = iseed/IQ iseed = IA*(iseed-K*IQ) - IR*K IF (ISEED < 0) iseed = iseed+IM ran = AM*(iseed/128) end function ran end
For the given initial seed 314159265 the program gives the following sequence of values: 0.7264141 0.0418309 0.8427828 0.0521500 0.6508798 0.4857842 0.3372238 0.5747718 72 0.7214876 0.1894919
As mentioned above, the period of this algorithm is m-1 = 2147483647. This is actual not large, for example my 2.4 GHz cpu takes only 23 seconds to generate the complete sequence of random numbers! Compare this to, for example, simulations of high energy particle reactions where farms of computers generate datasets over days, then the period of this generator is clearly not sucient. Improved algorithms are available providing uniform distributions of random numbers with periods of 1012 , 1018 , 1043 and even 10171 ; these algorithms, however, are much more complex. Frequency Experiments We now have a method for generating large numbers of (pseudo) random numbers call random number(r) in Fortran 90, and double r = rand()/(double(RAND MAX)+1); in C++; and we can return to the frequency experiments. Consider again the tossing of a coin. We can create an experiment by generating a large number of random values (uniformly distributed between 0 and 1), calling any value less than 0.5 a head, and count the number of times this occurs. This is illustrated in the algorithm below where a coin is tossed one million times (n=1000000) and outputs the fraction nheads/n. Algorithm 7b (Fortran syntax [mostly]) The second (concise) form of the algorithm makes use of Fortran whole-array processing and intrinsics. n = 1000000 nheads=0 do i = 1, n call random_number(r) if (r<0.5) nheads=nheads+1 end do output nheads/n real :: r(1000000) call random_number(r) print *, count(r<0.5)/1000000. end
Example output: 0.499687 The result is close to, but not exactly, the expected value of 0.5 because the process of generating outcomes is random. Repeating the above experiment with dierent sample sizes, n, gives the following results: n 100 1000 10000 100000 1000000 nheads/n 0.47 0.499 0.5030 0.50067 0.499687
Note that the dierence between the value of nheads/n and 0.5 gets smaller as n increases, i.e. the experiment becomes more accurate as the statistics increases. Operations on Random Variables Continuing the subject of random variables, a important topic is that of operations on random variables. Basic operations include the calculation of the expectation value and variance of a probability density function. The expectation value E[X] is the rst moment about the origin (denoted by m1 ); it can be viewed as the center of mass or arithmetic mean of a distribution and is dened as E[X] = m1 = the integral of the product x f(x). The variance is the second moment about the mean (denoted by mu2 ) it represents a measure of the size of the spread of the distribution about the mean m1 , and is dened as E[(X-m1 )2 ] = mu2 = the integral of the product (x-m1 )2 f(x). The variance can also be equated by simple algebraic arguments as mu2 = m2 - m1 2 where m2 is the second moment about the origin dened as E[X2 ] = m2 = the integral of the product x2 f(x). We will now calculate the expectation value and variance for the uniform pdf both theoretically and via a frequency experiment as follows.
73
Theory: We have the random variable X with a pdf f(x)=1 in the range 0<x<1. E[X] = m1 = the integral of the product x f(x) = 1/2 (see class notes for the integral), and the variance E[(X-1/2)2 ] = mu2 = the integral of the product (x-1/2)2 f(x) = 1/12 (see class notes for the integral). Alternatively mu2 = m2 -m1 2 ; with m2 = the integral of the product x2 f(x) = 1/3, mu2 = 1/3 - (1/2)2 = 1/12. Experiment: We now generate n random variables X = (x1 , x2 , x3 , ...., xn ) from a set of uniformly distributed values 0<x<1, and calculate experimentally the expectation value and the variance. For this we can use directly the random number generator intrinsic to the Fortran or C++ compiler. In this frequency experiment the calculation of the expectation value E[X] = m1 becomes the sum of the values of x normalised to the number of values; i.e. m1 = (x1 +x2 +x3 +...+xn )/n which is simply the arithmetic mean. For the variance mu2 , it is convenient to use the equality mu2 = m2 -m1 2 which requires m2 = (x1 2 +x2 2 +x3 2 +...+xn 2 )/n. Algorithm 7c (Fortran syntax [mostly]) The second (concise) form of the algorithm makes use of Fortran whole-array processing and intrinsics. n = 1000000 m1=0, m2=0 do i = 1, n call random_number(x) m1 = m1 + x m2 = m2 + x^2 end do m1 = m1/n m2 = m2/n output " mean, m1 = ", m1 output "variance, mu2 = ", m2-m1^2 Example output (for n = 1000,000): mean, m1 = variance, mu2 = 0.5000423 [1/1.9998] 0.0832316 [1/12.015] integer, parameter :: n = 1000000 real(kind=8) :: x(1000000), m1, m2 call random_number(x) m1 = sum(x)/n m2 = sum(x**2)/n print *, " mean, m1 = ", m1 print *, "variance, mu2 = ", m2 - m1**2 end
Increasing the number of trials to n = 1000,000,000 (the program take less than 1 minute to run!) gives: mean, m1 = variance, mu2 = 0.500005976 [1/1.99998] 0.083332809 [1/12.00008]
While we do not obtain the exact values, it is clear that the expectation value and variance tends toward the theoretical values for large n. The above demonstration can be repeated for the triangular pdf: f(x) = 2x with 0<x<1. Here, E[X] = m1 = the integral of the product x 2x = 2/3 (see class notes for the integral), and the variance E[(X-2/3)2 ] = mu2 = the integral of the product (x-2/3)2 2x = 1/18 (see class notes for the integral). Alternatively mu2 = m2 -m1 2 ; with m2 = the integral of the product x2 2x = 1/2, mu2 = 1/2 - (2/3)2 = 1/18. For the experiment, Algorithm 7c only needs to be modied such that random numbers are distributed in the form of a triangular pdf. This is obtained with transformation x=sqrt(x); i.e. replace call random number(x) with call random number(x); x=sqrt(x) [the transformation of distributions will be studied later in this topic]. The result for n = 1000,000,000 is: mean, m1 = variance, mu2 = 0.6666715428590047 [2/2.999978] 0.055555029965474394 [1/18.013]
Again the experimental results are in agreement with the theoretical results.
74
Total Probability and Conditional Probability Total probabilities are obtained by integrating the pdf. The probability of obtaining a value between a and b is dened as P(a<X<b) = integral of f(x) over the limits a to b. For example for the triangular pdf : f(x) = 2x with 0<x<1, the probability of obtaining a value between 0.5 and 0.9 = P(0.5<x<0.9) = the integral of 2x over the limits 0.5 to 0.9 = 0.56. Experimentally, the probability is obtained by simply counting the number of values that appear within the given range. This is illustrated below: Algorithm 7d1 (Fortran syntax [mostly]) n = 1000000 m = 0 do i = 1, n call random_number(x); x=sqrt(x) if (x>0.5 .and. x<0.9) m = m+1 end do output "P{0.5$<$X$<$0.9} = " m/n integer, parameter :: n = 1000000 real :: x(n), m call random_number(x); x=sqrt(x) m = count(x>0.5 .and. x<0.9)/real(n) print *, "P{0.5$<$X$<$0.9} = ", m
(remember that the statement x=sqrt(x) transforms the uniform pdf to a triangular pdf) The result is shown below for various values of n. n = 1000 P{0.5<X<0.9} = 0.574 n = 1000000 P{0.5<X<0.9} = 0.560993 n = 1000000000 P{0.5<X<0.9} = 0.560003037 For a large number of trials the the probability tends toward the theoretical result. From probability theory, the conditional probability P(A!B) = P(A intersect B)/P(B) where P(A!B) reads the probability of A given that B has occurred. For example P(0.5<X<0.9!X>0.6) = P(0.5<X<0.9 intersect X>0.6) / P(X>0.6) = P(0.6<X<0.9) / P(X>0.6) = 0.45/0.64 = 0.703125. See the lecture for the full calculations. Experimentally, the probability is obtained by simply counting the number of values that appear within the given range after rst requiring that x>0.6. This is illustrated below: Algorithm 7d2 (Fortran syntax [mostly]) do call random_number(x); x=sqrt(x) if ( x<0.6 ) cycle n = n+1 if ( x>0.5 .and. x<0.9 ) m = m+1 if ( n==1000000 ) exit end do output "P{0.5$<$X$<$0.9|X>0.6} = ", m/n ! ! ! ! ! a triangular pdf condition X>0.6 has occurred increase the trial count condition 0.5$<$X$<$0.9 exit when there are enough trials
Here, cycle means return the the top of the loop, and exit means drop out of the loop. The result is shown below for various values of n. n = 1000 n = 1000000 n = 1000000000 P{0.5$<$X$<$0.9|X>0.6} = P{0.5$<$X$<$0.9|X>0.6} = P{0.5$<$X$<$0.9|X>0.6} = 0.691 0.703951 0.703116218
As the number of trials increases the result moves closer to the theoretical value obtained from the rule P(A!B) = P(A intersect B)/P(B). 75
The Binomial probability mass function If the probability of success for a single trial is p then the probability of k successes out of n trials is given by the Binomial pmf f(k) = [n k] pk (1-p)nk where the binomial coecient is [n k] = n! / k!(n-k)!. A program to generate this pmf can be found on the download page (binomial-pmf.f90). For example if p=0.25 and n=6, the pmf is: k 0 1 2 3 4 5 6 P{k} 0.177979 0.355957 0.296631 0.131836 0.032959 0.004395 0.000244 Experimental 0.177988 0.356018 0.296520 0.131854 0.032981 0.004395 0.000244
The column indicated by experimental are obtained with the following algorithm. Algorithm 7d3 (Fortran syntax [mostly]) In this algorithm m is an integer vector with 7 elements indexed 0 to 6, and x is a real vector with 6 elements. n = 100000000 real x(6); integer m(0:6) = 0 do i = 1, n call random_number(x) ! generate 6 random values k = count(x<0.25) ! count how many are < 0.25 m(k) = m(k)+1 end do output "P{k} = " m/n Note that the output statement contains 7 values from the vector array m (see the above table). For a large number of trials the the probability tends toward the theoretical result. This experimental result has two consequences. First, it demonstrates the correctness of the expression for the binomial distribution, and second it demonstrates a computational method for simulating stochastic processes. In the next topic we will look in more detail at performing computational simulations of systems that involve stochastic processes; this eld of study is called Monte Carlo Simulation. Before we can do this we need to know how to generate pdfs of any required form. Generation of Non-uniform Random Distributions In simulations of random processes we often require a non-uniform distribution of random numbers. For example, radioactive decay is characterised by an exponential pdf: fX (x) = a eax with x0. There are two useful methods for generating such non-uniform distributions: 1. The transformation method. 2. The rejection method The aim of both these methods is to convert a uniform distribution of random numbers of the form fX (x)=1 with 0<x<1, into a non-uniform distribution of the form fY (y) with a<y<b; this is illustrated below. fx| |_________________| | | | | +-----------------+- x 0 1
Uniform (flat) distribution of a set of random numbers.
76
fy| _____ | / \ | / \______ | / \ +-+-----------------+- y a b
Non-uniform distribution of a set of random numbers. The shape of the pdf is arbitrary.
1. The Transformation Method Consider a collection of variables X = (x1 ,x2 ,x3 ,...) that are distributed according to the pdf fX (x), then the probability of nding a value that lies between x and x+dx is fX (x) dx. If y is some function of x then we can write: |fx(x)dx| = |fy(y)dy| where fY (y) is the pdf that describes the collection Y = (y1 , y2 , y3 , ....). Now let fX (x)=1 with 0<x<1, e.g the uniformly distributed random numbers that we generate via the random number() intrinsic subroutine in Fortran; then we can write dx = |fy(y)dy| and so fy(y) = |dx/dy| fx| 1|_________________| | | | | +-----------------+- x 0 1
fx(x)=1 and so fy(y) = |dx/dy|
And so in order to obtain a sequence characterised by the distribution fY (y) we must nd a transformation function y = T(x) that satises: |dx/dy| = fy(y) Example 1: Consider that we want the exponential distribution fY (y) = a eay , the transformation function is then y = T(x) = -ln(x)/a. Proof: y = -ln(x))/a and so x = eay and so abs(dx/dy) = abs(- a eay ) = a eay = fY (y) Example 2: Consider that we want the distribution fY (y) = 2y fx| 1|________ | | +-------+ x 0 1 fy| 2| / | / |/ +---+- y 0 1
the transformation function is then y = T(x) = x1/2 Proof: y = x1/2 and so x = y2 and so dx/dy = 2y = fY (y) A quick check for the correctness of the transformation is that the integral of the two distributions is equal: integral [fX (x) dx] = 1 times 1 = 1; integral [fY (y) dy] = 0.5 times 1 times 2 = 1 i.e. the integrals of the two distributions are both 1 (remember that any pdf must always integrate to unity i.e. the total probability is one). 77
Example 3: We wish to obtain the pdf fY (y) = 0.5 Sin(y) with 0<y<. The transformation function is y = T(x) = Cos1 (1-2x). Proof: y = Cos1 (1-2x) and so x = 0.5 (1-Cos(y)) and dx/dy = 0.5 Sin(y) = fY (y). The range of y is T(0)<y<T(1) = 0<y< as required. One check for correctness is that the integral should be one: integral of 0.5 Sin(y) over the limits 0<y< = 1 Algorithm 7e (Fortran syntax) The following algorithm implements the transformation given in Example 2. real :: x(8000), y(8000) call random_number(x); y = sqrt(x) Array x is lled with random numbers from a uniform distribution in the range 0<x<1 Array y is assigned the transformation of these numbers giving a distribution fY (y) = 2y with the range sqrt(0)=0<y<sqrt(1)=1. We can use the minval and maxval intrinsic functions to inspect the ranges: print *, minval(x), maxval(x), minval(y), maxval(y) gives: 0.0000973344 0.99984 0.0098658195 0.99992
The two distributions are illustrated in the histograms below. First the uniform distribution of random numbers from fX (x); here, 8000 random numbers have been generated and placed into a 20-bin histogram (the histogram is turned on its side). 1 0+-------------------+---> fx(x) |##################### |#################### The range of values is from 0 to 1. |#################### Each # symbol represents 20 numbers. |################### The average number of entries per bin |#################### is 8000 values / 20 bins = 400 values |###################### => 400/20 = 20 #s. |################### |#################### Given that the values are created |##################### randomly, statistical theory tells us |################### that we expect a variation from |#################### bin-to-bin of: sigma = sqrt(n) |##################### = sqrt(400) = 20 = 1 # |#################### |#################### i.e. we expect a variation of 1 or 2 # |#################### as is seen in this histogram. |###################### |################### |################### |##################### |################### 1+ | x 78
Next, each value x is transformed to y = T(x) = x1/2 , the resultant distribution, fY (y), is shown in the histogram below. 2 0+------------------------------------+----> fy(y) |# |### |##### |######## |######### |############ The distribution is the required |############# fy(y)=2y in the range 0 < y < 1. |############### |################ The number of entries is 8000 |################### (as each y value corresponds to |####################### an x value). |###################### |######################### The function is not exact due |########################## to statistical variations in |############################# the distribution fx(x). |################################ |################################ |##################################### |################################### |####################################### 1+ | y
2. The Rejection method The transformation method is useful when the transformation function y=T(x) can be derived easily. The rejection method provides an alternative to the transformation method, it has the advantage of being able to create any required distribution. In this method a sequence of random numbers X = (x1 , x2 , x3 , ...., xn ) is generated with a uniform distribution in the range of interest, ymin to ymax . Now suppose that our goal is to produce a sequence of numbers distributed according to the function fY (y): fy(y) | |_____________________________________ fmax | / \ | _________/ \ | / \ | ____/ \ | / \__ | / \ | / \ +--+-------------------------------+--- y y_min y_max We proceed through the sequence (x1 , x2 , x3 , ..., xn ) and accept values with a probability proportional to 79
fy (x). This is achieved as follows: for each value of x a new random number, ptest , distributed uniformly in the range 0<ptest <fmax , is generated. If fY (x) is greater than ptest then number x is kept (added to set Y) otherwise it is removed (rejected) from the sequence. The probability of number xi passing the test is proportional to fY (xi ). The resultant set Y = (y1 , y2 , y3 , ...., ym ) [mn] is therefore distributed according to the function fY (y). Algorithm 7f (Fortran syntax) Consider that we want a distribution fY (y) = 0.5 + Sin2 (y) in the range <y<3. fmax is therefore 0.5 + 1.02 = 1.5, and so ptest is generated from 0 to 1.5. integer :: i, m=0, n=8000 real :: x(n), y(n), Pi=3.141593, fmax=1.5, Ptest call random_number(x) x = pi+2*pi*x ! x is random and uniform in the range pi < x < 3pi do i = 1, n call random_number(Ptest) Ptest = Ptest*fmax ! range 0 < Ptest < fmax if (0.5+sin(x(i))**2 > Ptest) then m = m+1 ! entry x(i) passed the test y(m) = x(i) ! so we record it in y(m) end if end do Result: Before Rejection 0 pi+----------------------> fx(x) |##################### |################### |#################### |#################### |################### |#################### |################### |#################### |##################### |################### |################### |#################### |##################### |#################### |##################### |##################### |#################### |################### |##################### |################### 3pi+ x
The original sequence contains 8000 entries distributed uniformly in the range pi < x < 3pi
80
After Rejection pi+---------------------> fy(y) |####### |######### |############# |################# |################### |#################### |################ |############# |########## |####### |######## |########## |############## |################# |#################### |#################### |################# |############ |########## |###### 3pi+ y
The distribution of sequence y (after rejection of some entries) shows a sine-squared function of amplitude 1.0 on a base of 0.5. The number of entries is 5321 which is approximately equal to: integral fy(y) [y=pi,3pi] 8000 --------------------------fmax * (3pi-pi) = 8000 2pi / (1.5*2pi) = 5333 The number of rejected entries is 8000-5321 = 2679
Discussion: The rejection method is less ecient than the transform method as it requires two random numbers to be generated for each entry, and some numbers are wasted (rejected). Implementation is also more dicult. However, the advantage is that any distribution can be generated, whereas the transformation method is limited to distributions where the transformation function can be calculated. Monte Carlo Computer simulations of systems that involve some random process are called Monte Carlo simulations. In such simulations random numbers are generated with the appropriate distribution corresponding to the physical random process. There are a vast array of Monte Carlo applications in science and engineering. We will look at some basic Monte Carlo simulations next week.
81
7.3
Lab Exercises
1. Implement Algorithm 7b into a computer program, compile and run the program. Check that P(heads) = nheads /n tends to 0.5 for large n. Repeat this experiment replacing the coin with two dice. Determine the probabilty that a double six is thrown. Compare this with the theoretically expected outcome. 2. Implement Algorithm 7c (with the transformation to a triangular pdf) into a computer program, compile and run the program. Check that m1 = 2/3 and mu2 = 1/18. Modify the experiment to verify the identity E[20X+30] = 20 E[X] + 30. 3. Modify the above program to compute the expectation value and variance of the pdf fX (x) = 0.5 Sin(x) for 0<x<. Compare you results with the theoretical solutions. Hint: the transformation function can be found in the lecture notes. 4. A nucleus of an atom has a probability of decaying that is described by the pdf fT (t) = 0.1 exp(-0.1t) where t0 is the time in seconds. Write a program to perform a stochastic experiment to determine a) the probability that the nucleus survives 20 seconds, b) assuming the nucleus has already survived 10 seconds what is the probability that the nucleus survives 20 seconds? Compare you results with the theoretical solutions. Hint: Algorithm 7d1 will be helpful for part a, and Algorithm 7d2 for part b; the required transformation function can be found in the lecture notes.
82
7.4
Lab Solutions
1. Implement Algorithm 7b into a computer program, compile and run the program. Check that P(heads) = nheads /n tends to 0.5 for large n. Repeat this experiment replacing the coin with two dice. Determine the probability that a double six is thrown. Compare this with the theoretically expected outcome. Solution: eee484ex7a (see the downloads page). This program counts the number of times two random numbers are both < 1/6 repeating this for 100,000,000 trials. The result is 2780456/100000 = 0.02780456 = 1/35.965; the theoretical expectation is 1/6*1/6 = 1/36. 2. Implement Algorithm 7c (with the transformation to a triangular pdf ) into a computer program, compile and run the program. Check that m1 = 2/3 and mu2 = 1/18. Modify the experiment to verify the identity E[20X+30] = 20 E[X] + 30. Solution: eee484ex7b (see the downloads page). The result for [n=100000000 and call random number(x); x=sqrt(x)] is E[X] = 0.6666696399773584
To nd E[20X+30] for this pdf we simply transform each generated value of x as follows: x = 20 x + 30. The result for [n=100000000 and call random number(x); x=sqrt(x); x = 20*x + 30] is E[20X+30] = 43.33339279955304
Note that 20 E[X] + 30 = 20 * 0.6666696399773584 + 30 = 43.33339279954717, and so E[20X+30] = 20 E[X] + 30 is shown experimentally. If you are not convinced then you can repeat the experiment with dierent values and dierent pdfs. 3. Modify the above program to compute the expectation value and variance of the pdf fX (x) = 0.5 Sin(x) for 0<x<. Compare you results with the theoretical solutions. Hint: the transformation function can be found in the lecture notes. Solution: eee484ex7c (see the downloads page). From the lecture notes, the transformation function is y = T(x) = Cos1 (1-2x); i.e. to generate the pdf fX (x) = 0.5 Sin(x) for 0 < x < the Fortran code is: call random_number(x) x = acos(1-2*x) The output of the program is: mean, m1 = variance, mu2 = 1.5707902846841562 0.4673394386775112 = pi/2 - 0.000006 = pi/4-2 - 0.00006 ! x is uniform {0 < x < 1} ! x = 0.5 Sin(y) {0 < x < pi}
From theory: m1 =pi/2, m2 =pi2 /2 - 2, and so mu2 = pi2 /2 - 2 - (pi/2)2 = pi2 /4 - 2 = 0.4674011002723394; the experimental results are in agreement with the theory. 4. A nucleus of an atom has a probability of decaying that is described by the pdf fT (t) = 0.1 exp(-0.1t) where t0 is the time in seconds. Write a program to perform a stochastic experiment to determine a) the probability that the nucleus survives 20 seconds, b) assuming the nucleus has already survived 10 seconds what is the probability that the nucleus survives 20 seconds? Compare you results with the theoretical solutions. Hint: Algorithm 7d1 will be helpful for part a, and Algorithm 7d2 for part b; the required transformation function can be found in the lecture notes.
83
Solution: eee484ex7d1, eee484ex7d2 (see the downloads page). From the lecture notes, the transformation is x=-log(x)/0.1. Part a asks for P(X>20 seconds), the theoretical result is 1/e2 , the output of the program for n = 100000000 is P(X>20) = 0.13533197 = 1/e2 - 0.000003 Part b asks for P(X>20!X>10), the theoretical result is 1/e, the output of the program for n = 100000000 is P(X>20!X>10) = 0.36785103 = 1/e - 0.00003 The experimental results are in agreement with the theoretical.
84
7.5
Question 1 For the following transformation functions transforming a uniform pdf fx(x) in the range 0 < x < 1 into non-uniform pdf fy(y): a) y=SQRT(3+x), i. ii. iii. iv. b) y=ArcCosine(1-2x), c) y=1/(x+0.5)
Determine the transformed probability density functions fy(y). Write down the range of y values. Sketch the distribution fy(y). Show that the integral of the transformed pdf is equal to the integral of the original uniform pdf.
Question 2 Write a computer program that performs a frequency experiment to determine the probability that the outcome of throwing a coin and die is (a "head" and a "3").
85
8
8.1
Monte-Carlo Methods
Topics Covered
o Monte Carlo integration; the student should be able to write a computer program to integrate a function f(x) using the Monte Carlo method. o Monte Carlo simulation; the student should be able to write a simple Monte Carlo simulation to solve a problem that involves some underlying random process.
8.2
Lecture Notes
Introduction Armed with methods that allow us to generate any pdf we can now attempt to simulate less trivial physical processes. Such simulations are call Monte Carlo Simulation. But rst, we will use the Monte Carlo method as an alternative method for integration (Monte-Carlo Integration). Monte Carlo Integration In Monte Carlo integration numerical integration of a function is performed by making use of random numbers. We will see that for simple one-dimensional functions MC integration is not as eective as other numerical integration methods (though for high-dimensional integrals the MC method can be more ecient). To introduce Monte Carlo integration we will compute the area of a circle and hence a value for as follows: generate n pairs of random numbers, x and y, each uniformly distributed between 0 and 1; count the number of pairs m which satisfy the condition x2 + y2 < 1 (i.e. they lie inside a circle of unit radius); then the ratio n/m = 1/(/4) and so = 4m/n . Algorithm 8a (Fortran syntax [mostly]) m=0, n=10000000 do i = 1, n call random_number(x) call random_number(y) if ( x**2+y**2 < 1.0 ) m = m+1 end do output n, 4*m/real(n) Results for dierent values of n are tabulated below, estimates for are given in column 2 and the errors in column 3. n 1,000 10,000 100,000 1,000,000 10,000,000 100,000,000 1,000,000,000 4*m/n 3.156 3.1308 3.14768 3.143304 3.1420207 3.1417096 3.1415832 Error = 4*m/n -pi 0.014407259 -0.010792741 0.006087259 0.001711259 0.000428059 0.00011685899 -0.000009637012
The method works, but clearly it is not computationally ecient as it requires one billion random numbers to achieve only a 4 or 5 decimal place accuracy. As n increases the accuracy of the computed value of improves (except for some statistical variations). It can be shown, by repeating the experiment many times (with dierent seeds), that the error is generally proportional to 1/n1/2 . 86
We will now look at a simple MC method for the integral I of a function f(x), the method is as follows: 1. Enclose the function in a box area A and determine ymax . y=f(x) | | ______________________________ y_max | | / \ | | | A _________/ \ | | | / \ | | | ____/ \ | | | / I \_ | | |/ \| 0 +--+----------------------------+--- x a b 2. Uniformly populate the box with n random points.: generate two random numbers r1 , r2 , a random point in the box is then x = a + (b-a) r1 and y = ymax r2 3. Count the number of points m that lie below the curve f(x). 4. The integral is then estimated from: I/A = m/n and so I = A (m/n) and so I = (b-a)ymax m/n. Example: in a previous lecture Numerical Integration we use the Extended Trapezoidal Formula to integrate the function f(x) = x3 - 3x2 + 5 over the range x = 0.0 to 2.5. With n=1000 (1000 intervals) the result is 6.640627 (error = 0.000002). The MC integration is as follows: For this function, in the integral range, we have turning points at x=0.0 (maximum) and x=2 (minimum) and so ymax = f(0.0) = 5. So we have a = 0, b = 2.5, ymax = 5.0. Algorithm 8b (Fortran syntax [mostly]) The second program is a complete concise Fortran source using whole-arrays. m = 0 input a, b, Ymax, n do i = 1, n call random_number(r1) call random_number(r2) x = a + (b-a)*r1 y = Ymax*r2 if ( y < f(x) ) m = m+1 end do output (b-a)*Ymax*m/real(n) define function F(x) = x**3 - 3*x**2 + 5 The result for n=1000 is 6.450 (error = -0.190625). Repeating for increasing values of n gives: n 1,000 10,000 100,000 1,000,000 10,000,000 100,000,000 1,000,000,000 m 516 5320 53194 530692 5309750 53121655 531243958 integral 6.450000 6.650000 6.649250 6.633650 6.637187 6.640207 6.640550 Error -0.190625 0.009375 0.008625 -0.006975 -0.003437 -0.000418 -0.000076 integer, parameter :: n=1e8 integer :: m real :: a=0.0, b=2.5, Ymax=5.0 real :: x(n), y(n) call random_number(x); x = a + (b-a)*x call random_number(y); y = Ymax*y m = count(y < x**3-3*x**2+5) print *, (b-a)*Ymax*m/real(n) end
87
Again the error reduces with n1/2 (you have to perform many experiments, with dierent seeds, to see this more clearly). The accuracy of this simple Monte Carlo integration method is not good when compared to other numerical methods such as Trapezoidal or Simpson integration, however, the MC integration method can be rened to give improved accuracy (see text books for the details). Moreover, for high-dimensional integrations the MC method can be more ecient than other integration methods. MC integration is also useful for discontinuous shapes, for example a torus with a slice cut out of it; the shape is expressed more easily in a MC program than in Trapezoidal or Simpson integration algorithms. Monte Carlo Simulation We will look at two example simulations: 1. a binary communication system, and 2. propagation of errors through a measurement system. 1. A binary communication system A binary communication system consists of a transmitter that sends a binary 0 or 1 over a channel to a receiver. On average, the transmitter transmits a 0 with a probability of 0.4 and a 1 with a probability of 0.6. The channel occasionally causes errors to occur ipping a 0 to 1 and a 1 to a 0; the probability of this error occurring is 0.1. Using this information, we wish to calculate the following: a) The probability of a 1 being transmitted without error. b) The probability of a 0 being transmitted without error. c) If a 1 is observed at the receiver, what is the probability of it being correct. d) If a 0 is observed at the receiver, what is the probability of it being correct. The rst two calculations are trivial, but the second two requires Bayes theorem. The solutions to these questions are given below. P(A0)=0.4 P(A1)=0.6 Bayes Theorem P(Y|X) = P(X|Y) P(Y) / P(X) a) asks for P(B1|A1) = 0.90000 b) asks for P(B0|A0) = 0.90000 c) asks for P(A1|B1) = P(B1|A1) P(A1) / P(B1) = 0.9 0.6 / (0.9 P(A1) + 0.1 P(A0)) = 0.54 / (0.9 0.6 + 0.1 0.4) = 0.54 / (0.54 + 0.04) = 0.93103 d) asks for P(A0|B0) = P(B0|A0) P(A0) / P(B0) = 0.9 0.4 / (0.9 P(A0) + 0.1 P(A1)) = 0.36 / (0.9 0.4 + 0.1 0.6) = 0.36 / (0.36 + 0.06) = 0.85714 Receiver (B) Answers: a) 0.90000 c) 0.93103 b) 0.90000 d) 0.85714
A0 A1 + + |\ /| | \ / | | \ / | | \ / | | \ / | P(B0|A0) | \ / | = 0.9 | Channel | | /\ | | / \ | | / \ | | / \ | | / \ | | / \ | |/ \| + + B0 B1
Transmitter (A)
P(B1|A1) = 0.9
Note that the probability of observing a correct 0 is less than that of a 1 because more 1s ip to 0s as P(A1) > P(A0).
88
The above questions can be solved via a MC simulation; the simulation generates the correct fractions of zeros and ones and then ips them with a probability of 10 percent and counts the resulting number of zeros and ones and their history. Algorithm 8c (Fortran syntax [mostly]) n = 10000000 correct0=0, correct1=0 incorrect0=0, incorrect1=0 do i = 1, n call random_number(r) ! generate a binary "1" or "0" if (r<0.4) then bit=0 ! P{0}=0.4 else bit=1 ! P{1}=0.6 end if call random_number(r) ! give a 10% probability of an error if (r<0.1) then ! flip the bit if (bit==0) incorrect1=incorrect1+1 if (bit==1) incorrect0=incorrect0+1 else ! no error if (bit==0) correct0=correct0+1 if (bit==1) correct1=correct1+1 end if end do output output output output "a) "b) "c) "d) P(of a P(of a P(1 is P(0 is 1 being transmitted without error)", correct1/(correct1+incorrect0) 0 being transmitted without error)", correct0/(correct0+incorrect1) observed correctly)", correct1/(correct1+incorrect1) observed correctly)", correct0/(correct0+incorrect0)
The output for 10 million trials is a) b) c) d) P(1 P(0 P(1 P(0 being transmitted without error) 0.90003 being transmitted without error) 0.90008 is observed correctly) 0.93116 is observed correctly) 0.85707
Note that correct1+incorrect0 is the number of 1s at the transmitter, and correct1+incorrect1 is the number of 1s at the receiver. The values obtained for part a and b provide a basic validation of the simulation, the values obtained for part c and d provide a validation of Bayes theorem.
89
2. Propagation of errors through a measurement system Consider, for example, a pressure measurement system. Such a system may have several components that process a signal that is initially generated by a pressure transducer. For example the system may contain the following components: dT1 | dVs | dT2 | | | | v v v +------------+ +------------+ +-----------+ +----------+ ---->| Pressure |----->| Deflection |------>| Amplifier |----->| Recorder |-----> P | transducer | R | bridge | V1 | | V2 | | Pm +------------+ +------------+ +-----------+ +----------+ A resistance R (Ohms) is output from the pressure transducer in response to an input pressure P (Pascals). The deection bridge converts the resistance into a voltage V1 (mA) which in turn is amplied to the voltage V2 (mV) by the amplier. Finally the recorder outputs a reading Pm (Pascals). Suppose that the response of the last three components is aected by random variations in the environment; variations from standard ambient temperature dT aects the gain of the deection bridge and creates a bias in the recorder output, variations from the standard supply voltage dVS causes a bias in the output of the amplier. The output of each component can be modeled as follows: R V1 V2 Pm = = = = 0.0001 P ( 0.04 + 0.00003 dT1 ) R 1000 V1 + 0.13 dVs 250 V2 + 2.7 dT2
Here dT has a Gaussian distribution centered at zero with a standard deviation sT =3.0 C, and dVS has a Gaussian distribution centered at zero with a standard deviation sV s =0.23 V [see sketches in the class]. We can rst consider the output of the system for various inputs given standard environmental conditions; i.e with dT and dVS set to their average values of zero. The model of the measurement system becomes: R V1 V2 Pm = = = = 0.0001 P ( 0.04 ) R 1000 V1 250 V2 = 250 (1000 V1) = 250000 0.04 R = 10000 0.0001 P = P
i.e. Pm = P and so the system is perfectly calibrated. However, dT and dVS are randomly non-zero causing a random variation in the outputs of each component. These random variations propagate through the system to the nal output. Assuming Gaussian random variations, the standard deviation sO in the output O of a component for an input I is given by (here d are partial derivatives): sO 2 = (sI dO/dI)2 + (sA dO/dA)2 + (sB dO/dB)2 + (sC dO/dC)2 + .... Here, O is dependent on the input I and the random variables A, B, C... . Note that input I has a standard deviation sI due to random errors in the output of the previous component; the random errors therefore propagate through the system to the nal output. Including these random errors the response of the system is shown below for an input of 5000 Pa. R = 0.0001 P = 0.0001 5000 = 0.5 Ohms V1 = ( 0.04 + 0.00003 dT1 ) 0.5 = 0.04 0.5 + 0.00003 dT1 0.5 = 0.02 mV + 0.000015 dT1
90
dT1 is Gaussian with sT = 3.0 and so sV1^2 = (sT dV1/dT1)^2 = ( 3.0 0.000015 )^2 and so sV1 = 0.000045 mV with V1 = 0.02 mV (with dT1 = 0 on average) V2 = 1000 V1 + 0.13 dVs = 1000 0.02 + 0.13 dVs = 20 mV + 0.13 dVs dVs is Gaussian with sVs = 0.23 and we have also sV1 = 0.000045 and so sV2^2 = (sV1 dV2/dV1)^2 + (sVs dV2/dVs)^2 = (0.000045 1000)^2 + (0.23 0.13)^2 = (0.045)^2 + (0.029)^2 and so sV2 = 0.05403 mV with V2 = 20 mV (with dVs = 0 on average) Pm = 250 V2 + 2.7 dT2 = 250 20 + 2.7 dT2 = 5000 Pa + 2.7 dT2 [here we assume this dT2 is independent of the above dT1]
dT2 is Gaussian with sT = 3.0 and so sPm^2 = (sV2 dPm/dV2)^2 + (sT dPm/dT2)^2 = (0.05403 250)^2 + (3.0 2.7)^2
and so sPm = 15.75 Pa with Pm = 5000 Pa (with dT2 = 0 on average) Note that again the system is perfectly calibrated with an average output of 5000 Pa for an input of 5000 Pa, but the output is Gaussian distributed with a standard deviation of 15.75 Pa. We can verify this calculation by performing a monte-carlo simulation of the system. For this we need to be able to transform a uniform random variable into Gaussian (normal) random variable. This can be achieved by applying the Box-Muller transformation (see rnrm.f90 and rnrm.c++ in the downloads page): real function rnrm() !---------------------------------------------------------------------------! Returns a normally distributed deviate with zero mean and unit variance ! The routine uses the Box-Muller transformation of uniform deviates. !---------------------------------------------------------------------------real :: r, x, y do call random_number(x) call random_number(y) x = 2.0*x - 1.0 y = 2.0*y - 1.0 r = x**2 + y**2 if (r<1.0) exit end do rnrm = x*sqrt(-2.0*log(r)/r) end function rnrm 91
The above measurement system is simulated in the following algorithm: Algorithm 8d (Fortran syntax [mostly]) m1=0, m2=0, n = 100000000 sT=3.0, sVs=0.23, P=5000. do i = 1, n ! random variables dT1 = sT * rnrm() ! Gaussian pdf with standard deviation sT dT2 = sT * rnrm() ! Gaussian pdf with standard deviation sT dVs = sVs * rnrm() ! Gaussian pdf with standard deviation sVs ! model of the measurement system R = 0.0001 * P ! the pressure transducer V1 = ( 0.04 + 0.00003 * dT1 ) * R ! the deflection bridge V2 = 1000 * V1 + 0.13 * dVs ! the amplifier Pm = 250 * V2 + 2.7 * dT2 ! the recorder ! statistical variables m1 = m1+Pm ! E[Pm] m2 = m2+Pm**2 ! E[Pm^2] end do m1 = m1/n m2 = m2/n output "mean, m1 = ", m1 output "sd = sqrt(mu2) = ", sqrt(m2-m1**2) The output for n = 100000000 is mean, m1 = sd = sqrt(mu2) = 4999.99994 15.75136
Notes: 1. The simulation is very simple and can be expanded easily to include more components and environmental eects. 2. The result is in very good agreement with the theoretical result (given large enough n). 3. Although the environmental eects dT1, dVs, dT2 are included in the models, and they are non-zero, the net result is a zero contribution giving a mean output of 5000. 4. In this treatment (and in the theoretical treatment) changes in the ambient temperature (dT1 and dT2) are considered independent of each other. However, in reality these quantities are the same, i.e. dT1=dT2, and so are fully correlated; the actual standard deviation should therefore be larger. It is trivial to incorporate this situation in the algorithm (replace dT2 with dT1, and remove dT2); see the lab exercise.
92
8.3
Lab Exercises
Part A: 1. MC estimation of the volume of a sphere The following Fortran program estimates the area of a circle of unit radius. implicit none integer :: i, m=0, n=10000000 real :: x, y do i = 1, n call random_number(x) call random_number(y) if ( x**2+y**2 < 1.0 ) m=m+1 end do print *, 4.0*m/real(n) end Copy the program (or rewrite in your language of choice), run and check it, and then modify it to estimate the volume of a sphere of unit radius. How many random trials (n) does it take to acheive an accuracy of three decimal places?
2. MC integration of a function f(x) a) Write a Monte Carlo integration program to integrate the function f(x) = (1-x2 )1/2 over the range x = -1 to x = 1. b) Sketch the integral and determine ymax . c) Write, compile and run your program. d) Compare your computed result with the analytical result: /2. e) How many trials (n) are required to acheive an accuracy of three decimal places? Part B: Propagation of errors through a measurement system a) Code Algorithm 8d into a computer program, compile and run the program, check that the output agrees with the theoretical result. b) Verify that the mean output of the measurement system is equal to the input for the input values: 1000 Pa, 2000 Pa, 4000 Pa, and 8000 Pa; comment on the size of the standard deviation for each input value. c) In this model we assume that dT is independent for the deection bridge and the recorder; modify your program such that dT is not independent (this is more realistic), what is the eect of this on the standard deviation of the output?
93
8.4
Lab Solutions
Part A: 1. MC estimation of the volume of a sphere The following Fortran program estimates the area of a circle of unit radius. implicit none integer :: i, m=0, n=10000000 real :: x, y do i = 1, n call random_number(x) call random_number(y) if ( x**2+y**2 < 1.0 ) m=m+1 end do print *, 4.0*m/real(n) end Copy the program (or rewrite in your language of choice), run and check it, and then modify it to estimate the volume of a sphere of unit radius. How many random trials (n) does it take to achieve an accuracy of three decimal places?
Solution eee484ex8a (see the downloads page). We simply add another dimension Z and the test for X2 + Y2 + Z2 < 1 This denes an 1/8th of the volume (an octant) of the sphere, so the volume 4/3 is 8m/n. The estimate is 8m/n and so the error in the estimate is 8m/n - 4/3 The error can be inspected for increasing values of n: n 10 100 1000 10000 100000 1000000 10000000 100000000 8*m/n 2.4 4.4 4.104 4.1768 4.20456 4.191408 4.1885104 4.1886897 Error = 8*m/n - 4pi/3 -1.7887903 0.21120968 -0.08479032 -0.011990322 0.01576968 0.0026176786 -0.00027992134 | 3 d.p. -0.00010080135 | accuracy
It appears that we need of the order of n = 107 trials to gain a 3 decimal place accuracy! The number of trials required depends on the initial seed of the generator - there are signicant statistical uctuations.
2. a) -1 b) c) d) e)
MC integration of a function f(x) Write a Monte Carlo integration program to integrate the function f(x) = (1-x2 )1/2 over the range x = to x = 1. Sketch the integral and determine ymax . Write, compile and run your program. Compare your computed result with the analytical result: /2. How many trials (n) are required to achieve an accuracy of three decimal places?
94
Solution eee484ex8b (see the downloads page). We have a=-1.0, b=+1.0, Ymax =1.0 (by simple inspection), and the integral estimate (b-a) Ymax m/n. The error in the estimate is (b-a) Ymax m/n - /2; we can investigate the eect of varying n as follows: n 10 100 1,000 10,000 100,000 1,000,000 10,000,000 100,000,000 1,000,000,000 m 5 66 793 7855 78674 785723 7856866 78547805 785407068 Estimate 1.000000 1.320000 1.586000 1.571000 1.573480 1.571446 1.571373 1.570956 1.570814 Error -0.570796 -0.250796 0.015204 0.000204 0.002684 0.000650 | 3 d.p. 0.000577 | accuracy 0.000160 0.000018
So it appears (after investigating other seeds) one requires of the order of n = 107 trials to gain a 3 decimal place accuracy!
Part B: Propagation of errors through a measurement system a) Code Algorithm 8c into a computer program, compile and run the program, check that the output agrees with the theoretical result. b) Verify that the mean output of the measurement system is equal to the input for the input values: 1000 Pa, 2000 Pa, 4000 Pa, and 8000 Pa; comment on the size of the standard deviation for each input value. c) In this model we assume that dT is independent for the deection bridge and the recorder; modify your program such that dT is not independent (this is more realistic), what is the eect of this on the standard deviation of the output? Solution eee484ex8c (see the downloads page). a) For n=10000000 and P=5000 the output is: mean = 5000.005 sd = 15.758 b) For n=10000000 P=1000 the output is: mean = 1000.001, sd = 11.251 P=2000 the output is: mean = 2000.002, sd = 11.909 P=4000 the output is: mean = 4000.004, sd = 14.237 P=8000 the output is: mean = 8000.008, sd = 21.118 The relative error reduces as the input increases; this can be seen by inspecting the mathematical model of the measurement system. c) To make dT not-independent, simple replace dT2 = sT * rnrm() with dT2 = dT1; the result is an increase in the output standard deviation from 15.75 to 20.75 (when dT is independent the two values sometimes partially cancel, if they have opposite signs).
95
8.5
Question 1 Write a computer program to perform a Monte Carlo integration of the function y = 5 sqrt(x) - x over the interval x=1.0 to x=9.0.
96
Linux Tutorial
Linux Tutorial - Simple data manipulation programs under Linux Your central interactive computer account uses the Linux operating system. Linux is one of several variants of Unix. Many basic though very useful data manipulation programs exist under Linux. We will look at the following programs: cat sort uniq diff echo sed tr grep head tail wc cut concatenate files sort lines of text files remove duplicate lines from a sorted file find differences between two files display a line of text a Stream EDitor translate or delete characters search for lines in a file matching a pattern output the first part of files output the last part of files print the number of bytes, words, and lines in files remove sections from each line of files
For detailed information about these commands type: man command or info command or command --help Additionally, we can use "output redirection" (the > symbol) to redirect the output of a program to a file instead of the screen, "append" (the >> symbol) to add(append) to files, and "piping" (the | symbol) to "pipe" the output of one program into another program. Copy the following data files to your file space, you can do this with the command get-EEE484-unix. Starting with files file1.dat file2.dat file3.dat we can perform the following operations (you can view the contents of a file with the command "cat filename"): The file contents are: file1.dat --------cake hat pool ten tool file2.dat --------house pen fish cake file3.dat --------bed fence tool one comb
Join the contents into a single file: $ cat file1.dat file2.dat file3.dat > file4.dat the output is sent to the file file4.dat, view it with cat file4.dat Sort into alphabetical order: $ sort file4.dat > file5.dat output to file5.dat, view it with cat file5.dat
97
Remove multiple entries: $ uniq file5.dat > file6.dat We can combine the last two operations with: $ sort file4.dat | uniq > file6.dat (note that there is no file5.dat required) Look at the difference: $ diff file5.dat file6.dat Append the word dog to the file: $ echo dog >> file6.dat Again sort into alphabetical order: $ sort file6.dat > file7.dat Replace the word house with the word home: $ sed "s/house/home/g" file7.dat > file8.dat Translate all letters between a and z with their upper case values: $ cat file8.dat | tr a-z A-Z > file9.dat Search for the word home in the file: $ grep home file9.dat Search for the word HOME in the file: $ grep HOME file9.dat Search for the word home ignoring case: $ grep -i home file9.dat And again giving the line number of any occurrences: $ grep -i -n home file9.dat Display the first 8 lines: $ head -n8 file9.dat Display the last 5 lines: $ tail -n5 file9.dat Count the number of lines, words and characters in the file: $ wc file9.dat Display the first three characters of each line: $ cut -b1-3 file9.dat Display the seconds to fourth character of each line: $ cut -b2-4 file9.dat
98
Lab Exercise - Linux Download the file lep.dat and perform the following analysis. 1. Use wc to determine how many particles are there in the list. 2. Use cut, sort and uniq to form a unique list of particle species; how many species of particles are there? 3. Use grep and wc to determine how many particles there are of each species. 4. Use cut, sort, head, and tail to determine the maximum and minimum particle momenta. Which particles are they? 5. Use piping; i.e. dont waste time creating intermediate files, repeat the exercise (except part 3) with the file lep2.dat; the file contains many more particles. Solution for Lab Exercise Linux 1. Use wc to determine how many particles are there in the list. Answer: $ wc lep.dat 26 52 442 lep.dat 26 lines implies 26 particles in the list. 2. Use cut, sort and uniq to form a unique list of particle species; how many species of particles are there? Answer: $ cut -b1-7 lep.dat | sort | uniq KAONNEUTRON PHOTON PION+ PIONPROTON There are six particle species in the list. 3. Use grep and wc to determine how many particles there are of each species. Answer: We search for each particle type (6 of them) and count. $ grep 1 $ grep 2 $ grep 10 $ grep 7 $ grep 5 KAON2 NEUTRON 4 PHOTON 20 PION+ 14 PION10 lep.dat 17 lep.dat 34 lep.dat 170 lep.dat 119 lep.dat 85 | wc | wc | wc | wc | wc
99
$ grep PROTON lep.dat | wc 1 2 17 The result is: 1 Kaon, 2 Neutrons, 10 Photons, 7 positive Pions, 5 negative Pions, and 1 proton; (26 in total) 4. Use cut, sort, head, and tail to determine the maximum and minimum particle momenta. Which particles are they? Answer: $ cut -b11-16 lep.dat | sort -n | head -n1 0.120 $ cut -b11-16 lep.dat | sort -n | tail -n1 22.284 The minimum momentum is 0.120 GeV/c, the maximum is 22.284 GeV/c $ grep 0.120 lep.dat PHOTON 0.120 $ grep 22.284 lep.dat PION22.284 The minimum momentum belongs to a Photon. The maximum momentum belongs to a negative Pion. 5. Use piping; i.e. dont waste time creating intermediate files, repeat the exercise (except part 3) with the file lep2.dat; the file contains many more particles. Answer: $ wc -l lep2.dat 980 lep2.dat There are 982 particles in the list. $ cut -b1-8 lep2.dat | sort | uniq | wc -l 87 There are 87 particle species in the list. $ cut -b11-19 lep2.dat | sort -n | head -n1 0.001432 $ cut -b11-19 lep2.dat | sort -n | tail -n1 40.19147 The minimum momentum is 0.001 GeV/c, maximum is 40.191 GeV/c. $ grep 0.001432 lep2.dat GAMMA 0.00143232674 $ grep 40.191 lep2.dat B*40.1914749 The minimum momentum belongs to a GAMMA. The maximum momentum belongs to a B*-.
100

EEE484 Note Book

Загружено:

Сведения о документе

Исходное описание:

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

EEE484 Note Book

Загружено:

Авторское право:

Доступные форматы

EEE484 COMPUTATIONAL METHODS Course Notebook

Numerical Truncation, Precision and Overow

h=0.1 h=0.01 h=0.001 h=0.0001 h=0.00001 h=0.000001

FDA FDA FDA FDA FDA FDA

16.199990 16.019821 16.002655 15.983582 16.021729 15.258789

t.e t.e t.e t.e t.e t.e

0.2, 0.02, 0.002, 0.0002, 0.00002, 0.000002,

r.e. r.e. r.e. r.e. r.e. r.e.

-0.000011444092 -0.00017929077 0.0006542206 -0.016618729 0.021709442 -0.74121284

Result: 1 1.E+37 2 1.E+38 3 +Inf 4 +Inf 5 +Inf

Result: 1 1.E-42 2 1.E-43 3 1.E-44 4 1.E-45 5 0. 6 0.

Example exam questions

h is small but not zero

= (f(x+h)-f(x))/h = (f(x+h)-f(x-h))/(2*h) = (f(x-2*h)-8*f(x-h)+8*f(x+h)-f(x+2*h))/(12*h)

- (h^4/90) f(x) + O(h^6) -----------------------------\ the truncation error in the REA2

Example exam questions

Roots, Maxima, Minima (closed methods)

root root root root root root root root

MidPoint 3.0500 3.0250 3.0375 3.0313 3.0344 3.0359 3.0367

HalfBracket 0.0500 0.0250 0.0125 0.0062 0.0031 0.0016 0.0008

Example exam questions

Roots, Maxima, Minima (open methods)

612 020 989 755 020

684 715 373 049 164

Bisection Secant Newton-Raphson True root

Example exam questions

Numerical Integration: Trapezoidal and Simpsons formulae

(h/2) (h/2) (h/2) (h/2) (h/2)

F(x0) F(x1) F(x2) F(x3) F(x4)

F(x1) F(x2) F(x3) F(x4) F(x5)

(h^3/12) (h^3/12) (h^3/12) (h^3/12) (h^3/12)

F(z1) F(z2) F(z3) F(z4) F(z5)

(h/3) ( F(x0) + 4 F(x1) + F(x2) ) = I + (h^5/90) F(z) + higher order terms

h = (b-a)/n = (2.5-0.0)/2 = 1.25

Example exam questions

Solution of D.E.s: Runge-Kutta, and Finite-Dierence

Record Evolve Evolve Evolve

! Output the result

R = 1000 C = 1E-6 V0 = 12 V = 0 q = 0 dt = 1E-5 t = 0

! Euler time step (seconds) ! Start at t=0 seconds

! Time step (seconds) ! Start at t=0

! Initial displacement ! Initial velocity ! Initial time

[ V(x-dx,y,z) + V(x+dx,y,z) + V(x,y-dy,z) + V(x,y+dy,z) + V(x,y,z-dz) + V(x,y,z+dz) ] / 6

V2(-6,0)=-1.0 V2(+6,0)=+1.0 do V1=V2

! Make a copy of the old lattice

A Dipole (dipole.f90) Initial field

Final field after 473 iterations

with: V2(-6,-8:+8)=-1.0 V2(+6,-8:+8)=+1.0 Final field after 330 iterations

Example exam questions

The diagram is an R-C circuit

| | +------| |-------+ | | | | | C | | | | +------+ | +----| R |----+ | +------+ | | | | | +------(V)-------+

C = 1 micro Farad R = 1000 Ohms At t = 0 V = 12 volts

(V) is a voltmeter, you can assume it has infinite internal resistance.

Random Variables and Frequency Experiments

Uniform (flat) distribution of a set of random numbers

Distribution of a set of random numbers with some non-uniformity

|________ ___ ____| | | | | +-----------------+- R 0 1

Distribution of a set of random numbers with gaps.

Uniform (flat) distribution of a set of random numbers.

fy| _____ | / \ | / \______ | / \ +-+-----------------+- y a b

fx(x)=1 and so fy(y) = |dx/dy|

Example exam questions

= (f(x+h)-f(x))/h = (f(x+h)-f(x-h))/(2h) = (f(x-2h)-8f(x-h)+8f(x+h)-f(x+2h))/(12h)

|______ _ ____| | | | | +-----------------+- R 0 1

fy| _ | / \ | / \__ | / \ +-+-----------------+- y a b