Summary I - 2018-2019 Edition

Applied numerical analysis summary part I:
2018-2019 edition
Based on Applied Numerical Analysis by R. Klees and
R.P. Dwight
Sam van Elsloo
February - April 2017
Version 2.0
2
©Sam van Elsloo

Preface
Please note: you’ll find three documents in this dropbox folder:
1. The main summary, which contains all of the theory. However, the theory itself is rather short, so I’ve
included the official homework problems as examples, each time following the theory. These should
clarify a lot with regards to what you actually have to know, so I can really recommend doing them whilst
studying (and not first studying all of the theory and only then start making exercises). There are also a few
old quizzes available (the ones of 2013 and 2015-2018), but I have not used any of those questions, so that
you can practice them without being spoiled. Furthermore, formulas in red boxes are really important this
time; I’ve only highlighted them because they occurred so often in the homework problems/old quizzes.
2. There’s a solution manual to merely the homework problems: these are exactly the same solutions as
found in this summary, but maybe you preferred them in a separate document. However, there’s absolutely
no difference content wise.
3. Of course, there’s the solution manual to the old quizzes (the ones of 2013, 2015 and 2016).
Furthermore, I personally find the reader to be a bit too brief to my liking, so sometimes this summary is a bit
longer than the reader itself but that’s just because I like my summaries to be actually understandable. Finally, if
you think the theory is vague, just look at the examples as they are rather clarifying in my opinion.
Finally, since you can start registering for it now, I can really recommend going to the Revolutionary Aerospace
Women event (previously known as Aerospace Women’s Day). During the event, there will be two inspiring
talks by leading women from the aerospace industry, an (optional) workshop, free (!) diner during which you
can network with all kinds of important people, and a panel discussion. Furthermore, one of the main topics of
the evening will be bridging the gap between men and women in the aerospace industry, and so also if you’re a
guy like me you’re very much encouraged to sign up for it (and quite a number of guys already signed up for
it)!
You can sign up via this link: https://vsv.tudelft.nl/ticketshop/register/499. The event is on the
19th of March (a Tuesday), starting at 15:00. Your next exam would only be in three weeks, and this semester’s
project feels like going back to kindergarten compared to Systems Design1 , so you really have no reason to miss
this event.
P.S.: I was NOT responsible for choosing the front page colour this time.
1 Seriously, in the past, some project groups have gone to McDonalds during project sessions cause they had so little to do. Although if
you had space during Systems Design you could also could also have gone on holidays for two weeks during each work package and the
report wouldn’t suffer from it, so if you had space maybe you’ll still complain that Test, Analysis & Simulation is oh so much work. But
then count yourself lucky that whilst others were designing something a slight bit more complicated, you were ‘learning’ how to read a
10-step problem solving guide on how to design a hinge.
3
4
©Sam van Elsloo

Contents
1 Preliminaries: Motivation, Computer Arithmetic, Taylor Series 7

1.1 Numerical analysis motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.2 Computer representation of numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.2.1 Integers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.2.2 Real numbers - fixed-point arithmetic . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.2.3 Real numbers - floating point arithmetic . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.3 Taylor series review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2 Iterative Solutions of Non-linear Equations 17

2.1 Recursive bisection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.2 Fixed-point iteration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.3 Newton’s method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3 Polynomial interpolation in 1d 25
3.1 The monomial basis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.2 Why interpolation with polynomials? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.3 Newton polynomial basis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.4 Lagrange polynomial basis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.5 Chebychev polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.5.1 Interpolation error . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.5.2 Chebychev’s polynomial . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
5
CONTENTS 6
©Sam van Elsloo

1 Preliminaries: Motivation, Computer Arithmetic, Tay-
lor Series
1.1 Numerical analysis motivation
Numerical analysis comes down to finding solutions by purely arithmetic operations, i.e. +, −, × and ÷. The
reason why numerical analysis is important is that many problems cannot be solved analytically (i.e. by methods
of Calculus courses etc.); e.g. you can’t find the exact integral of
𝜋 √
1 + cos2 𝑥𝑑𝑥
∫0
In this course, we’ll see how we can find the value of this integral nevertheless (amongst other things).
1.2 Computer representation of numbers
As we use computers for numerical analysis (usually), it’s kinda important to understand how a computer saves
numbers. We need to represent infinite fields (such as the real numbers ℝ) with a finite and small number of bits
(even individual numbers such as 𝜋 can require an infinite decimal representation).
1.2.1 Integers
You know this already. Suppose you have a 4-bit system, then the integer 6 would be represented as 0110. If you
don’t entirely know any more how to represent an integer in bits, the easiest way to do it is by simply making a
nice table; suppose you are asked to write 175 in bits (in a 8-bit system), then it’s easiest to write it out as done
in table 1.1 to get 10101111. Please note: we start counting at bit 0, not at bit 1.
Table 1.1: Writing out 175 in bits.
𝑖 7 6 5 4 3 2 1 0
2𝑖 128 64 32 16 8 4 2 1
Larger than bit? 1 0 1 0 1 1 1 1
Remainder 175 − 128 = 47 − 0 = 47 − 32 = 15 − 0 = 15 − 8 =7−4 =3−2 =1−1
Possible errors are:
• Overflow: trying to represent a number that’s too large: a 𝑁-bit system can only represent integers smaller
than or equal to 2𝑁 − 1. For example, in a 32-bit system, you cannot represent numbers larger than
232 − 1 = 4294967296. If you try to represent 429467296 + 1, it returns to zero again. If a system also
wants to represent signs,( then one bit
) is reserved for the sign, so that the maximum number now becomes
2𝑁−1 − 1. If you’d try 232−1 − 1 + 1, then it’ll return to simply −231 (though it can also return to 0,
depending on the system used).
More mathematically, we can write this as follows:
7
CHAPTER 1. PRELIMINARIES: MOTIVATION, COMPUTER ARITHMETIC, TAYLOR SERIES 8
INTEGERS IN Assume we have 𝑁 bits,

BINARY ( )
𝑏 = 𝑏0 , 𝑏1 , ⋯ , 𝑏𝑁−1 (1.1)
taking values 0 or 1. A given 𝑏 then represents the natural number (integer)
∑
𝑁−1
𝑧= 𝑏𝑖 ⋅ 2𝑖 (1.2)
𝑖=0
assume we have 𝑁 bits,

( )
𝑏 = 𝑏0 , 𝑏1 , ⋯ , 𝑏𝑁−1
taking values 0 or 1. A given 𝑏 then represents the natural number (integer)
∑
𝑁−1
𝑧= 𝑏𝑖 ⋅ 2𝑖
𝑖=0
1.2.2 Real numbers - fixed-point arithmetic
FIXED-POINT One integer is assigned to each real number with a fixed interval ℎ:
ARITHMETIC
∑
𝑁−1
𝑟=ℎ⋅ 𝑏𝑖 ⋅ 2𝑖 (1.3)
𝑖=0
For example, if we have an interval ℎ of 1 × 10−4 , then the first bit equals 0.0000, the second bit equals 0.0001,
the third 0.0002, etc. If calculations would result in for example 0.00013, then this would be rounded to
0.0001, etc. If we’d have a 32-bit system with this interval length, we’d be able to represent numbers between
±231 ⋅ 1 × 10−4 ≈ ±200000 with a resolution of 0.00011 . This accuracy and size is rather limited. Two possible
errors are:
• Overflow.
∑
• Accumulation of rounding error: in the above system, we’d have 10
𝑖=1 0.00011 = 0.0010 rather than
0.0011, as 0.00011 gets rounded to 0.001.
1.2.3 Real numbers - floating point arithmetic
The problem with the previous method was the size and accuracy: although a resolution of 0.0001 may seem
quite good to you, it’s quite inefficient if you think about it: for very large numbers (in the order of millions and
billions), you probably don’t really care about such precision, whereas for very small numbers, you’d actually
need a better precision than this. So, perhaps we can make it more "efficient" by changing the distribution of the
bits a bit: very close to zero, they’ll be close together, increasing accuracy in that region; very far way from zero,
the bits will be very far apart, increasing the total size of the system. This sounds a bit like it uses logarithms in
some for, and indeed it does:
1 Please note: the resolution is thus the size of the steps, not the maximum rounding error.
©Sam van Elsloo

9 1.2. COMPUTER REPRESENTATION OF NUMBERS
FLOATING- A real number 𝑥 is written as a combination of a mantissa 𝑠 and exponent 𝑒, given a fixed base 𝑏:
POINT
ARITHMETIC 𝑥 = 𝑠 × 𝑏𝑒 (1.4)
In particular:
• 𝑏 - base: usually 2 or 10, fixed for the system.
• 𝑠 - significand (or mantissa), 1 ≤ 𝑠 < 𝑏, with 𝑛-digits - a fixed-point number. For example, in base
10 with 5-digits (so that the smallest representable number is 0.0001), we have 1 ≤ 𝑠 < 10.0000.
• 𝑒 - exponent, which is an integer 𝑒min ≤ 𝑒 ≤ 𝑒max .
Note how this would work for example for a system with base 10, 3-digits, and −8 ≤ 𝑒 ≤ 8: then, using 𝑒 = 0,
we could represent the numbers
1.00, 1.01, 1.02, ..., 9.98 9.99
If we’d want to go to 10 or higher, we’d be able to represent the numbers
1.00 ⋅ 101 , 1.01 ⋅ 101 1.02 ⋅ 101 , ..., 9.98 ⋅ 101 , 9.99 ⋅ 101
i.e.
10.0, 10.1, 10.2, ..., 99.8, 99.9
So you lose one decimal, but you keep the three significant digits. Note that this system does not contain zero,
this needs to be added explicitly. Furthermore, one bit is reserved for the sign. Possible errors are:
• Overflow
• Underflow: trying to represent a number closer to zero than 1 × 𝑏𝑒min .
• Undefined operation: e.g. if you divide by zero, or if you take the square root of a negative number. Other
words to describe this are special values or not a number (abbreviated to NaN).
• Rounding error: in the system above, with 3-digits, 1 + 0.0001 = 1. We define the machine epsilon
𝜖machine as the smallest number which, when added to 1, gives a number distinct from 1. In above system,
𝜖machine = 0.01.
To elaborate a bit more on a binary base, the industry standard for 32-bit systems is IEEE754, which uses the
following format:
0 01111100 0100000000000000000000
⏟⏟⏟ ⏟⏞⏞⏟⏞⏞⏟ ⏟⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏟⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏞⏟
sign exponent, 8 bits mantissa, 23 bits
where the base equals 2. Note that 23 bits are reserved for the mantissa, so there are 223 "steps" in the mantissa.
As only bit number 21 is2 "on", and because we count between 1 and 2 as the base is 2, the mantissa equals
221
𝑚=1+ = 1.25
223
The exponent seems to equal 124. however, again, as we have 8 bits, we can represent 28 = 256 numbers; we
don’t want our exponent to range from 0 to 255, but rather from -127 to +128. Therefore, we subtract 127 from
124, so that we have 𝑒 = 124 − 127 = −3. This means our actual number equals
1.25 ⋅ 2−3 = 0.15625
It goes without saying that you don’t need to know this system by heart (absolutely not), but just goes to show
what you can do with it.
2 The final bit is called bit 0, the second to last bit is bit 1, etc. If we have 23 bits, then the first bit is bit number 22, and thus the second
to first bit is number 21.
samvanelsloo97@icloud.com
Homework 1: Question 1
Consider the positive floating point number system 𝑠 × 𝑏𝑒 , where 𝑏 = 10, 𝑠 is a 5-digit significant
1.0000 ≤ 𝑠 ≤ 9.9999 and −8 ≤ 𝜖 ≤ 8. The system is completed with the number 0. What is the
machine epsilon (i.e. the smallest number which, added to 1, is distinct from 1)?
A: 1 × 10−4 B: 9.9999 × 10−5 C: 1 × 10−5 D: 1 × 10−8
The correct answer is A: if we have 5 digits, our steps are of the size 0.0001 = 10−4 . This step size is
equal to the machine epsilon. If we’d try 0.99999 × 10−4 (answer B), it’d simply get rounded down
again. The same applies for answers C and D.
What is the total number of distinct numbers in the number system described in Question 1, including
zero?
A: 99999 B: 1530001 C: 1710001 D: 999990001
The correct answer is B: if we have the numbers
1.0000, 1.0001, ..., 9.9998, 9.9999
in the mantissa, then we have

9.9999 − 1.0000
+ 1 = 90000
0.0001
distinct numbers in our mantissa. As we have 17 different exponents (−8, −7, ..., 0, ..., 7, 8), and we
explicitly add 0 as well, the total number of distinct numbers equals
17 ⋅ 90000 + 1 = 1530001
What is the result of the following algorithm, where 𝑥 is computed using the number system described
in Question 1:
1. 𝑥 ← 𝑥0
2. for 𝑖 = 1√to 100 do
3. 𝑥← 𝑥
4. end for
5. for 𝑖 = 1 to 100 do
6. 𝑥 ← 𝑥2
7. end for √
for the case when 𝑥0 > 1 and 0 < 𝑥0 < 1, respectively? (Rounding: assume that 𝑥 and 𝑥2 are
performed in exact arithmetic and then rounded to the nearest representable number.)
A: 0, 0 C: 1, 0 E: 0, Overflow G: 0, 𝑥
B: 0, 1 D: 1, 1 F: 1, Overflow H: 𝑥, 𝑥
The correct answer is C: this question may be slightly confusing in the way the program language is
written down; first, you should simply read the ← as =; secondly, you should remember that this for
loop simply tells the system to do this specific computation 100 times. In other words, you plug in a
certain value for 𝑥0 , then the system sets 𝑥 equal to 𝑥0 , then the system computes the square root of this
value, then takes the square root again, then takes it again, etc., a hundred times in total. It then squares
it a hundred times. So, from a pure mathematical standpoint, you should end up at precisely the same
©Sam van Elsloo

11 1.3. TAYLOR SERIES REVIEW
value as you started (because we take the square root a hundred times and then square this a hundred
times). However, as we’re dealing with a computer system here, it’s slightly different.
First, let’s see what happens for 𝑥0 > 1: let’s just try the largest number we can plug in: 𝑥0 = 9.9999⋅108 .
If you take the square root a few times with your calculator, you quickly see that we approach 1 at
a fast
√ rate. Note that all computations are rounded to 5 digits. Therefore, we’ll eventually end up
at 1.0001 = 1.00004999, which will be rounded to 1.0000. This means that if you start squar-
ing it again, you stay stuck at 1, so the final result is 1 as well (it does not "remember" its previous values).
For 0 < 𝑥0 < 1, let’s try the smallest number: 1.0000 ⋅ 10√ −8 : again, take the square root a few times
with your calculator and we quickly approach 1. This time, 0.99999 = 0.999995 (at least, that’s what
my TI-84+ gives). However, you should now that the more precise value is something like 0.99999498
or so, meaning it is actually rounded down to 0.99999. This means that taking the square root again
results in 0.99999, meaning you’re stuck in a loop on 0.99999. Then taking square roots will inevitably
result in a number
√ that’ll be smaller than 1.0000 ⋅ 10−8 , meaning it’ll be rounded to 0. What if you didn’t
realize that 0.99999 was actually slightly smaller than 0.999995? It’d suck, as you simply have to
remember that
√ 𝑥+1
𝑥<
2
for 𝑥 ≠ 1.
1.3 Taylor series review
Taylor series play a very important role in this course: remember that we wanted to make everything simply a
matter of +, −, × and ÷. Taylor series assist greatly in this: if you remember correctly (but you probably don’t),
Taylor series were capable of rewriting everything to polynomials. How exactly did it work?
TAYLOR If a function 𝑓 (𝑥) is infinitely times differentiable on the interval [𝑥0 , 𝑥], then the Taylor expansion of
SERIES 𝑓 (𝑥) about 𝑥0 is ( )
∑∞
𝑓 (𝑛) 𝑥0 ( )𝑛
𝑓 (𝑥) = 𝑥 − 𝑥0 (1.5)
𝑛=0
𝑛!
where 𝑓 (𝑛) is the 𝑛th derivative of 𝑓 (where 𝑓 (0) is the undifferentiated function).
For example, what would be the Taylor series expansion for 𝑓 (𝑥) = sin 𝑥 around 𝑥0 = 0? We have
𝑓 (0) = sin (0) = 0

′
𝑓 (0) = cos (0) = 1
𝑓 ′′ (0) = − sin (0) = 0
𝑓 ′′′ (0) = − cos (0) = −1
etc. such that

( )
∑
∞
𝑓 (𝑛) 𝑥0 ( )𝑛 𝑓 (0) ( )0 𝑓 ′ (0) ( )1 𝑓 ′′ (0) ( )2 𝑓 ′′′ (0) ( )3
𝑓 (𝑥) = 𝑥 − 𝑥0 = 𝑥 − 𝑥0 + 𝑥 − 𝑥0 + 𝑥 − 𝑥0 + 𝑥 − 𝑥0 + ...
𝑛=0
𝑛! 0! 1! 2! 3!
0 1 0 1 𝑥3 𝑥3 𝑥5 𝑥7
= ⋅ (𝑥 − 0)0 + (𝑥 − 0) + (𝑥 − 0)2 − (𝑥 − 0)3 = 𝑥 − + ... = 𝑥 − + − + ...
1 1 2! 3! 3! 3! 5! 7!
Similarly, we’d have for 𝑓 (𝑥) = cos 𝑥 around 𝑥0 = 0 we get
𝑓 (0) = cos (0) = 1

′
𝑓 (0) = − sin (0) = 0
𝑓 ′′ (0) = − cos (0) = −1
𝑓 ′′′ (0) = sin (0) = 0
etc. such that

1 0 1 0 𝑥2 𝑥2 𝑥4 𝑥6
𝑓 (𝑥) = (𝑥 − 0)0 + (𝑥 − 0)2 − (𝑥 − 0)2 + (𝑥 − 0)3 + ... = 1 − + ... = 1 − + − + ...
0! 1! 2! 3! 2! 2! 4! 6!
What could we do exactly with these Taylor series? The entire Taylor series (which includes infinitely many
terms) is an exact representation of the actual function, for every value of 𝑥 (it gives the exact same result).
However, it’s obviously impossible to include infinitely many terms; however, if 𝑥 does not differ
( too significantly
)𝑛
from 𝑥0 , it is still a really accurate approximation if you only include the first few terms, as 𝑥 − 𝑥0 becomes
really small for larger values of 𝑛 if 𝑥 − 𝑥0 is small. This is rather nice: if we have a certain function which we
would need to integrate between 1.5 and 2.5, but we don’t know how to integrate it, then you can also just make
a Taylor series expansion around 𝑥0 = 2, include the first few terms and then integrate this between 1.5 and
2.5.
Furthermore, please note: these Taylor series for sine and cosine are so common and fundamental that it is
highly advisable to just learn them by heart: just remember that sin gets the odd exponents and cos gets the
even exponents, and that plus and minus signs simply alternate (and that each term is divided by the factorial of
the exponent). If you can’t remember which one gets the odd exponents and which one the even, just plug in
𝑥 = 0.
Now, as more or less said before, your computer doesn’t have time to compute infinitely many terms in the
Taylor expansion, so in numerical analysis, we chop off the terms after an arbitrary number of terms. This is
called truncation (inkorting voor de Nederlanders). So, we adjust our definition a bit:
TAYLOR If a function 𝑓 (𝑥) is 𝑁 + 1 differentiable on the interval [𝑥0 , 𝑥], then the 𝑁th order Taylor expansion
SERIES of 𝑓 (𝑥) about 𝑥0 is
( )
∑𝑁
𝑓 (𝑛) 𝑥0 ( )𝑛 ( )𝑁+1
𝑓 (𝑥) = 𝑥 − 𝑥0 +  𝑥 − 𝑥0 (1.6)
𝑛=0
𝑛!
where 𝑓 (𝑛) is the 𝑛th derivative of 𝑓 (where 𝑓 (0) is the undifferentiated function) and  is the trucation
error. This can alternatively be written as
( )
∑
𝑁
𝑓 (𝑛) 𝑥0 ( )𝑛 𝑓 (𝑁+1) (𝜉) ( )𝑁+1
𝑓 (𝑥) = 𝑥 − 𝑥0 + 𝑥 − 𝑥0 (1.7)
𝑛=0
𝑛! (𝑁 + 1)!
which is called the Lagrange form of the remainder. The value of 𝜉 is never known in practice.
In other words: if we have an 𝑁th order Taylor expansion (which contains terms up the order of 𝑁), then the
( )𝑁+1 3
order of magnitude of this error will be  𝑥 − 𝑥0 . Now, this truncation error can be also written slightly
different as shown in the second equation: we write it as
𝑓 (𝑁+1) (𝜉) ( )𝑁+1

𝑥 − 𝑥0
(𝑁 + 1)!
What does this mean? Well, this tells us that if we’d know the value of 𝜉 (which lays somewhere between 𝑥
and 𝑥0 ), we can compute the exact value of the truncation error by simply calculating the 𝑁 + 1th derivative,
plugging in the value of 𝜉 for 𝑥, and then dividing by (𝑁 + 1)! and then multiplying by 𝑥 − 𝑥𝑁+1
0
. However, we
3 Why
( )𝑁+2
does it not include anything like  𝑥 − 𝑥0 ? Because in applied numerical analysis, we’ll be smart and only use small values
( )𝑁+2 ( )𝑁+1
of 𝑥 − 𝑥0 , so that 𝑥 − 𝑥0 already becomes of a smaller order of magnitude than 𝑥 − 𝑥0 and thus the order of magnitude of the
( )𝑁+1
error is simply  𝑥 − 𝑥0 .
©Sam van Elsloo

unfortunately never know the value of 𝜉, but it’s a nice way of writing stuff. Now, finally, we often write 𝑥 − 𝑥0
as
ℎ ≡ 𝑥 − 𝑥0
where ℎ is called the step-size, which means that
( )
FORMULA ( ) ∑
𝑁
𝑓 (𝑛) 𝑥0 𝑓 (𝑁+1) (𝜉) 𝑁+1
𝑓 𝑥0 + ℎ = ℎ𝑛 + ℎ (1.8)
𝑛=0
𝑛! (𝑁 + 1)!
which again is simply a nice way of writing stuff. So, ℎ is more or less the distance from 𝑥0 , and this series
representation is accurate as long as ℎ is small.
What is the third non-zero term in the Taylor expansion of cos (𝑥) about 𝑥 = 0?
1 1
A: 4!
𝑥 C: 4!
cos (𝑥) E: − 2!1 𝑥2
1 4 1
B: 4!
𝑥 D: 4!
cos (𝑥) 𝑥4
The correct answer is B: the Taylor expansion of the cosine is
𝑥2 𝑥4 𝑥6
cos (𝑥) = 1 − + − + ...
2! 4! 6!
so answer B is correct. Note that even if you did not remember the Taylor expansion of the cosine and
you didn’t feel like deriving it at the exam, you can already deduce that answers A, C and D must be
( )2
wrong. The third term in a Taylor series expansion will is already 𝑥 − 𝑥0 , so the third nonzero term
must at least be a polynomial√of the second degree or higher, meaning A is wrong. Furthermore, there
cannot appear a cos (or sin, 𝑥, 𝑒𝑥 or whatever) in a Taylor series, meaning C and D must be wrong as
well.
Represent 𝑓 (𝑥) = 𝑒𝑥 by a 3-term truncated Taylor expansion about 𝑥 = 1. What is the first term in the
truncation error?
(𝑥−1)3 𝑥−1 (𝑥−1)2 𝑥−1
A: 3
𝑒 C: 6
𝑒 E: 2
G: 3
B: 𝑥−1 (𝑥−1)3 (𝑥−1)3 H: 𝑥−1
3
𝑒 D: 6
𝑒 F: 3 6
The correct answer is D: let’s just derive the Taylor series for 𝑒𝑥 here. Note that 𝑓 (1) = 𝑓 ′ (1) =
𝑓 ′′ (1) = 𝑓 ′′′ (1) = ... = 𝑒1 = 𝑒; thus, we have
( )
∑∞
𝑓 (𝑛) 𝑥0 ( )𝑛 𝑒 𝑒 𝑒 𝑒
𝑓 (𝑥) = 𝑥 − 𝑥0 = (𝑥 − 1)0 + (𝑥 − 1)1 + (𝑥 − 1)2 + (𝑥 − 1)3 + ...
𝑛=0
𝑛! 0! 1! 2! 3!
Thus, if we include only the first three terms in the expansion, then the first term in the truncation error
is simply 6𝑒 (𝑥 − 1)3 , so answer D is correct.
The function 𝑓 (𝑥) = exp (2𝑥) is written as a 3-term Taylor expansion 𝑃 (𝑥) about 𝑥 = 𝑥0 , plus an exact
remainder term 𝑅 (𝑥), so that:
𝑓 (𝑥) = 𝑃 (𝑥) + 𝑅 (𝑥)
[ ]
What is the Lagrange form of the remainder 𝑅 (𝑥)? (Where in the following 𝜉 ∈ 𝑥0 , 𝑥 , and ℎ = 𝑥−𝑥0 .)
1 ( )3 1
A: 3
exp (2𝜉) 𝑥0 + ℎ C: 6
exp (2ℎ) 𝜉 3
4 8
B: 3
exp (2𝜉) ℎ3 D: 6
exp (ℎ𝜉)3
The correct answer is B: again, you can derive the entire Taylor series but it is not really necessary as we
know that all terms in the Taylor series are nonzero and therefore, if we have a 3-term Taylor expansion,
𝑅 (𝑥) is associated with the fourth term, which will be (compare with equation (1.8); remember that it is
the third derivative because the Taylor series starts counting terms at the zeroth derivative (the original
function))
𝑓 (3) (𝜉) 3
𝑅 (𝑥) = ℎ
3!
We have that 𝑓 (3) (𝜉) equals
𝑓 ′ (𝜉) = 2𝑒2𝜉
𝑓 ′′ (𝜉) = 4𝑒2𝜉
𝑓 ′′′ (𝜉) = 8𝑒2𝜉
and thus
8𝑒2𝜉 3 4
𝑅 (𝑥) = ℎ = exp (2𝜉) ℎ3
3⋅2⋅1 3
so answer B is correct.
Write sin (𝑥) as truncated Taylor series expansion about 𝑥 = 0 with two non-zero terms. What is the
magnitude of the first non-zero term in the truncation error at 𝑥 = 𝜋2 ?
A: 0.07969 B: 0.02 C: 0.008727 D: 0.008333
The correct answer is A: the Taylor series expansion for a sine is
𝑥3 𝑥5
𝑓 (𝑥) = 𝑥 − + − ...
3! 5!
Thus, the first non-zero term in the truncation error is
( )5
𝜋
𝑥5 2
→ = 0.07969
5! 5⋅4⋅3⋅2⋅1
2
Approximate 𝑒−𝑥 by a 2-term Taylor series expansion about 𝑥 = 1. What is the magnitude of the first
term in the truncation error at 𝑥 = 0?
1 −1 1 −1
A: 𝑒−1 B: 2𝑒−1 C: 2
𝑒 D: 3
𝑒
2
The correct answer is A: the Taylor series for 𝑒−𝑥 around 𝑥 = 1 is not immediately obvious probably,
©Sam van Elsloo

so let’s just derive it here. We have

( )
∑
∞
𝑓 (𝑛) 𝑥0 ( )𝑛
𝑓 (𝑥) = 𝑥 − 𝑥0
𝑛=0
𝑛!
where we’re interested only in the first three terms:

2 2
𝑓 (𝑥) = 𝑒−𝑥 → 𝑓 (1) = 𝑒−1 = 𝑒−1
2 2
𝑓 ′ (𝑥) = −2𝑥𝑒−𝑥 → 𝑓 ′ (1) = −2 ⋅ 1 ⋅ 𝑒−1 = −2𝑒−1
2 2 2 2
𝑓 ′′ (𝑥) = −2𝑒−𝑥 + 4𝑥2 𝑒−𝑥 → 𝑓 ′′ (1) = −2𝑒−1 + 4 ⋅ 12 ⋅ 𝑒−1 = 2𝑒−1
Thus, the Taylor series is
𝑒−1 2𝑒−1 2𝑒−1

𝑓 (0) = (0 − 1)0 − (0 − 1)1 + (0 − 1)2
0! 1! 2!
so that the magnitude of the first term in the truncation error is
2𝑒−1
(−1)2 = 𝑒−1
2
and thus answer A is correct.
©Sam van Elsloo

2 Iterative Solutions of Non-linear Equations
We often want to find the roots of an equation, i.e. the solutions of 𝑥 for 𝑓 (𝑥) = 0. In this chapter, we’ll discuss
three such methods; all of them are iterative in nature. If the "real" solution is 𝑥, ̃ and the sequences of estimates
of the solution is generated as 𝑥0 , 𝑥1 , ..., 𝑥𝑛 , we rather obviously get that in the limit,
lim 𝑥𝑁 = 𝑥̃
𝑁→∞
The iterative nature means that three things are of importance in our discussion (next to how they work of
course):
1. Under what conditions does the algorithm converge; i.e. under what conditions will it actually give a
solution.
2. A bound on the error of the estimate 𝑥𝑁 ; i.e., how large will the error still be for the 𝑁th estimate (after
all, we don’t have time to iterate infinitely many times, so we want to know how accurate our 𝑁th iteration
is.
3. How rapidly the algorithm converges (the rate at which the error in 𝑥𝑁 decreases).
2.1 Recursive bisection

Recursive bisection is a method you already learned during the Python course actually. We start with an interval
[𝑎, 𝑏] in which we know a root exists. Then we half the interval, choosing the interval which contains the root;
then half this interval, choosing the interval which contains the root, ad infinitum. To make it more concrete,
consider the polynomial
𝑓 (𝑥) = 𝑥3 − 𝑥 − 1 = 0
We have 𝑓 (1) = 13 − 1 − 1 = −1 and 𝑓 (2) = 23 − 2 − 1 = 5, and thus we know that the root must lay somewhere
between 𝑥 = 1 and 𝑥 = 2. What we do then is simply plug in 𝑥 = 1.5 (the middle of the interval), we have
𝑓 (1.5) = 1.53 − 1.5 − 1 = 0.875. From this, we can deduce that the root must lay between 1 and 1.5, we thus
try 1.25, for which 𝑓 (1.25) = 1.253 − 1.25 − 1 = −0.6875. Thus, we know that the root must lay between 1.25
and 1.5, and we would try 1.375 on our next try, but I think the process is clear now.
Now, this method will always converge, as long as the initial end points are of opposite signs. The upper-bound
on the error1 is simply half the length of the interval: if the interval is [1, 2], then we guessed 𝑥 = 1.5 first, but
as we know that the root must be between 1 and 2, the root must be within 0.5 of 1.5 (obviously). Finally, if the
"real" error after the 𝑖th iteration is
FORMULA 𝜖𝑖 ≡ ||𝑥𝑖 − 𝑥̃ || (2.1)
then we know that on the first iteration, 𝜖0 ≤ (𝑏 − 𝑎) ∕2 (as (𝑏 − 𝑎) ∕2 is the upper bound on the error), and as
the interval size halves for every subsequent iteration, we have
𝑏−𝑎
𝜖𝑁 ≤ 𝐸 𝑛 =
2𝑁+1
Note that this means that the error at each iteration is reduced by a constant factor of 0.5; this is the rate of
convergence. This is an example of a linear rate of convergence; "linear" comes from the fact that the convergence
curve when plotted on an iteration-log error graph is linear, as shown in figure 2.1.
Two final notes:
1 I.e., the maximum value the error could be, the upper-bound.
17
CHAPTER 2. ITERATIVE SOLUTIONS OF NON-LINEAR EQUATIONS 18
Figure 2.1: Convergence of recursive bisection (linear - left) and Newton (quadratic - right) (will be discussed
later).
• If your function is discontinuous in the interval, the recursive bisection method may fail to work. For
example, if you have 𝑓 (𝑥) = 1∕𝑥 = 0, then it’ll find a root at 𝑥 = 0, which is obviously bullshit.
• If you have multiple roots within your interval, then it will converge to one of the roots, but it is difficult
to predict to which one it’ll converge.
For fun: this method is guaranteed to converge (on the conditions mentioned above) and possesses a strong
upper-bound on the error; the major limitation is that it does not work for vector algebra.
What is the approximation of the root of the function 𝑓 (𝑥) = 𝑒𝑥 − 1, if three steps of repeated bisection
are applied on a starting interval [𝑥1 , 𝑥2 ] = [−2, 1]? (The root approximation is the center of the
remaining interval.)
A: -0.5 C: -0.125 E: 0.1331

B: -0.39346 D: 0 F: 0.75
The correct answer is C: let’s just do it three times:
−2 + 1
𝑥0 = = −0.5
2
𝑓 (−0.5) = 𝑒−0.5 − 1 = −0.39347
−0.5 + 1
𝑥1 = = 0.25
2
𝑓 (−0.75) = 𝑒0.25 − 1 = 0.28403
−0.5 + 0.25
𝑥2 = = −0.125
2
Note that three steps mean three times taking the center of the remaining interval.
Assume that a function 𝑓 (𝑥) has multiple roots in the interval [𝑎, 𝑏], and 𝑓 (𝑎) > 0, 𝑓 (𝑏) < 0. Repeated
bisection is applied, starting on this interval. How will the iteration behave? (Hint: perform the
algorithm graphically on a suitable curve.) It will:
A: Take no steps.
B: Fail to converge.
C: Converge to one of the roots.
D: Converge to more than root.
E: Terminate with a solution which is not a root.
©Sam van Elsloo

19 2.2. FIXED-POINT ITERATION
F: Terminate with an interval containing all roots.
The correct answer is C: repeated bisection will always converge to one of the roots. Converging to
more than one root just doesn’t make sense semantically. Answers E and F are also just wrong.
2.2 Fixed-point iteration

You probably don’t know fixed-point iteration yet, but it’s actually pretty logical. If we again have the equa-
tion
𝑥3 − 𝑥 − 1 = 0
then another way of writing this would be one of the following two:
√3
𝑥 = 𝑥+1
𝑥 = 𝑥3 − 1
Now, suppose we use the first √ one: a way of finding 𝑥 now is to first assume
√ a certain value for 𝑥, say 𝑥 = 2,
and plug this in, to get 𝑥 = 2 + 1 ≈ 1.4422; plug this in, and get 𝑥 = 1.4422 + 1 = 1.3467, 𝑥 = 1.3289,
3 3
𝑥 = 1.3255, etc. Pretty straightforward: you just rewrite the equation to a function of 𝑥, try a value for 𝑥 and
then keep reiterating the function. However, there’s a small problem: suppose we’d have used 𝑥 = 𝑥3 − 1: if
we then tried 𝑥 = 2 initially, we’d have gotten the numbers 𝑥 = 7, 𝑥 = 344, 𝑥 = 40707585; it is clear that this
quickly diverges. We don’t want this, so how can we know our way of rewriting actually leads to results?
We have a handy theory for this: if 𝑓 (𝑥) and 𝑓 ′ (𝑥) are continuous, then if our interval is between 𝑥 = 𝑎 and
𝑥 = 𝑏, then there exists a 𝜉 ∈ [𝑎, 𝑏] such that
𝑓 (𝑏) − 𝑓 (𝑎)
𝑓 ′ (𝜉) =
𝑏−𝑎
In other words: between 𝑎 and 𝑏, there exists a value 𝜉 such that the derivative at 𝜉 is equal to the average slope
between 𝑎 and 𝑏. How can we apply this here? Note how we can actually write our method of iteration as
( )
𝑥𝑖+1 = 𝜙 𝑥𝑖
( ) √
where 𝜙 𝑥𝑖 was the rewritten function, e.g. 𝑥 + 1. We then can rewrite this a fair bit (𝑥̃ was the real solution;
3
note that we must have 𝜙 (𝑥)

̃ = 𝑥):
̃
( )
𝑥𝑖+1 = 𝜙 𝑥𝑖
( )
𝑥𝑖+1 − 𝑥̃ = 𝜙 𝑥𝑖 − 𝑥̃
( )
𝑥𝑖+1 − 𝑥̃ = 𝜙 𝑥𝑖 − 𝜙 (𝑥) ̃
( )
𝜙 𝑥𝑖 − 𝜙 (𝑥) ̃ ( )
𝑥𝑖+1 − 𝑥̃ = 𝑥𝑖 − 𝑥̃
𝑥 − 𝑥̃
( 𝑖) ( )
𝑥𝑖+1 − 𝑥̃ = 𝜙′ 𝜉𝑖 𝑥𝑖 − 𝑥̃
Now, the error is 𝑒𝑖 ≡ ||𝑥𝑖 − 𝑥̃ ||, and thus
( )
FORMULA 𝑒𝑖+1 = 𝜙′ 𝜉𝑖 𝑒𝑖 (2.2)
| ( )| | ( )|
Finally: we want our error to decrease with each iteration, so we want |𝜙′ 𝜉𝑖 | < 1. If |𝜙′ 𝜉𝑖 | > 1, the error
| | | |
grows (divergence). If −1 < 𝜙′ (𝜉) < 0, then the error oscillates around the root; i.e., the first time your solution
will be larger than 𝑥,
̃ the second time it will be smaller than 𝑥,
̃ the third time it will be larger than 𝑥,̃ etc.
Now, you may be wondering, but what is 𝜉𝑖 ? Actually, it’s not a known number,( ) and it also changes with each
iteration, so how do you then know for what value you need to check 𝜙′ 𝜉𝑖 ? Look carefully at the derivation
shown: you need to check it for every value between 𝑥̃ and the initial guess 𝑥𝑖 . For 𝑥, ̃ you need to guess a bit
where it’ll be. Let’s do two examples to clarify a bit:
√
• For 𝜙 (𝑥) = 𝑥 + 1: we can guess more or less that 𝑥̃ will be between 1 and 22 ; furthermore, 𝜙′ (𝑥) =
3
1
3
(𝑥 + 1) −2∕3
. This is smaller than 1 for all 𝑥 ∈ [1, 2] and thus we’re good (as our initial guess for 𝑥 is 2
as well).
• For 𝜙 (𝑥) = 𝑥3 − 1: again, 𝑥̃ will be between 1 and 2. However, this time, 𝜙′ (𝑥) = 3𝑥2 , which is larger
than 1 for all values of 𝑥 between 1 and 2, so it’s not valid.
| ( )|
To round this section up, our error bound will be as follows: assume that |𝜙′ 𝜉𝑖 | < 𝐾 < 1 for all 𝑖; then the
| |
error bound after 𝑛 iterations is
𝜖𝑖 < 𝐾𝜖𝑖−1 < 𝐾 2 𝜖𝑖−2 < ⋯ < 𝐾 𝑖 𝜖0
Again, we have linear convergence: the error is reduced by a constant factor at each iteration.
For fun: FPI (fixed-point iteration are used almost universally in physics (including CFD); they can be very
efficient for vector algebra, but convergence is rarely guaranteed, and often slow.
Rearrange the function 𝑓 (𝑥) = 𝑒𝑥 − 5𝑥2 (without adding terms) into a suitable format for fixed-point
iteration. Make sure the iteration converges to a root, starting at an initial guess of 𝑥0 = 10. What is the
estimate of the root after two iterations of your method? (You may need to to try more than one choice
of fixed-point iteration?
A: 0 C: 1.156 E: 5.263
B: 0.447 D: 4.708 F: 9.572
The correct answer is E: we have two options; let’s just see the results of both of them:
5𝑥2 = 𝑒𝑥
√
𝑒𝑥
𝑥 = 𝜙 (𝑥) =
5
√
𝑒10
𝑥1 = 𝜙 (10) = = 66.37
5
√
𝑒66.37
𝑥2 = 𝜙 (10) = = 1.156 ⋅ 1014
5
which obviously diverges, so let’s try the other:
𝑒𝑥 = 5𝑥2
( )
𝑥 = 𝜙 (𝑥) = ln 5𝑥2
( )
𝑥1 = 𝜙 (10) = ln 5 ⋅ 102 = 6.2146
( )
𝑥2 = 𝜙 (6.2146) = ln 5 ⋅ 6.21462 = 5.263
so answer E is correct.
( )
Given a particular fixed-point iteration 𝑥𝑖+1 = 𝑔 𝑥𝑖 which is known to converge, how do you expect
the error 𝜖 of consecutive iterations to be related? (K is some constant with |𝐾| < 1.)
2 Same line of reasoning as why we picked those values during recursive bisection: you just guess a few values and see that there is a
change of sign between 𝑥 = 1 and 𝑥 = 2.
©Sam van Elsloo

21 2.3. NEWTON’S METHOD
A: 𝜖𝑖+1 < 𝐾𝜖𝑖 C: 𝜖𝑖+1 > 𝐾𝜖𝑖 E: 𝜖𝑖+1 < 𝐾 + 𝜖𝑖

B: 𝜖𝑖+1 < 𝐾𝜖𝑖3 D: 𝜖𝑖+1 > 𝐾𝜖𝑖2 F: 𝜖𝑖+1 > 𝐾 + 𝜖𝑖2
The correct answer is A: just basic knowledge, really.
2.3 Newton’s method

Again, a beautifully elegant method.
( ) It’s basically depicted( in)figure 2.2. We see what happens there: we guess
an initial value 𝑥0 , compute 𝑓 𝑥0 and the derivative 𝑓 ′ 𝑥0 at that point and solve
( ) ( )( )
0 = 𝑓 𝑥0 + 𝑓 ′ 𝑥0 𝑥1 − 𝑥0
Figure 2.2: Progression of Newton’s method.
Rearranging gives
( ) ( ) ( )
𝑓 ′ 𝑥0 𝑥0 − 𝑓 𝑥0 𝑓 𝑥0
𝑥1 = ( ) = 𝑥0 − ( )
𝑓 ′ 𝑥0 𝑓 ′ 𝑥0
or more generally,
( )
𝑓 𝑥𝑛−1
𝑥𝑛 = 𝑥𝑛−1 − ( ), 𝑛 = 1, 2, ...
𝑓 ′ 𝑥𝑛−1
This method is called Newton’s method. It is actually a special case of a FPI with iteration function
FORMULA 𝜙 (𝑥) = 𝑥 −
𝑓 (𝑥)
(2.3)
𝑓 ′ (𝑥)
Now, again, does it converge? How rapidly? Note that we have
𝑓 ′ (𝑥) 𝑓 ′ (𝑥) − 𝑓 (𝑥) 𝑓 ′′ (𝑥) 𝑓 (𝑥) 𝑓 ′′ (𝑥)

𝜙′ (𝑥) = 1 − =
(𝑓 ′ (𝑥))2 (𝑓 ′ (𝑥))2
So, 𝜙′ (𝑥) ̃ = 0); therefore, if 𝑓 is twice continuously differentiable, ||𝜙′ (𝑥)|| < 1 for all 𝑥 close to
̃ = 0 (as 𝑓 (𝑥)
̃ Therefore, Newton’s method will converge provided that the initial guess 𝑥0 is "close enough" to 𝑥.
𝑥. ̃ This is
called local convergence. Such a starting point can often be found by first using several iterations of a FPI in the
hope that a suitably small interval is obtained.
Now, what is the decrease in error? Again, we have 𝜖 ≡ 𝑥𝑖 − 𝑥,

̃ and we expect this to be small:
( ) ( )
𝑥𝑖+1 = 𝜙 𝑥𝑖 = 𝜙 𝑥̃ + 𝜖𝑖
Now, for this, we can actually use a Taylor series about 𝑥:

̃
( ) ( ) 𝜙′′ (𝑥)
̃ 2 ( ) 𝜙′′ (𝑥)
̃ 2
̃ + 𝜙′ (𝑥)
𝑥𝑖+1 = 𝜙 𝑥𝑖 = 𝜙 𝑥̃ + 𝜖𝑖 = 𝜙 (𝑥) ̃ 𝜖𝑖 + 𝜖𝑖 +  𝜖𝑖3 ≈ 𝑥̃ + 0 + 𝑒𝑖
2 2
If this went a little too fast for you: remember that the Taylor series for a function 𝑓 (𝑥) around 𝑥0 is given
by
( ) ( ) ( )
𝑓 𝑥0 ( )0 𝑓 ′ 𝑥0 ( )1 𝑓 ′′ 𝑥0 ( )2 ( )3
𝑓 (𝑥) = 𝑥 − 𝑥0 + 𝑥 − 𝑥0 + 𝑥 − 𝑥0 +  𝑥 − 𝑥0
0! 1! 2!
Now, replace 𝑓 with 𝜙, 𝑥0 with 𝑥̃ and 𝑥 with 𝑥̃ + 𝜖𝑖 ; the terms between brackets now reduce to 𝑥 − 𝑥0 =
𝑥̃ + 𝜖𝑖 − 𝑥̃ = 𝜖𝑖 . Furthermore, to go to the final equation, remember that 𝜙 (𝑥)
̃ must simply equal 𝑥,
̃ and
̃ = 0 as explained before. Now, rearrange to get
𝜙′ (𝑥)
𝜙′′ (𝑥)
̃ 2
𝑥𝑖+1 = 𝑥̃ + 𝜖𝑖
2
𝜙′′ (𝑥)
̃ 2
𝑥𝑖+1 − 𝑥̃ = 𝜖𝑖
2
FORMULA 𝜙′′ (𝑥)

̃ 2
𝜖𝑖+1 = 𝜖𝑖 (2.4)
2
Thus, the error is reduced by its square, each iteration: this is called quadratic convergence, as shown in figure
2.1.
Apply Newton’s method to 𝑔 (𝑥) = 8𝑥 − 8𝑥3 . Algebraically there is a root at 𝑥 = 1. What is the absolute
error to this root after a single iteration, using an initial guess of 𝑥1 = 0.5?
A: -15.436 B: 0 C: 0.435 D: 0.565 E: 14.936
The correct answer is E: just remember the formula
𝑓 (𝑥)
𝜙 (𝑥) = 𝑥−
𝑓 ′ (𝑥)
𝑓 (𝑥) = 8𝑥 − 8𝑥3
ln (8) ⋅ 8𝑥 − 24𝑥2
𝑓 ′ (𝑥) =
8𝑥 − 8𝑥3
𝜙 (𝑥) = 𝑥 −
ln (8) ⋅ 8𝑥 − 24𝑥2
80.5 − 8 ⋅ 0.53
𝑥1 = 𝜙 (0.5) = 0.5 − = 15.936
ln (8) ⋅ 80.5 − 24 ⋅ 0.52
|𝑥1 − 𝑥̃ | = |15.936 − 1| = 14.936
| |
So answer E is correct.
Does the Newton iteration in Question 13 converge using the given initial guess?
A: Yes, the iterations converge to 𝑥 = 1.

B: Yes, although it doesn’t converge to 𝑥 = 1.
C: No, Newton’s method diverges using this guess.
D: No, the iterations oscillate forever.
©Sam van Elsloo

23 2.3. NEWTON’S METHOD
E: More information is needed to answer this question.
The correct answer is B: the best way to do these kind of questions by just letting your calculator do the
iteration a few times; the fastest way to do this is by simple use of ans (at least that’s how it’s called on
my TI-30XB):
8ans − 8 ⋅ ans3
ans −
ln (8) ⋅ 8ans − 24 ⋅ ans2
If you do this like 10 times (which is just 10 times pressing enter so it’s not that much work at all), you
quickly see that it converges to 𝑥 = 2.
Consider the two-variable problem consisting of two scalar equations:
1
𝑥2 + 𝑦2 = 1, 𝑥𝑦 =
4
Newton’s method is applied to solve this system. Which of the following represents the first iteration of
Newton’s method for this system, with an initial guess of 𝑥0 = 2, 𝑦0 = −1? (Where Δ𝑥0 = 𝑥1 − 𝑥0 ,
and similarly for 𝑦.) [Hint: For multiple equations and variables the derivatives form a matrix.]
4Δ𝑥0 −1Δ𝑦0 −2Δ𝑥0 +2Δ𝑦0
A: 4
= 0, −2.25
=0
4Δ𝑥0 −1Δ𝑦0 −2Δ𝑥0 +2Δ𝑦0
B: Δ𝑥0
= −2.25, =4
Δ𝑦0
C: 4Δ𝑥0 − 1Δ𝑦0 = 4, −2Δ𝑥0 + 2Δ𝑦0 = −2.25
D: 4Δ𝑥0 − 2Δ𝑦0 = −4, −1Δ𝑥0 + 2Δ𝑦0 = 2.25
The correct answer is D: there is little chance you were able to solve this all by yourself probably, though
(at least I wasn’t). Remember that Newton’s method was
( )
𝑓 𝑥0
𝑥1 = 𝑥0 − ( )
𝑓 ′ 𝑥0
( )
but how does that work for a system of equations? Then 𝑥1 , 𝑥0 and 𝑓 𝑥0 all become vectors:
[ ]
𝑥1
x1 =
𝑦1
[ ]
𝑥0
x0 =
𝑦0
[ ] [ 2 ] [ 2 ] [ ]
𝑓 (𝑥, 𝑦) 𝑥 + 𝑦2 − 1 2 + (−1)2 − 1 4
F = = = =
𝑔 (𝑥, 𝑦) 1
𝑥𝑦 − 4 2 ⋅ −1 − 1
− 94
4
( )
Note that for F, I set the equations equal to zero (as required). Now, what happens with 𝑓 ′ 𝑥0 ? That
becomes the Jacobian matrix:
[ ] [ ] [ ] [ ]
𝑓 (𝑥, 𝑦) 𝑓𝑦 (𝑥, 𝑦) 2𝑥 2𝑦 2 ⋅ 2 2 ⋅ −1 4 −2
F′ = 𝑥 = = =
𝑔𝑥 (𝑥, 𝑦) 𝑔𝑦 (𝑥, 𝑦) 𝑦 𝑥 −1 2 −1 2
where 𝑓𝑥 is the partial derivative of 𝑓 w.r.t. 𝑥, etc. Newton’s method then becomes
x1 = x0 − F′−1 F
which may be rewritten to
Δ x0 = −F′−1 F
Now, computing inverses is generally not something nice, so we rather write this as
F′ Δx0 = −F
Writing all of this out leads to:

[ ][ ] [ ]
4 −2 Δ𝑥0 4
= − 4
−1 2 Δ𝑦0 −9
4Δ𝑥0 − 2Δ𝑦0 = −4
−1Δ𝑥0 + 2Δ𝑦 = 2.25
and so answer D is correct. Please note: you could have solved directly for Δ x0 by computing F′−1 , but
that would lead to explicit solutions for Δ𝑥0 and Δ𝑦0 : you’d need to plug in these values in each of the
answers to see which set of equations corresponds to this set of solutions.
What is the behaviour of Newton’s method for the equations given in Question 15, and an initial
condition for which 𝑥0 = 𝑦0 ?
A: Converges to a root with a quadratic rate of convergence.

B: Converges to a root with only a linear rate of convergence.
C: Convergence in 1 iteration.
D: Converges to a false root at 𝑥 = 0, 𝑦 = 0.
E: Diverges in 1 iteration.
F: Diverges in several iterations.
The correct answer is E: there are two ways to come to this answer. First of all, in the previous question,
we found
[ ]
′ 2𝑥 2𝑦
F =
𝑦 𝑥
and Δ x0 = −F′−1 F. Now, if 𝑥 = 𝑦, then F′ becomes singular as the determinant equals zero,
so then everything breaks and it diverges. If you didn’t see this, you can also use your reading
comprehension skills to arrive at answer E: first of all, Newton’s method only converges if 𝑥0 is
close enough to 𝑥.̃ The actual value of 𝑥0 is never specified in this question (only that it equals 𝑦0 );
thus there must be a value for 𝑥0 we can pick ourselves which is large enough so that it does not
converge any more (if it ever did converge). So, there must be a value for 𝑥0 for which it diverges,
and thus we conclude that answers A - D must be wrong, as they imply that it is always convergent,
no matter what value 𝑥0 takes. That leaves us with answers E and F, which is mostly a matter of
semantics: an algorithm either diverges or it does not, but it’s not as if it converges for the first few itera-
tions, and then suddenly thinks fuck this shit I’m gonna diverge outta here. Therefore, answer E is correct.
Do note, it can be correct to say something diverges only after several iterations; however, you then need
to be given information on when something is called divergence (for example, you could say that an
algorithm starts to diverge when its absolute error becomes larger than 10). However, if such information
is not given to you, then something either diverges or converges, but it cannot diverge only after several
iterations.
©Sam van Elsloo

3 Polynomial interpolation in 1d
The process of constructing a smooth function which passes exactly through specified data is called interpolation.
The data points are denoted by (𝑥𝑖 , 𝑓𝑖 ) for 𝑖 = 0, 1, ..., 𝑛. An interpolating function is called an interpolant
and is a linear combination of prescribed basis functions (such as 𝑥, 𝑥2 , etc.). If the basis functions are 𝜙0 (𝑥),
𝜙1 (𝑥),...,𝜙𝑛 (𝑥) etc., then the interpolant will have the form
∑
𝑛
𝜙 (𝑥) = 𝑎0 𝜙0 + 𝑎1 𝜙1 + ⋯ + 𝑎𝑛 𝜙𝑛 = 𝑎𝑖 𝜙𝑖 (𝑥)
𝑖=0
where 𝑎𝑖 are the interpolation coefficients and are constant.( ) We are thus looking for a function 𝜙 (𝑥) =
∑𝑛
𝑖=0 𝑎 𝑖 𝜙 𝑖 which satisfies the interpolation conditions 𝜙 𝑥𝑖 = 𝑓𝑖 for 𝑖 = 0, ..., 𝑛. Note that we have 𝑛 + 1
interpolation conditions to satisfy, and also 𝑛 + 1 degrees of freedom 𝑎𝑖 that we can change to satisfy the
conditions. If we add more interpolation conditions, the system is overdetermined and unsolvable (unless you’re
lucky), if we have less interpolation conditions, the system is undetermined and there’s no unique solution. Note
that this leads to the system of equations
( ) ( )
𝑎0 𝜙0 𝑥0 + ⋯ + 𝑎𝑛 𝜙𝑛 𝑥0 = 𝑓0
( ) ( )
𝑎0 𝜙0 𝑥1 + ⋯ + 𝑎𝑛 𝜙𝑛 𝑥1 = 𝑓1
⋮ = ⋮
( ) ( )
𝑎0 𝜙0 𝑥𝑛 + ⋯ + 𝑎𝑛 𝜙𝑛 𝑥𝑛 = 𝑓𝑛
where 𝑎0 , 𝑎1 etc. are the unknowns (you first decide on appropriate functions 𝜙0 etc. yourself). This means we
can write this as a matrix equation
( ) ( )
⎡𝜙0 𝑥0 ⋯ 𝜙𝑛 𝑥0 ⎤ ⎡𝑎0 ⎤ ⎡𝑓0 ⎤
Aa = ⎢ (⋮ ) ⋱ ⎥⎢ ⎥ ⎢ ⎥
( )⎥ ⎢ ⋮ ⎥ = ⎢ ⋮ ⎥ = f
⎢
⎣ 𝜙0 𝑥 𝑛 ⋯ 𝜙𝑛 𝑥𝑛 ⎦ ⎣𝑎𝑛 ⎦ ⎣𝑓𝑛 ⎦
for which the solution is simply a = A−1 f if A {is invertible

} (that is, det A ≠ 0). det A depends
{ } on the chosen
basic functions {𝜙} and on the data locations 𝑥𝑖 but not on the actual data values 𝑓𝑖 . If det A ≠ 0 for
{ }
every selection of 𝑛 + 1 distinct data points, then the system of basis functions 𝜙𝑗 (𝑥) is called unisolvent, a
very nice property indeed. Note that if 𝑥𝑖 = 𝑥𝑗 (with 𝑖 ≠ 𝑗), then two rows of A will be identical and therefore
det A = 0.
First, we’ll concentrate on the 1d case, i.e. 𝑓 is only dependent on 𝑥 (later on, we’ll see what happens if we
have data that is dependent on two variables (e.g. 𝑥 and 𝑦)).
3.1 The monomial basis

This is the most straightforward one. You assume the basis for polynomials
𝜙0 (𝑥) = 1
𝜙1 (𝑥) = 𝑥
𝜙2 (𝑥) = 𝑥2
⋮ = ⋮
𝜙𝑛 (𝑥) = 𝑥𝑛
The resulting matrix for A is simply
⎡1 𝑥0 𝑥20 ⋯ 𝑥𝑛−1
0
𝑥𝑛0 ⎤
A = ⎢⋮ ⋮ ⋮ ⋮ ⋮⎥=V
⎢ ⎥
⎣1 𝑥𝑛 𝑥2𝑛 ⋯ 𝑛−1
𝑥𝑛 𝑥𝑛𝑛 ⎦
25
CHAPTER 3. POLYNOMIAL INTERPOLATION IN 1D 26
and this particular form of the matrix that results with a monomial basis has a special name: the Vandermonde
matrix. It is denoted by the letter V. For example, suppose we have the following problem: we want to construct
a quadratic approximation of 𝑓 (𝑥) = sin (𝑥) on the interval [0, 𝜋].
To do this, we must first construct the nodal locations (the points where the polynomial must go through): we
can only use three points for a quadratic approximation, and we use the interval [0, 𝜋], so the best choice is to
use 𝑥0 = 0, 𝑥1 = 𝜋∕2 and 𝑥2 = 𝜋. We then have
⎡1 𝑥0 𝑥20 ⎤ ⎡1 0 0 ⎤
V = ⎢1 𝑥1 𝑥21 ⎥ = ⎢1 𝜋∕2 𝜋 2 ∕4⎥
⎢ ⎥ ⎢ ⎥
⎣1 𝑥2 𝑥22 ⎦ ⎣1 𝜋 𝜋2 ⎦
⎡1 0 0 ⎤ ⎡𝑎0 ⎤ ⎡ sin (0) ⎤ ⎡0⎤
⎢1 𝜋∕2 𝜋 2 ∕4⎥ ⎢𝑎1 ⎥ = ⎢sin (𝜋∕2)⎥ = ⎢1⎥
⎢ ⎥⎢ ⎥ ⎢ ⎥ ⎢ ⎥
⎣1 𝜋 𝜋 2 ⎦ ⎣𝑎2 ⎦ ⎣ sin (𝜋) ⎦ ⎣0⎦
From the first row, we get 𝑎0 = 0. Then we are left with
[ ][ ] [ ]
𝜋∕2 𝜋 2 ∕4 𝑎1 1
=
𝜋 𝜋2 𝑎2 0
[ 2
] [ ] [ ]
𝜋∕2 𝜋 ∕4 1 𝜋 𝜋 2 ∕2 2 𝜋 0 4
∼ ∼
𝜋 𝜋 2 0 0 𝜋 2 ∕2 −2 0 𝜋2 −4
and thus 𝑎1 = 4∕𝑝𝑖 and 𝑎2 = −4∕𝜋 2 , so that (with 𝑎0 = 0) the approximating function is
4 4
𝑝 (𝑥) = 𝑥 − 𝑥2
𝜋 𝜋2
The following data of the velocity of a body is given as a function of time.
Time (s) 0 15 18 22 24
Velocity (m/s) 22 24 37 25 123
Approximate the velocity at 16s by interpolating with a linear polynomial.
A: 22.33 C: 25.33 E: 27.33

B: 24.33 D: 26.33 F: 28.33
The correct answer is F: as we are supposed to use a linear polynomial, we are only allowed to use two
nodes. However, five data points are given, so which to use? Well, most logically, we’ll just use 𝑡 = 15
and 𝑡 = 18. Then, from interpolation, we find
37 − 24
𝑣 (16) = 24 + ⋅ (16 − 15) = 28.33
18 − 15
and thus answer F is correct.
3.2 Why interpolation with polynomials?

Various reasons:
• Polynomials can be evaluated using +, − and × only (easy for computer);
• Derivatives and indefinite integrals of polynomials are easy to compute and are polynomials themselves;
• Polynomials are always continuous and infinitely differentiable;
• Univariate polynomial interpolation is always uniquely solvable: there is exactly one polynomial of degree
≤ 𝑛 that passes through 𝑛 + 1 points.
©Sam van Elsloo

27 3.3. NEWTON POLYNOMIAL BASIS
{ }
Please note that the 𝑛 + 1 nodes 𝑥0 , 𝑥1 , ..., 𝑥𝑛 is called a grid 𝑋.
A set of 𝑛 data points (𝑥𝑖 , 𝑓𝑖 ), 𝑖 = 0, ..., 𝑛 − 1 is given. What is the minimum degree of polynomial
which is guaranteed to be able to interpolate all these points?
A: 𝑛 + 2 C: 𝑛 E: 𝑛 − 2
B: 𝑛 + 1 D: 𝑛 − 1 F: ∞
The correct answer is D: we have 𝑛 − 1 + 1 = 𝑛 conditions to satisfy, so we must use a polynomial of

degree 𝑛 − 1 to have enough coefficients to satisfy these (if you don’t fully understand what I’m saying,
suppose we have 𝑖 = 0, 1, 2, 3. Then we must use 𝑝 (𝑥) = 𝑎𝑥3 + 𝑏𝑥2 + 𝑐𝑥 + 𝑑 to make a system with
four equations and four unknowns). Thus, the correct answer is D.
What is the minimum order of the polynomial that interpolates the following points?
𝑖 0 1 2 3
𝑥𝑖 -2 -1 0 1
𝑓𝑖 4 1 0 1
A: 0 C: 2 E: 4
B: 1 D: 3 F: ∞
The correct answer is C: you may be inclined, based on the previous question, to answer D (that a
polynomial of degree 3 is necessary), but question 3 is the minimum number that guarantees you to
be able to interpolate all these points. Looking at this dataset, we actually see that 𝑝 (𝑥) = 𝑥2 perfectly
matches the data: thus, we only need a polynomial of degree 2, and C is the correct answer.
Which of the following functions can not be used as a basis for interpolation?
A: polynomials C: trigonometric
B: rational functions D: all of the above can be used
The correct answer is D: for interpolation, you can use any function for the base. However, polynomials
are simply much easier and are therefore mostly used.
3.3 Newton polynomial basis
For the monomial basis, we had to invert the matrix V, which we never like to do. Furthermore, if we add a new
node to the data-set, everything had to be re-evaluated, which is inefficient. Therefore, Newton came up with
his own basis, called the Newton basis. For this, we have the basis functions
∏
𝑘−1
( )
𝜋0 (𝑥) = 1, 𝜋𝑘 (𝑥) = 𝑥 − 𝑥𝑗 , 𝑘 = 1, 2, ..., 𝑛
𝑗=0
so that e.g.
( )( )( )
𝜋3 = 𝑥 − 𝑥0 𝑥 − 𝑥1 𝑥 − 𝑥2
With coefficients, this can be written as 𝑝𝑛 (𝑥) = 𝑑0 𝜋0 (𝑥) + 𝑑1 𝜋1 (𝑥) + ... + 𝑑𝑛 𝜋𝑛 (𝑥). What is the advantage of
this? The matrix A now becomes
( ) ( ) ( )
⎡𝜋0 𝑥0 𝜋1 𝑥0 ⋯ 𝜋𝑛 𝑥0 ⎤
A = U = ⎢ (⋮ ) (⋮ ) ⋱
⎥
(⋮ )⎥
⎢
⎣𝜋0 𝑥𝑛 𝜋1 𝑥𝑛 ⋯ 𝜋𝑛 𝑥𝑛 ⎦
( ) ( )( ) ( ) ( )
⎡1 (𝑥0 − 𝑥0 ) (𝑥0 − 𝑥0 ) (𝑥0 − 𝑥1 ) ⋯ (𝑥0 − 𝑥0 ) ⋯ (𝑥0 − 𝑥𝑛 ) ⎤
⎢1 𝑥1 − 𝑥0 𝑥1 − 𝑥0 𝑥1 − 𝑥1 ⋯ 𝑥1 − 𝑥0 ⋯ 𝑥1 − 𝑥𝑛 ⎥
= ⎢ ⎥
⋮ ( ⋮ ) ( )⋮( ) ⋱ ( ) ⋮( )⎥
⎢
⎣1 𝑥𝑛 − 𝑥0 𝑥𝑛 − 𝑥0 𝑥𝑛 − 𝑥1 ⋯ 𝑥𝑛 − 𝑥0 ⋯ 𝑥𝑛 − 𝑥𝑛−1 ⎦
A lot of these things reduce to zero: everything above the diagonal contains one term where you have 𝑥𝑗 − 𝑥𝑗 ,
and thus it reduces to zero, meaning we end up with
⎡1 ( 0 ) 0 ⋯ 0 ⎤
⎢1 𝑥1 − 𝑥0 0 ⋯ 0 ⎥
⎢ ⎥
U = ⎢⋮ ⋮ ⋮ ⋱ ⋮ ⎥
⎢ ( ) ( )( ) ∏(
𝑛−1 )⎥
⎢1 𝑥𝑛 − 𝑥0 𝑥𝑛 − 𝑥0 𝑥𝑛 − 𝑥1 ⋯ 𝑥𝑛 − 𝑥𝑗 ⎥
⎣ 𝑗=0 ⎦
This matrix makes the linear system particularly easy to solve. For example, consider again 𝑓 (𝑥) = sin (𝑥) on
the interval [0, 𝜋] with nodes at 𝑥 = (0, 𝜋∕2, 𝜋). The Newton basis then depends only on the nodes 𝑥𝑖 :
𝜋0 (𝑥) = 1
( )
𝜋1 (𝑥) = 𝑥 − 𝑥0 = (𝑥 − 0) = 𝑥
( )( )
𝜋2 (𝑥) = 𝑥 − 𝑥0 𝑥 − 𝑥1 = (𝑥 − 0) (𝑥 − 𝜋∕2) = 𝑥 (𝑥 − 𝜋∕2)
and we then have

( ) ( ) ( ) ( )
⎡𝜋0 (𝑥0 ) 𝜋1 (𝑥0 ) 𝜋2 (𝑥0 )⎤ ⎡1 𝑥0 𝑥0 (𝑥0 − 𝜋∕2)⎤ ⎡1 0 0 (0 − 𝜋∕2) ⎤
U = ⎢𝜋0 (𝑥1 ) 𝜋1 (𝑥1 ) 𝜋2 (𝑥1 )⎥ = ⎢1 𝑥1 𝑥1 (𝑥1 − 𝜋∕2)⎥ = ⎢1 𝜋∕2 𝜋∕2 (𝜋∕2 − 𝜋∕2)⎥
⎢ ⎥ ⎢ ⎥ ⎢ ⎥
⎣𝜋0 𝑥2 𝜋1 𝑥2 𝜋2 𝑥2 ⎦ ⎣1 𝑥2 𝑥2 𝑥2 − 𝜋∕2 ⎦ ⎣1 𝜋 𝜋 (𝜋 − 𝜋∕2) ⎦
⎡1 0 0 ⎤
= ⎢1 𝜋∕2 0 ⎥
⎢ ⎥
⎣1 𝜋 𝜋 2 ∕2⎦
and thus we merely need to solve
⎡1 0 0 ⎤ ⎡𝑑0 ⎤ ⎡ sin (0) ⎤ ⎡0⎤

⎢1 𝜋∕2 0 ⎥ ⎢𝑑1 ⎥ = ⎢sin (𝜋∕2)⎥ = ⎢1⎥
⎢ ⎥⎢ ⎥ ⎢ ⎥ ⎢ ⎥
⎣1 𝜋 𝜋 2 ∕2⎦ ⎣𝑑2 ⎦ ⎣ sin (𝜋) ⎦ ⎣0⎦
Now, from the first row, we clearly have 𝑑0 = 0. For the second row, we then simply have 𝑑1 = 1∕ (𝜋∕2) = 2∕𝜋,
and from the third row, we then have
2 𝜋2
𝜋⋅ + 𝑑2 = 0
𝜋 2
4
𝑑2 = −
𝜋2
and thus
( )
2 4 𝜋 2 4 2 4 4
𝑝 (𝑥) = 𝑑0 𝜋0 (𝑥) + 𝑑1 𝜋1 (𝑥) + 𝑑2 𝜋2 (𝑥) = 0 + 𝑥− 𝑥 𝑥− = 𝑥 − 𝑥2 + 𝑥 = 𝑥 − 𝑥2
𝜋 𝜋 2 2 𝜋 𝜋 2 𝜋 𝜋 𝜋2
This is exactly the same polynomial that the monomial interpolation gave us: it does not matter whether you use
monomial interpolation or Newton’s interpolation, as long as you choose the same nodes, you end up at the
same polynomial.
Note that Newton’s interpolation is the easiest to do by hand (e.g. on the exam). In general, just remember
that
©Sam van Elsloo

29 3.3. NEWTON POLYNOMIAL BASIS
FORMULAS The degree-2 polynomial is written as

FOR ( ) ( )( )
N EWTON’S IN- 𝑝2 (𝑥) = 𝑑0 + 𝑑1 𝑥 − 𝑥0 + 𝑑2 𝑥 − 𝑥0 𝑥 − 𝑥1 (3.1)
TERPOLATION
The matrix equation is
⎡1 ( 0 ) 0 ⎤ ⎡𝑑0 ⎤ ⎡𝑓0 ⎤
⎢1 ⎥⎢ ⎥ ⎢ ⎥ (3.2)
⎢ (𝑥1 − 𝑥0 ) ( 0
)( )⎥ ⎢𝑑1 ⎥ = ⎢𝑓1 ⎥
⎣1 𝑥2 − 𝑥0 𝑥2 − 𝑥0 𝑥2 − 𝑥1 ⎦ ⎣𝑑2 ⎦ ⎣𝑓2 ⎦
Three data points (𝑥𝑖 , 𝑓𝑖 ) are given in the table below. The Newton representation of the interpolation
polynomial is 𝑝 (𝑥) = 𝑑0 + 𝑑1 𝑥 + 𝑑2 𝑥 (𝑥 − 1). Determine the coefficient 𝑑2 .
𝑖 0 1 2
𝑥𝑖 0 1 2
𝑓𝑖 4 3 1
A: 𝑑2 = 4 C: 𝑑2 = −1∕3 E: 𝑑2 = −2
B: 𝑑2 = 3 D: 𝑑2 = 1∕3 F: 𝑑2 = −1∕2
The correct answer is F: remember that we must solve
⎡1 (0 ) 0 ⎤ ⎡𝑑0 ⎤ ⎡𝑓0 ⎤
⎢1 𝑥 − 𝑥0 ) 0 ⎥⎢ ⎥ = ⎢𝑓1 ⎥
⎢ ( 1 ( )( )⎥ ⎢𝑑1 ⎥ ⎢ ⎥
⎣1 𝑥 − 2 − 𝑥0 𝑥2 − 𝑥0 𝑥2 − 𝑥1 ⎦ ⎣𝑑2 ⎦ ⎣𝑓2 ⎦
⎡1 0 0 ⎤ ⎡𝑑0 ⎤ ⎡4⎤
⎢1 (1 − 0) 0 ⎥ ⎢𝑑1 ⎥ = ⎢3⎥
⎢ ⎥⎢ ⎥ ⎢ ⎥
⎣1 (2 − 0) (2 − 0) (2 − 1)⎦ ⎣𝑑2 ⎦ ⎣1⎦
⎡1 0 0⎤ ⎡𝑑0 ⎤ ⎡4⎤
⎢1 1 0⎥ ⎢𝑑1 ⎥ = ⎢3⎥
⎢ ⎥⎢ ⎥ ⎢ ⎥
⎣1 2 2⎦ ⎣𝑑2 ⎦ ⎣1⎦
From the first row, we see 𝑑0 = 4. Then from the second row, the equation 1 ⋅ 4 + 1 ⋅ 𝑑1 = 3 and thus
𝑑1 = −1. Then, from the third row, we have
1 ⋅ 4 + 2 ⋅ −1 + 2 ⋅ 𝑑2 = 1
and thus 𝑑2 = −1∕2 and thus answer F is correct.
The following data of the velocity of a body is given as a function of time.
Time (s) 0 1 3 4 5
Velocity (m/s) 10 11 14 25 77
Approximate the velocity at 2s by interpolating with a quadratic polynomial using the 1st three data
points. [Tip: The calculation is easiest with a Newton basis.]
A: 12.1333 C: 12.3333 E: 12.5333

B: 12.2333 D: 12.4333 F: 12.6333
( ) ( )( )
The correct answer is C: we have 𝑝 (𝑥) = 𝑑0 + 𝑑1 𝑥 − 𝑥0 + 𝑑2 𝑥 − 𝑥0 𝑥 − 𝑥1 = 𝑑0 + 𝑑1 𝑥 +
𝑑2 𝑥 (𝑥 − 1). Furthermore, remember that we must solve
⎡1 ( 0 ) 0 ⎤ ⎡𝑑0 ⎤ ⎡𝑓0 ⎤
⎢1 ⎥⎢ ⎥ ⎢𝑓1 ⎥
⎢ ( 𝑥1 − 𝑥0 ) ( 0
)( )⎥ ⎢𝑑1 ⎥ =
⎢ ⎥
⎣1 𝑥 − 2 − 𝑥0 𝑥2 − 𝑥0 𝑥2 − 𝑥1 ⎦ ⎣𝑑2 ⎦ ⎣𝑓2 ⎦
⎡1 0 0 ⎤ ⎡𝑑0 ⎤ ⎡10⎤
⎢1 (1 − 0) 0 ⎥ ⎢𝑑1 ⎥ = ⎢11⎥
⎢ ⎥⎢ ⎥ ⎢ ⎥
⎣1 (3 − 0) (3 − 0) (3 − 1)⎦ ⎣𝑑2 ⎦ ⎣14⎦
⎡1 0 0⎤ ⎡𝑑0 ⎤ ⎡10⎤
⎢1 1 0⎥ ⎢𝑑1 ⎥ = ⎢11⎥
⎢ ⎥⎢ ⎥ ⎢ ⎥
⎣1 3 6⎦ ⎣𝑑2 ⎦ ⎣14⎦
From the first row, we then easily see that 𝑑0 = 10. From the second row, we then straightforwardly
have 𝑑1 = 1. From the final row, we must then solve
1 ⋅ 10 + 3 ⋅ 1 + 6𝑑2 = 14
and thus 𝑑2 = 0.1667. Now we have
𝑝2 (𝑥) = 10 + 𝑥 + 0.1667𝑥 (𝑥 − 1)
𝑝2 (2) = 10 + 2 + 0.1667 ⋅ 2 ⋅ (2 − 1) = 12.3333
and thus answer C is correct.
The following 𝑥, 𝑦 data is given:
𝑥 15 18 22
𝑦 24 37 25
The value of 𝑎2 is
A: -1.0476 B: -4.3333 C: -3.0000 D: -0.1429
The correct answer is A: we have to solve the matrix equation
⎡1 ( 0 ) 0 ⎤ ⎡𝑑0 ⎤ ⎡𝑓0 ⎤
⎢1 ⎥⎢ ⎥ ⎢𝑓1 ⎥
⎢ ( 𝑥1 − 𝑥0 ) ( 0
)( )⎥ ⎢𝑑1 ⎥ =
⎢ ⎥
⎣1 𝑥 − 2 − 𝑥0 𝑥2 − 𝑥0 𝑥2 − 𝑥1 ⎦ ⎣𝑑2 ⎦ ⎣𝑓2 ⎦
⎡1 0 0 ⎤ ⎡𝑑0 ⎤ ⎡24⎤
⎢1 (18 − 15) 0 ⎥ ⎢𝑑1 ⎥ = ⎢37⎥
⎢ ⎥⎢ ⎥ ⎢ ⎥
⎣1 (22 − 15) (22 − 15) (22 − 18)⎦ ⎣𝑑2 ⎦ ⎣25⎦
⎡1 0 0 ⎤ ⎡𝑑0 ⎤ ⎡24⎤
⎢1 3 0 ⎥ ⎢𝑑1 ⎥ = ⎢37⎥
⎢ ⎥⎢ ⎥ ⎢ ⎥
⎣1 7 28⎦ ⎣𝑑2 ⎦ ⎣25⎦
From the first row, we obviously have 𝑑0 = 24. From the second one, we then have 𝑑1 = 13∕3 = 4.3333.
We then have for the third row that
1 ⋅ 24 + 7 ⋅ 4.3333 + 28𝑑2 = 25
and thus 𝑑2 = −1.0476 and thus answer A is correct.
©Sam van Elsloo

31 3.4. LAGRANGE POLYNOMIAL BASIS
3.4 Lagrange polynomial basis
There’s another polynomial basis we can use: again, this leads to exactly the same polynomial as the two
methods before. It can be written as follows: the interpolant equals
𝑝𝑛 (𝑥) = f𝑇 l (𝑥)
( )
i.e. simple the dot product of f and l (𝑥). Now, what are those vectors? The entries of f are simply 𝑓 𝑥𝑖 . l (𝑥)
consists of the polynomials formed by
∏
𝑛
𝑥 − 𝑥𝑗
𝑙𝑖 (𝑥) = , 𝑖 = 0, ..., 𝑛
𝑗=0,𝑗≠𝑖
𝑥𝑖 − 𝑥𝑗
As you probably have absolutely no idea what this means, let’s just do an example. Suppose we have the data
shown in table 3.1. Find the Lagrange interpolation polynomial which agrees with this data set. Use it to
estimate 𝑓 (2.5).
Table 3.1: Some data set.
𝑖 0 1 2 3
𝑥𝑖 ( ) 0 1 3 4
𝑓 𝑥𝑖 3 2 1 0
We then rather simply have

( ) ( ) ( ) ( )
𝑝 (𝑥) = 𝑓 𝑥0 𝑙0 (𝑥) + 𝑓 𝑥1 𝑙1 (𝑥) + 𝑓 𝑥2 𝑙2 (𝑥) + 𝑓 𝑥3 𝑙3 (𝑥) = 3𝑙0 (𝑥) + 2𝑙1 (𝑥) + 1𝑙2 (𝑥) + 0𝑙3 (𝑥)
Now we want to know 𝑙0 etc. We have
∏
𝑛
𝑥 − 𝑥𝑗
𝑙0 (𝑥) =
𝑗=0,𝑗≠0
𝑥0 − 𝑥𝑗
𝑥0 = 0, and 𝑗 = 1, 2, 3. We then have
∏
𝑛
𝑥 − 𝑥𝑗 𝑥 − 𝑥1 𝑥 − 𝑥2 𝑥 − 𝑥3 (𝑥 − 1) (𝑥 − 3) (𝑥 − 4) −1
𝑙0 (𝑥) = = = = (𝑥 − 1) (𝑥 − 3) (𝑥 − 4)
𝑗=0,𝑗≠0
0 − 𝑥𝑗 0 − 𝑥1 0 − 𝑥2 0 − 𝑥3 −1 ⋅ −3 ⋅ −4 12
Similarly, we have
∏
𝑛
𝑥 − 𝑥𝑗 𝑥−0𝑥−3𝑥−4 1
𝑙1 (𝑥) = = = 𝑥 (𝑥 − 3) (𝑥 − 4)
𝑗=0,𝑗≠1
1 − 𝑥𝑗 1−0 1−3 1−4 6
∏𝑛
𝑥 − 𝑥𝑗 𝑥−0𝑥−1𝑥−4 1
𝑙2 (𝑥) = = = − 𝑥 (𝑥 − 1) (𝑥 − 4)
𝑗=0,𝑗≠2
3 − 𝑥𝑗 3−0 3−1 3−4 6
∏𝑛
𝑥 − 𝑥𝑗 𝑥−0𝑥−1𝑥−3 1
𝑙3 (𝑥) = = = 𝑥 (𝑥 − 1) (𝑥 − 3)
𝑗=0,𝑗≠3
4 − 𝑥𝑗 4−0 4−1 4−3 12
Plugging all of this in leads to
−𝑥3 + 6𝑥2 − 17𝑥 + 36

𝑝 (𝑥) =
12
𝑝 (2.5) = 1.28125
These are exactly the same results as the other two methods would have given you.
Determine the Lagrange interpolation polynomial given the following data set:
𝑖 0 1 2 3
𝑥
(𝑖 ) -1 1 3 5
𝑓 𝑥𝑖 -6 0 -2 -12
A: 𝑝 (𝑥) = −𝑥2 + 3𝑥2 − 2 C: 𝑝 (𝑥) = 81 𝑥3 − 89 𝑥2 + 23
8
𝑥 − 158
B: 𝑝 (𝑥) = −4𝑥2 + 10𝑥 − 6 D: 𝑝 (𝑥) = 5 3
𝑥 − 39 2
𝑥 + 67
𝑥 − 33
24 24 24 24
The correct answer is A: the easy way to find the answer is by simply checking all of the formulas
whether they lead to the correct data points; doing so will lead to A being the only correct polyno-
mial. Alternatively, you can just do it the hard way (there are four nodes, thus you need a third order
polynomial):
∏
3
𝑥 − 𝑥𝑗 ∏
3
𝑥 − 𝑥𝑗 ∏
2
𝑥 − 𝑥𝑗 ∏
3
𝑥 − 𝑥𝑗
𝑝3 = 𝑓0 + 𝑓1 + 𝑓2 + 𝑓3
𝑗=0,𝑗≠0
𝑥0 − 𝑥𝑗 𝑗=0,𝑗≠1
𝑥1 − 𝑥𝑗 𝑗=0,𝑗≠0
𝑥2 − 𝑥𝑗 𝑗=0,𝑗≠3
𝑥3 − 𝑥𝑗
where 𝑓0 = −6, 𝑓1 = 0, 𝑓2 = −2 and 𝑓3 = −12. Working out the products (as 𝑓1 = 0, we don’t need to
deal with the second product):
∏
𝑛
𝑥 − 𝑥𝑗 𝑥 − 𝑥1 𝑥 − 𝑥2 𝑥 − 𝑥3 (𝑥 − 1) (𝑥 − 3) (𝑥 − 5)
= =
𝑗=0,𝑗≠0
𝑥0 − 𝑥𝑗 𝑥0 − 𝑥1 𝑥0 − 𝑥2 𝑥0 − 𝑥3 (−1 − 1) (−1 − 3) (−1 − 5)
𝑥3 − 9𝑥2 + 23𝑥 − 15
=
−48
∏
𝑛
𝑥 − 𝑥𝑗 𝑥 − 𝑥0 𝑥 − 𝑥1 𝑥 − 𝑥3 (𝑥 − −1) (𝑥 − 1) (𝑥 − 5)
= =
𝑗=0,𝑗≠2
𝑥2 − 𝑥𝑗 𝑥2 − 𝑥0 𝑥2 − 𝑥1 𝑥2 − 𝑥3 (3 − −1) (3 − 1) (3 − 5)
𝑥3 − 5𝑥2 − 𝑥 + 5
=
−16
∏
𝑛
𝑥 − 𝑥𝑗 𝑥 − 𝑥0 𝑥 − 𝑥1 𝑥 − 𝑥2 (𝑥 − −1) (𝑥 − 1) (𝑥 − 3)
= =
𝑗=0,𝑗≠3
𝑥3 − 𝑥𝑗 𝑥3 − 𝑥0 𝑥3 − 𝑥1 𝑥3 − 𝑥2 (5 − −1) (5 − 1) (5 − 3)
𝑥3 − 3𝑥2 − 𝑥 + 3
=
48
and thus
𝑥3 − 9𝑥2 + 23𝑥 − 15 𝑥3 − 5𝑥2 − 𝑥 − 5 𝑥3 − 3𝑥2 − 𝑥 + 3
𝑝3 (𝑥) = −6 ⋅ + −2 ⋅ + −12 ⋅
−48 −16 48
𝑥3 − 9𝑥2 + 23𝑥 − 15 𝑥3 − 5𝑥2 − 𝑥 + 5 𝑥3 − 3𝑥2 − 𝑥 + 3
= + −
8 8 4
2𝑥3 − 14𝑥2 + 22𝑥 − 10 2𝑥3 − 6𝑥2 − 2𝑥 + 6 −8𝑥2 + 24𝑥 − 16
= − = = −𝑥2 + 3𝑥 − 2
8 8 8
but yeah honestly you should just find the answer by trial and error, or by just seeing that the equation is
quadratic, as the step sizes decreases linearly (6, -2, -10) (thus the derivative must be linear), and then
realizing that it must have a line of symmetry at 𝑥 = 1.5, and then making the appropriate function for it
(but this is still harder than just trial and error).
3.5 Chebychev polynomials
Chebychev polynomials are the last set of polynomials you need to understand. However, before we dive into it,
we need to understand why we’re even bothering with them, because this method is actually objectively better
©Sam van Elsloo

33 3.5. CHEBYCHEV POLYNOMIALS
than the other methods so far. So, what makes the previous three methods bad? Did you study them all for
nothing?
3.5.1 Interpolation error
Figure 3.1: Interpolation error.
Look at figure 3.1: the black dots are the datapoints we use to determine the polynomial for any of the previous
three methods. Intuitively, you’d say that the more datapoints, the better: the polynomial can beter approximate
the real function. However, this is actually untrue: the more datapoints you add, the wilder behaviour near the
ends of the interval (as already visible in figure 3.1. This is obviously undesirable, and we want to keep that
error as small as possible. For that, let’s first analyse the interpolation error. For this, we define the interpolation
error of Cauchy as follows:
D EFINITION: If 𝑓 ∈ 𝐶 𝑛+1 ([𝑎, 𝑏]) (that is, on the interval [𝑎, 𝑏], it is 𝑛 + 1 times differentiable), then for any grid 𝑋 of
CAUCHY 𝑛 + 1 nodes and for any 𝑥 ∈ [𝑎, 𝑏], the interpolation error at 𝑥 is
ERROR
𝑓 (𝑛+1) (𝜉)
𝑅𝑛 (𝑓 ; 𝑥) = 𝑓 (𝑥) − 𝑝𝑛 (𝑥) = 𝜔 (𝑥) (3.3)
(𝑛 + 1)! 𝑛+1
where 𝜉 is a certain value on the interval [𝑎, 𝑏] and 𝜔𝑛+1 (𝑥) is the nodal polynomial associated with
the grid 𝑋, i.e. ( )
𝜔𝑛+1 (𝑥) = Π 𝑥 − 𝑥𝑖 (3.4)
This is nice and all, but what’s the use of this? This tells us that if we want to know the maximum value (the
upper bound) of the error in a certain interval, we simply find the value for 𝜉 that maximizes the error:
|𝜔 (𝑥)|
|𝑅𝑛 (𝑓 ; 𝑥)| ≤ max ||𝑓 𝑛+1 (𝑥)|| | 𝑛+1 |
| | 𝑥∈[𝑎,𝑏] | | (𝑛 + 1)!
so you just look whatever maximizes this. Now, we want this to be as small as possible, and how can we do
that? 𝑓 (𝑛+1) is the "true" function, so we can’t adjust it ourselves. The only thing we can change is 𝜔𝑛+1 :
∏
𝑛
( )
𝜔𝑛+1 (𝑥) = 𝑥 − 𝑥𝑖
𝑖=0
By picking different values for 𝑥𝑖 (this assume that we can choose 𝑥𝑖 ourselves and that is not just some fixed
data set we are given), we can actually minimize 𝜔𝑛+1 (𝑥) and thus the error. For this, we use Chebychev’s
polynomial: Chebychev’s polynomial allows us to choose 𝑥𝑖 such that ||𝜔𝑛+1 ||∞ is minimized (||𝜔𝑛+1 ||∞ means
the maximum value that 𝜔𝑛+1 takes). We then use one of the previous three methods to find the polynomial that
fits the associated data points. In other words, Chebychev polynomials are not used themselves to form a basis
for a polynomial, rather, they are used to find the data points that minimize the interpolation error, where the
interpolation is performed using one of the previous methods.
3.5.2 Chebychev’s polynomial
Now, first, what the hell is a Chebychev’s polynomial? Actually, it’s wonderfully easy:
𝑇𝑛 (𝑥) = cos (𝑛 arccos (𝑥))
We can also find recursively. We have 𝑇0 = cos (0 arccos (𝑥)) = 1 and 𝑇1 (𝑥) = cos (1 arccos (𝑥)) = 𝑥; and the
higher degree Chebychev polynomials can be obtained by use of
R ECURSIVE 𝑇𝑛+1 (𝑥) = 2𝑥𝑇𝑛 (𝑥) − 𝑇𝑛−1 (𝑥) , 𝑛 = 1, 2, ... (3.5)

FORMULA FOR
C HEBYCHEV
POLYNOMIALS For example:
𝑇2 = 2𝑥𝑇1 (𝑥) − 𝑇0 (𝑥) = 2𝑥 ⋅ 𝑥 − 1 = 2𝑥2 − 1
( )
𝑇3 = 2𝑥𝑇2 (𝑥) − 𝑇1 (𝑥) = 2𝑥 ⋅ 2𝑥2 − 1 − 𝑥 = 4𝑥3 − 2𝑥 − 𝑥 = 4𝑥3 − 3𝑥
and so on. Note that 𝑇𝑛 (𝑥) is a polynomial of degree 𝑛. Now, how do we find the values 𝑥𝑖 that we must use to
minimize the interpolation error? First, note that 𝑇𝑛 has 𝑛 distinct zeros, which are all located inside the interval
[−1, 1].
FORMULA: For a 𝑇𝑛 (𝑥) polynomial, the zeroes are given by

ZEROS FOR A ( )
2𝑖 − 1
𝑇𝑛 (𝑥) 𝜉𝑖 = cos 𝜋 , 𝑖 = 1, ..., 𝑛 (3.6)
POLYNOMIAL 2𝑛
Now, the big problem so far is that Chebychev polynomials are only valid on the interval [-1,1], but our data
will usually be on a different interval, call it [a,b]. Then, how do we determine what 𝑥0 , 𝑥1 , ..., 𝑥𝑛 we must use
(these lay on the interval [a,b])? For this, we use the relation
T RANSFORMA- 𝑏+𝑎 𝑏−𝑎

𝑥𝑛+1−𝑖 = + 𝜉 , 𝑖 = 1, ..., 𝑛 + 1 (3.7)
TION OF 2 2 𝑖
ROOTS Please note: this equation is slightly different from what the reader says; the reader uses simply 𝑛 instead
of 𝑛 + 1 (both in the subscript of 𝑥 and in the counting of 𝑖). The reason why I have written it like this is
because of the following: suppose you have 𝑥0 , ..., 𝑥𝑛 . Then if you would leave out the +1, it would
seem as if you can only compute up to 𝑥𝑛−1 as 𝑖 = 1 is the minimum value for 𝑖. By adding 1, you
negate this problem.
Let’s just do an example to clarify everything: suppose we want to interpolate a function 𝑓 (𝑥) on the interval
[6, 10] by a degree-4 univariate polynomial using a Chebychev-Gauss grid (but to set up the final polynomial,
we’ll use maybe Newton’s polynomial basis; this is merely setting up what values 𝑥0 , 𝑥1 ,...𝑥𝑛 we must use to
make the best approximation). Compute the grid nodes.
Now, this means that we need to find 𝑥0 , 𝑥1 etc. As we are asked for a degree-4, i.e. 𝑛 = 4, polynomial, we
need five nodes, thus we must use 𝑇5 (𝑥)1 . Now, the formula 𝑇5 (𝑥) itself does not matter: we already know that
the zeros equal
( )
2𝑖 − 1
𝜉𝑖 = cos 𝜋 , 𝑖 = 1, ..., 5
10
Then, these zeros are transformed into [6,10] using
10 + 6 10 − 6
𝑥4+1−𝑖 = + 𝜉𝑖 = 8 + 2𝜉𝑖 , 𝑖 = 1, ..., 4 + 1
2 2
Thus, for example, the first zero equals
( )
2⋅1−1
𝜉1 = cos 𝜋 = 0.951
10
and the corresponding 𝑥4+1−1 = 𝑥4 equals
𝑥4 = 8 + 2𝜉1 = 8 + 2 ⋅ 0.951 = 9.902
Doing this for the other four nodes results in the data shown in table 3.2. and thus we have 𝑥0 = 6.098,
1 Although the question asks for a degree-4 polynomial, this does not mean we need to use 𝑇 ! We use Chebychev’s polynomial merely
4
to find the 5 nodes, for which 5 zeros are necessary, for which 𝑇5 is necessary.
©Sam van Elsloo

Table 3.2: 5-point Chebychev-Gauss grid on [6,10].
𝑖 𝜉𝑖 𝑥4+1−𝑖
1 0.951 9.902
2 0.588 9.176
3 0.000 8.000
4 -5.888 6.824
5 -0.951 6.098
𝑥1 = 6.824, 𝑥2 = 8.000, 𝑥3 = 9.176 and 𝑥4 = 9.902. In practice, we could then continue by applying e.g.
Newton polynomial basis to find an approximation based on these nodes. Now, there are two small problems
with Chebychev-Gauss grids:
• Extrapolation is even more disastrous than using equidistant nodes.
• In practice, it may be difficult to obtain the data 𝑓𝑖 measured at the Chebychev points (for example if you
are simply given a data set rather than a function). Therefore, in general, if it is not possible to choose the
Chebychev-Gauss grid, choose the grid in such a way that there are more nodes towards the endpoints of
the interval [a,b] to minimize the interpolation error.
The problem of univariate polynomial interpolation using a grid 𝑁 of distinct points
A: Has no solution
B: Has always a unique solution
C: Has a unique solution if the Chebyshev grid is used
D: Has a unique solution if the underlying function 𝑓 is continuously differentiable
E: Has a unique solution if the Chebyshev grid is used and the underlying function is continuously
differentiable
The correct answer is B.
Univariate polynomial interpolation converges
A: Never
B: Always
C: Only for equidistant grids 𝑋 if the underlying function is sufficiently smooth
D: Only for non-equidistant grids 𝑋 if the underlying function is sufficiently smooth
E: Always for a Chebyshev grid
F: Always for a Chebyshev grid if the underling function is sufficiently smooth
The correct answer is F.
Given the function 𝑓 (𝑥) = cos (𝜋𝑥) for 𝑥 ∈ [0, 12 ]. Let 𝑝 (𝑥) be the polynomial, which interpolates
𝑓 (𝑥) at 𝑥0 and 𝑥1 = 12 . Determine the upper bound of the error 𝑅 (𝑓 ; 𝑥) = 𝑓 (𝑥) − 𝑝 (𝑥) at 𝑥 = 14 .
Hint: suppose 𝑓 ∈ 𝐶 𝑛+1 ([𝑎, 𝑏]). For any grid 𝑋 of 𝑛 + 1 nodes with 𝑎 = 𝑥0 < 𝑥1 < ... < 𝑥𝑛 = 𝑏 the
interpolation error is
𝑓 (𝑛+1) (𝜉) ∏ (
𝑛
) ( ) ( )
𝑅 (𝑓 ; 𝑥) = 𝑥 − 𝑥𝑗 , min 𝑥𝑖 , 𝑥 < 𝜉 < max 𝑥𝑖 , 𝑥
(𝑛 + 1)! 𝑗=0 𝑖 𝑖
A: 0.31 C: 4.93 E: 0.22

B: 0.62 D: 0.03 F: 0.02
The correct answer is A: from the text, we have that 𝑛 = 1 (our last node is 𝑥1 after all). For 𝑓 (𝑥) =
cos (𝜋𝑥), we have
𝑓 ′′ (𝜉) = −𝜋 2 cos (𝜋𝜉)
Furthermore, for 𝑥 = 1∕4, we have
∏1
( ) ( )( ) (1 )(
1 1
)
𝑥 − 𝑥𝑗 = 𝑥 − 𝑥0 𝑥 − 𝑥1 = −0 − = −0.0625
𝑗=0
4 4 2
Thus, the interpolation error is given by

( ) 𝑓 ′′ (𝜉)
1
𝑅 cos 𝜋𝑥; = ⋅ −0.0625 = −0.03125 ⋅ −𝜋 2 cos (𝜋𝜉) = 0.3084 cos (𝜋𝜉)
4 (1 + 1)!
The maximum value for this is attained at 𝜉 = 0, for which the interpolation error is 0.31, thus answer A
is correct (note: the only restriction on 𝜉 is that it must be between 0 and 0.5).
When interpolating a smooth function with polynomials on an interval [−1, 1], a grid 𝑋 = (𝑥0 , 𝑥1 , ..., 𝑥𝑛 )
is chosen. Which of the following choices of grid 𝑥𝑖 do you expect to give a result with minimum
interpolation error for (i) a Lagrange basis, and (ii) a monomial basis?
1. scattered
2. equidistant
3. a higher concentration of points around the center of the domain
4. a higher concentration of points near the edges of the domain
A: (i) 1, (ii) 3. C: (i) 2, (ii) 2. E: (i) 3, (ii) 2. G: (i) 4, (ii) 3.
B: (i) 2, (ii) 1. D: (i) 2, (ii) 3. F: (i) 4, (ii) 2. H: (i) 4, (ii) 4.
The correct answer is H: literally what this entire section was about: concentrate your datapoints near the
edges of the domain. Furthermore, Lagrange and monomial basis result in exactly the same polynomial,
thus it wouldn’t make any sense at all that they would require different choices of grid.
Suppose we interpolate a function 𝑓 (𝑥) on the interval [1,5] using a cubic polynomial. The grid 𝑋 is
given by 1 < 𝑥0 < 𝑥1 < ... < 𝑥𝑛 < 5. We want to use a Chebyshev grid. Compute the node 𝑥0 . Hint:
the eros of the Chebyshev polynomial of degree 𝑚, 𝑇𝑚 (𝑥) are given by
( )
2𝑖 − 1
𝜉𝑖 = cos 𝜋 , 𝑖 = 1, ..., 𝑚
2𝑚
A: -0.924 C: 4.848 E: 1.152 G: 1.098

B: 2.618 D: 2.235 F: 1.000 H: 1.268
The correct answer is E: we want a cubic polynomial, so we need four nodes, and thus four zeros, and
thus we need a fourth order Chebychev, i.e. 𝑇4 (𝑥) and thus 𝑚 = 4. For 𝑥0 , we must then use the fourth
root as well, so we have ( )
2⋅4−1
𝜉4 = cos 𝜋 = −0.92388
2⋅4
©Sam van Elsloo

We then have
𝑏+𝑎 𝑏−𝑎 5+1 5−1
𝑥0 = + 𝜉4 = + ⋅ −0.92388 = 1.152
2 2 2 2
and thus E is the correct answer.
Consider a function 𝑓 (𝑥) defined on the interval [0,1], which is approximated by Lagrange interpolant
constructed on a uniform grid with 𝑛 + 1 points 𝑥𝑖 = 𝑖∕𝑛. Which of the following statements is true in
this situation?
1. Any number of data points 𝑛 can be interpolated exactly
2. The interpolating polynomial passes through the given data points.
3. As 𝑛 → ∞ we can be certain that the error |𝑝 (𝑥) − 𝑓 (𝑥) | → 0 for all 𝑥 ∈ ℝ.
4. As 𝑛 → ∞ we can be certain that the error |𝑝 (𝑥) − 𝑓 (𝑥) | → 0 for all 𝑥 ∈ [0, 1].
A: 1, 2 C: 1, 4 E: 1, 2, 3 G: 2, 3, 4
B: 2, 3 D: 3, 4 F: 1, 3, 4 H: 1, 2, 3, 4
The correct answer is A: the error does not converge to zero if you keep on increasing the number of
data points. That was basically the point of this section.
A set of data points (𝑥𝑖 , 𝑓𝑖 ) is interpolated with polynomial interpolation. Which of the following has an
influence on the interpolating polynomial?
1. The choice of basis (monomial, Lagrange, Newton, etc.)
2. The ordering of the points 𝑥𝑖 .
3. The locations 𝑥𝑖 .
4. The values 𝑓𝑖 .
A: 1, 2 C: 1, 4 E: 1, 2, 3 G: 2, 3, 4
B: 2, 3 D: 3, 4 F: 1, 3, 4 H: 1, 2, 3, 4
The correct answer is D: 2 is obviously wrong, 4 is obviously correct. 1 is wrong: whatever basis you
choice, you end up at exactly the same polynomial. 3 is correct: choosing 𝑥𝑖 closer to the endpoints
of the interval will result in a smaller approximation error (that was the entire point this section was
making).
Consider the function 𝑓 (𝑥) ∶= cos (𝜋𝑥) defined for 𝑥 ∈ [0, 12 ]. Let 𝑝1 (𝑥) be a first-order polynomial
1
interpolating 𝑓 (𝑥) at the nodes 𝑥0 = 0 and 𝑥1 = 2
. What is the exact error in the interpolant
1
𝜖 (𝑥) = |𝑓 (𝑥) − 𝑝1 (𝑥) | at 𝑥 = 4
?
A: 0.2071 C: 4.9348 E: 0.2181

B: 0.6169 D: 0.0312 F: 0.0221
The correct answer is A: note that this time you do not have to use that fancy formula, as that merely
provides an estimate for the upper bound on the interpolation error on the interval, not the exact error for
a certain value of 𝑥. Interpolating with a first-order polynomial is easy: we have 𝑓0 (0) = cos (𝜋 ⋅ 0) = 1
and 𝑓1 (1∕2) = cos (𝜋 ⋅ 1∕2) = 0. Thus, the interpolation function is simply 𝑝1 (𝑥) = 1 − 2𝑥 (because
this formula goes through those two points). We have

( ) ( )
1 𝜋
𝑓 = cos = 0.7071
4 4
( )
1 1
𝑝1 = 1 − 2 ⋅ = 0.5
4 4
( ) | ( ) ( )|
1 1 1 |
𝜖 = ||𝑓 − 𝑝1 = |0.7071 − 0.5| = 0.2071
4 | 4 4 ||
and thus answer A is correct.
©Sam van Elsloo

Index
Base, 9 Machine epsilon, 9

Basis, 25 Mantissa, 9
Degrees of freedom, 25 Newton basis, 27

Newton’s method, 21
Exponent, 9
Quadratic convergence, 22
Fixed-point arithmetic, 8
Floating-point arithmetic, 9 Significand, 9
Grid, 27 Taylor series, 11

Truncation error, 12
Integers in binary, 8
Interpolation, 25 Unisolvent, 25
Lagrange form of the remainder, 12 Vandermonde matrix, 26
39

Summary I - 2018-2019 Edition

Загружено:

Сведения о документе

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Summary I - 2018-2019 Edition

Загружено:

Авторское право:

Доступные форматы

Applied numerical analysis summary part I:

Sam van Elsloo

February - April 2017

©Sam van Elsloo

©Sam van Elsloo

1 Preliminaries: Motivation, Computer Arithmetic, Taylor Series 7

2 Iterative Solutions of Non-linear Equations 17

©Sam van Elsloo

1.1 Numerical analysis motivation

1.2 Computer representation of numbers

Table 1.1: Writing out 175 in bits.

Possible errors are:

More mathematically, we can write this as follows:

INTEGERS IN Assume we have 𝑁 bits,

taking values 0 or 1. A given 𝑏 then represents the natural number (integer)

assume we have 𝑁 bits,

taking values 0 or 1. A given 𝑏 then represents the natural number (integer)

1.2.2 Real numbers - fixed-point arithmetic

1.2.3 Real numbers - floating point arithmetic

©Sam van Elsloo

1.00, 1.01, 1.02, ..., 9.98 9.99

If we’d want to go to 10 or higher, we’d be able to represent the numbers

10.0, 10.1, 10.2, ..., 99.8, 99.9

1.25 ⋅ 2−3 = 0.15625

to first bit is number 21.

A: 1 × 10−4 B: 9.9999 × 10−5 C: 1 × 10−5 D: 1 × 10−8

A: 99999 B: 1530001 C: 1710001 D: 999990001

The correct answer is B: if we have the numbers

1.0000, 1.0001, ..., 9.9998, 9.9999

in the mantissa, then we have

©Sam van Elsloo

1.3 Taylor series review

𝑓 (0) = sin (0) = 0

etc. such that

Similarly, we’d have for 𝑓 (𝑥) = cos 𝑥 around 𝑥0 = 0 we get

𝑓 (0) = cos (0) = 1

etc. such that

𝑓 (𝑁+1) (𝜉) ( )𝑁+1

©Sam van Elsloo

The correct answer is B: the Taylor expansion of the cosine is

𝑓 (𝑥) = 𝑃 (𝑥) + 𝑅 (𝑥)

A: 0.07969 B: 0.02 C: 0.008727 D: 0.008333

The correct answer is A: the Taylor series expansion for a sine is

©Sam van Elsloo

so let’s just derive it here. We have

where we’re interested only in the first three terms:

Thus, the Taylor series is

𝑒−1 2𝑒−1 2𝑒−1

©Sam van Elsloo

2.1 Recursive bisection

FORMULA 𝜖𝑖 ≡ ||𝑥𝑖 − 𝑥̃ || (2.1)

A: -0.5 C: -0.125 E: 0.1331

The correct answer is C: let’s just do it three times:

©Sam van Elsloo

F: Terminate with an interval containing all roots.

2.2 Fixed-point iteration

note that we must have 𝜙 (𝑥)

Now, the error is 𝑒𝑖 ≡ ||𝑥𝑖 − 𝑥̃ ||, and thus

𝜖𝑖 < 𝐾𝜖𝑖−1 < 𝐾 2 𝜖𝑖−2 < ⋯ < 𝐾 𝑖 𝜖0

change of sign between 𝑥 = 1 and 𝑥 = 2.

©Sam van Elsloo

A: 𝜖𝑖+1 < 𝐾𝜖𝑖 C: 𝜖𝑖+1 > 𝐾𝜖𝑖 E: 𝜖𝑖+1 < 𝐾 + 𝜖𝑖

The correct answer is A: just basic knowledge, really.

2.3 Newton’s method