Вы находитесь на странице: 1из 2

INDIAN INSTITUTE OF TECHNOLOGY MADRAS

Department of Chemical Engineering


CH 2061 Numerical Techniques
Homework #1

• Due Wednesday, 2 Feb, 2 PM.


• All answers should be written legibly in the answer book. Do not rewrite the questions.
• Do not attach a print out of the questions.
• Do not attach any unnecessary plastic folders/binders etc.
• Use both sides of the paper.
• Be as concise as possible. Unnecessarily long answers will be penalized.
• You are free to discuss, but all submitted work must be your own.

1. Fixed precision Rounding to three significant digits at each step, perform the following two additions
(200+0.49+0.49) and (10.0+0.333+0.333+0.333), starting from left to right. Repeat by adding from
right to left. Compare percentage relative errors in all cases. Note: You should round to 3 significant
digits, not to 3 decimal places

2. Quadratic formula Consider the equation x 2 − bx + 1 = 0. Define the discriminant r = b2 − 4. We
saw the dangerous effect of rounding errors in computing one of the roots when b >> 4 even when
using the exact formulae:
b+r b−r
x1 = , x2 =
2 2
Consider an alternate expression for the roots:
2 2
x1 = , x2 = .
b−r b+r
Explain how these expressions are derived. Hint: Multiply and divide by an appropriate expression
and use the fact that r 2 = b2 − 4. Use the alternate expression, and compute the roots, rounding
to 4 significant digits at each step for b = 80 + n, where n is the last digit of your roll number.
Compute “true” roots (using a calculator/computer with as much precision as possible) and compute
percentage relative errors. Generalize the expression for ax 2 + bx + c = 0.

3. Floating point Recall the tutorial problem of representing a number in floating point with a total of 7
binary bits. The first bit represents the sign of the number (1 indicates negative number), the next
three the mantissa and the last three the singed exponent. Unlike in class, we will not store the first
bit d1 = 1 explicitly and hence we implicitly have 4 bits in the mantissa. As in class, compute the
smallest positive number that can be exactly represented, the largest positive number and machine
precision if rounding is used. How many numbers (in total) can be exactly represented. Note: We
are using rounding now!

4. IEEE Double precision IEEE double precision uses a 64 bit binary representation for a number with
11 bits for the signed exponent and the remaining 53 for the signed mantissa. Show that the range
of positive numbers is approximately from 2 −1024 to 21024 .

1
5. Taylor’s theorem We will prove Taylor’s theorem step by step. Fill in the arguments. For notation,
technical requirements, refer to earlier document.

f (a + h) = f (a) + hf 0 (a) + h2 /2f 00 (a) + h3 /3!f 000 (a + t3 ),

for some t3 ∈ [0, h].

• Define the following functions/numbers:

P (t) = f (a) + tf 0 (a) + h2 /2f 00 (a),

f (a + h) − P (h)
M= ,
h3
g(t) = f (a + t) − P (t) − M t3 .

• Show that g(0) = 0 and g(h) = 0.


• Apply MVT and hence g 0 (t1 ) = 0 for some t1 ∈ [0, h].
• Show that g 0 (0) = 0. From above g 0 (t1 ) = 0 and hence applying MVT to g 0 (t), we have
g 00 (t2 ) = 0 for some t2 ∈ [0, t1 ].
• Again, show that g 00 (0) = 0 and from above g 00 (t2 ) = 0. Hence, applying MVT to g 00 (t), we have
that g 000 (t3 ) = 0 for some t3 ∈ [0, t2 ].
• Show that g 000 (t3 ) = f 000 (a + t3 ) − 1.2.3M and hence M = f 000 (a + t3 )/3!.
• Hence,
f (a + h) = f (a) + hf 0 (a) + h2 /2f 00 (a) + h3 /3!f 000 (a + t3 ),
for some t3 ∈ [0, h].

6. Truncation error estimation In class, we had obtained e −10 using the first 20 terms in the Taylor’s
expansion of e−x . Assume that infinite precision is possible, i.e., it is possible to store and compute
with numbers with unlimited precision, what is the maximum possible error that is possible when
using a truncated series with the first 20 terms?

7. Subtractive cancellation Recall the effort that we spent in class to evaluate e 10 and e−10 . Use your
favourite proramming language/environment (e.g., Fortran, MATLAB, VB, Excel etc.) to evaluate
e11+n and e−(11+n) , where n is the last digit of your roll number using the Taylor’s series with 25
terms. What arithmetic is the code using (e.g., single, double etc.)? Calculate the percentage
relative error.

8. Extra credit Are there functions that are differentiable, but the derivative itself is not continuous
everywhere. If you read the answer somewhere, be sure to mention the source.

Вам также может понравиться