Ps 1 Solution NNNNN

MACM 316
Assignment 1 Solutions
1.2 Finite Precision Arithmetic

1.2:6e Rounding Arithmetic
Use four-digit rounding arithmetic to perform the following calculation. Compute the absolute error and relative
error with the exact value determined to within at least 5 digits.

!

13
6
13
f l 67
f l f l 14
f l(0.9286 0.8571)
14 7
= fl
(1)
fl
2e 5.4
f l(f l(2f l(e)) f l(5.4))
f l(f l(2(2.718)) 5.400)

f l(0.9286 0.8571)
(2)
= fl
f l(5.436 5.400)

0.0715
= fl
note that subtraction left us with f ewer than 4 digits
0.036
(3)
= 1.986
(4)
however repeating the same calculation 16 digit rounding arithmetic

13
14
76
1.953540139286012
2e 5.4
(5)
a = |1.953540139286012 1.986| 3.2 102
(6)
Then the absolute error is
and the relative error is

r =
|1.953540139286012 1.986|
1.7 102 = 1.7%
1.953540139286012
(7)
Note the choice to represent the error in two digits is somewhat arbitrary, but should be sufficient to give you an
idea of how accurate the approximation is.
1.2:14a Chopping Arithmetic and the Quadratic Formula

Use four-digit chopping arithmetic and the formulae of Example 5 to find the most accurate approximations to the
roots of the following quadratic equations. Compute the absolute and relative errors.
1
1 2 123
x
x+ =0
3
4
6
(8)
(9)
Applying the quadratic formula
x =
123
4
123
4
1
23
+ 4 13 61
but we want to avoid subtracting numbers of the same sign (or adding numbers of opposite sign) so we split this
up and modify the problematic case and then do the finite precision arithmetic.
x+ =
x =
123
4

123 2
4
2 13
4 13 61
(10)
123
4
4 13 61
(11)
123
4

123 2
4
2 13
2 16

123 2
4
(12)
4 13 61
MACM 316
Now doing the calculations in finite precision

f
l
fl
x+ x
+ = f l

fl fl
= fl
123
4
123
4

+ fl
+ fl
r
r

fl fl fl

123 2
4

123 2
4
f l 2f l
fl fl fl
f l 2f l
f l f l 4f l

1
1
3
f l f l 4f l

1
1
3

fl
fl
1
6
1
6

(13)

(14)

r

2
f
l
30.75
+
f
l
f
l
(f
l
(4(0.3333))
(0.1666))
f
l
f
l
(30.75)
= fl
f l (2(0.3333))

p
f l (945.5 f l ((1.333)(0.1666)))
f l 30.75 + f l
= fl
0.6666
p

f l 30.75 + f l
f l (945.5 0.2220)
= fl
0.6666
!
f l 30.75 + f l 945.2
= fl
0.6666

f l (30.75 + 30.74)
= fl
0.6666

61.49
= fl
0.6666
(16)
(17)
(18)
(19)
(20)
= 92.24
Similarly reusing some of our intermediate results from above,
x x
= f l
123
4
+ fl

f l (2(0.1666))
= fl
61.50

0.3332
= fl
61.49

fl fl
r
1
6
f l 2f l

2
f l f l f l 123
f l f l 4f l
4

(21)
1
3
(15)
fl
1
6

(22)
(23)
(24)
= 0.005418
(25)
Repeating these calculations with 16 digit chopping arithmetic, the solution is

x+ = 92.24457962731231
x = 0.005420372687697272
(26)
The absolute errors are

a+ = |x+ x
+ | 4.6 103
a = |x x
| 2.4 106
(27)
and the relative errors are

r+ =
|x+ x
+ |
5.0 105 = 0.0050%
|x+ |
r =
|x x
|
4.4 104 = 0.044%
|x |
(28)
MACM 316
1.2:18 Finite Precision Taylor Series

We wish to approximate e5 using the 9th order Taylor polynomial.
e5
n
X
(5)i
i=0
i!
sumn
(29)
where sumn is defined by the recursion relation

sum1 = 0
(30)

sumn = f l sumn1 + f l
f l ((5) )
f l (n!)

(31)
Note that there are two approximations here. The first is the fact that we do not take the limit n . The error
associated with this approximation is called truncation error. This should get smaller as we add terms to the
series.
The second approximation is the fact that we use finite precision arithmetic to evaluate this truncated series.
The error associated with this approximation is called roundoff error.
Thus, when evaluate the relative error in sumi
ri =
|e5 sumi |
e5
(32)
we should expect contributions from both the roundoff and truncation error.
The table below shows the intermediate results in computing sum9 . Since ri increases as we add terms to the
series, we know our calculation is being dominated by roundoff error because the truncation error should be
decreasing as we add terms.

f l((5)i )
i
i
f l (5)
f l (i!) f l
sumi
ri
f l(i!)
0
1
2
3
4
5
6
7
8
9
1.00e + 00
5.00e + 00
2.50e + 01
1.25e + 02
6.25e + 02
3.12e + 03
1.56e + 04
7.81e + 04
3.90e + 05
1.95e + 06
1.00e + 00
1.00e + 00
2.00e + 00
6.00e + 00
2.40e + 01
1.20e + 02
7.20e + 02
5.04e + 03
4.03e + 04
3.62e + 05
1.00e + 00
5.00e + 00
1.25e + 01
2.08e + 01
2.60e + 01
2.60e + 01
2.16e + 01
1.55e + 01
9.67e + 00
5.38e + 00
1.00e + 00
4.00e + 00
8.50e + 00
1.23e + 01
1.37e + 01
1.23e + 01
9.30e + 00
6.20e + 00
3.47e + 00
1.91e + 00
1.5e + 02
5.9e + 02
1.3e + 03
1.8e + 03
2.0e + 03
1.8e + 03
1.4e + 03
9.2e + 02
5.1e + 02
2.8e + 02
The reason that this is happening is that the sign of sumi1 is always opposite the sign of f l
f l((5)i )
f l(i!)
so
Eq. (31) is adding two numbers of opposite sign at each step. As we know, this is prone to loss of precision.
The solution is to compute the series in a different way which avoids adding numbers of opposite sign. This can
be achieved by instead computing
e5 =
where sumi is defined by the recursion relation
1
1
Pn
5
e
i=0
5i
i!
1
sumi
(33)
sum1 = 0
(34)

sumi = f l sumi1 + f l
f l(5 )
i!

(35)
1
by this method. The first thing to notice is
The table below shows the intermediate results in computing sum
9
that there are no signs in this table. The second thing to notice is that ri decreases as we add terms suggesting
that truncation error is now more important
MACM 316
f l 5i
i
0
1
2
3
4
5
6
7
8
9
1.00e + 00
5.00e + 00
2.50e + 01
1.25e + 02
6.25e + 02
3.12e + 03
1.56e + 04
7.81e + 04
3.90e + 05
1.95e + 06
f l (i!)
1.00e + 00
1.00e + 00
2.00e + 00
6.00e + 00
2.40e + 01
1.20e + 02
7.20e + 02
5.04e + 03
4.03e + 04
3.62e + 05
fl
f l(5i )
f l(i!)
1.00e + 00
5.00e + 00
1.25e + 01
2.08e + 01
2.60e + 01
2.60e + 01
2.16e + 01
1.55e + 01
9.67e + 00
5.38e + 00
1
sumi
ri
1.00e + 00
1.67e 01
5.40e 02
2.54e 02
1.53e 02
1.10e 02
8.93e 03
7.87e 03
7.35e 03
7.09e 03
1.5e + 02
2.4e + 01
7.0e + 00
2.8e + 00
1.3e + 00
6.3e 01
3.3e 01
1.7e 01
9.1e 02
5.3e 02
In conclusion, we would have to say that the second method is a significant improvement over the first as the
relative errors generated by this method are a lot smaller. We also have the ability to further reduce these errors by
adding more terms to the series whereas it wasnt clear whether or not this led to appreciable improvements in the
first case.
1.3 Convergence
1.3:6b convergence as n
Find the rate of convergence of
lim sin
By making the substitution h =
1
n,
1
n2
=0
we have the related problem of finding the rate of convergence of

lim sin h2 = 0
h0
Thus we have a function
f (h) = sin h2
(36)
(37)
(38)
that trivially converges f (0) = 0.

We want to find the largest value of p such that for some constant K
|f (h) f (0)| = |f (h)| K |hp | for small h
(39)
We know from Taylors theorem (1.14) that

h2
f (c) for some c between 0 and h
2

h2
4 sin(c2 )c2 + 2 cos(c2 )
= sin(x2 ) + h 2x cos(x2 ) x=0 +
2

= h2 2 sin(c2 )c2 + cos(c2 )
f (h) = f (0) + hf (0) +
So

|f (h)| = h2 2 sin(c2 )c2 + cos(c2 )

h2 2c2 sin(c2 ) + cos(c2 )

h2 2c2 c2 + 1

h2 2h4 + 1
2
h (2 + 1) for h 1
= 3h
(40)
(41)
(42)
(43)
(44)
(45)
(46)
(47)
(48)
MACM 316
Which tells us that Eq. (54) holds for K = 3, p = 2 and by small h, we mean h 1.
To show that 2 is indeed the largest value of p for which we can make Eq. (54) hold, we simply expand the Taylor
series to one more order
h3
h2
(49)
f (h) = f (0) + hf (0) + f (h) + f (3) (c) for some c between 0 and h
2
6

h3
(50)
= 0 + h2 2 sin(x2 )x2 + cos(x2 ) x=0 + f (3) (c)
6
h3
= h2 + f (3) (c)
(51)
6
(52)
In other words, the h2 term does not vanish like the ones before it, so we cant do the same thing to show that
|f (h)| K|h3 |.
So transforming

|f (h) f (0)| = |f (h)| 3 h2
h1
(53)
back to our original problem Eq. 36 via h =
1
n
we have

sin 1 lim sin 1 = sin 1 3 1

n2

2
2
2
n
n
n
n
so the sequence converges like
1
n2
1n
(54)
or
sin
1
n2
=O
1
n2
(55)
1.3:7c convergence as h 0
Find the rate of convergence of
sin(h) h cos(h)
=0
(56)
h
We want to examine the behaviour of the function
sin(h) h cos(h)
(57)
f (h) =
h
near h = 0. But we need to be careful since we have h in the denominator, so we expand just the numerator in a
Taylor series.
The lowest order nontrivial Taylor series about h = 0 for sin and cos are
lim
h0
h3
cos(c1 ) for c1 between 0 and h
6
2
h
cos(h) = 1
cos(c2 ) for c2 between 0 and h
2
sin(h) = h
(58)
(59)
So we have

h h3 cos(c ) h 1 h2 cos(c )
1
2

6
2

|f (h) f (0)| = |f (h)| =

h

2
2

h
h
= 1
cos(c1 ) 1 +
cos(c2 )
6
2

1

1

= cos(c1 ) + cos(c2 ) h2
6
2

1
1

cos(c1 ) + cos(c2 ) h2
6
2

2
h2
3
(60)
(61)
(62)
(63)
(64)
(65)
MACM 316
So

sin(h) h cos(h)
= O h2
h
(66)
1.3:14 Orders of convergence

Make a table listing h, h2 , h3 and h4 for h = 0.5, 0.1.0.01, 0.001 and discuss the varying rates of convergence.
h
5.00e 01
1.00e 01
1.00e 02
1.00e 03
h2
2.50e 01
1.00e 02
1.00e 04
1.00e 06
h3
1.25e 01
1.00e 03
1.00e 06
1.00e 09
h4
6.25e 02
1.00e 04
1.00e 08
1.00e 12
Clearly the higher the power, the faster the convergence, (i.e., h > h2 > h3 > h4 ). For every order of magnitude by
which we decrease h, hp decreases by p orders of magnitude.
Now Suppose that 0 < q < p and that F (h) = L + O(hp ). Show that F (h) = L + O(hq ).
We know that
|hp | |hq | for q p, |h| 1
(67)
And statment that F (h) = L + O(hp ) means that

|F (h) L| K |hp | for small h
(68)
|F (h) L| K |hp | K |hq | for |h| 1

|F (h) L| K |hq |
(69)
(70)
F (h) = L + O (hq )
(71)
So we can say
Which translates back into
6.1 Linear Algebra

6.1:10 Singular matrices
Given the linear system
1
1
2
1
2 x = 3
2
1
1
(72)
For what values of does the system have no solutions and infinitely many solutions?
Finding the values of for which the determinant vanishes will only tell us if we have infinitely many or no
solutions but will not distinguish between the two cases. Furthermore part c asks us to solve the system for general
MACM 316
so well go ahead and do that first and see what values of might give us problems
2
1
1
1 1
2
1
2
3
1
0
1
7 0
2
1
1
0 1 + 1 2(1 + )
2
2
1 1
1
0
1
7 0
2
0
0 1 (1 + )
2
1 1
1 0
1
6= 1
7 0
1
0
0 1 1
1 1 0 2 1
1 0
1
7 0
1
0
0 1
1
1 0 0 1 1
1
7 0 1 0
1
0 0 1
1
1
1 0 0 1
1
= 0 1 0
1
0 0 1
1
(73)
(74)
(75)
(76)
(77)
(78)
So
x1 =
x2 = 1
x3 =
1
1
1
1
(79)
(80)
(81)
We can see in the third row of Eq (74) that = 1 will cause problems. If = 1 we have an inconsistent system
(no solutions):
1 1 1 2
0
1 0
1
(82)
0
0 0
2
If = 1, we have no constraint on x3 (infinitely many solutions):
1 1 1 2
0
1
0
1
0
0
0
0
(83)

Ps 1 Solution NNNNN

Загружено:

Сведения о документе

Исходное описание:

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Ps 1 Solution NNNNN

Загружено:

Авторское право:

Доступные форматы

MACM 316

1.2 Finite Precision Arithmetic

however repeating the same calculation 16 digit rounding arithmetic

a = |1.953540139286012 1.986| 3.2 102

Then the absolute error is

and the relative error is

1.2:14a Chopping Arithmetic and the Quadratic Formula

Applying the quadratic formula

Now doing the calculations in finite precision

Similarly reusing some of our intermediate results from above,

Repeating these calculations with 16 digit chopping arithmetic, the solution is

The absolute errors are

and the relative errors are

1.2:18 Finite Precision Taylor Series

where sumn is defined by the recursion relation

By making the substitution h =

we have the related problem of finding the rate of convergence of

Thus we have a function

that trivially converges f (0) = 0.

We know from Taylors theorem (1.14) that

f (h) = f (0) + hf (0) +

back to our original problem Eq. 36 via h =

so the sequence converges like

1.3:14 Orders of convergence

And statment that F (h) = L + O(hp ) means that

|F (h) L| K |hp | K |hq | for |h| 1

Which translates back into

6.1 Linear Algebra

Вам также может понравиться