Академический Документы
Профессиональный Документы
Культура Документы
fn(x1,x2,x3,,xn) = 0
Linear Algebraic Equations
a11x1 + a12x2 + a13x3 + + a1nxn = b1
a21x1 + a22x2 + a23x3 + + a2nxn = b2
a n1
or simply
a12
a 22
a 32
a13
a 23
a 33
M
a n 2 a n3
nxn
K
K
K
K
a1n
a 2n
a 3n
a nn
x 1
x
2
x 3
M
x n
nx1
b1
b 2
= b3
M
b n
nx1
[A]{x} = {b}
- Jacobi Method
- Gauss - Seidel Method
- Conjugate Gradient Method
a1n x1
a 2n x 2
M M
a nn x n
b1
a11 a12
b2
a 21 a 22
M M
M
b
a
a
n
n2
n1
L
L
O
L
a1n b1
a 2n b 2
M M
a nn b n
u13 K
u 23 K
u 33 K
O
0
u12
u 22
u1n
u1n
u 3n
u nn
x1
x
2
x 3
O
x n
b1
b
2
b3
O
bn
xn = bn / u n n
DO i = n1, 1, (1)
bi xi =
u x
j i 1
ij
uii
ENDDO
2 6 30 x 3 3
To 2 significant figures, the exact solution is:
0.016
= 0.041
0.091
x true
40
6
4
30
1
2
50 1 2
0 40 4
0 6 30
2
3
50
0
1
2
2.7
40
0
4
29
0 40 4
0 0 29
2
2.7
x3 = 2.7 = 0.093
29
x2 = (2 4x 3 ) = 0.040
40
x1 = (1 2x 3 x 2 ) = 0.016
50
50 1 2 1
1 40 4 2
After forward elimination, we obtain:
2
6
30
3
0 150 750 74
0
0
200 19
Now backsolve:
x3 = 0.095
50 1 2
1 40 4
2 6 30
2
3
x1 1/50 =0.02
x 2 2/40 =0.05
x 3 3/30 =0.10
2 6 30
50 1 2
1 40 4
1
2
x1 3/2 = 1.50
x 2 1/1 = 1.00
x 3 2/4 = 0.50
Goals
1. Best accuracy (i.e. minimize error)
2. Parsimony (i.e. minimize effort)
Possible Problems:
A. A zero on diagonal term
B. Many floating point operations (flops) cause numerical precision
problems and propagation of errors.
C. System may be ill-conditioning, i.e. det[A] 0
D. No solution or an infinite # of solutions, i.e. det[A] = 0
Possible Remedies:
A. Carry more significant figures (double precision)
B. Pivot when the diagonal is close to zero.
C. Scaling to reduce round-off error.
PIVOTING (C&C 4th, 9.4.2, p. 250)
A. Row pivoting (Partial Pivoting) In any good routine; at each step i, find
maxk | aki | for k = i+1, i+2, ..., n
Move corresponding row to the pivot position .
(i) Avoids zero aii
(ii) Keeps numbers in table small
and minimizes round-off,
(ii) Uses an equation with large | aki | to find xi.
PIVOTING (cont'd)
B. Column pivoting Reorder remaining variables xj; for j = i, . . . , n so get largest | aij |
Column pivoting changes the order of the unknowns, xi, and thus
leads to complexity in the algorithm. Not usually done.
C. Complete or Full pivoting
Performing both row and column pivoting.
How to fool pivoting example: one equation has very large coefficients.
Multiply third equation by 100. After pivoting will obtain:
200 600 3000 300
1
2
1
50
1
40
4
2
Forward elimination, yields:
200 600 3000
0 150 750
0
0
200
300
74
19
Backsolution yields:
x3 = 0.095
x2 = 0.020
x1 = 0.000
a ij
a ij
max j a ij
1 40 4
2 6 30
1
2
3
1000 40 4 2
2000 6 30 3
Scaling yields:
1 0.00002 0.00001 0.00001
0.04
0.004
0.002
1
1 0.003
0.015
0.0015
Which equation is used to determine x1???
Why bother to scale?
OPERATION COUNTING (C&C 4th, 9.2.1, p.242)
Number of multiplies and divides often determines the CPU time.
One floating point multiply/divide and associated adds/subtracts is called
a FLOP: FLoating point OPeration.
Some Useful Identities for counting FLOPS:
m
1)
cf (i) = c
i 1
m
3)
i=1
f(i)
2)
f (i) + g(i) =
i 1
m
i=1
1= 1 + 1 + K
i=1
5)
+1=m
4)
f(i) +
i=1
1 = m - k + 1
i=k
m(m+1)
m2
i= 1 + 2 + 3 + 4 + K + m =
=
+ O(m)
2
2
g(i)
i=1
6)
i2 = 12 + 22 + K
+ m2 =
i=1
m(m+1)(2m+1)
m3
=
+ O(m 2 )
6
3
1 n
FLOPS
i 1
2. DO i = 1 to n
Y(i) = X(i) X(i) + 1
DO j = i to n
Z(j) = [ Y(j) / X(i) ] Y(j) + X(i)
ENDDO
ENDDO
With nested loops, always start from the innermost loop.
[Y(j)/X(i)] * Y(j) + X(i) represents 2 FLOPS
n
ji
ji
2 = 21 = 2(n i + 1) FLOPS
For the outer i-loop:
X(i)X(i) + 1 represents 1 FLOP
n
i 1
i 1
i 1
2n ( n 1)
2
2
2
= 3n + 2n n n
=
2
2
n + 2n = n + O(n)
Back Substitution:
X(n) = C(n)/A(n,n)
DO i = n1 to 1 by 1
SUM = 0
DO j = i+1 to n
SUM = SUM + A(i,j)*X(j)
ENDDO
X(i) = [C(i) SUM]/A(i,i)
ENDDO
Forward Elimination
n
Inner loop
1 = n (k+1) + 1 = n k
jk 1
Second loop =
[2 (n k)]
i k 1
= [(2 + n) k] (n k)
2
2
= (n + 2n) 2(n + 1)k + k
n 1
Outer loop
k 1
n 1
n 1
n 1
k 1
k 1
k 1
= (n +2n) 1 2(n+1) k +
2
= (n2+2n)(n-1) 2(n+1)
(n 1)( n )(2n 1)
6
3
= n + O(n2)
(n 1)(n)
+
2
k2
Back Substitution
n
Inner Loop
1 = n (i +1) + 1 = n i
ji 1
n 1
Outer Loop
i 1
1 (n i) = (1+n)
n 1
n 1
i 1
i 1
1 i
2
(n 1)n
= n + O(n)
2
2
Total flops = Forward Elimination + Back Substitution
= (1+n) (n1)
n3/3 + O (n2)
n3/3 + O (n2)
n2/2 + O (n)
a a
21 22 a 23 L a 2n x 2 b 2
a 31 a 32 a 33 L a 3n x 3 = b3
O O
a n1 a n2 a n3 L a nn x n b 4
perform both backward and forward elimination until:
x1
b1
1 0 0 K 0
x
0 1 0 K 0
2 b 2
x 3 = b3
0 0 1 K 0
O
O
0 0 0 K 1
x n bn
Operation count:
n3
+ 0 (n 2 )
2
n3
+ 0 (n 2 )
3
Gauss-Jordan is 50% slower than Gauss elimination.
Gauss Elimination only requires
1 40 4 2
2 6 30 3
1 0.02 0.04 0.02
4
2
0 40
0
6
30
3
1 0 0.038 0.019
.1
0.05
0 40
0 0
29
2.7
=>
1 0 0
0 1 0
0 0 1
x1 =
x2 =
0.015
0.041
x3 =
0.093
0.015
0.041
0.093
{x} = [A]-1{b}
[A] [A]-1 = [ I ]
1) Create the augmented matrix: [ A | I ]
2) Apply Gauss-Jordan elimination:
==> [ I | A-1 ]
[A | I] = 1 40 4 0 1 0
2 6 30 0 0 1
1 0.02 0.04 0.02 0 0
4
0.02 1 0
= 0 40
0
6
30 0.04 0 1
1 0 0.038
0.02 0.005 0
0.026
0.0036
= 0 1 0 0.00037
0 0 1 0.0013 0.0054
0.036
MATRIX INVERSE [A-1]
CHECK:
[ A ] [ A ]-1 = [ I ]
50
2
2
40
6
4
30
0.020
-0.0029
-0.00037
-0.0013
0.026
-0.0054
-0.0014
-0.0036
0.036
0.997
0.13
0.000
0.001
1.016
0.012
[ A ]-1 { c } = { x }
0.020
-0.00037
-0.0013
-0.0029
0.026
-0.0054
-0.0014
-0.0036
0.036
1
2
3
0.015
0.033
0.099
0.002
0.001
1.056
Gaussian Elimination
x true
0.016
0.041
0.091
0.016
0.040
0.093
[L]
[L]{d}= {b}
b) forward substitution
[ L | b ] ==> { d }
[U]{x}={d}
c) backward substitution
{x}
Doolittle LU Decomposition (See C&C 4th, 10.1.2-3, p. 266)
U is just the upper triangular matrix from Gaussian elimination
[ A | b] [ U | b ]
[ L ] has one's on the diagonal (i.e., it is a "unit lower triangular matrix"
and therefore can be denoted [L1]), and elements below diagonal are
just the factors used to scale rows when doing Gaussian elimination,
e.g., l i1 ai1 / a11 for i = 2, 3, , n
0
0 L 0 u11 u12 u13
a11 a12 a13 L a1n
1
a
1
0 L 0 0 u 22 u 23
21 a 22 a 23 L a 2n
l 21
A a 31 a 32 a 33 L a 3n = l 31 l 32 1 L 0 0 0 u 33
M O
M
M
M O M
M
M M
M
M M
a n1 a n2 a n3 L a nn
l n1 l n2 l n3 L 1 0
0
0
then [L1] {d} = {b}
===> [U] {x} = {d} in which {d} is synonymous with {b'}
Basic Approach (C&C Figure 10.1):
Consider [A]{x} = {b}
a) Use Gauss-type "decomposition" of [A] into [L][U]
n3/3 flops
n2/2 flops
L
L
L
O
L
u1n
u 2n
u 3n
M
u nn
LU Decomposition Variations
Doolittle
[L1][U]
General [A]
Crout
[L][U1]
General [A]
T
Cholesky
[L][L]
PD Symmetic
C&C 10.12
C&C 10.1.4
C&C 11.1.2
Crout uses each element of [A] only once so the same array can be
used for [A] and [L\U1] saving computer memory! (The 1s of [U1] are
not stored.)
Dont do it.
n
+ n3 = 4n + O(n2) flops
3
3
n3
+ O(n 2 )
3
Gauss-Jordan (1 r.h.s)
n3
+ O(n 2 )
2
LU decomposition
n3
+ O(n 2 )
3
n2
n3
+ O(n 2 )
6
4n 3
+ O(n 2 )
3
n3 + O(n2)
n!
12 2
b1
b2/a22
b2/a2
b1/a12
b1/a11
x1
a21x1+ a22x2
= b2
x1
x1
x2
Uncertainty
in x2
Uncertainty
in x2
x2
i 1
xi
1/ p
x1
x
i 1
max x i
1/ 2
i 1
Triangle Inequality
x gy x
x gy x y cos x gy .
max
x p 1
A x
1 max
j
3. Spectral norm:
max
i
2
i 1
a ij
a ij
j1
( max )1/ 2
"Triangle Inequality"
4. || A B || || A || || B ||
Why are norms important?
Norms permit us to express the accuracy of the solution {x} in terms of || x ||
Norms allow us to bound the magnitude of the product [A]{x} and the associated errors.
or
A
x
b
Now an error in {b} yields a corresponding error in {x}:
[A ]{x + x} = {b + b}
[A]{x} + [A]{x} = {b} + {b}
Subtracting [A]{x} = {b} yields:
[A]{x} = {b} > {x} = [A]-1 {b}
Taking norms and using the lemma:
x
b
A A 1
x
b
b
b
A 1
{x } = [A]-1 [A] {x + x}
A A 1
A
A
A
A
A
~ 10 p , then
A
that if
x
A
implies
x x
A
x
~ 10s , then 10 s 10
x x
s p log10()
log10() is the loss in decimal precision, i.e., we start with p significant figures and end-up with s
significant figures. (This idea is expressed in words at bottom of p.280 of C & C)
One does not necessarily need to find [A]-1 to estimate = cond[A]. Can use an estimate based
upon iteration of inverse matrix using LU decomposition.
One does not necessarily need to find [A]-1 to estimate = cond[A]. For
example, if [A] is symmetric positive definite, = max/min and one can
bound max by any matrix norm and calculate min using the LU
decomposition of [A] and a method called inverse vector iteration.
Programs such as MATLAB have built-in functions to calculate = cond[A]
or the reciprocal condition number 1/ (cond and rcond).
+ ... +
+ ... +
+ ... +
b1
b2
bn
a1n xn)]/a11
a2n xn)]/a22
a3nxn)]/a33
+ ... + an,n-1xn-1)]/ann
Matrix Derivation
To solve [A]{x} = {b}, separate [A] into: [A] = [Lo] + [D] + [Uo]
[D] = diagonal(aii)
[Lo] = lower triangular w/ zeros on diagonal
[Uo] = upper triangular w/ zeros on diagonal
Rewrite system:
Iterate
which is effectively:
a11
a 22
a 33
x 2j1 b 2 a 21 x1j a 23 x 3j L a 2n x nj
x 3j1 b3 a 31 x1j a 32 x 2j L a 2n x nj
a nn
Note that, although the new estimate of x1 was known, we did not use it to calculate the new x2.
Iterative Solution Methods -- Gauss-Seidel (C&C 4th, 11.2, p. 289)
In most cases using the newest values on the right-hand side equations will provide better estimates of
the next value. If this is done, then we are using the Gauss-Seidel Method:
( [L0] + [D] ) {x}j+1 = {b} [U0] {x}j
or explicitely:
x 2j1 b 2 a 21 x1j+1 a 23 x 3j L a 2n x nj
a11
x nj1
a 22 F
a 33
j+1
b n a n1 x1j+1 a n2 x 2j+1 L a n,n 1 xn-1
a nn
For Gauss-Seidel
For Jacobi
Iterative methods will not converge for all systems of equations, nor for all possible rearrangements.
If the system is diagonally dominant, i.e. aii > aij where ij then
xi
ci a i1
a
a
x1 i2 x 2 L in x n
a ii a ii
a ii
a ii
a ij
with all
< 1.0, i.e., small slopes.
a ii
a ii a ij
j1
ji
old
i
The choice of is highly problem dependent and is empirical, so relaxation is usually only used for
often repeated calculations of a particular class.
Why Iterative Solutions?
You often need to solve A x = b where n = 1000's
Description of a building or airframe,
Finite-Difference approximation to PDE
Most of A's elements will be zero; for finite-difference approximations to Laplace equation have five
aij 0 in each row of A.
Direct method (Gaussian elimination)
Requires n3/3 flops
(n = 5000; n3/3 = 4 x 1010 flops)
Fills in many of n2-5n zero elements of A
Iterative methods (Jacobi or Gauss-Seidel)
Never store A
(say n = 5000; dont need to store 4n2 = 100 Megabyte)
Only need to compute (A-B)x; and to solve Bx t+1 = b
Effort:
Suppose B is diagonal, solving B v = b
n flops
Computing (AB) x
For m iterations
4n flops
5mn flops