Академический Документы
Профессиональный Документы
Культура Документы
Algebraic Eigenvalue
Problems
Eigenvalue analysis is an important practice in many elds of engineering or
physics. We also have seen in this course that eigenvalue analysis play an
important role in the performance of many numerical algorithms. A general
eigenvalue problem is stated as follows:
Denition 5.0.1 Given n n matrices A and B, nd numbers such that the
equation
Ax = Bx
(5.1)
is satised for some nontrivial vector x = 0.
If B is invertible, then (5.1) can be reduced to
Cx = x.
(5.2)
Even if both A and B are real-valued, it is likely that and x are complexvalued. Finding the solution of eigensystems is a fairly complicated procedure.
It is at least as dicult as nding the roots of polynomials. Therefore, any
numerical method for solving eigenvalue problems is expected to be iterative in
nature. Algorithms for solving eigenvalue problems include the power method,
subspace iteration, the QR algorithm, the Jacobi method, the Arnoldi method
and the Lanczos algorithm.
Some major references in this eld are given at the end of this note.
5.1
Perturbation Theory
n
j=i
(5.3)
101, 90
,
Example. Let A =
,E =
. The eigenvalues of A
110, 98
0,
0
2
are {1, 2}. The eigenvalues of A + E are { 3 1838+
}. When = 0.001,
2
the eigenvalues are approximately {1.298, 1.701}. Thus min | | 0.298.
This example shows that a small perturbation E can lead to relative large
perturbation in the eigenvalues of A.
Remark. When A is a normal matrix, i.e., when AA = A A (This class of
matrices include symmetric matrices, orthogonal matrices, hermitian matrices,
etc.), it is known that P may be chosen to be a unitary matrix (i.e., P P =
I). If we use the L2 -norm, then P 2 = P 1 2 = 1. In this case, we have
min | E2 . This implies that eigenvalue problems for normal matrices
are well-conditioned.
Remark. When talking about the perturbation of eigenvectors, some cautions
should be taken.
(1) Eigenvectors are unique up to multiplicative constants. Continuity should
be discussed only under the assumption that all vectors are normalized in the
same way.
1
(2) There
associated with multiply eigenvalues. For example,
are diculties
1 + , 0
the matrix
has eigenvectors [1, 0]T and [0, 1]T whereas the matrix
0, 1
I has eigenvectors everywhere in R2 .
Examples. (1) Consider the matrix
1 1 0 0 0
0 1 1 0 0
A=
0 0 1 1 0
0 0 0 1 1
0 0 0 0 1
which is already in the Jordan Canonical form. It is not diagonalizable and
has eigenvalues = 1 as an n-ford eigenvalues. Consider a slightly perturbed
matrix
1 1 0 0 0
0 1 1 0 0
A+E =
0 0 1 1 0 .
0 0 0 1 1
0 0 0 1
The eigenvalues of A + E are roots of p() = (1 )n (1)n = 0. So
1/n
1/n
1 i = ()i
where ()i
is the i-th of the n-th roots of . Note that
1/n
|i () i | = | |. With n = 10 and = 1010 , we nd that |i () i | = .1.
Note also that the matrix A has only one eigenvector whereas A + E has a
complete set of eigenvectors.
2, 1010
. A is not diagonalizable. The exact
(2) Consider the matrix
0,2
eigenvalue of A are {2, 2} Suppose that = 1 is an approximate eigenvalue with
eigenvector x = [1, 1010 [T . Then we nd the residue r := Axx = [0, 1010]T .
Obviously, the residue is misleading.
5.2
:= Ax(k1)
w(k)
:=
w(k)
k (0)
A x
n
i=1
n
i i xi
i ki xi
k1
1 x1 +
i=1
i=2
i
1
k
xi
A2 x(k2)
A2 x(k2)
Ak x(0)
.
Ak x(0)
It is clear that as k , the vector Ak x(0) behaves like 1 k1 x1 in the sense that
contribution from x2 , . . . xn becomes less and less signicant. Normalization
k
makes x(k) |1 1k | xx11 . That is, x(k) converges to an eigenvector associated
1
(Ax(k) )j
(k)
xj
1 .
-1
4
converges to 1
2
1
converges to 3
b the shifted power method will converge to either 1 or 3 , but not any other
eigenvalues. The idea is plausible, however, in the following setting.
5
1
1
1 , . . . , n .
We apply
:= x(k1)
w(k)
:=
.
w(k)
:= x(k1)
:=
w(k)
w(k)
-1
vertical bars separates the regions of b by which the shifted inverse power method
will be able to nd, respectively, dierent eigenvalues.
Remark. There are several ways to choose the shift b:
(1) If some estimate of i has been found, we may use it for b.
(2) We may generate b(0) at random. Then dene
b(k+1) :=
(w(k) ) Aw(k)
(w(k) ) w(k)
(5.4)
(5.5)
5.3
The OR algorithm
The basic idea behind the QR algorithm is from a sequence of matrices {Ai }
that are isospectral to the original matrix A1 := A and that converge to a special
form from which the eigenvalues can readily be known. For simplicity, we shall
restrict our discussion for real matrices in the sequel.
Algorithm 5.3.1 (Basic QR algorithm)
Given A Rnn , dene A1 := A.
For k = 1, 2, . . ., do
Calculate the QR decomposition Ak = Qk Rk ,
Dene Ak+1 := Rk Qk .
Remark. We observe that Ak+1 = Rk Qk = QTk Ak Qk . So Ak+1 is orthogonally
similar to Ak . By induction, the sequence {Ak } is isospectral to A.
Denition 5.3.1 We say matrix A Rnn is upper Hessenberg if and only if
Aij = 0 for all i 2 j.
The QR algorithm is expensive because of the large number of computations
that must be performed at each step. To save this overhead, the matrix A
usually is simplied rst by orthogonal similarity transformations to an upper
Hessenberg matrix. This can be accomplished
as follows:
1, 0, . . . , 0
0,
2 is
where U
Given A, let U2 be the unitary matrix U2 =
U2
0,
formed by the Householder
transformation
for
the
column vector [a21 , . . . an1 ]T .
unchanged
x
x ... x
0
x
x
Thus U2 A =
. It is clear that U2 AU2T does not
..
..
..
.
.
.
0
x ... x
change the rst column of U2 A. Continuing this procedure, A is transformed
by orthogonal similarity transformations into an upper Hessenberg matrix.
1
0
0 0
0
0
0 cos 0 0 sin 0 j th row
0
0
1 0
0
0
0
0
0 1
0
0
x x x x
x x x x
0
A1 =
0 x x x 12 A1 = A2 = 0 x x x
0 0 x x
0 0 x x
a11
a211 +a221
23 A2
34 A3
x x
0
= A3 =
0 0
0 0
x x
0 x
= A4 =
0 0
0 0
x x
x x
x x
= R.
0
Working with upper Hessenberg matrices, the QR decomposition will only take
O(n2 ) ops, while with general matrices, the QR decomposition requires O(n3 )
ops.
Remark. It is important to note that if A1 = A is upper Hessenberg, then
the orthogonal matrix Q1 in the QR decomposition of A1 = Q1 R1 is also upper
Hessenberg (Prove this!). Therefore, A2 = R1 Q1 maintains to be upper Hessenberg. It follows that the upper Hessenberg structure is maintained throughout
the QR algorithm.
We have seen that
Ak+1 = QTk Ak Qk = QTk . . . QT1 AQ1 . . . Qk .
Let us call
Pk : = Q1 . . . Qk
Uk : = Rk . . . R1 .
(5.6)
(5.7)
(5.8)
(5.9)
kR
k is the QR decomposition of I + REk R1 .
where Q
We recall the issue of uniqueness of the QR decomposition of a matrix.
Suppose a nonsingular matrix B has two QR decompositions, say B = QR =
= RR
1 . But QT Q
is orthogonal whereas RR
1 is upper
QR.
Then QT Q
triangular. An orthogonal and upper triangular matrix is diagonal with absolute
= QD and R
= D1 R.
1 := D, then Q
value 1 along its diagonal. If we call RR
So the QR decomposition of B is unique if the diagonal elements of R are
required to be positive.
kD
k
= QQ
1
R
kR
k RDK U = D
k RDk U
= D
k
kD
k )T A(QQ
kD
k) = D
kQ
T (QT AQ)(Q
kD
k)
= PkT APk = (QQ
k
T
1
1
kQ
k (RP AP R )(Q
kD
k)
= D
T
1
(RDR )(Q
k ).
kQ
kD
= D
(5.10)
k
Remark. From the above proof, we see that the rate of convergence depends
(5.11)
Ak+1 := Rk Qk + ck I.
(5.12)
Dene
Remark. We note that Rk = QTk (Ak ck I). So Ak+1 = QTk (Ak ck I)Qk +
ck I = QTk Ak Qk . That is, Ak+1 and Ak are orthogonally similar.
Denition 5.3.3 Suppose Ak is upper Hessenberg. The shift factor ck may be
determine from the eigenvalues, say k or k , of the bottom 2 2 submatrix of
Ak . If both k and k are real, we take ck to be k or k according to whether
(k)
(k)
|k ann | or |k ann | is smaller. If k = k , then we choose ck =
k . Such
a choice of shift factor is known as the Wilkinson shift.
Theorem 5.3.2 Given any matrix A and the relationship B = QT AQ. Suppose
B is upper Hessenberg with positive and diagonal elements and suppose Q is
orthogonal. Then matrices Q and B are uniquely determined from the rst
column of Q.
10
=
becomes an upper Hessenberg matrix (Recall how this is done!). Let Q
T T
T
T
is
P U1 . . . Un2 . Then we have Q AQ = B. Note that the rst column of Q
= B.
the same as that of Q. From Theorem 5.3.1, we conclude that B
Suppose Ak is upper Hessenberg. There is an easier way to compute Ak+1 =
QTk Ak Qk from Ak based on the above theorem and remark: Since Ak is upper
Hessenberg, the rst column of Qk is known, i.e., it is the column [c, s, 0, . . . , 0]T
with
(k)
a11 ck
c :=
(k)
(k)
(a11 ck )2 + a21
(k)
a21
s :=
.
(k)
(k)
2
(a11 ck ) + a21
So the rst column of P T is known and
c
s
T
P :=
0
0
PT
s
c
0
0
x
x
P Ak P T =
x
0
0
x x x x
x x x x
x x x x
0 x x x
0 0 x x
0 0
0 0
1 0
0 1
5.4. EXERCISES
5.4
11
Exercises
= AQk1
= Zk (QR factorization)
12
Bibliography
[1] D. K. Faddeev and V. N. Faddeva, Computational Methods of Linear Algebra, W. H. Freeman and Co., San Francisco, 1963.
[2] B. N. Parlett, The Symmetric Eigenvalue Problems, Prentice-Hall, Englewood Clis, N.J., 1980.
[3] B. T. Smith et al, Matrix Eigensystem Routines, EISPACK Guide, Lecture
notes in computer science, Vol. 6, Berlin, Springer-Verlag, 1974.
[4] G.W. Stewart and Ji-guang Sun, Matrix Perturbation Theory, Academic
Press, Boston, 1990.
[5] D. S. Watkins, Understanding the QR algorithm, SIAM Review, 24(1982),
427-440.
[6] J. H. Wilkinson and C. Reinsch, Linear Algebra, Handbook for Automatic
Computation, Vol. II, ed. F. L. Bauer, Berlin, Springer-Verlag, 1971.
[7] J.H. Wilkinson, The Algebraic Eigenvalue Problem, Oxford, Clarendon
Press, 1988.
13