Lecture 3 Eigenvalues

November 22, 2010
EIGENVALUES AND EIGENVECTORS

RODICA D. COSTIN
Contents
1. An example: linear dierential equations 2
2. Eigenvalues and eigenvectors: denition and calculation 3
2.1. Denitions 3
2.2. The characteristic equation 4
2.3. Geometric interpretation of eigenvalues and eigenvectors 4
2.4. Digression: the fundamental theorem of algebra 5
2.5. Solutions of the characteristic equation 6
2.6. Diagonal matrices 7
2.7. Similar matrices have the same eigenvalues 7
2.8. Projections 8
2.9. Trace, determinant and eigenvalues 8
2.10. Eigenvectors corresponding to dierent eigenvalues are
independent 9
2.11. Diagonalization of matrices with linearly independent
eigenvectors 10
2.12. Real matrices with complex eigenvalues: decomplexication 11
2.13. Eigenspaces 12
2.14. Multiple eigenvalues, Jordan normal form 12
3. Solutions of linear dierential equations with constant coecients 15
3.1. Diagonalizable matrix 15
3.2. Repeated eigenvalues 18
3.3. Higher order linear dierential equations; companion matrix 20
3.4. Stability in dierential equations 22
4. Dierence equations (Discrete dynamical systems) 27
4.1. Linear dierence equations with constant coecients 27
4.2. Solutions of linear dierence equations 28
4.3. Stability 29
4.4. Why does the theory of discrete equations parallel that of
dierential equations? 29
4.5. Example: Fibonacci numbers 31
4.6. Positive matrices 32
4.7. Markov chains 32
5. More functional calculus 34
5.1. Functional calculus for digonalizable matrices 34
1
2 RODICA D. COSTIN
5.2. Commuting matrices 37
1. An example: linear differential equations
One important motivation for the study of eigenvalues is solving linear
dierential equations.
Consider the simple equation
du
dt
= u
which is linear, homogeneous, with constant coecients, and unknown u of
dimension one. Its general solution is, as it is well known, u(t) = Ce
t
.
Consider next an equation which is also linear, homogeneous, with con-
stant coecients, but the unknown u has dimension two:
(1)
du
1
dt
= (4a
1
3a
2
)u
1
+ (2a
1
+ 2a
2
)u
2
du
2
dt
= (6a
1
6a
2
)u
1
+ (3a
1
+ 4a
2
)u
2
where a
1
, a
2
are constants, or, in matrix form,
(2)
du
dt
= Mu
where
(3) M =
_
4a
1
3a
2
2a
1
+ 2a
2
6a
1
6a
2
3a
1
+ 4a
2
_
, u =
_
u
1
u
2
_
For the system of equations (1), the solution is not immediately clear.
Inspired by the one dimensional case we can look for exponential solutions.
Substituting u(t) = e
t
x (where x is a constant vector) in (2) we obtain
that the scalar and the vector x must satisfy
(4) x = Mx
or
(5) (M I)x = 0
If the null space of the matrix M I is zero, then the only solution
of (5) is x = 0 which gives the (trivial!) solution identically zero for the
dierential system (2).
If however, we can nd special values of for which N(M I) is not
null, then we found a nontrivial solution of (2). Such values of are called
eigenvalues of the matrix M, and vectors x N(M I) are called
eigenvectors corresponding to the eigenvalue .
EIGENVALUES AND EIGENVECTORS 3
Of course, the necessary and sucient condition for N(M I) = {0} is
that
(6) det(M I) = 0
Solving for (3) gives
1
= a
1
and
2
= a
2
. For each eigenvalue we need
the corresponding eigenvectors.
Substituting = a
1
(then a
2
) in (4) and solving for x we obtain that the
eigenvector x is any scalar multiple of x
1
(respectively x
2
) where
(7) x
1
=
_
2
3
_
, x
2
=
_
1
2
_
We obtained the solutions u
1
(t) = e
a
1
t
x
1
, and u
2
(t) = e
a
2
t
x
2
of (1).
Since the system is linear, then any linear combination of solutions is again
a solution, and we found the general solution of (1) as
(8) u(t) = C
1
u
1
(t) +C
2
u
2
(t) = C
1
e
a
1
t
x
1
+C
2
e
a
2
t
x
2
To complete the problem, we should also check that for any initial condi-
tions we can nd constants C
1
, C
2
for which the solution (8) satises these
initial data. However, we leave this discussion for a course in dierential
equations. Furthermore, we will see in what follows that by carefully study-
ing the algebraic properties of eigenvalues and eigenvectors this information
comes for free!
2. Eigenvalues and eigenvectors: definition and calculation
2.1. Denitions. Denote the set of nn (square) matrices with entires in
F
M
n
(F) = {M| M = [M
ij
]
i,j=1,...n
, M
ij
F}
A matrix M M
n
(F) denes a linear transformation on the vector space
F
n
(over the scalars F) by usual multiplication x Mx.
Note that a matrix with real entries can also act on C
n
, since for any
x C
n
we can dene Mx C
n
. But a matrix with complex non real entries
cannot act on R
n
, since for x R
n
the image Mx may not belong to R
n
(while certainly Mx C
n
).
Denition 1. Let M be an nn matrix acting on the vector space V = F
n
.
A scalar F is an eigenvalue of M if for some nonzero vector x V ,
x = 0 we have
(9) Mx = x
The vector x is called eigenvector corresponding to the eigenavlue .
4 RODICA D. COSTIN
2.2. The characteristic equation. Equation (9) can be rewritten as Mx
x = 0, or (M I)x = 0, which means that the nonzero vector x belongs
to the null space of the matrix M I, and in particular this matrix is not
invertible. Using the theory of matrices, we know that this is equivalent to
the facts that its determinant is zero:
det(M I) = 0
The determinant has the form
det(M I) =
M
11
M
12
. . . M
1n
M
21
M
22
. . . M
2n
.
.
.
.
.
.
.
.
.
M
n1
M
n2
. . . M
nn
where the polynomial form is easy to see by expansion of the determinant

(and rigorously proved by induction on n).
Examples.
For n = 2 the characteristic polynomial is
M
11
M
12
M
21
M
22
= (M
11
) (M
22
)M
12
M
21
= polynomial in of degree 2
For n = 3 the characteristic polynomial is
M
11
M
12
M
13
M
21
M
22
M
23
M
13
M
23
M
33
and expanding along say, row 1,

= (1)
1+1
(M
11
)
M
22
M
23
M
23
M
33
+(1)
1+2
M
12
M
21
M
23
M
13
M
33
+(1)
1+3
M
13
M
21
M
22
M
13
M
23
= polynomial in of degree 3
It is easy to show by induction that
det(M I) = polynomial in of degree n
Denition 2. The polynomial det(M I) is called the characteristic
polynomial of the matrix M, and the equation det(M I) = 0 is called
the characteristic equation of M.
2.3. Geometric interpretation of eigenvalues and eigenvectors. Let
M be an n n matrix, and T : R
n
R
n
dened by T(x) = Mx be the
corresponding linear transformation.
If v is an eigenvector corresponding to an eigenvalue of M: Mv = v,
then T expands or contracts v (and any vector in its direaction) times
(and it does not change its direction!).
If the eigenvalue/vector are not real, a similar fact is true, only that
multiplication by a complex (not real) scalar cannot be easily called an
expansion or a contraction (there is no ordering in complex numbers), see
the example of rotations, 2.5.1.
The special directions of the eigenvectors are called principal axes of the
linear transformation (or of the matrix).
2.4. Digression: the fundamental theorem of algebra.
2.4.1. Polynomials of degree two: roots and factorization. Consider polyno-
mials of degree two, with real coecients: p(x) = ax
2
+ bx + c. It is well
known that p(x) has real solutions x
1,2
=
b
b
2
4ac
2a
if b
2
4ac 0
(where x
1
= x
2
when b
2
4ac = 0), or p(x) has no real solutions if
b
2
4ac 0.
When the solutions are real, then the polynomial factors as ax
2
+bx+c =
a(x x
1
)(x x
2
).
In particular, if x
1
= x
2
then p(x) = a(x x
1
)
2
and x
1
is called a double
root. It is convenient to say that also in this case p(x) has two roots.
Note: if we know the two zeroes of the polynomial, then we know each
factor of degree one: x x
1
; the number a in front of (x x
1
)(x x
2
) takes
care that the coecient of x
2
equals a.
On the other hand, if p(x) has no real roots, then it cannot be factored
within real numbers, it is called irreducible (over the real numbers).
2.4.2. Complex numbers and factorization of polynomials of degree two. If
p(x) = ax
2
+bx+c is irreducible this means that b
2
4ac < 0 and we cannot
take the square root of this quantity in order to calculate the two roots of
p(x). However, writing b
2
4ac = (1) (b
2
+ 4ac) and introducing the
symbol i for
1 we can write the zeroes of p(x) as

x
1,2
=
b i
b
2
+ 4ac
2a
=
b
2a
i
b
2
+ 4ac
2a
R +iR = C
Considering the two roots x
1
, x
2
complex, we can now factor ax
2
+bx+c =
a(x x
1
)(x x
2
), only now the factors have complex coecients. Within
complex numbers every polynomial of degree two is reducible!
Note that the two roots of a quadratic polynomial with real coecients
are complex conjugate: if a, b, c R and x
1,2
R then x
2
= x
1
.
2.4.3. The fundamental theorem of algebra. It is absolutely remarkable that
any polynomial can be completely factored using complex numbers:
Theorem 3. The fundamental theorem of algebra
Any polynomial p(x) = a
n
x
n
+a
n1
x
n1
+. . . +a
0
with coecients a
j
C
can be factored
(10) a
n
x
n
+a
n1
x
n1
+. . . +a
0
= a
n
(x x
1
)(x x
2
) . . . (x x
n
)
for a unique set of complex numbers x
1
, x
2
, . . . , x
n
(the roots of the polyno-
mial p(x)).
6 RODICA D. COSTIN
Remark. With probability one, the zeroes x
1
, . . . , x
n
of polynomials p(x)
are distinct. Indeed, if x
1
is a double root (or has higher multiplicity) then
both relations p(x
1
) = 0 and p
(x
1
) = 0 must hold. This means that there is
a relation between the coecients a
0
, . . . a
n
of p(x) (the multiplet (a
0
, . . . a
n
)
belongs to an n dimensional surface in C
n+1
.)
2.4.4. Factorization within real numbers. If we want to restrict ourselves
only within real numbers then we can factor any polynomial into factors of
degree one or two:
Theorem 4. Factorization within real numbers
Any polynomial of degree n with real coecients can be factored into fac-
tors of degree one or two with real coecients.
Theorem4 is an easy consequence of the (deep) Theorem3. Indeed, rst,
factor the polynomial in complex numbers (10). Then note that the zeroes
x
1
, x
2
, . . . , x
n
come in pairs of complex conjugate numbers, since if z satises
p(z) = 0, then also its complex conjugate z satises p(z) = 0. Then each
pair of factors (xz)(xz) must be replaced in (10) by its expanded value:
(x z)(x z) = x
2
(z +z) x +|z|
2
which is an irreducible polynomial of degree 2, with real coecients. 2
2.5. Solutions of the characteristic equation. Let M be an n n ma-
trix. Its characteristic equation, det(M I) = 0 is a polynomial of degree
n, therefore it has n solutions in C (some of them may be equal).
If M is considered acting on vectors in C
n
then this is just ne, but if M
is a matrix with real entries acting on R
n
, then eigenvectors corresponding
to complex nonreal eigenvalues are also nonreal.
For an n n matrix with real entries, if we want to have n guaranteed
eigenvalues, then we have to accept working in C
n
. Otherwise, if we want to
restrict ourselves to working only with real vectors, then we have to accept
that we may have fewer (real) eigenvalues, or perhaps none, as in the case
of rotations in R
2
, shown below.
2.5.1. Rotation in the plane. Consider a rotation matrix
R
=
_
cos sin
sin cos
_
To nd its eigenvalues calculate
det(R
I) =
cos sin
sin cos
= (cos )
2
+sin
2
=
2
2 cos +1
hence the solutions of the characteristic equations det(R
I) = 0 are
1,2
= cos i sin = e
i
. It is easy to see that v
1
= (i, 1)
T
is the
eigenvector corresponding to
1
= e
i
and v
2
= (i, 1)
T
is the eigenvector
corresponding to
2
= e
i
.
Note that both eigenvalues are not real (in general) and, since the entries
of R
are real, then

2
=
1
: the two eigenvalues are complex conjugate.
Also, so are the two eigenvectors: v
2
= v
1
.
2.6. Diagonal matrices. Let D be a diagonal matrix:
(11) D =
_
_
d
1
0 . . . 0
0 d
2
. . . 0
.
.
.
.
.
.
.
.
.
0 0 . . . d
n
_
_
Diagonal matrices are associated to dilations in R
n
(or C
n
): each axis x
j
is stretched (or compressed) d
j
times.
To nd its eigenvalues, calculate
det(DI) =
d
1
0 . . . 0
0 d
2
. . . 0
.
.
.
.
.
.
.
.
.
0 0 . . . d
n
= (d
1
1
)(d
2
2
) . . . (d
n
n
)
The eigenvalues are precisely the diagonal elements, and the eigenvector
corresponding to d
j
is e
j
(as it is easy to check). The principal axes of
diagonal matrices the coordonate axes x
1
, . . . , x
n
.
Diagonal matrices are easy to work with: you can easily check that any
power D
k
of D is the diagonal matrix having d
k
j
on the diagonal.
Furthermore, let p(x) be a polynomial
p(t) = a
k
t
k
+a
k1
t
k1
+. . . +a
1
t +a
0
then for any matrix n n M it makes sense to dene p(M) as
p(M) = a
k
M
k
+a
k1
M
k1
+. . . +a
1
M +a
0
I
If D is a diagonal matrix (11) then p(D) is the diagonal matrix having
p(d
j
) on the diagonal. (Check!)
Diagonal matrices can be viewed as the collection of their eigenvalues!
2.7. Similar matrices have the same eigenvalues. It is very easy to
work with diagonal matrices and a natural question arises: which linear
transformations have a diagonal matrix in a well chosen basis? This is the
main topic we will be exploring for many sections to come.
Recall that if the matrix A represents the linear transformation T : V
V in some basis B
V
of V , and the matrix A
represents the same linear

transformation T, only in a dierent basis B
V
, then the two matrices are
similar: A
= SAS
1
(where S the the matrix of change of basis).
Eigenvalues are associated to the linear transformation (rather than its
matrix representation):
8 RODICA D. COSTIN
Proposition 5. Two similar matrices have the same eigenvalues: if A, A
, S
are n n matrices, and A
= SAS
1
then the eigenvalues of A
and of A
are the same.
This is very easy to check, since
det(A
I) = det(SAS
1
I) = det
_
S(AI)S
1
= det S det(AI) det S

1
= det(AI)
so A
and A have the same characteristic equation. 2

2.8. Projections. Recall that projections do satisfy P
2
= P (we saw this
for projections in dimension two, and we will prove it in general).
Proposition 6. Let P be a square matrix satisfying P
2
= P. Then the
eigenvalues of P can only be are zero or one.
Proof. Let be an eigenvalue; this means that there is a nonzero vector
v so that Pv = v. Applying P to both sides of the equality we obtain
P
2
v = P(v) = Pv =
2
v. Using the fact that P
2
v = Pv = v it follows
that v =
2
v so (
2
)v = 0 and since v = 0 then
2
= 0 so
{0, 1}. 2
Example. Consider the projection of R
3
onto the x
1
x
2
plane. Its matrix
P =
_
_
1 0 0
0 1 0
0 0 0
_
_
is diagonal, with eigenvalues 1, 1, 0.
2.9. Trace, determinant and eigenvalues.
Denition 7. Let M be an n n matrix, M = [M
ij
]
i,j=1,...,n
. The trace
of M is the sum of its elements on the principal diagonal:
Tr M =
n
j=1
M
jj
The following theorem shows that the trace and the determinant of any
matrix can be found as if the matrix would be diagonal (its entries being,
of course, its eigenvalues):
Theorem 8. Let M be an nn matrix, an let
1
, . . . ,
n
be its n eigenvalues
(complex, with multiplicities counted). Then
(12) det M =
1
2
. . .
n
and
(13) Tr M =
1
+
2
+. . . +
n
In particular, the traces of similar matrices are equal, and so are their
determinants.
Corollary 9. A matrix M is invertible if and only if all its eigenvalues are
not zero.
Sketch of the proof.
On one hand it is not hard to see that for any factored polynomial
(14) p(x) (x
1
)(x
2
) . . . (x
n
)
= x
n
(
1
+
2
+. . . +
n
)x
n1
+. . . + (1)
n
(
1
2
. . .
n
)
In particular, p(0) = (1)
n
2
. . .
n
.
On the other hand, let p(x) = (1)
n
det(M I) (recall that the coe-
cient of
n
in the characteristic equation is (1)
n
) then p(0) = (1)
n
det M
which proves (12).
To show (13) we expand the determinant det(M I) using minors and
cofactors keeping track of the coecient of
n1
. As seen on the examples
in 2.2, only the rst term in the expansion contains the power
n1
, and
continuing to expand to lower and lower dimensional determinants, we see
that the only term containing
n1
is
(M
11
)(M
22
) . . . (M
nn
)
= (1)
n
n
(1)
n
(M
11
+M
22
+. . . +M
nn
)
n1
+ lower powers of
which compared to (14) gives (13) 2
2.10. Eigenvectors corresponding to dierent eigenvalues are inde-
pendent.
Theorem 10. M be an n n matrix Let M be an n n matrix, and
1
, . . .
k
a set of distinct eigenvalues (
j
=
i
for all i, j = 1, . . . k). Let v
j
be eigenvectors corresponding to the eigenvalues
j
for j = 1, . . . , k.
Then the set v
1
, . . . , v
j
is linearly independent.
In particular, is all the eigenvalues of M are in F and are distinct, then
the set of corresponding eigenvectors form a basis for F
n
.
Proof.
Assume, to obtain a contradiction, that the eigenvectors are linearly de-
pendent: there are c
1
, . . . , c
k
F not all zero such that
(15) c
1
v
1
+. . . +c
k
v
k
= 0
Step I. We can assume that all c
j
are not zero, otherwise we just remove
those v
j
from (15) and we have a similar equation with a smaller k. Then
we can solve (15) for v
k
:
(16) v
k
= c
1
v
1
+. . . +c
k1
v
k1
where c
j
= c
j
/c
k
.
Applying M to both sides of (16) we obtain
(17)
k
v
k
= c
1
v
1
+. . . +c
k1
k1
v
k1
10 RODICA D. COSTIN
Multiplying (16) by
k
and subtracting from (17) we obtain
(18) 0 = c
1
(
1
k
)v
1
+. . . +c
k1
(
k1
k
)v
k1
Step II. There are two possibilities.
1) Either all c
k1
(
k1
k
) are zero, which implies that all c
1
, . . . , c
k1
are zero, therefore all c
1
, . . . , c
k1
are zero, which from (15) implies that
c
k
v
k
= 0, and since v
k
= 0 then also c
k
= 0 which is a contradiction.
Or 2) not all c
j
are zero, in which case we go to Step I. with a lower k.
The procedure decreases k, therefore it must end, and we have a contra-
diction. 2
2.11. Diagonalization of matrices with linearly independent eigen-
vectors. Suppose that the M be an nn matrix has n independent eigen-
vectors v
1
, . . . , v
n
.
Note that, by Theorem10, this is the case if we work in F = C and all
the eigenvalues are distinct (recall that this happens with probability one).
Also this is the case if we work in F = R and all the eigenvalues are real
and distinct.
Let S be the matrix with columns v
1
, . . . , v
n
:
S = [v
1
, . . . , v
n
]
which is invertible, since v
1
, . . . , v
n
are linearly independent.
Since Mv
j
=
j
v
j
then
(19) M[v
1
, . . . , v
n
] = [
1
v
1
, . . . ,
n
v
n
]
The right side of (19) equals MS. To identify the matrix on the left side
of (19) note that since Se
j
= v
j
then S(
j
e
j
) =
j
v
j
so
[
1
v
1
, . . . ,
n
v
n
] = S[
1
e
1
, . . . ,
n
e
n
] = S, where =
_
1
0 . . . 0
0
2
. . . 0
.
.
.
.
.
.
.
.
.
0 0 . . .
m
_
_
Relation (19) is therefore
MS = S, or S
1
MS = = diagonal
Note that the matrix S which diagonalizes a matrix is not unique. For
example, we can replace any eigenvector by a scalar multiple of it. Also, we
can use dierent orders for the eigenvectors (this will result on a diagonal
matrix with the same values on the diagonal, but in dierent positions.
Example 1. Consider the matrix (3), for which we found the eigenvalues
1,2
= a
1,2
and the corresponding eigenvectors (7). Taking
S =
_
2 1
3 2
_
we have
S
1
MS =
_
a
1
0
0 a
2
_
Not all matrices are diagonalizable, only those with distinct eigenvalues,
and some matrices with multiple eigenvalues.
Example 2. The matrix
(20) N =
_
0 1
0 0
_
has eigenvalues
1
=
2
= 0 (easy to calculate) but only one (up to a scalar
multiple) eigenvector v
1
= e
1
.
Multiple eigenvalues are not guaranteed to have an equal number of in-
dependent eigenvectors!
To show that N is not diagonalizable, assume the contrary, to arrive
to a contradiction. Suppose there exists an invertible matrix S so that
S
1
NS = where is diagonal, hence it has the eigenvalues of N on its
diagonal, and therefore it is the zero matrix: S
1
NS = 0, which multiplied
by S to the left and S
1
to the right gives N = 0, which is a contradiction.
Remark about the zero eigenvalue. A matrix M has an eigenvalue
equal to zero if and only if its null space N(M) is nontrivial. Moreover, the
matrix M has dimN(M) eigenvectors linearly independent which correspond
to the eigenvalue zero.
Example 3. Examples of matrices that can be diagonalized even if they
have multiple eigenvalues can be found as M = SS
1
where is diagonal,
with some entries equal, and S is any invertible matrix.
Remark. If a matrix has real entries, and distinct eigenvalues some of
which are nor real, we cannot diagonalize within real numbers, but we can
do it using complex numbers and complex vectors, see for example rotations,
2.5.1.
2.12. Real matrices with complex eigenvalues: decomplexication.
Suppose the n n matrix M has real elements, eigenvalues
1
, . . . ,
n
and
n independent eigenvectors v
1
, . . . , v
n
. Then M is diagonalizable: if S =
[v
1
, . . . , v
n
] then S
1
MS = where is a diagonal matrix with
1
, . . . ,
n
on its diagonal.
Suppose that some eigenvalues are not real. Then the matrices S and
are not real either, and the diagonalization of M must be done in C
n
. Recall
that the nonreal eigenvalues and eigenvectors of real matrices come in pairs
of complex-conjugate ones.
Suppose that we want to work in R
n
only. This can be achieved by
replacing diagonal 2 2 blocks in
_

j
0
0
j
_
by a 2 2 matrix which is not diagonal, but has real entries.
12 RODICA D. COSTIN
To see how this is done, suppose
1
C \ R and
2
=
1
, v
2
= v
1
.
Splitting into real and imaginary parts, write
1
=
1
+i
1
and v
1
= x
1
+iy
1
.
Then from M(x
1
+ iy
1
) = (
1
+ i
1
)(x
1
+ iy
1
) identifying the real and
imaginary parts, we obtain
Mx
1
+iMy
1
= (
1
x
1
y) +i(
1
x +
1
y)
Then in the matrix S for diagonalization, replace the rst two columns
v
1
, v
2
= v
1
by x
1
, y
1
(which are vectors in R
n
): using the matrix

S =
[x
1
, y
1
, v
3
, . . . , v
n
] instead of S we have M

S =

S
where
=
_
1

1
0 . . . 0
1

1
0 . . . 0
0 0
3
. . . 0
.
.
.
.
.
.
.
.
.
.
.
.
0 0 0 . . .
m
_
_
We can similarly replace any pair of complex conjugate eigenvalues with
2 2 real blocks.
2.13. Eigenspaces. Consider an n n matrix M with entries in F, with
eigenvalues
1
, . . . ,
n
in F.
Denition 11. The set
V
j
= {x F
n
| Mx =
j
x}
is called the eigenspace of M associated to the eigenvalue
j
.
Exercise. Show that V
j
is the null space of the transformation M I
and that V
j
is a subspace of F
n
.
Note that all the nonzero vectors in V
j
are eigenvectors of M correspond-
ing to the eigenvalues
j
.
Denition 12. A subspace V is called an an invariant subspace for M
if M(V ) V (which means that if x V then Mx V ).
Exercise. Show that V
j
is an invariant subspace for M. Show that
V
j
V
k
= {0} if
j
=
k
.
Denote by
1
, . . . ,
k
the distinct eigenvalues of M (in other words, {
1
, . . . ,
k
} =
{
1
, . . . ,
n
}).
Show that M is diagonalizable in F
n
if and only if V
1
. . . V
k
= F
n
and explain why the sum before is a direct sum.
2.14. Multiple eigenvalues, Jordan normal form. We noted in 2.13
that a matrix is similar to a diagonal matrix if and only if the dimension
of each eigenspace V
j
equals the order of multiplicity of the eigenvalue
j
.
Otherwise, there are lees than n independent eigenvectors; such a matrix is
called defective.
2.14.1. Jordan normal form. Defective matrices can not be diagonalized,
but we will see that they are similar to almost diagonal matrices, called Jor-
dan normal form, which have the eigenvalues on the diagonal, 1 in selected
placed above the diagonal, and zero in the rest. In 2.14.2 we will see how
to construct the matrix S which conjugates a defective matrix to its Jordan
normal form.
Example 0. The matrix (20) is defective, and it is in Jordan normal
form.
Example 1. The matrix
_
+
1
3
1
3
1
3
+
1
3
_
is defective: it has eigenvalues , and only one independent eigenvector,
(1, 1)
T
. The matrix is not similar to a diagonal matrix (is not diagonaliz-
able), but we will see that it is similar to a 22 matrix with the eigenvalues
, on the diagonal, 1 above the diagonal, and zeroes in rest, the Jordan
block:
(21) J
2
() =
_
1
0
_
Example 2. Suppose M is a 33 matrix with characteristic polynomial
det(MI) = (2)
3
. Therefore M has the eigenvalues 2, 2, 2. Suppose
that M has only one independent vector v (the dimension of the eigenspace
V
2
is 1. This matrix is defective, it cannot be diagonalized, but it is similar
to the 3 3 Jordan block
(22) J
3
(2) =
_
_
2 1 0
0 2 1
0 0 2
_
_
In general, if an eigenvalue is repeated p times, but there are only k < p
independent eigenvectors corresponding to the eigenvalue , then the Jordan
normal form will have a k 1 dimensional diagonal block having on the
diagonal and Jordan block J
pk+1
() (p k + 1 dimensional, having on
the diagonal, 1 above the diagonal, and zeroes in rest).
Example 3. Suppose M is a 55 matrix with characteristic polynomial
det(MI) = (2)
4
(3). Therefore M has the eigenvalues 3, 2, 2, 2, 2.
Denote by v
1
the eigenvector corresponding to the eigenvalue 3. Suppose
that M has only two linearly independent vectors v
2
, v
3
associated to the
eigenvalue 2 (the dimension of the eigenspace V
2
is 2, while the multiplicity
14 RODICA D. COSTIN
of the eigenvalue 2 is four). This matrix is similar to its Jordan normal form
S
1
MS =
_
_
3 0 0 0 0
0 2 0 0 0
0 0 2 1 0
0 0 0 2 1
0 0 0 0 2
_
_
Note that the diagonal contains all the eigenvalues. Note the blocks that
form this matrix:
_
_
3 | 0 0 0 0

0 | 2 | 0 0 0

0 0 | 2 1 0
0 0 | 0 2 1
0 0 | 0 0 2
_
_
All the blocks are along the diagonal. There are two 1 1 blocks, cor-
responding the the eigenvectors v
1
and v
2
, and one 3 3 block for the
eigenvalue 2, which is defective by two eigenvectors.
In general defective matrices are similar to a matrix which is block-
diagonal, having Jordan blocks on its diagonal.
2.14.2. Generalized eigenvectors. If M is a defective matrix, the matrix S
which conjugated M to a Jordan normal form is constructed as follows. First
choose a basis for each eigenspace V
j
; these vectors will form a linearly
independent set. For each eigenvalue for which dimV
j
is smaller than the
order of multiplicity of
j
we add a number of independent vectors, called
generalized eigenvectors constructed as follows.
Let be a defective eigenvalue. Let v be an eigenvector corresponding to
. Let
(23) x
1
= v
and solving iteratively
(24) (M I)x
k
= x
k1
, k = 2, 3, . . .
Note that
0 = (M I)v = (M I)x
1
= (M I)
2
x
2
= (M I)
3
x
3
= . . .
This produces enough linearly independent vectors:
Lemma 13. Assume that the characteristic polynomial of the matrix M has
the root with multiplicity p, but the dimension of the eigenspace V
equals
k with k < p (this means that M has only k < p linearly independent vectors
associated to the eigenvalue ).
Then x
1
, . . . , x
pk+1
given by (23), (24) are linearly independent.
We will not include here the proof of the Lemma.
Example.
The matrix
M =
_
_
1 2 3
1 2 1
0 1 3
_
_
has eigenvalues 2, 2, 2 and only one independent eigenvector v = (1, 1, 1)
T
.
Let x
1
= (1, 1, 1)
T
. Then solving (M 2I)x
2
= x
1
we obtain x
2
=
(1, 1, 0)
T
and solving (M 2I)x
3
= x
2
we obtain x
3
= (0, 1, 1)
T
. For
S = [x
1
, x
2
, x
3
] we have
(25) S
1
MS =
_
_
2 1 0
0 2 1
0 0 2
_
_
3. Solutions of linear differential equations with constant
coefficients
In 1 we saw an example which motivated the theory of eigenvalues and
eigenvectors. We can now solve in a similar way general linear rst order
systems of dierential equations with constant coecients:
(26)
du
dt
= Mu
where M is an mm constant matrix and u in an m-dimensional vector.
3.1. Diagonalizable matrix. Assume that M is diagonalizable: there are
m independent eigenvectors v
1
, . . . , v
m
. As in 1 we nd m purely expo-
nential solutions u
j
(t) = e
j
t
v
j
for j = 1, . . . , m where v
j
is the eigenvector
associated to the eigenvalue
j
of M.
3.1.1. Fundamental matrix solution. Since the dierential system is linear,
then any linear combination of solutions is again a solution:
(27) u(t) = a
1
u
1
(t) +. . . +a
m
u
m
(t)
= a
1
e
1
t
v
1
+. . . +a
m
e
mt
v
m
, a
j
arbitrary constants
The matrix U(t) = [u
1
(t), . . . , u
m
(t)] is called a fundamental matrix so-
lution. The formula (27) is
(28) u(t) = U(t)a, where a = (a
1
, . . . , a
m
)
T
16 RODICA D. COSTIN
The constants a are found so that the initial condition is satised: u(0) = u
0
implies U(0)a = u
0
therefore a = U(0)
1
u
0
(the matrix U(0) = [v
1
, . . . , v
m
]
is invertible since v
1
, . . . , v
m
are linearly independent).
In conclusion, any initial problem has a unique solution and (28) is the
general solutions of the dierential system.
3.1.2. The matrix e
Mt
. In fact it is preferable to work with a matrix of
independent solutions rather than with a set of independent solutions. Note
that the mm matrix U(t) satises
(29)
d
dt
U(t) = M U(t)
If the dierential equation (29) is a scalar one (were U and M were
scalars), then U(t) = e
Mt
is a solution and U(t) = e
Mt
a is the general
solution. Can we dene the exponential of a matrix?
Recall that the exponential function is the sum of its power series, which
converges for all x C:
e
x
= 1 +
1
1!
x +
1
2!
x
2
+. . . +
1
n!
x
n
+. . . =
n=0
1
n!
x
n
We can then dene for any matrix M the exponential e
tM
using the series
(30) e
tM
= 1 +
1
1!
tM +
1
2!
t
2
M
2
+. . . +
1
n!
t
n
M
n
+. . . =
n=0
1
n!
t
n
M
n
provided we show that the series converges. If, furthermore, we can dier-
entiate term by term, then this matrix is a solution of (29) since
(31)
d
dt
e
tM
=
d
dt
n=0
1
n!
t
n
M
n
=
n=0
1
n!
d
dt
t
n
M
n
=
n=1
n
n!
t
n1
M
n
= Me
tM
To establish uniform convergence of (30) the easiest way is to diagonalize
M. If v
1
, . . . , v
m
are independent eigenvectors corresponding to the eigen-
values
1
, . . . ,
m
of M, let S = [v
1
, . . . , v
m
]. Then M = SS
1
with the
diagonal matrix with entries
1
, . . . ,
m
.
Note that
M
2
=
_
SS
1
_
2
= SS
1
SS
1
= S
2
S
1
then
M
3
= M
2
M =
_
S
2
S
1
_ _
SS
1
_
= S
3
S
1
and so on; for any power
M
n
= S
n
S
1
Then the series (30) is
(32) e
tM
=
n=0
1
n!
t
n
M
n
=
n=0
1
n!
t
n
S
n
S
1
= S
_

n=0
1
n!
t
n
n
_
S
1
For
(33) =
_
1
0 . . . 0
0
2
. . . 0
.
.
.
.
.
.
.
.
.
0 0 . . .
m
_
_
it is easy to see that
(34)
n
=
_
n
1
0 . . . 0
0
n
2
. . . 0
.
.
.
.
.
.
.
.
.
0 0 . . .
n
m
_
_
for n = 1, 2, 3 . . .
therefore
(35)
n=1
1
n!
t
n
n
=
_
n=1
1
n!
t
n
n
1
0 . . . 0
0
n=1
1
n!
t
n
n
2
. . . 0
.
.
.
.
.
.
.
.
.
0 0 . . .
n=1
1
n!
t
n
n
m
_
_
(36) =
_
_
e
t
1
0 . . . 0
0 e
t
2
. . . 0
.
.
.
.
.
.
.
.
.
0 0 . . . e
tm
_
_
= e
t
and (32) becomes
(37) e
tM
= Se
t
S
1
which shows that the series dening the matrix e
tM
converges uniformly for
all t, and (37), (36) give a formula for its value. Furthermore, (31) shows
that U(t) = e
tM
is a solution of the dierential equation (29). Obviously,
U(t) = e
t
C is also a solution for any constant matrix C.
Note that the general vector solution of (26) is
(38) u(t) = e
tM
b, with b an arbitrary constant vector
So see how the two forms of the general solution (38) and (28) are con-
nected, write M = SS
1
in (26), and multiply by S
1
:
S
1
du
dt
= S
1
u
Denoting y = S
1
u the equation becomes y
= y, with general solution

y(t) = e
t
a. Therefore
u(t) = Sy(t) = Se
t
a
which is exactly (27).
On the other hand, (38) is u(t) = Se
t
S
1
b so (38) and (28) are the same
for a = S
1
b.
18 RODICA D. COSTIN
Theorem 14. Fundamental facts on linear dierential systems
Let M be an n n matrix (diagonalizable or not).
(i) The matrix dierential problem
(39)
d
dt
U(t) = M U(t), U(0) = U
0
has a unique solution, namely U(t) = e
Mt
U
0
.
(ii) Let W(t) = det U(t). Then
(40) W
(t) = TrM W(t)

therefore
(41) W(t) = W(0) e
tTrM
(iii) If U
0
is an invertible matrix, then the matrix U(t) is invertible for
all t, and the columns of U(t) form an independent set of solutions of the
system
(42)
du
dt
= Mu
(iv) Let u
1
(t), . . . , u
n
(t) be solutions of the system (42). If the vectors
u
1
(t), . . . , u
n
(t) are linearly independent at some t then they are linearly
independent at any t.
Proof.
We give here the proof in the case when M is diagonalizable. The general
case is quite similar, see 3.2.
(i) Clearly U(t) = e
Mt
U
0
is a solution, and it is unique by the general
theory of dierential equations: (39) is a linear system of n
2
dierential
equation in n
2
unknowns.
(ii) Using (37) it follows that
W(t) = det U(t) = det(Se
t
S
1
U
0
) = det e
t
det U
0
= e
t
n
j=1

j
det U
0
= e
tTrM
det U
0
= e
tTrM
W(0)
which is (41), implying (40).
(iii), (iv) are immediate consequences of (41). 2
3.2. Repeated eigenvalues. All we did for diagonalizable matrices M can
be worked out for matrices which are not diagonalizable, by replacing the
diagonal matrix by a Jordan normal form J, see 2.14.
In formula (32) we need to understand what is the exponential of a Jordan
normal form. Since such a matrix is block diagonal, then its exponential will
be block diagonal as well, with exponentials of each Jordan block.
3.2.1. Example: 2 2 blocks. For
(43) J =
_
1
0
_
direct calculations give
(44) J
2
=
_

2
2
0
2
_
, J
3
=
_

3
3
2
0
3
_
, . . . , J
k
=
_

k
k
k1
0
k
_
, . . .
and then
(45) e
tJ
=
k=0
1
k!
t
k
J
k
=
_
e
t
te
t
0 e
t
_
3.2.2. Example: 3 3 blocks: for
(46) J =
_
_
1 0
0 1
0 0
_
_
direct calculations give
J
2
=
_
2
2 1
0
2
2
0 0
2
_
_, J
3
=
_
3
3
2
3
0
3
3
2
0 0
3
_
_, J
4
=
_
4
4
3
6
2
0
4
4
3
0 0
4
_
_
Higher powers can be calculated by induction; it is clear that
(47) J
k
=
_
k
k
k1
k(k1)
2

k2
0
k
k
k1
0 0
k
_
_
Then
(48) e
tJ
=
k=0
1
k!
t
k
J
k
=
_
_
e
t
te
t 1
2
t
2
e
t
0 e
t
te
t
0 0 e
t
_
_
In general, if S is the matrix whose columns are an independent set of
eigenvectors and generalized eigenvectors of M such that M = SJS
1
where
J is a Jordan normal form, then
(49) e
tM
= Se
tJ
S
1
is a fundamental matrix solution of (26), and the general solution of (26) is
u(t) = e
tM
b
20 RODICA D. COSTIN
3.2.3. General solution for 22 Jordan blocks. Let M be a 22 matrix, with
eigenvalues , and only one independent eigenvector v. Find a generalized
eigenvector; let x
1
= v and nd x
2
solution of (M I)x
2
= x
1
(note that
this means that Mx
2
= x
1
+x
2
). If S = [x
1
, x
2
] then S
1
MS is the Jordan
block (43) (check!).
Then, as in the diagonalizable case, the general vector solution is
u(t) = e
tM
b = Se
tJ
S
1
b = Se
tJ
a = a
1
e
t
x
1
+a
2
_
te
t
x
1
+e
t
x
2
_
3.2.4. General solution for 33 Jordan blocks. Let M be a 33 matrix, with
eigenvalues , , and only one independent eigenvector v. We need two
generalized eigenvectors: let x
1
= v, nd x
2
solution of (M I)x
2
= x
1
,
then x
3
solution of (M I)x
3
= x
2
(note that this means that Mx
2
=
x
1
+ x
2
, and Mx
3
= x
2
+ x
3
). If S = [x
1
, x
2
, x
3
] then S
1
MS is the
Jordan block (46) (check!).
Then the general vector solution is
u(t) = e
tM
b = Se
tJ
S
1
b = Se
tJ
a
= a
1
e
t
x
1
+a
2
_
te
t
x
1
+e
t
x
2
_
+a
3
_
1
2
t
2
e
t
x
1
+te
t
x
2
+e
t
x
3
_
Note the practical prescription: if is a repeated eigenvalue of multiplicity
p then we need to look for solutions of the type u(t) = e
t
q(t) where q(t)
are polynomials in t of degree at most p.
3.3. Higher order linear dierential equations; companion matrix.
Consider scalar linear dierential equations, with constant coecients, of
degree n:
(50) y
(n)
+a
n1
y
(n1)
+. . . +a
1
y
+a
0
y = 0
where y(t) is a scalar function and a
1
, . . . , a
n
1
are constants.
Such equations can be transformed into linear systems of rst order equa-
tions. The substitution:
(51) u
1
= y, u
2
= y
, . . . , u
n
= y
(n1)
transforms (50) into the system
(52) u
= Mu, where M =
_
_
0 1 0 . . . 0
0 0 1 . . . 0
.
.
.
.
.
.
0 0 0 . . . 1
a
0
a
1
a
2
. . . a
n1
_
_
The matrix M is called the companion matrix of the dierential equation
(50).
To nd its eigenvalues an easy method is to search for so that the linear
system Mx = x has a solution x = 0:
x
2
= x
1
, x
3
= x
2
, . . . , x
n1
= x
n2
, a
0
x
1
a
1
x
2
. . . a
n1
x
n
= x
n
which implies that
(53)
n
+a
n1
n1
+. . . +a
1
+a
0
= 0
which is the characteristic equation of M. Note that we can obtain this
equation by searching for solutions of (50) which are purely exponential:
y(t) = e
t
.
Using the results obtained for rst order linear systems, and looking just
at the rst component u
1
(t) (which is y(t)) of the vector u(t) we nd the
following:
(i) if the characteristic equation (53) has n distinct solutions
1
, . . . ,
n
then the general solution is a linear combination of purely exponential solu-
tions
y(t) = a
1
e
1
t
+. . . +a
n
e
nt
(ii) if
j
is a repeated eigenvalue of multiplicity p
j
then there are p
j
independent solutions of the type e
j
t
q(t) where q(t) are polynomials in t of
degree at most p
j
, therefore they can be taken to be e
j
t
, te
j
t
, . . . , t
p
j
1
e
j
t
.
3.3.1. The Wronskian. Let y
1
(t), . . . , y
n
(t) be n solutions of the equation
(50). The substitution (51) produces n solutions u
1
, . . . u
n
of the system
(52). Let U(t) = [u
1
(t), . . . u
n
(t)] be the matrix solution of (52). Theorem14
applied to the companion system yields:
Theorem 15. Let y
1
(t), . . . , y
n
(t) be solutions of equation (50).
(i) Their Wronskian
W(t) =
y
1
(t) . . . y
n
(t)
y
1
(t) . . . y
n
(t)
.
.
.
.
.
.
y
(n1)
1
(t) . . . y
(n1)
n
(t)
satisfy
W(t) = e
ta
n1
W(0)
(ii) y
1
(t), . . . , y
n
(t) are linearly independent if and only if their Wronskian
is not zero.
3.3.2. Decomplexication. Suppose a
j
R and there are nonreal eigen-
values, say
1,2
=
1
i
1
(recall that they come in pairs of conjugate
ones). Then there are two independent solutions y
1,2
(t) = e
t(
1
i
1
)
=
e
t
1
(cos(t
1
) i sin(t
1
)). If real valued solutions are needed (or desired),
note that Sp(y
1
, y
2
) = Sp(y
c
, y
s
) where
y
c
(t) = e
t
1
cos(t
1
), y
s
(t) = e
t
1
sin(t
1
)
and y
s
, y
c
are two independent solutions (real valued).
22 RODICA D. COSTIN
Furthermore, any solution in Sp(y
c
, y
s
) can be written as
(54) C
1
e
t
1
cos(t
1
) +C
2
e
t
1
sin(t
1
) = Ae
t
1
sin(t
1
+B)
where
A =
_
C
2
1
+C
2
2
and B is the unique angle in [0, 2) so that
cos B =
C
2
_
C
2
1
+C
2
2
, sin B =
C
1
_
C
2
1
+C
2
2
3.4. Stability in dierential equations. A linear, rst order system of
dierential equation
(55)
du
dt
= Mu
always has, a particular solution, the zero solution: u(t) = 0 for all t. The
point 0 is called an equilibrium point of the system (55). More generally,
Denition 16. An equilibrium point of a dierential equation u
= f (u)
is a point u
0
for which the constant function u(t) = u
0
is a solution, there-
fore f (u
0
) = 0.
It is important in applications to know how solutions behave near an
equilibrium point.
An equilibrium point u
0
is called stable if any solutions which start close
enough to u
0
remain close to u
0
for all t > 0. (This denition can be made
more mathematically precise, but it will not be needed here, and it is besides
the scope of these lectures.)
Denition 17. An equilibrium point u
0
is called asymptotically stable
if
lim
t
u(t) = u
0
for any solution u(t)
It is clear that an asymptotically stable point is stable, but the converse
is not necessarily true.
An equilibrium point which is not stable is called unstable.
We can easily characterize the nature of the equilibrium point u
0
= 0 of
linear dierential equations in terms of the eigenvalues of the matrix M.
We saw that solutions of a linear system (55) are linear combinations of
exponentials e
t
j
where
j
are the eigenvalues of the matrix M, and if M is
not diagonalizable, also of t
k
e
t
j
for 0 < k (multiplicity of
j
) 1.
Recall that
lim
t
t
k
e
t
j
= 0 if and only if
j
< 0
Therefore:
(i) if all
j
have negative real parts, then any solution u(t) of (55) converge
to zero: lim
t
u(t) = 0, and 0 is asymptotically stable.
(ii) If all
j
0, and some real parts are zero, and eigenvalues with zero
real part have the dimension of the eigenspace equal to the multiplicity of
the eigenvalue
1
then 0 is stable.
(iii) If any eigenvalue has a positive real part, then 0 is unstable.
As examples, let us consider 2 by 2 systems with real coecients.
Example 1: an asymptotically stable case, with all eigenvalues real.
For
M =
_
5 2
1 4
_
, M = SS
1
, with =
_
3 0
0 6
_
, S =
_
1 2
1 1
_
The gure shows the eld plot (a representation of the linear transfor-
mation x Mx of R
2
). The trajectories are tangent to the line eld, and
they are clearly going towards the origin. Along the directions of the two
eigenvectors of M, Mx is parallel to x (seen here from the fact that the
arrow points towards O). The point 0 is a hyperbolic equilibrium point.
1
This means that if j = 0 then there are no solutions q(t)e
t
j
with nonconstant
q(t).
24 RODICA D. COSTIN
Example 2: an asymptotically stable case, with nonreal eigenvalues.
For
M =
_
1 2
1 3
_
with =
_
2 +i 0
0 2 i
_
, S =
_
1 +i 1 i
1 1
_
The gure shows the eld plot. The trajectories are tangent to the line
eld, and they are clearly going towards the origin, though rotating around
it. The equilibrium point 0 is hyperbolic.
Example 3: an unstable case, with one negative eigenvalue, and a positive
one.
For
M =
_
3 6
3 0
_
with =
_
3 0
0 6
_
, S =
_
1 2
1 1
_
The gure shows the eld plot. Note that there is a stable direction (in
the direction of the eigenvector corresponding to the negative eigenvalue),
and an unstable one (the direction of the second eigenvector, corresponding
to the positive eigenvalue). The equilibrium point 0 is a saddle point.
26 RODICA D. COSTIN
Example 4: an unstable point, with both positive eigenvalues.
For
M =
_
5 2
1 4
_
with =
_
3 0
0 6
_
, S =
_
1 2
1 1
_
The gure shows the eld plot. Note that the arrows have opposite di-
rections than in Example 1. The trajectories go away from the origin.
Example 5: the equilibrium point 0 is stable, not asymptotically stable.
For
M =
_
1 2
1 1
_
with =
_
i 0
0 i
_
, S =
_
1 +i 1 i
1 1
_
The trajectories rotate around the origin on ellipses, with axes determined
by the real part and the imaginary part of the eigenvectors.
4. Difference equations (Discrete dynamical systems)
4.1. Linear dierence equations with constant coecients. A rst
order dierence equation, linear, homogeneous, with constant coecients,
has the form
(56) x
k+1
= Mx
k
where M is an n n matrix, and x
k
are n-dimensional vectors. Given an
initial condition x
0
the solution of (56) is uniquely determined: x
1
= Mx
0
,
then we can determine x
2
= Mx
1
, then x
3
= Mx
2
, etc. Clearly the solution
of (56) with the initial condition x
0
is
(57) x
k
= M
k
x
0
28 RODICA D. COSTIN
A second order dierence equation, linear, homogeneous, with constant
coecients, has the form
(58) x
k+2
= M
1
x
k+1
+M
0
x
k
A solution of (58) is uniquely determined if we give two initial conditions,
x
0
and x
1
. Then we can nd x
2
= M
1
x
1
+M
0
x
0
, then x
3
= M
1
x
2
+M
0
x
1
etc.
Second order dierence equations can be reduced to rst order ones: let
y
k
be the 2n dimensional vector
y
k
=
_
x
k
x
k+1
_
Then y
k
satises the recurrence
y
k+1
= My
k
where M =
_
0 I
M
0
M
1
_
which is of the type (56), and has a unique solution if y
0
is given.
More generally, a dierence equation of order p which is linear, homoge-
neous, with constant coecients, has the form
(59) x
k+p
= M
p1
x
k+p1
+. . . +M
1
x
k+1
+M
0
x
k
which has a unique solution if the initial p values are specied x
0
, x
1
. . . , x
p1
.
The recurrence (59) can be reduced to a rst order one for a vector of di-
mension np.
To understand the solutions of the linear dierence equations it then
suces to study the rst order ones, (56).
4.2. Solutions of linear dierence equations. Consider the equation
(56). If M has n independent eigenvectors v
1
, . . . , v
n
(i.e. M is diagonaliz-
able) let S = [v
1
, . . . , v
m
] and then M = SS
1
with the diagonal matrix
with entries
1
, . . . ,
n
. The solution (57) can be written as
x
k
= M
k
x
0
= S
k
S
1
x
0
and denoting S
1
x
0
= b,
x
k
= S
k
b = b
1
k
1
v
1
+. . . +b
n
k
n
v
n
hence solutions x
k
are linear combinations of
k
j
multiples of the eigenvectors
v
j
.
If M is not diagonalizable, just as in the case of dierential equations,
then consider a matrix S so that S
1
MS is in Jordan normal form.
Consider the example of a 2 2 Jordan block: M = SJS
1
with J
given by (46). As in Let S = [y
1
, y
2
] where y
1
is the eigenvector of M
corresponding to the eigenvalue and y
2
is a generalized eigenvector (as
in 3.2.3, only with a dierent notation). Using (44) we obtain the general
solution
x
k
= [y
1
, y
2
]
_

k
k
k1
0
k
_
_
b
1
b
2
_
= b
1
k
y
1
+b
2
_
k
k
y
1
+
k
y
2
_
and the recurrence has two linearly independent solutions of the form
k
y
1
and q(k)
k
where q(k) is a polynomial in k of degree one.
In a similar way, for p p Jordan blocks there are p linearly independent
solutions of the form q(k)
k
where q(k) are polynomials in k of degree at
most p 1, one of then being constant, equal to the eigenvector.
4.3. Stability. Clearly the constant zero sequance x
k
= 0 is a solution of
any linear homogeneous discrete equation (56): 0 is an equilibrium point
(or a steady state).
As in the case of dierential equations, an equilibrium point of a dierence
equation is called asymptotically stable, or an attractor, if solutions starting
close enough to the equilibrium point converge towards it.
For linear dierence equations this means that lim
k
x
k
= 0 for all
solutions x
k
. This clearly happens if and only if all the eigenvalues
j
of M
satisfy |
j
| < 1.
If all eigenvalues have either |
j
| < 1 or |
j
| = 1 and for eigenvalues of
modulus 1, the dimension of the eigenspace equals the multiplicity of the
eigenvalue, 0 is a stable point (or neutral).
In all other cases the equilibrium point is called unstable.
4.4. Why does the theory of discrete equations parallel that of
dierential equations?
An example: compounding interest
If a principal P
0
is deposited in a bank, earning an interest of r% com-
pounded anually, after the rst year, the amount in the account is P
1
=
(1 +r/100)P
0
, after the second year is P
2
= (1 +r/100)P
1
, and so on: after
the nth year the amount is
(60) P
n
= (1 +r/100)P
n1
, therefore P
n
= (1 +r/100)
n
P
0
Now assume the same principal P
0
is deposited in a bank earning the
same interest of r% per year, compounded bianually. This means that every
half year an r/2% interest is added. After the rst year, the amount in
the account is P
[2]
1
= (1 + r/200)
2
P
0
, after the second year it is P
[2]
2
=
(1 +r/200)
2
P
[2]
1
, and so on: after the nth year the amount is
(61) P
[2]
n
= (1 +r/200)
2
P
[2]
n1
, therefore P
[2]
n
= (1 +r/200)
2n
P
0
Suppose now that the principal P
0
is deposited in a bank earning r%
interest per year, but compounded more often, k times a year. After the
30 RODICA D. COSTIN
rst year, the amount in the account is P
[k]
1
= (1 + r/(100 k))
k
P
0
, and so
on: after the nth year the amount is
(62) P
[k]
n
= (1 +r/(100 k))
k
P
[k]
n1
therefore
(63) P
[k]
n
= (1 +r/(100 k))
nk
P
0
Now assume the interest is compounded more and more often, continoulsy.
This means k , and taking this limit in (63) and using the well known
fact that
lim
x0
(1 +x)
1/x
= e therefore lim
x0
(1 +cx)
1/x
= e
c
we obtain
(64) P
[]
n
= e
nr/100
P
0
Relations (60), (61), (62) are dierence equations, while (64) satises a
dierential equation:
d
dn
P
[]
n
= r/100 P
[]
n
More generally:
Suppose a function y(t) satises a dierential equation, take for simplicity
y(t) scalar, and a linear, rst order equation
(65)
dy
dt
= ay(t), y(0) = y
0
with solution y(t) = e
at
y
0
Let us consider a discretization of t: let h be small, and consider only the
values for t: t
n
= nh. Using the approximation
(66) y(t
n+1
) = y(t
n
) +y
(t
n
)h +O(h
2
)
then
hy
(t
n
) y(t
n+1
) y(t
n
)
(read: the shift by h equals, in a srt approximation, h
d
dh
) which used in
(67) gives the dierence equation
(67)
y(t
n+1
) y(t
n
) = ah y(t
n
), y(0) = y
0
with solution y(t
n
) = (1 +ah)
n
y
0
The values y(t
n
) are approximations of y(t), and in the limit h 0 they
equal y(t):
y(t
n
) = (1 +ah)
tn/h
y
0
e
atn
y
0
as h 0
More abstractly:
It is appears that the argument above relies on a (limit) connection be-
tween the shift operator (which takes a function y(t) to its shift y(t+h)) and
the dierentiation operator (taking a function y(t) to its derivative
d
dt
y(t)).
Let us explore this connection.
Writing the whole Taylor series expansion in (66):
y(t +h) = y(t) +h
dy
dt
+
1
2!
h
2
d
2
y
dt
2
+. . . +
1
n!
h
n
d
n
y
dt
n
+. . .
=
_
1 +h
d
dt
+
1
2!
h
2
d
2
dt
2
+. . . +
1
n!
h
n
d
n
dt
n
+. . .
_
y(t) = e
h
d
dt
y(t)
therefore
y(t +h) = e
h
d
dt
y(t)
The dierence operator
h
dened as (
h
y)(t) = y(t +h) y(t) satises
h
= e
h
d
dt
1
(of course, justication is needed, and it is done using operator theory).
4.5. Example: Fibonacci numbers.
0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, . . .
this is one of the most famous sequence of numbers, studied for more that 2
milennia (it st appeared in ancient indian mathematics), which describes
countless phenomena in nature, in art and in sciences.
The Fibonacci numbers are dened by the recurrence relation
(68) F
k+2
= F
k+1
+F
k
with the initial condition F
0
= 0, F
1
= 1.
Denoting
x
k
=
_
F
k
F
k+1
_
then x
k
satises the recurrence x
k+1
= Mx
k
where M is the companion
matrix
M =
_
0 1
1 1
_
As in the case of dierential equations, the characteristic equation can be
found either using the matrix M, or by directly substituting F
k
=
k
into
the recurrence (68), yielding
2
= + 1.
The eigenvalues are
1
=
1 +
5
2
= = the golden ratio,
2
=
1
5
2
F
k
is a linear combination of
k
1
and
k
2
: F
k
= c
1
k
1
+c
2
k
2
. The values of
c
1
, c
2
can be found from the initial conditions:
Note that the ratio of two consecutive Fibonacci numbers converges to
the golden ratio:
lim
k
F
k+1
F
k
=
32 RODICA D. COSTIN
4.6. Positive matrices.
Denition 18. A positive matrix is a square matrix whose entries are
all positive numbers.
Caution: this is not to be confused with positive denite self adjoint
matrices, which will be studied later.
Positive matrices have countless applications and very special properties.
Notations.
x 0 denotes a vector with all components x
j
0
x > 0 denotes a vector with all components x
j
> 0
Theorem 19. Perron-Frobenius Theorem
Let P be a positive matrix: P = [P
ij
]
i,j=1,...,n
, P
ij
> 0.
P has a dominant eigenvalue (or, Perron root, or Perron-Frobenius eigen-
value) r(P) =
1
with the following properties:
(i)
1
> 0 and the associated eigenvector v
1
is positive: v
1
> 0.
(ii)
1
is a simple eigenvalue.
(iii) All other eigenvalues have smaller modulus: if |
j
| <
1
for all
eigenvalues
j
of P, j > 1.
(iv) All other eigenvectors of P are not nonnegative, v
j
0 (they have
have at least one negative or nonreal entry).
(v)
1
satises the following maximin property:
1
= max T where
T = {t 0 | Px tx, for some x 0, x = 0}
(v)
1
satises the following minimax property:
1
= min S where
S = {t 0 | Px tx, for all x 0}
(vi) Also
min
i
j
P
ij

1
max
i
j
P
ij
The proof of the Perron theorem will not be given here.
4.7. Markov chains. Markov processes model random chains of events
events whose likelihood depends on, and only on, what happened last.
4.7.1. Example. Suppose that it was found that every year 1% of the US
population living in costal areas moves inland, and 2% of the US population
living inland moves to costal areas
2
. Denote by x
k
and y
k
the number of
people living in costal areas, respectively inland, at year k. We are interested
to understand how the population distribution among these areas evolves in
the future.
2
These are not real gures. Unfortunately, I could not nd real data on this topic.
Assuming the US population remains the same, in the year k +1 we nd
that x
k+1
= .99x
k
+.02y
k
and y
k+1
= .01x
k
.98y
k
or
(69) x
k+1
= Mx
k
where
x
k
=
_
x
k
y
k
_
, M =
_
.99 .02
.01 .98
_
Relation (69) modeling our process is a rst order dierence equation.
Note that the entries of the matrix M are nonnegative (they represent a
percentage, or a probability), and that its columns add up to 1, since the
whole population is subject to the process: any person of the US population
is in one of the two regions.
Question: what happens in the long run, as k ? Would the whole
population eventually move to costal areas?
To nd the solution x
k
of (69) we need the eigenvalues and eigenvectors of
M: it is easily calculated that there is one eigenvalue equal to 1 correspond-
ing to v
1
= (2, 1)
T
, and an eigenvalue .97, corresponding to v
2
= (1, 1)
T
.
(Note that M is a positive matrix, and the Perron-Frobenius Theorem ap-
plies: the dominant eigenvalue is 1, and its eigenvector has positive compo-
nents, while the other eigenvector has both positive and nonpositive com-
ponents.)
Then
x
k
= c
1
v
1
+c
2
.97
k
v
2
and
x
= lim
k
x
k
= c
1
v
1
The limit is an eigenvector corresponding to the eigenvalue 1!
In fact this is not a big surprise if we reason as follows : assuming that x
k
converges (which is not guaranteed without information on the eigenvalues
of M) then taking the limit k in the recurrence relation (69) we nd
that x
= Mx
hence the limit x
is an eigenvector of M corresponding
to the eigenvalue 1, or the limit is 0 - which is excluded by the interpretation
that x
k
+y
k
= const=the total population.
Note: all the eigenvectors corresponding to the eigenvalue 1 are steady-
states: if the initial population distribution was x
0
= av
1
then the popula-
tion distribution remains the same: x
k
= x
0
for all k (since Mv
1
= v
1
).
Exercise. What is the type of the equilibrium points c
1
v
1
(asymptoti-
cally stable, stable, unstable)?
In conclusion, in the long run the population becomes distributed with
twice as many people living in costal area than inland.
4.7.2. Markov matrices. More generally, a Markov process is governed by
an equation (69) where the matrix M has two properties summarized as
follows.
34 RODICA D. COSTIN
Denition 20. An n n matrix M = [M
ij
] is called a Markov matrix
(or a Stochastic matrix) if:
(i) all M
ij
0, and
(ii) each column adds up to 1:
i
M
ij
= 1.
Theorem 21. Properties of Markov matrices
If M is a Markov matrix then:
(i) = 1 is an eigenvalue.
(ii) All the eigenvalues satisfy |
j
| 1. If all the entries of M are positive,
then |
j
| < 1 for j > 1.
(iii) If for some k all the entries of M
k
are positive, then
1
= 1 has
multiplicity 1 and all the other eigenvalues satisfy |
j
| < 1 for j = 2, . . . , n.
Proof.
(i) The matrix M I is not invertible, since all the columns of M add
up to 1, and therefore the columns of M I add up to zero. Therefore
det(M I) = 0 and 1 is an eigenvalue.
(ii) and (iii) follow from Perron-Frobenius Theorem19. 2
Note that for general Markov matrices all eigenvectors corresponding to
the eigenvalue 1 are steady states.
5. More functional calculus
5.1. Functional calculus for digonalizable matrices. Let M be a square
n n matrix, assumed diagonalizable: it has n independent eigenvectors
v
1
, . . . , v
n
corresponding to the eigenvalues
1
, . . . ,
n
and if S = [v
1
, . . . , v
n
]
then S
1
MS = a diagonal matrix with
1
, . . . ,
n
on its diagonal.
5.1.1. Polynomials of M. We looked at positive integer powers of M, and
we saw that M
k
= S
k
S
1
, where the power k is applied to each diagonal
entry of . To be consistent we clearly need to dene M
0
= I.
Recall that M is invertible if and only if all its eigenvalues are not zero.
Assume this is the case. Then we can easily check that M
1
= S
1
S
1
where the power k is applied to each diagonal entry of . We can then dene
any negative integer power of M.
If p(t) = a
n
t
n
+. . . +a
1
t +a
0
is a polynomial in t, we can easily dene
p(M) = a
n
M
n
+. . . +a
1
M +a
0
I = S p() S
1
where p() is the diagonal matrix with p(
j
) on the diagonal.
5.1.2. The exponential e
M
. We dened the exponential e
M
using its Taylor
series
e
M
=
k=0
1
k!
M
k
and e
M
= Se
S
1
where e
is the diagonal matrix with e
j
on the diagonal.
5.1.3. The resolvent. For which numbers z C the matrix zI M has an
inverse, and what are its eigenvalues? Clearly the matrix zI M is invertible
for all z which dier from the eigenvalues of M (in the innite dimensional
case things are not quite so).
The matrix valued function R(z) = (zI M)
1
, dened for all z =
1
, . . .
n
is called the resolvent of M. The resolvent has many uses, and is
particularly useful in innite dimensions.
Let z = 1. If M is diagonalizable then (I M)
1
= S (zI )
1
S
1
where (zI )
1
is the diagonal matrix with (z
j
)
1
on the diagonal.
Here is another formula, very useful for the innite dimensional case: if
M is diagonalizable, with all the eigenvalues satisfying |
j
| < 1
then
(70) (I M)
1
= I +M +M
2
+M
3
+. . .
which follows from the fact that
1
1
j
= 1 +
j
+
2
j
+
3
j
+. . . if |
j
| < 1
The resolvent is extremely useful for nondiagonalizable cases as well. In
innite dimensions the numbers z for which the resolvent does not exist, the
spectrum of the linear transformation, (they may or may nor be eigenval-
ues) play the role of the eigenvalues in nite dimensions.
Returning to nite dimensions, if M is not diagonalizable, formula (70)
may not hold (see for example a 2 dimensional Jordan block with zero eigen-
value). We will see that (70) is true for matrices of norm less than 1 - this is
a good motivation for introducing the notion of norm of a matrix later on.
5.1.4. The square root of M. Given the diagonalizable matrix M can we
nd matrices R so that R
2
= M, and what are they?
Using the diagonal form of M we have: R
2
= SS
1
which is equivalent
to S
1
R
2
S = and therefore (S
1
RS)
2
= .
Assuming that S
1
RS is diagonal, then S
1
RS =diag(
1/2
1
, . . . ,
1/2
n
)
and therefore
(71) R = S
_
1
. . . 0
.
.
.
.
.
.
0 . . .
n
_
_S
1
,
j
{1, 1}
There are 2
n
such matrices!
But there are also matrices with S
1
RS not diagonal. Take for example
M = I, and nd all the matrices R with R
2
= I. Then besides the four
diagonal solutions
R =
_

1
0
0
2
_
,
1,2
{1, 1}
36 RODICA D. COSTIN
there is the two parameter family of the solutions
R =
_

1 ab a
b
1 ab
_
Some of these matrices have nonreal entries!
5.1.5. Functional calculus for diagonalizable matrices. What other functions
of M can we dene? If M is diagonalizable it seems that given a function
f(t) we can dene f(M) provided that all f(
j
) are dened (a careful con-
struction is needed).
Diagonalizable matrices are thus very user friendly. Later on we will
see that there is a quick test to see which matrices are diagonalizable, and
which are not. It will be proved that M is diagonalizable if and only if
it commutes with its adjoint: MM
= M
M. Such matrices are called

normal, and this is the gateway to generalizing functional calculus to linear
operators in innite dimensions.
5.1.6. Working with Jordan blocks. The calculations done for 2 and 3 di-
mensional Jodan blocks in 3.2 can be done in a tidy way for the general
n n blocks using functional calculus.
First note that any n n Jordan block, with eigenvalue can be written
as
J = I +N
where N is a matrix whose only nonzero entries are 1 above the diagonal.
A short calculation shows that N
2
only nonzero entries are a sequence of
1s at a distance two above diagonal, and so on: each additional power of N
pushes the slanted line of 1 moves toward the upper right corner. Eventually
N
n
= 0. For example in dimension four:
N =
_
_
0 1 0 0
0 0 1 0
0 0 0 1
0 0 0 0
_
_
N
2
=
_
_
0 0 1 0
0 0 0 1
0 0 0 0
0 0 0 0
_
_
N
3
=
_
_
0 0 0 1
0 0 0 0
0 0 0 0
0 0 0 0
_
_
N
4
= 0
Since I and N commute we can use the binomial formula which gives
J
k
= (I +N)
k
=
k
j=0
_
k
j
_
kj
N
j
which for k > n 2 equals
n1
j=0
_
k
j
_
kj
N
j
. See (44), (47) for n = 2
and n = 3.
Also because I and N commute
e
J
= e
I+N
= e
I
e
N
= e
n1
k=0
1
k!
N
k
See (45), (48) for n = 2 and n = 3.
Exercise. What is (I J)
1
for J an nn Jordan block with eigenvalue
?
5.1.7. The Cayley-Hamilton Theorem. Here is a beautiful fact:
Theorem 22. The Cayley-Hamilton Theorem. Let M be a square
matrix, and p() = det(M I) be its characteristic polynomial.
Then p(M) = 0.
Note that if M is n n then it follows in particular that M
n
is a linear
combinations of earlier powers I, M, M
2
, . . . , M
n1
.
Proof of the Cayley-Hamilton Theorem.
Assume rst that M is diagonalizable: M = SS
1
. Then p(M) =
p(SS
1
) = S p() S
1
where p() is the diagonal matrix having p(
j
) on
the diagonal. Since p(
j
) = 0 for all j then p() = 0 and the theorem is
proved.
In the general case M = SJS
1
where J is a Jordan normal form. Then
p(J) is a block diagonal matrix, the blocks being p applied to standard
Jordan blocks. Let J
1
be any one of these blocks, with eigenvalue
1
and
dimension p
1
. Then the characteristic polynomial of M contains the factor
(
1
)
p
1
. Since (
1
J
1
)
p
1
= (N
1
)
p
1
= 0 then p(J
1
) = 0. As this is true
for each Jordan block composing J, the theorem follows. 2
5.2. Commuting matrices. The beautiful world of functional calculus
with matrices is marred by noncommuatitivity. For example e
A
e
B
equals
e
A+B
only if A and B commute, and the square (A + B)
2
= A
2
+ AB +
BA+B
2
cannot be simplied to A
2
+2AB +B
2
unless A and B commute.
When do two matrices commute?
Theorem 23. Let A and B be two diagonalizable matrices.
Then AB = BA if and only if they have the same matrix matrix of eigen-
vectors S (they are simultaneously diagonalizable).
Proof.
Assume that A = SS
1
and B = S
S
1
where ,
diagonal. Then,
since diagonal matrices commute,
AB = SS
1
S
S
1
= S
S
1
= S
S
1
= S
S
1
SS
1
= BA
Conversely, assume AB = BA and let S = [v
1
, . . . , v
n
] be the matrix
diagonalizing A, with Av
j
=
j
v
j
. Then BAv
j
=
j
Bv
j
so ABv
j
=
j
Bv
j
which means that both v
j
and Bv
j
are eigenvectors of A corresponding to
the same eigenvalue
j
.
If all the eigenvalues of A are simple, then this means that Bv
j
is a scalar
multiple of v
j
so S diagonalizes B as well.
If A has multiple eigenvalues then we may need to change S a little bit
(each set of eigenvectors of A corresponding to the same eigenvalue), to
accommodate B.
38 RODICA D. COSTIN
First replace Aby a diagonal matrix: AB = BAis equivalent to SS
1
B =
BSS
1
therefore S
1
BS = S
1
BS. Let C = S
1
BS, satisfying
C = C.
We can assume that the multiple eigenvalues of are grouped together,
so that is built of diagonal blocks of the type
j
I of dimensions d
j
, with
distinct
j
.
A direct calculation shows that C = C is equivalent to the fact that
C is block diagonal with blocks of dimensions d
j
. Since C is diagonalizable,
each block can be diagonalized: C = T
T
1
, and this conjugation leaves
invariant: TT
1
= .
Then the matrix ST diagonalizes both A and B. 2
Examples. Any two functions of a matrix M, f(M) and g(M), commute.

Lecture 3 Eigenvalues

Загружено:

Сведения о документе

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Lecture 3 Eigenvalues

Загружено:

Авторское право:

Доступные форматы

November 22, 2010

EIGENVALUES AND EIGENVECTORS

where the polynomial form is easy to see by expansion of the determinant

and expanding along say, row 1,

1 we can write the zeroes of p(x) as

are real, then

represents the same linear

= det S det(AI) det S

and A have the same characteristic equation. 2

= y, with general solution

(t) = TrM W(t)

hence the limit x

is the diagonal matrix with e

M. Such matrices are called

Вам также может понравиться