Академический Документы
Профессиональный Документы
Культура Документы
Version 1.0
John Stensby
Chapter 10
Function of a Matrix
Let f(z) be a complex-valued function of a complex variable z. Let A be an nn complexvalued matrix. In this chapter, we give a definition for the nn matrix f(A). Also, we show how
f(A) can be computed. Our approach relies heavily on the Jordan canonical form of A, an
important topic in Chapter 9. We do not strive for maximum generality in our exposition; instead,
we want to get the job done as simply as possible, and we want to be able to compute f(A) with a
minimum of fuss. In the literature, a number of equivalent approaches have been described for
defining and computing a function of a matrix. The concept of a matrix function has many
applications, especially in control theory and, more generally, differential equations (where
exp(At) and ln(A) play prominent roles).
Function of an n
n Matrix
Let f(z) be a function of the complex variable z. We require that f(z) be analytic in the
disk z < R. Basically (and for our purposes), this is equivalent to saying that f(z) can be
represented as a convergent power series (Taylors Series)
f ( z ) = c0 + c1z + c2 z 2 + L
(10-1)
f ( A ) = c0 + c1A + c2 A 2 + L .
(10-2)
We say that this matrix series is convergent (to an nn matrix f(A)) if all n2 scalar series that make
up f(A) are convergent. Now recall Theorem 4-1 which says, in part, that every element of a
matrix has an absolute value that is bounded above by the 2-norm of the matrix. Hence, each
element in f(A) is a series that is bounded in magnitude by the norm f ( A ) 2 . But the norm of a
CH10.DOC
Page 10-1
EE448/528
Version 1.0
John Stensby
sum is less than, or equal to, the sum of the norms. This leads to the conclusion that (10-2)
converges if
f ( A ) 2 = c0 + c1A + c2 A 2 + L
2
c0
I 2 + c1 A 2 + c2 A 2 2 + c3 A 3 2 + L
(10-3)
c0 I 2 + c1 A 2 + c2 A 22 + c3 A 23 + L
converges. Now, the last series on the right of (10-3) is an "ordinary" power series; it converges
for all nn matrices A with A 2 < R. Hence, we have argued that f(A) can be represented by a
series of matrices if A 2 is in the region of convergence of the scalar series (10-1).
At the end of this chapter, we will get a little more sophisticated. We will argue that the
series for f(A) converges (i.e. f(A) exists) if all eigenvalues of A lie in the region of convergence
of (10-1). That is, f(A) converges if k < R, where k, 1 k n, are the eigenvalues of A. In
addition, the series (10-1) diverges if one, or more, eigenvalues of A lie outside the disk z < R.
Our method of defining f(A) requires that function f(z) be analytic (so that it has a Taylor's
series expansion) in some disk centered at the origin; here, we limit ourselves to working with
"nice" functions This excludes from our analysis a number of interesting functions like f(z) = z1/n,
n > 1, and f(z) = ln(z), both of which have branch points at z = 0. While we do not cover it here,
a matrix-function theory for these "more complicated" functions is available.
Sometimes, f(A) is "exactly what you think it should be". Roughly speaking, if the scalar
function f(z) has a Taylor's series that converges in the disk z < R containing the eigenvalues
of A, then f(A) can be calculated by "substituting" matrix A for variable z in the formula for f(z).
For example, if f(z) = (1+z)/(1-z), then f(A) = (I + A)(I - A)-1, at least for matrices that have their
eigenvalues inside a unit disk centered at the origin (actually, as can be shown by analytic
continuation, this example is valid for all matrices A that do not have a unit eigenvalue). While
this "direct substitution" approach works well with rational (and other simple) functions, it does
CH10.DOC
Page 10-2
EE448/528
Version 1.0
John Stensby
eA = I + A k / k !
k =1
cos( A ) = I +
( 1)k A 2k / ( 2k )!
(10-4)
k =1
sin( A ) = I +
( 1)k 1A 2k 1 / ( 2k 1)! ,
k =1
to cite just a few. Also, as can be verified by the basic definition given above, many identities in
the variable z remain valid for a matrix variable. For example, the common Eulers identity
e jA = cos( A ) + j sin( A )
(10-5)
A=
LM.5 1 OP .
N0 .6Q
The MatLab program listed below computes a ten-term approximation and compares the result
with expm(A), an accurate routine internal to MatLab.
% Enter the matix A
> A = [.5 1;0 .6]
CH10.DOC
Page 10-3
EE448/528
Version 1.0
LM0.5000
N 0
John Stensby
OP
Q
1.0000
0.6000
% Set Up Working Matix B = A
> B = A;
% Set Matrix f to the Identity Matrix
> f = [1 0; 0 1];
% Sum Ten Terms of the Taylors Series
> for i = 1:10
> f = f + B;
> B = A*B/(i+1);
> end f
% Print-Out the Ten-Term Approximation to EXP[A]
1.6487 1.7340
f=
0
1.8221
% Use MatLabs Exponential Matix Function to Calculate EXP(A)
> expm(A)
1.6487 1.7340
ans =
0
1.8221
A =
LM
N
OP
Q
LM
N
OP
Q
(10-6)
Proof This result is obvious once the Taylors series of f(PAP-1) is written down.
Theorem 10-2
Let A be a block diagonal matrix
CH10.DOC
Page 10-4
EE448/528
LMA1
A=M
MM
MN
Version 1.0
A2
O
OP
PP
P
A p PQ
,
John Stensby
(10-7)
where Ak is an nknk square matrix (a block can be any size, but it must be square). Then we have
LMf ( A1)
f (A) = M
MM
MN
f (A2 )
O
OP
PP
P
f ( A p )PQ
.
(10-8)
LMf (J1)
f (A ) = PM
MM
MN
f (J 2 )
O
OP
PP P1
P
f ( J p )PQ
,
(10-9)
where Jk, 1 k p, are the Jordan blocks of the Jordan form for A. Hence, to compute f(A), all
we need to know is the Jordan form of A (i.e., the blocks Jk, 1 k p) and how to compute f(Jk),
where Jk is a Jordan block.
Computing f(J), the Function of a Jordan Block
As suggested by (10-9), function f(A) can be computed once we know how to compute a
function of a Jordan block. Since a function of a block can be expressed as an infinite series of
powers of a block (think of Taylors series), we really need to know a simple formula for integer
powers of a block (i.e., we need to know a simple representation for Jp, the pth power of block J).
CH10.DOC
Page 10-5
EE448/528
Version 1.0
John Stensby
But first, consider computing Hp, where n1n1 matrix H has the form
LM0
0
M
H=M
MM
MN
1
0
0
1
O O
0 1
0 0
OP
PP
PP
PQ
(10-10)
,
that is, n1n1 matrix H has 1s on its first superdiagonal and zeroes everywhere else. To see the
general trend, lets compute a few integer powers H. When n1 = 4, we can compute easily
LM0
MM0
H=M
MM0
MN0
0
MML
0
3 M
H =M
MM0
MN0
OP
0PP
PP ,
1
PP
0Q
1O
PP
0P
P,
0P
PP
0Q
0
LM0
MM0
H2 = M
MM0
MN0
LM0
MM0
Hp = M
MM0
MN0
OP
1PP
PP ,
0
PP
0Q
0O
PP
0P
P, p 4 .
0P
PP
0Q
(10-11)
Notice that the all-one superdiagonal is on the pth superdiagonal of Hp. For the 44 example
illustrated by (10-11), the result becomes zero for p 4. The general case can be inferred from
this example. Consider raising n1n1 matrix H (given by (10-10)) to an integer power p. For p <
n1, the result would have all 1s on the pth superdiagonal and zero elsewhere. For p n1, the
result is the n1n1 all zero matrix.
For an integer p > 0, we can compute Jp, where J is an n1n1 Jordan block associated with
CH10.DOC
Page 10-6
EE448/528
Version 1.0
John Stensby
J = I + H,
(10-12)
where I is an n1n1 identity matrix, and H is given by (10-10). Apply the Binomial expansion
( x + y) =
F pI
+ yp
(10-13)
k =0
p
J = ( + ) =
p
F pI
+ Hp .
(10-14)
k =0
On the far right-hand side of this expansion, the 1st term is the diagonal, the 2nd term is the first
superdiagonal, the 3rd term is the second superdiagonal, and so forth. If p < n1, the last term is the
pth superdiagonal; on the other hand, if p n1, the last term is zero. Finally, note that (10-14) can
be written as
LMp
MM 0
M
p M0
J =
MM M
MM 0
MM 0
N
p 1
1 p( p 1)p 2
2!
p 1
L
L
OP
PP
PP
PP
PP
PP
Q
p [n 1 1]
1
p
(
p
1
)
L
(
p
[
n
2
])
1
( n1 1)!
p [n 1 2]
1
p
(
p
1
)
L
(
p
[
n
3
])
1
( n 1 2 )!
p [n 1 3]
1
1)L ( p [n1 4])
p
p
(
( n1 3)!
.(10-15)
M
p 1
p
p2
CH10.DOC
on the second
Page 10-7
EE448/528
Version 1.0
John Stensby
p3
superdiagonal, 31! p( p 1)( p 2)
on the third superdiagonal, and so on until the series
f (J ) = c0 + c1J + c2J 2 + L
(10-16)
LMf ( )
MM 0
f (J ) = M 0
MM M
MM 0
N0
f ( ) f ( ) / 2! L f
f()
f ( )
L f
f ( )
L f
M
0
0
M
0
0
L
L
( n1 2 )
( n1 3)
( n1 4 )
OP
( n 2)
f
( ) / ( n1 2)!P
P
( n 3)
f
( ) / ( n1 3)!P ,
PP
M
f ( )
PP
f ( )
Q
( ) / ( n1 2)! f
( ) / ( n1 3)!
( ) / ( n1 4)!
M
f ( )
0
( n1 1)
( ) / ( n1 1)!
(10-17)
an n1n1 matrix that has f() on it main diagonal, f ( ) on its first superdiagonal, f ( ) / 2 ! on
its second superdiagonal, and so on (primes denote derivatives, and f(k) denotes the kth derivative).
Example
Calculate f(J) for the function f() = et. Direct application of (10-17) produces
f (J ) = exp[Jt ]
t
MMLe0
M0
=M
MM M
MM 00
N
CH10.DOC
tet
et
0
M
0
0
t 2et / 2 !
tet
et
M
0
0
OP
PP
PP
PP
PQ
Page 10-8
EE448/528
Version 1.0
Example
LM1
0
M
Consider the matrix A = M 0
MM 0
MN 0
1 0 0
0
1 1 0
0
0 1 0
0
0 0 2 1
0 0 0 2
John Stensby
OP
PP
PP
PQ
If f() = et, find f(A). Note that A contains two Jordan blocks. Hence, the previous example can
be applied twice, once to each block. The result is
LMe t
M0
At M
f (A) = e = 0
MM 0
MN 0
1
te1t
e1t
0
0
0
t 2e1t
te1t
e1t
0
0
0
0
0
e 2 t
0
OP
PP
P
te t P
e t PQ
0
0
0
LM3
MM01
A=M
MM0
MN00
OP
PP
PP
PP
Q
1 1 1 0 0
1 1 1 0 0
J1
0 2 0
1 1
J2
=P
0 0 2 1 1
0 0 0
1 1
0 0 0
1 1
LM
MM
N
OP 1
P ,
P
J 3 QP
LM2
MM20
P=M
MM0
MN00
1
1
0
0
0
0
OP
PP
PP
PP
Q
0
0 0 0
0
0 0 0
2 1 0
1 2 1 0
2 1
, J1 = 0 2 1 , J 2 =
, J3 = 0
0 2 1 0
0 2
0 0 2
0
0 1 1
0
0 1 1
LM
MM
N
OP
PP
Q
LM OP
N Q
Let's compute exp(A) by using the information given above. The answer is
CH10.DOC
Page 10-9
EE448/528
Version 1.0
LMexp(J1 )
exp( A ) = P
MM
N
exp( J 2 )
John Stensby
OP 1
P
P
exp( J 3 )PQ
where
LMe2
exp( J1 ) = M 0
MN 0
OP
PP
Q
LM
MN
e2 e2
2
2
2 , exp(J ) = e
e
e
2
0
2
0
e
OP
PQ
e2
, exp(J 3 ) = 1 .
e2
LM
MM
MM
MM
N
LM
MM
MM
MM
N
OP
PP
PP
PP
Q
OP
PP
PP
PP
Q
Example
For the A matrix considered in the last example, we can calculate sin(A). In terms of the
transformation matrix P and Jordan form given in the previous example, the answer is sin(A) =
CH10.DOC
Page 10-10
EE448/528
Version 1.0
John Stensby
F LJ1
sinG M
GGH MM
N
J2
OPI
,J =
P
J
J 3 PQJK
LMsin(2)
MM 00
MM 0
MM 0
N 0
cos( 2 ) sin( 2 )
sin( 2)
cos( 2)
0
0
0
0
0
0
sin( 2 )
0
0
0
sin( 2 ) cos( 2)
sin( 2)
OP
P
0 P
P
0 P
0 P
P
sin( 0)Q
0
0
A final numerical result can be obtained by using MatLab to do the messy work.
% Enter the 6x6 matrix P into MatLab
>P = [2 1 0 0 0 0;2 -1 0 0 0 0;0 0 1 2 1 0;0 0 0 -2 -1 0;0 0 0 0 1 1;0 0 0 0 1 -1];
% Enter the 6x6 matrix sin[Jordan Form] (i.e., sin of Jordan Canonical Form)into MatLab
>s = sin(2);
>c = cos(2);
>SinJ = [s c -s/2 0 0 0;0 s c 0 0 0;0 0 s 0 0 0;0 0 0 s c 0;0 0 0 0 s 0;0 0 0 0 0 0];
% Calculate sin(A)
>sinA = P*SinJ*inv(P)
0.4932 0.4161 -1.3254 -1.3254
0
0
-0.4161 1.3254 -0.4932 -0.4932
0
0
0
0
0.9093
0
-0.4161 -0.4161
SinA =
0
0
0
0.9093 0.4161 0.4161
0
0
0
0
0.4546 0.4546
0
0
0
0
0.4546 0.4546
% To verify this result, Let's calculate sin(A) by using MatLabs built-in functions to compute SinA =
% imag(expm(i*A))
>SinA = imag(expm(i*A))
0.4932 0.4161 -1.3254 -1.3254
0
0
-0.4161 1.3254 -0.4932 -0.4932
0
0
0
0
0.9093
0
-0.4161 -0.4161
SinA =
0
0
0
0.9093 0.4161 0.4161
0
0
0
0
0.4546 0.4546
0
0
0
0
0.4546 0.4546
LM
MM
MM
MM
N
OP
PP
PP
PP
Q
LM
MM
MM
MM
N
OP
PP
PP
PP
Q
Page 10-11
EE448/528
Version 1.0
John Stensby
At the beginning of this chapter, we argued that f(A) can be represented by a series of
matrices if A 2 is in the region of convergence of (10-1).
sophisticated. We argue that the series for f(A) converges if the eigenvalues of A lie in the region
of convergence of (10-1).
Theorem 10-3
If f(z) has a power series representation
f ( z) =
ck zk
(10-18)
k =0
f (A) =
ck A k
(10-19)
k =0
Proof: We prove this theorem for nn matrices that are similar to a diagonal matrix (the more
general case follows by adapting this proof to the Jordan form of A). Let transformation matrix P
diagonalize A; that is, let D = diag(1, ... , n) = P-1AP. By Theorem 10-2 it follows that
f ( A ) = P diag( f ( 1 ), L , f( n )) P 1 = P diag( ck k1 , L ,
k =0
=P
F c Dk I P 1 = c ( PDP1)k = c A k .
k J
k
k
GH k=0
K
k=0
k=0
ck kn ) P1
k =0
(10-20)
If (10-18) diverges when evaluated at the eigenvalue i (as would be the case if i > R), then
series (10-19) diverges. Hence, if one (or more) eigenvalue falls outside of z < R, then
(10-19) diverges.
CH10.DOC
Page 10-12