Академический Документы
Профессиональный Документы
Культура Документы
Matrix Algebra:
Exercises and Solutions
" Springer
David A. Harville
Mathematical Sciences Department
IBM TJ. Watson Research Center
Yorktown Heights, NY 10598-0218
USA
9 8 765 432 I
ISBN 978-0-387-95318-2
Preface
This book comprises well over three-hundred exercises in matrix algebra and their
solutions. The exercises are taken from my earlier book Matrix Algebra From a
Statistician's Perspective. They have been restated (as necessary) to make them
comprehensible independently of their source. To further insure that the restated
exercises have this stand-alone property, I have included in the front matter a
section on terminology and another on notation. These sections provide definitions,
descriptions, comments, or explanatory material pertaining to certain terms and
notational symbols and conventions from Matrix Algebra From a Statistician's
Perspective that may be unfamiliar to a nonreader of that book or that may differ in
generality or other respects from those to which he/she is accustomed. For example,
the section on terminology includes an entry for scalar and one for matrix. These
are standard terms, but their use herein (and in Matrix Algebra From a Statistician's
Perspective) is restricted to real numbers and to rectangular arrays of real numbers,
whereas in various other presentations, a scalar may be a complex number or more
generally a member of a field, and a matrix may be a rectangular array of such
entities.
It is my intention that Matrix Algebra: Exercises and Solutions serve not only
as a "solution manual" for the readers of Matrix Algebra From a Statistician's
Perspective, but also as a resource for anyone with an interest in matrix algebra
(including teachers and students of the subject) who may have a need for exercises
accompanied by solutions. The early chapters of this volume contain a relatively
small number of exercises-in fact, Chapter 7 contains only one exercise and
Chapter 3 only two. This is because the corresponding chapters of Matrix Alge-
bra From a Statistician's Perspective cover relatively standard material, to which
many readers will have had previous exposure, and/or are relatively short. It is
vi Preface
the final ten chapters that contain the vast majority of the exercises. The topics
of many of these chapters are ones that may not be covered extensively (if at all)
in more standard presentations or that may be covered from a different perspec-
tive. Consequently, the overlap between the exercises from Matrix Algebra From
a Statistician's Perspective (and contained herein) and those available from other
sources is relatively small.
A considerable number of the exercises consist of verifying or deriving results
supplementary to those included in the primary coverage of Matrix Algebra From
a Statistician's Perspective. Thus, their solutions provide what are in effect proofs.
For many of these results, including some of considerable relevance and interest
in statistics and related disciplines, proofs have heretofore only been available (if
at all) through relatively high-level books or through journal articles.
The exercises are arranged in 22 chapters and within each chapter, are numbered
successively (starting with 1). The arrangement, the numbering, and the chapter
titles match those in Matrix Algebra From a Statistician's Perspective. An exercise
from a different chapter is identified by a number obtained by inserting the chapter
number (and a decimal point) in front of the exercise number.
A considerable effort was expended in designing the exercises to insure an
appropriate level of difficulty-the book Matrix Algebra From a Statistician's
Perspective is essentially a self-contained treatise on matrix algebra, however it
is aimed at a reader who has had at least some previous exposure to the subject
(of the kind that might be attained in an introductory course on matrix or linear
algebra). This effort included breaking some of the more difficult exercises into
relatively palatable parts and/or providing judicious hints.
The solutions presented herein are ones that should be comprehensible to those
with exposure to the material presented in the corresponding chapter of Matrix
Algebra From a Statistician's Perspective (and possibly to that presented in one
or more earlier chapters). When deemed helpful in comprehending a solution,
references are included to the appropriate results in Matrix Algebra From a Statis-
tician's Perspective-unless otherwise indicated a reference to a chapter, section,
or subsection or to a numbered result (theorem, lemma, corollary, "equation",
etc.) pertains to a chapter, section, or subsection or to a numbered result in Matrix
Algebra From a Statistician's Perspective (and is made by following the same con-
ventions as in the corresponding chapter of Matrix Algebra From a Statistician's
Perspective). What constitutes a "legitimate" solution to an exercise depends of
course on what one takes to be "given". If additional results are regarded as given,
then additional, possibly shorter solutions may become possible.
The ordering of topics in Matrix Algebra From a Statistician's Perspective is
somewhat nonstandard. In particular, the topic of eigenvalues and eigenvectors is
deferred until Chapter 21, which is the next-to-Iast chapter. Among the key results
on that topic is the existence of something called the spectral decomposition. This
result if included among those regarded as given, could be used to devise alternative
solutions for a number of the exercises in the chapters preceding Chapter 21.
However, its use comes at a "price"; the existence of the spectral decomposition
can only be established by resort to mathematics considerably deeper than those
Preface vii
Preface v
1 Matrices
6 Geometrical Considerations 21
8 Inverse Matrices 29
9 Generalized Inverses 35
10 Idempotent Matrices 49
x Contents
13 Determinants ...................................................................................... 69
References 265
Index 267
Some Notation
{xd A row or (depending on the context) column vector whose ith element
is Xi
{aij} A matrix whose ijth element is aij (and whose dimensions are arbitrary
or may be inferred from the context)
A' The transpose of a matrix A
AP The pth (for a positive integer p) power of a square matrix A; i.e., the
matrix product AA ... A defined recursively by setting A0 = I and taking
Ak = AA k - 1 (k = 1, ... , p)
C(A) Column space of a matrix A
R(A) Row space of a matrix A
R mxn The linear space comprising all m x n matrices
Rn The linear space R nx I comprising all n-dimensional column vectors
or (depending on the context) the linear space R I xn comprising all n-
dimensional row vectors
speS) Span of a finite set S of matrices; Sp({AI, ... , Ad), which represents
the span of the set {AI, ... , Ad comprising the k matrices AI, ... , Ab
is generally abbreviated to Sp(AI, ... , Ak)
C Writing SeT (or T ~ S) indicates that a set S is a (not necessarily
proper) subset of a set T
dim (V) Dimension of a linear space V
rank A The rank of a matrix A
rank T The rank of a linear transformation T
xii Some Notation
~
aXiaXj An alternative [to Drjf or DrJ(x)] notation for the ijth (second-order)
partial derivative of a function 1 of an m x 1 vector x = (XI, ... , x m ) ' -
this notation extends in a straightforward way to third- and higher-order
partial derivatives
HI The Hessian matrix of a function 1 - accordingly, Hf(c) represents
the value of Hf at an interior point c of the domain of f
Djf The p x 1 vector (Dj iJ, ... , D J!p)', whose ith element is the jth partial
derivative D j Ii of the ith element fi of a p x 1 vector f = (fl, ... , I p )'
of functions, each of whose domain is a set S in nmxl_ similarly,
Djf(c} = [DjiJ (c), ... , DJ!p(c}]', where c is an interior point of S
g~ The p x q matrix whose stth element is the partial derivative afsr/axj
of the stth element of a p x q matrix F = {fsd of functions of a vector
x = (XI, ... ,xm )' of m variables
a2 F
aXi ax j The p x q matrix whose stth element is the second-order partial derivative
a2fsr/aXiaXj of the stth elementofa p x q matrix F = {fsd offunctions
of a vector x = (XI, ... ,xm )' of m variables-this notation extends in a
straightforward way to a p x q matrix whose s tth element is one of the
third- or higher-order partial derivatives of the stth element of F
Df The Jacobian matrix (Dlf, ... , Dmf) of a vector f = (fl,"" fp)' of
functions, each of whose domain is a set S in nm
x I - similarly, Df(c}
= [Dlf(c), ... , Dmf(c}], where c is an interior point of S
:: An alternative [to Df or Df(x}] notation for the Jacobian matrix of a vector
f = (iJ, ... , fp)' of functions of an m x I vector x = (XI, ... ,xm )' -
af/ax' is the p x m matrix whose ijth element is ali/axj
ar'
ax An alternative [to (Df)' or (Df(x»'] notation for the gradient (matrix)
of a vector f = (fJ, ... , fp)' of functions of an m x 1 vector x =
(XI, ... , xm)' - ar' lax is the m x p matrix whose jith element is
af;/axj
ax
aaaf The derivative of a function f of an m x n matrix X of mn "independent"
variables or (depending on the context) of an n x n symmetric matrix X
- the matrix af/aX' is identical to (af/aX)'
unV The intersection of 2 sets U and V of matrices-this notation extends in
an obvious way to the intersection of 3 or more sets
U UV The union of 2 sets U and V of matrices (of the same dimensions )-this
notation extends in an obvious way to the union of 3 or more sets
U +V The sum of 2 nonempty sets U and V of matrices (of the same dimen-
sions)-this notation extends in an obvious way to the sum of 3 or more
nonempty sets
U EI1 V The direct sum of 2 (essentially disjoint) linear spaces U and V in xn nm
Some Notation xv
Y E nn) that, for some m x n matrix A = {aij} (called the matrix of the
bilinear form), is expressible as x'Ay = Li,j aijXiYj - the bilinear form
is said to be symmetric if m = n and x' Ay = y' Ax for all x and all y or
equivalently if the matrix A is symmetric,
o 0 Arr
of whose off-diagonal blocks are null matrices) is said to be block-diagonal
and may be expressed in abbreviated notation as diag(AIl, A22, ... , Arr).
All AJ2 ... Air)
(
o A22 ... A2r
block-triangular A partitioned matrix of the form: ... : or
o 0 Arr
(!~: A~2
Arl Ar2
~
Arr
) is respectively upper or lower block-triangular-
full column rank An m x n matrix A is said to have full column rank if rank(A)
=n.
full row rank An m x n matrix A is said to have full row rank ifrank(A) = m.
generalized eigenvalue problem The generalized eigenvalue problem consists of
finding, for a symmetric matrix A and a symmetric positive definite matrix
B, the roots of the polynomial IA - ABI (i.e., the solutions for A to the
equation IA - ABI = 0).
generalized inverse A generalized inverse of an m x n matrix A is any n x m
matrix G such that AGA = A - if A is nonsingular, its only generalized
inverse is A-I; otherwise, it has infinitely many generalized inverses.
geometric multiplicity The geometric multiplicity of an eigenvalue, say A, of an
n x n matrix A is (by definition) dim[N(A - AI)] (i.e., the dimension of the
eigenspace of A).
gradient (or gradient matrix) The gradient of a vector f = (h, ... , Ip)' of
functions, each of whose domain is a set in nm xl, is the m x p matrix
[(DI!)', ... , (Dip)'], whose jith element is Ddi-the gradient off is the
transpose of the Jacobian matrix of f.
gradient vector The gradient vector of a function I, with domain in n mx1 , is
the m-dimensional column vector (D f)', whose jth element is the partial
derivative D j I of I
Hessian matrix The Hessian matrix of a function I, with domain in nm xl, is the
m x m matrix whose ijth element is the ijth partial derivative D;j I of I
homogeneous linear system A linear system (in a matrix X) of the fonn AX = 0;
i.e., a linear system whose right side is a null matrix.
idempotent A (square) matrix A is idempotent if A2 = A.
identity transformation An identity transfonnation is a transfonnation from a
linear space V onto V defined by T (X) = X.
indefinite A square (symmetric or nonsymmetric) matrix or a quadratic fonn is
(by definition) indefinite if it is neither nonnegative definite nor non positive
° °
definite-thus, an n x n matrix A and the quadratic fonn x' Ax (in an n x 1
vector x) are indefinite if x' Ax < for some x and x' Ax > for some
(other) x.
inner product The inner product A • B of an arbitrary pair of matrices A and B in
a linear space V is the value assigned to A and B by a designated function
having the following 4 properties: (1) A· B = B· A; (2) A· A :::: 0, with
equality holding if and only if A = 0; (3) (kA) • B = k(A· B) (where k is
an arbitrary scalar); (4) (A + B) • C = (A· C) + (B· C) (where C is an
arbitrary matrix in V)-the quasi-inner product A· B is defined in the same
way as the inner product except that Property (2) is replaced by the weaker
property (2') A· A :::: 0, with equality holding if A = O.
inner product (usual) The usual inner product of a pair of matrices A and B in a
linear space is tr(A'B) (which in the special case of a pair of column vectors
xxii Some Tenninology
nonnull x.
neighborhood A neighborhood of an m x n matrix C is a set of the general form
{X E nmxn: IIX-CII < r}, wherer is a positive number called the radius
of the neighborhood (and where the norm is the usual norm).
nonhomogeneous linear system A linear system whose right side (which is a
column vector or more generally a matrix) is nonnull.
nonnegative definite An n x n (symmetric or nonsymmetric) matrix A and the
quadratic form x' Ax (in an n x 1 vector x) are (by definition) nonnegative
definite if x' Ax :::: 0 for every x in nn.
nonpositive definite An n x n (symmetric or nonsymmetric) matrix A and the
quadratic form x' Ax (in an n x 1 vector x) are (by definition) nonpositive
definite if -x' Ax is a nonnegative definite quadratic form (or equivalently
if -A is a nonnegative definite matrix)-thus, A and x' Ax are nonpositive
definite if x' Ax :::: 0 for every x in nn.
nonnull matrix A matrix having 1 or more nonzero elements.
nonsingular A matrix is nonsingular if it has both full row rank and full column
rank or equivalently if it is square and its rank equals its order.
norm The norm of a matrix A in a linear space V is (A· A)1/2-the use of this
term is limited herein to norms defined in terms of an inner product; in the
case of a quasi-inner product, (A· A) 1/2 is referred to as the quasi norm.
normal equations A linear system (or the equations comprising the linear system)
of the form X/Xb = X' y (in a p x 1 vector b), where X is an n x p matrix
and y an n x 1 vector.
null matrix A matrix all of whose elements are o.
null space (of a matrix) The null space of an m x n matrix A is the solution
space of the homogeneous linear system Ax = 0 (in an n-dimensional
column vector x), or equivalently is the set {x E nnxl : Ax = O} .
null space (of a transformation) The null space-also known as the kernel- of
a linear transformation T from a linear space V into a linear space W is the
set (X E V : T(X) = O}, which is a subspace of V.
one to one A transformation T from a set V into a set W is said to be 1-1 (one to
one) if each member of the range of T is the image of only one member of
V.
onto A transformation T from a set V into a set W is said to be onto if T (V) = W
(i.e., if the range of T is all of W), in which case T may be referred to as a
transformation from V onto W.
open set A set S of m x n matrices is an open set if every matrix in S is an interior
point of S.
order A (square) matrix of dimensions n x n is said to be of order n.
orthogonal complement The orthogonal complement of a subspace U of a linear
space V is the set comprising all matrices in V that are orthogonal to U -
note that the orthogonal complement of U depends on V as well as U (and
xxvi Some Terminology
t t
singular value decomposition An m x n matrix A of rank r is expressible as
(column) vee'or (1} where (fo, i = 1.2 •...• n).; = (aii. ai+l.i.···.
and is the subvector of the i th column of A obtained by striking out its first
i - I elements.
vee-permutation matrix The mn x mn vee-permutation matrix is the unique
permutation matrix, denoted by the symbol K mn , such that, for every m x n
matrix A, vec(A') = Kmn vee (A) - the vee-permutation matrix is also
known as the commutation matrix.
zero transformation The linear transformation from a linear space V into a linear
space W that assigns to every matrix in V the null matrix (in W) is called
the zero transformation.
1
Matrices
EXERCISE 1. Show that, for any matrices A, B, and C (of the same dimensions),
(A + B) +C = (C + A) + B.
Solution. Since matrix addition is commutative and associative,
(A + B) +C = C + (A + B) = (C + A) + B.
~ (~>ijbjkCks)
~ (~aijbjkCks) = ~ (~aijbjk)CkS'
and Lk(Lj aijbjk)cks equals the isth element of (AB)e. Since each element of
A(BC) equals the corresponding element of (AB)C, we conclude that A(BC) =
(AB)C.
(b) Observing that the j kth element of B + C equals b j k + Cj k, we find that the
ikth element of A(B + C) equals
Further, observing that Lj aijbjk is the ikth element of AB and that Lj aijCjk
is the ikth element of AC, we find that Lj aijbjk + Lj aijCjk equals the ikth
element of AB + Ae. Since each element of A(B + C) equals the corresponding
element of AB + BC, we conclude that A(B + C) = AB + Be.
(E.1)
Solution. (a) The jth element of the vector Ax is Lk=l ajkxk. Thus, upon re-
garding BAx as the product of B and Ax, we find that the ith element of BAx is
Lj=l bij Lk=l a jkXk·
(b) The i rth element of BAX is
m n
Lbij LajkXkr,
j=l k=l
as is evident from Part (a) upon regarding the irth element of BAX as the ith
element of the product of BA and the rth column of X.
(c) According to Part (a), the sth element of the vector BAx is
m n
Lbsj Lajkxk.
j=l k=l
Thus, upon regarding CBAx as the product of C and BAx, we find that the i th
element of CBAx is
(d) The i th element of the row vector y'BA is the same as the i th element of the
column vector (y'BA)' = A'B'y. Thus, according to Part (a), the ith element of
y'BA is
m p
Laji LbkjYk.
j=l k=l
(A + B)(A - B) = A2 - B2
Thus,
(A + B)(A - B) = A 2 - B2
if and only if -AB + BA = 0 or equivalently if and only if AB = BA (i.e., if and
only if A and B commute).
Solution. (a) Since A and B are symmetric, (AB)' = B' A' = BA. Thus, if AB
is symmetric, that is, if AB = (AB)', then AB = BA, that is, A and B commute.
Conversely, if AB = BA, then AB = (AB)'.
AB = (~ ~) # (~ ~) = BA.
EXERCISE 7. Verify (a) that the transpose of an upper triangular matrix is lower
triangular and (b) that the sum of two upper triangular matrices (of the same order)
is upper triangular.
Solution. Let A = {aij} represent an upper triangular matrix of order n. Then,
by definition, the ijth element of A' is aji. Since A is upper triangular, aji = 0
for i < j = 1, ... , n or equivalently for j > i = 1, ... , n. Thus, A' is lower
triangular, which verifies Part (a).
Let B = {bij} represent another upper triangular matrix of order n. Then, by
definition, the ijth element of A + B is aij + bij. Since both A and B are upper
triangular, aij = 0 and bij = 0 for j < i = 1, ... , n, and hence aij + bij = 0 for
j < i = 1, ... , n. Thus, A + B is upper triangular, which verifies Part (b).
(a) Show that, for i = 1, ... , n and j = 1, ... , min(n, i + p - 1), the ijth
element of AP equals zero.
(b) Show that, for i :::: n - p + 1, the ith row of AP is null.
(c) Show that, for p :::: n, AP = O.
Solution. For i, k = 1, ... , n, let bik represent that ikth element of AP.
(a) The proof is by mathematical induction. Clearly, for i = 1, ... , n and j =
1, ... , min (n, i + 1 - 1), the i j th element of A I equals zero. Now, suppose that, for
i = I, ... , nand j = I, ... , min(n, i + p - I), the ijth element of AP equals zero.
Then, to complete the induction argument, it suffices to show that, for i = I, ... , n
and j = 1, ... , min(n, i + p), the ijth element of AP+! equals zero. Observing
that AP+l = APA, we find that, fori = I, ... ,nandj = l, ... ,min(n,i+p),
the ijth element of AP+l equals
min(n,i+p-l) n
= L Oakj + L bikakj
k=l k=i+p
(where, if i > n - p, the sum Lk=i+p bikakj
is degenerate and is to be interpreted as 0)
1. Matrices 5
= 0
(since, for k ~ j, akj = 0).
(b)Fori ~ n-p+1,min(n, i+p-1) = n(sincei ~ n-p+1 <=> i+p-1 ~ n).
Thus, for i ~ n - p + 1, it follows from Part (a) that all n elements of the ith row
of AP equal zero and hence that the ith row of AP is null.
(c) Clearly, for p ~ n, n - p + 1 S 1. Thus, for p ~ n, it follows from Part (b)
that all n rows of AP are null and hence that AP = O.
2
Submatrices and Partitioned Matrices
Solution. Let it, ... , i: (it < ... < in represent those r of the first m posi-
tive integers that are not represented in the sequence iI, ... , i m- r . Similarly, let
H, ... , g (j{ < ... < js*) represent those s of the first n positive integers that
are not represented in the sequence h, ... , jn-s. Denote by aij and bij the ijth
elements of A and A', respectively. Then,
a" .• a" .•
'Ilt 'ilt
A'* - .
- :
( (
a'.
'T lt'* a·* '*
'I Js
b .•.• b·.·.
Ji'l lt'T
(
=
b·.··
Jsli
b·.·.
JS'T
EXERCISE 3. Let
...... A2r
AIr)
o Arr
represent an n x n upper block-triangular matrix whose ijth block Ai} is of di-
mensions ni x n j (j ~ i = I, ... , r). Show that A is upper triangular if and only
if each of its diagonal blocks All, A22, ... , Arr is upper triangular.
Solution. Let ats represent the t sth element of A (t, s = 1, ... , n). Then,
(j ~ i = 1, ... ,r).
Suppose that A is upper triangular. Then, by definition, ats = 0 for s < t =
1, ... , n. Thus, anl+.+ni-l+k,nl+,,+ni-l+l (which is the kith element of the ith
diagonal block Aii) equals zero for I < k = 1, ... , ni, implying that Aii is upper
block-triangular (i = 1, ... , r).
Conversely, suppose that All, A22, ... ,Arr are upper triangular. Let t and s
represent any integers (between 1 and n, inclusive) such that ats i= O. Then,
clearly, for some integers i and j ~ i, ats is an element of the submatrix Aij, say
the kith element, in which case t = nl + .. ·+ni-l +k ands = nl + .. ·+nj-l +1.
If j > i, then (since k :::: ni) t < s. Moreover, if j = i, then (since Aii is upper
triangular) k :::: I, implying that t :::: s. Thus, in either case, t :::: s. We conclude
that A is upper triangular.
EXERCISE 4. Let
A= Cl
A2!
.
Ar!
Al2
A22
Ar2 ...
Ak)
A2c
Arc
2. Submatrices and Partitioned Matrices 9
...
...
A~I)
A~2
A~c
in other words, verify that A' can be expressed as a partitioned matrix, comprising
c rows and r columns of blocks, the ijth of which is the transpose Aji of the jith
block A ji of A. And, letting
BIV)
B2v
. ,
... B~v
represent a partitioned P x q matrix whose ijth block Bij is of dimensions Pi x qj,
verify also that if c = u and nk = Pk (k = 1, ... , c) [in which case all of the
products AikBkj (i = 1, ... , r; j = 1, ... , v; k = 1, ... , c), as well as the product
AB, exist], then
...
AB=
(Fll. F21
F12
F22 ... Fl")
F2v
. ,
Frl Fr2 F rv
Hll H12
H21 H22
A'= ( .
HcI
Clearly, H;j is the submatrix of A' obtained by striking out the first n 1 + ... + ni-I
and last ni+1 + ... + nc rows of A' and the first ml + ... + mj_1 and last
mj+l + ... + mr columns of A'; and Aji is the submatrix of A obtained by
striking out the first m 1 + ... + m j -I and last m j + 1 + ... + mr rows of A and the
first n 1 + ... + ni -1 and last ni +1 + ... + nc columns of A. Thus, it follows from
result (1.1) that
10 2. Submatrices and Partitioned Matrices
Then, for w = 1, ... , mi and z = 1, ... , qj, the wzth element of Sij is
Sml +··+mi-I +w,ql +"+qj_1 +~
nl+-·+n c
L aml+-·+mi_l+w.l bl,ql+··+qj_l+z
l=1
c nl+··+nk_l+nk
L L aml+-··+mi_l+w,l be,ql+··+qj_l+z
k=1 l=nl+··+nk_I+1
c nk
= LL aml+-·+mi_l+w,nl+-·+nk_l+t bnl+··+nk_l+t.ql+-·+qj_l+z,
k=1 t=1
nk
L
1=1
ami +-·+mi-I +W,nl +··+nk-I +t b nl +-·+nk-I +t,ql +-'+qj-I +z
is the wzth element of AikBkj and hence that sml+-.+mi_l+w.ql+ .. +qj_I+Z is the
wzth element of Fij. Thus,
3
Linear Dependence and Independence
EXERCISE 1. For what values of the scalar k are the three row vectors (k, 1, 0),
(1, k, 1), and (0, 1, k) linearly dependent, and for what values are they linearly
independent? Describe your reasoning.
Solution. Let Xl, x2, and X3 represent any scalars such that
xlk +X2 = 0,
Xl +X2k +X3 = 0,
X2 +X3k = 0,
Thus, there exist values of XI, X2, and X3 other than XI = X2 = X3 = 0 if and only
if k = 0 or k = ±J2. And, we conclude that the three vectors (k, 1,0), (1, k, 1),
and (0, 1, k) are linearly dependent if k = 0 or k = ±J2, and linearly independent,
otherwise.
EXERCISE 1. Which of the following two sets are linear spaces: (a) the set of all
n x n upper triangular matrices; (b) the set of all n x n nonsymmetric matrices?
Solution. Clearly, the sum of two n x n upper triangular matrices is upper triangular.
And, the matrix obtained by multiplying any n x n upper triangular matrix by any
scalar is upper triangular. However, the sum of two n x n nonsymmetric matrices is
not necessarily nonsymmetric. For example, if A is an n x n nonsymmetric matrix,
then -A and A' are nonsymmetric, yet the sums A + (-A) = 0 and A + A' are
symmetric. Also, the product of the scalar 0 and any n x n matrix is the null matrix,
which is symmetric. Thus, the set of all n x n upper triangular matrices is a linear
space, but the set of all n x n nonsymmetric matrices is not.
(2) IfR(A') = R(B'), then R(A') c R(B') and R(B') c R(A'), implying [in
light of Part (1)] thatC(A) C C(B) andC(B) c C(A) and hence thatC(A) = C(B).
Similarly, if C(A) = C(B), then C(A) c C(B) and C(B) C C(A), implying that
R(A') c R(B') and R(B') c R(A') and hence that R(A') = R(B'). Thus,
C(A) = C(B) if and only if R(A') = R(B').
EXERCISE 4. Let A, B, and C represent three matrices (having the same dimen-
sions) such that A + B + C = O. Show that sp(A, B) = sp(A, C).
Solution. Let E represent an arbitrary matrix in sp(A, B). Then, E = dA + kB for
some scalars d and k, implying (since B = -A - C) that
Thus, sp(A, B) C sp(A, C). And, it follows from an analogous argument that
sp(A, C) C sp(A, B). We conclude that sp(A, B) = sp(A, C).
EXERCISE 5. Let AI, ... , Ak represent any matrices in a linear space V. Show
that sp(A I, ... , A k ) is a subspace of V and that, among all subspaces of V that
contain AI, ... Ak, it is the smallest [in the sense that, for any subspace U (of V)
that contains AI, ... , Ab SP(AI, ... , Ak) c U].
Solution. Let U represent any subspace of V that contains AI, ... , Ak. It suffices
(since V itself is a subspace of V) to show that SP(AI, ... , Ak) is a subspace of U.
Let A represent an arbitrary matrix in Sp(AI, ... , Ak). Then, A = xlAI + ... +
XkAk for some scalars XI, ... , Xk, implying that A E U. Thus, Sp(AI, ... , Ak)
is a subset of U, and, since Sp(AI, ... ,Ak) is a linear space, it follows that
Sp(AI, ... , Ak) is a subspace of U.
4. Linear Spaces: Rowand Column Spaces 15
EXERCISE 6. Let AI, ... , Ap and BI, ... , Bq represent matrices in a linear
space V. Show that if the set {AI, ... , Ap} spans V, then so does the set {AI, ... ,
A p , BI, ... , B q }. Show also that if the set {AI, ... , A p , BI, ... , B q } spans V and
if BI, ... ,Bq are expressible as linear combinations of AI, ... , A p , then the set
{AI, ... , Ap} spans V.
Solution. It suffices (as observed in Section 4.3c) to show that if BI, ... ,Bq are
expressible as linear combinations of AI, ... , A p , then any linear combination
of the matrices AI, ... , A p , BI, ... , Bq is expressible as a linear combination of
AI, ... ,Ap and vice versa. Suppose then that there exist scalars klj, ... ,kpj such
that B j = Li kijAi (j = 1, ... , q). Then, for any scalars Xl, ... ,x p' YI, ... , Yq'
which verifies that any linear combination of AI, ... ,A p , BI, ... ,Bq is express-
ible as a linear combination of AI, ... , Ap. That any linear combination of AI, ... ,
Ap is expressible as a linear combination of AI, ... , A p , BI, ... , Bq is obvious.
EXERCISE 7. Suppose that {A 1, ... , Ak} is a set of matrices that spans a linear
space V but is not a basis for V. Show that, for any matrix A in V, the representation
of A in terms of AI, ... , Ak is nonunique.
Solution. Let Xl, ... , Xk represent any scalars such that A = L~= 1 Xi Ai. [Since
Sp(AI, ... ,Ak) = V, such scalars necessarily exist.] Since the set {AI, ... , Ak}
spans V but is not a basis for V, it is linearll dependent and h~nce there exist
scalars ZI, ... , Zk. not all zero, such that Li=l ZiAi = o. Lettmg Yi = Xi +
Zi (i = 1, ... ,k), we obtain a representation A = L~=l YiAi different from the
representation A = L~=I XiAi.
EXERCISE 8. Let
A-
- (~ 0
1
-2
2
-3 2)
o0 6 2
5 2 .
2
o -4 -2 1 0
(a) Show that each of the two column vectors (2, -1,3, -4)' and (0, 9, -3, 12)'
is expressible as a linear combination of the columns of A [and hence is in C(A)].
(b) A basis, say S*, for a linear space V can be obtained from any finite set S
that spans V by successively applying to each of the matrices in S the following
algorithm: include the matrix in S* if it is nonnull and if it is not expressible as a
linear combination of the matrices already included in S*. Use this algorithm to
find a basis for C(A). (In applying the algorithm, take the spanning set S to be the
set consisting of the columns of A.)
(c) What is the value of rank(A)? Explain your reasoning.
16 4. Linear Spaces: Rowand Column Spaces
(d) A basis for a linear space V that includes a specified set, say T, of r linearly
independent matrices in V can be obtained by applying the algorithm described
in Part (b) to the set S whose first r elements are the elements of T and whose
remaining elements are the elements of any finite set U that spans V. Use this
generalization of the procedure from Part (b) to find a basis for C(A) that includes
the two column vectors from Part (a). (In applying the generalized procedure, take
the spanning set U to be the set consisting of the columns of A.)
Solution. (a) Clearly,
and
(b) The basis obtained by applying the algorithm comprises the following 3
(j} (j). (0
vectors:
(c) Rank A = 3. The number of vectors in a basis for C(A) equals 3 [as is
evident from Part (b)], implying that the column rank of A equals 3.
(d) The basis obtained by applying the generalized procedure comprises the
cn
following 3 vectors:
(j). (~D·
EXERCISE 9. Let A represent a q x p matrix, B a p x n matrix, and C an m x q
matrix. Show that (a) if rank(CAB) = rank(C), then rank(CA) = rank(C) and
(b) ifrank(CAB) = rank(B), then rank(AB) = rank(B).
Solution. (a) Suppose that rank(CAB) = rank(C). Then, it follows from Corollary
4.4.5 that
rank(C) ~ rank(CA) ~ rank(CAB) = rank(C)
and hence that rank(CA) = rank(C).
4. Linear Spaces: Rowand Column Spaces 17
(b) Similarly, suppose that rank (CAB) = rank(B). Then, it follows from Corol-
lary 4.4.5 that
rank(B) ~ rank(AB) ~ rank(CAB) = rank(B)
and hence that rank(AB) = rank(B).
(b) Confirm that rank(C) ~ rank ( ~). with equality holding if and only if
R(A) c R(C).
Solution. (a) Suppose that R(A) c R(C). Then, according to Lemma 4.2.2, there
exists an m x q matrix L such that A = LC and hence such that ( ~) = (~) C.
Thus, R( ~) c R(C), implying [since R(C) C R(~)l that R(C) = R( ~).
Conversely, suppose that R(C) = R(~). Then, since R(A) C R(~)'
R(A) c R(C). Thus, we have established that R(C) = R(~) {:} R(A) c
R(C) .
(b) Since (according to Lemma 4.5. 1) R(C) C R(~) , it follows from Theorem
rank ( ~). And, conversely, if rank(C) = rank ( ~). then since R(C) C R( ~) ,
it follows from Theorem 4.4.6 that R(C) = R( ~) and hence [in light of Part (a)
or of Lemma 4.5.1] that R(A) C R(C). Thus, rank(C) ::: rank ( ~). with equality
holding if and only if R(A) c R(C).
5
Trace of a (Square) Matrix
A* = G~). B* = (_ ~ ~ ). C* = G_~ ).
20 5. Trace of a (Square) Matrix
we find that
Solution. (a) Making use ofresuIts (2.3) and (1.5), we find that
EXERCISE 1. Use the Schwarz inequality to show that, for any two matrices A
and B in a linear space V,
II A + B II :<:: II A II + II B II,
II A + B 112 = (A + B) (A + B)
0
or equivalently that
IIA+BII:<:: IIAII + IIBII·
For this inequality to hold as an equality, it is necessary and sufficient that both
of inequalities (S.l) and (S.2) hold as equalities. Recalling (from, for instance,
Theorem 6.3.1) the conditions under which the Schwarz inequality holds as an
equality, we find that inequalities (S.l) and (S.2) both hold as equalities if and
only ifB = 0 or A = kB with k ::: O.
22 6. Geometrical Considerations
8(B, A) = II B - A II
= II (-I)(A - B) II = 1- 11 II A - B II = II A - B II = 8(A, B).
(b)
8(A, B) = II A - B I
= II (A - C) + (C - B) II
::::: II A - Cil + II C - B II = 8(A, C) + 8(C, B).
(d)
EXERCISE 3. Let w'J , w;, and w; represent the three linearly independent 4-
dimensional row vectors (6, 0, -2,3), (-2,4,4,2), and (0,5, -1, 2), respectively,
in the linear space R4, and adopt the usual definition of inner product.
(a) Use Gram-Schmidt orthogonalization to find an orthonormal basis for the
linear space sp(w;, w;, w;).
6. Geometrical Considerations 23
n
(b) Find an orthonormal basis for 4 that includes the three orthonormal vectors
from Part (a). Do so by extending the results of the Gram-Schmidt orthogonaliza-
tion [from Part (a)] to a fourth linearly independent row vector such as (0, 1,0,
0).
Solution. (a) The 3 orthogonal vectors obtained by applying the formulas (for
Gram-Schmidt orthogonalization) of Theorem 6.4.1 are:
y~ = w~ = (6,0, -2,3),
y~ = w~ - (-2/7)y~ = (1/7)(-2,28,24,20),
y; = w; - (13/21)y~ - (8/49)y~ = (1/147)(-118,371, -411, -38).
By normalizing y~, y~, and y;, we obtain a basis for sp(w;, w~, w;) consisting of
the following 3 vectors:
z; = (1/7)y; = 0/7)(6,0, -2,3),
z~ = (1/6)y~ = 0/42)(-2,28,24,20),
z; = (21609/321930)1/2y; = (321930)-1/2(-118,371, -411, -38).
EXERCISE 4. Let {AI, ... , Ak} represent a nonempty (possibly linearly depen-
dent) set of matrices in a linear space V.
(a) Generalize the results underlying Gram-Schmidt orthogonalization (which
are for the special case where the set (AI, ... ,Ad is linearly independent) by
showing (1) that there exist scalars Xij (i < j = 1, ... , k) such that the set
comprising the k matrices
BI =AI,
B2 = A2 - X12BI,
24 6. Geometrical Considerations
is orthogonal; (2) that, for j = I, ... , k and for those i < j such that Bi is nonnull,
Xij is given uniquely by
and (3) that the number of non null matrices among BI, ... , Bk equals dim[sp(AI,
... , Ak)].
(b) Describe a procedure for constructing an orthonormal basis for Sp(AI, ... ,
Ad.
Solution. (a) The proof of (I) and (2) is by mathematical induction. Assertions
(1) and (2) are clearly true for k = 1. Suppose now that they are true for a set
of k - 1 matrices. Then, there exist scalars xij (i < j = 1, ... , k - 1) such
that the set comprising the k - 1 matrices BI, ... , Bk-I is orthogonal, and, for
j = 1, ... , k - I and for those i < j such that Bi is nonnull, Xij is given uniquely
by
AjoBi
xij = - - .
BioBi
Moreover, for i = I, ... , k - 1, we find (as in the proof of the results underlying
Gram-Schmidt orthogonalization in Theorem 6.4.1) that Bk ° Bi = 0 if and only if
AkoBi
Xik=--·
BioBi
This completes the induction argument, thereby establishing (I) and (2).
Consider now Assertion (3). Each of the matrices BI, ... , Bk can (by repeated
substitution) be expressed as a linear combination of AI, ... , Ak. Conversely, each
of the matrices AI, ... , Ak can be expressed as a linear combination ofBI , ... , Bk.
Thus, Sp(BI, ... , Bk) = SP(AI, ... , Ak). Since the set {BI, ... , Bd is orthogonal,
we conclude - in light of Lemma 6.2.1 and Theorem 4.3.2 - that the nonnull
matrices among BI, ... , Bk form a basis for Sp(AI, ... , Ak) and hence that the
number of such matrices equals dim[sp(AI, ... , Ak)].
(b) An orthonormal basis for Sp(AI, ... , Ad can be constructed by making use
of the formulas for BI, ... , Bk from Part (a). The basis consists of those matrices
obtained by normalizing the nonnull matrices among BI, ... , Bk.
Solution. Denote the first, ... , kth columns of A by aI, ... , at, respectively. Then,
according to the results of Exercise 4, there exist scalars xij (i < j = I, ... , k)
such that the k column vectors bl, ... , bk defined recursively by the equalities
bl = aI,
b2 = a2 - Xl2bl,
al = bl
a2 = b2 + X12bl,
form an orthogonal set. Further, r of the vectors bl, ... , bt, say the SI th, ... , srth
of them, are nonnull, and, for j = I, ... , k and for those i < j such that b i is
nonnull, Xij is given uniquely by
Now, let B represent the m x k matrix whose first, ... , kth columns are bl, ... ,
bk, respectively, and let X represent the k x k unit upper triangular matrix whose ij th
element is (for i < j = I, ... , k) Xij. Then, observing that the first column ofBX is
bI and that (for j = 2, ... ,k)the jthcolumn ofBX is bj+Xj-I,jbj-1 +. ,,+xljbI
and recalling result (2.2.9), we find that
where BI is the m x r submatrix (of B) whose columns are the slth, ... , srth
columns of B and Xl is the r x k submatrix (of X) whose rows are the slth,
... ,srth rows of X.
And, the decomposition A = BIXI can be reexpressed as
26 6. Geometrical Considerations
where Q = BID, with D = diag(1I b s [ II-I, ... , II bSr II-I), and Rl = EXl, with
E = diag( II bs [ II, ... ,II bSr II), or equivalently where Q is the m x r matrix with
jth column II b Sj 11-1 b Sj and RI = {rij} is the r x k matrix with
(b) Let A and B represent m x n matrices. (1) Show that if C is an r x q matrix and
Daqxmmatrixsuchthatrank(CD) = rank (D) , then CDA = CDBimpliesDA =
DB. {Hint. To show that DA = DB, it suffices to show that rank [D(A - B)] = O.}
(2) Similarly, show that if C is an n x q matrix and D a q x p matrix such that
rank (CD) = rank(C), then ACD = BCD implies AC = BC.
Solution. (a) It is clear from Corollary 4.2.3 that R(ACB) c R(CB) and C(ACB)
c C(AC).
Now, suppose that rank(AC) = rank(C). Then, according to Corollary 4.4.7,
R(AC) = R(C), and it follows from Lemma 4.2.2 that C = LAC for some matrix
L. Thus,
R(CB) = R(LACB) c R(ACB),
implying that R(ACB) = R(CB) [which implies, in tum, that rank(ACB) =
rank(CB)].
Similarly, if rank(CB) = rank(C), then C(CB) = C(C), in which case C =
CBR for some matrix R, implying that C(AC) = C(ACBR) c C(ACB) and
hence that C(ACB) = C(AC) [and rank(ACB) = rank(AC)].
28 7. Linear Systems: Consistency and Compatibility
(b) Let F = A-B. (1) Suppose that rank(CD) = rank(D). Then, if CDA =
CDB, we find, in light of Part (a), that
Thus,
(I - A) (I + A) = 0 <=> 1- A2 =0 <=> A2 = I.
(b) Clearly,
A2 = (a 2 +be ab +bd)
be +d 2
ae+ed .
30 8. Inverse Matrices
Thus, rank(TB) = r.
Solution. (a) To establish results (E. I) and (E.2), it suffices to observe that if A is
invertible and A -I is partitioned as A -I = (:~). then
and
8. Inverse Matrices 31
(b) Results (E.3) and (E.4) can be obtained as a special case of results (E.l) and
(E.2) by observing that if A is orthogonal, then A is invertible and A-I = A' =
(!D·
EXERCISE 5. Let A represent an m x n non null matrix of rank r. Show that
there exists an m x m orthogonal matrix whose first r columns span C(A).
Solution. According to Theorem 6.4.3, there exist r m-dimensional vectors that
are orthonormal with respect to the usual inner product for n mxI and form a
basis for C(A). And, according to Theorem 6.4.5, there exist m - r additional
m-dimensional vectors, say b r+1, ... , b m, such that bI, ... , b r , b r+1, ... , bm are
orthonormal with respect to the usual inner product for n'" x I and form a basis for
nmx 1. Clearly, the m x m matrix whose first, ... , rth, (r + l)th, ... , mth columns
are respectively bI, ... , b r , b r+ 1, ... , b m is orthogonal, and its first r columns
span C(A).
EXERCISE 7. Let
..
.. ,, AIr)
A2r (BII
B2I
., B= .
o A" Brl
FIr)
F2r
· , B
_I
=
(Gil
G2I
·· ...
o F" GrI
32 8. Inverse Matrices
where
j
Fii = Ail I , Fij = -Ail l L AikFkj (j > i = 1, ... ,r) , (*)
~r+-I
Gii = Bill, Gij = -Bill LBikGkj (j < i = 1, ... , r) . (**)
k=j
Show that the submatrices Fij (j ::: i = 1, ... , r) and Gij (j :::: i = 1, ... , r) are
also expressible as
j-I
Fjj = Aj}, Fij = -(LFikAkj)Aj/ (i < j = 1, ... , r), (E.5)
k=ri
Gjj = Bj/, Gij = -( L GikBkj)Bj/ (i > j = 1, ... , r) . (E.6)
k=j+1
Do so by applying results (**) and (*) to A' and B', respectively.
(b) Formulas (*) form the basis for an algorithm for computing A -I in r steps: the
first step is to compute the matrix F rr = A;,I; the (r - i + l)th step is to compute
the matrices F ii , Fi,i+I, ... , Fir from formulas (*) (i = r - 1, r - 2, ... ,1).
Similarly, formulas (**) form the basis for an algorithm for computing B- 1 in r
steps: the first step is to compute the matrix GIl = B1/; the ith step is to compute
the matrices Gil, Gi2, ... , Gi i from formulas (**) (i = 2, ... , r). Describe how
formulas (E.5) and (E.6) in Part (a) can be used to devise r-step algorithms for
computing A -I and B- 1, and indicate how these algorithms differ from those
based on formulas (*) and (**).
Solution. (a) Clearly, it suffices to show that
:}
G;I
%)
C C'
F' F;2 G;2 G~2
(A')-I = : 12 (8')-' ~ ~ . ,
i-I
F;i = (A;i)-I, Fji = -(A;i)-I(LAkiFjk) (j < i = 1, ... , r),
k=j
8. Inverse Matrices 33
j
G;i = (B;i)-I, Gji = -(B;i)-I( L B~iGjk) (j > i = 1, ... , r).
k=i+1
Upon observing that
C'
A' A;2
A' = : 12
A~r A;r o
the validity of these fonnulas for (A')-1 and (B')-1 is seen to be an immediate
consequence of fonnulas (**) and (*), respectively.
(b) To compute A-I, we can employ an r-step algorithm, whose first step is to
compute Fl1 = All and whose jth step is to compute the matrices Flj, F2j' ... ,
Fjj fromfonnulas (E.5)(j = 2, .... r). To compute B- 1, we can employ an r-step
algorithm, whose first step is to compute G rr = B;:;.1 and whose (r - j + l)th
step is to compute the matrices G jj, G 1+ I.j, ... , G r j from fonnulas (E.6) (j =
r - 1, r - 2, ... , 1). These algorithms differ from those based on fonnulas (*) and
(**) in that they generate A-I and B- 1 one "column" of blocks at a time, rather
than one "row" at a time.
9
Generalized Inverses
AXA=AI=A
36 9. Generalized Inverses
AA'[L'(TT')-IL]AA' = BTT'B'L'(TT')-ILBTT'B'
= BTT'(LB)'(TT')-IITT'B'
= BTT'I'B' = BTT'B' = AA'.
(b) Clearly,
A 2[R(TB)-IL]A 2 = BTBTR(TB)-ILBTBT
= BTBI(TB)-IITBT = BTBT = A2.
(of A) save rows il,i2, ... ,i, and columns h,h, ... ,j" and (3) taking (for
s = 1,2, ... , rand t = 1,2, ... , r) the jsitth element of G to be the stth element
ofB l / and taking its other (n - r)(m - r) elements to be O. Use this approach to
find a generalized inverse of the matrix
Solution. The second and third columns of A are linearly independent (as can
be easily verified), implying (since the first column is null) that r = 2. Choose,
for example, the linearly independent rows and linearly independent columns so
that il = 2 and i2 = 4 - clearly, the second and fourth rows of A are linearly
independent- and h = 2 and h = 3. Then,
Bll = (~ D·
Applying formula (8.1.2) for the inverse of a 2 x 2 nonsingular matrix, we find
that
Bl/ = (1/6) (_~ -2)4 .
Thus, one generalized inverse of A is
~)
0 0
G = (1/6) (~ 3
-3
0
0
-24
EXERCISE 5. Let A represent an m x n nonnull matrix of rank r. Take Band K
to be nonsingular matrices (of orders m and n, respectively) such that
G -- K- 1 (I,V U)B- 1
W
(E.1)
where H II is of dimensions r x r.
( Ir 0) H (Ir 0) _ (Ir 0)
0000-00'
with V = H12, V = H21, and W = H22, so that G is expressible in the form (E.l).
We conclude that G is a generalized inverse of A if and only if G is expressible in
the form (E. I).
and C a q x n matrix. Do so by showing (I) that, for any generalized inverse (~~)
of the partitioned matrix (A, B)(where Gll is of dimensions n x m), (k~~2) is
a generalized inverse of (A, kB) and (2) that, for any generalized inverse (HI, H2)
9. Generalized Inverses 39
Thus, it follows from Part (2) of Lemma 9.2.4 that the matrix
Thus, it follows from Part (1) of Lemma 9.2.4 that the matrix
(HI, H2) (
I
0' 0
kIq
)-1 = (HI, H2)
(1m
0
is a generalized inverse ofthe matrix (~ ~). Generalize the result of Part (a) by
showing that, unless T and Q are both nonsingular, there exist generalized inverses
of (~ ~) that are not of the form (*). [Hint. Use Part (a), together with the
result that, for any r x s matrix B, any r x r nonsingular matrix A, and any s x s
40 9. Generalized Inverses
Solution. (a) Making use of the result cited in the hint, we find that, for any p x n
matrix X and q x m matrix Y, the partitioned matrix
is a generalized inverse of (! ~). 1fT is not nonsingular, then either 1- T-T =1=
oor I - TT- =1= 0, as is evident from Corollary 8.1.2. Moreover, if 1- T-T =1= 0,
then X can be chosen so that (I - T-T) X =1= 0, and similarly if 1- TT- =1= 0,
then V can be chosen so that V(I - TT-) =1= o. We conclude that if T is not
nonsingular, then there exists a generalized inverse of (! ~) that is not of
the form (TO- ~_). It follows from an analogous argument that if W is not
nonsingular, then again there exists a generalized inverse of (! ~) that is not
of the form (T- 0)
0 W-·
(b) Suppose that either Tor Q is not nonsingular, and assume (for purposes of
establishing a contradiction) that every generalized inverse of (~ ~) is of the
form (*). Upon observing that (in light of Lemma 9.3.5)
and that (according to Lemma 8.5.2) the matrices (V~- ~) and (! - T;U)
are nonsingular and upon applying the result cited in the hint and making use of
formulas (8.5.6), we find that every generalized inverse of (! g) is of the form
(0 1 -T-U)-I (T- + T-UQ-VT-
I -Q-VT-
-T-UQ-)
Q-
(I 0)-1
- VT- I
= (Io T-U) (T- + T-UQ-VT-
I -Q-VT-
(I
- T-UQ-)
Q- VT-
0)
1
- (T-0 0)Q- ,
9. Generalized Inverses 41
which contradicts Part (a). We conclude that, unless T and Q are both nonsingular,
there exist generalized inverses of (~ ~) that are not of the fonn (*).
U - (I - TT-)U(I - Q-Q»)
W .
°
(b) Suppose that C (U) C C(T) and R(V) c R(T). Then, it follows from Lemma
9.3.5 that (I - TT-)U = and V(I - T-T) = 0, and hence that Conditions (1)
- (3) are satisfied.
°
(c) Take T = and U = 0, take W to be an arbitrary nonnull matrix, and take V
to be any nonnull matrix such thatC(V) C C(W). Conditions (1) and (3) are clearly
satisfied. Moreover, Q = W, and (in light of Lemma 9.3.5) (I - QQ-)V = 0, so
that condition (2) is also satisfied. On the other hand, the condition R(V) c R(T)
is obviously not satisfied.
and that C(AI2) c C(AII) and R(A2t> C R(AII)' Take Q to be the Schur
42 9. Generalized Inverses
(where Qll, Q12, Q21' and Q22 are of the same dimensions as A22, A23, A32, and
A33, respectively), so that QII = A22 - A21AlI AJ2, QJ2 = A23 - A21AlIAI3,
Q21 = A32 - A3IAlIAJ2, and Q22 = A33 - A3IAlIA!3. Let
Show that (1) G is a generalized inverse of T; (2) the Schur complement Q22 -
Q21 Q ll QI2 of QI I in Q relative to Q ll equals the Schur complement A33 - VGU
of T in A relative to G; and (3)
GU = (AlIAI3 - ~IIAI2QlIQ12) ,
QII QI2
VG = (A3IAII - Q21 Q lIA2I All ' Q21 Q ll)'
(b) Let A represent an n x n matrix (where n ::: 2), let nl, ... , nk represent
positive integers such that n I + ... + nk = n (where k ::: 2), and (for i = 1, ... , k)
let n7 = nl + '" + ni. Define (for i = 1, ... , k) Ai to be the leading principal
submatrix of A ofordern7 and define (fori = 1, ... , k-I)Ui tobethen7x(n-n7)
matrix obtained by striking out all of the rows and columns of A except the first n7
rows and the last n - n7 columns, Vi to be the (n - nn x n7 matrix obtained by
striking out all of the rows and columns of A except the last n- n7 rows and first
n7 columns, and Wi to be the (n - n7) x (11 - n7) submatrix obtained by striking
out all of the rows and columns of A except the last n - n7 rows and columns, so
that (for i = 1, ... , k - 1)
A= (~:
Suppose that (for i = 1, ... , k - 1) C(U i ) C C(Ai) and R(Vi) C R(Ai). Let
i)
B (i) _ ( BI\
- (i)
B21
9. Generalized Inverses 43
.
B(l) _
(B(i-I)
II
+ X(i-I)Q-(i-l)y(i-I)
I II I
_X(i-I)Q-(i-I»)
I 11
11 - _Q-(i-I)y(i-I) Q -(i-I) ,
II I II
" (X(i-I) _ X(i-I)Q-(i-I)Q(i-I»)
B(l) _ 2 I II 12
12 - Q-(i-I)Q(i-I) ,
II 12
B(i) _ (y(i-l) _ Q(i-I)Q-(i-I)y(i-l) Q(i-I)Q-(i-I»
21 - 2 21 11 I' 21 I '
B(i) _ Q(i-I) _ Q(i-I)Q-(i-I)Q(i-l)
22 - 22 21 II 12·
Show that (1) B~? is a generalized inverse of Ai (i = 1, ... , k); (2) B~d is the
Schur complement of Ai in A relative to B~ii (i = 1, ... , k -1); (3) B~id = B~iivi
and B~i = ViB~ii (i = 1, ... , k - 1).
[Note. The recursive formulas given in Part (b) for the sequence of matrices B(I) ,
... , B(k-I), B(k) can be used to generate B(k-I) in k - 1 steps or to generate B(k)
in k steps - the formula for generating B(i) from B(i-l) involves a generalized
inverse of the ni x ni matrix Q~\-I). The various parts of B(k-I) consist of a
generalized inverse B~~-I) of Ak-I, the Schur complement Bi~-l) of Ak-I in A
relative to B~~-I), a solution B~~-I) of the linear system Ak-IX = Vk-I (in X),
and a solution Bi~-l) of the linear system YAk-1 = Vk-I (in V). The matrix B(k)
is a generalized inverse of A. In the special case where ni = 1, the process of
generating the elements of the n x n matrix B(i) from those of the n x n matrix
B(i-I) is called a sweep operation - see, e.g., Goodnight (1979).]
Solution. (a) (1) That G is a generalized inverse of T is evident upon setting
T = All, V = Al2, V = A2J, and W = A22 in formula (6.2a) of Theorem 9.6.1 [or
equivalently in formula (*) of Exercise 7 or 8]-the conditions C(AI2) c C(All)
and R(A21) c R(A II ) insure that this formula is applicable.
(2)
and
It can be established in similar fashion that Y I = A31Aii - Q21 Qil A21Ali and
Y2 = Q2IQil'
(b) The proof of results (1), (2), and (3) is by mathematical induction. By defi-
nition, BW is a generalized inverse of AI, Bi~ is the Schur complement of Al in
ArelativetoBii),andBi~ = Bg)VI andBW =VIBW·
Suppose now that B;\-I) is a generalized inverse of Ai-I, that B~2-1) is the
Schurcompemento
I . A relahveto
f A i-I m . B(i-I)
II ,and th at B(i-l)
12 = B(i-I)V
II i-I
and B~I-I) = Vi-IBi\-I) (where 2 :: i :: k - 1). Partition Ai, Vi, and Vi as
. _ (A i-I A (i-I»)
12 · - (A(i-I) A(i-I»
AI - A(i-I) A(i-I) , an d V I - 31 ' 32
21 22
(were
h 13 has n *i _ 1 rows an d A(i-I)
A (i-I) 31 has n *i _ 1 coI umns.
) Then, cIearI y,
B (i)
Ii ' andB(i)
12 = B(i)V
11 i an
dB(i)
21 = V i B(i)
II. Moreover, SInce
. B(k-I)
11
. a generaI·Ized
IS
inverse of Ak-I, since Qi~-l) = Bi~-I) and Bi~-l) is the Schur complement of
. A I· B(k-I). X(k-I) B(k-I) B(k-I)V
Ak-I In re atlve to 11 ,SInce I = 12 = 11 k-I and y(k-I) I =
Bi~-l) = Vk_IB~~-l), and since Ak = A, it is evident upon setting T = Ak-I,
V = Uk-I, V = Vk-I, and W = Wk-I in formula (6.2a) of Theorem 9.6.1 [or
equivalently in formula (*) of Exercise (7) or (8)] that B~~) is a generalized inverse
of Ak.
o W0)
(T = A = AGA = (TG ll
WG21
Thus, TGII T = T (i.e., Gil is a generalized inverse of T), WG22 W = W (i.e.,
G22 is a generalized inverse of W), TG12 W = 0, and WG21 T = O.
I
( - VT- 0) (T
I V
V) (I0
W
-T-V) = (T
I 0
0)
Q
and of then using the result cited in the hint for Part (b) of Exercise 7, along with
the result of Exercise 10.
Solution. Observing (in light of Lemma 9.3.5) that V - VT-T = 0 and that
V - TT-V = 0, we find that
-T-V) _ (T
I - 0
V)
Q 0
(I
= (! ~).
Moreover, recalling Lemma 8.5.2, it follows from the result cited in the hint for
Part (b) of Exercise 7, or (equivalently) from Part (3) of Lemma 9.2.4, that the
46 9. Generalized Inverses
matrix
G = (Gil G12).
G21 G22
9. Generalized Inverses 47
where GIl = T-, G22 = 0, and GI2 and G2I are arbitrary. Then, clearly G is a
generalized inverse of A. Further,
Suppose, for example, that G12, GZi, and G2I are chosen so that the (I, I)th ele-
ment of GI2 G 22 G2I is nonzero (which - since any n x q matrix is a generalized in-
verseofG22 -is clearly possible). Then, the (1, I)th elementofTGI2G22G2I T is
nonzero and hence TG12G22G2I Tis nonnull. We conclude that Gil -G12G22G2I
is not a generalized inverse of T.
10
Idempotent Matrices
EXERCISE 3. Show that, for any symmetric idempotent matrix A, the matrix
I - 2A is orthogonal.
Solution. Clearly,
(I-2A)'(1-2A) = (1-2A)(I-2A) = 1-2A-2A+4A2 = I-2A-2A+4A = I.
50 10. Idempotent Matrices
AA'A = AI = A
and hence that
(AA')2 = (AA' A)A' = AA'.
EXERCISE 5. Let A represent a symmetric matrix and k an integer greater than
or equal to 1. Show that if Ak+ I = Ak, then A is idempotent.
Solution. It suffices to show that, for every integer m between k and 2 inclusive,
A m+ 1 = Am implies Am = Am-I. (If Ak + 1 = Ak butA 2 were not equal to A, then
there would exist an integer m between k and 2 inclusive such that Am+ I = Am
but Am "I Am-I.)
Suppose that Am+ 1 = Am. Then, since A is symmetric,
A'AA m- 1 = A'AAm- 2
(where A0 = I), and it follows from Corollary 5.3.3 that
AA m- 1 = AA m- 2
or equivalently that
Solution. Let A = (~ ~). and define G as in Part (a) of Exercise 9.8. Sup-
pose that Conditions (1) - (3) are satisfied. Then, in light of Exercise 9.8, G is
52 10. Idempotent Matrices
a generalized inverse of A, and, making use of the result that (for any matrix B)
rank(B) = tr (BB - ) [which is part of result (2.1)], we find that
(E.l)
where
T- - T-U(I - Q*Q)X-ET
- FTY-(I - QQ*)VT- FTY-(I - QQ*)
- FTY-(I - QQ*)QX-ET
and
10. Idempotent Matrices 53
is a generalized inverse of A.
(b) (Meyer 1973, Theorem 4.1) Show that
[Hint. Use Part (a), together with the result that (for any matrix B) rank(B) =
tr (B-B) = tr (BB-).]
(c) Show that if C(U) c C(T) and R(V) c R(T), then formula (E.2) for
rank(A) reduces to the formula
which is reexpressible as
(b) Taking G to be the generalized inverse (E. 1), it is easy to show that
Thus, making use of the result that (for any matrix B) rank (B) = tr (BB-) [which
is part of result (2.1)], we find that
rank(A) = tr (AG)
= tr (TT-) + tr (XX-ET) + tr (YY-) + tr (EyQQ*)
= rank(T) + tr (XX-ET) + rank(Y) + tr (EyQQ*).
Moreover,
and similarly
Solution. Letting x represent a column vector (whose dimension equals the number
of rows in A), it follows from Corollary 9.3.6 that x E C(A) if and only if x =
AA -x, or equivalently if and only if (I - AA -)x = 0, and hence if and only if
x E N(I - AA -). We conclude that C(A) = N(I - AA -).
AZ = (Ax)k' = Ok' = O.
(S.l)
Consequently,
11. Linear Systems: Solutions 57
(S.2)
and hence [in light of equality (S.l)] that L:f=l kiZi = O. Since ZI,"" Zs are
linearly independent, we have that kl = ... = ks = 0, which, together with
equality (S.2), further implies that ko = 0. We conclude that Xo, Xl, ... , Xs are
linearly independent.
(b) Let X* represent any solution to AX = B. Then, according to Theorem
11.2.3, X* = Xo + Z* for some solution Z* to AZ = O. Since ZI, ... , Zs form
a basis for the solution space of AZ = 0, Z* = L:f=1 kiZi for some scalars
ki' ... , k s . Thus,
X* s
= Xo + {;kiZi =(
1-s ) Xo + (;kiXi'
{;ki s
so that X* is a solution to AX = B.
To complete the proof, it suffices to show that X* is not expressible as X* = GB
for any generalized inverse G of A. Assume the contrary, that is, assume that
X* = GB for some generalized inverse G of A. Then, since
we have that
yj = X*kl = GBkl = 0,
which (since, by definition, yj is nonnull) establishes a contradiction.
implying that
n - rank(A) < n - rank(CA). (S.3)
Let Xo represent any particular solution to AX = B. According to Theorem
11.2.3, the solution set of AX = B is comprised of every n x p matrix X* that is
expressible as
X* = Xo +Z*
for some solution Z* to the homogeneous linear system AZ = 0 (in an n x p
matrix Z). Similarly, since Xo is also a solution to CAX = CB, the solution set of
CAX = CB is comprised of every matrix X* that is expressible as
X* = Xo+Z*
e
Alternatively, suppose that
consistent. Show that the value of AX is the same for every solution to CAX = B
if and only if rank:(CA) = rank(A).
Solution. It suffices (in light of Theorem ILl 0.1) to show that rank(CA) =
rank(A) if and only if R(A) c R(CA) or equivalently [since R(CA c R(A)]
if and only if R(A) = R(CA). If R(A) = R(CA), then it follows from the
very definition of the rank of a matrix that rank(CA) = rank(A). Conversely, if
rank(CA) = rank:(A), then it follows from Corollary 4.4.7 that R(A) = R(CA).
(in X and L), then X* is a solution to the linear system AX = B (in X), and
L * = K'X*. and, conversely, if X* is any solution to AX = B, then X* and K'X*
are the first and second parts, respectively, of some solution to linear system (*);
and (2) (restricting attention to the special case where m = n) that If X* and L *
are the first and second parts, respectively, of any solution to the linear system
(in X and L), then X* is a solution to the linear system AX = B (in X) and
L * = K'X*, and, conversely, if X* is any solution to AX = B, then X* and K'X*
are the first and second parts, respectively, of some solution to linear system (**).
Solution. (1) Suppose that X* and L* are the first and second parts, respectively,
of any solution to linear system (*). Then, clearly
-K'X* + L* = 0,
so that X* and K'X* are the first and second parts, respectively, of some solution
to linear system (*).
(2) Suppose that X* and L * are the first and second parts, respectively, of any
solution to linear system (**). Then, clearly
-K'X* + L* = 0,
11. Linear Systems: Solutions 61
so that X* is a solution to AX = B.
Conversely, suppose that X* is a solution to AX = B. Then, clearly,
( A + KK'
-K'
-K) ( X* ) _ (A + KK')X* - KK'X*) = (AX*) =
I K'X* - -K'X* + K'X* 0
(B)
0'
so that X* and K'X* are the first and second parts, respectively, of some solution
to linear system (**).
12
Projections and Projection Matrices
Thus, U J.. W.
Solution. Let r = dim(U) and s = dim(V). And, let {AI, ... , AT} and {BI, ... ,
Bs} represent bases for U and V, respectively. Further, define H = {hi}} to be the
r x s matrix whose ijth element equals Ai ° B j.
Now, suppose that s > r. Then, since rank(H) ::s r < s, there exists an s x 1
nonnull vector x = {x j} such that Hx = O.
Let C = xlB I + ... + xsBs. Then, C is nonnull. Moreover, for i = 1, ... , r,
Since Lj hi}x j is the ith element of the vector Hx, Lj hijX j = 0, and hence
Ai ° C = 0 (i = 1, ... , r). Thus, C is orthogonal to each of the matrices
A I, ... , AT' We conclude on the basis of Lemma 12.1.1 (or equivalently the result
of Exercise 1) that C is orthogonal to U.
Solution. Let Yi represent the ith column of Y, and take Vi to be the projection
(with respect to the usual inner product) of Yi on U (i = I, ... , n). Define V =
(VI, ... , vn). Then, by definition, (Yi - Vi)'W = 0 for every vector WinU, so that,
for every matrix W = (WI, ... , w n ) in M,
n
tr[(Y - V)'W] = L(Yi - Vi)'Wi = 0,
;=1
implying that Z = V.
Now, suppose that B* is a solution to X'XB = X'Y. Then, for i = I, ... , n, the
ith column bt of B* is clearly a solution to the linear system X'Xbi = X'y; (in
12. Projections and Projection Matrices 65
was determined to be the vector (3, 22/5, 44/5)' -and it was observed that XI and
X2 are linearly independent and that X3 = X2 - (1/3 )XI, with the consequence that
dim(U) = 2. Recompute the projection of yon U (in this special case) by taking
X to be the 3 x 2 matrix
and carrying out the following two steps: (1) compute the solution to the normal
equations X'Xb = X'y; and (2) postmultiply X by the solution you computed in
Step (1).
z= (~6 -;)
4
(~/13/25 ) = (2i/5)
44/5
.
EXERCISE 5. Let X represent any n x p matrix. If a p x n matrix B* is a solution
to the linear system X'XB = X' (in B), then B* is a generalized inverse of X and
XB* is symmetric. Show that, conversely, if a p x n matrix G is a generalized
inverse of X and if XG is symmetric, then X'XG = X' (i.e., G is a solution to
X'XB = X').
66 12. Projections and Projection Matrices
EXERCISE 6. Using the result of Part (b) of Exercise 9.3 (or otherwise), show
that, for any nonnull symmetric matrix A,
where B is any matrix of full column rank and T any matrix of full row rank such
that A = BT. (That TB is nonsingular follows from the result of Exercise 8.3.)
Solution. Let L represent a left inverse of B and R a right inverse of T. Then,
according to Part (b) of Exercise 9.3, the matrix R(TB) - I L is a generalized inverse
of A2 or equivalently (since A is symmetric) of A'A. Thus,
BI =AI,
B2 = A2 - XI2BI,
Moreover, upon observing that the set {C I, ... , C j } is orthonormal and applying
result (1.1), we find that Lf::II(Aj ·Ci)Ci is the projection of Aj on Wj_l.
Thus, it follows from Theorem 12.5.8 that B j is the projection of Aj on Wf-I·
And, since (in light of the discussion of Section 6.4b) W j -I = sp(A I, ... , A j -I),
we conclude that B j is the projection of A j on the orthogonal complement of the
subspace (of V) spanned by AI, ... , Aj_l.
13
Determinants
EXERCISE 1. Let
(a) Write out all of the pairs that can be formed from the four boxed elements
ofA.
(b) Indicate which ofthe pairs from Part (a) are positive and which are negative.
(c) Use the formula
a n (1, il; ... ; n, in) = a n (il, 1; ... ; in, n) = ¢nUl,.··, in)
(in which iI, ... ,in represents an arbitrary permutation of the first n positive
integers) to compute the number of pairs from Part (a) that are negative, and check
that the result of this computation is consistent with your answer to Part (b).
Solution. (a) and (b)
Pair "Sign"
a14, a21
a14, a33
a14, a42
a21, a33 +
a21, a42 +
a33, a42
70 13. Determinants
"Recall" that
lSI = IRI
for any n x n matrix R and for any matrix S formed from R by adding to anyone
of its rows or columns, scalar mUltiples of one or more other rows or columns; and
use this result to show that
(Hint. Add the last n - 1 columns of A to the first column, and then subtract the
first row of the resultant matrix from each of the last n - 1 rows).
Solution. The matrix obtained from A by adding the last n - 1 columns of A to
the first column is
B= (:;:~ x ~A
nx +"A x
The matrix obtained from B by subtracting the first row of B from each of the
next i rows is
nx +"A x x x x
I
0 "A 0 0 0
i rows
Ci = 0 0 "A 0 0
I
nx+"A x x x+"A x
n - 1 - i rows
nx +"A x x x x+"A
0 4 0
A = ( I 0 -1
030
o 0 -6
Do so in each of the following two ways:
(a) by finding and summing the nonzero terms in the expression
(where iI, ... , in or ii, ... , in is a permutation of the first n positive integers and
the summation is over all such permutations);
(b) by repeated expansion in terms of cofactors-use the (general) formula
n n
IAI = L aijaij or IAI = L aijaij
j=1 i=1
(where i or i, respectively, is any integer between 1 and n inclusive and where aij is
the cofactor of aij) to expand IAI (in the special case) in terms of the determinants
72 13. Determinants
(b)
405
IAI = (1)(-1)2+1 3 0-2
o -6 0
= (_1)3(_6)(_1)3+2143 51
-2
= (-1)3(-6)(-1)5[4(-1)1+1(-2) + 5(_1)1+2(3)]
= (-6)(-8 -15)
= 138.
Suppose now that A is singular but nonnull, in which case A contains a nonnull
row, say the ith row a;.
Since A is singular, IAI = 0, and it follows from Theorem
13.5.3 that A adj (A) = 0 and hence that
a; adj(A) = 0,
a;
implying (since is nonnull) that the rows of adj(A) are linearly dependent. We
conclude that adj(A) is singular.
(b) Making use of Theorems 13.3.4 and 13.5.3, Corollary 13.2.4, and result
(1.9), we find that
Alternatively, if A is singular, then it follows from Part (a) that adj(A) is singular
and hence that
°
ladj(A) I = = IAr-l.
Use formula (*) to verify that, for any 2 x 2 nonsingular matrix A = (all a l2 ),
a21 a22
Solution. Let aij represent the ijth element of a 2 x 2 matrix A, and let aij
represent the cofactor of aij (i, j = 1, 2). Then, as a special case of formula (*)
[or equivalently formula (5.7)], we have that
Moreover, it follows from the very definition of a cofactor and from formulas (1.3)
and (l.4) that
Upon substituting these expressions in formula (S.2), we obtain formula (**) [or
equivalently formula (8.1.2)].
74 13. Detenninants
( o2 0 -1)
EXERCISE 9. Let
A= -1 3 1
-4 5
all = (_1)1+1 I 3
-4 ~I = 19, a12 = (_1)1+21-~ ~I =5,
al3 = (_1)1+31-~ _;1 =4, a21 = (_1)2+1 I_~ -11
5 =4,
a33 = (_1)3+31 2
-1 ~I =6.
(b) Expanding IAI in terms of the cofactors of the elements of the second row
of A gives
IAI = (-1)4 + 3(10) + 1(8) = 34.
Expanding IAI in terms of the cofactors of the elements of the second column of
A gives
IAI = 0(5) + 3(10) + (-4)( -I) = 34.
(c) Substituting from Parts (a) and (b) in formula (*) of Exercise 8 [or equiva-
lently in formula (5.7)], we find that
A-I = (1/34) ( 5
19
10
4 3)-1 .
4 8 6
EXERCISE 10. Let A = {aij} represent an n x n matrix (where n :::: 2), and let
aij represent the cofactor of aij .
13. Determinants 75
(a) Show [by for instance, making use of the result of Part (b) of Exercise 11.3]
that if rank(A) = n - 1, then there exists a scalar c such that adj(A) = cxy',
where x = {x j} and y = {Yi} are any nonnull n-dimensional column vectors such
that Ax = 0 and A'y = O. Show also that c is nonzero and is expressible as
c = aij /(YiXj) for any i and j such that Yi i= 0 and Xj i= O.
(b) Show that ifrank(A) :s n - 2, then adj(A) = O.
Solution. (a) Suppose that rank(A) = n - 1. Then, det(A) = 0 and hence (accord-
ing to Theorem 13.5.3) A adj(A) = adj(A)A = O. Thus, it follows from the result
of Part (b) of Exercise 11.3 that there exists a scalar c such that adj(A) = cxy' [or
equivalently such that (adj A)' = cyx'] and hence such that (for all i and j)
(S.3)
Clearly, the cofactor of the ijth element of Aj is the same as the cofactor of the
ijth element of A (i = 1, ... , n), so that, according to Theorem 13.5.1, the jth
element of A -Ib is IAj 1/ IAI.
EXERCISE 12. Let c represent a scalar, let x and y represent n x 1 vectors, and
let A represent an n x n matrix.
76 13. Determinants
(b) Show that, in the special case where A is nonsingular, result (E. 1) can be
reexpressed as
I~ ~I = IAI(c-x'A-1y),
in agreement with the more general result that, for any n x n nonsingular matrix
T, n x m matrix U, m x n matrix V, and m x m matrix W,
Solution. (a) Denote by Xi the ith element of x, and by Yi the ith element of y.
Let Aj represent the n x (n - I) submatrix of A obtained by striking out the jth
column, let Aij represent the (n - I) x (n - I) submatrix of A obtained by striking
out the ith row and the jth column, and let aij represent the cofactor of the ijth
element of A.
Expanding I~ ~ Iin tenns of the cofactors of the last row of (~
c '
wey)
obtain
I~ ~I = ~Yixj(-1)2n+l+i+jIAijl+cIAI
I.J
= clAI- LYixj(-I)i+jIAijl
i.j
= clAI- LYiXjaij
i,j
= clAI - x' adj(A)y.
XI
X2
x2
I
x2
2
...
...
'_I)
xI
n-I
X2
v= ((
x2 n-I
xn n Xn
(where XI, X2, ... ,Xn are arbitrary scalars) obtained by striking out the kth row
and the nth (last) column (of V). Show that
Solution. Let V* represent the n x n matrix whose first, ... , (k - I)th rows are
respectively the first, ... , (k - I)th rows of V, whose kth, ... , (n - I)th rows
are respectively the (k + I)th, ... , nth rows of V, and whose nth row is the kth
row of V. Then, V* (like V) is an n x n Vandermonde matrix, and Vk equals
the (n - I) x (n - I) submatrix of V* obtained by striking out the last row
and the last column (of V*). Moreover, V can be obtained from V* by n - k
successive interchanges of pairs of rows - specifically, V can be obtained from
V* by successively interchanging the nth row of V* with the (n - I)th, ... , kth
rows of V*. Thus, making use of Theorem 13.2.6 and of result (6.4), we find that
adj(AB) = adj(B)adj(A).
(Hint. Use the Binet-Cauchy formula to establish that the ijth element of adj(AB)
equals the ijth element of adj (B)adj (A).)
Solution. Let A j represent the (n - I) x n submatrix of A obtained by striking out
the jth row of A, and let Bi represent the n x (n - I) submatrix of B obtained by
striking out the ith column ofB. Further, let Ajs represent the (n - I) x (n - I)
submatrix of A obtained by striking out the jth row and the sth column of A,
and let Bsi represent the (n - I) x (n - I) submatrix of B obtained by striking
78 13. Detenninants
out the sth row and the ith column of B. Then, application of formula (8.3) (the
Binet-Cauchy formula) gives
n
IAjBil = L IAjsllBsil.
s=\
implying that
n
(-I)i+iIAjB;I = L(_1)S+iIB sil (-1)j+s IAjsl. (S.6)
s=1
EXERCISE 2. Show that corresponding to any quadratic form x' Ax (in the n-
dimensional vector x) there exists a unique upper triangular matrix B such that
x' Ax and x'Bx are identically equal, and express the elements of B in terms of the
elements of A.
= 1, ... ,n). WhenB = {bij}
Solution. Letaij represent the ijth element of A (i, j
is upper triangUlar, the conditions aii = bii and aij + aji = bij + bji (j =l=-
i = 1, ... , n) of Lemma 14.1.1 are equivalent to the conditions aii = bii and
aij + aji = bij (j > i = 1, ... , n). Thus, it follows from the lemma that there
exists a unique upper triangular matrix B such that x' Ax and x'Bx are identically
equal, namely, the upper triangular matrix B = {bij}, where bii = aii and bij =
aij +aji (j > i = 1, ... ,n).
80 14. Linear, Bilinear, and Quadratic Forms
Solution. Consider the two n x n matrices G~) and (~ ~n -I). Clearly, both
of these two matrices are positive semidefinite, however, their sum is the n x n
identity matrix In, which is positive definite.
1 2 0 0
0 1 0 0
A= 0 0 0
0 0 0
For an arbitrary n-dimensional vector x = (XI, X2, X3, ... ,xn )', we find that
Solution. As in results (1) - (3) (of the exercise or equivalently of Theorem 14.2.9),
letA represent an n xn matrixandPann xm matrix. Upon applying results (1)-(3)
with -A in place of A, we find that (1') if -A is nonnegative definite, then -P' AP
is nonnegative definite; (2') if -A is nonnegative definite and rank(P) < m, then
- P' AP is positive semidefinite; and (3') if - A is positive definite and rank(P) = m,
then -P' AP is positive definite. These three results can be restated as follows:
14. Linear, Bilinear, and Quadratic Fanns 81
n n
= I)ki I)Sj(Ys·Yk)
k=l s=I
n
= Lbkibkj.
k=l
Moreover, Lk=l bkibkj is the ijth element of the r x r matrix B'B, where B is
the n x r matrix whose kjth element is bkj (and hence where B' is the r x n
matrix whose ikth element is bki). Thus, A = B'B, and since B'B is symmetric
(and in light of Corollary 14.2.14) nonnegative definite, the solution of Part (a) is
complete.
Now, consider Part (b). For j = 1, ... , r, let b j = (blj, ... ' bnj. Then,
since clearly Y I, ... , Yn are linearly independent, it follows from Lemma 3.2.4
that Xl, ... ,Xr are linearly independent if and only if bI, ... , b r are linearly
independent. Thus, since bl, ... , b r are the columns ofB, Xl, ... , Xr are linearly
independent if and only if rank(B) = r or equivalently (in light of Corollary 7.4.5)
if and only if rank(B'B) = r. And, since A = B'B, we conclude that X I, ... , Xr
are linearly independent if and only if A is nonsingular.
with equality holding if and only if, for all j =1= i, aij = O.
Solution. Let V = (UI, V2), where UI is the ith column of In and V2 is the
submatrix of In obtained by striking out the ith column, and observe that V is a
permutation matrix.
Define R = V' AV and S = R- I . Partition R and S as
and
r' = U~AV2 = (ail, ai2, ... , ai. i-I, ai. i+l, ... , ai. n-I, ain), (S.2)
It follows from Corollary 14.2.10 that R is positive definite, implying (in light
of Corollary 14.2.12) that R* is positive definite and hence (in light of Corollary
14.2.11) that R* is invertible and that R;I is positive definite. Thus, making use
of Theorem 8.5.11, we find [in light of results (S.l) and (S.3)] that
bU = (aii - r 'R-I
* r )-1
and also that r'R; Ir 2: 0 with equality holding if and only if r = O. Since bii > 0
(and hence aii - r'R;lr > 0), we conclude that bu 2: l/aii with equality holding
if and only if r = 0 or equivalently [in light of result (S.2)] if and only if, for
j =1= i, aij = O.
EXERCISE 11. Show that if an n x n matrix A is such that x' Ax 'I 0 for every
n x 1 nonnull vector x, then A is either positive definite or negative definite.
Solution. Let A represent an n x n matrix such that x' Ax 'I 0 for every n x 1
nonnull vector x.
Define B = (l/2)(A + A'). Then, x'Bx = x' Ax for every n x 1 vector x.
Moreover, B is symmetric, implying (in light of Corollary 14.3.5) that there exists
a nonsingular matrix P and a diagonal matrix D = diag(dl, ... , dn ) such that B =
P'DP. Thus, (Px)'DPx = x' Ax for every n x 1 vector x and hence (Px)'DPx 'I 0
for every n x 1 nonnull vector x.
There exists no i such that di = 0 [since, if di = 0, then, taking x to be
the nonnull vector P-Iei, where ei is the ith column of In, we would have that
(Px)'DPx = e;Dei = di = 0]. Moreover, there exists no i and j such that di > 0
and dj < 0 [since, if di > 0 and d j < 0, then, taking x to be the (nonnull)
vector P-Iy, where y is the n x 1 vector with ith element II v'd; and jth element
II J-dj, we would have that (Px)'DPx = y'Dy = 1 - 1 = 0].
It follows that the n scalars dl, ... , dn are either all positive, in which case B
is (according to Corollary 14.2.15) positive definite, or all negative, in which case
-B is positive definite and hence B is negative definite. We conclude (on the basis
of Corollary 14.2.7) that A is either positive definite or negative definite.
A if and only if B has the same rank and the same index of inertia as A. This result
is called Sylvester's law of inertia. after James Joseph Sylvester (1814-1897).
(d) Let A represent an n x n symmetric matrix of rank r and with index of inertia
m. Show that A is nonnegative definite if and only if m = r and is positive definite
if and only if m = r = n.
dk
1. 2, ... , m I, and similarly take kl , k2, ... , kn to be a permutation such that 2) >
ofor j = 1,2, ... , m2. Further. take UI to be the n x n permutation matrix whose
J
first, second, ... , nth columns are respectively the ilth, i2th•... , inth columns
of In and U2 to be the n x n permutation matrix whose first, second, ... , nth
columns are respectively the kith. k2th • ... , knth columns ofln, and define Or =
U~OI UI and O2 = U;02U2. Then, Or = diag(di\I), dg), ... ,d2» and O 2 =
diag(dk~)' dk;),·.·, dk~\
Suppose, for purposes of establishing a contradiction, that m I < m 2, and observe
that
O 2= U;(PZ-I)'APZ- IU 2 = U;(pZ-I)'~0IPIPZ-IU2
= U;(pZ-I)'P'IUIOrU;PIPz-IU2 = R'Or R ,
~
L... d(2) 2_
kj Xj -
(X)'o*(x)
0 2 0 -
_ (X)'R'O*R(X)
0 I 0
j=1
_ ( 0 )'0*( 0 )
- R21 X I R21 X
(S.4)
Moreover.
14. Linear, Bilinear, and Quadratic Fonns 85
and, since the last n - m 1 diagonal elements of Dr are either zero or negative,
n
L d01) YJ-ml ::: 0.
j=ml+1
These two inequalities, in combination with equality (S.4), establish the sought-
after contradiction.
We conclude that m 1 ::: m2. It can be established, via an analogous argument,
that ml ::: m2. Together, those two inequalities imply that m2 = mi.
Consider now the number of negative diagonal elements in the diagonal matrix
D. According to Lemma 14.3.1, the number of nonzero diagonal elements in D
equals r. Thus, the number of negative diagonal elements in D equals r - m.
(b) According to Corollary 14.3.5, there exists an n x n nonsingular matrix P * and
an n x n diagonal matrix D = {dd such that A = P'*DP*. Take ii, i2, ... , in to be
any permutation of the first n positive integers such that - for some integers m and
r (0 ::: m ::: r ::: n) - di j > 0, for j = 1, ... , m, di j < 0, for j = m + 1, ... , r,
and dij = 0, for j = r + 1, ... , n. Further, take U to be the n x n permutation
matrix whose first, second, ... , nth columns are respectively the i 1th, izth, ... , in th
columns ofln, and define D* = U'DU. Then,
We find that
A = P'*UU'DUU'P* = (U'P*)'D*U'P*.
And, taking t1 to be the diagonal matrix whose first m diagonal elements are
p;;, jdi;, ... , .jdi;" whose (m + l)th, (m + 2)th, ... , rth diagonal elements
are J-dim+1 ' J -dim+2' ... , J -dir' and whose last n - r diagonal elements equal
one, we have that
Conversely, suppose that A and B have the same rank, say r, and the same index
of inertia, say m. Then, according to Part (b), A = p' diag(Im, - I r - m , 0) P and
B = Q' diag(Im, - I r - m , 0) Q for some n x n nonsingular matrices P and Q. Thus,
and consequently
B = Q'(P-I)'AP-1Q = ~AP*,
where P * = p- I Q. Clearly, P * is nonsingular. We conclude that B is congruent
toA.
(d) According to Part (b),
A = pi diag(Im, - I r - m, 0) P
X = PxX = PBX.
(b) Since rank(B'B) = rank(B) = r, B'B (which is of dimensions r x r) is
invertible. Thus, it follows from Part (a) that
(S.5)
QIQ'I = (B'B)-IB'XX'B(B'B)-I
= (B'B)-IB'AB(B'B)-I
= (B'B) -I B'BB'BB(B'B) -I = I,
so that the rows ofthe r x m matrix QI are orthonormal (with respect to the usual
inner product).
It follows from Theorem 6.4.5 that there exists an (m - r) x m matrix Q2 whose
rows, together with the rows of QI, form an orthonormal (with respect to the usual
14. Linear, Bilinear, and Quadratic Fonns 87
inner product) basis for nm. Take Q = (~~). Then, clearly, Q is orthogonal.
Further, (B, O)Q = BQl, implying, in light of result (S.5), that X = (B, O)Q.
Solution. Suppose that the symmetric matrix A has a nonnegative definite gener-
alized inverse, say G. Then, A = AGA = A'GA, implying (in light of Theorem
14.2.9) that A is nonnegative definite.
Solution. Making use of Theorem 13.2.11 and of Corollary 13.1.2, we find that
Solution. Let us restrict attention to Part (a) - Part (b) can be proved in essentially
the same way as Part (a).
Suppose - for purposes of establishing a contradiction - that, for some i (1 :::
i ::: n - 1), d; = O. Take L * to be a unit lower triangular matrix and U* a unit upper
triangular matrix that are identical to Land U, respectively, except that, for some
j (j > i) the ijth element ofU* differs from the ijth element ofU and/or the jith
element of L * differs from the j i th element of L. Then, according to Theorem
14.5.5, A = L*D*U* is an LDU decomposition of A. Since this decomposition
differs from the supposedly unique LDU decomposition A = LDU, we have
arrived at the desired contradiction. We conclude that d; i= 0 (i = 1, ... , n - 1).
And, since (in light of Lemma 14.3.1) A is nonsingular if and only if all n diagonal
elements of D are nonzero, we further conclude that A is nonsingular if and only
if dn i= O.
EXERCISE 19. Let A represent an n x n matrix (where n ::: 2). By for instance
using the results of Exercises 16, 17, and 18, show that if A has a unique LDU
decomposition or (in the special case where A is symmetric) a unique U'DU de-
composition, then the leading principal submatrices (of A) of orders 1,2, ... , n-1
are nonsingular and have unique LDU decompositions.
Solution. In light of the result of Exercise 17, it suffices to restrict attention to the
case where A has a unique LDU decomposition, say A = LDV.
For i = 1, 2, ... , n - 1, let Ai, Li, Vi, and Di represent the ith-order leading
principal submatrices of A, L, V, and D, respectively. Then, according to Theorem
14.5.3, an LDU decomposition of Ai is Ai = LiDiVi, and, according to the result
of Exercise 16, Di is nonsingular. Thus, Ai is nonsingular and, in light of the result
of Exercise 18, has a unique LDU decomposition.
PAQ = (BII
B21
(b) Let B = (:~: :~~) represent any m x n nonnull matrix of rank r such that
BII is an r x r nonsingular matrix whose leading principal submatrices (of orders
1, 2, ... , r - 1) are nonsingular. Show that there exists a unique decomposition of
14. Linear, Bilinear, and Quadratic Forms 89
B of the fonn
B= (~~)D(U\, U2),
Solution. (a) The matrix A contains r linearly independent rows, say rows iI,
i2, ... , i r . For k = 1, ... , r, denote by Ak the k x n matrix whose rows are
respectively rows iI, i2, ... , h of A.
There exists a subset h, h, ... ,jr of the first n positive integers such that,
for k = 1, ... , r, the matrix, say A k, whose columns are respectively columns
h, h, ... , jk of Ak, is nonsingular. As evidence of this, let us outline a recursive
scheme for constructing such a subset.
Row il of A is nonnull, so that h can be chosen in such a way that ailh f. 0
and hence in such a way that AT = (ailh) is nonsingular. Suppose now that
h, h, ... , ik-I have been chosen in such a way that AT, Ai,.··, Ak-I are non-
singular. Since Ak-\ is nonsingular, columns j\, h, ... , ik-I of Ak are linearly
independent, and, since rank (Ak) = k, Ak has a column that is not expressible as
a linear combination of columns h, h, ... , jk-\. Thus, it follows from Corollary
3.2.3 that ik can be chosen in such a way that Ak is nonsingular.
Take P to be any m x m pennutation matrix whose first r rows are respectively
rows iI, i2, ... , ir of 1m , and take Q to be any n x n pennutation matrix whose
first r columns are respectively columns h, h, ... , jr of In. Then,
PAQ = (Bl1
B2\
B12)
B22 '
Bl1 = L\DU\,
B\2 = L\DU2,
B2\ = L2DU\, and
B22 = L2DU2.
It follows from Corollary 14.5.7 that there exists a unique unit lower triangular
matrix LI, a unique unit upper triangular matrix U \, and a unique diagonal matrix
D such that B11 = L\DU\ - by definition, Bl1 = L\DU\ is the unique LDU
decomposition ofBl1. Moreover, D is nonsingular (since B\\ = L\DU\ is non-
singular). Thus, there exist unique matrices L2 and U2 such that B21 = L2 DU I
90 14. Linear, Bilinear, and Quadratic Forms
and B12 = LIDU2 ,namely, L2 = B21 U,ID-I and U2 = D- 1L,I B12 . Finally,
it follows from Lemma 9.2.2 that
EXERCISE 21. Show, by example, that there exist n x n (non symmetric ) positive
semidefinite matrices that do not have LDU decompositions.
Solution. Let
A_ (0
- In-I
-1~_1)
In-I .
Thus, x' Ax :::: 0 for all x, with equality holding when, for example, XI = I and
X2 = 0, so that x' Ax is a positive semidefinite quadratic form and hence A is a
D
positive semidefinite matrix. Moreover, since the leading principal submatrix of
A of order two is (~ - and since -1 ~ C(O), it follows from Part (2) of
Theorem 14.5.4 that A does not have an LDU decomposition.
EXERCISE 23. Let A represent an m x k matrix of full column rank. And, let
A = QR represent the QR decomposition of A; that is, let Q represent the unique
m x k matrix whose columns are orthonormal with respect to the usual inner
product and let R represent the unique k x k upper triangular matrix with positive
diagonal elements such that A = QR. Show that A' A = R/R (so that A' A = R/R
is the Cholesky decomposition of A' A).
14. Linear, Bilinear, and Quadratic Forms 91
Solutioo. Since the inner product with respect to which the columns of Q are
orthonormal is the usual inner product, Q'Q = It, and consequently
Solutioo. Suppose that the inner product with respect to which the columns of Q
are orthonormal is the usual inner product. Then, Q' Q = I r . Thus, recalling result
(2.2.9), we find that
EXERCISE 25. Let A = {aij} represent an n x n matrix that has an LOU decom-
position, say A = LOU. And, define G = U-10-L -I (which is a generalized
inverse of A).
(a) Show that
(b) For i = I, ... , n, let di represent the ith diagonal element of the diagonal
matrix 0; and, for i, j = 1, ... , n, let iij' Uij, and gij represent the ijth elements
of L, U, and G, respectively. Take 0- = diag(dj, ... , d;), where d; = Ildi, if
di f= 0, and d; is an arbitrary scalar, if di = O. Show that
n n
gii = d; - L Uikgki = d; - L gikiki (E. I)
k=i+1 k=i+1
and that
(where the degenerate sums L~=n+1 gikiki and L~=n+1 Uikgki are to be inter-
preted as 0).
92 14. Linear, Bilinear, and Quadratic Forms
(c) Devise a recursive procedure that uses the formulas from Part (b) to generate
a generalized inverse of A.
if j = i ,
if j > i,
(c) The formulas from Part (b) can be used to generate a generalized inverse of A
in n steps. During the first step, the nth diagonal element gnn is generated from the
formula gnn = d:,and then gn-I,n, ... , gin and gn,n-I, , .. , gnl (the off-diagonal
elements of the nth column and row of G) are generated recursively using formulas
(E.2b) and (E.2a), respectively. During the (n - s + l)th step (n - I :'S s :'S 2), the
sth diagonal element gss is generated from gs+I,s, ... , gns or alternatively from
gs,s+l, ... , gsn using result (E.I), and then gs-I,s, ... , gls and gs,s-I, ... , gsl are
generated recursively using formulas (E.2b) and (E.2a), respectively. During the
nth (and final) step, the first diagonal element gll is generated from the last n - I
elements of the first column or row using result (E. I ).
Since bji is the ijth element of B' and -bij the ijth element of -B, we conclude
thatB' = -B.
EXERCISE 27. (a) Show that the sum of skew-symmetric matrices is skew-
symmetric.
(b) Show that the sum Al + A2 + ... + Ak of n x n nonnegative definite
matrices AI, A2, ... , Ak is skew-symmetric if and only if AI, A2, ... , Ak are
skew-symmetric.
(c) Show that the sum Al + A2 + ... + Ak of n x n symmetric nonnegative
definite matrices AI, A2, ... , Ak is a null matrix if and only if AI, A2, ... ,Ak are
null matrices.
Solution. (a) Let AI, A2, ... ,Ak represent n x n skew-symmetric matrices. Then,
so that Li Ai is skew-symmetric.
(b) If the nonnegative definite matrices A I, A2, ... ,Ak are skew-symmetric,
then it follows from Part (a) that their sum Li Ai is skew-symmetric.
Conversely, suppose that Li Ai is skew-symmetric. Let dij represent the jth
diagonal element of Ai (i = 1, ... , k; j = I, ... , n). Since (according to the
definition of skew-symmetry) the diagonal elements of Li Ai equal zero, we have
that
dlj + d2j + ... + dkj = 0
leading to the conclusion that dlj' d2j, ... , dkj equal zero (j = I, ... , n). Thus,
it follows from Lemma 14.6.4 that AI, A2, ... , Ak are skew-symmetric.
(c) Since (according to Lemma 14.6.1) the only n x n symmetric matrix that is
skew-symmetric is the n x n null matrix, Part (c) is a special case of Part (b).
EXERCISE 28. (a) Let AI, A2, ... , Ak represent n x n nonnegative definite
matrices. Show that tr(L7=1 Ai) ~ 0, with equality holding if and only if L7=1 Ai
is skew-symmetric or equivalently if and only if A I, A2, ... , Ak are skew-symmet-
ric. [Note. That L7=1 Ai being skew-symmetric is equivalent to AI, A2, ... , Ak
being skew-symmetric is the result of Part (b) of Exercise 27.]
(b) Let A I, A2, ... , Ak represent n x n symmetric nonnegative definite matrices.
Show that tr(L7=1 Ai) ~ 0, with equality holding if and only if L~=I Ai = 0 or
equivalently if and only if A I, A2, ... , Ak are null matrices.
Solution. (a) According to Corollary 14.2.5, L~=l Ai is nonnegative definite.
Thus, it follows from Theorem 14.7.2 that tr(L~=1 Ai) ~ 0, with equality holding
94 14. Linear, Bilinear, and Quadratic Forms
EXERCISE 29. Show, via an example, that (for n > 1) there exist n x n (non-
symmetric) positive definite matrices A and B such that tr(AB) < O.
Solution. Take A = {aij} to be an n x n matrix such that
au 1, for j = i,
2, for j = i + 1, ... , n,
= -2, for j = 1, ... , i - I
-2 -2
Then. (1/2)(A+A') = In. which is a positive definite matrix. implying (in light
of Corollary 14.2.7) that A is positive definite. Moreover, all n diagonal elements of
AB equal 1 - 4(n - 1), which (for n > 1) is a negative number. Thus. tr(AB) < O.
EXERCISE 30. (a) Show, via an example. that (for n > 1) there exist n x n
symmetric positive definite matrices A and B such that the product AB has one
or more negative diagonal elements (and hence such that AB is not nonnegative
definite).
(b) Show. however. that the product of two n x n symmetric positive definite
matrices cannot be nonpositive definite.
Solution. (a) Take A = diag(AlI, I n -2) and B = diag(BII. In -2). where
All = ( 2 -1)
-1 2 and
12
B" = ( 3
3)
1 '
and consider the quadratic forms x' Ax and x'Bx in the n-dimensional vector x =
(Xl, X2, ...• x n )'. We find that
positive definite, and hence, by definition, the matrices A and B of the quadratic
forms are positive definite.
Consider now the product AB. We find that AB = diag(AllBll, I n -2) and that
AlIBI I = (~~ _~), thereby revealing that the second diagonal element of AB
equals the negative number -I.
(b) Let A and B represent n x n symmetric positive definite matrices. Suppose,
for purposes of establishing a contradiction, that AB is nonpositive definite. Then,
by definition, - AB is nonnegative definite, implying (in light of Theorem 14.7.2)
that tr( -AB) ::': 0 and hence that
Thus,
i. j i, j i, j
= Laijhji = tr(AH),
;, j
Clearly, the matrix H is symmetric nonnegative definite, implying (in light of The-
orem 14.7.6) that if A is nonnegative definite, then tr(AH) ::': 0 and consequently
x'Cx::,: O.
Consider now the special case where B is symmetric positive definite. In this
special case, rank(F) = rank(B) = n, implying that the columns of F are linearly
independent and hence nonnull. Thus, unless x = 0, G is non null and hence H is
96 14. Linear, Bilinear, and Quadratic Fonns
nonnull. It follows (in light of Theorem 14.7.4) that if A is positive definite, then,
unless x = 0, tr(AH) > 0 and consequently x'Cx > O.
We conclude that if A is nonnegative definite and B is symmetric nonnegative
definite, then C is nonnegative definite and that if A is positive definite and B is
symmetric positive definite, then C is positive definite.
EXERCISE 32. Let AI, A2, ... , Ak and BI, B2, ... , Bk represent n x n sym-
metric nonnegative definite matrices. Show that tr(L:7=1 AiBi) ~ 0, with equality
holding if and only if, for i = 1, 2, ... , k, Ai Bi = 0, thereby generalizing and the
results of Part (b) of Exercise 28.
Solution. According to Corollary 14.7.7, tr(AiBi) ~ 0 (i = 1,2, ... , k). Thus,
with equality holding if and only if, for i = 1, 2, ... ,k, tr(Ai Bi) = 0 or equiva-
lently (in light of Corollary 14.7.7) if and only if, for i = 1,2, ... , k, AiBi = 0.
EXERCISE 33. Let A represent a symmetric nonnegative definite matrix that has
been partitioned as
A= (~ ~)
where T (and hence W) is square. Show that VT-U and UW-V are symmetric
and nonnegative definite.
Solution. According to Lemma 14.8.l, there exist matrices R and S such that
Thus, making use of Parts (6) and (3) of Theorem 12.3.4, we find that
dimensions n x m, for which C(U) (/:. C(T) and/or R(V) (/:. R(T), the expression
rank(T) + rank(W - VT-U) does not necessarily equal rank(A), and the formula
(
T- + T-UQ-VT-
-Q-VT-
:nJ
(b) Let A = {aij} represent an n x n diagonally dominant matrix, partition A
n-I n-I
n-l
S Icul-Iainl + I: lainanjlannl
j=1
:nJ
diagonally dominant matrix. It suffices to show that A is nonsingular.
Partition A as A = (~,I [so that All is of dimensions (n - 1) x (n -I)].
Since A is diagonally dominant, ann f= O. Let C = A II - (l I ann )ab'. It follows
from Part (b) that C is diagonally dominant, so that by supposition C [which is
of dimensions (n - I) x (n - I)] is nonsingular. Based on Theorem 8.5.11, we
conclude that A is nonsingular.
(d) It follows from Part (a) that every principal submatrix of a diagonally dom-
inant matrix is diagonally dominant and hence-in light of Part (c)-nonsingular.
We conclude--on the basis of Corollary 14.5.7-that a diagonally dominant matrix
has a unique LDU decomposition.
(e) The proof is by mathematical induction. Clearly, any 1 x I diagonally dom-
inant matrix with a positive (diagonal) element is positive definite. Suppose now
that any (n - 1) x (n - 1) symmetric diagonally dominant matrix with positive
diagonal elements is positive definite. Let A = {aij} represent an n x n symmetric
diagonally dominant matrix with positive diagonal elements. It suffices to show
that A is positive definite.
Partition A as A = (AI,l a ) [so that All is of dimensions (n -I) x (n -1 )],
a ann
and let C = A II - (1 I ann )aa' represent the Schur complement of ann. It follows
from Part (b) that C is diagonally dominant. Moreover, the ith diagonal element
ofC is
A = (all a l2 )
al2 a22
A= (~~ :nJ
[where A* is of dimensions (n - 1) x (n - 1)]. Then, in light of the discussion of
Section 14.8a, it follows from Theorem 13.3.8 that
(S.6)
And, it follows from Corollary 14.8.6 and Lemma 14.9.1 that IA*I > 0 and ann -
a'A;1 a > O.
In the case where A* is diagonal, we have (since A is not diagonal) that a f= 0,
implying (since A;I is positive definite) that a' A;I a > 0 and hence that ann>
ann - a' A;I a, so that [in light of result (S.6)]
i=l
au =
n
i=1
aii·
In the alternative case where A* is not diagonal, we have that a' A; 1a :::: 0, implying
that ann :::: ann - a'A;1 a, and we have, by supposition, that IA*I < n7~11 aii, so
that
i=l
aii ::: ann n n
n-l
i=l
au =
n
i=l
aii·
14. Linear, Bilinear, and Quadratic Fonns 101
(a) Show that A is positive definite if and only if a > 0, d > 0, and I b +cI
/2 < .f(id.
(b) Show that, in the special case where A is symmetric (i.e., where c = b), A
is positive definite if and only if a > 0, d > 0, and Ib I < .f(id.
Solution. (a) Let
(b + C)/2)
d .
Observe that
det(B) = ad - [(b + c)/2]2 (S.7)
and (in light of Corollary 14.2.7) that A is positive definite if and only if B is
positive definite.
°
Suppose that A is positive definite (and hence that B is positive definite). Then,
it follows from Corollary 14.2.13 that a > and d > 0, and [in light of equality
(S.7)] it follows from Lemma 14.9.1 that ad - [(b + c)/2]2 > 0, or equivalently
that [(b + c)/2]2 < ad, and hence that Ib + cl/2 < .f(id.
Conversely, suppose that a > 0, d > 0, and Ib + cl/2 < .f(id, in which case
[(b + c)/2]2 < ad or equivalently [in light of equality (S.7)] that det(B) > O.
Then, it follows from Theorem 14.9.5 that B is positive definite and hence that A
is positive definite.
(b) Part (a) follows from Part (b) upon observing that, in the special case where
c = b, the condition Ib + cl/2 < .f(id simplifies to the condition Ibl < .f(id.
EXERCISE 39. By, for example, making use of the result of Exercise 38, show
that if an n x n matrix A = {aij} is symmetric positive definite, then, for j =1= i =
1, ... , n,
laijl < Jaiiajj ~ max(aii, ajj).
By, for example, expanding IA* I in terms of the cofactors of the three elements of
the first row of A*, we find that IA* I = 1. Thus, the determinants of the leading
principal submatrices of A* (of orders I, 2, and 3) are 0, 0, and 1, respectively, all
of which are nonnegative; and A* is nonsingular. However, A* is not nonnegative
definite (since, e.g., one of its diagonal elements is negative).
Finally, for n :::: 4, consider the n x n symmetric matrix
A=(:* ~n-3).
Clearly, the leading principal submatrices of A of orders 1, 2, and 3 are the same
as those of A*, so that their determinants are 0, 0, and 1, respectively. Moreover, it
follows from results (13.3.5) and (13.1.9) that the determinants of all ofthe leading
principal submatrices of A of order 4 or more equallA* I and hence equal 1. Thus,
the determinants of all n leading principal submatrices of A are nonnegative, and
A is nonsingular. However, A is not nonnegative definite (since, e.g., one of its
diagonal elements is negative).
(b) Show that g is an inner product (for V) if and only if there exists an r x r
14. Linear, Bilinear, and Quadratic Forms 103
x * y = x'L'WLy.
(c) Show that g is an inner product (for V) if and only if there exists an n x n
symmetric positive definite matrix W such that (for all x and y in V)
x*y = x'Wy.
Solution. (a) Suppose that, for some f, x * y = (Lx) (Ly) (for all x and y in V).
0
Then,
(1) x * y = (Lx) (Ly) = (Ly) (Lx) = y * x;
0 0
S* t = (Bs) * (Bt).
We find that
(1) S* t = (Bs) * (Bt) = (Bt) * (Bs) = t *S;
(2) s * s = (Bs) * (Bs) ~ 0, with equality holding if and only if Bs = 0 or
equivalently (since the columns of B are linearly independent) if and only if
s=O;
(3) (ks) * t = (kBs) * (Bt) = k[(Bs) * (Bt)] = k(s * t) ;
(4) (s+t)*u = (Bs+Bt)*(Bu) = [(Bs) * (Bu)]+[(Bt)*(Bu)] = (s*u)+(t*u)
n
(where s, t, and u represent arbitrary vectors in T x 1 and k represents an arbitrary
n
scalar). Thus, j is an inner product (for T xl).
Now, set f = 1.
Then, letting x and y represent arbitrary vectors in V and
defining s and t to be the unique vectors that satisfy Bs = x and Bt = y (so that
s = Is = LBs = Lx and similarly t = Ly), we find that
(b) Let f represent an arbitrary inner product for n T xl, and denote by sot the
value assigned by f to an arbitrary pair of r-dimensional vectors sand t. According
to Part (a), g is an inner product (for V) if and only if there exists an f such that
(for all x and y in V) x * y = (Lx) (Ly). Moreover, according to the discussion of
0
104 14. Linear, Bilinear, and Quadratic Fonns
Section 14.10a, every inner product for 'Rf x 1 is expressible as a bilinear form, and
a bilinear form (in r-dimensional vectors) qualifies as an inner product for rx1 n
if and only if the matrix of the bilinear form is symmetric and positive definite.
Thus, g is an inner product (for V) if and only if there exists an r x r symmetric
positive definite matrix W such that (for all x and y in V) x * y = (Lx)'WLy.
(c) Suppose that there exists an n x n symmetric positive definite matrix W such
that (for all x and y in V) x * y = x'Wy. According to the discussion of Section
14. lOa, the function that assigns the value x'Wy to an arbitrary pair of vectors x
and y in nn x 1 is an inner product fornn x 1. Thus, it follows from the discussion
of Section 6.1 b that g is an inner product (for V).
Conversely, suppose that g is an inner product. According to Theorem 4.3.12,
there exist n - r n-dimensional column vectors b r + 1, ... , b n such that bl, ... , b r ,
b r +l, ... , b n form a basis fornnx 1. Let C = (b r +l, b r +2, ... , b n ), and define
F = (B, C).
Partition F- 1 as
MB=O.
represent the value assigned by a quasi-inner product to any pair of matrices A and
B (in V). Show that the set
U = {A E V: AoA = O},
which comprises every matrix in V with a zero quasi norm, is a linear space.
Solution. Let A and B represent arbitrary matrices in U, and let k represent an
arbitrary scalar.
Since (by definition) II A II = 0, it follows from the discussion in Section 14.lOc
that A B = O. Thus,
0
with equality holding if and only if QAP' = 0 or, equivalently, if and only if
A=O.
106 14. Linear, Bilinear, and Quadratic Forms
Solution. (a) Suppose that rank(CAB) = rank(C). Then, it follows from Corollary
4.4.7 that C(CAB) = C(C) and hence that C = CABR for some matrix R. Thus,
Solution. (a) In light of Theorem 14.12.18, it suffices to show that this condition
is equivalent to the condition (I - PX,w)'VPx .w = O.
Suppose that V = p~.w VPx,w + (I - px.w),V (I - px.w). Then,
(I - px.w)'VPx .w
= [Px.w(I - px.w)]'vPiw + [(I - Px,w),]2V (I - px.w)Px.w·
14. Linear, Bilinear, and Quadratic Forms 107
(b) Suppose that px.wy is the same for every y in R n as px.vy. Then, Condition
(a) of the exercise is satisfied, so that
C(LZ) = V,
where Z is any n x q matrix whose columns span W. Use results (*) and (**) to
show that, for any n x n symmetric nonnegative definite matrix W, Y..iw U if and
only if X'Wy = O.
108 14. Linear, Bilinear, and Quadratic Forms
WPu,w = WPx,w,
Solution. (a) (1) According to Lemma 4.2.2, there exists a matrix F such that
=
U XF. Thus, making use of Parts (1) and (5) of Theorem 14.12.25, we find that
(a) Show that A is a projection matrix for U with respect to W if and only if
A = px.w + (I - PX,w)XK
A = Px,w + (I - PX,w)XK
Ay = PX,wY + (I - PX,w)X(Ky).
Thus, it follows from Corollary 14.12.27 that A is a projection matrix for U with
respect to W.
Conversely, suppose that A is a projection matrix for U with respect to W. Then,
Ay E U for every y in R n , so that C(A) c U = C(X) and hence A = XF for
some matrix F. Moreover, it follows from Parts (2) and (3) of Theorem 14.12.26
that WAy = WX(X'WX)-X'Wy for every y in R n [since one solution to linear
system (12.4) is X'WX)-X'Wy], implying that
WA = WX(X'WX)-X'W
F = (X'WX)-X'W + [I - (X'WX)-X'WX]K
A = Px,w + (I - PX,w)XK
for some matrix K. Thus, making use of Part (I) of Theorem 14.12.25, we find
that
WA = WPx,w + (WX - WPx.wX)K = WPx,w,
110 14. Linear, Bilinear, and Quadratic Forms
Then,
Solution. (a) It follows from the result of Exercise 46 that a vector z (in U) is a
projection of a vector y (in R") on U with respect to V if and only if
X'V(y - z) = o.
Further, it follows from Corollary 14.12.27 that every projection of yon U with
respect to W is a projection (for every y in R") of y on U with respect to V if and
only if, for every y and every vector-valued function key),
or, equivalently, if and only if, for every y and every vector-valued function key),
Thus, it suffices to show that condition (S.9) is satisfied for every y and every
vector-valued function key) if and only if X'V (I - px.w) = o.
If X'V (I - Px,w) = 0, then condition (S.9) is obviously satisfied. Conversely,
suppose that condition (S.9) is satisfied for every y and every vector-valued function
key). Then, since one choice for key) is key) == 0,
X'V(I - px.w)y = 0
VX=P~.wVX. (S.lO)
VX = WX[(X'WX)-],X'VX = WXQ
112 14. Linear, Bilinear, and Quadratic Fanns
(a) By, for example, making use of the result of Exercise 46, show that
C~(X) = N(X'W) = C(I - Px,w),
(c) By, for example, making use of the result of Exercise 46, show that, for any
solution b* to the linear system X'WXb = X'Wy (in b), the vector y - Xb* is a
projection of y on C~(X) with respect to W.
Solution. (a) It follows from the result of Exercise 46 that an n-dimensional column
vectory andC(X) are orthogonal with respectto W if and only ifX'Wy = O. Thus,
C~(X) = N(X'W). Moreover, since [according to Part (5) of Theorem 14.12.25]
X(X'WX)- is a generalized inverse ofX'W, we have (in light of Corollary 11.2.2)
that N (X'W) = C(I - Px,w).
(b) Making use of Part (a), together with Part (10) of Theorem 14.12.25 and
Corollary 4.4.5, we find that
dim[C~(X)] = dim[C(I - px.w)]
= rank(I - Px,w)
= n - rank(WX) ~ n - rank(X) = n - dim[C(X)].
EXERCISE 1. Using the result of Part (c) of Exercise 6.2, verify that every
n
neighborhood of a point x in m x I is an open set.
Solution. Take the nonn for nm x I to be the usual nonn, let N represent the
neighborhood of x of radius r, and let y represent an arbitrary point in N. Further,
take M to be the neighborhood of y of radius
s = r - II y - x II,
and let z represent an arbitrary point in M. Then, using the result of Part (c) of
Exercise 6.2, we find that
liz-xII::: IIz-YII+lly-xll
<s+lIy-xll
=r-lIy-xll+lly-xll
= r,
implying that zEN. It follows that MeN and hence that y is an interior point
of N. We conclude that N is an open set.
1/2
II x - c II = [(x - c)'(x - C)]1/2 = [ l: (Xij - Cij)2
]
l.J
a -k a
_g_ = _kg-k-l~.
aXj aXj
a(1/g) = _(I/g)2~ ,
aXj aXj
Solution. In light of the results cited in the statement of the exercise (or equiva-
lently in light of Lemma 15.2.2 and the ensuing discussion), we have that 1/g is
continuously differentiable at c and that, as a consequence, (1/ g)k or equivalently
g-k is continuously differentiable at c. Moreover, using results (**) and (*) [or,
equivalently, results (2.16) and (2.8)], we find that
= _k(1/g)k+l~
aXj
= _kg-k-l~.
aXj
(S.l)
116 15. Matrix Differentiation
or equivalently
aF
F-F=O.
aXj
(j = 1, ... , m). Moreover, since a(gf)/aXj, (ag/axj)f, and g(af/axj) are the
jth columns of a(gf)/ax', f(ag/ax'), and g(af/ax'), respectively, we have that
atr(AXB) = A'B'.
ax
[Hint. Observe that tr(AXB) = tr(BAX).]
(2) Show that, for any m- and n-dimensional column vectors a and b,
a(a'Xb) = ab'.
ax
[Hint. Observe that a'Xb = tr(a'Xb).]
(b) Suppose now that X is a symmetric (but otherwise unrestricted) matrix (of
dimensions m x m).
15. Matrix Differentiation 117
atr(AXB) ,
ax = C + C - diag(cll, C22, ... , Cmm ),
a(a'Xb) , ,
ax = ab + ba - diag(albl, a2b2, ... , ambm).
(2) Observing that a'Xb = tr(a'Xb) and applying Part (1) (with A = a' and
B = b), we find that
a(a'Xb) ,
ax = ab.
atr(AXB) atr(BAX) ,.
ax = ax =C +C - dlag(cII, C22, ... , cmm ).
(2) Observing that a'Xb = tr(a'Xb) and applying Part (1) (with A = a' and
B = b), we find that
a(a'Xb)
ax
, , b
= ba + ab - diag(albl, a2 b2, ... , am m).
atr[(Ax)2] = 2(AXA)'.
ax
(b) Let X = {xstl represent an m x m symmetric (but otherwise unrestricted)
matrix of variables. Show that, for any m x m matrix of constants A,
atr[(Ax)2] ,.
ax = 2[B + B - dlag(bll, b22, ... , bmm )],
Thus, making use of results (6.3), (5.3), and (5.2.3), we find that
atr[(AX)2] atr[(AXA)X]
aXij aXij
= tr(AXAuiUj) + tr(XAUiUjA)
= 2 tr(ujAXAui)
= 2 ujAXAui.
Since ujAXAui is the jith element ofAXA or, equivalently, the ijth element of
(AXA)', we find that
atr[(Ax)2] = 2(AXA)'.
ax
(b) For purposes of differentiating tr[(AX)2], tr[(AX)2] is interpreted as a
function of an m(m + 1)/2-dimensional column vector x whose elements are Xij
(j ~ i = 1, ... , m).
According to results (4.7), (5.6), and (5.7),
a(AXA) ax {AUiU~A' if j = i ,
-aXij
- - =A-A=
aXij
A(UiUj
" + UjU')A , l·f J. < I,.
i
= tr(AXA ax )
aXij
+ tr[x a(AXA)
aXij
J.
Thus, making use of results (5.6), (5.7), and (5.2.3), we find that
atr[(AX)2] , ,
- - - - = tr(AXAuiUi) + tr(XAUiUiA)
aXii
= 2 tr(u;AXAui)
= 2u;Bui
and that (for j < i)
atr[(AX)2] "
_"'--'-----'--C. = tr[AXA(Ui U,. + UjUi)] + tr[XA(UiU,." + UjUi )A]
aXij
15. Matrix Differentiation 119
= 2 tr[AXA(UiUi + UjU;)]
= 2 tr(AXAuiUi) + 2 tr(AXAuju;)
= 2 tr(ujAXAui) + 2 tr(u;AXAuj)
= 2 UjBUi + 2 u;Buj.
Since (for i, j = 1, ... ,m) uiBui is the jith element of B or, equivalently, the
ijth element of B' and since u;Bu j is the ijth element of B, we conclude that
atr[(Ax)2] ,.
ax = 2 [B + B - dlag(bll, b22,···, b mm )]·
Solution. Let Uj representthe jth column ofIm . Then, making use of results (6.1),
(4.8), (5.2.3), and (5.3), we find that
atr(X k) = tr(aXk )
aXij aXij
= k tr(Xk-1 ax)
aXij
= k tr(Xk-1uiuj)
= k tr(U'.Xk-1Ui)
J
= k(uiXk-1ui).
Moreover, ujXk-1ui equals the jith element of Xk- 1 or equivalently the ijth
element of (Xk- 1),. Since (Xk- 1), = (X')k-l, we conclude that atr(Xk) / aXij'
which is the ijth element of atr(Xk)/ax, equals the ijth element of k(X')k-l and
hence that
120 15. Matrix Differentiation
(c) Show (in the special case where n = m) that, for any m x m matrix of
constants A,
atr(XAX) = (AX)' + (XA)'.
ax
(d) Use the results of Parts (a)-(c) to devise simple formulas for atr(X'X)/ax,
atr(XX')/ax, and (in the special case where n = m) atr(X2 )/ax.
Solution. Let Uj represent the jth column of an identity matrix (of unspecified
dimensions).
(a) Since tr(X' AX) = tr(AXX'), it follows from result (6.2) that
atr(X' AX) = tr[A a(XX')].
aXij aXij
Thus, making use of results (4.3), (4.10), (5.3), and (5.2.3), we find that
= tr(AXuju;) + tr(AUiUiX')
= tr(u;AXuj) + tr(UiX'Aui)
= u;AXuj + UiX'Aui.
Since u;AXuj is the ijth element of AX and since ujX' AUi is the jith element
of X' A or equivalently the ijth element of (X' A)', we conclude that
Thus, making use of results (4.3), (5.3), and (5.2.3), we find that
Since UjAX'Ui is the jith element of AX' or equivalently the ijth element of
(AX')' and since u;XAuj is the ijth element ofXA, we conclude that
- - - - = tr (A X
atr(XAX) ax)
- +tr (A -ax X)
axi} aXil aXil
= tr(AXUiUj) + tr(AUiUjX)
= ujAXui + ujXAui .
Since UjAXUi is the jith element of AX or equivalently the ijth element of (AX)'
and since UjXAUi is the jith element of XA or equivalently the ijth element of
(XA)', we conclude that
atr(X'X) = atr(XX') = 2X
ax ax
and (in the special case where n = m)
al ag (a g )' . (a g ag ag )
ax = ax + ax - dlag aXil' aX22' ... , aXmm .
Since u;(agjaX)'Ui is the ith diagonal element of (agjaX)' (or equivalently the
ith diagonal element of agjaX) and since uj(agjaX)'Ui and u;(agjaX)'Uj are
the ijth elements of agjaX and (agjaX)', respectively, it follows that (at x = c)
al ag ( ag )' . (a g ag ag )
ax = ax + ax - dlag aXil' aX22' ... , aXmm .
n
= L [ui(C)D~hi(C) + DsUi(C)Djhi(C)]
i=1
n n n
= L Dig[h(c)]D~hi(C) +L L D;i g[h(c)]Dshk (c) Djhi (c)
i=1 k=1 i=1
n
=L Dig[h(c)]D;jhi(C) + [Dsh(c)]'Hg[h(c)]Djh(c).
i=1
To complete the argument, observe that D;jhi (c) is the sjth element of Hhi (c)
and that [Dsh(c»)'Hg[h(c)] Dj h(c) is the sjth element of [Dh(c)]'Hg[h(c)]Dh(c)
124 15. Matrix Differentiation
alXl k
- - = kIXlk-l[adj(X)]'.
ax
Solution. For purposes of differentiation, rearrange the elements of X in the form
of an m2 -dimensional column vector x, and reinterpret I as a function of x (in
which case the domain of I comprises all of Rm\ Let h represent a function of
x defined (on Rm2) by hex) = det(X), let g represent a function of a variable y
defined (for all y) by g(y) = yk, and express I as the composite of g and h, so
that I(x) = g[h(x)].
The function g is continuously differentiable at every y, and
ah
-a- = ~ij,
Xi}
where ~i} is the cofactor of the ijth element Xi} of X. Thus, it follows from the
chain rule that I is continuously differentiable at every x (or equivalently at every
X) and that
aal
Xij
= kIXl k - I '>IJ'
" ..
or equivalently that
al = klXl k- 1[adj(X)]'.
ax
EXERCISE 15. Let F = {fis} represent a p x p matrix of functions, defined on
a set S, of a vector x = (XI, ... , xm)' of m variables. Let c represent any interior
point (of S) at which F is continuously differentiable. Use the results of Exercise
13.10 to show that (a) if rank[F(e)] = p - 1, then (at x = e)
a det(F) , aF
---=ky-z,
aX} aX}
15. Matrix Differentiation 125
where z = {zs} and y = {Yi} are any nonnull p-dimensional vectors such that
F(c)z = 0 and [F(c)l'y = 0 and where [letting ¢is represent the cofactor of
.tis (c)] k is a scalar that is expressible as k = ¢i sf (Yi zs) for any i and s such that
Yi =1= 0 and Zs =1= 0; and (b) ifrank[F(c)] :s p - 2, then (at x = c)
_a_de_t_(F_) _ 0
aXj - .
adet(F)
- - - = tr [ adj(F)-
aFJ .
aXj aXj
(a) Suppose that rank[F(c)] = p - 1. Then, according to the result of Part (a)
of Exercise 13.10,
adj[F(c)] = kzy',
so that (at x = c)
a det(F)
- aF ) = k tr ( y,aF)
- - = tr ( kzy'- - z = ky , -
aFz .
aXj aXj aXj aXj
(b) Suppose that rank[F(c)] :s p - 2. Then, it follows from the result of Part
(b) of Exercise 13.1 0 that (at x = c)
adet(F) = tr(o~) = o.
aXj aXj
Moreover, making use of results (8.6), (4.6), (4.10), (5.3), and (5.2.3) and letting
Uj represent the jth column ofl m or In, we find that (at x = c)
alog det(X' AX) [, 1 a(X' AX) ]
- - ' - - - - - = tr (X AX) - - - -
aXij aXij
= tr{(X'AX)-I[X'A ax
aXij
+ (ax
aXij
)~xJ}
= tr[(X'AX)-IX'AuiUjJ + tr[(X'AX)-IUju;AXJ
= uj(X'AX)-IX'Aui + u;AX(X'AX)-IUj.
Upon observing that u;AX(X' AX)-IUj and uj(X' AX)-IX' AUi are the ijth ele-
ments ofAX(X'AX)-1 and [(X'AX)-IX'A)', respectively, we conclude that (at
x = c)
alog det(X' AX)
--=--=-- = AX(X'AX)-l + [(X'AX)-IX'A)'.
ax
EXERCISE 17. (a) Let X represent an m x n matrix of "independent" variables,
let A and B represent q x m and n x q matrices of constants, and suppose that the
range of X is a set S comprising some or all X-values for which det(AXB) > O.
Show that log det(AXB) is continuously differentiable at any interior point C of
S and that (at X = C)
alogdet(AXB) = [B(AXB)-IA),.
ax
(b) Suppose now that X is an m x m symmetric matrix; that A and Bare q x m and
m x q matrices of constants; that, for purposes of differentiating any function of X,
the function is to be interpreted as a function of the column vector x whose elements
are Xij (J :s i = 1, ... , m); and that the range of x is a set S comprising some
or all x-values for which det(AXB) > O. Show that log det(AXB) is continuously
differentiable at any interior point c (of S) and that (at x = c)
alog det(AXB) ,
ax = K +K - diag(kll, k22, kqq),
Moreover, in light of results (8.6), (4.7), (5.3), and (5.2.3), we have that (at
x = c)
alog det(AXB) a
= tr[(AXB) -I (AXB)] = tr[(AXB) -I A ax B]
aXij aX;} aXij
= tr[(AXB)-IAu;ujB]
= ujB(AXB)-1 Au;
and hence {since ujB(AXB)-1 Au; is the ijth element of [B(AXB)-I An that (at
x = c)
a10gdet(AXB)
ax B XB -lA'
= [(A ) ].
(b) By employing essentially the same reasoning as in Part (a), it can be estab-
lished that log det(AXB) is continuously differentiable at the interior point c and
that (at x = c)
a10gdet(AXB) = tr[(AXB)-1 A ax B].
aX;} aXij
Since u; Ku; is the ith diagonal element of K and since u; Ku} and uj Ku; are the
i j th elements of K and K', respectively, it follows that (at x = c)
alog det(AXB) ,
ax = K +K - diag(ku, k22, ... , kqq ).
a
logdet(AF-1B) [ -I _I a(AF-1B)]
- - - - - - = tr (AF B)
a~ a~
= tr[(AF-IB)-I(-AF-I~F-IB)J
aX}
alogdet(AX-1B) = -tr[X-1B(AX-1B)-IAX- 1 ax J.
bij bij
15. Matrix Differentiation 129
(b) By employing essentially the same reasoning as in Part (a), it can be estab-
lished that log det(AX- 1B) is continuously differentiable at the interior point c
and that (at x = c)
Since u; Ku; is the i th diagonal element of K and since u; Ku) and uj Ku; are the
ijth elements of K and K', respectively, it follows that (at x = c)
aadj(F) = o.
aX)
Solution. Let ¢s; represent the cofactor of Is; and hence the i sth element of adj(F),
and let Fsi represent the (p - 1) x (p - 1) submatrix of F obtained by striking
130 15. Matrix Differentiation
out the sth row and the ith column (ofF). Then, as discussed in Section 15.8, <Psi
is continuously differentiable at c and (at x = c)
aadj(F) = o.
aXj
where Yi represents the ith column and zj the jth row of X-I.
(b) Suppose now that X is an m x m symmetric matrix; that, for purposes of
differentiating a function of X, the function is to be interpreted as a function of the
column vector x whose elements are xij (j :::: i = I, ... , m); and that the range of
x is a set S comprising some or all x-values for which X is nonsingular. Show that
X-I is continuously differentiable at any interior point c of S and that (at x = c)
ax- I
--=
{-Y'Y~
I r I
if j = i,
aXij -YiYj - YjYj' if j < i
ax- I
--=-
X-I -ax X-I =-
X-I UiUi'X-I =YiYi
,
aXii aXii
and similarly (for j < i)
ax-
aXij
I
= -X
-I'
(UiUj
,-I
+ UjUi)X
, ,
= -YiYj - YjYi'
(b) Suppose that det(X) > 0 for every X in S. Show that logdet(X) is twice
continuously differentiable at C and that (at X = C)
a2 Iogdet(X)
= -YtiYjs·
(b) Similarly, based on the results of Section 15.9 (along with Lemma 5.2.1),
we conclude that log det(X) is twice continuously differentiable at c and that (at
x = c)
where TI, ... , Tr are r nonempty mutually exclusive and exhaustive subsets of
{jl, ... , Jd (and where the second summation is over all possible choices for
TI, ... , Tr).
(b) Suppose that det(F) > 0 for every x in S, and denote by c any interior point
(of S) at which F is k times continuously differentiable. Show that log det(F) is k
times continuously differentiable at c and that (at x = c)
ak log det(F)
aXil ... aXik
L L
k
= (-l)r+l tr [F- I O(T))F- I O(T2)" .F-IO(Tr )], (E.2)
r=1 T], ... ,Tr
where T1, ... , Tr are r nonempty mutually exclusive and exhaustive subsets of
{jt, ... , Jd with Jk E Tr (and where the second summation is over all possible
choices for T1, ... , Tr).
Solutiou. (a) The proof is by mathematical induction. For k = I and k = 2, it
follows from the results of Sections 15.8 and 15.9 that F- i is k times continuously
differentiable and formula (E.I) valid at any interior point at which F is k times
continuously differentiable.
Suppose now that, for an arbitrary value of k, F- I is k times continuously
differentiable and formula (E.I) valid at any interior point at which F is k times
continuously differentiable. Denote by c* an interior point at which F is k + I
times continuously differentiable. Then, it suffices to show that F- i is k + I times
15. Matrix Differentiation 133
ak+IF- 1
aXil'" aXjk+l
k+l
= L L (-WF- 1D(Tt)F- 1D(T2*)·· .F-1D(Tr*)F- 1, (S.2)
r=1 Tt, ... ,Tr*
where jk+ 1 is an integer between 1 and m, inclusive, and where Tt, ... , T,* are r
nonempty mutually exclusive and exhaustive subsets of UI, ... , A+l}.
The matrix F is k times continuously differentiable at c* and hence at every
point in some neighborhood N of c*. By supposition, F- 1 is k times continuously
differentiable and formula (E.1) valid at every point in N. Moreover, all partial
derivatives of F of order less than or equal to k are continuously differentiable
at c*. Thus, it follows from results (4.8) and (8.15) that akF-I/axh ... aXjt is
continuously differentiable at c* and that (at x c*) =
aXil'" aXjt+l
k
=L L (-1)'
X [-F-l~F-ID(TdF-ID(T2)'" F-1D(Tr)F- 1
aXjk+l
aF
-F- 1D(TJ}F- 1D(T2)'" F-1D(Tr)F-l_-F- 1
aXjt+l
+F-1D(TI u Uk+lDF- 1D(T2)'" F-1D(Tr)F- 1
+F- 1D(TJ}F- 1D(T2 U Uk+l})" .F-1D(Tr)F- 1
The terms of sum (S.3) can be put into one-to-one correspondence with the terms
of sum (S.2) (in such a way that the corresponding terms are identical), so that
formula (S.2) is valid and the mathematical induction argument is complete.
(b) The proof is by mathematical induction. For k = 1 and k = 2, it follows
from the results of Sections 15.8 and 15.9 that logdet(F) is k times continuously
differentiable and formula (E.2) valid at any interior point at which F is k times
continuously differentiable.
134 15. Matrix Differentiation
Suppose now that, for an arbitrary value of k, log det(F) is k times continuously
differentiable and formula (E.2) valid at any interior point at which F is k times
continuously differentiable. Denote by c* an interior point at which F is k + 1
times continuously differentiable. Then, it suffices to show that log det(F) is k + 1
times continuously differentiable at c* and that (at x = c*)
ak 10gdet(F)
aXil'" aXjk+1
HI
=L L (-1)'+ltr[F- 10(Tt)F- 10(T2*)··· F-10(Tr*)], (S.4)
r=l Tt •...• Tr·
where Tt, . .. , Tr* are r nonempty mutually exclusive and exhaustive subsets of
{iI,···, A+u with iHI E T/.
The matrix F is k times continuously differentiable at c* and hence at every point
in some neighborhood N of c* . By supposition, log det(F) is k times continuously
differentiable and formula (E.2) valid at every point in N. Moreover, all partial
derivatives of F of order less than or equal to k are continuously differentiable at
c*, and F- 1 is continuously differentiable at c*. Thus, it follows from results (4.S)
and (S.15) that a k 10gdet(F)/axh ... aXjk is continuously differentiable at c* and
that (at x = c*)
aH1logdet(F)
aXil'" aXjk+1
k
=L L (-W+ l
r=l TI ..... Tr
The terms of sum (S.5) can be put into one-to-one correspondence with the terms
of sum (S.4) (in such a way that the corresponding terms are identical), so that
formula (S.4) is valid and the mathematical induction argument is complete.
15. Matrix Differentiation 135
02(W - WPx,w)
OZiOZj
I a2 w
= (I - Pxw)--(I - P xw )
'OZiOZj ,
oW, ,oW
(I - Pxw)-X(X wxrx -a
I
(I - Px,w)
, o~ 0
oW, ,oW ,
[(I - Pxw)-X(X wxrx - ( I - PX,w)].
I
, OZi OZj
Pxw)-(I - px.w).
. aZj
Further, px.w and aw /aZj are continuously differentiable at c.
Thus, a(W - wpx.w)/aZj is continuously differentiable at c, and hence W -
wpx .w is twice continuously differentiable at c. Moreover, making use of results
(4.6) and (11.1) [along with Part (3') of Theorem 14.12.11], we find that (at z = c)
+ (I -
I
Pxw)--(I - px.w)
a2w
· aZiaZj
apxw)' aw
- ( - ' - --(I - px.w)
aZi aZj
ap )'aw
= - [( ~ -(I-Px.w)
aZi aZj
J'
+ (I - px.w)'--(I - px.w)
a2w
aZiaZj
apxw)' aw
- ( - ' - --(I-Px.w)
aZi aZj
aw, ,aw ,
= -[(I - Pxw)-X(X
. aZi
I
wxrx - ( I - px.w))
aZj
a2 w
+ (I -
I
Pxw)--(I - px.w)
aZiaZj ·
aw, I ,aw
- (I - Pxw)-X(X WX)-X - ( I - px.w).
· aZi aZj
apxw )'
( --'- , aw ax
WPx,w = (I - Pxw)-Px,w + W(I - px.w)-a B.
a0 . a0 0
Solution. According to result (**) [or, equivalently, according to Part (6') of The-
orem 14.12.11], WPx,w = p~.wWPx,w. Thus, it follows from results (4.6) and
4.10) that
Substituting from result (*) [or equivalently from result (11.16)], we find that (for
any p x n matrix B such that X'WXB = X'W)
a(WPxw ) , aw ax ,
- - - '- = [(I - Pxw)-Px,w + W(I - Px.w)-B]
aZj . aZj aZj
, aw , aw
+ Px,w aZj px.w + (I - Px,w) aZj Px,w
ax
+ W(I - PX,w)-B
aZj
, aw ax ,
= Pxw-(I - Px,w) + [W(I - PX,w)-B]
, aZj aZj
aw ax
+ -Px,w + W(I - Px,w)-a B.
aZj Zj
it is clear that
a (WPx,w)
---'-----"=-:...
aZj
= -aw
aZj
- ,
,
aw
(I - Pxw)-(I - Px,w)
aZj
ax ax ,
+ W(I - PX,w)-B + [W(I - PX,w)-B].
aZj aZj
16
Kronecker Products and the Vec and
Vech Operators
EXERCISE 1. (a) Verify that, for any m x n matrices A and B and any p x q
matrices C and D,
(b) Verify that, for any m x n matrices AI, A2, ... ,Ar and p x q matrices
Bl, B2, ... , Bs ,
A® (t Bj )~ t<A®Bj). (S.l)
s*+!
= L(A®B j ),
j=!
which indicates that result (S.l) is valid for s = s* + I, thereby completing the
induction argument. Moreover, it can be shown in analogous fashion that, for any
p x q matrix B,
EXERCISE 3. Show that, for any m x 1 vector a and any p x 1 vector b, (1)
a®b = (a ® Ip)b and (2) a' ® b' = b'(a' ® Ip).
Solution. Making use of results (1.20) and (1.1), we find (1) that
a ® b = (a ® Ip)(1 ® b) = (a ® Ip)b
so that (A' A) - ® (B'B) - is a generalized inverse of X'X. And, again making use
of result (1.19), it follows that
P x = X(X'X)-X'
= (A ® B)[(A'A)- ® (B'B)-](A' ® B')
= [A(A'A)- A'] ® [B(B'B)-B']
= PA ®PB .
Solution. (a) Suppose that A and B are both symmetric nonnegative definite. Then,
according to Corollary 14.3.8, there exist matrices P and Q such that A = P'P and
B = Q'Q. Thus, making use of results (1.19) and (1.15), we find that
A ® B = (p' ® Q')(P® Q) = (P® Q)'(P® Q).
We conclude (in light of Corollary 14.3.8 or 14.2.14) that A ® B is symmetric
nonnegative definite.
Alternatively, if A and B are both symmetric non positive definite, then -A
and -B are symmetric and (by definition) nonnegative definite, and the proof
[of Part (a)] is complete upon observing [in light of result (1.10)] that A ® B =
(-A) ® (-B).
(b) Suppose now that A and B are both symmetric positive definite. Then,
according to Corollary 14.3.13, there exist nonsingular matrices P and Q such
that A = P'P and B = Q'Q. Further, A ® B = (P ® Q)'(P ® Q), and P ® Q is
nonsingular. We conclude (in light of Corollary 14.3.13 or 14.2.14) that A ® B is
symmetric positive definite.
Alternatively, if A and B are both symmetric negative definite, then -A and-B
are symmetric and (by definition) positive definite, and the proof is complete upon
observing that A ® B = (-A) ® (-B).
IIA®BII=IIAIIIIBII·
Solution. Making use of results (1.15), (1.19), and (1.25), we find that
I
IIA ® B II = {tr[(A ® B)'(A ® B)]P
= {tr[(A' ® B')(A ® B)]P
1
I
= {tr[(A' A) ® (B'B)]P
I I
= {tr(A'A) P {tr(B'B) P
= IIAIIIIBII·
All AJ2
... AlC)
... A2c
( A2l A22
A= . .
and a p x q matrix B,
that is, A ® B equals the mp x nq matrix obtained by replacing each block Aij of
A with the Kronecker product of Aij and B.
Solution. For i = 1, ... , r, let mi represent the number ofrows in Ail, Ai2, ... ,
Aic; and, for j = 1, ... , c, let n j represent the number of columns in Alj, A2j,
... , A rj . Define F = A ® B; and partition F as
...
FJ2
Flo)
C
F2l F22 ... F2c
F= . . ,
Frl Fr2 F rc
144 16. Kronecker Products and the Vec and Vech Operators
that is, partition F into r rows and c columns of blocks, the ijth of which is
of dimensions mi P x n jq and is denoted by Fij. Then (for i = 1, ... , rand
j = 1, ... , c), Fij equals a partitioned matrix comprising mi rows and n j columns
of P x q dimensional blocks, the uvth of which is
EXERCISE 13. Let AI, A2, ... , Ak represent k matrices (of the same dimen-
16. Kronecker Products and the Vec and Vech Operators 145
sions). Show that AI, A2, ... , Ak are linearly independent if and only ifvec(Aj),
vec(A2), ... , vec(Ak) are linearly independent.
Solution. It suffices to show that AI, A2, ... ,Ak are linearly dependent if and
only if vec(A I), vec(A2), ... , vec(Ak) are linearly dependent.
Suppose that A I, A2, ... , Ak are linearly dependent. Then, there exist scalars
q, C2, ..• , q, not all zero, such that L:7=1 CiAi = O. Since [in light of result (2.6)]
k k
L Ci vee (Ai ) = vec(L Ci Ai) = vec(O) = 0,
i=1 i=1
we conclude that vec(Aj), vec(A2), ... , vec(Ak) are linearly dependent.
Conversely, suppose that vec(AI), vec(A2), ... , vec(Ak) are linearly depen-
dent. Then, there exist scalars CI, C2, ..• , q, not all zero, such that
k
L Ci vee (Ai ) = 0,
i=1
or equivalently [in light of result (2.6)] such that vec(L:7=1 CiAi) = 0, and hence
such that L:7=1 CiAi = O. We conclude thatAI , A2, ... , Ak are linearly dependent.
EXERCISE 14. Let m represent a positive integer, let ei represent the ith column
oflm (i = 1, ... , m), and (for i, j = 1, ... , m) let Vi} = eiej (in which case Vi)
is an m x m matrix whose ijth element is 1 and whose remaining m 2 - 1 elements
are 0).
(a) Show that
m
vec(Im) = Lei ® ei·
i=1
(b) Show that (for i, j, r, S = 1, ... , m)
vec(Vr;)[vec(Vsj)]' = Vi} ® Vrs .
(c) Show that
m m
L L Vi} ® Vi} = vee (1m Hvec(Im)]'.
i=1 j=1
Solution. (a) Making use of results (2.4.4), (2.6), and (2.3), we find that
(b) Making use of results (2.3), (1.15), and (1.19), we find that
146 16. Kronecker Products and the Vec and Vech Operators
(c) Making use of Part (b) and results (2.6) and (2.4.4), we find that
= Lvee(U;;)[Lvee(Ujj)]'
j
= vee(LUii)[vee(LUjj))'
i j
= vee(Im)[vee(Im)]'.
Solution. (a) If A is orthogonal, then, making use of result (2.14), we find that
(vee A)'vee A = tr(A' A) = tr(In ) = n.
(b) If A is idempotent, then, making use of result (2.14) and Corollary 10.2.2,
we find that
[vee(A')]'vee A = tr(AA) = tr(A) = rank(A).
EXERCISE 17. (a) Let V represent a linear space of m x n matrices, and let g
represent a function that assigns the value A * B to each pair of matrices A and B
in V. Take U to be the linear space of mn x 1 vectors defined by
U = {x E nmnx 1 : x = vee (A) for some A E V},
16. Kronecker Products and the Vec and Vech Operators 147
and let x y represent the value assigned to each pair of vectors x and y in U by
0
an arbitrary inner product f. Show that g is an inner product (for V) if and only if
there exists an f such that (for all A and B in V)
A * B = vec(A)ovec(B).
(b) Let g represent a function that assigns the value A * B to an arbitrary pair
of matrices A and B in nmxn. Show that g is an inner product (for nmxn) if and
only if there exists an mn x mn partitioned symmetric positive definite matrix
(where each submatrix is of dimensions m x m) such that (for alIA andB in nmxn)
A*B = L3;Wijbj,
i,j
where 31,32, ... ,3n and bl, b2, ... , b n represent the first, second, ... , nth col-
umns of A and B, respectively.
(c) Let g represent a function that assigns the value x' * y' to an arbitrary pair
of (row) vectors in n I xn. Show that g is an inner product (for n I xn) if and only
if there exists an n x n symmetric positive definite matrix W such that (for every
pair of n-dimensional row vectors x' and y')
A *B = vec(A)ovec(B)
(for all A and B in V). Then,
(1) A *B = vec(A)ovec(B) = vec(B)ovec(A) = B * A;
(2) A * A = vec(A) 0vec(A) 2: 0, with equality holding if and only if vec(A) =
oor equivalently if and only if A = 0 ;
(3) (kA) * B = vec(kA) °vec(B) = [k vec(A)) ovec(B)
=k [vec(A) vec(B)] = k(A * B) ;
0
where X and Yare the (unique) m x n matrices such that x = vec(X) and y =
vec(Y). Then, letting x, y, and z represent arbitrary vectors in U, taking X, Y, and
Z to be m x n matrices such that x = vec(X), y = vec(Y), and z = vec(Z), and
denoting by k an arbitrary scalar, we find that
(1) x * y = X * Y = Y * X = Y * x;
(2) x * x = X * X ~ 0, with equality holding if and only if X = 0 or equivalently
if and only if x = 0 ;
(3) (kx) * Y = (kX) * Y = k(X * Y) = k(x * y);
(4) (x + y) *Z = (X + Y) * Z = (X * Z) + (Y * Z) = (x *Z) + (y *Z).
Thus, 1 is an inner product (for U). Moreover, for f = 1, we have that
A *B = vec(A) * vec(B) = vec(A) ·vec(B)
A *B = vec(A) ·vec(B) .
Moreover, according to the discussion of Section 14. lOa, every inner product for
n mn x 1 is expressible as a bilinear form, and a bilinear form (in mn x 1 vectors)
qualifies as an inner product for nmn x I if and only if the matrix of the bilinear
form is symmetric and positive definite. Thus, g is an inner product (for nm xn) if
and only if there exists an mn x mn symmetric positive definite matrix W such
that (for all A and B in V)
A * B = (vec A)'Wvec B.
Further, partitioning W as
Win)
W2n
. ,
Wnn
16. Kronecker Products and the Vec and Vech Operators 149
and denoting by ai, a2, ... , an and bl, b2, ... , b n the first, second, ... , nth col-
umns of A and B, respectively, we find that
n
(c) It follows from Part (b) that g is an inner product (for I x n) if and only if
there exists an n x n symmetric positive definite matrix W = {Wij} such that (for
every pair of n-dimensional row vectors x' = {Xi} and y' = {Yi})
where A* is the (m - 1) x n matrix whose rows are respectively the first, ... ,
(m - l)th rows of A and r' is the mth row of A [and hence where A = (~*) and
A' = (A:, r)) .
Solution. (a) (1) Since vec (A:) = Km-l,nvec A*, we have [in light of the defining
relation (3.1)] that
Thus,
K a -_
mn
(Km-l.n
0
0)
In
Pa
150 16. Kronecker Products and the Vec and Vech Operators
for every mn-dimensional column vector a, implying (in light of Lemma 2.3.2)
that
K _ (Km-I,n
mn - 0
O)p
In .
(2) The vector r is the n x 1 vector whose first, second, ... , nth elements
are respectively the mth, (2m )th, ... , (nm)th elements of vec A, and vec A* is
the (m - l)n-dimensional subvector of vec A obtained by striking out those n
elements. Accordingly, P = (~~). where P2 is the n x mn matrix whose first,
second, ... ,nth rows are respectively the mth, (2m)th, ... , (nm)th rows of I mn ,
and PI is the (m - 1)n x mn submatrix of Imn obtained by striking out those n
rows. Now, applying Lemma 13.1.3, we find that IPI = (-1)/fJ, where
IKmn I = (_1)m(m-l)n(n-I)/4.
IKmml = (_I)[m(m-I)/212 .
Since the product of two odd numbers is odd and the product of two even numbers
even, we conclude that
IKmml = (_1)m(m-I)/2.
Solution. Making use of Corollary 16.3.3 and of results (1.16) and (1.4), we find
that
EXERCISE 20. Let m and n represent positive integers, and let ei represent the ith
column ofIm (i = 1, ... , m) and U j represent the jth column ofIn (j = 1, ... , n).
Show that
n m
Kmn = Luj ®Im ®Uj = Lei ®In ®e;.
j=l i=l
Solution. Starting with result (3.3) and using results (1.4) and (2.4.4), we find that
= Luj®(Leie;)®uj = Luj®Im®Uj.
j i j
Similarly,
EXERCISE 21. Let m, n, and p represent positive integers. Using the result of
Exercise 20, show that
152 16. Kronecker Products and the Vec and Vech Operators
Kmp,n = L u j ® Imp ® Uj
j
=L uj ® (Ip ® 1m) ® Uj
j
(b) Making use of Part (a) and result (3.6), we find that
Kmp,nKnp,mKmn.p = Kp.mnKm.npKnp,mKmn,p
= Kp.mn I Kmn.p = Kp.mnKmn.p = I.
(c) Making use of Part (b) and result (3.6), we find that
(0 Making use of Parts (c) and (a) and result (3.6), we find that
Km,npKmp,n = (Kmp,nKnm,p)(Kp,mnKm.np)
= Kmp,nKmn,pKp.mnKm,np = Kmp,nKm.np .
EXERCISE 22, Let A represent an m x n matrix, and define B = Kmn (A' ® A).
Show (a) that B is symmetric, (b) that rank(B) =
[rank(A)f, (c) that B2 =
(AA') ® (A' A), and (d) that tr(B) = tr(A' A).
Solution. (a) Making use ofresults (1.15), (3.6), and (3.9), we find that
EXERCISE 23. Show that, for any m x n matrix A and any p x q matrix B,
Solution. Making use of results (1.20), (1.1), and (1.8), we find that
Now, substituting these expressions [for vec(A) ® vec(B)] into formula (3.16)
and making use of result (1.19), we obtain
and
vech A = Ln vec A
for every n x n matrix A (symmetric or not). [The matrix Ln is one choice for the
matrix H n , i.e., for a left inverse of Gn . It is referred to by Magnus and Neudecker
(1980) as the elimination matrix-the effect of premultiplying the vec of an n x n
matrix A by Ln is to eliminate (from vec A) the "supradiagonal" elements of A.]
(a) Write out the elements of LJ, L2, and L3.
(b) For an arbitrary positive integer n, describe Ln in terms of its rows.
Solution. (a) LJ = (1),
1 0 0 0 0 0 0 0 0
0 1 0 0 0 0 0 0 0
0 0
~= (~ 1 0
0 0 ~). and L3 = 0
0
0
0
0
0
1
0
0
0
0
0
0
1
0
0
0
1
0
0
0
0 0
0 0
0 0
0 0 0 0 0 0 0 0
(b) For i ::: j, the [(j - 1)(n - j /2) + i]th row of Ln is the [(j - 1)n + i]th
row ofln2.
Solution. (a) Since (1/2)(A + A') is an n x n symmetric matrix, we have [in light
of result (4.17)] that, for Hn = (G~Gn)-lG~,
(c) Using the results of Parts (a) and (b), we find that
EXERCISE 28. Let A represent a square matrix of order n. Show that, for Hn =
(G'n G n )-IGn'
'
GnHn (A ® A)H~ = (A ® A)H~ .
Solution. Let us use mathematical induction to show that for any n x n upper
triangular matrix A = {aij}, Hn(A ® A)Gn is upper triangular with diagonal
elements aiiajj (i = 1, ... , n; j = i, ... , n).
For every 1 x 1 upper triangular matrix A = (all), H!(A®A)Gl is the 1 x 1 ma-
trix (ail)' which is upper triangular with diagonal element aiia jj (i = 1; j = 1).
Suppose now that, for every n x n upper triangular matrix A = {aij}, Hn (A®A)G n
is upper triangular with diagonal elements a;;a jj (i = 1, ... , n; j = i, ... , n), and
let B = {bi}} represent an (n + 1) x (n + 1) upper triangular matrix. Then, to com-
plete the induction argument, it suffices to show that Hn+! (B ® B)Gn+! is upper
triangular with diagonal elements biibjj (i = 1, ... , n + 1; j = i, ... , n + I).
For this purpose, partition B as
B = (: ~)
(where A is n x n with ijth element b;+1.j+J}. Then (since B is upper triangular),
a = 0, and it follows from result (4.29) that
Thus, Hn+ I (B ® B)Gn+ I is upper triangular. And, its diagonal elements are c 2 =
bil' cbjj = bllbjj(j = 2, ... , n+l),andbiibjj(i = 2, ... , n+l; j = i, ... , n+
1); that is, its diagonal elements are biibjj{i = 1, ... , n + 1; j = i, ... , n + 1).
It can be established via an analogous argument that, for any n x n lower
triangular matrix A = {aij}, Hn (A ® A)G n is lower triangular with diagonal
elements aiiajj (i = 1, ... , n; j = i, ... , n).
Finally, note that if an n x n matrix A = {aij} is diagonal, then A is both upper and
lower triangular, in which case Hn (A ® A)G n is both upper and lower triangular,
and hence diagonal, with diagonal elements aiiajj (i = 1, ... , n; j = i, ... , n).
EXERCISE 30. Let AI, ... , Ak, and B represent m x n matrices, and let b =
vec B.
(a) Show that the matrix equation L7=1 Xi Ai = B (in unknowns XI, ... , Xk) is
equivalent to a linear system of the form Ax = b, where x = (XI, ... , xd' is a
vector of unknowns.
(b) Show that if A I, ... , Ak, and B are symmetric, then the matrix equation
L7=1 xjAj = B (in unknowns XI, ... ,Xk) is equivalent to a linear system of
the form A *x = b*, where b* = vech B and x = (XI, ... , Xk)' is a vector of
unknowns.
Ax = GnA*x = Gnb* = b,
avee(Fk) = ve{a(Fk)]
aXj aXj
= vel Fk-I~ +Fk-2~F+ ... + ~Fk-I)
'\ aXj aXj aXj
= vee(t Fk - S aF.~_I)
s=1 ax;
= t
s=1 ax;
'\
~-I)
velFk- s aF.
= t [(~-I)'
s=1 ,\ax;
® Fk-s]vel aF.)
= ~ k aveeF
~ [(p-I)' ® F -S] _ _ ,
s=1 aXj
implying that
avee(Fk) _ ~ ~_I' k-s avee F
a' - ~ [( ) ®F ] a' .
x s=1 X
(d) Show that, in the special case where x' = [(vee X)', (vee V)'], F(x) = X,
and G(x) = Y for some p x q and r x s matrices X and Y of variables, the formula
in Part (c) simplifies to
avee(X ® Y)
ax' = (Iq ® Ksp ® Ir )[Ipq ® (vee V), (vee X) ® Irs].
16. Kronecker Products and the Vee and Vech Operators 159
Solution. (a) Partition each of the three matrices a(F®G)/ax j, F®(aG/aXj), and
(aF lax j) ® G into p rows and q columns of r x s dimensional blocks. Then, for
i = 1, ... , p ands = 1, ... , q, the isth blocks of a(F®G)/axj' F® (aG/aXj),
and (aF /aXj) ®G are respectively aUisG)/8xj, fis(aG/aXj), and (8fis/axj)G,
implying [in light ofresult(15.4.9)] that the isth block of a(F ® G)/aXj equals
the sum of the isth blocks ofF® (8G/8xj) and (8F/axj) ®G. We conclude that
(b) Making use of Part (a) and Theorem 16.3.5, we find that
8 vee (F ® G)
aXj
= vec[a(F ® G)]
aXj
= vec(F ® 8G) +
aXj
vec(~
aXj
® G)
(c) In light of result (1.28), (vee F) ® [a(vec G)/aXj] is the jth column of
(vee F) ® [8 (vee G)/ax']. And, in light of result (1.27), [8(vec F)/8x j] ® (vee G)
is the j th column of [a (vee F) / ax'] ® (vee G). Thus, it follows from Part (b) that
8 vee (F ® G)
ax' = (Iq ® Ksp ® Ir) [ 8 vee G avee F ]
(vee F) ® ax' + ax;- ® (vee G) .
(vee F) ®
avecG = [0, (vee F) ® Irs]
ax'
and that
avecF
- - ® (vee G) = [lpq ® (vee G), 0],
ax'
so that it follows from Part (c) that
avee (F®G)
---, --'- = (Iq ® Ksp ® IrHlpq ® (vee G), (vee F) ® Irs].
ax
17
Intersections and Sums of Subspaces
Y E (U + W)..L ~ Y 1- (U + W)
~ Y 1- U and Y 1- W
~ Y E U..L and Y E W..L
~ YE(U..LnW..L).
EXERCISE 5. Let UI, U2, ... ,Uk represent subspaces of Rmxn. Show that if,
for j = 1,2, ... , k, Uj is spanned by a (finite nonempty) set of (m x n) matrices
(j) (j)
VI , ... , V rj ,then
Solution. Suppose that, for j = 1,2, ... ,k, Uj is spanned by the set {V~j), ... ,
VW}. The proof that UI +U2 + ... +Uk is spanned by ViI), ... , V~:), vi2), ... ,
(2) ' .•. , V(k)
V r2 I ' · .. , V(k). b th . I· d .
rk IS Y ma ematIca 10 uctIon.
It follows from Lemma 17.1.1 that
Then, the proof is complete upon observing (in light of Lemma 17.1.1) that
(E. 1)
iSU1=",=Uk=O.
(a) Show that UI, ... ,Uk are independent if and only if, for i = 2, ... ,k, Ui
and UI + ... + Ui -I are essentially disjoint.
(b) Show that UI, ... , Uk are independent if and only if, for i = 1, ... , k, Ui
and UI + ... + Ui-I + Ui+1 + ... + Uk are essentially disjoint.
(c) Use the results of Exercise 3 [along with Part (a) or (b)] to show that if
UI, ... , Uk are (pairwise) orthogonal, then they are independent.
(d) Assuming that UI, Uz, ... , Uk are of dimension one or more and letting
{Ui j ), ... , UW} represent any linearly independent set of matrices in Uj (j =
1,2, ... , k), show that if UI, U2, ... ,Uk are independent, then the combined set
{U (l) UO) (2) (2) (k) (k) • . I' d d
1 , ... , r l ' U 1 , ... , U r2 ' ••• , U I , ... , U rk } IS lInear y III epen ent.
(e) Assuming that UI, U2, ... , Uk are of dimension one or more, show that UI,
U2, ' .. ,Uk are independent if and only if, for every nonnull matrix UI in UI, every
nonnull matrix U2 in U2, ... , and every nonnull matrix Uk in Uk. UI, U2, ... , Uk
are linearly independent.
(f) For j = 1, ... , k, let Pj = dim(Uj), and let Sj represent a basis forUj (j =
1, ... , k). Define S to be the set of 2:)=1 Pj matrices obtained by combining all
of the matrices in SI, ... , Sk into a single set. Use the result of Exercise 5 [along
with Part (d)] to show that (1) if UI, ... , Uk are independent, then S is a basis for
UI + ... + Uk; and (2) if UI, ... , Uk are not independent, then S contains a proper
subset that is a basis for UI + ... + Uk.
(g) Show that (1) if U I, ... , Uk are independent, then
Solution. (a) It suffices to show that UI, ... , Uk are not independent if and only
if, for some i (2 SiS k), U i and UI + ... + Ui-I are not essentially disjoint.
Suppose thatUI, ... , Uk are not independent. Then, by definition, equation (E.l)
has a solution, say UI = Ur, ... , Uk = Uk' other than UI = ... = Uk = O. Let
r represent the largest value of i for which Ur is nonnull. (Clearly, r 2: 2.) Since
Ur + ... + U; = Ur + ... + Uk = 0,
According to Part (a), Uj+1 and UI + ... + Uj are essentially disjoint. And,
(1) V(1) (J). b
c IearIy, V I " ' " rr'
(2) (2) (j)
VI , ... , Vrz ' .... VI , ... , Vrj are m the su space
UI + U2 + ... + U j . Thus, it follows from Lemma 17.1.3 that the set (V~I), ... ,
el) V(2)
V rr' V(2) (J+I) (J+I) . . .
I , ... , r2" .. , V I ' ...• Vrj+r } IS lmearly mdependent.
(e) Suppose that UI. U2 • ... , Uk are independent. Let VI, V2 • ... , Vk represent
nonnull matrices in UI, U2 • ... , Uk, respectively. Then, it follows from Part (d)
that the set {VI, V2, ... , Vd is linearly independent.
Conversely, suppose that, for every nonnull matrix V I in UI, every nonnull ma-
trix V 2 in U2, ... , and every nonnull matrix V k in Uk. V I, V 2, ... , V k are linearly
independent. If UI, U2, ... ,Uk were not independent, then, for some nonempty
subset {JI, ... , ir} of the first k positive integers, there would exist nonnull matri-
ces V jr ' ... , V jr in Ujr' ...• U jr , respectively, such that
and the set {VI, V2, ... , Vk} (where, for i rf. {h, ... , ir}, Vj is an arbitrary
nonnull matrix in Uj) would be linearly dependent, which would be contradictory.
Thus, UI, U2, ... , Uk are independent.
(f) It is clear from the result of Exercise 5 that S spans UI + ... + Uk.
(1) Now, suppose that UI, ... , Uk are independent. Then, it is evident from Part
(d) that S is a linearly independent set. Thus, S is a basis for UI + ... + Uk.
(2) Alternatively, suppose that UI, ... , Uk are not independent. Then, for some
(nonempty) subset {Jr, ... , ir} of the first k positive integers, there exist nonnull
matrices V jr ' ... , V jr in UJr' ... , Ujr' respectively, such that
VJr +",+V jr =0.
h
werec (m) (m) . b ) d V(m) V(m)
1 , ... ,cpjmarescalars(notallofwhichcan ezero an 1 , ... , Pjm
are the matrices in S jm' Thus,
implying that S is a linearly dependent set. We conclude that S itself is not a basis
and consequently (in light of Theorem 4.3.11) that S contains a proper subset that
is a basis for Ul + ... + Uk.
(g) Part (g) is an immediate consequence of Part (f).
EXERCISE 7. Let AI, ... , Ak represent matrices having the same number of
rows, and let BI, ... , Bk represent matrices having the same number of columns.
17. Intersections and Sums of Subspaces 167
Adopting the terminology of Exercise 6, use Part (g) of that exercise to show (a)
that if C(Al), ... , C(Ak) are independent, then
that
(a) C[(I - AA -)B] = C(I - AA -) n C(A, B) and
168 17. Intersections and Sums of Subspaces
Solution. (a) According to result (*) [or, equivalently, the first part of Corollary
17.2.9],
C(A, B) = C(A) + C[(I - AA -)B].
Thus, observing thatC[(1 - AA -)B] c C(I - AA -) and making use of the result
of Part (c)-(2) of Exercise 4 (and also of Lemma 17.2.7), we find that
(b) According to Part (1) of Theorem 12.3.4, (A' A)- A' is a generalized inverse
of A. Substituting this generalized inverse for A-in the result of Part (a) and
making use of Lemma 12.5.2, we find that
R(V) are not necessarily essentially disjoint] and (b) the special case where R(T)
and R(V) are essentially disjoint.
Solution. (a) IfC(T) and C(U) are essentially disjoint, then {since C[T(I - V-V)]
c C(T)} C[T(I - V-V)] and C(U) are essentially disjoint, and (in light of Corollary
17.2.10) formula (*) [or, equivalently, formula (2.15)] simplifies to
(b) If R(T) and R(V) are essentially disjoint, then it follows from an analogous
line of reasoning that formula (*) [or, equivalently, formula (2.15)] simplifies to
represent a generalized inverse of the partitioned matrix (~ ~). Show that (a)
if GIl is a generalized inverse of T and Gl2 a generalized inverse of V, then R(T)
and R(V) are essentially disjoint, and (b) if GIl is a generalized inverse of T and
G21 a generalized inverse of U, then C(T) and C(U) are essentially disjoint.
Solution. Clearly,
= (~ ~) (g~~ (S.I)
EXERCISE 13. (a) Generalize the result that, for any two subspaces U and V of
Rmxn,
dim(U + V) = dim(U) + dim (V) - dim(U n V),
Do so by showing that, for any k subspaces Ul, ... , Uk.
dim(UI + ... + Uk) = dim(Ul) + ... + dim(Uk)
k
- L dim[(UI + ... + Ui-l) n U;1. (E.2)
i=2
17. Intersections and Sums of Subspaces 171
(b) Generalize the result that, for any m x n matrix A, m x p matrix B, and
q x n matrix C,
Do so by showing that, for any matrices AI, ... , Ak having the same number of
rows,
and, for any matrices BI, ... , Bk having the same number of columns,
Solution. (a) The proof is by mathematical induction. In the special case where
k = 2, equality (E.2) reduces to the equality
which is equivalent to equality (*) and whose validity was established in Theorem
17.4.1.
Suppose now that equality (E.2) is valid for k = k' (where k' ::: 2). Then,
making use of Theorem 17.4.1, we find that
(b) Applying Part (a) with U\ = C(Ad, ... ,Uk = C(Ad [and recalling result
(1.6)], we find that
i=2
= rank(Ad + ... + rank(Ak)
- L dim[C(Al, ... , Ai-I> n C(Ai )].
k
i=2
And, similarly, applying Part (a) with Ul = R(Bd, ... , Uk = R(Bk) [and
recalling result (1.7)], we find that
4~}mm[RG}
= dim[R(Bl) + ... + R(Bk)]
= dim[R(BJ)] + ... + dim[R(Bk)]
;=2
Solution. Making use of equality (*) [or equivalently equality (5.8)], we find that
rank(ACB) = rank[A(CB)]
= rank (A) + rank(CB) - n
+ rank{[1 - CB(CB)-](I - A-A)}. (S.3)
And upon equating expression (**) [or equivalently expression (5.5)] to expression
(S.3), we find that
Solution. Suppose that A is the projection matrix for U along V (where U EEl V =
Rnxl). Then, according to Theorem 17.6.14, A is idempotent, U = C(A), and
V = C(I - A). And, since (according to Lemma 10.1.2) A' is idempotent, it
follows from Theorem 17.6.14 that A' is the projection matrix for C(A') along
N(A' ). Moreover, making use of Corollary 11.7.2 and of Lemma 12.5.2, we find
that
C(A') = N(I - A') = Cl.(1 - A) = Vl.
and that
EXERCISE 16. Show that, for any n x p matrix X, XX- is the projection matrix
forC(X) alongN(XX-).
Solution. Suppose that UI, ... ,Uk are independent and that UI + ... + Uk = V.
(a) It follows from the very definition of a sum (of subspaces) that there exist
matrices ZI, ... , Zk in UI, ... ,Uk. respectively, such that Y = ZI + ... + Zk. For
purposes of establishing the uniqueness of Z" ... , Zk. let Zr, ... , Zk represent
matrices (potentially different from Z" ... , Zk) in UI, ... ,Uk, respectively, such
that Y = Zr + ... + ZZ. Then,
EXERCISE 18. Let U and W represent essentially disjoint subspaces (of R n X,)
whose sum is R nx I, and let U represent any n x s matrix such that C(U) = U and
W any n x t matrix such that C(W) = W.
(a) Show that the n x (s + t) partitioned matrix (U, W) has a right inverse.
(b) Taking R to be an arbitrary right inverse of (U, W) and partitioning R as
R = (:~) (where RI has s rows), show that the projection matrix for U along
W equals URI and that the projection matrix for W along U equals WR2.
Solution. (a) In light of result (1.4), we have that
Si rows), and let H = B'B or (more generally) let H represent any matrix of the
form
(E.3)
Solution. (a) Clearly, dim(UI + ... + Uk) = dim(Rn x I) = n. Thus, making use
of Part (g)-( 1) of Exercise 6, we find that
(b) Clearly, H = B' diag(AI, ... , Ak)B. Thus, since (according to Lemma
14.8.3) diag(AI, ... , Ak) is positive definite, it follows from Corollary 14.2.10
that H is positive definite.
(c) In light of Lemma 14.12.1, it suffices to show that (for j i= i)U;HUj = O.
178 17. Intersections and Sums of Subspaces
By definition,
BIUI BIU2
... BIUk)
( B2UI B2U2 ... B2Uk
·· .. .
.
.
.
· .
BkUI BkU2 ... BkUk
(d) (1) According to Part (c), Ui is orthogonal to UI, ... ,Ui-I, Ui+I, ... , Uk.
Thus, making repeated (k - 2 times) use of Part (a)-(2) of Exercise 3, we find that
Uj is orthogonal to UI + ... +Ui-I +UHI + ... +Uk. We conclude (on the basis
of Lemma 17.7.2) that UI + ... +Ui-I +UHI + ... +Uk = U/".
(2) That the projection of yon Uj along Ul + ... + Ui-l + Ui+l + ... + Uk
equals the orthogonal projection of y on Ui (with respect to H) is [in light of Part
(1)] evident from Theorem 17.6.6.
(e) Suppose that, for j t= i = 1, ... , k, Uj and Uj are orthogonal with respect to
some symmetric positive definite matrix H*. Then, according to Corollary 14.3.13,
there exists an n x n nonsingular matrix P such that H* = P'P. Further,
I~ ~I = I~ ~I = ITIIW - VT-1UI.
or equivalently
IR + STUI = IRIIT + TUR-ISTI/ITI.
Similarly, by applying formula (*) to the right side of equality (S.2) [i.e., by
applying formula (*) with 1m and TU in place of T and U, respectively]' we find
that
III = 1 = IAI.
Thus, it follows from Corollary 18.1.7 (specifically from the special case of Corol-
lary 18.1.7 where C = I) that 1= A.
ICI::: IC-BI,
18. Sums (and Differences) of Matrices 181
A= (~, ~).
where T is of dimensions m x m and W of dimensions n x n (and where U is of
dimensionsm xn). And, defineQ = W - U'T-U (which is the Schur complement
ofT).
(a) Using the result that the symmetry and nonnegative definiteness of A imply
the nonnegative definiteness of Q and the result of Exercise 14.33 (or otherwise),
show that
Solution. (a) According to the result of Exercise 14.33, U'T-U is symmetric and
nonnegative definite. Further, W is symmetric. And, in light of the result that the
symmetry and nonnegative definiteness of A imply the nonnegative definiteness of
Q = W - U'T-U [a result that is implicit in Parts (1) and (2) of Theorem 14.8.4],
it follows from Corollary 18.1.8 that
and
IWIITI = IUl 2 ¢> IWI = IU'T-1UI.
Moreover, in light of Theorem 8.5.10,
EXERCISE 6. Show that, for any n x p matrix X and any symmetric positive
definite matrix W,
(E.1)
[Hint. Begin by showing that the matrices X'X(X'WX)-X'X and X'W-1X -
X'X(X'WX)-X'X are symmetric and nonnegative definite.]
Solution. Let A = X'X(X'WX)-X'X and C = X'W-1X. Then, making use of
Part (6') of Theorem 14.12.11, we find that
A = X'W-lWPx.wW-lX = X'W-lp~.wWPx.wW-lX
= (Px.wW-lX)'W(Px.wW-lX),
C- A = X'W-1W(I - px.w)W-1X
= X, W- l ' 1
(I - px.w) WeI - px.w)W- X
= [(I - px.w)W-1X]'W[(I- px.w)W-1X],
(S.3)
Ifrank(X) = p, then (in light of Theorem 14.2.9 and Lemma 14.9.1) IX'WXI
> 0 and (in light of Theorems 13.3.4 and 13.3.7)
18. Sums (and Differences) of Matrices 183
IA + BI 2: IAI,
II+q 2: 1,
with equality holding if and only if C = 0 or equivalently if and only ifB = O. The
proof is complete upon observing that, since IA + BI = IPI 2 1I + q and IAI = IPI 2
184 18. Sums (and Differences) of Matrices
(b) To what does the result of Part (a) simplify in the special case where R = In?
Solution. (a) It follows from Theorem 4.4.8 that there exist n-dimensional column
vectors sand u such that B = su'. Then, as a consequence of Corollary 18.2.10,
we find that R + B is nonsingular if and only if u' R -I S =1= -1. Moreover, upon
applying result (5.2.6) (with b = u and a = R-1s), we obtain
(R + B)-I = R- 1 - (1 + u'R-1s)R-1su'R- 1
= R- 1 - [1 + tr(R-1B)rIR-IBR- I.
(b) In the special case where R = In, the result of Part (a) can be restated as
follows: In + B is nonsingular if and only if tr(B) =1= -1, in which case
R+STU = R+ (ST)lmU.
Then, applying the cited result (or equivalently Theorem 18.2.8) with ST and 1m
in place of Sand T, respectively, we find that R + STU is nonsingular if and only
ifIm + UR- 1ST is nonsingular, in which case
R + STU = R + Slm(TU).
Then, applying the cited result (or equivalently Theorem 18.2.8) with 1m and TV
in place of T and U, respectively, we find that R + STU is nonsingular if and only
if 1m + TUR -1 S is nonsingular, in which case
R- - R-ST(l p + UR-ST)-UR-
R + STU = R + (ST)lpU
and also as
R + STU = R + Slm(TU).
Suppose now that R(STU) C R(R) andC(STU) C C(R). Then, upon applying
Theorem 18.2.14 with ST and Ip in place of Sand T, respectively, we find that
and R(A12) C R(A22) and for any generalized inverse (~~: ~~~) of A (where
CII is of the same dimensions as A;I)' CII is a generalized inverse of the matrix
All - A 12 A 22 A21.
(b)LetER = I-RR-,FR = I-R-R,X = ERST, Y = TUFR,Ey = I-YY-,
Fx = 1- X-X, Q = T+TUR-ST, Z = EyQFx, andQ* = FxZ-Ey. Use the
result of Part (a) of Exercise 10.10 to show that the matrix
(b) Let G represent the generalized inverse of the matrix (T~ -~T) obtained
by applying formula (1O.E.I). Partition Gas G = (~~: ~~~) (where Gil is of
dimensions q x n), and assume that [in applying formula (lO.E.l)] the generalized
inverse of -X is set equal to -X- [in which case Fx = 1- (-X-)( -X)]. Then,
GIl equals the matrix (E.2), and we conclude on the basis of Part (a) (of the current
exercise) that the matrix (E.2) is a generalized inverse of the matrix R + STU.
(c) Suppose that R(TU) c R(R) and C(ST) c C(R). Then, it follows from
Lemma 9.3.5 that X = OandY = o(sothatFx = landEy = I and consequently
18. Sums (and Differences) of Matrices 187
EXERCISE 12. Let AI, A2, ... represent a sequence of m x n matrices, and let
A represent another m x n matrix.
(a) Using the result of Exercise 6.1 (i.e., the triangle inequality), show that if
IIAk-AIi -+ 0, then IIAkll -+ IIAII.
(b) Show that if Ak -+ A, then IIAkll -+ IIAII (where the nonns are the usual
nonns).
Solution. (a) Making use of the triangle inequality, we find that
and that
Thus,
and
implying that
I IIAk II - IIAII I ~ IIAk - All·
Suppose now that IIAk - All -+ 0. Then, corresponding to each positive scaler
E, there exists a positive integer p such that, for k > p, IIAk - All < E and hence
such that, for k > p, IIIAkll - IIAIII < E. We conclude that IIAkll -+ IIAII.
(b) In light of Lemma 18.2.20, Part (b) follows from Part (a).
(I - A)-I - Sk = (p-+oo
lim Sp) - Sk = p-+oo
lim (Sp - Sd
LAm,
p
= lim
p-+oo m=k+1
188 18. Sums (and Differences) of Matrices
(SA)
Moreover, making repeated use of the result of Exercise 6.1 (i.e., of the triangle
inequality) and of Lemma 18.2.21, we find that (for p ::: k + I)
p p p p-k-I
II L Amll S L IIAmll S L IIAli m = IIAIlk+1 L IIAlim. (S.5)
m=k+ I m=k+ I m=k+ I m=O
It follows from a basic result on geometric series [which is example 34.8(c) in
Bartle's (1976) book] that L~=o IIAli m = 1/(1 - IIAII). Thus, combining result
(S.5) with result (SA), we find that
= IIAIlk+1 L
00
Solution. Suppose that IIFII < 1. Then, since B - A = B(I - F) (and since B - A
is nonsingular), I - F is nonsingular, and
(B - A)-I = (I - F)-IB- I.
Thus, making use of Lemma 18.2.21 and the result of Exercise 13, we find that
EXERCISE 17. Let Ai, ... , Ak represent n x n symmetric matrices, and de-
fine A = Ai + ... + Ak. Suppose that A is idempotent. Suppose further that
190 18. Sums (and Differences) of Matrices
A I, ... , Ak-I are idempotent and that Ak is nonnegative definite. Using the result
of Exercise 16 (or otherwise), show that AiAj = 0 (for j "I i = 1, ... , k), that
Ak is idempotent, and that rank(Ak) = rank(A) - L7::l rank(Ai).
so that
and hence
L tr(AiAj) = tr(A) - Ltr(A;). (S.9)
i,di
Suppose now that AI, ... , Ak are nonnegative definite and also that tr(A) S
Li tr(A;). Then, it follows from result (S.9) that
L tr(AiAj) SO.
i,di
And, since (according to Corollary 14.7.7, which is the result cited in the hint)
tr(AiAj) ::: 0 (for all i and j "I i), we have that tr(AiAj) = 0 (for all i and
j "I i). We conclude (on the basis of Corollary 14.7.7) that A;Aj = 0 (for all j
18. Sums (and Differences) of Matrices 191
and j f. i). And, in light of Theorem 18.4.1 (and the symmetry of AI, ... ,Ak),
we further conclude that A I, ... , Ak are idempotent.
EXERCISE 19. Let AI, ... , Ak represent n x n symmetric matrices such that
Al + ... + Ak = I. Show that if rank(Ad + ... + rank (Ad = n, then, for
any (strictly) positive scalars CI, ... , q, the matrix clAI + ... + qAk is positive
definite.
Solution. Suppose that rank(Ad + ... + rank(Ad = n. Then, it follows from
Theorem 18.4.5 that AI, ... , Ak are idempotent. Thus,
nonnegative definite. Thus, it follows from Corollary 14.2.5 that col + 2:7=1 CiAi
is positive definite (and hence, in light of Lemma 14.2.8, nonsingu1ar).
That (col + 2:7=1 CiAi)-1 = dol + 2:7=1 diAi is clear upon observing that
= (A - LAi)(A - LAi)
(A - LAi)'(A - LAi)
i i i i
= A2 - LAiA- LAAi + LAf+ L AiAj
i i i iJf=i
=A- LAi - LAi + LAi +0
i i i
=A-LAi.
i
Thus,
or equivalently that
rank(BA) + rank(1 - BA) = n.
Thus, it follows from Lemma 18.4.2 that BA is idempotent. Further,
implying that rank(BA) = rank(A). We conclude (on the basis of Theorem 10.2.7)
that B is a generalized inverse of A.
implying [in light of result (4.5.5)] that C(A + B) = C(A, B) and (in light of
Theorem 17.2.4) that C(A) and C(B) are essentially disjoint. Thus, in light of
result (17.1.4), it follows that
projection matrix for V along U, it follows from Part (a) that (I - A) + B is the
projection matrix for some subspace £* along some subspace M* if and only if
B(I - A) = (I - A)B = 0,
BA=AB=B,
EXERCISE 24. (a) Let B represent an n x n symmetric matrix, and let W represent
an n x n symmetric nonnegative definite matrix. Show that
(SB)'SBWBW = (SB)'SBW,
Thus, it follows from Lemma 5.3.1 that SBS' - SBWBS' = 0, or equivalently that
SBWBS' = SBS', so that
(b) Suppose that AI, ... , Ak are symmetric (in which case A is also symmetric).
Then, applying the results of Part (a) (with A in place of B and V in place of W),
we find that
Similarly, applying the results of Part (a) (with Ai in place of B and V in place of
W), we find that
R
( TU -ST)( I 0) _ (R + STU -ST)
T -U I - 0 T
18. Sums (and Differences) of Matrices 197
and making use of Lemma 8.5.2 and Corollary 9.6.2, we find that
R
rank(TU -ST)
T _- r
ank(R
O + STU
T -ST) = rankeR + STU) + rank(T)
and hence that
Or, alternatively, equality (E.3) can be validated by making use of result (9.6.1)
- observing that C(TU) c C(T) and 'R( -ST) c 'R(T), we find that
=
(b) Upon observing thatrank(X) rank( -X), that -X- is a generalized inverse
of -X, and that Fx = 1- (-X-)( -X), it follows from the result of Part (b) of
Exercise 10.10 that
R
rank ( TU -ST)
T = rankeR) + rank (X) + rank(Y) + rank(Z).
We conclude, on the basis of Part (a) (of the current exercise) that
EXERCISE 27. Show that, for any n x n symmetric nonnegative definite matrices
A andB,
C(A +B) = C(A, B), R(A + B) = R(:),
rank(A + B) = rank(A, B) = rank ( :).
Solution. According to Corollary 14.3.8, there exist matrices Rand S such that
A = R'R and B = S'S. And, upon observing that
A+B = (~)(~)
and recalling Corollaries 7.4.5 and 4.5.6, we find that
(b) Show that (1) ifR(A) and R(B) are essentially disjoint, then C(A) c C(A +
B) and (2) if C(A) and C(B) are essentially disjoint, then R(A) c R(A + B).
Solution. (a)(l) Suppose that rank (A , B) = rank(A+B). Then, since (according to
Lemma 4.5.8) C(A + B) c C(A, B), it follows from Theorem 4.4.6 that C(A, B) =
C(A + B). Then, since C(A) C C(A, B), we have that C(A) C C(A + B).
Conversely, suppose that C(A) C C(A + B). Then, according to Lemma 4.2.2,
there exists a matrix F such that A = (A + B)F. Further, B = (A + B) - A =
18. Sums (and Differences) of Matrices 199
(1) If R(A) and R(B) are essentially disjoint (or equivalently if d = 0), then
(as a consequence of Theorem 18.5.6) rank(H) = 0 = d, implying [in light of
result (5.14)] that
rank (A + B) = rank(A, B),
and hence [in light of part (a)-(l)] that C(A) c C(A + B).
(2) Similarly, if C(A and C(B) are essentially disjoint (or equivalently if c = 0),
then (as a consequence of Theorem 18.5.6) rank(H) = 0 = c, implying [in light
of result (5.18)] that
rank(A + B) = rank(:)
and hence [in light of Part (a)-(2)] that R(A) c R(A + B).
EXERCISE 29. Let A and B represent m x n matrices. Show that each of the
following five conditions is necessary and sufficient for rank additivity [i.e., for
rank(A + B) = rank(A) + rank(B)]:
where
(b) Show that A and B are rank subtractive [in the sense that rank(A - B) =
rank(A) - rank(B)] if and only ifrank(A, B) = rank(~) = rank(A) and D = O.
(c) Show that if rank(A, B) = rank(~) = rank(A), then (1) (A -,0) and
( : -) are generalized inverses of (~) and (A, B), respectively, and (2) for
(e) Using the result of Exercise 29 (or otherwise), show that rank (A - B) =
rank(A) - rank(B) if and only if rank (A - B) = rank[A(I - B-B)] = rank[(I-
BB-)A].
Solution. (a) Clearly,
Thus, it follows from Parts (1) and (2) of Lemma 9.2.4 that ( A)-(I
B Om 0
-1m )-1
is a generalized inverse of (_:) and that (~n o )-1(A, B) -
_ In is a generalized
inverse of (A, -B). Further,
Since all three terms of the left side of equality (S.15) are nonnegative, we conclude
that rank(A - B) = rank(A) - rank(B) if and only if rank(A, B) - rank(A) =
0, rank(!) - rank(A) = 0, and rank (H) = 0 or, equivalently, if and only if
rank(A, B) = rank(!) = rank(A) and H = O.
202 18. Sums (and Differences) of Matrices
and (2) upon setting (~r and (A, B) - equal to (A - , 0) and (: -), respectively,
we obtain
(d) (1) Suppose that rank(A, B) = rank (~) = rank (A) and that BA-B = B.
Then, it follows from Part (c) that (A - ,0) and (: -) are generalized inverses
of (~) and (A, B), respectively, and that, for (~r = (A -,0) and (A, B)- =
of (~r and (A, B)-]. And, observing [in light of Part (c)] that (A -,0) and
(:-) are generalized inverses of (~) and (A, B), respectively, and that, for
EXERCISE 31. Let AI, ... , Ak represent m x n matrices. Adopting the termi-
nology of Exercise 17.6, use Part (a) of that exercise to show that if R(Ad, ... ,
R(Ak) are independent and C(Ad, ... , C(Ak) are independent, then rank(AI +
... + Ak) = rank(Ad + ... + rank(Ak).
Solution. Suppose that R(Ad, ... , R(Ak) are independent and also that C(AI),
... , C(Ak) are independent. Then, as a consequence of Part (a) of Exercise 17.6,
we have that, for i = 2, ... , k, R(A;) and R(Ad + ... + R(A;-d are essentially
disjoint and C(A;) and C(AI) + ... + C(A;-d are essentially disjoint or equiva-
lently [in light of results (17.1.7) and (17.1.6)] that, for i = 2, ... , k, R(A;) and
R (~I ) are essentially disjoint and C(A;) and C(AI, ... ,A;-d are essentially
A;_I
disjoint. Moreover, it follows from results (4.5.9) and (4.5.8) that (for i = 2, ... ,k)
Thus, for i = 2, ... , k, R(A;) and R(A I + ... + Ai -I) are essentially disjoint
and C(Ai) and C(AI + ... + Ai-I) are essentially disjoint, implying (in light of
Theorem 18.5.7) that (for i = 2, ... , k)
(E.4)
= (~ ~).
18. Sums (and Differences) of Matrices 205
Further,
(~ ~) = (! ~) + (~ ~).
And, since n(!) = neT) and n(~) = n(Y) and since (according to Corol-
lary 17.2.8) neT) and n(Y) are essentially disjoint, n(!) and n (~) are es-
sentially disjoint, and hence it follows from Corollary 17.2.16 that n(! ~)
and n(~ ~) are essentially disjoint. Similarly, c(! ~) and c(~ ~) are
essentially disjoint. Thus, making use of Theorem 18.5.7, we find that
(0 X) (I T-U) _(0 X)
YK 01 -YQ
and hence that
rank(~ ~) = rank(~ ~). (S.18)
(S.19)
Moreover, applying result (E.4) (in the special case where W = 0) and again
making use of Lemma 8.5.1, we find that
0
rank ( U V)
T = rank(T) + rank (-VT-U
X Y)
O·
(c) Applying result (* ) [or equivalently result (17.2.15), which is part of Theorem
17.2.17] with -VT-U, Y, and X in place of A, B, and C, respectively (or T, U,
and V, respectively), we find that
rank (
-VT-U
X !) = rank(Y) + rank(X)
+rank[(1 - YY-)(-VT-U)(I - X-X)]
= rank(Y) + rank(X) + rank(EyVT-UFx). (S.21)
Solution. Upon observing that rank(A) = rank( -A), that -A-is a generalized
inverse of -A, and that
rank[(1 - AA -)XQ-Y(I - B-B)]
= rank[ -(I - AA -)XQ-Y(I - B-B)]
= rank{[1 - (-A)(-A-)](-X)Q-Y(I - B-B»),
18. Sums (and Differences) of Matrices 207
R -ST)
rank ( TU T = rankeR) + rank(Q) + rank(A) + rank(B)
+ rank[(I - AA-)XQ-Y(I - B-B)].
Thus, limk--doo !(kx) = -00, and hence there exists a scalar k* such that
!(k*x) < c, so that, for a* = k*x, !(a*) < c.
Or, suppose that b fj C(V). According to Theorem 12.5.11, C(V) contains a
(unique) vector b I and Cl- (V) [or equivalently N (V')] contains a (unique) vector
b2 such that b = bl + b2. Clearly, b2 f. 0 [since otherwise we would arrive at a
contradiction of the supposition that b fj C(V)]. Moreover, for any scalar k,
Thus, limk-->oo !(kb2) = -00, and hence there exists a scalar k* such that
!(k*b2) < c, so that, for a* = k*b2, !(a*) < c.
Solution. (1) According to Lemma 19.3.4, C(V, X) = C(V +XUX'), implying (in
light of Lemma 4.5.1) that C(V) c C(V + XUX') and hence (in light of Lemma
9.3.5) that (V + XUX')(V + XUX')-V = V.
(2) According to Lemma 19.3.4, C(V, X) = C(V + XU'X'), implying that
C(V) c C(V + XU'X') and hence (in light of Lemma 4.2.5) that
R*=T*+UD
and
A* = WB - WXT* + [I - W(V + XUX')]L
for some solution T * to the (consistent) linear system
X'WXT = X'WB - D
H= (H"
H2l
H12)
H22 ' S = (SI)
S2 ' and Y = (YI)
Y2 '
and if
then the matrix Y* = (iD is a solution to the linear system HY = S if and only
if Y~ is a solution to the linear system
Solution. According to Lemma 19.3.2, A* and R* are the first and second parts
of a solution to linear system (*) [or equivalently linear system (3.14)] if and only
if A* and R* - UD are the first and second parts of a solution to the linear system
(S.l)
(in T) and
(V + XUX')A* + XT * = B. (S.3)
Note that linear system (S.2) is equivalent to linear system (**) [which is iden-
tical to linear system (3.17)]. Note also that condition (S.3) is equivalent to the
condition
(V + XUX')A* = B - XT*. (S.4)
And, since (in light of Lemma 19.3.4) C(B - XT*) c C(V + XUX'), it follows
from Theorem 11.2.4 that condition (S.4) is equivalent to the condition that
for some solution T* to linear system (**) [or equivalently linear system (3.17)]
and for some matrix L.
space of V. Show that C(X) c C(V + XUX') and that C(V) and C(XUX') are
essentially disjoint and R(V) and R(XUX') are essentially disjoint (even in the
absence of any assumption that V is nonnegative definite).
Solution. Making use of Corollaries 7.4.5, 4.5.6, and 17.2.14, we find that
Further,
C(XX'T) = C[(XX'T)(XX'T)'J = C(XUX').
Thus, again making use of Corollary 4.5.6, we have that
And, in light of Corollary 17.2.14, C(V) and C(XUX') are essentially disjoint.
Moreover, since XUX' is clearly symmetric, it follows from Lemma 17.2.1 that
R(V) and R(XUX') are also essentially disjoint.
Finally, in light of Theorem 18.5.6, it follows from result (5.13) or (5.14) that
rank (V + XUX') = rank(V, XUX'), so that rank (V + XUX') = rank(V, X) or
equivalently (in light of Lemma 19.3.4) C(X) c C(V + XUX').
(E.l)
for some p x n matrix R*. Then, X'H' = X'. And, VH' = X(-R*), implying
that C(VH') c C(X) and hence (in light of Corollaries 12.5.5 and 12.1.2) that
Z'VH'=O.
(b) In light of Part (a), it suffices to show that H' is the first part of a solution to
linear system (E.l) if and only if
(E.2)
and
X'WX(X'WX)-X' = X'. (S.6)
And, by following the same line of reasoning as in the latter part of the proof
of Theorem 19.5.1, we find that conditions (S.5) and (S.6) are equivalent to the
conditions that C(VWX) c C(X) and rank(X'WX) = rank(X).
W = XX-W(X')-X' = XUX',
where U = X-W(X')-. Thus,
We conclude that, for a such that X'a = d, g(a) differs from I(a) only by an
additive constant and hence that g(a) and I (a) attain their minimum values (under
the constraint X' a = d) at the same points.
and
rank(X'WX) = rank(X). (E.5)
19. Minimization of a Second-Degree Polynomial 215
Solution. It follows from Theorem 19.2.1 that a'Va - 2b' a has a minimum at
W(I - PX,w)f + WX(X'WXrd under the constraint X'a = d [where dE C(X')]
if and only if VW (I - Px,w)f + VWX(X'WX) - d + Xr = b for some vector rand
X'W (I - PX,w)f + X'WX(X'WX) - d = d, or equivalently if and only if VW (I -
PX,w)f-b+VWX(X'WX)-d E C(X)andX'W(I-Px.w)f+X'WX(X'WX)-d =
d. Thus, forW(I-Px.w)f+ WX(X'WX)-d to be a solution, for every dE C(X'),
to the problem of minimizing a'Va - 2b' a subject to X' a = d, it is necesary and
sufficient that, for every n x I vector u, VW (I - Px.w)f - b+ VWX(X'WX) -X'u E
C(X) and X'W (I - PX,w)f+ X'WX(X'WX) -X' u = X' u, a requirement equivalent
to a requirement that
X'WPx,w = X'WX(X'WX)-X'W,
and
and
X'W(I - PX,w)f + X'WX(X'WX)-X'u = X'u. (S.Il)
Then, since conditions (S.lO) and (S.ll) are satisfied in particular for u = 0, we
have that
VW(I - PX,w)f - bE C(X)
and
X'W(I - Px.w)f = O.
Further, for every u, VWX(X'WX)-X'u E C(X) and X'WX(X'WX)-X'u =
X'u, implying that
C[VWX(X'WX)-X'] c C(X)
and
X'WX(X'WX)-X' = X'.
216 19. Minimization of a Second-Degree Polynomial
Solution. Since (V- I PX,w,)' = P~,WI V-I, V-I PX,w' is symmetric if and only if
V-Ipx,w ' = P~,wIV-I. Moreover, ifV-1px,w' = P~,wIV-I, then
-I I _I I
PX,w,V = V(V PX,w')V = V(Pxw'V
, )V = VPXW"
,
19. Minimization of a Second-Degree Polynomial 217
V-I Px,w' = V-I (Px,w' V) V-I = V-I (VP~,w') V-I = P~,w' V-I.
Thus, V-Ipx,w' is symmetric if and only if PX,W'V = VP~,w" and the neces-
sity and sufficiency of V-I Px,w' being symmetric and rank(X'WX) = rank(X)
follows from Theorem 19.5.4.
To complete the proof, it suffices to show that (I - P~,w,)V-Ipx,w' = 0 and
rank(X'WX) = rank(X), or equivalently that V-Ipx.w' = P~,w,V-Ipx,w' and
rank(X'WX) = rank(X), if and only if V-I p x.w' = P~,w' V-I and rank(X'WX)
= rank(X).
Suppose that V-IpX,w' = P~,w,V-1 and rank(X'WX) = rank(X). Then, since
X'W'X = (X'WX)', rank(X'W'X) = rank(X), and it follows from Part (3) of
Lemma 19.5.5 that pi,w' = PX,w" Thus,
Thus,
C(V) nC(Px) c C(PxV). (S.12)
Now, suppose that X(X'X)-d is a solution, for every dE C(X'), to the problem
of minimizing a'Va subject to X'a = d. Then, Condition (c) is satisfied (Le.,
PxV = VPx ), implying [since, clearly, C(PxV) c C(P x ) and C(VPx ) c C(V)]
that
C(VPx ) c C(V) n C(Px )
and also [in light of result (S.l2)] that
Now, observe (in light of Corollary 14.3.8) that there exists a matrix C such that
V = CC'. Thus,
Moreover,
(I - p x .w' )CC'WX = 0,
so that
(I - px.w')CC'p~.w' = (I - px.w')CC'WX[(X'W'X)-]'X' = 0
and
px.w,CC'(1 - p~.w') = [(I - px.w')CC'p~.w']' = O.
We conclude that
, ' I '
V = pX.W'CC p x .w' + (I - px.w')CC (I - p x .w')
= XR,X' + (I - P x .w,)R2(1 - p x .w')"
220 19. Minimization of a Second-Degree Polynomial
where Rl = Sl and R2 = BS2B'. Thus, Conditions (a) and (b) are equivalent.
Assume now that W is nonsingular [and continue to assume that rank(X'WX)
= rank(X)]. And [for purposes of establishing the necessity of Condition (c)]
suppose that WX(X'WX)-d is a solution, for every dE C(X'), to the problem of
minimizing a'Va subject to X'a = d. Then, Condition (b) is satisfied, in which
case
v
= tW- 1 + XTIX' + KT2K',
where t = 0, Tl = Sl, and T2 = S2.
Conversely, suppose that Condition (c) is satisfied. Then, recalling that K =
(I - PX,w,)B for some matrix B, we find that
Thus, C(VWX) c C(X), and it follows from Theorem 19.5.1 that Condition (c)
is sufficient (as well as necessary) for WX(X'WX)-d to be a solution, for every
dE C(X'), to the problem of minimizing a'Va subject to X'a = d.
20
The Moore-Penrose Inverse
EXERCISE 1. Show that, for any m x n matrix B of full column rank and for
any n x p matrix C of full row rank,
(BC)+ = C+B+.
(BC) + = C+B+.
EXERCISE 2. Show that, for any m x n matrix A, A + = A' if and only if A' A
is idempotent.
Solution. Suppose that A' A is idempotent or equivalently that
A'A = A'AA'A. (S.l)
A+(A+)'A'AA+ = A+(A+)'A'AA'AA+.
Moreover,
A+(A+)'A'AA+ = A+(AA+)'AA+ = A+AA+AA+ = A+AA+ = A+,
222 20. The Moore-Penrose Inverse
and
A+(A+)'A'AA'AA + = A+(AA +)'AA'(AA +)'
= A+AA+A(AA+A)'
= A+AA' = (A+A)'A' = (AA+A)' = A'.
Thus, A+ = A'.
Conversely, suppose that A+ = A'. Then, clearly, A' A = A+ A, implying (in
light of Lemma 10.2.5) that A' A is idempotent.
( T- + T-UQ-VT- -T-UQ-)(T U)
-Q-VT- Q- V W
(
T- + T-UQ-VT-
x -Q-VT-
= (T- + T-UQ-VT- - T-UQ-)( TT-
-Q-VT- Q- (I - QQ-)VT-
Now, if the generalized inverses T- and Q- are both reflexive (i.e., if T-TT-
= T- and Q-QQ- = Q-), then partitioned matrix (S.2) simplifies to partitioned
20. The Moore-Penrose Inverse 223
matrix (*). We conclude that if the generalized inverses T- and Q- are both reflex-
ive [and if if C(U) c C(T) and R(V) c R(T)], then the generalized inverse (*)
of (~ ~) [or equivalently the generalized inverse given by expression (9.6.2)]
is reflexive. And, it can be shown in similar fashion that if the generalized inverses
T- and Q- are both reflexive [and if if C(U) c C(T) and R(V) c R(T)], then
the generalized inverse (**) of (~ ~) [or equivalently the generalized inverse
given by expression (9.6.3)] is reflexive.
EXERCISE 6. (a) Show that, for any m x n matrices A and B such that A'B =0
andBA' = 0, (A + B)+ = A+ + B+.
(b) Let AI, A2, ... , Ak represent m x n matrices such that, for j > i =
1, ... , k - 1, A;Aj = 0 and AjA; = O. Generalize the result of Part (a) by
showing that (AI + A2 + ... + Ad+ = Ai + At + ... + At·
Solution. (a) Let X represent any n x m matrix such that (A + B)' (A + B)X =
(A + B)' and Y any m x n matrix such that (A + B)(A + B)'Y = A + B. Then,
since B'A = (A'B)' = 0 and AB' = (BA')' = 0, we have that
Thus, upon observing that C(A') = C(A'A), C(B') = C(B'B), C(A) = C(AA'),
and C(B) = C(BB'), it follows from Theorem 18.2.7 that
and that
20. The Moore-Penrose Inverse 225
Moreover, in light of Lemma 14.11.2 and the result of Exercise 8, we have that
rank(V+X) ::: rank(VV+X) = rank(V+VX) ::: rank(X'VV+VX)
= rank(X'VX) = rank(VX),
for any matrix T of full row rank (and with n columns) such that A = T'T.
Thus, A + is symmetric and (in light of Corollary 14.2.14) nonnegative definite.
And, since (T+), has n columns and since [in light of Part (1) of Theorem 20.5.1]
rank (T+)' = rank T+ = rank T, it follows from Corollary 14.2.14 that A + is
positive semidefinite if rank (T) < n or equivalently if A is positive semidefinite
and that A+ is positive definite if rank(T) = n or equivalently if A is positive
definite.
and
(CB)+ = (CB)'[CB(CB)'l+ = B'C'[CB(CB)'l+.
Thus,
and
for some n x 1 vector w. Do so by, for instance, using the results of Exercise 11
in combination with the result that, for any n x k matrix Z whose columns span
N(X'), lea) attains its minimum value (subject to the constraint X'a = d) at a
point a* if and only if
We conclude that f (a) attains its minimum value (under the constraint X' a = d)
at a point a* if and only if
R(Q) c R(P)
and hence that there exists a matrix K (having r columns) such that
Q=KP.
Thus,
B - A = P'P - Q'Q = P'(I - K'K)P.
Moreover, according to Lemma 8.1.1, P has a right inverse R, so that
implying (since clearly rank K ::s r) that rank(K) = r and hence (in light of
Corollary 14.2.14) that K'K is positive definite. Thus, it follows from one of
the cited results (a result encompassed in Theorem 18.3.4) that (K'K)-l - I is
nonnegative definite. Moreover, upon observing that A = P' (K'K)P, it follows
from the result of Exercise 1 that
and from another of the cited results (a result covered by Theorem 20.4.5) that
so that
A + - B+ = P+[(K'K)-l - IHP+)'.
And, in light of Theorem 14.2.9, we conclude that A + - B + is nonnegative definite.
Conversely, suppose that A + - B+ is nonnegative definite. Then, it follows from
the result of Exercise 18.15 that R.(B+) C R.(A +), implying that rank(B+) ::s
rank(A+) and hence [in light of Part (1) of Theorem 20.5.1] that rank (B) ::s
rank(A). Since [in light of result (S.5)] rank(A) ::s rank(B), we conclude that
rank(A) = raok(B).
21
Eigenvalues and Eigenvectors
o:s A2 = -x'A'Ax/x'x:s 0,
where Ko, KI, K2, ... , Kn-l are n x n matrices (that do not vary with A).
(b) Letting To = AKo, Tn = -Kn-l, and (for s = 1, ... , n - 1) Ts =
AKs - Ks-I, show that (for A E 'R)
[Hint. It follows from a fundamental result on adjoint matrices that (for A E 'R)
B(A)H(A) = IB(A)IIn = p(A)In .]
(c) Show that, for s = 0,1, ... , n, Ts = csl.
(d) Show that
Solution. (a) Let hij(A) represent the ijth element of H(A). Then, hij(A) is the
cofactor of the jith element of B(A). And, it is apparent from the definition of a
21. Eigenvalues and Eigenvectors 233
cofactor and from the definition of a determinant [given by formula (13.1.2)] that
hij (A) is a polynomial (in A) of degree n - 1 or n - 2. Thus,
forsomescalarsk~), kW, kg), ... ,kG-I) (that do not vary with A). And, it follows
that
H(A) = Ko + AKI + A2K2 + ... + )..n-IKn_l,
where (for s = 0, 1,2, ... , n - 1) Ks is the n x n matrix whose ijth element is
kIJ~~) .
(b) In light of Part (a), we have that (for A E R)
Consequently,
(s) _ { cs , if j = i,
tij - 0, 1'f' ...J. •
J ;- I,
To+ATI +A2T2+···+AnTn
= col + A(cII) + A2(c21) + ... + An (cnl) = P.
Moreover,
EXERCISE 4. Let co, CI, ... , Cn-I, Cn represent the respective coefficients of
the characteristic polynomial p(A) of an n x n matrix A [so that p(A) = Co +
234 21. Eigenvalues and Eigenvectors
CIA + ... + Cn_lA n- 1 + CnA n (for A E 'R)]. Using the result of Exercise 3 (the
Cayley-Hamilton theorem), show that if A is nonsingular, then Co =1= 0, and
EXERCISE 7. Let A = G~) and B = G~). Show that B has the same
rank, determinant, trace, and characteristic polynomial as A, but that, nevertheless,
B is not similar to A.
21. Eigenvalues and Eigenvectors 235
CB = (Cll + C12)
CII and AC = C = (Cll CJ2),
C21 + C22
C21 C21 C22
Cll + C12 = CJ2 and C21 + C22 = C22, implying that Cll = 0 and C21 = 0 and hence
that C is singular.
Thus, there exists no 2 x 2 nOllsingu/ar matrix C such that CB = AC. And, we
conclude that B is not similar to A.
Solution. Suppose that A = In, and suppose that B is a triangular matrix, all of
whose diagonal elements equal 1. Then, in light of Lemma 13.1.1 and Corollary
8.5.6, B has the same determinant, rank, trace, and characteristic polynomial as A.
However, for B to be similar to A, it is necessary (and sufficient) that there exist an
n x n nonsingular matrix C such that B = C- I AC or equivalently (since A = In)
that B = In. Thus, it is only in the special case where B = In that B is similar to
A.
EXERCISE 12. Let YI represent the algebraic multiplicity and VI the geometric
multiplicity of 0 when 0 is regarded as an eigenvalue of an n x n (singular) matrix A.
And let Y2 represent the algebraic multiplicity and V2 the geometric multiplicity
of 0 when 0 is regarded as an eigenvalue of A 2. Show that if VI = YI, then
V2 = Y2 = VI·
Solution. For any n x 1 vector x such that Ax = 0, we find that A2x = AAx =
AO = O. Thus,
(S.2)
There exists an n x VI matrix U whose columns form an orthonormal (with
respect to the usual inner product) basis for N(A). Then, AU = 0 = U O. And, it
follows from Theorem 21.3.2 that there exists an n x (n - VI) matrix V such that
the n x n matrix (U, V) is orthogonal and, taking V to be any such matrix, that
,
(U, V) A(U, V) = (0
0
U'AV)
V'AV
= (0
0
U'AV)2
V'AV =
(0
0
U'AVV'AV)
(V'AV)2 '
21. Eigenvalues and Eigenvectors 237
EXERCISE 14. Let A represent an n x n matrix, and suppose that there exists
an n x n nonsingular matrix Q such that Q-I AQ = 0 for some diagonal matrix
r;
0= {dj). Further, for i = 1, ... , n, let represent the ith row of Q-I. Show (a)
that A' is diagonalized by (Q-I)" (b) that the diagonal elements of 0 are the (not
necessarily distinct) eigenvalues of A', and (c) that rl, ... , r n are eigenvectors of
A' (with rj corresponding to the eigenvalue dj).
Solution. The validity of Part (a) is evident upon observing that
Solution. Let VI, ... , Vk represent the geometric multiplicities of AI, ... , Ak, re-
spectively. Then, according to result (1.1), Vi = n - rank(A - Ail) (i = 1, ... ,
k). Thus,
Moreover, since the columns of Rm are linearly independent and the columns of
Qm are linearly independent, Y2 = 0 ¢} R mY2 = 0 (or equivalently Y2 i= 0 ¢}
R mY2 i= 0) and Yl = 0 {:} QmYI = O. Thus,
Q~ x = 0 {:} Y1 = 0 ¢} x = Rm Y2 .
It follows that x E Sm if and only if x = R mY2 for some (n - m + 1) x 1 nonnull
vector Y2'
It is now clear (since R~Rm = I) that
Show that r = k and that there exists a permutation tl, ... , tr of the first r positive
integers such that (for j = 1, ... , r) Tj = Atj and F j = E rj .
(2) Observe that (for j = 1, ... , k) Ej = QjQj and QjQj = I. Then, clearly,
Ej is symmetric. And, Qj is nonnull, implying (in light of Corollary 5.3.2) that
E j is nonnull. Further
so that E j is idempotent.
(3) and (4) In light of Theorem 18.4.1, Parts (3) and (4) follow from Parts (1)
and (2).
240 21. Eigenvalues and Eigenvectors
(b) Observe (in light of Theorem 18.4.1) that, for t f. j = 1, ... , r, FtFj = O.
Then, for j = 1, ... , r, we find that
C(Et)·) = C(Qt.Q;)
))
= C(Qt·)
)
= N(A - At)· I) = N(A - T:jl),
implying [in light of Part (a) and equality (5.5)] that (for j = 1, ... , r)
EXERCISE 19. Let A represent an n x n symmetric matrix, and let dl, ... ,dn
represent the (not-necessarily-distinct) eigenvalues of A. Show that limk ...... oo Ak
= 0 if and only if, for i = 1, ... ,n, Idd < 1.
21. Eigenvalues and Eigenvectors 241
Since A and A+ are triangular, it is easy to see that the distinct eigenvalues of A
are 0 and 1, while those of A + are 0 and lin. Obviously, lin is not (for n ~ 2)
the reciprocal of 1.
242 21. Eigenvalues and Eigenvectors
EXERCISE 21. Show that, for any positive integer n that is divisible by 2, there
exists an n x n orthogonal matrix that has no eigenvalues. [Hint. Find a 2 x 2
orthogonal matrix Q that has no eigenvalues, and then consider the block-diagonal
matrix diag(Q, Q, ... , Q).]
EXERCISE 22. Let Q represent an n x n orthogonal matrix, and let p(A) represent
the characteristic polynomial of Q. Show that (for A i= 0)
p()...) = ±)...n p (1/)...).
Thus, making use of Theorem l3.3.4, Lemma l3.2.1, and Corollaries 13.2.4 and
13.3.6, we find that
EXERCISE 23. Let A represent an n x n matrix, and suppose that the scalar 1 is
an eigenvalue of A of geometric multiplicity v. Show that v :::: rank (A) and that if
v = rank(A), then A is idempotent.
IAIA -1, it follows from the results of Section 21.10 that lAllA is an eigenvalue of
adj(A) and that x is an eigenvector of adj(A) corresponding to lAllA.
EXERCISE 25. Let A represent an n x n matrix, and let peA) represent the char-
acteristic polynomial of A. And, let 1..1,0' . , Ak represent the distinct eigenvalues
of A, and YI, .. Yk represent their respective algebraic multiplicities, so that (for
0 ,
alIA)
peA) = (-I)nq(A) nk
j=1
(A - Aj)Yj
for some polynomial q(A) (of degree n - L~=I Yj) that has no real roots. Further,
define B = A - AIUV', where U = (UI, ... , UYI) is an n x YI matrix whose
columns UI, .. u YI are (not necessarily linearly independent) eigenvectors of A
0 ,
[Hint. Since the left and right sides of equality (E.I) are polynomials (in A), it
suffices to show that they are equal for all A other than AI, ... ,Ak.)
(b) Show that in the special case where U'V = cI for some nonzero scalar c, the
distinct eigenvalues of B are either 1..2, ... , As-I, As, As+I .... , Ak with algebraic
multiplicities)/2, 00', Ys-l, Ys + YI, Ys+I, ... , Yk. respectively, or (1 - C)AI, 1..2,
o ••Ak with algebraic multiplicities YI, )/2, ... , Yk. respectively [depending on
,
is an eigenvector of A corresponding to A.
Solution. (a) Let A represent any scalar other than AI, .. 0 • Ak. And, observe that
rCA) = IA - AI - Al UV'I
= IA - AIIII - AI(A - AI)-IUV'I
244 21. Eigenvalues and Eigenvectors
= peA) nYI
i=1
[1 + Al (A - Ad-Iv;uiJ
i=I
(A - Al + AIV;Ui)
= (-I)nq(A) n
YI
i=1
[A - (1 - v;ui)AIl nk
j=2
(A - Aj)Yj.
(b) Part (b) follows from Part (a) upon observing that, in this special case,
nYI
i=1
[A - (1 - v;ui)AIl = [A - (1 - c)AIl Y1 •
(2) Suppose that YI = 1, and let d = Al (AI - A)-Iv~ x. Since (by definition)
Bx = AX, we have that
Solution. Upon applying Theorem 21.12.1 with A' in place of A (and with nand
m in place of m and n, respectively) and writing P = (PI, P2) for Q = (Ql, Q2)
and Q = (Ql, Q2) for P = (PI, P2), we find that
for v =i
for v 1= i.
Thus,
so that
And, for t 1= j,
and
UtU) = L Pvq~( L Piq;)' = LL Pvq~qiP; = O.
VEL I iELj VEL I iELj
so that Q; Q2 = O. Thus,
rank(Q2) = n - r = n - rank(A).
let AI, ... ,Ak represent k diagonalizable matrices of arbitrary order n that com-
mute in pairs. Then, to complete the induction argument, it suffices to show that
AI, ... , Ak can be simultaneously diagonalized.
Let AI, ... , Ar represent the distinct eigenvalues of Ak. and let VI, ... , Vr rep-
resent their respective geometric multiplicities. Take Qj to be an n x Vj matrix
whose columns are linearly independent eigenvectors of Ak corresponding to the
eigenvalue Aj or, equivalently, whose columns form a basis for the V j -dimensional
linear spaceN(Ak - AjI) (j = 1, ... , r). And, define Q = (QI,"" Qr)'
Since Ak is diagonalizable, it follows from Corollary 21.5.4 that L:J=I Vj = n,
so that (in light of Theorem 21.4.2) Q has n linearly independent columns and
hence is nonsingular. Further,
(S.4)
(S.8)
248 21. Eigenvalues and Eigenvectors
(S.9)
Define S = diag(SI, ... , Sr) and P = QS. Then, clearly, S is nonsingular, and
hence P is also nonsingular. Further, using results (S.6), (S.9), and (S.4), we find
that, for i = 1, ... , k - 1,
and that
p- IAkP = S-lQ-I AkQS = diag(AlI v1 , ••• , ArIvr ),
so that all k of the matrices AI, ... , Ak are simultaneously diagonalized by the
nonsingular matrix P.
Solution. (a) Recall from Part (3) of Theorem 12.3.4 that P x is symmetric. Then,
as a consequence of Corollary 21.13.2, there exists an orthogonal matrix that
21. Eigenvalues and Eigenvectors 249
for Q = SDT. And, it follows from the results of Exercise 19.11 that X(X'X) - d
is a solution [for every d E C(X')] to the problem of minimizing a'Va subject to
X'a=d.
Conversely, suppose that X(X'X)-d is a solution [for every d E C(X')] to the
problem of minimizing a' Va subject to X'a = d. Then, it follows from Part (a)
that there exists an n x n orthogonal matrix Q that simultaneously diagonalizes
P x and V. That is, there exists an n x n orthogonal matrix Q such that Q'PxQ =
diag(d1, ... ,dn) and Q'VQ = diag(f1, ... , In) for some scalars d), ... , dn and
fl,···,ln.
Further, it follows from Theorem 21.5.1 that the (not necessarily distinct) eigen-
values of P x are d1, ... , dn and the (not necessarily distinct) eigenvalues of V are
fl, ... , In and that the first, ... , nth columns of Q are eigenvectors of Px corre-
sponding to d1, ... , dn , respectively, and are also eigenvectors of V corresponding
to fl, ... , In, respectively. And, since [according to Part (8) of Theorem 12.3.4]
rank(P x) = r, n - r of the eigenvalues of Px are (in light of Lemma 21.1.1) equal
toO.
Now, let Q1 represent the n x r matrix obtained from Q by deleting those
columns that are eigenvectors of P x corresponding to O. Then, in light of Theorem
21.4.3 and Part (7) of Theorem 12.3.4, C(Q1) = C(Px) = C(X). And, since the
columns of Q1 are (orthonormal) eigenvectors of V, there exist r orthonormal
eigenvectors of V that form a basis for C(X).
x'Ax
Amin :s ~B
x x :s Amax
for every nonnull vector x in nn.
Solution. Let S represent any n x n nonsingular matrix such that B = S'S, let
R = (S-l), (so that B- 1 = R'R), and let C = RAR'. Then, in light of result
(14.7), Amax and Amin are, respectively, the largest and smallest eigenvalues of C.
250 21. Eigenvalues and Eigenvectors
y'Cy
Amin .::: - ,- .::: Amax
yy
Solution. Letd1, ... ,dn representthe n (not necessarily distinct) roots of IA-ABI.
And, let S represent any n x n nonsingular matrix such that B = S'S, let R = (S-1),
(so that B- 1 = R'R), and let C = RAR'. Then, in light of result (14.7), the (not
necessarily distinct) eigenvalues ofC are d1, ... , dn , and it follows from Corollary
21.5.9 that there exists an n x n orthogonal matrix P such that P'CP = D, where
=
D diag(d1, ... , dn ).
Now, take Q = S'P. Then, according to results (14.2) and (14.1), A = QDQ'
and B = QQ'. Thus,
U and Y in W. Further, define X * Z = T (X) " T (Z) for all matrices X and Z in
V. Show that the "*-operation" satisfies the four properties required of an inner
product for V.
Solution. Observe (in light of Lemma 22.1.3) that N(T) = {O} and hence that
T (X) = 0 if and only if X = O. Then, letting X, Z, and Y represent arbitrary
matrices in V and letting k represent an arbitrary scalar, we find that
(1) X * Z = T(X)"T(Z) = T(Z)"T(X) = Z * X;
(2) X * X = T (X)" T (X) > 0, if T (X) i= 0 or, equivalently, if X i= 0,
= 0, if T (X) = 0 or, equivalently, if X = 0 ;
(3) (kX) *Z = T(kX)"T(Z)
= [kT(X)]"T(Z) = k[T(X)"T(Z)] = k(X * Z); and
(4) (X+Z)*Y=T(X+Z)"T(Y)
= [T(X) + T(Z)]"T(Y)
= [T(X)"T(Y)] + [TCZ)"TCY)] = eX * Y) + (Z * Y).
Solution. (a) Let CI, ... , C r represent any scalars such that L~=l Ci T (X..) = O.
Then, T(L~=I CiXi) = L~=I C;T(Xi) = 0, implying that L~=I CiXi is inN(T)
and hence (since clearly L;=I CiXi E U) that L~=l CiXi is in Un N(T). Thus,
L~=I Ci Xi = 0, and (since XI, ... , Xr are linearly independent) it follows that
CI = ... = Cr = O. And, we conclude that T(XI), ... , T(X r ) are linearly inde-
pendent.
(b) Suppose that r = dim(U) and that U (f) NeT) = V. Then, making use of
Theorem 22.1.1 and of Corollary 17.1.6, we find that
And, in light of Theorem 4.3.9 and the result of Part (a), it follows that T(XI),
... , T (X r ) form a basis for T eV).
An alternative proof [of the result of Part (b)] can be obtained [in light of the
result of Part (a)] by showing that the the set {T(XJ}, ... , TCX r )} spans T(V).
Continue to suppose that r = dim (U) and that U {f)N (n = V. And, let Z I, ... , Zs
represent any matrices that form a basis for N (n. Further, observe, in light of
Theorem 17.1.5, that the r + s matrices XI, ... , X r , ZI, ... , Zs form a basis for
V.
Now, let Y represent an arbitrary matrix in T(V). Then, Y = T(X) for some
22. Linear Transfonnations 253
(T+S)(X+Z) = T(X+Z)+S(X+Z)
= T(X) + T(Z) + SeX) + S(Z)
= T(X) + SeX) + T(Z) + S(Z)
= (T + S)(X) + (T + S)(Z),
and
Using results (2.12), (3.1), (3.3), (2.2), and (2.1), we find that
(kT)«(1lk)T- 1) = (1lk)«kT)T- 1)
=
(1lk)(k(TT- 1)) = (1Ik)(kf) = [(1lk)k]f = 1I = f.
(b) In light of the results of Exercise 7, it suffices to show that
And, upon comparing this expression for T(V i ) with expression (4.3), we find
that the matrix representation of T with respect to Band C is the m x n matrix
whose ijth element is T(Vi)"W i .
EXERCISE 10. Let T represent the linear transformation from nmxn into nnxm
defined by T(X) = X'. And, let C represent the natural basis for n mxn , com-
prising the mn matrices Ull, U 21 , ... , Uml, ... , Uln, U2n, ... , U mn , where (for
i = I, ... , m and j = I, ... n) Uij is the m x n matrix whose ijth element equals
1 and whose remaining mn - I elements equal 0; and similarly let D represent the
natural basis for nn xm. Show that the matrix representation for T with respect to
the bases C and D is the vec-permutation matrix Kmn .
Solution. Making use of results (4.11) and (16.3.1), we find that, for any m x n
matrix X,
EXERCISE 11. Let W represent the linear space of all p x p symmetric matrices,
and let T represent a linear transformation from nmxn into W. Further, let B
represent the natural basis for nm xn, comprising the mn matrices U 11, U2\, ... ,
Uml, ... , Uln, U2n, ... , U mn , where (for i = 1, ... , m and j = 1, ... , n) Uii
is the m x n matrix whose ijth element equals 1 and whose remaining mn - 1
elements equal O. And, let C represent the usual basis for W.
(a) Show that, for any m x n matrix X,
(b) Show that the matrix representation of T (with respect to B and C) equals
the pep + 1)/2 x mn matrix
[vech T(U ll ), ... , vech T(Uml), ... , vech T(Uln), ... , vech T(Umn )].
(G'nG n)-IG'n
(where G n is the duplication matrix).
Solution. (a) Making use of result (3.5), we find that, for any mn x 1 vector x,
(b) For any m x n matrix X = {xij}, we find [using Part (a)] that
(Lc TL8
I )(Vtt Xl ~ Vtt{T(:t>j U'j ) 1
= vec{.l:xijT(Uij)]
(.j
= I>ijvech[T(Uij)]
i.j
= [vech T(UII), ... , vech T(Umd,
... , vech T(Ul n ), .... vech T(Umn)]vec(X).
And, in light of result (4.7), it follows that the matrix representation of T (with
respect to B and C) equals the pep + 1)/2 x mn matrix
[vech T(U ll ), ... , vech T(Umd, ... , vech T(Ul n), ... , vech T(U mn )].
(c) Using Part (a) and results (16.4.6), (16.3.1), and (16.4.15), we find that, for
any n x n matrix X,
And, in light of result (4.7), it follows that the matrix representation of T (with
respect to B and C) equals (G~Gn)-IG~.
Solution. Lemma 3.2.4 implies that the set C is linearly independent and hence
that C is a basis for V. Then, clearly, A -I is the matrix representation of the identity
transfonnation [ (from V onto V) with respect to C and B. And, it follows from
Corollary 22.4.3 that (A -I) -I is the matrix representation of r I with respect
to B and C and hence [since (A -I) -I = A and [-I = I] that A is the matrix
representation of [ with respect to Band C.
EXERCISE 13. Let T represent the linear transfonnation from R4x I into R 3x I
defined by
where x = (XI, X2, X3, X4)'. Further, let B represent the natural basis for R4xl
(comprising the columns of 4), and let E represent the basis (for 1<}x I ) comprising
the four vectors (1, -1,0, -1)', (0,0,1,1)', (0,0,0,1)', and (1, 1,0,0)'. And,
let C represent the natural basis for R 3x I (comprising the columns of 13), and F
represent the basis (for R3xl) comprising the three vectors (1,0,1)', (1,1,0)',
and (-1, 0, 0)'.
(a) Find the matrix representation of T with respect to Band C.
(b) Find (1) the matrix representation of the identity transfonnation from R4x I
onto R4x I with respect to E and Band (2) the matrix representation of the identity
transfonnation from R 3x1 onto R 3xl with respect to C and F.
(c) Find the matrix representation of T with respect to E and F via each of two
approaches: (1) a direct approach, using the equality
Solution. (a) Let A represent the matrix representation of T with respect to B and
C, and denote the first, ... , 4th columns of 4 by el, .... e4, respectively. Then,
in light of the discussion in Part 1 of Section 4b, we find that
-D
(b) (1) In light of the discussion in Part 3 of Section 4b, the matrix representation
of the identity transformation from n4x 1 onto n4x 1 with respect to E and B is
the 4 x 4 matrix whose first, ... , 4th columns are the vectors that form E, that is,
the 4 x 4 matrix
( -~o ~
-1
~0 0~) .
1 0
(2) Let S represent the 3 x 3 nonsingular matrix whose inverse S-I is the matrix
n n
representation of the identity transformation from 3x 1 onto 3x 1 with respect
to C and F. Then, in light of Part 3 of Section 4b,
(o11-1)
l O S - I = 13,
1 0 0
so that
S-I = ( 0 0 1)
0
-1
1 0 .
1 1
o -I)
CI C D 0 0
1 0 -1
1 0
~ H= ~ 0
Co n
Thus,
H= 0 0 -1
0 0
(2) The matrix S [from Part (b)] is (in light of Corollary 22.4.3) the matrix
representation of the identity transformation from n 3 x 1 onto n 3 x 1 with respect
to F and C. Thus, in light of the results of Parts (a) and (b), it follows from the
260 22. Linear Transfonnations
u no
0 0
-DU ~)
0 0
0 0
H= 1
0
0 -1
D
0
~G 0
0
-1
0
(d) It is clear from Part (c) that the rank of the matrix representation of T with
respect to E and F equals 2. Thus, it follows from the first result cited (which is
Theorem 22.5.2) that rank T = 2 and from the second result cited (which is part
of Corollary 22.5.3) that dim [N(T)] = 4 - 2 = 2.
X"(kS)(Y) = X"[kS(Y)]
= k[X"S(Y)] = k[T(X) * Y] = [kT(X)] * Y = (kT)(X) * Y.
Thus, kS is the dual transformation of kT.
(e) For every matrix X in V and every matrix Y in W,
and
implying that
b ji =XjoS(Yi) = T(Xj)*Y i =aij
and hence that the j i th element of B equals the j i th element of A'. Thus, B = A'.
(a) Show that [S(W)].i = N(T) (i.e., that the orthogonal complement of the
range space of S equals the null space of T).
(b) Using the result of Part (c) of Exercise 17 (or otherwise), show that [N (S)].i
= T (V) (i.e., that the orthogonal complement of the null space of S equals the
range space of T).
(c) Show that rank S = rank T.
Solution. (a) Write X· Z for the inner product of arbitrary matrices X and Z in V
and U * Y for the inner product of arbitrary matrices U and Y in W.
Let X represent an arbitrary matrix in V. Suppose that X E N(T). Then, for
every matrix Y in W,
Thus, X E [S(W)].i.
Conversely, suppose that X E [S(W)].i. Then, since T(X) E W,
(c) Making use of Corollary 22.5.3 and Theorem 12.5.12 [together with the
result of Part (a)), we find that
References
Bartle, R. G. (1976), The Elements of Real Analysis (2nd ed.), New York: John
Wiley.
Goodnight, J. H. (1979), "A Tutorial on the SWEEP Operator," The American
Statistician, 33, 149-158.
Magnus, J. R., and Neudecker, H. (1980), "The Elimination Matrix: Some Lem-
mas and Applications," SIAM Journal on Algebra and Discrete Mathematics,
1,422-449.
Meyer, C. D. (1973), "Generalized Inverses and Ranks of Block Matrices," SIAM
Journal on Applied Mathematics, 25, 597-602.
Index
trace
differentiation of, see under differ-
entiation
of a product, 19, 94, 146
ofasum,96
transformation(s)
inverse of, 254, 255
invertible, 254, 255
linear, see linear transformation(s)
product of, 255
scalar multiple of, 255
ALSO AVAILABLE FROM SPRINGER!
STATISTICS
A Comprehensive Guide
Second Edition
This book describes the basis. application. and
interpretation of statistics. and presents a wide
range of univariate and multivariate statistical
methodology. Its first edition was popular across To Order or for Infonnation:
all science and technology- based disciplines. The In MJfth An •• ;'.d: ~ l.J<OO.!J'RINl:.ER '" FAX:
book is developed without the use of calculus. 1:/(,1, 34A-lr;o<> · WIII1E: ~.io",,,, V",(...: No-w H"L
Based on the author's Statistics in Scientific h .• [).-",. S:1<·;3. PO av. :l-l8.!>. ~~... u~ NJ
Investigation. the book has been extended sub-
0709.):!J&5 • VISIT: TU4.M' ,,"~~1 t~hfli-...,~ st.:H'e Ix.,.
• (.MAIl.: Ml~'''''''''''lr~ .... nv.C'om
stantially in the area of multivariate applications
tlu/",<1I·M..rhArt.'f" J: ~ ,4'.1.3<) j/:!7b;3/~
and through the expansion of logistic regression • -f4'J 30. 21 ~:J.I . fAX; 14~ / JlJ~::';~" .;$1.11 .
and log linear methodology. It presumes readers WIIITt: ::.prI~~ Vetl.'e. ~.o. S<>.•. 1.\1)~()1, [)'143U2
have access to a statistical computing package and Be~ln. Gormany • E-MAIL: Ofdfor' : prlr.g~, .<1~
<.,:.-4iii7
includes guidance on the application of statisti- rp(~~"mot':
f'., ' .
ground information. and connects statistical
results with interpretations in plain English. rf~O~i
. -.'
Springer
2001/672 PP.jHARDCOVER/ ISBN ()'387·9511()'5 ",,_, .~ ...... ", 14111 . ' f fl, • II
SPRINGER TEXTS IN STATISTICS